In the last part of this series we'll write a real program in assembly language.
We'll implement the Caesar cipher from Rosetta Code. This is a very simple cipher where the key indicates how much to rotate the alphabet by. So with a key of 2,
ABC XYZ would be encoded as
CDE ZAB. The solution should handle both encoding and decoding; decoding simply needs a rotation by the key in the opposite direction.
Finding a algorithm
This appears quite simple at first. Take each letter's code, add the key and return the new letter, accounting for wrap around at Z.
However, this assumes that letters are coded in a contiguous manner, as they are in ASCII. The System/360 uses EBCDIC, and if you look at the code chart on Wikipedia you'll see that A, B, C, ... I are contiguous, but there's then a gap of 7 characters before J, K, ... R, then another gap and so on.
Now you could account for this but it would be very fiddly in assembler. Instead, let's try another way.
What if we transformed A, B, C into offsets 0, 1, 2. Next, add the key and use that as an index into a table of letters repeated to allow wrap around. This should work, but how to implement it?
Turning letters into offsets
The first step is turning A, B, C into 0, 1, 2. There's a useful opcode called
TR that can help us here.
This will take each byte in WORK, look up its position in OFFSETS and update WORK to be the value found there. So if WORK is 'A' (193 in EBCDIC), find the value at OFFSETS, which we could arrange to be 0, and then update WORK to be that value.
For this to work, we need to set up OFFSETS correctly. This can be done with some trickery with
OFFSETS DC 256AL1(*-OFFSETS) ORG OFFSETS+C'A' DC X'000102030405060708' ORG OFFSETS+C'J' DC X'090A0B0C0D0E0F1011' ORG OFFSETS+C'S' DC X'1213141516171819' ORG
The first line defines
- an address
A, 32 bits
- converted to length of 1 byte with
- repeated 256 times
- with value at each time
*is the current location
The effect of this is to create a block of memory 256 bytes long with values 0, 1, 2, ..., 255.
The next line uses
ORG to set the location counter to
OFFSETS+C'A', ie the position where
A would be in the EBCDIC table. We then overwrite a part of the 256 byte block with the bytes 00, 01, 02, ..., 09, creating the offset to replace letters A - I with. We do this again for the letter run starting with J and S.
Turning offsets into letters
To go the other way, ie change an offset like 0, 1 into letters A, B, we define a block of letters:
FLAT0 DC C'ABCDEFGHIJKLMNOPQRSTUVWXYZ' FLAT DC C'ABCDEFGHIJKLMNOPQRSTUVWXYZ' FLAT1 DC C'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
We do this contiguously three times to allow for forward and reverse rotations with the key, so FLAT+1 gives us B and FLAT-1 gives us Z. Doing it this way allows easy access; we could just use one block of letters and deal with rotated offsets past the ends with extra logic, but I don't mind giving up memory for ease of access here.
Byte access to memory
The final piece we need is a way to access bytes in memory. For this we can read a byte (insert a character) with
This will set register 4 to the value in memory at the byte addressed by
WORK + 1.
To store a character back into memory,
STC can be used in a similar way.
The ROT subroutine
With this we have enough to define the subroutine to rotate a message by a given key:
* ROT subroutines implements Caesar cipher message letter rotation * * Parameters: * R0 - Key for cipher. 1 to 25 to encode, -1 to -25 to decode * WORK - Buffer containing message to encode/decode of length MSGLEN * Will be overwritten with encoded/decoded message * Return values: * None * ENTRY ROT Subroutine entry for ROT ROT ENTER 12,SA=RSAVE Use R12 for base address TR WORK,OFFSETS Translate letters into offsets L 1,=F'0' Loop start value L 2,=F'1' Loop increment LA 3,L'MESSAGE-1 Loop final value LOOP IC 4,WORK(1) Get message byte at loop index C 4,MAXOFF Is it an offset? BH STORE If no, don't change it LA 10,FLAT Load base of flat letter array AR 10,0 Add the key displacement IC 4,0(4,10) Fetch into R4 FLAT+KEY+OFFSET STORE STC 4,WORK(1) Store char back into work BXLE 1,2,LOOP Loop if index <= final EXIT
We use a predefined area of memory
WORK to do the transformation rather than passing in an address, again just for convenience.
Note the explicit use of addressing in
IC 4,0(4,10). The second parameter means use register 10 as the base address and add the contents of register 4 to the displacement 0.
One other observation is that the
BXLE loop is only needed as we want to skip over non-alphabetic characters in the message. If we knew the message can only contain A-Z, we could do something like
TR WORK,FLAT+KEY to do the whole message in one go.
Putting it together
We need a couple more things: first the definition of the message and key, and working areas:
* * Inputs to program * MESSAGE DC C'THE FIVE BOXING WIZARDS JUMP QUICKLY!' KEY DC F'7' * * Working buffers * ENC DS CL(L'MESSAGE) DEC DS CL(L'MESSAGE) WORK DS CL(L'MESSAGE) RSAVE DS 18F Save area * * Constants * MAXOFF DC F'25'
And finally, a test call to encode and decode the message, using
MVC to copy a block of memory to another location:
CAESAR START 0 Program at relative address 0 USING CAESAR,12 R12 will contain program addr LR 12,15 Load R12 with absolute addr * * Encode the plain-text message to ENC MVC WORK,MESSAGE Copy message to WORK L 0,KEY Set key CALL ROT Encode the message MVC ENC,WORK Copy encoded message to ENC * * Decode the encoded message to DEC L 0,KEY Set the key LNR 0,0 Negate the key for decoding CALL ROT Decode the message MVC DEC,WORK Copy decoded message to DEC * * Print results and exit SPRINT MESSAGE Print the original message SPRINT ENC Print the decoded message SPRINT DEC Print the encoded message SYSTEM Exit program
Assembling and running the program
$ run *asmg scards=caesar.asm spunch=-load sercom=*sink* par=test ... $run -load Execution begins 18:29:02 THE FIVE BOXING WIZARDS JUMP QUICKLY! AOL MPCL IVEPUN DPGHYKZ QBTW XBPJRSF! THE FIVE BOXING WIZARDS JUMP QUICKLY! Execution terminated 18:29:02 T=0 RC=0 $0.00
Thoughts on the program and assembly
I think that was the hardest program to write of all the languages so far. The documentation is very good, but terse - I can imagine reading the Principle of Operation several times would pay off greatly. Also having to flip between Principles, the Assembler F reference and MTS documentation rather than having everything in one place was tough. But it was very rewarding to get the program running.
I'm not entirely happy with the final design I came up with - I'm sure there's an easier or more efficient way. I'd love to hear from experience System/360 assembly language programmers out there how you'd implement this.
Full source code for this program can be found on github.