System/360 Assembly Language - Caesar Cipher

In the last part of this series we’ll write a real program in assembly language.

The problem

We’ll implement the Caesar cipher from Rosetta Code. This is a very simple cipher where the key indicates how much to rotate the alphabet by. So with a key of 2, ABC XYZ would be encoded as CDE ZAB. The solution should handle both encoding and decoding; decoding simply needs a rotation by the key in the opposite direction.

Finding a algorithm

This appears quite simple at first. Take each letter’s code, add the key and return the new letter, accounting for wrap around at Z.

However, this assumes that letters are coded in a contiguous manner, as they are in ASCII. The System/360 uses EBCDIC, and if you look at the code chart on Wikipedia you’ll see that A, B, C, … I are contiguous, but there’s then a gap of 7 characters before J, K, … R, then another gap and so on.

Now you could account for this but it would be very fiddly in assembler. Instead, let’s try another way.

What if we transformed A, B, C into offsets 0, 1, 2. Next, add the key and use that as an index into a table of letters repeated to allow wrap around. This should work, but how to implement it?

Turning letters into offsets

The first step is turning A, B, C into 0, 1, 2. There’s a useful opcode called TR that can help us here.

TR WORK,OFFSETS

This will take each byte in WORK, look up its position in OFFSETS and update WORK to be the value found there. So if WORK[0] is ‘A’ (193 in EBCDIC), find the value at OFFSETS[193], which we could arrange to be 0, and then update WORK[0] to be that value.

For this to work, we need to set up OFFSETS correctly. This can be done with some trickery with DC:

OFFSETS  DC    256AL1(*-OFFSETS)
         ORG   OFFSETS+C'A'
         DC    X'000102030405060708'
         ORG   OFFSETS+C'J'
         DC    X'090A0B0C0D0E0F1011'
         ORG   OFFSETS+C'S'
         DC    X'1213141516171819'
         ORG

The first line defines

The effect of this is to create a block of memory 256 bytes long with values 0, 1, 2, …, 255.

The next line uses ORG to set the location counter to OFFSETS+C'A', ie the position where A would be in the EBCDIC table. We then overwrite a part of the 256 byte block with the bytes 00, 01, 02, …, 09, creating the offset to replace letters A - I with. We do this again for the letter run starting with J and S.

Turning offsets into letters

To go the other way, ie change an offset like 0, 1 into letters A, B, we define a block of letters:

FLAT0    DC    C'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
FLAT     DC    C'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
FLAT1    DC    C'ABCDEFGHIJKLMNOPQRSTUVWXYZ'

We do this contiguously three times to allow for forward and reverse rotations with the key, so FLAT+1 gives us B and FLAT-1 gives us Z. Doing it this way allows easy access; we could just use one block of letters and deal with rotated offsets past the ends with extra logic, but I don’t mind giving up memory for ease of access here.

Byte access to memory

The final piece we need is a way to access bytes in memory. For this we can read a byte (insert a character) with IC, eg:

         IC    4,WORK(1)

This will set register 4 to the value in memory at the byte addressed by WORK + 1.

To store a character back into memory, STC can be used in a similar way.

The ROT subroutine

With this we have enough to define the subroutine to rotate a message by a given key:

* ROT subroutines implements Caesar cipher message letter rotation
*
* Parameters:
* R0 - Key for cipher. 1 to 25 to encode, -1 to -25 to decode
* WORK - Buffer containing message to encode/decode of length MSGLEN
*        Will be overwritten with encoded/decoded message
* Return values:
* None
*
         ENTRY ROT                Subroutine entry for ROT
ROT      ENTER 12,SA=RSAVE        Use R12 for base address
         TR    WORK,OFFSETS       Translate letters into offsets
         L     1,=F'0'            Loop start value
         L     2,=F'1'            Loop increment
         LA    3,L'MESSAGE-1      Loop final value
LOOP     IC    4,WORK(1)          Get message byte at loop index
         C     4,MAXOFF           Is it an offset?
         BH    STORE              If no, don't change it
         LA    10,FLAT            Load base of flat letter array
         AR    10,0               Add the key displacement
         IC    4,0(4,10)          Fetch into R4 FLAT+KEY+OFFSET
STORE    STC   4,WORK(1)          Store char back into work
         BXLE  1,2,LOOP           Loop if index <= final
         EXIT

We use a predefined area of memory WORK to do the transformation rather than passing in an address, again just for convenience.

Note the explicit use of addressing in IC 4,0(4,10). The second parameter means use register 10 as the base address and add the contents of register 4 to the displacement 0.

One other observation is that the BXLE loop is only needed as we want to skip over non-alphabetic characters in the message. If we knew the message can only contain A-Z, we could do something like TR WORK,FLAT+KEY to do the whole message in one go.

Putting it together

We need a couple more things: first the definition of the message and key, and working areas:

*
* Inputs to program
*
MESSAGE  DC    C'THE FIVE BOXING WIZARDS JUMP QUICKLY!'
KEY      DC    F'7'
*
* Working buffers
*
ENC      DS    CL(L'MESSAGE)
DEC      DS    CL(L'MESSAGE)
WORK     DS    CL(L'MESSAGE)
RSAVE    DS    18F                Save area
*
* Constants
*
MAXOFF   DC    F'25'

And finally, a test call to encode and decode the message, using MVC to copy a block of memory to another location:

CAESAR   START 0                  Program at relative address 0
         USING CAESAR,12          R12 will contain program addr
         LR    12,15              Load R12 with absolute addr
*
* Encode the plain-text message to ENC
         MVC   WORK,MESSAGE       Copy message to WORK
         L     0,KEY              Set key
         CALL  ROT                Encode the message
         MVC   ENC,WORK           Copy encoded message to ENC
*
* Decode the encoded message to DEC
         L     0,KEY              Set the key
         LNR   0,0                Negate the key for decoding
         CALL  ROT                Decode the message
         MVC   DEC,WORK           Copy decoded message to DEC
*
* Print results and exit
         SPRINT MESSAGE           Print the original message
         SPRINT ENC               Print the decoded message
         SPRINT DEC               Print the encoded message
         SYSTEM                   Exit program

Assembling and running the program

$ run *asmg scards=caesar.asm spunch=-load sercom=*sink* par=test

...

$run -load
Execution begins   18:29:02
THE FIVE BOXING WIZARDS JUMP QUICKLY!
AOL MPCL IVEPUN DPGHYKZ QBTW XBPJRSF!
THE FIVE BOXING WIZARDS JUMP QUICKLY!
Execution terminated   18:29:02  T=0  RC=0  $0.00

Thoughts on the program and assembly

I think that was the hardest program to write of all the languages so far. The documentation is very good, but terse - I can imagine reading the Principle of Operation several times would pay off greatly. Also having to flip between Principles, the Assembler F reference and MTS documentation rather than having everything in one place was tough. But it was very rewarding to get the program running.

I’m not entirely happy with the final design I came up with - I’m sure there’s an easier or more efficient way. I’d love to hear from experience System/360 assembly language programmers out there how you’d implement this.

Further information

Full source code for this program can be found on github.

Comments

comments powered by Disqus