#### assembly

3 posts

In the last part of this series we'll write a real program in assembly language.

## The problem

We'll implement the Caesar cipher from Rosetta Code. This is a very simple cipher where the key indicates how much to rotate the alphabet by. So with a key of 2, `ABC XYZ` would be encoded as `CDE ZAB`. The solution should handle both encoding and decoding; decoding simply needs a rotation by the key in the opposite direction.

## Finding a algorithm

This appears quite simple at first. Take each letter's code, add the key and return the new letter, accounting for wrap around at Z.

However, this assumes that letters are coded in a contiguous manner, as they are in ASCII. The System/360 uses EBCDIC, and if you look at the code chart on Wikipedia you'll see that A, B, C, ... I are contiguous, but there's then a gap of 7 characters before J, K, ... R, then another gap and so on.

Now you could account for this but it would be very fiddly in assembler. Instead, let's try another way.

What if we transformed A, B, C into offsets 0, 1, 2. Next, add the key and use that as an index into a table of letters repeated to allow wrap around. This should work, but how to implement it?

## Turning letters into offsets

The first step is turning A, B, C into 0, 1, 2. There's a useful opcode called `TR` that can help us here.

``````TR WORK,OFFSETS
``````

This will take each byte in WORK, look up its position in OFFSETS and update WORK to be the value found there. So if WORK[0] is 'A' (193 in EBCDIC), find the value at OFFSETS[193], which we could arrange to be 0, and then update WORK[0] to be that value.

For this to work, we need to set up OFFSETS correctly. This can be done with some trickery with `DC`:

``````OFFSETS  DC    256AL1(*-OFFSETS)
ORG   OFFSETS+C'A'
DC    X'000102030405060708'
ORG   OFFSETS+C'J'
DC    X'090A0B0C0D0E0F1011'
ORG   OFFSETS+C'S'
DC    X'1213141516171819'
ORG
``````

The first line defines

• an address `A`, 32 bits
• converted to length of 1 byte with `L1`
• repeated 256 times
• with value at each time `*-OFFSETS`, where `*` is the current location

The effect of this is to create a block of memory 256 bytes long with values 0, 1, 2, ..., 255.

The next line uses `ORG` to set the location counter to `OFFSETS+C'A'`, ie the position where `A` would be in the EBCDIC table. We then overwrite a part of the 256 byte block with the bytes 00, 01, 02, ..., 09, creating the offset to replace letters A - I with. We do this again for the letter run starting with J and S.

## Turning offsets into letters

To go the other way, ie change an offset like 0, 1 into letters A, B, we define a block of letters:

``````FLAT0    DC    C'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
FLAT     DC    C'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
FLAT1    DC    C'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
``````

We do this contiguously three times to allow for forward and reverse rotations with the key, so FLAT+1 gives us B and FLAT-1 gives us Z. Doing it this way allows easy access; we could just use one block of letters and deal with rotated offsets past the ends with extra logic, but I don't mind giving up memory for ease of access here.

The final piece we need is a way to access bytes in memory. For this we can read a byte (insert a character) with `IC`, eg:

``````         IC    4,WORK(1)
``````

This will set register 4 to the value in memory at the byte addressed by `WORK` + 1.

To store a character back into memory, `STC` can be used in a similar way.

## The ROT subroutine

With this we have enough to define the subroutine to rotate a message by a given key:

``````* ROT subroutines implements Caesar cipher message letter rotation
*
* Parameters:
* R0 - Key for cipher. 1 to 25 to encode, -1 to -25 to decode
* WORK - Buffer containing message to encode/decode of length MSGLEN
*        Will be overwritten with encoded/decoded message
* Return values:
* None
*
ENTRY ROT                Subroutine entry for ROT
ROT      ENTER 12,SA=RSAVE        Use R12 for base address
TR    WORK,OFFSETS       Translate letters into offsets
L     1,=F'0'            Loop start value
L     2,=F'1'            Loop increment
LA    3,L'MESSAGE-1      Loop final value
LOOP     IC    4,WORK(1)          Get message byte at loop index
C     4,MAXOFF           Is it an offset?
BH    STORE              If no, don't change it
LA    10,FLAT            Load base of flat letter array
AR    10,0               Add the key displacement
IC    4,0(4,10)          Fetch into R4 FLAT+KEY+OFFSET
STORE    STC   4,WORK(1)          Store char back into work
BXLE  1,2,LOOP           Loop if index <= final
EXIT
``````

We use a predefined area of memory `WORK` to do the transformation rather than passing in an address, again just for convenience.

Note the explicit use of addressing in `IC 4,0(4,10)`. The second parameter means use register 10 as the base address and add the contents of register 4 to the displacement 0.

One other observation is that the `BXLE` loop is only needed as we want to skip over non-alphabetic characters in the message. If we knew the message can only contain A-Z, we could do something like `TR WORK,FLAT+KEY` to do the whole message in one go.

## Putting it together

We need a couple more things: first the definition of the message and key, and working areas:

``````*
* Inputs to program
*
MESSAGE  DC    C'THE FIVE BOXING WIZARDS JUMP QUICKLY!'
KEY      DC    F'7'
*
* Working buffers
*
ENC      DS    CL(L'MESSAGE)
DEC      DS    CL(L'MESSAGE)
WORK     DS    CL(L'MESSAGE)
RSAVE    DS    18F                Save area
*
* Constants
*
MAXOFF   DC    F'25'
``````

And finally, a test call to encode and decode the message, using `MVC` to copy a block of memory to another location:

``````CAESAR   START 0                  Program at relative address 0
USING CAESAR,12          R12 will contain program addr
*
* Encode the plain-text message to ENC
MVC   WORK,MESSAGE       Copy message to WORK
L     0,KEY              Set key
CALL  ROT                Encode the message
MVC   ENC,WORK           Copy encoded message to ENC
*
* Decode the encoded message to DEC
L     0,KEY              Set the key
LNR   0,0                Negate the key for decoding
CALL  ROT                Decode the message
MVC   DEC,WORK           Copy decoded message to DEC
*
* Print results and exit
SPRINT MESSAGE           Print the original message
SPRINT ENC               Print the decoded message
SPRINT DEC               Print the encoded message
SYSTEM                   Exit program
``````

## Assembling and running the program

``````\$ run *asmg scards=caesar.asm spunch=-load sercom=*sink* par=test

...

Execution begins   18:29:02
THE FIVE BOXING WIZARDS JUMP QUICKLY!
AOL MPCL IVEPUN DPGHYKZ QBTW XBPJRSF!
THE FIVE BOXING WIZARDS JUMP QUICKLY!
Execution terminated   18:29:02  T=0  RC=0  \$0.00
``````

## Thoughts on the program and assembly

I think that was the hardest program to write of all the languages so far. The documentation is very good, but terse - I can imagine reading the Principle of Operation several times would pay off greatly. Also having to flip between Principles, the Assembler F reference and MTS documentation rather than having everything in one place was tough. But it was very rewarding to get the program running.

I'm not entirely happy with the final design I came up with - I'm sure there's an easier or more efficient way. I'd love to hear from experience System/360 assembly language programmers out there how you'd implement this.

## Further information

Full source code for this program can be found on github.

Functions of the central processing unit, from A Programmer's Introduction to IBM System/360 Assembler Language

In the previous post we got a simple assembly language program running. Let's now look in more detail at how to program in System/360 assembly language. Although a lot of this material is common across other System/360 operating systems and assemblers, it does contain specifics for MTS and Assembler G.

This only covers the surface of the subject: see the end of this post for some of what was missed and how to find more information.

## Architecture overview

The IBM System/360 and its derivatives has the following basic architecture.

• 32 bit words, with access to 16 bit halfwords and 8 bit bytes.
• Big-endian, using 2's compliment for signed values.
• 16 general purpose 32 bit registers, 4 64 bit floating point registers, plus support for BCD arithmetic.
• A 64 bit program status word (PSW), which contains the address of the current instruction being executed, condition codes and interrupt masks.

Most System/360 models did not support virtual memory, but the one used for MTS did.

## Assembly language

The assembly language defined by IBM is line based, with the following components separated by whitespace:

• An optional label, which must start in column 1 if present.
• An opcode, assembler directive or macro indicating the operation to perform.
• An optional set of operands separated by commas.
• An optional comment.
• An optional identification sequence, in columns 73-80, commonly used to number each statement.

Apart from the first and last element, which columns are used is not important, but by convention they are lined up.

## An example

Let's look at lines from the 'Hello, world' example from the last post to illustrate these features.

This line loads registers 12 with the contents of register 15.

``````         LR      12,15             LOAD R12 WITH ABSOLUTE ADDR
``````

The next line loads register 3 with the value found at the address with label `RUNS`.

``````         L       3,RUNS            R3 COUNTS DOWN NUMBER OF RUNS
``````

Looking at the line labeled `RUNS`, we can see the value pointed to defined as a constant using the assembler directive `DC`. Constants of different types are represented with quotes, so `F'5'` means a fullword (32 bit quantity) with a value of 5. `X'BEEF0'` would be a 20 bit value composed of the 5 hex digits `BEEF0` each taking up 4 bits.

``````RUNS     DC      F'5'              NUMBER OF RUNS TO MAKE
``````

`DC` stands for data constant, and arranges a value to be placed at a location in memory after the previous line. `DS` can be used to arrange data storage, allowing the program to write values back to memory.

Finally, here is a macro call, which will take the string as parameter and arrange a subroutine call to the `SPRINT` service provided by MTS:

``````LOOP     SPRINT 'Hello, world!'    PRINT THE MESSAGE
``````

## Operation formats, base registers and addressing

Next let's see how the assembler constructs machine instructions from parts of this program. The listing sent to `sprint` when running the assembler contains a dump of memory locations and assembled code for each line. For example, here's the line where the `RUNS` constant is stored (I've omitted some columns from the listing for clarity). Here we can see the fullword quantity 5 being stored at address 3c.

``````LOC    OBJECT CODE  SOURCE STATEMENT
00003C 00000005     RUNS     DC      F'5'
``````

The load statement `L 3,RUNS` looks like this:

``````LOC    OBJECT CODE   SOURCE STATEMENT
000002 5830 C03C     L       3,RUNS
``````

Breaking down the object code:

• Opcodes on System/360 are a single byte: here `58` is the opcode for `L`, load.
• The next four bits, `3`, represents the register to load into.
• The next four bits, `0` is the index register
• The next four bits, `C`, is the base register.
• The final 12 bits, `03C`, is the displacement.

The address to load from is base register + index register + displacement. In this example, the index register is 0 which indicates it is not being used, so at run time the system will find the value at offset `3c` from whatever is in the base register.

The base register is present in nearly all addressing operations and ensures that System/360 is relocatable: at run time, the start of this piece of code is loaded into memory somewhere and a register is set to that address so all further address arithmetic can use it.

In our example, this was set up with assembler directives at the start of the program:

``````HELLO    START   0
USING   HELLO,12
LR      12,15
``````

Here, `START 0` means the program will be placed at offset 0 in the block of memory being composed. `USING HELLO,12` states that register 12 (`C`) will be the base register. Finally, `LR 12,15` loads register 12 with the contents of register 15: this is part of the calling convention for MTS, so when the program starts it knows it can find its own start address in register 15. Using `USING` allows the assembler to take care of the base register for you, but you can also specify it and the index register directly.

This sequence of opcode and operands is called RX format. Other opcodes use other formats; for example, the `LR 12,15` is in RR format, where the one byte opcode is followed by two 4 bit register operands. Other formats can take an immediate value and an address, or two addresses.

## Loads and stores and binary arithmetic

Armed with this knowledge, it's fairly easy to understand the basic instruction set described in the IBM System/360 Principles of Operation manual. Let's look at the load instructions for example:

NameMnemonicType

We've seen `L` and `LR` already. `LH` loads a halfword (16 bit) quantity from the given address, sign extending it to a fullword. `LTR` will load a value and set condition codes, which will look at in the next section. `LCR` will load a value and toggle the sign; `LPR` and `LNR` will load and force positive or negative.

`LM`, load multiple, uses a format we've not seen so far, RS. This format is composed like this:

• One byte opcode
• Four bits register R1
• Four bits register R3
• Four bits register B2
• 12 bits displacement

For LM, data is loaded from the address computed from the base register B2 + displacement into a sequence of registers starting at R1 and ending at R3. So the instruction

``````LM 3,5,12,0
``````

will load up the registers R3, R4 and R5 with data from the memory address in base register 12 + 0 offset, reading 3 fullword values in sequence.

For stores, we have `ST` for store fullword quantity in a register to memory, `STH` for storing a halfword and `SM` for storing multiple registers to a sequence in memory.

Addition opcodes are `A` and `AR` for add memory to register and add register to register. `AH` to add halfwords and `AL`/`ALR` to add ignoring the sign bit. Subtraction is orthogonal.

Multiplication and division result in a 64 bit value so as not to lose precision. The input register must have an even number and the results are stored in that plus the next sequential register, so

``````M 2,LABEL
``````

will multiply the value of register 2 and the value found at `LABEL`, then store the results in registers 2 and 3.

## The PSW and branches

The PSW, or program status word, is a doubleword that defines the current state of the processor. It includes fields that indicate interrupt masks, state (eg problem state), condition codes and the current instruction address.

The condition code occupies bits 34 and 35. They have the following meaning:

• `00` - zero
• `01` - negative
• `10` - positive
• `11` - carry/overflow

The condition code is set after many operations, for example add, subtract or load and test.

The branch on condition opcode can be used to branch based on its value. This comes in two formats, a RR instruction `BCR` where if the branch is made, the address is taken from one register, or `BC` which uses the base+index+displacement system.

Whether to take the branch or not is determine by a four bit mask. A mask value of 8 means branch if condition code `00` is set, mask value 4 for `01` etc. These can be added together, so a mask value of 12 means branch on either condition code `00` or `01` (zero or negative). If all four mask bits are set, the branch is unconditional, if unset, then the instruction is a no-op.

As specifying the mask each time can be tedious, the assembler provides directives for common cases such as branch if zero or branch if positive. We can see this in the 'hello, world' program:

``````         S       3,DECR            DECREMENT R3
BP      LOOP              IF R3 POSITIVE, LOOP AGAIN
``````

`BP` stands for branch if positive, and will test the condition code from the previous `S` subtract: if the value is positive, then jump to address labeled as `LOOP`.

Internally, the assembler turns `BP LOOP` into `BC 2,LOOP(0,12)`, using mask 2 = condition code `10` and if positive branching to the address given by the base register 12 + `LOOP`.

There are also higher level branch instructions suitable for loops, for example:

``````         BXLE 1,2,LABEL
``````

BXLE stands for branch on index low or equal. In this example, register 1 is incremented by the value of register 2. If register 1 is less than equal to the value in register 3, then the branch is taken.

## Subroutines

Supporting subroutines or functions requires a facility to pass in parameters, jump to the called code, do an operation, and return to the calling code with any results. On most modern systems this is done via a stack, but you may have noticed this was not part of the architecture overview above, so how is this done?

The answer is that a combination of a save area in memory is used to store data plus a calling convention so the caller and callee can exchange information via registers. The actual change in location address is done by a branch when this is set up. The calling convention used is up to the programmer, but in practice IBM defines a standard calling convention so that code in different languages can operate together.

(24 March 2018: See the comment below from Jeff Ogden for more details on the calling convention.)

MTS Volume 3: System Subroutine Descriptions defines this calling convention. Simplifying greatly, for the common S-type routines used by most MTS library functions:

• Register 1 points to an address where a list of parameters for the called program is stored..
• Register 13 points to a save area, described below.
• Register 14 points to the next instruction in the calling program that the called program should return to after execution is done.
• Register 15 points to the entry point of the called program.

The save area is a portion of memory with a specific format, owned by the calling program where the called program can save registers and other data. For example, word 6 is used to store the contents of register 0. It is the responsibility of the called program to save data in the save area and restore it before returning execution to the caller.

A byproduct of this is that recursion is not freely available like on stack based architectures, as there is a single save area per program call, multiple calls would overwrite this area. Special code is needed to dynamically allocate save areas if recursion is required.

## Macros and calling the MTS system

Setting up these subroutine calls by hand would be tedious. Luckily, the assembler includes a powerful macro facility that helps abstract these away, and macros have been defined as follows:

• `CALL` to pass parameters to a subroutine
• `ENTER` to start a subroutine and set up registers and save area
• `EXIT` to restore saved values and pass back a result code.

Macros are also defined for MTS system facilities. We saw this in the 'hello world' program in two places:

``````LOOP     SPRINT 'Hello, world!'    PRINT THE MESSAGE
...
SYSTEM                    EXIT PROGRAM
``````

We can see how these were assembled by looking at the listing file. The `SYSTEM` call is easiest as it takes and returns no parameters:

``````L     15,=V(SYSTEM)
BR    15
``````

Register 15 is loaded with the address of the `SYSTEM` subroutine (which is determined at link time) and an unconditional branch is made.

The `SPRINT` call is more complex:

``````L     15,=V(SPRINT)                SUBROUTINE TO DO I/O
BAL   1,*+18+((L'###1+1)/2*2)      AROUND CONSTANTS
DC    A(*+8)                       LENGTH
DC    A(0)                         NO MODIFIERS SPECIFIED
DC    Y(L'###1)                    LENGTH
DC    C'Hello, world!'
BALR  14,15                        BRANCH TO SUBROUTINE
``````

The `L` sets up the subroutine address in register 15. A parameter list is defined with the `DC`s; the `BAL` puts the address of this in register 1 and jumps over the constants.. Finally, `BALR` is used to jump to the address stored in register 15, storing the current address (and other bits from the PSW) in register 14.

## Other features

The information presented above should be enough to get started with assembler programming on MTS. We have not covered a number of other features supported by the system, including

• Decimal and floating point arithmetic
• Logical operations like `AND`
• Byte access and character operations
• I/O operations and channels
• Other parts of the PSW and their use in system programs.

## Further information

See the end of the last post for links to documentation to find out more about assembly language.

In the next post we'll write a complete program in assembly.

Sample assembly language program form, from 'A Programmer's Introduction to IBM System/360 Assembler Language

After the heights of APL, let's turn to the lowest level language possible: System/360 assembly language.

## System/360 Assembly Language

MTS runs on the IBM System/360, designed from scratch by IBM in the early 1960s as a unified successor to a number of different architectures. System programmers, and application programmers looking for maximum performance used assembler to write code as close to the bare metal as possible.

As the System/360 was a new design, the architecture is clean and fairly simple. It's a 32 bit architecture with 24 bit addresses, 16 full word registers and 4 64 bit floating point registers. Binary Coded Decimal (BCD) arithmetic and I/O operations are also supported.

## Assemblers on MTS

MTS ran originally on a System/360-67, and later on 370 and Amdahl CPUs which provided extensions to the basic instruction set that we will not consider here.

The main assembler available to us today is Assembler G (`*ASMG`). This is a basic assembler, derived from Assembler F provided by IBM for OS/360 with extensions by the University of Waterloo for improved performance.

At the time of MTS D6, the most common assembler in use was Assembler H. This has a considerable number of improvements to the assembly language supported in Assembler G; however, this is not available in the MTS distribution due to copyright reasons.

ASSIST, Assembler System for Student Instruction & Systems Teaching, was an assembler and emulator used by students to learn assembly. As it emulates the underlying machine it can provide additional debug information and run time control, at the expense of performance. It is available on the MTS distribution as `*ASSIST`. Finally, `*ASMT` is a specialised assembler compatible with features on IBM's time sharing system TSS. We will not consider either of these further in these blog posts.

## Prerequisites

No special installation instructions to get this language running - just do the standard D6.0 setup as described in this guide and then sign on as a regular user such as `ST01`.

## Running a program using `*ASMG`

`*ASMG` will take a file of assembly language instructions from `scards` and write output to `spunch`. A program listing can be sent to `sprint` and errors to `sercom` if these are set on the command line. Extra parameters can be set with `par`, for example `par=test` will add debugging information to the object file.

Linking is done at run time by MTS, so if you are just using the system libraries the object file can be run directly with `\$run`.

## Hello world

Let's see how to run a simple program to print 'Hello, world!' five times using assembly language.

The source file (with line numbers) looks like this:

``````# list hello.asm
1     HELLO    START   0                 PROGRAM AT RELATIVE ADDRESS 0
2              USING   HELLO,12          R12 WILL CONTAIN PROGRAM ADDR
4              L       3,RUNS            R3 COUNTS DOWN NUMBER OF RUNS
5     LOOP     SPRINT 'Hello, world!'    PRINT THE MESSAGE
6              S       3,DECR            DECREMENT R3
7              BP      LOOP              IF R3 POSITIVE, LOOP AGAIN
8              SYSTEM                    EXIT PROGRAM
9     RUNS     DC      F'5'              NUMBER OF RUNS TO MAKE
10     DECR     DC      F'1'              DECREMENT FOR LOOP
11              END     HELLO             END OF CODE
``````

Here's how to run the assembler. We send errors to `*sink*` so they are displayed immediately and a full program listing to `-hello.l`.

``````# \$run *asmg scards=hello.asm spunch=-load sercom=*sink* sprint=-hello.l par=test
. *** *ASMG has been changed to use *ASMGSYSMAC for the default macro
. *** library instead of *SYSMAC.  It will no longer work with *SYSMAC.
# Execution begins   18:19:08

ASSEMBLER (G) DONE           18:19:08   30 SEP 17

NO STATEMENTS FLAGGED IN THIS ASSEMBLY
# Execution terminated   18:19:08  T=0.091
``````

We can ignore the `*ASMG has been changed...` message; `NO STATEMENTS FLAGGED` means that it worked OK.

Finally, let's run the assembled program.

``````# \$run -load
# Execution begins   18:19:17
Hello, world!
Hello, world!
Hello, world!
Hello, world!
Hello, world!
# Execution terminated   18:19:17  T=0
``````

## Further information

A good place to start is A Programmer's Introduction to IBM System/360 Assembler Language, which gives an overview of the System/360 architecture and then teaches different aspects of assembly language programming using examples.

A full description of how the System/360 works and the opcodes available is in IBM System/360 Principles of Operation.

OS Assembler Language is a reference guide to assembly language. `*ASMG` is documented in MTS Volume 2X from page 27 onwards.

MTS Volume 14: 360/370 Assemblers in MTS details the differences between Assembler G and Assembler H, macro libraries, structured programming macros and the ASSIST assembler.

MTS Volume 3: System Subroutine Descriptions describes the 'standard library' of routines available to assembler programmers on MTS.