PL/I - Radix sort

For the final part of this series on the PL/I programming language as implemented on MTS, let's implement radix sort from Rosetta Stone.

The problem

Radix sort is intended as a fast way to sort an array of integers by grouping them by digits that share the same value. Digits can be defined depending on the numeric base and the sort can take place from either the most significant digit or the least. Here we'll sort by most significant digit and use base 2. so we are comparing by bits.

The algorithm we'll use sorts in place. Starting at the largest bit it will group the numbers into one range where the bit is 1 and another where it is 0 by swapping values in the array. It will then recursively sort each range by the next smallest bit.

The solution

Here's how the implementation looks; this was based on the C version at Rosetta Code.

RADIX_SORT: PROCEDURE(FROM, TO, BIT, ARR) RECURSIVE;  
  DECLARE (FROM, TO) FIXED DECIMAL;
  DECLARE BIT FIXED BINARY(16);
  DECLARE ARR(*) FIXED BINARY(16);

  DECLARE (LEFT, RIGHT) FIXED DECIMAL;
  DECLARE TEMP FIXED BINARY(16);

  /* Exit early if bit down to zero or range is zero */
  IF (BIT = 0) | (TO <= FROM) THEN RETURN;

  LEFT = FROM;
  RIGHT = TO;

  LOOP:
  /* Find the leftmost with the bit set */
  DO WHILE ((LEFT < RIGHT) & ((ARR(LEFT) & BIT) = 0));
    LEFT = LEFT + 1;
  END;

  /* Find the rightmost with the bit not set */
  DO WHILE ((LEFT < RIGHT) & ((ARR(RIGHT) & BIT) ¬= 0));
    RIGHT = RIGHT - 1;
  END;

  /* If found one of each, swap and loop again to look for more */
  IF LEFT < RIGHT THEN BEGIN;
    TEMP = ARR(LEFT);
    ARR(LEFT) = ARR(RIGHT);
    ARR(RIGHT) = TEMP;
    GO TO LOOP;
  END;

  /* If nothing found, make the first sub search do the full range */
  IF ((BIT & ARR(LEFT)) = 0) & (LEFT <= TO) THEN
    LEFT = LEFT + 1;

  /* The two ranges bounded by LEFT are now sorted by BIT.
   * Sort each range by the next least significant bit */
  BIT = BIT / 2;
  CALL RADIX_SORT(FROM, LEFT-1, BIT, ARR);
  CALL RADIX_SORT(LEFT, TO, BIT, ARR);
END RADIX_SORT;  

The definition of the function is marked as RECURSIVE, similar to other languages like ALGOL, as otherwise calling the function from itself will not work.

We pass in an array of fixed point numbers of up to 16 bits in length with the dimensions of the array bounded by FROM and TO. The initial call will set BIT to be 2^15. To handle an array of any size we can use * in the declaration DECLARE ARR(*) FIXED BINARY(16);.

Although PL/I has logical operators - and is &, or is |, not is ¬ - it does not seem to have the boolean not operator like ! in C, so we have to write (ARR(LEFT) & BIT) = 0 rather than something like ! (ARR(LEFT) & BIT).

We'll also need a procedure to print the array which will be called before and after the sort.

SHOW_ARR: PROCEDURE(LABEL, ARR);  
  DECLARE LABEL CHARACTER(80) VARYING;
  DECLARE ARR(*) FIXED BINARY(16);
  PUT FILE (SCARDS) LIST(LABEL);
  PUT FILE (SCARDS) SKIP;
  PUT FILE (SCARDS) LIST(ARR);
  PUT FILE (SCARDS) SKIP;
END SHOW_ARR;  

Finally, we set up the main function

TEST: PROCEDURE OPTIONS(MAIN);  
  DECLARE RADIX_SORT ENTRY (FIXED DECIMAL, FIXED DECIMAL,
                            FIXED BINARY(16),);

  DECLARE NUMS(10) FIXED BINARY(16);
  NUMS(1) = 3;       NUMS(6) = 1630;
  NUMS(2) = 42;      NUMS(7) = 4;
  NUMS(3) = 910;     NUMS(8) = 2721;
  NUMS(4) = 3241;    NUMS(9) = 684;
  NUMS(5) = 99;      NUMS(10) = 11;

  CALL SHOW_ARR('Before', NUMS);
  CALL RADIX_SORT(1, 10, 1000000000000000B, NUMS);
  CALL SHOW_ARR('After', NUMS);

  /* Definition of RADIX_SORT and SHOW_ARR follow inside TEST here. */
END TEST;  

Note that we need to declare the signature of RADIX_SORT before we use it.

We can now compile and run the program.

# $run *pl1f scards=radix.pl1 spunch=-load
# Execution begins   20:03:01 

  PROCEDURE TEST: SYNTAX CHECK COMPLETED. COMPILATION CONTINUES.


   NO ERRORS OR WARNINGS DETECTED.
# Execution terminated   20:03:02  T=0.128 

# $run -load+*pl1flib
# Execution begins   20:03:03 
  'Before' 
          3        42       910      3241        99      1630         4      2721
        684        11 
  'After' 
          3         4        11        42        99       684       910      1630
       2721      3241 

# Execution terminated   20:03:03  T=0.003 

Final thoughts about PL/I

To a C programmer, PL/I appears the most familiar of the languages we've seen so far, from the comments style and semicolon as line terminators to the storage classes and system level access, but with a more ALGOLy syntax. Writing the above I certainly found it expressive and easy to use, but it lacks even the minimal type checking found in C and I wasn't quite sure what conversions were going on behind the scenes.

In its historical context, PL/I never really took off - it was tough to implement a compiler and too much identified with IBM to be adopted widely. It was important in the development of MULTICS and it nearly had a renaissance when it was made available for OS/2 in the 1990s, but ultimately it did not gain enough adoption from programmers to be truly successful.

For MTS at UM, according to the Introduction to programming and debugging in MTS, it was the third most popular language (8% by frequency of use) in 1984, but way behind FORTRAN and PASCAL. It would be interesting to know what types of users it had - teaching, business or scientific research? It does not seem to have been used by MTS system programmers much, going by the lack of components in PL/I on the D6.0 tapes.

Further information

Full source code for this program is on github.

Go back to the first article in ther series on PL/I or see the other languages being looked at.

PL/I - Language features 2

Let's continue our look at the PL/I language.

Storage classes

There are four storage classes in PL/I. AUTOMATIC (the default) and STATIC are much like in C:

CONTROLLED variables are assigned space dynamically with ALLOCATE and released with FREE. Unlike C malloc/free, allocations are stacked so you can reallocate space and the program will remember the last allocation after it is freed. For example, the below will print 20 then 10.

TEST: PROCEDURE OPTIONS(MAIN);  
  DECLARE A FIXED BINARY(16) CONTROLLED;

  ALLOCATE A;
  A = 10;
  ALLOCATE A;
  A = 20;
  PUT FILE (SCARDS) LIST(A);
  FREE A;
  PUT FILE (SCARDS) LIST(A);
  FREE A;
END TEST;  

Use of controlled variables before they are allocated or after they are freed will lead to a runtime error.

BASED variables are similar to CONTROLLED variables but also allow pointer style access.

Variables (and procedures) can also be declared EXTERNAL if they are defined elsewhere in the program.

Procedures and functions

Procedures can be declared at file level, within other procedures, or even within DO blocks. They are introduced with PROCEDURE and executed using CALL. Parameters are passed by name so the procedure can alter them; you can also pass an expression and the compiler will introduce a temporary variable. The below will print 41.

TEST: PROCEDURE OPTIONS(MAIN);  
  DECLARE A FIXED BINARY(16);
  A = 41;
  CALL INCR(A);
  CALL INCR(A+1);
  PUT FILE (SCARDS) LIST(A);

  INCR: PROCEDURE(X);
    DECLARE X FIXED BINARY(16);
    X = X + 1;
  END INCR;

END TEST;  

A function is a procedure that returns a value, so the above could also be written as:

TEST: PROCEDURE OPTIONS(MAIN);  
  DECLARE A FIXED BINARY(16);
  A = 41;
  A = INCR(A);
  PUT FILE (SCARDS) LIST(A);

  INCR: PROCEDURE(X) RETURNS(FIXED BINARY(16));
    DECLARE X FIXED BINARY(16);
    RETURN(X + 1);
  END INCR;

END TEST;  

Note that brackets are needed around the expression being returned.

Preprocessor

The source file can be augmented at compile time by using preprocessor directives, which start with %. On MTS, to use this facility you must set the MACRO parameter in the compiler command (eg $run *pl1f scards=in.pl1 spunch=-load par=macro).

%INCLUDE otherfile copies the contents of otherfile into the file currently being compiled.

%DECLARE can be used for macro replacement. %DEACTIVATE will undefine a macro variable allowing it to be used by the program again. The following will print 42.

%DECLARE A CHARACTER, B FIXED;
%A = 'B + 2';
%B = 40;

TEST: PROCEDURE OPTIONS(MAIN);  
  DECLARE X FIXED BINARY(16);
  X = A;
  PUT FILE (SCARDS) LIST(A);
END TEST;  

So far so much like the C preprocessor. But you can also do compile time programming using labels, %IF and %GOTO. The below will substitute the middle section with X = X + 1;, X = X + 2; etc and prints 10.

TEST: PROCEDURE OPTIONS(MAIN);  
  DECLARE X FIXED BINARY(16);
  X = 0;

  %DECLARE N FIXED;
  %N = 1;
  %LABEL1:;
  X = X + N;
  %N = N + 1;
  %IF N < 5 %THEN %GO TO LABEL1;

  PUT FILE (SCARDS) LIST(X);
END TEST;  

There is also a %DO loop and the ability to create compile time %PROCEDURES.

Further information

There's a lot more we haven't covered. PL/I has extensive I/O operations in both stream and record formats. There is support for multi-tasking, although much of this is not available on MTS.

There is surprisingly little on the internet about PL/I, outside of IBM. There's an (opinionated) comparison between C and PL/I. The experience Multics had using PL/I is also interesting.

PL/I - Language features 1

Let's look at the PL/I language in more details. Before we start, PL/I is a big language, and we are going to need two posts to even scratch the surface.

Format

PL/I is free format, but the IBM implementation requires use of upper case letters for keywords and identifiers. By default, characters after the 72nd column are ignored by the IBM compiler but this can be changed by a command line switch.

Comments are introduced with /* and ended with */ like in C.

Keywords are not reserved words, so you can do things like the below, where IF is an identifier; this will print 4. Also note that = is used for assignment or comparison, depending on context.

TEST: PROCEDURE OPTIONS(MAIN);  
  DECLARE IF FIXED BINARY(16);
  IF = 3;
  IF IF = 3 THEN IF = 4; ELSE IF = 5;
  PUT FILE (SCARDS) LIST(IF);
END TEST;  

Variables and constants

Arithmetic data

Arithmetic data can be either fixed point or floating, both using in decimal or binary as its base. An example of each of these combinations:

TEST: PROCEDURE OPTIONS(MAIN);  
  DECLARE XB FIXED BINARY(16);
  DECLARE XD FIXED DECIMAL(8,2);
  DECLARE FB FLOAT BINARY(8);
  DECLARE FD FLOAT DECIMAL(5);
  XB = 42;
  XD = 123.45;
  FB = 10110.01B;
  FD = 123.45678;
  PUT FILE (SCARDS) LIST(XB, XD, FB, FD);
END TEST;  

which will print

42      123.45  2.22E+01  1.2345E+02  

Note that for the binary items, the number in brackets specifies the number of bits to be used for storage; for decimals, it is the precision, so FD has been truncated.

In the above, you can see that B can be used as a suffix for binary constants. There is also a L suffix for pre-decimalisation British currency where the elements designate pounds, shillings and pence. This is converted into pence for storage.

TEST: PROCEDURE OPTIONS(MAIN);  
  DECLARE AMT FIXED BINARY(16);
  AMT = 1.5.2L;
  PUT FILE (SCARDS) LIST(AMT);
END TEST;  

I think no other language has this, and it may reflect that PL/I was originally designed at IBM's labs in England.

You can also declare complex variables and use the I suffix in constants, and there are PICTURE definitions for string representations of numeric data similar to COBOL.

Strings

Strings can be defined with either a fixed length or maximum length (with the VARYING keyword). A repeating constant can be introduced with a prefix number, as the below:

TEST: PROCEDURE OPTIONS(MAIN);  
  DECLARE A CHARACTER(20);
  DECLARE B CHARACTER(20) VARYING;
  A = 'HELLO';
  B = (3)'PL1 ';
  PUT FILE (SCARDS) LIST(A,B);
END TEST;  

which prints:

'HELLO               ' 'PL1 PL1 PL1 '  

There are also bit strings, eg DECLARE C BIT(8); C = '1010'B.

Arrays and structures

Arrays can be defined by adding dimensions after the variable name, eg DECLARE TABLE(10,20) FIXED DECIMAL(16);

Structures are collections of variables using a hierarchical format, for example

TEST: PROCEDURE OPTIONS(MAIN);  
  DECLARE 1 MEAL,
            2 NAME CHARACTER(20) VARYING,
            2 CUISINE CHARACTER(20) VARYING,
            2 RESTAURANT,
              3 NAME CHARACTER(20) VARYING,
              3 ADDRESS CHARACTER(80) VARYING,
            2 COST FIXED DECIMAL(5,2);
  MEAL.NAME = 'Dinner';
  MEAL.CUISINE = 'Italian';
  MEAL.RESTAURANT.NAME = 'Alfredo';
  MEAL.ADDRESS = '123 Foo Road';
  MEAL.COST = 123.45;
  PUT FILE (SCARDS) LIST(MEAL);
END TEST;  

Note that when qualifying a structure member we can omit levels if it is not ambiguous, eg MEAL.ADDRESS.

Once a structure has been set up, other variables can be instantiated with the same type using LIKE, eg DECLARE PARTY LIKE MEAL;.

Other types and attributes

As well as statement labels, which can be used for GO TO targets, there are special types for multitasking called events and tasks, but these are not supported by MTS.

An alias to another variable can be set up with DEFINED, eg DECLARE A FIXED BINARY(16); DECLARE B DEFINED A; means B and A point at the same variable.

Other attributes on types include ALIGNED and UNALIGNED and an initial value set at declaration with INITIAL;

Not covered here are variable lifetimes and pointers, which I will come back to in the next part.

Expressions

Expressions work mostly as expected, with conversions allowed implicitly between types - which leads to some complex rules given the number of built in types. || can be used as the string concatenation operator. Whole arrays or structures can be used as parameters to an expression.

Statements

Multiple assignments can be made on the left hand side of an expression, eg A,B = 2*21.

Control flow statements include IF ... THEN ... ELSE and looping via DO, which allows variable and condition testing as in the below:

TEST: PROCEDURE OPTIONS(MAIN);  
  DECLARE I FIXED BINARY(16);
  DECLARE S FIXED BINARY(16);

  DO I = 1 TO 5, 10 TO 15;
    PUT FILE (SCARDS) LIST(I);
  END;
  PUT FILE (SCARDS) SKIP;

  S = 0;
  DO WHILE (S < 10);
    S = S + 1;
    PUT FILE (SCARDS) LIST(S);
  END;
  PUT FILE (SCARDS) SKIP;

  S = 0;
  DO I = 2 TO 10 BY 2 WHILE (S < 10);
    S = S + I;
    PUT FILE (SCARDS) LIST(S);
  END;
  PUT FILE (SCARDS) SKIP;
END TEST;  

while will print

          1         2         3         4         5        10        11  
         13        14        15 

          1         2         3         4         5         6         7  
          9        10 

          2         6        12 

The STOP and EXIT statements can be used to terminate program execution immediately. There is also exception support for a set of predefined cases, such as end of file, using ON, eg ON ENDFILE(SCARDS) GO TO DONE;

In the next post we will look at pointers, block structure and the pre-processor along with other topics.

Further information

The original IBM documentation for the PL/I language and PL/I F compiler can be found at bitsavers.

PL/I - Introduction

In this series we'll look at PL/I, or Programming Language 1, an ambitious multi-purpose language developed by IBM.

PL/I overview

PL/I was intended to be a programming language for all users - scientific, business and systems. It was originally developed by IBM in the mid '60s and spread to other mainframes, Multics and DEC systems. A subset of the language was used to create CP/M and IBM produced a version for OS/2 on PCs that was competitive with C. A standardised specification was produced by ECMA and ANSI from the early '70s. Although it gained some popularity (it was even used as a teaching language at some universities) it never really took off in any of the fields it was intended for.

Language-wise, it is a procedural imperative language drawing inspiration from FORTRAN, COBOL and ALGOL. Due to its intended wide coverage, it is a very large language with a number of interesting features, some of which were incorporated into later languages such as C.

PL/I on MTS

There were three PL/I environments originally available on MTS:

  • *PL1F was derived from the IBM OS/360 F version 5 compiler with modifications for MTS. This is the only version available on the D6.0 tapes today and the one we will look at here. The MTS docuimentation refers to this as *PL1 but this name is not available; use *PL1F.

  • *PL1OPT was the IBM optimising compiler for PL/I and is not available due to copyright reasons. It took many of the compiler optimisation techniques developed for the FORTRAN H compiler and also offered a checkout facility for improved debugging.

  • *PLC was a load-and-go processor developer at Cornell; according to the MTS Archive the source and object code for this has been lost. It had the unique feature that any input would compile (but maybe not produce the desired results) due to automatic substitution of syntax errors.

There are also a number of utility programs for PL/I *PL1SCAN does a quick syntax check of a program and *PL1TIDY formats a PL/I source file.

Prerequisites

No special installation instructions to get PL/I running - just do the standard D6.0 setup as described in this guide and then sign on as a regular user such as ST01.

Using *PLIF

*PLIF will read source from scards (by default *source* ie standard input) and will write object files to spunch. Unlike some other MTS compilers you need to provide a filename for spunch as there is no default. You should destroy or empty the contents of the object file before recompiling.

Other compilation parameters are listed in MTS Volume 7.

To run PL/I programs you need to concatenate the runtime library *PL1LIB with the object file.

Hello world

Here's a transcript of a session where we run a Hello world program. This assumes the source code is contained in the file hello.pl1.

# $list hello.pl1

      1     /* Hello world program in PL/I */
      2     HELLO: PROCEDURE OPTIONS(MAIN);
      3       DECLARE I FIXED BINARY(16);
      4       DO I = 1 TO 5;
      5           PUT FILE (SCARDS) LIST('Hello, world!');
      6           PUT FILE (SCARDS) SKIP;
      7     END HELLO;

# $destroy -load
 File "-LOAD" has been destroyed.

# $run *pl1f scards=hello.pl1 spunch=-load
# Execution begins   12:18:31


  PROCEDURE HELLO: SYNTAX CHECK COMPLETED. COMPILATION CONTINUES.


  NO ERRORS OR WARNINGS DETECTED.
# Execution terminated   12:18:31  T=0.087

# $run -load+*pl1lib
# Execution begins   12:18:34
 'Hello, world!'
 'Hello, world!'
 'Hello, world!'
 'Hello, world!'
 'Hello, world!'

# Execution terminated   12:18:34  T=0.002

Further information

MTS Volume 7 describes the PL/I compilers available on MTS.

The original IBM documentation for the PL/I language and PL/I F compiler can be found at bitsavers.

A good introduction to PL/I is the PL/I Primer.

The Wikipedia article gives an extensive history and list of implementations.

The great FORTRAN-LISP shootout

One of the reasons LISP did not become more popular for scientific computing was the perception that it was slower than FORTRAN. In this post I will compare the performance of the LISP and FORTRAN implementations on MTS using a simple problem.

The problem and some caveats

We'll use the finding emirp primes problem introduced in the FORTRAN posts on this blog as the subject. To make it simpler we will just look at the third test, finding the 10,000th emirp.

So some caveats before we start. Benchmarks are notoriously difficult to get right and this is only a single sample. I'm not an expert at optimising LISP or FORTRAN so there may well be better ways of implementing this. Finally, the UTILISP implementation is from the 1980s, MTS LISP is from the 1970s and the FORTRAN G and H compilers date from at least the 1960s so this is not a historically accurate recreation.

Having said that, it'll be interesting to see how they stack up, so let's start.

FORTRAN

We'll use the source code for the FORTRAN emirp implementation discussed before, taking out the first and second test.

We compile without optimisations first (ie using FORTRAN G) and then try again with FORTRAN H. The results look like this:

# run *ftn scards=emirp3.f
# Execution begins   11:39:08 
  No errors in PRIME 
  No errors in REVRSE 
  No errors in EMIRP 
  No errors in SHOW 
  No errors in TEST3 
  No errors in MAIN 
# Execution terminated   11:39:08  T=0.045 
# run -load
# Execution begins   11:39:10 
     948349
# Execution terminated   11:39:19  T=9.05 
#
# $run *ftn scards=emirp3.f par=opt=h
# Execution begins   11:39:49 

  **** No errors for  PRIME ****      SAT MAY 09/15 11:39:49 

  **** No errors for REVRSE ****      SAT MAY 09/15 11:39:49 

  **** No errors for  EMIRP ****      SAT MAY 09/15 11:39:49 

  **** No errors for   SHOW ****      SAT MAY 09/15 11:39:49 

  **** No errors for  TEST3 ****      SAT MAY 09/15 11:39:49 

  **** No errors for   MAIN ****      SAT MAY 09/15 11:39:49 
# Execution terminated   11:39:49  T=0.074 
# run -load
# Execution begins   11:39:54 
     948349
# Execution terminated   11:39:59  T=4.281 

So 9.05s for FORTRAN G and 4.28s for FORTRAN H.

MTS LISP

Let's try the original MTS LISP first. It's fairly easy to translate using the (GO) form to do looping.

;; Checks if n is prime using a simple division test
(DEFUN PRIME-P (N)
  (COND
   ;; Deal with numbers <= 3
   ((LESS N 2) NIL)
   ((LESS N 4) T)
   ;; Check if divisible by 2 or 3
   ((EQUAL (REMAIN N 2) 0) NIL)
   ((EQUAL (REMAIN N 3) 0) NIL)
   ;; See if divisible by 5, 7, ..., up to approx sqrt(n)
   (T (PROG (I RESULT)
            (SETQ I 5)
A           (SETQ RESULT (COND ((GREATER (TIMES I I) N) 'PRIME)  
                               ((EQUAL (REMAIN N I) 0) 'NOT-PRIME)
                               (T (SETQ I (ADD I 2)) NIL)))
            (COND ((NOT RESULT) (GO A)))
            (EQ RESULT 'PRIME)))))

;; Return a number that has the reversed digits in n
(DEFUN REVERSE-DIGITS (N)
  (PROG (REVERSED)
        (SETQ REVERSED 0)
        ;; Take last digit from n and append to reversed
        (PROG ()
B             (SETQ REVERSED (TIMES REVERSED 10))  
              (SETQ REVERSED (ADD REVERSED (REMAIN N 10)))
              (SETQ N (IDIVIDE N 10))
              (COND ((GREATER N 0) (GO B))))
        ;; CAR needed to evaluate REVERSED otherwise PROG will return
        ;; the symbol REVERSED
        (CAR REVERSED)))

;; Check if a number is an emirp, ie both it and its reversed digits
;; are prime.
(DEFUN EMIRP-P (N)
  (PROG (REVERSED)
        (SETQ REVERSED (REVERSE-DIGITS N))
        (AND (NOT (EQUAL N REVERSED)) (PRIME-P N) (PRIME-P REVERSED))))

(DEFUN EMIRP-10K ()
  (PROG (N COUNT LAST)
        (SETQ N 0)
        (SETQ COUNT 0)
        (SETQ LAST 0)
C       (COND ((EMIRP-P N)  
               (SETQ COUNT (ADD COUNT 1)) (SETQ LAST N)))
        (SETQ N (ADD N 1))
        (COND ((LESS COUNT 10000) (GO C)))
        (CAR LAST)))

(EMIRP-10K)
(STOP)

We'll first execute it through the interpreter:

# $run *lisp scards=emirp.l
# Execution begins   12:41:27
      WELCOME TO LISP/MTS
 >    PRIME-P
 >    REVERSE-DIGITS
 >    EMIRP-P
 >    EMIRP-10K
* GC: COLLECTED 23370 CELLS
...
* GC: COLLECTED 23370 CELLS
>    948349
# Execution terminated   12:47:15  T=347.517 

Next we'll compile the functions. Add (COMPILE PRIME-P) and so on for all the functions and rerun

# $run *lisp scards=emirp.l
     WELCOME TO LISP/MTS
>    PRIME-P
>    REVERSE-DIGITS
>    EMIRP-P
>    EMIRP-10K
* COMPILE        04-17-79 RESTORED
>    (PRIME-P)
>    (REVERSE-DIGITS)
>    (EMIRP-P)
>    (EMIRP-10K)
* GC: COLLECTED 23370 CELLS
...
* GC: COLLECTED 23370 CELLS
>    948349
# Execution terminated   12:56:39  T=72.347 

OK, so we are down from 348s to 72s (the compilation itself takes less than 0.5s). Not bad, but still 17 times slower than FORTRAN H.

UTILISP

Let's see how UTILISP compares. We do need to change the program in quite a few places to match the different built in functions (eg + instead of ADD) but at least we can now use the do macro to create more expressive loops.

;; Checks if n is prime using a simple division test
(defun prime-p (n)
  (cond
   ;; Deal with numbers <= 3
   ((<= n 1) nil)
   ((<= n 3) t)
   ;; Check if divisible by 2 or 3
   ((= (remainder n 2) 0) nil)
   ((= (remainder n 3) 0) nil)
   ;; See if divisible by 5, 7, ..., up to approx sqrt(n)
   (t (do ((i 5 (+ i 2)) (result nil))
          (result (eq result 'prime))
          (setq result (cond ((> (* i i) n) 'prime)
                             ((= (remainder n i) 0) 'not-prime)
                             (t nil)))))))


;; Return a number that has the reversed digits in n
(defun reverse-digits (n)
  (do ((reversed 0) (m n (// m 10)))
      ((< m 1) reversed)
      ;; Take last digit from n and append to reversed
      (setq reversed (* reversed 10))
      (setq reversed (+ reversed (remainder m 10)))))


;; Check if a number is an emirp, ie both it and its reversed digits
;; are prime.
(defun emirp-p (n)
  (let ((reversed (reverse-digits n)))
    (and (neq n reversed) (prime-p n) (prime-p reversed))))


;; Return the 10,000th emirp
(defun emirp-10k ()
  (do ((n 0 (+ n 1)) (count 0) (last 0))
      ((eq count 10000) last)
      (cond ((emirp-p n) (setq count (+ count 1)) (setq last n)))))

;; Run the test
(print (emirp-10k))
(quit)

And let's run it through the interpreter.

# $run *utilisp 0=emirp.ul
# Execution begins   18:18:02 
  948349
# Execution terminated   18:21:30  T=208.35 

So that's 40% quicker than the MTS LISP interpreted version. Let's try compiling the forms, which we can just do by putting

(compile prime-p reverse-digits emirp-p emirp-10k)

after the definitions and then adding the fix parameter to the command line to reserve memory for the compiler (again compilation takes less than 0.5s of the total):

$ run *utilisp 0=emirp.ul par=fix=100
# Execution begins   18:32:27 
  ; Loading Utilisp Compiler System Version(82-11-09)
  948349
# Execution terminated   18:32:54  T=27.788 

Much better - 3 times faster than the MTS LISP compiled version but 7 times slower still than FORTRAN H.

Summary of results and final thoughts

Program Time (s)
FORTRAN G 9.05
FORTRAN H 4.28
MTS LISP interpreted 347.52
MTS LISP compiled 72.35
UTILISP interpreted 208.35
UTILISP compiled 27.79

Compiled UTILISP puts up a good showing and it's definitely usable for light numerical tasks but in the end FORTRAN wins on pure performance. The UTILISP implementation is much faster than MTS LISP for both interpreted and compiled code, which you'd expect given the amount of knowledge creating fast LISP implementations that was built up between the release of the two programs.

Another factor evident from this test was the lack of LISP standardisation (pre-Common Lisp) makes porting programs between implementations a chore, whereas FORTRAN was relatively standardised.

Any suggestions for optimising the code? Please let me know in the comments.

In the next post, in a couple of weeks, we'll look at PL/1.

Further information

Full source code for the programs shown here is on github.

UTILISP - Language features

Let's look at some of the language features in UTILISP compared to the original MTS LISP.

Strings

UTILISP brings support for strings, which can also be treated as a sequence of bytes that can be accessed independently.

> (setq s "hello world")
  "hello world"
> (string-length s)
  11
> (sref s 1)
  133
> (sset s 1 134)
  134
> s
  "hfllo world"
> (string-search "world" s)
  6
> (cutout s 0 2)
  34950

In the last expression, we take the first two bytes of s and extract it as an integer. There's also bref and bset which can be used to access individual bits in a string.

Numbers

UTILISP has both fixed and floating point numbers; arithmetic operators for floating point numbers have $ at the end of the keyword to distinguish them.

> (setq f 0.0)
  +0.0000000?+00
> (setq i 0)
  0
> (0= i)
  T
> (0= f)

  @@@ ILLEGAL ARGUMENT TYPE
  +0.0000000?+00 -- C#/0=
> (0=$ f)
  T

Vectors and references

Vectors are implemented as a primitive data type: they are created with vector and addressed with vref and vset. Vectors can contain any LISP data types. Here we create a four element vector, initially filled with the number 42 in each slot.

> (setq v (vector 4 42))
  V#-8057044
> v
  V#-8057044
> (vector-length v)
  4
> (vref v 0)
  42
> (vset v 0 43)
  43
> (vset v 1 "hello")
  "hello"

You can set references, which are pointers to elements of the vector which can then be dereferenced or updated through the reference.

> (setq r (reference v 3))
  R#-8057028
> (deref r)
  42
> (setref r 99)
  99
> (vref v 3)
  99

mapv can be used to operate on each element of a vector; it passes a reference to each element.

> (mapv v (function (lambda (r) (print (deref r)))))
  43
  "hello"
  42
  99
  V#-8057044

There is also a hash function that could be used to build efficient lookup tables using vectors.

> (hash 32)
  32
> (hash "hello")
  373277

Macros

UTILISP has macro support similar to modern LISPs, with selective quoting and evaluation via backtick and comma. Here we define a macro that will change a variable to nil if it is currently t.

> (defmacro t-to-nil (v) 
      `(cond ((eq ,v t) (setq ,v nil))))
  T-TO-NIL
> (setq a 1)
  1
> (setq b t)
  T
> (t-to-nil a)
  NIL
> (t-to-nil b)
  NIL
> a
  1
> b
  NIL

Scope and closures

UTILISP has dynamic scope and no lexical-let. There also seems to be no funarg implementation although this was available in Maclisp. It is possible to mimic closures with macros:

> (defun make-adder (init)
   (let ((sym (gensym)))
     (set sym init)
     `(lambda ( val) (cond (val (setq ,sym (+ ,sym val))) (t ,sym)))))
  MAKE-ADDER
> (setq aa (make-adder 40))
  (LAMBDA (VAL) (COND (VAL ?) (T G0002)))
> (funcall aa 1)
  41
> (funcall aa 2)
  43

INFANT

One interesting macro package that comes with UTILISP is INFANT (Infix Antidote), which allows infix notation for arithmetic expressions via !:

> (! 2 + 3 * 4)
  14

Further information

MTS Volume 22 describes the UTILISP implementation and how to run it on MTS.