snobol

3 posts

SNOBOL - Date formats

Let's implement a simple program in SNOBOL on MTS to print today's date in different formats.

The problem

The problem is quite simple: take today's date and display it in ISO format (eg 2015-10-11) and a human readable format (eg Sunday, October 11, 2015). Further details and implementations in other languages can be found on Rosetta Code.

Getting today's date

There's a built in function in SNOBOL to return the date as an eight character string. Interestingly, the SNOBOL 4 Programming Language says this returns 'MM/DD/YY' but on MTS it returns 'MM-DD-YY'. Let's get this and store in a variable.

        NOW = DATE()

Breaking the date into components

We take the date and extract the month, day and year using SNOBOL's pattern matching facility and assign it to variables DAY, MONTH, and YEAR.

        PART = SPAN("0123456789")
        SEP  = "-"
        NOW (PART . MONTH) SEP (PART . DAY) SEP (PART . YEAR)

Y2K strikes again

The value returned from DATE() has a two digit year, so let's assume we are running this in the 21st century.

        CENTURY = 2000
        CYEAR = YEAR + CENTURY

Displaying in ISO format

So displaying the date in ISO format is now simply a case of concatenating the day, month and four digit year and then outputting it.

        ISO = CYEAR SEP MONTH SEP DAY
        OUTPUT = ISO

Day of the week

We now turn to the human readable form but there is a slight problem - we need to know which day of the week it is, eg Monday, and there is no facility in SNOBOL to calculate this. There may be a MTS external library we could call to get this, but instead we will use Gauss' algorithm to derive the day as a number from 0 (Sunday) to 6 (Saturday).

* GYEAR is the 4 digit year, unless Jan or Feb then subtract 2
* GMONTH is MONTH-2 modulus 12, Jan is 11, Feb is 12
        GT(MONTH, 2)               :S(G1)F(G2)
G1      GYEAR = CYEAR  
        GMONTH = MONTH - 2         :(GX)
G2      GYEAR = CYEAR - 1  
        EQ(MONTH, 1)               :S(G3)F(G4)
G3      GMONTH = 11                :(GX)  
G4      GMONTH = 12                :(GX)  
GX      WDAY = REMDR(DAY, 7)  
* Calculate the month term 
        MT = (2.6 * GMONTH) - 0.2
* Add the month term - the 0.00005 is needed due to lack of FP precision
        WDAY = WDAY + REMDR(CONVERT(MT + 0.00005, 'INTEGER'), 7)
        WDAY = WDAY + 5 * REMDR(REMDR(GYEAR, 4), 7)
        WDAY = WDAY + 4 * REMDR(REMDR(GYEAR, 100), 7)
        WDAY = WDAY + 6 * REMDR(REMDR(GYEAR, 400), 7)
        WDAY = REMDR(WDAY, 7)

Month and day names

We will need a way of translating a month and day number into a name, eg January or Monday. SNOBOL's arrays can be used for this. Note that the DAYS array is indexed from 0 to 6 instead of 1 to 7.

        MONTHS = ARRAY("12")
        MONTHS<1> = "January"
        MONTHS<2> = "February"
* ...
        MONTHS<11> = "November"
        MONTHS<12> = "December"


        DAYS = ARRAY("0:6")
        DAYS<0> = "Sunday"
        DAYS<1> = "Monday"
* ...
        DAYS<5> = "Friday"
        DAYS<6> = "Saturday"

Displaying in readable format

We now have all the components to display the date in readable format.

        READABLE = DAYS<WDAY> ", " MONTHS<MONTH> " " DAY ", " CYEAR
        OUTPUT = READABLE

Running the program

Here's what the output of the program looks like.

# $run *snobol4 5=date.sn
# Execution begins   21:09:05

 SNOBOL4 (VERSION 3.10, APRIL 1, 1973)
 (MTS IMPLEMENTATION MAY 1, 1975)

 0 SYNTACTIC ERROR(S) IN SOURCE PROGRAM


 2015-10-11
 Sunday, October 11, 2015


 NORMAL TERMINATION AT LEVEL  0
 LAST STATEMENT EXECUTED WAS   53


 SNOBOL4 STATISTICS SUMMARY
              38 MS. COMPILATION TIME
               1 MS. EXECUTION TIME
              41 STATEMENTS EXECUTED,       0 FAILED
              21 ARITHMETIC OPERATIONS PERFORMED
               1 PATTERN MATCHES PERFORMED
               0 REGENERATIONS OF DYNAMIC STORAGE
               0 READS PERFORMED
               2 WRITES PERFORMED
            0.02 MS. AVERAGE PER STATEMENT EXECUTED

# Execution terminated   21:09:05  T=0.045

Final thoughts on SNOBOL

Using SNOBOL feels close to modern scripting languages such as Perl, Python or Ruby. I really like the pattern matching facilities where you can do things like EXPR = TERM | *EXPR OP TERM which is much more powerful than regular expressions; I don't think there is any modern language that has this built in. The lack of control flow processing apart from GOTO is annoying; later versions of the language such as SPITBOL addressed this. I imagine that it was also rather slow when running on a mainframe, especially as it had to be compiled each time it was run.

I don't think SNOBOL is much in use today, but the maintainer of SPITBOL is still active.

Further information

Full source code for this program is on github.

SNOBOL - Language features

Hello again. Let's take a closer look at the SNOBOL language: to run these on MTS see the instructions in the previous article.

Statements

There is only one statement format in SNOBOL, but each part of the statement is optional:

label subject pattern = replacement :goto  

An example will make this clearer. This will read lines from input and print to output until QUIT is found anywhere in the input line.

BEGIN  LINE = INPUT  
       LINE "QUIT"     :S(END)
       OUTPUT = LINE   :(BEGIN)
END  

The first line BEGIN LINE = INPUT has a label (BEGIN) and assigns the subject variable LINE to the replacement value INPUT, which is a special keyword meaning take a line from the input device.

The second line has the subject LINE and matches that against the pattern "QUIT". If the match was a success, the goto statement :S(END) will jump to the final line with the end label.

The third line prints the value of the replacement, LINE, by assigning it to the subject, which is the special keyword OUTPUT. The goto here is unconditionally back to BEGIN.

The fourth line contains just the label END.

Dynamic variables

SNOBOL is the first language we've seen on MTS that has dynamic typing. Variables do not need to be pre-declared and their values can change types easily. In the below, J starts off containing a string but then is changed so it contains an integer; 42 is printed.

J = "ABC"  
A = 20  
B = "22"  
J = A + B  
OUTPUT = J  

You cam also refer to and create variables indirectly via the $ operator. The below will print "123".

A = 'B"  
B = "123  
OUT = $"A"  

Arrays, tables and data types

Arrays can be defined and initialised with ARRAY, eg for a tic-tac-toe board you could do:

BOARD = ARRAY("3,3", " ")  
BOARD<2,2> = "O"  
BOARD<1,1> = "X"  

Associative arrays can be defined with TABLE

DIRECTORY = TABLE()  
DIRECTORY<"John"> = 123  

User defined data types can be created, eg for a 2D point type, the below will print 4:

DATA("POINT(X,Y)")  
P = POINT(3, 4)  
OUTPUT = Y(P)  

Control flow

The only control flow available in SNOBOL is goto. Each statement evaluates to either success or failure, and a jump to another statement can be made based on this result or unconditionally. In the below, a different jump for success and failure is defined on the third line.

START X = INPUT  
      X "YES" :S(START)F(END)
      OUTPUT = "This is not reached"
END  

Pattern matching

The above example is a simple pattern matching test: if the variable X contains YES then the statement is successful. SNOBOL has many more pattern matching constructs: some are showed below along with a string that would be a successful match.

  • Simple concatenation of strings matches the sequence, either | or ANY can be used for alternation.
"HAMLET"      "ML" "ET"
"THE TEMPEST" "TOMP" | "TEMP"
"MACBETH"     "MA" ANY("CDY") "BETH"
  • Patterns can be grouped together with brackets:
"A MIDSUMMER'S NIGH DREAM"   "MID" ("SUMMER" | "WINTER")
  • ARB matches and arbitrary number of chars
"OTHELLO"     "H" ARB "LO"
  • LEN matches a fixed length run of characters
"HENRY IV PART I"     "HENRY " LEN(2) " PART " LEN(1)
  • SPAN matches a run of anything from a set of characters, BREAK the opposite
"PERICLES"   "PER" SPAN("CIXZ") BREAK("ABCDE") "ES"
  • BAL matches an string which has balanced parentheses (including no parentheses), so the pattern
"TO BE" BAL "."

would match "TO BE (OR NOT TO BE)." and "TO BE OR NOT TO BE." but not "TO BE ((OR NOT TO BE)."

By default, SNOBOL will match at any position on the line; it can be forced to match from a certain column by setting the variable &ANCHOR.

Patterns can be defined and referred to later; a pattern can be referred to recursively in the same pattern with *.

A more complex example is below, which will match simple arithmetic expressions, eg Z=21 or X+Y*Z=42.

BEGIN LINE = INPUT  
      &ANCHOR = 1
      NUM     = SPAN("0123456789")
      TERM    = ANY("XYZ")
      OP      = ANY("+-*/")
      EXPR    = TERM | *EXPR OP TERM
      LINE    EXPR "=" NUM            :S(END)
      OUTPUT = LINE                   :(BEGIN)
END  

Replacement and assignment

If any of the above patterns matches, simple replacement can be done by using pattern = replacement, so the below will replace the first occurrence of A with Z.

 LINE = INPUT
 LINE "A" = "Z"
 OUTPUT = LINE
END  

Assignment of a substring to a variable can be done with the binary operator . (which will match if the whole pattern matches) or $ (which will match even if the whole pattern fails. So for the line below, AQQQZ will cause FIRST to be A and LAST to be Z, but AQQQ will cause neither to be set.

LINE ANY("ABC") . FIRST BREAK("XYZ") ANY("XYZ") . LAST  

Instead, if you do

LINE ANY("ABC") $ FIRST BREAK("XYZ") ANY("XYZ") $ LAST  

then AQQQ will cause FIRST to be set to A. Note that QQQZ will not match to LAST in either case.

Built in functions

SNOBOL has a number of built in functions. Function parameters are passed by value and the function can return a value.

  • LT, GT, EQ etc for numeric equality
  • LGT compares two values lexically and returns true if the first is after the second
  • INTEGER to test if a value is an integer
  • IDENT and DIFFER to compare two values and return true (or false for DIFFER) if they have the same type and value
  • SIZE for string size, DUPL(n, x) to create a string of size n by repeating the value of X
  • TRIM to remove trailing blanks
  • EVAL(x) will evaluate the expression in the string x at run time; APPLY(f, a, b...) will take the string f and run its value as a function, passing in variables a, b etc.

User defined functions

It is possible to define functions, but the syntax is clumsy. First you need to define a function with DEFINE("name(args)locals", "entrypoint"). args and locals are a list of arguments and local variables used by the function. The label entrypoint sets the start of the function, if omitted it will use name as the entrypoint.

Next, define the function at the label given by name. The return value can be set by assigning to the function name, and the function is exited by goto-ing RETURN.

A simple example:

       DEFINE("CENTRE(S)L,P")

BEGIN  LINE = TRIM(INPUT)  
       CENTRED = CENTRE(LINE)
       OUTPUT = CENTRED        :(END)

CENTRE L = (80 - SIZE(S)) / 2  
       P = DUPL(".", L)
       CENTRE = P S P          :(RETURN)
END  

This will input a string and then pad it with leading and trailing dots to make it display on the centre of a line.

Functions can be recursive in SNOBOL.

Further information

The language reference manual "The SNOBOL4 Programming Language", linked at snobol4.org, has complete information on the language and was used to assemble this. Take a look at the example programs in the appendix for a taste of what can be done in SNOBOL.

SNOBOL - Introduction

In this series we'll look at SNOBOL, a unique pattern matching language, and its implementation on MTS.

SNOBOL overview

SNOBOL (StriNg Oriented and symBOlic Language) was developed at Bell Labs in the 1960s to help with a symbolic manipulation project. It had powerful pattern matching and string manipulation features but had a simple syntax: it has no control flow instructions apart from goto and variables are dynamically typed and don't need declarations. It started to spread to other sites and was taught at some universities in the 197-s. The original implementation was for the IBM 7090 but versions were ported to the IBM S/360 and DEC PDP/10. Its use started to die out in the 1980s but its creators went on to work on the ICON language and it influenced later text manipulation languages such as AWK and Perl.

SNOBOL on MTS

The main implementation that we will run here is the *SNOBOL4 interpreter. Also available on the D6.0 tapes is *SNOBOL4B which has an extension to the core language for printing blocks, two- and three-dimensional visualisations of data.

MTS originally had a number of other implementations of SNOBOL that are not available on the D6.0 tapes due to copyright reasons:

  • *SPITBOL - a fast SNOBOL 4 compiler from the Illinois Institute of Technology.
  • *SNOSTORM - a SNOBOL preprocessor written at UM to add structured programming features

Prerequisites

No special installation instructions to get SNOBOL running - just do the standard D6.0 setup as described in this guide and then sign on as a regular user such as ST01.

Using *SNOBOL

*SNOBOL4 will read the source code for the program and then any input from unit 5 (by default *source* ie standard input). If you want to take the source code from a file prog.sn and then enter input from the keyboard you could do something like:

# $run *snobol4 5=prog.sn+*source*

Other parameters to *SNOBOL4 are listed in MTS Volume 9.

Hello world

Here's a transcript of a session where we run a Hello world program. This assumes the source code is contained in the file hello.sn. Note that the code is not free format: only goto labels and comments (starting with *) are allowed in the first column.

# $list hello.sn

      1     * SNOBOL program to print Hello World
      2           I = 1
      3     LOOP  OUTPUT = "Hello, world!"
      4           I = I + 1
      5           LE(I, 5) : S(LOOP)
      6     END

# $run *snobol4 5=hello.sn

 SNOBOL4 (VERSION 3.10, APRIL 1, 1973)
 (MTS IMPLEMENTATION MAY 1, 1975)

         * SNOBOL program to print Hello World
 *1            I = 1
 *2      LOOP  OUTPUT = "Hello, world!"
 *3            I = I + 1
 *4            LE(I, 5) : S(LOOP)
 *5      END

    0 SYNTACTIC ERROR(S) IN SOURCE PROGRAM

 Hello, world!
 Hello, world!
 Hello, world!
 Hello, world!
 Hello, world!

 NORMAL TERMINATION AT LEVEL  0
 LAST STATEMENT EXECUTED WAS    4

SNOBOL4 STATISTICS SUMMARY

               5 MS. COMPILATION TIME
               2 MS. EXECUTION TIME
              16 STATEMENTS EXECUTED,       1 FAILED
               5 ARITHMETIC OPERATIONS PERFORMED
               0 PATTERN MATCHES PERFORMED
               0 REGENERATIONS OF DYNAMIC STORAGE
               0 READS PERFORMED
               5 WRITES PERFORMED
            0.13 MS. AVERAGE PER STATEMENT EXECUTED

Further information

MTS Volume 9 describes the SNOBOL compilers available on MTS and includes a basic tutorial on the language.

snobol4.org has lots of information about SNOBOL's history, implementations and links to books including the main reference manual for the language, "The SNOBOL4 Programming Language".