SNOBOL - Language features
Hello again. Let’s take a closer look at the SNOBOL language: to run these on MTS see the instructions in the previous article.
Statements
There is only one statement format in SNOBOL, but each part of the statement is optional:
label subject pattern = replacement :goto
An example will make this clearer. This will read lines from input and print to output until QUIT
is found anywhere in the input line.
BEGIN LINE = INPUT
LINE "QUIT" :S(END)
OUTPUT = LINE :(BEGIN)
END
The first line BEGIN LINE = INPUT
has a label (BEGIN
) and assigns the subject variable LINE
to the replacement value INPUT
, which is a special keyword meaning take a line from the input device.
The second line has the subject LINE
and matches that against the pattern "QUIT"
. If the match was a success, the goto statement :S(END)
will jump to the final line with the end label.
The third line prints the value of the replacement, LINE
, by assigning it to the subject, which is the special keyword OUTPUT
. The goto here is unconditionally back to BEGIN
.
The fourth line contains just the label END
.
Dynamic variables
SNOBOL is the first language we’ve seen on MTS that has dynamic typing. Variables do not need to be pre-declared and their values can change types easily. In the below, J starts off containing a string but then is changed so it contains an integer; 42 is printed.
J = "ABC"
A = 20
B = "22"
J = A + B
OUTPUT = J
You cam also refer to and create variables indirectly via the $
operator. The below will print “123”.
A = 'B"
B = "123
OUT = $"A"
Arrays, tables and data types
Arrays can be defined and initialised with ARRAY
, eg for a tic-tac-toe board you could do:
BOARD = ARRAY("3,3", " ")
BOARD<2,2> = "O"
BOARD<1,1> = "X"
Associative arrays can be defined with TABLE
DIRECTORY = TABLE()
DIRECTORY<"John"> = 123
User defined data types can be created, eg for a 2D point type, the below will print 4:
DATA("POINT(X,Y)")
P = POINT(3, 4)
OUTPUT = Y(P)
Control flow
The only control flow available in SNOBOL is goto. Each statement evaluates to either success or failure, and a jump to another statement can be made based on this result or unconditionally. In the below, a different jump for success and failure is defined on the third line.
START X = INPUT
X "YES" :S(START)F(END)
OUTPUT = "This is not reached"
END
Pattern matching
The above example is a simple pattern matching test: if the variable X
contains YES
then the statement is successful. SNOBOL has many more pattern matching constructs: some are showed below along with a string that would be a successful match.
- Simple concatenation of strings matches the sequence, either
|
or ANY can be used for alternation.
"HAMLET" "ML" "ET"
"THE TEMPEST" "TOMP" | "TEMP"
"MACBETH" "MA" ANY("CDY") "BETH"
- Patterns can be grouped together with brackets:
"A MIDSUMMER'S NIGH DREAM" "MID" ("SUMMER" | "WINTER")
- ARB matches and arbitrary number of chars
"OTHELLO" "H" ARB "LO"
- LEN matches a fixed length run of characters
"HENRY IV PART I" "HENRY " LEN(2) " PART " LEN(1)
- SPAN matches a run of anything from a set of characters, BREAK the opposite
"PERICLES" "PER" SPAN("CIXZ") BREAK("ABCDE") "ES"
- BAL matches an string which has balanced parentheses (including no parentheses), so the pattern
"TO BE" BAL "."
would match “TO BE (OR NOT TO BE).” and “TO BE OR NOT TO BE.” but not “TO BE ((OR NOT TO BE).”
By default, SNOBOL will match at any position on the line; it can be forced to match from a certain column by setting the variable &ANCHOR
.
Patterns can be defined and referred to later; a pattern can be referred to recursively in the same pattern with *
.
A more complex example is below, which will match simple arithmetic expressions, eg Z=21
or X+Y*Z=42
.
BEGIN LINE = INPUT
&ANCHOR = 1
NUM = SPAN("0123456789")
TERM = ANY("XYZ")
OP = ANY("+-*/")
EXPR = TERM | *EXPR OP TERM
LINE EXPR "=" NUM :S(END)
OUTPUT = LINE :(BEGIN)
END
Replacement and assignment
If any of the above patterns matches, simple replacement can be done by using pattern = replacement
, so the below will replace the first occurrence of A with Z.
LINE = INPUT
LINE "A" = "Z"
OUTPUT = LINE
END
Assignment of a substring to a variable can be done with the binary operator .
(which will match if the whole pattern matches) or $
(which will match even if the whole pattern fails. So for the line below, AQQQZ will cause FIRST to be A and LAST to be Z, but AQQQ will cause neither to be set.
LINE ANY("ABC") . FIRST BREAK("XYZ") ANY("XYZ") . LAST
Instead, if you do
LINE ANY("ABC") $ FIRST BREAK("XYZ") ANY("XYZ") $ LAST
then AQQQ will cause FIRST to be set to A. Note that QQQZ will not match to LAST in either case.
Built in functions
SNOBOL has a number of built in functions. Function parameters are passed by value and the function can return a value.
- LT, GT, EQ etc for numeric equality
- LGT compares two values lexically and returns true if the first is after the second
- INTEGER to test if a value is an integer
- IDENT and DIFFER to compare two values and return true (or false for DIFFER) if they have the same type and value
- SIZE for string size, DUPL(n, x) to create a string of size n by repeating the value of X
- TRIM to remove trailing blanks
- EVAL(x) will evaluate the expression in the string x at run time; APPLY(f, a, b…) will take the string f and run its value as a function, passing in variables a, b etc.
User defined functions
It is possible to define functions, but the syntax is clumsy. First you need to define a function with DEFINE("name(args)locals", "entrypoint")
. args
and locals
are a list of arguments and local variables used by the function. The label entrypoint
sets the start of the function, if omitted it will use name
as the entrypoint.
Next, define the function at the label given by name
. The return value can be set by assigning to the function name, and the function is exited by goto-ing RETURN
.
A simple example:
DEFINE("CENTRE(S)L,P")
BEGIN LINE = TRIM(INPUT)
CENTRED = CENTRE(LINE)
OUTPUT = CENTRED :(END)
CENTRE L = (80 - SIZE(S)) / 2
P = DUPL(".", L)
CENTRE = P S P :(RETURN)
END
This will input a string and then pad it with leading and trailing dots to make it display on the centre of a line.
Functions can be recursive in SNOBOL.
Further information
The language reference manual “The SNOBOL4 Programming Language”, linked at snobol4.org, has complete information on the language and was used to assemble this. Take a look at the example programs in the appendix for a taste of what can be done in SNOBOL.