APL - Knuth Shuffle

In the final part of this series, let’s create a real program in APL. As before, we’ll show both the original and transliterated version so we can run the program on MTS’s APL.

The problem

We’ll implement Knuth Shuffle (also known as Fisher/Yates shuffle) from Rosetta Code. This produces a random permutation of a vector.

Using deal

Dyadic ?, or deal, looks an ideal candidate here. Recall from the previous part that x ? y means take x unique items from the population 1 … y. This means that x ? x will give a random permutation of all the values 1 … x, which we can use as vector indices.

What else will we need?

Putting this together:

∇S ← SHUFFLE1 V                     "S = SHUFFLE1 V    
S ← V[(⍴ V) ? ⍴ V]                  S = V[($RH V) ? $RH
∇                                   "                  

Let’s try to run this and see if it works:

        SHUFFLE1 11 22 33
11  33  22  
        SHUFFLE1 11 22 33
22  11  33  
        SHUFFLE1 11 22
22  11  
        SHUFFLE1 11 22
22  11  
        SHUFFLE1 11 22
11  22  
        SHUFFLE1 ,11
11  

In the last example we need to use ravel (,) to change the input scalar 11 into a vector of length 1 containing 11.

This function will also work on strings, which are treated as vectors of characters:

        SHUFFLE1 'FACE'
FCEA  
        SHUFFLE1 'FACE'
ACFE  

Implementing the algorithm

What if we wanted to implement the algorithm itself, rather then just use deal? Let’s start by looking at the pseudocode given on Rosetta Code. For a vector with indices 0 - last:

for i from last downto 1 do:
    let j = random integer in range 0 ≤ j ≤ i
    swap items[i] with items[j]

What language features will we need from APL?

Here’s the complete program:

∇S ← SHUFFLE2 V;I;J                 "S = SHUFFLE2 V;I;J
I ← ⍴ V                             I = $RH V
→ (3,7)[1 + I ≤ 1]                  $GO (3,7)[1 + I $LE 1]
J ← ? I                             J = ? I
V[I,J] ← V[J,I]                     V[I,J] = V[J,I]
I ← I - 1                           I = I - 1
→ 2                                 $GO 2
S ← V                               S = V
∇                                   "

Line 2 is tricky: it branches to lines in the vector in round brackets based on the expression in the square brackets. If I is less than or equal to 1 then the expression evaluates to 2, so control jumps to line 7 (S ← V) where the output value is assigned and the function ends. Otherwise it will continue to line 3 and swap an element.

Running this on some test cases shows this works as expected.

Ideally we’d like to use idiomatic APL and avoid a loop altogether. Modern APLs have control structures like repeat, but I can’t see an easy way to do this in APL\360 - if you can, please add a comment!

Performance

Let’s do a test by running each of these implementations 60,000 times with the same input (1 2 3). We can use code like this:

∇RUN1;I                             "RUN1;I
I ← 60000                           I = 60000
SHUFFLE1 1 2 3                      SHUFFLE1 1 2 3
I ← I - 1                           I = I - 1
→ 2 × I > 0                         $GO 2 * I $GT 0
∇                                   "

On my system, SHUFFLE1 takes 7.3s to run and SHUFFLE2 24.6s, which we’d expect as SHUFFLE1 uses the system provided deal function.

I also ran a quick check for randomness by printing the results to a file and seeing how many times each permutation occurred. Over 60,000 runs we’d expect each permutation to appear around 10,000 times; SHUFFLE1 was within 0.5% of that and SHUFFLE2 0.6%.

Internally, ? uses a pseudo-random number generator so true randomness cannot be achieved. Also with a 32 bit word, the number of possible states is 2^32, which limits the number of combinations that can occur. For example, if we wanted to shuffle a deck of 52 cards there are 52! combinations, much more than 2^32.

Final thoughts on APL

APL’s use of a large number of symbols is what first strikes you when learning the language. I found by going through the tutorials that you start picking them up quite quickly, and it’s not hard to write simple programs. However, reading other people’s programs can be difficult, given its compact form and right-to-left structure; I imagine that understanding a large APL code base would take some time.

The language itself is unique and even beautiful; some times I feel like I’m writing mathematics rather than coding. If you have the opportunity then I’d recommend trying it out.

The transliterations needed on emulated MTS add another level of difficulty and unless you want to experience the historical aspects of running APL like this I’d recommend learning APL with a modern implementation that can use APL symbols directly. There are several commercial implementations that run on modern hardware, for example Dynalog. The GNU project also has GNU APL. Several languages have been derived from APL, including K, used for financial analysis.

Further information

Full source code for this program can be found on github.

Jeff Atwood’s Coding Horror blog has a good post on shuffling with a follow-up on the dangers of mis-implementing the algorithm.

Eugene McDonnell wrote an interesting article on How the Roll Function Works in APL\360 and other APLs, giving insight on its implementation.

To see some amazing APL one-liners in action, check out Sixteen APL Amuse-Bouches.

Comments

comments powered by Disqus