Calculation v Look-Up

(⎕io←0 and timings are done in Dyalog version 15.0.)

Table Look-Up

Some functions can be computed faster by table look-up than a more traditional and more conventional calculation. For example:


      cmpx '*b' '(*0 1)[b]'
 *b        → 1.32E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
 (*0 1)[b] → 1.23E¯3 | -91% ⎕⎕⎕

Some observations about this benchmark:

(a) The advantage of table look-up depends on the time to calculate the function directly, versus the time to do indexing, ⍺[⍵] or ⍵⌷⍺, plus a small amount of time to calculate the function on a (much) smaller representative set.

Indexing is particularly efficient when the indices are boolean:

      cmpx '2 3[bb]' '2 3 3[bi]'
 2 3[bb]   → 1.12E¯4 |    0% ⎕⎕⎕⎕⎕
 2 3 3[bi] → 7.39E¯4 | +558% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

bi is the same as bb except that the last element of 2 forces it to be 1-byte integers (for bi) instead of 1-bit booleans (for bb).

(b) Although the faster expression works best with 0-origin, the timing for 1-origin is similar.

      cmpx '*b' '{⎕io←0 ⋄ (*0 1)[⍵]}b'
 *b                   → 1.32E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
 {⎕io←0 ⋄ (*0 1)[⍵]}b → 1.22E¯3 | -91% ⎕⎕⎕

That is, if the current value of ⎕io is 1, the implementation can use a local ⎕io setting of 0.

(c) The fact that *b is currently slower than (*0 1)[b] represents a judgment by the implementers that *b is not a sufficiently common computation to warrant writing faster code for it. Other similar functions already embody the table look-up technique:

      cmpx '⍕b' '¯1↓,(2 2⍴''0 1 '')[b;]'
 ⍕b                   → 6.95E¯4 |  0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
 ¯1↓,(2 2⍴'0 1 ')[b;] → 6.92E¯4 | -1% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

Small Domains

Table look-up works best for booleans, but it also works for other small domains such as 1-byte integers:

      i1←?1e6⍴128    ⍝ 1-byte integers

      cmpx '*i1' '(*⍳128)[i1]'
 *i1         → 1.48E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
 (*⍳128)[i1] → 9.30E¯4 | -94% ⎕⎕

      cmpx '○i1' '(○⍳128)[i1]'
 ○i1         → 2.78E¯3 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
 (○⍳128)[i1] → 9.25E¯4 | -67% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

In general, a table for 1-byte integers needs to have values for ¯128+⍳256. Here ⍳128 is used to simulate that in C indexing can be done on 1-byte indices without extra computation on the indices. In APL, general 1-byte indices are used as table[i1+128]; in C, the expression is (table+offset)[i].

Domains considered “small” get bigger all the time as machines come equipped with ever larger memories.

Larger Domains

Table look-up is applicable when the argument is not a substantial subset of a large domain and there is a fast way to detect that it is not.

      i4s←1e7+?1e6⍴1e5    ⍝ small-range 4-byte integers

      G←{(⊂u⍳⍵)⌷⍺⍺ u←∪⍵}

      cmpx '⍟i4s' '⍟G i4s'
 ⍟i4s   → 1.44E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
 ⍟G i4s → 9.59E¯3 | -34% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

The definition of G implements that the argument is not used directly as indices (as in (⊂⍵)⌷⍺⍺ u) but needed to be looked up, the u⍳⍵ in (⊂u⍳⍵)⌷⍺⍺ u. Thus both index-of and indexing are key enablers for making table look-up competitive.

§4.3 of Notation as a Tool of Thought presents a different way to distribute the results of a calculation on the “nub” (unique items). But it takes more time and space and is limited to numeric scalar results.

      H←{(⍺⍺ u)+.×(u←∪⍵)∘.=⍵}

      i4a←1e7+?1e4⍴1e3    ⍝ small-range 4-byte integers

      cmpx '⍟i4a' '⍟G i4a' '⍟H i4a'
 ⍟i4a   → 1.56E¯4 |       0%
 ⍟G i4a → 6.25E¯5 |     -60%
 ⍟H i4a → 1.78E¯1 | +113500% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

The benchmark is done on the smaller i4a as a benchmark on i4s founders on the ≢∪⍵ by ≢⍵ boolean matrix (for i4s, 12.5 GB = ×/ 0.125 1e5 1e6) created by H.

Mathematical Background

New Math” teaches that a function is a set of ordered pairs whose first items are unique:

*⍵: {(⍵,*⍵)|⍵∊C}      The set of all (⍵,*⍵) where is a complex number
⍟⍵: {(⍵,⍟⍵)|⍵∊C~{0}}  The set of all (⍵,⍟⍵) where is a non-zero complex number
○⍵: {(⍵,○⍵)|⍵∊C}      The set of all (⍵,○⍵) where is a complex number
⍕⍵: {(⍵,⍕⍵)|0=⍴⍴⍵}    The set of all (⍵,⍕⍵) where is a scalar

When is restricted (for example) to the boolean domain, the functions, the sets of ordered pairs, are more simply represented by enumerating all the possibilities:

*⍵: {(0,1), (1,2.71828)}
⍟⍵: {(0,Err), (1,0)}
○⍵: {(0,0), (1,3.14159)}
⍕⍵: {(0,'0'), (1,'1')}

Realized and Potential Performance Improvements

realized (available no later than 15.0)

1-byte integer
⍕ ∘.f
⍕     a∘⍕

potential (non-exhaustive list)

1-byte integer
2-byte integer
small-range 4-byte integer
* ⍟ | - ○   ! ⍳ a∘f f.g
* ⍟ | - ○ × ! ⍳ a∘f f.g ∘.f
* ⍟ | - ○ ×   ⍳ a∘f
* ⍟ |     ×

a is a scalar; f and g are primitive scalar dyadic functions.

Supporting the Community – Update

You may have been wondering what has been happening with Jo Shaw (daughter of our Customer Account Manager, Karen) the England pool player who Dyalog have sponsored in the past, so we thought it was time for an update.

After getting a new job in December 2014, Jo moved from Hampshire to Wiltshire and joined the Wiltshire Ladies county pool team. After a successful first season with them, Jo again qualified to play for England as a reserve player in May 2016 but the exciting news that she was expecting a baby meant that she was not able to take up her place in the team – unfortunately the Nation’s Cup event was happening at the exact same time the baby was due. Jo also found that she was not able to practice or play pool during the later stages of her pregnancy as she was just not able to bend over the table(!) so this meant she had to take a back seat with the Wiltshire Ladies county team too.

Jo’s daughter Remy was born on 18 October 2016 and Remy has been a keen supporter of Wiltshire Ladies since she was born – she even has her own team shirt!

Jo and Remy have recently been to the EBPF National Finals to support Jo’s Wiltshire teammates as they became National Ladies Champions for the first time. Jo was disappointed that she wasn’t able to play a bigger part in the victory but being a mother is now her top priority and she was pleased that her team did so well.

After a lot of thought Jo has now made the very difficult decision to take some time out from both county and national pool to concentrate on being a mum. She plans to be back in a year or so once Remy is a bit older and Dyalog fully expect her to win back her place in the England team – we will be there to support her when she does!

Beauty and the Beast

beauty-and-the-beastFinally, the last accessory I ordered for my Raspberry Pi Zero (that’s the little red thing behind my keyboard) has arrived – an Acer 43″ ET430K monitor. The Zero won’t quite drive this monitor at its maximum resolution of 3840×2160 pixels, but as you can see, you get enough real estate to do real graphics with “ASCII Art” (well, Unicode Art, anyway)!

Some readers will recognise the image as the famous Mandelbrot Set, named after the mathematician Benoit Mandelbrot, who studied fractals. According to Wikipedia a fractal is a mathematical set that exhibits a repeating pattern displayed at every scale – they are interesting because they produce patterns that are very similar to things we can see around us – both in living organisms and landscapes – they seem to be part of the fundamental fabric of the universe.

Fractals are are also interesting because they produce images of staggering beauty, as you can see on the pages linked to above and one of our rotating banner pages, where a one-line form of the APL expression which produced the image is embedded:


The colourful images of the Mandelbrot set are produced by looking at a selection of points in the complex plane. For each point c, start with a value of 0 for z, repeat the computation z = c + z², and colour the point according to the number of iterations required before the function becomes “unbounded in value”.

In APL, the function for each iteration can be expressed as {c+⍵*2}, where c is the point and ⍵ (the right argument of the function) is z (initially 0, and subsequently the result of the previous iteration):

      f←{c+⍵*2} ⍝ define the function
      c←1J1     ⍝ c is 1+i (above and to the right of the set)
      (f 0)(f f 0)(f f f 0)(f f f f 0) (f f f f f 0)
1J1 1J3 ¯7J7 1J¯97 ¯9407J¯193

If you are not familiar with complex numbers, those results may look a bit odd. While complex addition just requires adding the real and the imaginary parts independently, the result of multiplication of two complex numbers (a + bi) and (c + di) is defined as (ac-db) + (bc+ad)i. Geometrically, complex multiplication is a combined stretching and rotation operation.

Typing all those f’s gets a bit tedious, fortunately APL has a power operator , a second-order function that can be used to repeatedly apply a function. We can also compute the magnitude of the result number using |:

8.853E7 7.837E15

Let’s take a look at what happens if we choose a point inside the Mandelbrot set:

      {|(f⍣⍵)0}¨ 1 2 3 4 5 6 7 8 9 10
0.1414 0.1562 0.1566 0.1552 0.1547 0.1546 0.1546 0.1546 0.1546 0.1546

Above, I used an anonymous function so I could pass the number of iterations in as a parameter, and use the each operator ¨ to generate all the results up to ten iterations. In this case, we can see that the magnitude of the result stabilises, which is why the point 0.1 + 0.1i is considered to be inside the Mandelbrot set.

Points which are just outside the set will require varying numbers of applications of f before the magnitude “explodes”, and if you colour points according to how many iterations are needed and pick interesting areas along the edge of the set, great beauty is revealed.


The above image is in the public domain and was created by Jonathan J. Dickau using ChaosPro 3.3 software, which was created by Martin Pfingstl.

Our next task is to create array of points in the complex plane. A helper function unitstep generates values between zero and 1 with a desired number of steps, so I can vary the resolution and size of the image. Using it, I can create two ranges, multiply one of them by i (0J1 in APL) and use an addition table (∘.+) to generate the array:

      ⎕io←0 ⍝ use index origin zero
      unitstep 6
0 0.1667 0.3333 0.5 0.6667 0.8333 1
      c←¯3 × 0.7J0.5 - (0J1×unitstep 6)∘.+unitstep 6
¯2.1J¯1.5 ¯1.6J¯1.5 ¯1.1J¯1.5 ¯0.6J¯1.5 ¯0.1J¯1.5 0.4J¯1.5 0.9J¯1.5
¯2.1J¯1   ¯1.6J¯1   ¯1.1J¯1   ¯0.6J¯1   ¯0.1J¯1   0.4J¯1   0.9J¯1  
¯2.1J¯0.5 ¯1.6J¯0.5 ¯1.1J¯0.5 ¯0.6J¯0.5 ¯0.1J¯0.5 0.4J¯0.5 0.9J¯0.5
¯2.1      ¯1.6      ¯1.1      ¯0.6      ¯0.1      0.4      0.9     
¯2.1J00.5 ¯1.6J00.5 ¯1.1J00.5 ¯0.6J00.5 ¯0.1J00.5 0.4J00.5 0.9J00.5
¯2.1J01   ¯1.6J01   ¯1.1J01   ¯0.6J01   ¯0.1J01   0.4J01   0.9J01  
¯2.1J01.5 ¯1.6J01.5 ¯1.1J01.5 ¯0.6J01.5 ¯0.1J01.5 0.4J01.5 0.9J01.5

The result is subtracted from 0.7J0.5 (0.7 + 0.5i) to get the origin of the complex plane slightly off centre in the middle, and multiplied the whole thing by 3 to get a set of values that brackets the Mandelbrot set.

Note that APL expressions are typically rank and shape invariant, so our function f can be applied to the entire array without changes. Since our goal is only to produce ASCII art, we don’t need to count iterations, we can just compare the magnitude of the result with 9 to decide whether the point has escaped:

      9<|f 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
      9<|(f⍣3) 0 ⍝ Apply f 3 times
1 1 1 0 0 0 1
1 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
1 0 0 0 0 0 0
1 1 1 0 0 0 1

We can see that points around the edge start escaping after 3 iterations. Using an anonymous function again, we can observe more and more points escape, until we recognise the (very low resolution) outline of the Mandelbrot set:

      ]box on
      {((⍕⍵)@(⊂0 0))' #'[9>|(f⍣⍵)0]}¨2↓⍳9
│2######│3  ### │4      │5      │6      │7      │8      │
│#######│ ######│   ### │    #  │    #  │    #  │    #  │
│#######│#######│ ##### │  #### │  #### │   ### │   ### │
│#######│#######│###### │ ##### │ ##### │ ##### │ ####  │
│#######│#######│ ##### │  #### │  #### │   ### │   ### │
│#######│ ######│   ### │    #  │    #  │    #  │    #  │
│#######│   ### │       │       │       │       │       │

The last example allowed me to sneak in a preview of the new “at” operator coming in Dyalog v16.0. It is a functional merge operator that I am using to insert the formatted right argument (the number of iterations) into position (0 0) of each matrix.

If I use our remote IDE (RIDE) on my Windows machine and connect to the Pi, I can have an APL session on the Pi with 3840×2160 resolution. In this example, I experimented with grey scale “colouring” the result by rounding down and capping the result at 10, and using the character ⍟ (darkest) for 0, ○ for 1-5, × for 6-9, and blank for points that “escaped”, by indexing into an array of ten characters:

      '⍟○○○○○×××× '[10⌊⌊|(f⍣9)c]

Who needs OpenGL?! (click on the image to enlarge)


Charting Reaction Times on the Raspberry Pi

Earlier this week I collected some reaction timer data on my Pi using the BBC micro:bit as an input device. I only produced an “ASCII art” chart at the time:

251 305 294 415 338 298 294 251 378
      ReactionTimer.AsciiChart times
400|    * 
375|         *
325|     * 
300|  * 
275|   *  ** 
250+ *      * 

Retro is back in style – but of course I should point out that we can produce “proper” graphics using SharpPlot, a cross-platform graphics package that is included with Dyalog APL on all platforms. I’ve enhanced the ReactionTimer namespace with a function called SPHistogram. If you are using RIDE as the front end to APL on the Pi, this will render output from SharpPlot in a window:

    ReactionTimer.SPHistogram times

Reaction Time Histogram

The original ASCII chart simply plotted the observations in the order that they occurred. Above, the shaded areas show how many observations there were in each bucket (2 between 250ms and 275ms, 3 between 275ms and 300ms, and so on). At the same time, the individual observations can be seen along the X axis as vertical red lines. It is a little unfortunate that, presumably due to some artifact of the timing process, the values 251 and 294 occur twice – and these lines are drawn on top of each other.

The example highlights one of the things that makes SharpPlot a bit special: We have overlaid a histogram showing the frequency of reactions in each bucket, and used a “scatterplot” to where the Y value is always 0, to mark the individual observations along the X axis.

     ∇ SPHistogram times;heading;renderHtml;sp;svg;z 
[1]   heading←'Reaction Times' 
[2]   renderHtml←3500⌶ ⍝ Render HTML in Window 
[4]   :If 0=⎕NC'#.SharpPlot' ⋄ #.⎕CY'sharpplot.dws' ⋄ :EndIf
[6]   sp←⎕NEW #.SharpPlot(432 250) 
[7]   sp.Heading←heading 
[9]   ⍝ Draw histogram 
[10]  sp.ClassInterval←25 
[11]  sp.SetXTickMarks 25 
[12]  sp.HistogramStyle←#.HistogramStyles.SurfaceShading 
[13]  sp.SetFillStyles #.FillStyle.Opacity30 
[14]  sp.DrawHistogram⊂times 
[16]  ⍝ Add observations using ScatterPlot 
[17]  sp.SetMarkers #.Marker.UpTick 
[18]  sp.SetPenWidths 1 
[19]  sp.SetMarkerScales 3 
[20]  sp.DrawScatterPlot(times×0)(times) 
[22]  ⍝ Render SVG and display in window 
[23]  svg←sp.RenderSvg #.SvgMode.FixedAspect 
[24]  z←heading renderHtml svg 

Hopefully the code is more or less self-explanatory, but if you’d like to learn more about SharpPlot there is excellent documentation at The documentation is actually written for C# users, but there is an illustration of how to translate the documentation to APL (and VB) at

micro:bit Reaction Timer in APL on the Pi and BBC micro:bit

BBC micro:bit displaying a happy face

BBC micro:bit displaying a happy face

I have a bit of a cold today, so I decided that instead of hopping in an icy car and driving to the office in order to spend the day drinking coffee and answering e-mail, I should stay at home, turn up the radiators, make lots of tea (with honey!) and have some fun writing code on my Raspberry Pi! Can there be a better way to ensure a speedy recovery?

By the way, if you are already a user of Dyalog APL but you haven’t got a Pi yet, you should read The APLer’s Quick-start Guide to the Raspberry Pi, which Romilly Cocking completed a few days ago. This explains what your options are for buying hardware, and how to get going with a free copy of Dyalog APL. If you don’t read it now, your cold may be over before all the bits are delivered, so hurry up!

I wasn’t feeling THAT energetic this morning, so I decided to ease back into things by trying to replicate Romilly’s reaction timer, which is written in MicroPython. To begin with, I extended the microbit class that I developed for the Morse code display function with a few new methods to cover the API calls that allow me to check the state of the two buttons on each side of the micro:bit display (see the image above).

First, I added a method called is_true, which takes a MicroPython expression as an argument, evaluates it using the PyREPL function, and returns 0 or 1 depending on whether the expression returns False or True (and fails if it returns something else):

 ∇ r←is_true expr;t;z
 :Access Public
 r←1⊃t←'True' 'False'∊⊂z←PyREPL expr
 :If ~∨/t ⋄ ⎕←'True/False expected, got: ',z ⋄ ∘∘∘ ⋄ :EndIf

This is a useful building block, which allows the simple construction of functions like was_pressed, which takes ‘a’ or ‘b’ as an argument and tells you whether the button in question has been pressed since you last asked:

∇ r←was_pressed button
 :Access Public
 r←is_true 'button_',button,'.was_pressed()' 

Once this is done, writing the Play function in the ReactionTimer namespace is easy, you can find the rest of the code on GitHub. The APL version is different from the MicroPython version in that – once the user presses button B and stops the game – the reaction times (in milliseconds) are returned as an integer vector. So now we can have some fun with data!

In the spirit of the micro:bit, I thought I’d produce a low tech character based chart, trying to remember the skills that were start-of-the-art when I started using APL. The AsciiChart function in the ReactionTimer namespace takes a vector of timings as the right argument, and produces a chart:

251 305 294 415 338 298 294 251 378
      ReactionTimer.AsciiChart times
400|    * 
375|         *
325|     * 
300|  * 
275|   *  ** 
250+ *      * 

The code (which is also on GitHub) is listed below. Because this is a pure function, I used the modern dfns style of function definition rather than the old style procedural form that I’ve used for the object oriented code. The function works by allocating each value to a bucket of the size defined by the variable scale. The fun part is the use of the outer product with equals (∘.=) between the list of buckets on the left (rb) and the scaled values on the right (⌊⍵÷scale) – and then using this Boolean array to index into a two-element character vector to produce the chart. The rest is scaling calculations and finally decorating with “tik marks”:

     scale←25 ⍝ size of each row
     tiks←4 ⍝ tik spacing
     (max min)←(⌈/ , ⌊/) ⍵ ⍝ maximum and minimum
     base←⌊min÷scale ⍝ round down to nearest scale unit
     rb←base+0,⍳⌈(max-min)÷scale ⍝ row base values
     r←' *'[1+rb∘.=⌊⍵÷scale] ⍝ our chart
     r←((≢rb)⍴'+',(tiks-1)⍴'|'),' ',r ⍝ add tiks
     r←(⍕⍪rb×scale),r ⍝ add base values
     ⊖r ⍝ low values last (humans!)

Although I might be feeling better tomorrow, I expect to be back soon. I have a couple of plastic bags full of resistors, LEDs, potentiometers and cameras (yes, plural) that I have not opened yet!

Morse Code – Revisited using the BBC micro:bit

As mentioned last week, I have found a new way to provide a front end processor for my Pi, the BBC micro:bit. This week, I have started putting the new system through its paces: The microbit class in our GitHub repository has been beefed up to make it more resilient in the case where serial I/O is not initialised, or the MicroPython REPL is not running, and I have added a trivial Python program which can be used to intialise the REPL using the mu editor.

To test it all, I have revisited the Morse Code example, originally written to use GPIO support provided by Quick2Wire, to see if I could get it up and running with my new infrastructure. The micro:bit has a 5×5 LED display which makes the result a bit more impressive to look at:


The new code is significantly simpler than the original, for a couple of reasons:

  1. Making calls to the micro:bit API via MicroPython is easier than using GPIO via a C library.
  2. Version 15.0 of Dyalog APL contains new system functions for working with text files, so we no longer need the “Files” library, we can just use the built-in ⎕NGET function to read the contents of a file.