# Calculation v Look-Up

(`⎕io←0` and timings are done in Dyalog version 15.0.)

## Table Look-Up

Some functions can be computed faster by table look-up than a more traditional and more conventional calculation. For example:

``````      b←?1e6⍴2

cmpx '*b' '(*0 1)[b]'
*b        → 1.32E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
(*0 1)[b] → 1.23E¯3 | -91% ⎕⎕⎕``````

(a) The advantage of table look-up depends on the time to calculate the function directly, versus the time to do indexing, `⍺[⍵]` or `⍵⌷⍺`, plus a small amount of time to calculate the function on a (much) smaller representative set.

Indexing is particularly efficient when the indices are boolean:

``````      bb←b,1
bi←b,2
cmpx '2 3[bb]' '2 3 3[bi]'
2 3[bb]   → 1.12E¯4 |    0% ⎕⎕⎕⎕⎕
2 3 3[bi] → 7.39E¯4 | +558% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕``````

`bi` is the same as `bb` except that the last element of `2` forces it to be 1-byte integers (for `bi`) instead of 1-bit booleans (for `bb`).

(b) Although the faster expression works best with 0-origin, the timing for 1-origin is similar.

``````      cmpx '*b' '{⎕io←0 ⋄ (*0 1)[⍵]}b'
*b                   → 1.32E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
{⎕io←0 ⋄ (*0 1)[⍵]}b → 1.22E¯3 | -91% ⎕⎕⎕``````

That is, if the current value of `⎕io` is 1, the implementation can use a local `⎕io` setting of 0.

(c) The fact that `*b` is currently slower than `(*0 1)[b]` represents a judgment by the implementers that `*b` is not a sufficiently common computation to warrant writing faster code for it. Other similar functions already embody the table look-up technique:

``````      cmpx '⍕b' '¯1↓,(2 2⍴''0 1 '')[b;]'
⍕b                   → 6.95E¯4 |  0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
¯1↓,(2 2⍴'0 1 ')[b;] → 6.92E¯4 | -1% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕``````

## Small Domains

Table look-up works best for booleans, but it also works for other small domains such as 1-byte integers:

``````      i1←?1e6⍴128    ⍝ 1-byte integers

cmpx '*i1' '(*⍳128)[i1]'
*i1         → 1.48E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
(*⍳128)[i1] → 9.30E¯4 | -94% ⎕⎕

cmpx '○i1' '(○⍳128)[i1]'
○i1         → 2.78E¯3 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
(○⍳128)[i1] → 9.25E¯4 | -67% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕``````

In general, a table for 1-byte integers needs to have values for `¯128+⍳256`. Here `⍳128` is used to simulate that in C indexing can be done on 1-byte indices without extra computation on the indices. In APL, general 1-byte indices are used as `table[i1+128]`; in C, the expression is `(table+offset)[i]`.

Domains considered “small” get bigger all the time as machines come equipped with ever larger memories.

## Larger Domains

Table look-up is applicable when the argument is not a substantial subset of a large domain and there is a fast way to detect that it is not.

``````      i4s←1e7+?1e6⍴1e5    ⍝ small-range 4-byte integers

G←{(⊂u⍳⍵)⌷⍺⍺ u←∪⍵}

cmpx '⍟i4s' '⍟G i4s'
⍟i4s   → 1.44E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
⍟G i4s → 9.59E¯3 | -34% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕``````

The definition of `G` implements that the argument is not used directly as indices (as in `(⊂⍵)⌷⍺⍺ u`) but needed to be looked up, the `u⍳⍵` in `(⊂u⍳⍵)⌷⍺⍺ u`. Thus both index-of and indexing are key enablers for making table look-up competitive.

§4.3 of Notation as a Tool of Thought presents a different way to distribute the results of a calculation on the “nub” (unique items). But it takes more time and space and is limited to numeric scalar results.

``````      H←{(⍺⍺ u)+.×(u←∪⍵)∘.=⍵}

i4a←1e7+?1e4⍴1e3    ⍝ small-range 4-byte integers

cmpx '⍟i4a' '⍟G i4a' '⍟H i4a'
⍟i4a   → 1.56E¯4 |       0%
⍟G i4a → 6.25E¯5 |     -60%
⍟H i4a → 1.78E¯1 | +113500% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕``````

The benchmark is done on the smaller `i4a` as a benchmark on `i4s` founders on the `≢∪⍵` by `≢⍵` boolean matrix (for `i4s`, 12.5 GB = `×/ 0.125 1e5 1e6`) created by `H`.

## Mathematical Background

New Math” teaches that a function is a set of ordered pairs whose first items are unique:

`*⍵: {(⍵,*⍵)|⍵∊C}     ` The set of all `(⍵,*⍵)` where `⍵` is a complex number
`⍟⍵: {(⍵,⍟⍵)|⍵∊C~{0}} ` The set of all `(⍵,⍟⍵)` where `⍵` is a non-zero complex number
`○⍵: {(⍵,○⍵)|⍵∊C}     ` The set of all `(⍵,○⍵)` where `⍵` is a complex number
`⍕⍵: {(⍵,⍕⍵)|0=⍴⍴⍵}   ` The set of all `(⍵,⍕⍵)` where `⍵` is a scalar

When `⍵` is restricted (for example) to the boolean domain, the functions, the sets of ordered pairs, are more simply represented by enumerating all the possibilities:

```*⍵: {(0,1), (1,2.71828)} ⍟⍵: {(0,Err), (1,0)} ○⍵: {(0,0), (1,3.14159)} ⍕⍵: {(0,'0'), (1,'1')}```

## Realized and Potential Performance Improvements

realized (available no later than 15.0)

 boolean 1-byte integer `⍕ ∘.f` `⍕     a∘⍕`

potential (non-exhaustive list)

 boolean 1-byte integer 2-byte integer small-range 4-byte integer `* ⍟ | - ○   ! ⍳ a∘f f.g` `* ⍟ | - ○ × ! ⍳ a∘f f.g ∘.f` `* ⍟ | - ○ ×   ⍳ a∘f` `* ⍟ |     ×`

`a` is a scalar; `f` and `g` are primitive scalar dyadic functions.

# APL Exercises

These exercises are designed to introduce APL to a high school senior adept in mathematics. The entire set of APL Exercises has the following sections:

 Introduction 0. Beginnings 1. Utilities 2. Recursion 3. Proofs 4. Rank Operator 5. Index-Of 6. Key Operator 7. Indexing 8. Grade and Sort 9. Power Operator 10. Arithmetic 11. Combinatorial Objects The Fine Print

Sections 2 and 10 are the most interesting ones and are as follows. The full set is too lengthy (and too boring?) to be included here. Some students may be able to make up their own exercises from the list of topics. As well, it may be useful to know what the “experts” think are important aspects of APL.

## 2. Recursion

`⎕io←0` throughout.

2.0 Factorial

Write a function to compute the factorial function on integer `⍵`, `⍵≥0`.

``````      fac¨ 5 6 7 8
120 720 5040 40320

n←1+?20
(fac n) = n×fac n-1
1
fac 0
1``````

2.1 Fibonacci Numbers

Write a function to compute the `⍵`-th Fibonacci number, `⍵≥0`.

``````      fib¨ 5 6 7 8
5 8 13 21

n←2+?20
(fib n) = (fib n-2) + (fib n-1)
1
fib 0
0
fib 1
1``````

If the function written above is multiply recursive, write a version which is singly recursive (only one occurrence of `∇`). Use `cmpx` to compare the performance of the two functions on the argument `35`.

Let `⍺` and `⍵` be natural numbers (non-negative integers). Write a function `padd` to compute `⍺+⍵` without using `+` itself (or `-` or `×` or `÷` …). The only functions allowed are comparison to `0` and the functions `pre←{⍵-1}` and `suc←{⍵+1}`.

2.3 Peano Multiplication

Let `⍺` and `⍵` be natural numbers (non-negative integers). Write a function `ptimes` to compute `⍺×⍵` without using `×` itself (or `+` or `-` or `÷` …). The only functions allowed are comparison to 0 and the functions `pre←{⍵-1}` and `suc←{⍵+1}` and `padd`.

2.4 Binomial Coefficients

Write a function to produce the binomal coefficients of order `⍵`, `⍵≥0`.

``````      bc 0
1
bc 1
1 1
bc 2
1 2 1
bc 3
1 3 3 1
bc 4
1 4 6 4 1``````

2.5 Cantor Set

Write a function to compute the Cantor set of order `⍵`, `⍵≥0`.

``````      Cantor 0
1
Cantor 1
1 0 1
Cantor 2
1 0 1 0 0 0 1 0 1
Cantor 3
1 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 0 1``````

The classical version of the Cantor set starts with the interval `[0,1]` and at each stage removes the middle third from each remaining subinterval:

``````[0,1] →
[0,1/3] ∪ [2/3,1] →
[0,1/9] ∪ [2/9,1/3] ∪ [2/3,7/9] ∪ [8/9,1] → ….

{(+⌿÷≢)Cantor ⍵}¨ 3 5⍴⍳15
1         0.666667  0.444444   0.296296   0.197531
0.131687  0.0877915 0.0585277  0.0390184  0.0260123
0.0173415 0.011561  0.00770735 0.00513823 0.00342549

(2÷3)*3 5⍴⍳15
1         0.666667  0.444444   0.296296   0.197531
0.131687  0.0877915 0.0585277  0.0390184  0.0260123
0.0173415 0.011561  0.00770735 0.00513823 0.00342549``````

2.6 Sierpinski Carpet

Write a function to compute the Sierpinski Carpet of order `⍵`, `⍵≥0`.

``````      SC 0
1
SC 1
1 1 1
1 0 1
1 1 1
SC 2
1 1 1 1 1 1 1 1 1
1 0 1 1 0 1 1 0 1
1 1 1 1 1 1 1 1 1
1 1 1 0 0 0 1 1 1
1 0 1 0 0 0 1 0 1
1 1 1 0 0 0 1 1 1
1 1 1 1 1 1 1 1 1
1 0 1 1 0 1 1 0 1
1 1 1 1 1 1 1 1 1

' ⌹'[SC 3]
⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹
⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹
⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹
⌹⌹⌹   ⌹⌹⌹⌹⌹⌹   ⌹⌹⌹⌹⌹⌹   ⌹⌹⌹
⌹ ⌹   ⌹ ⌹⌹ ⌹   ⌹ ⌹⌹ ⌹   ⌹ ⌹
⌹⌹⌹   ⌹⌹⌹⌹⌹⌹   ⌹⌹⌹⌹⌹⌹   ⌹⌹⌹
⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹
⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹
⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹
⌹⌹⌹⌹⌹⌹⌹⌹⌹         ⌹⌹⌹⌹⌹⌹⌹⌹⌹
⌹ ⌹⌹ ⌹⌹ ⌹         ⌹ ⌹⌹ ⌹⌹ ⌹
⌹⌹⌹⌹⌹⌹⌹⌹⌹         ⌹⌹⌹⌹⌹⌹⌹⌹⌹
⌹⌹⌹   ⌹⌹⌹         ⌹⌹⌹   ⌹⌹⌹
⌹ ⌹   ⌹ ⌹         ⌹ ⌹   ⌹ ⌹
⌹⌹⌹   ⌹⌹⌹         ⌹⌹⌹   ⌹⌹⌹
⌹⌹⌹⌹⌹⌹⌹⌹⌹         ⌹⌹⌹⌹⌹⌹⌹⌹⌹
⌹ ⌹⌹ ⌹⌹ ⌹         ⌹ ⌹⌹ ⌹⌹ ⌹
⌹⌹⌹⌹⌹⌹⌹⌹⌹         ⌹⌹⌹⌹⌹⌹⌹⌹⌹
⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹
⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹
⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹
⌹⌹⌹   ⌹⌹⌹⌹⌹⌹   ⌹⌹⌹⌹⌹⌹   ⌹⌹⌹
⌹ ⌹   ⌹ ⌹⌹ ⌹   ⌹ ⌹⌹ ⌹   ⌹ ⌹
⌹⌹⌹   ⌹⌹⌹⌹⌹⌹   ⌹⌹⌹⌹⌹⌹   ⌹⌹⌹
⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹
⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹
⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹``````

3-D analogs of the Cantor set and the Sierpinski carpet are the Sierpinski sponge or the Menger sponge.

2.7 Extended H

Write a function for the extend H algorithm for `⍵`, `⍵≥0`, which embeds the complete binary tree of depth `⍵` on the plane. In `×H ⍵` there are `2*⍵` leaves, instances of the letter `o` with exactly one neighbor (or no neighbors for `0=⍵`). The root is at the center of the matrix.

``````      xH¨ ⍳5
┌─┬─────┬─────┬─────────────┬─────────────┐
│o│o-o-o│o   o│o-o-o   o-o-o│o   o   o   o│
│ │     │|   |│  |       |  │|   |   |   |│
│ │     │o-o-o│  o---o---o  │o-o-o   o-o-o│
│ │     │|   |│  |       |  │| | |   | | |│
│ │     │o   o│o-o-o   o-o-o│o | o   o | o│
│ │     │     │             │  |       |  │
│ │     │     │             │  o---o---o  │
│ │     │     │             │  |       |  │
│ │     │     │             │o | o   o | o│
│ │     │     │             │| | |   | | |│
│ │     │     │             │o-o-o   o-o-o│
│ │     │     │             │|   |   |   |│
│ │     │     │             │o   o   o   o│
└─┴─────┴─────┴─────────────┴─────────────┘``````

Write a function that has the same result as `xH` but checks the result using the `assert` utility. For example:

``````assert←{⍺←'assertion failure' ⋄ 0∊⍵:⍺ ⎕SIGNAL 8 ⋄ shy←0}

xH1←{
h←xH ⍵
assert 2=⍴⍴h:
assert h∊'○ -|':
assert (¯1+2*1+⍵)=+/'o'=,h:
...
h
}``````

2.8 Tower of Hanoi

(By André Karwath aka Aka (Own work) [CC BY-SA 2.5 (http://creativecommons.org/licenses/by-sa/2.5)], via Wikimedia Commons)

The Tower of Hanoi problem is to move a set of `⍵` different-sized disks from one peg to another, moving one disk at a time, using an intermediate peg if necessary. At all times no larger disk may sit on top of a smaller disk.

Write a function Hanoi `⍵` to solve the problem of moving `⍵` disks from peg 0 to peg 2. Since it’s always the disk which is at the top of a peg which is moved, the solution can be stated as a 2-column matrix with column 0 indicating the source peg and column 1 the destination peg.

``````      Hanoi¨ ⍳5
┌───┬───┬───┬───┬───┐
│   │0 2│0 1│0 2│0 1│
│   │   │0 2│0 1│0 2│
│   │   │1 2│2 1│1 2│
│   │   │   │0 2│0 1│
│   │   │   │1 0│2 0│
│   │   │   │1 2│2 1│
│   │   │   │0 2│0 1│
│   │   │   │   │0 2│
│   │   │   │   │1 2│
│   │   │   │   │1 0│
│   │   │   │   │2 0│
│   │   │   │   │1 2│
│   │   │   │   │0 1│
│   │   │   │   │0 2│
│   │   │   │   │1 2│
└───┴───┴───┴───┴───┘``````

Prove that `Hanoi ⍵` has `¯1+2*⍵` rows.

## 10. Arithmetic

`⎕io←0` throughout.

10.0 Primality Testing

Write a function `prime ⍵` which is 1 or 0 depending on whether non-negative integer `⍵` is a prime number. For example:

``````      prime 1
0
prime 2
1
prime 9
0
prime¨ 10 10⍴⍳100
0 0 1 1 0 1 0 1 0 0
0 1 0 1 0 0 0 1 0 1
0 0 0 1 0 0 0 0 0 1
0 1 0 0 0 0 0 1 0 0
0 1 0 1 0 0 0 1 0 0
0 0 0 1 0 0 0 0 0 1
0 1 0 0 0 0 0 1 0 0
0 1 0 1 0 0 0 0 0 1
0 0 0 1 0 0 0 0 0 1
0 0 0 0 0 0 0 1 0 0``````

10.1 Primes Less Than n

Write a function `primes ⍵` using the sieve method which produces all the primes less than `⍵`. Use `cmpx` to compare the speed of `primes n` and `(prime¨⍳n)/⍳n`. For example:

``````      primes 7
2 3 5
primes 50
2 3 5 7 11 13 17 19 23 29 31 37 41 43 47

(prime¨⍳50)/⍳50
2 3 5 7 11 13 17 19 23 29 31 37 41 43 47``````

10.2 Factoring

Write a function factor `⍵` which produces the prime factorization of `⍵`, such that `⍵ = ×/factor ⍵` and `∧/ prime¨ factor ⍵` is 1. For example:

``````      factor 144
2 2 2 2 3 3
×/ factor 144
144
∧/ prime¨ factor 144
1

factor 1

factor 2
2``````

10.3 pco

Read about the `pco` function in http://dfns.dyalog.com/n_pco.htm and experiment with it.

``````      )copy dfns pco

pco ⍳7
2 3 5 7 11 13 17

1 pco ⍳10
0 0 1 1 0 1 0 1 0 0``````

G.H. Hardy states in A Mathematician’s Apology (chapter 14, page 23) that the number of primes less than `1e9` is `50847478`. Use `pco` to check whether this is correct.

10.4 GCD

Write a function `⍺ gcd ⍵` to compute the greatest common divisor of non-negative integers `⍺` and `⍵` using the Euclidean algorithm.

Write a function `⍺ egcd ⍵` to compute the GCD of `⍺` and `⍵` as coefficients of `⍺` and `⍵`, such that `(⍺ gcd ⍵) = (⍺,⍵)+.×⍺ egcd ⍵`.

``````      112 gcd 144
16
112 egcd 144
4 ¯3
112 144 +.× 112 egcd 144
16``````

10.5 Ring Extension

`Z[√k]` is the ring of integers `Z` extended with `√k` where `k` is not a perfect square. The elements of `Z[√k]` are ordered pairs `(a,b)` which can be interpreted as `a+b×√k` (`a+b×k*0.5`).

Write a d-operator `⍺(⍺⍺ rplus)⍵` which does addition in `Z[√⍺⍺]`.

``````      3 4 (5 rplus) ¯2 7
1 11``````

Write a d-operator `⍺(⍺⍺ rtimes)⍵` which does multiplication in `Z[√⍺⍺]`.

``````      3 4 (5 rtimes) ¯2 7
134 13``````

Both of the above can be done using integer operations only.

10.6 Repeated Squaring I

Write a function `⍺ pow ⍵` that computes `⍺*⍵` using repeated squaring. `⍺` is a number and `⍵` a non-negative integer. Hint: the binary representation of an integer `n` obtains as `2⊥⍣¯1⊢n`.

10.7 Repeated Squaring II

Write a d-operator `⍺(⍺⍺ rpow)⍵` that computes `⍺` raised to the power `⍵` using repeated squaring. `⍺` is in `Z[√⍺⍺]` and `⍵` is a non-negative integer.

``````      ⊃ (5 rtimes)/ 13 ⍴ ⊂ 1 1
2134016 954368

1 1 (5 rpow) 13
2134016 954368
1 ¯1 (5 rpow) 13
2134016 ¯954368``````

The last two expressions are key steps in an `O(⍟n)` computation of the `n`-th Fibonacci number using integer operations, using Binet’s formula:

# 50847534

`⎕io←0` throughout.

I was re-reading A Mathematician’s Apology before recommending it to Suki Tekverk, our summer intern, and came across a statement that the number of primes less than `1e9` is `50847478` (§14, page 23). The function `pco` from the dfns workspace does computations on primes; `¯1 pco n` is the number of primes less than `n`:

``````      )copy dfns pco

¯1 pco 1e9
50847534``````

The two numbers `50847478` and `50847534` cannot both be correct. A search of `50847478` on the internet reveals that it is incorrect and that `50847534` is correct. `50847478` is erroneously cited in various textbooks and even has a name, Bertelsen’s number, memorably described by MathWorld as “an erroneous name erroneously given the erroneous value of `π(1e9) = 50847478`.”

Although several internet sources give `50847534` as the number of primes less than `1e9`, they don’t provide a proof. Relying on them would be doing the same thing — arguing from authority — that led other authors astray, even if it is the authority of the mighty internet. Besides, I’d already given Suki, an aspiring mathematician, the standard spiel about the importance of proof.

How do you prove the `50847534` number? One way is to prove correct a program that produces it. Proving `pco` correct seems daunting — it has 103 lines and two large tables. Therefore, I wrote a function from scratch to compute the number of primes less than `⍵`, a function written in a way that facilitates proof.

``````sieve←{
b←⍵⍴{∧⌿↑(×/⍵)⍴¨~⍵↑¨1}2 3 5
b[⍳6⌊⍵]←(6⌊⍵)⍴0 0 1 1 0 1
49≥⍵:b
p←3↓{⍵/⍳⍴⍵}∇⌈⍵*0.5
m←1+⌊(⍵-1+p×p)÷2×p
b ⊣ p {b[⍺×⍺+2×⍳⍵]←0}¨ m
}

+/ sieve 1e9
50847534``````

`sieve ⍵` produces a boolean vector `b` with length `⍵` such that `b/⍳⍴b` are all the primes less than `⍵`. It implements the sieve of Eratosthenes: mark as not-prime multiples of each prime less than `⍵*0.5`; this latter list of primes obtains by applying the function recursively to `⌈⍵*0.5`. Some obvious optimizations are implemented:

• Multiples of `2 3 5` are marked by initializing `b` with `⍵⍴{∧⌿↑(×/⍵)⍴¨~⍵↑¨1}2 3 5` rather than with `⍵⍴1`.
• Subsequently, only odd multiples of primes `> 5` are marked.
• Multiples of a prime to be marked start at its square.

Further examples:

``````      +/∘sieve¨ ⍳10
0 0 0 1 2 2 3 3 4 4

+/∘sieve¨ 10*⍳10
0 4 25 168 1229 9592 78498 664579 5761455 50847534``````

There are other functions which are much easier to prove correct, for example, `sieve1←{2=+⌿0=(⍳3⌈⍵)∘.|⍳⍵}`. However, `sieve1` requires `O(⍵*2)` space and `sieve1 1e9` cannot produce a result with current technology. (Hmm, a provably correct program that cannot produce the desired result …)

We can test that `sieve` is consistent with `pco` and that `pco` is self-consistent. `pco` is a model of the p: primitive function in J. Its cases are:
`   pco n   `the `n`-th prime
`¯1 pco n   `the number of primes less than `n`
` 1 pco n   ``1` iff `n` is prime
` 2 pco n   `the prime factors and exponents of `n`
` 3 pco n   `the prime factorization of `n`
`¯4 pco n   `the next prime `< n`
` 4 pco n   `the next prime `> n`
`10 pco r,s `a boolean vector `b` such that `r+b/⍳⍴b` are the primes in the half-open interval `[r,s)`.

``````      ¯1 pco 10*⍳10      ⍝ the number of primes < 1e0 1e1 ... 1e9
0 4 25 168 1229 9592 78498 664579 5761455 50847534

+/ 10 pco 0 1e9    ⍝ sum of the sieve between 0 and 1e9
50847534
⍝ sum of sums of 10 sieves
⍝ each of size 1e8 from 0 to 1e9
+/ t← {+/10 pco ⍵+0 1e8}¨ 1e8×⍳10
50847534
⍝ sum of sums of 1000 sieves
⍝ each of size 1e6 from 0 to 1e9
+/ s← {+/10 pco ⍵+0 1e6}¨ 1e6×⍳1000
50847534

t ≡ +/ 10 100 ⍴ s
1
⍝ sum of sums of sieves with 1000 randomly
⍝ chosen end-points, from 0 to 1e9
+/ 2 {+/10 pco ⍺,⍵}/ 0,({⍵[⍋⍵]}1000?1e9),1e9
50847534

¯1 pco 1e9         ⍝ the number of primes < 1e9
50847534
pco 50847534       ⍝ the 50847534-th prime
1000000007
¯4 pco 1000000007  ⍝ the next prime < 1000000007
999999937
4 pco 999999937    ⍝ the next prime > 999999937
1000000007
4 pco 1e9          ⍝ the next prime > 1e9
1000000007

⍝ are 999999937 1000000007 primes?
1 pco 999999937 1000000007
1 1
⍝ which of 999999930 ... 1000000009
⍝ are prime?
1 pco 999999930+4 20⍴⍳80
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1

+/∘sieve¨ 999999937 1000000007 ∘.+ ¯1 0 1
50847533 50847533 50847534
50847534 50847534 50847535``````

# Constructions

Suki Tekverk, a summer intern, spent the last two weeks here studying APL. She will be a high school senior next September and is adept in math, so in addition to APL we also considered some math problems, proofs, and proof techniques, including the following:

Given line segments `x` and `y`, construct (using compass and straight-edge) line segments for the following values:
`   x+y`
`   x-y`
`   x×y`
`   x÷y`

The first two are immediate. Constructing the last two are straightforward if you are also given `1` (or some other reference length from which to construct `1`). Can you construct `x×y` and `x÷y` without `1`?

Constructing `x÷y` is impossible if you are not given `1` : From `x` and `y` alone you can not determine how they compare to `1`. If you can construct `x÷y`, then you can construct `x÷x` and therefore relate `x` and `y` to `1`, contradicting the previous statement.

Can you construct `x×y` without `1`? I got stuck (and lazy) and posed the question to the J Chat Forum, and received a solution from Raul Miller in less than half an hour. Miller’s proof recast in terms similar to that for `x÷y` is as follows:

From `x` and `y` alone you can not determine how they compare to `1`. If you can construct `x×y`, then you can construct `x×x`, whence:
`   `if `x<x×x` then `x>1`
`   `if `x=x×x` then `x=1`
`   `if `x>x×x` then `x<1`

I last thought about this problem in my first year of college many decades ago. At the time Norman M. (a classmate) argued that there must be a `1` and then did the construction for `x÷y` using `1`. I recall he said “there must be a 1” in the sense of “`1` has to exist if `x` and `y` exist” rather than that “you have to use a `1` in the construction” or “you can not construct `x÷y` without using `1`”. I don’t remember what we did with `x×y`; before Miller’s message I had some doubt that perhaps we were able to construct `x×y` without using `1` all those years ago. (Norman went on to get a Ph.D. at MIT and other great things.)

# A Memo Operator

Abstract

Some functions are simply stated and easily understood as multiply-recursive functions. For example, the Fibonacci numbers are `{1≥⍵:⍵ ⋄ (∇ ⍵-2)+∇ ⍵-1}`. A drawback of multiple recursion is abysmal performance even for moderate-sized arguments, consequently motivating resorts to circumlocutions. The memo operator addresses the performance problem and restores multiple recursion to the toolkit of thought. We present a memo operator modelled as a 6-line d-operator.

`⎕io←0` throughout.

The Man Who Knew Infinity

Jay sent out an e-mail asking who would be interested to go see the film The Man Who Knew Infinity. Fiona’s response: “Ah, Mr. 1729; I’m in.” John was game, and so was I.

The Number of Partitions
Nick found an essay by Stephen Wolfram on Ramanujan written on occasion of the film, wherein Wolfram said:

A notable paper [Ramanujan] wrote with Hardy concerns the partition function (`PartitionsP` in the Wolfram Language) that counts the number of ways an integer can be written as a sum of positive integers. …

In Ramanujan’s day, computing the exact value of `PartitionsP[200]` was a big deal — and the climax of his paper. But now, thanks to Ramanujan’s method, it’s instantaneous:

``````In[1]:= PartitionsP[200]
Out[1]= 3 972 999 029 388
``````

A partition of a non-negative integer `n` is a vector `v` of positive integers such that `n = +/v`, where the order in `v` is not significant. For example, the partitions of the first seven non-negative integers are:

``````┌┐
││
└┘
┌─┐
│1│
└─┘
┌─┬───┐
│2│1 1│
└─┴───┘
┌─┬───┬─────┐
│3│2 1│1 1 1│
└─┴───┴─────┘
┌─┬───┬───┬─────┬───────┐
│4│3 1│2 2│2 1 1│1 1 1 1│
└─┴───┴───┴─────┴───────┘
┌─┬───┬───┬─────┬─────┬───────┬─────────┐
│5│4 1│3 2│3 1 1│2 2 1│2 1 1 1│1 1 1 1 1│
└─┴───┴───┴─────┴─────┴───────┴─────────┘
┌─┬───┬───┬─────┬───┬─────┬───────┬─────┬───────┬─────────┬───────────┐
│6│5 1│4 2│4 1 1│3 3│3 2 1│3 1 1 1│2 2 2│2 2 1 1│2 1 1 1 1│1 1 1 1 1 1│
└─┴───┴───┴─────┴───┴─────┴───────┴─────┴───────┴─────────┴───────────┘
``````

Can we compute the partition counting function “instantaneously” in APL?

We will use a recurrence relation derived from the pentagonal number theorem, proved by Euler more than 160 years before Hardy and Ramanujan:

equation (11) in http://mathworld.wolfram.com/PartitionFunctionP.html. In APL:

``````      pn  ← {0≥⍵:0=⍵ ⋄ -/+⌿∇¨rec ⍵}
rec ← {⍵ - (÷∘2 (×⍤1) ¯1 1 ∘.+ 3∘×) 1+⍳⌈0.5*⍨⍵×2÷3}

pn¨0 1 2 3 4 5 6
1 1 2 3 5 7 11

pn 30
5604
``````

Warning: don’t try `pn 200` because that would take a while! Why? `pn 200` would engender `pn` being applied to each element of `rec 200` and:

``````   rec 200
199 195 188 178 165 149 130 108 83 55 24 ¯10
198 193 185 174 160 143 123 100 74 45 13 ¯22
``````

Each of the non-negative values would itself engender further recursive applications.

A Memo Operator

I recalled from J the memo adverb (monadic operator) `M`. Paraphrasing the J dictionary: `f M` is the same as `f` but may keep a record of the arguments and results for reuse. It is commonly used for multiply-recursive functions. Sounds like just the ticket!

The functionistas had previously implemented a (dyadic) memo operator. The version here is more restrictive but simpler. The restrictions are as follows:

• The cache is constructed “on the fly” and is discarded once the operand finishes execution.
• The operand is a dfn `{condition:basis ⋄ expression}` where none of `condition`, `base` and `expression` contain a `⋄` and recursion is denoted by `∇` in `expression`
• The arguments are scalar integers.
• The operand function does not have `¯1` as a result.
• A recursive call is on a smaller integer.

With these restrictions, our memo operator can be defined as follows:

``````M←{
f←⍺⍺
i←2+'⋄'⍳⍨t←3↓,⎕CR'f'
0=⎕NC'⍺':⍎' {T←(1+  ⍵)⍴¯1 ⋄  {',(i↑t),'¯1≢T[⍵]  :⊃T[⍵]   ⋄ ⊃T[⍵]  ←⊂',(i↓t),'⍵}⍵'
⍎'⍺{T←(1+⍺,⍵)⍴¯1 ⋄ ⍺{',(i↑t),'¯1≢T[⍺;⍵]:⊃T[⍺;⍵] ⋄ ⊃T[⍺;⍵]←⊂',(i↓t),'⍵}⍵'
}

pn M 10
42

pn 10
42
``````

Obviously, the `⍎` lines in `M` are crucial. When the operand is `pn`, the expression which is executed is as follows; the characters inserted by `M` are shown in red.

```{T←(1+⍵)⍴¯1 ⋄ {0≥⍵:0=⍵ ⋄ ¯1≢T[⍵]:⊃T[⍵] ⋄ ⊃T[⍵]←⊂-/+⌿∇¨rec ⍵}⍵}⍵
```

The machinations produce a worthwhile performance improvement:

``````      pn M 30
5604
cmpx 'pn M 30'
3.94E¯4
cmpx 'pn 30'
4.49E1
4.49e1 ÷ 3.94e¯4
113959
``````

That is, `pn M 30` is faster than `pn 30` by a factor of 114 thousand.

Can APL compute `pn M 200` “instantaneously”? Yes it can, for a suitable definition of “instantaneous”.

``````      cmpx 'm←pn M 200'
4.47E¯3
m
3.973E12
0⍕m
3972999029388
``````

`M` also works on operands which are anonymous, dyadic, or have non-scalar results.

Fibonacci Numbers

``````   fib←{1≥⍵:⍵ ⋄ (∇ ⍵-2)+∇ ⍵-1}

fib¨ ⍳15
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377
fib M¨ ⍳15
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377

cmpx 'fib 30'
1.45E0
cmpx 'fib M 30'
1.08E¯4
``````

Anonymous function:

``````      {1≥⍵:⍵ ⋄ (∇ ⍵-2)+∇ ⍵-1}M¨ ⍳15
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377
``````

Combinations
`⍺ comb ⍵` produces a matrix of all the size-`⍺` combinations of `⍳⍵`. The algorithm is from the venerable APL: An Interactive Approach by Gilman and Rose.

``````      comb←{(⍺≥⍵)∨0=⍺:((⍺≤⍵),⍺)⍴⍳⍺ ⋄ (0,1+(⍺-1)∇ ⍵-1)⍪1+⍺ ∇ ⍵-1}

3 comb 5
0 1 2
0 1 3
0 1 4
0 2 3
0 2 4
0 3 4
1 2 3
1 2 4
1 3 4
2 3 4

3 (comb ≡ comb M) 5
1

cmpx '8 comb M 16' '8 comb 16'
8 comb M 16 → 1.00E¯3 |     0% ⎕
8 comb 16   → 3.63E¯2 | +3526% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
``````

A faster `comb` is possible without a memo operator, but it’s not easy. (See for example section 4.1 of my APL87 paper.) A memo operator addresses the performance problem with multiple recursion and restores it to the toolkit of thought.

Presented at the BAA seminar at the Royal Society of Arts on 2016-05-20

Andy reported that in the shuffle QA some functions take a long time:

`m9249   `“4½ days so far”
`rankop  `21.5 hours
`m12466  `26.3 hours
`points  `7.2 hours

Background: Shuffle QA is an intensification of the normal QA. The suite of over 1800 QA functions is run under shuffle, whereby every `getspace` (memory allocation) is followed by every APL array in the workspace being shifted (“shuffled”) and by a simple workspace integrity check. The purpose is to uncover memory reads/writes which are rendered unsafe by events which in normal operations are rare.

`m9249`

`m9249` tests `+\[a]`, `×\[a]`, `⌈\[a]`, and `⌊\[a]`. It ran in under 3 seconds in normal operations; for it to take more than 4½ days under shuffle, `getspace` is clearly the dominant resource and hereafter we focus on that.

`m9249` used expressions like:

``````assert (+⍀x) ≡ {⍺+⍵}⍀x
``````

Since `{⍺+⍵}` is not recognized by the scan operator `⍀` and therefore not supported by special code, `{⍺+⍵}⍀x` is done by the general `O(n*2)` algorithm for scans, AND on each scalar the operand `{⍺+⍵}` is invoked and a `getspace` is done. The improved, faster QA does this:

``````F ← {1≥≢⍵:⍵ ⋄ x ⍪ (¯1↑⍵) ⍺⍺⍨ ¯1↑x←⍺⍺ ∇∇ ¯1↓⍵}
assert (+⍀x) ≡ +F x
``````

With such a change, the faster `m9249` runs in 22 minutes under shuffle instead of over 4½ days, without any reduction in coverage.

The number of `getspace` are as follows: `+⍀x` takes `O(1)`. The second expression `{⍺+⍵}⍀x` does `×\1↓⍴x` separate scans, and each one takes `O((≢x)*2) getspace`; the total is therefore `O(((≢x)*2)××/1↓⍴x) ←→ O((≢x)××/⍴x)`. The third expression `+F` is linear in `≢x` and is therefore `O(≢x)` in the number of `getspace`. In `m9249` the largest `x` has size `33 5 7`. The following table shows the orders of complexity and what they imply for this largest `x`:

 `+⍀x` `O(1)` `1` `{⍺+⍵}⍀x` `O((≢x)××/⍴x)` `38115 = 33 × ×/ 33 5 7` `+F x` `O(≢x)` `33`

The ratio between 38115 and 33 goes a long way towards explaining why the time to run `m9249` under shuffle was over 4½ days and is now 22 minutes. (Why does `m9249` still take 22 minutes? It tests for the four functions `+ × ⌈ ⌊`, for various axes, for different datatypes, and for different axis lengths.)

`rankop`

Another shuffle sluggard `rankop` took 47 hours. `rankop` tests expressions of the form `f⍤r⊢y`and `x f⍤r⊢y` for various functions `f`. The techniques used for reducing the number of `getspace` differ from function to function. We look at `x↑⍤r⊢y` as an illustration.

``````assert (4↑⍤1⊢y) ≡ 4{⍺↑⍵}⍤1⊢y
``````

`⍴y` was `2 7 1 8 2 8`. The expression took `7.6e¯5` seconds normally and `7.5` seconds under shuffle. The left hand side requires `O(1) getspace`, the right hand side `O(×/¯1↓⍴x)`. The expression was changed to:

``````assert (4↑⍤1⊢y) ≡ 2 7 1 8 2 4↑y
``````

Feels like cheating, doesn’t it? 🙂 The new expression takes 0.115 seconds under shuffle.

`m12466`

`m12466` tests `n⌈/[a]x` (and `n⌊/[a]x`). Previously, it did this by comparing `n⌈/[a]x` against `n{⍺⌈⍵}/[a]x` for `x` with shape `7 17 13 11`, for each of the four axes, for two different values of `n`, and for the 1-, 2-, 4-, and 8-byte numeric datatypes. It required 5238426 `getspace` and 26.3 hours to run under shuffle.

``F←{p⍉⍺⍺⌿(⊂(⎕io-⍨⍳|⍺)∘.+⍳(1-|⍺)+(⍴⍵)[⍵⍵])⌷(⍋p←⍒⍵⍵=⍳⍴⍴⍵)⍉⍵}``

`F` is a dyadic operator. The left operand `⍺⍺` is `⌈` or `⌊`; the right operand `⍵⍵` is the axis; and `n(⌈ F a)x` is the same as `n⌈/[a]x`. `n(⌈ F a)x` mainly does `n⌈⌿x`; other axes are handled by transposing the axis of interest to be the first, then doing `n⌈⌿x`, then transposing the axes back into the required order. `n(⌈ F a)x` is much more complicated than `n{⍺⌈⍵}/[a]x` but has the advantage of using only `O(1) getspace`.

The revamped `m12466` function uses 13717 `getspace` and 2 minutes to run under shuffle.

`points`

`points` is a function to test monadic format (`⍕`) for different values of `⎕pp`.

``````points←{⍺←17 ⋄ ⎕pp←⍺
tt←{(⌽∨\⌽⍵≠⍺)/⍵}                   ⍝ trim trailing ⍺s.
~0∊∘.{                             ⍝ all pairs (⍳⍵)
num←'0'tt↑,/⍕¨⍺'.'⍵              ⍝ eg '9.3'
num≡⍕⍎num                        ⍝ check round trip
}⍨⍳⍵
}``````

The expression is `14 15 16 17 points¨ 100`, meaning that for each of the 100 numbers `num` as text,

```  1.1   1.2   1.3   1.4   1.5     1.100   2.1   2.2   2.3   2.4   2.5     2.100   3.1   3.2   3.3   3.4   3.5 ... 3.100 ...  99.1  99.2  99.3  99.4  99.5    99.100 100.1 100.2 100.3 100.4 100.5   100.100```

a “round-trip” `num≡⍕⍎num` is evaluated. The expression requires 1.3 million `getspace` and 7.2 hours to run under shuffle.

A much more efficient version with respect to `getspace` is possible. Let `x` be a vector. `⍕¨x` requires `O(≢x) getspace`; `⍕x` requires `O(1) getspace` and, like `⍕¨x`, formats each individual element of `x` independently.

``````points1←{
⎕io←⎕ml←1
tt←{(⌽∨\⌽⍵≠⍺)/⍵}                   ⍝ trim trailing ⍺s
x←1↓∊(' ',¨s,¨'.')∘.,'0'tt¨s←⍕¨⍳⍵  ⍝ numbers as text
y←⍎x
{⎕pp←⍵ ⋄ x≡⍕y}¨⍺                   ⍝ check round trip
}``````

`14 15 16 17 points1 100` requires 11050 `getspace` and less than 2 minutes to run under shuffle.

What to do?

Andy says that the shuffle QA takes over 2000 hours (!). The shuffle sluggards will be tackled as done here by using more parsimonious APL expressions to compare against the aspect being QA-ed. Going forward, a new QA function will be run under shuffle as it is being developed. The developers will be unlikely to tolerate a running time of 4½ days.