Stencil Lives

⎕io←0 throughout. ⎕io delenda est.

Stencil

A stencil operator is available with Dyalog version 16.0. In brief, stencil is a dyadic operator f⌺s which applies f to (possibly overlapping) rectangles. The size of the rectangle and its movement are controlled by s. For example, enclosing 3-by-3 rectangles with default movements of 1:

   ⊢ a←4 5⍴⎕a
ABCDE
FGHIJ
KLMNO
PQRST
   {⊂⍵}⌺3 3 ⊢a
┌───┬───┬───┬───┬───┐
│   │   │   │   │   │
│ AB│ABC│BCD│CDE│DE │
│ FG│FGH│GHI│HIJ│IJ │
├───┼───┼───┼───┼───┤
│ AB│ABC│BCD│CDE│DE │
│ FG│FGH│GHI│HIJ│IJ │
│ KL│KLM│LMN│MNO│NO │
├───┼───┼───┼───┼───┤
│ FG│FGH│GHI│HIJ│IJ │
│ KL│KLM│LMN│MNO│NO │
│ PQ│PQR│QRS│RST│ST │
├───┼───┼───┼───┼───┤
│ KL│KLM│LMN│MNO│NO │
│ PQ│PQR│QRS│RST│ST │
│   │   │   │   │   │
└───┴───┴───┴───┴───┘

Stencil is also known as stencil code, tile, tessellation, and cut. It has applications in artificial neural networks, computational fluid dynamics, cellular automata, etc., and of course in Conway’s Game of Life.

The Rules of Life

Each cell of a boolean matrix has 8 neighbors adjacent to it horizontally, vertically, or diagonally. The Game of Life concerns the computation of the next generation boolean matrix.

(0) A 0-cell with 3 neighboring 1-cells becomes a 1.

(1) A 1-cell with 2 or 3 neighboring 1-cells remains at 1.

(2) All other cells remain or become a 0.

There are two main variations on the treatment of cells on the edges of the matrix: (a) the matrix is surrounded by a border of 0s; or (b) cells on an edge are adjacent to cells on the opposite edge, as on a torus.

There is an implementation of life in the dfns workspace and explained in a YouTube video. It assumes a toroidal topology.

   life←{↑1 ⍵∨.∧3 4=+/,¯1 0 1∘.⊖¯1 0 1∘.⌽⊂⍵}    ⍝ John Scholes

   ⊢ glider←5 5⍴0 0 1 0 0 1 0 1 0 0 0 1 1,12⍴0
0 0 1 0 0
1 0 1 0 0
0 1 1 0 0
0 0 0 0 0
0 0 0 0 0
   life glider
0 1 0 0 0
0 0 1 1 0
0 1 1 0 0
0 0 0 0 0
0 0 0 0 0

   {'.⍟'[⍵]}¨ (⍳8) {life⍣⍺⊢⍵}¨ ⊂glider
┌─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┐
│..⍟..│.⍟...│..⍟..│.....│.....│.....│.....│.....│
│⍟.⍟..│..⍟⍟.│...⍟.│.⍟.⍟.│...⍟.│..⍟..│...⍟.│.....│
│.⍟⍟..│.⍟⍟..│.⍟⍟⍟.│..⍟⍟.│.⍟.⍟.│...⍟⍟│....⍟│..⍟.⍟│
│.....│.....│.....│..⍟..│..⍟⍟.│..⍟⍟.│..⍟⍟⍟│...⍟⍟│
│.....│.....│.....│.....│.....│.....│.....│...⍟.│
└─────┴─────┴─────┴─────┴─────┴─────┴─────┴─────┘

Stencil Lives

It has long been known that stencil facilitates Game of Life computations. Eugene McDonnell explored the question in the APL88 paper Life: Nasty, Brutish, and Short [Ho51]. The shortest of the solutions derive as follows.

By hook or by crook, find all the 3-by-3 boolean matrices U which lead to a middle 1. A succinct Game of Life then obtains.

   B ← {1 1⌷life 3 3⍴(9⍴2)⊤⍵}¨ ⍳2*9
   U ← {3 3⍴(9⍴2)⊤⍵}¨ ⍸B  ⍝ ⍸ ←→ {⍵/⍳⍴⍵}

   life1 ← {U ∊⍨ {⊂⍵}⌺3 3⊢⍵}    ⍝ Eugene McDonnell

Comparing life and life1, and also illustrating that the toroidal and 0-border computations can be expressed one with the other.

   b←1=?97 103⍴3

   x←1 1↓¯1 ¯1↓ life 0,0,⍨0⍪0⍪⍨b
   y←life1 b
   x≡y
1

   g←{(¯1↑⍵)⍪⍵⍪1↑⍵}
   p←life b
   q←1 1↓¯1 ¯1↓ life1 (g b[;102]),(g b),(g b[;0])
   p≡q
1

Adám Brudzewsky points out that life can be terser as a train (fork):

   life1a ← U ∊⍨ ⊢∘⊂⌺3 3    ⍝ Adám Brudzewsky
   life1b ← U ∊⍨ {⊂⍵}⌺3 3

   (life1 ≡ life1a) b
1
   (life1 ≡ life1b) b
1

life1 is an example of implementing a calculation by look-up rather than by a more conventional computation, discussed in a recent blog post. There is a variation which is more efficient because the look-up is effected with integers rather than boolean matrices:

   A←3 3⍴2*⌽⍳9
   life2 ← {B[{+/,A×⍵}⌺3 3⊢⍵]}

   (life1 ≡ life1b) b
1

Jay Foad offers another stencil life, translating an algorithm in k by Arthur Whitney:

   life3 ← {3=s-⍵∧4=s←{+/,⍵}⌺3 3⊢⍵}    ⍝ Jay Foad

   (life1 ≡ life3) b
1

The algorithm combines the life rules into a single expression, wherein s←{+/,⍵}⌺3 3 ⊢⍵

(0) for 0-cells s is the number of neighbors; and
(1) for 1-cells s is the number of neighbors plus 1, and the plus 1 only matters if s is 4.

The same idea can be retrofit into the toroidal life:

   lifea←{3=s-⍵∧4=s←⊃+/,¯1 0 1∘.⊖¯1 0 1∘.⌽⊂⍵}

   (life ≡ lifea) b
1

Collected Definitions and Timings

life   ← {↑1 ⍵∨.∧3 4=+/,¯1 0 1∘.⊖¯1 0 1∘.⌽⊂⍵}
lifea  ← {3=s-⍵∧4=s←⊃+/,¯1 0 1∘.⊖¯1 0 1∘.⌽⊂⍵}

  B←{1 1⌷life 3 3⍴(9⍴2)⊤⍵}¨ ⍳2*9
  U←{3 3⍴(9⍴2)⊤⍵}¨ ⍸ B
  A←3 3⍴2*⌽⍳9

life1  ← {U ∊⍨ {⊂⍵}⌺3 3⊢⍵}
life1a ← U ∊⍨ ⊢∘⊂⌺3 3
life1b ← U ∊⍨ {⊂⍵}⌺3 3
life2  ← {B[{+/,A×⍵}⌺3 3⊢⍵]}
life3  ← {3=s-⍵∧4=s←{+/,⍵}⌺3 3⊢⍵}

   cmpx (⊂'life') ,¨ '1' '1a' '1b' '2' '3' '' 'a' ,¨ ⊂' b'
  life1 b  → 2.98E¯3 |    0% ⎕⎕⎕⎕⎕
  life1a b → 1.97E¯2 | +561% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
  life1b b → 2.99E¯3 |    0% ⎕⎕⎕⎕⎕
  life2 b  → 2.71E¯4 |  -91%
  life3 b  → 6.05E¯5 |  -98%
* life b   → 1.50E¯4 |  -95%
* lifea b  → 1.41E¯4 |  -96%

The * indicate that life and lifea give a different result (toroidal v 0-border).

life1a is much slower than the others because ⊢∘⊂⌺ is not implemented by special code.

{+/,⍵}⌺ is the fastest of the special codes because the computation has mathematical properties absent from {⊂⍵}⌺ and {+/,A×⍵}⌺.

The effect of special code v not, can be observed (for example) by use of redundant parentheses:

   cmpx '{+/,⍵}⌺3 3⊢b' '{+/,(⍵)}⌺3 3⊢b'
  {+/,⍵}⌺3 3⊢b   → 3.13E¯5  |      0%
  {+/,(⍵)}⌺3 3⊢b → 2.63E¯2  | +83900% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

   cmpx '{+/,A×⍵}⌺3 3⊢b' '{+/,A×(⍵)}⌺3 3⊢b'
  {+/,A×⍵}⌺3 3⊢b   → 2.42E¯4 |      0%
  {+/,A×(⍵)}⌺3 3⊢b → 2.98E¯2 | +12216% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

   ⍝ no special code in either of the following expressions
   cmpx '{+/,2×⍵}⌺3 3⊢b' '{+/,2×(⍵)}⌺3 3⊢b'
  {+/,2×⍵}⌺3 3⊢b   → 2.92E¯2 |      0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕ 
  {+/,2×(⍵)}⌺3 3⊢b → 3.03E¯2 |     +3% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

That life and lifea do not use stencil and yet are competitive, illustrates the efficacy of boolean operations and of letting primitives “see” large arguments.

Finally, if you wish to play with stencil a description and a hi-fi model of it can be found here.

Krypto

In the 2016 Year Game, the task was to generate the numbers 0 to 100 using APL primitives and the digits 2 0 1 6 in that order. For example,

   20=16
   ×2016
   2⌊016
   2+×016
   …

This “puzzle of the year” brings to mind Krypto, a game I played many years ago while in grade school.

Krypto

Krypto is a mathematical card game. The Krypto deck has 56 cards: 3 each of numbers 1-6, 4 each of the numbers 7-10, 2 each of 11-17, 1 each of 18-25.

   ⎕io←0
   DECK ← (3/1+⍳6),(4/7+⍳4),(2/11+⍳7),18+⍳8

Six cards are dealt: an objective card and five other cards. A player must use all five of the latter cards’ numbers exactly once, using any combination of arithmetic operations (+, -, ×, and ÷) to form the objective card’s number. The first player to come up with a correct formula is the winner. The stricter “International Rules” specify the use of non-negative integers only; fractional or negative intermediate results are not permitted.

For example, if the objective card were 17 and the other cards were 2, 8, 14, 19, and 21, then a Krypto solution can be as follows. Without loss of generality, we use APL notation and syntax.

   8 - 19 + 14 - 2 × 21

In this article we present functions to find all solutions to a Krypto hand.

A Solution

There are a maximum of !5 permutations of the 5 cards and 4 possibilities in each of the 4 places where an operation can be put, for (!5)×4*4 or 30720 total possibilities. This number is small enough for an exhaustive approach.

deal←{DECK ⌷⍨⊂ 6?≢DECK}

Krypto←{
  perm←{0=⍵:1 0⍴0 ⋄ ,[⍳2](⊂0,1+∇ ¯1+⍵)⌷⍤1⍒⍤1∘.=⍨⍳⍵}
  a←256⌿5 0⍕⍵[perm 5]
  a[;6+5×⍳4]←'+-×÷'[((256×!5),4)⍴⍉(4⍴4)⊤⍳256]
  ⊣⌸ a ⌿⍨ ⍺ = ⍎⍤1 ⊢a
}

   deal ⍬
17 8 19 14 2 21

   ⊢ t← 17 Krypto 8 19 14 2 21
    8 - 19 + 14 -  2 × 21
    8 - 19 + 14 - 21 ×  2
    8 - 19 - 21 + 14 ÷  2
    8 - 14 + 19 -  2 × 21
        ...
   21 +  8 - 19 - 14 ÷  2
   21 - 19 -  8 + 14 ÷  2

   ⍴ t
24 25

Intermediate Steps

The function perm is from the Dyalog blog post Permutations, 2015-07-20. perm n generates the sorted table of all permutations of ⍳n.

The local variable a in Krypto is a 30720-row character matrix computed from the 5 non-objective numbers. It consists of all !5 permutations of the 5 numbers interspersed with all 4*4 arrangements of the four operations + - × ÷.

Executing the rows of a produces a 30720-element numeric vector. Comparison of this vector with the objective yields a boolean vector that selects the rows of a which are correct formulas.

   ⍴ a
30720 25

   8↑a
    8 + 19 + 14 +  2 + 21
    8 + 19 + 14 +  2 - 21
    8 + 19 + 14 +  2 × 21
    8 + 19 + 14 +  2 ÷ 21
    8 + 19 + 14 -  2 + 21
    8 + 19 + 14 -  2 - 21
    8 + 19 + 14 -  2 × 21
    8 + 19 + 14 -  2 ÷ 21
   ¯5↑a
   21 ÷  2 ÷ 14 × 19 ÷  8
   21 ÷  2 ÷ 14 ÷ 19 +  8
   21 ÷  2 ÷ 14 ÷ 19 -  8
   21 ÷  2 ÷ 14 ÷ 19 ×  8
   21 ÷  2 ÷ 14 ÷ 19 ÷  8

   +/ b ← 17 = ⍎⍤1 ⊢a
24

   t ≡ b⌿a
1

We note that use of the @ operator, new to Dyalog version 16.0, obviates the need to name intermediate results for reasons for syntax. Instead, names are only used to break the code down into more comprehensible components.

Krypto1←{
  perm←{0=⍵:1 0⍴0 ⋄ ,[⍳2](⊂0,1+∇ ¯1+⍵)⌷⍤1⍒⍤1∘.=⍨⍳⍵}
  ⊣⌸ ⍺ (⊢(⌿⍨)=∘(⍎⍤1)) '+-×÷'[((256×!5),4)⍴⍉(4⍴4)⊤⍳256]⊣@(6+5×⍳4)⍤1 ⊢256⌿5 0⍕⍵[perm 5]
}

Krypto2←{
  perm←{0=⍵:1 0⍴0 ⋄ ,[⍳2](⊂0,1+∇ ¯1+⍵)⌷⍤1⍒⍤1∘.=⍨⍳⍵}
  fns  ← '+-×÷'[((256×!5),4)⍴⍉(4⍴4)⊤⍳256]
  nums ← 256⌿5 0⍕⍵[perm 5]
  ⊣⌸ ⍺ (⊢(⌿⍨)=∘(⍎⍤1)) fns ⊣@(6+5×⍳4)⍤1 ⊢nums
}

   17 (Krypto ≡ Krypto1) 8 19 14 2 21
1

   17 (Krypto ≡ Krypto2) 8 19 14 2 21
1

International Rules

The international rules (intermediate results must be non-negative integers) can be enforced readily:

irules←{⍵⌿⍨∧/(i=⌊i)∧0≤i←8 13 18{⍎¨⍺↓¨⊂⍵}⍤1⊢⍵}

   irules 17 Krypto 8 19 14 2 21
    8 +  2 + 14 ÷ 21 - 19
    8 + 21 - 19 - 14 ÷  2
   19 -  2 ×  8 - 21 - 14
   19 -  2 ÷  8 - 21 - 14
   19 -  2 × 14 - 21 -  8
   19 -  2 ÷ 14 - 21 -  8
    2 +  8 + 14 ÷ 21 - 19
   21 - 19 -  8 + 14 ÷  2

It is instructive to look at how the local variable i in irules is formed:

   ⊢ t←17 Krypto 8 19 14 2 21
    8 - 19 + 14 -  2 × 21
    8 - 19 + 14 - 21 ×  2
    8 - 19 - 21 + 14 ÷  2
    8 - 14 + 19 -  2 × 21
        ...
   21 +  8 - 19 - 14 ÷  2
   21 - 19 -  8 + 14 ÷  2

   8 13 18 {⍺↓¨⊂⍵}⍤1 ⊢t
┌─────────────────┬────────────┬───────┐
│19 + 14 -  2 × 21│14 -  2 × 21│ 2 × 21│
├─────────────────┼────────────┼───────┤
│19 + 14 - 21 ×  2│14 - 21 ×  2│21 ×  2│
├─────────────────┼────────────┼───────┤
│19 - 21 + 14 ÷  2│21 + 14 ÷  2│14 ÷  2│
├─────────────────┼────────────┼───────┤
│14 + 19 -  2 × 21│19 -  2 × 21│ 2 × 21│
├─────────────────┼────────────┼───────┤
            ...
├─────────────────┼────────────┼───────┤
│ 8 - 19 - 14 ÷  2│19 - 14 ÷  2│14 ÷  2│
├─────────────────┼────────────┼───────┤
│19 -  8 + 14 ÷  2│ 8 + 14 ÷  2│14 ÷  2│
└─────────────────┴────────────┴───────┘

      ⍎¨ 8 13 18 {⍺↓¨⊂⍵}⍤1 ⊢t
¯9 ¯28  42
¯9 ¯28  42
¯9  28   7
¯9 ¯23  42
   ...
¯4  12   7
 4  15   7

(An earlier version of the text appeared as the Jwiki essay Krypto, 2013-07-04.)

Stencil Maneuvers

Introduction

The e-mail arrived in the early afternoon from Morten, in Finland attending the FinnAPL Forest Seminar.

How do I speed this up and impress the Finns?

0 cmpx 'e←⊃∨/0.2 edges¨r g b'
6.4E¯1
edges
{⍺←0.7 ⋄ 1 1↓¯1 ¯1↓⍺<(|EdgeDetect apply ⍵)÷1⌈(+⌿÷≢),⍵}
apply
{stencil←⍺ ⋄ {+/,⍵×stencil}⌺(⍴stencil)⊢⍵}
EdgeDetect
¯1 ¯1 ¯1
¯1 8 ¯1
¯1 ¯1 ¯1

(r g b in the attached ws)

Background

is stencil, a new dyadic operator which will be available in version 16.0. In brief, f⌺s applies the operand function f to windows of size s. The window is moved over the argument centered over every possible position. The size is commonly odd and the movement commonly 1. For example:

   {⊂⍵}⌺5 3⊢6 5⍴⍳30
┌───────┬────────┬────────┬────────┬───────┐
│0  0  0│ 0  0  0│ 0  0  0│ 0  0  0│ 0  0 0│
│0  0  0│ 0  0  0│ 0  0  0│ 0  0  0│ 0  0 0│
│0  1  2│ 1  2  3│ 2  3  4│ 3  4  5│ 4  5 0│
│0  6  7│ 6  7  8│ 7  8  9│ 8  9 10│ 9 10 0│
│0 11 12│11 12 13│12 13 14│13 14 15│14 15 0│
├───────┼────────┼────────┼────────┼───────┤
│0  0  0│ 0  0  0│ 0  0  0│ 0  0  0│ 0  0 0│
│0  1  2│ 1  2  3│ 2  3  4│ 3  4  5│ 4  5 0│
│0  6  7│ 6  7  8│ 7  8  9│ 8  9 10│ 9 10 0│
│0 11 12│11 12 13│12 13 14│13 14 15│14 15 0│
│0 16 17│16 17 18│17 18 19│18 19 20│19 20 0│
├───────┼────────┼────────┼────────┼───────┤
│0  1  2│ 1  2  3│ 2  3  4│ 3  4  5│ 4  5 0│
│0  6  7│ 6  7  8│ 7  8  9│ 8  9 10│ 9 10 0│
│0 11 12│11 12 13│12 13 14│13 14 15│14 15 0│
│0 16 17│16 17 18│17 18 19│18 19 20│19 20 0│
│0 21 22│21 22 23│22 23 24│23 24 25│24 25 0│
├───────┼────────┼────────┼────────┼───────┤
│0  6  7│ 6  7  8│ 7  8  9│ 8  9 10│ 9 10 0│
│0 11 12│11 12 13│12 13 14│13 14 15│14 15 0│
│0 16 17│16 17 18│17 18 19│18 19 20│19 20 0│
│0 21 22│21 22 23│22 23 24│23 24 25│24 25 0│
│0 26 27│26 27 28│27 28 29│28 29 30│29 30 0│
├───────┼────────┼────────┼────────┼───────┤
│0 11 12│11 12 13│12 13 14│13 14 15│14 15 0│
│0 16 17│16 17 18│17 18 19│18 19 20│19 20 0│
│0 21 22│21 22 23│22 23 24│23 24 25│24 25 0│
│0 26 27│26 27 28│27 28 29│28 29 30│29 30 0│
│0  0  0│ 0  0  0│ 0  0  0│ 0  0  0│ 0  0 0│
├───────┼────────┼────────┼────────┼───────┤
│0 16 17│16 17 18│17 18 19│18 19 20│19 20 0│
│0 21 22│21 22 23│22 23 24│23 24 25│24 25 0│
│0 26 27│26 27 28│27 28 29│28 29 30│29 30 0│
│0  0  0│ 0  0  0│ 0  0  0│ 0  0  0│ 0  0 0│
│0  0  0│ 0  0  0│ 0  0  0│ 0  0  0│ 0  0 0│
└───────┴────────┴────────┴────────┴───────┘
   {+/,⍵}⌺5 3⊢6 5⍴⍳30
 39  63  72  81  57
 72 114 126 138  96
115 180 195 210 145
165 255 270 285 195
152 234 246 258 176
129 198 207 216 147

In addition, for matrix right arguments with movement 1, special code is provided for the following operand functions:

{∧/,⍵}   {∨/,⍵}   {=/,⍵}   {≠/,⍵} 

{⍵}      {,⍵}     {⊂⍵}     {+/,⍵}

{+/,A×⍵}     {E<+/,A×⍵}       
{+/⍪A×⍤2⊢⍵}  {E<+/⍪A×⍤2⊢⍵}

The comparison < can be one of < ≤ = ≠ ≥ >.

The variables r g b in the problem are integer matrices with shape 227 316 having values between 0 and 255. For present purposes we can initialize them to random values:

   r←¯1+?227 316⍴256
   g←¯1+?227 316⍴256
   b←¯1+?227 316⍴256

Opening Moves

I had other obligations and did not attend to the problem until 7 PM. A low-hanging fruit was immediately apparent: {+/,⍵×A}⌺s is not implemented by special code but {+/,A×⍵}⌺s is. The difference in performance is significant:

   edges1←{⍺←0.7 ⋄ 1 1↓¯1 ¯1↓⍺<(|EdgeDetect apply1 ⍵)÷1⌈(+⌿÷≢),⍵}
   apply1←{stencil←⍺ ⋄ {+/,stencil×⍵}⌺(⍴stencil)⊢⍵}

   cmpx 'e←⊃∨/0.2 edges1¨r g b' 'e←⊃∨/0.2 edges ¨r g b'
  e←⊃∨/0.2 edges1¨r g b → 8.0E¯3 |     0% ⎕
  e←⊃∨/0.2 edges ¨r g b → 6.1E¯1 | +7559% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

I fired off an e-mail reporting this good result, then turned to other more urgent matters. An example of the good being an enemy of the better, I suppose.

When I returned to the problem at 11 PM, the smug good feelings have largely dissipated:

  • Why should {+/,A×⍵}⌺s be so much faster than {+/,⍵×A}⌺s? That is, why is there special code for the former but not the latter? (Answer: all the special codes end with … ⍵}.)
  • I can not win: If the factor is small, then why isn’t it larger; if it is large, it’s only because the original code wasn’t very good.
  • The large factor is because, for this problem, C (special code) is still so much faster than APL (magic function).
  • If the absolute value can somehow be replaced, the operand function can then be in the form {E<+/,A×⍵}⌺s, already implemented by special code. Alternatively, {E<|+/,A×⍵}⌺s can be implemented by new special code.

Regarding the last point, the performance improvement potentially can be:

   edges2←{⍺←0.7 ⋄ 1 1↓¯1 ¯1↓(⍺ EdgeDetect)apply2 ⍵}
   apply2←{threshold stencil←⍺ ⋄ 
                    {threshold<+/,stencil×⍵}⌺(⍴stencil)⊢⍵}

   cmpx (⊂'e←⊃∨/0.2 edges'),¨'21 ',¨⊂'¨r g b'
  e←⊃∨/0.2 edges2¨r g b → 4.8E¯3 |      0%
* e←⊃∨/0.2 edges1¨r g b → 7.5E¯3 |    +57%
* e←⊃∨/0.2 edges ¨r g b → 7.4E¯1 | +15489% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

The * in the result of cmpx indicates that the expressions don’t give the same results. That is expected; edges2 is not meant to give the same results but is to get a sense of the performance difference. (Here, an additional factor of 1.57.)

edges1 has the expression ⍺<matrix÷1⌈(+⌿÷≢),⍵ where and the divisor are both scalars. Reordering the expression eliminates one matrix operation: (⍺×1⌈(+⌿÷≢),⍵)<matrix. Thus:

   edges1a←{⍺←0.7 ⋄ 1 1↓¯1 ¯1↓(⍺×1⌈(+⌿÷≢),⍵)<|EdgeDetect apply1 ⍵}

   cmpx (⊂'e←⊃∨/0.2 edges'),¨('1a' '1 ' '  '),¨⊂'¨r g b'
  e←⊃∨/0.2 edges1a¨r g b → 6.3E¯3 |      0%
  e←⊃∨/0.2 edges1 ¨r g b → 7.6E¯3 |    +22%
  e←⊃∨/0.2 edges  ¨r g b → 6.5E¯1 | +10232% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

The time was almost 1 AM. To bed.

Further Maneuvers

Fresh and promising ideas came with the morning. The following discussion applies to the operand function {+/,A×⍵}.

(0) Scalar multiple: If all the elements of A are equal, then {+/,A×⍵}⌺(⍴A)⊢r ←→ (⊃A)×{+/,⍵}⌺(⍴A)⊢r.

   A←3 3⍴?17
   ({+/,A×⍵}⌺(⍴A)⊢r) ≡ (⊃A)×{+/,⍵}⌺(⍴A)⊢r
1

(1) Sum v inner product: {+/,⍵}⌺(⍴A)⊢r is significantly faster than {+/,A×⍵}⌺(⍴A)⊢r because the former exploits mathematical properties absent from the latter.

   A←?3 3⍴17
   cmpx '{+/,⍵}⌺(⍴A)⊢r' '{+/,A×⍵}⌺(⍴A)⊢r'
  {+/,⍵}⌺(⍴A)⊢r   → 1.4E¯4 |    0% ⎕⎕⎕
* {+/,A×⍵}⌺(⍴A)⊢r → 1.3E¯3 | +828% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

(The * in the result of cmpx is expected.)

(2) Linearity: the stencil of the sum equals the sum of the stencils.

   A←?3 3⍴17
   B←?3 3⍴17
   ({+/,(A+B)×⍵}⌺(⍴A)⊢r) ≡ ({+/,A×⍵}⌺(⍴A)⊢r) + {+/,B×⍵}⌺(⍴A)⊢r 
1

(3) Middle: If B is zero everywhere except the middle, then {+/,B×⍵}⌺(⍴B)⊢r ←→ mid×r where mid is the middle value.

   B←(⍴A)⍴0 0 0 0 9
   B
0 0 0
0 9 0
0 0 0
   ({+/,B×⍵}⌺(⍴B)⊢r) ≡ 9×r
1

(4) A faster solution.

   A←EdgeDetect
   B←(⍴A)⍴0 0 0 0 9
   C←(⍴A)⍴¯1
   A B C
┌────────┬─────┬────────┐
│¯1 ¯1 ¯1│0 0 0│¯1 ¯1 ¯1│
│¯1  8 ¯1│0 9 0│¯1 ¯1 ¯1│
│¯1 ¯1 ¯1│0 0 0│¯1 ¯1 ¯1│
└────────┴─────┴────────┘
   A ≡ B+C
1

Whence:

   ({+/,A×⍵}⌺(⍴A)⊢r) ≡ ({+/,B×⍵}⌺(⍴A)⊢r) + {+/,C×⍵}⌺(⍴A)⊢r ⍝ (2)
1
   ({+/,A×⍵}⌺(⍴A)⊢r) ≡ ({+/,B×⍵}⌺(⍴A)⊢r) - {+/,⍵}⌺(⍴A)⊢r   ⍝ (0)
1
   ({+/,A×⍵}⌺(⍴A)⊢r) ≡ (9×r) - {+/,⍵}⌺(⍴A)⊢r                ⍝ (3)
1

Putting it all together:

edges3←{
  ⍺←0.7
  mid←⊃EdgeDetect↓⍨⌊2÷⍨⍴EdgeDetect
  1 1↓¯1 ¯1↓(⍺×1⌈(+⌿÷≢),⍵)<|(⍵×1+mid)-{+/,⍵}⌺(⍴EdgeDetect)⊢⍵
}

Comparing the various edges:

   x←(⊂'e←⊃∨/0.2 edges'),¨('3 ' '2 ' '1a' '1 ' '  '),¨⊂'¨r g b'
   cmpx x
  e←⊃∨/0.2 edges3 ¨r g b → 3.4E¯3 |      0%
* e←⊃∨/0.2 edges2 ¨r g b → 4.3E¯3 |    +25%
  e←⊃∨/0.2 edges1a¨r g b → 6.4E¯3 |    +88%
  e←⊃∨/0.2 edges1 ¨r g b → 7.5E¯3 |   +122%
  e←⊃∨/0.2 edges  ¨r g b → 6.5E¯1 | +19022% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
edges original
edges1 uses {+/,stencil×⍵} instead of {+/,⍵×stencil}
edges1a uses (⍺×mean,⍵)<matrix instead of ⍺<matrix÷mean,⍵
edges2 uses {threshold<+/,stencil×⍵}
edges3 uses the faster {+/,⍵} instead of {+/,stencil×⍵} and other maneuvers

Fin

I don’t know if the Finns would be impressed. The exercise has been amusing in any case.

Calculation v Look-Up

(⎕io←0 and timings are done in Dyalog version 15.0.)

Table Look-Up

Some functions can be computed faster by table look-up than a more traditional and more conventional calculation. For example:

      b←?1e6⍴2

      cmpx '*b' '(*0 1)[b]'
 *b        → 1.32E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
 (*0 1)[b] → 1.23E¯3 | -91% ⎕⎕⎕

Some observations about this benchmark:

(a) The advantage of table look-up depends on the time to calculate the function directly, versus the time to do indexing, ⍺[⍵] or ⍵⌷⍺, plus a small amount of time to calculate the function on a (much) smaller representative set.

Indexing is particularly efficient when the indices are boolean:

      bb←b,1
      bi←b,2
      cmpx '2 3[bb]' '2 3 3[bi]'
 2 3[bb]   → 1.12E¯4 |    0% ⎕⎕⎕⎕⎕
 2 3 3[bi] → 7.39E¯4 | +558% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

bi is the same as bb except that the last element of 2 forces it to be 1-byte integers (for bi) instead of 1-bit booleans (for bb).

(b) Although the faster expression works best with 0-origin, the timing for 1-origin is similar.

      cmpx '*b' '{⎕io←0 ⋄ (*0 1)[⍵]}b'
 *b                   → 1.32E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
 {⎕io←0 ⋄ (*0 1)[⍵]}b → 1.22E¯3 | -91% ⎕⎕⎕

That is, if the current value of ⎕io is 1, the implementation can use a local ⎕io setting of 0.

(c) The fact that *b is currently slower than (*0 1)[b] represents a judgment by the implementers that *b is not a sufficiently common computation to warrant writing faster code for it. Other similar functions already embody the table look-up technique:

      cmpx '⍕b' '¯1↓,(2 2⍴''0 1 '')[b;]'
 ⍕b                   → 6.95E¯4 |  0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
 ¯1↓,(2 2⍴'0 1 ')[b;] → 6.92E¯4 | -1% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

Small Domains

Table look-up works best for booleans, but it also works for other small domains such as 1-byte integers:

      i1←?1e6⍴128    ⍝ 1-byte integers

      cmpx '*i1' '(*⍳128)[i1]'
 *i1         → 1.48E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
 (*⍳128)[i1] → 9.30E¯4 | -94% ⎕⎕

      cmpx '○i1' '(○⍳128)[i1]'
 ○i1         → 2.78E¯3 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
 (○⍳128)[i1] → 9.25E¯4 | -67% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

In general, a table for 1-byte integers needs to have values for ¯128+⍳256. Here ⍳128 is used to simulate that in C indexing can be done on 1-byte indices without extra computation on the indices. In APL, general 1-byte indices are used as table[i1+128]; in C, the expression is (table+offset)[i].

Domains considered “small” get bigger all the time as machines come equipped with ever larger memories.

Larger Domains

Table look-up is applicable when the argument is not a substantial subset of a large domain and there is a fast way to detect that it is not.

      i4s←1e7+?1e6⍴1e5    ⍝ small-range 4-byte integers

      G←{(⊂u⍳⍵)⌷⍺⍺ u←∪⍵}

      cmpx '⍟i4s' '⍟G i4s'
 ⍟i4s   → 1.44E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
 ⍟G i4s → 9.59E¯3 | -34% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

The definition of G implements that the argument is not used directly as indices (as in (⊂⍵)⌷⍺⍺ u) but needed to be looked up, the u⍳⍵ in (⊂u⍳⍵)⌷⍺⍺ u. Thus both index-of and indexing are key enablers for making table look-up competitive.

§4.3 of Notation as a Tool of Thought presents a different way to distribute the results of a calculation on the “nub” (unique items). But it takes more time and space and is limited to numeric scalar results.

      H←{(⍺⍺ u)+.×(u←∪⍵)∘.=⍵}

      i4a←1e7+?1e4⍴1e3    ⍝ small-range 4-byte integers

      cmpx '⍟i4a' '⍟G i4a' '⍟H i4a'
 ⍟i4a   → 1.56E¯4 |       0%
 ⍟G i4a → 6.25E¯5 |     -60%
 ⍟H i4a → 1.78E¯1 | +113500% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

The benchmark is done on the smaller i4a as a benchmark on i4s founders on the ≢∪⍵ by ≢⍵ boolean matrix (for i4s, 12.5 GB = ×/ 0.125 1e5 1e6) created by H.

Mathematical Background

New Math” teaches that a function is a set of ordered pairs whose first items are unique:

*⍵: {(⍵,*⍵)|⍵∊C}      The set of all (⍵,*⍵) where is a complex number
⍟⍵: {(⍵,⍟⍵)|⍵∊C~{0}}  The set of all (⍵,⍟⍵) where is a non-zero complex number
○⍵: {(⍵,○⍵)|⍵∊C}      The set of all (⍵,○⍵) where is a complex number
⍕⍵: {(⍵,⍕⍵)|0=⍴⍴⍵}    The set of all (⍵,⍕⍵) where is a scalar

When is restricted (for example) to the boolean domain, the functions, the sets of ordered pairs, are more simply represented by enumerating all the possibilities:

*⍵: {(0,1), (1,2.71828)}
⍟⍵: {(0,Err), (1,0)}
○⍵: {(0,0), (1,3.14159)}
⍕⍵: {(0,'0'), (1,'1')}

Realized and Potential Performance Improvements

realized (available no later than 15.0)

boolean
1-byte integer
⍕ ∘.f
⍕     a∘⍕

potential (non-exhaustive list)

boolean
1-byte integer
2-byte integer
small-range 4-byte integer
* ⍟ | - ○   ! ⍳ a∘f f.g
* ⍟ | - ○ × ! ⍳ a∘f f.g ∘.f
* ⍟ | - ○ ×   ⍳ a∘f
* ⍟ |     ×

a is a scalar; f and g are primitive scalar dyadic functions.

APL Exercises

These exercises are designed to introduce APL to a high school senior adept in mathematics. The entire set of APL Exercises has the following sections:

Introduction
0. Beginnings
1. Utilities
2. Recursion
3. Proofs
4. Rank Operator
5. Index-Of
6. Key Operator
7. Indexing
8. Grade and Sort
9. Power Operator
10. Arithmetic
11. Combinatorial Objects
The Fine Print

Sections 2 and 10 are the most interesting ones and are as follows. The full set is too lengthy (and too boring?) to be included here. Some students may be able to make up their own exercises from the list of topics. As well, it may be useful to know what the “experts” think are important aspects of APL.

2. Recursion

⎕io←0 throughout.

2.0 Factorial

Write a function to compute the factorial function on integer , ⍵≥0.

      fac¨ 5 6 7 8
120 720 5040 40320

      n←1+?20
      (fac n) = n×fac n-1
1
      fac 0
1

2.1 Fibonacci Numbers

Write a function to compute the -th Fibonacci number, ⍵≥0.

      fib¨ 5 6 7 8
5 8 13 21    

      n←2+?20
      (fib n) = (fib n-2) + (fib n-1)
1
      fib 0
0
      fib 1
1

If the function written above is multiply recursive, write a version which is singly recursive (only one occurrence of ). Use cmpx to compare the performance of the two functions on the argument 35.

2.2 Peano Addition

Let and be natural numbers (non-negative integers). Write a function padd to compute ⍺+⍵ without using + itself (or - or × or ÷ …). The only functions allowed are comparison to 0 and the functions pre←{⍵-1} and suc←{⍵+1}.

2.3 Peano Multiplication

Let and be natural numbers (non-negative integers). Write a function ptimes to compute ⍺×⍵ without using × itself (or + or - or ÷ …). The only functions allowed are comparison to 0 and the functions pre←{⍵-1} and suc←{⍵+1} and padd.

2.4 Binomial Coefficients

Write a function to produce the binomal coefficients of order , ⍵≥0.

      bc 0
1
      bc 1
1 1
      bc 2
1 2 1
      bc 3
1 3 3 1
      bc 4
1 4 6 4 1

2.5 Cantor Set

Write a function to compute the Cantor set of order , ⍵≥0.

      Cantor 0
1
      Cantor 1
1 0 1
      Cantor 2
1 0 1 0 0 0 1 0 1
      Cantor 3
1 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 0 1

The classical version of the Cantor set starts with the interval [0,1] and at each stage removes the middle third from each remaining subinterval:

[0,1] →
[0,1/3] ∪ [2/3,1] →
[0,1/9] ∪ [2/9,1/3] ∪ [2/3,7/9] ∪ [8/9,1] → ….

      {(+⌿÷≢)Cantor ⍵}¨ 3 5⍴⍳15
1         0.666667  0.444444   0.296296   0.197531  
0.131687  0.0877915 0.0585277  0.0390184  0.0260123 
0.0173415 0.011561  0.00770735 0.00513823 0.00342549

      (2÷3)*3 5⍴⍳15
1         0.666667  0.444444   0.296296   0.197531  
0.131687  0.0877915 0.0585277  0.0390184  0.0260123 
0.0173415 0.011561  0.00770735 0.00513823 0.00342549

2.6 Sierpinski Carpet

Write a function to compute the Sierpinski Carpet of order , ⍵≥0.

      SC 0
1
      SC 1
1 1 1
1 0 1
1 1 1
      SC 2
1 1 1 1 1 1 1 1 1
1 0 1 1 0 1 1 0 1
1 1 1 1 1 1 1 1 1
1 1 1 0 0 0 1 1 1
1 0 1 0 0 0 1 0 1
1 1 1 0 0 0 1 1 1
1 1 1 1 1 1 1 1 1
1 0 1 1 0 1 1 0 1
1 1 1 1 1 1 1 1 1

      ' ⌹'[SC 3]
⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹
⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹
⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹
⌹⌹⌹   ⌹⌹⌹⌹⌹⌹   ⌹⌹⌹⌹⌹⌹   ⌹⌹⌹
⌹ ⌹   ⌹ ⌹⌹ ⌹   ⌹ ⌹⌹ ⌹   ⌹ ⌹
⌹⌹⌹   ⌹⌹⌹⌹⌹⌹   ⌹⌹⌹⌹⌹⌹   ⌹⌹⌹
⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹
⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹
⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹
⌹⌹⌹⌹⌹⌹⌹⌹⌹         ⌹⌹⌹⌹⌹⌹⌹⌹⌹
⌹ ⌹⌹ ⌹⌹ ⌹         ⌹ ⌹⌹ ⌹⌹ ⌹
⌹⌹⌹⌹⌹⌹⌹⌹⌹         ⌹⌹⌹⌹⌹⌹⌹⌹⌹
⌹⌹⌹   ⌹⌹⌹         ⌹⌹⌹   ⌹⌹⌹
⌹ ⌹   ⌹ ⌹         ⌹ ⌹   ⌹ ⌹
⌹⌹⌹   ⌹⌹⌹         ⌹⌹⌹   ⌹⌹⌹
⌹⌹⌹⌹⌹⌹⌹⌹⌹         ⌹⌹⌹⌹⌹⌹⌹⌹⌹
⌹ ⌹⌹ ⌹⌹ ⌹         ⌹ ⌹⌹ ⌹⌹ ⌹
⌹⌹⌹⌹⌹⌹⌹⌹⌹         ⌹⌹⌹⌹⌹⌹⌹⌹⌹
⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹
⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹
⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹
⌹⌹⌹   ⌹⌹⌹⌹⌹⌹   ⌹⌹⌹⌹⌹⌹   ⌹⌹⌹
⌹ ⌹   ⌹ ⌹⌹ ⌹   ⌹ ⌹⌹ ⌹   ⌹ ⌹
⌹⌹⌹   ⌹⌹⌹⌹⌹⌹   ⌹⌹⌹⌹⌹⌹   ⌹⌹⌹
⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹
⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹⌹ ⌹
⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹⌹

3-D analogs of the Cantor set and the Sierpinski carpet are the Sierpinski sponge or the Menger sponge.

2.7 Extended H

Write a function for the extend H algorithm for , ⍵≥0, which embeds the complete binary tree of depth on the plane. In ×H ⍵ there are 2*⍵ leaves, instances of the letter o with exactly one neighbor (or no neighbors for 0=⍵). The root is at the center of the matrix.

      xH¨ ⍳5
┌─┬─────┬─────┬─────────────┬─────────────┐
│o│o-o-o│o   o│o-o-o   o-o-o│o   o   o   o│
│ │     │|   |│  |       |  │|   |   |   |│
│ │     │o-o-o│  o---o---o  │o-o-o   o-o-o│
│ │     │|   |│  |       |  │| | |   | | |│
│ │     │o   o│o-o-o   o-o-o│o | o   o | o│
│ │     │     │             │  |       |  │
│ │     │     │             │  o---o---o  │
│ │     │     │             │  |       |  │
│ │     │     │             │o | o   o | o│
│ │     │     │             │| | |   | | |│
│ │     │     │             │o-o-o   o-o-o│
│ │     │     │             │|   |   |   |│
│ │     │     │             │o   o   o   o│
└─┴─────┴─────┴─────────────┴─────────────┘

Write a function that has the same result as xH but checks the result using the assert utility. For example:

assert←{⍺←'assertion failure' ⋄ 0∊⍵:⍺ ⎕SIGNAL 8 ⋄ shy←0}

xH1←{
  h←xH ⍵
  assert 2=⍴⍴h:
  assert h∊'○ -|':
  assert (¯1+2*1+⍵)=+/'o'=,h:
  ...
  h
}

2.8 Tower of Hanoi

(By André Karwath aka Aka (Own work) [CC BY-SA 2.5 (http://creativecommons.org/licenses/by-sa/2.5)], via Wikimedia Commons)

The Tower of Hanoi problem is to move a set of different-sized disks from one peg to another, moving one disk at a time, using an intermediate peg if necessary. At all times no larger disk may sit on top of a smaller disk.

Write a function Hanoi to solve the problem of moving disks from peg 0 to peg 2. Since it’s always the disk which is at the top of a peg which is moved, the solution can be stated as a 2-column matrix with column 0 indicating the source peg and column 1 the destination peg.

      Hanoi¨ ⍳5
┌───┬───┬───┬───┬───┐
│   │0 2│0 1│0 2│0 1│
│   │   │0 2│0 1│0 2│
│   │   │1 2│2 1│1 2│
│   │   │   │0 2│0 1│
│   │   │   │1 0│2 0│
│   │   │   │1 2│2 1│
│   │   │   │0 2│0 1│
│   │   │   │   │0 2│
│   │   │   │   │1 2│
│   │   │   │   │1 0│
│   │   │   │   │2 0│
│   │   │   │   │1 2│
│   │   │   │   │0 1│
│   │   │   │   │0 2│
│   │   │   │   │1 2│
└───┴───┴───┴───┴───┘

Prove that Hanoi ⍵ has ¯1+2*⍵ rows.

10. Arithmetic

⎕io←0 throughout.

10.0 Primality Testing

Write a function prime ⍵ which is 1 or 0 depending on whether non-negative integer is a prime number. For example:

      prime 1
0
      prime 2
1
      prime 9
0
      prime¨ 10 10⍴⍳100
0 0 1 1 0 1 0 1 0 0
0 1 0 1 0 0 0 1 0 1
0 0 0 1 0 0 0 0 0 1
0 1 0 0 0 0 0 1 0 0
0 1 0 1 0 0 0 1 0 0
0 0 0 1 0 0 0 0 0 1
0 1 0 0 0 0 0 1 0 0
0 1 0 1 0 0 0 0 0 1
0 0 0 1 0 0 0 0 0 1
0 0 0 0 0 0 0 1 0 0

10.1 Primes Less Than n

Write a function primes ⍵ using the sieve method which produces all the primes less than . Use cmpx to compare the speed of primes n and (prime¨⍳n)/⍳n. For example:

      primes 7
2 3 5
      primes 50
2 3 5 7 11 13 17 19 23 29 31 37 41 43 47

      (prime¨⍳50)/⍳50
2 3 5 7 11 13 17 19 23 29 31 37 41 43 47

10.2 Factoring

Write a function factor which produces the prime factorization of , such that ⍵ = ×/factor ⍵ and ∧/ prime¨ factor ⍵ is 1. For example:

      factor 144
2 2 2 2 3 3
      ×/ factor 144
144
      ∧/ prime¨ factor 144
1

      factor 1

      factor 2
2

10.3 pco

Read about the pco function in http://dfns.dyalog.com/n_pco.htm and experiment with it.

      )copy dfns pco

      pco ⍳7
2 3 5 7 11 13 17

      1 pco ⍳10
0 0 1 1 0 1 0 1 0 0

G.H. Hardy states in A Mathematician’s Apology (chapter 14, page 23) that the number of primes less than 1e9 is 50847478. Use pco to check whether this is correct.

10.4 GCD

Write a function ⍺ gcd ⍵ to compute the greatest common divisor of non-negative integers and using the Euclidean algorithm.

Write a function ⍺ egcd ⍵ to compute the GCD of and as coefficients of and , such that (⍺ gcd ⍵) = (⍺,⍵)+.×⍺ egcd ⍵.

      112 gcd 144
16
      112 egcd 144
4 ¯3
      112 144 +.× 112 egcd 144
16

10.5 Ring Extension

Z[√k] is the ring of integers Z extended with √k where k is not a perfect square. The elements of Z[√k] are ordered pairs (a,b) which can be interpreted as a+b×√k (a+b×k*0.5).

Write a d-operator ⍺(⍺⍺ rplus)⍵ which does addition in Z[√⍺⍺].

      3 4 (5 rplus) ¯2 7
1 11

Write a d-operator ⍺(⍺⍺ rtimes)⍵ which does multiplication in Z[√⍺⍺].

      3 4 (5 rtimes) ¯2 7
134 13

Both of the above can be done using integer operations only.

10.6 Repeated Squaring I

Write a function ⍺ pow ⍵ that computes ⍺*⍵ using repeated squaring. is a number and a non-negative integer. Hint: the binary representation of an integer n obtains as 2⊥⍣¯1⊢n.

10.7 Repeated Squaring II

Write a d-operator ⍺(⍺⍺ rpow)⍵ that computes raised to the power using repeated squaring. is in Z[√⍺⍺] and is a non-negative integer.

      ⊃ (5 rtimes)/ 13 ⍴ ⊂ 1 1
2134016 954368

      1 1 (5 rpow) 13
2134016 954368
      1 ¯1 (5 rpow) 13
2134016 ¯954368

The last two expressions are key steps in an O(⍟n) computation of the n-th Fibonacci number using integer operations, using Binet’s formula:

50847534

⎕io←0 throughout.

I was re-reading A Mathematician’s Apology before recommending it to Suki Tekverk, our summer intern, and came across a statement that the number of primes less than 1e9 is 50847478 (§14, page 23). The function pco from the dfns workspace does computations on primes; ¯1 pco n is the number of primes less than n:

      )copy dfns pco

      ¯1 pco 1e9
50847534

The two numbers 50847478 and 50847534 cannot both be correct. A search of 50847478 on the internet reveals that it is incorrect and that 50847534 is correct. 50847478 is erroneously cited in various textbooks and even has a name, Bertelsen’s number, memorably described by MathWorld as “an erroneous name erroneously given the erroneous value of π(1e9) = 50847478.”

Although several internet sources give 50847534 as the number of primes less than 1e9, they don’t provide a proof. Relying on them would be doing the same thing — arguing from authority — that led other authors astray, even if it is the authority of the mighty internet. Besides, I’d already given Suki, an aspiring mathematician, the standard spiel about the importance of proof.

How do you prove the 50847534 number? One way is to prove correct a program that produces it. Proving pco correct seems daunting — it has 103 lines and two large tables. Therefore, I wrote a function from scratch to compute the number of primes less than , a function written in a way that facilitates proof.

sieve←{
  b←⍵⍴{∧⌿↑(×/⍵)⍴¨~⍵↑¨1}2 3 5
  b[⍳6⌊⍵]←(6⌊⍵)⍴0 0 1 1 0 1
  49≥⍵:b
  p←3↓{⍵/⍳⍴⍵}∇⌈⍵*0.5
  m←1+⌊(⍵-1+p×p)÷2×p
  b ⊣ p {b[⍺×⍺+2×⍳⍵]←0}¨ m
}

      +/ sieve 1e9
50847534

sieve ⍵ produces a boolean vector b with length such that b/⍳⍴b are all the primes less than . It implements the sieve of Eratosthenes: mark as not-prime multiples of each prime less than ⍵*0.5; this latter list of primes obtains by applying the function recursively to ⌈⍵*0.5. Some obvious optimizations are implemented:

  • Multiples of 2 3 5 are marked by initializing b with ⍵⍴{∧⌿↑(×/⍵)⍴¨~⍵↑¨1}2 3 5 rather than with ⍵⍴1.
  • Subsequently, only odd multiples of primes > 5 are marked.
  • Multiples of a prime to be marked start at its square.

Further examples:

      +/∘sieve¨ ⍳10
0 0 0 1 2 2 3 3 4 4

      +/∘sieve¨ 10*⍳10
0 4 25 168 1229 9592 78498 664579 5761455 50847534

There are other functions which are much easier to prove correct, for example, sieve1←{2=+⌿0=(⍳3⌈⍵)∘.|⍳⍵}. However, sieve1 requires O(⍵*2) space and sieve1 1e9 cannot produce a result with current technology. (Hmm, a provably correct program that cannot produce the desired result …)

We can test that sieve is consistent with pco and that pco is self-consistent. pco is a model of the p: primitive function in J. Its cases are:
   pco n   the n-th prime
¯1 pco n   the number of primes less than n
 1 pco n   1 iff n is prime
 2 pco n   the prime factors and exponents of n
 3 pco n   the prime factorization of n
¯4 pco n   the next prime < n
 4 pco n   the next prime > n
10 pco r,s a boolean vector b such that r+b/⍳⍴b are the primes in the half-open interval [r,s).

      ¯1 pco 10*⍳10      ⍝ the number of primes < 1e0 1e1 ... 1e9
0 4 25 168 1229 9592 78498 664579 5761455 50847534

      +/ 10 pco 0 1e9    ⍝ sum of the sieve between 0 and 1e9
50847534
                         ⍝ sum of sums of 10 sieves
                         ⍝ each of size 1e8 from 0 to 1e9
      +/ t← {+/10 pco ⍵+0 1e8}¨ 1e8×⍳10
50847534
                         ⍝ sum of sums of 1000 sieves
                         ⍝ each of size 1e6 from 0 to 1e9
      +/ s← {+/10 pco ⍵+0 1e6}¨ 1e6×⍳1000
50847534

      t ≡ +/ 10 100 ⍴ s
1
                         ⍝ sum of sums of sieves with 1000 randomly
                         ⍝ chosen end-points, from 0 to 1e9
      +/ 2 {+/10 pco ⍺,⍵}/ 0,({⍵[⍋⍵]}1000?1e9),1e9
50847534

      ¯1 pco 1e9         ⍝ the number of primes < 1e9
50847534
      pco 50847534       ⍝ the 50847534-th prime
1000000007
      ¯4 pco 1000000007  ⍝ the next prime < 1000000007
999999937
      4 pco 999999937    ⍝ the next prime > 999999937
1000000007     
      4 pco 1e9          ⍝ the next prime > 1e9
1000000007

                         ⍝ are 999999937 1000000007 primes?
      1 pco 999999937 1000000007
1 1
                         ⍝ which of 999999930 ... 1000000009
                         ⍝ are prime?
      1 pco 999999930+4 20⍴⍳80
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1

      +/∘sieve¨ 999999937 1000000007 ∘.+ ¯1 0 1
50847533 50847533 50847534
50847534 50847534 50847535