Linear Interpolation

⎕io=0 assumed throughout; works in 1-origin with the obvious modifications.

Introduction

On Wednesday, a question arrived via Dyalog Support from an intern in Africa: If M is the matrix on the left, use linear interpolation to compute the result on the right.

   1 20         1 20
   4 80         2 40
   6 82         3 60
                4 80
                5 81
                6 82

Linear Interpolation

Two points (x0,y0) and (x1,y1) specify a line; for any x there is a unique y on that line (assuming x0≠x1). The equation for the line derives as follows, starting from its slope m:

   m = (y1-y0) ÷ (x1-x0)
   (y-y0) = m × (x-x0)
   y = y0 + m × (x-x0)

Therefore, if is a 2-by-2 matrix of the two points and are the x-values to be interpolated, then:

   g ← {(⊃⌽⍺)+(⍵-⊃⍺)÷÷/-⌿⍺}

   ⊢ M←1 4 6,⍪20 80 82
1 20
4 80
6 82

   M[0 1;] g 2 3
40 60
   M[1 2;] g 5
81

A New Twist, A New Solution

The problem as posed implicitly required that:

  • The x-values are the positive integers bounded by ⊃⊖M.
  • Appropriate rows of the matrix are selected for a given x-value.
  • The missing x-values and their interpolations are “slotted back” into the argument matrix.

These requirements are best met by , interval index, a relatively new primitive function introduced in Dyalog APL version 16.0. The left argument must be sorted and partitions the universe into disjoint contiguous intervals; ⍺⍸⍵ finds the index of the interval which contains an item of . The result is ⎕io dependent.

For the given matrix M, the partition (of the real numbers in this case) is depicted below. As in conventional mathematical notation, [ denotes that the interval includes the left end-point and ) denotes that the interval excludes the right end-point.

          1        4      6
─────────)[───────)[─────)[──────────
     ¯1       0       1       2

   v←¯5 0 1 2.5 6 3 4 5 9 8 7

   1 4 6 ⍸ v
¯1 ¯1 0 0 2 0 1 1 2 2 2

   v ,[¯0.5] 1 4 6 ⍸ v
¯5  0 1 2.5 6 3 4 5 9 8 7
¯1 ¯1 0 0   2 0 1 1 2 2 2

With in hand, the problem can be solved as follows:

interpol←{
  (x y)←↓⍉⍵
  m←m,⊃⌽m←(2-/y)÷(2-/x)
  j←0⌈x⍸i←1+⍳⊃⌽x
  i,⍪y[j]+m[j]×i-x[j]
}

   interpol M
1 20
2 40
3 60
4 80
5 81
6 82

The problem of x-values less than the first end-point is finessed by applying 0⌈ to the interval indices, and that of x-values greater than or equal to the last end-point is finessed by repeating the last slope m←m,⊃⌽m.

It is possible to do the interpolation only on the missing indices (2 3 5 in this case) and insert them into the argument matrix. It seems neater to simply interpolate everything, and in so doing provide a check that the interpolated values equal the values given in the argument.

An Alternative Interpolation

Interpolating according to two selected rows of a matrix of points treats the function as piecewise linear, with sharp inflection points where the lines join (different slopes between adjacent lines). A “holistic” alternative approach is possible: the matrix can be interpreted as specifying a single line and the interpolation is according to this single line. The primitive function computes the coefficients of the line which best fits the points:

   ⎕rl←7*5  ⍝ for reproducible random numbers

   ⊢ M←t,⍪(?7⍴5)+¯17+3×t←?7⍴100
35  89
98 278
19  44
 4  ¯5
62 170
49 133
25  59

   M[;1] ⌹ 1,M[;,0]    ⍝ y-intercept and slope
¯15.3164 2.99731

   interpola ← {(1,⍤0⊢⍵)+.×⍺[;1]⌹1,⍺[;,0]}

   M[;1] ,[¯0.5] M interpola M[;0]
89      278    44      ¯5       170     133     59     
89.5895 278.42 41.6325 ¯3.32713 170.517 131.552 59.6164

   M interpola 33 35 37 39.7
83.5949 89.5895 95.5841 103.677

Finally

Our best wishes to the intern. Welcome to APL!

Stencil Lives

⎕io←0 throughout. ⎕io delenda est.

Stencil

A stencil operator is available with Dyalog version 16.0. In brief, stencil is a dyadic operator f⌺s which applies f to (possibly overlapping) rectangles. The size of the rectangle and its movement are controlled by s. For example, enclosing 3-by-3 rectangles with default movements of 1:

   ⊢ a←4 5⍴⎕a
ABCDE
FGHIJ
KLMNO
PQRST
   {⊂⍵}⌺3 3 ⊢a
┌───┬───┬───┬───┬───┐
│   │   │   │   │   │
│ AB│ABC│BCD│CDE│DE │
│ FG│FGH│GHI│HIJ│IJ │
├───┼───┼───┼───┼───┤
│ AB│ABC│BCD│CDE│DE │
│ FG│FGH│GHI│HIJ│IJ │
│ KL│KLM│LMN│MNO│NO │
├───┼───┼───┼───┼───┤
│ FG│FGH│GHI│HIJ│IJ │
│ KL│KLM│LMN│MNO│NO │
│ PQ│PQR│QRS│RST│ST │
├───┼───┼───┼───┼───┤
│ KL│KLM│LMN│MNO│NO │
│ PQ│PQR│QRS│RST│ST │
│   │   │   │   │   │
└───┴───┴───┴───┴───┘

Stencil is also known as stencil code, tile, tessellation, and cut. It has applications in artificial neural networks, computational fluid dynamics, cellular automata, etc., and of course in Conway’s Game of Life.

The Rules of Life

Each cell of a boolean matrix has 8 neighbors adjacent to it horizontally, vertically, or diagonally. The Game of Life concerns the computation of the next generation boolean matrix.

(0) A 0-cell with 3 neighboring 1-cells becomes a 1.

(1) A 1-cell with 2 or 3 neighboring 1-cells remains at 1.

(2) All other cells remain or become a 0.

There are two main variations on the treatment of cells on the edges of the matrix: (a) the matrix is surrounded by a border of 0s; or (b) cells on an edge are adjacent to cells on the opposite edge, as on a torus.

There is an implementation of life in the dfns workspace and explained in a YouTube video. It assumes a toroidal topology.

   life←{↑1 ⍵∨.∧3 4=+/,¯1 0 1∘.⊖¯1 0 1∘.⌽⊂⍵}    ⍝ John Scholes

   ⊢ glider←5 5⍴0 0 1 0 0 1 0 1 0 0 0 1 1,12⍴0
0 0 1 0 0
1 0 1 0 0
0 1 1 0 0
0 0 0 0 0
0 0 0 0 0
   life glider
0 1 0 0 0
0 0 1 1 0
0 1 1 0 0
0 0 0 0 0
0 0 0 0 0

   {'.⍟'[⍵]}¨ (⍳8) {life⍣⍺⊢⍵}¨ ⊂glider
┌─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┐
│..⍟..│.⍟...│..⍟..│.....│.....│.....│.....│.....│
│⍟.⍟..│..⍟⍟.│...⍟.│.⍟.⍟.│...⍟.│..⍟..│...⍟.│.....│
│.⍟⍟..│.⍟⍟..│.⍟⍟⍟.│..⍟⍟.│.⍟.⍟.│...⍟⍟│....⍟│..⍟.⍟│
│.....│.....│.....│..⍟..│..⍟⍟.│..⍟⍟.│..⍟⍟⍟│...⍟⍟│
│.....│.....│.....│.....│.....│.....│.....│...⍟.│
└─────┴─────┴─────┴─────┴─────┴─────┴─────┴─────┘

Stencil Lives

It has long been known that stencil facilitates Game of Life computations. Eugene McDonnell explored the question in the APL88 paper Life: Nasty, Brutish, and Short [Ho51]. The shortest of the solutions derive as follows.

By hook or by crook, find all the 3-by-3 boolean matrices U which lead to a middle 1. A succinct Game of Life then obtains.

   B ← {1 1⌷life 3 3⍴(9⍴2)⊤⍵}¨ ⍳2*9
   U ← {3 3⍴(9⍴2)⊤⍵}¨ ⍸B  ⍝ ⍸ ←→ {⍵/⍳⍴⍵}

   life1 ← {U ∊⍨ {⊂⍵}⌺3 3⊢⍵}    ⍝ Eugene McDonnell

Comparing life and life1, and also illustrating that the toroidal and 0-border computations can be expressed one with the other.

   b←1=?97 103⍴3

   x←1 1↓¯1 ¯1↓ life 0,0,⍨0⍪0⍪⍨b
   y←life1 b
   x≡y
1

   g←{(¯1↑⍵)⍪⍵⍪1↑⍵}
   p←life b
   q←1 1↓¯1 ¯1↓ life1 (g b[;102]),(g b),(g b[;0])
   p≡q
1

Adám Brudzewsky points out that life can be terser as a train (fork):

   life1a ← U ∊⍨ ⊢∘⊂⌺3 3    ⍝ Adám Brudzewsky
   life1b ← U ∊⍨ {⊂⍵}⌺3 3

   (life1 ≡ life1a) b
1
   (life1 ≡ life1b) b
1

life1 is an example of implementing a calculation by look-up rather than by a more conventional computation, discussed in a recent blog post. There is a variation which is more efficient because the look-up is effected with integers rather than boolean matrices:

   A←3 3⍴2*⌽⍳9
   life2 ← {B[{+/,A×⍵}⌺3 3⊢⍵]}

   (life1 ≡ life1b) b
1

Jay Foad offers another stencil life, translating an algorithm in k by Arthur Whitney:

   life3 ← {3=s-⍵∧4=s←{+/,⍵}⌺3 3⊢⍵}    ⍝ Jay Foad

   (life1 ≡ life3) b
1

The algorithm combines the life rules into a single expression, wherein s←{+/,⍵}⌺3 3 ⊢⍵

(0) for 0-cells s is the number of neighbors; and
(1) for 1-cells s is the number of neighbors plus 1, and the plus 1 only matters if s is 4.

The same idea can be retrofit into the toroidal life:

   lifea←{3=s-⍵∧4=s←⊃+/,¯1 0 1∘.⊖¯1 0 1∘.⌽⊂⍵}

   (life ≡ lifea) b
1

Collected Definitions and Timings

life   ← {↑1 ⍵∨.∧3 4=+/,¯1 0 1∘.⊖¯1 0 1∘.⌽⊂⍵}
lifea  ← {3=s-⍵∧4=s←⊃+/,¯1 0 1∘.⊖¯1 0 1∘.⌽⊂⍵}

  B←{1 1⌷life 3 3⍴(9⍴2)⊤⍵}¨ ⍳2*9
  U←{3 3⍴(9⍴2)⊤⍵}¨ ⍸ B
  A←3 3⍴2*⌽⍳9

life1  ← {U ∊⍨ {⊂⍵}⌺3 3⊢⍵}
life1a ← U ∊⍨ ⊢∘⊂⌺3 3
life1b ← U ∊⍨ {⊂⍵}⌺3 3
life2  ← {B[{+/,A×⍵}⌺3 3⊢⍵]}
life3  ← {3=s-⍵∧4=s←{+/,⍵}⌺3 3⊢⍵}

   cmpx (⊂'life') ,¨ '1' '1a' '1b' '2' '3' '' 'a' ,¨ ⊂' b'
  life1 b  → 2.98E¯3 |    0% ⎕⎕⎕⎕⎕
  life1a b → 1.97E¯2 | +561% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
  life1b b → 2.99E¯3 |    0% ⎕⎕⎕⎕⎕
  life2 b  → 2.71E¯4 |  -91%
  life3 b  → 6.05E¯5 |  -98%
* life b   → 1.50E¯4 |  -95%
* lifea b  → 1.41E¯4 |  -96%

The * indicate that life and lifea give a different result (toroidal v 0-border).

life1a is much slower than the others because ⊢∘⊂⌺ is not implemented by special code.

{+/,⍵}⌺ is the fastest of the special codes because the computation has mathematical properties absent from {⊂⍵}⌺ and {+/,A×⍵}⌺.

The effect of special code v not, can be observed (for example) by use of redundant parentheses:

   cmpx '{+/,⍵}⌺3 3⊢b' '{+/,(⍵)}⌺3 3⊢b'
  {+/,⍵}⌺3 3⊢b   → 3.13E¯5  |      0%
  {+/,(⍵)}⌺3 3⊢b → 2.63E¯2  | +83900% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

   cmpx '{+/,A×⍵}⌺3 3⊢b' '{+/,A×(⍵)}⌺3 3⊢b'
  {+/,A×⍵}⌺3 3⊢b   → 2.42E¯4 |      0%
  {+/,A×(⍵)}⌺3 3⊢b → 2.98E¯2 | +12216% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

   ⍝ no special code in either of the following expressions
   cmpx '{+/,2×⍵}⌺3 3⊢b' '{+/,2×(⍵)}⌺3 3⊢b'
  {+/,2×⍵}⌺3 3⊢b   → 2.92E¯2 |      0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕ 
  {+/,2×(⍵)}⌺3 3⊢b → 3.03E¯2 |     +3% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

That life and lifea do not use stencil and yet are competitive, illustrates the efficacy of boolean operations and of letting primitives “see” large arguments.

Finally, if you wish to play with stencil a description and a hi-fi model of it can be found here.

Stencil Maneuvers

Introduction

The e-mail arrived in the early afternoon from Morten, in Finland attending the FinnAPL Forest Seminar.

How do I speed this up and impress the Finns?

0 cmpx 'e←⊃∨/0.2 edges¨r g b'
6.4E¯1
edges
{⍺←0.7 ⋄ 1 1↓¯1 ¯1↓⍺<(|EdgeDetect apply ⍵)÷1⌈(+⌿÷≢),⍵}
apply
{stencil←⍺ ⋄ {+/,⍵×stencil}⌺(⍴stencil)⊢⍵}
EdgeDetect
¯1 ¯1 ¯1
¯1 8 ¯1
¯1 ¯1 ¯1

(r g b in the attached ws)

Background

is stencil, a new dyadic operator which will be available in version 16.0. In brief, f⌺s applies the operand function f to windows of size s. The window is moved over the argument centered over every possible position. The size is commonly odd and the movement commonly 1. For example:

   {⊂⍵}⌺5 3⊢6 5⍴⍳30
┌───────┬────────┬────────┬────────┬───────┐
│0  0  0│ 0  0  0│ 0  0  0│ 0  0  0│ 0  0 0│
│0  0  0│ 0  0  0│ 0  0  0│ 0  0  0│ 0  0 0│
│0  1  2│ 1  2  3│ 2  3  4│ 3  4  5│ 4  5 0│
│0  6  7│ 6  7  8│ 7  8  9│ 8  9 10│ 9 10 0│
│0 11 12│11 12 13│12 13 14│13 14 15│14 15 0│
├───────┼────────┼────────┼────────┼───────┤
│0  0  0│ 0  0  0│ 0  0  0│ 0  0  0│ 0  0 0│
│0  1  2│ 1  2  3│ 2  3  4│ 3  4  5│ 4  5 0│
│0  6  7│ 6  7  8│ 7  8  9│ 8  9 10│ 9 10 0│
│0 11 12│11 12 13│12 13 14│13 14 15│14 15 0│
│0 16 17│16 17 18│17 18 19│18 19 20│19 20 0│
├───────┼────────┼────────┼────────┼───────┤
│0  1  2│ 1  2  3│ 2  3  4│ 3  4  5│ 4  5 0│
│0  6  7│ 6  7  8│ 7  8  9│ 8  9 10│ 9 10 0│
│0 11 12│11 12 13│12 13 14│13 14 15│14 15 0│
│0 16 17│16 17 18│17 18 19│18 19 20│19 20 0│
│0 21 22│21 22 23│22 23 24│23 24 25│24 25 0│
├───────┼────────┼────────┼────────┼───────┤
│0  6  7│ 6  7  8│ 7  8  9│ 8  9 10│ 9 10 0│
│0 11 12│11 12 13│12 13 14│13 14 15│14 15 0│
│0 16 17│16 17 18│17 18 19│18 19 20│19 20 0│
│0 21 22│21 22 23│22 23 24│23 24 25│24 25 0│
│0 26 27│26 27 28│27 28 29│28 29 30│29 30 0│
├───────┼────────┼────────┼────────┼───────┤
│0 11 12│11 12 13│12 13 14│13 14 15│14 15 0│
│0 16 17│16 17 18│17 18 19│18 19 20│19 20 0│
│0 21 22│21 22 23│22 23 24│23 24 25│24 25 0│
│0 26 27│26 27 28│27 28 29│28 29 30│29 30 0│
│0  0  0│ 0  0  0│ 0  0  0│ 0  0  0│ 0  0 0│
├───────┼────────┼────────┼────────┼───────┤
│0 16 17│16 17 18│17 18 19│18 19 20│19 20 0│
│0 21 22│21 22 23│22 23 24│23 24 25│24 25 0│
│0 26 27│26 27 28│27 28 29│28 29 30│29 30 0│
│0  0  0│ 0  0  0│ 0  0  0│ 0  0  0│ 0  0 0│
│0  0  0│ 0  0  0│ 0  0  0│ 0  0  0│ 0  0 0│
└───────┴────────┴────────┴────────┴───────┘
   {+/,⍵}⌺5 3⊢6 5⍴⍳30
 39  63  72  81  57
 72 114 126 138  96
115 180 195 210 145
165 255 270 285 195
152 234 246 258 176
129 198 207 216 147

In addition, for matrix right arguments with movement 1, special code is provided for the following operand functions:

{∧/,⍵}   {∨/,⍵}   {=/,⍵}   {≠/,⍵} 

{⍵}      {,⍵}     {⊂⍵}     {+/,⍵}

{+/,A×⍵}     {E<+/,A×⍵}       
{+/⍪A×⍤2⊢⍵}  {E<+/⍪A×⍤2⊢⍵}

The comparison < can be one of < ≤ = ≠ ≥ >.

The variables r g b in the problem are integer matrices with shape 227 316 having values between 0 and 255. For present purposes we can initialize them to random values:

   r←¯1+?227 316⍴256
   g←¯1+?227 316⍴256
   b←¯1+?227 316⍴256

Opening Moves

I had other obligations and did not attend to the problem until 7 PM. A low-hanging fruit was immediately apparent: {+/,⍵×A}⌺s is not implemented by special code but {+/,A×⍵}⌺s is. The difference in performance is significant:

   edges1←{⍺←0.7 ⋄ 1 1↓¯1 ¯1↓⍺<(|EdgeDetect apply1 ⍵)÷1⌈(+⌿÷≢),⍵}
   apply1←{stencil←⍺ ⋄ {+/,stencil×⍵}⌺(⍴stencil)⊢⍵}

   cmpx 'e←⊃∨/0.2 edges1¨r g b' 'e←⊃∨/0.2 edges ¨r g b'
  e←⊃∨/0.2 edges1¨r g b → 8.0E¯3 |     0% ⎕
  e←⊃∨/0.2 edges ¨r g b → 6.1E¯1 | +7559% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

I fired off an e-mail reporting this good result, then turned to other more urgent matters. An example of the good being an enemy of the better, I suppose.

When I returned to the problem at 11 PM, the smug good feelings have largely dissipated:

  • Why should {+/,A×⍵}⌺s be so much faster than {+/,⍵×A}⌺s? That is, why is there special code for the former but not the latter? (Answer: all the special codes end with … ⍵}.)
  • I can not win: If the factor is small, then why isn’t it larger; if it is large, it’s only because the original code wasn’t very good.
  • The large factor is because, for this problem, C (special code) is still so much faster than APL (magic function).
  • If the absolute value can somehow be replaced, the operand function can then be in the form {E<+/,A×⍵}⌺s, already implemented by special code. Alternatively, {E<|+/,A×⍵}⌺s can be implemented by new special code.

Regarding the last point, the performance improvement potentially can be:

   edges2←{⍺←0.7 ⋄ 1 1↓¯1 ¯1↓(⍺ EdgeDetect)apply2 ⍵}
   apply2←{threshold stencil←⍺ ⋄ 
                    {threshold<+/,stencil×⍵}⌺(⍴stencil)⊢⍵}

   cmpx (⊂'e←⊃∨/0.2 edges'),¨'21 ',¨⊂'¨r g b'
  e←⊃∨/0.2 edges2¨r g b → 4.8E¯3 |      0%
* e←⊃∨/0.2 edges1¨r g b → 7.5E¯3 |    +57%
* e←⊃∨/0.2 edges ¨r g b → 7.4E¯1 | +15489% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

The * in the result of cmpx indicates that the expressions don’t give the same results. That is expected; edges2 is not meant to give the same results but is to get a sense of the performance difference. (Here, an additional factor of 1.57.)

edges1 has the expression ⍺<matrix÷1⌈(+⌿÷≢),⍵ where and the divisor are both scalars. Reordering the expression eliminates one matrix operation: (⍺×1⌈(+⌿÷≢),⍵)<matrix. Thus:

   edges1a←{⍺←0.7 ⋄ 1 1↓¯1 ¯1↓(⍺×1⌈(+⌿÷≢),⍵)<|EdgeDetect apply1 ⍵}

   cmpx (⊂'e←⊃∨/0.2 edges'),¨('1a' '1 ' '  '),¨⊂'¨r g b'
  e←⊃∨/0.2 edges1a¨r g b → 6.3E¯3 |      0%
  e←⊃∨/0.2 edges1 ¨r g b → 7.6E¯3 |    +22%
  e←⊃∨/0.2 edges  ¨r g b → 6.5E¯1 | +10232% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

The time was almost 1 AM. To bed.

Further Maneuvers

Fresh and promising ideas came with the morning. The following discussion applies to the operand function {+/,A×⍵}.

(0) Scalar multiple: If all the elements of A are equal, then {+/,A×⍵}⌺(⍴A)⊢r ←→ (⊃A)×{+/,⍵}⌺(⍴A)⊢r.

   A←3 3⍴?17
   ({+/,A×⍵}⌺(⍴A)⊢r) ≡ (⊃A)×{+/,⍵}⌺(⍴A)⊢r
1

(1) Sum v inner product: {+/,⍵}⌺(⍴A)⊢r is significantly faster than {+/,A×⍵}⌺(⍴A)⊢r because the former exploits mathematical properties absent from the latter.

   A←?3 3⍴17
   cmpx '{+/,⍵}⌺(⍴A)⊢r' '{+/,A×⍵}⌺(⍴A)⊢r'
  {+/,⍵}⌺(⍴A)⊢r   → 1.4E¯4 |    0% ⎕⎕⎕
* {+/,A×⍵}⌺(⍴A)⊢r → 1.3E¯3 | +828% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

(The * in the result of cmpx is expected.)

(2) Linearity: the stencil of the sum equals the sum of the stencils.

   A←?3 3⍴17
   B←?3 3⍴17
   ({+/,(A+B)×⍵}⌺(⍴A)⊢r) ≡ ({+/,A×⍵}⌺(⍴A)⊢r) + {+/,B×⍵}⌺(⍴A)⊢r 
1

(3) Middle: If B is zero everywhere except the middle, then {+/,B×⍵}⌺(⍴B)⊢r ←→ mid×r where mid is the middle value.

   B←(⍴A)⍴0 0 0 0 9
   B
0 0 0
0 9 0
0 0 0
   ({+/,B×⍵}⌺(⍴B)⊢r) ≡ 9×r
1

(4) A faster solution.

   A←EdgeDetect
   B←(⍴A)⍴0 0 0 0 9
   C←(⍴A)⍴¯1
   A B C
┌────────┬─────┬────────┐
│¯1 ¯1 ¯1│0 0 0│¯1 ¯1 ¯1│
│¯1  8 ¯1│0 9 0│¯1 ¯1 ¯1│
│¯1 ¯1 ¯1│0 0 0│¯1 ¯1 ¯1│
└────────┴─────┴────────┘
   A ≡ B+C
1

Whence:

   ({+/,A×⍵}⌺(⍴A)⊢r) ≡ ({+/,B×⍵}⌺(⍴A)⊢r) + {+/,C×⍵}⌺(⍴A)⊢r ⍝ (2)
1
   ({+/,A×⍵}⌺(⍴A)⊢r) ≡ ({+/,B×⍵}⌺(⍴A)⊢r) - {+/,⍵}⌺(⍴A)⊢r   ⍝ (0)
1
   ({+/,A×⍵}⌺(⍴A)⊢r) ≡ (9×r) - {+/,⍵}⌺(⍴A)⊢r                ⍝ (3)
1

Putting it all together:

edges3←{
  ⍺←0.7
  mid←⊃EdgeDetect↓⍨⌊2÷⍨⍴EdgeDetect
  1 1↓¯1 ¯1↓(⍺×1⌈(+⌿÷≢),⍵)<|(⍵×1+mid)-{+/,⍵}⌺(⍴EdgeDetect)⊢⍵
}

Comparing the various edges:

   x←(⊂'e←⊃∨/0.2 edges'),¨('3 ' '2 ' '1a' '1 ' '  '),¨⊂'¨r g b'
   cmpx x
  e←⊃∨/0.2 edges3 ¨r g b → 3.4E¯3 |      0%
* e←⊃∨/0.2 edges2 ¨r g b → 4.3E¯3 |    +25%
  e←⊃∨/0.2 edges1a¨r g b → 6.4E¯3 |    +88%
  e←⊃∨/0.2 edges1 ¨r g b → 7.5E¯3 |   +122%
  e←⊃∨/0.2 edges  ¨r g b → 6.5E¯1 | +19022% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
edges original
edges1 uses {+/,stencil×⍵} instead of {+/,⍵×stencil}
edges1a uses (⍺×mean,⍵)<matrix instead of ⍺<matrix÷mean,⍵
edges2 uses {threshold<+/,stencil×⍵}
edges3 uses the faster {+/,⍵} instead of {+/,stencil×⍵} and other maneuvers

Fin

I don’t know if the Finns would be impressed. The exercise has been amusing in any case.

Calculation v Look-Up

(⎕io←0 and timings are done in Dyalog version 15.0.)

Table Look-Up

Some functions can be computed faster by table look-up than a more traditional and more conventional calculation. For example:

      b←?1e6⍴2

      cmpx '*b' '(*0 1)[b]'
 *b        → 1.32E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
 (*0 1)[b] → 1.23E¯3 | -91% ⎕⎕⎕

Some observations about this benchmark:

(a) The advantage of table look-up depends on the time to calculate the function directly, versus the time to do indexing, ⍺[⍵] or ⍵⌷⍺, plus a small amount of time to calculate the function on a (much) smaller representative set.

Indexing is particularly efficient when the indices are boolean:

      bb←b,1
      bi←b,2
      cmpx '2 3[bb]' '2 3 3[bi]'
 2 3[bb]   → 1.12E¯4 |    0% ⎕⎕⎕⎕⎕
 2 3 3[bi] → 7.39E¯4 | +558% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

bi is the same as bb except that the last element of 2 forces it to be 1-byte integers (for bi) instead of 1-bit booleans (for bb).

(b) Although the faster expression works best with 0-origin, the timing for 1-origin is similar.

      cmpx '*b' '{⎕io←0 ⋄ (*0 1)[⍵]}b'
 *b                   → 1.32E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
 {⎕io←0 ⋄ (*0 1)[⍵]}b → 1.22E¯3 | -91% ⎕⎕⎕

That is, if the current value of ⎕io is 1, the implementation can use a local ⎕io setting of 0.

(c) The fact that *b is currently slower than (*0 1)[b] represents a judgment by the implementers that *b is not a sufficiently common computation to warrant writing faster code for it. Other similar functions already embody the table look-up technique:

      cmpx '⍕b' '¯1↓,(2 2⍴''0 1 '')[b;]'
 ⍕b                   → 6.95E¯4 |  0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
 ¯1↓,(2 2⍴'0 1 ')[b;] → 6.92E¯4 | -1% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

Small Domains

Table look-up works best for booleans, but it also works for other small domains such as 1-byte integers:

      i1←?1e6⍴128    ⍝ 1-byte integers

      cmpx '*i1' '(*⍳128)[i1]'
 *i1         → 1.48E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
 (*⍳128)[i1] → 9.30E¯4 | -94% ⎕⎕

      cmpx '○i1' '(○⍳128)[i1]'
 ○i1         → 2.78E¯3 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
 (○⍳128)[i1] → 9.25E¯4 | -67% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

In general, a table for 1-byte integers needs to have values for ¯128+⍳256. Here ⍳128 is used to simulate that in C indexing can be done on 1-byte indices without extra computation on the indices. In APL, general 1-byte indices are used as table[i1+128]; in C, the expression is (table+offset)[i].

Domains considered “small” get bigger all the time as machines come equipped with ever larger memories.

Larger Domains

Table look-up is applicable when the argument is not a substantial subset of a large domain and there is a fast way to detect that it is not.

      i4s←1e7+?1e6⍴1e5    ⍝ small-range 4-byte integers

      G←{(⊂u⍳⍵)⌷⍺⍺ u←∪⍵}

      cmpx '⍟i4s' '⍟G i4s'
 ⍟i4s   → 1.44E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
 ⍟G i4s → 9.59E¯3 | -34% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

The definition of G implements that the argument is not used directly as indices (as in (⊂⍵)⌷⍺⍺ u) but needed to be looked up, the u⍳⍵ in (⊂u⍳⍵)⌷⍺⍺ u. Thus both index-of and indexing are key enablers for making table look-up competitive.

§4.3 of Notation as a Tool of Thought presents a different way to distribute the results of a calculation on the “nub” (unique items). But it takes more time and space and is limited to numeric scalar results.

      H←{(⍺⍺ u)+.×(u←∪⍵)∘.=⍵}

      i4a←1e7+?1e4⍴1e3    ⍝ small-range 4-byte integers

      cmpx '⍟i4a' '⍟G i4a' '⍟H i4a'
 ⍟i4a   → 1.56E¯4 |       0%
 ⍟G i4a → 6.25E¯5 |     -60%
 ⍟H i4a → 1.78E¯1 | +113500% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

The benchmark is done on the smaller i4a as a benchmark on i4s founders on the ≢∪⍵ by ≢⍵ boolean matrix (for i4s, 12.5 GB = ×/ 0.125 1e5 1e6) created by H.

Mathematical Background

New Math” teaches that a function is a set of ordered pairs whose first items are unique:

*⍵: {(⍵,*⍵)|⍵∊C}      The set of all (⍵,*⍵) where is a complex number
⍟⍵: {(⍵,⍟⍵)|⍵∊C~{0}}  The set of all (⍵,⍟⍵) where is a non-zero complex number
○⍵: {(⍵,○⍵)|⍵∊C}      The set of all (⍵,○⍵) where is a complex number
⍕⍵: {(⍵,⍕⍵)|0=⍴⍴⍵}    The set of all (⍵,⍕⍵) where is a scalar

When is restricted (for example) to the boolean domain, the functions, the sets of ordered pairs, are more simply represented by enumerating all the possibilities:

*⍵: {(0,1), (1,2.71828)}
⍟⍵: {(0,Err), (1,0)}
○⍵: {(0,0), (1,3.14159)}
⍕⍵: {(0,'0'), (1,'1')}

Realized and Potential Performance Improvements

realized (available no later than 15.0)

boolean
1-byte integer
⍕ ∘.f
⍕     a∘⍕

potential (non-exhaustive list)

boolean
1-byte integer
2-byte integer
small-range 4-byte integer
* ⍟ | - ○   ! ⍳ a∘f f.g
* ⍟ | - ○ × ! ⍳ a∘f f.g ∘.f
* ⍟ | - ○ ×   ⍳ a∘f
* ⍟ |     ×

a is a scalar; f and g are primitive scalar dyadic functions.

Beauty and the Beast

beauty-and-the-beastFinally, the last accessory I ordered for my Raspberry Pi Zero (that’s the little red thing behind my keyboard) has arrived – an Acer 43″ ET430K monitor. The Zero won’t quite drive this monitor at its maximum resolution of 3840×2160 pixels, but as you can see, you get enough real estate to do real graphics with “ASCII Art” (well, Unicode Art, anyway)!

Some readers will recognise the image as the famous Mandelbrot Set, named after the mathematician Benoit Mandelbrot, who studied fractals. According to Wikipedia a fractal is a mathematical set that exhibits a repeating pattern displayed at every scale – they are interesting because they produce patterns that are very similar to things we can see around us – both in living organisms and landscapes – they seem to be part of the fundamental fabric of the universe.

Fractals are are also interesting because they produce images of staggering beauty, as you can see on the pages linked to above and one of our rotating banner pages, where a one-line form of the APL expression which produced the image is embedded:

Website_banner_MbrotW

The colourful images of the Mandelbrot set are produced by looking at a selection of points in the complex plane. For each point c, start with a value of 0 for z, repeat the computation z = c + z², and colour the point according to the number of iterations required before the function becomes “unbounded in value”.

In APL, the function for each iteration can be expressed as {c+⍵*2}, where c is the point and ⍵ (the right argument of the function) is z (initially 0, and subsequently the result of the previous iteration):

      f←{c+⍵*2} ⍝ define the function
      c←1J1     ⍝ c is 1+i (above and to the right of the set)
      (f 0)(f f 0)(f f f 0)(f f f f 0) (f f f f f 0)
1J1 1J3 ¯7J7 1J¯97 ¯9407J¯193

If you are not familiar with complex numbers, those results may look a bit odd. While complex addition just requires adding the real and the imaginary parts independently, the result of multiplication of two complex numbers (a + bi) and (c + di) is defined as (ac-db) + (bc+ad)i. Geometrically, complex multiplication is a combined stretching and rotation operation.

Typing all those f’s gets a bit tedious, fortunately APL has a power operator , a second-order function that can be used to repeatedly apply a function. We can also compute the magnitude of the result number using |:

      (|(f⍣6)0)(|(f⍣7)0)
8.853E7 7.837E15

Let’s take a look at what happens if we choose a point inside the Mandelbrot set:

      c←0.1J0.1
      {|(f⍣⍵)0}¨ 1 2 3 4 5 6 7 8 9 10
0.1414 0.1562 0.1566 0.1552 0.1547 0.1546 0.1546 0.1546 0.1546 0.1546

Above, I used an anonymous function so I could pass the number of iterations in as a parameter, and use the each operator ¨ to generate all the results up to ten iterations. In this case, we can see that the magnitude of the result stabilises, which is why the point 0.1 + 0.1i is considered to be inside the Mandelbrot set.

Points which are just outside the set will require varying numbers of applications of f before the magnitude “explodes”, and if you colour points according to how many iterations are needed and pick interesting areas along the edge of the set, great beauty is revealed.

1200px-GalaxyOfGalaxies

The above image is in the public domain and was created by Jonathan J. Dickau using ChaosPro 3.3 software, which was created by Martin Pfingstl.

Our next task is to create array of points in the complex plane. A helper function unitstep generates values between zero and 1 with a desired number of steps, so I can vary the resolution and size of the image. Using it, I can create two ranges, multiply one of them by i (0J1 in APL) and use an addition table (∘.+) to generate the array:

      ⎕io←0 ⍝ use index origin zero
      unitstep←{(⍳⍵+1)÷⍵}
      unitstep 6
0 0.1667 0.3333 0.5 0.6667 0.8333 1
      c←¯3 × 0.7J0.5 - (0J1×unitstep 6)∘.+unitstep 6
      c
¯2.1J¯1.5 ¯1.6J¯1.5 ¯1.1J¯1.5 ¯0.6J¯1.5 ¯0.1J¯1.5 0.4J¯1.5 0.9J¯1.5
¯2.1J¯1   ¯1.6J¯1   ¯1.1J¯1   ¯0.6J¯1   ¯0.1J¯1   0.4J¯1   0.9J¯1  
¯2.1J¯0.5 ¯1.6J¯0.5 ¯1.1J¯0.5 ¯0.6J¯0.5 ¯0.1J¯0.5 0.4J¯0.5 0.9J¯0.5
¯2.1      ¯1.6      ¯1.1      ¯0.6      ¯0.1      0.4      0.9     
¯2.1J00.5 ¯1.6J00.5 ¯1.1J00.5 ¯0.6J00.5 ¯0.1J00.5 0.4J00.5 0.9J00.5
¯2.1J01   ¯1.6J01   ¯1.1J01   ¯0.6J01   ¯0.1J01   0.4J01   0.9J01  
¯2.1J01.5 ¯1.6J01.5 ¯1.1J01.5 ¯0.6J01.5 ¯0.1J01.5 0.4J01.5 0.9J01.5

The result is subtracted from 0.7J0.5 (0.7 + 0.5i) to get the origin of the complex plane slightly off centre in the middle, and multiplied the whole thing by 3 to get a set of values that brackets the Mandelbrot set.

Note that APL expressions are typically rank and shape invariant, so our function f can be applied to the entire array without changes. Since our goal is only to produce ASCII art, we don’t need to count iterations, we can just compare the magnitude of the result with 9 to decide whether the point has escaped:

      9<|f 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
      9<|(f⍣3) 0 ⍝ Apply f 3 times
1 1 1 0 0 0 1
1 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
1 0 0 0 0 0 0
1 1 1 0 0 0 1

We can see that points around the edge start escaping after 3 iterations. Using an anonymous function again, we can observe more and more points escape, until we recognise the (very low resolution) outline of the Mandelbrot set:


      ]box on
      {((⍕⍵)@(⊂0 0))' #'[9>|(f⍣⍵)0]}¨2↓⍳9
┌───────┬───────┬───────┬───────┬───────┬───────┬───────┐
│2######│3  ### │4      │5      │6      │7      │8      │
│#######│ ######│   ### │    #  │    #  │    #  │    #  │
│#######│#######│ ##### │  #### │  #### │   ### │   ### │
│#######│#######│###### │ ##### │ ##### │ ##### │ ####  │
│#######│#######│ ##### │  #### │  #### │   ### │   ### │
│#######│ ######│   ### │    #  │    #  │    #  │    #  │
│#######│   ### │       │       │       │       │       │
└───────┴───────┴───────┴───────┴───────┴───────┴───────┘

The last example allowed me to sneak in a preview of the new “at” operator coming in Dyalog v16.0. It is a functional merge operator that I am using to insert the formatted right argument (the number of iterations) into position (0 0) of each matrix.

If I use our remote IDE (RIDE) on my Windows machine and connect to the Pi, I can have an APL session on the Pi with 3840×2160 resolution. In this example, I experimented with grey scale “colouring” the result by rounding down and capping the result at 10, and using the character ⍟ (darkest) for 0, ○ for 1-5, × for 6-9, and blank for points that “escaped”, by indexing into an array of ten characters:

      '⍟○○○○○×××× '[10⌊⌊|(f⍣9)c]

Who needs OpenGL?! (click on the image to enlarge)

mandelbrot-ride4

Charting Reaction Times on the Raspberry Pi

Earlier this week I collected some reaction timer data on my Pi using the BBC micro:bit as an input device. I only produced an “ASCII art” chart at the time:

      times←ReactionTimer.Play
      times
251 305 294 415 338 298 294 251 378
      ReactionTimer.AsciiChart times
425| 
400|    * 
375|         *
350+ 
325|     * 
300|  * 
275|   *  ** 
250+ *      * 

Retro is back in style – but of course I should point out that we can produce “proper” graphics using SharpPlot, a cross-platform graphics package that is included with Dyalog APL on all platforms. I’ve enhanced the ReactionTimer namespace with a function called SPHistogram. If you are using RIDE as the front end to APL on the Pi, this will render output from SharpPlot in a window:

    ReactionTimer.SPHistogram times

Reaction Time Histogram

The original ASCII chart simply plotted the observations in the order that they occurred. Above, the shaded areas show how many observations there were in each bucket (2 between 250ms and 275ms, 3 between 275ms and 300ms, and so on). At the same time, the individual observations can be seen along the X axis as vertical red lines. It is a little unfortunate that, presumably due to some artifact of the timing process, the values 251 and 294 occur twice – and these lines are drawn on top of each other.

The example highlights one of the things that makes SharpPlot a bit special: We have overlaid a histogram showing the frequency of reactions in each bucket, and used a “scatterplot” to where the Y value is always 0, to mark the individual observations along the X axis.

     ∇ SPHistogram times;heading;renderHtml;sp;svg;z 
[1]   heading←'Reaction Times' 
[2]   renderHtml←3500⌶ ⍝ Render HTML in Window 
[3] 
[4]   :If 0=⎕NC'#.SharpPlot' ⋄ #.⎕CY'sharpplot.dws' ⋄ :EndIf
[5] 
[6]   sp←⎕NEW #.SharpPlot(432 250) 
[7]   sp.Heading←heading 
[8] 
[9]   ⍝ Draw histogram 
[10]  sp.ClassInterval←25 
[11]  sp.SetXTickMarks 25 
[12]  sp.HistogramStyle←#.HistogramStyles.SurfaceShading 
[13]  sp.SetFillStyles #.FillStyle.Opacity30 
[14]  sp.DrawHistogram⊂times 
[15] 
[16]  ⍝ Add observations using ScatterPlot 
[17]  sp.SetMarkers #.Marker.UpTick 
[18]  sp.SetPenWidths 1 
[19]  sp.SetMarkerScales 3 
[20]  sp.DrawScatterPlot(times×0)(times) 
[21] 
[22]  ⍝ Render SVG and display in window 
[23]  svg←sp.RenderSvg #.SvgMode.FixedAspect 
[24]  z←heading renderHtml svg 
     ∇ 

Hopefully the code is more or less self-explanatory, but if you’d like to learn more about SharpPlot there is excellent documentation at http://sharpplot.com. The documentation is actually written for C# users, but there is an illustration of how to translate the documentation to APL (and VB) at http://www.sharpplot.com/Languages.htm.