Exploring Key

In ⍺ f⌸ ⍵, major cells of specify keys for the corresponding major cells of , and f applies to each unique major cell of and the major cells of having that key. The monadic case f⌸⍵ is equivalent to ⍵ f⌸ ⍳≢⍵. Key is similar to the GROUP BY clause in SQL.

For example:

   ⎕io←0

   (⍳11) ,[¯0.5] 'Mississippi'
0 1 2 3 4 5 6 7 8 9 10
M i s s i s s i p p  i

   {⍺}⌸'Mississippi'                 {⍺⍵}⌸'Mississippi'
Misp                              ┌─┬────────┐
                                  │M│0       │
   {⊂⍵}⌸'Mississippi'             ├─┼────────┤
┌─┬────────┬───────┬───┐          │i│1 4 7 10│
│0│1 4 7 10│2 3 5 6│8 9│          ├─┼────────┤
└─┴────────┴───────┴───┘          │s│2 3 5 6 │
                                  ├─┼────────┤
   {≢⍵}⌸'Mississippi'             │p│8 9     │
1 4 4 2                           └─┴────────┘

We illustrate key by using it to model some APL primitives and to motivate and solve a programming puzzle.

Index-Of and Friends

Since key is defined in terms of unique and indices, we expect to be able to effect members of the index-of family. And so we can.

The specifications say that f applies to each unique major cell of . Therefore, {⍺}⌸ and equivalently ⊣⌸ computes unique. Since the specifications are for unique major cells, these are what ∪⍵ would be if it were extended to non-vector arguments, analagous to the way was extended to non-vector left arguments.

   x←↑'Aspen' 'John' 'Susan' 'Roger' 'Opal' 'John' 'Aspen'
   x
Aspen
John 
Susan
Roger
Opal 
John 
Aspen

   {⍺}⌸x                  ⊣⌸x
Aspen                  Aspen
John                   John 
Susan                  Susan
Roger                  Roger
Opal                   Opal 

A little surprisingly, key can also be used to model extended index-of and extended membership. For reasons which will become clear, we model (AKA ∊⍨) instead of . Both models use the merge operator, whimsically denoted here by .

ix  ← {(≢⍺) ↓ (0⍴⍨(≢⍺)+≢⍵) ((∊i)→)⍨ (≢¨i)/(≢⍺)⌊⊃¨i←{⊂⍵}⌸⍺⍪⍵}
has ← {(≢⍺) ↓ (0⍴⍨(≢⍺)+≢⍵) ((∊i)→)⍨ (≢¨i)/(≢⍺)>⊃¨i←{⊂⍵}⌸⍺⍪⍵}

   y←↑'China' 'Susan' 'John' 'John' 'Anne' 'Roger'
   y
China
Susan
John 
John 
Anne 
Roger

   x ix y
7 2 1 1 7 3

   x has y
0 1 1 1 0 1

To focus on essentials, ix and has do not directly handle the case where the right argument has higher rank than the left argument (the principal argument). Such higher rank arguments can be handled by pre-application of {↑,⊂⍤(¯1+⍴⍴⍺)⊢⍵} and post-application of ((1-⍴⍴⍺)↓⍴⍵)⍴.

Partition

Since key can effect non-contiguous partitions, we expect to be able to use it to effect contiguous partitions, a special case of the former. And so we can:

   part ← {(+\b/⍺){⊂⍵}⌸b⌿⍵⊣b←∨\⍺}

   x
Aspen
John 
Susan
Roger
Opal 
John 
Aspen
   0 1 0 1 0 0 1 part x
┌─────┬─────┬─────┐
│John │Roger│Aspen│
│Susan│Opal │     │
│     │John │     │
└─────┴─────┴─────┘
   0 1 0 1 0 0 1 (⊂[0] ≡ part) x
1
   (7⍴0) (⊂[0] ≡ part) x
1

Sort and Grade

sort  ← {⍺⌿⍨¯1+{≢⍵}⌸⍺⍪⍵}
grade ← {(≢⍺)-⍨∊1↓¨{⊂⍵}⌸⍺⍪⍵}

If is an array from a small “alphabet” , then ⍺ sort ⍵ sorts and ⍺ grade ⍵ grades it. For example:

   x←?1e6⍴256                        a←⎕a[?1e6⍴26]

   (⍋x)  ≡ (⍳256) grade x            (⍋a)  ≡ ⎕a grade a
1                                 1
   x[⍋x] ≡ (⍳256) sort  x            a[⍋a] ≡ ⎕a sort  a
1                                 1

The Maximum of a Vector

In Dyalog ’13 session D11, I included a slide giving relative timings on the then new key operator:

x←?1e6⍴1000

a. ¯1+{≢⍵}⌸(⍳1000),x     1.00
b. ¯1+{≢⍵}⌸(⍳1+⌈/x),x    1.62
c. ...

The timings show that key with tally takes less than a factor of 2 over finding the maximum. This is fast; unbelievably fast even. The following puzzle arose from investigations into this timing: Find the maximum of a vector of unsigned 1-byte integers without using multicore, vector instructions, loop unrolling, etc. Can you do it faster in C than the following?

max=*x++; 
for(i=1;i<n;++i){if(max<*x)max=*x; ++x;}

I posed the puzzle to some expert C programmers, and they failed to solve it. An “obvious” solution obtains by considering an obvious implementation of {≢⍵}⌸x in C:

M=256;
memset(c,0,M*4);              // initialize table of counts
for(i=0;i<n;++i)c[*x++]++;    // analoguous to c[x]+←1 in APL

Therefore, one way to find the maximum of x is to have an M-element (byte) Boolean table b, and:

memset(b,0,M);
for(i=0;i<n;++i)b[*x++]=1;    // analoguous to b[x]←1 in APL
c=b+M;                        // scan from the back
for(i=0;i<M;++i)if(*--c){max=c-b; break;}

Timing show that the table method is faster than the comparison method by a factor of 1.5 for 1-byte integers and 1.3 for 2-byte integers. The table method requires more code, but the main n-loop, the time for which dominates the overall time, is simpler for the table method.

Jay Foad countered that finding the maximum (and the minimum) is much faster still with the vector instructions available in modern CPUs. Dyalog has now (version 14.1) implemented such finding, with performance benefits for key, the index-of family, grade, and sort.