The 2021 APL Problem Solving Competition: Phase I – Best of Breed

By: Stefan Kruger
Stefan works for IBM making databases. He tries to learn at least one new programming language a year, and a few years ago he got hooked on APL and participated in the competition. This is his perspective on some solutions that the judges picked out – call it the “Judges’ Pick”, if you like; smart, novel, or otherwise noteworthy solutions that can serve as an inspiration.


Congratulations to all the winners of the 2021 APL Problem Solving Competition (you can learn more about the phase 2 winners in this article) and well done to Dzintars Klušs who won the Grand Prize. At the recent Dyalog ’21 user meeting, we got to enjoy the runner-up, Victor Ogunlokun, walking us through his solutions live.

In this post I’ll go through some great solutions that were submitted (and some that weren’t submitted) to the Phase I problems so that we can all marvel in the ingenuity and perhaps learn a thing or two. If you’re feeling inspired by the end, go ahead and participate in this year’s round which just launched.

If you’re new to the APL Problem Solving Competition, Phase I problems tend to be short and the expectation is that solutions will be “one-liners” (dfns). However, although it might seem like it from some of the solutions here, this isn’t a code golf competition! Solutions are judged holistically: do they solve the problem, are they efficient, and are they clear? Even though a few test cases are given, there is no guarantee that your solution is correct just because it works for the example data. The judging process involves running the code on many hidden test cases too. Crucially, just because your code is accepted, it doesn’t necessarily mean that you’ll get full marks.

As with my blog post that reviewed the 2020 Phase II solutions, I’ve included a more in-depth examination of one or two problems.

Problem 1: Are You a Bacteria?

Something from the excellent Project Rosalind problem collection, the task is to compute the combined percentage of guanine (G) and cytosine (C) in a given DNA-string.

Efficiency can vary a lot, depending on whether summation or multiplication (or even division!) is performed first. Some solutions were also leading-axis oriented.

Here’s my solution:

      {100×(+⌿⍵∊'CG')÷≢⍵} 'ACGTACGTACGTACGT'
50

which several competitors made more tacit with:

      {100×(+⌿÷≢)⍵∊'GC'} 'ACGTACGTACGTACGT'
50

or even went further:

      (100×≢÷⍨1⊥∊∘'GC') 'ACGTACGTACGTACGT'
50

If you’re unfamiliar with the 1⊥ trick, it’s a way of summing a vector:

      1⊥6 3 9 8 12 62
100

It’s perhaps not immediately obvious why this should work. Here’s one explanation. Assume we want to sum the vector 1 0 2 0 0. We can do this in a very convoluted way by using a sum inner product with a vector of exponentials: [14, 13, 12, 11, 10]:

      (1*4 3 2 1 0)+.×1 0 2 0 0
3

If we expand the exponentials to the left we get a vector of 1s. We can then break apart the inner product by turning +. to a +⌿ to the left:

      +⌿1 1 1 1 1×1 0 2 0 0
3

This is the textbook definition of 1⊥! Look:

      1⊥1 0 2 0 0
3

which, to be clear, is just the sum-reduce-first:

      +⌿1 0 2 0 0
3

Using 1⊥ to sum has two advantages over the more obvious formulation +⌿. Firstly, it’s easier to use in tacit formulations as it doesn’t require an operator, and secondly, it’s usually faster. The reasons for it being quicker is somewhat beyond the scope of this post, but it’s to do with 1⊥ making no guarantees about the ordering of operations, meaning that the interpreter is free to vectorise more efficiently.

Problem 2: Index-Of Modified

This problem wanted us to write a function that behaves like the APL Index Of function R←X⍳Y except that it should return 0 for elements of Y not found in X.

I wrote:

      p2 ← {0@((≢⍺)∘<)⊢⍺⍳⍵}
      2 3 p2 ⍳5
0 1 2 0 0

which is basically saying “change all instances of numbers greater than the length of the argument to zero”, which is how X⍳Y presents values that are not found.

Some very different solutions were submitted, for example:

      p2 ← ⍳|⍨1+≢⍤⊣
      2 3 p2 ⍳5
0 1 2 0 0

which is simply:

      p2 ← {(1+≢⍺)|⍺⍳⍵} ⍝ dfn of the above
      2 3 p2 ⍳5
0 1 2 0 0

Another option would have been to multiply ⍺⍳⍵ with ≢⍺, although no-one submitted exactly this:

      p2 ← ≢⍤⊣(≥×⊢)⍳
      2 3 p2 ⍳5
0 1 2 0 0

which could have been written explicitly as:

      p2 ← {m×(≢⍺)≥m←⍺⍳⍵} ⍝ dfn of the above
      2 3 p2 ⍳5
0 1 2 0 0

Problem 3: Multiplicity

Write a function that:

  • has a right argument Y which is an integer vector or scalar
  • has a left argument X which is also an integer vector or scalar
  • finds which elements of Y are multiples of each element of X and returns them as a vector (in the order of X) of vectors (in the order of Y).

Some test data was provided:

      X ← 2 4 7 3 9
      Y ← 5 7 8 1 12 10 20 16 11 4 2 15 3 18 14 19 13 9 17 6

I wrote something that, in retrospect, looks somewhat clumsy:

      p3 ← {⍵∘{⍵/⍺}¨↓0=⍺∘.|,⍵}
      X p3 Y
┌─────────────────────────┬────────────┬────┬──────────────┬────┐
│8 12 10 20 16 4 2 18 14 6│8 12 20 16 4│7 14│12 15 3 18 9 6│18 9│
└─────────────────────────┴────────────┴────┴──────────────┴────┘

which can be expressed more compactly as:

      p3 ← {/∘⍵¨↓0=⍺∘.|⍥,⍵}
      X p3 Y
┌─────────────────────────┬────────────┬────┬──────────────┬────┐
│8 12 10 20 16 4 2 18 14 6│8 12 20 16 4│7 14│12 15 3 18 9 6│18 9│
└─────────────────────────┴────────────┴────┴──────────────┴────┘

or:

      X (⊢⊂⍤/⍤1⍨0=∘.|⍥,) Y
┌─────────────────────────┬────────────┬────┬──────────────┬────┐
│8 12 10 20 16 4 2 18 14 6│8 12 20 16 4│7 14│12 15 3 18 9 6│18 9│
└─────────────────────────┴────────────┴────┴──────────────┴────┘

although no-one actually submitted that, to everyone’s credit.

Problem 4: Square Peg, Round Hole

Write a function that:

  • takes a right argument which is an array of positive numbers representing circle diameters
  • returns a numeric array of the same shape as the right argument representing the difference between the areas of the circles and the areas of the largest squares that can be inscribed within each circle.

I had to read that many times before it sank in. The key to achieve something snappy is to really work through the maths until it is as compact as possible, which, if you’re anything like me, you didn’t bother to do.

My attempt was:

      p4 ← {(○2*⍨⍵÷2)-2÷⍨⍵*2}

but there are much neater solutions if you did your homework. Here’s one that no-one found:

      p4 ← (○-+⍨)4÷⍨×⍨

and a nice explicit version:

      p4 ← {⍵×⍵×0.5-⍨○÷4}

which can be derived from this simplified mathematical expression, suggested by Rodrigo:


Explanation: The area of the circle is ○r*2, which is ○(⍵÷2)*2, in turn equivalent to ⍵×⍵×○÷4. The area of the square [ABCD] is twice the area of the triangle [ABC]. Given that the area of the triangle is 0.5×⍵×⍵÷2, the area of the square becomes 0.5×⍵×⍵. Putting both together, we get (⍵×⍵×○÷4)-⍵×⍵×0.5, the same as ⍵×⍵×(○÷4)-0.5, which is ⍵×⍵×0.5-⍨○÷4.

Square inside a circle with its diagonal as the circle's diameter

Problem 5: Rect-ify

For this problem, we were asked to plant a number of trees in a rectangular pattern with complete rows and columns, meaning all rows have the same number of trees. That rectangular pattern also needed to be as “square as possible”, meaning there is a minimal difference between the number of rows and columns in the pattern.

Here’s a smart solution, based on the observation that the “most square” choice must have one factor being the largest factor less than or equal to the square root:

      p5 ← {N,⍵÷1⌈N←⌈/0,⍵∨⍳⌊⍵*÷2}

This solution works well on large numbers of trees, too:

      p5 98776512304
280888 351658

Someone even offered up a recursive solution:

      p5rec ← {⍵=0:2⍴0 ⋄ ⍵ {0=⍵|⍺: ⍵,⍺÷⍵ ⋄ ⍺∇⍵-1} ⌊⍵*÷2}

So is one solution better than the other? Well, they both work correctly, but one is a lot faster than the other. Do you want to guess which was faster before we test it?

      'cmpx'⎕CY'dfns'
      cmpx 'p5 98776512304' 'p5rec 98776512304'
  p5 98776512304    → 8.7E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
  p5rec 98776512304 → 6.1E¯3 | -94% ⎕⎕⎕

Surprised? I was! So, what is going on here? The non-recursive solution relies on a rather crude way to find the factors, which is a fairly large number to factorise even if it only needs to go up to the square root. The recursive version just tries each number in turn, up to the square root.

Can we be even smarter? This version was offered up by APL Orchard regular @rak1507:

      p5rak1507 ← {a,⍵÷1⌈a←⊃⌽⍸0=⍵|⍨⍳⌊⍵*.5} ⍝ @rak1507
      cmpx 'p5 98776512304' 'p5rec 98776512304' 'p5rak1507 98776512304'
  p5 98776512304        → 8.7E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
  p5rec 98776512304     → 6.3E¯3 | -93% ⎕⎕⎕                                     
  p5rak1507 98776512304 → 4.2E¯3 | -96% ⎕⎕

Basically, (⊢∨⍳) is neat as a code-golf trick, but not great in terms of efficiency.

Problem 6: Fischer Random Chess

According to Wikipedia, Fischer random chess is a variation of the game of chess invented by former world chess champion Bobby Fischer. Fischer random chess employs the same board and pieces as standard chess, but the starting position of the non-pawn pieces on the players’ home ranks is randomised, following certain rules.

White’s non-pawn pieces are placed on the first rank according to the following rules:

  • the Bishops must be placed on opposite-colour squares
  • the King must be placed on a square between the rooks.

The task was to write a function that verifies that a given board placement is valid according to these rules.
This was my solution for this:

      q6 ← {(1=+/(⍵⍳'K')>⍸'R'=⍵)∧1=+/2|⍸'B'=⍵}

but there was a lot of variety in the solutions submitted to this problem. For example:

      q6i  ← {≠/2|⍸'B'=⍵}∧'RKR'≡∩∘'RK'        ⍝ Intersection
      q6ii ← {(≠/2|⍸'B'=⍵)∧1=(⍸'R'=⍵)⍸⍵⍳'K'}  ⍝ Interval index
      q6w  ← {(≠/2|⍸'B'=⍵)∧≠/(⍸'K'=⍵)<⍸'R'=⍵} ⍝ Where (similar to mine above)

The last version there is very amenable to Over:

            q6over ← {I←⍸=∘⍵ ⋄ (2|I'B')∧⍥(≠/)'K'<⍥I'R'}

And for masochists, there is always the famous Progressive Dyadic Iota:

      pd ← {((⍴⍺)⍴⍋⍋⍺,⍵)⍺⍺(⍴⍵)⍴⍋⍋⍵,⍺}
      q6pdi ← {(∧/2\</⍵⍳pd'RKR')∧≠/2|⍵⍳pd'BB'}

Problem 7: Can You Feel the Magic?

A square matrix is ‘magic’ if all of its rows and columns and both diagonals sum to the same number.

One hero came up with the following:

      q7 ← (∧/2=/∘∊+/,(+/1 2 2∘⍉))⍉,[0.5]⌽

Here is how it works:

      q7 magic←⎕←3 3⍴4 9 2 3 5 7 8 1 6
4 9 2
3 5 7
8 1 6
1
      q7 nonmagic←⎕←3 3⍴4 9 2 7 5 3 8 1 6
4 9 2
7 5 3
8 1 6
0

The problem statement suggested that dyadic transpose might come in handy, but that’s just showing off! So, how does it work? It’s certainly tacit:

              q7            ⍝ Ouch...
    ┌─────────┴─────────┐
  ┌─┴─┐               ┌─┼─────┐
  / ┌─┼──────┐        ⍉ [0.5] ⌽
┌─┘ 2 ∘    ┌─┼───┐    ┌─┘
∧    ┌┴┐   / , ┌─┴──┐ ,
     / ∊ ┌─┘   /    ∘
   ┌─┘   +   ┌─┘ ┌──┴──┐
   =         +   1 2 2 ⍉

The fork ⍉,[0.5]⌽ takes the argument matrix – a square array of rank-2, shape A A – and returns an array of rank-3, shape 2 A A, where the first cell is the transposed original array and the second is the original array with its rows reversed:

      ]display (⍉,[0.5]⌽) magic
┌┌→────┐
↓↓4 3 8│
││9 5 1│
││2 7 6│
││     │
││2 9 4│
││7 5 3│
││6 1 8│
└└~────┘

We only need to know about the main diagonal of each cell; as you can see, the main diagonal in the second cell is the reverse diagonal of the first cell. We can extract both diagonals with a single dyadic transpose:

      1 2 2⍉(⍉,[0.5]⌽) magic
4 5 6
2 5 8

The same result can be achieved using slightly less showy instead, which has the same byte count but is a little easier to understand when first seen:

      1 1⍉⍤2(⍉,[0.5]⌽) magic ⍝ Diagonals of each major cell love ⍤
4 5 6
2 5 8

The remaining part of the tacit formulation untangles easily. Impressive and creative.

Here’s another good one that is slightly shorter:

      q7 ← (1=≢∘∪)⍉+⌿⍤,⍥(1 1∘⍉,⊢)⌽
      q7 magic
1
      q7 nonmagic
0

How does that work? The phrase 1 1∘⍉,⊢ prepends the diagonal as a column to the argument matrix:

      (1 1∘⍉,⊢)magic ⍝ Explicit: {(1 1⍉⍵),⍵}
4 4 9 2
5 3 5 7
6 8 1 6

Clever application of says “take the matrix, append its reverse over the diagonal-append operation”:

      {⍵,⍥{(1 1⍉⍵),⍵}⌽⍵} magic ⍝ love ⍥
4 4 9 2 2 2 9 4
5 3 5 7 5 7 5 3
6 8 1 6 8 6 1 8

We can emphasise the location of the diagonals by using Partitioned Enclose to make them stand out a bit:

      1 1 0 0 1 1 0 0 ⊂ {⍵,⍥{(1 1⍉⍵),⍵}⌽⍵} magic
┌─┬─────┬─┬─────┐
│4│4 9 2│2│2 9 4│
│5│3 5 7│5│7 5 3│
│6│8 1 6│8│6 1 8│
└─┴─────┴─┴─────┘

Summing along the leading axis gives:

      {+⌿⍵,⍥{(1 1⍉⍵),⍵}⌽⍵} magic
15 15 15 15 15 15 15 15

Finally, check that all items are equal:

      {1=≢∪+⌿⍵,⍥{(1 1⍉⍵),⍵}⌽⍵} magic ⍝ Length of vector of unique values = 1?
1

In summary, there are two things to note here: using to get both diagonals and the use of 1=≢∘∪ to check that all items are equal. If you attended the APL Seeds ’21 conference last March, you’ll recognise this as one of the many ways of solving this problem that Conor Hoekstra presented – see https://dyalog.tv/APLSeeds21/?v=GZuZgCDql6g to watch his presentation.

Any solution that makes use of both of my favourite glyphs ( and ) is a winner in my book.

Problem 8: Time to Make a Difference

Write a function that:

  • has a right argument that is a numeric scalar or vector of length up to 3, representing a number of [[[days] hours] minutes] – a single number represents minutes, a 2-element vector represents hours and minutes, and a 3-element vector represents days, hours, and minutes
  • has a similar left argument, although not necessarily the same length as the right argument
  • returns a single number representing the magnitude of the difference between the arguments in minutes.

Here’s a cool version (several submissions were similar):

      p8 ← |-⍥(1 24 60⊥¯3∘↑)

Nothing too mysterious here. A slight complication is the need to handle a right argument that can be a scalar or a vector of length 2 or 3. The decode function expects the argument vector to always be length 3, so we use the take function, dyadic , with ¯3 as the left argument to ensure that the argument is always a vector of the correct length, padding from the left with zeros as required. The mixed radix vector 1 24 60 as the left argument to decode converts to minutes.

Problem 9: In the Long Run

Write a function that:

  • has a right argument that is a numeric vector of 2 or more elements representing daily prices of a stock
  • returns an integer singleton that represents the highest number of consecutive days where the price increased, decreased, or remained the same, relative to the previous day.

I’d like to compare and contrast two solutions, neither of which are tacit for a change:

      p9a ← {≢⍉↑⊂⍨1,2≠/×2-/⍵}
      p9b ← {⌈/¯2-/0,⍸1,⍨2≠/×2-/⍵} ⍝ Flat efficiency

Starting with the first of the two (p9a), from the right, we use a windowed difference reduction to calculate pairwise differences:

      2-/1 3 5 6 6 6 6 6 3 2 1
¯2 ¯2 ¯1 0 0 0 0 3 1 1

and then apply the direction function, monadic ×, to turn this into a vector of ¯1, 0 and 1 if the corresponding item is negative, zero or positive respectively:

      ×2-/1 3 5 6 6 6 6 6 3 2 1
¯1 ¯1 ¯1 0 0 0 0 1 1 1

Another pairwise windowed reduction, this time with , gives us the points of change:

      2≠/×2-/1 3 5 6 6 6 6 6 3 2 1
0 0 1 0 0 0 1 0 0

Prepending a 1, this Boolean vector can be used as the left argument to partitioned enclose, ; a common pattern. But what of the right argument? We can use the same vector as the right argument by using a clever commute, :

      ⊢m←⊂⍨1,2≠/×2-/1 3 5 6 6 6 6 6 3 2 1 ⍝ Commute to use the same argument left and right
┌─────┬───────┬─────┐
│1 0 0│1 0 0 0│1 0 0│
└─────┴───────┴─────┘

What remains is to find the longest cell in this vector. We could do ⌈/≢¨, but instead this submission found the length of the transpose-mix:

      ≢⍉↑ m
4

A code-golfer’s trick shot, perhaps, and somewhat dubious in terms of efficiency, but certainly cute. If you don’t see why it works, work it through right to left!

The second solution (p9b) uses a lot of the same ideas, but this time we add a 1 to the end of the points-of-change vector:

      {1,⍨2≠/×2-/⍵}1 3 5 6 6 6 6 6 3 2 1
0 0 1 0 0 0 1 0 0 1

and use where, monadic , to get the indices, prepending a 0 so that we can calculate the length of each segment:

      {0,⍸1,⍨2≠/×2-/⍵}1 3 5 6 6 6 6 6 3 2 1
0 3 7 10

The pairwise difference now represents the length of each segment, and by using a negative window we can commute each pair to get a positive number out for each pair:

      {¯2-/0,⍸1,⍨2≠/×2-/⍵}1 3 5 6 6 6 6 6 3 2 1
3 4 3

and so, for the maximum:

      {⌈/¯2-/0,⍸1,⍨2≠/×2-/⍵}1 3 5 6 6 6 6 6 3 2 1
4

Shall we race them? Of course!

      data ← 10000?10000
      cmpx 'p9a data'
  p9a data → 2.7E¯4 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
  p9b data → 2.1E¯5 | -92% ⎕⎕⎕

The second version is faster for several reasons. We suspected already that the ‘cute’ way to find the longest vector in a nested vector was likely to be slow, as it has to create a huge matrix first, chasing pointers. The second version uses flat numeric vectors throughout, and cuts the work considerably by using where initially to do length calculations on the shorter vector of indices. Flat is fast.

Problem 10: On the Right Side

Write a function that:

  • has a right argument T that is a character scalar, vector or vector of character vectors/scalars
  • has a left argument W that is a positive integer specifying the width of the result
  • returns a right-aligned character array R of shape ((2=|≡T)/≢T),W meaning that R is one of the following:
    • a W-wide vector if T is a simple vector or scalar
    • a W-wide matrix with the same number rows as elements of T if T is a vector of vectors/scalars
  • if an element of T has length greater than W, truncate it after W characters.

The last point is perhaps a bit misleading, but the intention is clear from one of the examples given:

      ⍴⎕←8 (your_function) 'Longer Phrase' 'APL' 'Parade'
r Phrase
     APL
  Parade
3 8

In this case, “truncate after W characters” means “remove from the left”.
Conceptually, we need to (over)take W characters from the right of each element and mix that into a rank-2 array. To make it work for the edge cases, we should ensure that we can always treat the right argument as a vector of character vectors, using nest, monadic . This works because if we take more characters than the vector contains, it gets padded using a character-vector’s prototype element, a space.

            8 {↑(-⍺)↑¨⊆⍵} 'Longer Phrase' 'APL' 'Parade'
r Phrase
     APL
  Parade

An equivalent tacit formulation would be:

            8 (↑-⍤⊣↑¨⊆⍤⊢) 'Longer Phrase' 'APL' 'Parade'
r Phrase
     APL
  Parade

Here’s a slight variation:

            8 {⌽⍉⍺↑⍉↑⌽¨⊆⍵}'Longer Phrase' 'APL' 'Parade'
r Phrase
     APL
  Parade

This starts by reversing each cell, then applies a mix and transpose. We then take items from the left before backing out of the transpose and reverse by applying them again.
It can be done in a flatter manner, too:

            8{⍉(-⍺)↑⍉(⊆⍵)⌽∘↑⍨(⊢-⌈/)≢¨⊆⍵} 'Longer Phrase' 'APL' 'Parade'
r Phrase
     APL
  Parade

If we flip the selfie and add a few spaces it gets a bit easier to see what’s going on:

            8 {⍉(-⍺)↑⍉ ((⊢-⌈/)≢¨⊆⍵) ⌽↑⊆⍵} 'Longer Phrase' 'APL' 'Parade'
r Phrase
     APL
  Parade

From the right, we turn our input into a character array and then Rotate each row by its length minus the length of the longest row, which implements the right alignment:

            {((⊢-⌈/)≢¨⊆⍵) ⌽↑⊆⍵} 'Longer Phrase' 'APL' 'Parade'
Longer Phrase
          APL
       Parade

What remains is the truncation, which follows similar lines to the earlier versions.

For completeness we can race a couple of these variants. Let’s generate a chunkier data set first: 10,000 random strings of varying lengths up to 50:

      data←{⎕A[?(?50)⍴26]}¨⍳10000      'cmpx'⎕CY'dfns'
      cmpx '20{↑(-⍺)↑¨⊆⍵}data' '20{⍉(-⍺)↑⍉(⊆⍵)⌽∘↑⍨(⊢-⌈/)≢¨⊆⍵}data' '20{⌽⍉⍺↑⍉↑⌽¨⊆⍵}data'
  20{↑(-⍺)↑¨⊆⍵}data                 → 9.3E¯4 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕             
  20{⍉(-⍺)↑⍉(⊆⍵)⌽∘↑⍨(⊢-⌈/)≢¨⊆⍵}data → 9.0E¯4 |  -3% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕              
  20{⌽⍉⍺↑⍉↑⌽¨⊆⍵}data                → 1.4E¯3 | +46% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

The complex, flat version wins, but not by a significant amount.

With that we’ve reached the end. A nice set of problems, with a lot of creative solutions submitted. Watch this space for a review of the Phase II problems…

Code Golf: Generating Empty Arrays

By: Stefan Kruger

Recently, I was looking for a Dyalog problem to pit my wits against, and I asked in the APL Orchard if Adám could make one up:

A CMC, if you’re unfamiliar with the APL Orchard, is a Chat Mini Challenge – typically an informal code golf challenge to produce the shortest possible bit of code that solves the set problem. Code golf practitioners usually delve into the dustiest corners of a language and, if practiced diligently, this can lead to a deep mastery over time, even if some dirty hacks techniques employed may be frowned upon in production code.

Anyway. Adám responded with the following:

The Mysterious Case of 0 0⍴0

Hmm. What even is that thing‽ It’s only 5 characters already – not much scope to shrink that, surely? Let’s have a look at it with full boxing:

      ⎕IO←0
      ]Box on -style=max
┌→────────────────┐
│Was ON -style=max│
└─────────────────┘
      0 0⍴0
┌⊖┐
⌽0│
└~┘

In the boxed output, the tells us that the leading axis has length 0, the means that the trailing axis has length 0, and the ~ means that the array is non-nested.

This is a simple numeric array of two dimensions, each of length 0 – think of it as a spreadsheet or table before you’ve put any rows or columns of data in it.

I found this problem difficult to get started with, so I searched for the expression on APL Cart, and discovered that:

      ]Box off
Was ON
      ]APLCart 0 0⍴0   ⍝ Dyalog 18.1 has APLCart built in! Fancy.
X,Y,Z:any M,N:num I,J:int A,B:Bool C,D:char f,g,h:fn ax:axis s:scal v:vec m:mat
───────────────────────────────────────────────────────────────────────────────
⍬⊤⍬  zero-by-zero numeric matrix                                               
───────────────────────────────────────────────────────────────────────────────
Showing 1 of 1 matches                                                         
      ]Box on
┌→──────┐
│Was OFF│
└───────┘

Surely not? Let’s try that suggestion:

      (0 0⍴0)≡⍬⊤⍬
1

Well, it clearly works, but why? Let’s see if we can figure that one out.

In the basic case for encode, X⊤Y, if X and Y are vectors, the result will have one column for each element in Y and one row for each element in X. Using the example from the language bar:

      2 2 2 2 ⊤ 5 7 12 ⍝ Convert decimal to binary
┌→────┐
↓0 0 1│
│1 1 1│
│0 1 0│
│1 1 0│
└~────┘

we have a shape of 4 3, as the length of the vector to the left (2 2 2 2) is 4 and the length of the vector to the right (5 7 12) is 3.

Returning to ⍬⊤⍬, given the above, we can deduce that the result should have a rank of 2 with a shape of 0 0 which, of course, is what we wanted to achieve:

      ⍴⍬⊤⍬ ⍝ Shape 0 0
┌→──┐
│0 0│
└~──┘

Disappointingly, I had to cheat by looking up the answer. However, are there any more length-3 solutions? Apparently, ⍬⊤⍬ is “reasonably well known”. I decided to write a simple brute-force search function to look for any other solutions.

This works as follows:

  1. Create all possible length-3 combinations of APL glyphs
  2. Evaluate each in turn, skipping those that error
  3. Save those that evaluate to 0 0⍴0

Here’s what I ended up with:

∇ r←Bruter target;glyphs;combo;eval 
  glyphs ← '0⍬+-×÷*⍟⌹○|⌈⌊⊥⊤⊣⊢=≠≤<>≥≡≢∨∧⍲⍱↑↓⊂⊃⊆⌷⍋⍒⍳⍸∊⍷∪∩~/\⌿⍀,⍪⍴⌽⊖⍉¨⍨⍣.∘⍤⍥@⌸⌺⍎⍕¯'
  r ← ⍬
  :For combo :In ,∘.,⍣2⍨glyphs
      :Trap 0
          eval ← ⍎combo
      :Else
          :Continue
      :EndTrap
      :If eval≡target
          r ,← ⊂combo
      :EndIf
  :EndFor
∇

I decided to leave out a few glyphs that I guessed would be unlikely to feature (←→⎕⍠⍞⍝⋄), and I only added the number 0. Let’s see what that generates:

      7 5⍴Bruter 0 0⍴0 ⍝ 7×5 layout added for clarity after the fact
┌→──────────────────────────────┐
↓ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ │
│ │⍬⊤⍬│ │+⌸⍬│ │-⌸⍬│ │×⌸⍬│ │÷⌸⍬│ │
│ └───┘ └───┘ └───┘ └───┘ └───┘ │
│ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ │
│ │*⌸⍬│ │⍟⌸⍬│ │○⌸⍬│ │|⌸⍬│ │⌈⌸⍬│ │
│ └───┘ └───┘ └───┘ └───┘ └───┘ │
│ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ │
│ │⌊⌸⍬│ │⊤⍨⍬│ │⊤⌸⍬│ │⊢⌸⍬│ │=⌸⍬│ │
│ └───┘ └───┘ └───┘ └───┘ └───┘ │
│ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ │
│ │≠⌸⍬│ │≤⌸⍬│ │<⌸⍬│ │>⌸⍬│ │≥⌸⍬│ │
│ └───┘ └───┘ └───┘ └───┘ └───┘ │
│ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ │
│ │∨⌸⍬│ │∧⌸⍬│ │⍲⌸⍬│ │⍱⌸⍬│ │↑⍸0│ │
│ └───┘ └───┘ └───┘ └───┘ └───┘ │
│ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ │
│ │↑⌸⍬│ │↓⌸⍬│ │⍷⌸⍬│ │∩⌸⍬│ │/⌸⍬│ │
│ └───┘ └───┘ └───┘ └───┘ └───┘ │
│ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ │
│ │⌿⌸⍬│ │⍴⌸⍬│ │⌽⌸⍬│ │⊖⌸⍬│ │⍉⌸⍬│ │
│ └───┘ └───┘ └───┘ └───┘ └───┘ │
└∊──────────────────────────────┘

Wow! There are 35 of them (for ⎕IO←0)!

There are clearly patterns here – we can see our friend ⍬⊤⍬ right at the beginning, and also its equivalent ⊤⍨⍬. We can also see ↑⍸0 which, in retrospect, perhaps I should have thought of in the first place.

The remaining 32 solutions all feature key. The majority of those have a scalar function operand; we can treat this whole group as equivalent. Let’s tackle that group first, with the operand + as the example:

      +⌸⍬
┌⊖┐
⌽0│
└~┘

Let’s recap what key does. The operand dyadic function is called with each unique element of the argument in turn as its left argument, and a vector of indices of occurrences as its right. Key then returns a rank 2 array of the results. For example:

      {⍺ ⍵}⌸'Mississippi'
┌→─────────────┐
↓   ┌→┐        │
│ M │1│        │
│ - └~┘        │
│   ┌→───────┐ │
│ i │2 5 8 11│ │
│ - └~───────┘ │
│   ┌→──────┐  │
│ s │3 4 6 7│  │
│ - └~──────┘  │
│   ┌→───┐     │
│ p │9 10│     │
│ - └~───┘     │
└∊─────────────┘

With an argument of the empty vector , in the operand function, will always be 0 – the prototype element of our empty numeric vector:

      {⎕←⍺}⌸⍬
0

What about ? Well, it must be an empty numeric vector (no found indices):

      {⎕←⍵}⌸⍬
┌⊖┐
│0│
└~┘

So, with a scalar function like + as the operand, we end up with 0+⍬. Key returns an array where each major cell is the result of the function applied to the unique element and its indices, which in this case is an array with major cells of the structure . How many such major cells? 0. So the result is 0⌿1 0⍴⍬, which is, of course, 0 0⍴0.

So for 0 f ⍬ for any scalar dyadic function f, the above holds. For example:

      0>⍬
┌⊖┐
│0│
└~┘
      >⌸⍬
┌⊖┐
⌽0│
└~┘

Then we have a lot of non-scalar operands, like ⊖/⍷↑↓. All we need to understand is why they produce when applied as 0 f ⍬, as key will call them. Some of these are pretty obvious, like find, :

      0⍷⍬ ⍝ Find all locations of 0 in the empty vector
┌⊖┐
│0│
└~┘

or take and drop, ↑↓:

      0↑⍬ ⍝ Take no elements from the beginning of the empty vector
┌⊖┐
│0│
└~┘
      0↓⍬ ⍝ Drop no elements from the end of the empty vector
┌⊖┐
│0│
└~┘

However, gives us a dyadic transpose – and, perhaps unexpectedly, this one is ⎕IO-dependent:

      0⍉⍬
┌⊖┐
│0│
└~┘

At first glance, this made no sense to me. The reason this works is that transposing a vector is an identity operation:

      ⍉'Transposing a vector returns the vector'
┌→──────────────────────────────────────┐
│Transposing a vector returns the vector│
└───────────────────────────────────────┘

A vector has a single axis, so “reordering the axes” doesn’t do anything. The same holds true for the dyadic form, assuming the left argument is ⎕IO:

      0⍉'Same for the dyadic form. Sometimes.'
┌→───────────────────────────────────┐
│Same for the dyadic form. Sometimes.│
└────────────────────────────────────┘

This is the reason why this only works for ⎕IO←0: Key always provides 0 for the left argument. For ⎕IO←1 we will get a DOMAIN ERROR:

      ⎕IO←1
      0⍉'Same for the dyadic form. Sometimes.'
DOMAIN ERROR
      0⍉'Same for the dyadic form. Sometimes.'
       ∧

A few others stand out. Replicate (and its sibling replicate-first), for example:

      /⌸⍬
┌⊖┐
⌽0│
└~┘

will end up as:

      0/⍬ ⍝ zero empty vectors
┌⊖┐
│0│
└~┘

in the left operand – zero empty vectors is still the empty vector. Reshape is similar:

      0⍴⍬ ⍝ zero empty vectors
┌⊖┐
│0│
└~┘

Encode tries to express in the 0 radix; also an empty vector:

      0⊤⍬
┌⊖┐
│0│
└~┘

My favourite, though, is probably ↑⍸0, and, as I said earlier, I’m a bit disappointed I didn’t find that before resorting to brute force and ignorance. Where () returns a vector of indices with 1 for its Boolean array right argument. If we call it with a right argument of 0, we’ll get an empty vector of empty numeric vectors. This is because the one (and only) valid index into a scalar is the empty vector, and none of them (!) are non-zero.

      ⍸0
┌⊖────┐
│ ┌⊖┐ │
│ │0│ │
│ └~┘ │
└∊────┘

Trading depth for rank, we get the shape we want:

      ↑⍸0
┌⊖┐
⌽0│
└~┘

What About 0⍴⊂⍬?

The expression ⍸0 above is the only length-2 way to produce an empty vector of empty numeric vectors:

      (⍸0)≡0⍴⊂⍬
1

However, there are several interesting length-3 solutions. Deploying the Bruter again:

      8 7⍴res←Bruter 0⍴⊂⍬
┌→──────────────────────────────────────────┐
↓ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ │
│ │0⊂0│ │0⊂⍬│ │0⊆⍬│ │⍬⊂0│ │⍬⊂⍬│ │⍬⊆⍬│ │+⍸0│ │
│ └───┘ └───┘ └───┘ └───┘ └───┘ └───┘ └───┘ │
│ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ │
│ │-⍸0│ │×⍸0│ │÷⍸0│ │*⍸0│ │⍟⍸0│ │○⍸0│ │|⍸0│ │
│ └───┘ └───┘ └───┘ └───┘ └───┘ └───┘ └───┘ │
│ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ │
│ │⌈⍸0│ │⌊⍸0│ │⊣⍸0│ │⊢⍸0│ │⊂⍨0│ │⊂⍨⍬│ │⊆⍸0│ │
│ └───┘ └───┘ └───┘ └───┘ └───┘ └───┘ └───┘ │
│ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ │
│ │⊆⍨⍬│ │⌷⍸0│ │⍳¨⍬│ │⍸00│ │⍸0.│ │⍸+0│ │⍸-0│ │
│ └───┘ └───┘ └───┘ └───┘ └───┘ └───┘ └───┘ │
│ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ │
│ │⍸×0│ │⍸○0│ │⍸|0│ │⍸⌈0│ │⍸⌊0│ │⍸⊣0│ │⍸⊢0│ │
│ └───┘ └───┘ └───┘ └───┘ └───┘ └───┘ └───┘ │
│ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ │
│ │⍸≡0│ │⍸≢⍬│ │⍸↑0│ │⍸↓0│ │⍸⊂0│ │⍸⊃0│ │⍸⊃⍬│ │
│ └───┘ └───┘ └───┘ └───┘ └───┘ └───┘ └───┘ │
│ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ │
│ │⍸⊆0│ │⍸⌷0│ │⍸⌽0│ │⍸⊖0│ │⍸⍉0│ │⍸.0│ │⍸¯0│ │
│ └───┘ └───┘ └───┘ └───┘ └───┘ └───┘ └───┘ │
│ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐ │
│ │∪⍸0│ │~⍸0│ │,⍸0│ │⍴¨⍬│ │⌽⍸0│ │⊖⍸0│ │⍉⍸0│ │
│ └───┘ └───┘ └───┘ └───┘ └───┘ └───┘ └───┘ │
└∊──────────────────────────────────────────┘

Most are ⍸0 combined with an additional primitive that has no effect, or selfie-mirrors, so let’s restrict ourselves to the interesting subset:

      4 2⍴res/⍨~'⍸⍨'∘(∨/∊)¨res ⍝ Skip variants of ⍸0 and selfies
┌→────────────┐
↓ ┌→──┐ ┌→──┐ │
│ │0⊂0│ │0⊂⍬│ │
│ └───┘ └───┘ │
│ ┌→──┐ ┌→──┐ │
│ │0⊆⍬│ │⍬⊂0│ │
│ └───┘ └───┘ │
│ ┌→──┐ ┌→──┐ │
│ │⍬⊂⍬│ │⍬⊆⍬│ │
│ └───┘ └───┘ │
│ ┌→──┐ ┌→──┐ │
│ │⍳¨⍬│ │⍴¨⍬│ │
│ └───┘ └───┘ │
└∊────────────┘

We can see a few patterns again – partition () or partitioned enclose (), with 0 as left or right argument and as left or right argument, plus two variants using each (¨) on . Let’s look at first.

Partitioned enclose groups, or partitions, stretches its right argument as specified by its Boolean vector left argument, with each new partition starting on 1, and returns a nested vector of the resulting partitions:

      1 0 0 0 0 1 1 0 0 0 0⊂'Hello World'
┌→────────────────────┐
│ ┌→────┐ ┌→┐ ┌→────┐ │
│ │Hello│ │ │ │World│ │
│ └─────┘ └─┘ └─────┘ │
└∊────────────────────┘

In the first case, 0⊂0, we’re saying that we want 0 partitions of 0 (which gets treated as a 1-element vector), returned as a nested vector – in other words, exactly the “empty vector of empty numeric vectors” that we’re after. In fact, with a 0 as its left argument returns an empty vector of empty vectors of the prototype element of the right, which also helps to explain our 0⊂⍬:

      0⊂'hello world' ⍝ Empty vector of empty character vectors
┌⊖────┐
│ ┌⊖┐ │
│ │ │ │
│ └─┘ │
└∊────┘
      0⊂1 2 3 4 5 ⍝ Empty vector of empty numeric vectors
┌⊖────┐
│ ┌⊖┐ │
│ │0│ │
│ └~┘ │
└∊────┘
      0⊂⍬ ⍝ Empty vector of empty numeric vectors
┌⊖────┐
│ ┌⊖┐ │
│ │0│ │
│ └~┘ │
└∊────┘

Swapping the order of that last expression:

      ⍬⊂0
┌⊖────┐
│ ┌⊖┐ │
│ │0│ │
│ └~┘ │
└∊────┘

is really the same thing: the left side is still a numeric vector with no 1s.

What about ? If we restrict ourselves to a Boolean left argument again, partition is similar to replicate, but instead of just skipping elements of the right side corresponding to 0s in the left side, it encloses groups of elements corresponding to 1s, and skips the rest:

      1 1 1 1 1 0 1 1 1 1 1⊆'Hello World'
┌→────────────────┐
│ ┌→────┐ ┌→────┐ │
│ │Hello│ │World│ │
│ └─────┘ └─────┘ │
└∊────────────────┘

Compare with replicate:

      1 1 1 1 1 0 1 1 1 1 1/'Hello World'
┌→─────────┐
│HelloWorld│
└──────────┘

As with , returns a vector of vectors and, by the same reasoning, if the left argument contains no 1s, then the inner vector will be an empty vector of the prototype element of the right argument:

      0⊆1 2 23 34 
┌⊖────┐
│ ┌⊖┐ │
│ │0│ │
│ └~┘ │
└∊────┘

We should understand the remaining variant, ⍬⊆⍬, too. You might wonder why there is no 0⊆0, analogous to the 0⊂0 we saw earlier; a fair question. However, it results in an error:

      0⊆0 ⍝ RANK ERROR
RANK ERROR
      0⊆0 ⍝ RANK ERROR
       ∧

The reason for this is that is IBM’s APL2 primitive, supplied for compatibility reasons. It simply doesn’t treat a scalar right argument as a 1-element vector, unlike .

What remains are the two variants using the each operator: ⍳¨⍬ and ⍴¨⍬. Each always returns an array of the same shape as its right argument, in which each element is the result of applying the operand to the corresponding element in the argument.

By that reasoning, with an argument of we’ll always end up with an empty vector. The prototype element is itself a vector, as both and returns vectors – and, in our specific case, empty numeric vectors, as given the argument .

Closing Remark

Many people – certainly including myself – find reasoning about the behaviour of empty arrays in Dyalog APL difficult. Going through exercises such as these can help to make this clearer.

Highlights of the 2020 Problem Solving Competition – Phase II

With Dyalog’s APL Problem Solving Competition 2021 in full swing, it’s time to highlight some of the excellent solutions that were submitted to last year’s edition.

Stefan Kruger works for IBM making databases. While he tries to learn at least one new programming language a year, he got hooked on APL and participated in the competition. This is his perspective on some solutions that the judges picked out – call it the “Judges’ Pick”, if you like; smart, novel, or otherwise noteworthy solutions that can serve as an inspiration.

This blog post is also available as an interactive Jupyter Notebook document.


By Stefan Kruger

I’ll show a cool solution or two to each Phase II problem and dive into the details of a couple. If you need to refresh your memory with what the problems looked like, there’s a PDF of the Phase II problems.

Oh, and note that at the time of writing there is still plenty time to take part in the current edition of the competition (and really, who knew bowling was so complicated?) – there are some juicy cash prices to be won.

Problem 1: Take a Dive (1 task)

Level of Difficulty: Low

So let’s kick off with problem 1. The task was to calculate the score of an Olympic dive, consisting of a technical difficulty rating and a vector containing either 3, 5 or 7 judges’ scores. Only the central three ordered judges’ scores should be considered, which should be summed and multiplied by the technical difficulty rating.

Here is a cunning trick that wasn’t at all obvious:

∇ score←dd DiveScore scores;sorted;cenzored;rotator
  ⍝ 2020 APL Problem Solving Competition Phase II
  ⍝ Problem 1, Task 1 - DiveScore
   
  sorted←{⍵[⍋⍵]}scores
   
  ⍝  0 1 2 rotates score indexes to 123, 23451 or 3456712
  ⍝  So three center values always goes first
  ⍝  51 = (0 1 2∧.= 3 5 7 ∘.|⍳100) ⍳ 1
  rotator←51
 
  cenzored←3↑rotator⌽sorted
  score←⍎2⍕dd+.×cenzored
∇
      2.9 2.6 2.7 DiveScore¨(7 7.5 6.5 8 8 7.5 7)(9.5 8 8.5)(7.5 7 7 8.5 8)
63.8 67.6 60.75

This contestant figured out that if a vector of length 3, 5 or 7 is rotated 51 steps, then the original central three items will always end up at the beginning. No, really. It turns out that 51 is the first number X such that 0 1 2≡3 5 7|X. They tabulated the options and picked the first solution, guessing that it’d be less than 100:

      ⍸0 1 2∧.=3 5 7∘.|⍳100
51

But there is another way – this is one of those situations where the Chinese Remainder Theorem comes in handy, especially since it’s available on APLcart:

      3 5 7 {m|⍵+.×⍺(⊣×⊢|∘⊃{0=⍵:1 0 ⋄ (⍵∇⍵|⍺)+.×0 1,⍪1,-⌊⍺÷⍵})¨⍨⍺÷⍨m←×/⍺} 0 1 2 ⍝ https://aplcart.info?q=chinese
51

If you figured that out, award yourself a well-deserved pat on the back. For us mortals, we probably all did something rather more pedestrian:

DiveScore ← {
    d ← 2-2÷⍨7-≢⍵       ⍝ How many items should we drop each side?
    ⍺+.×(-d)↓d↓⍵[⍋⍵]
}

Problem 2 – Another Step in the Proper Direction (1 task)

Level of Difficulty: Medium

Problem 2 builds upon Problem 5 from Phase I. In short, we are asked to write a function Steps that takes a two-element vector to the right, defining a start and end value, and an optional left integer argument that tweaks how we generate values from start to end. The complexity here comes from the many combinations of behaviours from what exactly is given as the left argument: integer or float? positive or negative? Also, the range must be inclusive, even if a floating-point step size means that the end point is overshot. I took this on thinking it would be trivial – it wasn’t.

Here’s a great solution that manages to combine this functionality with a call to a single dfn:

∇ steps←{p}Steps fromTo;segments;width
  width ← |-/fromTo
  :If 0=⎕NC'p' ⍝ No left argument: same as Problem 5 of Phase I
      segments ← 0,⍳width
  :ElseIf p0 ⍝ p is the step size
      segments ← p {⍵⌊⍺×0,⍳⌈⍵÷⍺} width
  :ElseIf p=0 ⍝ As if we took zero step
      segments ← 0
  :EndIf
  ⍝ Take into account the start point and the direction.
  steps ← fromTo {(⊃⍺)+(-×-/⍺)×⍵} segments
∇

I ended up with something more convoluted, with a few ugly special cases, and shamelessly borrowing from dfns.iotag:

Steps ← {
    range ← {
        r ← ⍺-s×⎕IO-⍳⌊1-(⍺-⊃⍵)÷s←×/1↓⍵,(⍺>⊃⍵)/¯1 ⍝ "inspired" by dfns.iotag
        (⊃⍵)≠⊃⊖r: r,⊃⍵ ⋄ r   ⍝ Ensure endpoint is included – yeuch :(
    }
    ⍺ ← ⍬
    (b e) ← ⍵
    ⍺≡⍬: b range e        ⍝ No ⍺
    ⍺=0: b                ⍝ Zero step; return start point
    ⍺>0: b range e ⍺      ⍝ Positive ⍺
    len ← (e-b)÷count←⌊-⍺ ⍝ Negative ⍺
    len=0: b/⍨1+count     
    b range e len
}

Problem 3 – Past Tasks Blast (1 task)

Level of Difficulty: Medium

The task here was to scrape the Dyalog APL Problem Solving Competition webpage to extract all links to PDF files. We get the suggestion to use either Dyalog’s HttpCommand or shell out to a system mechanism for fetching a web page.

To use HttpCommand, we first need to load it:

      ]load HttpCommand
#.HttpCommand

Here’s a slightly tweaked competition submission, showing great flair in how to process XML:

PastTasks ← {
    url ← ⍵
    r ← (HttpCommand.Get url).Data  ⍝ get page contents
    (d n c a t) ← ↓⍉⎕XML r          ⍝ depth; name; content; attributes; type
    (k v) ← ↓⍉ ⊃⍪/ ((,'a')∘≡¨n)/a   ⍝ extract key-value pairs of <a> elements
    urls ← ('href'∘≡¨k)/v           ⍝ get URLs
    pdfs ← ('.pdf'∘≡¨¯4↑¨urls)/urls ⍝ filter .pdfs
    base ← ⊃⌽⊃('base'∘≡¨n)/a        ⍝ base URL
    base∘,¨pdfs
}

The problem statement suggests that a regex-based solution might be tolerable. Here’s a stab at that approach:

PastTasks ← {
    body ← (HttpCommand.Get ⍵).Data
    pdfs ← '<a href="(.+?\.pdf)"'⎕S'\1'⊢body
    base ← '<base href="(.+?)"'⎕S'\1'⊢body
    base,¨pdfs
}

So which is the “better” solution? Well, the first approach has a number of advantages: firstly, is much more robust (provided that the web page is valid XHTML, which we are told is a given), meaning that we can abdicate responsibility for dealing with markup quirks (single vs double quotes, whitespace etc) to the built-in ⎕XML system function, and secondly, there is that (in)famous quote from Jamie Zawinski:

Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems. – jwz

Mixing in a liberal helping of regular expressions in with APL is perhaps not helping APL’s unfair reputation for being write-only.

However, when dealing with patterns in textual data, as we unquestionably are here, regular expressions – even in a powerful language like APL – are sharp tools that are hard to beat, and any programmer worth their salt owes it to themselves to master them. In the case above, had the data not neatly been parseable as XML, it would have been more awkward to solve a problem like this relying only on APL primitives.

Problem 4 – Bioinformatics (2 tasks)

Level of Difficulty: Medium

The two tasks making up Problem 4 are borrowed from Project Rosalind, which is a Bioinformatics problem collection that often has great APL affinity:

and a hint that one benefits from understanding modular multiplication, as this isn’t built into Dyalog APL.

Here is a great example:

revp ← {                    ⍝ r ← revp dna
   dnaNum ← 'ACGT'⍳⍵        ⍝ Convert to 1..4 so that A+T = C+G = 5
   FindRevp ← {             ⍝ Given chunk size, extract positions and build the output format
       chunks ← ⍵,/dnaNum
       isRevp ← (⊢≡5-⌽)¨chunks
       ⍵,⍨⍪⍸isRevp
   }
   ⊃⍪/FindRevp¨4 6 8 10 12  ⍝ Test against all chunk sizes and collect results
}
sset ← {          ⍝ r←sset n
   bin ← 2⊥⍣¯1⊢⍵  ⍝ Binary digits
   arr ← ⌽2*bin   ⍝ Repeated squaring: Starting from MSB and 1, square ⍵, multiply ⍺, modulo m
   mod ← 1000000
   {mod|⍺×⍵*2}/arr,1
}

This contestant also saw fit to include their test suite; a nice touch! Roger Hui’s version of assert has become the de facto standard, and the contestant puts it to good use:

Assert ← {⍺←'assertion failure' ⋄ 0∊⍵:⍺ ⎕SIGNAL 8 ⋄ shy←0} ⍝ Roger Hui's Assert
RevpTest ← {
   s ← 'TCAATGCATGCGGGTCTATATGCAT'
   ans ← revp s
   Assert 8 2≡⍴ans:
   Assert 5 4 7 4 17 4 18 4 21 4 4 6 6 6 20 6≡∊ans:
  
   header ← 'Contest2020/Data/'  ⍝ Change as needed
   data1 ← ∊1↓⊃⎕NGET (header,'rosalind_revp_1_dataset.txt') 1
   ans1 ← ↑⍎¨⊃⎕NGET (header,'rosalind_revp_1_output.txt') 1
   data2 ← ∊1↓⊃⎕NGET (header,'rosalind_revp_2_dataset.txt') 1
   ans2 ← ↑⍎¨⊃⎕NGET (header,'rosalind_revp_2_output.txt') 1
   Assert ans1 ≡ revp data1:
   Assert ans2 ≡ revp data2:
   'Test passed'
}
SsetTest ← {
   Assert 8 = sset 3:
   Assert 551872 = sset 857:
   Assert 935424 = sset 870:
   'Test passed'
}

Problem 5 – Future and Present Value (2 tasks)

Level of Difficulty: Medium

Problem 5 is some hedge fund maths, or something where my eyes glazed over before I fully understood the ask. What is this, K‽

This solution is impressively compact – I removed the comments to highlight the APL artistry on display: no less than three scans, count ’em!

rr ← {AR×+\⍺÷AR←×\1+⍵} 
pv ← {+/⍺÷×\1+⍵}

Here’s how the competitor outlined how their solution works:

This can be calculated elegantly with the following operations:

  1. Find the accumulated interest rate (AR) for each term (AR←×\1+⍵).
  2. Deprecate the cashflow amounts by dividing them by AR. This finds the present value of all the amounts.
  3. Accumulate all the present values of the amounts to find the total present value at each term.
  4. Multiply by AR to find future values at each term.

This way the money that was invested or withdrawn in a term is not changed for that term, but the money that came from the previous terms is multiplied by the current interest rate for each term arriving to the correct recurrent relation:

Step 2) amounts[i]/AR[i] ⍝ ≡ PV[i]
Step 3) amounts[i]/AR[i] + APV[i-1]
Step 4) amounts[i] + APV[i-1]×AR[i]
amounts[i] + APV[i-1]×AR[i-1]×(1+rate[i])
amounts[i] + r[i-1]×(1+rate[i]) ⍝ ≡ r[i]

Problem 6 – Merge (1 task)

Level of Difficulty: Medium

Mail merge – gotta love it. Your spam folder is full of bad examples of this: “Dear $FIRSTNAME, do you want to purchase a bridge?” We’re given a template file with patterns such as @firstname@ which are to be replaced with values stored in a JSON file. Here’s a smart approach from a competitor who knows their way around the @ operator:

Merge ← {
   templateFile ← ⍺
   jsonFile ← ⍵
   template ← ⊃⎕NGET templateFile
   ns ← ⎕JSON⊃⎕NGET jsonFile

   getValue ← {
       0=⍴⍵:,'@'   ⍝ '@@'         → ,'@'
       6::'???'    ⍝ ~⍵∊ns.⎕NL ¯2 → '???'
       ⍕ns⍎⍵       ⍝  ⍵∊ns.⎕NL ¯2 → ⍕ns.⍵
   }
   ∊getValue¨@(⍴⍴1 0⍨)'@'(1↓¨=⊂⊢)template
}

The key insight here is that since each template starts and ends with the same marker, we can partition the data on sections beginning with @ and then we’ll have a vector where every other element is a template to be substituted. Here’s an example of this:

      ↑('@'(1↓¨=⊂⊢) '@title@ @firstname@ @lastname@, would you be interested in the Brooklyn Bridge?') (1 0 1 0 1 0)
┌─────┬─┬─────────┬─┬────────┬─────────────────────────────────────────────────┐
│title│ │firstname│ │lastname│, would you be interested in the Brooklyn Bridge?│
├─────┼─┼─────────┼─┼────────┼─────────────────────────────────────────────────┤
│1    │0│1        │0│1       │0                                                │
└─────┴─┴─────────┴─┴────────┴─────────────────────────────────────────────────┘

I added the second row for clarity to show the alternating templates. Cool, huh? However, this only works correctly if the data leads with a template. Consider:

      '@'(1↓¨=⊂⊢) 'Dear @firstname@ @lastname@, or maybe the Golden Gate?'
┌─────────┬─┬────────┬───────────────────────────┐
│firstname│ │lastname│, or maybe the Golden Gate?│
└─────────┴─┴────────┴───────────────────────────┘

We still have the alternating templates, but the prefix (Dear ) is lost. We can tweak the Merge function a bit to cater for this if we need to:

Merge ← {
    templateFile ← ⍺
    jsonFile ← ⍵
    template ← ⊃⎕NGET templateFile
    ns ← ⎕JSON⊃⎕NGET jsonFile
    first ← templ⍳'@'
    first>≢templ: templ    ⍝ No templates at all
    prefix ← first↑templ   ⍝ Anything preceding the first '@'?

    getValue ← {
        0=⍴⍵:,'@'   ⍝ '@@'         → ,'@'
        6::'???'    ⍝ ~⍵∊ns.⎕NL ¯2 → '???'
        ⍕ns⍎⍵       ⍝  ⍵∊ns.⎕NL ¯2 → ⍕ns.⍵
    }
    ∊prefix,getValue¨@(⍴⍴1 0⍨)'@'(1↓¨=⊂⊢)template
}

Now, the competition is pitched such that “proper array solutions” are preferred – and for good reasons, most of the time. However, it’s hard to overlook some industrial regex action in this case. Strictly for Perl-fans:

Merge ← {
    mrg ← ⎕JSON⊃⎕NGET ⍵
    keys ← mrg.⎕NL¯2
    vals ← mrg.⍎¨keys

    ('@',¨(keys,'' '[^@]+'),¨'@')⎕R((⍕¨vals),'@' '???')⊃⎕NGET ⍺
}

Problem 7 – UPC (3 tasks)

Level of Difficulty: Medium

Problem 7 had us learning more about bar codes than we ever thought necessary. Read them, write them, verify them, scan them – forwards and backwards no less. Good scope for stretching your array muscles on this one. The eagle-eyed amongst you may have spotted that the verification aspect is a simplified version of Luhn’s algorithm, which a certain Morten Kromberg used to illustrate APL’s array capabilities at JIO a while back.

Here’s a good solution:

CheckDigit ← (10|∘-+.×∘(11⍴3 1))          ⍝ Computes the check digit for a UPC-A barcode.

UPCRD ← 114 102 108 66 92 78 80 68 72 116 ⍝ Right digits of a UPC-A barcode, base 10.
bUPCRD ← ⍉2∘⊥⍣¯1⊢UPCRD                    ⍝ Bit matrix with one right digit per row.
WriteUPC ← {
   ⍝ Writes the bits of a UPC-A barcode.  
   ~((11∘=≢)∧(∧/0∘≤∧≤∘9))⍵: ¯1            ⍝ Check for simple errors
   b ← bUPCRD[⍵,CheckDigit ⍵;]  
   1 0 1, (,~6↑b), 0 1 0 1 0, (,6↓b), 1 0 1 
}
ReadUPC ← {
   ⍝ Reads a UPC-A barcode into its digits.
   ~(∧/0∘≤∧≤∘1)⍵: ¯1                 ⍝ Input isn't a bit vector
   95≠≢⍵: ¯1                         ⍝ Number of bits must be 95
   (b l m r e) ← ⍵ ⊂⍨ (∊¯1∘↓,⌽) (3↑1)(42↑1)(5↑1)
   
   b ∨⍥(≢∘1 0 1) e: ¯1               ⍝ Wrong patterns for the guards
   m≢0 1 0 1 0: ¯1
   bits ← ↓12 7⍴ l,r
   C ← (↓bUPCRD)∘⍳ ~@(⍳6)            ⍝ Convert bits to digits
   tf ← ~∧/10 > nums ← C bits        ⍝ Should we try flipping the bits?
   nums ← (nums×1-tf) + tf×C⌽↓⌽↑bits
   ∨/10=nums: ¯1                     ⍝ Bits simply aren't right
   (¯1↑nums)≠CheckDigit 11↑nums: ¯1  ⍝ Bad check digit
   nums
}

Problem 8 – Balancing the Scales (1 task)

Level of Difficulty: Hard

Our task is to partition a set of numbers into two groups of equal sum if this is possible, or return if not. This is a well-known NP-complete problem called The Partition Problem and, as such, has no polynomial time exact solutions. The problem statement indicates that we only need to consider a set of 20 numbers or fewer, which is a bit of a hint on what kind of solution is expected.

This problem, in common with many other NP problems, also has a plethora of interesting heuristic solutions: polynomial algorithms that whilst not guaranteed to always find the optimal solution will either get close, or be correct for a significant subset of the problem domain in a fraction of the time the exact algorithms would take.

However, it’s clear that Dyalog expects us to give an exact solution, and has given us an upper bound on the input data length. Finally, we’re offered the cryptic advice that

Understanding the nuances of the problem is the key to developing a good algorithm.

Yes, thank you, master Yoda.

Here’s a great, efficient solution:

Balance←{
   sum←1⊥⍵
   2|sum: ⍬   ⍝ Lists with an odd sum cannot be split into equal parts.
   halfsum←sum÷2
  
   ⍝ A partitioning method based on the algorithm by Horowitz and Sahni.
   ⍝ The basic idea of the algorithm is to split the input into two parts,
   ⍝ and then generate all subset sums for these parts. Then the problem
   ⍝ becomes finding a sum of two subset sums from different parts
   ⍝ equal to the desired value. Instead of sorting the sums and comparing
   ⍝ them like in the original algorithm, standard APL searching primitives
   ⍝ ∊ and ⍳ are used. Another key idea is to generate the subset sums
   ⍝ in a specific order, so that the nth subset sum in the vectors a and b
   ⍝ is the sum of the elements chosen by the binary representation of n.
   ⍝ This means that we can get the elements of the solution sum
   ⍝ without having to generate anything but the sums.
   horowitzsahni←{
       s←⍵(↑{⍺⍵}↓)⍨⌊2÷⍨≢⍵                          ⍝ Split the input.
       a b←⊃¨(⊢,+)/¨s,¨0                           ⍝ Generate the subset sums.
       indexes←a {(⊢,⍵⍳⍺⌷⍨(≢⍺)⌊⊢)1⍳⍨⍺∊⍵} halfsum-b ⍝ Search for solution indexes.
       indexes[2]>≢b: ⍬
       ⍵ {(⍺/⍨~⍵)(⍵/⍺)} ∊(2⍴¨⍨≢¨s)⊤¨indexes-1      ⍝ Get the solution from the indexes.
   }
  
   ⍝ A simple exhaustive search. It uses the same binary representation
   ⍝ idea as the horowitzsahni function.
   exhaustive←{
       i←halfsum⍳⍨⊃(⊢,+)/⍵,0
       i>2*≢⍵: ⍬
       ⍵ {(⍺/⍨~⍵)(⍵/⍺)} (2⍴⍨≢⍵)⊤i-1
   }

   ⍝ The exhaustive method performs better than the Horowitz-Sahni method
   ⍝ for small input sizes. 14 seems to be a reasonable cutoff point.
   14>≢⍵: exhaustive ⍵
   horowitzsahni ⍵
}

There are a number of clever touches here – there are actually two different solutions, an exhaustive search and an implementation of the algorithm due to Horowitz and Sahni, which, although still exponential, is known to be one of the fastest for certain subsets and input sizes. A switch based on input size checks for the crossover point and chooses the fastest option. And this is fast – five times faster than that of the Grand Prize winner, and four orders of magnitude faster than the slowest solution.

Such a performance spread is intriguing, so there are clearly lessons to be learned here. When I tried this problem, I ended up with a pretty straight-forward (a.k.a. naive) brute force search:

Balance ← {⎕IO←0
    total ← +/⍵
    2|total: ⍬             ⍝ Sum must be divisible by 2
    psum ← total÷2         ⍝ Our target partition sum
    bitp ← ⍉2∘⊥⍣¯1⍳2*≢⍵    ⍝ All possible bit patterns up to ≢⍵
    idx ← ⍸<\psum=bitp+.×⍵ ⍝ First index of partition sum = target
    ⍬≡idx: ⍬               ⍝ If we have no 1s, there is no solution
    part ← idx⌷bitp        ⍝ Partition corresponding to solution index
    (part/⍵)(⍵/⍨~part)     ⍝ Compress input by solution pattern and inverse
}

If you come to APL from a scalar language, that approach must seem incredibly wasteful: make all bit patterns. Try all sums. Search for the right one, if it exists. But as it turns out, this is APL home turf advantage. Let’s try to demonstrate this point. If you did this “loop and branch”, you’d iterate over the bit patterns and stop once you find the first solution – in fact, for the test data in the problem specification, the first solution appears at around the 1500th bit pattern if you generate them as I do above. The vector version would need to consider the whole space of around

      ¯1+2*20
1048575

a million or so, so quite a difference. Surely, in this case the scalar approach should be way faster? Only one way to find out. We can make a scalar version in several ways – here’s the “Scheme” version:

BalanceScalar ← {⎕IO←0     ⍝ Warning: this is not the APL Way, as we shall see.
    total ← +/⍵
    2|total: ⍬             ⍝ Sum must be divisible by 2
    psum ← total÷2         ⍝ Our target partition sum
    data ← ⍵
    bitp ← ↓⍉2∘⊥⍣¯1⍳2*≢⍵   ⍝ Pre-compute the bit patterns
    {                      ⍝ Try one sum after the other, halt on first solution
        0=⍵: ⍬
        patt ← ⍵⊃bitp
        psum=patt+.×data: (patt/data)(data/⍨~patt) ⍝ Exit on first solution found
        ∇¯1+⍵
    } ¯1+≢bitp
}

Dyalog’s got game when it comes to tail call optimisation, right? OK, let’s race:

      'cmpx'⎕CY'dfns'
      d ← 10 81 98 27 28 5 1 46 63 99 25 39 84 87 76 85 78 64 41 93
      cmpx 'Balance d' 'BalanceScalar d'
  Balance d       → 2.7E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕            
* BalanceScalar d → 3.9E¯2 | +43% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

Vectorisation, Boolean vectors and primitive functions wins the day. We didn’t go completely scalar, to be fair, as we still pre-computed all the binary patterns.

But back to the task at hand – let’s pit ourselves against the intellectual might of Horowitz and Sahni:

horowitzsahni←{
    sum←1⊥⍵
    2|sum: ⍬   ⍝ Lists with an odd sum cannot be split into equal parts.
    halfsum←sum÷2
    s←⍵(↑{⍺⍵}↓)⍨⌊2÷⍨≢⍵                          ⍝ Split the input.
    a b←⊃¨(⊢,+)/¨s,¨0                           ⍝ Generate the subset sums.
    indexes←a {(⊢,⍵⍳⍺⌷⍨(≢⍺)⌊⊢)1⍳⍨⍺∊⍵} halfsum-b ⍝ Search for solution indexes.
    indexes[2]>≢b: ⍬
    ⍵ {(⍺/⍨~⍵)(⍵/⍺)} ∊(2⍴¨⍨≢¨s)⊤¨indexes-1      ⍝ Get the solution from the indexes.
}
      cmpx 'horowitzsahni d' 'Balance d' 'BalanceScalar d'
  horowitzsahni d → 4.7E¯5 |      0%                                         
* Balance d       → 2.8E¯2 | +59266% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕            
  BalanceScalar d → 4.0E¯2 | +84466% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

Ouch! Well, told you my exhaustive search was naive. An impressive performance from the competitor – but also an impressive performance from Dyalog APL – even my knocked up exhaustive search runs in a pretty decent 25–30ms or so, about half the time of my shoddy Python attempt (although out-speeding Python is a low bar). I’m keeping the above implementation of Horowitz/Sahni handy for next edition of Advent of Code, where this problem always seems to crop up in some shape or form.

Problem 9 – Upwardly Mobile (1 task)

Level of Difficulty: Hard

And so for the final question. We were offered strong hints that a neat array-oriented solution might not be possible, but that the judges were prepared to be proven wrong.

Here’s a nicely compact, recursive solution:

∇ weights ← Weights filename;diag;FindWeights;start
    diag ← ↑(≠∘(⎕UCS 10)⊆⊢)⊃⎕NGET filename
    FindWeights ← {
        '┌┐│'∊⍨⊃⍵: ∇1↓⍵                    ⍝ if on any of these, go down        
        ⎕A∊⍨⊃⍵: ⎕A=⊃⍵                      ⍝ if on a letter, give weights
        r_disp ← '┐'⍳⍨0⌷⍵                  ⍝ otherwise, (i.e. on '┴'), find the displacement of right branch,
        l_disp ← -1+'┌'⍳⍨⌽0⌷⍵              ⍝ ...and the left branch
        wts ← ↑(∇r_disp⌽⍵)(∇l_disp⌽⍵)      ⍝ recurse,
        +⌿wts×[0]⌽(+/wts)×r_disp (-l_disp) ⍝ ...and calculate new weights
    }
    start ← diag⌽⍨⍸'┴│'∊⍨0⌷diag            ⍝ starting position attained by ⌽'ing to '┴' or '│'
    weights ← (~∘0÷∨/)FindWeights start    ⍝ remove 0s and get lowest weights
∇

Finally, someone took the suggestion that an array-based solution might not be possible as a personal challenge and produced the following:

Weights ← {
    m  ← ↑(⎕UCS 10)(≠⊆⊢)⊃⎕NGET ⍵ ⍝ no empty lines midway through so this is fine
    fm ← m='┴'               ⍝ fulcrum mask
    ER ← {+\1-⍵\¯2-⌿0⍪⍸⍵}    ⍝ distance to closest 1 to the left
      
    wa ← +/,m∊⎕A             ⍝ weight amount
    wi ← (⍳wa)@{⍵} m∊⎕A      ⍝ weight indexes
    fa ← +/,fm               ⍝ fulcrum amount
    fir← wa + ⍳fa            ⍝ fulcrum indexes (reduced)
    fi ← fir@{⍵} fm          ⍝ fulcrum indexes
    ai ← fi+wi               ⍝ all indexes
    ai+← ⍉(m∊'┌┐') {⍺\⍵/⍨⍵≠0}⍤1⍥⍉ 0@1⊢ai ⍝ extend indexes upwards to the ┌┐s that need them (exclude top ┴ as it isn't matched)
      
    ld ←  ER⍤1⊢ m='┌'        ⍝ distance to left
    rd ← ⌽ER⍤1⌽ m='┐'        ⍝ distance to right
    xp ← (⍴m)⍴⍳2⊃⍴m          ⍝ x position
    fml← ↓fm                 ⍝ fulcrum mask & its lines
    ail← ↓ai                 ⍝ all index lines
    GET← {⊃,/ail⌷⍨∘⊂¨fml/¨⍵} ⍝ get an item of ai for each fulcrum at x position ⍵
    lir← GET ↓xp-ld          ⍝ left indexes (reduced)
    rir← GET ↓xp+rd          ⍝ right indexes (reduced)
    ldr← fm /⍥, ld           ⍝ left distance (reduced)
    rdr← fm /⍥, rd           ⍝ right distance (reduced)
      
    in ← ↑⊃{(+/⍵[⍺])@(⊃⍺)⊢ ⍵}/ (↓⍉↑fir lir rir) , ⊂↓(⍳fa+wa)∘.=⍳wa ⍝ included weights for each index
    cf ← (ldr ×⍤¯1⊢ in[lir;]) - rdr ×⍤¯1⊢ in[rir;] ⍝ coefficients
    ws ← (1,(≢cf)⍴0) ⌹ ((2⊃⍴cf)↑1)⍪cf              ⍝ unscaled weights
    (⊢÷∨/) ws                                      ⍝ scale weights to integers
}

I take my hat off in admiration of the audacity: “An array solution might not be possible, eh? Hold my beer.”

So there we have it, a smörgåsbord of clever solutions to serve as an inspiration for us all. The 2020 edition of the competition sported a slightly simplified format where you were expected to tackle every problem instead of the approach in previous years where you had to make a subset selection from themed groups – this new approach remains for the current (2021) edition.

You are taking part, aren’t you?

Dyalog ’17: Day 2 (Monday 11 September)

by Vibeke Ulmann

Focus on Dyalog APL the language – Monday 11th September 2017

Where Sunday is traditionally filled with workshops and hands-on experiences – the first proper day of the annual user meeting is Monday – and this year was no different.

CEO, Gitte Christensen opened the meeting and emphasised a few of the major new things that have been achieved since last year, namely:

  • RIDE Version 4
  • Embedded HMTL Rendering Engine and – as always –
  • Performance enhancements.

She highlighted the fact that Version 16.0 was the beginning of a tool chain for developing distributed applications – including Cloud computing.

The licence for the SyncFusion library has been renewed for another 5 years. So, for those working with SyncFusion, you will have the usual widgets for dashboarding and graphing to hand.

A new multi-platform developer licence is now available allowing for development on all platforms; Windows, Linux, Mac, and soon, Android.

The Tools Group has been expanded, and they are producing more and more examples and templates.

Dyalog is now producing live content (outside of the user meeting) in the form of webcasts – currently one a month – and Podcasts are also planned.

CXO Morten Kromberg gave us a look at the next generation APL.

Authors note: if you are wondering what the X in CXO stands for it is ‘Experience’.

Morten established that there is a general ‘climate change’ in the world of computing especially as cloud computing is now ‘THERE’. This means that performance once again becomes key, as the true cost of cloud computing is measured in Watts – meaning CPU and memory consumption. So, if you can reduce the footprint of your application, you can reduce the costs of cloud hosting. Another point made is that Cloud computing generally means Linux – as it uses less memory and, therefore, fewer Watts. Whereas macOS and Androids can be considered UNIXes.

Morten focused on the demand for a new generation of APL developers and more to the point, managers who are comfortable using APL, and APL programmers. There are a number of criteria that both groups need; the developer need a modern set of libraries to build upon and to be able to find them easily for example using Git. Whereas managers need test driven development, source code management, and continuous development cycles.

Not everyone is a familiar with Git, and some are even a tad intimidated by it. But there is much good to be said for Git. You can have a private area, as well as a public shared area. Dyalog currently has 25 public repositories on Git. More will follow over time.

CTO Jay Foad proceeded to outline how we can make some of the Dyalog dreams for the future come true.

Version 16.0 of Dyalog APL was released in June this year, and work on version 17.0 continues apace. Speculatively Jay highlighted some of the key areas in version 17.0 to be: scripting, language, performance, and bridges to Python, Julia, MATLAB and Haskell. More work on RIDE, GPUs (and Xion Phi), portability and Android, the Cloud, shuffle testing and PQA.

The Key Note speech before Lunch was presented by Aaron Hsu from Indiana University (USA). Aaron went through how you can escape the beginner’s plateau when starting to work with APL.

The key takeaways for yours truly was that Aaron has observed two attitudes to APL

1)     Never in my life

2)     Can’t imagine life without it

He also observed that there seems to be a ‘learning wall’ which we need to find a way to overcome.

A directory of best practices can give insight into why computer scientists, or those trained in traditional programming methods, often find APL jarring and difficult, whereas those with no prior training fall in love with APL and take to it like ducks to water.

Watch his presentation in which he walks through 8 patterns, he considers to be key for newbie APL programmers. We will announce when the presentation is available on Dyalog’s YouTube Channel later in the autumn.

After lunch, most of the afternoon was dedicated to looking deeper into some of the new feature/functionality and topics many APL programmers find of particular interest.

In the interest of enticing you to watch the presentations online when they’re posted on Dyalog’s YouTube Channel, this blog only touches the basics on a couple of the presentations.

John Scholes went through re-coding from procedural to denotative style and showed us how pure functions opens for code reduction when implemented. ‘Massaging’ the code was the new expression I came away with.

Roger Hui showed us how he has now managed to solve a 20-year-old problem: ‘Index-of’ on Multiple Floats. After having initially established – to much hilarity – that the best way of solving the problem at hand was to not introduce it in the first place in your code, Roger proceeded to show how it can now be solved. Intentionally I am not giving away what Rogers 20-year-old ‘Problem’ was. Let me just briefly mention: it has to do with X and Y.

The afternoon was rounded off with a user presentation by Kostas Blekos from the University of Patras (Greece), where a group of physicists have used APL for the research they did for a paper.

His initial premise was that Physicists + Programming = Disaster. On the other hand, physicists need to do a lot of programming, so when they were developing the basis for the paper, they wanted to find a (new) language that made it easier to do better (and faster) prototyping.

Lots of Kostas’ and his colleagues’ previous work had been done in FORTRAN and, as he said, we needed something a bit easier to work with and the choice was Dyalog APL. Outside of the ability to do fast prototyping, the terseness of the language was attractive, as were the close mathematical relations, which made it easy to understand.

What they learned was that APL is GREAT, suitable for fast prototyping, and for avoiding making mistakes. The quote of the day surely must be

In FORTRAN I could spend a whole day trying to find a missing comma……..

Asked if there were any downsides to APL, Kostas said no, not really, except it is difficult to convince people to use it.

Dyalog ’17: Day 1 (Sunday 10 September)

by Vibeke Ulmann

SharpPlot and SharpLeaf – the graphics and publication tools included with Dyalog on all platforms

(Workshop SA4)

A lot of work has gone into SharpPlot and SharpLeaf over the past couple of years*. The workshop on Sunday 10th September was run by Nicolas Delcros, the Dyalog software programmer behind the work.

SharpPlot:

The documentation has been significantly improved, comes across as comprehensive, and is available on right-click anywhere when you use the tool. It is also available on www.SharpPlot.com and can be downloaded as a PDF. Nic strongly recommends that you walk through the documentation and tutorials and familiarise yourself with the capabilities and the different graph types that are available for automatic graphical representation of data before you start using SharpPlot. That way you will save yourself a lot of time when you start working with the tool, and you will also quickly discover which types of graphs are ‘best’ to show your particular data.

It’s possible to click on the graph widget within a Dyalog Session to immediately see how the data displays graphically. If it doesn’t work to your satisfaction, or clearly show the data the way you want in one type of graph, you can right-click on the data and choose the graph wizard.

Output can be Raster or Vector format, and you can publish to the Web or UI (User Interface).

Chart types are 2D and 3D. There are also a number of miscellaneous charts such as Gantt, Triangle, Table and Venn diagrams.

There are obviously a lot of options you can tweak for each graph and the graph wizard presents them to you and lets you see the effects immediately. Once you are happy with the graph you can turn the specs into a script which can be incorporated in your program.

All in all, SharpPlot now comes across as highly intuitive and easy to use.

We recommend you review the blogpost on www.dyalog.com from April 2017 where Nic explains some of the concepts further: https://www.dyalog.com/blog/2017/04/displaying-cross-references-with-sharpplot-2/

SharpLeaf:

SharpLeaf is an automated publication tool, which comes in extremely handy if, for example, you have to create the same report on a regular basis, but have to use/show different data from the previous report. A good example could be Board reports with varying data from Finance and/or Product Sales.

A lot of people will be pleased to find that their standard multi page report – which hitherto they have had to develop for example in Word and Excel each month – can now be generated automatically, with changes made to the embedded graphs and tables, based on data accrued since the previous month.

If you want this level of automation, you will have need to swap to SharpLeaf, develop your standard report, and then carry on from there. Importing an old report developed in, for example, Word or Excel is, unfortunately, not an option (but cutting and pasting text is).

* Author’s note: The original software programming work on SharpPlot and SharpLeaf was done by Causeway. Causeway was acquired by Dyalog Ltd in May 2007.