The 2021 APL Problem Solving Competition: Phase I – Best of Breed

By: Stefan Kruger
Stefan works for IBM making databases. He tries to learn at least one new programming language a year, and a few years ago he got hooked on APL and participated in the competition. This is his perspective on some solutions that the judges picked out – call it the “Judges’ Pick”, if you like; smart, novel, or otherwise noteworthy solutions that can serve as an inspiration.


Congratulations to all the winners of the 2021 APL Problem Solving Competition (you can learn more about the phase 2 winners in this article) and well done to Dzintars Klušs who won the Grand Prize. At the recent Dyalog ’21 user meeting, we got to enjoy the runner-up, Victor Ogunlokun, walking us through his solutions live.

In this post I’ll go through some great solutions that were submitted (and some that weren’t submitted) to the Phase I problems so that we can all marvel in the ingenuity and perhaps learn a thing or two. If you’re feeling inspired by the end, go ahead and participate in this year’s round which just launched.

If you’re new to the APL Problem Solving Competition, Phase I problems tend to be short and the expectation is that solutions will be “one-liners” (dfns). However, although it might seem like it from some of the solutions here, this isn’t a code golf competition! Solutions are judged holistically: do they solve the problem, are they efficient, and are they clear? Even though a few test cases are given, there is no guarantee that your solution is correct just because it works for the example data. The judging process involves running the code on many hidden test cases too. Crucially, just because your code is accepted, it doesn’t necessarily mean that you’ll get full marks.

As with my blog post that reviewed the 2020 Phase II solutions, I’ve included a more in-depth examination of one or two problems.

Problem 1: Are You a Bacteria?

Something from the excellent Project Rosalind problem collection, the task is to compute the combined percentage of guanine (G) and cytosine (C) in a given DNA-string.

Efficiency can vary a lot, depending on whether summation or multiplication (or even division!) is performed first. Some solutions were also leading-axis oriented.

Here’s my solution:

      {100×(+⌿⍵∊'CG')÷≢⍵} 'ACGTACGTACGTACGT'
50

which several competitors made more tacit with:

      {100×(+⌿÷≢)⍵∊'GC'} 'ACGTACGTACGTACGT'
50

or even went further:

      (100×≢÷⍨1⊥∊∘'GC') 'ACGTACGTACGTACGT'
50

If you’re unfamiliar with the 1⊥ trick, it’s a way of summing a vector:

      1⊥6 3 9 8 12 62
100

It’s perhaps not immediately obvious why this should work. Here’s one explanation. Assume we want to sum the vector 1 0 2 0 0. We can do this in a very convoluted way by using a sum inner product with a vector of exponentials: [14, 13, 12, 11, 10]:

      (1*4 3 2 1 0)+.×1 0 2 0 0
3

If we expand the exponentials to the left we get a vector of 1s. We can then break apart the inner product by turning +. to a +⌿ to the left:

      +⌿1 1 1 1 1×1 0 2 0 0
3

This is the textbook definition of 1⊥! Look:

      1⊥1 0 2 0 0
3

which, to be clear, is just the sum-reduce-first:

      +⌿1 0 2 0 0
3

Using 1⊥ to sum has two advantages over the more obvious formulation +⌿. Firstly, it’s easier to use in tacit formulations as it doesn’t require an operator, and secondly, it’s usually faster. The reasons for it being quicker is somewhat beyond the scope of this post, but it’s to do with 1⊥ making no guarantees about the ordering of operations, meaning that the interpreter is free to vectorise more efficiently.

Problem 2: Index-Of Modified

This problem wanted us to write a function that behaves like the APL Index Of function R←X⍳Y except that it should return 0 for elements of Y not found in X.

I wrote:

      p2 ← {0@((≢⍺)∘<)⊢⍺⍳⍵}
      2 3 p2 ⍳5
0 1 2 0 0

which is basically saying “change all instances of numbers greater than the length of the argument to zero”, which is how X⍳Y presents values that are not found.

Some very different solutions were submitted, for example:

      p2 ← ⍳|⍨1+≢⍤⊣
      2 3 p2 ⍳5
0 1 2 0 0

which is simply:

      p2 ← {(1+≢⍺)|⍺⍳⍵} ⍝ dfn of the above
      2 3 p2 ⍳5
0 1 2 0 0

Another option would have been to multiply ⍺⍳⍵ with ≢⍺, although no-one submitted exactly this:

      p2 ← ≢⍤⊣(≥×⊢)⍳
      2 3 p2 ⍳5
0 1 2 0 0

which could have been written explicitly as:

      p2 ← {m×(≢⍺)≥m←⍺⍳⍵} ⍝ dfn of the above
      2 3 p2 ⍳5
0 1 2 0 0

Problem 3: Multiplicity

Write a function that:

  • has a right argument Y which is an integer vector or scalar
  • has a left argument X which is also an integer vector or scalar
  • finds which elements of Y are multiples of each element of X and returns them as a vector (in the order of X) of vectors (in the order of Y).

Some test data was provided:

      X ← 2 4 7 3 9
      Y ← 5 7 8 1 12 10 20 16 11 4 2 15 3 18 14 19 13 9 17 6

I wrote something that, in retrospect, looks somewhat clumsy:

      p3 ← {⍵∘{⍵/⍺}¨↓0=⍺∘.|,⍵}
      X p3 Y
┌─────────────────────────┬────────────┬────┬──────────────┬────┐
│8 12 10 20 16 4 2 18 14 6│8 12 20 16 4│7 14│12 15 3 18 9 6│18 9│
└─────────────────────────┴────────────┴────┴──────────────┴────┘

which can be expressed more compactly as:

      p3 ← {/∘⍵¨↓0=⍺∘.|⍥,⍵}
      X p3 Y
┌─────────────────────────┬────────────┬────┬──────────────┬────┐
│8 12 10 20 16 4 2 18 14 6│8 12 20 16 4│7 14│12 15 3 18 9 6│18 9│
└─────────────────────────┴────────────┴────┴──────────────┴────┘

or:

      X (⊢⊂⍤/⍤1⍨0=∘.|⍥,) Y
┌─────────────────────────┬────────────┬────┬──────────────┬────┐
│8 12 10 20 16 4 2 18 14 6│8 12 20 16 4│7 14│12 15 3 18 9 6│18 9│
└─────────────────────────┴────────────┴────┴──────────────┴────┘

although no-one actually submitted that, to everyone’s credit.

Problem 4: Square Peg, Round Hole

Write a function that:

  • takes a right argument which is an array of positive numbers representing circle diameters
  • returns a numeric array of the same shape as the right argument representing the difference between the areas of the circles and the areas of the largest squares that can be inscribed within each circle.

I had to read that many times before it sank in. The key to achieve something snappy is to really work through the maths until it is as compact as possible, which, if you’re anything like me, you didn’t bother to do.

My attempt was:

      p4 ← {(○2*⍨⍵÷2)-2÷⍨⍵*2}

but there are much neater solutions if you did your homework. Here’s one that no-one found:

      p4 ← (○-+⍨)4÷⍨×⍨

and a nice explicit version:

      p4 ← {⍵×⍵×0.5-⍨○÷4}

which can be derived from this simplified mathematical expression, suggested by Rodrigo:


Explanation: The area of the circle is ○r*2, which is ○(⍵÷2)*2, in turn equivalent to ⍵×⍵×○÷4. The area of the square [ABCD] is twice the area of the triangle [ABC]. Given that the area of the triangle is 0.5×⍵×⍵÷2, the area of the square becomes 0.5×⍵×⍵. Putting both together, we get (⍵×⍵×○÷4)-⍵×⍵×0.5, the same as ⍵×⍵×(○÷4)-0.5, which is ⍵×⍵×0.5-⍨○÷4.

Square inside a circle with its diagonal as the circle's diameter

Problem 5: Rect-ify

For this problem, we were asked to plant a number of trees in a rectangular pattern with complete rows and columns, meaning all rows have the same number of trees. That rectangular pattern also needed to be as “square as possible”, meaning there is a minimal difference between the number of rows and columns in the pattern.

Here’s a smart solution, based on the observation that the “most square” choice must have one factor being the largest factor less than or equal to the square root:

      p5 ← {N,⍵÷1⌈N←⌈/0,⍵∨⍳⌊⍵*÷2}

This solution works well on large numbers of trees, too:

      p5 98776512304
280888 351658

Someone even offered up a recursive solution:

      p5rec ← {⍵=0:2⍴0 ⋄ ⍵ {0=⍵|⍺: ⍵,⍺÷⍵ ⋄ ⍺∇⍵-1} ⌊⍵*÷2}

So is one solution better than the other? Well, they both work correctly, but one is a lot faster than the other. Do you want to guess which was faster before we test it?

      'cmpx'⎕CY'dfns'
      cmpx 'p5 98776512304' 'p5rec 98776512304'
  p5 98776512304    → 8.7E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
  p5rec 98776512304 → 6.1E¯3 | -94% ⎕⎕⎕

Surprised? I was! So, what is going on here? The non-recursive solution relies on a rather crude way to find the factors, which is a fairly large number to factorise even if it only needs to go up to the square root. The recursive version just tries each number in turn, up to the square root.

Can we be even smarter? This version was offered up by APL Orchard regular @rak1507:

      p5rak1507 ← {a,⍵÷1⌈a←⊃⌽⍸0=⍵|⍨⍳⌊⍵*.5} ⍝ @rak1507
      cmpx 'p5 98776512304' 'p5rec 98776512304' 'p5rak1507 98776512304'
  p5 98776512304        → 8.7E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
  p5rec 98776512304     → 6.3E¯3 | -93% ⎕⎕⎕                                     
  p5rak1507 98776512304 → 4.2E¯3 | -96% ⎕⎕

Basically, (⊢∨⍳) is neat as a code-golf trick, but not great in terms of efficiency.

Problem 6: Fischer Random Chess

According to Wikipedia, Fischer random chess is a variation of the game of chess invented by former world chess champion Bobby Fischer. Fischer random chess employs the same board and pieces as standard chess, but the starting position of the non-pawn pieces on the players’ home ranks is randomised, following certain rules.

White’s non-pawn pieces are placed on the first rank according to the following rules:

  • the Bishops must be placed on opposite-colour squares
  • the King must be placed on a square between the rooks.

The task was to write a function that verifies that a given board placement is valid according to these rules.
This was my solution for this:

      q6 ← {(1=+/(⍵⍳'K')>⍸'R'=⍵)∧1=+/2|⍸'B'=⍵}

but there was a lot of variety in the solutions submitted to this problem. For example:

      q6i  ← {≠/2|⍸'B'=⍵}∧'RKR'≡∩∘'RK'        ⍝ Intersection
      q6ii ← {(≠/2|⍸'B'=⍵)∧1=(⍸'R'=⍵)⍸⍵⍳'K'}  ⍝ Interval index
      q6w  ← {(≠/2|⍸'B'=⍵)∧≠/(⍸'K'=⍵)<⍸'R'=⍵} ⍝ Where (similar to mine above)

The last version there is very amenable to Over:

            q6over ← {I←⍸=∘⍵ ⋄ (2|I'B')∧⍥(≠/)'K'<⍥I'R'}

And for masochists, there is always the famous Progressive Dyadic Iota:

      pd ← {((⍴⍺)⍴⍋⍋⍺,⍵)⍺⍺(⍴⍵)⍴⍋⍋⍵,⍺}
      q6pdi ← {(∧/2\</⍵⍳pd'RKR')∧≠/2|⍵⍳pd'BB'}

Problem 7: Can You Feel the Magic?

A square matrix is ‘magic’ if all of its rows and columns and both diagonals sum to the same number.

One hero came up with the following:

      q7 ← (∧/2=/∘∊+/,(+/1 2 2∘⍉))⍉,[0.5]⌽

Here is how it works:

      q7 magic←⎕←3 3⍴4 9 2 3 5 7 8 1 6
4 9 2
3 5 7
8 1 6
1
      q7 nonmagic←⎕←3 3⍴4 9 2 7 5 3 8 1 6
4 9 2
7 5 3
8 1 6
0

The problem statement suggested that dyadic transpose might come in handy, but that’s just showing off! So, how does it work? It’s certainly tacit:

              q7            ⍝ Ouch...
    ┌─────────┴─────────┐
  ┌─┴─┐               ┌─┼─────┐
  / ┌─┼──────┐        ⍉ [0.5] ⌽
┌─┘ 2 ∘    ┌─┼───┐    ┌─┘
∧    ┌┴┐   / , ┌─┴──┐ ,
     / ∊ ┌─┘   /    ∘
   ┌─┘   +   ┌─┘ ┌──┴──┐
   =         +   1 2 2 ⍉

The fork ⍉,[0.5]⌽ takes the argument matrix – a square array of rank-2, shape A A – and returns an array of rank-3, shape 2 A A, where the first cell is the transposed original array and the second is the original array with its rows reversed:

      ]display (⍉,[0.5]⌽) magic
┌┌→────┐
↓↓4 3 8│
││9 5 1│
││2 7 6│
││     │
││2 9 4│
││7 5 3│
││6 1 8│
└└~────┘

We only need to know about the main diagonal of each cell; as you can see, the main diagonal in the second cell is the reverse diagonal of the first cell. We can extract both diagonals with a single dyadic transpose:

      1 2 2⍉(⍉,[0.5]⌽) magic
4 5 6
2 5 8

The same result can be achieved using slightly less showy instead, which has the same byte count but is a little easier to understand when first seen:

      1 1⍉⍤2(⍉,[0.5]⌽) magic ⍝ Diagonals of each major cell love ⍤
4 5 6
2 5 8

The remaining part of the tacit formulation untangles easily. Impressive and creative.

Here’s another good one that is slightly shorter:

      q7 ← (1=≢∘∪)⍉+⌿⍤,⍥(1 1∘⍉,⊢)⌽
      q7 magic
1
      q7 nonmagic
0

How does that work? The phrase 1 1∘⍉,⊢ prepends the diagonal as a column to the argument matrix:

      (1 1∘⍉,⊢)magic ⍝ Explicit: {(1 1⍉⍵),⍵}
4 4 9 2
5 3 5 7
6 8 1 6

Clever application of says “take the matrix, append its reverse over the diagonal-append operation”:

      {⍵,⍥{(1 1⍉⍵),⍵}⌽⍵} magic ⍝ love ⍥
4 4 9 2 2 2 9 4
5 3 5 7 5 7 5 3
6 8 1 6 8 6 1 8

We can emphasise the location of the diagonals by using Partitioned Enclose to make them stand out a bit:

      1 1 0 0 1 1 0 0 ⊂ {⍵,⍥{(1 1⍉⍵),⍵}⌽⍵} magic
┌─┬─────┬─┬─────┐
│4│4 9 2│2│2 9 4│
│5│3 5 7│5│7 5 3│
│6│8 1 6│8│6 1 8│
└─┴─────┴─┴─────┘

Summing along the leading axis gives:

      {+⌿⍵,⍥{(1 1⍉⍵),⍵}⌽⍵} magic
15 15 15 15 15 15 15 15

Finally, check that all items are equal:

      {1=≢∪+⌿⍵,⍥{(1 1⍉⍵),⍵}⌽⍵} magic ⍝ Length of vector of unique values = 1?
1

In summary, there are two things to note here: using to get both diagonals and the use of 1=≢∘∪ to check that all items are equal. If you attended the APL Seeds ’21 conference last March, you’ll recognise this as one of the many ways of solving this problem that Conor Hoekstra presented – see https://dyalog.tv/APLSeeds21/?v=GZuZgCDql6g to watch his presentation.

Any solution that makes use of both of my favourite glyphs ( and ) is a winner in my book.

Problem 8: Time to Make a Difference

Write a function that:

  • has a right argument that is a numeric scalar or vector of length up to 3, representing a number of [[[days] hours] minutes] – a single number represents minutes, a 2-element vector represents hours and minutes, and a 3-element vector represents days, hours, and minutes
  • has a similar left argument, although not necessarily the same length as the right argument
  • returns a single number representing the magnitude of the difference between the arguments in minutes.

Here’s a cool version (several submissions were similar):

      p8 ← |-⍥(1 24 60⊥¯3∘↑)

Nothing too mysterious here. A slight complication is the need to handle a right argument that can be a scalar or a vector of length 2 or 3. The decode function expects the argument vector to always be length 3, so we use the take function, dyadic , with ¯3 as the left argument to ensure that the argument is always a vector of the correct length, padding from the left with zeros as required. The mixed radix vector 1 24 60 as the left argument to decode converts to minutes.

Problem 9: In the Long Run

Write a function that:

  • has a right argument that is a numeric vector of 2 or more elements representing daily prices of a stock
  • returns an integer singleton that represents the highest number of consecutive days where the price increased, decreased, or remained the same, relative to the previous day.

I’d like to compare and contrast two solutions, neither of which are tacit for a change:

      p9a ← {≢⍉↑⊂⍨1,2≠/×2-/⍵}
      p9b ← {⌈/¯2-/0,⍸1,⍨2≠/×2-/⍵} ⍝ Flat efficiency

Starting with the first of the two (p9a), from the right, we use a windowed difference reduction to calculate pairwise differences:

      2-/1 3 5 6 6 6 6 6 3 2 1
¯2 ¯2 ¯1 0 0 0 0 3 1 1

and then apply the direction function, monadic ×, to turn this into a vector of ¯1, 0 and 1 if the corresponding item is negative, zero or positive respectively:

      ×2-/1 3 5 6 6 6 6 6 3 2 1
¯1 ¯1 ¯1 0 0 0 0 1 1 1

Another pairwise windowed reduction, this time with , gives us the points of change:

      2≠/×2-/1 3 5 6 6 6 6 6 3 2 1
0 0 1 0 0 0 1 0 0

Prepending a 1, this Boolean vector can be used as the left argument to partitioned enclose, ; a common pattern. But what of the right argument? We can use the same vector as the right argument by using a clever commute, :

      ⊢m←⊂⍨1,2≠/×2-/1 3 5 6 6 6 6 6 3 2 1 ⍝ Commute to use the same argument left and right
┌─────┬───────┬─────┐
│1 0 0│1 0 0 0│1 0 0│
└─────┴───────┴─────┘

What remains is to find the longest cell in this vector. We could do ⌈/≢¨, but instead this submission found the length of the transpose-mix:

      ≢⍉↑ m
4

A code-golfer’s trick shot, perhaps, and somewhat dubious in terms of efficiency, but certainly cute. If you don’t see why it works, work it through right to left!

The second solution (p9b) uses a lot of the same ideas, but this time we add a 1 to the end of the points-of-change vector:

      {1,⍨2≠/×2-/⍵}1 3 5 6 6 6 6 6 3 2 1
0 0 1 0 0 0 1 0 0 1

and use where, monadic , to get the indices, prepending a 0 so that we can calculate the length of each segment:

      {0,⍸1,⍨2≠/×2-/⍵}1 3 5 6 6 6 6 6 3 2 1
0 3 7 10

The pairwise difference now represents the length of each segment, and by using a negative window we can commute each pair to get a positive number out for each pair:

      {¯2-/0,⍸1,⍨2≠/×2-/⍵}1 3 5 6 6 6 6 6 3 2 1
3 4 3

and so, for the maximum:

      {⌈/¯2-/0,⍸1,⍨2≠/×2-/⍵}1 3 5 6 6 6 6 6 3 2 1
4

Shall we race them? Of course!

      data ← 10000?10000
      cmpx 'p9a data'
  p9a data → 2.7E¯4 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
  p9b data → 2.1E¯5 | -92% ⎕⎕⎕

The second version is faster for several reasons. We suspected already that the ‘cute’ way to find the longest vector in a nested vector was likely to be slow, as it has to create a huge matrix first, chasing pointers. The second version uses flat numeric vectors throughout, and cuts the work considerably by using where initially to do length calculations on the shorter vector of indices. Flat is fast.

Problem 10: On the Right Side

Write a function that:

  • has a right argument T that is a character scalar, vector or vector of character vectors/scalars
  • has a left argument W that is a positive integer specifying the width of the result
  • returns a right-aligned character array R of shape ((2=|≡T)/≢T),W meaning that R is one of the following:
    • a W-wide vector if T is a simple vector or scalar
    • a W-wide matrix with the same number rows as elements of T if T is a vector of vectors/scalars
  • if an element of T has length greater than W, truncate it after W characters.

The last point is perhaps a bit misleading, but the intention is clear from one of the examples given:

      ⍴⎕←8 (your_function) 'Longer Phrase' 'APL' 'Parade'
r Phrase
     APL
  Parade
3 8

In this case, “truncate after W characters” means “remove from the left”.
Conceptually, we need to (over)take W characters from the right of each element and mix that into a rank-2 array. To make it work for the edge cases, we should ensure that we can always treat the right argument as a vector of character vectors, using nest, monadic . This works because if we take more characters than the vector contains, it gets padded using a character-vector’s prototype element, a space.

            8 {↑(-⍺)↑¨⊆⍵} 'Longer Phrase' 'APL' 'Parade'
r Phrase
     APL
  Parade

An equivalent tacit formulation would be:

            8 (↑-⍤⊣↑¨⊆⍤⊢) 'Longer Phrase' 'APL' 'Parade'
r Phrase
     APL
  Parade

Here’s a slight variation:

            8 {⌽⍉⍺↑⍉↑⌽¨⊆⍵}'Longer Phrase' 'APL' 'Parade'
r Phrase
     APL
  Parade

This starts by reversing each cell, then applies a mix and transpose. We then take items from the left before backing out of the transpose and reverse by applying them again.
It can be done in a flatter manner, too:

            8{⍉(-⍺)↑⍉(⊆⍵)⌽∘↑⍨(⊢-⌈/)≢¨⊆⍵} 'Longer Phrase' 'APL' 'Parade'
r Phrase
     APL
  Parade

If we flip the selfie and add a few spaces it gets a bit easier to see what’s going on:

            8 {⍉(-⍺)↑⍉ ((⊢-⌈/)≢¨⊆⍵) ⌽↑⊆⍵} 'Longer Phrase' 'APL' 'Parade'
r Phrase
     APL
  Parade

From the right, we turn our input into a character array and then Rotate each row by its length minus the length of the longest row, which implements the right alignment:

            {((⊢-⌈/)≢¨⊆⍵) ⌽↑⊆⍵} 'Longer Phrase' 'APL' 'Parade'
Longer Phrase
          APL
       Parade

What remains is the truncation, which follows similar lines to the earlier versions.

For completeness we can race a couple of these variants. Let’s generate a chunkier data set first: 10,000 random strings of varying lengths up to 50:

      data←{⎕A[?(?50)⍴26]}¨⍳10000      'cmpx'⎕CY'dfns'
      cmpx '20{↑(-⍺)↑¨⊆⍵}data' '20{⍉(-⍺)↑⍉(⊆⍵)⌽∘↑⍨(⊢-⌈/)≢¨⊆⍵}data' '20{⌽⍉⍺↑⍉↑⌽¨⊆⍵}data'
  20{↑(-⍺)↑¨⊆⍵}data                 → 9.3E¯4 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕             
  20{⍉(-⍺)↑⍉(⊆⍵)⌽∘↑⍨(⊢-⌈/)≢¨⊆⍵}data → 9.0E¯4 |  -3% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕              
  20{⌽⍉⍺↑⍉↑⌽¨⊆⍵}data                → 1.4E¯3 | +46% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

The complex, flat version wins, but not by a significant amount.

With that we’ve reached the end. A nice set of problems, with a lot of creative solutions submitted. Watch this space for a review of the Phase II problems…

Highlights of the 2020 Problem Solving Competition – Phase II

With Dyalog’s APL Problem Solving Competition 2021 in full swing, it’s time to highlight some of the excellent solutions that were submitted to last year’s edition.

Stefan Kruger works for IBM making databases. While he tries to learn at least one new programming language a year, he got hooked on APL and participated in the competition. This is his perspective on some solutions that the judges picked out – call it the “Judges’ Pick”, if you like; smart, novel, or otherwise noteworthy solutions that can serve as an inspiration.

This blog post is also available as an interactive Jupyter Notebook document.


By Stefan Kruger

I’ll show a cool solution or two to each Phase II problem and dive into the details of a couple. If you need to refresh your memory with what the problems looked like, there’s a PDF of the Phase II problems.

Oh, and note that at the time of writing there is still plenty time to take part in the current edition of the competition (and really, who knew bowling was so complicated?) – there are some juicy cash prices to be won.

Problem 1: Take a Dive (1 task)

Level of Difficulty: Low

So let’s kick off with problem 1. The task was to calculate the score of an Olympic dive, consisting of a technical difficulty rating and a vector containing either 3, 5 or 7 judges’ scores. Only the central three ordered judges’ scores should be considered, which should be summed and multiplied by the technical difficulty rating.

Here is a cunning trick that wasn’t at all obvious:

∇ score←dd DiveScore scores;sorted;cenzored;rotator
  ⍝ 2020 APL Problem Solving Competition Phase II
  ⍝ Problem 1, Task 1 - DiveScore
   
  sorted←{⍵[⍋⍵]}scores
   
  ⍝  0 1 2 rotates score indexes to 123, 23451 or 3456712
  ⍝  So three center values always goes first
  ⍝  51 = (0 1 2∧.= 3 5 7 ∘.|⍳100) ⍳ 1
  rotator←51
 
  cenzored←3↑rotator⌽sorted
  score←⍎2⍕dd+.×cenzored
∇
      2.9 2.6 2.7 DiveScore¨(7 7.5 6.5 8 8 7.5 7)(9.5 8 8.5)(7.5 7 7 8.5 8)
63.8 67.6 60.75

This contestant figured out that if a vector of length 3, 5 or 7 is rotated 51 steps, then the original central three items will always end up at the beginning. No, really. It turns out that 51 is the first number X such that 0 1 2≡3 5 7|X. They tabulated the options and picked the first solution, guessing that it’d be less than 100:

      ⍸0 1 2∧.=3 5 7∘.|⍳100
51

But there is another way – this is one of those situations where the Chinese Remainder Theorem comes in handy, especially since it’s available on APLcart:

      3 5 7 {m|⍵+.×⍺(⊣×⊢|∘⊃{0=⍵:1 0 ⋄ (⍵∇⍵|⍺)+.×0 1,⍪1,-⌊⍺÷⍵})¨⍨⍺÷⍨m←×/⍺} 0 1 2 ⍝ https://aplcart.info?q=chinese
51

If you figured that out, award yourself a well-deserved pat on the back. For us mortals, we probably all did something rather more pedestrian:

DiveScore ← {
    d ← 2-2÷⍨7-≢⍵       ⍝ How many items should we drop each side?
    ⍺+.×(-d)↓d↓⍵[⍋⍵]
}

Problem 2 – Another Step in the Proper Direction (1 task)

Level of Difficulty: Medium

Problem 2 builds upon Problem 5 from Phase I. In short, we are asked to write a function Steps that takes a two-element vector to the right, defining a start and end value, and an optional left integer argument that tweaks how we generate values from start to end. The complexity here comes from the many combinations of behaviours from what exactly is given as the left argument: integer or float? positive or negative? Also, the range must be inclusive, even if a floating-point step size means that the end point is overshot. I took this on thinking it would be trivial – it wasn’t.

Here’s a great solution that manages to combine this functionality with a call to a single dfn:

∇ steps←{p}Steps fromTo;segments;width
  width ← |-/fromTo
  :If 0=⎕NC'p' ⍝ No left argument: same as Problem 5 of Phase I
      segments ← 0,⍳width
  :ElseIf p0 ⍝ p is the step size
      segments ← p {⍵⌊⍺×0,⍳⌈⍵÷⍺} width
  :ElseIf p=0 ⍝ As if we took zero step
      segments ← 0
  :EndIf
  ⍝ Take into account the start point and the direction.
  steps ← fromTo {(⊃⍺)+(-×-/⍺)×⍵} segments
∇

I ended up with something more convoluted, with a few ugly special cases, and shamelessly borrowing from dfns.iotag:

Steps ← {
    range ← {
        r ← ⍺-s×⎕IO-⍳⌊1-(⍺-⊃⍵)÷s←×/1↓⍵,(⍺>⊃⍵)/¯1 ⍝ "inspired" by dfns.iotag
        (⊃⍵)≠⊃⊖r: r,⊃⍵ ⋄ r   ⍝ Ensure endpoint is included – yeuch :(
    }
    ⍺ ← ⍬
    (b e) ← ⍵
    ⍺≡⍬: b range e        ⍝ No ⍺
    ⍺=0: b                ⍝ Zero step; return start point
    ⍺>0: b range e ⍺      ⍝ Positive ⍺
    len ← (e-b)÷count←⌊-⍺ ⍝ Negative ⍺
    len=0: b/⍨1+count     
    b range e len
}

Problem 3 – Past Tasks Blast (1 task)

Level of Difficulty: Medium

The task here was to scrape the Dyalog APL Problem Solving Competition webpage to extract all links to PDF files. We get the suggestion to use either Dyalog’s HttpCommand or shell out to a system mechanism for fetching a web page.

To use HttpCommand, we first need to load it:

      ]load HttpCommand
#.HttpCommand

Here’s a slightly tweaked competition submission, showing great flair in how to process XML:

PastTasks ← {
    url ← ⍵
    r ← (HttpCommand.Get url).Data  ⍝ get page contents
    (d n c a t) ← ↓⍉⎕XML r          ⍝ depth; name; content; attributes; type
    (k v) ← ↓⍉ ⊃⍪/ ((,'a')∘≡¨n)/a   ⍝ extract key-value pairs of <a> elements
    urls ← ('href'∘≡¨k)/v           ⍝ get URLs
    pdfs ← ('.pdf'∘≡¨¯4↑¨urls)/urls ⍝ filter .pdfs
    base ← ⊃⌽⊃('base'∘≡¨n)/a        ⍝ base URL
    base∘,¨pdfs
}

The problem statement suggests that a regex-based solution might be tolerable. Here’s a stab at that approach:

PastTasks ← {
    body ← (HttpCommand.Get ⍵).Data
    pdfs ← '<a href="(.+?\.pdf)"'⎕S'\1'⊢body
    base ← '<base href="(.+?)"'⎕S'\1'⊢body
    base,¨pdfs
}

So which is the “better” solution? Well, the first approach has a number of advantages: firstly, is much more robust (provided that the web page is valid XHTML, which we are told is a given), meaning that we can abdicate responsibility for dealing with markup quirks (single vs double quotes, whitespace etc) to the built-in ⎕XML system function, and secondly, there is that (in)famous quote from Jamie Zawinski:

Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems. – jwz

Mixing in a liberal helping of regular expressions in with APL is perhaps not helping APL’s unfair reputation for being write-only.

However, when dealing with patterns in textual data, as we unquestionably are here, regular expressions – even in a powerful language like APL – are sharp tools that are hard to beat, and any programmer worth their salt owes it to themselves to master them. In the case above, had the data not neatly been parseable as XML, it would have been more awkward to solve a problem like this relying only on APL primitives.

Problem 4 – Bioinformatics (2 tasks)

Level of Difficulty: Medium

The two tasks making up Problem 4 are borrowed from Project Rosalind, which is a Bioinformatics problem collection that often has great APL affinity:

and a hint that one benefits from understanding modular multiplication, as this isn’t built into Dyalog APL.

Here is a great example:

revp ← {                    ⍝ r ← revp dna
   dnaNum ← 'ACGT'⍳⍵        ⍝ Convert to 1..4 so that A+T = C+G = 5
   FindRevp ← {             ⍝ Given chunk size, extract positions and build the output format
       chunks ← ⍵,/dnaNum
       isRevp ← (⊢≡5-⌽)¨chunks
       ⍵,⍨⍪⍸isRevp
   }
   ⊃⍪/FindRevp¨4 6 8 10 12  ⍝ Test against all chunk sizes and collect results
}
sset ← {          ⍝ r←sset n
   bin ← 2⊥⍣¯1⊢⍵  ⍝ Binary digits
   arr ← ⌽2*bin   ⍝ Repeated squaring: Starting from MSB and 1, square ⍵, multiply ⍺, modulo m
   mod ← 1000000
   {mod|⍺×⍵*2}/arr,1
}

This contestant also saw fit to include their test suite; a nice touch! Roger Hui’s version of assert has become the de facto standard, and the contestant puts it to good use:

Assert ← {⍺←'assertion failure' ⋄ 0∊⍵:⍺ ⎕SIGNAL 8 ⋄ shy←0} ⍝ Roger Hui's Assert
RevpTest ← {
   s ← 'TCAATGCATGCGGGTCTATATGCAT'
   ans ← revp s
   Assert 8 2≡⍴ans:
   Assert 5 4 7 4 17 4 18 4 21 4 4 6 6 6 20 6≡∊ans:
  
   header ← 'Contest2020/Data/'  ⍝ Change as needed
   data1 ← ∊1↓⊃⎕NGET (header,'rosalind_revp_1_dataset.txt') 1
   ans1 ← ↑⍎¨⊃⎕NGET (header,'rosalind_revp_1_output.txt') 1
   data2 ← ∊1↓⊃⎕NGET (header,'rosalind_revp_2_dataset.txt') 1
   ans2 ← ↑⍎¨⊃⎕NGET (header,'rosalind_revp_2_output.txt') 1
   Assert ans1 ≡ revp data1:
   Assert ans2 ≡ revp data2:
   'Test passed'
}
SsetTest ← {
   Assert 8 = sset 3:
   Assert 551872 = sset 857:
   Assert 935424 = sset 870:
   'Test passed'
}

Problem 5 – Future and Present Value (2 tasks)

Level of Difficulty: Medium

Problem 5 is some hedge fund maths, or something where my eyes glazed over before I fully understood the ask. What is this, K‽

This solution is impressively compact – I removed the comments to highlight the APL artistry on display: no less than three scans, count ’em!

rr ← {AR×+\⍺÷AR←×\1+⍵} 
pv ← {+/⍺÷×\1+⍵}

Here’s how the competitor outlined how their solution works:

This can be calculated elegantly with the following operations:

  1. Find the accumulated interest rate (AR) for each term (AR←×\1+⍵).
  2. Deprecate the cashflow amounts by dividing them by AR. This finds the present value of all the amounts.
  3. Accumulate all the present values of the amounts to find the total present value at each term.
  4. Multiply by AR to find future values at each term.

This way the money that was invested or withdrawn in a term is not changed for that term, but the money that came from the previous terms is multiplied by the current interest rate for each term arriving to the correct recurrent relation:

Step 2) amounts[i]/AR[i] ⍝ ≡ PV[i]
Step 3) amounts[i]/AR[i] + APV[i-1]
Step 4) amounts[i] + APV[i-1]×AR[i]
amounts[i] + APV[i-1]×AR[i-1]×(1+rate[i])
amounts[i] + r[i-1]×(1+rate[i]) ⍝ ≡ r[i]

Problem 6 – Merge (1 task)

Level of Difficulty: Medium

Mail merge – gotta love it. Your spam folder is full of bad examples of this: “Dear $FIRSTNAME, do you want to purchase a bridge?” We’re given a template file with patterns such as @firstname@ which are to be replaced with values stored in a JSON file. Here’s a smart approach from a competitor who knows their way around the @ operator:

Merge ← {
   templateFile ← ⍺
   jsonFile ← ⍵
   template ← ⊃⎕NGET templateFile
   ns ← ⎕JSON⊃⎕NGET jsonFile

   getValue ← {
       0=⍴⍵:,'@'   ⍝ '@@'         → ,'@'
       6::'???'    ⍝ ~⍵∊ns.⎕NL ¯2 → '???'
       ⍕ns⍎⍵       ⍝  ⍵∊ns.⎕NL ¯2 → ⍕ns.⍵
   }
   ∊getValue¨@(⍴⍴1 0⍨)'@'(1↓¨=⊂⊢)template
}

The key insight here is that since each template starts and ends with the same marker, we can partition the data on sections beginning with @ and then we’ll have a vector where every other element is a template to be substituted. Here’s an example of this:

      ↑('@'(1↓¨=⊂⊢) '@title@ @firstname@ @lastname@, would you be interested in the Brooklyn Bridge?') (1 0 1 0 1 0)
┌─────┬─┬─────────┬─┬────────┬─────────────────────────────────────────────────┐
│title│ │firstname│ │lastname│, would you be interested in the Brooklyn Bridge?│
├─────┼─┼─────────┼─┼────────┼─────────────────────────────────────────────────┤
│1    │0│1        │0│1       │0                                                │
└─────┴─┴─────────┴─┴────────┴─────────────────────────────────────────────────┘

I added the second row for clarity to show the alternating templates. Cool, huh? However, this only works correctly if the data leads with a template. Consider:

      '@'(1↓¨=⊂⊢) 'Dear @firstname@ @lastname@, or maybe the Golden Gate?'
┌─────────┬─┬────────┬───────────────────────────┐
│firstname│ │lastname│, or maybe the Golden Gate?│
└─────────┴─┴────────┴───────────────────────────┘

We still have the alternating templates, but the prefix (Dear ) is lost. We can tweak the Merge function a bit to cater for this if we need to:

Merge ← {
    templateFile ← ⍺
    jsonFile ← ⍵
    template ← ⊃⎕NGET templateFile
    ns ← ⎕JSON⊃⎕NGET jsonFile
    first ← templ⍳'@'
    first>≢templ: templ    ⍝ No templates at all
    prefix ← first↑templ   ⍝ Anything preceding the first '@'?

    getValue ← {
        0=⍴⍵:,'@'   ⍝ '@@'         → ,'@'
        6::'???'    ⍝ ~⍵∊ns.⎕NL ¯2 → '???'
        ⍕ns⍎⍵       ⍝  ⍵∊ns.⎕NL ¯2 → ⍕ns.⍵
    }
    ∊prefix,getValue¨@(⍴⍴1 0⍨)'@'(1↓¨=⊂⊢)template
}

Now, the competition is pitched such that “proper array solutions” are preferred – and for good reasons, most of the time. However, it’s hard to overlook some industrial regex action in this case. Strictly for Perl-fans:

Merge ← {
    mrg ← ⎕JSON⊃⎕NGET ⍵
    keys ← mrg.⎕NL¯2
    vals ← mrg.⍎¨keys

    ('@',¨(keys,'' '[^@]+'),¨'@')⎕R((⍕¨vals),'@' '???')⊃⎕NGET ⍺
}

Problem 7 – UPC (3 tasks)

Level of Difficulty: Medium

Problem 7 had us learning more about bar codes than we ever thought necessary. Read them, write them, verify them, scan them – forwards and backwards no less. Good scope for stretching your array muscles on this one. The eagle-eyed amongst you may have spotted that the verification aspect is a simplified version of Luhn’s algorithm, which a certain Morten Kromberg used to illustrate APL’s array capabilities at JIO a while back.

Here’s a good solution:

CheckDigit ← (10|∘-+.×∘(11⍴3 1))          ⍝ Computes the check digit for a UPC-A barcode.

UPCRD ← 114 102 108 66 92 78 80 68 72 116 ⍝ Right digits of a UPC-A barcode, base 10.
bUPCRD ← ⍉2∘⊥⍣¯1⊢UPCRD                    ⍝ Bit matrix with one right digit per row.
WriteUPC ← {
   ⍝ Writes the bits of a UPC-A barcode.  
   ~((11∘=≢)∧(∧/0∘≤∧≤∘9))⍵: ¯1            ⍝ Check for simple errors
   b ← bUPCRD[⍵,CheckDigit ⍵;]  
   1 0 1, (,~6↑b), 0 1 0 1 0, (,6↓b), 1 0 1 
}
ReadUPC ← {
   ⍝ Reads a UPC-A barcode into its digits.
   ~(∧/0∘≤∧≤∘1)⍵: ¯1                 ⍝ Input isn't a bit vector
   95≠≢⍵: ¯1                         ⍝ Number of bits must be 95
   (b l m r e) ← ⍵ ⊂⍨ (∊¯1∘↓,⌽) (3↑1)(42↑1)(5↑1)
   
   b ∨⍥(≢∘1 0 1) e: ¯1               ⍝ Wrong patterns for the guards
   m≢0 1 0 1 0: ¯1
   bits ← ↓12 7⍴ l,r
   C ← (↓bUPCRD)∘⍳ ~@(⍳6)            ⍝ Convert bits to digits
   tf ← ~∧/10 > nums ← C bits        ⍝ Should we try flipping the bits?
   nums ← (nums×1-tf) + tf×C⌽↓⌽↑bits
   ∨/10=nums: ¯1                     ⍝ Bits simply aren't right
   (¯1↑nums)≠CheckDigit 11↑nums: ¯1  ⍝ Bad check digit
   nums
}

Problem 8 – Balancing the Scales (1 task)

Level of Difficulty: Hard

Our task is to partition a set of numbers into two groups of equal sum if this is possible, or return if not. This is a well-known NP-complete problem called The Partition Problem and, as such, has no polynomial time exact solutions. The problem statement indicates that we only need to consider a set of 20 numbers or fewer, which is a bit of a hint on what kind of solution is expected.

This problem, in common with many other NP problems, also has a plethora of interesting heuristic solutions: polynomial algorithms that whilst not guaranteed to always find the optimal solution will either get close, or be correct for a significant subset of the problem domain in a fraction of the time the exact algorithms would take.

However, it’s clear that Dyalog expects us to give an exact solution, and has given us an upper bound on the input data length. Finally, we’re offered the cryptic advice that

Understanding the nuances of the problem is the key to developing a good algorithm.

Yes, thank you, master Yoda.

Here’s a great, efficient solution:

Balance←{
   sum←1⊥⍵
   2|sum: ⍬   ⍝ Lists with an odd sum cannot be split into equal parts.
   halfsum←sum÷2
  
   ⍝ A partitioning method based on the algorithm by Horowitz and Sahni.
   ⍝ The basic idea of the algorithm is to split the input into two parts,
   ⍝ and then generate all subset sums for these parts. Then the problem
   ⍝ becomes finding a sum of two subset sums from different parts
   ⍝ equal to the desired value. Instead of sorting the sums and comparing
   ⍝ them like in the original algorithm, standard APL searching primitives
   ⍝ ∊ and ⍳ are used. Another key idea is to generate the subset sums
   ⍝ in a specific order, so that the nth subset sum in the vectors a and b
   ⍝ is the sum of the elements chosen by the binary representation of n.
   ⍝ This means that we can get the elements of the solution sum
   ⍝ without having to generate anything but the sums.
   horowitzsahni←{
       s←⍵(↑{⍺⍵}↓)⍨⌊2÷⍨≢⍵                          ⍝ Split the input.
       a b←⊃¨(⊢,+)/¨s,¨0                           ⍝ Generate the subset sums.
       indexes←a {(⊢,⍵⍳⍺⌷⍨(≢⍺)⌊⊢)1⍳⍨⍺∊⍵} halfsum-b ⍝ Search for solution indexes.
       indexes[2]>≢b: ⍬
       ⍵ {(⍺/⍨~⍵)(⍵/⍺)} ∊(2⍴¨⍨≢¨s)⊤¨indexes-1      ⍝ Get the solution from the indexes.
   }
  
   ⍝ A simple exhaustive search. It uses the same binary representation
   ⍝ idea as the horowitzsahni function.
   exhaustive←{
       i←halfsum⍳⍨⊃(⊢,+)/⍵,0
       i>2*≢⍵: ⍬
       ⍵ {(⍺/⍨~⍵)(⍵/⍺)} (2⍴⍨≢⍵)⊤i-1
   }

   ⍝ The exhaustive method performs better than the Horowitz-Sahni method
   ⍝ for small input sizes. 14 seems to be a reasonable cutoff point.
   14>≢⍵: exhaustive ⍵
   horowitzsahni ⍵
}

There are a number of clever touches here – there are actually two different solutions, an exhaustive search and an implementation of the algorithm due to Horowitz and Sahni, which, although still exponential, is known to be one of the fastest for certain subsets and input sizes. A switch based on input size checks for the crossover point and chooses the fastest option. And this is fast – five times faster than that of the Grand Prize winner, and four orders of magnitude faster than the slowest solution.

Such a performance spread is intriguing, so there are clearly lessons to be learned here. When I tried this problem, I ended up with a pretty straight-forward (a.k.a. naive) brute force search:

Balance ← {⎕IO←0
    total ← +/⍵
    2|total: ⍬             ⍝ Sum must be divisible by 2
    psum ← total÷2         ⍝ Our target partition sum
    bitp ← ⍉2∘⊥⍣¯1⍳2*≢⍵    ⍝ All possible bit patterns up to ≢⍵
    idx ← ⍸<\psum=bitp+.×⍵ ⍝ First index of partition sum = target
    ⍬≡idx: ⍬               ⍝ If we have no 1s, there is no solution
    part ← idx⌷bitp        ⍝ Partition corresponding to solution index
    (part/⍵)(⍵/⍨~part)     ⍝ Compress input by solution pattern and inverse
}

If you come to APL from a scalar language, that approach must seem incredibly wasteful: make all bit patterns. Try all sums. Search for the right one, if it exists. But as it turns out, this is APL home turf advantage. Let’s try to demonstrate this point. If you did this “loop and branch”, you’d iterate over the bit patterns and stop once you find the first solution – in fact, for the test data in the problem specification, the first solution appears at around the 1500th bit pattern if you generate them as I do above. The vector version would need to consider the whole space of around

      ¯1+2*20
1048575

a million or so, so quite a difference. Surely, in this case the scalar approach should be way faster? Only one way to find out. We can make a scalar version in several ways – here’s the “Scheme” version:

BalanceScalar ← {⎕IO←0     ⍝ Warning: this is not the APL Way, as we shall see.
    total ← +/⍵
    2|total: ⍬             ⍝ Sum must be divisible by 2
    psum ← total÷2         ⍝ Our target partition sum
    data ← ⍵
    bitp ← ↓⍉2∘⊥⍣¯1⍳2*≢⍵   ⍝ Pre-compute the bit patterns
    {                      ⍝ Try one sum after the other, halt on first solution
        0=⍵: ⍬
        patt ← ⍵⊃bitp
        psum=patt+.×data: (patt/data)(data/⍨~patt) ⍝ Exit on first solution found
        ∇¯1+⍵
    } ¯1+≢bitp
}

Dyalog’s got game when it comes to tail call optimisation, right? OK, let’s race:

      'cmpx'⎕CY'dfns'
      d ← 10 81 98 27 28 5 1 46 63 99 25 39 84 87 76 85 78 64 41 93
      cmpx 'Balance d' 'BalanceScalar d'
  Balance d       → 2.7E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕            
* BalanceScalar d → 3.9E¯2 | +43% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

Vectorisation, Boolean vectors and primitive functions wins the day. We didn’t go completely scalar, to be fair, as we still pre-computed all the binary patterns.

But back to the task at hand – let’s pit ourselves against the intellectual might of Horowitz and Sahni:

horowitzsahni←{
    sum←1⊥⍵
    2|sum: ⍬   ⍝ Lists with an odd sum cannot be split into equal parts.
    halfsum←sum÷2
    s←⍵(↑{⍺⍵}↓)⍨⌊2÷⍨≢⍵                          ⍝ Split the input.
    a b←⊃¨(⊢,+)/¨s,¨0                           ⍝ Generate the subset sums.
    indexes←a {(⊢,⍵⍳⍺⌷⍨(≢⍺)⌊⊢)1⍳⍨⍺∊⍵} halfsum-b ⍝ Search for solution indexes.
    indexes[2]>≢b: ⍬
    ⍵ {(⍺/⍨~⍵)(⍵/⍺)} ∊(2⍴¨⍨≢¨s)⊤¨indexes-1      ⍝ Get the solution from the indexes.
}
      cmpx 'horowitzsahni d' 'Balance d' 'BalanceScalar d'
  horowitzsahni d → 4.7E¯5 |      0%                                         
* Balance d       → 2.8E¯2 | +59266% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕            
  BalanceScalar d → 4.0E¯2 | +84466% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

Ouch! Well, told you my exhaustive search was naive. An impressive performance from the competitor – but also an impressive performance from Dyalog APL – even my knocked up exhaustive search runs in a pretty decent 25–30ms or so, about half the time of my shoddy Python attempt (although out-speeding Python is a low bar). I’m keeping the above implementation of Horowitz/Sahni handy for next edition of Advent of Code, where this problem always seems to crop up in some shape or form.

Problem 9 – Upwardly Mobile (1 task)

Level of Difficulty: Hard

And so for the final question. We were offered strong hints that a neat array-oriented solution might not be possible, but that the judges were prepared to be proven wrong.

Here’s a nicely compact, recursive solution:

∇ weights ← Weights filename;diag;FindWeights;start
    diag ← ↑(≠∘(⎕UCS 10)⊆⊢)⊃⎕NGET filename
    FindWeights ← {
        '┌┐│'∊⍨⊃⍵: ∇1↓⍵                    ⍝ if on any of these, go down        
        ⎕A∊⍨⊃⍵: ⎕A=⊃⍵                      ⍝ if on a letter, give weights
        r_disp ← '┐'⍳⍨0⌷⍵                  ⍝ otherwise, (i.e. on '┴'), find the displacement of right branch,
        l_disp ← -1+'┌'⍳⍨⌽0⌷⍵              ⍝ ...and the left branch
        wts ← ↑(∇r_disp⌽⍵)(∇l_disp⌽⍵)      ⍝ recurse,
        +⌿wts×[0]⌽(+/wts)×r_disp (-l_disp) ⍝ ...and calculate new weights
    }
    start ← diag⌽⍨⍸'┴│'∊⍨0⌷diag            ⍝ starting position attained by ⌽'ing to '┴' or '│'
    weights ← (~∘0÷∨/)FindWeights start    ⍝ remove 0s and get lowest weights
∇

Finally, someone took the suggestion that an array-based solution might not be possible as a personal challenge and produced the following:

Weights ← {
    m  ← ↑(⎕UCS 10)(≠⊆⊢)⊃⎕NGET ⍵ ⍝ no empty lines midway through so this is fine
    fm ← m='┴'               ⍝ fulcrum mask
    ER ← {+\1-⍵\¯2-⌿0⍪⍸⍵}    ⍝ distance to closest 1 to the left
      
    wa ← +/,m∊⎕A             ⍝ weight amount
    wi ← (⍳wa)@{⍵} m∊⎕A      ⍝ weight indexes
    fa ← +/,fm               ⍝ fulcrum amount
    fir← wa + ⍳fa            ⍝ fulcrum indexes (reduced)
    fi ← fir@{⍵} fm          ⍝ fulcrum indexes
    ai ← fi+wi               ⍝ all indexes
    ai+← ⍉(m∊'┌┐') {⍺\⍵/⍨⍵≠0}⍤1⍥⍉ 0@1⊢ai ⍝ extend indexes upwards to the ┌┐s that need them (exclude top ┴ as it isn't matched)
      
    ld ←  ER⍤1⊢ m='┌'        ⍝ distance to left
    rd ← ⌽ER⍤1⌽ m='┐'        ⍝ distance to right
    xp ← (⍴m)⍴⍳2⊃⍴m          ⍝ x position
    fml← ↓fm                 ⍝ fulcrum mask & its lines
    ail← ↓ai                 ⍝ all index lines
    GET← {⊃,/ail⌷⍨∘⊂¨fml/¨⍵} ⍝ get an item of ai for each fulcrum at x position ⍵
    lir← GET ↓xp-ld          ⍝ left indexes (reduced)
    rir← GET ↓xp+rd          ⍝ right indexes (reduced)
    ldr← fm /⍥, ld           ⍝ left distance (reduced)
    rdr← fm /⍥, rd           ⍝ right distance (reduced)
      
    in ← ↑⊃{(+/⍵[⍺])@(⊃⍺)⊢ ⍵}/ (↓⍉↑fir lir rir) , ⊂↓(⍳fa+wa)∘.=⍳wa ⍝ included weights for each index
    cf ← (ldr ×⍤¯1⊢ in[lir;]) - rdr ×⍤¯1⊢ in[rir;] ⍝ coefficients
    ws ← (1,(≢cf)⍴0) ⌹ ((2⊃⍴cf)↑1)⍪cf              ⍝ unscaled weights
    (⊢÷∨/) ws                                      ⍝ scale weights to integers
}

I take my hat off in admiration of the audacity: “An array solution might not be possible, eh? Hold my beer.”

So there we have it, a smörgåsbord of clever solutions to serve as an inspiration for us all. The 2020 edition of the competition sported a slightly simplified format where you were expected to tackle every problem instead of the approach in previous years where you had to make a subset selection from themed groups – this new approach remains for the current (2021) edition.

You are taking part, aren’t you?

Highlights of the 2020 APL Problem Solving Competition – Phase I

We received some excellent competition entries this year. Once again, thank you to all those who participated, congratulations to this year’s winners, and thank you to the Grand Prize Winner Andrii Makukha for his presentation at this year’s user meeting.

This post contains some suggested Phase I solutions along with some comments from the judges. Before each solution there is a brief summary of the problem and a link to the full problem on the practice problems site; you can also download the full set of problem descriptions as a PDF.

This page contains spoilers. The suggested solutions are not the only correct solutions, and may not necessarily be best practice depending on your application or most optimal in terms of performance, but they do solve the problems in some particularly interesting and elegant ways. If you’d like to try to solve the problems yourself before looking at these example solutions, you can use problems.tryapl.org to check your solutions.

1: Let’s Split

The first problem was to write a function splitting a vector right argument into a nested vector of vectors according to a signed integer left argument. For example:

      ¯3 (your_function) 'DyalogAPL'
┌──────┬───┐
│Dyalog│APL│
└──────┴───┘

Most entrants successfully solved this problem using dyadic take and drop .

{c←⍺+(⍺<0)×≢⍵ ⋄ (c↑⍵)(c↓⍵)}

It was common to use the left argument as-is with and , and swap the two parts of the result using ⌽⍣condition or condition⌽array, but the solution above avoids the swap by computing appropriate arguments to and .

Try it now with TryAPL

2: Character Building

In problem 2, we asked participants to partition a vector of UTF-8 character encodings, similarly to 'UTF-8'∘⎕UCS¨, without using ⎕UCS. For example:

      (your_function) 68 194 165 226 141 186 226 140 138 240 159 148 178 57   
┌──┬───────┬───────────┬───────────┬───────────────┬──┐
│68│194 165│226 141 186│226 140 138│240 159 148 178│57│
└──┴───────┴───────────┴───────────┴───────────────┴──┘

Eight participants submitted this (or a variation thereof):

{⍵⊂⍨1≠128 192⍸⍵}

Instead of doing multiple comparisons, this neatly uses interval index to check which range the argument is in. It then uses partitioned-enclose to create partitions beginning where code points are either below 128 or above 192.

Try it now with TryAPL

3: Excel-lent Columns

Problem 3 was simply to convert Microsoft Excel-style column letters to an integer. For example:

      (your_function) 'APL'
1104

Thirty-five participants submitted variations on this:

{26⊥⎕A⍳⍵}

While simple at first glance, it is actually quite involved because ⎕A⍳⍵ can give 26 (for Z) which isn’t a valid digit in base-26. However, decode handles out-of-bounds digits by carrying.

Try it now with TryAPL

4: Take a Leap

The task for problem 4 was to write a function to verify which of an array of integer years are leap years. For example:

      (your_function) 1900+10 10⍴⍳100
0 0 0 1 0 0 0 1 0 0
0 1 0 0 0 1 0 0 0 1
0 0 0 1 0 0 0 1 0 0
0 1 0 0 0 1 0 0 0 1
0 0 0 1 0 0 0 1 0 0
0 1 0 0 0 1 0 0 0 1
0 0 0 1 0 0 0 1 0 0
0 1 0 0 0 1 0 0 0 1
0 0 0 1 0 0 0 1 0 0
0 1 0 0 0 1 0 0 0 1

We had eight solutions like this one:

{0≠.=4 100 400∘.|⍵}

At first, it generates a 3-element vector showing whether the argument is divisible by 4, 100 or 400.

      0=4 100 400∘.|1900
1 1 0

The cleverness then is that ≠⌿ is used to express the logic of the leap-year algorithm. From Wikipedia:

if (year is not divisible by 4) then (it is a common year)
else if (year is not divisible by 100) then (it is a leap year)
else if (year is not divisible by 400) then (it is a common year)
else (it is a leap year)

We can check this with all possible length-3 boolean arguments:

      2⊥⍣¯1⍳¯1+2*3
0 0 0 1 1 1 1
0 1 1 0 0 1 1
1 0 1 0 1 0 1
      ≠⌿2⊥⍣¯1⍳¯1+2*3
1 1 0 1 0 0 1

Consider each case in turn:
1. Leap year, return 1
2. Can never occur
3. Not a leap year, return 0
4. Can never occur
5. Can never occur
6. Can never occur
7. Leap year, return 1

It is good because it uses no explicit loops and keeps intermediate values flat (no nesting). The solution leverages that each leap year rule is an exception to the previous one, and this particular formulation employs an unusual inner product ≠.= (equivalent to {≠/⍺=⍵} for vector arguments) to compute the parity of the divisibilities.

Try it now with TryAPL

5: Stepping in the Proper Direction

Problem 5 was to create a list generator somewhat similar to iota . However, this list generator takes a 2-element integer right argument and returns a list starting from the first integer and either increasing or decreasing in steps of 1 until the last integer inclusively. For example:

      (your_function) 4 ¯3
4 3 2 1 0 ¯1 ¯2 ¯3

Only one person had this exact solution, though many solutions were not too far off:

{(⊃⍵)-0,(××⍳∘|)-/⍵}

This dfn contains a 3-train or fork. Having seen contestants use the following format before, we feel compelled to provide you with a commented version of the above:

{
               -/⍵   ⍝ The length of the result is 1 more than the difference
           ⍳∘|)      ⍝ Integers up to the absolute difference
          ×          ⍝ times
        (×           ⍝ The sign of the difference
      0,             ⍝ Make the range inclusive
     -               ⍝ Use arithmetic to compute the correct result
 (⊃⍵)                ⍝ From the first value
                  }

Alternatively:

{
 (⊃⍵)                ⍝ From the first value
     -               ⍝ to
      0,             ⍝ inclusively
        (×           ⍝ The sign of...
          ×          ⍝ times
           ⍳∘|)      ⍝ Integers in the range of...
               -/⍵   ⍝ The difference
                  }    

This one excels in only computing necessary values once, and cleverly adjusts the generated values to rise or fall as needed, using the sign of the difference between the beginning and end points of the target range.

Try it now with TryAPL

6: Move to the Front

The task for problem 6 was to move all elements in the right argument vector equal to the left argument scalar to the start of that vector. For example:

      'a' (your_function) 'dyalog apl for all'
aaadylog pl for ll

Only one participant found this train, though two others submitted dfns using the same idea:

∩⍨,~⍨

Instead of computing indices or selecting elements, this simply employs two set functions, intersection and without ~. The asymmetry of intersection, namely that it preserves duplicates from its left argument, is here used to great advantage.

Try it now with TryAPL

7: See You in a Bit

Problem 7 involved writing a function to compare set bits in the base-2 representations of its integer arguments. For example:

      2 (your_function) 7   ⍝ is 2 in 7 (1+2+4)?
1

Eleven solutions used this method:

∧/(≤/2⊥⍣¯1,)

Indeed, the problem is about finding particular set bits in a binary number, hence the 2⊥⍣¯1. The overall function is a 2-train or atop, where the right-hand function is itself a 3-train or fork.

We can break it down as follows:

∧/             ⍝ Are all of (0 if any are *not*)
  (≤/          ⍝ Set left bits also set in right
     2⊥⍣¯1     ⍝ in The base-2 representation of
	      ,)   ⍝ The left and right arguments?

The function less than or equal to only returns 0 where a left bit is not found in the right argument:

      5((⊢,⍥⊂⍪⍤(≤/))2⊥⍣¯1,)9   ⍝ Just a fancy way of visualising the intermediate and final result
┌───┬─┐
│0 1│1│
│1 0│0│
│0 0│1│
│1 1│1│
└───┴─┘

This is pretty impressive, as it both demonstrates array orientation (in treating both arguments together) and uses Dyalog APL’s fancy inverse operator ⍣¯1 to use as many bits as necessary, while keeping the two representations aligned.

Try it now with TryAPL

8: Zigzag Numbers

The solution to problem 8 returns a 1 if its integer argument’s digits consecutively rise and fall throughout. For example, 12121 is a zigzag number, but 1221 is not.

We saw a handful of solutions of this type:

∧/0>2×/2-/10∘⊥⍣¯1

We can decompose it like so:

∧/                  ⍝ Are all
  0>                ⍝ Negative for...
    2×/             ⍝ Consecutively different signs of...
       2-/          ⍝ The pairwise difference of...
          10∘⊥⍣¯1   ⍝ The digits of the input?

It constitutes a good example of how the pattern in trains often is a natural one (this is actually an 8-train), and also shows off two uses of pairwise application 2f/ to compute the pairwise difference (the sign of which indicates the direction from digit to digit) and then the pairwise product (which due to the rules for multiplication of signed numbers indicates if a change has happened or not).

Try it now with TryAPL

9: Rise and Fall

Problem 9 involved writing a function to verify if a numeric vector has two properties:

  • The elements increase or stay the same until the “apex” (highest value) is reached
  • After the apex, any remaining values decrease or remain the same

For example:

      (your_solution)¨(1 2 2 3 1)(1 2 3 2 1)(1 3 2 3 1)
1 1 0

Actually, nobody had this exact solution, however, a handful came very close:

⊢≡⌈\⌊∘⌽⌈\∘⌽

Instead of trying to analyse the numbers, it does a running maximum from the left and from the right. If the minimum of those matches the original numbers, then we have exactly one peak.

⊢             ⍝ The input vector
 ≡            ⍝ matches
  ⌈\          ⍝ The max-scan from the left (msl)
    ⌊∘⌽       ⍝ The lower of msl and ⌽msr
       ⌈\∘⌽   ⍝ The max-scan from the right (msr)

We can visualise ⌊∘⌽ by stacking its arguments on top of one another:

      1 3 5,[0.5]⌽2 5 4
1 3 5
4 5 2
      ⌊⌿1 3 5,[0.5]⌽2 5 4
1 3 2
      1 3 5⌊∘⌽2 5 4
1 3 2

When used with the two max-scans, we can see how this solution works.

      (⌈\,[0.5]∘⌽⌈\∘⌽)1 3 2  
1 3 3  
3 3 2  
      (⌈\,[0.5]∘⌽⌈\∘⌽)1 0 2  
1 1 2  
2 2 2

Try it now with TryAPL

10: Stacking It Up

The task for problem 10 was to format a nested vector of simple arrays as if displayed using {⎕←⍵}¨, and then to return the formatted character matrix ({⎕←⍵}¨ simply returns its argument). For example:

      (your_function) (3 3⍴⍳9)(↑'Adam' 'Michael')(⍳10) '*'(5 5⍴⍳25)
1 2 3               
4 5 6               
7 8 9               
Adam                
Michael             
1 2 3 4 5 6 7 8 9 10
*                   
 1  2  3  4  5      
 6  7  8  9 10      
11 12 13 14 15      
16 17 18 19 20      
21 22 23 24 25

We had a couple of entries like this:

{¯1↓∊(⍕¨⍵),¨⎕UCS 13}

This was a tricky problem, especially as the automated testing didn’t include the 'a'1 test case, and many didn’t catch that one. Whilst most people wrote complicated code to get matrices for the element arrays, two participants thought outside the box, and simply joined the arrays with newlines after converting them to text.

Try it now with TryAPL

Conclusion

As always, not only are we deeply impressed by the ingenuity and cleverness of your submissions, but we also continue to be amazed by the number of people successfully solving most, if not all, of the problems in Phase I.

If you’d like to be notified when next year’s competition launches, go to dyalogaplcompetition.com and submit your email address.

2019 APL Problem Solving Competition: Phase I Problems Sample Solutions

The following are my attempts at the Phase I problems of the 2019 APL Problem Solving Competition. There are not necessarily “right answers” as personal style and taste come into play. More explanation of the code is provided here than common practice. All solutions pass all the tests specified in the official problem description.

1. Chunky Monkey

Write a function that, given a scalar or vector as the right argument and a positive (>0) integer chunk size n as the left argument, breaks the array’s items up into chunks of size n. If the number of elements in the array is not evenly divisible by n, then the last chunk will have fewer than n elements.

💡Hint: The partitioned enclose function could be helpful for this problem.

   f1←{((≢⍵)⍴⍺↑1)⊂⍵}

Basically, the problem is to construct an appropriate boolean left argument to. For this the reshape function ⍺⍴⍵ is apt, which repeats the items of up to length .

   9 ⍴ 1 0 0                     (9 ⍴ 1 0 0) ⊂ 'ABCDEFGHI'
1 0 0 1 0 0 1 0 0             ┌───┬───┬───┐
                              │ABC│DEF│GHI│
                              └───┴───┴───┘

   11 ⍴ 1 0 0                    (11 ⍴ 1 0 0) ⊂ 'ABCDEFGHIJK'
1 0 0 1 0 0 1 0 0 1 0         ┌───┬───┬───┬──┐
                              │ABC│DEF│GHI│JK│
                              └───┴───┴───┴──┘

2. Making the Grade

Score Range Letter Grade
0-64 F
65-69 D
70-79 C
80-89 B
90-100 A
 

Write a function that, given an array of integer test scores in the inclusive range 0–100, returns an identically-shaped array of the corresponding letter grades according to the table to the left.

💡Hint: You may want to investigate the interval index function.

   f2← {'FDCBA'[0 65 70 80 90⍸⍵]}

For example:

   range← 0 65 70 80 90
   score← 0 65 89 64 75 100

   range ⍸ score                      range ⍸ 2 3⍴score
1 2 4 1 3 5                        1 2 4
                                   1 3 5
   'FDCBA'[1 2 4 1 3 5]               'FDCBA'[2 3⍴1 2 4 1 3 5]
FDBFCA                             FDB
                                   FCA
    f2 score                          f2 2 3⍴score
FDBFCA                             FDB
                                   FCA

The examples on the right illustrate that the functionsand [] extend consistently to array arguments.

In APL, functions take array arguments, and so too indexing takes array arguments, including the indices (the “subscripts”). This property is integral to the template

   Y indexing (X index ⍵)

where

 
X   domain for looking things up
Y range where you want to end up; “aliases” corresponding to X
index a function to do the looking up, such asor
indexing   a function to do indexing into X , such as [] or or (dyadic)

3. Grade Distribution

Given a non-empty character vector of single-letter grades, produce a 3-column, 5-row, alphabetically-sorted matrix of each grade, the number of occurrences of that grade, and the percentage (rounded to 1 decimal position) of the total number of occurrences of that grade. The table should have a row for each grade even if there are no occurrences of a grade. Note: due to rounding the last column might not total 100%.

💡Hint: The key operator could be useful for this problem.

   f3←{a,k,1⍕⍪100×k÷+⌿k←¯1+{≢⍵}⌸⍵⍪⍨a←'ABCDF'}

The result of f⌸ is ordered by the unique major cells in the keys. If a particular order is required, or if a particular set of keys is required (even when some keys don’t occur in the argument), the computation can be effected by prefacing keys to the argument (here ,⍨a←'ABCDF') and then applying an inverse function (here ¯1+) to the result of .

For the key operator, in particular cases, for example the letter distribution in a corpus of English text, the universe of letters and their ordering are known (A-Z); in principle, it is not possible to “know” the complete universe of keys, or their ordering.

The function f3x illustrates the complications. f3 is the same as above; extra spaces are inserted into both functions to facilitate comparison.

   f3 ← {a,   k,1⍕⍪100×k÷+⌿k←¯1+{≢⍵}⌸⍵⍪⍨a←'ABCDF'}
   f3x← {(∪⍵),k,1⍕⍪100×k÷+⌿k←   {≢⍵}⌸⍵           }

   ⊢ g1← 9 3 8 4 7/'DABFC'
DDDDDDDDDAAABBBBBBBBFFFFCCCCCCC

   f3x g1                            f3 g1
D 9  29.0                         A 3   9.7
A 3   9.7                         B 8  25.8
B 8  25.8                         C 7  22.6
F 4  12.9                         D 9  29.0
C 7  22.6                         F 4  12.9
                  
   ⊢ g2← ('F'≠grade)⌿grade
DDDDDDDDDAAABBBBBBBBCCCCCCC

   f3x g2                            f3 g2
D 9  33.3                         A 3  11.1
A 3  11.1                         B 8  29.6
B 8  29.6                         C 7  25.9
C 7  25.9                         D 9  33.3
                                  F 0   0.0

4. Knight Moves

┌───┬───┬───┬───┬───┬───┬───┬───┐
│1 1│1 2│1 3│1 4│1 5│1 6│1 7│1 8│
├───┼───┼───┼───┼───┼───┼───┼───┤
│2 1│2 2│2 3│2 4│2 5│2 6│2 7│2 8│
├───┼───┼───┼───┼───┼───┼───┼───┤
│3 1│3 2│3 3│3 4│3 5│3 6│3 7│3 8│
├───┼───┼───┼───┼───┼───┼───┼───┤
│4 1│4 2│4 3│4 4│4 5│4 6│4 7│4 8│
├───┼───┼───┼───┼───┼───┼───┼───┤
│5 1│5 2│5 3│5 4│5 5│5 6│5 7│5 8│
├───┼───┼───┼───┼───┼───┼───┼───┤
│6 1│6 2│6 3│6 4│6 5│6 6│6 7│6 8│
├───┼───┼───┼───┼───┼───┼───┼───┤
│7 1│7 2│7 3│7 4│7 5│7 6│7 7│7 8│
├───┼───┼───┼───┼───┼───┼───┼───┤
│8 1│8 2│8 3│8 4│8 5│8 6│8 7│8 8│
└───┴───┴───┴───┴───┴───┴───┴───┘
  Consider a chess board as an 8×8 matrix with square (1 1) in the upper left corner and square (8 8) in the lower right corner. For those not familiar with the game a chess, the knight, generally depicted as a horse (♞), can move 2 spaces right or left and then 1 space up or down, or 2 spaces up or down and then 1 space right or left. For example, this means that a knight on the square (5 4) can move to any of the underscored squares.

Given a 2-element vector representing the current square for a knight, return a vector of 2-element vectors representing (in any order) all the squares that the knight can move to.

💡Hint: The outer product operator ∘. could be useful for generating the coordinates.

   f4← {↓(∧/q∊⍳8)⌿q←⍵+⍤1⊢(3=+/|t)⌿t←↑,∘.,⍨¯2 ¯1 1 2}

f4 derives as follows: First, generate all 16 combinations t of moves involving 1 and 2 steps, left and right and up and down, then select move combinations which total exactly 3 squares regardless of direction.

   (3=+/|t)⌿t←↑,∘.,⍨¯2 ¯1 1 2
¯2 ¯1
¯2  1
¯1 ¯2
¯1  2
 1 ¯2
 1  2
 2 ¯1
 2  1

The resultant 8-row matrix (call this mv) is added to, the coordinates of the current square, and then pruned to discard squares which fall outside of the chess board. The following examples illustrate the computation for ⍵≡5 4 and ⍵≡1 2 :

   mv←(3=+/|t)⌿t←↑,∘.,⍨¯2 ¯1 1 2

   ⊢ q←5 4+⍤1⊢mv                             ⊢ q←1 2+⍤1⊢mv
3 3                                       ¯1 1
3 5                                       ¯1 3
4 2                                        0 0
4 6                                        0 4
6 2                                        2 0
6 6                                        2 4
7 3                                        3 1
7 5                                        3 3

   ↓(∧/q∊⍳8)⌿q                               ↓(∧/q∊⍳8)⌿q
┌───┬───┬───┬───┬───┬───┬───┬───┐         ┌───┬───┬───┐
│3 3│3 5│4 2│4 6│6 2│6 6│7 3│7 5│         │2 4│3 1│3 3│
└───┴───┴───┴───┴───┴───┴───┴───┘         └───┴───┴───┘

An alterative solution is to precomputing an 8×8 table of the possible knight moves for each chess square, and then picking from the table:

   f4i← (f4¨ ⍳8 8) ⊃⍨ ⊂

The table look-up version would be more efficient in situations (such as in the Knight’s Tour puzzle) where the knight moves are computed repeatedly.
 

5. Doubling Up

Given a word or a list of words, return a Boolean vector where 1 indicates a word with one or more consecutive duplicated, case-sensitive, letters. Each word will have at least one letter and will consist entirely of either uppercase (A-Z) or lowercase (a-z) letters. Words consisting of a single letter can be scalars.

💡Hint: The nest functioncould be useful.

   f5← (∨⌿2=⌿' ',⊢)¨∘⊆

A solution obtains by solving it for one word and then applying it to each word via the each operator. Since a single word argument can be a string of letters, and we don’t want to apply the single word solution to each letter, that argument must first be converted in an enclosed word with nest. Thus the overall solution is of the form f¨∘⊆.

For a single word, what is required is to detect consecutive duplicate letters, whence the operator 2=⌿⍵ is apt.

   2 =⌿ 'bookkeeper'                  2 =⌿ 'radar'
0 1 0 1 0 1 0 0 0                  0 0 0 0

   ∨⌿ 2 =⌿ 'bookkeeper'               ∨⌿ 2 =⌿ 'radar'
1                                  0

As usual, the link function {⍺⍵} can be used as a generic dyadic operand function to gain additional insight into the workings of an operator:

   2 {⍺⍵}⌿ 'bookkeeper'               2 {⍺⍵}⌿ 'radar'
┌──┬──┬──┬──┬──┬──┬──┬──┬──┐       ┌──┬──┬──┬──┐
│bo│oo│ok│kk│ke│ee│ep│pe│er│       │ra│ad│da│ar│
└──┴──┴──┴──┴──┴──┴──┴──┴──┘       └──┴──┴──┴──┘

2 f⌿⍵ signals error on single-item arguments; moreover, it is problematic to compare a single letter against itself. Both problems are finessed by first prefacing the argument with a space ' '.

In f5, the train (∨⌿2=⌿' ',⊢) can also be written as the equivalent dfn {∨⌿2=⌿' ',⍵} as a matter of personal style. The display of a train does provide more information about how it is structured than the display of a dfn.

   (∨⌿2=⌿' ',⊢)                       {∨/2=⌿' ',⍵}
┌─────┬─────────────────┐          {∨⌿2=⌿' ',⍵}
│┌─┬─┐│┌─┬─────┬───────┐│
││∨│⌿│││2│┌─┬─┐│┌─┬─┬─┐││
│└─┴─┘││ ││=│⌿│││ │,│⊢│││
│     ││ │└─┴─┘│└─┴─┴─┘││
│     │└─┴─────┴───────┘│
└─────┴─────────────────┘

6. Telephone Names

┌────┬───┬────┐
│    │ABC│DEF │
│ 1  │ 2 │ 3  │
├────┼───┼────┤
│GHI │JKL│MNO │
│ 4  │ 5 │ 6  │
├────┼───┼────┤
│PQRS│TUV│WXYZ│
│ 7  │ 8 │ 9  │
├────┼───┼────┤
│    │   │    │
│ *  │ 0 │ #  │
└────┴───┴────┘
  Some telephone keypads have letters of the alphabet embossed on their keytops. Some people like to remember phone numbers by converting them to an alphanumeric form using one of the letters on the corresponding key. For example, in the keypad shown, 'ALSMITH' would correspond to the number 257-6484 and '1DYALOGBEST' would correspond to 1-392-564-2378. Write an APL function that takes a character vector right argument that consists of digits and uppercase letters and returns an integer vector of the corresponding digits on the keypad.

💡Hint: Your solution might make use of the membership function .

   f6← {(⍵⍸⍨⎕d,'ADGJMPTW')-9*⍵∊⎕a}

Letters and digits alike are mapped to integer indices using the interval index function, which neatly handles the irregularly-sized intervals (see problem 2 above). The indices are then decremented by 9 for letters and by 1 for digits.

The expression 9*⍵∊⎕a illustrates a common technique in APL used to implement array logic, effecting control flow without using control structures or explicit branching. In the following, c and d are scalars (usually numbers) and is a boolean array.

c*⍵ c whereis 1 and 1 whereis 0.
c×⍵ c whereis 1 and 0 whereis 0.
c+⍵×d-c   c whereis 0 and d whereis 1.
(c,d)[1+⍵]   Same as c+⍵×d-c, but c and d can be any scalars. The 1+ is omitted if the index origin ⎕io is 0.

7. In the Center of It All

Given a right argument of a list of words (or possibly a single word) and a left argument of a width, return a character matrix that has width columns and one row per word, with each word is centered within the row. If width is smaller than the length of a word, truncate the word from the right. If there are an odd number of spaces to center within, leave the extra space on the right.

💡Hint: The mixand rotatefunctions will probably be useful here.

   f7← {(⌈¯0.5×0⌈⍺-≢¨⍵)⌽↑⍺↑¨⍵}∘⊆

As in problem 5, a prefatory application of nestconverts an argument of a single word into a more manageable standard of a list of words. Subsequently, the right argument is turned into a matrix, each row padded with spaces on the right (or truncated). Each row is then rotated so that the non-blank characters are centered. The finicky detail of an odd number of spaces is resolved by usingorin the calculation of the amounts of rotation.

8. Going the Distance

Given a vector of (X Y) points, or a single X Y point, determine the total distance covered when travelling in a straight line from the first point to the next one, and so on until the last point, then returning directly back to the start. For example, given the points

   (A B C)← (¯1.5 ¯1.5) (1.5 2.5) (1.5 ¯1.5)

the distance A to B is 5, B to C is 4 and C back to A is 3, for a total of 12.

💡Hint: The rotateand power * functions might be useful.

   f8← {+⌿ 2 {0.5*⍨+.×⍨⍺-⍵}⌿ ⍵⍪1↑⍵}

The result obtains by applying the distance function d←{0.5*⍨+.×⍨⍺-⍵} between pairs of points, taking care to return to the start.

As in problem 5, the expression 2 f⌿⍵ is just the ticket for working with consecutive items in the argument and, again, using the link function {⍺⍵} elucidates the workings of an operator:

   (A B C)← (¯1.5 ¯1.5) (1.5 2.5) (1.5 ¯1.5)

   2 {⍺⍵}⌿ A B C A
┌───────────────────┬──────────────────┬────────────────────┐
│┌─────────┬───────┐│┌───────┬────────┐│┌────────┬─────────┐│
││¯1.5 ¯1.5│1.5 2.5│││1.5 2.5│1.5 ¯1.5│││1.5 ¯1.5│¯1.5 ¯1.5││
│└─────────┴───────┘│└───────┴────────┘│└────────┴─────────┘│
└───────────────────┴──────────────────┴────────────────────┘
   2 d⌿ A B C A
5 4 3

   A d B              B d C              C d A
5                  4                   3

   f8 A B C
12

9. Area Code à la Gauss

Gauss’s area formula, also known as the shoelace formula, is an algorithm to calculate the area of a simple polygon (a polygon that does not intersect itself). It’s called the shoelace formula because of a common method using matrices to evaluate it. For example, the area of the triangle described by the vertices (2 4) (3 ¯8) (1 2) can be calculated by “walking around” the perimeter back to the first vertex, then drawing diagonals between the columns. The pattern created by the intersecting diagonals resembles shoelaces, hence the name “shoelace formula”.

💡Hint: You may want to investigate the rotate firstfunction.

First place the vertices in order above each other:
2 4
3 ¯8
1 2
2 4
Sum the products of the numbers connected by the diagonal lines going down and to the right:

      (2ׯ8)+(3×2)+(1×4)
¯6
      
2 4
3 ¯8
1 2
2 4
Next sum the products of the numbers connected by the diagonal lines going down and to the left:

      (4×3)+(¯8×1)+(2×2)
8
      
2 4
3 ¯8
1 2
2 4

Finally, halve the absolute value of the difference between the two sums:

      0.5 × | ¯6 - 8
7
      
2 4
3 ¯8
1 2
2 4

Given a vector of (X Y) points, or a single X Y point, return a number indicating the area circumscribed by the points.

   f9← {0.5×|(+/×/¯1↓0 1⊖t)-+/×/1↓0 ¯1⊖t←↑(⊢,1∘↑)⊆⍵}

There is an alternative solution using the determinant function and the stencil operator  :

   )copy dfns det    ⍝ or  det← (-/)∘(×/)∘(0 1∘⊖)
   x← (2 4) (3 ¯8) (1 2)

   {det ⍵}⌺2 ↑x⍪1↑x
¯28 14 0

   2 ÷⍨| +/ {det ⍵}⌺2 ↑x⍪1↑x
7
   f9 x
7

Putting it together:

   f9a← {2÷⍨|+/ {det ⍵}⌺2 ↑⍵⍪1↑⍵}
   f9b← {2÷⍨ +/ {det ⍵}⌺2 ↑⍵⍪1↑⍵}

   f9a x
7
   f9b x
¯7
   f9b ⊖t
7

f9a computes the absolute area as specified by the problem. f9b computes the signed area by omitting the absolute value function | . Commonly, the signed area is positive if the vertices are ordered counterclockwise and is negative otherwise. See the Wikipedia article on polygons for more details.

Similar to 2 f⌿⍵ (problem 5), the workings of stencil can be elucidated by using {⊂⍵} as a generic monadic operand function:

   {⊂⍵}⌺2 ↑x⍪1↑x
┌────┬────┬───┐
│2  4│3 ¯8│1 2│
│3 ¯8│1  2│2 4│
└────┴────┴───┘
   {det ⍵}⌺2 ↑x⍪1↑x
¯28 14 0

   det ↑ (2 4) (3 ¯8)       det ↑ (3 ¯8) (1 2)       det ↑ (1 2) (2 4)
¯28                      14                        0

10. Odds & Evens

Given a vector of words, separate the words into two vectors—one containing all the words that have an odd number of letters and the other containing all the words that have an even number of letters.

💡Hint: You may want to look into the dyadic form of the key operator.

   f10← 1 ↓¨ (1 0,2|≢¨) {⊂⍵}⌸ 1 0∘,

The solution is required to have exactly two items, words of odd lengths and words of even lengths. This required form is ensured by prefacing the left and right argument to key by 1 0, then dropping the first item from each of the resultant two parts. (See also problem 3 above.)

Editor’s Addendum: Phase II Questions

You can watch the 2019 Grand Prize Winner Jamin Wu’s Dyalog ’19 acceptance presentation and explanation of how he approached the phase II questions on Dyalog.tv.

2018 APL Problem Solving Competition: Phase I Problems Sample Solutions

The following are my attempts at the Phase I problems of the 2018 APL Problem Solving Competition. There are not necessarily “right answers” as personal style and taste come into play. More explanation of the code is provided here than common practice. All solutions pass all the tests specified in the official problem description.

The solutions for problems 1 and 3 are due to Brian Becker, judge and supremo of the competition. They are better than the ones I had originally.

1. Oh Say Can You See?

Given a vector or scalar of skyscraper heights, compute the number of skyscrapers which can be seen from the left. A skyscraper hides one further to the right of equal or lesser height.

   visible ← {≢∪⌈\⍵}

Proceeding from left to right, each new maximum obscures subsequent equal or lesser values. The answer is the number of unique maxima. A tacit equivalent is ≢∘∪∘(⌈\).

2. Number Splitting

Split a non-negative real number into the integer and fractional parts.

   split ← 0 1∘⊤

The function ⍺⊤⍵ encodes the number in the number system specified by numeric vector (the bases). For example, 24 60 60⊤sec expresses sec in hours, minutes, and seconds. Such expression obtains by repeated application of the process, starting from the right of : the next “digit” is the remainder of the number on division by a base, and the quotient of the division feeds into the division by the next base. Therefore, 0 1⊤⍵ divides by 1; the remainder of that division is the requisite fractional part and the quotient the integer part. That integer part is further divided by the next base, 0. In APL, remaindering by 0 is defined to be the identity function.

You can have a good argue about the philosophy (theology?) of division by 0, but the APL definition in the context of ⍺⊤⍵ gives practically useful results: A 0 in essentially says, whatever is left. For example, 0 24 60 60⊤sec expresses sec as days/hours/minutes/seconds.

3. Rolling Along

Given an integer vector or scalar representing a set of dice, produce a histogram of the possible totals that can be rolled.

   roll ← {{⍺('*'⍴⍨≢⍵)}⌸,+/¨⍳⍵}

   roll 5 3 4
3  *
4  ***
5  ******
6  *********
7  ***********
8  ***********
9  *********
10 ******
11 ***
12 *

⍳⍵ produces all possible rolls for the set of dice (the cartesian product of ⍳¨⍵) whence further application of +/¨ and then , produce a vector of all possible totals. With the vector of possible totals in hand, the required unique totals and corresponding histogram of the number of occurrences obtain readily with the key operator . (And rather messy without the key operator.) For each unique total, the operand function {⍺('*'⍴⍨≢⍵)} produces that total and a vector of * with the required number of repetitions.

The problem statement does not require it, but it would be nice if the totals are listed in increasing order. At first, I’d thought that the totals would need to be explicitly sorted to make that happen, but on further reflection realized that the unique elements of ,+/¨⍳⍵ (when produced by taking the first occurrence) are guaranteed to be sorted.

4. What’s Your Sign?

Find the Chinese zodiac sign for a given year.

   zodiac_zh ← {(1+12|⍵+0>⍵) ⊃ ' '(≠⊆⊢)' Monkey Rooster Dog Pig Rat Ox Tiger Rabbit Dragon Snake Horse Goat'}

Since the zodiac signs are assigned in cycles of 12, the phrase 12| plays a key role in the solution. The residue (remainder) function | is inherently and necessarily 0-origin; the 1+ accounts for the 1-origin indexing required by the competition. Adding 0>⍵ overcomes the inconvenient fact that there is no year 0.

Essentials of the computation are brought into sharper relief if each zodiac sign is denoted by a single character:

   zzh ← {(1+12|⍵+0>⍵)⊃'猴雞狗豬鼠牛虎兔龍蛇馬羊'}

The Chinese Unicode characters were found using https://translate.google.com and then copied and pasted into the Dyalog APL session.

5. What’s Your Sign? Revisited

Find the Western zodiac sign for a given month and day.

   zodiac_en←{
     d←12 2⍴ 1 20  2 19  3 21  4 20  5 21  6 21  7 23  8 23  9 23  10 23  11 22  12 22
     s←13⍴' '(≠⊆⊢)' Capricorn Aquarius Pisces Aries Taurus Gemini Cancer Leo Virgo Libra Scorpio Sagittarius'
     (1+d⍸⍵)⊃s
   }

For working with irregular-sized non-overlapping intervals, the pre-eminent function is , interval index. As with the Chinese zodiac, essentials of the computation are brought into sharper relief if each zodiac sign is denoted by a single character:

   zen←{
     d←12 2⍴ 1 20  2 19  3 21  4 20  5 21  6 21  7 23  8 23  9 23  10 23  11 22  12 22
     (1+d⍸⍵)⊃13⍴'♑♒♓♈♉♊♋♌♍♎♏♐'
   }

The single-character signs, Unicode U+2648 to U+2653, were found by Google search and then confirmed by https://www.visibone.com/htmlref/char/cer09800.htm. It is possible that the single-character signs do not display correctly in your browser; the line of code can be expressed alternatively as (1+d⍸⍵)⊃13⍴⎕ucs 9800+12|8+⍳12.

6. What’s Your Angle?

Check that angle brackets are balanced and not nested.

   balanced ← {(∧/c∊0 1)∧0=⊃⌽c←+\1 ¯1 0['<>'⍳⍵]}

In APL, functions take array arguments, and so too indexing takes array arguments, including the indices (the “subscripts”). This fact is exploited to transform the argument string into a vector c of 1 and ¯1 and 0, according to whether a character is < or > or “other”. For the angles to be balanced, the plus scan (the partial sums) of c must be a 0-1 vector whose last element is 0.

The closely related problem where the brackets can be nested (e.g. where the brackets are parentheses), can be solved by checking that c is non-negative, that is, ∧/(×c)∊0 1 (and the last element is 0).

7. Unconditionally Shifty

Shift a boolean vector by , where positive means a right shift and negative means a left shift.

   shift ← {(≢⍵)⍴(-⍺)⌽⍵,(|⍺)⍴0}

The problem solution is facilitated by the rotate function ⍺⌽⍵, where a negative means rotate right and positive means rotate left. Other alternative unguarded code can use or (take or drop) where a negative means take (drop) from the right and positive means from the left.

8. Making a Good Argument

Check whether a numeric left argument to ⍺⍉⍵ is valid.

   dta ← {0::0 ⋄ 1⊣⍺⍉⍵}

This is probably the shortest possible solution: Return 1 if ⍺⍉⍵ executes successfully, otherwise the error is trapped and a 0 is returned. A longer but more insightful solution is as follows:

   dt ← {((≢⍺)=≢⍴⍵) ∧ ∧/⍺∊⍨⍳(≢⍴⍵)⌊(×≢⍺)⌈⌈⌈/⍺}

The first part of the conjunction checks that the length of is the same as the rank of . (Many APL authors would write ⍴⍴⍵; I prefer ≢⍴⍵ because the result is a scalar.) The second part checks the following properties on :

  • all elements are positive
  • the elements (if any) form a dense set of integers (from 1 to ⌈/⍺)
  • a (necessarily incorrect) large element would not cause the to error

The three consecutive ⌈⌈⌈, each interpreted differently by the system, are quite the masterstroke, don’t you think? ☺

9. Earlier, Later, or the Same

Return ¯1, 1, or 0 according to whether a timestamp is earlier than, later than, or simultaneous with another.

   ts ← {⊃0~⍨×⍺-⍵}

The functions works by returning the first non-zero value in the signum of the difference between the arguments, or 0 if all values are zero. A tacit solution of same: ts1 ← ⊃∘(~∘0)∘× - .

10. Anagrammatically Correct

Determine whether two character vectors or scalars are anagrams of each other. Spaces are not significant. The problem statement is silent on this, but I am assuming that upper/lower case is significant.

   anagram ← {g←{{⍵[⍋⍵]}⍵~' '} ⋄ (g ⍺)≡(g ⍵)}

The function works by first converting each argument to a standard form, sorted and without spaces, using the local function g, and then comparing the standard forms. In g, the idiom {⍵[⍋⍵]} sorts a vector and the phrase ⍵~' ' removes spaces and finesses the problem of scalars.

A reasonable tacit solution depends on the over operator , contemplated but not implemented:

     f⍥g ⍵  ←→  f g ⍵
   ⍺ f⍥g ⍵  ←→  (g ⍺) f (g ⍵)

A tacit solution written with over: ≡⍥((⊂∘⍋ ⌷ ⊢)∘(~∘' ')). I myself would prefer the semi-tacit ≡⍥{{⍵[⍋⍵]}⍵~' '}.

 

Solving the 2014 APL Problem Solving Competition – How Tweet It Is

This post is the continuation of the series where we examine some of the problems selected for the 2014 APL Problem Solving Competition.

The problems presented in Phase 1 of the competition were selected because they could be solved succinctly, generally in a single line of APL code. This makes them well suited for experimentation on TryAPL.org.

Problem 2 of Phase 1, entitled “How tweet it is” reads

“Twitter messages have a 140 character limit – what if the limit was even shorter? One way to shorten the message yet retain most readability is to remove interior vowels from its words. Write a dfn which takes a character vector and removes the interior vowels from each word.”

Test cases:
      {your_solution} 'if you can read this, it worked!'
if yu cn rd ths, it wrkd!
      {your_solution} 'APL is REALLY cool'
APL is RLLY cl
      {your_solution} '' ⍝ an empty vector argument should return an empty vector

      {your_solution} 'a' ⍝ your solution should work with a single character message
a

We’ll examine a couple of approaches to this problem – one that’s more “traditional APL” code, and another that makes use of a really helpful Dyalog feature.

This problem could be restated as “find and remove the vowels that aren’t at the beginning or end of a word”. To start with, we need to determine where the words are and where the vowels are. A word is a contiguous set of letters; multiple words are separated by spaces or punctuation. For simplicity’s sake, we’ll ignore contractions and possessives.

The “Traditional APL” Approach

This approach employs a technique that is not commonly found outside of APL and its brethren – using a Boolean vector to determine which elements to remove or keep. First, let’s find where all the vowels are:

      string←'If you can read this, it worked!'
      vowels←{⍵∊'aeiouAEIOU'}
      vowels string
1 0 0 0 1 1 0 0 1 0 0 0 1 1 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 1 0 0

To help illustrate what’s happening, I’ll write a little operator called “show” to compactly display the string, the Boolean vector, and the elements that would be selected by applying the Boolean to the string.

      show←{⍵⍪⍵{↑(1 0⍕ ⍵)(⍵\⍵/⍺)}⍺⍺ ⍵}
      vowels show string
If you can read this, it worked!
10001100100011000010001000100100
I   ou  a   ea    i   i   o  e

Next we want to remove vowels that aren’t at either end of a word. First, find where the words are by finding where the letters are.  There are several ways to do this; the most obvious may be to use a character vector constant:

      letters←{⍵∊'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'}

Long character constants seem a bit awkward to me.  So, another technique uses the Unicode Conversion system function to return the 26 characters starting at the code points for each of ‘a’ and ‘A’:

      letters←{⍵∊⎕UCS (⍳26)∘.+¯1+⎕UCS'aA'}

Yet another way might be to use the code point values directly and do numerical operations:

      letters←{{((⍵≥65)∧⍵≤90)∨(⍵≥97)∧⍵≤122}⎕UCS ⍵}

Which technique you choose is largely a matter of taste and style. All three return the same result and have comparable performance. My personal preference is the second one – it has fewer characters for me to mistype 🙂

      letters show string
If you can read this, it worked!
11011101110111101111001101111110
If you can read this  it worked 

So now let’s mark the interior letters of the words. This employs a technique known as shift and compare that I learned in the early 1980s when I was privileged to work with Bob Smith. Among Bob’s many contributions to the APL world was a book on Boolean Functions and Techniques. To mark the interior letters, we’ll do both a right and left shift:

      interior←{⍵∧(¯1↓0,⍵)∧1↓⍵,0}
      {interior letters ⍵} show string
If you can read this, it worked!
00001000100011000110000000111100
    o   a   ea   hi       orke  

The last step is to find interior vowels and negate:

      {(vowels ⍵)∧interior letters ⍵} show string
If you can read this, it worked!
00001000100011000010000000100100
    o   a   ea    i       o  e  

      {(vowels ⍵)⍲interior letters ⍵} show string
If you can read this, it worked!
11110111011100111101111111011011
If y u c n r  d th s, it w rk d!

Putting it all together…

      tweet←{⍵/⍨~(⍵∊'aeiouAEIOU')∧{(1↓⍵,0)∧¯1↓0,⍵}⍵∊⎕UCS(⍳26)∘.+¯1+⎕UCS'aA'}
      tweet string
If yu cn rd ths, it wrkd!

The Other Technique – Using Regular Expressions

In version 13.0, Dyalog introduced the system functions ⎕S and ⎕R as interfaces to the PCRE (Perl Compatible Regular Expression) library. Like APL, regular expressions may seem a bit alien at first, but in the years since their incorporation into Dyalog, I’ve grown to appreciate their power and flexibility – they can frequently accomplish complex string manipulations more succinctly than their APL equivalents thus furthering Dyalog’s power as a tool of thought, notation and execution.

      tweet←{('\B[AEIOU]\B' ⎕R '' ⍠ 1) ⍵}
      tweet string
If yu cn rd ths, it wrkd!

The expression above replaces any vowel (⍠ 1means case-insensitive) that is not at the beginning or end of a word with the empty vector, effectively removing the interior vowels. A blog post is not enough space to give an adequate overview of regular expressions. But I hope the expression above piques your interest and encourages you to experiment with ⎕S and ⎕R on TryAPL.org or with a Dyalog system of your own.