Dyalog ’18 Videos, Week 2

Each week until early January, we will be releasing a selection of recordings of presentations from Dyalog’18, which was held in Belfast at the end of October 2018. Last week we kicked off with the opening keynote talks and the prize ceremony and acceptance speech by the winner of our annual problem-solving competition.

Just under half of the presentations at Dyalog User Meetings are by users who have volunteered – or sometimes been commandeered – to share stories about how they have used APL for fun or profit. These user stories provide significant motivation to the Dyalog team for future direction.

Aaron Hsu’s talk on “High Performance Tree Wrangling, the APL Way” is a pearl. Back in 2015 I gave a talk at Google on APL. One of the Google engineers asked about working with trees in APL and I was unable to give him a useful answer. Aaron is working on a compiler for APL, and trees that represent the code that is being compiled are his most important type of data structure.

In this talk Aaron demonstrates that APL is an elegant – and highly efficient – notation for working with trees, if you just pick the right representation!


Most of the talks at Dyalog User Meetings are fairly technical. The subject at the core of Ilaria Piccirilli’s talk – the fair pricing of financial instruments and subsequent evaluation of portfolios – is no exception. Mercifully, Ilaria spares us the details of the calculations – as she dryly notes, there is no “Fair Pricing for Dummies”. Instead, she offers humorous insights into the way her team used APL to deal with the explosion of computations required by regular additions to legislation requiring health checks – and the day that negative interest pulled the rug out from under most standard pricing calculations.

The other, slightly larger half of the talks at Dyalog Users meetings are by members of the Dyalog Team, talking about work that has recently been done on our products or presenting designs for future enhancements.


Adám Brudzewsky’s talk, titled Array Notation Mk III, is about a potential future extension to the APL language, which will make it possible to easily and clearly describe arrays of high rank, or with deeply nested structure, without using APL primitives to “construct” them, as is common practice today. In addition to making application code easier to read and write, a literal notation for data structures will make it easy to use text files to describe data structures which are essentially part of the source code of an application, and should be managed by a source code management system. As the name suggests, this work has been ongoing for some time, with the initial inspiration coming from a user presentation by Phil Last, back at Dyalog ’15 in Sicily. Watch the presentation and give us feedback on whether you think this idea is now sufficiently baked to become part of Dyalog APL, or we’ll need a “Mk IV” talk next year!


With the growth in usage of Dyalog APL under macOS and Linux – especially in server or cloud environments – the Dyalog Remote Integrated Development Environment is becoming a “mainstream” tool, rather than the curiosity that it was during the first few years of development. Our partners at Optima Systems are developing RIDE on Dyalog’s behalf, and Gilgamesh Athoraya is now the lead developer. In his talk on “RIDE 4.1 and Next Generation Integrations”, Gil talks first about significant new features and performance improvements to RIDE in 4.1 – and then continues to talk about how components of the RIDE technology may be re-purposed to provide APL add-ins for popular development frameworks like the new Microsoft VS Code.

Summary of this week’s videos:

Diane’s Lasagne Problem

Making Lasagne

Participants in the SA2 Performance Tuning workshop at the Dyalog ’18 User Meeting were encouraged to bring their own problems for the group to work on. Diane Hymas of ExxonMobil brought a good one. The one-liner computation is as follows:

lasagne0 ← {groups {+⌿⍵}⌸ amts ×[⎕io] spices[inds;]}

where

   n ← 8e5
   spices ← ?6000 44⍴0
   groups ← +\(16↑1 2)[?n⍴16]
   inds   ← ?n⍴≢spices
   amts   ← ?n⍴0

Applying lasagne0 to this dataset:

   ⍴ lasagne0 ⍬
100015 44
   ≢ ∪ groups
100015

   )copy dfns wsreq cmpx

   wsreq 'lasagne0 ⍬'
844799820
   cmpx  'lasagne0 ⍬'
2.12E¯1

The problem with lasagne0 is space rather than time. The 845 MB required for this dataset may be acceptable, but we can be called upon to cook up large batches of lasagne in a smallish workspace, on a machine with limited RAM. (Large n and large ≢∪groups.)

All benchmarks in this document were run in Dyalog APL version 17.0, in a 2 GB workspace, on a machine with generous RAM.

Solutions

Marshall Lochbaum solved the problem. The alternative solutions are as follows:

lasagne0 ← {groups {+⌿⍵}⌸ amts ×[⎕io] spices[inds;]}
lasagne1 ← {↑ (groups{⊂⍵}⌸amts) {+⌿⍺×[⎕io]spices[⍵;]}¨ groups{⊂⍵}⌸inds}
lasagne2 ← {↑ (groups{⊂⍵}⌸amts)      {⍺+.×spices[⍵;]}¨ groups{⊂⍵}⌸inds}
lasagne3 ← {↑ {amts[⍵]+.×spices[inds[⍵];]}¨ {⊂⍵}⌸groups}

lasagne0 is the original expression; lasagne1 and lasagne2 were derived by Marshall during the workshop; lasagne3 was suggested by a participant in the workshop. The four functions produce matching results. Comparing the space and time:

space (MB)           time
lasagne0 845 2.29e¯1 ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
lasagne1 74 3.60e¯1 ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
lasagne2 74 2.39e¯1 ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
lasagne3 74 2.93e¯1 ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

lasagne0 v  lasagne1

Nearly all of the space required to evaluate lasagne0 is accounted for by the space for computing the right argument to key:

   wsreq 'lasagne0 ⍬'
844799820

   wsreq 'amts ×[⎕io] spices[inds;]'
844799548

In fact, the array spices[inds;], all by itself, is already very large. It has shape (⍴inds),1↓⍴spices (≡ 8e5 44), each item requiring 8 bytes.

   wsreq 'spices[inds;]'
281599556

   ⍴ spices[inds;]
800000 44

   8 × ×/ ⍴ spices[inds;]
281600000

   qsize←{⎕size '⍵'}    ⍝ # bytes for array ⍵
   qsize spices[inds;]
281600040

lasagne1 avoids creating these large intermediate results, by first partitioning the arguments (groups{⊂⍵}⌸amts and groups{⊂⍵}⌸inds) and then applying a computation to each partition. In that computation, the operand function {+⌿⍺×[⎕io]spices[⍵;]}, is a partition of amts and is the corresponding partition of inds.

lasagne1, lasagne2 and lasagne3 require the same amount of space to run, so the comparison among them is on time.

lasagne1 v  lasagne2

The only change is from +⌿⍺×[⎕io]spices[⍵;] to ⍺+.×spices[⍵;], which are equivalent when is a vector. The interpreter can compute +.× in one go rather than doing +⌿ separately after doing ×[⎕io]; in such computation the interpreter can and does exploit the additional information afforded by +.× and is faster by a factor of 1.5 (= 2.39 ÷⍨ 3.60).

lasagne2 v  lasagne3

The idea in lasagne3 is doing one key operation rather than the two in lasagne2. Therefore, the changes between lasagne2 v lasagne3 are:

lasagne2 groups{⊂⍵}⌸amts
groups{⊂⍵}⌸inds
spices[⍵;]
lasagne3 {⊂⍵}⌸groups amts[⍵] spices[inds[⍵];]

All three key operations involve {⊂⍵}⌸ with groups as the key, and are roughly equally fast, each taking up no more than 7% of the total time.

   cmpx 'groups{⊂⍵}⌸amts' 'groups{⊂⍵}⌸inds' '{⊂⍵}⌸groups'
  groups{⊂⍵}⌸amts → 1.69E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
* groups{⊂⍵}⌸inds → 1.39E¯2 | -18% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
* {⊂⍵}⌸groups     → 1.36E¯2 | -20% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

lasagne3 is doing one less key operation than lasagne2 in exchange for doing, for each of the ≢∪groups (= 100015) executions of the operand function, amt[⍵] v and spices[ind[⍵];] v spices[⍵;]. Indexing is by no means slow, but it’s not as fast as references to and . Therefore, lasagne2 is faster.

The trade-off may differ for different values of groups. In this case groups are small-range integers so operations using it as the key value are fast.

Welcome to the Dyalog ’18 Videos!

Three weeks have gone by since we waved goodbye to the last Dyalog ’18 delegates in Belfast. We’ve had time to catch up on sleep, half of us have had colds and recovered from them. Jason Rivers and Richard Park have started mixing and improving the audio and video recordings, and we are ready to release the first group of processed videos.


Our plan is to release batches of 3-5 videos, with enough variety for everyone to find at least one topic of interest each week. We have not reviewed all of the material yet; there are always one or two where something went wrong and we are unable to publish the recordings (or the presenter asks that we refrain from making the talk public), but we do expect to be able to make the vast majority of the talks available over the next couple of months.

Each week, I’ll be doing my best to introduce each set with a blog entry: The first batch contains cleaned-up versions of the presentations that were streamed live from Belfast. The audio and video quality is significantly enhanced compared to the live stream, and the most confusing gaffes in my own live demo have been removed 😊.


As usual, the user meeting opened with the traditional trio of keynotes by Dyalog’s CEO Gitte Christensen, the CXO Morten Kromberg (that’s me) and CTO Jay Foad. Gitte introduces a couple of new faces at Dyalog, and the contest winners, so everyone can plan to buy the winners drinks during the week. Gitte then discusses high level direction – announcing our intention to make the Linux version available to download, and included in public Docker containers and Cloud VM images, with no questions asked.


My own session mostly consists of a live demo of the potential consequences of making Linux licences really easy to get hold of. In an imaginary conversation with a data scientist, I demonstrate the use of Dyalog APL to implement an (admittedly silly) analytical function, and subsequently make it available as a web service and via a web site, finally deplying it to the cloud using a set of Public Docker containers, without once installing Dyalog APL itself.


Jay Foad rounded Monday’s live stream off with a review of the features of the recently released version 17.0, before moving on to talk about the work that the development team is planning for versions 17.1 and 18.0, scheduled for the spring of 2019 and 2020, respectively.


In accordance with tradition, we also streamed the Prize Ceremony for the International Problem-Solving Competition and – often the most interesting talk of the year – the acceptance speech where this year’s winner talked about his code, and the experience of learning APL. This year’s winner did not let us down; it is amazing how quickly you can learn to write really, really good APL code!

Summary of this week’s videos: