Monday 22 September at Dyalog ’14

This was originally posted to Catalyst PR’s Facebook page and is reproduced here to make it accessible to people who don’t have Facebook accounts.

For information on the presentations at Dyalog ’14, see http://www.dyalog.com/user-meetings/dyalog14.htm.

After having spent yesterday in workshops with lots of hands-on examples, today was the kick off for the user meeting proper. The last time Dyalog held a global user meeting in the UK was in 2003, and 2014 has given us an all-time record attendance in terms of delegates.

DNA

My posts from Dyalog ’14 will focus on extracts from User Presentations, and the first presentation I would like to highlight is There’s DNA Everywhere – an Opportunity for APL by Charles Brenner PhD (http://dna-view.com).

Charles Brenner

Charles Brenner

Charles describes himself as consulting in forensic mathematics and his DNA-View software programme is used by 100 laboratories in every continent (except Antarctica) for academic activities in mathematics, biostatistics and various aspects of population genetics.

Charles used the example of gun crime to illustrate: “Use a gun in a crime and, thanks to recent biochemical advances, you’ll likely leave enough DNA to be detected. However, often the gun has also been handled by others and, consequently, the analysis of DNA mixtures has become increasingly important. The market for good DNA mixture software, therefore, has led me to recent and rapid progress in my long-delayed venture to convert my forensic DNA interpretation work into the modern world of Dyalog. Many people enjoy the scientific aspects of the story and I enjoy relating how, in my view, this project exemplifies the thesis that APL is a tool of thought. I credit APL with leading me to elegant and, therefore, very fast and flexible solutions to a problem for which the competing solutions are lumbered by complicated statistical and Monte Carlo methods.”

Elegance and simplicity lead to several concrete benefits. The first is conceptual development.

Slide from talk I01

Slide from talk I01

“I made notes in APL as solutions to various aspects of the problem occurred to me last year. These brief APL notes ensured that the ideas really made sense, that they worked together and that I would not forget them. The brevity also revealed a simple but important point that others had overlooked: nesting the computation loops in the right order saves orders of magnitude in computation time. For another, execution speed makes it much easier to see the forest in many ways – testing, developing, designing. Competing programs take hours to find a partial answer, then having worked that hard succumb to a natural tendency to call it a day. The APL “DNA-VIEW Mixture Solution” program does the same in seconds, which makes it much easier to think through to the fact that there’s a lot more work to do before the result is logically defensible.” Charles explained.

At the end of 2013 Charles implemented the Mixture Solution and tested it on examples including a set of five proficiency exercises created at the National Institute of Standards and Technology. More than one hundred entrants, including the leading competition, had contributed analyses. One of the competitors (supported by years and millions of dollars of government grants) got four problems right and came close on the fifth, viewing it as a three person mixture although in fact it was four. Mixture Solution alone correctly analysed all five – as a bonus it correctly diagnosed one suspect as a mixed race person.

“We APLers try to be modest but it’s not always easy. Sometimes there’s just not much to be modest about!” Charles stated.

Having now reached a point where he has developed a superior, fast, elegant software solution to a massively complex DNA Mixture analysis problem, Charles ended the presentation with a plea for help. First and foremost he is looking for someone who can help carry on with the further software development using Dyalog. This could be a fantastic opportunity for someone with an interest in forensics, maths, computer science and software programming in Dyalog – “living in the Bay area”.

ELSI

The Finnish Pension System is currently undergoing a reform and there is a strong need to microsimulate various parameters to help make decisions regarding future pension schemes.

Slide from Talk U02

Slide from Talk U02

Heikki Tikanmäki from the Finnish Centre for Pensions gave us a presentation of ELSI – the Pension microsimulation model that has been built in the Finnish Centre for Pensions using Dyalog.

ELSI models the statutory pension system of Finland. It is used for reform analysis and long-term projections. In a typical simulation run, they model 500,000 individuals over 70 years.

By using Dyalog not only do they retain compatibility with older macromodels, population and employment projections, they also gain flexibility for modelling various proposals for a new pension scheme.

The model is built in modules with each module running as independently as possible. All the simulation data are held in component files. Depending on the complexity of each module, it takes 10-40 minutes to run using version 12.1 – without doing any performance optimisation.

Heikki told us that by upgrading to Dyalog version 14.0 the model runs 20% faster and the simulation data are now only taking up 16% space in comparison to the 5 GB size they previously had.

Future plans include implementing the use of parallel processors for the computational tasks in ELSI, and at that point in time he expects at least a further 50% reduction in the time it takes to run the entire microsimulation model.

Postcard from Dyalog ’14 – Monday

Delegates by country

Delegates by country (click to expand)

Monday opened with registration for what is the biggest Dyalog user meeting on record – it could have been even bigger but we had to turn some people away because there was no more room available. In total there are 126 attendees (even more that when we went to print with the programme!) made up from 95 delegates, 12 spouses and partners, and 19 Dyalog employees from around the world as shown.

The Conference Hall

The Conference Hall

Discussion Point: DNA Analysis

Charles Brenner performs forensic analysis of DNA and DNA mixtures – an intensive mathematical process. Others use complicated statistical and Monte Carlo methods but by using APL as his tool of thought, Charles has devised techniques which are more accurate and many times faster than competing applications supported by multi-million dollar funding.

Discussion Point: A Different Kind of Selfie

In today’s session “The Tuning Pipeline”, Roger Hui noted that for the index-of family of functions, a faster algorithm is often possible if the left and right arguments are “the same”, up to a factor of two. “The same” means the that the values have identical references so that they can be compared with a simple and quick check. For example, x⍳x is a selfie but x⍳0+x or x⍳(⍴x)⍴x are not.

For example, for the inverted-table index-of 8⌶,

      u←(' ',⎕a,⎕d)[?8e4 30⍴37]
      x←u[?1e5⍴≢u;]
      p←,⊂x
      q←,⊂x
      cmpx 'p(8⌶)p' 'p(8⌶)q'
  p(8⌶)p → 3.98E¯3 |    0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕                      
  p(8⌶)q → 8.72E¯3 | +118% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

Selfies occur in less obvious places – when finding uniques (∪x), in the key operator, and in x∧.=⍉y as well as in x⍳x and ⍳⍨x. You can try this yourself for various datatypes. A selfie which is not faster is an opportunity for a performance improvement.

Discussion Point: Sorting is Faster Than Grading

Also in “The Tuning Pipeline” session, Roger Hui and Kimmo Kekäläinen looked at some of the recent performance improvements in Dyalog. It has long had idioms for sorting ({⍵[⍋⍵]}, {⍵[⍋⍵;]}, etc.) Interestingly, for some common datatypes, sorting is faster than grading. You can try this yourself for various datatypes:

      cmpx '{⍵[⍋⍵]}x' '⍋x' ⊣ x←?1e6⍴2e9
  {⍵[⍋⍵]}x → 3.23E¯2 |    0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕                         
* ⍋x       → 8.34E¯2 | +158% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

The point is written up in this essay on the J website. The webpage uses J but the explanation applies to APL (or any other language). Quoting from the webpage: “Grade needs to keep track of the argument array and the list of indices. Sort just needs to keep track of the first. … When the argument items can be manipulated efficiently, as would be the case when the items are machine units (1, 2, 4, or 8 bytes), then sort can eschew a separate index array.”

John Scholes’ “Distractions” – An Objective Review

An awed silence gripped the room as John unveiled ground-breaking techniques for minimising life’s distractions and maximising programming productivity. It is too early to say how far-reaching this approach will turn out be in real life situations and whether it will affect The Global Economy as a whole. John appeared to have all categories of distraction covered. Let’s see. Like Woodstock ’69 and The Isle of Wight ’70, in years to come you may be able to say “I was there”.

John Scholes

John Scholes

John Scholes

John Scholes

And Also …

Some of the many other things we saw and heard today:

  • Dyalog has taken on two new employees since Dyalog ’13
  • John Daintree can’t take selfies but embraces high resolution touch devices
  • Data files used in Finnish pension microsimulations reduced to one sixth of their size when component file compression was enabled
  • MyDyalog launched live at the user meeting
  • Fiona wants to promote your APL application in a banner on the Dyalog homepage
  • Morten computed Mandlebrot set images, performing the calculation in parallel with isolates running on servers in Holland and Hong Kong

Tomorrow…

Tomorrow’s schedule features five user presentations and four Dyalog presentations covering the themes of code management and reuse, performance, presentation tools and cryptography. In the evening Morten will demonstrate the progress he has made with the ‘bots (something that should be very familiar to regular readers of the Dyalog blog!).