Index-Of on Multiple Floats

Roger K.W. Hui
 




⎕io←0

Here, the index origin is 0, although the ideas, algorithms, code, examples, …, work in 1-origin with obvious modifications.
 

The Problem

The problem is to efficiently compute x index-of y on real or complex numbers, where the major cells of x have multiple elements, using tolerant comparison. It’s a problem I’ve worked on for more than 20 years [0] but didn’t have a good solution. Until now.

For this problem, the best solution is to set ⎕ct to 0, that is, use exact comparisons. The APL user can make this choice depending on the application. The APL implementer can not.

We were motivated to revisit the problem due to a bug in tolerant unique, which will be discussed later in this User Meeting [1].
 

Non-Linearity

The current implementation takes time O(n*2) .

f creates real matrices x and y withrows andcolumns, and runs a benchmark on x⍳y . The first expressions are benchmarks on 3-column matrices, with 1e3 , 2e3 , …, 32e3 rows. The behavior is O(n*2) because the time quadruples when the argument size doubles.

In contrast, when the arguments are vectors, the time doubles when the argument size doubles. The last number shows that on 32e3 rows, the vector case is about 2600 times faster than the 3-column case.
 

Clustering 0

If the arguments are real vectors, you can sort them so that (tolerantly) equal elements are contiguous. You can then put the elements into clusters so that equal elements are in the same cluster, using interval index () and key (). Within each cluster, the red indices refer to x and the blue indices refer to y . The red if any always precedes the blue, if any.
 

Clustering 1

If the argument cells have multiple elements, you can not sort them so that equal cells are contiguous.
 

Clustering 2

Jay Foad offered the key insight that the columns can be clustered independently, resulting in an integer array k . You can then cluster the arguments according to k or, equivalently, according to k⍳k , and in the result equal cells are in the same cluster.

The monadic case of key can be used to get the indices in each cluster instead of the actual data.
 

Hashing 0

Putting it altogether, we get the operator HASH . The most significant point is actually the name HASH , recognizing that what we are doing is hashing, and therefore the clustering calculation is a hash function. Looking at the definition:

•   ⍺⍺ ⍺⍪⍵ applies the hash function to the arguments.
{⊂⍵}⌸ computes the clusters c in terms of indices into the arguments.
An anonymous local function is applied to each cluster. Within the local function,
p are indices into X (the original left   argument)
q are indices into Y (the original right argument)
An O(n*2) search is done on the selected cells of X and Y .

On real matrices of size 1e4 by 3 , the APL model is 16 times faster than the current APL primitive.
 

Hashing 1

The most pleasing part is that the present code is much clearer and shorter than code from 2010 [2].
 

Complex Numbers 0

If the arguments are complex, the clustering can not be used as is, because it requires sorting and a complex array can not be sorted so that equal numbers are contiguous. But if you sort the magnitudes, then cells with equal magnitudes are contiguous. This works as long as equal complex numbers have equal magnitudes.

Using this idea, we get a nice speed-up, a factor of 40 in this benchmark.
 

Complex Numbers 1

We were pleased about this until the code ran afoul of some QA written in 2010. The repro can be simplified to the following case: a is a number with real part 1 and imaginary part 1+⎕ct . b is a number with real part 1+⎕ct and imaginary part 1+⎕ct×2 . x and y are random vectors of a and b .

a and b are equal so x⍳y should be all 0s. The clustering/hashing algorithm gives wrong answers!
 

Complex Numbers 2

The problem comes down to that a and b are equal but their magnitudes are not, whence clustering by magnitudes leads to an incorrect result.

But it should not be possible that equal numbers have unequal magnitudes. A little algebra on the definition of tolerant equality shows that the two comparisons together contradict the triangle inequality, a fact proven by Euclid more than two thousand years ago.

Don’t you just hate it when the QA contradicts proven facts?
 

Complex Numbers 3

Calculations show that |a-b is indeed less than |(|a)-|b , which is supposed to be impossible.

I did some extended precision arithmetic in J [3], using 120 decimal digits in the calculations and 40 decimal digits in the display. The magnitude of the difference is not less than the difference of the magnitudes, but this is apparent only after 28 decimal places.

What to do? We can not afford to resort to heroic efforts like doing the calculations with 120 decimal digits.
 

Complex Numbers 4

Fortunately, there is an easy “trick” which works: use double the tolerance to do the clustering. Clustering is a hash function; lumping together unequal cells merely reduces the efficiency, whereas failing to lump together equal cells would give incorrect answers.

Is doubling the tolerance enough? We don’t (yet) have a proof, but there is empirical evidence of its correctness.
 

Timings

The HASH APL model has been implemented in C for in version 17. The table presents the v16 times divided by v17 times to do x⍳y , for 3-column matrices with 1e3 , 4e3 , 16e3 , and 64e3 rows. The timings depend on the number of clusters.

In the (unusual) case of a single cluster, you pay a penalty for the wasted effort of clustering. In other cases, the v17 code exhibits linear behavior: quadrupling the argument size only quadruples the time, and so the ratios also quadruple versus the O(n*2) v16 code. That is, the v17 code is “infinitely faster” than the v16 code.
 

References

[0]   Hui, Roger K.W., Index-Of, A 30-Year Quest, J Conference 2014, 2014-07-25.
[1]   Hui, Roger K.W., Tolerant Unique, Dyalog User Meeting, Helsingør, Denmark, 2017-09-12.
[2] Hui, Roger K.W., Hashing for Tolerant Index-Of, APL2010, Berlin, 2010-09-16.
[3] Hui, Roger K.W., Extended Precision Functions,, J Wiki Essay, 2007-01-01.



created:  2017-06-14 15:45
updated: