|
Hashing
for Tolerant Index-Of, APL 2010, Berlin.
∇ z←x diota y;h;hj;i;j;k;m0;m1;mask;ne0;q;yi;yy
⍝ model of x⍳y where x and y are 64-bit floating point numbers and ⎕ct>0
q←Primes[(Primes>2×⍴x)⍳1] ⍝ size of hash table, a prime >2×⍴x
h←q⍴¯1 ⍝ hash table; ¯1 means "free"
m1←÷m0←1-⎕ct ⍝ upper & lower multipliers
mask←(○1)hashmask ⎕ct ⍝ bits in a 64-bit number to hash
ne0←{⎕ct←0 ⋄ ⍺≠⍵} ⍝ exact not-equal
:For i :In ⍳⍴x ⍝ hash x
j←q dhashfn mask∧binary x[i] ⍝ index into hash table
:While (0≤h[j])∧x[i] ne0 x[0⌈h[j]] ⍝ don't table exact duplicates
j←q|1+j ⍝ the next hash table entry
:EndWhile
⍎(0>h[j])/'h[j]←i' ⍝ new hash continuation
:EndFor
z←(⍴y)⍴¯1
:For i :In ⍳⍴y ⍝ main loop
k←⍴x ⍝ index of y[i] as far as we now know
yi←{mask∧binary ⍵}¨y[i]×m0,m1 ⍝ masked variants
:For yy :In ∪yi ⍝ do for each unique variant
j←q dhashfn yy ⍝ where to start looking in hash table
:While 0≤hj←h[j] ⍝ loop until no more hash continuations
⍎((k>hj)∧y[i]=x[hj])/'k←hj' ⍝ look for tolerant match with smaller index
j←q|1+j ⍝ do next hash continuation
:EndWhile
:EndFor
z[i]←k
:EndFor
∇
|