Hashing Index   <<   >>

 

Hashing for Tolerant Index-Of, APL 2010, Berlin.

∇ z←x diota y;h;hj;i;j;k;m0;m1;mask;ne0;q;yi;yy
 ⍝ model of x⍳y where x and y are 64-bit floating point numbers and ⎕ct>0

  q←Primes[(Primes>2×⍴x)⍳1]                 ⍝ size of hash table, a prime >2×⍴x
  h←q⍴¯1                                    ⍝ hash table; ¯1 means "free"
  m1←÷m0←1-⎕ct                              ⍝ upper & lower multipliers
  mask←(○1)hashmask ⎕ct                     ⍝ bits in a 64-bit number to hash
  ne0←{⎕ct←0 ⋄ ⍺≠⍵}                         ⍝ exact not-equal

  :For i :In ⍳⍴x                            ⍝ hash x
      j←q dhashfn mask∧binary x[i]          ⍝ index into hash table
      :While (0≤h[j])∧x[i] ne0 x[0⌈h[j]]    ⍝ don't table exact duplicates
          j←q|1+j                           ⍝ the next hash table entry
      :EndWhile
      ⍎(0>h[j])/'h[j]←i'                    ⍝ new hash continuation
  :EndFor

  z←(⍴y)⍴¯1
  :For i :In ⍳⍴y                            ⍝ main loop
      k←⍴x                                  ⍝ index of y[i] as far as we now know
      yi←{mask∧binary ⍵}¨y[i]×m0,m1         ⍝ masked variants
      :For yy :In ∪yi                       ⍝ do for each unique variant
          j←q dhashfn yy                    ⍝ where to start looking in hash table
          :While 0≤hj←h[j]                  ⍝ loop until no more hash continuations
              ⍎((k>hj)∧y[i]=x[hj])/'k←hj'   ⍝ look for tolerant match with smaller index
              j←q|1+j                       ⍝ do next hash continuation
          :EndWhile
      :EndFor
      z[i]←k
  :EndFor
∇