`⎕io=0 `

is assumed throughout.

A recent Forum post motivated investigations into the progressive index-of functions in the FinnAPL Idiom Library:

```
pix ← {((⍴⍺)⍴⍋⍋⍺⍳⍺,⍵) ⍳ ((⍴⍵)⍴⍋⍋⍺⍳⍵,⍺)} ⍝ FinnAPL Idiom 1
pixa ← {((⍋⍺⍳⍺,⍵)⍳⍳⍴⍺) ⍳ ((⍋⍺⍳⍵,⍺)⍳⍳⍴⍵)} ⍝ FinnAPL Idiom 5
```

In this note, we:

- explain what is progressive index-of
- explain why the two functions work
- investigate the performance of the two functions
- provide a more general solution

## Progressive Index-Of

Progressive index-of is like index-of (`⍳`

) except that each find “uses up” the target of that find. There are no duplicates in the result with the possible exception of `≢⍺`

(for “not found”). Thus:

```
x←'mississippi'
y←'dismiss'
x pix y
11 1 2 0 4 3 5
```

The following chart illustrates a step-by-step derivation of each progressive index:

```
0 1 2 3 4 5 6 7 8 9 10
m i s s i s s i p p i d i s m i s s
11
m i s s i s s i p p i d i s m i s s
11 1
m i s s i s s i p p i d i s m i s s
11 1 2
m i s s i s s i p p i d i s m i s s
11 1 2 0
m i s s i s s i p p i d i s m i s s
11 1 2 0 4
m i s s i s s i p p i d i s m i s s
11 1 2 0 4 3
m i s s i s s i p p i d i s m i s s
11 1 2 0 4 3 5
```

It is possible to compute the progressive index without looping or recursion, as the two FinnAPL functions demonstrate.

## Why It Works

The basic idea of `⍺ pix ⍵`

is to substitute for each item of `⍺`

and `⍵`

an equivalent representative, collectively `c`

and `d`

, whence the result obtains as `c⍳d`

. The equivalent representative used here is *ranking*, specifically the ranking of the indices in `⍺`

.

The ranking of an array `⍵`

is a permutation of order `≢⍵`

. The smallest major cell is assigned 0; the next smallest is assigned 1; and so on. Ties are resolved by favoring the earlier-occurring cell. The ranking can be computed by `⍋⍋⍵`

. For example:

```
x ⍪ ⍉⍪ ⍋⍋x
m i s s i s s i p p i
4 0 7 8 1 9 10 2 5 6 3
y ⍪ ⍉⍪ ⍋⍋y
d i s m i s s
0 1 4 3 2 5 6
```

`⍺ pix ⍵`

works on two different rankings of indices in `⍺`

:

`⍋⍋⍺⍳⍺,⍵ `

rankings of indices in `⍺`

of `⍺`

and `⍵`

, favoring `⍺`

`⍋⍋⍺⍳⍵,⍺ `

rankings of indices in `⍺`

of `⍵`

and `⍺`

, favoring `⍵`

The first `⍴⍺`

items of the former are those for `⍺`

and the first `⍴⍵`

of the latter are those for `⍵`

, and we get

`pix ← {((⍴⍺)⍴⍋⍋⍺⍳⍺,⍵) ⍳ ((⍴⍵)⍴⍋⍋⍺⍳⍵,⍺)}`

The second version depends on the following properties of permutations. Let `p`

be a permutation. Then `p[⍋p] ←→ ⍳≢p`

, the identity permutation, and therefore `⍋p`

is the inverse of `p`

. Furthermore, `p[p⍳⍳≢p] ←→ ⍳≢p`

and so `p⍳⍳≢p`

is also the inverse of `p`

. The inverse is unique (that’s why it’s called *the* inverse), therefore `⍋p ←→ p⍳⍳≢p`

.

```
p←97?97 ⍝ a random permutation
p[⍋p] ≡ ⍳≢p
1
p[p⍳⍳≢p] ≡ ⍳≢p
1
(⍋p) ≡ p⍳⍳≢p
1
```

The two rankings are permutations (because the leftmost functions are `⍋`

) and we just need the first `⍴⍺`

items of the former and the first `⍴⍵`

items of the latter. Thus:

```
pixa ← {((⍋⍺⍳⍺,⍵)⍳⍳⍴⍺) ⍳ ((⍋⍺⍳⍵,⍺)⍳⍳⍴⍵)}
```

## Performance

We note that both versions of `pix`

contain the expressions `⍺⍳⍺,⍵`

and `⍺⍳⍵,⍺`

, but the latter is just a rotation of the former. Thus:

```
pixb ← {i←⍺⍳⍺,⍵ ⋄ ((⍴⍺)⍴⍋⍋i) ⍳ ((⍴⍵)⍴⍋⍋(⍴⍺)⌽i)}
pixc ← {i←⍺⍳⍺,⍵ ⋄ ((⍋i)⍳⍳⍴⍺) ⍳ ((⍋(⍴⍺)⌽i)⍳⍳⍴⍵)}
```

Which is faster? The answer may surprise.

```
x←?1e6⍴3e5
y←?2e5⍴3e5
cmpx 'x pixb y' 'x pixc y'
x pixb y → 9.15E¯2 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
x pixc y → 9.21E¯2 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
```

A few factors about the Dyalog APL interpreter are relevant to this performance:

- Computing
`⍺⍳⍵,⍺`

as a rotation of an already computed`i←⍺⍳⍺,⍵`

produces a worthwhile speed-up, although only on a relatively small part of the overall computation.`i←x⍳x,y cmpx '(⍴x)⌽i' 'x⍳y,x' (⍴x)⌽i → 5.00E¯4 | 0% ⎕⎕ x⍳y,x → 7.19E¯3 | +1337% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕`

- Both
`⍳`

and`⍋`

have special code for small range data.`s←?1e6⍴5e5 ⍝ small range t←s ⋄ t[t⍳⌈/t]←2e9 ⍝ large range cmpx 's⍳s' 't⍳t' s⍳s → 5.87E¯3 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕ t⍳t → 2.00E¯2 | +240% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕ cmpx '⍋s' '⍋t' ⍋s → 3.25E¯2 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕ ⍋t → 3.84E¯2 | +18% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕`

`⍋⍵`

has special code when`⍵`

is a permutation.`p←1e6?1e6 ⍝ p is a permutation q←p ⋄ q[999999]←⊃q ⍝ q is not; both are small-range cmpx '⍋p' '⍋q' ⍋p → 5.81E¯3 | 0% ⎕⎕⎕ * ⍋q → 5.71E¯2 | +882% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕`

- We saw previously that if
`p`

is a permutation then`⍋p ←→ p⍳⍳⍴p`

. The special code for`⍋p`

makes the two expressions run at roughly the same speed. The slight advantage for`⍋⍋x`

versus`(⍋x)⍳⍳⍴x`

would increase if and when`⍋⍋`

is recognized as an idiom.`cmpx '⍋p' 'p⍳⍳⍴p' ⍋p → 6.02E¯3 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕ p⍳⍳⍴p → 6.57E¯3 | +9% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕ cmpx '⍋⍋x' '(⍋x)⍳⍳⍴x' ⍋⍋x → 3.16E¯2 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕ (⍋x)⍳⍳⍴x → 3.25E¯2 | +2% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕`

## A General Solution

Index-of works on cells rather than just scalars. Likewise, progressive index-of can also be extended to work on cells. The core algorithm remains the same. The generalization obtains by first reshaping `⍵`

to have the same rank as `⍺`

(having major cells with the same shape), applying the core algorithm, and then reshaping its result to have the same leading shape as the original `⍵`

. Thus:

```
pixd←{
m←≢⍺
r←0⌊1-⍴⍴⍺
n←×/r↓⍴⍵
i←⍺⍳⍺⍪(n,1↓⍴⍺)⍴⍵
(r↓⍴⍵) ⍴ ((⍋i)⍳⍳m) ⍳ ((⍋m⌽i)⍳⍳n)
}
xx yy
mmmm dddd
iiii iiii
ssss ssss
ssss mmmm
iiii iiii
ssss ssss
ssss ssss
iiii
pppp
pppp x
iiii mississippi
⍴xx ⍴yy y
11 4 7 4 dismiss
xx pixd yy x pixd y
11 1 2 0 4 3 5 11 1 2 0 4 3 5
xx pixd 3 5 4⍴yy x pixd 3 5⍴y
11 1 2 0 4 11 1 2 0 4
3 5 11 7 6 3 5 11 7 6
11 10 11 11 11 11 10 11 11 11
```

**Postscript**

After having written the above, I discovered an alternative exposition on progressive index-of by Bob Smith entitled *Anatomy of an Idiom*. Adám Brudzewsky has produced a Stack Exchange lesson and a Jupyter Notebook based on Smith’s text.

There is also an exposition in J on the same topic, with a more verbose but easier-to-understand derivation.