Boolean transpose
- Originally Dyalog APL transposed one bit at a time
- This is very wasteful: we read and write full bytes for each bit
- Jay Foad developed a faster solution for matrices whose dimensions are multiples of 8, but this is a small fraction of matrices
- Dyalog version 16.0 has good performance on all matrices
- Transpose for large matrices is up to 20 times faster!
- This requires some casework…