Long and short axes
To get the most out of the processor, reduction algorithms must work on large enough units.
Possible unit sizes are given by ⌽×\⌽axes
.
- If the rows are long enough, we can add in rows one at a time.
- If the cells are large enough, we can reduce one cell at a time.
- If cells are small, we have to consider the array as a whole.
- If the entire array is small, there's no way to achieve high performance!