An overview of topics related to writing performant code and optimising existing code.
APLers
In APL, the ability to express similar ideas (or even the exact same idea) in multiple ways is quite pronounced.
This double-edged sword of language is both one of the most enjoyable parts of writing (choosing an expression which suits oneself), but it is also a source of frustration ("How can I express that better?" "What is a better way to put that?" "What is the best way to express this idea?").
Reference: Dyalog Webinars: APL CodeGolf Autumn Tournament
Here we advocate for balanced code, as this is desirable in production.
⎕IO
¶Do not optimise code which has not been measured as slow in realistic situations.
Optimised code is often longer and much less readable.
dfns.life
R←¯2⌽¯1⊖5 7↑(3 3⍴⍳9)∊2 3 4 5 8
lifeNested←{↑1 ⍵∨.∧3 4=+/,¯1 0 1∘.⌽¯1 0 1∘.⊖⊂⍵}
lifeFlat←{(3=c)∨⍵∧4=c←+⌿,[1 2]¯1 0 1⊖⍤0 99⍤99 2⊢¯1 0 1⌽⍤0 99⊢⍵}
]runtime -c "lifeFlat R" "lifeNested R"
⎕PROFILE ]Profile
dfns.cmpx ]Runtime
$${(\sum_{n=1}^{N}A_n})\div{N}$$
$$\sum_{n=1}^{N}({A_n}\div{N})$$
$${(\sum_{n=1}^{N}A_n})\div{N}$$
]dinput
avg1←{
N←≢⍵ ⍝ Count elements
s←+⌿⍵ ⍝ Sum elements
s÷N ⍝ Sum divided by count
}
$$\sum_{n=1}^{N}({A_n}\div{N})$$
]dinput
avg2←{
N←≢⍵ ⍝ Count elements
n←⍵÷N ⍝ Array divided by count
+⌿n ⍝ Sum
}
)copy dfns cmpx
n←?⍨1000
cmpx 'avg1 n' 'avg2 n'
⎕PROFILE 'clear'
⎕PROFILE 'start'
avg1 n
⎕PROFILE 'stop'
]Profile -lines
repeat←1e4
_Profile←{⍺←1 ⋄ _←⎕PROFILE'clear' ⋄ _←⎕PROFILE 'start' ⋄ r←⍺⍺¨⍺⍴⊂⍵ ⋄ _←⎕PROFILE 'stop'}
repeat avg1 _Profile n
⎕VR'avg1'
]profile -lines
repeat avg2 _Profile n
⎕VR'avg2'
]profile -lines
Relatively easy gains
Avoid nested arrays or mixed-type arrays
3 4⍴⎕A ⍝ 3 4 ⍴ 'A' 'B' 'C' 'D' 'E' 'F' 'G' 'H' 'I' 'J' 'K' 'L'
⍝ 2 3 ⍴ p0 p1 p2 p3 p4 p5
⍝ p0 → 2 2 ⍴ 1 2 3 4
⍝ p1 → 5 ⍴ 'a' 'b' 'c' 'd' 'e'
⍝ p2 → 2 3 ⍴ p6 p6 p6 p6
⍝ p3 → ⍬ ⍴ 1
⍝ p4 → 2 ⍴ p7 p8
⍝ p5 → ⍬ ⍴ 3
⍝ p6 → 4 ⍴ 'w' 'o' 'r' 'd'
⍝ p7 → ⍬ ⍴ 2
⍝ p8 → ⍬ ⍴ 'b'
2 3⍴(2 2⍴⍳4)'abcde'(2 3⍴⊂'word')1 (2'b') 3
⎕←5↑posNest←?500⍴⊂3⍴10
⎕←5↑posFlat←↑posNest
]runtime -c "0.5*⍨+/2*⍨-⍤1⍤1 99⍨posFlat" "0.5*⍨+/¨2*⍨∘.-⍨posNest"
Use inverted tables: Dyalog '18: Inverted Tables
8⌶
Do work on large arrays where possible
b←1=?100 100⍴2
]runtime -c "+/,3<{+/,⍵}⌺3 3⊢b" "+/,{3<+/,⍵}⌺3 3⊢b"
A←?1e4⍴1e2
]runtime -c "{≢⍵}⌸A" "{≢⊢⍵}⌸A"
Search: dyalog help idiom list
⍝ Sorting idioms
]runtime -c "{(⊂⍋⍵)⌷⍵}A" "{⍵⌷⍨⊂⍋⍵}A"
Use ⎕CT←0
if possible
Hsu, A.W., 2019. A data parallel compiler hosted on the GPU.
APL makes it easy to reason about algorithms.
https://en.wikipedia.org/wiki/Computational_complexity_of_mathematical_operations
Primitive complexity:
+ ⍝ O(n)
| ⍝ O(n)
∘.f ⍝ O(n*2)
PT0←PrimesTil←{⍸2=+⌿0=∘.|⍨⍳⍵} ⍝ Primes from 1 to ⍵ using Modulo and Reduction
PT1←(⊢~∘.×⍨)1↓⍳ ⍝ Without Products
)copy dfns sieve pco
PT2←sieve 1↓⍳ ⍝ Sieve of Eratosthenes
PT3←⍸10 pco 1,⊢ ⍝ dfns.pco (lookup table)
⎕VR'sieve'
_Time←{⍎0 0 0 0.2 cmpx ⍺⍺,' ',⍕⍵}
)copy sharpplot
∇ {key}Plot data;d;n;s
:If 0=⎕NC'key'
key←''
:EndIf
s←⎕NEW SharpPlot
n←⊃data
:For d :In ⊆⊃⌽data
s.DrawLineGraph d n
:EndFor
s.SetKeyText key
View s
∇
⊢n←5↓⌊10*0.1×⍳20
'Modulo reduction' 'Without products' 'Sieve' 'dfns.pco' Plot n{⍺⍵}('PT0'_Time¨n)('PT1'_Time¨n)('PT2'_Time¨n)('PT3'_Time¨n)
⊢n←5↓⌊10*0.1×5+⍳18
⍝ 'Without products' 'Sieve' Plot n{⍺⍵}('PT1'_Time¨n)('PT2'_Time¨n)
'Modulo reduction' 'Without products' 'Sieve' 'dfns.pco' Plot n{⍺⍵}('PT0'_Time¨n)('PT1'_Time¨n)('PT2'_Time¨n)('PT3'_Time¨n)