4.4. Allocation and sharing
Lean uses reference counting with destructive update for unique references. This matters two different ways for benchmarks.
Allocation cost in Nat arithmetic and structural data. Every
non-trivial Nat operation allocates (a fresh bignum result), as
does every Array.push, List.cons, String concatenation, and
record construction with at least one field changed. A loop that
looks like arr := arr.push x is O(1) amortised when arr is
unshared (Lean re-uses the buffer in place) and O(n) per push
when it is shared (Lean copies first). Code that benchmarks fine
in isolation can flip between regimes when used in a context that
holds a second reference to the same array.
A concrete example:
def pushN (n : Nat) : Array Nat := Id.run do
let mut a : Array Nat := #[]
for i in [0:n] do a := a.push i
return a
-- Looks like O(n). It is, when called as below.
setup_benchmark pushN n => n
-- But this version, which keeps a snapshot every step, is O(n^2):
def (n : Nat) : Array Nat := Id.run do
let mut a : Array Nat := #[]
let mut snapshots : Array (Array Nat) := #[]
for i in [0:n] do
snapshots := snapshots.push a -- shares `a`!
a := a.push i -- next push must copy
return a
If you suspect sharing-induced regressions, write a smaller allocation-only variant that exercises just the suspect operation and benchmark it alongside the full algorithm — the relative per-call cost will tell you whether the unique-reference path is being hit.
Allocation churn vs. arithmetic in tight loops. Even when a single operation is cheap, allocating a fresh bignum / array per iteration generates GC-style pressure that doesn't map cleanly onto a textbook complexity. The harness measures wall time including this pressure, so it shows up as "constant factor C is bigger than you expected" rather than as a slope deviation. If you're chasing a 2× constant-factor difference between two algorithms with the same declared complexity, allocation patterns are a strong suspect.
The alloc_bytes and peak_rss_kb schema fields are reserved
for a future feature
(#6) that will
surface this directly. Until then, allocation effects are visible
only through the C = perCallNanos / complexity(n) constant.