6.9. Ladder semantics
The ladder shape is per-benchmark and chosen via
BenchmarkConfig.paramSchedule:
-
.doubling—[paramFloor, 1, 2, 4, …, paramCeiling]. Stops at the first non-okrung. Right for polynomial complexity: it spreads samples evenly in log-space, which is what the verdict's log-log slope fit needs. -
.linear samples— runs the doubling ladder as a probe (rows markedpartOfVerdict := false), records the bracket(lastOk, firstFail), then refinesfirstFailagainst the declared complexity model and the wallclock cap (so we don't burn the sweep budget on rungs guaranteed to time out), then walkssamplesconsecutive rungs fromlastOk + 1upward inside the refined bracket. Right for exponential complexity, where the cap kills doubling within 1–2 useful samples. -
.custom params— walks an explicit user-suppliedArray Natof param values in order, applying the wallclock cap rung-by-rung. Right for benchmarks where the meaningful inputs aren't a log-spaced family (e.g. realistic instance sizes from a corpus, or a small set of regression-pinnedns). Opt-in only via declaration-timewhere { paramSchedule := .custom #[...] };.autonever resolves to it. Issue #15. -
.auto(default) — at runtime, evaluates the declared complexity at probe points 8/16/32/64 and resolves to.doublingor.linear. The test comparescomplexity(64)/complexity(32)againstcomplexity(16)/complexity(8): equal for anyn^k(→ doubling), super-linearly larger for anyb^n(→ linear, threshold 4×).
Probe rows in .linear mode are kept in the report (so the user
sees the cold-regime context that determined the bracket) but
filtered out of cMin/cMax/slope by Stats.ratiosFromPoints.
Two functions in the same compare may stop at very different
params; the report preserves both ladders fully and computes
agreeOnCommon only over the intersection.
6.9.1. Verdict on narrow ladders
The slope-based verdict (|β| ≤ slopeTolerance) only works when
the log-x range is wide enough to make OLS well-conditioned.
Linear ladders span a narrow log-x range (e.g. n ∈ 33..40 is
log 40/33 ≈ 0.19), so Stats.fitSlope returns none for spans
below 1.0 and deriveVerdict falls back to a multiplicative
range check:
cMax / cMin ≤ max(narrowRangeNoiseFloor, exp(slopeTolerance × xRange))
Two ingredients:
-
The scale-adjusted slope bound
exp(slopeTolerance × xRange)matches the slope semantics: it converges to1 + slopeTolerance · xRangefor small spans and admits exactly the constant-Cdata the slope check would. NaivecMax/cMin ≤ 1 + slopeToleranceis unsound (admits e.g.n² log ndeclared asn²becauselogvaries slowly over any narrow interval). -
The
narrowRangeNoiseFloor(default 1.50) admits the realistic 15–25% spread of single-shot near-cap measurements. At the upper rungs of an exponential-complexity ladder the autotuner runs 1–4 inner repeats per spawn (because each batch is already at the cap), so per-call timings are unsmoothed. Without a noise floor the boundexp(slopeTolerance · xRange) ≈ 1.03overxRange ≈ 0.2would reject the actual data run after run;1.50lets a flat-with-noise C pass with comfortable margin on noisy hardware while still flagging gross deviations.
The tradeoff: with a 1.50 noise floor on a narrow ladder, an
actual n^(2.1) declared as n^2 over n ∈ 33..40 would pass
(cMax/cMin ≈ 1.20). On wider ladders that benchmark would use
doubling and the slope check would catch it. The noise floor
only relaxes the narrow-range path.