6.5. What we left out
-
Elaboration-time subprocess sanity check ("does
f 0finish in < 1s?"). Considered and rejected: it makes the elaborator depend on the compiled binary, which is circular and phase-dependent. v0.1 static checks only verify symbol existence, arity, and complexity type. Compiled-code sanity checks live inlean-bench verify, which invokes the registered runner in-process againstf 0andf 1for each registered benchmark and reports any non-okoutcome. -
Strong verdict labels. The verdict fits the log-log slope β of
Cvsparamover the trimmed tail (leading 20% of ratios dropped by default).|β| ≤ 0.15→ "consistent with declared complexity", otherwise "inconclusive". β's sign is reported alongside the verdict so callers can see direction, but the verdict itself stays binary: the rawratiosarray is the source of truth and the verdict is decoration. The labels deliberately don't say "matches" or "doesn't match" because the heuristic isn't defended statistically. -
Statistical machinery. No outlier rejection, confidence intervals, repeated outer trials. v0.1 trusts that
targetInnerNanos := 500ms× auto-tuned inner-repeats is enough signal-to-noise for the orders of magnitude we care about.