lean-bench

3.7. CI integration🔗

A typical CI step runs the bench exe with a low per-call budget so each individual measurement is bounded:

- name: Benchmark sanity
  timeout-minutes: 10
  run: |
    lake build bench
    lake exe bench list
    lake exe bench run goodFib \
      --max-seconds-per-call 0.5 --param-ceiling 1024

run, compare, and verify enforce their own per-call wallclock cap via the in-Lean kill path, so they don't need an external timeout(1) wrapper — and the example above runs unchanged on Windows, which lacks GNU timeout. list and lake build aren't covered by the per-call cap (they don't run user code), so for an overall job-level bound use the CI system's own timeout (e.g. GitHub Actions' timeout-minutes shown above) rather than a shell wrapper. The same per-call flags work on compare, so you can pin a comparison run to a tight wallclock budget without recompiling. See Quickstart for the full flag list and the matching declaration-time where { … } syntax.

When a benchmark's natural ladder is far larger than CI allows, ship a tighter default in the source itself rather than putting flags in every CI job:

def expensive (_ : Nat) : Nat := 0 setup_benchmark expensive n => 2 ^ n where { maxSecondsPerCall := 0.25 paramCeiling := 256 }

CLI flags still layer on top, so the ergonomic split is "declaration defaults what's true forever; CLI flags what's true today."