2.7. Export format
The export format is a separate, user-facing JSON document
produced by --export FILE on run and compare. It is
distinct from the JSONL wire format described above:
aspect | JSONL wire format | Export format |
|---|---|---|
scope | one row per child invocation |
one document per |
audience | internal (child → parent) | external (CI, dashboards, baselines) |
structure | newline-delimited JSON objects | single JSON object |
version key |
|
|
source of truth |
|
|
The two version counters are independent. Changes to the JSONL
wire format do not force a bump of export_schema_version, and
vice versa.
2.7.1. Export document structure
{
"export_schema_version": 1,
"lean_bench_version": "0.1.0",
"env": { ... },
"results": [ ... ],
"baseline_comparison": [ ... ],
"budget": { ... }
}
-
export_schema_version(integer, required): current version is 1. -
lean_bench_version(string):LeanBench.libraryVersion. -
env(object | null): top-level reproducibility metadata (same schema as the JSONLenvfield). All results in a single export share the same parent env. -
results(array): one entry per benchmark result. Each entry has a"kind"discriminator ("parametric"or"fixed") and carries the full result data: points, ratios, verdict, config, trial summaries, etc. Every result also carries"budget_truncated"(boolean):truewhen the run was a CI-budget run and that benchmark's ladder was cut short between rungs by the deadline. -
baseline_comparison(array, optional): present when--baselinewas used. One entry per matched function with per-param regression / improvement / stable classification. -
budget(object, optional): present only when--total-secondswas used. See Export budget summary below.
2.7.2. Export budget summary
When the run was launched with --total-seconds N (issue
#9), the export
document carries a top-level "budget" object summarising which
benchmarks ran, which were skipped, and how long the suite
actually took:
{
"budget": {
"total_seconds": 60.0,
"elapsed_seconds": 58.91,
"completed": 7,
"truncated": 1,
"skipped": [
{"function": "MyProject.Sort.runMergeSort",
"kind": "parametric",
"status": "budget_skip"},
{"function": "MyProject.Sort.runInsertion",
"kind": "parametric",
"status": "budget_skip"}
]
}
}
Field semantics:
-
total_seconds(number, required): the value of the--total-secondsflag. -
elapsed_seconds(number, required): wallclock seconds the orchestrator actually spent before stopping. -
completed(integer, required): number of benchmarks that ran to a result (whether or not their ladders were truncated mid-flight). -
truncated(integer, required): number of completed benchmarks whose ladder was cut short between rungs by the deadline. Each truncated result also carries"budget_truncated": truein itsresultsentry. -
skipped(array, required): one entry per benchmark that didn't run at all because the budget was exhausted before it could start. Each entry hasfunction(string),kind("parametric"or"fixed"), andstatus(always"budget_skip"). Order matches the order benchmarks would have been scheduled in.
CI consumers should treat skipped and truncated > 0 as
actionable information rather than failure: the run completed
within the configured budget on purpose.
2.7.3. Export versioning rules
The same breaking/non-breaking rules as the JSONL format apply:
-
Adding a new optional field is non-breaking.
-
Removing or renaming a field, or changing its type, is breaking and bumps
export_schema_version. -
Readers MUST reject documents with
export_schema_versionoutside theirsupportedExportVersionsset.
2.7.4. Baseline comparison via --baseline
--baseline FILE on run loads a previous export file and
compares the current run against it at shared parameter values:
-
Parametric: for each shared param, computes the percentage change in
per_call_nanos(using the trial-summary median when outer trials > 1). Params where the current run is slower than baseline by more than--regression-threshold(default 10%) are flagged as regressions. Params faster by the same margin are improvements. -
Fixed: compares median wall time.
-
Exit code: non-zero when any regression is detected, making the feature CI-usable without parsing terminal output.