5.2. Workflow 1 perf stat
perf stat is the right tool when you want to know "is this a
CPU problem, a memory-bandwidth problem, or a
branch-misprediction problem?" before reaching for a profile. It
has no output file — results print to stderr at the end of the
run.
$ lake exe bench profile myFib --param 4096 \
--profiler "perf stat -e cycles,instructions,cache-misses,branch-misses --"
Expected output:
lean-bench profile: perf stat -e cycles,instructions,... -- /path/to/bench _child ...
{"schema_version":1,"kind":"parametric","function":"myFib","param":4096,...}
Performance counter stats for ' /path/to/bench _child ... ':
12,345,678,901 cycles
18,234,567,890 instructions # 1.48 insn per cycle
12,345,678 cache-misses
2,345,678 branch-misses
0.512345678 seconds time elapsed
The benchmark's own JSONL row stays interleaved on stdout — the
per_call_nanos field gives you the per-iteration wall time, and
inner_repeats tells you how many calls the autotuner ran inside
this single profiler invocation.