5.2. Workflow 1 perf stat🔗

perf stat is the right tool when you want to know "is this a CPU problem, a memory-bandwidth problem, or a branch-misprediction problem?" before reaching for a profile. It has no output file — results print to stderr at the end of the run.

$ lake exe bench profile myFib --param 4096 \
    --profiler "perf stat -e cycles,instructions,cache-misses,branch-misses --"

Expected output:

lean-bench profile: perf stat -e cycles,instructions,... -- /path/to/bench _child ...
{"schema_version":1,"kind":"parametric","function":"myFib","param":4096,...}

 Performance counter stats for ' /path/to/bench _child ... ':

   12,345,678,901      cycles
   18,234,567,890      instructions             #    1.48  insn per cycle
       12,345,678      cache-misses
        2,345,678      branch-misses

       0.512345678 seconds time elapsed

The benchmark's own JSONL row stays interleaved on stdout — the per_call_nanos field gives you the per-iteration wall time, and inner_repeats tells you how many calls the autotuner ran inside this single profiler invocation.