6.1. Subprocess per measurement
Every measurement of f at a chosen param spawns a child
process. The child does the timing AND the auto-tuned-repeat
loop; the parent spawns the child via IO.Process.spawn, reads
stdout, and runs a sibling killer Task that calls
IO.Process.Child.kill if the batch exceeds
maxSecondsPerCall + killGraceMs.
Why subprocess: Lean 4 has no portable in-process interrupt for
arbitrary computation. Task cancellation is cooperative;
IO.sleep doesn't help. The only reliable way to enforce a
wallclock cap on a single measurement is to give it its own
process and kill the process.
The kill path is pure Lean — no dependency on GNU coreutils
timeout(1) or any other external utility.
IO.Process.Child.kill is implemented for every platform Lean
supports, so the harness runs on Linux, macOS, and Windows out of
the box.
6.1.1. Killer task: parent / killer coordination
The parent spawns the child and immediately starts a sibling
Task whose job is to kill the child if the wall budget runs out.
A BaseMutex plus two IO.Ref Bool flags (mainDoneRef,
killedRef) coordinate the two:
-
Parent reads stdout (one JSONL row).
-
Parent takes the lock and sets
mainDoneRef := true, then releases. -
Parent calls
child.wait(which on POSIX reaps the pid). -
Parent inspects
killedRef.true→ synthesize akilledAtCaprow;false→ use the parsed JSONL row (or synthesize anerrorrow if the exit code is non-zero).
Meanwhile, the killer:
-
Polls
mainDoneRefwhile sleeping in 25 ms increments up to the deadline (maxSecondsPerCall + killGraceMs). Polling lets it exit promptly when the parent finishes early — without it, every batch would block for the full deadline. -
If the deadline elapses without
mainDoneRefbecomingtrue, the killer takes the lock, re-readsmainDoneRef, and if stillfalsesetskilledRef := trueand callsChild.kill.
The lock guarantees the dangerous race is impossible: the parent
sets mainDoneRef := true before calling child.wait, so by
the time the kernel can reap the pid, any subsequent killer
critical-section will observe mainDoneRef = true and skip the
kill. We never call kill(2) on a pid that has already been
reaped (which on POSIX could hit a recycled pid and signal an
unrelated process).
There is a microsecond-scale logical (not safety) race: if the
child happens to exit naturally just as the killer fires, the
killer can grab the lock first, see mainDoneRef = false, and
mark the row killedAtCap even though the child finished within
the budget. The row is dropped from consistentWith… ratio
analysis anyway, and the cap was effectively reached, so this is
benign.