Alpha. Vary is under active development and not ready for production use. Syntax, APIs, performance, and behaviour may change between releases.

Survivor tail

Each mutation run is composed of a main loop that runs every mutant against the suite once and a survivor tail, a second pass that re-runs each apparent survivor a single time to detect flaky kills. The survivor tail used to hide inside the per-file executionTimeMs; it is now exposed as its own bucket so dense survivor sets stop masquerading as slow main loops.

For the basics, see Introduction. For the full pipeline, see Advanced overview.

What the survivor tail is

After the main mutant loop completes, the runner walks the result list and re-runs every result whose result.survived is true (no test killed it on the first pass) and whose result.error is null (the mutant compiled and ran without infrastructure error).

Each rerun reuses the same warm worker as the main loop; there is no classloader rebuild between passes. If the rerun kills the mutant, it is re-classified as killed = true, flaky = true, with the killing test recorded in killedBy. If the rerun also survives, the original survivor classification stands.

This pass is serial by construction: parallel reruns would dilute the very flake signal being measured. That is also why a noisy file can spend a large share of its wall-time in the tail even when the main loop ran in parallel.

What the tail is NOT

PropertyDetail
Not a budget-aware retry loopEvery survivor is re-run exactly once
Does not change per-mutant executionTimeMsThe main-loop field reflects the original measurement
Does not run on bytecode backendshot-swap and redefine report survivorRerunMs = 0 and are therefore never tail-dominated

Per-file accounting

Every files[].phaseTimings block now includes:

FieldMeaning
mainLoopMsThe per-mutant test loop only
survivorTailMsThe post-loop rerun pass
tailFractionsurvivorTailMs / (mainLoopMs + survivorTailMs), in [0, 1]
tailDominatedtailFraction > threshold (default 0.5)

The aggregate over a project run uses the same arithmetic against summed mainLoopMs and survivorTailMs.

Where it surfaces

SurfaceWhat appears
mutation.jsonfiles[].phaseTimings contains the four fields alongside the pre-existing pipeline phases
telemetry.json / telemetry.csvWritten when --telemetry is passed. The same four fields appear per file; the artifact root carries the tailThreshold actually used so downstream tooling can reproduce the classification
--output log progress streamEmits a file_tail event after each file_done with the four fields plus tailPct for human reading
--output json progress streamEmits a file_tail event as a JSON object with the same payload
CLI summaryProject runs append a one-line tail summary reporting aggregate tail / total time and the count of tail-dominated files

Configuring the threshold

The default tail-dominated threshold is 0.5. Programmatic callers can pass an explicit tailThreshold: Double to tighten or loosen it. The threshold the artifact was generated under is recorded as tailThreshold in telemetry.json.

Reading a tail-dominated file

A tail-dominated file signals that the mutant set produces many first-pass survivors and that the rerun pass, not the per-mutant test execution, is consuming the file's budget. Next steps:

StepWhat to check
1Inspect the survivor list. A high survivor rate often points at missing coverage, not flake.
2Confirm whether flaky reclassifications were actually produced by the tail. If so, the tests are non-deterministic and the rerun is doing real work.
3If neither applies, the file is paying tail cost for no information: a candidate for future tail-reduction work.

Non-goals

Non-goalDetail
Adaptive rerun under loadEvery survivor is re-run regardless of file size or tail fraction
Per-mutant rerun timingTail time is a per-file property by construction; a per-mutant breakdown would need an additional accumulator inside the rerun loop