tl;dr: Use `vary check` while drafting, `vary test` when the structure is clean, `vary mutate --quick` when passing tests may still be weak, and `vary validate` only when you need a final local or CI bar.


Most AI coding loops waste time running expensive verifiers too early. The model writes some code, the tool runs the full test suite, half the tests fail for structural reasons, and then everything thrashes trying to fix a noisy failure surface all at once.

There is a simpler policy: run the cheapest command that can actually tell you something, and only escalate when that stage is clean.

The ladder

StageCommandWhat it answers
1vary checkIs the code structurally sane enough to keep working on?
2vary testDoes the code behave correctly on the cases we wrote down?
3vary mutate --quickAre those tests strong enough to catch realistic faults?
4vary validateHas this change met the final local or CI policy bar?

How to use it

Start with vary check while the code is still changing shape.

vary check src/
vary check src/ --plan
vary check src/ --fix

If the checker is still finding structural problems, stay there. That feedback is cheaper and more local than test failures. If a rule is unclear, ask the toolchain directly:

vary explain VCI001

When the structure is clean enough, move to behaviour:

vary test tests/
vary test tests/ --only auth::test_login --trace

If tests fail, do not jump ahead. Replay the narrowest failing behaviour and fix that. Only after tests pass should you ask whether the tests themselves are any good.

vary mutate src/foo.vary --quick

This catches confidence theatre. A suite can pass and still be too weak to notice small but realistic faults. Mutation tells you whether "tests pass" actually means anything.

When the change is ready for handoff, run the policy bar:

vary validate . --profile local
vary validate . --profile ci

vary validate is the closeout step, not something you run on every edit.

The point

check narrows shape problems. test checks behaviour. mutate checks test strength. validate applies the final gate. Use the cheapest next verifier that can actually reduce uncertainty, and stop tools from bouncing between vague edits and expensive checks.

ArticleFocus
From generated code to confidence at scaleThe confidence workflow that strengthens generated code before deeper verification
Human-readable, AI-written, confidence at scaleThe product direction behind confidence-building in AI-assisted Vary

More articles

What's new in Vary v122-alpha.1 v122-alpha.1 is out. The headline is vary var, a new top-level command that runs check, test, mutation, and review under a cost budget. The mutation engine was rewritten around reachability tracing, kill-first scheduling, and a hot-swap backend. Frugal, a native PEG parser library ported from Parsimonious, also lands.
Vary mutation testing speed: comparing to AST and PIT Vary now measures mutation-testing performance directly on real benchmark programs, including a project-scale parser workload and a PIT-style comparison fixture, and the current results are strong enough to talk about in concrete terms.