Most AI coding loops waste time running expensive verifiers too early. The model writes some code, the tool runs the full test suite, half the tests fail for structural reasons, and then everything thrashes trying to fix a noisy failure surface all at once.

There is a simpler policy: run the cheapest command that can actually tell you something, and only escalate when that stage is clean.

## The ladder

| Stage | Command | What it answers |
|---|---|---|
| 1 | `vary check` | Is the code structurally sane enough to keep working on? |
| 2 | `vary test` | Does the code behave correctly on the cases we wrote down? |
| 3 | `vary mutate --quick` | Are those tests strong enough to catch realistic faults? |
| 4 | `vary validate` | Has this change met the final local or CI policy bar? |

## How to use it

Start with `vary check` while the code is still changing shape.

```bash
vary check src/
vary check src/ --plan
vary check src/ --fix
```

If the checker is still finding structural problems, stay there. That feedback is cheaper and more local than test failures. If a rule is unclear, ask the toolchain directly:

```bash
vary explain VCI001
```

When the structure is clean enough, move to behaviour:

```bash
vary test tests/
vary test tests/ --only auth::test_login --trace
```

If tests fail, do not jump ahead. Replay the narrowest failing behaviour and fix that. Only after tests pass should you ask whether the tests themselves are any good.

```bash
vary mutate src/foo.vary --quick
```

This catches confidence theatre. A suite can pass and still be too weak to notice small but realistic faults. Mutation tells you whether "tests pass" actually means anything.

When the change is ready for handoff, run the policy bar:

```bash
vary validate . --profile local
vary validate . --profile ci
```

`vary validate` is the closeout step, not something you run on every edit.

## The point

`check` narrows shape problems. `test` checks behaviour. `mutate` checks test strength. `validate` applies the final gate. Use the cheapest next verifier that can actually reduce uncertainty, and stop tools from bouncing between vague edits and expensive checks.

## Related reading

| Article | Focus |
|---|---|
| [From generated code to confidence at scale](/articles/from-generated-code-to-confidence-at-scale/) | The confidence workflow that strengthens generated code before deeper verification |
| [Human-readable, AI-written, confidence at scale](/articles/human-readable-ai-written-confidence-at-scale/) | The product direction behind confidence-building in AI-assisted Vary |