
> `vary vast` tests the compiler. `vary mutate` tests your tests. Both involve running programs multiple ways and comparing results, but they answer different questions.

## Two commands, two questions

Vary has two verification tools that both involve running programs multiple ways and comparing results. They sound similar but solve different problems.

`vary vast` validates the compiler. `vary mutate` validates your tests.

## `vary vast`: compiler semantic validation

VAST asks one question: does the compiler preserve the meaning of programs?

It generates random valid Vary programs and runs each one through multiple independent execution paths. If the paths disagree, the compiler has a bug.

| Technique | What it does |
|-----------|-------------|
| Random program generation | Produces valid Vary programs from a seeded, deterministic generator |
| Metamorphic transforms | Rewrites a program in a semantically equivalent way and checks that the result stays the same |
| Mutation expansion | Injects known-wrong changes into the compiler pipeline to verify that the comparison infrastructure detects them |
| Three-path comparison | Runs each program through AST interpreter, IR interpreter, and JVM bytecode, which should all agree |
| Shrinking and corpus replay | Minimizes a failing program to isolate the bug, then replays known-interesting programs from a saved corpus |

The target is the compiler pipeline itself: the optimizer, codegen, type checker, and every stage between parsing and execution.

```bash
vary vast --count 1000 --seed 42
vary vast --profile control --count 100
```

## `vary mutate`: user-facing mutation testing

Mutation testing asks a different question: would your tests notice if something in the code changed?

It makes small changes to compiled bytecode (and optionally the AST), runs your tests against each changed version, and reports which changes your tests missed.

| Technique | What it does |
|-----------|-------------|
| Bytecode mutation | Single-instruction swaps in compiled bytecode, no recompilation |
| AST-level mutation | Higher-level structural changes (optional) |
| Test execution | Runs your tests against each mutant |
| Survivor analysis | Reports which mutations your tests failed to catch |
| Observability diagnostics | Determines whether the mutation's effect reached a test boundary at all |

The target is your test suite, the tests you wrote for your own code.

```bash
vary mutate math.vary
vary mutate src/ --why "add:LIT_CHANGE:abc123"
```

## Side by side

| | `vary vast` | `vary mutate` |
|---|---|---|
| Question | Does the compiler preserve semantics? | Would your tests notice a code change? |
| Target | The Vary compiler | Your test suite |
| Input | Generated programs (nobody wrote them) | Your source files and tests |
| Mutation role | Validation tool, confirms VAST's comparison infra catches injected faults | Primary technique, each mutant probes a test gap |
| Execution paths | AST interpreter, IR interpreter, JVM bytecode | Original bytecode vs. mutated bytecode |
| Output | Agreement/mismatch verdicts per seed | Kill rate, survivor analysis, fix suggestions |
| Finds | Miscompilations, optimizer bugs, codegen errors | Weak tests, missing assertions, untested branches |
| Who benefits | Compiler developers | Anyone writing Vary code |

## Why mutation appears in both

The word "mutation" shows up in both tools, but it means different things.

In `vary mutate`, mutation is the point. You mutate compiled code to probe whether tests detect the change. A surviving mutant means your tests have a gap.

In `vary vast`, mutation is a self-test. VAST mutates generated programs or compiler internals to verify that its own comparison infrastructure works. If VAST's three-path comparison fails to detect an injected fault, that is a bug in VAST, not in your code. This is sometimes called mutation expansion: using mutation as a confidence check on the testing tool rather than on user code.

## Where PIT fits

There is a third mutation-testing tool in the repository: PIT.

PIT is not a Vary command. It mutation-tests the Kotlin compiler implementation itself.

| Tool | What it tests |
|------|---------------|
| `./gradlew :compiler:pitest` / `make kotlin-mutation` | Kotlin compiler code and Kotlin unit tests |
| `vary mutate` | Vary user code and the tests written for that code |
| `vary vast` | Compiler semantic correctness through generated Vary programs |

This means the three tools answer different questions:

| Tool | Core question |
|------|---------------|
| PIT | Do our Kotlin tests catch mistakes in the compiler code? |
| `vary vast` | Does the compiler preserve the meaning of Vary programs? |
| `vary mutate` | Do your tests catch mistakes in your Vary code? |

PIT tests the implementation from the inside.

VAST tests the compiler from the outside.

`vary mutate` tests user test suites.

## They work together

`vary mutate` validates your tests. `vary vast` validates the compiler that runs your tests. Neither replaces the other.

A high mutation score does not help if the compiler itself is wrong. A clean VAST run does not help if nobody tested the user code. Both layers need to hold.
