## What is mutation testing?

Code coverage tells you which lines ran. It does not tell you whether your tests checked anything. A test that calls a function and ignores the return value gets 100% coverage and catches zero bugs.

Mutation testing answers a different question: if something in the code changed, would any test notice?

The compiler makes small changes to compiled bytecode and program semantics (flipping conditions, mutating field access, altering null checks, shifting loop boundaries) and runs your tests against each changed version. Each change is a mutant. If the tests still pass, that mutant survived, and your tests have a gap. If a test fails, the mutant was killed.

If you want the smallest concrete walkthrough before the internals, read [Smallest example](/docs/mutation/smallest-example/).

## Bytecode mutation

Vary compiles to JVM bytecode, and mutation happens at the bytecode level. This is the whole point: the compiled bytecode is a flat stream of instructions (`IADD` for addition, `ISUB` for subtraction, `IF_ICMPGT` for greater-than), and mutating one instruction is a single byte change. No re-parsing, no re-compiling. The mutated class loads in an isolated classloader, tests run against it, and the classloader is discarded. The whole cycle takes milliseconds per mutant, and mutants run in parallel.

Most mutation testing tools rewrite source text or modify a syntax tree. That means re-compiling for every mutant. For a project with hundreds of mutations, the overhead adds up fast. Bytecode mutation skips all of that. A file with 30 mutants finishes in seconds, which makes mutation testing practical during development rather than something you run overnight in CI.

Bytecode mutation is also more precise. Source-level rewriting has to deal with formatting, comments, and syntax ambiguity. Two source changes that look different can produce the same bytecode, or one source change can accidentally affect multiple operations. Bytecode has none of that. Each instruction has a fixed meaning, and swapping one is an unambiguous change.

## AST mutation

> **Note:** AST mutation is not a focus of Vary. Bytecode mutation is the primary approach and covers the vast majority of use cases. AST mutation exists as a secondary tool for specific situations where bytecode operators are not enough.

Vary also supports AST-level mutation (`--level ast`) for those cases. AST mutation modifies the parsed syntax tree before compilation. Each mutant goes through constant folding, type checking, and bytecode generation, so it is slower (a full compile pass per mutant). But it has access to higher-level program structures that bytecode cannot see: removing entire statements, swapping function arguments, dropping list elements, and skipping control flow blocks.

AST mutation is a secondary tool. Use it when investigating specific survivor patterns or when you want the semantic operators (skip-effect, skip-block, drop-element, swap-args) that have no bytecode equivalent. You can also run both with `--level both`, which combines results and deduplicates.

| Level | Speed | Operators | Parallelism |
|-------|-------|-----------|-------------|
| `bytecode` (default) | Fast (milliseconds per mutant) | 6 bytecode operators | Parallel |
| `ast` | Moderate (recompiles per mutant) | 27 AST operators (17 classic + 10 semantic) | Sequential |
| `both` | Slower (runs both) | All 33 operators, deduplicated | Mixed |

## Running it

```bash
vary mutate calc.vary
```

The output shows a mutation score: the percentage of mutations your tests detected.

```bash
vary mutate src/
```

You can also run mutation testing across an entire directory.

## What mutators do

The bytecode level has 6 operators:

| Mutator | Example |
|---------|---------|
| Arithmetic | `IADD` becomes `ISUB` (covers int, long, float, double) |
| Conditional | `IF_ICMPLT` becomes `IF_ICMPLE` or `IF_ICMPGE` |
| Return value | Functions return `0`, `0L`, `0.0`, or `null` instead of computed results |
| Negation | `INEG` removed (negation has no effect) |
| Call skip | Method calls removed, replaced with default return values |
| Return poison | Functions return adversarial values like `-1` or `MAX_VALUE` |

The AST level has 27 operators: 17 classic operators, plus 10 semantic operators that understand program meaning:

| Mutator | Example |
|---------|---------|
| Arithmetic | `+` becomes `-`, `*` becomes `/` |
| Comparison | `>` becomes `>=`, `==` becomes `!=` |
| Boolean | `True` becomes `False`, `and` becomes `or` |
| Literal | `60` becomes `61`, `""` becomes `"mutant"` |
| Statement removal | Statements replaced with `pass` |
| Boundary | `<` becomes `<=` (off-by-one errors) |
| Return default | `return expr` becomes `return 0`, `return ""`, etc. |
| Skip effect | Side-effecting calls like `validate(data)` replaced with `pass` |
| Skip block | `if cond { body }` becomes `if cond { pass }` |
| Drop element | `[a, b, c]` becomes `[b, c]` or `[a, c]` |
| Swap arguments | `f(a, b)` becomes `f(b, a)` |
| Contract precondition | Mutates expressions inside `in {}` blocks |
| Contract postcondition | Mutates expressions inside `out(r) {}` / `post {}` blocks |
| Enum replace | `Color.Red` becomes `Color.Green` or `Color.Blue` |
| Contract remove | Entire `in {}` or `out(r) {}` block removed |
| Match swap | Match case bodies swapped with each other |
| Match pattern | Match guard removed or pattern replaced with wildcard |
| Boundary shift | Shifts loop bound and comparison together |
| Guard mismatch | Checks wrong field in a guard condition |
| Field swap | Reads a sibling field instead of the intended one |
| Omitted read | Removes a field from a calculation |
| Duplicate field | Uses one field twice instead of two distinct fields |
| Misbound constructor | Swaps constructor arguments with compatible types |
| Null weaken | Removes or weakens a null check |
| Null strengthen | Removes a null-safe fallback |
| Collection simplify | Weakens collection emptiness or membership check |
| Numeric boundary | Shifts a numeric boundary or division type |

## Reading the output

After a mutation run, you see which mutants were killed (your tests caught them) and which survived (your tests missed them).

A surviving mutant means the compiler changed something and no test noticed. If a real bug made the same change, your tests would not catch it either.

The output includes three metrics beyond the raw score:

| Metric | Meaning |
|--------|---------|
| **Kill rate** | Fraction of mutants killed by tests |
| **Observability** | Fraction where the behavioural change reached an oracle boundary at all (killed mutants plus weak-oracle survivors) |
| **Actionable survivors** | Survivors worth investigating (excludes likely-equivalent mutants) |

A survivor breakdown shows the composition: weak-oracle (behaviour was seen but assertions were too weak), unobserved (behaviour never reached a test oracle), equivalent-likely (mutation probably has no observable effect), and other.

You can gate CI on these metrics in `vary.toml`:

```toml
[mutation]
min_observability = 70.0       # Minimum observability score (0-100)
max_unobserved_survivors = 5   # Maximum unobserved survivors allowed
```

## Drilling into survivors

The output includes a survivor groups table. To see individual mutants in a group, use `--expand`:

```bash
vary mutate src/ --expand "math#add"
```

To understand why a specific mutant survived, use `--why` with its ID (shown in the `--expand` output):

```bash
vary mutate math.vary --why "add:LIT_CHANGE:abc123"
```

This shows what was changed, where in the code, why your tests missed it, and what assertion would catch it.

For the full step-by-step workflow, see [Golden path](/docs/mutation/workflow/).

For all operators, flags, and advanced features, see [Advanced overview](/docs/mutation/advanced/).

## What score to aim for

There is no universal target. 100% is not always practical. But below 60% usually means tests are not checking return values or are missing branches.

The `--why` output and the leverage fixes tell you more than the score itself. They point to exactly where the gaps are and what to write next.