
> When VAST finds a bug, the failing program might be hundreds of lines long. The reducer automatically strips it down to the smallest program that still triggers the same bug, often just a few lines.

## The problem with generated programs

When VAST finds a mismatch, the failing program can be large. A `complete` profile program might have enum definitions, helper functions, nested loops, match statements, and dozens of variables. Most of that code is irrelevant to the bug.

Debugging a 300-line generated program is slow. You need to find the 5 lines that actually trigger the mismatch.

## Automatic reduction

VAST includes an automatic reducer that shrinks failing programs while preserving the mismatch. The reducer repeatedly tries simplifications and keeps the ones that still trigger the same failure.

```text
Failing program (156 AST nodes)
       |
       v
  Try removing a statement
  Try simplifying an expression
  Try replacing a constant
  Try eliminating a function
  Try pruning a branch
       |
       v
  Still triggers the same mismatch?
       |
  yes --> keep the simplification, try more
  no  --> revert, try a different simplification
       |
       v
  Minimal reproducer (31 AST nodes)
```

The reducer stops when no further simplification preserves the mismatch, or when it hits the attempt limit or time budget.

## Reduction passes

The reducer applies several kinds of simplification:

| Pass | What it tries |
|------|--------------|
| Statement removal | Delete a statement and check if the mismatch persists |
| Expression simplification | Replace a complex expression with a simpler one (e.g., `(x + y) * z` to `x`) |
| Literal replacement | Replace a variable reference with a constant |
| Function elimination | Remove a helper function and inline or replace its calls |
| Loop unrolling | Replace a while loop with its first iteration |
| Branch pruning | Replace an if/else with just one branch |

Each pass is tried repeatedly until it stops making progress. Then the next pass runs. The reducer cycles through all passes until no pass can reduce the program further.

## Cost metric

The reducer measures program size using a composite cost:

| Metric | What it counts |
|--------|---------------|
| Node count | Total AST nodes (primary metric) |
| Statement count | Top-level and nested statements |
| Max depth | Deepest nesting level |
| Function count | Number of helper functions |

A reduction is accepted if the overall cost decreases and the mismatch still occurs.

## The oracle

The reducer needs a way to test whether a simplified program still exhibits the same bug. This is called the oracle.

The oracle runs the reduced program through all execution paths and checks two things:

| Step | Check |
|------|-------|
| 1 | The program still compiles and runs (it was not reduced into nonsense) |
| 2 | The mismatch verdict matches the original (same paths disagree, same kind of failure) |

The oracle uses the failure signature (a hash of the verdict, blame, and path outcomes) to detect when a reduction accidentally changes the bug being triggered.

## Reduction outcomes

| Status | Meaning |
|--------|---------|
| `REDUCED` | The program was successfully minimized |
| `NO_IMPROVEMENT` | The program was already minimal (no simplification preserved the mismatch) |
| `MAX_ATTEMPTS` | The reducer hit its attempt limit before converging |
| `TIME_BUDGET_EXCEEDED` | The reducer ran out of time |
| `ORACLE_ERROR` | The oracle itself threw an exception |

## Example

Before reduction (156 nodes):

```vary
def helper_0(a: Int, b: Int) -> Int {
    if a > b {
        return a - b
    } else {
        return b - a
    }
}

def __vast_compute() -> Int {
    mut x = 10
    let y = 3
    mut i = 0
    while i < 5 {
        x = helper_0(x, y)
        i = i + 1
    }
    let z = (x - y) + y
    return z
}
```

After reduction (12 nodes):

```vary
def __vast_compute() -> Int {
    let x = 4
    let y = 3
    return (x - y) + y
}
```

The reducer found that the loop, the helper function, and most variables were irrelevant. The bug is in how `(x - y) + y` is compiled.

## Reproducer corpus

Reduced programs are saved to a reproducer corpus. Each reproducer includes:

| Field | Contents |
|-------|----------|
| Source code | Original and reduced versions |
| Metadata | Seed, profile, and verdict |
| Fingerprint | Hash of program, blame, and verdict (for deduplication) |
| Size | Original and reduced node counts |
| Timestamp | When the failure was first recorded |

The corpus can be replayed against future compiler versions:

```bash
vary vast --replay-corpus .vast-corpus/
```

This checks whether known bugs are still present or have been fixed.

## CLI flags

| Flag | Default | Description |
|------|---------|-------------|
| `--reduce` | off | Enable automatic shrinking when a mismatch is found |
| `--max-shrink` | 1000 | Maximum number of oracle calls per reduction |
| `--max-reduce-time` | 30s | Maximum wall-clock time for reduction |
| `--corpus-dir` | none | Directory to save/load reproducer corpus |
| `--replay-corpus` | off | Replay saved reproducers before exploring new programs |

## Why reduction matters

Without reduction, every mismatch requires manual investigation of a large generated program. With reduction, the developer sees a minimal reproducer that isolates the bug. The difference is between "somewhere in these 300 lines" and "this 5-line program triggers the bug."

In CI, reduced programs are saved as [regression artifacts](/docs/vast/ci-integration/) for tracking and verification.
