When VAST finds a bug, the failing program might be hundreds of lines long. The reducer automatically strips it down to the smallest program that still triggers the same bug, often just a few lines.
When VAST finds a mismatch, the failing program can be large. A complete profile program might have enum definitions, helper functions, nested loops, match statements, and dozens of variables. Most of that code is irrelevant to the bug.
Debugging a 300-line generated program is slow. You need to find the 5 lines that actually trigger the mismatch.
VAST includes an automatic reducer that shrinks failing programs while preserving the mismatch. The reducer repeatedly tries simplifications and keeps the ones that still trigger the same failure.
Failing program (156 AST nodes)
|
v
Try removing a statement
Try simplifying an expression
Try replacing a constant
Try eliminating a function
Try pruning a branch
|
v
Still triggers the same mismatch?
|
yes --> keep the simplification, try more
no --> revert, try a different simplification
|
v
Minimal reproducer (31 AST nodes)
The reducer stops when no further simplification preserves the mismatch, or when it hits the attempt limit or time budget.
The reducer applies several kinds of simplification:
| Pass | What it tries |
|---|---|
| Statement removal | Delete a statement and check if the mismatch persists |
| Expression simplification | Replace a complex expression with a simpler one (e.g., (x + y) * z to x) |
| Literal replacement | Replace a variable reference with a constant |
| Function elimination | Remove a helper function and inline or replace its calls |
| Loop unrolling | Replace a while loop with its first iteration |
| Branch pruning | Replace an if/else with just one branch |
Each pass is tried repeatedly until it stops making progress. Then the next pass runs. The reducer cycles through all passes until no pass can reduce the program further.
The reducer measures program size using a composite cost:
| Metric | What it counts |
|---|---|
| Node count | Total AST nodes (primary metric) |
| Statement count | Top-level and nested statements |
| Max depth | Deepest nesting level |
| Function count | Number of helper functions |
A reduction is accepted if the overall cost decreases and the mismatch still occurs.
The reducer needs a way to test whether a simplified program still exhibits the same bug. This is called the oracle.
The oracle runs the reduced program through all execution paths and checks two things:
| Step | Check |
|---|---|
| 1 | The program still compiles and runs (it was not reduced into nonsense) |
| 2 | The mismatch verdict matches the original (same paths disagree, same kind of failure) |
The oracle uses the failure signature (a hash of the verdict, blame, and path outcomes) to detect when a reduction accidentally changes the bug being triggered.
| Status | Meaning |
|---|---|
REDUCED | The program was successfully minimized |
NO_IMPROVEMENT | The program was already minimal (no simplification preserved the mismatch) |
MAX_ATTEMPTS | The reducer hit its attempt limit before converging |
TIME_BUDGET_EXCEEDED | The reducer ran out of time |
ORACLE_ERROR | The oracle itself threw an exception |
Before reduction (156 nodes):
def helper_0(a: Int, b: Int) -> Int {
if a > b {
return a - b
} else {
return b - a
}
}
def __vast_compute() -> Int {
mut x = 10
let y = 3
mut i = 0
while i < 5 {
x = helper_0(x, y)
i = i + 1
}
let z = (x - y) + y
return z
}
After reduction (12 nodes):
def __vast_compute() -> Int {
let x = 4
let y = 3
return (x - y) + y
}
The reducer found that the loop, the helper function, and most variables were irrelevant. The bug is in how (x - y) + y is compiled.
Reduced programs are saved to a reproducer corpus. Each reproducer includes:
| Field | Contents |
|---|---|
| Source code | Original and reduced versions |
| Metadata | Seed, profile, and verdict |
| Fingerprint | Hash of program, blame, and verdict (for deduplication) |
| Size | Original and reduced node counts |
| Timestamp | When the failure was first recorded |
The corpus can be replayed against future compiler versions:
vary vast --replay-corpus .vast-corpus/
This checks whether known bugs are still present or have been fixed.
| Flag | Default | Description |
|---|---|---|
--reduce | off | Enable automatic shrinking when a mismatch is found |
--max-shrink | 1000 | Maximum number of oracle calls per reduction |
--max-reduce-time | 30s | Maximum wall-clock time for reduction |
--corpus-dir | none | Directory to save/load reproducer corpus |
--replay-corpus | off | Replay saved reproducers before exploring new programs |
Without reduction, every mismatch requires manual investigation of a large generated program. With reduction, the developer sees a minimal reproducer that isolates the bug. The difference is between "somewhere in these 300 lines" and "this 5-line program triggers the bug."
In CI, reduced programs are saved as regression artifacts for tracking and verification.