VAST

Reduction

When VAST finds a bug, the failing program might be hundreds of lines long. The reducer automatically strips it down to the smallest program that still triggers the same bug, often just a few lines.

The problem with generated programs

When VAST finds a mismatch, the failing program can be large. A complete profile program might have enum definitions, helper functions, nested loops, match statements, and dozens of variables. Most of that code is irrelevant to the bug.

Debugging a 300-line generated program is slow. You need to find the 5 lines that actually trigger the mismatch.

Automatic reduction

VAST includes an automatic reducer that shrinks failing programs while preserving the mismatch. The reducer repeatedly tries simplifications and keeps the ones that still trigger the same failure.

Failing program (156 AST nodes)
       |
       v
  Try removing a statement
  Try simplifying an expression
  Try replacing a constant
  Try eliminating a function
  Try pruning a branch
       |
       v
  Still triggers the same mismatch?
       |
  yes --> keep the simplification, try more
  no  --> revert, try a different simplification
       |
       v
  Minimal reproducer (31 AST nodes)

The reducer stops when no further simplification preserves the mismatch, or when it hits the attempt limit or time budget.

Reduction passes

The reducer applies several kinds of simplification:

PassWhat it tries
Statement removalDelete a statement and check if the mismatch persists
Expression simplificationReplace a complex expression with a simpler one (e.g., (x + y) * z to x)
Literal replacementReplace a variable reference with a constant
Function eliminationRemove a helper function and inline or replace its calls
Loop unrollingReplace a while loop with its first iteration
Branch pruningReplace an if/else with just one branch

Each pass is tried repeatedly until it stops making progress. Then the next pass runs. The reducer cycles through all passes until no pass can reduce the program further.

Cost metric

The reducer measures program size using a composite cost:

MetricWhat it counts
Node countTotal AST nodes (primary metric)
Statement countTop-level and nested statements
Max depthDeepest nesting level
Function countNumber of helper functions

A reduction is accepted if the overall cost decreases and the mismatch still occurs.

The oracle

The reducer needs a way to test whether a simplified program still exhibits the same bug. This is called the oracle.

The oracle runs the reduced program through all execution paths and checks two things:

StepCheck
1The program still compiles and runs (it was not reduced into nonsense)
2The mismatch verdict matches the original (same paths disagree, same kind of failure)

The oracle uses the failure signature (a hash of the verdict, blame, and path outcomes) to detect when a reduction accidentally changes the bug being triggered.

Reduction outcomes

StatusMeaning
REDUCEDThe program was successfully minimized
NO_IMPROVEMENTThe program was already minimal (no simplification preserved the mismatch)
MAX_ATTEMPTSThe reducer hit its attempt limit before converging
TIME_BUDGET_EXCEEDEDThe reducer ran out of time
ORACLE_ERRORThe oracle itself threw an exception

Example

Before reduction (156 nodes):

def helper_0(a: Int, b: Int) -> Int {
    if a > b {
        return a - b
    } else {
        return b - a
    }
}

def __vast_compute() -> Int {
    mut x = 10
    let y = 3
    mut i = 0
    while i < 5 {
        x = helper_0(x, y)
        i = i + 1
    }
    let z = (x - y) + y
    return z
}

After reduction (12 nodes):

def __vast_compute() -> Int {
    let x = 4
    let y = 3
    return (x - y) + y
}

The reducer found that the loop, the helper function, and most variables were irrelevant. The bug is in how (x - y) + y is compiled.

Reproducer corpus

Reduced programs are saved to a reproducer corpus. Each reproducer includes:

FieldContents
Source codeOriginal and reduced versions
MetadataSeed, profile, and verdict
FingerprintHash of program, blame, and verdict (for deduplication)
SizeOriginal and reduced node counts
TimestampWhen the failure was first recorded

The corpus can be replayed against future compiler versions:

vary vast --replay-corpus .vast-corpus/

This checks whether known bugs are still present or have been fixed.

CLI flags

FlagDefaultDescription
--reduceoffEnable automatic shrinking when a mismatch is found
--max-shrink1000Maximum number of oracle calls per reduction
--max-reduce-time30sMaximum wall-clock time for reduction
--corpus-dirnoneDirectory to save/load reproducer corpus
--replay-corpusoffReplay saved reproducers before exploring new programs

Why reduction matters

Without reduction, every mismatch requires manual investigation of a large generated program. With reduction, the developer sees a minimal reproducer that isolates the bug. The difference is between "somewhere in these 300 lines" and "this 5-line program triggers the bug."

In CI, reduced programs are saved as regression artifacts for tracking and verification.

← Mutation testing
Value pools →