VAST

How it works

A generator builds random programs. Each program runs through three independent engines. A comparator checks whether they all got the same answer. If one disagrees, something is broken, and the system can usually tell you which engine is at fault.

Architecture

The VAST program is built around seven components in a pipeline:

Generator --> Validator --> AST Executor -+
                       |                  |
                       +-->  IR Executor -+--> Comparator --> Reporter
                       |                  |
                       +--> JVM Executor -+

Three independent execution paths run every generated program. When two paths agree and one differs, the comparator can localize the fault to a specific compiler stage (blame localization).

Program generation

The generator builds real AST nodes (the same FunctionDef, VariableDecl, BinaryExpr types the parser produces) using a seeded java.util.Random. Every seed produces an identical program.

Generation is type-directed. When the generator needs an Int expression, it picks from: integer literals, variables of type Int, binary operators (+, -, *, //, %), unary negation, if-expressions, or function calls returning Int. When it needs a Bool, it picks from: boolean literals, variables of type Bool, comparisons, logical operators (and, or), or not.

A budget parameter controls expression depth. At budget 1, only terminals (literals and variables) appear. Higher budgets allow nested expressions.

Profiles

Profiles control what the generator is allowed to produce:

ProfileConstructsUse
coreLiterals, variables, arithmetic, comparisons, if/else, returnStraight-line programs
controlEverything in core plus helper functions and bounded while loopsControl flow and call stacks
text to genericsIncremental feature expansion: strings, enums, data types, collections, nullable, match, exceptions, genericsType system coverage
typesEnums, data types, lists, nullableCommon type combinations
completeAll features including match, exceptions, genericsFull language coverage

Each profile caps AST node count, nesting depth, function count, parameter count, and while loop iterations.

Bounded while loops

While loops are always bounded. The generator emits:

mut __i_0 = 0
while __i_0 < 20 {
    # body
    __i_0 = __i_0 + 1
}

The iteration count is random, up to the profile's maxWhileIterations limit. This prevents infinite loops from eating test time.

Program validation

Before execution, every generated program passes through a validator that checks:

RuleWhat it verifies
Entry point__vast_compute() exists exactly once with return type Int
ScopingAll identifiers are declared before use
MutabilityAssignment targets are mutable
Profile conformanceAll expression and statement forms are within the active profile
DepthNesting depth is within limits

Programs that fail validation are counted as invalid (a generator bug) and excluded from the pass/fail count.

AST interpreter

The AST interpreter walks the syntax tree directly. It uses sealed value types:

sealed class VastValue {
    data class VInt(val v: Long)
    data class VBool(val v: Boolean)
}

No Any types, no casting. Every operation dispatches on the sealed type. Division by zero returns RuntimeError(DIVISION_BY_ZERO). While loops past the iteration cap return RuntimeError(INFINITE_LOOP). Exceeded call depth returns RuntimeError(STACK_OVERFLOW).

The interpreter is simple on purpose. Its job is to be obviously correct, not fast.

IR interpreter

The IR interpreter (added in Phase 2) provides a third execution path. The AST is lowered to a flat intermediate representation (VastIr nodes) and interpreted. This creates a middle layer between the high-level AST interpreter and the JVM bytecode path.

The three-path architecture enables blame localization:

ASTIRJVMLikely fault
agreeagreediffersCodegen or bytecode emission
agreediffersdiffersIR lowering
differsagreeagreeAST interpreter bug

JVM executor

The JVM executor takes the same AST through the real compiler pipeline:

StepStageAction
1ConstantFolderOptimizes constant expressions
2DeadCodeEliminatorRemoves unreachable code
3TypeCheckerValidates types and produces type info
4BytecodeGeneratorEmits JVM bytecode
5ClassLoaderLoads the generated class
6ReflectionInvokes __vast_compute() and captures the result

No formatter or parser is involved. The AST goes straight into the compiler backend, so this tests the real compilation pipeline, not a serialized round-trip.

Each execution has a timeout. If the JVM path times out and the AST path detected an infinite loop, VAST treats that as agreement (same root cause, different detection mechanism).

Outcome comparison

The comparator classifies each result based on what both paths returned:

Path APath BVerdict
Success(42)Success(42)AGREE_SUCCESS
RuntimeError(DIV_ZERO)RuntimeError(DIV_ZERO)AGREE_RUNTIME_ERROR
Success(7)Success(9)MISMATCH_VALUE
Success(42)RuntimeError(DIV_ZERO)MISMATCH_OUTCOME_KIND
RuntimeError(DIV_ZERO)RuntimeError(STACK_OVERFLOW)MISMATCH_ERROR_CATEGORY
Success(42)CompileError(...)PATH_FAILURE
RuntimeError(INFINITE_LOOP)Timeout(...)AGREE_RUNTIME_ERROR

Agreements mean the compiler got it right for that program. Mismatches are bug candidates. Path failures point to infrastructure issues (the compiler rejected a program the interpreter accepted).

Error normalization

Both executors map their exceptions to the same VastErrorCategory enum:

CategoryAST interpreterJVM executor
DIVISION_BY_ZEROCaught at divide/moduloArithmeticException
STACK_OVERFLOWCall depth exceededStackOverflowError
INFINITE_LOOPIteration cap exceededExecution timeout

This normalization keeps the comparison fair. The AST interpreter catches infinite loops by counting iterations; the JVM executor catches them by timeout. Both map to the same category.

← Introduction
Why it matters →