The VAST Program

Vary Automated Semantic Testing. Internal compiler verification.

Alpha. Vary is under active development and not ready for production use. Syntax, APIs, performance, and behaviour may change between releases.

VAST generates random valid Vary programs, runs them through two independent execution paths (an AST interpreter and the real JVM compiler backend), and fails when they disagree. If the compiler changes program semantics, VAST catches it.

Every run is deterministic. A seed produces the same program, the same execution, and the same result. If seed 842193 finds a bug, vary vast --seed 842193 --count 1 reproduces it exactly.

We call it a program because we maintain and expand it as the language grows. When Vary adds a new feature, VAST gets a new generator for it. vary vast ships with the compiler, so differential testing is part of the normal development workflow.

Documentation

Overview What VAST is and why it exists, explained without jargon
Introduction What the VAST program is and why it matters for compiler trust
How it works The VAST pipeline: generation, three-path execution, and comparison
Why it matters What changes when a compiler project runs its own verification program
CLI reference Running the VAST program: flags, profiles, modes, and usage examples
Differential testing How VAST uses multiple independent execution paths to find compiler bugs through disagreement
Metamorphic testing How VAST verifies compiler correctness by applying semantics-preserving transforms and checking that results stay the same
Mutation testing How VAST uses mutation expansion to validate its own detection infrastructure
Reduction How VAST automatically shrinks failing programs to minimal reproducers
Value pools How VAST uses value pools to generate programs with richer data flow and deeper semantic coverage
CI integration Running VAST in CI: fast, deep, and continuous modes, dashboards, coverage, and seed rotation
VAST vs mutate How vary vast and vary mutate differ in purpose, technique, and what they find
Comparison with other systems How VAST compares to Csmith, Alive2, CompCert, QuickCheck, LangFuzz, and other compiler testing tools
Coverage and confidence Semantic coverage, feature interactions, confidence scoring, and stress testing
Testing playbook When to run what, how to interpret results, and how to act on failures
Future phases What we are considering for future VAST work