The VAST Program
Vary Automated Semantic Testing. Internal compiler verification.
Alpha. Vary is under active development and not ready for production use. Syntax, APIs, performance, and behaviour may change between releases.
VAST generates random valid Vary programs, runs them through two independent execution paths (an AST interpreter and the real JVM compiler backend), and fails when they disagree. If the compiler changes program semantics, VAST catches it.
Every run is deterministic. A seed produces the same program, the same execution, and the same result. If seed 842193 finds a bug, vary vast --seed 842193 --count 1 reproduces it exactly.
We call it a program because we maintain and expand it as the language grows. When Vary adds a new feature, VAST gets a new generator for it. vary vast ships with the compiler, so differential testing is part of the normal development workflow.
|
|
| Overview | What VAST is and why it exists, explained without jargon |
| Introduction | What the VAST program is and why it matters for compiler trust |
| How it works | The VAST pipeline: generation, three-path execution, and comparison |
| Why it matters | What changes when a compiler project runs its own verification program |
| CLI reference | Running the VAST program: flags, profiles, modes, and usage examples |
| Differential testing | How VAST uses multiple independent execution paths to find compiler bugs through disagreement |
| Metamorphic testing | How VAST verifies compiler correctness by applying semantics-preserving transforms and checking that results stay the same |
| Mutation testing | How VAST uses mutation expansion to validate its own detection infrastructure |
| Reduction | How VAST automatically shrinks failing programs to minimal reproducers |
| Value pools | How VAST uses value pools to generate programs with richer data flow and deeper semantic coverage |
| CI integration | Running VAST in CI: fast, deep, and continuous modes, dashboards, coverage, and seed rotation |
| VAST vs mutate | How vary vast and vary mutate differ in purpose, technique, and what they find |
| Comparison with other systems | How VAST compares to Csmith, Alive2, CompCert, QuickCheck, LangFuzz, and other compiler testing tools |
| Coverage and confidence | Semantic coverage, feature interactions, confidence scoring, and stress testing |
| Testing playbook | When to run what, how to interpret results, and how to act on failures |
| Future phases | What we are considering for future VAST work |