Value Pools Make Random Programs Better

If you ask a random program generator to invent code from scratch every time, you get a lot of variety. You also get a lot of nonsense.

Not invalid nonsense. VAST only generates valid, well-typed Vary programs. But without some memory, the generated programs tend to look like this:

def __vast_compute() -> Int {
    let a = 41
    let b = 7
    let c = 13
    return a + 1
}

That is random code, but it is shallow. The values do not really interact. b and c exist, but they do not influence anything important. The program has syntax, but not much data flow.

Real programs are not like that. Real programs reuse earlier values. They compare one result to another. They accumulate state. They pass a value through several steps and then make a decision based on it.

That is what value pools are for.

The basic idea

The easiest way to understand a value pool is to imagine a person writing code with a scratchpad beside them.

Every time they create a useful value, they jot it down:

x = 10
y = 3
flag = x > y

Later, when they need another expression, they do not have to start from zero. They can reuse something from the scratchpad:

z = x + y
ok = flag and x != 0

A value pool is that scratchpad, but for the random program generator inside VAST.

Instead of generating a fresh literal every time it needs an Int or a Bool, the generator can say:

OptionWhat it means
Fresh generationMake a new literal or expression from scratch
ReusePick a value that already exists and use it again
CompositionCombine two earlier values into a new expression

That one change makes generated programs look much more like software a human might actually write.

What problem they solve

Early random generation has a common weakness: every expression is independent.

def __vast_compute() -> Bool {
    let x = 5
    let y = 99
    return True
}

This can still catch parser, typer, and codegen bugs. But it misses a whole class of semantic problems because there is very little interaction between values.

Compilers break more often on relationships than on isolated literals:

Weak generator patternStronger generator pattern
Fresh constant every timeReuse values across statements
Flat expressionsDependent expressions
No accumulationLoop-carried state
Incidental edge casesRepeated pressure on the same values

Value pools push the generator toward the right column.

What a value pool actually is

A value pool is a collection of expressions that are available for reuse at a given point in generation.

If VAST generates this:

let x = 10
let y = 3

then both x and y can be added to the pool as reusable Int values.

Later, if the generator needs another Int, it has choices:

Need an Int expression
    |
    +-- generate 42
    +-- reuse x
    +-- reuse y
    +-- compose x + y
    +-- compose x - y

The result is variety that actually matters.

A small example

Without value pools, VAST might produce:

def __vast_compute() -> Int {
    let a = 4
    let b = 9
    let c = 2
    return a * 3
}

With value pools, it can produce:

def __vast_compute() -> Int {
    let x = 4
    let y = 9
    let delta = y - x
    let doubled = delta + delta
    return doubled
}

The second program is still small, but it has real dependencies:

ValueDepends on
deltay, x
doubleddelta
return valuedoubled

That dependency chain matters. A compiler bug in variable loading, temporary storage, arithmetic lowering, optimization, or register allocation now has more chances to surface.

Why this helps find bugs

Value pools make VAST better at finding exactly the sort of bug that random literals often miss.

For example, suppose the generator creates:

def __vast_compute() -> Int {
    let x = 0
    let y = x + 1
    if y > x {
        return y
    }
    return x
}

This program pressures several things at once:

Feature under pressureWhy the pool helps
Variable reuseThe same value appears in multiple places
Branch semanticsThe branch depends on a derived value
Arithmetic loweringy is computed from x rather than from a fresh literal
Optimizer correctnessy > x may look simplifiable

Without a pool, x and y are less likely to be connected in a useful way. With a pool, the generator deliberately creates those connections.

Pools create edge cases on purpose

One of the main benefits of value pools is that they make important edge cases appear regularly instead of accidentally.

If the pool already contains 0, 1, False, or an empty string, later expressions can reuse them in ways that stress the compiler:

PatternExample
Zero interactionx * 0, x + 0, x / y where y came from earlier state
Identity behaviourx * 1, x - 0
Repeated aliasingUsing the same variable in several expressions
Loop accumulationtotal = total + x
Branch joinsReusing values created on one side of a conditional

This matters because many compiler bugs live in those interactions, not in isolated constants.

Why not just make programs bigger?

Bigger random programs help, but size alone is not the answer.

A 300-line program made of unrelated constants is often less useful than a 20-line program where values actually flow through the code.

That is the real job of a value pool. It increases semantic density.

Variables depend on earlier choices, expressions participate in the final result, and the generated code exercises the compiler in ways that flat programs cannot. VAST gets more out of every line.

The main idea

Value pools are a simple idea with a large effect.

They let VAST remember what it has already created, then reuse and recombine those values to generate programs with richer data flow. That makes the generated code feel less like random noise and more like real software. And real-software-shaped programs are better at finding real compiler bugs.

For the technical details, see the Value pools docs. For a broader introduction to VAST, see Introduction.