Value pools — Markdown View

Why random constants are not enough

Early VAST phases generated fresh literals for every expression. Need an integer? Pick a random number. Need a boolean? Flip a coin.

This produces syntactically diverse programs, but the data flow is shallow. Every value is independent. There is no reuse, no composition, and no dependency between variables. Real programs are different: values flow through assignments, get passed to functions, and combine in expressions that reference earlier results.

Value pools address this by tracking values as they are generated and making them available for reuse.

What a value pool is

A value pool is a collection of expressions available at a given point in the program. As the generator creates variables, loop counters, and branch results, each one is added to the pool. When the generator needs a new expression, it can choose to reuse a value from the pool instead of generating a fresh literal.

Generator needs an Int expression
       |
       +-- 30% chance: reuse a value from the pool
       |
       +-- 20% chance: compose two pool values (e.g., x + y)
       |
       +-- 50% chance: generate a fresh literal or expression

The probabilities are configurable. Higher reuse bias produces programs with more data dependencies. Lower reuse bias produces programs closer to the original random generation.

Pool entries

Each entry in the pool tracks metadata about the value:

Field	What it records
Type	The Vary type (`Int`, `Bool`, `Str`, etc.)
Expression	The AST expression that produces the value
Origin	Where the value came from (local binding, loop iteration, branch join, argument, temporary)
Scope depth	How deeply nested the value is (for visibility checking)
Complexity	Expression depth (simpler values are preferred for reuse)
Dependency depth	How many other pool entries this value transitively depends on
Use count	How many times the value has been reused

Pool origins

Values enter the pool from different sources:

Origin	Example	Priority
Local binding	`let x = 42`	High (named, easy to reference)
Loop carried	`x = x + 1` inside a while loop	Medium
Branch join	Value computed in an if/else branch	Medium
Argument	Function parameter	High
Temporary	Intermediate expression result	Low

Named locals are preferred for reuse because they produce cleaner, more readable generated programs.

How reuse works

When the generator decides to reuse a pool value, it selects from entries that match the required type and are visible at the current scope depth. Selection is biased toward:

Priority	Preference
1	Named locals over temporaries
2	Lower complexity over higher complexity
3	Less frequently used values over heavily used ones
4	Deeper scope (closer to the current position)

This bias produces realistic data flow patterns: values tend to be used near where they are defined, simpler values get reused more than complex expressions, and the generator avoids over-referencing any single variable.

Composition

Instead of reusing a single pool value, the generator can compose two values:

let x = 10
let y = 3
# composed value: x + y (combines two pool entries)
let z = x + y

Composition creates expressions that reference multiple earlier values, producing richer dependency graphs. The maximum composition complexity is configurable to prevent deeply nested expressions.

Edge cases that pools enable

Value pools improve VAST's ability to find edge-case bugs because they create programs with specific value patterns:

Pattern	How pools produce it	What it tests
Zero interactions	Reuse a variable that holds `0` in arithmetic	Division by zero, multiply by zero
Identity operations	Compose `x + 0`, `x * 1` with pool values	Optimizer correctness for identities
Aliasing	Reuse the same variable in multiple contexts	Variable scoping and register allocation
Accumulation	Loop-carried values that grow over iterations	Loop variable overflow, accumulator correctness

Without pools, these patterns only appear by coincidence. With pools, they appear regularly because the generator deliberately reuses values.

Configuration

Pool behaviour is controlled by five parameters:

Parameter	Default	Description
Reuse bias	30	Probability (0-100) of reusing a pool value
Composition bias	20	Probability (0-100) of composing two pool values
Diversity penalty	1	Bonus for selecting less-used entries
Max dependency depth	4	Maximum transitive dependencies for a pool entry
Max composition complexity	3	Maximum expression depth when composing

These defaults produce a balance between fresh generation and value reuse. Higher reuse and composition biases produce programs with deeper data flow at the cost of less syntactic variety.

Metrics

VAST tracks pool statistics during generation:

Metric	What it measures
Pool size over time	How many entries are available at each generation step
Reuse and composition ratios	How often each strategy is selected
Dependency depth distribution	How deep the transitive chains get
Named local prevalence	What fraction of reused values are named locals

These metrics help tune pool configuration and verify that value pools are actually producing the intended data flow patterns.