Introducing Vary

Large language models can write code that looks correct, compiles cleanly, and passes tests, while still being wrong. A condition is flipped, a value is off by one, an error is silently swallowed. The test suite reports green. Nothing in the toolchain notices.

LLMs are probabilistic systems that generate plausible programs, not correct ones. Programming with AI-generated code requires stronger guardrails than we had before, and the compiler is the most reliable place to put them.

Vary is a statically typed language that compiles to JVM bytecode and ships as a single CLI. The syntax will feel familiar if you have used Python, but Vary uses braces, explicit function types, compile-time null safety, and contracts. It also includes mutation testing directly in the compiler: instead of only running tests, the compiler systematically changes the program and checks whether the tests detect the change. Mutations that survive reveal places where tests pass but fail to verify behaviour.

When machines help write code, the language has to verify that the programs, and the tests meant to check them, are actually telling the truth.

What it looks like

Vary uses braces, explicit function types, and type inference for locals.

def greet(name: Str) -> Str {
    return f"Hello, {name}!"
}

Nullable values are explicit and tracked by the compiler.

let message: Str? = None

if message is not None {
    print(message)    # narrowed from Str? to Str
}

Inside the if, the compiler knows message cannot be None. No casts are required.

Function signatures require explicit types. Local variables can be inferred.

Why another language?

Developers already know how to deal with incorrect code: tests. But tests have their own blind spot. A test can run code without actually verifying its behaviour.

For example:

What the test doesWhat it misses
Calls a function but ignores its return valueWhether the function returned the right thing
Asserts a constant rather than the computed resultWhether the computation is correct
Checks a side effect but not the resultWhether the return value matters
Runs the code path but never checks what changedWhether the code did anything useful

Coverage tools report this as success because the lines executed. But execution is not verification.

Mutation testing flips the question. Instead of asking "did this code run?", it asks "if the code were wrong, would the tests notice?"

Why mutation testing?

The compiler takes your program and creates slightly altered versions of it. Each altered version contains a small change, a mutant:

MutationExample
Replace operator> becomes >=
Substitute return valueReturn 0 instead of a computed value
Remove callDelete a function call entirely
Flip conditiontrue becomes false

Your test suite is then executed against each mutant.

If a mutant causes a test to fail, it is killed. If all tests still pass, it survives. A surviving mutant means the tests did not detect a real behavioural change, which is exactly the kind of bug that reaches production unnoticed. Coverage tools cannot detect this.

Mission

Dynamic languages defer too many checks until runtime. Advanced type systems can prevent many errors, but at a cognitive cost that slows people down. Vary picks a small set of guarantees that pay for themselves:

GuaranteeWhat it catches
Static typingType mismatches at compile time
Null safetyNull pointer errors before the code runs
ContractsPrecondition and postcondition violations
Mutation testingTests that cannot actually detect bugs