Mutation

Smallest Example

Mutation testing asks: if we mess with the code in a small way, like changing + to -, will the tests catch it?

That is the whole idea. Mutation testing takes code that is currently correct, makes one small deliberate wrong change, and reruns the test. If the test fails, the test noticed the change. If the test still passes, the test did not notice. That usually means the test is too weak.

Original code:

def add(a: Int, b: Int) -> Int {
    return a + b
}

Test:

from calc import add

test "add" {
    observe add(2, 3) == 5
}

This test means: call add with 2 and 3, and fail if the answer is not 5. In other words, this test is supposed to catch it if add stops returning the right answer.

Run mutation testing with:

vary mutate calc.vary

When you run mutation testing, Vary keeps your original source file, picks one small part of the code, and temporarily changes it. In this example, it changes the + operator to -.

So this line:

return a + b

temporarily becomes this:

return a - b

That is the mutation. It is a deliberate tiny change to the code.

The reason for making this swap is not to be random. The point is to change the code in a small realistic way and see whether the test catches it. If a test only passes when the code is correct and fails when the code is slightly wrong, that test is doing real work. If we can mess with the code and the test still passes, the test is probably not checking the behavior well enough.

Nothing else changes here. The function name stays the same. The inputs stay the same. The test stays the same. The only thing that changes is the operator in the middle of the calculation.

One example mutant:

def add(a: Int, b: Int) -> Int {
    return a - b
}

Now Vary reruns the same test against this temporary broken version.

This time, add(2, 3) no longer computes 2 + 3. It computes 2 - 3, which is -1. But the test still expects 5. So the actual check becomes -1 == 5, which is false, and the test fails. That failure is the test catching the change.

So Vary marks that mutant as killed.

In mutation testing language, a failing test means the mutant was killed. That is good. It means the test caught the wrong behavior after we changed the code.

Test resultMeaning
The test failsThe mutant is killed
The test still passesThe mutant survived

That is the whole idea of mutation testing. It asks: if we mess with the code in a small way, will the tests catch it? In this example, the answer is yes. Swapping + to - changed the result from 5 to -1, and the test failed immediately.

If you want the conceptual version first, read Overview. If you want the technical details after this, continue to Introduction.

← Introduction
Golden path →