Making Mutation Testing Faster

Background
There have been huge improvements in the performance and efficiency of mutation testing systems over the last few years. My conference talks show how pitest is able to analyse in 3 minutes codebases that took earlier systems 3 days, and how arcmutate has makes things faster still.
But, while things may be faster, they aren’t always fast enough. For some codebases a full analysis can still be impractical. Fortunately, git integration means that most teams don’t need to analyse all the code. And for the teams that do need a full analysis, arcmutate’s history feature means they only have to do it once.
And, sometimes, you can significantly speed up analysis by making a few tweaks. If you know what to look for.
What Makes Analysis Slow?
Many different factors affect how quickly a code base can be analysed, including
- The number of lines of code
- The speed of the unit tests
- How good the tests are at killing mutants
- The number of infinite loops created
The ways these factors will interact can be hard to predict. Although generally larger code bases with slower tests will have longer analysis times, this is not always the case. You need to look at the details underneath.
From least expensive, to most expensive we can categorise mutants as follows
- A - Surviving mutant with no coverage
- B - Mutant killed by a fast test
- C - Surviving mutant covered by one or more fast tests
- D - Mutant killed by a slow test
- E - Surviving mutant covered by one or more slow tests
Category A mutants have almost no cost, but are not desirable.
Category B mutants are the best case scenario. Ideally, all our mutants would fall into this category.
The most expensive mutants are category E: surviving mutants covered by slow tests.
So, to make our analysis faster, we want to avoid category E mutants as much as possible.
There are two ways to do this.
- Write fast tests that kill the mutants
- Remove slow ineffective tests from the suite
Faster tests will be run before slower ones so, if they kill the mutants, the slower ones will never execute.
Adding fast, effective tests to a codebase takes time. For a quick win, option 2 may work. If we have very slow tests in our suite, excluding them could make a significant difference to analysis times, particularly if they exercise a large number of surving mutants.
Pitest and arcmutate have some features to help.
Test Stats
Since release 1.20.0 pitest prints some basic tests stats out to the console, including :-
- The slowest test
- The test that executed the largest number of code blocks.
This information is collected before any mutants are challenged, so dry-run mode can be used to gather stats without analysing any mutants.
1
mvn -Ppitest test -DextraFeatures=+csv_test_stats -Dpit.dryRun=true test
If, when you examine the tests, they have a high cost but don’t add much value, it might be pragmatic to exclude them from the analysis with pitest’s excludedTestClasses
parameter.
There may be mutants that only these tests could kill, so excluding them could result in more survivors. Analysis times could go up as well as down.
More Test Stats
Arcmutate users get a little more help.
Since release 1.5.0, if the base plugin identifies a test that it thinks may slow the analysis, it will print it to the console.
1
com.example.problem.ProblemTest may slow the analysis
Although this is often the same as the class with the slowest test, the heuristics the plugin uses are more complex and less likely to report classes that have little impact on the analysis time.
The release also adds the option to export information about the test suite.
If you install the plugin and activate the ‘+csv_test_stats’ feature, arcmutate will generate two files.
- tests.csv
- killing_tests.csv
The first file is generated before mutants are challenged. For each test it shows the execution time and how many code blocks it executes. All slow tests that execute a large number of blocks can be easily identified, these are the ones most likely to slow the analysis.
The second file is generated once analysis is complete. It shows the same information, but also shows how many mutants each test killed.
Armed with this information we can generate a list of long running tests that do not kill any mutants. These are candidates for exclusion from the analysis.
Thanks for reading. Checkout our industrial quality mutation testing tools for the jvm.