According to Eivind Jahren, property-based testing is an invaluable tool for its ease of use and effectiveness. It is flexible in what requirements one can formulate and is simple and lightweight enough to put in the hands of software developers to perform iterative testing on a daily basis. It can discover bugs as the code is being written, less test code is required, and it’s easier to reuse test data generators for complex structured data.
Eivind Jahren gave a talk about property-based testing at NDC TechTown.
Fuzzing is a testing method where one repeatedly gives random inputs to the program, Jahren said. You provide both valid and invalid inputs, and use various strategies for guiding the inputs toward new behaviour. The purpose is to determine whether the program has unwanted behaviour (memory leaks, invalid operations, crashes, etc.).
Jahren mentioned that they started using property-based testing in a group of enthusiasts seven years ago. Since then they have had several workshops and a course to spread the knowledge of how to use this tool. He mentioned that they use property-based testing as fuzzy unit testing:
Property tests are written alongside other unit and integration tests with some more heavyweight tests that fuzz large parts of the code.
Jahren gave an example using property-based testing for the Python sorting function `sorted()`:
```python
from hypothesis import given
import hypothesis.strategies as st
@given(st.lists(elements=st.integers()))
def test_sorted(list):
sorted_list = sorted(list)
assert_is_permutation(list, sorted_list)
assert_is_ordered(sorted_list)
def assert_is_permutation(list1, list2):
for element in (list1+list2):
assert list1.count(element) == list2.count(element)
def assert_is_ordered(list):
for i in range(len(sorted)-1):
assert list[i] <= list[i+1]
```
This example uses the Python property-based testing library "hypothesis". The functions that generate random values are called "strategies" in the hypothesis, and here we have created a generator for a list of integers `st.lists(elements=st.integers())`, Jahren explained:
The `@given` decorator calls the test 100 times (by default) with randomly generated values from that strategy. Each time we sort the numbers with `sorted()` and assert the defining properties of sorting, namely that the output is an ordered permutation of the input list.
There are several libraries for doing property-based testing. They usually consist of two parts: a combinator library for generating test data with the desired shape, and a test runner that injects the test data into a unit test, Jahren mentioned. Features of property-based testing libraries are shrinking, stateful tests, and mechanisms for reproducing and controlling the amount of test data. Jahren referred to a list of property-based testing libraries in different languages and what features they provide.
In comparison to formal verification, other fuzzing techniques, and linting, property-based testing is very flexible in what requirements one can formulate and is simple and lightweight enough to put in the hands of all developers to perform iterative testing on a daily basis, Jahren said.
Jahren mentioned that the immediate effect that they saw from doing property-based testing was that it, more than regular unit tests, can discover bugs as the code is being written, as test data not otherwise considered by the developer is being generated by the tool.
Jahren mentioned that his impression is that less test code is required for the same amount of assurance. One other effect that he noticed was that reusing test data generators for complex structured data is easier than coming up with new dummy test data, and it does not have the problems of reusing test data.
InfoQ interviewed Eivind Jahren about doing property-based testing.
InfoQ: How does property-based testing compare to traditional testing methods and verification techniques?
Eivind Jahren: With testing (or any dynamic verification technique with human-generated input) the danger is that the author of the test is not sufficiently imaginative with creating test data. This is a point where Property Based Testing really shines.
Take the example of when multiple bugs attack. Here the result of computing the mean of a list of floating point numbers is tested to not exceed the maximum number of the list. However, this is not generally valid as computing the mean can result in catastrophic loss of precision!
mean([174763.00000000006,174763.00000000006, 174763.00000000006]) = 174763.0000000001
Providing inputs where calculating the mean has a catastrophic loss of precision is not at the top of my mind when I write tests.
InfoQ: What have you learned?
Jahren: What has been the most surprising and consistent feedback is that people find working with property-based testing really fun. The kinds of failures you get are interesting or surprising, and one has a sense that code that passes the tests is more resilient, which adds to a sense of accomplishment.