# A 5 minute Intro to Hypothesis

October 14, 2017 | 7 min Read

Let’s say you have a Python module called `example.py` where you have a function called `add_numbers` that you want to test.

``````# example.py
def add_numbers(a, b):
return a + b``````

This function adds two numbers, so among other things you might want to check that the commutative property holds for all the inputs that the function receives. You create a file called `test_example.py` and start writing a simple unit test to prove it.

## A simple unit test

``````# test_example.py
import unittest
from your_python_module.example import add_numbers

class TestAddNumbers(unittest.TestCase):

def test_add_numers_is_commutative(self):
self.assertEqual(add_numbers(1.23, 4.56), add_numbers(4.56, 1.23))

if __name__ == '__main__':
unittest.main()``````

That’s cool, but you actually didn’t prove that the commutative property holds in general. You have just proved that for this specific case such property holds.

You realize that the combination of `1.23` and `4.56` is a very tiny subset of the entire input space of numbers that your function can receive, so you write more tests.

## A common solution: write more test cases

``````# test_example.py

# more tests here...
def test_add_numers_is_commutative_another_case(self):
self.assertEqual(add_numbers(0.789, 321), add_numbers(321, 0.789))
# more tests here...``````

Not a huge gain. You have just proved that for these other specific cases that you wrote the commutative property holds. And obviously you don’t want to write a million test cases by hand.

Maybe you have heard about fuzzing, and you want to use it to create random test cases every time you run the test.

## A better solution: fuzzing and ddt

You can create a pair of random floats, so every time you run the test you have a new test case. That’s a bit dangerous though, because if you have a failure you cannot reproduce it easily.

``````# test_example.py
import random
import unittest

class TestAddNumbers(unittest.TestCase):

def test_add_numers_is_commutative(self):
a = random.random() * 10.0
b = random.random() * 10.0
self.assertEqual(add_numbers(a, b), add_numbers(b, a))

if __name__ == '__main__':
unittest.main()``````

You can also use a library called ddt to generate many random test cases.

``````# test_example.py
import random
import unittest
from ddt import ddt, idata, unpack

def float_pairs_generator():
num_test_cases = 100
for i in range(num_test_cases):
a = random.random() * 10.0
b = random.random() * 10.0
yield (a, b)

@ddt
class TestAddNumbers(unittest.TestCase):

@idata(float_pairs_generator())
@unpack
def test_add_floats_ddt(self, a, b):
self.assertEqual(add_numbers(a, b), add_numbers(b, a))

if __name__ == '__main__':
unittest.main()``````

Here the `float_pairs_generator` generator function creates 100 random pairs of floats. With this trick you can multiply the number of test cases while keeping your tests easy to maintain.

That’s definitely a step in the right direction, but if you think about it we are still testing some random combinations of numbers between 0.0 and 10.0 here. Not a very extensive portion of the input domain of the function `add_numbers`.

You have two options:

1. find a way to generate domain objects that your function can accept. In this case the domain objects are the floats that `add_numbers` can receive.
2. use hypothesis

I don’t know about you, but I’m going for the second one.

## The best solution: Hypothesis

Here is how you write a test that checks that the commutative property holds for a pair of floats.

``````# test_example.py
from hypothesis import given
from hypothesis.strategies import floats
from your_python_module.example import add_numbers

@given(a=floats(), b=floats())
def test_add_numbers(a, b):
assert add_numbers(a, b) == add_numbers(b, a)

if __name__ == '__main__':
test_add_numbers()``````

When I ran this test I was shocked. It failed!

WTF! How it that possible that this test fails, after I have tried 100 test cases with ddt?

Luckily with hypothesis you can increase the verbosity level of your test by using the `@settings` decorator.

Let’s say you also want to test a specific test case: `a == 1.23` and `b == 4.56`. For this you can use the `@example` decorator. This is nice because now your test provides some documentation to anyone who wants to use the `add_numbers` function, and at the same time you are testing a specific case that you know about or that might be particularly hard to hit.

``````# test_example.py
from hypothesis import given, example, settings, Verbosity
from hypothesis.strategies import floats
from your_python_module.example import add_numbers

@settings(verbosity=Verbosity.verbose)
@example(a=1.23, b=4.56)
@given(a=floats(), b=floats())
def test_add_numbers(a, b):
assert add_numbers(a, b) == add_numbers(b, a)

if __name__ == '__main__':
test_add_numbers()``````

Obviously the test fails again, but this time you get more insights about it.

``````Trying example: test_add_numbers(a=1.23, b=4.56)
Trying example: test_add_numbers(a=0.0, b=nan)
Traceback (most recent call last):
# Traceback here...
AssertionError
Trying example: test_add_numbers(a=0.0, b=nan)
Traceback (most recent call last):
# Traceback here...
AssertionError
Trying example: test_add_numbers(a=0.0, b=1.0)
Trying example: test_add_numbers(a=0.0, b=4293918720.0)
Trying example: test_add_numbers(a=0.0, b=281406257233920.0)
Trying example: test_add_numbers(a=0.0, b=7.204000185188352e+16)
Trying example: test_add_numbers(a=0.0, b=inf)
# more test cases...
You can add @seed(247616548810050264291730850370106354271) to this test to reproduce this failure.``````

That’s really helpful. You can see all the test case that were successful and the ones that caused a failure. You get also a `seed` that you can use to reproduce this very specific failure at a later time or on a different computer. This is so awesome.

Anyway, why does this test fail? It fails because in Python `nan` and `inf` are valid floats, so the function `floats()` might create some test cases that have `a == nan` and/or `b == inf`.

Are `nan` and `inf` valid inputs for your application? Maybe. It depends on your application.

If you are absolutely sure that `add_numbers` will never receive a `nan` or a `inf` as inputs, you can write a test that never generates either `nan` or `inf`. You just have to set `allow_nan` and `allow_infinity` to `False` in the `@given` decorator. Easy peasy.

``````# test_example.py
from hypothesis import given, example, settings, Verbosity
from hypothesis.strategies import floats
from your_python_module.example import add_numbers

@given(
a=floats(allow_nan=False, allow_infinity=False),
b=floats(allow_nan=False, allow_infinity=False))
def test_add_numbers(a, b):
assert add_numbers(a, b) == add_numbers(b, a)

if __name__ == '__main__':
test_add_numbers()``````

But what if `add_numbers` could in fact receive `nan` or `inf` as inputs (a much more realistic assumption). In this case the test should be able to generate `nan` or `inf`, your function should raise specific exceptions that you will have to handle somewhere else in your application, and the test should not consider such specific exceptions as failures.

Here is how `example.py` might look:

``````# example.py
import math

class NaNIsNotAllowed(ValueError):
pass

class InfIsNotAllowed(ValueError):
pass

def add_numbers(a, b):
if math.isnan(a) or math.isnan(b):
raise NaNIsNotAllowed('nan is not a valid input')
elif math.isinf(a) or math.isinf(b):
raise InfIsNotAllowed('inf is not a valid input')
return a + b``````

And here is how you write the test with hypothesis:

``````# test_example.py
from hypothesis import given
from hypothesis.strategies import floats
from your_python_module.example import add_numbers, NaNIsNotAllowed, InfIsNotAllowed

@given(a=floats(), b=floats())
def test_add_numbers(a, b):
try:
assert add_numbers(a, b) == add_numbers(b, a)
except (NaNIsNotAllowed, InfIsNotAllowed):
reject()

if __name__ == '__main__':
test_add_numbers()``````

This test will generate some `nan` and `inf` inputs, `add_numbers` will raise either `NaNIsNotAllowed` or `InfIsNotAllowed` and the test will catch these exceptions and reject them as test failures (i.e. the test case will be considered a success when either `NaNIsNotAllowed` or `InfIsNotAllowed` occurs).

Can you really afford to reject `nan` as an input value for `add_numbers`? Maybe not. Let’s say your code needs to sum two samples in a time series, and one sample of the time series is missing: `nan` would be a perfectly valid input for `add_numbers` in such case.

## References Written by Giacomo Debidda, Pythonista & JS lover (D3, React). You can find me on Twitter & Github