My first Property test - a brief retrospective
Recently I found myself having to implement a fairly simple algorithm, the orthodromic (great-circle) distance between 2 latitude/longitude points. I've long looked at property based testing and it implementations (QuickCheck) with a sense of longing, but couldn't find something that would easily allow me to bring this to my work in python. Luckily I stumbled across hypothesis which seems to accomplish just this!
Great I thought, this will allow me to do some property tests with ease! But then I hit the first (and possibly biggest) stumbling block, what is a property?
The task at hand
Before we delve further let's quickly set out our algorithm.
# Spherical law of Cosines calculation (the shorter method)
import maths
def orth_distance(point1, point2):
lat1, long1 = point1
lat2, long2 = point2
abs_long_distance = long2 - long1
central_angle = math.acos(
(math.sin(lat1) * math.sin(lat2)) + \
(math.cos(lat1) * math.cos(lat2) * \
math.cos(abs_long_distance))
)
return central_angle * EARTH_RADIUS
So given 2 points (in radians) (lat1, long1), (lat2, long2) we get back the distance between the 2.
Properties?
So what are the properties of this system?
Think of all of the laws of Physics / Maths / Science that you know of. Most are quite simple, but when applied together allow much more complex meaning be expressed. We're trying to come up with something similar here. What's the simplest set of assertions that we can define that will hold true?
First it looks to be a reasonably pure function, so we don't have to content with side effects. What are some of the other obvious properties?
- It's commutative f(x, y) = f(y, x)
- Distance between the same point is zero f(x,x) = 0
Great! That gives us something to start with.
Generating data
To help us along lets define how we would generate a single point
from hypothesis.strategies import floats, tuples
point_test_gen = tuples(
floats(allow_infinity=False, allow_nan=False, min_value=-90.0, max_value=90.0),
floats(allow_infinity=False, allow_nan=False, min_value=-180.0, max_value=180.0)
)
So, a co-ordinate point is 2 floating point numbers, ranging from -180 - 180 (representing lines of latitude / longitude).
The tests
So for the actual tests, it was simply a case of just gluing everything together!
@given(point_test_gen, point_test_gen)
def test_orth_distance_commutative(point1, point2):
"""
Test that orth_distance doesn't care about argument order
"""
assert orth_distance(point1, point2) == orth_distance(point2, point1)
@given(point_test_gen)
def test_orth_distance_same_point(point1):
"""
Test that orth_distance gives the distance between the same point
as 0
"""
assert orth_distance(point1, point1) == 0
It is as simple as it looks, hypothesis takes care of generating out all of the test cases for each value. And if any failures occur it will automatically try and find the smallest possible problem set.
A successful test run
========================================= test session starts ==========================================
platform linux -- Python 3.4.2, pytest-2.9.2, py-1.4.31, pluggy-0.3.1 -- /home/dave/projects/dist/.venv/bin/python
cachedir: .cache
rootdir: /home/dave/projects/dist, inifile:
plugins: hypothesis-3.4.2
collected 3 items
dist/test_distance.py::test_orth_distance_commutative PASSED
dist/test_distance.py::test_orth_distance_same_point PASSED
dist/test_distance.py::test_orth_distance_dublin_cork PASSED
======================================= 3 passed in 0.48 seconds =======================================
And just to compare a not-so successful test run
========================================= test session starts ==========================================
platform linux -- Python 3.4.2, pytest-2.9.2, py-1.4.31, pluggy-0.3.1 -- /home/dave/projects/dist/.venv/bin/python
cachedir: .cache
rootdir: /home/dave/projects/dist, inifile:
plugins: hypothesis-3.4.2
collected 3 items
dist/test_distance.py::test_orth_distance_commutative FAILED
dist/test_distance.py::test_orth_distance_same_point PASSED
dist/test_distance.py::test_orth_distance_dublin_cork FAILED
======================================= short test summary info ========================================
FAIL dist/test_distance.py::test_orth_distance_commutative
FAIL dist/test_distance.py::test_orth_distance_dublin_cork
=============================================== FAILURES ===============================================
____________________________________ test_orth_distance_commutative ____________________________________
@given(point_test_gen, point_test_gen)
> def test_orth_distance_commutative(point1, point2):
"""
Test that orth_distance doesn't care about argument order
dist/test_distance.py:32:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
.venv/lib/python3.4/site-packages/hypothesis/core.py:520: in wrapped_test
print_example=True, is_final=True
.venv/lib/python3.4/site-packages/hypothesis/executors.py:58: in default_new_style_executor
return function(data)
.venv/lib/python3.4/site-packages/hypothesis/core.py:110: in run
return test(*args, **kwargs)
dist/test_distance.py:36: in test_orth_distance_commutative
assert orth_distance(point1, point2) == orth_distance(point2, point1)
dist/distance.py:58: in orth_distance
return impl_fn(lat1, long1, lat2, long2) * EARTH_RADIUS
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
lat1 = 0.0, long1 = 0.0, lat2 = 1e-323, long2 = 0.0
def _haversine_distance(lat1, long1, lat2, long2):
"""
Distance (in Rads) between 2 lat / long via the Haversine function
(computationally better for smaller distances)
:returns float distance: Distance in Radians
"""
#: TODO should we check for 0N 0W / InfN InfW ? (invalid data)
abs_lat_distance = lat1 - lat2
abs_long_distance = long1 - long2
return 2 * math.asin(math.sqrt(
> math.pow(math.sin(abs_lat_distance/2), 2.1) + \
(math.cos(lat1) * math.cos(lat2) * math.pow(math.sin(abs_long_distance/2), 2))
))
E ValueError: math domain error
dist/distance.py:23: ValueError
---------------------------------------------- Hypothesis ----------------------------------------------
Falsifying example: test_orth_distance_commutative(point1=(0.0, 0.0), point2=(4.25e-322, 0.0))
To do this I fudged one of the math.pow calls to raise to the power of 2.1. Interestingly enough the property f(x, x) = 0 still holds true!
Looking back
So after I implemented the tests I was quite amazed by their result. Normally when Unit-Testing I'd like to cover a few of the following points:
- Successful call
- Simple failure
- No data
- Exception / unusual input (NaN etc)
Which would take 8-10 Unit-Tests to leave me with a reasonable confidence in what I'd done.
I'd always been worried about having to define all of the properties of a function, ensuring that I didn't miss anything. With the properties I'd defined I had a much greater confidence in my coding after just 1! It'd hit a number of edge cases that I hadn't ever thought about before and forced my hand to fix them right there.
After this I think I'll be relying much more on implementing more property based testing.
A replacement for Unit-Tests?
Possibly, however I wasn't 100% confident in my skills so ended up writing this Unit-Test.
def test_orth_distance_dublin_cork():
"""
Test the pre-calculcated distance between
Dublin and Cork (from wolfram alpha)
"""
dublin_pos = (53.3498, 6.2603)
cork_pos = (51.8969, 8.4863)
dist = orth_distance(dublin_pos, cork_pos)
# Wolfram Alpha has a much more accurate implementation
# So check we are +/- 500m away
pre_calculated_distance_upper = to_round_dec(221 + 0.5)
pre_calculated_distance_lower = to_round_dec(221 - 0.5)
assert to_round_dec(dist) >= pre_calculated_distance_lower
assert to_round_dec(dist) <= pre_calculated_distance_upper
To check at least 1 set of values was what I expected. And with a bit of fudging it was (Wolfram Alpha has a much better algorithm!).
Conclusion
Property testing allows you to cover a large space of possible tests very quickly, with good feedback. It gave me a lot of confidence in my algorithm, especially as it found a couple of corner cases around NaN, 0 and inf that I would not have thought of when creating a Unit-Test.
Over the last few years I've been drifting more towards functional programming, writing discrete pure functions and composing them into higher-order functionality. This has even merged into my day to day work with Python. (I use a lot more map, filter, reduce over data).
And it's now clear to me why property based testing originated in that world, and to a certain extent has stayed there. I'd find this a lot harder to implement if I had to bundle state with functionality, but that's a post for a different day.
If you take away one thing from this post it should be to give it a try, at the very least it will enlighten you as to the much deeper set of Unit-Test values possible for even a simple algorithm. (And hypothesis is a most excellent library)
Thinking about the 2 testing approaches I'd summarize them as:
- Unit-Testing - checking what I expect.
- Property Testing - checking the laws/properties/truths of the function
Each as valuable as the other.