Hypothesis 1.0: A property-based testing library for Python

bowyakka · on March 30, 2015

This library is awesome, I am a massive fan of randomized and quick-check style testing. This type of testing has for me found so many bugs its surreal.

I love the fact that hypothesis is a proper quickcheck and not just the fuzzing part, and that the author has a crazy stack-based-vm-language-testing thing project that might be useful.

I have been using this library with py.test for 6 months now and its been a godsend.

If I ever meet the author I would buy him beers.

DRMacIver · on March 30, 2015

Thanks so much for the kind words!

The best part of this message is that if you were someone I knew and thus could trust to only say nice things about me you would offer to buy me gin instead of beer. ;-)

bowyakka · on March 30, 2015

Gin is fine, prehaps meet at a pycon one day :P

wyldfire · on March 30, 2015

The search strategy is described in the docs [1]. I wonder if they could capitalize on the algorithm NIST uses in their ACTS software [2]? It's great for maximizing time spent testing parameters in N different dimensions.

[1] http://hypothesis.readthedocs.org/en/latest/details.html#sea... [2] http://csrc.nist.gov/groups/SNS/acts/

DRMacIver · on March 30, 2015

This looks super interesting and I'll check it out when I'm not doing release management stuff, thanks!

If you haven't seen it, http://hypothesis.readthedocs.org/en/latest/internals.html#p... describes the current algorithm for generation. It's smarter than is typical for quickcheck, but could definitely be improved.

e12e · on March 30, 2015

Interesting stuff. Can't wait to play with.

One immediate reaction from reading about things like "@given(float)... def testFun(x): ... assume(not isnan(x))..." -- I immediately would prefer to decorate/document my functions (that could be test functions, but more generally my "real" functions) with these pre-/post- conditions and assumtions, and generate the tests, something like:

    @assert(idempotent) #(inv(inv(x))==x)
    def inv(x):
      "Return -x"
      @given(float):
        @assume(not isnan(x))
      return -x

Typing this out, I can see why they go in the tests (avoids going down the rabbithole of creating a partially typed version of python...) -- which is why @assume(float) isn't on top; it'd imply inv didn't work for integers. Maybe a weaker:

    @assume(not isnan(x)) #implies x is number, but not nan
    @idempotent # inv(inv(x)) == x
    def inv(x):
      return -x

Now it should be possible to infer that inv(x) should be checked with numbers, and that it is idempotent. Might be a weak test unless one can specfify a few other things - but maybe such pre/post "hints" could be combined to make better short tests? Eg that one could give a different expression to check against rather than the function itself: def test_inv(): assert(inv(x), lambda y: -y)?

At any rate, interesting framework.

reverius42 · on March 30, 2015

I don't think that's what idempotent means. The only way f can be idempotent when f(f(x)) == x is if f(x) == x. (This implies that f is the identity function).

The property that makes the function idempotent is that successive application produces the same result as initial application: f(f(x)) == f(x).

e12e · on March 31, 2015

You're right, of course. I typed that on my cellphone, and were a little quick looking up alternatives for @double_application_turns_this_into_an_identity_function -- that made more sense, and was a little more succinct.

Maybe @left_inverse would be more appropriate? It's been a while since I had to classify functions.

jfarmer · on March 31, 2015

If you're curious, the math-y term for an operation that is its own inverse is involution. I don't know that I've ever heard the word in a programming context, but there it is! :)

tmoertel · on March 31, 2015

An involution played a rather important role in this programming context:

http://blog.moertel.com/posts/2013-12-14-great-old-timey-gam...

e12e · on March 31, 2015

Thank you! A quick look through my bookshelf reveals that out of four books on discrete mathematics, only one contain involution in the index: and there it is mentioned once in a supplementary exercise... I guess that explains why I couldn't seem to recall a term for "self-inverse" [ed: No, that's wrong to, lol. Lets just stick with involution].

jfarmer · on March 31, 2015

The word "involution" finds more use in analysis and algebra, where self-inverses have more interesting (often geometric) properties. The only interesting property I can think of in the context of discrete mathematics would be that if S is a finite set and f:S → S is an involution then the parity of |S| is equal to the parity of the fixed points of f.

That is,

    |S| ≡ |Fix(f)| (mod 2)

where Fix(f) = {x in S : f(x) = x}.

tekacs · on March 30, 2015

Might be more useful pointed [here][1], perhaps?

[1]: http://hypothesis.readthedocs.org/en/latest/

pyre · on March 31, 2015

Well, it is a link to the "official" announcement, and the announcement directly links there, but I can see you point too.

michaelmior · on March 30, 2015

GitHub https://github.com/DRMacIver/hypothesis

lgierth · on March 30, 2015

Is this similar to Mutation Testing, where the code being tested is mutated, in order to identify unspecified behavior?

Mutant is a Ruby library for this: https://github.com/mbj/mutant

DRMacIver · on March 30, 2015

No, it's much closer to classic fuzz testing, where the test is held constant and the data fed to it is varied. I've been meaning to see if I can figure out a way to use mutation testing to feed into Hypothesis, but it's on the long list of things I'll try "at some point"

baq · on March 30, 2015

could you use something like recently open sourced z3 to assist in finding minimal examples? (http://en.wikipedia.org/wiki/Concolic_testing)

DRMacIver · on March 30, 2015

This is also on the "at some point" list. :-)

Concolic testing is quite hard in Python because of its extremely flexible semantics. I will probably look into doing something with z3 at some point when I need an interesting problem to entertain me, but I don't hold out a massive amount of hope for it being useful.

ayrx · on March 31, 2015

This is a very awesome library. I am attempting to integrate it into the test suites of my various projects this past week and the author has been more than helpful, even when I filed bug reports that turns out to be issues in my own code.

IanCal · on March 31, 2015

Slightly OT ramble ahead :)

> even when I filed bug reports that turns out to be issues in my own code.

Every time I've used or built a property based testing system I've had problems when starting because there's something broken in it. This has then turned out to be an actual bug in the system being tested.

Building a quickcheck style tester in AS3 drove me nuts at one point before finding out that the conversions between strings and floating point numbers has a bunch of weird issues. There are some obvious ones, but then also things like certain numbers convert differently if you add a trailing zero (so 0.7362856270 is turned into a different number to 0.736285627 and 0.73628562700). So my little "decode(encode(x)) == x" test example broke!

My favourite thing that it found was in a menuing library we were building (and why I built the tester). I set it up to test a library by making library calls that a developer might make and test various properties of the overall menu (all elements reachable, for some if you move left then right you're back on the same item, etc). One property was that if there was at least one item in the list and the list had focus, then one item in that list had focus. This was found to be broken by starting with only one item, removing it and adding a new one (reduced test cases are incredibly useful).

That's a fairly boring bug, but the interesting thing (to me) was that when I fixed it my unit tests broke. I had an explicit test to ensure that this behaviour was happening, and I'd also written it down in my spec for what should happen.

The testing tool forced me to consider higher level things of what should be true and drove out an inconsistency in my library. It's a 'bug' that would have bitten me many times, but rarely enough that it would probably have regularly ended up going live.

DRMacIver · on March 31, 2015

> Building a quickcheck style tester in AS3 drove me nuts at one point before finding out that the conversions between strings and floating point numbers has a bunch of weird issues. There are some obvious ones, but then also things like certain numbers convert differently if you add a trailing zero (so 0.7362856270 is turned into a different number to 0.736285627 and 0.73628562700). So my little "decode(encode(x)) == x" test example broke!

I have in fact had to deal with exactly this problem in Hypothesis internals. The example saving code didn't work correctly with floats in the first edition, because it was serializing them as JSON, which loses some information encoded in the actual float. In the end I serialize floats (which are actually doubles) by converting them to a 64-bit integer with the same bitwise representation first.

The bug aryx is talking about though actually illustrates an amusing feature of Hypothesis, which is that the code in it is just weird enough that it tends to do unexpected things that trigger bugs which have nothing to do with the properties being tested. :-) In this case it was putting unicode objects onto sys.path, which is 100% allowed but causes problems for certain code running on python 2.7 on windows that previously appeared to work.

IanCal · on March 31, 2015

> I have in fact had to deal with exactly this problem in Hypothesis internals.

Hah, wonderful. It's a great example of how this type of testing can really dig out odd bugs. Once you've tried property based testing, I think you never really trust things quite the same :)

> The bug aryx is talking about though actually illustrates an amusing feature of Hypothesis, which is that the code in it is just weird enough that it tends to do unexpected things that trigger bugs which have nothing to do with the properties being tested

Nice :)

Thanks for releasing this library, it's great to see more work being done in this area. There are usually a few random testing libraries floating about but adding things like minimisation (and I'm still reading through the templating & other new stuff) really makes it stand out.

The API testing example in particular actually comes at a perfect time for me, so it'll definitely be getting some use.

DRMacIver · on March 31, 2015

> Once you've tried property based testing, I think you never really trust things quite the same :)

Oh god, tell me about it. So far I've hit 4 pypy bugs, 3 cpython bugs, two pytz bugs (one caused by a cpython bug), and I've learned far more about the edge cases of the language than I ever wanted to know.

Lets just not talk about the number of bugs I've found in Hypothesis itself in the course of testing bits of the system I was sure worked perfectly.

evancordell · on March 30, 2015

This is awesome! And it comes right as I've been getting my feet wet with Haskell (and property-based testing).

It was really easy to get up and running with a simple test: https://github.com/ecordell/pymacaroons/blob/property-tests/...

and I'm excited to use it in more places. Though I'll have to configure tox/travis not to run hypothesis tests on Python 2.6.

DRMacIver · on March 31, 2015

Cool! I've added a comment to your commit as I think you've made a mistake in the strategy setup, but it looks good as a test.

Sorry about the lack of 2.6 support. I looked into it and I could probably do it but I just don't care enough about 2.6 to put in the work and make everything uglier. :-)

coolrhymes · on March 30, 2015

I love this library and like others, was a god send. one thing I want to try is use the lib with the python mock lib to mock services that can return randomized data of some kind.

dirtyaura · on March 31, 2015

I read about QuickChck over 10 years ago in Uni, I was just reintroduced to property based testing this week in HelsinkiJS meetup and now this Python implementation hit Hacker News.

Question: most of my backend code is rather dull CRUD and business logic code. Are there good tutorials about randomly generating objects with a few of fields and relationships between them and testing business logic using hypothesis?

DRMacIver · on March 31, 2015

Have you seen http://hypothesis.readthedocs.org/en/latest/examples.html#fu... ? It doesn't test very much - mostly just that the API doesn't produce a 500 error - but it's a decent example of how you can generate structured data with some constraints. http://hypothesis.readthedocs.org/en/latest/examples.html#co... is also a decent example where the data is more uniform but requires a bunch of massaging to satisfy the constraints.

leondutoit · on March 30, 2015

Thanks for a great library, can't wait to use it.