Hedgehog will eat all your bugs.
Hedgehog is a modern property based testing system in the spirit of QuickCheck, originally written in Haskell, but now also available in R. One of the key benefits of Hedgehog is integrated shrinking of counterexamples, which allows one to quickly find the cause of bugs, given salient examples when incorrect behaviour occurs.
To get a quick look of how Hedgehog feels, hereβs an example showing
some of the properties a function which reverses a vector should have.
Weβll be testing the rev
function from
package:base
.
test_that( "Reverse of reverse is identity",
forall( gen.c( gen.element(1:100) ), function(xs) expect_equal(rev(rev(xs)), xs))
)
## Test passed πΈ
The property above tests that if I reverse a vector twice, the result should be the same as the vector that I began with. Hedgehog has generated 100 examples, and checked that this property holds in all of these cases.
As one can see, there is not a big step from using vanilla
testthat
to including hedgehog in oneβs process. Inside a
test_that
block, one can add a forall
and set
expectations within it.
We use the term forall
(which comes from predicate
logic) to say that we want the property to be true no matter what the
input to the tested function is. The first argument to
forall
is function to generate random values (the
generator); while the second is the properties we wish to test.
The property above doesnβt actually completely specify that the
rev
function is accurate though, as one could replace
rev
with the identity function and still observe this
result. We will therefore write one more property to thoroughly test
this function.
test_that( "Reversed of concatenation is flipped concatenation of reversed",
forall( list( as = gen.c( gen.element(1:100) )
, bs = gen.c( gen.element(1:100) ))
, function(as,bs) expect_equal ( rev(c(as, bs)), c(rev(bs), rev(as)))
)
)
## Test passed π₯
This is now a well tested reverse function. Notice that the property
function now accepts two arguments: as
and bs
.
A list of generators in Hedgehog is treated as a generator of lists, and
shrinks all members independently. We do however do our best to make
sure that properties can be specified naturally if the generator is
specified in this manner as a list of generators.
Now letβs look at an assertion which isnβt true so we can see what our counterexamples looks like
test_that( "Reverse is identity",
forall( gen.c( gen.element(1:100) ), function(xs) expect_equal(rev(xs), c(xs)))
)
## ββ Failure: Reverse is identity ββββββββββββββββββββββββββββββββββββββββββββββββ
## Falsifiable after 2 tests, and 6 shrinks
## <expectation_failure/expectation/error/condition>
## rev(xs) not equal to c(xs).
## 2/2 mismatches (average diff: 1)
## [1] 2 - 1 == 1
## [2] 1 - 2 == -1
## Backtrace:
## β
## 1. ββhedgehog::forall(...)
## 2. ββhedgehog:::run.prop(property, counterexample$smallest, curry)
## 3. ββbase::tryCatch(...)
## 4. β ββbase (local) tryCatchList(expr, classes, parentenv, handlers)
## 5. β ββbase (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
## 6. β ββbase (local) doTryCatch(return(expr), name, parentenv, handler)
## 7. ββbase::withCallingHandlers(...)
## 8. ββbase::do.call(property, arguments)
## 9. ββ`<fn>`(`<int>`)
## 10. ββtestthat::expect_equal(rev(xs), c(xs))
## Counterexample:
## [1] 1 2
## Error:
## ! Test failed
This test says that the reverse of a vector should equal the vector,
which is obviously not true for all vectors. Here, hedgehog has run this
expectation with random input, and found it to not be true. Instead of
reporting it directly, it has shrunk the bad test case to the smallest
counterexample it could find: c(1,2)
. Hedgehog then reΓ«mits
this test error to testthat
, which handles it as per usual
and displays it to the user.
Hedgehog exports some basic generators and plenty of combinators for
making new generators. Hereβs an example analogous to calling
sample
on a small list, using the hedgehog generator
gen.sample
:
## Hedgehog generator:
## Example:
## [1] 3 4 5 1 2
## Initial shrinks:
## [1] 1 2 3 4 5
## [1] 1 3 4 5 2
## [1] 1 4 5 3 2
## [1] 2 4 5 1 3
## [1] 3 1 5 4 2
## [1] 3 2 5 1 4
## [1] 3 4 1 5 2
## [1] 3 4 2 1 5
This generator shrinks back towards the original list ordering the user supplied. Although only a few shrinks are shown above, these are actually just the first layer of a rose tree of possible shrinks. This integrated shrinking property is a key component of hedgehog, and gives us an excellent chance of reducing to the minimum possible counterexample.
test_that( "a is less than b + 1",
forall(list(a = gen.element(1:100), b = gen.unif(1,100, shrink.median = F))
, function(a, b) expect_lt( a, b + 1 ))
)
## ββ Failure: a is less than b + 1 βββββββββββββββββββββββββββββββββββββββββββββββ
## Falsifiable after 2 tests, and 8 shrinks
## <expectation_failure/expectation/error/condition>
## `a` is not strictly less than b + 1. Difference: 0
## Backtrace:
## β
## 1. ββhedgehog::forall(...)
## 2. ββhedgehog:::run.prop(property, counterexample$smallest, curry)
## 3. ββbase::tryCatch(...)
## 4. β ββbase (local) tryCatchList(expr, classes, parentenv, handlers)
## 5. β ββbase (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
## 6. β ββbase (local) doTryCatch(return(expr), name, parentenv, handler)
## 7. ββbase::withCallingHandlers(...)
## 8. ββbase::do.call(property, arguments)
## 9. ββ`<fn>`(a = 2L, b = 1)
## 10. ββtestthat::expect_lt(a, b + 1)
## Counterexample:
## $a
## [1] 2
##
## $b
## [1] 1
## Error:
## ! Test failed
The generators gen.c
, gen.element
, and
gen.unif
, are related to standard R functions:
c
, to create a vector; sample
, to sample from
a list or vector; and runif
, to sample from a uniform
distribution. We try to maintain a relationship to Rβs well known
functions inside Hedgehog.
Generators are also monads, meaning that one can use the result of a generator to build a generator. An example of this is a list generator, which first randomly chooses a length, then generates a list of said length.
The gen.map
function can be used to apply an arbitrary
function to the output of a generator, while gen.and_then
is useful in chaining the results of a generator.
In the following example, weβll create a generator which builds two
lists of length n
, then turn them into a
data.frame
with gen.with
.
gen.df.of <- function ( n )
gen.with (
list( as = gen.c(of = n, gen.element(1:10) )
, bs = gen.c(of = n, gen.element(10:20) )
)
, as.data.frame
)
test_that( "Number of rows is 5",
forall( gen.df.of(5), function(df) expect_equal(nrow(df), 5))
)
## Test passed π
While this is good, but we would also like to be able to create
data.frames
with a varying number of rows. Here, weβll
again test a property which is false in order to show how hedgehog will
find the minimum shrink.
gen.df <-
generate(for (e in gen.element(1:100)) {
gen.df.of(e)
})
test_that( "All data frames are of length 1",
forall( gen.df, function(x) expect_equal(nrow(x), 1))
)
## ββ Failure: All data frames are of length 1 ββββββββββββββββββββββββββββββββββββ
## Falsifiable after 1 tests, and 5 shrinks
## <expectation_failure/expectation/error/condition>
## nrow(x) not equal to 1.
## 1/1 mismatches
## [1] 2 - 1 == 1
## Backtrace:
## β
## 1. ββhedgehog::forall(gen.df, function(x) expect_equal(nrow(x), 1))
## 2. ββhedgehog:::run.prop(property, counterexample$smallest, curry)
## 3. ββbase::tryCatch(...)
## 4. β ββbase (local) tryCatchList(expr, classes, parentenv, handlers)
## 5. β ββbase (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
## 6. β ββbase (local) doTryCatch(return(expr), name, parentenv, handler)
## 7. ββbase::withCallingHandlers(...)
## 8. ββbase::do.call(property, arguments)
## 9. ββ`<fn>`(`<data.frame>`)
## 10. ββtestthat::expect_equal(nrow(x), 1)
## Counterexample:
## as bs
## 1 1 10
## 2 1 10
## Error:
## ! Test failed
Technically, that we can sequence generators is this way implies they
are monads, and we provide a number combinators for manipulating them in
this manner. Indeed, generate
is simply syntactic sugar for
monadic bind, sometimes referred to as βand thenβ.
The gen.with
function can be used to apply an arbitrary
function to the output of a generator, while gen.and_then
is useful in chaining the results of a generator.
R is a multi-paradigm programming language, while all the tests we have seen so far have tested functions which have no side effects (pure functions).
To deal with more complex situations which might arise in practice, Hedgehog also supports testing stateful system using a state machine model under random actions.
The general idea is that we can generate a model of the system, with requirements and post-conditions for every action we can take. With a random sequence of actions, we can test our model of the system against the true implementation. Hedgehog will then be able to identify inconsistencies between the true implementation and the model, from which the programmer can ask whether this is a bug in the model or a true bug in the system.
John Hughes has a series of excellent talks regarding testing of state based and non-deterministic systems using QuviQβs proprietary QuickCheck implementation, which has been using these techniques to great effect for many years.
Hedgehogβs current implementation in R is still quite young, and not nearly as feature rich, but does still allow for interesting properties in stateful systems to be investigated. See the state-machines vignette for more information.