--- title: "Hedgehog - State Machine Testing" author: "Huw Campbell" date: "`r Sys.Date()`" output: github_document vignette: > %\VignetteIndexEntry{Hedgehog state machine tutorial} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, echo = FALSE, message = FALSE} library(hedgehog) set.seed(1014) snoc <- function (xs, x) { unlist ( list ( xs, list( x)) , recursive = F ) } ```

> Hedgehog will eat all your bugs. State machine testing in [Hedgehog](http://hedgehog.qa/) is an interesting and expressive testing paradigm for R. The goal is to test object oriented code or complex systems involving state and hidden or abstracted functionality. John Hughes has a series of excellent [talks][jh-dropbox] regarding testing of state based and non-deterministic systems using QuviQ's proprietary QuickCheck implementation, which has been using these techniques to great effect for many years. Here's a quick example: the following code is a reference class which maintains a lookup for values, with integer keys. One could also imagine writing a fast implementation of this in c. If you haven't seen reference classes before, that's ok – one can think of them as akin to C++ classes. ```{r echo = T, message = F} refs <- setRefClass("Refs", fields = list( num = "numeric" , refs = "list" ) , methods = list( initialize = function() .self$reset() , newRef = function() { .self$num <- .self$num + 1 .self$refs[[.self$num]] <- 0 return ( .self$num ) } , readRef = function(i) { return ( .self$refs[[i]] ) } , writeRef = function(i, a) { .self$refs[[i]] <- a invisible(NULL) } , reset = function() { .self$num = 0 .self$refs = list() invisible(NULL) } ) ) grefs <- refs$new() ``` We use the `grefs` value as a global system which we will interact with. How would we test this object? Using hedgehog, we will simulate a model for this class as a list of tuples, one for naming the map entry, and one for its value. For example: ```{r echo = T, message = F} list(key = 2, val = 0) ``` The commands we can use to interact with the reference class api are creating a new map entry, writing to a map entry, and reading a map entry. We will define a hedgehog command which will be used to test the `newRef` functionality of the class. ```{r echo = T, message = F} new <- command ( "New", generator = function( state ) list() , execute = function() grefs$newRef() , update = function( state, output ) snoc( state, list(key = output, val = 0)) ) ``` The `command` function takes a name of the function, which will be shown when presenting counterexamples, as well as a set of functions which are used when building and exercising this class's functionality. The generator function is provided with the current state of the system while we are running the generators and generates the list of arguments which should be passed to the `new` function while it is being executed. Here, the `new` command doesn't take an input, so this is the empty list (the arguments are curried to the property functions using `do.call`). The `execute` function will take the inputs generated by the generator, and applies them. In our case, there is no input (as mentioned above), so this function takes no arguments. Inside the `execute` function we actually call the `newRef()` function of the class. Finally the `update` function shown will add a new value to the test's model of the system. This function is run both at generation time and during execution of the property. This is interesting because during generation, there isn't actually a value for the output which we can pass to this function. Instead, we pass a symbolic value representing the output and what it will be during execution. This does have one interesting side effect, in that we can not permit one to look at or examine the value of the output in this function. It must be agnostic to whether the value is the type it expects. For `new` the update function is rather simple, and just adds a new map entry with value `0` to the model. We can now test this functionality with an initial state. ```{r echo = T, message = F} initialmodel <- list() ``` ```{r echo = T, message = F} test_that( "Registry State Machine Model", forall( gen.actions ( initialmodel, list(new) ), function( actions ) { grefs$reset() expect_sequential( initialmodel, actions ) }) ) ``` And it passes. Hedgehog has generated a random list of commands (all of which are `new`) and run them. Let's look at the functions we have used: `gen.actions` is a generator which will build a list of commands to call, ensuring that the preconditions are sound, and providing sensible shrinks; the test, `expect_sequential`, is an expectation which will run the actions and the post condition expectations. Notice that inside the `forall`, we also call `grefs$reset()`; this ensures that the global state is in a pristine condition before the tests are run. There aren't actually any expectations in this model, so it's not too surprising that the test passes. Now we can make this test more interesting and will add a command for the `readRef` functionality of the system. ```{r echo = T, message = F} read <- command ( "Read", generator = function( state ) { if ( length(state) == 0 ) return(NULL) list( key = gen.with(gen.element( state ), function(i) i$key ) )} , require = function( state, key ) !is.null ( Find( function( proc ) { proc$key == key } , state ) ) , execute = function( key ) grefs$readRef(key) , ensure = function( state, output, key ) { expected <- Find( function( proc ) { proc$key == key } , state )$val expect_equal( expected, output) } ) ``` There quite a few interesting things going on here. Firstly the generator now is a function which will choose one of the `keys` to read. Notice that if the model is currently empty, we return `NULL`. This is one of the ways we ensure we only run `read` after a call to `new` has been performed. The `require` function is also applicable here, but furthermore tests that the generated function inputs are valid – this is important during shrinking, as we don't want to cull a command during shrinking and then end up with an invalid state. The `execute` function now takes the `key` to read, and can pass it to `readRef`. Finally, we have an `ensure` function. After execution, we run this function, and any `testthat` expectations within it. Ok, let's add the final command: ```{r echo = T, message = F} write <- command ( "Write", generator = function( state ) { if ( length(state) == 0 ) return(NULL) list ( key = gen.map( function(i) i$key, gen.element( state )) , val = gen.int(10) )} , require = function( state, key, val ) !is.null ( Find( function( proc ) { proc$key == key } , state ) ) , execute = function( key, val ) grefs$writeRef( key, val ) , update = function( state, output, key, val ) lapply( state, function(proc) if (proc$key == key) list(key = proc$key, val = val) else proc ) ) ``` Now we can run the tests with `new`, `read`, and `write` commands. ```{r echo = T, message = F} test_that( "Registry State Machine Model", forall( gen.actions ( initialmodel, list(new, read, write) ), function( actions ) { grefs$reset() expect_sequential( initialmodel, actions ) }) ) ``` These tests now pass, meaning our model and our system are consistent. Let's now subtly break something, so we can show how counterexamples and shrinking work. What we'll do is write an incorrect `write` function, which writes a value incremented by `1`, instead of the correct value. This will cause the model and reality to clash. ```{r echo = T, message = F} writeIncorrect <- command ( "Write (Broken)", generator = function( state ) { if ( length(state) == 0 ) return(NULL) list ( key = gen.with( gen.element( state ), function(i) i$key) , val = gen.int(10) )} , require = function( state, key, val ) !is.null ( Find( function( proc ) { proc$key == key } , state ) ) , execute = function( key, val ) grefs$writeRef( key, val + 1) , update = function( state, output, key, val ) lapply( state, function(proc) if (proc$key == key) list(key = proc$key, val = val) else proc ) ) ``` Now we'll run the expectation with the incorrect write function as well. ```{r echo = T, message = F, error=TRUE} test_that( "Registry State Machine Model", forall( gen.actions ( initialmodel, list(new, read, write, writeIncorrect) ), function( actions ) { grefs$reset() expect_sequential( initialmodel, actions ) }) ) ``` One can see that the minimal shrink for incorrect behaviour has been found: where we create a new value; write to it with the broken function; and read the value, finding it is inconsistent with our model. [jh-dropbox]: https://www.youtube.com/watch?v=H18vxq-VsCk