Writing a Model Checker in 80 Days:Reusable Libraries and Custom Implementation

Electronic Communications of the EASST
Volume 076 (2019)

Automated Verification of Critical Systems 2018
(AVoCS 2018)

Writing a Model Checker in 80 Days:
Reusable Libraries and Custom Implementation

Jessica Petrasch, Jan-Hendrik Oepen, Sebastian Krings, Moritz Gericke

18 pages

Guest Editors: David Pichardie, Mihaela Sighireanu
ECEASST Home Page: http://www.easst.org/eceasst/ ISSN 1863-2122

http://www.easst.org/eceasst/

ECEASST

Writing a Model Checker in 80 Days:
Reusable Libraries and Custom Implementation

Jessica Petrasch, Jan-Hendrik Oepen, Sebastian Krings, Moritz Gericke

Institut für Informatik, Universität Düsseldorf
Universitätsstr. 1, D-40225 Düsseldorf

{jessica.petrasch,jan-hendrik.oepen,moritz.gericke}@hhu.de
krings@cs.uni-duesseldorf.de

Abstract: During a course on model checking we developed BMoth, a full-stack
model checker for classical B, featuring both explicit-state and symbolic model
checking. Given that we only had a single university term to finish the project, a
particular focus was on reusing existing libraries to reduce implementation work-
load.

In the following, we report on a selection of reusable libraries, which can be com-
bined into a prototypical model checker relatively easily. Additionally, we discuss
where custom code depending on the specification language to be checked is needed
and where further optimization can take place. To conclude, we compare to other
model checkers for classical B.

Keywords: Model Checking, SMT Solving, Libraries, University Course

1 Introduction and Motivation

During a course on model checking we developed BMoth, a prototypical model checker for
classical B [Abr96]. BMoth is based on a translation to SMT-LIB, using the well-known SMT
solver Z3 [MB08] as backend. Our model checker is mostly written in Java and a clean-room
implementation not built upon other model checkers. This allows us to use BMoth as a secondary
toolchain for other B method tools. BMoth’s source code, benchmarks used in this article and
further information can be found on https://github.com/bmoth-mc.

A typical university course on model checking features different techniques such as explicit-
state and symbolic model checking as well as temporal model checking, e.g., using LTL for-
mulas. All algorithms and techniques discussed in the lectures were supposed to be at least
rudimentarily implemented by the students in the programming project, whereas parser and type
checker were given by the tutors. To achieve the goal of developing the full model checker during
a single university term, we focused on the integration of different publicly available general pur-
pose libraries. The course itself has been documented individually [KKS19], this article mainly
deals with the technical details of the model checker we implemented.

We believe that our selection of libraries and their composition can be useful to other devel-
opers and help developing model checkers for new languages. To do so, the following article

• introduces the B language and some of its key concepts,

1 / 18 Volume 076 (2019)

https://github.com/bmoth-mc

Writing a Model Checker in 80 Days

• identifies several publicly available libraries that can be used to implement various parts
of a typical model checker,

• highlights how these libraries can be integrated,

• exemplarily discusses language-dependent implementation on top, and

• evaluates BMoth by comparing to the state-of-the-art model checker PROB.

2 A Primer on the B-method

The formal specification language B and the corresponding development method called the B-
method [Abr96] follow the correct-by-construction approach. Their models consist of a set of
machines. A machine consists of variable and type definitions together with a predicate describ-
ing the initial states. By defining machine operations, one is able to specify transitions between
states. A machine operation has a unique name and consists of B substitutions [Abr96] defining
the machine state after its execution. An operation can have a precondition allowing or prohibit-
ing execution based on the current state.

Transitions can be non-deterministic and might be nested. Furthermore, B features differ-
ent constructs that are executed atomically to define operations, ranging from simple variable
assignment to if-then-else and while loops as well as parallel execution.

To ensure correctness of a specification, the user can define machine invariants, i.e., safety
properties that have to hold in every state. Inside predicates and expressions, one can make use
of a multitude of high-level language constructs featuring arithmetics, set logic and set compre-
hensions, sequences, as well as existential and universal quantification.

Besides using data types explicitly provided by the B language, one can define custom types in
the form of sets. A set is defined by a unique name and may be initialized by a finite enumeration
of distinct elements. Sets not enumerated are called deferred sets which are assumed to be non-
empty and finite.

Below, we focus on classical B [Abr96], but our approach also works for Event-B [Abr10] and
could be extended to other state-based specification languages such as TLA+ [Lam94]. While
a simpler language than B would certainly be a more appropriate target for a university course,
we decided to stick with what is used at our chair. In particular, this allows us to compare to our
existing tools and to transfer knowledge gained while developing BMoth.

3 Libraries and Tools Used

As stated in the introduction, we focused on integrating different reusable libraries, avoiding to
reinvent the wheel as much as possible. Below, we introduce the libraries and tools we used and
discuss how they contribute to our model checker. Figure 1 gives an overview of the libraries
and where they are employed.

AVoCS 2018 2 / 18

ECEASST

3.1 ANTLR

ANTLR [Par13] is a parser generator getting a grammar as input. In our case, we were able to
closely follow a subset of the B language grammar given by Abrial [Abr96]. See Table 1 for
a list of supported and unsupported parts of classical B. ANTLR then automatically generates
code, in our case Java classes, for lexing and parsing using accepting finite-state machines.

While parser and lexer are generated automatically, the resulting AST is often too concrete
and has to be restructured into a more abstract syntax tree. This is done manually using rewriting
rules on top of the generated classes. Details are given in Subsection 4.1.

3.2 JGraphT

JGraphT [Nav] handles storage of the state space used for explicit-state and LTL model check-
ing. As the state space is just a directed graph, using a general purpose graph library seems
appropriate. Furthermore, this allowed us to use efficient (graph focused) algorithms provided
by JGraphT for finding shortest paths between two nodes, e.g. an initialisation node and a node
occurring after a few steps, and for finding strongly connected components in the state space.

3.3 Z3

We use the SMT solver Z3 [MB08] as a backend for checking the satisfiability and computing
valuations of formulas. This is done both for computing states as well as for invariant checking.
Furthermore, for LTL model checking, Z3 computes state transitions in a Büchi automaton. This
will be discussed in Subsection 4.4. We connect to Z3 using its Java API.

Of course, Z3 and its input language SMT-LIB do not support all high-level constructs avail-
able in B. In consequence, we have to translate B to SMT-LIB. More concrete, we translate into a
logic supporting quantifiers, integer arithmetic and set theory. While for most language features
the translation can be implemented using a simple AST walker on top of ANTLR, translation is
more complicated for others. As an example, we will look at the translation of exponentiation
in Subsection 4.1, where we will discuss different translation alternatives and their performance
impact.

3.4 UI libraries

We decided to create a graphical user interface for more flexibility (especially while testing), as it
allows the user to edit machines directly in BMoth. For this GUI we used JavaFX in combination
with MvvmFX [CM]. In particular, we follow the MVVM pattern. To create an efficient editor
for B models we used RichTextFX [Mik], which also allows custom syntax highlighting.

3.5 Infrastructure

For source code management we worked with GitHub. Code quality and performance where
surveyed and improved using Travis CI [Tra] and SonarQube [Son]. While not as crucial as
using the right set of libraries, infrastructure had a tremendous impact on progress. While GitHub
provided us with the means of remote synchronization and project management, Travis CI and

3 / 18 Volume 076 (2019)

Writing a Model Checker in 80 Days

Frontend

Backend

Libraries

Controlling UI

B code Result

Parser
AST

Rewrite
Translation

SMC

ESMC LTL
State Space

ANTLR

JGraphT

JavaFX

MvvmFX

RichTextFX

Figure 1: Architecture overview and libraries used

SonarQube helped us to detect bugs early and thus avoid spending time fixing them. Without the
right infrastructure, our project would have been delayed way beyond the scope of a university
term.

4 Implementation Details

In this section, we will discuss several parts of our implementation, focusing on the libraries
used to realize different aspects of our model checker. In particular, we will discuss custom, e.g.,
language-dependent, implementation where needed.

4.1 Backend and General Translation of B to Z3

A key question within the development process of BMoth was how to connect a suitable backend
to the JVM-based parts of the project. Our approach consisted in using the Z3 solver, connected
via the Z3 Java API for solving B formulas translated to SMT. In particular, we translate into a
logic natively supporting sets and quantifiers.

The underlying logic is in general undecidable, i.e., we have to account for Z3 timing out
or returning with an unknown result. However, we often call Z3 on predicates that are at least
partially valuated, e.g., when computing successor states to already known states. This can be
exploited by using encodings of B into SMT-LIB that perform well on known values. As an
example, we will discuss how we encoded exponentiation in Subsection 4.2.

Each input instance of BMoth contains a specification of a machine written in B and a property
given as a B or LTL formula to be checked. As of course B and LTL cannot be directly processed
by Z3, a translation process from B to an SMT-LIB constraint is needed. An overview of the full
translation process is visualized in Figure 2.

If an input instance contains an LTL formula, it is normalized, reducing available features in
order to simplify translation rules. LTL conversion is covered in more detail in Subsection 4.4.
An input instance may contain both LTL properties (given as separate formula) and B properties
(as machine invariants or as separate formula), which will individually be checked by BMoth.

In order to be evaluated, a B predicate is parsed to a concrete syntax tree using the lexer and

AVoCS 2018 4 / 18

ECEASST

Input Formula as String

LTL Part B Part

BMoth LTL CST representation BMoth B CST representation

BMoth LTL AST representation BMoth B AST representation

Z3 Java API AST representation

ANTLR ANTLR

Java objects

Custom Code Custom Code

Java objects

Custom Code

Figure 2: Translation process from user input to SMT

parser generated by ANTLR. Afterwards, the CST is simplified to an AST using common com-
piler construction techniques [ASU86]. Following, we implemented a syntax-directed translation
to SMT-LIB using Z3’s Java API.

While the syntax-directed conversion visits each node of the AST and replaces it with an
equivalent node of Z3’s AST, in some cases additional constraints are generated, e.g., for well-
definedness. Any additional constraints are conjoined appropriately after translation.

The approach chosen allows direct conversion of nodes in many cases, namely where BMoth’s
B node has a semantically equivalent node in the Z3 AST and no additional constraints are re-
quired. Examples are, among others, =/2, ≤/2, +/2, -/2, -/1, ∈/2, ⊆/2, ∪/2 and >/0. A coun-
terexample would be the division :/2, which requires an additional constraint, stating that the
divisor may not be zero, in B. A significant number of node types can be transformed by simple
additions or rewrites. Examples are B’s 6=/2, which corresponds to Z3’s ¬(=/2), or ⊂/2, which
can be split into ∧(6=/2, ⊆/2).

The AST rewrites are easily implemented on top of the AST provided by ANTLR. Listing 1
shows the code used to rewrite ex pr ∈N into ex pr ≥0. Essentially, we just have to check whether
the node to be rewritten has the ∈ operator and whether the right argument is N. If so, we return
a new AST node with the target predicate, which is inserted into the AST replacing the old node.

Just like formulas, BMoth parses machines into an abstract syntax tree representation before
further conversion. Afterwards, before-after-predicates are computed for all operations and Z3
representations of all occurring predicates are precomputed and stored.

Note that there is no closed representation of B’s while loops in terms of a before-after-
predicate. The definition given by Abrial [Abr96] only states that the loop condition has to
evaluate to false in the after state and that the invariant should evaluate to true. The effect of the

5 / 18 Volume 076 (2019)

Writing a Model Checker in 80 Days

Listing 1: AST rewriting
i f ( n o d e . g e t O p e r a t o r ( ) == ELEMENT OF ) {

E x p r N o d e l e f t = n o d e . g e t E x p r e s s i o n N o d e s ( ) . g e t ( 0 ) ;
E x p r N o d e r i g h t = n o d e . g e t E x p r e s s i o n N o d e s ( ) . g e t ( 1 ) ;
i f ( r i g h t i n s t a n c e o f O p e r a t o r N o d e

&& ( ( O p e r a t o r N o d e ) r i g h t ) . g e t O p e r a t o r ( ) == NATURAL) {
r e t u r n new O p e r a t o r W i t h A r g s N o d e ( . . . , GREATER EQUAL ,

A r r a y s . a s L i s t ( l e f t , new NumberNode ( . . . , ZERO ) ) ) ;
}

}
r e t u r n n o d e ;

loop however has to be computed using a fixpoint operation. In case of BMoth, this means that
we cannot model check machines using the current evaluation architecture. However, one could
think of a small-step semantic for B, in which the body of a while loop is computed individually
until the loop condition eventually evaluates to false. This would enable explicit-state model
checking using our architecture. For the symbolic model checking algorithms, other approaches
are needed and should be part of future research both in BMoth and PROB.

Once all additional predicates are computed, all model checking algorithms solely rely on the
SMT-LIB translation of predicates, while the B representation is only used for textual represen-
tation and user feedback.

As can be seen in Table 1, the list of language features implemented in BMoth only covers
a subset of B. Of those not yet implemented, some can be considered variants of implemented
features that they can be rewritten to to be expressed. For instance, relations can be expressed via
set theory constraints, while the missing kinds of functions can be constructed in a more general
way as relations with additional constraints (consisting of already implemented features).

What remains among those features not implemented so far, but yet has to be taken care of
in an efficient way, is the cardinality of sets. In principle, it could be dealt with by finding a
bijection to an integer interval. However, that yields an expensive solution, which often does not
perform well [KL16b].

4.2 Custom Encodings

The system design and selection of libraries we presented facilitates experimenting with differ-
ent encodings and other optimization techniques. An example illustrating the flexibility of our
compositional approach is the exponentiation expression **/2 in BMoth.

Due to the undecidability of the underlying logic, Z3 performed rather poorly when confronted
with a combination of quantifiers and non-linear arithmetic (in particular exponentiation). While
both individually could often be solved, test cases combining both usually resulted in the con-
straints being unsolvable for Z3. We investigated different possible solutions:

• use other backends, possibly in parallel with Z3,

• changing the way BMoth handles quantified formulas, e.g., try unrolling them where pos-

AVoCS 2018 6 / 18

ECEASST

Table 1: Overview of implemented and unimplemented features

implemented not implemented
comparison
predicates

=, <, >, ≤, ≥, 6=

arithmetics +, *, -, /, %, ˆ
basic logical
predicates

∧, ∨, true, false, ∀, ∃, ⇒

set predicates ∈, /∈, ⊆, *, ⊂, 6⊂, ×

basic sets
deferred and enumerated sets, /0,
intervals, Z, N, N+, INT, NAT,

NAT1, BOOL

set operators
\, ∪, ∩,

⋃
, domain, range, set

enumeration, min, max
P, finite subsets,

⋂
, cardinality,

set summation, set product
tuples/se-
quences

couples, enumerated sequences,
empty sequence

sequence
operators

front, last, first

set of finite subsequences,
permutations, concat, prepend,

append, size, reverse, take,
drop, tail, strings

relations inverse

forward composition, backward
composition, identity, relational
image, override, domain/range

restriction/subtraction

functions function call
partial functions (and less
general variants), lambda

abstraction

sible,

• changing the way BMoth handles exponentiation, and

• search for ways to alter Z3’s behavior by configuration of Z3.

Since the undecidability limits Z3’s support for non-linear integer arithmetic combined with
quantifiers, relying on a different backend might have helped. However, additional solvers would
have been a rather severe change in design, quite impossible to perform within our time restric-
tions. We thus decided to evaluate the other approaches first.

In comparison, examining various encodings of exponentiation was a comparatively light-
weight solution and thus appeared more promising. Our alternative approach to Z3’s internal
exponentiation consists in translating the exponentiation nodes of the internal B AST to a recur-
sive function. By doing so, we can force the solver to use distinct ways of evaluation.

More concretely, exponentiation is realized using the well-known square-and-multiply method1.
1 Square-and-multiply significantly reduces the number of multiplications, thus improving performance when com-

7 / 18 Volume 076 (2019)

Writing a Model Checker in 80 Days

Square-and-multiply is introduced in Z3 by defining a function and recursively assigning a
value to certain invocations by means of universal quantification. In particular, the special cases
x0 and x1 are used as termination cases of the recursion. The recursive cases implement the
square-and-multiply pattern.

Of course this alternative encoding does not help Z3 proving formulas involving exponen-
tiation in every case, but some of the problematic combinations ceased posing an obstacle to
checking satisfiability and finding a valuation. For certain models, the approach was able to
compensate for Z3’s inability to solve combinations of quantifiers and non-linear arithmetic,
showing a notable advantage of the flexible architecture used in BMoth.

4.3 Explicit-State Model Checker

BMoth’s explicit-state model checker (ESMC) works on a state space, a directed graph made up
of the possible states of the system. These states are successively checked for invariant violations.
To build this state space we use Z3. For each new state found by evaluating the conjunction of
an already discovered state and the before-after-predicate, a new vertex is added to JGraphT’s
representation of the state space.

The basic explicit-state model checking algorithm [CGP99] is simple: Determine the initial
states by finding all solutions to the initial state predicate using Z3, put them in a queue, check
each one for invariant violations, add all successors to the queue and repeat the last two steps
until the queue is empty. All solutions to the initial state predicate and the before-after-predicate
are computed by calling Z3 repeatedly (using its incremental solving modes), conjoining the
negations of the previously found models.

In case of a violation the found counterexample is returned and presented to the user together
with the path to the offending state. If desired, we could return the shortest path (so far) using
the implementation of Dijkstra’s algorithm in JGraphT. If we encounter undecidability because
of used quantifiers, ESMC may run into a timeout. It then returns a partial state space and the
result Unknown.

To improve the efficiency of the implementation BMoth aggregates different before-after-
predicates disjunctively for finding successors. Further optimization in the future could include
the aggregation of states.

4.4 LTL Model Checker

So far, we only discussed how to implement checking of safety properties, i.e., the absence of
faulty states. However, to be considered a fully fledged model checker, BMoth has to also process
liveness properties.

As LTL is more common in the B community, we decided to implement an LTL rather than
a CTL model checker. To do so, we follow an automaton-based approach, i.e., the formula in
question is negated and transformed into a Labeled generalized Büchi automaton (LGBA) [Bü66]
using the algorithm suggested by Gerth et al. [GPVW96]. Afterwards, BMoth searches for a
counterexample by combining the automaton with the model’s state space. As BMoth executes
the LTL Model Checking subsequent to ESMC, we can reuse the same state space.

puting on given values as done for computing successor states.

AVoCS 2018 8 / 18

ECEASST

Listing 2: State space cycle detection using JGraphT
i f ( l g b a ! = n u l l ) {

l a b e l S t a t e S p a c e ( ) ; / / a s s i g n s LGBA n o d e s t o s t a t e n o d e s
L i s t <L i s t <S t a t e >> c y c l e s = new

T a r j a n S i m p l e C y c l e s <>( s t a t e S p a c e ) . f i n d S i m p l e C y c l e s ( ) ;
f o r ( L i s t <S t a t e > c y c l e : c y c l e s ) {

f o r ( S t a t e s t a t e : c y c l e ) {
i f ( l g b a . i s A c c e p t i n g S e t ( s t a t e . g e t B u e c h i N o d e s ( ) ) ) {

r e t u r n L T L C o u n t e r E x a m p l e ( s t a t e ) ;
}

}
}

}
r e t u r n v e r i f i e d ( s t a t e S p a c e ) ;

For BMoth, LTL formulae over a set AP of atomic propositions are formed according to the
basic grammar

φ ::= true | a | φ ∧φ |¬φ | X φ | φ U φ

for a ∈ AP, together with the LTL operators ♦ (finally), � (globally), W (weak-until) and R
(release), which are all derived from X and U . As of now, BMoth does not support past-
LTL. However, this could be added using rewriting rules (increasing the size of the formulas to
check [LMS02]).

As discussed for basic B predicates, we use ANTLR for parsing and rewriting of LTL formu-
lae. The algorithm introduced by Gerth et al. [GPVW96] can only process the negation normal
form, which just allows the operators ∧,∨,X ,U ,R and ¬. Furthermore negations may only
appear directly in front of a predicate. In consequence, the input formula has to be normalized,
again following the patterns introduced for the normalization of a regular B predicate.

BMoth searches for counterexamples differently than proposed by Gerth et al. [GPVW96].
Instead of checking the product automaton of state space and LGBA for emptiness, BMoth finds
loops in the state space that are accepted in the LGBA. This is realized by assigning each state
of the state space the LGBA nodes the model can be in in this state. Initial state space states
are assigned the initial LGBA nodes. Next, successors are assigned LGBA nodes recursively.
Based on the assigned LGBA nodes of the current state we determine which LGBA nodes can be
reached. For every successor in the state space we check whether the labels of all those reachable
LGBA nodes do not contradict the propositions supposed to be true in the successor state. If Z3
states that labels and propositions match, that LGBA node is assigned to the successor. State
space traversal is realized using the accessors provided for graph nodes by JGraphT.

As shown in Listing 2, BMoth uses Tarjan’s strongly connected components algorithm pro-
vided by JGraphT to find loops in the state space created during ESMC. For each loop in the state
space, BMoth checks whether there is a node from each accepting set of the LGBA assigned to
the states in the loop. If so, a counterexample is found and the model does not satisfy the LTL
property.

9 / 18 Volume 076 (2019)

Writing a Model Checker in 80 Days

4.5 Symbolic Model Checker

Selecting states from the state space one by one can be very inefficient, particularly for systems
with a large number of transitions. To counteract this inefficiency symbolic model checkers
analyze multiple states at once. BMoth includes two symbolic model checking algorithms: BMC
and k-induction.

4.5.1 Bounded Model Checker

A bounded model checker looks for counterexamples to the invariant, which can be reached after
a number k of steps. This implies that a counterexample that occurs later than after k steps will
not be found unless the system is checked again with an increased k [BCC+03].

BMoth’s BMC works the same way: sequentially increasing the bound k from 1 to K, where
K is the maximum number of steps, all transitions as well as the negated invariant are added to
the constraint in question. The query is given to Z3 to check the satisfiability. If a valuation is
found, the model is not correct and the path to the counterexample is extracted from the model
and returned. To increase efficiency, we add a constraint enforcing states to be distinct, ensuring
that new states are discovered with every increment of the bound. In the future, this could be
used to detect if the system has been checked exhaustively, i.e., if no new states can be computed.

Implementing BMC on top of our library selection is simple. The complete implementa-
tion of BMC for invariant checking is given in Listing 3. As can be seen, the amount of cus-
tom code is quite small. Essentially, we combine Z3 (calls to the solver object) with the re-
sults of the AST transformations mentioned earlier. Most notably, init, transition and
negatedInvariant return three predicates translated from the B machine to Z3’s AST: the
initial state predicate, the conjunction of the before-after-predicates and the negated invariant.
Finally, stateFromModel accesses the valuation found by Z3 and extracts the values of the
state variables.

4.5.2 K-Induction Model Checker

BMC can only refute the correctness of a system by finding a counterexample in the first k steps.
However, the absence of a counterexample cannot be proven, i.e., BMC can never verify a system
as correct unless there is some upper bound known for k [CKOS04].

This limitation can be avoided by employing induction in the model checker, leading to an
algorithm called k-induction [SSS00]. Essentially, k-induction uses mathematical induction to
conclude the entire absence of an invariant violation from the absence of an invariant violation
in the first k steps. The implementation uses an additional constraint and otherwise is similar to
the one of BMC presented above.

5 Empirical Evaluation

To detect flaws and strengths of our implementation we compared it to PROB [LBD+14, LB08,
LB03] in various benchmarks, mostly using examples provided with PROB. In addition to PROB’s
explicit-state model checker, we compare to PROB’s symbolic model checker introduced by

AVoCS 2018 10 / 18

ECEASST

Listing 3: Implementation of BMC
f o r ( i n t k = 0 ; k < m a x S t e p s ; k + + ) {

/ / g e t a c l e a n s o l v e r
s o l v e r . r e s e t ( ) ;

/ / I N I T ( V0 )
s o l v e r . a d d ( i n i t ( ) ) ;

/ / CONJUNCTION i f r o m 1 t o k T ( V i −1 , V i )
f o r ( i n t i = 1 ; i <= k ; i + + ) {

s o l v e r . a d d ( t r a n s i t i o n ( i − 1 , i ) ) ;
}

/ / n o t INV ( Vk )
s o l v e r . a d d ( n e g a t e d I n v a r i a n t ( k ) ) ;

/ / i f r o m 1 t o k , j f r o m i + 1 t o k ( V i != V j )
s o l v e r . a d d ( d i s t i n c t V e c t o r s ( k ) ) ;

S t a t u s c h e c k = s o l v e r . c h e c k ( ) ;
i f ( c h e c k == S t a t u s . SATISFIABLE ) {

/ / c o u n t e r e x a m p l e f o u n d !
S t a t e c e S t a t e = s t a t e F r o m M o d e l ( s o l v e r . g e t M o d e l ( ) , k ) ;
r e t u r n c r e a t e C o u n t e r E x a m p l e F o u n d ( k , c e S t a t e , n u l l ) ;
}

}

/ / no c o u n t e r e x a m p l e f o u n d a f t e r m a x S t e p s t e p s
r e t u r n c r e a t e E x c e e d e d M a x S t e p s ( m a x S t e p s ) ;

Krings et al. [KL16a, KL17]. Benchmarks were executed on a laptop running Ubuntu 16.04 and
featuring a 2.4 GHz i7 CPU and 8 GB of RAM. Most of our benchmarks are comparably small.
As BMoth does not support all of B (see Subsection 4.1), there are not that many larger machines
available. The used B machines can be assorted into the following classes:

• Counters Counters are suitable to investigate qualities like speed and the capability of a
model checker to detect invariant violations in a potentially large (but linear) state space.
Even though symbolic model checkers (especially BMC) often have trouble verifying
these models because of the possibly tremendous length of paths to counterexamples, we
were also curious to find out how long it would take them to examine the k first steps. Pre-
sented machines of this class are SimpleCounter, DoubleAndHalve, FaultyCounter, Get-
ToOneMillion and IncTwoDecTwo.

• Puzzles Solving a puzzle that was translated to a B machine is done by finding an invariant
violation in the initial state. As this happens very fast for the SendMoreMoney machine,

11 / 18 Volume 076 (2019)

Writing a Model Checker in 80 Days

BMoth PROB
ESMC 92.89s 1.92s
BMC 0.64s 1.7s

k-induction 1.58s 2.08s

(a) Average runtime of model
checking algorithms

Machine Parser Model checker
DoubleAndHalve 5% 95%
DecOneOfTwo 9.3% 90.7%
BooleanLaws 13% 87%
FaultyCounter 1% 99%

(b) Average shares of parser and model checker in
the runtime of BMoth’s ESMC

Table 2: Comparison and analysis of average runtimes

model checking it to find a solution of the puzzle demonstrates how fast the model checkers
kick off without actually having much to do.

• Laws ArithmeticLaws and BooleanLaws are two more extensive machines consisting of
invariants stating basic laws of arithmetics and Boolean algebra.

• Miscellaneous Apart from these special machines we included simple models to get more
results for basic machines containing violations. Presented here are SetVarToConstant
(which assigns the given value of a constant to a state variable and thus violates the invari-
ant) and DecOneOfTwo.

Results are given in Table 3. The average runtimes of explicit-state model checking (ESMC),
bounded model checking (BMC) and k-induction applied to our examples are shown in Table 2a
for both BMoth and PROB. Only the executions leading to a clear result are considered, whereas
those ending in a timeout or an exceeding of the maximum steps were left out of the calculation.

Summarizing, BMoth is not as fast as PROB. This is unsurprising, as we did focus on ease
of implementation and code reuse rather than performance. Furthermore, PROB has been in
development for more than 10 years and is thus more mature than our prototypical implementa-
tion. While the comparisons generally show that PROB’s model checking algorithms are faster
than our counterparts, especially for more extensive models, there are some interesting outcomes
when looking at single results in Table 3.

5.1 Startup

For machines with small state spaces or counterexamples reachable within few steps, our imple-
mentation performed better than PROB. In contrast, for larger machines (e.g. see ESMC’s results
for FaultyCounter and IncTwoDecTwo in Table 3), BMoth took longer than PROB. We assume
that BMoth’s startup is faster, while the actual model checking is quicker in PROB. The poor
performance of BMoth’s ESMC is mainly caused by the computation of successor states via Z3.
The fast startup might be owing to a short parsing time. The average shares of parser and model
checker in the total runtime of BMoth for some examples in Table 3 are depicted in Table 2b.
PROB only reports the time consumed overall, we thus cannot compare the parsers individually.

AVoCS 2018 12 / 18

ECEASST

Table 3: Runtimes in minutes and results - number of steps performed in parentheses

Machine Value
ESMC BMC k-induction

BMoth PROB BMoth PROB BMoth PROB

Simple
Counter

Runtime 00:00.68 00:01.72 00:00.96 00:07.11 00:01.10 00:01.72

Result
Verified

(6)
Verified

Exceeded
(20)

Timeout
(13)

Verified
(6)

Verified
(0)

Faulty
Counter

Runtime 01:18.81 00:02.91 00:01.13 00:05.77 00:01.27 07:27.97

Result
CE found

(5001)
CE found

(5001)
Exceeded

(20)
Timeout

(9)
Exceeded

(20)
Timeout

(9)

Double
And

Halve

Runtime 00:00.64 00:01.65 00:03.31 00:08.37 00:05.74 00:03.58

Result
Verified

(9)
Verified

Exceeded
(20)

Timeout
(17)

Verified
(14)

Verified
(11)

Arith-
metic
Laws

Runtime 00:00.82 00:07.10 00:00.78 00:05.31 00:00.81 00:01.94

Result Unknown Verified Unknown
Timeout

(6)
Unknown

Timeout
(0)

Boolean
Laws

Runtime 00:00.72 00:01.71 00:01.04 00:12.43 00:00.72 00:01.73

Result
Verified

(8)
Verified

Exceeded
(20)

Exceeded
(24)

Verified
(0)

GetTo
One

Million

Runtime - 02:58.50 00:00.89 00:07.14 00:01.14 08:49.15

Result -
CE found
(999999)

Exceeded
(20)

Timeout
(16)

Exceeded
(20)

Timeout
(16)

IncTwo
DecTwo

Runtime 11:00.31 00:02.33 00:01.35 00:04.28 00:01.40 02:55.55

Result
CE found
(19819)

CE found
(203)

Exceeded
(20)

Timeout
(6)

Exceeded
(20)

Timeout
(6)

SetVar
ToCon-

stant

Runtime 00:00.62 00:01.66 00:00.61 00:01.67 00:00.62 00:01.69

Result
CE found

(2)
CE found

(1)
CE found

(1)

DecOne
OfTwo

Runtime 00:00.64 00:01.67 00:00.64 00:01.70 00:00.63 00:01.74

Result
CE found

(2)
CE found

(1)
CE found

(1)

Send
More

Money

Runtime 00:00.66 00:01.72 00:00.66 00:01.72 00:00.64 00:01.71

Result
CE found

(1)
CE found

(0)
CE found

(0)

13 / 18 Volume 076 (2019)

Writing a Model Checker in 80 Days

5.2 Solver Limits

The backends of both tools were overstrained by different machines respectively when perform-
ing BMC and k-induction. Occasionally, PROB aborted the model check due to a timeout caused
by an error that is described as “Solvers too weak for predicate”. In these cases, BMoth could
conduct the model check further and execute more steps (e.g. see IncTwoDecTwo in Table 3).

While this did not lead to more results, as BMoth’s BMC and k-induction still always reached
the specified maximum number of steps to be executed without a falsification or verification, it
shows that the backend of PROB has issues with certain machines that BMoth can handle. We
assume that these differences are either due to different approaches to encoding or due to the
synchronization of different backends in PROB. A more thorough analysis should be performed
in the future and might aid the development of both BMoth and PROB.

The other way around, we found machines troubling BMoth but not PROB. The explicit-state
model checking could not be completed because of limitations of Z3, e.g. incomplete quantifiers
when using exponentiation.

Some results are due to the behavior of PROB’s k-induction in presence of timeouts. While
BMC stops as soon as it runs into a timeout, k-induction continues, hoping for an inductivity to
be proven eventually. This implies a long runtime (e.g. see FaultyCounter and GetToOneMillion
in Table 3, where the timeout was encountered after just a few steps, but the algorithm continued
for minutes reaching a timeout again and again).

Table 3 also contains the example ArithmeticLaws for which our model checker, contrary to
PROB, could not come to a result. However, Figure 3 shows a machine that could be proven
right by BMoth, but not by PROB. BMoth immediately realizes that the guard for the operation
op, x < y & y < x, can not be satisfied, making the operation infeasible. It hence comes to the
conclusion that v is never changed and particularly not set to 5, and that the machine is correct.
PROB’s Prolog backend on the other hand searches for an assignment for x and y that will satisfy
the predicate until it runs into a timeout.

Altogether, the runtime results for BMoth were very satisfactory, given that we had no time
for fine-tuning and further experimentation. Often enough our implementation was about as fast
or even faster than PROB for small models.

MACHINE BMothWins
VARIABLES

v
INVARIANT

n o t ( v = 5 )
INITIALISATION

v : = 1
OPERATIONS

op = ANY y , x WHERE x < y & y < x THEN v : = 5 END

Figure 3: Model where BMoth outperforms PROB

AVoCS 2018 14 / 18

ECEASST

6 Related Work

Several model checkers for classical B and Event-B have been developed: PROB [LBD+14]
features both explicit-state [LB08, LB03] and symbolic model checking [KL16a, KL17]. Sym-
bolic model checking in PROB uses the same algorithms implemented in BMoth. In contrast to
BMoth, PROB supports different backends, such as SICStus Prolog’s CLP(FD) library [COC97],
Z3 [MB08] or SAT via Kodkod [TJ07]. Furthermore, PROB’s LTL model checker [PL10] uses a
tableau based approach [LP85] rather than Büchi automata. PROB supports fairness constraints
in LTL, which are currently unsupported in BMoth but could be implemented as done by Do-
brikov et al. [DLP16].

pyB [WL14] is another clean-room implementation of an explicit-state model checker for B.
Originally, pyB was designed as a second toolchain to verify results generated by PROB. Fur-
thermore, Yang et al. implemented JeB [YJS13], an animation framework for Event-B written in
JavaScript. While it does not yet reach the level of maturity of PROB, JeB certainly shows that
JavaScript can be a viable alternative to Prolog, which is used in PROB’s kernel. The research
done in JeB could influence BMoth’s further development, especially regarding alternative back-
ends to overcome the limitations outlined in Subsection 4.1.

Model checking aside, B and Event-B have been translated to SMT-LIB for other purposes.
A translation from classical B to SMT-LIB has been suggested by Krings et al. [KL16b] mostly
aiming at (interactive) animation of B models, but also supporting proof. In addition, the authors
also introduce a backend combining SMT solvers and constraint logic programming in the style
of the CLP(FD) libraries of SICStus and SWI Prolog [Tri12, COC97]. This approach could help
overcome limitations of BMoth.

Event-B has been translated to SMT-LIB via the SMT plug-in [DFGV14] for Rodin [ABH+10].
The plugin is used inside Rodin’s proving perspective. As B and TLA+ [Lam94] feature con-
siderable parallels, the translation of TLA+ [MV16] to SMT-LIB also influenced the translation
approach employed in BMoth.

7 Discussion and Conclusion

In summary, we presented a selection of libraries and their combination into a model checker for
classical B. Even though BMoth remains a student project and is merely a prototypical imple-
mentation, it shows that reinventing the wheel is as unnecessary for model checkers as it is in
general software development.

Of course, we had to lower our sights regarding completeness and performance. Due to the
short development period, BMoth does not support all features of classical B. However, as B is
among the most high-level specification languages, we suppose a simpler language could have
been implemented exhaustively. To gain more benchmarks, existing B specifications could be
rewritten to constructs that BMoth supports. Regarding performance, our evaluations show that
BMoth lacks the years of fine-tuning that went into PROB [LBD+14].

In consequence, we should reconsider BMoth’s backend. While we are pleased with Z3’s API
and performance, its input language lacks the expressiveness needed to translate B. This has been
pointed out by Krings and Leuschel [KL16b] and is a major area of work in the B community. We

15 / 18 Volume 076 (2019)

Writing a Model Checker in 80 Days

want to explore further backends regarding ease of integration and performance. In particular,
following the spirit of code reuse, projects such as JavaSMT [KFB16] and Why3 [BFMP11] can
be the way to go. A backend-agnostic library, connecting to multiple solvers at once, would fit
nicely. However, since we translate into Z3’s set-theory for now, both approaches would not be a
drop-in replacement. Additionally, we could integrate an interpreter for B and use it to compute
transitions on given values, i.e., where no constraint solving is needed.

While BMoth cannot compete with PROB, it shows that developing a prototype for an exper-
imental language can be done quickly, by an inexperienced development team. This does not
only help language experimentation, but also might be useful to the development of other model
checkers, as a prototype can serve as playground for techniques not easily implemented in an old
codebase.

References

[ABH+10] J.-R. Abrial, M. Butler, S. Hallerstede, T. S. Hoang, F. Mehta, L. Voisin. Rodin: an
open toolset for modelling and reasoning in Event-B. STTT 12(6):447–466, 2010.

[Abr96] J.-R. Abrial. The B-book: Assigning Programs to Meanings. Cambridge University
Press, New York, NY, USA, 1996.

[Abr10] J.-R. Abrial. Modeling in Event-B: System and Software Engineering. Cambridge
University Press, 2010.

[ASU86] A. V. Aho, R. Sethi, J. D. Ullman. Compilers principles, techniques, and tools.
Addison-Wesley, Reading, MA, 1986.

[BCC+03] A. Biere, A. Cimatti, E. M. Clarke, O. Strichman, Y. Zhu. Bounded Model Check-
ing. In Advances in Computers, Volume 58. Academic Press, 2003.

[BFMP11] F. Bobot, J.-C. Filliâtre, C. Marché, A. Paskevich. Why3: Shepherd Your Herd
of Provers. In Boogie 2011: First International Workshop on Intermediate Verifi-
cation Languages. Pp. 53–64. Wrocław, Poland, August 2011. https://hal.inria.fr/
hal-00790310.

[Bü66] J. R. Büchi. Symposium on Decision Problems: On a Decision Method in Restricted
Second Order Arithmetic. In Logic, Methodology and Philosophy of Science. Pp. 1–
11. Elsevier, 1966.

[CGP99] E. M. Clarke, Jr., O. Grumberg, D. A. Peled. Model Checking. MIT Press, Cam-
bridge, MA, USA, 1999.

[CKOS04] E. Clarke, D. Kroening, J. Ouaknine, O. Strichman. Completeness and Complexity
of Bounded Model Checking. In Verification, Model Checking, and Abstract Inter-
pretation. Pp. 85–96. Springer Berlin Heidelberg, 2004.

[CM] A. Casall, M. Mauky. mvvmFX. https://github.com/sialcasa/mvvmFX.

AVoCS 2018 16 / 18

https://hal.inria.fr/hal-00790310
https://hal.inria.fr/hal-00790310
https://github.com/sialcasa/mvvmFX

ECEASST

[COC97] M. Carlsson, G. Ottosson, B. Carlson. An open-ended finite domain constraint
solver. In Glaser et al. (eds.), Programming Languages: Implementations, Logics,
and Programs. Pp. 191–206. Springer Berlin Heidelberg, Berlin, Heidelberg, 1997.

[DFGV14] D. Déharbe, P. Fontaine, Y. Guyot, L. Voisin. Integrating SMT solvers in Rodin.
Science of Computer Programming 94, Part 2(0):130–143, 2014.

[DLP16] I. Dobrikov, M. Leuschel, D. Plagge. LTL Model Checking under Fairness in ProB .
In De Nicola and Kühn (eds.), Software Engineering and Formal Methods. Pp. 204–
211. Springer International Publishing, Cham, 2016.

[GPVW96] R. Gerth, D. Peled, M. Vardi, P. Wolper. Simple On-the-Fly Automatic Verification
of Linear Temporal Logic. In Dembiński and Średniawa (eds.), Proceedings PSTV.
IFIP Conference Proceedings, pp. 3–18. Chapman & Hall, 1996.

[KFB16] E. G. Karpenkov, K. Friedberger, D. Beyer. JavaSMT: A Unified Interface for SMT
Solvers in Java. In Blazy and Chechik (eds.), Verified Software. Theories, Tools, and
Experiments. Pp. 139–148. Springer, Cham, 2016.

[KKS19] S. Krings, P. Körner, J. Schmidt. Experience Report on An Inquiry-Based Course
on Model Checking. In Proceedings SEUH. CEUR Workshop Proceedings. CEUR-
WS.org, 2019.

[KL16a] S. Krings, M. Leuschel. Proof Assisted Symbolic Model Checking for B and Event-
B. In Proceedings ABZ. LNCS 9675. Springer, 2016.

[KL16b] S. Krings, M. Leuschel. SMT Solvers for Validation of B and Event-B models. In
Proceedings iFM. LNCS 9681. Springer, 2016.

[KL17] S. Krings, M. Leuschel. Proof assisted bounded and unbounded symbolic model
checking of software and system models. Science of Computer Programming, 2017.

[Lam94] L. Lamport. The Temporal Logic of Actions. ACM Trans. Program. Lang. Syst.
16(3):872–923, May 1994.

[LB03] M. Leuschel, M. Butler. ProB: A Model Checker for B. In Proceedings FME’03.
LNCS 2805, pp. 855–874. Springer, 2003.

[LB08] M. Leuschel, M. Butler. ProB: an automated analysis toolset for the B method. Int.
J. Softw. Tools Technol. Transf. 10(2):185–203, 2008.

[LBD+14] M. Leuschel, J. Bendisposto, I. Dobrikov, S. Krings, D. Plagge. From Animation
to Data Validation: The ProB Constraint Solver 10 Years On. In Boulanger (ed.),
Formal Methods Applied to Complex Systems: Implementation of the B Method.
Chapter 14, pp. 427–446. Wiley ISTE, Hoboken, NJ, 2014.

[LMS02] F. Laroussinie, N. Markey, P. Schnoebelen. Temporal Logic with Forgettable Past.
In Proceedings of the 17th Annual IEEE Symposium on Logic in Computer Science.
LICS ’02, pp. 383–392. IEEE Computer Society, Washington, DC, USA, 2002.

17 / 18 Volume 076 (2019)

Writing a Model Checker in 80 Days

[LP85] O. Lichtenstein, A. Pnueli. Checking That Finite State Concurrent Programs Sat-
isfy Their Linear Specification. In Proceedings of the 12th ACM SIGACT-SIGPLAN
Symposium on Principles of Programming Languages. POPL ’85, pp. 97–107.
ACM, New York, NY, USA, 1985.

[MB08] L. de Moura, N. Bjørner. Z3: An Efficient SMT Solver. In Proceedings TACAS.
LNCS 4963, pp. 337–340. Springer, 2008.

[Mik] T. Mikula. RichTextFX. https://github.com/FXMisc/RichTextFX.

[MV16] S. Merz, H. Vanzetto. Encoding TLA+ into Many-Sorted First-Order Logic. In Pro-
ceedings ABZ. LNCS 9675, pp. 54–69. Springer, 2016.

[Nav] B. Naveh. JGraphT. http://jgrapht.org/.

[Par13] T. Parr. The Definitive ANTLR 4 Reference. Pragmatic Bookshelf, 2nd edition, 2013.

[PL10] D. Plagge, M. Leuschel. Seven at One Stroke: LTL Model Checking for High-level
Specifications in B, Z, CSP, and More. Int. J. Softw. Tools Technol. Transf. 12(1):9–
21, Jan. 2010.

[Son] SonarSource SA. SonarQube. https://www.sonarqube.org/.

[SSS00] M. Sheeran, S. Singh, G. Stålmarck. Checking Safety Properties Using Induction
and a SAT-Solver. In Formal Methods in Computer-Aided Design. Pp. 127–144.
2000.

[TJ07] E. Torlak, D. Jackson. Kodkod: A Relational Model Finder. In Grumberg and Huth
(eds.), Tools and Algorithms for the Construction and Analysis of Systems. Pp. 632–
647. Springer, 2007.

[Tra] Travis CI GmbH. Travis CI. https://travis-ci.org/.

[Tri12] M. Triska. The Finite Domain Constraint Solver of SWI-Prolog. In Schrijvers and
Thiemann (eds.), Functional and Logic Programming. Pp. 307–316. Springer Berlin
Heidelberg, Berlin, Heidelberg, 2012.

[WL14] J. Witulski, M. Leuschel. Checking Computations of Formal Method Tools - A
Secondary Toolchain for ProB. In Proceedings 1st Workshop on Formal Integrated
Development Environment, F-IDE 2014, Grenoble, France, April 6, 2014. Pp. 93–
105. 2014.

[YJS13] F. Yang, J. Jacquot, J. Souquières. JeB: Safe Simulation of Event-B Models in
JavaScript. In 20th Asia-Pacific Software Engineering Conference, APSEC 2013,
Ratchathewi, Bangkok, Thailand, December 2-5, 2013 - Volume 1. Pp. 571–576.
2013.

AVoCS 2018 18 / 18

https://github.com/FXMisc/RichTextFX
http://jgrapht.org/
https://www.sonarqube.org/
https://travis-ci.org/

Introduction and Motivation
A Primer on the B-method
Libraries and Tools Used
ANTLR
JGraphT
Z3
UI libraries
Infrastructure

Implementation Details
Backend and General Translation of B to Z3
Custom Encodings
Explicit-State Model Checker
LTL Model Checker
Symbolic Model Checker
Bounded Model Checker
K-Induction Model Checker

Empirical Evaluation
Startup
Solver Limits

Related Work
Discussion and Conclusion