Distributed Verification of Rare Properties using Importance Splitting Observers Electronic Communications of the EASST Volume 72 (2015) Proceedings of the 15th International Workshop on Automated Verification of Critical Systems (AVoCS 2015) Distributed Verification of Rare Properties using Importance Splitting Observers Cyrille Jegourel, Axel Legay, Sean Sedwards and Louis-Marie Traonouez 15 pages Guest Editors: Gudmund Grov, Andrew Ireland Managing Editors: Tiziana Margaria, Julia Padberg, Gabriele Taentzer ECEASST Home Page: http://www.easst.org/eceasst/ ISSN 1863-2122 http://www.easst.org/eceasst/ ECEASST Distributed Verification of Rare Properties using Importance Splitting Observers Cyrille Jegourel, Axel Legay, Sean Sedwards and Louis-Marie Traonouez INRIA Rennes – Bretagne Atlantique Abstract: Rare properties remain a challenge for statistical model checking (SMC) due to the quadratic scaling of variance with rarity. We address this with a variance reduction framework based on lightweight importance splitting observers. These expose the model-property automaton to allow the construction of score functions for high performance algorithms. The confidence intervals defined for importance splitting make it appealing for SMC, but optimising its performance in the stan- dard way makes distribution inefficient. We show how it is possible to achieve equivalently good results in less time by distributing simpler algorithms. We first explore the challenges posed by importance splitting and present an algorithm opti- mised for distribution. We then define a specific bounded time logic that is compiled into memory-efficient observers to monitor executions. Finally, we demonstrate our framework on a number of challenging case studies. Keywords: Rare events, statistical model checking, importance splitting, scalable verification, lightweight observers. 1 Introduction Failure in critical systems is required to be very infrequent. Numerical model checking can quantify the probability of such failure with certainty, but is limited in its application to real systems because of the ‘state explosion problem’ [CES09]. This is addressed by statistical model checking (SMC) [YKNP06], which includes a number of approximative techniques based on Monte Carlo sampling [HH64]. Using SMC, only a subset of system states are generated on the fly during stochastic simulation, while results converge in a predictable way. Performance is typically independent of the size of the state space [Nie92] and simulations may be efficiently divided on parallel computation architectures. SMC has therefore been successfully applied to real systems in a critical context, such as to the cabin communication system of an aeroplane [BBB+10]. Rare properties (those with probability close to zero) nevertheless pose a problem because the standard and relative estimation errors scale quadratically with rarity [HH64, RT09]. For example, 4000 simulations would be sufficient to estimate a probability of 0.1±10% with 95% confidence, whereas 4×1013 simulations would be necessary to estimate a probability of 10−6 ±10% with the same confidence. Desirable failure rates in critical systems may be orders of magnitude lower, so we seek to enhance SMC with variance reduction techniques, such as importance sampling and importance splitting [KH51, HH64, RT09], without sacrificing the easy distribution that SMC affords. Importance sampling weights the executable model of a system so that the rare property oc- curs more frequently in simulations. The proportion of simulations that satisfy the property using 1 / 15 Volume 72 (2015) Distributed Verification of Rare Properties using Importance Splitting Observers the weighted model overestimates the true probability, but the estimate may be exactly compen- sated by the weights. It is generally not feasible to implement a perfectly weighted executable model for importance sampling because (i) the perfect model may not actually exist as a re- parametrisation of the original model and (ii) a perfect re-parametrisation typically requires an iteration over all the transitions, defeating the benefits of sampling. Practical approaches tend to use a low dimensional vector of parameters to weight the model [JLS12b, JLS12a]. Given such a parametrisation, importance sampling can be implemented with minimal memory and may be distributed efficiently on parallel computational architectures. The principal limitation of impor- tance sampling is that without a guarantee that the simulation model is perfect, it is difficult to formally bound the error of estimates. In contrast, useful confidence intervals have been defined for importance splitting [CG07, CDFG12]. Importance splitting divides a rare property into a set of less rare sub-properties that corre- spond to an increasing sequence of disjoint levels: the initial state corresponds to the lowest level, while states that satisfy the rare property corresponds to the final level. Importance split- ting algorithms use a series of easy simulation experiments to estimate the conditional prob- abilities of going from one level to the next. Since relatively few simulations fail to satisfy the sub-properties, the overall simulation budget may be reduced. Each experiment comprises simulations initialised with the terminal states of previous simulations that reached the current level. The overall probability is the product of the estimates, with the best performance (lowest variance) achieved with many levels of equal conditional probability. Importance splitting poses several challenges for optimisation and distribution. In the con- text of SMC, importance splitting algorithms repeatedly initialise simulations with states of the model-property product automaton. For arbitrary properties this may have size proportional to the length of a simulation trace. At the same time, increasing the number of levels to maximise performance reduces the number of simulation steps in each simulation experiment. The cost of sending the model-property state across slow communication channels may be significantly greater than the cost of short simulations. In addition, to specify levels with equal conditional probabilities it is necessary to define a ‘score function’ that maps the states of the product au- tomaton to a value. This cannot easily be automated, so a syntactic description of the property automaton must be accessible for the user to construct a score function manually. To address the above challenges we present an importance splitting framework for SMC, specifically considering the problems of distribution. We first discuss the problems of distribut- ing importance splitting algorithms and present a fixed level algorithm optimised for distribution. We then define an expressive bounded time temporal logic and describe the system of efficient lightweight observers that implement it. These make the product automaton (i) accessible to the user, (ii) efficient to construct, (iii) efficient to distribute and (iv) efficient to execute. Finally, we demonstrate the performance and flexibility of our framework on a number of case studies that are intractable to numerical methods. We believe the present work is the first to describe a practical importance splitting framework for SMC and is therefore the first to consider the problems of distributing importance splitting for SMC. Proc. AVoCS 2015 2 / 15 ECEASST Related Work There have been many ad hoc implementations of importance splitting based on the original ideas of [Kah50, KH51]. The algorithm of [VV91] is a relatively recent example that is often cited. The work of [CG07, CDFG12] is novel because the authors define efficient adaptive importance splitting algorithms that also include confidence intervals. To our knowledge, [JLS13] is the first work to explicitly link importance splitting to arbitrary logical properties. SMC tools construct an automaton (a monitor) to accept traces that satisfy a temporal logic formula, typically based on a time bounded variant of temporal logic. The proportion of inde- pendent simulations of a stochastic model that satisfy the property is then used to estimate the probability of the property or to test hypotheses about the probability. There have been sev- eral works that construct runtime verification monitors from temporal logic (e.g., [Gei01, GH01, HR02, FS04, BLS06]). Such monitors typically comprise tableau-based automata [GPVW95] whose states represent the combinations of subformulas of the overall property. While some have considered timed properties (e.g., [BLS06]), the focus is predominantly unbounded LTL properties interpreted on finite paths [EFH+03]. In contrast, SMC typically checks formulas with explicit time bounds (see, e.g., (1)), which are inherently defined on finite traces. To avoid the combinatorial explosion of subformulas caused by including time in this way, the monitors used by [JLS12b, BCLS13] and other high performance tools are compact “programs” that gen- erate the states of an automaton on the fly and do not store them. Such programs incorporate notions of optimality that may be subtly different from those that apply in other contexts. Since states of the automaton are generated on the fly, it is not necessary for the automaton to have the minimum number of states. The actual requirements are that the automaton reaches a conclusion with the minimum number of input states and that its programmatic representation is as compact as possible. We adapt this “lightweight” approach to allow importance splitting for SMC to be efficiently distributed on high performance parallel computational architectures. 2 Technical Background Our SMC tools (PLASMA [JLS12b], PLASMA-LAB [BCLS13]) implement a bounded linear tem- poral logic having the following syntactic form: ϕ = Xkϕ | Fkϕ | Gkϕ |ϕUkϕ |¬ϕ |ϕ∨ϕ |ϕ∧ϕ |ϕ⇒ϕ |α (1) This syntax allows arbitrary combinations and nesting of temporal and atomic properties (i.e., those which may be evaluated in a single state and denoted by α). The time bound k may denote discrete steps or continuous time, but in this work we consider only discrete time semantics. Given a finite trace ω, comprising sequence of states ω0ω1ω2 ···, ω(i) denotes the suffix ωiωi+1ωi+2 ···. The semantics of the satisfaction relation |= is constructed inductively as fol- 3 / 15 Volume 72 (2015) Distributed Verification of Rare Properties using Importance Splitting Observers lows: ω(i) |=true ω(i) |=α ⇐⇒ α is true in state ωi ω(i) |=¬φ ⇐⇒ ω(i) |= φ ̸∈|= ω(i) |=φ1 ∨φ2 ⇐⇒ ω(i) |= φ1 or ω(i) |= φ2 ω(i) |=Xkφ ⇐⇒ ω(k+i) |= φ ω(i) |=φ1Ukφ2 ⇐⇒ ∃ j ∈{i,...,i + k} : ω( j) |= φ2 ∧( j = i∨∀l ∈{i,..., j−1} : ω(l) |= φ1) (2) Other elements of the relation are constructed using the equivalences false ≡¬true, φ1 ∧φ2 ≡ ¬(¬φ1 ∨¬φ2), Fkφ≡ true Ukφ, Gkφ≡¬(true Uk¬φ). Hence, given a property φ with syntax according to (1), ω |= φ is evaluated by ω(0) |= φ. Importance Splitting and Score Functions The neutron shield model of [Kah50, KH51] is illustrative of how importance splitting works. The distance travelled by a neutron in a shield defines a monotonic sequence of levels 0 = s0 < s1 < s2 < ··· < sm = shield thickness, such that reaching a given level implies having reached all the lower levels. While the overall probability γ of passing through the shield is small, the probability of passing from one level to another can be made arbitrarily close to 1 by reducing the distance between levels. Denoting the abstract level of a neutron as s, the probability of a neutron reaching level si can be expressed as P(s ≥ si) = P(s ≥ si | s ≥ si−1)P(s ≥ si−1). Defining γ = P(s ≥ sm) and P(s ≥ s0) = 1, γ = m ∏ i=1 P(s ≥ si | s ≥ si−1). (3) Each term of (3) is necessarily greater than or equal to γ, making their estimation easier. By writing γi = P(s ≥ si | s ≥ si−1) and denoting the estimates of γ and γi as respectively γ̂ and γ̂i, [JLS13] defines the unbiased confidence interval CI = [ γ̂/ ( 1 + zασ√ n ) ,γ̂/ ( 1− zασ√ n )] with σ2 ≥ m ∑ i=1 1−γi γi . (4) Confidence is specified via zα, the 1−α/2 quantile of the standard normal distribution, while n is the per-level simulation budget. We infer from (4) that for a given γ the confidence is maximised by making both the number of levels m and the simulation budget large, with all γi equal. The concept of levels can be generalised to arbitrary systems and properties in the context of SMC, treating s and si in (3) as values of a score function over the model-property product automaton. Intuitively, a score function discriminates good paths from bad, assigning higher scores to paths that more nearly satisfy the overall property. Since the choice of levels is crucial to the effectiveness of importance splitting, various ways to construct score functions from a temporal logic property are proposed in [JLS13]. Formally, given a set of finite trace prefixes Proc. AVoCS 2015 4 / 15 ECEASST ω ∈ Ω, an ideal score function S : Ω → R has the characteristics S(ω) > S(ω′) ⇐⇒ P(|= φ | ω) > P(|= φ |ω′), where P(|= φ |ω) is the probability of eventually satisfying φ given prefix ω. Intuitively, ω has a higher score than ω′ iff there is more chance of satisfying φ by continuing ω than by continuing ω′. The minimum requirement of a score function is S(ω)≥ sφ ⇐⇒ ω |= φ, where sφ is an arbitrary value denoting that φ is satisfied. Any trace that satisfies φ must have a score of at least sφ and any trace that does not satisfy φ must have a score less than sφ. In what follows we assume that (3) refers to scores. 3 Distributing Importance Splitting Simple Monte Carlo SMC may be efficiently distributed because once initialised, simulations are executed independently and the result is communicated at the end with just a single bit of information (i.e., whether the property was satisfied or not). By contrast, the simulations of importance splitting are dependent because scores generated during the course of each simulation must be processed centrally. The amount of central processing can be minimised by reducing the number of levels, but this generally reduces the variance reduction performance. Alternatively, entire instances of the importance splitting algorithm may be distributed and their estimates averaged, with each instance using a proportionally reduced simulation budget. We use this approach to generate some of the results in Section 6, but note that if the budget is reduced too far, the algorithm will fail to pass from one level to the next (because no trace achieves a high enough score) and no valid estimate will be produced. Distribution of importance splitting is thus possible, but its efficiency is dependent on the par- ticular problem. In this work we therefore provide the framework to explore different approaches. In Section 3.1 we first describe the concept of an adaptive importance splitting algorithm and then explain why this otherwise optimised technique is unsuitable for distribution. In Section 3.2 we motivate the use of a fixed level algorithm for “lightweight” distribution and provide a suitable algorithm. The results we present in Section 6 demonstrate that this simpler approach can be highly effective. 3.1 The Adaptive Algorithm The basic notion of importance splitting described in Section 2 can be directly implemented in a so-called fixed level algorithm, i.e., an algorithm in which the levels are pre-defined by the user. With no a priori information, such levels will typically be chosen to subdivide the maximum score equally. In general, however, this will not equally divide the conditional probabilities of the levels, as required by (4) to minimise variance. In the worst case, one or more of the conditional probabilities will be too low for the algorithm to pass between levels. Finding good or even reasonable levels by trial and error may be computationally expensive and has prompted the development of adaptive algorithms that discover optimal levels on the fly [CG07, JLS13, JLS14]. Instead of pre-defining levels, the user specifies the proportion of simulations to retain after each iteration. This proportion generally defines all but the final conditional probability in (3). The adaptive importance splitting algorithm first performs a number of simulations until the 5 / 15 Volume 72 (2015) Distributed Verification of Rare Properties using Importance Splitting Observers overall property is decided, storing the resulting traces of the model-property automaton. Each trace induces a sequence of scores and a corresponding maximum score. The algorithm finds a level that is less than or equal to the maximum score of the desired proportion of simulations to retain. The simulations whose maximum score is below this current level are discarded. New simulations to replace the discarded ones are initialised with states corresponding to the current level, chosen at random from the retained simulations. The new simulations are continued until the overall property is decided and the procedure is repeated until a sufficient proportion of simulations satisfy the overall property. The principal advantage of the adaptive algorithm is that by simply rejecting the minimum number of simulations at each level it is possible to maximise confidence for a given score function. The principal disadvantage is that it stores simulation traces, severely limiting the size of model and simulation budget. The use of lightweight computational threads is effectively prohibited. Moreover, minimising the number of rejected simulations reduces the number of simulations performed between levels, thus reducing the possibility to perform computations in parallel. Minimising the rejected simulations also maximises the number of levels, which in turn minimises the number of simulation steps between each level. This further limits the feasibility of dividing the algorithm, since sending a model-property state over a slow communication channel may be orders of magnitude more costly than performing a short simulation locally. 3.2 A Fixed Level Algorithm for Distribution In contrast to the adaptive algorithm, the fixed level importance splitting algorithm does not need to store traces, making it lightweight and suitable for distribution. Scores are calculated on the fly and only the states that achieve the desired level are retained for further consideration. While the choice of levels remains a problem, an effective strategy is to first use the adaptive algorithm with a relatively high rejection rate to find good fixed levels. An estimate with high confidence can then be generated efficiently by distributing the fixed level algorithm. Algorithm 1 is our fixed level importance splitting algorithm optimised for distribution. We use the terms server and client to refer to the root and leaf nodes of a network of computational devices or to mean independent computational threads on the same machine. In essence, the server manages the job and the clients perform the simulations. The server initially sends compact representations of the model and property to each client. Thereafter, only the state of the product automaton is communicated. In general, each client returns terminal states of simulations that reached the current level and the server distributes these as initial states for the next round of simulations. Algorithm 1 optimises this. The server requests and distributes only the number of states necessary to restart the simulations that failed to reach the current level, while maintaining the randomness of the selection. Despite this optimisation, however, the performance of this and other importance splitting algorithms will be confounded by the combination of large state size and properties having short time bounds. Under such circumstances it may be preferable to distribute entire instances of the algorithm, as described above. The memory requirements of Algorithm 1 are minimal. Each client need only store the state of n simulations. As such, it is conceivable to distribute simulations on lightweight computational threads, such as those provided by GPGPU (general purpose computing on graphics processing Proc. AVoCS 2015 6 / 15 ECEASST units). Algorithm 1: Distributed Fixed Level Importance Splitting input: s1 < s2 < ···< sm is a sequence of scores, with sm = sφ the score necessary to satisfy property φ 1 γ̂ ← 1 is the initial estimate of γ = P(ω |= φ) 2 server sends compact description of model and observer to k clients 3 each client initialises n simulations 4 for s ← s1,...,sm do 5 each client continues its n simulations from their current state simulations halt as soon as their scores reach s 6 ∀ clients, client i sends server the number of traces ni that reached s 7 server calculates γ̂ ← γ̂n′/kn, where n′ = ∑ ni 8 for j ← 1,...,kn−n′ do 9 server chooses client i at random, with probability ni/n′ 10 client i sends server a state chosen uniformly at random from those that reached s 11 server sends state to client corresponding to failed simulation j, as initial state of new simulation to replace simulation j output: γ̂ 4 Linear Temporal Logic for Importance Splitting High performance SMC tools, such as [JLS12b, BCLS13], avoid the complexity of standard model checking by compiling the property to a program of size proportional to the formula and memory proportional to the maximum sum of nested time bounds. This program implicitly encodes the model checking automaton, but is exponentially smaller. For example, the property Xkφ can be implemented as a loop that generates k simulation steps before returning the truth of φ in the last state; the property ϑUkφ can be implemented as a loop that generates up to k simulation steps while ϑ is true and φ is not true, returning the value of φ in the last state otherwise. If ϑ and φ are atomic, the programs require just O(log k) bits of memory to hold a loop counter. In contrast, the nested property Fk1(ϑ∨Gk2φ) has an O(k2) memory requirement. If ϑ is not true on step i < k1 it may be necessary to simulate up to step i+k2 to decide subformula Gk2φ. If ϑ∨Gk2φ turns out to be false on step i, it will then be necessary to consider the truth of ϑ on step i + 1, noting that the last simulated step could be i + k2. To evaluate this formula it is effectively necessary to remember the truth of ϑ on O(k2) simulation steps. Similar requirements can arise when the until operator (U) is a subformula of a temporal operator. In all such cases the sequence of stored truth values become part of the state of the property automaton. SMC using importance splitting requires that simulations are repeatedly and frequently ini- tialised with the state of the model-property product automaton. If the size of this state is propor- tional to the time bounds of temporal operators, initialisation may have comparable complexity to simulation. This becomes especially problematic if the state is to be transmitted across rel- 7 / 15 Volume 72 (2015) Distributed Verification of Rare Properties using Importance Splitting Observers atively slow communication channels for the purposes of distribution. We therefore define a subset of (1), the size of whose automata is not dependent on the bounds of temporal operators: ϕ =Xkϕ |ψUkψ |¬ϕ |ϕ∨ϕ |ϕ∧ϕ |ϕ⇒ϕ |ψ ψ =Xkψ | Fkψ | Gkψ |α (5) The semantics of (5) is the same as (1), but (5) restricts how temporal operators may be combined. In particular, U may not be the subformula of a temporal operator other than X and temporal operators that are subformulas of other temporal operators may not be combined with Boolean connectives. Temporal operators containing other temporal operators as subformu- las may, however, be combined. This logic expresses many useful properties, including nested bounded temporal properties that are not implemented in the numerical model checker PRISM1. 5 Lightweight Observers for Importance Splitting To facilitate the construction of score functions we implement the logic given by (5) as a set of nested observers. Each observer corresponds to either a temporal operator, a Boolean operator acting on temporal operators, or as a predicate describing an atomic property. In our implemen- tation observers are written in a syntax based on the commonly used reactive modules language [AH99], using the notion of ‘guarded commands’ [Dij75] with sequential semantics. The ob- servers are easily implemented in other modelling languages. An observer comprises a set of guarded commands, any number of which may be enabled and executed on a given simulation step. Updates are performed in syntactic order after all guards have been evaluated, hence the update of one command does not affect the guards of commands in the same observer. In general, the output of one observer is the input to another and observers are therefore executed in reverse order of their nesting. Observers evaluate states as they are generated by the simulation. Since it may not be possible to decide a property before seeing a certain number of states, observers implement a three valued logic. In Figs. 1, 2 and 3 we use the symbols ?, ⊤ and ⊥ to denote the three values undecided, true and false, respectively. The state of an observer changes only when at least one of its inputs is decided. An observer may reach a deadlock state (no commands enabled) once its output is decided and cannot be changed by further input. A simulation terminates when the output of the root observer is decided, i.e., the property is decided. Simulations may also be paused by the importance splitting algorithm if the score reaches a desired level. Observers implementing the same temporal operator behave differently according to their level of nesting within a formula. We therefore distinguish outer and inner temporal observers. The temporal operators closest to the root of any branch of the syntax tree induced by a formula are implemented by outer observers. Their output proceeds from undecided to either true or false and then does not change. Inner observers encode temporal operators that are the subformulas of other temporal operators. Their output proceeds from undecided to a possibly alternating sequence of true, false and undecided values because their enclosing operator(s) cause them to evaluate a moving window of states in the execution trace. The inner and outer variants of X, F 1 www.prismmodelchecker.org Proc. AVoCS 2015 8 / 15 ECEASST ...?. ⊤. ⊥ . 1 . 2. 3 1. ¬d ∧(¬d′∨¬d′′)∧¬(¬o′∧d′∨¬o′′∧d′′) 2. ¬d ∧d′∧o′∧d′′∧o′′ : d ← true,o ← true 3. ¬d ∧(¬o′∧d′∨¬o′′∧d′′) : d ← true,o ← false (a) o ← o′∧o′′ ...?. ⊤. ⊥ . 1 . 2. 3 1. ¬d ∧(¬d′∧¬(d′′∧o′′)∨d′∧o′∧¬d′′) 2. ¬d ∧(¬o′∧d′∨o′′∧d′′) : d ← true,o ← true 3. ¬d ∧d′∧o′∧d′′∧¬o′′ : d ← true,o ← false (b) o ← o′ ⇒ o′′ Figure 1: Connective observers. Initially d = false. and G are closely related—outer observers are essentially simplified inner observers. When U is a subformula of X, however, the X is implemented as a delay within the U observer. In what follows we describe the important aspects of the various observers that implement (5). The accompanying figures include diagrammatic representations of how the observers work and sets of commands written in the form predicate : update. Each observer has Boolean output variables o and d to indicate respectively the result and whether the property has been decided (observers for atomic formulas omit d). Observers for temporal operators take discrete time bound k as a parameter and use a counter variable w (U uses counter variables w′ and w′′). Inner temporal operators make use of an additional counter, t (U uses t′ and t′′). The inputs of observers are Boolean variables o′ and o′′, with corresponding decidedness d′ and d′′. Connective Observers These observers implement Boolean connectives at syntactic level ϕ in (5) and take advantage of the equivalences false∧? = false, true∨? = true, false ⇒ ? = false and ? ⇒ true = true, for any truth value of ?. Figure 1a describes the observer for conjunction and Fig. 1b describes the observer for implication. The observer for disjunction may be derived from that of conjunction by negating all instances of o′ and o′′, and by exchanging o ← true and o ← false. Negation is implemented by inverting the truth assignment of the observer to which it applies, i.e., by exchanging o ← true and o ← false. The connectives may be combined with themselves and with outer temporal operators. Boolean connectives that apply only to atomic properties (i.e., syntactic level α) are implemented directly in formulas within observers for atomic properties. Inner Temporal Observers These observers act on a moving window of states created by an enclosing temporal operator. The output may pass from one decided value to the other and also become undecided. 9 / 15 Volume 72 (2015) Distributed Verification of Rare Properties using Importance Splitting Observers ...?. ⊤. ⊥ . 1 . 2. 2 . 2 . 3 . 2 . 3 . 2 . 2 1. ¬d ∧d′∧w < k : w ← w + 1 2. d′∧w = k : d ← true,o ← o′ 3. d ∧¬d′ : d ← false (a) o ← Xk o′ ...?. ⊤. ⊥ . 1 . 2,5. 3 . 2 . 4,6 . 3 . 6 . 3 . 2 1. ¬d ∧d′∧¬o′∧w < k : w ← w + 1 2. d′∧o′ : o ← true,d ← true,t ← w 3. d′∧¬o′∧t = 0∧w = k : d ← true,o ← false 4. d ∧d′∧¬o′∧t = 0∧w < k : d ← false 5. d ∧d′∧¬o′∧t > 0 : t ← t −1 6. d ∧¬d′ : d ← false (b) o ← Fk o′ Figure 2: Observers for inner temporal operators. Initially w = t = 0,d = false. Figure 2a describes the observer for Xk. Command 1 counts decided input states until bound k is reached. Thereafter command 2 sets the output decided and equal to the value of the input. Figure 2b describes the observer for Fk. While decided inputs are not true, command 1 incre- ments w from 0 to k. If at any time the input is true, command 2 sets the output to true and the “true-counter” t is set to w. Command 5 decrements t on subsequent false inputs. The output remains true while t > 0. If w reaches k while t = 0, command 3 sets the output to false. The observer for Gk may be derived from that of Fk by negating all instances of o′ and ¬o′, and by exchanging o ← true and o ← false. Outer Temporal observers The outer observers for Xk and Fk are not illustrated but may be derived from their respective inner observers given in Fig. 2. For Xk, command 3 is removed and the guard of command 2 is strengthened with ¬d. For Fk, commands 4, 5 and 6, together with all references to counter t, are removed, while the guards of commands 2 and 3 are strengthened by ¬d. The outer observer for Gk can be derived from that of Fk in the same way as described for inner temporal observers. Figure 3 describes the observer for properties of the form XkX(ϑUkφ), which can be simplified to implement properties of the form ϑUkφ. Since ϑ and φ may be temporal formulas that are satisfied on different simulation steps in arbitrary order, the observer employs variables w′ and w′′ to respectively count the sequences of ¬φ and ϑ (commands 3 and 5). Variable t′ then records the position of the first φ (command 4), while t′′ records the position of the last ϑ (command 5). Using t′ and t′′, commands 7 and 8 are able to determine if the property is satisfied or falsified, respectively. The XkX part of the formula is implemented by initialising variables w′ and w′′ to −kX, forcing the observer to ignore the first kX decided values of ϑ and φ. In the case of properties of the form ϑUkφ, w′ and w′′ are initialised to 0 and the automaton may be simplified by removing commands 1 and 2 and all instances of expressions w′ ≥ 0 and w′′ ≥ 0. Proc. AVoCS 2015 10 / 15 ECEASST ...?. ⊤. ⊥ . 1,2,3,4,5,6 . 7. 8 1. d′∧w′ < 0 : w′ ← w′+ 1 2. d′′∧w′′ < 0 : w′′ ← w′′ + 1 3. ¬d ∧d′∧¬o′∧w′ ≥ 0∧w′ ≤ k : w′ ← w′+ 1 4. ¬d ∧d′∧o′∧w′ ≥ 0∧w′ ≤ k : t′ ← w′,w′ ← k + 1 5. ¬d ∧d′′∧o′′∧w′′ ≥ 0∧w′′ < k : w′′ ← w′′ + 1,t′′ ← w′′ 6. ¬d ∧d′′∧¬o′′∧w′′ ≥ 0∧w′′ < k : w′′ ← k 7. ¬d ∧t′ ≥ 0∧t′′ ≥ t′−1 : d ← true,o ← true 8. ¬d ∧(t′ < 0∧w′ = k + 1∨w′′ = k∧(t′′ < t′−1 ∨t′ < 0∧t′′ ≤ w′−1)) : d ← true,o ← false Figure 3: Observer for o ← XkX(o′′Uko′). Initially t′′ = 0,t′ =−1,d = dX = false and w′ = w′′ = −kX (see text). 6 Case Studies We have implemented our importance splitting framework in PLASMA-LAB [BCLS13] and demonstrate its use on three case studies whose state space is intractable to numerical model checking. The following results do not seek to promote a particular methodology (adaptive or fixed level algorithm, distributed or single machine), but serve to illustrate the flexibility of our platform. The software, models and observers can be downloaded from our website2. The leader election and dining philosophers models are also illustrated on the PRISM case studies website3. For each model we performed a number of experiments to compare the performance of the fixed and adaptive importance splitting algorithms with and without distribution, using different simulation budgets and levels. Our results are illustrated in the form of empirical cumulative probability distributions of 100 estimates, noting that a perfect (zero variance) estimator distri- bution would be represented by a single step. The results are also summarised in Table 1. The probabilities we estimate are all close to 10−6 and are marked on the figures with a vertical line. Since we are not able to use numerical techniques to calculate the true probabilities, we use the average of 200 low variance estimates as our best overall estimate. As a reference, we applied the adaptive algorithm to each model using a single computational thread. We chose parameters to maximise the number of levels and thus minimise the variance for a given score function and budget. The resulting distributions, sampled at every tenth percentile, are plotted with circular markers in the figures. Over these points we superimpose the results of applying a single instance of the fixed level algorithm with just a few levels. We also superimpose the average estimates of five parallel threads running the fixed level algorithm, using the same levels. The figures confirm our expectation that the fixed level algorithm with few levels is outper- formed by the adaptive algorithm. The figures also demonstrate that the average of parallel instances of the fixed level algorithm are very close to the performance of the adaptive algo- rithm. The timings given in Table 1 show that the distributed approach achieves these results in less time. For comparison we also include the estimated time of using a simple Monte Carlo 2 projects.inria.fr/plasma-lab/importance-splitting 3 www.prismmodelchecker.org/casestudies 11 / 15 Volume 72 (2015) Distributed Verification of Rare Properties using Importance Splitting Observers Estimate × 10 7 C u m u la ti v e p ro b a b il it y 1 3 10 0 0 .2 0 .4 0 .6 0 .8 1 adaptive parallel fixed single fixed Figure 4: Leader election. Estimate × 10 7 C u m u la ti v e p ro b a b il it y 5 10 20 40 0 0 .2 0 .4 0 .6 0 .8 1 adaptive parallel fixed single fixed Figure 5: Dining philosophers. (MC) estimator to achieve the same standard deviation. Importance splitting gives more than three orders of magnitude improvement in all cases. All results were generated using an Intel Core i7-3740 CPU with 4 cores running at 2.7 GHz. In the remainder of this section we briefly describe our models and their associated properties and score functions. Leader Election Our leader election case study is based on the PRISM model of the synchronous leader election protocol of [IR90]. With N = 20 processes and K = 6 probabilistic choices the model has ap- proximately 1.2×1018 states. We consider the probability of the property G420¬elected, where elected denotes the state where a leader has been elected. Our chosen score function uses the time bound of the G operator to give nominal scores between 0 and 420. The model constrains these to only 20 actual levels (some scores are equivalent with respect to the model and prop- erty), but with evenly distributed probability. For the fixed level algorithm we use scores of 70,140,210,280,350 and 420. Dining Philosophers Our dining philosophers case study extends the PRISM model of the fair probabilistic protocol of [LR81]. With 150 philosophers our model contains approximately 2.3×10144 states. We consider the probability of the property F30Phil eats, where Phil is the name of an arbitrary philosopher. The adaptive algorithm uses the heuristic score function described in [JLS14], which includes the five logical levels used by the fixed level algorithm. Between these levels the heuristic favours short paths, based on the assumption that as time runs out the property is less likely to be satisfied. Dependent Counters Our dependent counters case study comprises ten counters, initially set to zero, that with some probability dependent on the values of the other counters are either incremented or reset to zero. Proc. AVoCS 2015 12 / 15 ECEASST Estimate × 10 7 C u m u la ti v e p ro b a b il it y 1 3 10 30 0 0 .2 0 .4 0 .6 0 .8 1 adaptive parallel fixed single fixed Figure 6: Dependent counters. Adaptive Single Parallel Le ad er Std. dev. 4.8×10−8 1.3×10−7 5.2×10−8 Levels 20 6 6 Budget 1000 1000 5×1000 Time (MC) 7.3s (30h) 2.5s (4.4h) 5.8s (5.0h) P hi lo so ph er s Std. dev. 4.2×10−7 7.7×10−7 2.8×10−7 Levels 109 5 5 Budget 1000 1000 5×1000 Time (MC) 5.4s (2.3h) 1.7s (41m) 3.7s (1.4h) C ou nt er s Std. dev. 2.1×10−7 5.0×10−7 2.3×10−7 Levels 3942 4 4 Budget 500 500 5×500 Time (MC) 15s (7.5h) 2.8s (1.2h) 4.8s (1.9h) Table 1: Summary of results. This can be viewed as modelling an abstract computational process, a set of reservoirs of finite capacity, or as the failure and repair of ten different types of components in a system, etc. With a maximum count of 10, the model has approximately 2.6×1010 states. We consider the probability of the property X1(¬init U1000complete), where init and complete denote the initial state and the state where all counters have reached their maximum value. Our score function ranges over values between 0 and 99, but the probabilities are not evenly dis- tributed. With a budget of 500, uniformly distributed fixed scores fail to produce traces that satisfy the property until the difference between the last two levels is about 5. Note that our bud- get is limited to only 500 simulations due to the length of the traces that must be stored by the adaptive algorithm. We maintain this budget for the fixed level algorithm to simplify comparison. After a small amount of trial and error, we adopted fixed scores of 80,90,95 and 99. 7 Challenges and Prospects Our results demonstrate the effectiveness and flexibility of our framework with discrete time properties applied to standard case studies. Future challenges include industrial scale examples and the implementation of continuous time properties. We also intend to provide proofs of the correctness of our observers and of our logic’s memory requirements. Although the manual construction of score functions adds to the overall cost of using impor- tance splitting, we believe that distribution relaxes the need for these to be highly optimised. We nevertheless expect that it will be possible to construct good score functions automatically using statistical learning techniques. Acknowledgement This work was partially supported by the European Union Seventh Framework Programme under grant agreement number 318490 (SENSATION). 13 / 15 Volume 72 (2015) Distributed Verification of Rare Properties using Importance Splitting Observers Bibliography [AH99] R. Alur, T. A. Henzinger. Reactive Modules. Formal Methods in System Design 15(1):7–48, 1999. [BBB+10] A. Basu, S. Bensalem, M. Bozga, B. Caillaud, B. Delahaye, A. Legay. Statistical Abstraction and Model Checking of Large Heterogeneous Systems. In Hatcliff and Zucca (eds.), Formal Techniques for Distributed Systems. LNCS 6117, pp. 32–46. Springer, 2010. [BCLS13] B. Boyer, K. Corre, A. Legay, S. Sedwards. PLASMA-lab: A Flexible, Distributable Statistical Model Checking Library. In Joshi et al. (eds.), Quantitative Evaluation of Systems. LNCS 8054, pp. 160–164. Springer, 2013. [BLS06] A. Bauer, M. Leucker, C. Schallhart. Monitoring of Real-Time Properties. In FSTTCS 2006: Foundations of Software Technology and Theoretical Computer Sci- ence. Pp. 260–272. Springer, 2006. [CDFG12] F. Cérou, P. Del Moral, T. Furon, A. Guyader. Sequential Monte Carlo for rare event estimation. Statistics and Computing 22:795–808, 2012. [CES09] E. M. Clarke, E. A. Emerson, J. Sifakis. Model checking: algorithmic verification and debugging. Commun. ACM 52(11):74–84, November 2009. [CG07] F. Cérou, A. Guyader. Adaptive multilevel splitting for rare event analysis. Stochas- tic Analysis and Applications 25:417–443, 2007. [Dij75] E. W. Dijkstra. Guarded commands, nondeterminacy and formal derivation of pro- grams. Commun. ACM 18(8):453–457, August 1975. [EFH+03] C. Eisner, D. Fisman, J. Havlicek, Y. Lustig, A. McIsaac, D. Van Campenhout. Reasoning with temporal logic on truncated paths. In Computer Aided Verification. Pp. 27–39. Springer, 2003. [FS04] B. Finkbeiner, H. Sipma. Checking Finite Traces Using Alternating Automata. For- mal Methods in System Design 24(2):101–127, 2004. [Gei01] M. C. W. Geilen. On the Construction of Monitors for Temporal Logic Properties. Electronic Notes in Theoretical Computer Science 55(2):181–199, 2001. [GH01] D. Giannakopoulou, K. Havelund. Automata-based verification of temporal proper- ties on running programs. In Proceedings of 16th Annual International Conference on Automated Software Engineering. Pp. 412–416. IEEE, Nov 2001. [GPVW95] R. Gerth, D. Peled, M. Y. V. Vardi, P. Wolper. Simple On-the-fly Automatic Verifi- cation of Linear Temporal Logic. In Protocol Specification Testing and Verification. Pp. 3–18. Chapman & Hall, 1995. [HH64] J. M. Hammersley, D. C. Handscomb. Monte Carlo Methods. Methuen & Co., 1964. Proc. AVoCS 2015 14 / 15 ECEASST [HR02] K. Havelund, G. Roşu. Synthesizing Monitors for Safety Properties. In Katoen and Stevens (eds.), Tools and Algorithms for the Construction and Analysis of Systems. LNCS 2280, pp. 342–356. Springer, 2002. [IR90] A. Itai, M. Rodeh. Symmetry breaking in distributed networks. Information and Computation 88(1):60–87, 1990. [JLS12a] C. Jegourel, A. Legay, S. Sedwards. Cross-Entropy Optimisation of Importance Sampling Parameters for Statistical Model Checking. In Madhusudan and Seshia (eds.), Computer Aided Verification. LNCS 7358, pp. 327–342. Springer, 2012. [JLS12b] C. Jegourel, A. Legay, S. Sedwards. A Platform for High Performance Statistical Model Checking – PLASMA. In Flanagan and König (eds.), Tools and Algorithms for the Construction and Analysis of Systems. LNCS 7214, pp. 498–503. Springer, 2012. [JLS13] C. Jegourel, A. Legay, S. Sedwards. Importance Splitting for Statistical Model Checking Rare Properties. In Computer Aided Verification. LNCS 8044, pp. 576– 591. Springer, 2013. [JLS14] C. Jegourel, A. Legay, S. Sedwards. An Effective Heuristic for Adaptive Importance Splitting in Statistical Model Checking. In Margaria and Steffen (eds.), Leverag- ing Applications of Formal Methods, Verification and Validation. Specialized Tech- niques and Applications. LNCS 8803, pp. 143–159. Springer, 2014. [Kah50] H. Kahn. Random sampling (Monte Carlo) techniques in neutron attenuation prob- lems. Nucleonics 6(5):27, 1950. [KH51] H. Kahn, T. E. Harris. Estimation of Particle Transmission by Random Sampling. In Applied Mathematics. series 12 5. National Bureau of Standards, 1951. [LR81] D. Lehmann, M. O. Rabin. On the Advantage of Free Choice: A Symmetric and Fully Distributed Solution to the Dining Philosophers Problem. In Proc. 8thAnn. Symposium on Principles of Programming Languages. Pp. 133–138. 1981. [Nie92] H. Niederreiter. Random Number Generation and Quasi-Monte Carlo Methods. So- ciety for Industrial and Applied Mathematics, 1992. [RT09] G. Rubino, B. Tufin (eds.). Rare Event Simulation using Monte Carlo Methods. John Wiley & Sons, Ltd, 2009. [VV91] M. Villén-Altamirano, J. Villén-Altamirano. RESTART: A Method for Accelerating Rare Event Simulations. In Cohen and Pack (eds.), Queueing, Performance and Control in ATM. Pp. 71–76. Elsevier, 1991. [YKNP06] H. S. Younes, M. Kwiatkowska, G. Norman, D. Parker. Numerical vs. statistical probabilistic model checking. International Journal on Software Tools for Technol- ogy Transfer 8(3):216–228, 2006. 15 / 15 Volume 72 (2015) Introduction Technical Background Distributing Importance Splitting The Adaptive Algorithm A Fixed Level Algorithm for Distribution Linear Temporal Logic for Importance Splitting Lightweight Observers for Importance Splitting Case Studies Challenges and Prospects