Model Checking Communicating Processes: Run Graphs, Graph Grammars, and MSO Electronic Communications of the EASST Volume 47 (2012) Proceedings of the 11th International Workshop on Graph Transformation and Visual Modeling Techniques (GTVMT 2012) Model Checking Communicating Processes: Run Graphs, Graph Grammars, and MSO Alexander Heußner 15 pages Guest Editors: Andrew Fish, Leen Lambers Managing Editors: Tiziana Margaria, Julia Padberg, Gabriele Taentzer ECEASST Home Page: http://www.easst.org/eceasst/ ISSN 1863-2122 http://www.easst.org/eceasst/ ECEASST Model Checking Communicating Processes: Run Graphs, Graph Grammars, and MSO Alexander Heußner Université Libre de Bruxelles – Belgium Abstract: The formal model of recursive communicating processes (RCPS) is im- portant in practice but does not allows to derive decidability results for model check- ing questions easily. We focus a partial order representation of RCPS’s execution by graphs—so called run graphs, and suggest an under-approximative verification approach based on a bounded-treewidth requirement. This allows to directly derive positive decidability results for MSO model checking (seen as partial order logic on run graphs) based on a context-freeness argument for restricted classes run graphs. Keywords: Pushdown Systems, Hyperedge Replacement Grammars, MSO Communicating ProcesseS (CPS) allow to model an important class of distributed processes that asynchronously communicate, e.g., via TCP-based channels or MPI, over a network topology. Hence, synchronization takes place on unbounded, lossless fifo channels. As CPS subsume com- municating finite-state machines, they inherit their well-known undecidability results, e.g., for reachability [BZ83]. Local infinite data and unbounded recursion suggests to model each process by a finite control flow automaton with a pushdown stack leading to recursive CPS (RCPS). The latter allow to reduce the Post correspondence problem or the emptiness-testing of the intersec- tion of two context free languages to RCPS’s reachability question [Ram00], and thus give rise to a second source of undecidability. Hence, classical verification questions become a challenging problem on these models, especially when leaving behind simple reachability questions. Our Contribution Extending positive decidability results from the (control-state) reachabil- ity problem of a (semantically) restricted class of RCPS introduced in [HLMS11] to the model checking problem of linear temporal logics (LTL) proves to be impossible [Ram00], as one can (ab)use LTL interpreted on an interleaving run to locally synchronize events that take place on different processes and that are a priori independent. Thus, we propose to leave CPS’s interleaving-based semantics behind. We represent the underlying partial order of events of a CPS’s (finite) execution as graphs—so called run graphs—and interpret monadic second order logic (MSO) thereupon. However, it is well-known that MSO is undecidable on general graphs. We avoid this intricacy by extending some of our previous insights, e.g., the context-freeness of runs of eager RCPS on non-confluent architectures [HLMS11], to the graph setting. We show that one can represent the run graphs of several (semantically) restricted classes of finite CPS/RCPS with the help of context-free graph grammars, in our case hyperedge replacement grammars [DKH97]. As graphs generated by these grammars have bounded treewidth, we can easily transfer the fundamental positive decidability result on MSO model checking of bounded- treewidth graphs [CE11] to our setting. The latter was already proposed in [Heu10] but is first detailed in this article. This gives rise to a new way of approaching CPS from a partial-order per- spective, and proposes an unifying view on known semantic boundedness conditions via graph measures. 1 / 15 Volume 47 (2012) Model Checking Communicating Processes Overview In the following section, we introduce the formal model of (R)CPS as well as give an overview of known restrictions and decidability results for a suite of reachability questions. Next, we recall some notions on MSO interpreted on graphs, and hyperedge replacement gram- mars, before introducing run graphs. Last, we give decidability results for MSO interpreted on run graphs of two important classes of (R)CPS. We refer to the appendix for additional de- tails/proofs. Notation Given a set S, let |S| denote its cardinality. For an I-indexed family of sets (Si)i∈I , we write elements of ∏i∈I S i in bold face, i.e., s ∈ ∏i∈I Si. The i-component of s is written si ∈ Si, and we identify s with the indexed family of elements (si)i∈I . An alphabet Σ is a finite set of letters. We write Σ∗ for the set of all finite words over Σ. Let ε denote the empty word. Let w = w1w2 ···∈ Σ∗, then its forgetful projection on Γ ⊆ Σ is the word w′ = w′1w ′ 2 ... with wi = w ′ i if wi ∈ Γ, else wi = ε (for i ∈{1,2,...}). A labeled transition system (LTS) S = 〈S,s0,Act,→〉 is given by a set of states S, an initial state s0, an action alphabet Act, and a (labeled) transition relation →⊆ S×Act×S. We usually write s a−→ s′ in place of (s,a,s′)∈→. A pushdown system is given by a tuple 〈Z,z0,Act,Γ,∆〉 where Z is a finite set of (control) states, z0 ∈ Z the initial state, Γ a finite stack alphabet, and Act = {push(γ),pop(γ)|γ ∈ Γ} the set of pushdown actions where transition rules are given by ∆ ⊆ Z ×Act×Z. The semantics of a pushdown system is given by an infinite transition system over configurations in Z ×Γ∗, i.e., tuples of a control state and a word representation of the stack’s content. Note that a push(γ) action adds the letter γ to the top of the current stack content, and that a pop(γ) action blocks if the topmost letter is not equal to γ . 1 Verifying (Recursive) Communicating Processes A topology T is a directed graph whose vertices are processes Proc and whose edges are chan- nels Ch. The set of possible communication actions of a process p ∈ Proc are defined by Comp(T,M) ={c!m | c ∈Ch,src(c) = p,m ∈ M}∪{c?m | c ∈C,dst(c) = p,m ∈ M}. We denote by c!m the sending of message m into channel c, and by c?m the reception of message m from c. Definition 1 A system of communicating processes (CPS) Q = 〈T,M,(Sp)p∈Proc〉 is given by a topology T, a message alphabet M, and an LTS Sp for each process p ∈ Proc of the form 〈Sp,sp0,Act p,→p〉 such that: – the action alphabets Act p, p ∈ Proc, are pairwise disjoint, and – Act pcom = Act p ∩(C×{!,?}×M)⊆ Comp(T,M) for each p ∈ Proc. States sp ∈ Sp are called local states of p. We write S = ∏p∈Proc Sp for the set of global states. Note that the sets Sp, and hence S, may be infinite. A finite CPS is a CPS where for all p ∈ Proc the Sp are finite. The operational semantics of Q is defined by a global LTS JQK = 〈X,x0,Act,→〉, where X = S×(M∗)C is the set of configurations, x0 = (s0,(ε)c∈Ch) is the initial configuration (note that channels are initially empty), Act = ⋃ p∈Proc Act p is the set of actions, and →⊆X ×Act×X is Proc. GTVMT 2012 2 / 15 ECEASST the transition relation; the latter is defined as follows: for a ∈ Act p, (s1,w1) a−→ (s2,w2) if the subsequent conditions are satisfied: (i) sp1 a−→p sp2 and s q 1 = s q 2 for all q ∈ Proc with q 6= p, (ii) if a ∈ Act ploc then w1 = w2, (iii) if a = c!m then wc2 = w c 1 ·m and w d 2 = w d 1 for all d ∈C with d 6= c, (iv) if a = c?m then m·wc2 = w c 1 and w d 2 = w d 1 for all d ∈C with d 6= c. A (finite) run in the LTS Q is an alternating sequence ρ = (x0,a1,x1,...,an,xn) of configu- rations xi ∈ X and actions ai ∈ Act that satisfies, for all 1 ≤ i ≤ n, xi−1 ai−→ xi. A configuration x ∈ X is reachable in a CPS Q if there exists a finite run of Q from the initial configuration x0 to x. We define the reachability set of Q as Reach(Q) = {x ∈ X | x is reachable in Q}. For a given CPS Q and a global state s ∈ S, the state reachability problem for CPS asks whether Reach(Q)∩{s}×(M∗)C is non-empty. A pair of send/receive actions ai = c!m,a j = c?m is called matching in ρ if i < j and the number of receives on c within ai ···a j equals the length of c in xi. In the following we will mainly focus runs of a certain form, so called eager runs. Definition 2 A run ρ = (x0,a1,x1,...,an,xn) is eager if for all ai with 1 ≤ i ≤ n when ai is a receive action then i > 1 and ai−1 is its matching send action. A configuration x ∈ X is eager-reachable in a CPS Q if there exists a finite, eager run from the initial configuration x0 to x. The eager-reachability set of Q is the set Reacheager(Q) of eager-reachable configurations. We say that a CPS Q is eager when Reacheager(Q) = Reach(Q). One can show that eagerness of CPS arises under some natural (and decidable) restriction on cyclic communication, called mutex; the latter is a generalization of the well-known half-duplex restriction from mutual channels to larger topologies. Deciding the eagerness of a finite CPS proves to be undecidable, however, deciding whether a given finite CPS is mutex is complete in PSPACE [HLMS11]. Eager CPS subsume globally existentially 1-bounded finite-state ma- chines [LM04]. Recursive communicating processes (RCPS) R = 〈T,M,(Dp)p∈Proc〉 are CPS whose local transition systems Dp are pushdown systems. Formally, each Dp is given by 〈Z p,zp0,Act p,Γp,∆p〉 where Z p is a finite set of control states, zp0 an initial state, Act p an alphabet of actions, Γp a stack alphabet and ∆p ⊆ Z p ×Act p ×Z p the set of actions. The (local) semantics of each Dp is the common one and we assert the stack operations to be expressed by Act pstack = {push(γ),pop(γ) | γ ∈ Γ p}⊆ Aploc; local configurations are written (zp,up) ∈ Z p ×(Γp)∗. Note that each pop action must have a matching push action before, such that the forgetful projection of a local run of Dp on Act pstack is a Dyck word, i.e., a word where the push(γ) and pop(γ) actions for γ ∈ Γ are “well-nested”. Let Z = ∏p∈Proc Z p the set of global control states. A state of R is written s = (z,u) where sp = (zp,up) for each p ∈ Proc. For a given RCPS R and a global control state z ∈ Z, the state reachability problem for RCPS asks whether Reach(R)∩{z}×(∏p∈Proc((Γp)∗))×(M∗)C is empty. We recall the following decidability results for (R)CPS: 3 / 15 Volume 47 (2012) Model Checking Communicating Processes Fact 1 The state reachability problem is undecidable for finite CPS, even if we restrict the topology to two processes that are mutually connected by two channels. [BZ83] Fact 2 The state eager-reachability problem for finite CPS is decidable. [HLMS11] Fact 3 The RCPS state eager-reachability problem is undecidable when the underlying topol- ogy contains at least two pushdown processes that are connected by one channel. [HLMS11] Even if eagerness is sufficient for arriving at positive results for finite CPS, we need additional architectural constraints to gain the following positive reachability result for RCPS. Fact 4 A typed topology has a deciable RCPS state eager-reachability problem if and only if it is non-confluent. The problem is EXPTIME-complete in the latter case. [HLMS11] The previous result considers a semantic restriction against the direct synchronization of a RCPS’s local pushdown systems, inspired by, e.g., [LMP08, CV09]. A typed topology restricts the usage of a (point-to-point) channels as follows: either the process sending on this channel can use its channel end only when its local stack empty, or the receiving process can read only when its local stack empty, or both of the previous two restrictions (i.e., sending and receiving requires the active process’s local stacks to be empty). A typed topology is confluent if there exists a sequence (p0,c1, p1,...,cn, pn) of distinct pro- cesses pi ∈Proc (for 0≤ i≤n) and channels c j ∈Ch (1≤ j ≤n) such that p0 and pn can use their channel ends of c1 and cn without restriction. A non-confluent typed topology is not confluent. Informally, non-confluent topologies do not allow to let the information generated/read by two pushdown systems to “flow together”. The proof of Fact 4 relies on the following: for each eager run of a non-confluent RCPS one can efficiently construct an order-equivalent run whose forgetful projection onto Actstack is globally a Dyck word, i.e., which can be simulated on one exponentially larger global pushdown process. We capture the latter by the following definition. Definition 3 A finite run ρ of an RCPS is one-stack if the forgetful projection of a′1 ···a ′ n on⋃ p∈P Act p stack is a Dyck word. Lemma 1 All runs of eager RCPS over non-confluent architectures are one-stack. Lemma 2 Given an RCPS, it is undecidable whether all its runs are one-stack. Remark 1 Directly extending the previous positive results from safety/reachability verification to the model checking of ω -regular properties is however not possible. Let us take a look at an RCPS consisting of two local pushdown process with no channel in between, hence, an RCPS on a non-confluent architecture where all runs are eager. Interpreting linear temporal logics (LTL) on runs of this RCPS allows to express the synchronization of two pushdown processes via a logical formula [Ram00]. Restricting the logic to a “weaker” (from the standpoint of expressiveness) fragment does not help either, as the negative result can already derived for the X -less fragement of LTL interpreted only on the sequence of actions induced by a run. The Proc. GTVMT 2012 4 / 15 ECEASST culprit can be found in the possibility to express the synchronization of independent events via a formula that is interpreted on the global interleaving. Regarding our example: when there is no communication between the underlying local processes, all fireings of actions on both processes are assumed independent. Thus we propose to abandon interleaving-based semantics in favor of a partial-order one and to consider (temporal) logics based on this structure. In the following, we will focus monadic second order logic (MSO) on a rendering of the underlying partial of a run of a CPS into graphs. 2 Run Graphs and MSO Definition 4 For finite alphabets Λ and Π, a (Λ,Π)-labeled directed graph is a tuple G = 〈V,A,arc,λ ,π〉 where V is the finite, non-empty set of vertices, A the set of arcs (often called edges), the function arc maps each arc to an ordered pair of vertices, as well as λ : V → 2Λ labels each vertex by an element from 2Λ and π : A → Π labels each arc by an element from Π. For arc(a) = (v,v′) where a ∈ A and v,v′ ∈ V , we say that v is the source vertex of a and v′ its destination. Given a graph G = 〈V,A,arc,λ ,π〉, then a graph G′ = 〈V ′,A′,arc′,λ ′,π′〉 is a subgraph of G if V ′ ⊆ V and A′ ⊆ V , and arc′, λ ′, π′ can be derived by restricting the original counterparts to V ′ and A′. An induced subgraph demands in addition that for all a ∈ A with arc(a) = (x,y) and x,y ∈ V ′ it holds that a ∈ A′. For a ∈ A let G\a be the graph obtained by removing a from G and by contracting the source and destination of a to one new vertex. A graph H is a minor of a graph G if H is isomorphic to a graph G′′ that can be derived from G by finitely many arc contractions, as well as deletions of arcs and vertices (the latter implies the deletion of arcs connected to these vertices). Given a graph G, we define the treewidth of G as usual via graph decompositions [RS86]. The treewidth measures by a natural number how “close” G is to a tree where trees have treewidth 1. (Recall that the treewidth of a graph corresponds to the size of a separator for the graph, i.e., the number of edges one has to delete to separate the graph into components, which equals 1 for trees.) A class of graphs has bounded treewidth if there exists a constant k ∈ N such that all graphs in this class have a treewidth lower or equal to k; else it has unbounded treewidth. Fact 5 For k ∈N, let Gk be the k×k grid; then (i) Gk has treewidth k, thus (ii) the class of all grids {Gk | k ∈N} has unbounded treewidth; further, (iii) if H is a minor of G then the treewidth of H is lower or equal to the one of G. [Bod98] Monadic Second Order Logic on Graphs We introduce MSOV interpreted over the previ- ously introduced graphs where formulas are constructed as below for first-order variables x,y over vertices, and X is a second-order variable over sets of vertices. Correct MSOV formulae are given by the following BNF expression: ϕ ::= x = y | Labl(x) | Arc(x,y) | x ∈ X | ϕ ∧ϕ | ¬ϕ | ∃x(ϕ) | ∃X(ϕ) Informally, we interpret = as equality on V , the predicate Labl(x) for l ∈ 2Λ is true if the vertex x is labeled by l, and x ∈ X is true if x is in the set represented by the second order variable X . The core of MSOV is the incidence predicate Arc(x,y) that is true if there exists an arc whose 5 / 15 Volume 47 (2012) Model Checking Communicating Processes source vertex is represented by x and whose destination is represented by y. Note that we have existential quantification over both first and second order variables. (A formal introduction of the semantics can be found in [CE11].) We write G |= ϕ if a graph G satisfies a formula ϕ . We add the usual additional syntactic sugar by introducing (ϕ ∨φ), (ϕ → φ), and (∀x(ϕ)). MSOA (often written MSO2 or MSO-2 in the literature) is introduced analogously whereas we additionally allow first and second order variables over arcs, and first- and second-order quan- tification over arcs. The central difference between MSOA and MSOV is the ternary incidence predicate Arc(x,y,z) in MSOA which is true if x is an arc with source y and destination z. We extend Labl(x) to include arc labellings when x is an arc and l ∈ Π, as well as demand the =-relation to extend to arcs. Note that ( ∃x ( Arc(x,y,z) )) is equivalent to the incidence predicate of MSOV . Thus, we can directly translate any formula from MSOV to MSOA. The reverse direction only holds if we restrict ourselves to the class of graphs of bounded treewidth [CE11]. As this is the case in the following, we will not distinguish between MSOV and MSOA, thus only refer to “MSO”. Example 1 The following MSOA formula χ(x,y,l) (with free variables x,y,l) expresses a “causal” reachability problem: there exists an l-labeled path from vertex x to vertex y: χ(x,y,l)≡∀X ( x ∈ X ∧∀u,v ( u ∈ X ∧arc(z,u,v)∧Labl(z)→ v ∈ X ) → y ∈ X ) Then again, we can characterize classes of graphs that are “incompatible” with MSO as follows: Fact 6 If a class of graphs contains for each k ∈N a graph G that has Gk as induced subgraph, then MSO is undecidable on this class of graphs. [See75] Hyperedge Replacement Grammars We introduce hypergraphs as extension of “classical” graphs. A ((Λ,Π)-labeled) hypergraph is tuple H = 〈V,A,arc,λ ,π〉, analogous to the previous definition of graphs, whereas A is a finite sorted set A = ⋃ i∈N Ai (i.e., the Ai are disjoint), whose elements are called hyperarcs (or hyperedges), and we define arity(a) = i if a ∈ Ai; further, arc maps an arc a ∈ Ai to a sequence of nodes of size i, i.e., arc(a) = (v1,··· ,vn) for n = arity(a). Note that the previously introduced graphs are hypergraphs where the only possible arity of (hyper-)arcs is 2. Embeddable hypergraphs are a tuple 〈H,ext〉 where ext is a sequence of distinct “external” vertices (v1,··· ,vk) of H (with v1,··· ,vk ∈ V ). We abuse notation and write H for the tuple 〈H,ext〉 and ext(H) for the sequence ext; further let type map an embeddable hypergraph H to the size of its sequence ext(H). Let us assert that all the hypergraphs in the remainder are over the same two set of labels Λ and Π. We depict external vertices by • and others by ◦. We define the operation of hyperedge replacement as follows: let H be a hypergraph, a∈A one of its arcs, and H′ an embeddable hypergraph disjoint from H with arity(a) = type(H′) (if they are not disjoint, we take disjoint copies the usual way), then H[a ←H′] is the hypergraph that is equivalent to the disjoint union of H and H′ where we deleted the arc a in H and fused arc(a) = (v1,··· ,vn) componentwise with ext(H′) = (v′1,··· ,v ′ n). Further, we adapted the labellings of vertices accordingly as disjoint union of the labellings of H and H′ where we assign the union of the original labels to the fused vertices. Proc. GTVMT 2012 6 / 15 ECEASST A hyperedge replacement grammar (HRG) is a tuple G = 〈N,T,P,H0〉 where N and T are the sets of non-terminal and terminal hyperarcs. H0 is the initial hypergraph. P is a finite set of production rules that map a hyperedge a ∈ N to an embeddable hypergraph H such that arity(a) = type(H) and all vertices in ext(H) are labeled by /0. We write rules as R : a ↪→ H where we add the syntactic sugar of an identifier R. The width of an HRG G is the maximal number of vertices of the hypergraphs at the right-hand sides of all rules in P minus one. Given an HRG G = 〈N,T,P,H0〉, we define the derivation relation ⇒G between two hyper- graphs H,H′ as follows: H⇒G H′ iff H′ = H[a ←Hi], and a ↪→Hi is a rule in P for a ∈ N and an embedded hypergraph Hi. Let L(G) be the class of hypergraphs that can be derived from H0 in a finite number of ⇒G-derivations, and that do only contain arcs in T . We refer to [DKH97] for more details on HRG and HRG languages. Fact 7 Given an HRG G of width k (for k ∈ N), then the treewidth of the class of graphs L(G) is bounded by k. [Lau88] As graphs generated by HRG have bounded treewidth, one can draw a bridge to the known fundamental connection of decidability of MSO on classes of graphs with bounded treewidth: Fact 8 Given a hyperedge replacement grammar G that describes a class of graphs, and an MSO formula ϕ , then checking whether there exists a graph G in this class with G |= ϕ is decidable. (follows directly by combining Fact 7 with [CE11]) From CPS to Run Graphs In Section 1 we focused an interleaving-based semantics by view- ing CPS runs as alternating sequences of configurations and actions. Alternatively, we can regard runs as linear sequence of events that are labeled by actions. Formally, let E be a set of events, λ : E → Act their labelling by actions, then a finite run ρ is a word in E∗. Analogously to the partition of Act into Act p for p ∈ Proc, we partition E into sets E p. Due to the underlying semantics of a CPS, the events in the sets E p (for p ∈ Proc) are linearly ordered; interprocess dependencies arise via communication, i.e., via events labeled by matching pairs of send and receive actions. For RCPS, we also need to take a closer look on events labeled by matching push/pop actions. We thus introduce the three order relations succp,