Detecting Deadlocks in Formal System Models with Condition Synchronization Electronic Communications of the EASST Volume 076 (2019) Automated Verification of Critical Systems 2018 (AVoCS 2018) Detecting Deadlocks in Formal System Models with Condition Synchronization Eduard Kamburjan 19 pages Guest Editors: David Pichardie, Mihaela Sighireanu ECEASST Home Page: http://www.easst.org/eceasst/ ISSN 1863-2122 http://www.easst.org/eceasst/ ECEASST Detecting Deadlocks in Formal System Models with Condition Synchronization Eduard Kamburjan Department of Computer Science, Technische Universität Darmstadt, Germany kamburjan@cs.tu-darmstadt.de Abstract: We present a novel notion of deadlock for synchronization on arbitrary boolean conditions and a sound, fully automatic deadlock analysis. Contrary to other approaches, our analysis aims to detect deadlocks caused by faulty system design, rather than implementation bugs. We analyze synchronization on boolean conditions on the fields of an object instead of targeting specific synchronization primitives. As usual, a deadlock is a circular dependency between multiple tasks. A task depends on a second task if the execution of this second task has a side- effect that makes the blocking guard-condition of the first one evaluate to true. This requires an analysis of the computations in a method beyond syntactic properties and we integrate a logical validity calculus to do so. Keywords: Deadlock, Dependency, Active Object 1 Introduction Deadlock is an essential notion of error in distributed systems and is commonly defined as a blocked configuration with circular dependencies among multiple tasks. Deadlocks have been examined for different notions of dependency: Resource dependencies are defined between ac- quire and release actions, or message dependencies, which are defined between receiving and sending actions on channels, and other notions based on other synchronization primitives. The control flow of a system design or model can be described using common synchronization primitives like locks, in most languages. The primitives reduce to one of the aforementioned dependency notions and allow to use dependency analyses for deadlock detection. Systems however are rarely directly designed with concrete primitives in mind – the design makes use of the more natural and abstract synchronization patterns with synchronization on conditions, which can be compiled into more concrete synchronization primitives. Condition synchronization can be expressed with, e.g., a statement await i>0, which suspends the active task until the guard i>0 becomes true. Such a statement is not available in most languages, but can be seen as an abstraction to the established monitor pattern. Using a condition synchronization statement is nearer to the modelers intention of when a task will resume. Condition synchronization can be compiled into other synchronization patterns, e.g., monitors, but requires the addition of low-level synchronization primitives, such as condition variables and locks. Performing a deadlock analysis on the compiled code is thus less helpful for the system designer, as it operates on a lower level of abstraction than his system model. We propose a formalization of dependency that fits the intuition of the system designer better than approaches 1 / 19 Volume 076 (2019) mailto:kamburjan@cs.tu-darmstadt.de Detecting Deadlocks in Formal System Models with Condition Synchronization based on translation into low-level synchronization primitives. A task t1 depends on t2 if the continuation of t2 would make the guard b of t1 true at a point where t1 may be scheduled. This notion of dependency requires to evaluate the guard b and to analyze all side-effects of the continuation of t2. The analysis builds a dependency graph: First, for each field new dependencies are added from each read in a condition synchronization to each write. In the second step, all those edges are removed, for which we can show that the execution of the writing method will never make the guard in question true. An await statement for condition synchronization is implemented by the object-oriented Ab- stract Behavioral Specification (ABS) modeling language [JHS+10]. However, the available deadlock analysis tools for ABS [FAG13, GLL16, GDJ+17] do not consider such await state- ments. Motivated by this shortcoming, we implemented our approach for ABS, while the theory can be applied to other languages. Our implementation extends the Deadlock Checker for Con- current Objects (DECO) tool [FAG13] in the Static Analysis for Concurrent Objects (SACO) toolsuite, and integrates the KeY-ABS theorem prover [DBH15a] to discard dependencies. We evaluate our approach on industrial case studies. The precision depends on the commu- nication structure and the complexity of required SMT-theories for the types occurring in the program. While false positive potential deadlocks are detected, the analysis is precise enough to provide enough user feedback to manually discard the remaining potential deadlocks: The number of false positives in all but one case studies is small enough to check them all manually. As condition synchronization is nearer to the modelers intention when a task will resume, it allows to detect deadlocks caused by errors in the modeled system rather than errors caused by the wrong use of synchronization primitives. Consider the dining philosophers: The possible deadlock is caused by an error in the modeled control flow of the system (the conditions on putting down the fork depend on the conditions on picking up a fork and vice versa). A wrong use of synchronization would, e.g., be a philosopher that never puts his fork down. Our main contributions are (1) a novel notion of dependency and deadlock for condition syn- chronization and (2) a sound deadlock analysis for full coreABS that integrates a theorem prover into a dependency-based deadlock analysis. Paper Overview. Section 2 introduces condition synchronization and gives examples of its usage as an abstraction for low-level synchronization primitives. Section 3 introduces a simple lan- guage with condition synchronization. Section 4 defines our notion of deadlock and Section 5 describes our analysis for it, Section 6 reports on the implementation and Section 7 concludes. 2 On Condition Synchronization in System Design Our aim is to analyze the control flow of a system design to ensure that it does not include cir- cular dependencies. For this, we concentrate on the boolean conditions on which processes may synchronize to achieve the intended control flow. We are not concerned with the usage of low- level primitives like locks: Deadlocks caused by low-level primitives are indications of incorrect usage of the concurrency model, e.g., forgetting to unlock or unlocking twice. Deadlocks caused by the boolean conditions are signs of errors in the design: the program cannot progress because its designed control flow itself contains a bug. AVoCS 2018 2 / 19 ECEASST 1 public void m1() { 2 lock.lock(); 3 a = 1; 4 aIsOne.signalAll(); 5 6 } 7 public void m2(){ 8 lock.lock(); 9 while(a != 1) aIsOne.await(); 10 a = 0; 11 lock.unlock(); 12 } 1 public void m1() { 2 lock.lock(); 3 a = 2; //bug 4 aIsOne.signalAll(); 5 lock.unlock(); 6 } 7 public void m2(){ 8 lock.lock(); 9 while(a != 1) aIsOne.await(); 10 a = 0; 11 lock.unlock(); 12 } Figure 1: Two Java snippets for monitors. The methods run in separate threads. Example 1 Consider the Java code on the left in Fig. 1. If both described methods are running in parallel in two threads on the same object, they need not progress, as m1 fails to unlock. Now consider the Java code on the right. Again both methods may not progress — however the reason is that m2 waits for m1 to change the internal state. The first example is an implementation bug: The lock is used wrongly. The second example is a design bug: the combined control flow of m1 and m2 is designed wrongly — m2 does not continue after m1. Both kinds of errors cannot be sharply distinguished — a wrong usage of synchronization primitives may also be a result of erroneous design, e.g., if unlocking twice is a consequence of the intended control flow. Deadlocks caused by synchronization primitives have been studied extensively [NY16, PVB+13, CGHA18, HJ18, GDJ+17, GLL16] and focus for the most part on syntactic properties, not information in the guard. In this work we concentrate on condition synchronization. We aim to detect bugs in the design itself, helping the software architect, not the implementing programmer and use this system model: We abstract away from the low-level primitives and only consider the aforementioned await statement with a cooperative scheduling concurrency model, where every context switch is explicit. Condition Synchronization as Abstraction To illustrate the difference between condition synchronization and synchronization via low-level primitives, we use condition synchroniza- tion as an abstraction of monitors. A monitor is a predicate associated with a condition variable and a lock. All threads waiting for the condition are notified by the condition variable once the predicate may become true. If the guard evaluates to false, the notified threads become inactive again. The Java code on the left in Fig. 2 shows the uses condition variables to add an element to a bounded queue once the queue is not full. Here, the thread waits for the list to be below its maximal capacity. Otherwise it waits on the monitor notFull until the state changes and it is notified. If it modifies the state itself, it notifies all threads waiting for the list to be not empty by calling signalAll on the monitor notEmpty. Deadlock analysis can be performed by analyzing the possible sequences of calls to the synchronization primitives lock, await and signalAll [LMS10], as every call to signalAll causes the process to execute one more 3 / 19 Volume 076 (2019) Detecting Deadlocks in Formal System Models with Condition Synchronization 1 public void put(Object x) { 2 lock.lock(); 3 while (count >= items.length) 4 notFull.await(); 5 //add x to queue here 6 notEmpty.signalAll(); 7 lock.unlock(); 8 } 1 put(Object x){ 2 await count < items.length; 3 //add x to queue here 4 } Figure 2: A Java method and its Abstraction loop iteration. With condition synchronization and cooperative scheduling, we can express this method as shown on the right in Fig. 2. The lock and the monitors are not part of the code, it is thus not necessary to check their correct usage already in the design: The method only switches context at the await statement and continues execution once its guard evaluates to true. Condition Synchronization as a Modeling Tool Condition synchronization is not only a use- ful tool for modeling, it clarifies reasoning about control flow by abstracting from implementa- tion details. E.g., in the above example the notEmpty and notFull monitors are not part of the code. This makes it unnecessary to ensure that the correct monitors are used. Notions of deadlock for monitors based on the correct use of the involved primitives have two down-sides: First, the condition itself is not determining the dependencies – dependencies are determined by the additional structure the programmer assumes to guarantee deadlock freedom. This structure (1) leads to a large overhead, as for each condition a monitor has to be added and (2) adds another layer between the system design and the analyzed artifact. Secondly, in the sketched situation in Java, the waiting thread may progress, if another process was active and called signalAll; as it must execute the loop to reevaluate its guard-condition. We abstract away from the reevaluation, and assume it is handled by the runtime environment – by abstracting to condition synchronization, the results are nearer to the intuition of the designer. The ABS implementation of await b evaluates the guard for scheduling in the runtime envi- ronment. The guard is side-effect free, thus evaluation does not alter the state of the object and can safely be done in the scheduler. Replacing Condition Synchronization in Presence of Timed Behavior In ABS, it is not possible to replace condition synchronization with other synchronization primitives, even if a suspend statement (with the semantics of await True) is given. As the condition has to be checked at every point the thread could be scheduled, the check must be repeated inside the method and a busy waiting translation for await g; would be the following: 1 suspend; while(!g){ suspend; } The first suspension ensures that the thread releases at leats once. This translation has a different semantics than await g; as it allows to reschedule the thread multiple times if g does not hold, thus the configuration never gets stuck, as it can always make another loop iteration. The await statement only evaluates the guard once, until the heap memory of the object is changed. AVoCS 2018 4 / 19 ECEASST While this may be acceptable (because the guard cannot have side effects in ABS), it does livelock the system if timed behavior is modeled. ABS has a await duration(t,t’) statement which suspends a method for t to t’ symbolic time steps. Consider the following two methods: 1 mWait(){ 2 await x == 1; 3 } 1 mSet(){ 2 await (1,1); x = 1; 3 } A program running these two methods in parallel terminates. It does not terminate if await x==1 is translated into the loop above: The semantics of timed behavior is that time is only advanced if nothing else in the system can be scheduled – but as seen above the loop in mWait could always be executed once more. Thus, the translated program does not terminate, because it never advances time to execute x = 1;. While condition synchronization may be compiled into monitors, monitors require low-level primitives. Some concurrency models, such as the Active Objects of ABS, where locks are handled implicitly, cannot remove condition synchronization by translating it into other synchro- nization primitives. 3 A Language with Boolean Guards and Dynamic Logic We introduce a simple language SYNC with cooperative scheduling, asynchronous communica- tion and conditional synchronization. SYNC is a simplified version of ABS [JHS+10], following the formalization of the semantics in [FAG13]. We ignore futures, which synchronize processes on termination similar to thread joins, and return values for presentation’s sake, as those depen- dencies have been described by Flores-Montoya et al. [FAG13]. Our implementation considers full coreABS. A SYNC-program is a set of objects and a main block. Each object has fields and methods. All objects are running in parallel and share no state. An object may only change its active task, if the active task explicitly releases control. Control can be released by termination or a special statement await b; which suspends the active task, and allows its reactivation only once the boolean expression b evaluates to true. This statement models condition synchronization within one object. Between multiple objects, only asynchronous method calls of the form async X.m(e) are possible. Such an statement has been introduced and examined earlier [OG76]. As seen in the previous Section, await can be compiled into monitors, if condition variables and locks are synchronization primitives provided by the concurrency model. Definition 1 (Syntax) We underspecify the sets of types and expressions. For the examples, we assume types for booleans, integers, lists and Object, as well as the usual operations and literals for their elements. Let e range over expressions, T over types, v over variable names, f over field names and X over object names. · denotes possibly empty lists. A program Prgm is defined as follows: Prgm ::= O main{s} O ::= object X {M T f = e} M ::= m(T v){s} s ::= async e.m(e) | f = e | T v = e | await e | if(e)s else s fi | skip | s;s Example 2 In the following code, the object Queue models a queue with maximal length of 5 5 / 19 Volume 076 (2019) Detecting Deadlocks in Formal System Models with Condition Synchronization and the main block pushes a number into the queue and afterwards removes it. It is not guaran- teed that the push method will start to execute first. The synchronization with await, however, guarantees that it terminates first. 1 object Queue{ 2 List list = Nil; 3 push(Int i){ await size(list) < 5; list = [i]::list;} 4 pop(){ await size(list) > 0; list = tail(list);} 5 } 6 main{ 7 async Queue!push(1); 8 async Queue!pop(); 9 } Definition 2 (Runtime Syntax) D is the value domain, with {tt,ff}⊆ D. Let X range over object names, i over N, s over statements, σ over functions that map variable names to domain elements and ρ over functions that map field names to domain elements. We define configura- tions C as follows: C ::= tsk(X,i,s,σ) | obj(X,i,ρ) | C C The composition of configurations is associative and commutative, i.e. C (C′ C′′) = (C C′) C′′ and C C′ = C′ C. Well-formedness conditions can be found in [JHS+10]. A configuration contains tasks and objects. An object obj(X,i,ρ) has a unique name X, an active task id i and a store ρ . If inactive, the task id is the special symbol ⊥. A task tsk(X,i,s,σ) has a unique id i, a local store σ , the id of its object X and the remaining statement. A terminated task has the special symbol ⊥ as its statement. We denote the initial configuration of a program Prgm with I(Prgm). The definition is straightforward and the main block is running in a special object. We assume that each store ρ is initialized with a special field X f for each object X with ρ(X f ) = X. The method body of a method m is denoted M(m) and the initial local store of a task executing m with parameters d with M̂(d). The most important rules are shown in Fig 3: The rule (wait) suspends a process by setting the task id of the corresponding object to ⊥. The await statement is not removed. The rule (cont) removes the await statement when reactivating a process – the corresponding object must be inactive and the guard must hold. The rule (call) starts a new process, which is not set as active upon creation. A configuration is terminated if all tasks and objects have the forms tsk(Xi,i,⊥,σi) obj(X,⊥,ρX) A configuration is stuck, if it can not be reduced further but it is not terminated. We denote with JeKσ,ρ the evaluation of e with the stores σ and ρ . We write C |= e iff JeKσ,ρ = tt and the object whose store ρ is evaluated is understood. We index the reduction relation with a tuple of active tasks. A singleton tuple (i) expresses that only i is active, a tuple (i, j) expresses that i is active and launches j. This allows us to reason AVoCS 2018 6 / 19 ECEASST tsk(X,i, await e,σ) obj(X,i,ρ) C →(i) tsk(X,i, await e,σ) obj(X,⊥,ρ) C (wait) JeKσ,ρ = tt tsk(X,i, await e,σ) obj(X,⊥,ρ) C →(i) tsk(X,i,⊥,σ) obj(X,i,ρ) C (cont) JeKσ,ρ = X′ j does not appear in C C = obj(X′,l,ρ′) obj(X,i,ρ) C′ tsk(X,i, async e.m(e),σ) C →(i, j) tsk(X,i,⊥,σ) tsk(X′, j,M(m),M̂(JeKσ,ρ )) C (call) Figure 3: Selected Small-Step Operational Semantics Rules about restricted behavior, i.e. C ⇒(i) C′ expresses that C′ is reachable from C only by executing the task with id i. Definition 3 (Run) Let C1,...,Cn be configurations. A run from C1 to Cn is denoted C1 ⇒ Cn and defined as a tuple C1,...,Cn with C1 →I1 C2 →I2 ...→In−1 Cn for some tuples of task-ids I1,...,In−1. We say that the run is annotated with I1,...,In−1. For simplicity, we assume that all runs are finite. Using the annotated tuples, we can define rooted runs: A run rooted in a task-id i is a run which only executes task i and tasks started by task i. Rooted runs allow one to reason about system behavior caused by a certain task. Definition 4 (Rooted Runs) Let the following be the graph of some tuple of tuples of task ids I = (I1,...,In): G ( I ) = ( V,E ) V ={i | id i occurs in some Ik} E = { (i, j) | ∃k < n. Ik = (i, j) } A tuple I is rooted in i, if G (I ′) is a tree with root i for each prefix I ′ of I . A run rooted in i, denoted C1 ⇒i Cn, is a run C1,...,Cn annotated with I = I1,...,In−1, such that I is rooted in i. We write C ≡eX C ′ if two configurations are equal everywhere, except for the values of fields occurring in an expression e of object X: C≡eX C ′ ⇐⇒ ∃C′′. C = C′′ obj(X,i,ρ) ∧ C′ = C′′ obj(X,i,ρ′) ∧ ∧ f 6∈fields(e) ρ(f) = ρ′(f) 3.1 Dynamic Logic We use a dynamic logic, called SDL based on ABSDL [DO15] to reason about programs. SDL extends first-order logic with a modality for SYNC programs and allows us to reason about all possible runs of a method. We refer to [DO15, DBH15a] for full formal details about ABSDL. 7 / 19 Volume 076 (2019) Detecting Deadlocks in Formal System Models with Condition Synchronization Definition 6 (Syntax) Let v range over logical variables and f over function symbols. SDL- formulas φ and terms t are defined by the following syntax: φ ::=∃ v. φ | ¬φ | φ ∨φ | [s]φ | 〈[s]〉φ | t .= t t ::= f (t) | v The modality [s]φ expresses that φ holds after the execution of s and at every suspension point within. We introduce 〈[s]〉φ below. A formula is valid if it holds in all models. The other formulas express constraints on given configurations. We assume a formalization of the heap with two function symbols store and select with the connecting axiom select(store(heap,o,f,value),o,f) . = value for every heap heap, object o, field f and value value. A modality-free formula holds in a configuration if the constraints are satisfied – select is interpreted such that select(heap,o,f) . = value) is satisfied in a configuration C if C has the form obj(X,i,ρ) C′ and ρ(f) = JvalueKρ,σ ∧ JoKρ,σ = X holds. The local store σ is also modeled globally, with one special function for each local variable. We assume for simplicity that all local variables have unique names. Example 3 The following formula states that if in the beginning o.f is positive, then after the execution of f = f+1; in o, o.f is strictly positive. o.f≥ 0 → [f = f+1;]o.f > 0 The full semantics and a sequent calculus for validity are presented in [DO15, DBH15a]. A sequent calculus operates on sequents of the form Γ ⇒ ∆, where Γ,∆ are sets of SDL-formulas. Contrary to [DO15] we use the sequent calculus not to ensure that an invariant is preserved by a method, but only to check that the method establishes a certain post-condition at all suspension points. We only show the rule for the await statement. The following rule is taken from [DO15, DBH15a] and replaces the heap by a new function symbol to erase all knowledge. Afterwards, i.e., once the method continues execution, only the guard expression can be assumed. This mirrors the concurrency model, as other tasks may modify the heap during the suspension of this task and the task can only be scheduled if the guard condition holds. It also proves that the post-condition holds at each suspension point. Γ ⇒ φ,∆ Γ ⇒{heap := newHeap}e→ [s]φ,∆ Γ ⇒ [await e;s]φ,∆ (await) We require a way to verify that a property holds at all suspension points, except the first one. This is needed to verify that a method will fulfill a post-condition after being suspended at least once – it is not relevant whether the execution up to the first suspension satisfies the post- condition. Thus we use a special modality 〈[s]〉φ that expresses that φ holds after the execution of s and at each suspension point in s, except the first one. We use 〈[·]〉φ to verify that a method never releases a guard φ , after it was suspended once and thus this guard does not depend on the method in question. AVoCS 2018 8 / 19 ECEASST The calculus is the same, except that for the await statement, we use the following rule which does not check the post-condition. Note that afterwards the usual modality is used. Γ ⇒{heap := newHeap}e→ [s]φ,∆ Γ ⇒〈[await e;s]〉φ,∆ (await) The connection to the language’s SOS semantics follows from the correctness of the underly- ing validity calculus [DO15]. Lemma 1 Let φ be a modality-free formula which contains function symbols only for the fields of object X, 〈[s]〉φ a formula and C1 a configuration of the form C1 = tsk(X,i,s,σ) obj(X,i,ρ) C′ If the proof for 〈[s]〉φ can be closed, then for every run C1 ⇒i Cn with intermediate configura- tions C1,C2,...,Cn the following holds: At every position, except the very first, with a transition Ck →(i) Ck+1 such that i is active in X in Ck, but not in Ck+1, (i.e., these configurations execute suspension points) φ holds in Ck. Sketch. Our system differs from the ABSDL calculus developed by [DO15] only by the (await) rules. The semantics of [s]φ is that φ holds at all suspension points of s. In the original calculus, it also assumed φ when continuing the process, as their system proved class invariants for all methods. Our rule for [·] does not assume φ after continuing execution, the formula [s]φ is thus valid if s ensures that φ holds at the continuation at all suspension points within s, independent of how the heap is changed during the suspension, but does not use information from possible invariants. The rule for (await) for the 〈[·]〉 modality differs only in the missing first premisse, which is exactly proving that φ holds at the suspension point in question. As the premisse continues symbolic execution with the normal modality, all following suspension points have to guarantee φ – only the very first suspension point does not. 4 Dependencies for Condition Synchronization A deadlock describes a stuck configuration, where tasks circularly depend on each other. To fix the notion of deadlock, we need to fix the notion of dependency. Intuitively, a stuck task t depends on a task t′ in configuration C, if the continued execution of t′ leads to a configuration where t can continue its execution. If t′ is stuck at some guard b too, then t depends on t′ if the continuation of t′ in some configuration C′ where b holds leads to a configuration where t can continue its execution. We demand that C and C′ are as similar as possible: they are equal everywhere but in the fields occurring in b, as defined in Def. 5. Definition 7 We formally define a predicate dep(C,i, j) which expresses that i depends on j in configuration C. The formalization is not in SDL but references SDL-formulas. To do so, we first define a family of predicates n-dep(C,i, j), to model that i depends on j with at most n enforced continuations. Let C be a configuration of the form tsk(X,i, await e;si,σi) tsk(X, j,s j,σ j) obj(X,⊥,ρ) C0 9 / 19 Volume 076 (2019) Detecting Deadlocks in Formal System Models with Condition Synchronization The base predicate models that by executing only j, a configuration can be reached, such that e evaluates to true: 0-dep(C,i, j)≡C 6|= e∧∃C′. (C⇒ j C′∧C′ |= e) The other predicates handle the case that both i and j are blocked and j has the guard e′: and by choosing a configuration C′′ w.r.t. C′, the guard e′ evaluates to true and in this configuration i depends on j. n-dep(C,i, j)≡∃e′. ∃C′,C′′. s j = await e′;s′j ∧C 6|= e∧C 6|= e ′∧C≡e ′ X C ′ ∧C′ |= e′∧C′ 6|= e∧C′ ⇒ j C′′∧ ( C′′ |= e∨(n−1)-dep(C′′,i, j) ) Task i depends on j in C, written dep(C,i, j), if some n-dep(C,i, j) holds. We can now distinguish between deadlock and starvation. Definition 8 (Deadlock and Starvation) The dependency graph of a configuration has its task ids as nodes and its dependencies as edges. A stuck configuration is deadlocked if its dependency graph contains a dependency cycle. A configuration is starving, if it is stuck, but not deadlocked. A starving configuration requires some condition e to become true, but no task can have such an effect. Sometimes an active process which tries to acquire a resource is also said to be starving, but in our framework this is abstracted to await isAvailable(this.resource) — all starving processes are stuck. Example 4 (Deadlock and Starvation) Consider the program on the left in Figure 4. Its execu- tion leads to the configuration tsk(X,1, await f1; f2 = True;,σ1) tsk(X,2, await f2; f1 = True;,σ2) obj(X,⊥,ρX) tsk(X0,0,⊥,σ0) obj(X0,⊥,ρX0) This configuration is deadlocked as for the dependency of task 1 on task 2 we can set X.f1 = True and for the dependency of task 2 on task 1 we can set X.f2 = True. Now consider the right program is Figure 4. Its execution leads to C =tsk(X,1, await f1; f2 = True;,σ1) tsk(X,2, await f2; f1 = False;,σ2) obj(X,⊥,ρX) tsk(X0,0,⊥,σ0) obj(X0,⊥,ρX0) C is starving, as task 1 does not depend on task 2: no configuration can be chosen to continue task 2, so it leads to a configuration that evaluates X1.f1 to True. It is undecidable in general whether a configuration is deadlocked, as the computation of the dependency includes the computation of all effects caused by the program following a guard. Program and guard are both turing-complete, thus one can define a function encoding the uni- versal turing machine and check in the guard for some property of the output of another turing machine, which is computed/encoded in the code of another method. Proposition 1 Given a stuck configuration C, it is not decidable whether C is deadlocked or starving. AVoCS 2018 10 / 19 ECEASST 1 object X{ 2 Bool f1 = False; Bool f2 = False; 3 m(){ await f1; f2 = True; } 4 n(){ await f2; f1 = True; } 5 } 6 main{async X.m(); async X.n();} 1 object X{ 2 Bool f1 = False; Bool f2 = False; 3 m(){ await f1; f2 = True; } 4 n(){ await f2; f1 = False; } 5 } 6 main{async X.m(); async X.n();} Figure 4: Two example programs: The left will deadlock, the right will starve. Indeed even the dependency relation is undecidable. This result may appear discouraging, but the presented notion of deadlock captures the intent of the designer more precisely than notions which do no take the information flow through the heap into account and do not differentiate be- tween deadlock and starvation. The aim of our analysis is to present clues to the designer where the intended control flow has circular dependencies. It does not aim to catch any kind of error and is not supposed to catch implementation bugs, where every erroneous state is undesirable. The aim is to catch specific logical errors in the design of the control flow. Under these assump- tions, undecidability is not a deal-breaker. Indeed, if the notion would be decidable, it would restrict the possible guards – our aim however is to give the designer full freedom and support him with clues where it might deadlock, not guarantee complete error-freedom. Similarly, it is useful to distinguish between deadlock and starvation. Both notions describe erroneous states, but the reason are different design flaws. Also, starvation is not always unde- sirable. Consider the following method: 1 server(){ 2 await requestList != Nil; 3 //handle requests 4 async this.server(); 5 } Here, the object buffers and handles multiple requests at once. This pattern is used in prac- tice [KH18]. Starvation is only caused by a lack of requests, not erroneous control flow. Simi- larly, the right code in Figure 1 will terminate in a starving configuration, as m1 does not depend on m2. A starvation analysis would also be useful, but is out of the scope for this work. Partially Deadlocked Configurations Processes may depend on multiple other processes, and a situation can occur where a process is deadlocked and starving at the same time: Consider the following configuration: C =tsk(X,1, await x > 0&&y > 0; z = 1, /0) tsk(X,2, await z > 0; x = 1, /0) tsk(X,3, await z > 0; y = 1, /0) This configuration is deadlocked with the following dependencies 1-dep(C,1,2),1-dep(C,1,3),1-dep(C,2,1),1-dep(C,3,1) 11 / 19 Volume 076 (2019) Detecting Deadlocks in Formal System Models with Condition Synchronization Now consider the case that process 3 is not part of the configuration: C′ =tsk(X,1, await x > 0&&y > 0; z = 1, /0) tsk(X,2, await z > 0; x = 1, /0) The configuration is still deadlocked, as it contains the dependencies 1-dep(C′,1,2),1-dep(C′,2,1) if the deadlock would be resolved for process 1 (i.e., the guard x > 0 would hold), process 1 would not progress, as the second conjunct of its guard is starving. This example shows a further point where our definition of deadlock does not coincide with the simple notion of getting stuck: we consider configurations as C′ as deadlocked, even if they contain further reasons than only dependency cycles for getting stuck. If one wants to define a notion of deadlock where dependency cycles are the only reasons for getting stuck (e.g., C above), a possible definition where to bring the guard of the process in question into CNF, derive dependency edges for every conjunct and demand that each conjunct is part of a dependency cycle. In this case C′ would be not considered as deadlocked (as y>0 is not part of any cycle). 5 Analyzing Condition Synchronization To detect deadlocks, the abstract dependency graph is computed. The abstract dependency graph subsumes all dependency graphs of reachable stuck configurations in a program: If the depen- dency graph of a reachable stuck configuration has a circular dependency, then the abstract de- pendency graph also has one. Our approach extends the one of Flores-Montoya et al. [FAG13] and the implementation thus handles a language with condition synchronization and synchronization on futures, i.e., termina- tion of tasks. For presentation’s sake, we only define the object-insensitive abstract dependency graph. Improvements of [FAG13] can still be applied, e.g., their main improvement relies on a may-happen-in-parallel analysis, which is extended for condition synchronization in [AFG15]. The abstract dependency graph is defined syntactically. Let P be a program. Definition 9 Let X1,...,Xn be all objects in P and mi,1,...,mi,o the methods of Xi. The abstract dependency graph A(P)=(V,E) is defined as follows: • The nodes are all methods, i.e. V = ( mi, j ) i≤n j≤o • Edges connect methods with writes into a field with methods which synchronize on this field: (mi, j,mk,l) ∈ E iff there is a field f such that mi, j contains a guard with f and mk,l contains f = e or a call to a method doing so. Note that a guard may contain multiple fields and that methods on different objects may de- pend on each other. At this point, we do not analyse here whether call or write statement are in a branch or in dead code. AVoCS 2018 12 / 19 ECEASST Example 5 Consider the following code and abstract dependency graph. The nodes in the graph correspond to the guards in X.m and X.n. The edge (X.m, X.n) is by the write to field b2 in X.n and its read in the guard of X.m. 1 object X { 2 Bool b1 = False; Bool b2 = False; 3 m(){ b1 = True; await b2; } 4 n(){ await !b1; b2 = True; } 5 } 6 main { async X.m(); async X.n();} X.mX.n To incorporate the side-effects of computations, we make two additional steps. The first im- provement aims to discards cycles because there is no reachable deadlock configuration to which they correspond. In Definition 9, the whole method was checked for written fields. A cycle, how- ever, only represents some concrete configuration where the processes hold at specific guards: every field in a deadlocked configuration must be written after some synchronization statement. Definition 10 (Feasibility) A cycle m1,...,mn in A(P) is feasible, if for each k < n, every write causing the edge (mk,mk+1) is after the first guard of mk+1. Nonfeasible cycles contain edges that refer to heap changes of the execution of an involved method, but happen before the stuck configuration is reached: Example 6 Consider again Example 5. The edge from X.n to X.m is added, because the field b1 is written in m and read in a guard in n. This edge is missing in all concrete dependency graphs of reachable stuck configurations, because in the stuck configurations X.m has already reached its guard and thus will not change b1. I.e., the cycle (X.n,X.m,X.n) is not feasible. We increase the accuracy further by analyzing the transmitted information and ensure that every edge is refering to a write statement which actually may release the guard. Given a guard e and a method mk,l with method body s, we check that after some suspension of inside of s the guard e evaluates to true. We may ignore the first suspension, as all side effects before it cannot influence the heap afterwards. I.e., if the formula 〈[s]〉¬e is valid, then after no execution of mk,l can resolve the blocking guard and we can remove the dependency edge. Definition 11 (Refined Abstract Dependency Graphs) Let G = (V,E) be an abstract depen- dency graph. Let (mi, j,mk,l) be an edge, added because of a statement await e in mi, j. Let s be the body of mk,l The edge (mi, j,mk,l) is dispensable if the formula 〈[s]〉¬e holds. The refined abstract dependency graph of a program is the graph that results from removing all dispensable edges from its abstract dependency graph. The use of the 〈[·]〉 is necessary, as we only reason about stuck configurations, thus we can ignore any side effects that happen before the first guard - they do not refer to information flow that may release another guard afterwards. Example 7 Consider the right program in Figure 4. As discussed this program will starve, but deadlock. The left graph below is its abstract dependency graph, the right graph the refined 13 / 19 Volume 076 (2019) Detecting Deadlocks in Formal System Models with Condition Synchronization abstract dependency graph: X.mX.n X.m X.n Theorem 1 (Soundness) If a program has a reachable deadlocked configuration, then its re- fined abstract dependency graph has a feasible cycle. Sketch. The proof follows the soundness proof of [FAG13]: If a reachable configuration is dead- locked, then it has a dependency cycle: We show that for any dependency in the configuration between two tasks t1, t2, there is a dependency between the methods m1 and m2 executed by t1 (resp. t2) in the refined abstract dependency graph. If there is a dependency between t1 and t2 (note that as the configuration is deadlock it cannot be a 0-dependency), the t2 must have a rooted run that releases the guard of t1, as we may only change the fields in the guard to do the step in the unrolling of n-dependencies for n > 0. Thus t2 must change one of the fields occuring in t1, or call a method doing so. This is exactly the condition used to derive the dependecy between m1 and m2. Next, we show that the edge between m1 and m2 cannot be dispensable, and the whole cycle is feasable. If it were, 〈[s]〉¬φ (where s is the statement of t2 and φ the guard of t1) would not be valid. As t2 is doing the change in the heap after being unblock once in the definition of dep(,,t,h)e statement making the change (write or call) is after the first await. Thus it is safe to use 〈[·]〉 in the check for dispensability. The argument is analogous for feasable cycles. 6 Evaluation We implemented our approach in the SACO [AAF+14] framework for coreABS and use KeY- ABS [DBH15a] as the theorem prover to check for dispensable edges1. Existing tools did not support conditional synchronization, so only six coreABS case studies made use of this feature and we rely on micro-benchmarks to evaluate on a wider code base. Case studies and micro- benchmarks cover full coreABS, including loops. The implementation is fully automatic. 1 class Server implements S{ 2 List wList = Nil; 3 Int status = 0; 4 Unit in_pool(){ 5 await status == 1; 6 wList = 1;} 7 Unit add_worker(Fut f){ 8 await f?; 9 wList = [new Work()| wList];} 10 Unit run(){ 11 this!init_all(); 12 Fut f = this!in_pool(); 13 this!add_worker(f) } 14 Unit init_all(){ 15 status = 1; 16 await length(wList) >= 2; 17 wList = [new Work()| wList]; 18 status = 2; }} Figure 5: An ABS class modeling the internal synchronization structure of a server during ini- tialization. The main block is omitted and the ABS code is prettified. We can show that the right example in Figure 4 is deadlock free. Figure 5 mixes conditional synchronization and future-synchronization. The implementation can deal with deadlocks where 1 Available under formbar.raillab.de/deadlock AVoCS 2018 14 / 19 formbar.raillab.de/deadlock ECEASST some dependencies are caused by futures and some dependencies are caused by condition syn- chronization. This example (and, e.g., Example 2) requires the application of the theorem prover to detect and discard dispensable edges: two false positive deadlock risks are found otherwise. We analyzed the two largest examples in ABS, the FredHopper trading and replicate systems which model industrial software systems [DBH+15b], and found 20 deadlock risks in the trad- ing system and 52 in the replicate system. One reason for this is that the replicate system uses deployment components [JST12] modeling cloud architecture, which are not supported by KeY- ABS. In [AFG15, FAG13] the trading system was already analyzed in a setting with conditional synchronization as deadlock free. In that work, only the MHP analysis was adjusted for con- ditional synchronization, the deadlock analysis however was not sound and does not detect the deadlock in the left program in Figure 4. We were able to manually identify all 20 deadlock risks as false postives and confirm that the trading system is deadlock free. Manual post-processing is acceptable as the tool outputs the methods involved in the deadlock risk and 18 of the 20 deadlock share one edge. Selected Microbenchmarks (5 of 42) Name LoC deadlock-free time found deadlocks back_dead 39 × 8ms 1 OneQueue 37 × 13ms 2 Figure 5 43 X 7ms 0 Transitive 52 × 10ms 1 Loop 39 × 8ms 1 Case Studies Study LoC potential deadlocks time critical edges BlockChain 620 0 1312ms - Compugene 860 1 83ms 1 Memory 351 3 49ms 2 YARN 199 3 144ms 3 HyVar 632 6 200ms 2 trading 1466 20 31s 3 replicate 2101 52 5s 11 FormbaR 2200 Timeout Table 1: Evaluation on selected examples We analyzed the non-trivial models for industrial architecture from HyVar [LMRY17] and FormbaR [KH16]. Additionally, we evaluated an ABS model for weak memory [KH18], an ABS model for resource consumption in YARN clusters [LYJL16], and the Compugene model for computational biology2. The analysis returns 3 (resp. 3 and 1) potential deadlocks, which are easily manually identified as false positives. The false positives in the YARN model are again due to the use of deployment components. The analysis confirms deadlock-freedom of an ABS Blockchain model [Nak08]. The right side of Table 1 summarizes our evaluation on these case studies. The critical edge column shows how many edges needs to be removed from the graph to remove all cycles. The lower the number, the more feasible manual post-processing is. Except the mentioned industrial examples, the Compugene, HyVar and weak memory models, all examples are written by us. The tool was run on a Intel E5-2643 with 6 cores 3,4 GHz and 64 GB RAM. Our analysis does not scale only for the FormbaR model. This has two main reasons: (1) Form- 2 http://www.compugene.tu-darmstadt.de 15 / 19 Volume 076 (2019) http://www.compugene.tu-darmstadt.de Detecting Deadlocks in Formal System Models with Condition Synchronization baR models communication during railway operations and contains little computation, while the other example systems are less communication-heavy. (2) FormbaR makes heavy use of maps and contains several guards that read 4 fields of the class and every field is written in several methods - the model contains a lot of information flow through fields. Maps complicate analysis, as they require to ensure that the keys are passed around correctly. Such global properties cannot be derived by analyzing methods in isolation. 7 Conclusion We presented an approach for deadlock detection in presence of conditional synchronization, which integrates a theorem prover to analyze side-effects. The implementation is the first sound deadlock analysis for full coreABS. We are able to analyze all ABS case studies, but are not precise if models contain synchronization points that access many fields of a class: the abstract dependency graph subsumes all information flow in a program and is highly connected in those cases. This reflects the inherent difficulty of reasoning about arbitrary side-effects. Discussion of the Use of Deductive Verification Our analysis integrates a heavy-weight de- ductive verification tool into a light-weight static analysis. This allows us to reason about heap memory beyond analyzing the field names occurring in a method, but also offers other beneficial features from a design perspective. Theorem provers have a clear interface and our implementation is not monolithic: Our dead- lock tool benefits from any future advance in the precision or performance of KeY-ABS. Every invocation of KeY-ABS is caused by a pair of one guard and one method. This gives us modu- larity of the analysis results: If method and guard are unchanged, the prover does not have to be run again. We are able to handle unbounded data types and recursion without performance loss: KeY-ABS works on symbolic values and analyzes single methods. Non-termination is handled implicitly and we do not need to provide a maximal number of unrolling for loops or similar. Contrary to that, model checking would involve rerunning the whole program after each change and relies on finite domains and traces. We are still fully automatic, but we propose that in some situations it would be acceptable to interact with the theorem prover. Related Work To the author’s best knowledge, no deadlock analysis for condition synchro- nization in an object-oriented setting has been proposed. Some work has been done on simpler concurrency models, e.g., Owicki and Gries [OG76], which does not to models with an arbi- trary number of threads. For Active Objects (without condition synchronization) the following approaches are proposed: (1) The discussed approach of Flores-Montoya et al. [FAG13]. (2) Giachino et al. [GLL16] use behavioral types to detect deadlocks in ABS code. Contracts, de- scriptions of the dependency-structure of methods, are inferred and cycles are detected in their composition. For boolean guards, manual annotations are proposed, but not implementeted and no inference algorithm is given. (3) Gkolfi et al. [GDJ+17] use Petri Nets for deadlock detection and do not consider or discuss boolean guards. As described, conditional synchronization is similar to condition variables and monitors. Leino et al. [LMS10] presented an approach to deadlock detection of locks that generalizes to AVoCS 2018 16 / 19 ECEASST condition variables. Deadlocks are checked on a manually annotated global order for releasing and acquiring locks, receiving and sending messages over channels, and joining on threads. de Carvalho-Gomes et al. [CGHA18] translate Java programs into colored Petri nets for deadlock detection. While translation into Petri nets and the analysis of these Petri nets are automatic, the approach requires manual annotations. Recently, Hamin and Jacobs [HJ18] presented an approach that works directly on condition variables in C code, based on symbolic execution and verified in Coq. Java PathFinder [PVB+13] also uses symbolic execution, but does so on low-level primitives in Java bytecode. Future Work We aim to integrate user-provided specifications in SDL to use more information about newHeap in the await rule of the SDL-calculus, at the cost of no longer being fully auto- matic. To automate rejection of assumed false positives, we plan to adopt the approach of Albert et al. [AGI16] to generate tests. We plan an incremental approach to summarize detected cycles based on critical edges. We did not discuss starvation analysis, which is also an open question. Acknowledgments This work is supported by FormbaR, part of AG Signalling/DB RailLab. We thank Antonio Flores-Montoya and Michael Lienhardt for insightful discussions and the anonymous reviewers for valueable feedback. References [AAF+14] E. Albert, P. Arenas, A. Flores-Montoya, S. Genaim, M. Gómez-Zamalloa, E. Martin-Martin, G. Puebla, G. Román-Díez. SACO: Static Analyzer for Con- current Objects. In TACAS’14, Proceedings. 2014. doi:10.1007/978-3-642-54862-8_46 [AFG15] E. Albert, A. Flores-Montoya, S. Genaim. May-Happen-in-Parallel Analysis with Condition Synchronization. In FOPARA’15, Proceedings. 2015. doi:10.1007/978-3-319-46559-3_1 [AGI16] E. Albert, M. Gómez-Zamalloa, M. Isabel. Combining Static Analysis and Testing for Deadlock Detection. In iFM 2016, Proceedings. Pp. 409–424. 2016. doi:10.1007/978-3-319-33693-0_26 [CGHA18] P. de Carvalho Gomes, D. Gurov, M. Huisman, C. Artho. Specification and verifica- tion of synchronization with condition variables. Sci. Comput. Program. 163:174– 189, 2018. doi:10.1016/j.scico.2018.05.001 [DBH15a] C. C. Din, R. Bubel, R. Hähnle. KeY-ABS: A Deductive Verification Tool for the Concurrent Modelling Language ABS. In Felty and Middeldorp (eds.), CADE’15, Proceedings. Pp. 517–526. 2015. doi:10.1007/978-3-319-21401-6_35 17 / 19 Volume 076 (2019) http://dx.doi.org/10.1007/978-3-642-54862-8_46 http://dx.doi.org/10.1007/978-3-319-46559-3_1 http://dx.doi.org/10.1007/978-3-319-33693-0_26 http://dx.doi.org/10.1016/j.scico.2018.05.001 http://dx.doi.org/10.1007/978-3-319-21401-6_35 Detecting Deadlocks in Formal System Models with Condition Synchronization [DBH+15b] C. C. Din, R. Bubel, R. Hähnle, E. Giachino, C. Laneve, M. Lienhardt. Deliverable D3.2 Verification of project FP7-610582 (ENVISAGE). Mar. 2015. available at http://www.envisage-project.eu. [DO15] C. C. Din, O. Owe. Compositional reasoning about active objects with shared fu- tures. Formal Asp. Comput. 27(3):551–572, 2015. doi:10.1007/s00165-014-0322-y [FAG13] A. Flores-Montoya, E. Albert, S. Genaim. May-Happen-in-Parallel Based Dead- lock Analysis for Concurrent Objects. In FORTE’13, Proceedings. 2013. doi:10.1007/978-3-642-38592-6_19 [GDJ+17] A. Gkolfi, C. C. Din, E. B. Johnsen, M. Steffen, I. C. Yu. Translating Active Objects into Colored Petri Nets for Communication Analysis. In FSEN’17, Proceedings. LNCS 10522. Springer, 2017. doi:10.1007/978-3-319-68972-2_6 [GLL16] E. Giachino, C. Laneve, M. Lienhardt. A framework for deadlock detection in core- ABS. Software & Systems Modeling 15(4):1013–1048, Oct 2016. doi:10.1007/s10270-014-0444-y [HJ18] J. Hamin, B. Jacobs. Deadlock-Free Monitors. In ESOP’18, Proceedings. LNCS 10801, pp. 415–441. Springer, 2018. doi:10.1007/978-3-319-89884-1_15 [JHS+10] E. B. Johnsen, R. Hähnle, J. Schäfer, R. Schlatte, M. Steffen. ABS: A Core Lan- guage for Abstract Behavioral Specification. In Aichernig et al. (eds.), FMCO’10, Proceedings. Pp. 142–164. 2010. doi:10.1007/978-3-642-25271-6_8 [JST12] E. B. Johnsen, R. Schlatte, S. L. T. Tarifa. Modeling Resource-Aware Virtual- ized Applications for the Cloud in Real-Time ABS. In Aoki and Taguchi (eds.), ICFEM’12, Proceedings. Pp. 71–86. 2012. doi:10.1007/978-3-642-34281-3_8 [KH16] E. Kamburjan, R. Hähnle. Uniform Modeling of Railway Operations. In Artho and Ölveczky (eds.), FTSCS’16, Proceedings. CCIS 694. Springer, 2016. doi:10.1007/978-3-319-53946-1_4 [KH18] E. Kamburjan, R. Hähnle. Prototyping Formal System Models with Active Objects. In ICE’18, Proceedings. EPTCS 279, pp. 52–67. 2018. doi:10.4204/EPTCS.279.7 [LMRY17] J.-C. Lin, J. Mauro, T. B. Røst, I. C. Yu. A Model-Based Scalability Optimization Methodology for Cloud Applications. In 7th IEEE International Symposium on Cloud and Service Computing, IEEE SC2 2017. 2017. doi:10.1109/SC2.2017.32 AVoCS 2018 18 / 19 http://www.envisage-project.eu http://dx.doi.org/10.1007/s00165-014-0322-y http://dx.doi.org/10.1007/978-3-642-38592-6_19 http://dx.doi.org/10.1007/978-3-319-68972-2_6 http://dx.doi.org/10.1007/s10270-014-0444-y http://dx.doi.org/10.1007/978-3-319-89884-1_15 http://dx.doi.org/10.1007/978-3-642-25271-6_8 http://dx.doi.org/10.1007/978-3-642-34281-3_8 http://dx.doi.org/10.1007/978-3-319-53946-1_4 http://dx.doi.org/10.4204/EPTCS.279.7 http://dx.doi.org/10.1109/SC2.2017.32 ECEASST [LMS10] K. R. M. Leino, P. Müller, J. Smans. Deadlock-Free Channels and Locks. In ESOP’10, Proceedings. Pp. 407–426. 2010. doi:10.1007/978-3-642-11957-6_22 [LYJL16] J. Lin, I. C. Yu, E. B. Johnsen, M. Lee. ABS-YARN: A Formal Framework for Modeling Hadoop YARN Clusters. In FASE’16, Proceedings. LNCS 9633, pp. 49– 65. Springer, 2016. doi:10.1007/978-3-662-49665-7_4 [Nak08] S. Nakamoto. Bitcoin: A peer-to-peer electronic cash system. 2008. [NY16] N. Ng, N. Yoshida. Static deadlock detection for concurrent go by global session graph synthesis. In CC’16, Proceedings. Pp. 174–184. ACM, 2016. doi:10.1145/2892208.2892232 [OG76] S. S. Owicki, D. Gries. An Axiomatic Proof Technique for Parallel Programs I. Acta Inf. 6:319–340, 1976. doi:10.1007/BF00268134 [PVB+13] C. S. Pasareanu, W. Visser, D. H. Bushnell, J. Geldenhuys, P. C. Mehlitz, N. Rungta. Symbolic PathFinder: integrating symbolic execution with model checking for Java bytecode analysis. Autom. Softw. Eng. 20(3):391–425, 2013. doi:10.1007/s10515-013-0122-2 19 / 19 Volume 076 (2019) http://dx.doi.org/10.1007/978-3-642-11957-6_22 http://dx.doi.org/10.1007/978-3-662-49665-7_4 http://dx.doi.org/10.1145/2892208.2892232 http://dx.doi.org/10.1007/BF00268134 http://dx.doi.org/10.1007/s10515-013-0122-2 Introduction On Condition Synchronization in System Design A Language with Boolean Guards and Dynamic Logic Dynamic Logic Dependencies for Condition Synchronization Analyzing Condition Synchronization Evaluation Conclusion