Derivation Languages of Graph Grammars Electronic Communications of the EASST Volume 61 (2013) Selected Revised Papers from the 4th International Workshop on Graph Computation Models (GCM 2012) Derivation Languages of Graph Grammars Nils Erik Flick 22 pages Guest Editors: Rachid Echahed, Annegret Habel, Mohamed Mosbah Managing Editors: Tiziana Margaria, Julia Padberg, Gabriele Taentzer ECEASST Home Page: http://www.easst.org/eceasst/ ISSN 1863-2122 http://www.easst.org/eceasst/ ECEASST Derivation Languages of Graph Grammars Nils Erik Flick1∗ 1 flick@informatik.uni-oldenburg.de Carl von Ossietzky Universität, D-26111 Oldenburg Abstract: We investigate sequential derivation languages associated with graph grammars, as a loose generalisation of free-labeled Petri nets and Szilard languages. The grammars are used to output strings of rule labels, and the applicability of a special rule determines the acceptance of a preceding derivation. Due to the great power of such grammars, this family of languages is quite large and endowed with many closure properties. All derivation languages are decidable in nondeterministic polynomial time and space O(n log n), by simulation of the graph grammar on a Turing machine. Keywords: derivation language, closure properties, graph grammars, double pushout 1 Introduction Graph grammars are well known and well studied [Roz97]. This paper deals with their derivation languages: though a graph grammar is usually employed to generate graphs, we use it to output strings in an automaton-like fashion. The graph is thus the state of a nondeterministic machine and the labels of the rules used in the derivation are output symbols. The original motivation for this work is the extension of notions already known in some restricted contexts. Petri nets are graph grammars with no edges [Cor95, KKK06], and their languages of labeled transition sequences are well known [Jan87]. Graph grammars can also simulate string grammars, and results about their derivation languages exist [Mäk98]. However, any description of the more derivation languages of graph grammars appears to be missing. This, by itself, would perhaps not be motivation enough to consider such a vast generalisation; yet we have found the subject to be intricate enough to justify further investigation. The language class, we have named L$, studied in this paper has a relatively concise description in terms of graph grammars, and it seems to be a fairly natural class with strong closure properties. Some new techniques and non-trivial results arise from the definitions. Recasting other known formalisms as special cases of graph grammars after forging some generic tools for understanding their languages can also be a motivation to deal with graph grammars; Petri nets are sometimes too limiting from the modeling point of view. The dynamics of many kinds of systems, for instance modeled as high level Petri nets, can be understood in terms of local operations on graph-like system state. Edges are necessary, for instance, to ensure the co- herent transport of a set of tokens which logically belong to a single modeled entity. We argue ∗ This work was supported by a University of Hamburg (HmbNFG) doctoral fellowship. This work is supported by the German Research Foundation (DFG), grant GRK 1765 (Research Training Group – System Correctness under Adverse Conditions). 1 / 22 Volume 61 (2013) mailto:flick@informatik.uni-oldenburg.de Derivation Languages of Graph Grammars that understanding their languages, as a simple form of semantics, can lead to steps towards anal- ysis, verification and possibly synthesis from behavioural specifications, of complex systems. The paper is structured as follows. Section 2 reviews the basic definitions and introduces deriva- tion languages. Section 3 presents closure properties, constructively obtained via operations on graphs and grammars. We present a proof of L$ under letter-to-letter homomorphisms, express- ing (ambiguous) rule relabelings. We use the closure properties to establish a first hierarchy result. In Section 4 we present an upper bound on the nondeterministic complexity of the word problem for a derivation language. We show that a lot of questions about general derivation languages are undecidable, and application of deleting homomorphisms leads to the recursively enumerable languages. Section 5 concludes with an overview of our results and an outlook. 2 Graph Grammars In this section, we introduce the notions of graphs, rules, grammars and derivation languages. We write − for set difference, X −Y = {x ∈ X | x < Y}, and f |Y for the restriction of the function f : X → Z to Y ⊆ X. ◦ is function or relation composition. X + Y denotes the union of the sets X and Y when these are assumed to be disjoint. If A ⊆ X∩Y , X +A Y also denotes a union of X and Y , with X −A and Y −A assumed disjoint. Assumption 1 Let Λ be a label alphabet. Definition 1 (Graphs and Morphisms) A graph is a tuple G = (VG, EG, sG,tG,λGV,λGE ) of two disjoint finite sets VG of nodes, EG of edges and four total functions, sG,tG : E → V , λGV : V → Λ λGE : E → Λ called the source and target mappings and the label functions, respectively. A morphism f : G → H from a graph G to a graph H is a pair f = ( fV, fE ) of total functions fV : VG →VH and fE : EG → EH which preserves the connectivity and the labels: sH◦ fE = fV ◦sG, tH ◦ fE = fV ◦tG, λHE ◦ fE = λGE and λHV ◦ fV = λGV . f is called injective or an inclusion (usually denoted by f : G ↪→ H) iff both fV and fE are injective. An isomorphism is a morphism with bijective components. Being related by isomorphisms is an equivalence relation denoted �. Nodes and edges are collectively called items. When the meaning can be inferred from the context, we will denote both components fV and fE of a morphism f just by the symbol f . We write λG for both λGV and λGE when it is convenient to do so. Graphs is the set of all graphs. Definition 2 (Rule) A rule is a tuple % = (L ←↩ K ↪→ R) of graphs, where K is called the interface and is contained both in the left hand side graph L and in the right hand side graph R. A rule % is called identical and denoted by idK if L = K = R. Definition 3 (Direct Derivation Step) Let % be a rule, G a graph and g : L ↪→ G an injective morphism. The dangling condition requires that for every node n ∈ VG in the range of gGV that is not the image of a node in K, every edge e ∈ EG with sG(e) = n or tG(e) = n is in the range of gGE . If g satisfies the dangling condition, then g is called a match of L in G. A direct derivation step G %,g ⇒H from a graph G to a graph H with a rule % via a match g consists of a double pushout Selected Revised Papers from GCM 2012 2 / 22 ECEASST diagram L K R G D H (1) (2)g h where (1) D = G−g(L−K), (2) H = D + h(R−K). The equation (1) stands for VD = VG − gV (VL − VK ) and ED = EG − gE (EL − EK ), (2) means VH = VD + hV (VR −KR) and EH = ED + hE (ER −KR). We may omit the match in the notation: G % ⇒H. Given a set R of rules, a derivation G0 v ⇒∗ R Gn is a sequence of derivation steps G0 %1 ==⇒ ... %n ==⇒ Gn with v = %1 ...%n. Definition 4 (Graph Grammars and Derivation Languages) A graph grammar, short grammar, is a tuple GG = (T,R,r,S) of an alphabet T of (rule) labels, a finite set of rules R, a surjective function r : T →R and a start graph S. We write G w ⇒∗ GG H for G r(w) ⇒∗ R H, where w = w1 ...wn and r : T∗→R∗ denotes the extension of r : T → R with r(w1 ...wn) = r(w1)...r(wn) for w = w1 ...wn ∈ T∗. The subscript may be omitted when GG can be inferred from the context. If G w ⇒∗ H and H r(t) ⇒J, we say that w may enable t from G. We will also use the terms rule application and applicability of a rule for a derivation step and the existence of a derivation step, respectively. The prefix-closed derivation language of GG is the set LD(GG) := {w ∈ T ∗ | S w ⇒ ∗ GG } of all sequences w of rule labels such that a derivation starting from S with the corresponding rules exists. The $-terminated derivation language of GG and $ ∈ T is the set of all strings over T −{$} in LD(GG) which may enable $: L$(GG,$) := { w ∈ (T −{$})∗ | w$ ∈LD(GG) } Unless otherwise stated, the terminating character is $, and we simply write L$(GG). LD denotes the class of all languages LD(GG) for some grammar GG, L$ analogously. w1 · w2 · ... · wn G0 G1 G2 ... Gn %1 ==⇒ %2 ==⇒ %n ==⇒ 7→ 7→ 7→ w1 ...wn ∈LD(GG) w1 · w2 · ... · wn G0 G1 G2 ... Gn %1 ==⇒ %2 ==⇒ %n ==⇒ 7→ 7→ 7→ · $ 7→ %n+1 ⇒ Gn+1 w1 ...wn ∈L$(GG) w1,...,wn , $ 3 / 22 Volume 61 (2013) Derivation Languages of Graph Grammars For every derivation sequence there exist in general one or several rule label sequences. For every rule label sequence in T∗ there exist in general zero, one or several derivations. Let us fix conventions for the graphical representation of graphs, rules and grammars. Node colours and shapes shall represent labels. Node inscriptions represent node names or labels, ac- cording to the legend given in the text. Edges are also decorated (dashed or thick lines) according to their label. Rules are represented with the left hand side, interface, right hand side in that order from left to right; the common items are placed at the same location relative to the bounding box in all three graphs of the rule. A special representation is introduced later to abstractly represent subgraphs. In two instances, a subgraph will be highlighted by rendering nodes and edges in bold and in bright colours. Example 1 Let the rules R of a grammar GG be a: ⊇ ⊆ c: ⊇ ⊆ b: ⊇ ⊆ $: ⊇ ⊆ Then there is a derivation with the given start graph, a ⇒ a ⇒ b ⇒ b ⇒ c ⇒ c ⇒ $ ⇒ and the word aabbcc is a word of L$(GG). 3 Closure Properties and Relation to Other Families This section presents several features of L$ in two subsections. In Subsection 3.1, we construc- tively prove that L$ is closed under homomorphisms which assign only single letters (several letters may have the same image). For the purpose of language generation, it is possible to sim- ulate such a non-injective relabeling of the rules with a modified grammar. The modification of the start graph and rules to this effect will be explained in detail and proved correct. The resulting theorem is a property of L$ not shared by the similar but less general mechanisms that had so far been investigated in the literature. It has non-trivial applications to the further development of the derivation language theory of graph grammars: in Subsection 3.2, we use it as an ingre- dient for the proof of the inclusion of the context free languages CF in L$. From that proof the stronger statement of the proper inclusion CF ⊂L$ already follows, because L$ is closed under intersections and CF is not. We then demonstrate that all Szilard languages are in L$, and finally provide another valuable example of a non-context-free language in L$. Selected Revised Papers from GCM 2012 4 / 22 ECEASST 3.1 Closure under Rule Relabelings We start with a construction that serves to extend a graph with an extra node, which is connected to all nodes of the original graph. The construction is then extended to rules. Because the extra node is given a special label, it cannot be confused with any other node, and the matches of the modified rule in the modified graph will correspond exactly to the matches of the original rule in the original graph. This construction, which we call suspension, may appear almost trivial but is essential in the sequel: the extra node can serve as a single gluing point to attach the whole graph to a specific place in a much larger graph. This, in turn, is essential for our approach to constructing a grammar that runs a simulation of another grammar such that the language of the simulating grammar is a homomorphic image of the language of the simulated grammar. Definition 5 (Suspension) Given a graph G and a new label ∗ not in the range of λGV nor λGE , the suspended graph Gx is obtained from G by adding a new node x with label ∗, and edges from x to each node of V , also with label ∗. We will usually represent G graphically as a box, G ∗ Given a morphism f : G → H, f x,y denotes the morphism from Gx to Hy, such that f x,y(u) = f (u) if u is an item of G, f x,y(x) = y and each new edge with target in Gx is mapped to the corresponding new edge in Hy. G H f G ∗x H ∗y f x,y We fix a label ∗ to be implicitly used as the new label for all suspensions in the rest of the paper, and assume it does not occur in any graph subjected to the construction. Let %x denote the rule obtained by suspending the graphs of % on x. Given a grammar GG, GGx denotes the grammar obtained by applying suspension to the start graph and the interfaces, left hand and right hand sides of the rules of GG. Lemma 1 (Suspension Lemma) For every match g : L → G of a rule % in G, there is a match gx,y : Lx → Ly of %y in Gy such that Gy ⇒% x,gx,y Hy if G ⇒%,g H. Every match q : Lx → Gy of a rule %x in Gy is of the form q = gx,y for some morphism g : L → G, and g is a match of % such that G ⇒%,g H if Gy ⇒% x,gx,y Hy. Proof. G ρ,g ⇒H implies Gy %x,gx,y =====⇒ Hy: Extend the match g, and h, to x to gx,y resp. hx,y. This results in a derivation because gx,y is again a match: the dangling condition is satisfied because nodes n to be deleted have a supplementary edge attached to them but that edge from Gy to n is also present in the modified left hand side graph of the rule ρx. 5 / 22 Volume 61 (2013) Derivation Languages of Graph Grammars Gy ρx,gx,y =====⇒ Hy implies G ρ,g ⇒H: Since there is exactly one node labeled ∗ in Gy and one edge from it to every node from G if there are any nodes in G, the images of the ∗ labeled node and edges of D in H are determined by the source and target consistency requirements for a morphism. The removal of all such ∗ labeled items from all graphs involved in the derivation step does not impose any new restrictions. � It is expedient to define gluing constructions for graphs and rules and to prove several properties that will allow us to use those definitions later in this section for the proof of the main theorem. Definition 6 (Gluing) The gluing of two graphs G and H along the graph A ⊆G, H is the graph denoted G +A H with node set (VG +VA VH ) and edge set (EG +EA EH ), VG − VA and VH − VA assumed disjoint, similarly for the edges. If A has no edges, we simply write G +VA H for G +A H, also omitting the set brackets if no confusion is possible, and G + H if VA is empty. If A is a set of nodes which is not a subset of VG ∩VH , G +A H is an abbreviation for G +A∩VG∩VH H. Definition 7 (Parallel Composition) Given two rules %1 and %2 and a set A ⊆ VK1 ∩ VK2 of common nodes, the parallel composition of %1 and %2 as the rule %1 +A %2 := (L1 +A L2 ←↩ K1 +A K2 ↪→ R1 +A R2). If %2 = (L2 ←↩ K2 ↪→ R2) is identical (L2 = K2 = R2), then %1 +A %2 is called an enlargement of %1. We recall two properties of the parallel composition collectively known as the Parallelism Theo- rem (Theorem 4.7 in [Ehr79]). Lemma 2 (Parallel Step Lemma [Ehr79]) If G1 %1 ==⇒ H1 and G2 %2 ==⇒ H2, then G1 +A G2 %1+A′%2 ======⇒ H1 +A H2 where A′ is the common preimage of A via both matches. Lemma 3 (Sequential Decomposition [Ehr79]) A derivation step G % ⇒H of the parallel com- position of two rules % = %1 +A %2 can be decomposed sequentially into two derivation steps G %1 ⇒J %2 ⇒H, assigning the same image to the nodes of A in both matches. It will frequently become necessary in the subsequent constructions to transfer a rule application to a larger graph, or to restrict attention to a certain part of a derivation step. The next lemma allows under certain conditions to conclude that the graphs involved in derivation steps may be enlarged or shrunk without interfering with the derivation step. It is partly similar to the Embedding and Restriction Theorems (Theorems 6.14 and 6.18 in [EEPT06]) but additionally allows to enlarge and shrink rules, not just their contexts of application. Lemma 4 (Embedding Lemma) Let X be a graph, % a rule, A a subgraph of K, and %′ the enlargement % +A idX . Then, for any match g : L → G, writing g′ for the enlarged version of g, and L′ for L +A X and so on, (1) G %,g ⇒H implies G′ %,g′◦l′ ⇒ H′. (2) G %,g ⇒H implies G′ %′,g′ ⇒ H′. Selected Revised Papers from GCM 2012 6 / 22 ECEASST Conversely, (3) G′ %,g′◦l′ ⇒ H′ implies G %,g ⇒H. (4) G′ %′,g′ ⇒ H′ implies G %,g ⇒H. L K R L′ K′ R′ G′ D′ H′ l′ r′ g′ h′ Proof. We notice that the presence of items of X in the graph to which the rule is applied does not influence the satisfaction of the dangling condition because all edges from X are attached to nodes of K, and those are not deleted. G %,g ⇒H implies G′ %,g′◦l′ ⇒ H′: the intermediary graph is constructed as D = G−g(L−K). Since X is glued to nodes of K, the dangling condition is respected and the outer difference in D′ = (G−g(L−K))′ = (G′−g(L−K)) is a graph. Completing the derivation step results in H′. G %,g ⇒H implies G′ %′,g′ ⇒ H′: the only difference to the first case is that now g(L−K) is replaced with g(L′− K′) and analogously R− K with R′− K′ when completing the derivation step. But L′−K′ = L−K and R′−K′ = R−K, so the enlarged rule has the same effect as the original rule. G′ %,g′◦l′ ⇒ H′ implies G %,g ⇒H: as argued before, gluing in X along A does not interfere with the dangling condition. One weakens the conclusion of the derivation step to obtain a proposition about G′−(X −A), D′−(X −A), H′−(X −A) only, which are indeed the graphs G, D and H. G′ %′,g′ ⇒ H′ implies G %,g ⇒H: one proceeds as in the third case and ignores what is known about the nodes and edges of X from the definition of the derivation step, to obtain a statement about a derivation step with (L′−(X−A) ←↩ K′−(X−A) ↪→ R′−(X−A)) = % from G′−(X−A) = G. The result of the first half of the derivation step is D′−(X −A) = D since (G′−g(L−K))−(X −A) is equal to (G′−(X−A)−g(L−K)) = D, and this is also a graph (no dangling edges) for the reason stated in the introductory remark. H is obtained as before, as H = (D + h(R−K)). � A comparison with Petri net and Szilard languages suggests the question whether the image of a derivation language under a non-injective relabeling of the rules (assigning ambiguous labels to some rules) is still a derivation language. In the cases of Petri net [Jan87] and Szilard [Hö75] languages, this increases the generative power. We devote the next pages to showing how a grammar GG can be modified such that the derivation language of the modified grammar GGh indeed simulates two distinct rules with the same label. Two letters a1 and a2 used for rules in GG would be merged to a common image h(a1) = h(a2) = a, and LD(GGh) = h(LD(GG)) and L$(GGh) = h(L$(GG)). Closure under arbitrary letter-to-letter homomorphisms then follows by repeated application of such mergers. Since the effect of a rule is always the same, when simulating the effects of two different rules originally labeled a1, a2 with a single compound rule %a, it is impossible to choose between the two in a derivation step. Both will necessarily be applied. But since one of the effects 7 / 22 Volume 61 (2013) Derivation Languages of Graph Grammars is unwanted, the idea is to neutralise it by extending the graph of the simulated grammar with a specially constructed sandbox which contains the left hand side of the unwanted rule to guarantee its applicability, and then forcing the match of the unwanted rule into the sandbox, where it cannot interfere with the simulation. Since the sandbox will be depleted after each usage, it must also be replenished by the compound rule. We now introduce three control graphs (Control, Control′ and Control′′) that guide the applica- tion of the compound rule and the other rules in the simulation. Roughly, a suspended copy of the simulated grammar’s current graph G will be glued to a node of Control. The other nodes of Control will serve as a sandbox, initialised by gluing to them the suspended left hand sides of a1 and a2 (as one does not know in advance which of a1 or a2 will be unwanted, both left hand sides must be available in the sandbox). We call the result of this construction an augmented graph G̃. It is the graph on which the modified grammar operates to perform the simulation. In the first control graph, the role of node 0 is to provide a handle for suspending the simulation state. Control has no symmetry exchanging 0 with another node. The nodes 1 and 2 will serve as the sandbox to execute the unwanted rule, while 3 only collects garbage. Control′, the control graph for unambiguously relabeled rules, has a node that can only match to 0 and this ensures the correct simulation of these rules. On the other hand, Control′′ will be used for compound rules. This graph has two nodes (4 and 5) that can match to 0. The special construction of Control′′ directs the match of the a1 part of the compound rule to the sandbox when the a2 part is matched on the graph of the simulated grammar and vice versa. Construction 1 (Control Graphs for the Simulation) The basic control graph Control is defined as the graph with node set {0,1,2,3}, all nodes labeled ∗ and the edges shown: edges represented in dotted lines bear the label new and edges represented in thick lines have the label link. 0 1 2 3 The control graph Control′ for unambiguously relabeled rules is defined as the graph with node set {8,9,10}, label ∗ for all three nodes and just two new edges: 8 9 10 The control graph Control′′ for ambiguously relabeled rules is defined as the graph with node set {4,5,6,7}, label ∗ for all four nodes, and the edges shown. 4 5 6 7 Selected Revised Papers from GCM 2012 8 / 22 ECEASST We also assume that in the subsequent constructions, no other new and link labeled edges occur except those introduced with these three graphs. Construction 2 (Augmented Graphs) Given a grammar GG = (T,R,r,S), two letters a1,a2 ∈ T , r(a1) = %1, r(a2) = %2, and a label ∗ not occurring in any rule of R nor in S, for i, j ∈ {1,2}, let G̃(i j) := Control + 0,1,2,3 ( G0 + ( L1 i + 1,2 L2 j )) , be the graph consisting of one copy of G suspended to the node 0 of Control, and one supple- mentary copy of the left hand side of each one of the rules %1, %2 suspended either on 1 or on 2. All four possibilities (i, j) = (1,1), (1,2), (2,1) or (2,2) are represented in a schematic drawing; an actual G̃(i j) graph is obtained by choosing one L1 and one L2 copy and striking the other. 0 1 2 3 G L1 L2 L1 L2 G̃(1,2) and G̃(2,1) are isomorphic, and so are G̃(1,1) and G̃(2,2), but since we have chosen to con- sistently track the nodes of the Control graph through a derivation sequence by their identities, there are four variants, not two. The role of i and j is as follows: since one does not know in advance whether the simulation requires a copy of L1 or a copy of L2 at a given step to neutralise the collateral effect, both must be “on stock”; however, just placing a new left hand side where the current one was consumed is not a solution, because one of the rule effects will be applied “for real” on the graph suspended on node 0. Placing a new left hand side on 0, however, could interfere with the simulation. Therefore, a more complicated scheme is necessary. Our solution requires, for compound rules, an alternation between the cases i = j and i , j. Unambiguously relabeled rules ignore the sandbox part and leave i and j unchanged after the derivation step. Construction 3 (Augmented Rules, First Case) A rule % = (L ←↩ K ↪→ R) is changed into %′ := ( Control′ + 8 L8 ←↩ Control′ + 8 K8 ↪→ Control′ + 8 R8 ) . 8 9 10 L 8 9 10 K 8 9 10 R ⊇ ⊆ 9 / 22 Volume 61 (2013) Derivation Languages of Graph Grammars Lemma 5 (Unique Simulation Step in the First Case) If a rule % is transformed to %′ according to Construction 3, and Rest contains no ∗ labeled nodes or edges besides 1,2,3, there is a % derivation from G to H iff there is a corresponding %′ derivation between the augmented graphs: G % ⇒H iff∃Rest ∈ Graphs and i, j ∈ {1,2} such that ( G̃(i j) + 1,2,3 Rest %′ ⇒ H̃(i j) + 1,2,3 Rest ) . Proof. G % ⇒H implies G̃(i j) + 1,2,3 Rest %′ ⇒H̃(i j) + 1,2,3 Rest: By Lemma 1, G %,g ⇒H implies G0 %8,g8,0 =====⇒ H0, which in turn implies G̃(i j) %8,g8,0 =====⇒ H̃(i j) by Lemma 4 (1), and G̃(i j) +1,2,3 Rest %′,g′8,0 =====⇒ H̃(i j) +1,2,3 Rest by Lemma 4 (2), where %′ is obtained from rule % as in Construction 3, and g′ is the en- largement of the match (as in Lemma 4) by the graph Control′, glued in along 8. G̃(i j) +1,2,3 Rest %′ ⇒H̃(i j) +1,2,3 Rest implies G % ⇒H: let q be the match used in the derivation step in the antecedent. The only possible image for the node 8 is the node 0 (no new ∗ labeled nodes nor control edges have been added, and there is only one morphism from Control′ to Control). This constrains the images of the nodes from L% to lie within G, because all edges e of L%′ with q(s(e)) = 0 in G̃(i j) +1,2,3 Rest not labeled new are suspension edges from G0, by construction. Rest + ∅̃(i j) is linked only to nodes of g(K), therefore an application of Lemma 4 (3) with X = Rest + ∅̃(i j) glued in along {1,2,3}, followed by the second proposition of Lemma 1 to reverse the suspension, concludes the proof. � Construction 4 (Augmented Rules, Second Case) From two rules r(a1) = %1 and r(a2) = %2, a rule %a is constructed by gluing along the ∗ nodes the following rules, in the sequel referred to as the sub-rules of %a, %14, %25, %add1 := (∅←↩∅ ↪→ L1) 7, %add2 := (∅←↩∅ ↪→ L2) 6, idControl′′: %a := ( %1 4 + %2 5 + %add1 + %add2 ) + 4,5,6,7 idControl′′. 4 5 6 7 L1 L2 4 5 6 7 K1 K2 4 5 6 7 R1 R2 L1L2 ⊇ ⊆ Lemma 6 (Essentially Unique Simulation Step in the Second Case) If the rules %1 and %2 are transformed into %a according to Construction 4, then if λ−1Rest(∗) ⊆{1,2,3} and ∃i, j,k,l ∈ {1,2}, (G %1 ⇒H ∨ G %2 ⇒H) iff ( ∃Rest,Rest′ ∈ Graphs : G̃(i j) + 1,2,3 Rest %a ⇒ H̃(kl) + 1,2,3 Rest′ ) . Proof. G %1 ⇒H ∨G %2 ⇒H ⇒ ∃Rest,Rest′ ∈ Graphs : G̃(i j) +1,2,3 Rest %a ⇒H̃(kl) +1,2,3 Rest′: Assume first that G %1 ⇒H. Due to the symmetrical construction of %a, the case of G %2 ⇒H is com- pletely analogous. Selected Revised Papers from GCM 2012 10 / 22 ECEASST There is a match g such that G %1,g ===⇒ H. We apply the Suspension Lemma (Lemma 1) to suspend %1 as %14 and the match as g4,0, G0 %1 4,g4,0 =====⇒ H0, and repeat the same with the derivation step L2 %2, f ===⇒ R2 to obtain L2 j %2 5, f 5, j =====⇒ R2 j. Fix matches of the sub-rules %add1 and %add2 which assign 7 to 3 and 6 to 3− j (6 is mapped one of 1 and 2, but not the same as j). Applying proposition (1) of the Embedding Lemma to each of the sub-rule derivations in turn results in a derivation of four steps. Leaving the matches implicit, G̃(i j) +1,2,3 Rest = ( Control +0,1,2,3 ( G0 + (L1i +1,2 L2 j) )) +1,2,3 Rest %1 4 ==⇒ ( Control +0,1,2,3 ( H0 + (L1i +1,2 L2 j) )) +1,2,3 Rest %2 5 ==⇒ ( Control +0,1,2,3 ( H0 + (L1i +1,2 R2 j) )) +1,2,3 Rest (∅←↩∅↪→L1) 7 =========⇒ ( Control +0,1,2,3 ( H0 + (L1i +1,2 R2 j) )) +1,2,3 (Rest +3 L13) (∅←↩∅↪→L2) 6 =========⇒ ( Control +0,1,2,3 ( H0 + (L1i +1,2 R2 j) )) +1,2,3 ( Rest +i,3 ( L13 + L23− j )) = ( Control +0,1,2,3 ( H0 + (L1i +1,2 L23− j) )) +1,2,3 ( Rest +i,3 ( L1 3 + R2 j )) ︸ ︷︷ ︸ Rest′ Here, k = i because the subgraph L1i is not used up and can be re-used as a constituent of the augmented graph H̃(kl). The new copy of L1 which has been suspended to 3 by the application of the sub-rule %add1 becomes part of Rest ′. l = 2 when j = 1 and l = 1 when j = 2, because the subgraph L2 j has been used up and transformed into R2 j by the application of the sub-rule %25. The subgraph R2 j thus created becomes part of Rest′. A new copy of L2 has instead been suspended to the node l by the rule %add2 . Note that k = l if i , j and k , l if i = j. After applying the Embedding Lemma, forms (3) and (2), to enlarge the rules by the control graph Control′′ (the image of Control′′ in Control is determined by the node mapping), this yields the desired derivation step by Parallel Composition (Lemma 2), by the construction of %a. ∃Rest,Rest′ ∈ Graphs : G̃(i j) +1,2,3 Rest %a ⇒H̃(kl) +1,2,3 Rest′ ⇒ G %1 ⇒H ∨G %2 ⇒H: We proceed again by case analysis. The following is true for any match q of %a in G̃(i j): By construction of Control′′ and Control, q must assign either 4 to 0 and 5 to j, or 5 to 0 and 4 to i. In the first case, we seek to arrive at the conclusion that G %1 ⇒H, the second case is again completely analogous and will, mutandis mutatis, lead to G %2 ⇒H. The derivation step using %a is equivalent to a parallel composition of the following two deriva- tion steps: the first step is an application of an embedded form of %01 with a match mapping 0 to 4. The other step leaves unchanged the graph suspended on 0, but performs the rest of %a. By Lemma 3, we can extract the first derivation step by sequential decomposition, with suitable matches both mapping 4,5,6,7 to 0, j,i,3 in that order. G̃(i j) + 1,2,3 Rest %1 4 + 4,5,6,7 idControl′′ =============⇒ H̃(i j) + 1,2,3 Rest %′′′ ==⇒ H̃(kl) + 1,2,3 Rest′, 11 / 22 Volume 61 (2013) Derivation Languages of Graph Grammars where Rest′ = Rest +1,2,3 ( R2 j + L13 ) , k = i, l = 3− j, %′′′ = ( (%25 + %add1 + %add2 ) + 4,5,6,7 idControl′′ ) , because from H̃(i j) +1,2,3 Rest = (Control +1,2,3,4 (H0 +1,2,3 (L1i +1,2 L2 j))) +1,2,3 Rest, with the given information about the match, a derivation step %′′′ must yield (Control +1,2,3,4 (H0 +1,2,3 (L1i +1,2 R2 j))) +1,2,3 (Rest +3,i (L13 + L2i)). This can again be rearranged into the desired form H̃(kl) +1,2,3 Rest′. An application of the Embedding Lemma (4) and (3) to the first step, followed by the Suspension Lemma, allows us to arrive at the conclusion G %1 ⇒H. � We have represented the graph G̃(21) schematically and highlighted a match of %a in blue, in the case G ⇒%1 H. The second drawing represents the situation after the derivation step induced by this match. We have highlighted the resulting subgraph H̃(22) and represented Rest′ in gray. 0 1 2 3 GGG L2L2L2 L1 0 1 2 3 0 1 2 3 H R2 L1 L2 L1 0 1 2 3 Construction 5 (Merging Two Letters) Given a homomorphism h : T∗ → T′∗, T′ = (T −{a}+ {a1,a2}), and a grammar GG = (T,R,r,S), let GGh := (T′,R′,r′,S̃(12)), r′(a) := %a built from r′(a1), r′(a2) according to Construction 4, ∀b , a, r′(b) := r(b)′ according to Construction 3. Closure under strictly letter-to-letter homomorphisms is an operation on classes of languages often considered in the literature on formal languages and sometimes denoted by H1. Theorem 1 (Closure under Letter-to-Letter Homomorphisms) L$ is closed under H1. Proof. Assume, w.l.o.g., that the homomorphism h merges two letters a1 and a2 to a single image a and maps each other letter to itself. Any letter-to-letter homomorphism between finite alphabets can be factored into finitely many such elementary mergers. Let L = L$(GG). We shall show that Construction 5 results in a grammar GGh such that L$(GGh) = h(L$(GG)). We prove the proposition by induction over the length of a derivation, on the hypothesis that G⇒∗wGG H iff there are two graphs Rest, Rest′ such that G̃ +1,2,3 Rest⇒∗ h(w) GGh H̃ +1,2,3 Rest′, where G̃ is one of Selected Revised Papers from GCM 2012 12 / 22 ECEASST G̃(11), G̃(12), G̃(21), G̃(22) and H̃ is one of H̃(11), H̃(12), H̃(21), H̃(22). The induction step is already proven in Lemmata 5 and 6, since the required property that the nodes labeled ∗ are precisely 0,1,2,3 is fulfilled by all graphs reachable in GG as no derivation step can introduce a label occurring neither in the graph being transformed nor in the rule. � 3.2 Context Free Languages and Beyond From formal language theory, we know that REG ⊂ CF ⊂ CS ⊂ RE, where REG stands for the regular languages, CF for the context free languages, CS for the context sensitive languages and RE for all recursively enumerable languages. To each of the four classes corresponds a type of grammar, known as type 3, 2, 1 and 0, respectively [HU79]. In this subsection, we locate L$ above the context free languages in a first hierarchy result, CF ⊆L$. This result will immediately exposed as being quite weak, because our proof already shows the closure of L$ under intersections, from which follows CF ⊂L$. We also establish that the Szilard languages are a subset of L$, and provide a grammar for the language {a 2n | n ∈N}. We shall start with the inclusion REG ⊆L$, which is also useful subsequently. The construction for transforming a nondeterministic finite automaton A into a graph grammar GG whose derivation language L$(GG) is the language L(A) accepted by A is straightforward. Lemma 7 (Regular Languages) Every regular language is in L$: REG ⊆L$. Proof. Given a finite automaton A = (Q,T,δ,q0, QF ) with state set Q, one initial state q0 ∈ Q, alphabet T , transition relation δ ⊆ Q×T × Q, final states QF ⊆ Q, let sA : Q → Graphs be the function assigning to each state q̂ a graph sA(q̂) with edge set V = Q ∪ δ ∪ {ma | a ∈ T} ∪ {m∗} ∪ {m$}, the union assumed disjoint. That is, every state of A, as well as every transition, becomes a node; besides, some auxiliary nodes are introduced. The edge set E = {eqma | (q,a,q ′) ∈ δ} ∪ {emaq′ | (q,a,q ′) ∈ δ} ∪ {e(q,a,q′)ma | (q,a,q ′) ∈ δ} ∪ {eqm$ | q ∈ QF} ∪ {eq̂m∗} with s(eab) = a and t(eab) = b attaches the state nodes to the transition nodes and to the special marker nodes m∗, m$ and the transition nodes to state nodes and alphabet nodes. The marker nodes are distinguishable: ∀a ∈ T ∪{∗,$} : λ(ma) = a, all other nodes obtain a nondescript label x < T : ∀v ∈ V −{ma | a ∈ T ∪{∗,$}} : λ(v) = x. All edges also have the same label, λ(e) = x. We have represented an automaton A and the graph sA(q0). q0start q1 a b a b (q0,a,q0) q0 (q0,b,q1) q1 m∗ m$ The rules are the same for every automaton. For each letter a ∈ T , let %a and %å be the rules (the latter being an identical rule): 13 / 22 Volume 61 (2013) Derivation Languages of Graph Grammars q q′ (q,a,q′) mam∗ q q′ (q,a,q′) ma q q′ (q,a,q′) ma m∗ ⊇ ⊆ q (q,a,q) mam∗ The end rule %$ removes the nodes labeled $ and ∗ and the edges attached to them, when they are linked to a common node q. q m$m∗ q q ⊇ ⊆ It is easy to see that (q,a,q′)∈δ iff sA(q) r(a) ⇒ sA(q′) or sA(q) r(å) ⇒ sA(q′). By induction over the length of an accepted or generated word, the fact that the end rule is applicable in sA(q) iff q ∈ QF , and an application of Theorem 1 with the homomorphism mapping å and a to a for each a ∈ Σ, Hence the languages of the automaton A and the grammar are equal. � The extended Dyck language [Okh12] over m pairs of parentheses [1]1 ...[m]m and n extra symbols x1 ... xn, D̂m,n, is the language generated by the context free grammar with variables V = {S}, terminal alphabet ⋃ i∈{1,...,m}{]i,[i} ∪ {x j | j ∈ {1,...,n}} and productions {(S,S S ),(S,�)} ∪ {(S,[iS ]i) | i ∈ {1,...,m}} ∪ {(S, x j) | j ∈ {1,...,n}}. These are a building block for the context free languages. We show that every such language is the derivation language of a graph grammar. Lemma 8 (Extended Dyck languages) The extended Dyck languages are in L$. Proof. Let Xn = {x j | j ∈ {1...n}}, Σm,n = {[i]i | i ∈ {1...m}}∪Xn. Given an alphabet X, let G(w) denote the graph encoding the word w ∈ X∗ in the following manner: there is one node ma labeled a for each a ∈ X, one node v−1 and one node v|w|, and one node vi for each i ∈{0,..., |w|−1}. Each node vi, i ∈ {−1,..., |w|−1} is linked with an edge to its successor vi+1, and if i > 0 also to the node ma such that the i-th letter wi = a. The edges and the nodes vi are all labeled •. As an example, we have represented the graph G([1[2[1[1[2). [1 [2 Let GGm,n = (Σm,n,R,r,S) be a grammar where S is the graph with m + n unconnected nodes ma, one for each opening bracket a ∈{[i| i ∈{1...m}} and two nodes v0 and v1 linked by a single edge. The end rule r($) is the rule (S ←↩∅ ↪→∅) that just deletes a connected component isomorphic to S. All rules r(x j) are identical rules that check for the existence of a single • labeled node. All rules r([i) and r(]i) are mirror images of each other, L[i = R]i , K[i = K]i , L]i = R[i . This is r([i), which extends a chain of nodes by one node linked to m[i , using the graph as a stack. r([i) pushes a symbol on the stack while r(]i) removes the top stack symbol: Selected Revised Papers from GCM 2012 14 / 22 ECEASST [i [i [i ⊇ ⊆ We use a well-known alternative characterisation of the Dyck languages: Dm := D̂m,0 is the equivalence class [�]∼m of � under the congruence ∼m induced by {([i]i,�) | i ∈ {1,...,m}} (see Berstel [Ber79]). This is easily extended to the extended Dyck languages, the congruence ∼m,n now being induced by {([i]i,�) | i ∈ {1,...,m}}∪{(x j,�) | j ∈ {1,...,n}}. This relation can be used as a confluent rewriting system, and there is a unique reduced word wred ∈ [w]∼m,n for any word w ∈ Σm,n and w ∈ D̂m,n iff wred = �. We show by induction over the length n of a word w that w ∈LD(GGm,n) implies S w ⇒∗ GGm,n G(wred ), and the derivation is uniquely determined by the word w. The statement is obvious for w = �; it is also true for w = ux j for any x j ∈ Xn and u ∈ Σ∗m,n of length n since the rules r(x j) do not modify the graph. It is also true for any u[i, [i∈ Σm,n because an opening bracket does not create any new opportunity for applying a reduction to the string. It is also true for any u]i, since r(]i) can only be applied when the last rule was r([i) or any r(x j), j ∈N (which can again be shown by induction, examining the effects and applicability of the rules) and has a unique effect: ]i takes back the effect of the last [i, which is to extend the graph by one node encoding an opening bracket. L$(GGm,n) ⊆ D̂m,n: We have just shown that any derivation labeled w ends in state G(wred ). The end rule can be applied only to a graph having G(λ) as a connected component. This can clearly not occur until the stack is cleared again, therefore w ∈ D̂m,n. D̂m,n ⊆L$(GGm,n): Any word w ∈ D̂m,n is related to λ by the relations introduced at the start of this proof. We show how to construct a derivation by applying these relations in any direction, by induction over the number of rules needed to go from λ to w. The induction step hinges on the fact that r(x j) rules are always applicable and have no effect on the graph, while any sequence [i]i can also be inserted at any position in a derivation because starting at any reachable graph (induction) it can be applied, uniquely corresponding to a derivation with no effect. � The next definition serves to show closure of L$ under intersections, after which we will be able to prove the inclusion of the context free languages. Definition 8 (Label-Disjoint Union) Given two graphs G and H, let G ÷ H := G(0) + H(1), where X(i) denotes the graph X with λX (x) = (x,i). Let this operation be extended to morphisms and rules: if f1 : G1 → H1 and f2 : G2 → H2, let f1 ÷ f2 be the morphism such that ( f1 ÷ f2)v(n,i) = (( f1)v(n),i) and likewise for edges. Let it be extended to grammars by removing all rules which are not present in one of the grammars: Given GG1 = (T1,R1,r1,S1) and GG2 = (T2,R2,r2,S2), GG1 ÷GG2 is the grammar with rule labels T1 ∩T2, rules %1 ÷%2 for any pair of rules %1 and %2 with the same label via r1 and r2, start graph S1 ÷ S2. In the illustration, the background colour is used to depict the alteration of the labels. It must be thought of as a contribution to the edge and node colours. 15 / 22 Volume 61 (2013) Derivation Languages of Graph Grammars Rules labeled a in the original grammars from GG1: ⊇ ⊆ from GG2: ⊇ ⊆ Rule labeled a in the resulting grammar in GG1 ÷GG2: ⊇ ⊆ By forming such disjoint unions of graphs and rules, parallel steps combining the individual effects on the component graphs are possible. The disjointness of the label sets prevents spurious derivation steps. Lemma 9 The definition of ÷ for morphisms establishes a bijective correspondence between the morphisms from G1 ÷G2 to H1 ÷ H2 and the pairs of morphisms from G to H1 and G2 to H2. Proof. Every morphism from G1 ÷G2 to H1 ÷ H2 assigns the items of (G1)0 to items of (H1)0 and those of (G2)1 to items of (H1)1, independently, determining two morphisms. Likewise, every pair of morphisms from G to H1 and G2 to H2 determines one from G1 ÷G2 to H1 ÷ H2. � The ÷ constructions are now leveraged to build a grammar that simulates two grammars GG1, GG2 simultaneously. It can perform a parallel derivation step that corresponds to one step in GG1 and one step with the same label in GG2, whenever these are applicable individually. Lemma 10 (Intersection Closure) L$ is closed under intersections K, L ∈L$ implies K ∩L ∈L$, and the operation ÷ provides a construction for (L∈{LD,L$}) L(GG1)∩L(GG2) = L(GG1 ÷GG2). Proof. LD(GG1)∩LD(GG2) = LD(GG1 ÷GG2), because the existence of any morphism in the combined grammar is now tantamount by Lemma 9 to the conjunction of the existence of one morphism in the first grammar and one in the second grammar, rule applicability and effects clearly being preserved under bijective relabelings. The same holds for the end rule, therefore also L$($,GG1)∩L$($,GG2) = L$($,GG1 ÷GG2). � Selected Revised Papers from GCM 2012 16 / 22 ECEASST Closure under intersections with regular languages follows from Lemmata 7 and 10. Corollary 1 (Regular Intersection Closure) L$ is closed under intersections with regular lan- guages. Theorem 2 (Context Free Languages) The context free languages are properly contained in L$: CF ⊂L$. Proof. The context free languages are characterised as the images under letter-to-letter ho- momorphisms of the intersection of an extended Dyck and a regular language (Theorem 3 in [Okh12]). L$ contains the extended Dyck languages and is closed under the letter-to-letter ho- momorphism by Theorem 1 and regular intersection by Corollary 1. The inclusion is proper by the closure under all intersections (Lemma 10), which the context free languages lack. � We now go from the languages defined by Chomsky grammars to the derivation languages of these grammars, the Szilard languages [MS97]. Definition 9 (Szilard Languages) The Szilard language of a Chomsky type 0 grammar G is the set of images of the terminal derivations of G under a bijective labeling of the productions. By Mateescu and Salomaa [MS97], it is known that all Szilard languages are context sensitive. Proposition 1 (Szilard Languages) The Szilard languages are in L$: Sz ⊂L$. Proof. This proposition can be seen by encoding each word composed of terminals and non- terminals as a linearly shaped graph with two dummy nodes at the ends, a chain of nodes repre- senting the positions in the currently derived word, which are linked to alphabet nodes: for each terminal or non-terminal a one node ma with λ(ma) = a, the links representing the letters at each position. It is the same encoding as in the proof of Lemma 8. The start state for a grammar with non-terminals N = {S, A} and terminals T = {a,b} is shown in the following picture. S A a b Each production as a rule replacing a fixed, known portion of the string by another sequence of terminals and non-terminals. A production AA → Aa, for instance, gives the following rule: A a A a A a ⊇ ⊆ Termination, that is derivation of a terminal word, can be recognised by using as the left hand side of the end rule a graph consisting of |N| nodes, one labeled with each non-terminal, and 17 / 22 Volume 61 (2013) Derivation Languages of Graph Grammars deleting these. When no more nodes use these labels, a terminal word has clearly been derived, and only then there are no more edges inhibiting the application of the end rule. Clearly, the Szilard language of any given type 0 (or type 1, type 2) grammar is obtained in this way. � To finish the section, we demonstrate that another frequently seen non-context-free language is the derivation language of a graph grammar. Proposition 2 (Powers of 2) The graph grammar GG from the following Construction 6 has the derivation language L$(GG) = {a 2n | n ∈N}. Construction 6 (Powers of 2) Let GG be the grammar with rule alphabet {a,$}, start graph S and end rule as follows (node labels are indicated, white nodes are labeled •, as are edges): ∗ $ $ $ ∗ $ ∗ $ ∗ $ ⊇ ⊆ The rule r(a) moves the ∗ labeled node along an edge, deletes the edge and replaces it with a new node labeled • and two edges: ∗ ∗ ∗ ⊇ ⊆ Proof. By inspection, the word aa corresponds to a single possible derivation sequence and enables $ at every step; it can be shown by induction over the length of a word that when the ∗ labeled marker has reached the cycle, it can only travel around the cycle, doubling its length every time it reaches the node linked to the $ labeled marker. The induction hypothesis is: Hypothesis After n ≥ 2 derivation steps with the rule a, a unique graph is reached. In this state, the unique node m∗ labeled ∗ is connected with an edge to the node which is on a cycle of length n = 2k + j, 2k > j, and there are no outgoing edges from the cycle to • labeled nodes, and the node connected to m∗ is 2k − j mod 2k edges away in forward direction from the unique node on the cycle connected to a node labeled $, and 2 j edges in backward direction. Therefore if j = 0, then the word ak has been read and the node connected with ∗ is also connected to a node labeled $ and so the end rule can be applied. Otherwise, ak does not enable the end rule. The hypothesis is true for n = 2 by inspection; If it is true for n = 2k + j, then also for n + 1 because the cycle makes only one match of rule a possible, whose effect is to reduce the number of edges in the forward direction by one and increase the number of edges in the backward direction by two. The cycle structure is preserved, and the number of ∗ labeled nodes is not changed. The number of and edges connected to $ labeled nodes are not changed. � Selected Revised Papers from GCM 2012 18 / 22 ECEASST 4 Limitations Graph grammars provide a very powerful mechanism for the generation of string languages as derivation languages. In this section, we examine some of the limitations of this mechanism. We present an upper bound on the nondeterministic complexity for the recognition of a derivation language. We then use known results about the context free languages to show that many other questions about general derivation languages, such as the inclusion problem, are undecidable, and that applying deleting homomorphisms to them yields the recursively enumerable languages. Proposition 3 (Space and Time Bound) Polynomial time and O(n log n) space bounds for the problem w∈?L$(GG) are attainable simultaneously with a nondeterministic Turing machine: L$ ⊆ NP, L$ ⊆ NSPACE(n log n) Proof. A direct simulation of a derivation of length n of the grammar by a nondeterministic Turing machine is possible in O(n log n) space and polynomial time. Each finite graph with v nodes and e edges can be represented as a list of node identifiers and labels, followed by an adjacency list carrying also the edge labels. If node identifiers are numbers not exceeding v (or at least bounded by v times a constant factor), this takes O((v + e) log v) space. A potential match g of a rule % is represented by flagging |VL| nodes and |EL| edges with an annotation: either a node or edge is not in g(L), or it is to be deleted, or preserved. The preimage via g is also in the annotation. Since rules have constant size throughout the derivation, this takes only linearly more space over the bare graph representation. Rule applicability comes down to checking the dangling condition, which is possible in polynomial time and very little space by searching for unflagged edges attached to nodes flagged for deletion. Rule application involves adding at most |VR| new nodes (precisely c% = |VR|−|VL| new nodes) and |ER| new edges, which may increase the space needed for node identifiers: therefore the Turing machine first deletes the entries for the obsolete nodes and edges in polynomial time and no excess space, then calls a routine to increase the space allocated per node identifier to dlog v + c%e, for example in linear space and quadratic time by using a work tape to perform the operation while copying. Then it adds the c% new nodes (for which space has been left blank at the end of the node list) and |ER|− |EL| new edges and clears all flags. Looking up the source and target node labels for the new edges can be done in polynomial time by going through the annotations. The number of nodes and edges involved (added) in a derivation step is constant. So during and after the application of n rules, the length of the representation is increased to at most O((v + e) log v) with v and e linear in n. � Undecidability of the termination of graph rewriting is already well known (see Plump [Plu98]). Indeed, a graph grammar can simulate the workings of a Turing machine, but only with a deriva- tion sequence of length proportional to the length of the computation. If unlabeled steps are allowed, then the whole set of recursively enumerable languages is obtained: Closure under arbitrary homomorphisms, which may also delete letters, is an operation on lan- guage classes often considered in the literature on formal languages, sometimes denoted by Ĥ. 19 / 22 Volume 61 (2013) Derivation Languages of Graph Grammars Lemma 11 (The Effect of Deleting Homomorphisms) The closure of the derivation languages L$ under deleting homomorphisms is the family of recursively enumerable languages: RE = Ĥ(L$). Proof. We slightly modify the usual proof [HU79, Kud04] of the fact that the emptiness problem for the intersection of two context free languages is undecidable, by simulating the elementary steps of any given Turing machine with two context free languages, one producing sequences of correct steps at even positions (2i,2i + 1) and the other one doing so at odd positions (2i + 1,2i + 2). We refer to the literature and follow the proof in [HU79], modifying it to write out the input word in a special alphabet at the beginning of the computation. Upon reaching a configuration where the Turing machine halts in an accepting state, the derivation in the simulating grammar terminates with $. Application of the deleting homomorphism to remove the whole subsequent calculation, however long it may be, then results in L ∈ Ĥ(L$). On the other hand, all images under deleting homomorphisms of L$ languages are recursively enumerable: simulate the given graph grammar breadth-first, output the images (if any) under the homomorphism of the labels of the rules used, and accept once there is a valid match for the termination rule. � Proposition 4 (Undecidability) The following questions are undecidable for L$ languages given by grammars GG, GG′ in general: totality, equivalence, inclusion, regularity, emptiness. Proof. Decidability of the word problem follows from Proposition 3. All questions undecidable [Kud04] for context free languages (given as push-down automata or equivalently as context free grammars) are undecidable for L$ languages. Because of Lemma 10 there is furthermore an ef- fective way of constructing a grammar outputting the intersection of two context free languages, but it is undecidable whether the intersection of two context free languages is empty. � The following table recapitulates the decidability results. Question Formal Description Decidable? Word problem w ? ∈L$(GG) + Totality L$(GG) ? = T∗ – Equivalence L$(GG) ? =L$(GG ′) – Inclusion L$(GG) ? ⊆L$(GG ′) – Regularity L$(GG) ? ⊆REG – Emptiness L$(GG) ? =∅ – 5 Conclusion and Outlook These are the relations of L$ and the Chomsky families and the Szilard languages: REG ⊂ CF ⊂ CS ⊂ RE Sz⊃L$H1(L$) ⊂ ⊂ Ĥ(L$) = = ⊂ ⊂ Selected Revised Papers from GCM 2012 20 / 22 ECEASST For easy reference, we tabulate the main results of the paper and their dependencies. Reference Result Dependencies Theorem 1 L$ = H1(L$) Lemma 7 REG ⊆L$ Theorem 1 Lemma 8 D̂m,n ∈L$ - Lemma 10 K, L ∈L$ ⇒ K ∩L ∈L$ - Theorem 2 CF ⊂L$ Theorem 1, Lemmata 7, 8, 10 Proposition 1 Sz ⊂L$ - Proposition 6 {a2 n | n ∈N} ∈L$ - Proposition 3 L$ ⊆ NP,NSPACE(n log n) - Lemma 11 RE = Ĥ(L$) Theorem 2, Lemma 10 Proposition 4 Decidability Issues Theorem 2, Lemma 10, Proposition 3 Further work in progress suggests a large array of closure properties, and the inclusion of several well studied supersets of the context free languages in L$. The comparison with Petri net lan- guages with various acceptance conditions has also been an object of our attention, since Petri nets are an important subclass of graph grammars. Several questions remain open. First of all, we do not know of a criterion like a pumping lemma to place a language outside of L$. Also, answers to these questions are still missing: L ∈L$⇒(Σ ∗−L) ∈L$? is L$ closed under complementation? L$⊆CS? are all derivation languages context sensitive? CS⊆L$? are all context sensitive languages derivation languages? The second question would be answered in the affirmative if the space requirements for recogni- tion from Proposition 3 could be reduced from O(n log n) to O(n). The statement of the third ques- tion should not be expected to hold, since then there would be a graph grammar whose derivation language is the PSPACE complete language of true quantified Boolean formulae [HU79], highly unlikely in the light of Proposition 3. If the language Lp ∈ CS of all words from {a}∗ of prime length could be shown not to be a derivation language, the first and third questions would have a negative answer, since its complement {a}∗−Lp is indeed a derivation language (although we have not shown it here). Acknowledgements The author gratefully acknowledges the anonymous reviewers, whose com- ments helped greatly towards improving this paper. Many thanks to Annegret Habel for provid- ing valuable criticism at several readings. Bibliography [Ber79] J. Berstel. Transductions and Context-Free Languages. Teubner, 1979. [Cor95] A. Corradini. Concurrent Computing: from Petri Nets to Graph Grammars. Volume 2, pp. 56–70. Elsevier, 1995. 21 / 22 Volume 61 (2013) Derivation Languages of Graph Grammars [EEPT06] H. Ehrig, K. Ehrig, U. Prange, G. Taentzer. Fundamentals of Algebraic Graph Trans- formation. Springer, 2006. [Ehr79] H. Ehrig. Introduction to the Algebraic Theory of Graph Grammars (a Survey). LNCS 73, pp. 1–69. Springer, 1979. [HU79] J. E. Hopcroft, J. D. Ullman. Introduction to Automata Theory, Languages, and Com- putation. Addison-Wesley, 1979. [Hö75] M. Höpner. Eine Charakterisierung der Szilardsprachen und ihre Verwendung als Steuersprachen. In Gl-4.Jahrestagung. LNCS 26, pp. 113–121. Springer, 1975. [Jan87] M. Jantzen. Language Theory of Petri Nets. LNCS 254, pp. 397–412. Springer, 1987. [KKK06] H.-J. Kreowski, R. Klempien-Hinrichs, S. Kuske. Some Essentials of Graph Trans- formation. In Recent Advances in Formal Languages and Applications. Studies in Computational Intelligence 25, pp. 229–254. Springer, 2006. [Kud04] M. Kudlek. Context-Free Languages. In Formal Languages and Applications. Vol- ume 148, pp. 97–116. Springer, 2004. [Mäk98] E. Mäkinen. A Bibliography on Szilard Languages. Bulletin of the EATCS 65:143– 148, 1998. [MS97] A. Mateescu, A. Salomaa. Aspects of Classical Language Theory. In Rozenberg and Salomaa (eds.), Handbook of formal languages, vol. 1: word, language, grammar. Chapter 7. Springer, 1997. [Okh12] A. Okhotin. Non-erasing Variants of the Chomsky–Schützenberger Theorem. In De- velopments in Language Theory. LNCS 7410, pp. 121–129. Springer, 2012. [Plu98] D. Plump. Termination of Graph Rewriting is Undecidable. Volume 33(2), pp. 201– 209. IOS Press, 1998. [Roz97] G. Rozenberg (ed.). Handbook of Graph Grammars and Computing by Graph Trans- formations, Volume 1: Foundations. World Scientific, 1997. Selected Revised Papers from GCM 2012 22 / 22 Introduction Graph Grammars Closure Properties and Relation to Other Families Closure under Rule Relabelings Context Free Languages and Beyond Limitations Conclusion and Outlook