Concatenation and other Closure Properties of Recognizable Languages in Adhesive Categories Electronic Communications of the EASST Volume 58 (2013) Proceedings of the 12th International Workshop on Graph Transformation and Visual Modeling Techniques (GTVMT 2013) Concatenation and other Closure Properties of Recognizable Languages in Adhesive Categories H.J. Sander Bruggink, Barbara König, Sebastian Küpper 14 pages Guest Editors: Matthias Tichy, Leila Ribeiro Managing Editors: Tiziana Margaria, Julia Padberg, Gabriele Taentzer ECEASST Home Page: http://www.easst.org/eceasst/ ISSN 1863-2122 http://www.easst.org/eceasst/ ECEASST Concatenation and other Closure Properties of Recognizable Languages in Adhesive Categories∗ H.J. Sander Bruggink1, Barbara König2, Sebastian Küpper3 1 sander.bruggink@uni-due.de 2 barbara koenig@uni-due.de 3 sebastian.kuepper@uni-due.de Fakultät für Ingenieurwissenschaften Abteilung für Informatik und Angewandte Kognitionswissenschaften Universität Duisburg-Essen, Germany Abstract: We consider recognizable languages of cospans in adhesive categories, of which recognizable graph languages are a special case. We show that such languages are closed under concatenation, i.e. under cospan composition, by providing a con- crete construction that creates a concatenation automaton from two given automata. The construction is considerably more complex than the corresponding construction for finite automata. We conclude by showing negative closure properties for Kleene star and substitution. Keywords: closure properties, recognizable languages, recognizable graph lan- guages, concatenation, adhesive categories 1 Introduction Regular languages for words and trees are an indispensable concept in many areas of computer science: they are used for instance for parsers, text editors or verification tools. Hence a natural question to ask is whether their counterpart in the world of graphs, recognizable graph languages à la Courcelle [Cou94], can serve the same purpose. In order to use such languages in practice it is necessary to prove closure properties and to give effective procedures in order to constructively realize such closure properties. Here we focus on the closure of recognizable graph languages under concatenation. More concretely, we use an automaton model introduced by us in [BK08, BBEK12], which accepts exactly the recognizable graph languages of Courcelle (for an exact comparison see [BK08]). Such automata accept graphs equipped with a fixed inner and outer interface (categorically: cospans of graphs). Given two languages LX,K , LK,Y of cospans, where the cospans are equipped with interface X,K, respectively K,Y , by concatenating every cospan of the first language with every cospan of the second language, one obtains a language LX,K ; LK,Y of cospans equipped with interfaces X,Y . For finite automata on words concatenation can be implemented by a straightforward construc- tion [HMU06]), however for graph languages the situation is considerably more complex, due to the fact that graphs are not freely generated, as opposed to words which are the free monoid ∗ This work was supported by the DFG-project GaReV. 1 / 14 Volume 58 (2013) mailto:sander.bruggink@uni-due.de mailto:barbara_koenig@uni-due.de mailto:sebastian.kuepper@uni-due.de Closure Properties of Recognizable Languages over a given alphabet. In other words: there is no canonical way to decompose a graph into atomic components (or cospans) and hence several decompositions generate the same graph. For graph automata this requires the condition that a graph is accepted independently of its specific decomposition and that two different decompositions of a cospan yield the same transitions in the automaton. We can assume that two automata AX,K,AK,Y satisfying this requirement and accepting the languages LX,K , LK,Y are given. Now the challenge is to construct an automaton for the con- catenation language that also adheres to this constraint. Intuitively the problem is the following: when accepting a cospan of the concatenation language, one might read it in such a way that the automaton already sees parts of the second cospan before completely processing the first. This is visualized below where cospans are represented by “tube-like” structures. Instead of reading first the cospan from X to K and then the cospan from K to Y , it is possible to start with a cospan from X to some new interface J that overlaps with K and already reaches into the second cospan, without fully containing the first. X YK J This effect has to be taken into account in the construction of the new automaton. Especially we have to record states of both automata (since we can be in both automata at the same time) and we have to record the current union of the interface graphs K and J (this union will be denoted by U ) and the parts of U that are the current interfaces for the two automata. This construction will not only be given for graph automata, but more generally for languages of cospans of (monic) arrows in adhesive categories [LS05]. The properties of adhesive categories enable us to work in distributive subobject lattices, which are the key to the construction and the proof of its correctness. This greater generality allows us to prove the result for several classes of graph-like structures (directed graphs, hypergraphs, attributed graphs, etc.). Concatenation for recognizable graph languages was already studied by Courcelle in [Cou94], but for a different setting where graphs have only one interface and are glued over their joint interface. Furthermore Courcelle’s results applies only to concrete graphs, whereas ours is shown in a general categorical setting. We conclude the paper by giving counterexamples for closure properties that fail to hold for graph languages, while being true for word languages: closure under Kleene star and closure under substitution. 2 Recognizable Languages and Automata 2.1 Category Theory We presuppose basic knowledge of category theory. The identity arrow of an object G will be denoted by idG. If f and g are composable arrows, we write f ; g for the morphism f postcomposed Proc. GTVMT 2013 2 / 14 ECEASST with g, i.e. f ; g = g◦ f . Let C be a category in which all pushouts exist. A cospan c in C is a pair of arrows 〈cL,cR〉 with the same codomain: c : J cL−→ G cR←− K. J is the domain of c and K is the codomain of c. We often call J the inner interface of c and K the outer interface of c. Let c : J → F ← M and d : M → H ← K be cospans where the codomain of c equals the domain of d. Then, the composition of c and d, written as c ; d : J → M′ ← K is defined via the following commuting diagram where the middle diamond is a pushout J G M M′ H K cL cR f dL g dR (PO) Two cospans c : J cL−→ G cR←− K and d : J dL−→ H dR←− K are isomorphic, written c ∼ d, if there exists an isomorphism k from G to H, such that cL ; k = dL and cR ; k = dR. A semi-abstract cospan is a ∼-equivalence class of cospans. We will often identify a ∼-equivalence class with one of its representatives. The category Cospan(C) is defined as the category that has the objects of C as objects and semi-abstract cospans as arrows. The identity arrows are given as the equivalence class of identity cospans G idG−−→ G idG←−− G. 2.2 Adhesive Categories For this paper we want to investigate a specific type of categories, so called adhesive categories, defined as in [LS05]. Definition 2.1. A category C is called adhesive if • C has pullbacks • C has pushouts along monomorphisms • pushouts along monomorphims are Van Kampen-pushouts, i.e. whenever the lower square in the following picture is a pushout along a monomorphim and the front and the left square are pullbacks it holds that the top square is a pushout iff the right and back squares are pullbacks. A A′ D D′ C C′ B B′ 3 / 14 Volume 58 (2013) Closure Properties of Recognizable Languages Definition 2.2. An example of an adhesive category which we will use in examples is the category HGraph. For a set A we call A∗ the set of sequences of elements from A and for a function f : A → B we write f∗ : A∗ → B∗ for the function that acts as f on each element of a sequence. HGraph has hypergraphs as objects and graph morphisms as arrows. Let Λ be a fixed set of labels with an arity function ar : Λ →N0. A hypergraph is a four-tuple G = (VG,EG,attG,labG) where VG is a set (the set of vertices), EG is a set (the set of edges), attG : EG → V∗G and labG : EG → Λ are functions, where for each edge e ∈ EG it holds that ar(labG(e)) = |attG(e)|. We only consider finite graphs where the set of vertices and the set of edges is finite. A graph morphism is a pair of functions ( fV , fE): G → H, fV : VG →VH , fE : EG → EH where labH ◦ fE = labG and attH ◦ fE = f∗V ◦attG. We say a graph D is discrete if ED = /0. We are particularly interested in subobject lattices in adhesive categories, defined as in [LS05, BBC+11]. Definition 2.3 (Subobject). Let U be an object in C. Two monomorphisms a : A � U and b : B � U are called isomorphic if there is an isomorphism ψ : A → B such that ψ ; b = a. A subobject of U is an isomorphism class of monomorphisms into U . It is denoted by [a : A � U] or [a] where a : A � U is any representative. For a fixed object U we consider the category Sub(U) that has subobjects of U as objects and that has an arrow between two objects [a],[b] if there exists an arrow c in C such that c ; b = a. In this case we write [a]⊆ [b]. An important result regarding subobject lattices due to [LS05] is that each Sub(U) with partial order ⊆ forms a distributive lattice, the so-called subobject lattice of U . The meet of two subobjects [a], [b] is realized by taking their pullback in C (see diagram on the left below) and is denoted by [a]∩[b] = [a∩b]. A join [a]∪[b] = [a∪b] is obtained by taking the pushout over their meet (see diagram on the right below). A∩B A B U a b a∩b A∩B A B A∪B U a b a∪b Distributivity means that ([a]∪ [b])∩ [c] = ([a]∩ [c])∪([b]∩ [c]) and ([a]∩ [b])∪ [c] = ([a]∪ [c])∩([b]∪[c]) for subobjects [a],[b],[c]. In order to structure our results more clearly, we define: Definition 2.4. Let d : A → B ← C be a cospan consisting of monos and U be an object with monos a : A →U , b : B →U , c : C →U , such that a,b,c commute with the two legs of the cospan. That is, [a],[b],[c] are subobjects of U and [a]⊆ [b], [c]⊆ [b]. In the following we write [d̂]U = ([a],[b],[c]) in order to represent a cospan as a triple of subobjects. When U is clear from the context, we omit it and just write [d̂] = ([a],[b],[c]). Union and intersection of such triples are defined pairwise. Proc. GTVMT 2013 4 / 14 ECEASST 2.3 Recognizability Courcelle [Cou90, Cou94] introduced the notion of recognizability of graph languages as an analogon to regularity of word languages. This notion can be extended to arbitrary categories C that are locally small, i.e. for every two objects the class of arrows between them is a set. Nondeterministic finite automata are commonly used to investigate properties of regular languages. An analogous notion for recognizable sets of arrows in a category is given by an automaton functor [BK08]. The intuition behind such automaton functors is the following: a transition function δ : Z×Σ → P(Z) (where Z is the set of states and Σ is an alphabet) for non-deterministic word automata can be extended to a transition function δ̂ : Z ×Σ∗ →P(Z) on words. With some currying and rearranging we can view δ̂ as a function that maps a word from Σ∗ to a relation on Z and which furthermore satisfies δ̂ (ε) = idZ and δ̂ (w1w2) = δ̂ (w1)δ̂ (w2) (functoriality conditions). Similarly we define an automaton as a mapping from cospans of graphs to relations on states, such that the functoriality conditions above hold. Alternatively one could define automata only on so-called atomic cospans as in [BBEK12], but – different from word languages – it is necessary to require that a cospan generates the same relation, independently of its decomposition, which is already implicit in the functoriality conditions. Definition 2.5 (Automaton). Let C be any category and let Rel be the category of sets and relations. Furthermore let J,K be objects of C. A (J,K)-automaton is a tuple A= (A,I,F), where A is a functor A : C→Rel that maps every object X of C to a finite set A(X) (called the set of states of X ) and every arrow f : X →Y to a relation A( f ) ⊆ A(X)×A(Y ). Furthermore there is a set of initial states I ⊆ A(J) and a set of final states F ⊆ A(K). The functor A is called automaton functor; we often identify A with its functor A. An automaton A is deterministic whenever every relation A( f ) is a function and I contains exactly one element. The language L(A) of A (which contains arrows from J to K) is defined as follows: f : J → K is contained in L(A) if and only if there exist s ∈ I, t ∈ F which are related by A( f ), i.e., s A( f ) t. A language LJ,K of arrows from J to K is recognizable in C if it is the language of a (J,K)- automaton. Many properties of regular languages can be shown for recognizable languages in a straightfor- ward way; see [BK08] for proofs. Proposition 2.6. For every automaton, there exists an equivalent deterministic automaton. Proof. (Sketch.) The construction is more or less equivalent to the case of finite automata: we replace every set of states by its powerset. Proposition 2.7. Suppose we have two recognizable languages of arrows, L1J,K and L 2 J,K . Then also L1J,K ∩L 2 J,K , L 1 J,K ∪L 2 J,K and (L 1 J,K) C (the complement of L1J,K ) are recognizable. 5 / 14 Volume 58 (2013) Closure Properties of Recognizable Languages Proof. (Sketch.) Again the construction resembles the case of finite automata: For union and intersection, we take the cross product of two deterministic automata and define the final states accordingly. For the complement we exchange final and non-final states of a deterministic automaton. 3 Closure under Concatenation From now on we consider as category, over which to define automaton functors, the category of cospans of monos of a given adhesive category. We want to prove that recognizable languages over adhesive categories are closed under concatenation. Let LX,K be a language of cospans from X to K and LK,Y a language of cospans from K to Y . Then LX,K ; LK,Y ={c1 ; c2 | c1 ∈ LX,K,c2 ∈ LK,Y}, consists of all possible compositions of cospans of the two languages. For regular (word) languages this closure property is well known and given two (non deter- ministic finite) automata A1 and A2 it is easy to construct an automaton that accepts exactly L(A1) ; L(A2) by first simulating A1 for the first part of a given word, non-deterministically switching to A2 and then simulating A2 for the rest of the word. For recognizable languages on adhesive categories we cannot retain this easy construction because an automaton functor has to behave identically on different decompositions of the same cospan, as we will show in the following example. Example 3.1. We concatenate two rather similar graph automata over the label set Λ ={a,b}. Automaton 1 accepts a cospan J → G ← K with discrete interfaces of size one of the category HGraph. The graph G must contain an even number of a-labelled edges, no b-edges and one vertex. Automaton 2 accepts a cospan J → G ← K with discrete interfaces of size one of the category HGraph. The graph G must contain an even number of b-labelled edges, no a-edges and one vertex. We first specify both automata. As they do essentially the same, we will only specify A1. We will use the notation #a(G) for the number of a-labelled edges in a graph G. The first automaton is a 3-tuple A1 = (A1,I1,F1) with A1(G) = { {0,1} if |VG|= 1 and |EG|= 0 /0 otherwise A1(J → G ← K) =   {(0,0),(1,1)} if #a(G)≡ 0 (mod 2), #b(G) = 0 and |VG|= 1 {(0,1),(1,0)} if #a(G)≡ 1 (mod 2), #b(G) = 0 and |VG|= 1 /0 otherwise for each cospan of hypergraphs with discrete interfaces of size one. For cospans J → G ← K that do not have discrete interfaces of size 1, A1(J → G ← K) = /0. Furthermore, I1 ={0} and F1 ={0}. The second automaton A2 = (A2,I2,F2) is constructed analogously. To show why we have to use a more complex construction than in the case of regular word languages, consider the following cospan: Proc. GTVMT 2013 6 / 14 ECEASST c = a a b b This cospan is in the concatenation language because it can be decomposed into a a ; b b and the right cospan is clearly accepted by A1, while the right cospan is accepted by A2. However, it is not possible in general to run the two automata one after the other. The same cospan c can also be decomposed in a b ; a b . Because of the functoriality condition, the concatenation automaton must also accept the cospan in this case, although none of the two constituents cospans are accepted by A1 or A2. In general, A1 and A2 have to be run in parallel. We will sketch a constructive proof that recognizable languages on adhesive categories are closed under concatenation, some technical details are omitted, though. We will start by defining a concatenation automaton. Definition 3.2 (Concatenation automaton). Let A1 = (A1,I1,F1) and A2 = (A2,I2,F2) be au- tomata over cospans of monos of an adhesive category C where A1 is an (X,K)-automaton and A2 is a (K,Y )-automaton. We define the concatenation automaton A= (A,I,F) of A1 and A2 as follows. • A assigns to each object J a set of states A(J), where each state consists of four monos as shown below, and two states z1 ∈A1(U1),z2 ∈A2(U2).1 K U J U1〈z1〉 U2〈z2〉 ϕK s1 s2 ϕJ In addition we require that [s1]∪[s2] = [idU ] as well as [ϕJ]∪[ϕK] = [idU ] and that [ϕJ]∩ [ϕK] = [s1]∩[s2]. We want to simulate A1 and A2, so in such a state, U represents the union of the current interface J of A and the final interface K of A1. Furthermore U1 represents the current interface of A1 and U2 represents the current interface of A2.2 1 We will indicate the two states within the categorical diagrams in angular brackets. 2 In order to obtain a finite set of states we do not take all monos into U , but only one representative of every subobject equivalence class. 7 / 14 Volume 58 (2013) Closure Properties of Recognizable Languages • A state z as above is initial (z ∈ I) if z1 ∈ I1, z2 ∈ I2, s1 = ϕJ and s2 = ϕK . • Analogously, a state z as above is final (z ∈ F ) if z1 ∈ F1, z2 ∈ F2, s1 = ϕK and s2 = ϕJ . • On cospans the functor A is defined as follows: let c : J � J � J′ be a cospan consisting of two monos. Two states (s1,s2,ϕJ,ϕK,z1,z2), (s′1,s ′ 2,ϕ ′ J,ϕ ′ K,z ′ 1,z ′ 2) are related by A(c) whenever there are cospans d1,d2,d (consisting of monos) and additional monos such that the diagram below commutes U K J U1〈z1〉 U2〈z2〉 U K J U1 U2 U′ K J′ U′1〈z ′ 1〉 U′2〈z ′ 2〉 υ1 υ′1 s1 u1 s′1 idK idK ϕK k ϕ′K u u′ ϕJ j ϕ′J ι ι ′ s2 u2 s′2 υ2 υ′2 idK : d1: d: d2: c: and the following five conditions are satisfied. 1. [d̂1]∪[d̂2] = [d] 2. [d̂1]∩[d̂2] = [ĉ]∩[îdK] 3. [ĉ]∪[îdK] = [d] 4. It holds that z1A1(d1)z′1 and z2A2(d2)z ′ 2. 5. [u1]∩[k] = [u′1]∩[k] and [u2]∩[k] = [u2]∩[k] where u2 = υ2 ; u2 and u ′ 1 = υ ′ 1 ; u1 The meaning of the five objects in the diagram above describing a state can be intuitively explained with the diagram depicted in the introduction (Section 1): the objects K, J from the definition above are indicated in that picture and U would be the union of K and J. If we intersect U with the cospan from X to K we obtain U1 (fully contained in the first cospan) and if we intersect U with the cospan from K to Y we obtain U2 (fully contained in the second cospan). Conditions 1–3 above use the notation of Definition 2.4 and extend the conditions a state has to satisfy to transitions. The fourth condition means that d1 is a cospan that allows a transition from z1 to z′1 in A1 and that d2 is a cospan that allows a transition from z2 to z ′ 2 in A2. Therefore, this condition ensures that the concatenation automaton behaves like A1 on U1 and like A2 on U2. The last condition is in place to guarantee that U1 can only grow regarding K whereas U2 can only diminish regarding K. Intuitively, we do not want A1 to forget a part of K it has already read and we do not want A2 to read a part of K anew that it has already forgotten. To show that the concatenation automaton is indeed an automaton functor, we have to prove two lemmas. For the remainder of this section we assume that C is an adhesive category and that each cospan is a cospan consisting of monos in this category. Proc. GTVMT 2013 8 / 14 ECEASST Lemma 3.3. Let A= (A,I,F) be a concatenation automaton over C and c be a cospan that can be decomposed as c = c1 ; c2. 1. If there are states z,z,z′ such that z A(c1) z and z A(c2) z′ then a transition from z to z′ via c must exist as well, that is z A(c) z′. 2. If z A(c) z′ for states z,z′, then there is a state z such that z A(c1) z and z A(c2) z′, that is, one can reach every state that is reachable via c by first making a c1-step and then making a c2-step. The proof to this lemma is rather technical, although it mainly uses computations of unions and intersections, and lengthy, so it is omitted here. To allow working just on the subobject-lattice of U when Lemma 3.2, we used the following supplemental result. Lemma 3.4. Let a : A →U , b : B →U , c : C →U , d : D →U a′ : A →U 1, b′ : B →U 1, c′ : C → U 1, d′ : D →U 1 and u : U 1 →U be monomorphisms in an adhesive category. Furthermore, let a = a′ ; u, b = b′ ; u, c = c′ ; u and d = d′ ; u. Then (i) [a]∩[b] = [c]∩[d] iff [a′]∩[b′] = [c′]∩[d′] (ii) [a]∪[b] = [c]∪[d] iff [a′]∪[b′] = [c′]∪[d′]. Corollary 3.5. Concatenation automata preserve cospan composition. The “functors” of concatenation automata do not neccessarily preserve identities, though a simple factorization of the state space over the images of identities yields a proper functor. Next we want to show that concatenation automata accept exactly the language we expect them to accept. Lemma 3.6. Let A1 be an (X,K)-automaton, let A2 be a (K,Y )-automaton, let A be the con- catenation automaton of A1 and A2 and let c be a cospan from X to Y . 1. Whenever c ∈ L(A1) ; L(A2) then c ∈ L(A). 2. Whenever c ∈ L(A) then c ∈ L(A1) ; L(A2). Proof. Part 1: Let c ∈ L(A1); L(A2), then there must be a c1 ∈ L(A1) and a c2 ∈ L(A2) such that c = c1 ; c2. We will now give a transition for c1 in A that starts in an initial state, then a composable transition for c2 in A that ends in a final state. As we have already shown that A is a graph automaton, we have shown that c is accepted by A. The cospan c1 can be written as c1 = U1 → U ← K, hence we can construct the transition depicted in the following diagram, K K KidK : U U Kd: d1: d2: U1 U Kc1: U1〈zs1〉 U K〈ze1〉 K〈zs2〉 K K〈zs2〉 9 / 14 Volume 58 (2013) Closure Properties of Recognizable Languages where zs1 is the initial and ze1 the final state of the c1-transition in A1, zs2 is the initial and ze2 the final state of the c2-transition in A2. Apart from U , all objects and corresponding arrows are already known, U and its arrows are defined by first taking the pullback of U1 →U and K →U and then taking the pushout of the resulting arrows, so it is the join of U1 →U and K →U . As before, for each object A the equivalence class of arrows going from A to U is denoted by [a]. We now have to show that the five properties of state transitions in A hold. 1. [u] = [u]∪ [k], [k]∪ [k] = [k] are obvious, [u] is defined as the join of [u1] and [k], so [u] = [u1]∪[k] holds per definition. 2. [k]∩[k] = [k]∩[k], [u]∩[k] = [u]∩[k] and [k]∩[u1] = [k]∩[u1] clearly hold. 3. Once again, [u] = [k]∪[u1] holds per definition and [u]∪[k] = [u] as well as [k]∪[k] = [k] are obvious. 4. d1 was assumed to be a cospan with zs1 A1 (c1)ze1 . As d2 is just the identity and does not change the state, it has to be a cospan with zs1 A2(d2)zs1 . 5. [u]∩[k] = [k]∩[k] and [k]∩[k] = [k]∩[k] hold obviously. A transition for c2 in A can be found analogously: K K KidK : K U U′d: d1: d2: K U J′c2: K〈ze1〉 K K〈ze1〉 K〈zs2〉 U J ′〈ze2〉 Apart from U′, all objects and corresponding arrows are known, U′ can be obtained as the join of J′ and K in the same way as U was constructed for the c1-transition. The five properties are easily proven in the same way as for the c1-transition. Part 2: Let c ∈ L(A) and an accepting c-transition in A be given as follows: K K KidK : U U U′d: d1: d2: J J J′c: J〈z1〉 U1 K〈z ′ 1〉 K〈z2〉 U2 J ′〈z′2〉 We will show that c = d1 ; d2. As c is in L(A), d1 and d2 must be in L(A1) respectively L(A2), by the definition of A, hence this is sufficient to show that c ∈ L(A1); L(A2). We first observe that [ j] = [u], because from d2 it follows that [k] ⊆ [u2] and from d1 it follows that [k] ⊆ [u1] and as [u1]∩[u2] = [ j]∩[k] also [k]⊆ [ j] holds. Proc. GTVMT 2013 10 / 14 ECEASST We will now continue to show that d1 and d2 are in fact composable over U , which means that the pushout can be obtained as the union of U1 and U2. The outer interface of d1 equals the inner interface of d2 and as we have just seen, [k]⊆ [u1]∩[u2] holds. Moreover, [u1]∩[u2] = [ j]∩[k]⊆ [k], so [u1]∩[u2] = [k] and therefore the cospans are indeed composable in this sense. To conclude, we have to show that c and d1 ; d2 are equal. From the conditions a state transition in A has to fulfill, we know that [u1]∪[u2] = [u] and thus [u1]∪[u2] = [ j], hence c = d1 ; d2. And thus we conclude: Corollary 3.7. A concatenation automaton accepts exactly the concatenation of the languages of its constituent automata. So we have shown: Theorem 3.8. Recognizable languages in adhesive categories are closed under concatenation and an automaton accepting the concatenation of two recognizable languages can be computed from their respective automata via the concatenation automaton. An important adhesive category is the category HGraph so we have particularily shown that the concatenation of recognizable graph languages is recognizable. A similar result for recognizable graph languages has been shown in [Cou94], for languages of graphs with a single interface. Note that here we also generalized this result to the setting of adhesive categories. Example 3.9. We give a short example of our construction. For this, we will reuse the automata from example 3.1. We want to construct the concatenation automaton A = (A,I,F) of A1 = (A1,I1,F1) and A2 = (A2,I2,F2). In the following we will use the names c,d,J,U,... as in Definition 3.2. We start by defining the states of the automaton: K = D1, the discrete graph with one node, because the right interface of a cospan accepted by A1 always has one vertex. The equality J = U = U1 = U2 has to hold, as we are working with discrete interfaces and neither A1 nor A2 accept a graph that has more or less than one vertex. Furthermore we set ϕJ = ϕK = s1 = s2 = idU . For each combination of z1 ∈{0,1} and z2 ∈{0,1} this forms a state. A state is initial whenever z0 = z1 = 0 and a state is final whenever z0 = z1 = 0. So now we can define the state transitions of A. For a given cospan c we have to define z1,z′1,z2,z ′ 2 as well as d1,d,d2, idK is already uniquely defined. Let c = d and c = c1 ; c2 then d1 = c1, d2 = c2, z1 A1(c1) z′1, z2 A2(c2) z ′ 2 yields a transtion in A and all transitions in A arise in this way. Looking at the states, one can see that the concatenation automaton reads a given cospan in such a way that the a-edges are counted separately in z1 and the b-edges are counted in z2 (both modulo 2). The resulting automaton accepts exactly the cospans of hypergraphs that have discrete interfaces of size one, exactly one vertex and an even number of a-edges, as well as an even number of b-edges. 4 Negative Results for Closure Properties As shown above recognizable languages over adhesive categories are closed under concatenation, just like regular languages. We already know that recognizable languages are closed under 11 / 14 Volume 58 (2013) Closure Properties of Recognizable Languages complementation, union and intersection. Unfortunately, there are some closure properties that regular languages enjoy but which do not hold for recognizable languages. To show two negative results, we will work in the category of cospans over HGraph. We start by showing that recognizable languages are not closed under Kleene star and we will start by defining the notion of Kleene star in our setting. Definition 4.1. Let L be a (K,K)-graph language. Then the Kleene-star of L, written L∗, is the (K,K)-graph language defined as: L∗ = { σ | there are σ1,...,σn such that σ = σ1 ; ··· ; σn and σi ∈ L for all 1 ≤ i ≤ n } . Hence in our setting we call the language of all cospans of graphs that can be decomposed into cospans that are in a language L, the language L∗. We will now show that recognizable graph languages are not closed under Kleene star. Theorem 4.2. Recognizable graph languages are not closed under Kleene-star. Proof. Let L be the graph language that contains only the following cospan: a b Since this language consists of only one graph it is recognizable. However, L∗ consists of all cospans of graphs with one vertex (which is both in the inner and outer interface), any number of a-edges and the same amount of b-edges. But we cannot find a graph automaton over the set of atomic cospans that accepts L∗, because the set of reachable states has to be the same for any decomposition of a given cospan in L∗. Thus, it is possible to decompose an accepted cospan in a way that first reads all a-edges and then all b-edges. Therefore there have to be infinitely many equivalence classes for an interface of one vertex. We will now see that recognizable graph languages are not closed under edge substitution (i.e., simultaneous substitution of all edges with a certain label, as in hyperedge replacement [Hab92]) either. Definition 4.3. Let L be a set of graphs over the label alphabet Λ, a ∈ Λ a label for edges of arity k and G be a fixed graph where a sequence s of k vertices are marked as merge-vertices. The set L′ of all graphs that are generated by substituting all a-labelled edges by G is the set of graphs that includes all graphs that can be generated from any graph G′ ∈ L by repeating the following steps until there is no more original a-edge in G′. • Let e be an a-edge in G′. Delete e from G′ and remember the sequence of vertices attG′(e). • Insert a copy of G into G′. Identify the merge-vertices of G with attG′(e), the vertices previously incident to the a-edge. Theorem 4.4. Recognizable graph languages are not closed under substitution. Proc. GTVMT 2013 12 / 14 ECEASST Proof. Again we will show this by providing a counter-example. First, let L be the language of all graphs that have exactly one vertex v and an arbitrary number of a-edges of arity 1. As a substitution graph we will use the graph G that consists of one vertex (that also is the merge- vertex), one b-edge and one c-edge, both of arity 1. This substitution rule applied to L generates the language of all graphs with one vertex, any number of b-edges and the same amount of c-edges. As before we see that this language is not recognizable, whereas L ist recognizable. 5 Conclusion We gave an explicit construction for an automaton functor which recognizes the concatenation of the languages of two given automaton functors. More precisely, given an automaton functor AX,K accepting a language LX,K of cospans with domain X and codomain K and an automaton functor AK,Y accepting a language LK,X of cospans with domain K and codomain Y , we constructed an automaton functor A which accepts the concatenation of LX,K and LK,Y . The construction is non-trivial because we have to make sure that the constructed automaton functor is a functor, that is, if a cospan c can be decomposed into c = c1 ; c2, it must hold that A(c) =A(c1) ; A(c2). By giving this construction, we have shown that recognizable languages (of arrows in an adhesive category) are closed under concatenation. One application of the construction could be the calculation of strongest post-conditions. Let a property be given as an automaton functor A, and let p = (`,r) be a graph rewriting rule in the format of reactive systems [LM00], that is, r and ` are cospans with the same domain and codomain, and the reduction relation is generated by ` ; c ⇒ r ; c for all context cospans c. We construct an automaton A′ which accepts c if and only if A accepts ` ; c by changing the initial states of A to the states which are reachable by ` from an original initial state (see also [BBEK12]). Then we have two recognizable languages, {r} and L(A′) and we obtain the strongest post-condition by constructing the composition automaton. It is future research to make this idea more concrete. Note that in this application one of the recognizable languages is quite simple, consisting of only a single cospan. Another future plan is to find a more efficient construction for such simple concatenations. In the final part of the paper we showed, by giving counter-examples, that recognizable languages are not closed under (a suitable notion of) Kleene star and substitution. Especially the second negative result is unfortunate, as a similar property for regular word languages played a key role in the match-bound technique for showing that a string rewrite system terminates [GHW03]. This makes it hard to transfer the technique to graph transformation systems. Additional closure properties were considered in [Küp12], which forms the basis of this paper. Especially [Küp12] contains additional negative results and the proof that recognizable graph languages are closed under application of inverse context-free rules. Furthermore the construction of the concatenation automaton, given in this paper, is broken down to atomic cospans, a necessary requirement for the concrete construction of such automata, which we plan to implement in the tool RAVEN [BBEK12]. 13 / 14 Volume 58 (2013) Closure Properties of Recognizable Languages Bibliography [BBC+11] Paolo Baldan, Filippo Bonchi, Andrea Corradini, Tobias Heindel, and Barbara König. A lattice-theoretical perspective on adhesive categories. Journal of Symbolic Compu- tation, 46:222–245, 2011. [BBEK12] Christoph Blume, H.J. Sander Bruggink, Dominik Engelke, and Barbara König. Efficient symbolic implementation of graph automata with applications to invariant checking. In Proc. of ICGT ’12 (International Conference on Graph Transformation), pages 264–278. Springer, 2012. LNCS 7562. [BK08] H.J. Sander Bruggink and Barbara König. On the recognizability of arrow and graph languages. In Proc. of ICGT ’08 (International Conference on Graph Transformation), pages 336–350. Springer, 2008. LNCS 5214. [Cou90] Bruno Courcelle. The monadic second-order logic of graphs I. Recognizable sets of finite graphs. Information and Computation, 85:12–75, 1990. [Cou94] Bruno Courcelle. Recognizable sets of graphs: Equivalent definitions and closure properties. Mathematical Structures in Computer Science, 4(1):1–32, 1994. [GHW03] Alfons Geser, Dieter Hofbauer, and Johannes Waldmann. Match-bounded string rewriting systems. In Branislav Rovan and Peter Vojtas, editors, Mathematical Foundations of Computer Science 2003, volume 2747 of Lecture Notes in Computer Science, pages 449–459. Springer Berlin / Heidelberg, 2003. [Hab92] Annegret Habel. Hyperedge Replacement: Grammars and Languages. Lecture Notes in Computer Science, 643, 1992. [HMU06] John E. Hopcroft, Rajeev Motwani, and Jeffrey D. Ullman. Introduction to Automata Theory, Languages, and Computation. Prentice Hall, 2006. [Küp12] Sebastian Küpper. Abschlusseigenschaften für Graphsprachen mit Anwendungen auf Terminierungsanalyse. Master’s thesis, Universität Duisburg-Essen, August 2012. [LM00] James J. Leifer and Robin Milner. Deriving bisimulation congruences for reactive systems. In Proc. of CONCUR 2000, pages 243–258. Springer, 2000. LNCS 1877. [LS05] Stephen Lack and Paweł Sobociński. Adhesive and quasiadhesive categories. ITA, 39(3):511–545, 2005. Proc. GTVMT 2013 14 / 14 Introduction Recognizable Languages and Automata Category Theory Adhesive Categories Recognizability Closure under Concatenation Negative Results for Closure Properties Conclusion