Int. J. of Computers, Communications & Control, ISSN 1841-9836, E-ISSN 1841-9844 Vol. VI (2011), No. 2 (June), pp. 266-277 On the Power of Small Size Insertion P Systems A. Krassovitskiy Alexander Krassovitskiy University Rovira i Virgili Research Group on Mathematical Linguistics Av. Catalunya, 35, Tarragona 43002, Spain E-mail: alexander.krassovitskiy@estudiants.urv.cat Abstract: In this article we investigate insertion systems of small size in the framework of P systems. We consider P systems with insertion rules having one symbol context and we show that they have the computational power of context-free matrix grammars. If contexts of length two are permitted, then any recursively enumerable language can be generated. In both cases a squeez- ing mechanism, an inverse morphism, and a weak coding are applied to the output of the corresponding P systems. We also show that if no membranes are used then corresponding family is equal to the family of context-free languages. Keywords: P systems, insertion-deletion systems, context-free languages, ma- trix grammars, computational completeness 1 Introduction The study of insertion-deletion operations on strings has a long history; we just mention [2,5,11,16,20]. Insertion-deletion systems motivated from molecular computing have been studied in [1, 3, 10, 19, 21]. With some linguistic motivation they may be found in [9]. In general form, an insertion operation means adding a substring to a given string in a specified (left and right) context, while a deletion operation means removing a substring of a given string from a specified (left and right) context. A finite set of insertion/deletion rules, together with a set of axioms provide a language generating device: starting from the set of initial strings and iterating insertion-deletion operations as defined by the given rules we get a language. The number of axioms, the length of the inserted or deleted strings, as well as the length of the contexts where these operations take place are natural descriptional complexity measures of the insertion-deletion systems. Inspired by the structure and the functioning of a living cell, especially by the local informa- tion processing, P systems are brought to light few years ago [15]. This is a highly distributed computational model which combines local processing (in membranes) and communication (be- tween them). It is natural to consider insertion and deletion operations in the framework of P systems and it was firstly done by Gh. Paun in [17]. Such a combination permits do define programmed-like insertion-deletion systems, which, as expected, increase their computational power. Some combinations of parameters for pure insertion-deletion lead to systems which are not computationally complete [6, 12] or even decidable [22], while in [7, 8] it was shown that P systems framework can easily increase the computational power comparing to ordinary insertion- deletion systems. Traditionally, language generating devices having only insertion or only deletion rules were studied. Early computational models based only on insertion appear already in [9], and are discussed in [19] and [17] (with membrane tree structure). It was proved that pure insertion systems having one letter context are always context-free. Yet, there are insertion systems with two letter context which generate nonsemilinear languages (see Theorem 6.5 in [19]). On the Copyright c⃝ 2006-2011 by CCC Publications On the Power of Small Size Insertion P Systems 267 other hand, it appears that by using only insertion operations the obtained language classes with contexts greater than one are incomparable with many known language classes. For example, there is a simple linear language {anban | n ≥ 1} which cannot be generated by any insertion system (see Theorem 6.6 in [19]). In order to overcome this obstacle one can use some codings to “interpret” the generated strings. The computational power of insertion systems with morphisms and intersection with special languages was investigated in [14] and [18]. In [11] there were used two additional mapping relations: a morphism h and a weak coding φ. The strings of a language are considered being the products h−1◦φ over the generated strings. More precisely, the squeezing mechanism selects only those output strings of the corresponding (P) systems to which h−1 and φ are being applied. As expected, the language generating mechanisms have greater expressivity, and the corresponding language family is larger. It appears that with the help of morphisms and codings one can obtain every RE language if insertion rules have sufficiently large context. It is proved in [11] that for every recursively enumerable language L there exists a morphism h, a weak coding φ and a language L′ generated by an insertion system with rules using the length of the contexts at most 7, such that, L = h(φ−1(L′)). The result was improved in [13], showing that rules having at most 5 letter contexts are sufficient to encode every recursively enumerable language. Recently, in [4] it was shown that the same result can be obtained with the length of contexts equal to 3. Our aim is to reduce the length of the contexts in insertion rules by regulating derivations in terms of membranes. Unlike the previous works, our article considers the encoding as a part of insertion P systems. The obtained model is quite powerful and has the power of matrix languages if contexts of length one are used. We also show that if no encoding is used, then the corresponding family is strictly included into the family of matrix languages and is equal to the family of context-free languages if no membranes are used. If an insertion of two symbols in two letters contexts is used, then all recursively enumerable languages can be generated (using of course the inverse morphism, the weak coding, and the squeezing mechanism). 2 Prerequisites Here we use standard formal language theoretic notions and notations. The reader can consult any of the many monographs in this area (see, e.g., in [20] for the unexplained details). We denote by |w| the length of a word w and by card(A) the cardinality of the set A, while ε denotes the empty string. By CF and RE we denote the classes of context-free and recursively enumerable languages, respectively. A language L is context-free if there exists a context-free grammar G such that L(G) = L. A context-free grammar is a construct G = (N, T, S, P), where N and T are disjoint alphabets, S ∈ N, and P is a finite set of context-free rules of the form A −→ v, where A ∈ N and v ∈ (N ∪ T)∗. We say that a context-free grammar G = (N, T, P, S) is in Chomsky normal form, if each rule in P has one of the forms A −→ α, or A −→ BC, for A, B, C ∈ N, α ∈ T ∪{ε}. A language L is recursively enumerable if there exists a type 0 grammar G such that L(G) = L. A type 0 grammar is a construct G = (N, T, S, P), where N and T are disjoint alphabets, S ∈ N and P is a finite set of rules of the form u −→ v, where u ∈ (N ∪ T)∗N(N ∪ T)∗ and v ∈ (N ∪ T)∗. We say that a type 0 grammar G = (N, T, P, S) is in Penttonen normal form, if each rule in P can be written either as A −→ α, or AB −→ AC, for A, B, C ∈ N, α ∈ (T ∪ N)∗, |α| ≤ 2. It is well known that for every type 0 language there is a modified Penttonen normal form where each rule can be written as either: A −→ α, or A −→ AC, or A −→ CA, or AB −→ AC, or AB −→ CB, for A, B, C ∈ N, α ∈ T ∪ {ε}. 268 A. Krassovitskiy We also recall the following definition from [17]. A context-free matrix grammar (without appearance checking) is a construct G = (N, T, S, M), where N, T are disjoint alphabets (of non- terminals and terminals, respectively), S ∈ N (axiom), and M is a finite set of matrices, that is sequences of the form (A1 −→ x1, . . . , An −→ xn), n ≥ 1, of context-free rules over N ∪ T . For a string w, a matrix m : (r1, . . . , rn) is executed by applying the productions r1, . . . , rn one after the other, following the order in which they appear in the matrix. Formally, we write w =⇒m u if there is a matrix m : (A1 −→ u1, . . . , An −→ un) ∈ M and the strings w1, w2, . . . , wn+1 ∈ (N ∪ T)∗ such that w = w1, wn+1 = u, and for each i = 1, 2, . . . , n we have wi = w′Aiw′′ and wi+1 = w ′uiw ′′. If the matrix m is understood, then we write =⇒ instead of =⇒m. As usual, the reflexive and transitive closure of this relation is denoted by =⇒∗. Then, the language generated by G is L(G) = {w ∈ T ∗ | S =⇒∗ w}. The family of languages generated by context-free matrix grammars is denoted by MAT λ (the superscript indicates that erasing rules are allowed). It is well known fact that every language from MAT λ can be generated by a modified binary normal form, (similarly to the binary normal form, see e.g., [17]) having each matrix m of the following form m : (A −→ α, A′ −→ α′), for A, A′ ∈ N, α, α′ ∈ (N ∪ T)∗, max(|α|, |α′|) ≤ 2. An insertion system is a construct Γ = (V, A, I), where V is an alphabet, A is a finite language over V , and I is finite set of triples of the form (u, α, v), where u, α, and v are strings from V ∗. The elements of A are called axioms, the triples in I are insertion rules. An insertion rule (u, α, v) ∈ I indicates that the string α can be inserted in between u and v. Stated otherwise, (u, α, v) ∈ I corresponds to the rewriting rule uv → uαv. We denote by =⇒ the relation defined by an insertion rule (formally, x =⇒ y iff x = x1uvx2, y = x1uαvx2, for some (u, α, v) ∈ I and x1, x2 ∈ V ∗). We denote by =⇒∗ the reflexive and transitive closure of =⇒ (as usual, =⇒+ is its transitive closure). The language generated by Γ is defined by L(Γ) = {w ∈ V ∗ | x =⇒∗ w, x ∈ A}. We say that an insertion system (V, A, I) has weight (n, m, m′) if m = max{|u| | (u, α, v) ∈ I}, m′ = max{|v| | (u, α, v) ∈ I}, n = max{|α| | (u, α, v) ∈ I}. We denote by INSm,m ′ n the corresponding families of languages generated by insertion sys- tems. An insertion P system is a construct: Π = (V, µ, M1, . . . , Mk, R1, . . . , Rk), where • V is a finite alphabet, • µ is a (cell-like, i.e., hierarchical) membrane structure with k membranes. This structure will be represented by a word containing correctly nested marked parentheses, i.e., by a word from the Dyck language. The skin membrane is labeled with “1". • Mi, for each 1 ≤ i ≤ k is a finite language associated to the membrane i. • Ri, for each 1 ≤ i ≤ k is a set of insertion rules with target indicators associated to membrane i and which have the following form: (u, x, v; tar), where (u, x, v) is an insertion rule, and tar, called the target indicator, is from the set {here, inj, out}, 1 ≤ j ≤ k. On the Power of Small Size Insertion P Systems 269 A configuration of Π is a k-tuple (N1, . . . , Nk) of finite languages over V . For two configurations (N1, . . . , Nk) and (N ′1, . . . , N ′ k) of Π we write (N1, . . . , Nk) =⇒ (N ′ 1, . . . , N ′ k) if we can pass from (N1, . . . , Nk) to (N ′1, . . . , N ′ k) by applying nondeterministically the insertion rules, to all strings which can be rewritten from the corresponding regions, and following the target indications associated with the rules. More specifically, w′ ∈ N ′j if either w ′ ∈ Nj or there is a word w ∈ Ni and w =⇒r w′, where r = (u, x, v; tar) ∈ Ri. Moreover, the membrane labeled by j is immediately outside the membrane labeled by i if tar = out; the membrane labeled by j is immediately below the membrane labeled by i if tar = inj; and i = j if tar = here. No other words are present in N ′j, 1 ≤ j ≤ k. We say that a word w ′ is sent out of the system if there is a configuration (N1, . . . , Nk), a word w ∈ N1, and w =⇒r w′ where r = (u, x, v; out) ∈ R1. We use the definition of the language generated by Π according to [17]. This language is denoted by L(Π) and it is defined as follows: we start from the initial configuration (M1, . . . , Mk) of the system and proceed iteratively, by transition steps performed by applying the rules in parallel, to all strings which can be rewritten. All strings over the alphabet V sent out of the system (i.e. sent from the skin membrane) during any step of any computation form the language L(Π). Insertion tissue P systems are defined in an analogous manner. As the tissue P systems use arbitrary graph structures we write the target indicator in the form tar = goj, j = 1, . . . , k. We remark, that the result of a computation consists of all strings over V which are sent to one selected output cell. The weight of insertion rules (n, m, m′) and the membrane degree k describe the complexity of the system. We denote by LSPk(ins m,m′ n )(see [17]) the family of languages L(Π) generated by insertion P systems of degree at most k ≥ 1, having weight at most (n, m, m′). If some of the parameters n, m, m′, or k is not specified we write “ * ” instead. We say that a language L′ is from MorINSm,m ′ n (from MorLSPk(ins m,m′ n ), respectively) if there exist a morphism h, a weak coding φ and L ∈ INSm,m ′ n (L ∈ LSPk(ins m,m′ n )) such that φ(h−1(L)) = L′. For every language L ∈ MorLSPk(ins m,m′ n ), L = φ(h −1(L(Π)), we add h and φ to the system, and we write Π in the form Π = (V, µ, M1, . . . , Mn, R1, . . . , Rn, h, φ), Hereafter h and φ are naturally extended to strings: h(a1a2 . . . at) = h(a1)h(a2) . . . h(at), and φ(a1a2 . . . at) = φ(a1)φ(a2) . . . φ(at), ai ∈ V. We insert “t′′ before P to denote classes of languages corresponding to the tissue cases (e.g., LStP ). We also write “[t]” (e.g., LS[t]P ) if we do not distinguish between tissue and tree classes. We say that a letter a is marked in a sentential form waw′′ if it is followed by #, i.e., |w′′| > 0, and # is the prefix of w′′. In the following proofs we use a marking technique introduced in [17]. The technique works as follows: in order to simulate a rewriting production A −→ B we add adjacently right from A the word #B specifying that letter A is already rewritten. As soon as the derivation of the simulated sentential form is completed, every nonterminal A is marked and the pair A# is subject to the inverse morphism. 3 Main results Let us consider insertion systems (without membranes) with one letter context rules, i.e., the family MorINS1,1∗ . Applying the marking technique we get a characterization of context-free languages. Theorem 3.1. MorINS1,1∗ = CF. 270 A. Krassovitskiy Proof: First we show that CF ⊆ MorINS1,13 . Let G = (V, T, S, P) be a context-free grammar in Chomsky normal form. Consider the following insertion system Π = (T ∪ V ∪ {#}, R, {S}, h, φ), where R = {(A, #γ, α) | α ∈ T ∪ V, A −→ γ ∈ P, γ ∈ (T ∪ V )∗, 1 ≤ |γ| ≤ 2}, the morphism h is defined as follows h(a) = a#, if a ∈ V, and h(a) = a, if a ∈ T, and the weak coding φ is defined as follows φ(a) −→ ε, if a ∈ V, φ(a) −→ a, if a ∈ T. It is clear that L(Π) ∈ MorINS1,13 . We claim that L(Π) = L(G). Indeed, each rule (A, #γ, α) ∈ R can be applied in the sentential form wA alphaw′ if A is unmarked (not rewritten). Hence, the production A −→ γ ∈ P can be simulated by the corresponding derivation of G. Hence, by applying the counterpart rules we get equivalent derivations. At the end of the computation every nonterminal is marked, and no rules can be applied any more. (Indeed, if the system produces a word having some unmarked nonterminal then h−1 is not defined on this word.) At this point h−1 removes all marks, and φ removes all nonterminal symbols. Hence L(Π) = L(G). We get CF ⊆ MorINS1,13 . The equivalence of these two classes follows from Theorem 6.4 in [19] stating that INS1,1∗ ⊆ CF and the fact that the family of context-free languages is closed under inverse morphisms and weak codings. � Now we consider insertion P systems with the left and right contexts of at most one letter. It is known from Theorem 5.5.1 in [17] that LSP2(ins 1,1 2 ) contains non context-free languages. We prove that the more general family LSP∗(ins 1,1 ∗ ) is bounded by the class of languages generated context-free matrix grammars: Lemma 3.2. LStP∗(ins 1,1 ∗ ) ⊂ MAT λ. Proof: The proof uses a similar technique as in [19], Theorem 6.4 for context-free grammars. Let Π = (V, µ, M1, . . . , Mn, R1, . . . , Rn) be an insertion P system such that L(Π) ∈ LStPn(ins 1,1 ∗ ) for some n ≥ 1. Consider the matrix grammar G = (D ∪ Q ∪ {S}, V, S, P), where Q = {Qi | i = 1, . . . , n}, D = {Da,b | a, b ∈ V ∪ {ε}}, and P is constructed as follows: 1. For every rule (a, b1 . . . bk, c, goj) ∈ Ri, a, c ∈ V ∪ {ε}, b1, . . . , bk ∈ V, k > 0 we add to P (Qi −→ Qj, Da,c −→ Da,b1Db1,b2 . . . Dbk−1,bkDbk,c), where a = { a, if a ∈ V, t, ∀t ∈ V ∪ {ε}, if a = ε c = { c, if c ∈ V, t, ∀t ∈ V ∪ {ε}, if c = ε. 2. For every rule (a, ε, c, goj) ∈ Ri, a, c ∈ V ∪ {ε}, k > 0 we add to P (Qi −→ Qj, Da,c −→ Da,c), where a and c are defined as in the previous case. 3. Next, for every w = b1 . . . bk ∈ Mi, i = 1, . . . , n, k > 0 we add to P the matrix (S −→ QiDε,b1Db1,b2 . . . Dbk−1,bkDbk,ε). 4. As a special case if ε ∈ Mi we add (S −→ QiDε,ε) to P. 5. Also, for every Da,b ∈ D, a, b ∈ V ∪ {ε} we add (Da,b −→ a) to P. On the Power of Small Size Insertion P Systems 271 6. Finally, we add (Q1 −→ ε) to P (we assume that the first cell is the output cell). The simulation of Π by the matrix grammar is straightforward. We store the label of current cell by means of nonterminals from Q. Every nonterminal Da,c ∈ D, a, c ∈ V ∪ {ε} represents a pair of adjacent letters, so we can use them as a context. A rule (a, b1 . . . bk, c, goj) ∈ Ri, a, c ∈ V, b1 . . . bk ∈ V k can be simulated by the grammar iff the sentential form contains both Qi and Da,c. As a result, the label of the current cell is rewritten to Qj and Da,c is rewritten to the string Da,b1Db1,b2 . . . Dbk−1,bkDbk,c. We note, that in order to simulate rules that have no context we introduce productions with every possible context symbols by writing a ∈ V ∪ {ε} and c ∈ V ∪ {ε}. Clearly, since indexed symbols are duplicated for adjacent nonterminals, e.g., Dbi−1,biDbi,bi+1 , every nonterminal thus preserves one symbol right and left contexts. The simulation of Π by the grammar starts with a nondeterministic choice of the axiom from M1, . . . , Mn. Then, during the derivation any rule from R1, . . . , Rn having the context (a, b) is applied in one to one correspondence with grammar productions having Da,b in the left hand side. Finally, the string over V is produced by the grammar iff Q1 has been deleted from the simulated sentential form. The deletion of Q1 specifies that Π reached the output cell. So, we obtain L(Π) = L(G). Hence, LStP∗(ins 1,1 ∗ ) ⊆ MAT λ. The strictness of the inclusion follows from the fact there are languages from MAT λ which cannot be generated by any insertion P system from LStP∗(ins 1,1 n ), for any n ≥ 1. Indeed, consider the context-free language La = {cakcakc | k ≥ 0}. Since every context-free language is a matrix language we have La ∈ MAT λ. On the other hand, La /∈ LStP∗(ins 1,1 n ), for any n ≥ 1. For the contrary, assume there is such a system Π′. We note, that the system cannot delete or rewrite any letter, so every insertion is terminal. As the languages of axioms are finite we need an insertion rule of letter a. Consider the final insertion step in a derivation which has at most one step and derives a word w = cakcakc, for some k ≥ n + 1 : w0 =⇒∗ w′ =⇒ w, where w0 is an axiom. Since |w0|c ≤ 3, c may be inserted by the last insertion. Assume, that |w′|c = 3. In the latter case, let ap be the inserted string, p ≤ n. Because, we may insert ap in the distinct positions of w′ we get that either cak−pcak+pc ∈ L(Π′) or cak+pcak−pc ∈ L(Π′). This is a contradiction. Now assume that c is inserted by the last insertion. We note that the insertion of two c is not possible, since k ≥ n + 1. Consider three cases: (1) the last applied rule inserts c in the middle, (2) at the end, or (3) at the beginning of w′. (1) Let wc = ap ′ cap ′′ be the inserted string, where p′ + p′′ ≤ n − 1. Hence, w′ = cak ′+k′′c, where k′ + p′ = k′′ + p′′ = k, and k′ + k′′ = 2k − p′ − p′′ ≥ 2n + 2 − n + 1 ≥ 4. Obviously, regardless of the contexts of the last insertion rule there are at least two positions at which wc can be inserted. So, we get a contradiction because either cak ′+p′+1cak ′′+p′′−1c ∈ L(Π′), or cak ′+p′−1cak ′′+p′′+1c ∈ L(Π′). (2) Let aqc be the inserted string, where q ≤ n − 1. The corresponding insertion rule has one of the following forms: (ε, aqc, ε, goj) or (a, aqc, ε, goj), where j is an index of the final membrane. In ether case, aqc may be inserted in w′ before the last letter a. This is a contradiction. The case (3) is a mirror to the case (2) and can be treated similarly. So we proved La /∈ LStP∗(ins 1,1 n ), for any n ≥ 1 and, hence, LStP∗(ins 1,1 ∗ ) ⊂ MAT λ. � Since trees are special cases of graphs we get the following result Corollary 3.3. LSP∗(ins 1,1 ∗ ) ⊂ MAT λ. Lemma 3.4. MAT λ ⊆ MorLSP∗(ins 1,1 2 ). 272 A. Krassovitskiy Proof: We prove the lemma by direct simulation of a context-free matrix grammar G = (N, T, S, P). We assume that G is in the modified binary normal form, i.e., every ma- trix has the form i : (A −→ BC, A′ −→ B′C′) ∈ P , where A, A′ ∈ N; B, B′, C, C′ ∈ N ∪ T ∪ {ε}, and i = 1, . . . , n. Consider a P insertion system Π defined as follows: Π = (V, [1 [2 [3 [4 ]4 . . . [n+3 ]n+3 ]3 ]2 ]1, {S$}, ∅, . . . , ∅, R1, . . . , Rn+3, h, φ), where V = N ∪ T ∪ {Ci, C′i | i = 1, . . . , n} ∪ {#, $}. For every matrix i : (A −→ BC, A′ −→ B′C′) we add r.1.1 : (A, #Ci, α, in2), to R1; r.2.1 : (Ci, BC, α, in3), r.2.2 : (C ′ i, #, α, out) to R2; r.3.1 : (Ci, #, α, ini+3), r.3.2 : (C ′ i, B ′C′, α, out) to R3; r.i + 3.1 : (A′, #C′i, α, out), to Ri+3 for every α ∈ V \{#}. In addition we add (ε, $, ε, out) to R1. We define the morphism h and the weak coding φ by: h(a) = { a, if a ∈ T, a# if a ∈ V \(T ∪ {#}) φ(a) = { a, if a ∈ T, ε if a ∈ V \T. Clearly, L(Π) ∈ MorLSPn+3(ins 1,1 2 ). We claim that L(Π) = L(G). To prove this it is enough to prove that w ∈ L(G) iff w′ ∈ L(Π) and w′ = φ(h−1(w)). First we show that for every w ∈ L(G) there exists w′ ∈ L(Π) and w′ = φ(h−1(w)). Consider the simulation of the i-th matrix (A −→ BC, A′ −→ B′C′) ∈ P. It is controlled by letters Ci and C′i. First, we insert #Ci in the context of unmarked A and send the obtained string to the second membrane. Then we use Ci as a context to insert adjacently right the word BC. After that, we mark the control letter Ci and send sentential form to the (i + 3)-rd membrane. Here we choose nondeterministically one letter A′, mark it, write adjacently right the new control letter C′i, and, after that, send the obtained string to the third membrane. (We remark, the third membrane is immediately outside of the i + 3-rd membrane.) It is clear that it is not possible to apply the rule r.i + 3.1 : (A′, #C′i, α; out) in the (i + 3)-rd membrane and to reach the skin membrane if the sentential form does not contain unmarked A′. So, this branch of computation cannot influence the result and may be omitted in the consideration. Next, in the third membrane, B′C′ is inserted in the context of unmarked C′i and the form is sent to the second membrane. Finally, we mark C′i and send the resulting string back to the skin membrane. At the beginning of the simulation the sentential form in the skin membrane does not contain unmarked Ci, C′i. Hence, the insertions in the second and third membranes are deterministic. The derivation preserves this property, as after the sentential form is sent back to the skin membrane, introduced Ci, and C′i are marked. At the end of the computation we send the resulting form out from the system by the rule (ε, $, ε, out). Let w be a string in the skin region which contains some unmarked A and A′. If the letter A precedes A′ then we can write w = w1Aα1w2A′α2w3. The simulation of the matrix is the following w1Aα1w2A ′α2w3 r.1.1,r.2.1,r.3.1 =⇒ w1A#Ci#BCα1w2A′α2w3 r.3+i.1,r.3.2,r.2.2 =⇒ w1A#Ci#BCα1w2A ′#C′i#B ′C′α2w3, where w1, w2, w3 ∈ V ∗, α1, α2 ∈ V \{#}. We can write a similar derivation if A′ precedes A. On the Power of Small Size Insertion P Systems 273 Hence, as result of simulation of i-th matrix we get both A and A′ marked and BC, B′C′ inserted in the correct positions. The derivation in Π may terminate by the rule (ε, $, ε, out) only in the first membrane. Hence it guaranties that the simulation of each matrix has completed. According to the definition of MorLSP the string w′ belongs to the language if w′ = φ(h−1(w)), where w is the generated string. We may consider only the final derivations of Pi in which each nonterminal is marked. Hence, we have L(G) ⊆ φ(h−1(L(Π))). The inverse inclusion is obvious since every rule in Π has its counterpart in G. Moreover the case when the derivation in Π is blocked corresponds to the case in which the simulation of a matrix cannot be completed. Hence, we get MAT λ ⊆ MorLSP∗(ins 1,1 2 ). � Remark 3.5. One can mention that a similar result can be obtained with a smaller number of membranes at the price of the maximal length of inserted words. Precisely, for any context-free matrix grammar G′ there is a P insertion system Π′ such that L(Π′) ∈ MorLSPn+1(ins 1,1 3 ) and L(G′) = L(Π′), where n is the number of matrices in G′. To prove this we can use the same argument as in the previous theorem and replace rules in R1, . . . , Rn+3 by (A, #BC, α, ini+1) to R1, (A′, #B′C′, α, out) to Ri+1, for every α ∈ V \{#}, i = 1, . . . , n. Since trees are special cases of graphs we get the following result Corollary 3.6. MAT λ ⊆ MorLStP∗(ins 1,1 2 ). Taking into account Lemma 3.4 and 3.2, and the fact that the class of languages generated by context-free matrix grammars is closed under inverse morphisms and weak codings we get a characterization of MAT : Theorem 3.7. MorLS[t]P∗(ins 1,1 ∗ ) = MAT λ. Now we increase the maximal size of the context of insertion rules to two letters. It is known from [19] that INS2,22 contains non-semilinear languages. By considering these systems with membrane regulation we obtain the following result Theorem 3.8. MorLSP3(ins 2,2 2 ) = RE. Proof: We prove the theorem by simulating a type 0 grammar in the modified Penttonen normal form. Let G = (N, T, S, P) be such a grammar. Suppose that rules in P are ordered and n = card(P). Now consider the following insertion P system, Π = (V, [1 [2 [3 ]3 ]2 ]1, {S$}, ∅, ∅, R1, R2, R3, h, φ), where V = T ∪ N ∪ F ∪ F ∪ {#, #, $}, F = {FA, | A ∈ N}, F = {F A, | A ∈ N}. We include into R1 the following rules: (AB, #C, α, here), if AB −→ AC ∈ P ; (A, #C, Bα, here), if AB −→ CB ∈ P ; (A, C, α, here), if A −→ AC ∈ P ; (ε, C, Aα, here), if A −→ CA ∈ P ; (A, #δ, α, here), if A −→ δ ∈ P ; ($, ε, ε, out), 274 A. Krassovitskiy where α ∈ V \{#}, A, B, C ∈ N, and δ ∈ T ∪ {ε}. It may happen that the pair of letters AB subjected to be rewritten by a production AB −→ AC or AB −→ CB ∈ P is separated by letters that have been marked. We use two additional membranes to transfer a letter over marked ones. In order to transfer A ∈ N we add for each α ∈ V \{#} r.1.1 : (A, #FA, α, in2), to the skin membrane. Then we add to the second membrane r.2.1 : (FA, #A, α ′, out), r.2.2 : (FA, #A, α ′, out), r.2.3 : (FAX, #FA, #, in3), r.2.4 : (FAFB, #FA, #, in3), r.2.5 : (FA#, FA, α, in3), r.2.6 : (FA#, FA, α, in3), r.2.7 : (FA#, FA, #, in3), r.2.8 : (FA#, FA, #, in3), r.2.9 : (FA#, FA, #, in3), r.2.10 : (FA#, FA, #, in3), for every X ∈ F ∪ N, FB ∈ F, α ∈ V \{#, #}, α′ ∈ {ab | a ∈ N ∪ T, b ∈ N ∪ T ∪ {$}} ∪ {$}. Finally, we add to the third membrane the rules r.3.1 : (FA, #, α, out), r.3.2 : (FA, #, α ′, out), for every α ∈ V \{#}, α′ ∈ V \{#}. The morphism h and the weak coding φ are defined as h(a) = { a, if a ∈ V \N, a# if a ∈ N. φ(a) = { a, if a ∈ T, ε if a ∈ V \T. We simulate productions in P by marking nonterminals from N and inserting corresponding right hand sides of the productions. This can be done with insertions in the skin membrane by rules of weight (2, 2, 2) since the grammar has such a form that production rewrites/adds at most one letter. The simulation of the transfer is done in the second and third membranes. The idea of the simulation is (1) to mark the nonterminal we want to transfer, (2) jump over the marked letters with the help of one special letter, at the end (3) mark the special letter and insert the original nonterminal. Since we use two letter contexts, in one step we can jump only over a single letter. We also need to jump over the marking letter # as well as over the marked nonterminals, and the letters inserted previously. In order to jump over # we introduce one additional marking symbol #. We mark letters from F by #, and all the other letters in V \{#, #} by #, e.g., in a word FA#, letter FA is unmarked. (1) The rule r.1.1 : (A, #FA, α, in2), specifies that every unmarked letter from N may be subjected to the transfer. (2) The rules r.2.3 − r.2.10 in the second membrane specify that FA or FA is copied to the right in such a way that inserted letters would not be marked. In order to do so, the appropriate rule chooses to insert either the overlined copy FA or the simple copy FA. The rules r.2.3, r.2.4 describe jumps over one letter not in {#, #}, and r.2.5−r.2.10 describe jumps over #, #. Every rule r.2.3 − r.2.10 sends the sentential form to the third membrane, and the rules r.3.1, r.3.2 in the third membrane send the sentential form back to the second membrane after marking one symbol FA ∈ F or FA ∈ F. On the Power of Small Size Insertion P Systems 275 (3) The rules r.2.1 and r.2.2 may terminate the transferring procedure and send the sentential form to the first membrane if letter $ or two letters from {ab | a ∈ N ∪ T, b ∈ N ∪ T ∪ {$}} appear in the right context. For example, consider the transfer of A in the string AX#C$ (here, we underline inserting substrings) AX#C$ r.1.1 =⇒ A#FAX#C$ r.2.3 =⇒A#FAX#FA#C$ r.3.1 =⇒ A#FA#X#FA#C$ r.2.6 =⇒A#FA#X#FA#FAC$ r.3.2 =⇒ A#FA#X#FA ##FAC$ r.2.1 =⇒A#FA#X#FA ##FA#AC$. The sentential form preserves the following property: (i) The first membrane does not contain unmarked letters from F ∪ F; there is exactly one unmarked letter from F ∪ F in the second membrane; and there are always two unmarked letters from F ∪ F in the third membrane. We mention that property (i) is preserved by every derivation. Indeed, we start derivation from the axiom S$ that satisfies the property, then one unmarked symbol is inserted by r.1.1. Rules r.2.3 − r.2.12 always add one more unmarked letter, whereas rules r.2.1, r.2.2, r.3.1, r.3.2 always mark one letter from F ∪ F. In order to verify that Π generates the same language as G we note that every reachable sentential form in G will be reachable also in Π by simulating the same production. We also note that the derivation in Π may terminate by the rule ($, ε, ε, out) only in the first membrane. Hence, every transfer will be completed. It follows from property (i) that the simulation of the transfer is deterministic in the second membrane. Also note, that there is a nondeterministic choice in the third membrane, where the rules r.3.1, r.3.2 may mark one of the two unmarked letters. In the case the rule marks the rightmost letter, the derivation has to “jump" again over the inserted letter. In a special case if r.1.1 starts the transfer of a letter adjacently left from an unmarked one then the rules r.1.1, r.2.1 produce two marked symbols which do not affect the result of the simulation. The output string w is in the language, iff w′ = φ(h−1(w)) is defined. Hence, the resulting output of Π does not contain unmarked nonterminals. On the other hand every final derivation in Π has its counterpart in G. By applying the inverse morphism h−1 we filter out every sen- tential form with unmarked nonterminals from N. Hence, the corresponding derivation in G is completed. Finally, the weak coding φ filter away all supplementary letters. Hence, we have L(G) = L(Π). � 4 Conclusions This article investigates the generative power of insertion P systems with encodings. The length of insertion rules and number of membranes are used as parameters of the descriptional complexity of the system. In the article we exploit the fact that a morphism and a weak coding are incorporated into insertion P systems. The obtained family MorLS[t]P∗(ins 1,1 ∗ ) characterizes the matrix languages. When no membranes are used, the class MorINS1,1∗ is equal to the family of context-free languages. We proved also the universality result regarding the family MorLSP∗(ins 2,2 ∗ ). Also, for the family LSP∗(ins 2,2 ∗ ) one can get an analogous computational completeness result by applying right(left) quotient with respect to a regular language. We recall the open problem posed in [4], namely, whether MorINS2,2∗ is computationally complete. Our work gives a partial solution of the problem by using the membrane comput- ing framework. One may see that in order to solve the problem completely (by the technique 276 A. Krassovitskiy promoted in the article), it is enough to find a concise way to transfer a letter over a marked context. In our case this can be reduced to the question whether it is possible to compute the membrane regulation in the skin membrane. One may mention that there is a trade-off between the number of membranes and the maximal length of productions. By introducing additional nonterminals and fitting the grammars into the normal forms we decrease the amount of membranes used. On the other hand by raising the number of membranes we can simulate larger production rules. Moreover, the descriptional complexity used in the paper may be extended by such internal system parameters as, e.g., the size of the alphabet, the number of rules per membrane, etc. It may be promising to continue the research of the minimal systems regarding these parameters. We are also interested in the computational power of the insertion P systems having only one sided (right or left) contexts. Acknowledgments The author acknowledges the support PIF program from University Rovira i Virgili, and project no. MTM2007-63422 from the Ministry of Science and Education of Spain. The author sincerely thanks the help of Yurii Rogozhin and Seghey Verlan without whose supervision the work would not been done. Also the author warmly thanks Gheorghe Păun and Erzsèbet Csuhaj- Varjú who draw attention, during the meeting BWMC-09, to the point of using different normal forms of grammars to simulate P insertion-deletion systems. Bibliography [1] M. Daley, L. Kari, G. Gloor, R. Siromoney, Circular Contextual Insertions/Deletions with Applications to Biomolecular Computation, In: Proc. SPIRE’99 , pp. 47-54, 1999. [2] L. Kari, From Micro-soft to Bio-soft: Computing with DNA, In:Proc. of Biocomputing and Emergent Computations, 146-164, 1997. [3] L. Kari, Gh. Păun, G. Thierrin, S. Yu, At the Crossroads of DNA Computing and Formal Languages: Characterizing RE Using Insertion-Deletion Systems. In: Proc. of 3rd DIMACS , pp. 318-333, 1997. [4] L. Kari, P. Sosík, On the Weight of Universal Insertion Grammars, TCS , 396(1-3), pp. 264-270, 2008. [5] L. Kari, G. Thierrin, Contextual Insertion/Deletion and Computability, Information and Computation, 131, 1, pp. 47-61, 1996. [6] A. Krassovitskiy, Yu. Rogozhin, S. Verlan, Further Results on Insertion-Deletion Systems with One-Sided Contexts, In Proc. of 2nd LATA08, LNCS , 5196, pp. 333-344, 2008. [7] A. Krassovitskiy, Yu. Rogozhin, S. Verlan, One-Sided Insertion and Deletion: Traditional and P Systems Case, CBM08 , pp. 53-64, 2008. [8] A. Krassovitskiy, Yu. Rogozhin, S. Verlan, Computational Power of P Systems with Small Size Insertion and Deletion Rules, In:Proc of CSP08 , pp. 137-148, 2008. [9] S. Marcus, Contextual Grammars, Rev. Roum. Math. Pures Appl., 14, pp 1525-1534, 1969. [10] M. Margenstern, Gh. Păun, Yu. Rogozhin, S. Verlan, Context-Free Insertion-Deletion Sys- tems, TCS, 330, pp. 339-348, 2005. On the Power of Small Size Insertion P Systems 277 [11] C. Martin-Vide, Gh. Păun, A. Salomaa, Characterizations of Recursively Enumerable Lan- guages by Means of Insertion Grammars, TCS , 205, 1–2, pp. 195-205, 1998. [12] A. Matveevici, Yu. Rogozhin, S. Verlan, Insertion-Deletion Systems with One-Sided Con- texts, LNCS , 4664, pp. 205-217, 2007. [13] M. Mutyam, K. Krithivasan, A. S. Reddy, On Characterizing Recursively Enumerable Lan- guages by Insertion Grammars, Fundamenta Informaticae, 64(1-4), pp. 317-324, 2005. [14] K. Onodera, New Morphic Characterization of Languages in Chomsky Hierarchy Using Insertion and Locality, In Proc. of 3rd LATA09, LNCS, 5457, pp. 648–659, 2009. [15] Gh. Păun, Computing with Membranes, Journal of Computer and System Sciences, Vol. 61, 1, pp. 108-143, 2000. [16] Gh. Păun, Marcus Contextual Grammars, Kluwer, Dordrecht, 1997. [17] Gh. Păun, Membrane Computing. An Introduction, Springer–Verlag, Berlin, 163, pp. 226- 230, 2002. [18] Gh. Păun, M. J. Pérez-Jiménez, T. Yokomori, Representations and Characterizations of Languages in Chomsky Hierarchy by Means of Insertion-Deletion Systems, International Journal of Foundations of Computer Science, 19, 4, pp. 859-871, 2008. [19] Gh. Păun, G. Rozenberg, A. Salomaa, DNA Computing. New Computing Paradigms, Springer–Verlag, Berlin, 1998. [20] G. Rozenberg, A. Salomaa, eds., Handbook of Formal Languages, Springer–Verlag, Berlin, 1997. [21] A. Takahara, T. Yokomori, On the Computational Power of Insertion-Deletion Systems, DNA 2002, Sapporo, LNCS, 2568, pp. 269–280, 2003. [22] S. Verlan, On Minimal Context-Free Insertion-Deletion Systems, Journal of Automata, Lan- guages and Combinatorics, 12, 1/2, pp. 317–328, 2007.