Microsoft Word - BRAIN_8_2017_Issue_1_VERS3.docx 42 Right-Linear Languages Generated in Systems of Knowledge Representation based on LSG Daniela Dănciulescu University of Craiova, Alexandru Ioan Cuza 13, Craiova 200585, Romania danadanciulescu@gmail.com Mihaela Colhon University of Craiova, Alexandru Ioan Cuza 13, Craiova 200585, Romania mcolhon@inf.ucv.ro Gheorghe Grigoraş Alexandru Ioan Cuza University of Iaşi, Bulevardul Carol I 11, Iaşi 700506, Romania grigoras@info.uaic.ro Abstract In Tudor (Preda) (2010) a method for formal languages generation based on labeled stratified graph representations is sketched. The author proves that the considered method can generate regular languages and context-sensitive languages by considering an exemplification of the proposed method for a particular regular language and another one for a particular context- sensitive language. At the end of the study, the author highlights some open problems for future research among which we remind: (1) The study of the language families that can be generated by means of these structures; (2) The study of the infiniteness of the languages that can be represented in stratified graphs. In this paper, we extend the method presented in Tudor (Preda)(2010), by considering the stratified graph formalism in a system of knowledge representation and reasoning. More precisely, we propose a method that can be applied for generating any Right Linear Language construction. Our method is proved and exemplified in several cases. Keywords: graph-based representation and reasoning, formal language generation, right- linear language generation, accepted structured paths, stratified graph. 1. Introduction A non-empty and finite set of symbols noted with Σ is considered in what follows the alphabet set. A word over an alphabet is a sequence of symbols  = 1 … n, where i Σ, . The length of a word will be denoted by ||. Obviously, for  = 1 … n, we have || = n and || = 0, where for we mark an empty word. The set of all nonempty words over an alphabet is denoted (instead of the set) Σ+. If we want to consider the set of all the words over an alphabet (including the empty word) we denote Σ* = Σ +{}. For two words:1 = 1 … n and 2 = 1 … m, we note by  the operation of string concatenation. We will have (1). 12 = 1 … n1 … m (1) This operation is not commutative, that is(2), but it is associative (3) with  the neutral element (4). 1221 (2) (12) 3 = 1 (23) (3)  =  =  (4) D. Dănciulescu, M. Colhon, G. Grigoraş - Right-Linear Languages Generated in Systems of Knowledge Representation based on LSG 43 We obtain that (Σ+, ) is a semi-group and (Σ*, ) is a free monoid generated by Σ. If we take 1, 2 Σ* we consider:  2 a sub-word of 1 if there are 1, 2 Σ* such that 1 = 122;  2 is a prefix of 1 if 1 = 2 and  Σ*;  2 is a suffix of 1 if 1 = 2 and  Σ*. A language L over an alphabet Σ is defined as a subset of Σ*, that is (5). Due to the fact that Σ* is an infinite set, we can have the language (5) - an empty set, a finite or an infinite set. We will note by  = n the word (6), obviously  = 0. L Σ* (5)  = (6) A grammar is considered as a tuple (7) such that (8), for Σ - the set of terminal symbols and N - the set of non-terminal symbols. S0 is called the initial symbol and P is the set of grammar productions (rules). G = (N, Σ, S0, P) (7) Σ N =  (8) The elements of “P” are of the form (9). (ij) P (9) Where (10), such that i contains at least a symbol from N and (11). In the rest of the paper, we call the rules of the grammar G the elements (12).A direct derivation in the grammar G is defined over (N Σ)* using the binary relation (13). i (N Σ) + (10) j (N Σ)* (11) (ij)  P (12)  as: ipjiqj if i, j , p, q (N Σ)* and (pq) P (13) By * we will note the reflexive and transitive closure of the relation  and by + the transitive closure of the same relation. Using these notations, the language generated by the grammar G is considered (14). L (G) = { Σ* | S0 +} (14) In this paper, we extend the method presented in Tudor (Preda)(2010), by considering the stratified graph formalism in a system for knowledge representation and reasoning. More precisely, we propose a method that can be applied for generating any Right Linear Language construction. Our method is proved and exemplified in several cases. The paper is organized as follows: the second section provides the theoretical background behind the presented study. Section three presents the manner in which the language generation mechanism is designed by means of a system of knowledge representation and reasoning. The final section draws the concluding ideas and our future study. BRAIN: Broad Research in Artificial Intelligence and Neuroscience Volume 8, Issue 1, April 2017, ISSN 2068-0473, E-ISSN 2067-3957 44 2. System of knowledge representation based on stratified graph for formal language generation We consider two finite sets S and L0 such that (15), for S - a set of nodes labels and L0 - a set of arcs labels. We note by (16) a non-empty set of binary relations on S and by f0 a surjective function where (17). Following the definitions and the notations firstly introduced by Ţăndăreanu (Ţăndăreanu, 2000; Ţăndăreanu, 2003), the system (18) will denote a labeled graph structure. S \ L0 =  (15) T0 2 S×S (16) f0: L0T0 (17) G0 = (S, L0, T0, f0) (18) By STR(G0) we note the set of the structured paths of the graph G0, this concept being defined (Dănciulescu and Ţăndăreanu, 2013) as follows:  for every aL0 and for every x, yS such that (19), we have (20); (x, y) f0(a) (19) ([x, y], a) STR(G0) (20)  if (21) and (22) then (23). ([x1, …, xn], u) STR(G0) (21) ([y1, …, ym], v) STR(G0) (22) ([x1, …, xn, y1, …, ym], [u, v]) STR(G0) (23) We take the projection of STR(G0) on the second axis (24). STR2(G0) = {u | ([x1, …, xn], u) STR(G0), n 2} (24) If we take T the composition operation of binary relations from the set 2 S×S defined as (25). T(1, 2) = {(x, z)  2 S×S | yS: (x, y) 1, (y, z) 2 } (25) The pair (2S×S, T) becomes a partial algebra. Following (Ţăndăreanu, 2000), by the set (26) we will note the closure of the set T0 in the algebra (2 S×S, T). T = ClT(T0) (26) For L - a binary relation symbol, we consider L the smallest set satisfying the following notations (Ţăndăreanu, 2000): 1) L0L; 2) if (27) then (28). a, bL0 (27) L(a, b) L (28) The pair (L, L) also defines a partial algebra. Following Ţăndăreanu (Ţăndăreanu, 2000; Ţăndăreanu, 2003) we take (29). f: (L, L)  (2 S×S, T ) (29) D. Dănciulescu, M. Colhon, G. Grigoraş - Right-Linear Languages Generated in Systems of Knowledge Representation based on LSG 45 A morphism of partial algebras such that (30) and if (31), then (32) (see Figure 1). We obtain f(L) = T which means that “for every element of L the associated element of T is computed by the morphism f” (Ţăndăreanu, 2000). aL0, f(a) = f0(a) (30) f(u), f(v)) dom(T ) (31) (u, v) dom(L) (32) Figure 1. The graphical representation of the morphism f: (L, L)  (2 S×S, T) (Ţăndăreanu, 2000) From (Ţăndăreanu, 2000) for every (33) and for each (T, T ) we can construct a Stratified Graph (shortly, SG) over G0: (34). G0 = (S, L0, T0, f0) (33) G = (G0, L, T, T , f) (34) The elements of L represent the labels for the binary relations of T defined in the stratified graph structure by means of T - the composition operation of binary relations. For each (35), we define trace(u) as the list of symbols from L0, upon which the label u was made up, by considering the symbols of L0 in the exactly the same order they appear in the label u. uL (35) We have:  if uL0 then trace(u) = [u]  if (36) with a, bL0 then trace(u) = [a, b]  if u = L(u1, u2) then trace(u) = trace(u1) trace(u2),for we note the reunion of lists the operation (37). u = L(a, b) (36) [x1, . . . , xn]  [y1, . . . , ym] = [x1, . . . , xn, y1, . . . , ym]. (37) As it is considered in the literature, a formal language is a set of words which are represented by finite strings of letters, symbols or tokens. The set of all the letters or symbols upon which the words are made of is called the alphabet of the formal language. Usually, a formal language is defined by means of some formation rules or a “formal grammar” (Dănciulescu, 2015). BRAIN: Broad Research in Artificial Intelligence and Neuroscience Volume 8, Issue 1, April 2017, ISSN 2068-0473, E-ISSN 2067-3957 46 An interpretation for a stratified graph provides a representation in a domain of knowledge. In the presented approach, we take the domain of knowledge to be a particular formal language. In what follows we will note by V the alphabet of the generated formal language (Dănciulescu, 2015). Therefore, let us consider the stratified graph (38) over a labeled graph (39). We defined the interpretation of G as a tuple: (40) (Dănciulescu, 2015). G = (G0, L, T, u, f) (38) G0 = (S0, L0, T0, f0) (39) I = (V*, ob, D, Y) (40) such that:  V* is the set of all strings constructed over the characters of V; in this formalism V* represents the knowledge domain of I  ob: S V* a injective function that maps the symbols of S into the strings of V*  D = (V*, ) is a partial algebra over V*. In what follows we will consider the operation  as a partial operation defined in the following manner: for every natural number m we take m+1 = m  the set of algorithms Y = {Algu}uL generate the elements of the interpretation domain V*, Algu: (41).  V* × V* V*, uL (41) According to this interpretation system, the valuation mapping generated by I is defined as (42). valI : ASP(G) Y (42) such that (Dănciulescu, 2015):  valI([x, y], u) = Algu(ob(x), ob(y)), uL0  valI([x1, …, xn], L(u, v)) = valI([x1, …, xk], u) valI([xk, …, xn], v), for L(u, v) L Remark. (Dănciulescu, 2015) BecauseD = (V*, ) is a partial algebra, results that(43) if and only if: (44) and (45). valI([x1, …, xk], u) valI([xk, …, xn], v) V* (43) valI([x1, . . . , xk], u), valI([xk, . . . , xn], v) V* (44) valI([x1, . . . , xk], u), valI([xk, . . . , xn], v) dom() (45). Remark (Dănciulescu, 2015) For each (46), the string (47), is an element ofV*, therefore (48) represents the formal language overVconstructed by means of the abstract notations ofG. (ob1, ob2) Y×Y (46) Algu(ob1, ob2) foruL (47) (48) Definition (Dănciulescu, 2015) We define a Stratified Graph Representation System for formal language generation as follows: SGRS = (G, (V*, ), ob, Y) D. Dănciulescu, M. Colhon, G. Grigoraş - Right-Linear Languages Generated in Systems of Knowledge Representation based on LSG 47 The inference process IPSGRS generated by the system of knowledge representation SGRS is (49) (Dănciulescu, 2015). IPSGRS: ASP(G) Y (49) We obtain that the set Y that includes all the words constructed over the alphabet V by means of the algorithms defined in the interpretation system I by considering the representations of the stratified graph G. Therefore (50) is an accepted structured path of the structure G, we have (51) (Dănciulescu, 2015). dASP(G) (50) IPSGRS(d) = valI(d) (51) Following the inference process given above, the output elements of Y are defined as follows (Dănciulescu, 2015): - if we take (52) results (53). C(x, y) = {dASP(G) | first(d) = x, last(d) = y} (52) (53) where: for (54) with (55) and (56). IPSGRS = (54) dASP(G) (55) d = ([x1, …, xn], u), first(d) = x1 (56) last(d) = xn (57) 3. Right-linear languages generated in systems of knowledge representation based on stratified graphs A Right Linear Grammar is a Context Free GrammarG = (N, Σ, S, P) where each rule has one of the following forms: (58) and (59). S1S2 or S1 for  Σ*  {} (58) S1, S2N (59) A grammar is considered as a Left Linear Grammar if all the productions are of the form: (60) and (61). S1S2 or S2, for  Σ*  {} (60) A Regular Grammar is a grammar that is either a Right Linear or a Left Linear Grammar. In literature there are several generation mechanisms for Regular Languages like grammars, automata, transition networks and so on. In this study, we will consider the case of Right-Linear Languages generation; we will prove that we can emulate a Right-Linear Grammar using a system of knowledge representation based on stratified graphs by considering appropriate interpretations. Indeed, let us consider V - a vocabulary. We can define a new mechanism for V* words generation in a System of Knowledge Representation based on SG if we take (61). BRAIN: Broad Research in Artificial Intelligence and Neuroscience Volume 8, Issue 1, April 2017, ISSN 2068-0473, E-ISSN 2067-3957 48 SGRS = (G, (V*, ), {Algu}uL) (61) For G = (G0, L, T, T , f) a Stratified Graph over the labeled graph (62) for (63) - a partial algebra over the symbols of the vocabulary V and by  we note the string concatenation operation defined in above. G0 = (S, L0, T0, f0) (62) (V*, ) (63) Based on the outputs given by the set of algorithms (64) we defined the interpretationmapping for the elements of STR(G0) as the function (65) such that: - if (x, y) f(a) then (66) - if (x, y) f(L(u, v)) then (67). {Algu}uL (64) val: STR(G0) V* (65) val(Patha(x, y), a) = Alga(x, y) (66) val( (x, y), trace(L(u, v))) = (1, 2) (67) Theorem. If (68) is a Right Linear Grammar, there exists a System of Knowledge Representation based on LSG such that (69). G = (N, Σ0, S0, P) (68) L(G) = L(SGRS) (69) Proof. Let us considered (70) labeled graph such that: - (71) is the set of nodes, each non-terminal symbol of the grammar G having a correspondent node in the graph G0. In addition, for G0 – the set of nodes we consider a special node noted with F that will be considered as the “final node”. - the set of arc labels of G0 is the set of terminal symbols of the grammar G. - for every production (72) there are (73) such that (74). - for every production (75) we take (76). G0 = (X, Σ0, T0, f0) (70) X = N {F} (71) (S1S2) P (72) S1, S2X (73) (S1, S2) f0() (74) (S1) P (75) (S1, F) f0() (76) We take (77) a stratified graph structure over the graph G0. We will prove that for the Stratified Graph Representation System SGRS, defined based on the structure G: (78). G = (G0, L, T, f) (77) SGRS = (G, (Σ0, ), {Algu}uL) (78) We have L(SGRS) = L(G). Indeed, let us consider the following n step of derivation in the grammar G (79) 12 … nSn12 … nn+1Sn+1 (79) D. Dănciulescu, M. Colhon, G. Grigoraş - Right-Linear Languages Generated in Systems of Knowledge Representation based on LSG 49 This step is modeled in the stratified graph G given in Figure 1 as it will be proved in what follows: If (80), results that there are the following productions in the grammar G: (81) and by the production of the grammar G we have (82).  = 12 … nL(G) (80) S01 S1 S12 S2 … Skk+1Sk+1 … (81) S01S112S2 … 1 … k+1Sk+1 … 1 … n (*) (82) In order to model these derivations in the stratified graph G, each production of the grammar will be represented in the labeled graph G0 by a direct arc of the form given in Figure 2. Figure 2. The representation of the rule (Skk+1 Sk+1)  P in the labeled graph G0 For each non-terminal symbol Sk of the grammar G there is a node Sk in the graph G0 and for each production (83). Sk1 … k+1Sk+1 (83) The labeled graph G0 will have a relation (84), corresponding, in the stratified labeled graph G, we will have (85). (Sk, Sk+1) f0(k+1) (84) (85) Suppose (86), so that the Equation (*) is satisfied. We obtain (87) which implies the existence of (88) such that (89). This means that (90).  = 1 … nL(G) (86) (S0, S1) f0(1) … (Sn-1, F) f0(n) (87) d (88) d = ([S0, S1, …, Sn, F], (1, 2), …, n)) (89) 1 … nIPSGRS(d) L(SGRS) (90) Conversely, assume that (91). We have that (92), which implies that there is the sequence of non-terminals symbols (93) in the grammar G such that (94). We obtain that (95) and the theorem is proved.  = 1 … nL(SGRS) (91) BRAIN: Broad Research in Artificial Intelligence and Neuroscience Volume 8, Issue 1, April 2017, ISSN 2068-0473, E-ISSN 2067-3957 50 (92) S0, …, Sn-1 (93) S01S112S2 … 1 … n (94) that L(G) (95) Example 1. Let us consider the following Right Linear Grammar (96), where (97). We obtain that (98) with a, n 0. G1 = ({S0}, {a, b}, S0, P) (96) P = {(S0abS0), (S0a)} (97) L(G1) = (ab) n (98) Figure 3. Representation of the grammar G1 in the labelled graph G0 1 If we take the labeled graph G0 1 given in Figure 3 and construct the stratified graph structure over (99) such that (100) we obtain (101), (102). G0 1, G = (G0 1, L, T, f) (99) L = {ab, a, L(ab, ab), L(L(ab, ab), a), L(L(ab, ab), ab), L(L(L(ab, ab), ab), a), …} (100) (S0, S0) f(ab) f(L(ab, ab)) f(L(L(ab, ab), ab)) f( (ab, ab), …, ab)), (101) (S0, F) f(a) f(L(ab, a)) f(L(L(ab, ab), a)) f( (ab, ab), …, a)) (102) Results that the language generated in the resulted system of knowledge representation over G, (103) is (104). That is (105) with a, n 0. SGRS = (G, ({ab, a}*, ), {Algu}uL) (103) L(SGRS) = (104) L(SGRS) = (ab)n (105) 4. Conclusions and future work In this paper, we proposed a new system for formal language generation by means of stratified graphs structures. This mechanism can generate languages of the first type and of the second type. More precisely, we propose a new system for formal language generation by means of a system of knowledge based on stratified graphs. We exemplified that, using an interpretation D. Dănciulescu, M. Colhon, G. Grigoraş - Right-Linear Languages Generated in Systems of Knowledge Representation based on LSG 51 system specially defined for stratified graphs representations, a particular formal language can be obtained by means of the resulted accepted structured paths. From our last work (Dănciulescu, 2015), several open problems still remain unsolved: the investigation of the manner in which, by imposing a set of restrictions in the inference process computations, the generated formal language sequences could be affected; the study the formal languages families that can be generated using this type of knowledge system: regular languages, context-sensitive languages, etc. References Dănciulescu, D. (2015). Formal Languages Generation in Systems of Knowledge Representation Based on Stratified Graphs. Informatica,26(3), 407-417. doi:10.15388/informatica.2015.55. Dănciulescu, D., & Colhon, M. (2014). Splitting the structured paths in stratified graphs. Application in Natural Language Generation. Analele Universitatii "Ovidius" Constanta - Seria Matematica,22(2), 59-69. doi:10.2478/auom-2014-0031. Dănciulescu, D. (2013) Systems Of Knowledge Representation Based On Stratified Graphs And Their Inference Process, In Proceedings of 9th International Conference of Applied Mathematics, Abstracts and Pre-Proceedings, Baia Mare. Dănciulescu, D., Ţăndăreanu, N. (2013) Splitting the structured paths in stratified graphs, Modelling and Development of Intelligent Systems. In Proceedings of the 3th Int Conference on Modeling and Development of Intelligent Systems, University of Lucian Blaga from Sibiu Publishing House, ISSN 2067-3965. Negru, V., Grigoraş, G., Dănciulescu, D. (2015) Natural Language Agreement in the Generation Mechanism based on Stratified Graphs. In Proceedings of the 7th Balkan Conference in Informatics (BCI 2015), Craiova, Romania. Tudor, I. (2016) (2016). Properties of stratified languages. Annals of the University of Craiova,43(2), Mathematics and Computer Science series, 218-230. ISSN: 1223-6934 Tudor, (Preda), I.-V. (2010) Labelled Stratified Graphs can Generate Formal Languages. European Conference for the Applied Mathematics and Informatics, ISBN 978-960-474- 260-8, 184-189. Tudor (Preda), I.-V. (2011). Considerations Regarding Formal Languages Generation Using Labelled Stratified Graphs. International Journal of Applied Mathematics and Informatics, University Press, 2(5), 101-108. Tudor (Preda), I.-V. (2011). Generating Natural Language by Labelled Stratified Graphs. Scientific Bulletin, University of Piteşti, Matematics and Informatics Series, 17, 89-100. Ţăndăreanu, N. (2000). Knowledge bases with Output. Knowledge ans Information Systems, 2(4), 438-460. Ţăndăreanu, N. (2003). Collaborations between distinguished representatives for labelled stratified graphs. Annals of the University of Craiova, Mathematics and Computer Science Series, 30(2), 184-192. Ţăndăreanu, N. (2004). Knowledge representation by labelled stratified graphs. In Proceedings of the 8th World Multi-Conference on Systemics, Cybernetics and Informatics, 5, 345-350. Ţăndăreanu, N. (2005). Labelled Stratified Graphs and their Applications in Knowledge Representation, Technical Report, Research Centre for Artificial Intelligence, University of Craiova. Ţăndăreanu, N. (2006). Lecture Notes on Universal Algebra, Basic Concepts of Peano Algebras and Lattices, Research Report in Artificial Intelligence, Research Centre for Artificial Intelligence, University of Craiova, Universitaria Publishing House.