Extending Graph Query Languages by Reduction Electronic Communications of the EASST Volume 10 (2008) Proceedings of the Seventh International Workshop on Graph Transformation and Visual Modeling Techniques (GT-VMT 2008) Extending Graph Query Languages by Reduction Erhard Weinell 14 pages Guest Editors: Claudia Ermel, Reiko Heckel, Juan de Lara Managing Editors: Tiziana Margaria, Julia Padberg, Gabriele Taentzer ECEASST Home Page: http://www.easst.org/eceasst/ ISSN 1863-2122 http://www.easst.org/eceasst/ ECEASST Extending Graph Query Languages by Reduction Erhard Weinell RWTH Aachen University of Technology, Department of Computer Science 3, Ahornstrasse 55, D-52074 Aachen, Germany, Weinell@cs.rwth-aachen.de Abstract: Graph grammars are a well-founded technology for visually specifying computations or the processing of complex data structures. Up to now, numerous languages and tools for graph transformations exist, whilst new ones are proposed regularly. However, these tools have no technical basis such as an execution frame- work or data storage in common. Instead, graph transformation machineries are usually implemented anew each time. The DRAGOS graph database is especially well-suited for building graph transfor- mation systems, as it is able to store complex graph structures directly. Besides its storage functionality, the database also provides a Query & Transformation Mecha- nism which is able to handle complex queries upon the stored graphs, and to modify them accordingly. Being designed as a basis for graph and model transformation tools, this mechanism is required to allow a flexible adaptation and extension ac- cording to the respective applications’ needs. The present paper discusses how this requirement is covered by the proposed Query & Transformation Mechanism. Keywords: graph database, extensibility, constraint satisfaction 1 Introduction Graph transformations have shown to provide appropriate means to approach various challenges related to software engineering. They not only support the visual programming paradigm, but are also able to express model-to-model transformations, just to mention two prominent scenarios. Up to now, a lot of languages and tools related to graph transformations have been developed, differing w.r.t. application domains and provided language expressiveness. Nevertheless, the all- encompassing system does not seem to exist, as new ones are proposed regularly. All of these tools share common requirements: A data repository to store graph structures persistently, and an according execution framework to carry out the specified transformation rules. In our project, we strive to support the construction of tools related to graph transformations by means of a common platform. Persistent storage of graph structures is provided by the DRA- GOS database [Böh04], which continues a series of graph databases dating back to the eighties [LS88]. The ability to query and transform the stored graphs is currently being added [Wei08]. In this regard, we do not provide an enclosed language and environment, e.g. as applied by PROGRES [SWZ99]. Instead, a core Query & Transformation Language (QTL) is provided which can be easily applied as base-layer for other graph languages. This not only enables to specify graph transformations e.g. using PROGRES syntax, but also allows to construct spe- cialized languages comprising sophisticated query functionality, or model-to-model translations. 1 / 14 Volume 10 (2008) mailto:Weinell@cs.rwth-aachen.de Extending Graph Query Languages by Reduction Tool developers therefore can refer to high-level functionality provided by the QTL, instead of developing proprietary interpreters or code generation modules required without such a platform. Figure 6c gives an architectural overview of the pursued approach. As the figure suggests, Graph Transformation Tools have to provide a language-specific editor, and a transformation mod- ule. The latter’s responsibility is to translate rules of the specialized language to the QTL. These rules are stored in a Rule Repository, from where they are loaded into the Query & Transforma- tion Mechanism (QTM) for processing. The QTM can be controlled by UI frameworks to invoke rules on the stored data. Furthermore, the QTM is able to handle multiple data storages at a time, although all depicted DRAGOS instances can be integrated into a single one at runtime. GraTra Tool Editor Transformer DRAGOS: Rule Repository UI Framework Graph Transformation System s p e c if ic a ti o n t im e ru n t im e DRAGOS: Data Repository DRAGOS: Data Repository... retrieving retrieving querying / transformingprocessing Rule selection trans- lating Rule Processing Engine - QTM Rule set, in QTL Figure 1: DRAGOS / QTL interaction The critical step in constructing specialized graph languages is the concise representation of language transformations, e.g. an appropriately modeled transformation module. Despite the QTL’s ability of declarative modeling, such translations can become complex and hard to main- tain, depending on the specialized language’s complexity. Therefore, we allow to extend the QTL by additional constructs to align both languages’ conceptual levels. For example, an extension can comprise constructs to handle traceability links, which is useful if a model-transformation language is considered as specialized graph language. Language constructs can be realized by extending the QTL’s implementation, i.e. by textual programming. However, this approach requires precise technical knowledge on the QTM’s API, whilst the added constructs’ effects are hard to validate. Therefore, we alternatively allow to define language constructs by reducing them to basic ones. As result, extensions can be specified in a model-based way, easing development and making their effects easier to comprehend due to the known basic constructs’ effects. The present paper introduces this extension mechanism by means of a simple example, namely the detailed modeling of type checking in graph languages. This represents another application of language extensions, besides conceptual alignment: Core language extensions can also be applied to precisely model a specialized language’s semantics, e.g. which types of instances are considered valid for a specific query. The rest of this paper is structured as follows: We first introduce the basic functionality of DRAGOS in Section 2, and afterwards the QTL in Section 3. The following Section 4 presents how the language can be extended by additional language constructs. The paper finally discusses relations to other projects in Section 5 and gives an outlook on future work in Section 6. Proc. GT-VMT 2008 2 / 14 ECEASST 2 Graph database DRAGOS The DRAGOS database1 is a data repository for the management of graph structures. Using graphs as fundamental data model, even complex data structures can be represented without need for technical helper elements. In contrast, the relational data model often requires additional elements, such as extra tables to store many-to-many relations. Architecture. The DRAGOS architecture allows flexbile adaption to a given application do- main, amongst other things by exchanging the underlying graph storage module. This way, developers may choose between fast in-memory solutions, and transactional storages based on relational or object-oriented databases. Furthermore, the graph storage module can be adapted to use existing model repositories, too. Graph model. DRAGOS offers a rich graph model originally inspired by the Graph eXchange Language (GXL) [HWS00]. Among other things, the graph model supports hierarchical graphs including graph-crossing connections. Nodes, graphs, edges and relations are treated as first- class citizens, and thus can be identified and attributed. This enables flexible connections be- tween entities, e.g. edges connecting edges and the attribution of all entities. All entities need to be typed by some graph entity class. Type structuring is supported, including multiple inheri- tance. 3 Queries & Transformations for DRAGOS In this section, we present the Query & Transformation Language by means of an example, re- lating it to the well-known graph transformation language PROGRES [SWZ99]. The language’s abstract syntax is presented and its semantics are sketched. Unfortunately, no comprehensive definition of the QTL can be given here due to the lack of space. Also, only the query aspect of the language is handled in this paper. For the transformation of graphs, the reader is referred to [Wei08]. 3.1 Example Query Figure 2a shows a simple visual query modeled using the PROGRES graph transformation lan- guage. This query checks whether three nodes connected by edges of proper type and direction exist in the host graph. Another (intuitive, but rather implicit) condition is that indeed two dif- ferent nodes ‘1 resp. ‘2 exist. As the DRAGOS graph model is a lot more complex than the PROGRES model, queries according to the PROGRES syntax would be hard to represent. Therefore, the QTL separates between graph entities to be searched from the conditions that need to be fulfilled by these enti- ties. The DRAGOS query shown in Figure 2b contains a set of variables (middle row, depicted as circles). In order to confirm the query, each of these variables has to be bound to a graph entity from the host graph, otherwise the query fails. 1 Database Repository for Applications using Graph-Oriented Storage, previously called Gras/GXL. 3 / 14 Volume 10 (2008) Extending Graph Query Languages by Reduction query sampleQuery = end; `1 : A `2 : A `3 : Be e (a) PROGRES Query Node Var Node Var Node Var Edge Var Edge Var Type Constr reqType=“A“ Type Constr reqType=“e“ Type Constr reqType=“B“ Incid Constr s r c conn trg Incid Constr t r g con n src Iso Constr (b) DRAGOS Query Figure 2: Sample query searching three connected nodes Figure 3: Meta-Model of the Query & Transformation Language (simplified) Constraints (depicted as diamonds in Figure 2b) are used to restrict the queries’ results in sev- eral ways: IncidenceConstraints demand connectivity of entities, using role names to distinguish between variables for the source, the target and the connector. This distinction is necessary as DRAGOS allows edges to be connected to other edges, and so querying these structures needs to be supported. TypeConstraints restrict legal values to a certain type, where the desired type is indicated by the reqType attribute. The IsomorphismConstraint is used to ensure that attached variables are bound to pairwise different entities. Its name stems from the theoretical concept of searching an isomorphic mapping of queried entities to host graph entities, although it could be called NonIdentityConstraint as well. It is only added between variables of the same type, as inheritance is not considered in the current example. 3.2 Syntax & Semantics The language’s abstract syntax is depicted in Figure 3 by means of its meta-model. According to this model, each Pattern consists of a set of PatternElements, which are sub-divided into Variables and Constraints. Constraints are connected to at least one Variable via Restricts edges, which can Proc. GT-VMT 2008 4 / 14 ECEASST be distinguished using the role attribute. To support manipulation of graphs, the complete meta- model additionally provides Operators, which are not discussed in this paper. Figure 3 only defines the basic structure of patterns, but does neither define static semantics (e.g. well-formedness of patterns) nor dynamic semantics (the actual meaning of the pattern). Here, these two kinds of semantics are introduced for a small subset of the QTL. We utilize the OMG’s Object Constraint Language (OCL), as it allows to combine first-order predicate formulae with object-oriented concepts. Nevertheless, it should be noted that the OCL has not been comprehensively defined in a formal way, so that no unique interpretation of the presented formulae can be given. However, several research activities [BW02] strive to define the OCL’s semantics, which would lead to an unambiguous understanding. Besides the language’s meta-model depicted above, several well-formedness conditions for patterns exist, which cannot be expressed using class-diagrams in a convenient way. For exam- ple, the following OCL invariant defines conditions on the IncidenceConstraint: c o n t e x t I n c i d e n c e C o n s t r a i n t d e f : S : C o l l e c t i o n ( V a r i a b l e ) = s e l f . r e s t r i c t s →s e l e c t ( r | r . r o l e = ” s r c ” ) d e f : T : C o l l e c t i o n ( V a r i a b l e ) = s e l f . r e s t r i c t s →s e l e c t ( r | r . r o l e = ” t r g ” ) d e f : C : C o l l e c t i o n ( V a r i a b l e ) = s e l f . r e s t r i c t s →s e l e c t ( r | r . r o l e = ” c o n n ” ) i n v : w e l l f o r m e d n e s s = s e l f . C →s i z e ( ) = 1 and s e l f . C . s o r t = V a r i a b l e S o r t . EDGE and s e l f . S →s i z e ( ) ≤ 1 and s e l f . T →s i z e ( ) ≤ 1 and 1 ≤ s e l f . S →s i z e ( ) + s e l f . T →s i z e ( ) This invariant requires that the constraint is connected to exactly one Variable via a Restricts edge with role conn (connector). This variable has to specify the meta-class EDGE, i.e. it must query edges from the database. In addition, either a unique source variable (role src), or an unique target variable (role trg), or both, have to be given. An assignment of graph entities to a Pattern’s Variables not violating any Constraints is called a Match. Matches are instantiated by the language implementation according to the given Pattern and the contents of the graph database. As specified by the class diagram, each Match holds a (possibly empty) set of Assignments, each of which points to a Variable and its corresponding value. In addition, Matches have to comply to the following invariants. c o n t e x t A s s i g n m e n t i n v : v a l i d i t y = ( s e l f . v a r i a b l e . s o r t = V a r i a b l e S o r t . NODE i m p l i e s s e l f . v a l u e . o c l I s T y p e O f ( Node ) ) and ( s e l f . v a r i a b l e . s o r t = V a r i a b l e S o r t . EDGE i m p l i e s s e l f . v a l u e . o c l I s T y p e O f ( Edge ) ) and [ . . . ] The validity invariant requires that each Assignment relates Variables to proper entities in the database. Therefore, i.e. a Variable of sort EDGE may only be related to an Edge in the database. c o n t e x t Match i n v : c o m p l e t e n e s s = l e t V : C o l l e c t i o n ( V a r i a b l e ) = s e l f . p a t t e r n . c o n t a i n s→s e l e c t ( o c l I s K i n d O f ( C o n s t r a i n t ) )→c o l l e c t ( c | c . v a r i a b l e ) i n s e l f . a s s i g n m e n t s→c o l l e c t ( a | a . v a r i a b l e )→ i n c l u d e s A l l ( V ) i n v : u n i q u e n e s s = s e l f . a s s i g n m e n t s→f o r A l l ( a1 | s e l f . a s s i g n m e n t s→f o r A l l ( a2 | a1 . v a r i a b l e = a2 . v a r i a b l e i m p l i e s a1 = a2 ) ) 5 / 14 Volume 10 (2008) Extending Graph Query Languages by Reduction i n v : c o r r e c t n e s s = s e l f . p a t t e r n . c o n t a i n s→s e l e c t ( o c l I s K i n d O f ( C o n s t r a i n t ) )→f o r A l l ( c | c . f u l F i l l e d ( s e l f ) ) Besides the Assignments’ validity, a Match has to be complete, unique, and correct. • For completeness, an Assignment has to exist for all Variables which are referred to by any of the Pattern’s Constraints. Hence, all restricted Variables must have a value assigned. • The uniqueness invariant demands that each Match holds at most one Assignment for each Variable. This restriction eases the definition of Constraints. • correctness means that a Match fulfills every Constraint of its Pattern. Fulfilledness is defined depending on the respective Constraint’s type (see below). c o n t e x t T y p e C o n s t r a i n t d e f : f u l F i l l e d (m: Match ) : B o o l e a n = s e l f . v a r i a b l e→ f o r A l l ( v | m . a s s i g n m e n t s→s e l e c t ( a | v = a . v a r i a b l e ) . v a l u e . t y p e = s e l f . r e q T y p e ) c o n t e x t I n c i d e n c e C o n s t r a i n t d e f : f u l F i l l e d (m: Match ) : B o o l e a n = l e t c = m . a s s i g n m e n t s→s e l e c t ( a | C = a . v a r i a b l e ) . v a l u e i n ( S →i s E m p t y ( ) o r m . a s s i g n m e n t s→s e l e c t ( a | S = a . v a r i a b l e ) . v a l u e = c . s o u r c e ) and ( T →i s E m p t y ( ) o r m . a s s i g n m e n t s→s e l e c t ( a | T = a . v a r i a b l e ) . v a l u e = c . t a r g e t ) A TypeConstraint is fulfilled iff the values of all attached variables are of the type demanded by its reqType attribute. This definition does not consider any type hierarchy. The IncidenceCon- straint demands that the edge assigned to the connector variable (the singleton collection C ) is the source resp. the target of the corresponding variables. This restriction only applies if an according variable is connected to the constraint. The presented invariants (partially) define the validity of Matches, but do not state how such an assignment can be computed. Language implementations therefore need to provide an opera- tional implementation of these invariants. 4 Extending the Query & Transformation language The core language defined in the previous section allows to model queries using a basic set of language constructs. This section introduces a technique to add additional constructs to the language, e.g. to represent special semantics of a high-level language. As example, the Type- Constraint mentioned above is extended to support type inheritance. This is achieved by adding an additional constraint to the language’s meta-model, and by reducing its intended semantics to those of existing constraints. 4.1 Type-level reasoning The reduction of constraints usually requires to reason on the entities’ types and their relations. For this purpose, we added a mechanism which reflects the database graph schema into the run- time graph, as shown in Figure 4. On the left side (Figure 4a), the standard situation using sep- arate instance and schema models is shown. Dashed arrows indicate an entities’ type. However, Proc. GT-VMT 2008 6 / 14 ECEASST Instances Schema A B e B' B'' (a) Standard instance graph & schema Instances Schema A B e name= “B“ name= “A“ name= “B''“ name= “e“ Legend Edge Instance Node Instance represents instance of Node Class Edge Class Reflection Graph B' B'' name= “B'“ instanceof EdgeClass superclass (b) Explicit type relation using reflection Figure 4: Reflection graph to query types the QTM is not able to traverse this relation or examine the entities’ types. Therefore, Figure 4b reflects the graph schema into the runtime data as special Reflection Graph. Node classes and edge classes are represented by nodes in this graph, with attributes storing the types’ names. Edges model the inheritance relations. Additional edges connect entities of the regular instance graph to nodes representing their types in the Reflection Graph. The QTM can therefore traverse and an- alyze this graph in the same way as regular instance graphs are handled. For the sake of clarity, some represents and instance of lines are omitted in the figure. 4.2 Basic constraint reduction To revive the initial example, Figure 5 (left side) shows an additional SubTypeConstraint used to check an entities type compatibility. Just like the regular type constraint, it receives the type’s name by the reqType attribute. In order to evaluate this constraint based on the core language, it is reduced to the query on the right. Node variable h1 corresponds to the original variable. An IncidenceConstraint is used to traverse the instanceof relation, as checked by the TypeConstraint of edge variable h2 . The value of h3 is a node in the reflection graph representing the entities class. Node Var SbType Constr reqType=“B“ Node Var Incid Constr src trg Node Var 3 Edge Varconn Type Constr reqType= “instanceof“ 2 IClsr Constr src trg Node Var 4 Attr Constr expression= “name=B“ name 1 reqType= “superclass“ Figure 5: Definition of the SubTypeConstraint From variable h3 , a so-called IncidenceClosureConstraint traverses an arbitrary (including zero) 7 / 14 Volume 10 (2008) Extending Graph Query Languages by Reduction number of edges, just like the Kleene star operator does for regular expressions. In contrast to the IncidenceConstraint, this constraint is not connected to any edge variable, as an unknown number is traversed during pattern matching. To restrict the traversed edges to a certain type, the IncidenceClosureConstraint expects an edge class passed as value of the reqType attribute. In this case, the type superclass is given, whose instances model inheritance relations in the reflection graph. According to this relation, entities assigned to the target variable h4 are again nodes of the reflection graph representing classes. Another AttributeConstraint checks the respective class name, only retaining the class named B as valid assignment. As result of this transition, the SubTypeConstraint is fulfilled iff the pattern on the right of Fig- ure 5 is fulfilled. For variable h1 , the value’s type h3 is retrieved, and all reachable supertypes are checked whether they carry the requested name B. Variables h3 and h4 may get the same entity assigned, so the case that the value of h1 is an instance of class B is covered, too. Furthermore, the reachability check also supports multiple inheritance offered by DRAGOS. The replacement shown in Figure 5 can be expressed easily by a graph transformation rule. This replacement rule is run in a pre-processing phase before invoking the resulting query. 4.3 Nested pattern matching The previous subsection demonstrated a simple conversion rule to replace extended language constructs by basic ones. The proposed QTL additionally allows to replace parts of a rule recur- sively, which is necessary to define the IncidenceClosureConstraint used above. Our approach for recursive replacements is based on the idea of nested queries, which is presented in the following. On the syntactic level, the QTL meta-model is extended by adding Pattern to the subclasses of PatternElement (c.f. Figure 3), so that its instances may contain other patterns. Furthermore, class Match gains a reflexive association to model nested matches. For all matches, this relation has to be coherent with their respective patterns’ nesting. This condition implies that a child pattern is evaluated only if a match of its parent pattern exists. Semantically, nested patterns are matched independently from each other if constraints only refer to variables of the same pattern. The resulting set of matches (if “joining” assignments of parent and child matches) is the cross-product of matches of non-nested patterns. However, there are two possible interactions between parent and child patterns: Firstly, constraints can restrict variables of child patterns. As the child pattern’s variable is not bound when checking fulfilled- ness of the parent pattern, such constraints cannot be verified. Fulfilledness of the constraint’s pattern therefore only demands that no constraint is violated, thus allowing unevaluable con- straints to persist. In addition, a pattern is matched only if no constraint of any ancestor pattern is violated by its variable assignments. Secondly, variables can be restricted by constraints of child patterns. Here, the common conditions for non-nested queries suffice, demanding fulfilled- ness (more generally non-violatedness) of a pattern’s constraints. However, references between entities of sibling patterns are forbidden to keep matches independent of each other. A final aspect on nested queries that needs to be addressed here is the processing of the re- sulting match structure. As result, we determine the validity of a match with regards to its child matches. From the application’s point of view, an invalid match is treated as non-existent. Match validity can be specified w.r.t. two criteria: The pattern condition ensures that a match contains appropriate child matches for a specific child pattern. One usage of this condition is to reason on Proc. GT-VMT 2008 8 / 14 ECEASST the number of these matches, e.g. at most zero matches to model negative application conditions. In the following, nested patterns are treated according to the intuitive at least one cardinality. An- other approach is the group condition, which specifies the treatment of distinct child patterns (if any, otherwise the condition is true). Here, e.g. a boolean operator such as ∨ or ∧ can be applied on the pattern conditions’ results. In the following, we assume an ∨ condition, so that at least one match for at least one child pattern has to be found. Figure 6a shows a nested pattern searching for paths of length 0 or 1. The outer variables are assumed to be bound before, in surrounding parent pattern. Pattern h1 contains a single IsomorphismConstraint. As only the outer variables are bound when searching for matches ofh1 , this constraint is always fulfilled. Therefore, a single match without any assignments exists for this pattern. The inner pattern h2 checks whether the outer variables have the same value assigned, which represents a path of length 0. In contrast, pattern h3 traverses an edge (according variable and type check are omitted here), and checks whether the reached node equal the outer- right variable’s value. The IsomorphismConstraint of h1 requires that the target node is not identical to the outer-left variable’s value, to exclude reflexive edges. Processing rules discussed above state that at least one of these nested patterns need to be matched to obtain a valid match for h1 . 4.4 Recursive constraint reduction Although nested queries allow to express alternative patterns, they can only be used to check a limited number of variables. Usually, this number cannot be given in advance, e.g. the Incidence- ClosureConstraint requires to check for paths of an arbitrary length. The only, albeit impossible, solution would be the specification of an infinite number of patterns. Therefore, we apply a mechanism for recursive expansion of queries at runtime. The language’s meta-model is extended by a PatternReference class, which references a pattern defined by the developer. References are replaced by the corresponding pattern when its con- tainer pattern is matched successfully. Recursion is achieved by copying a pattern into itself, also copying the reference being expanded. If multiple references exist within the same pattern, their order of replacement is undefined. However, consistency conditions introduced below ensure that the result is indeed independent of this order. Furthermore, reference expansion should be guaranteed to terminate in recursive situations. Although this property is not ensured directly, expansion can only occur whenever a pattern is matched. Therefore, termination of reference expansion is given if only a finite number of patterns can be matched. Obviously, this can be achieved by an IsomorphismConstraint limiting at least one variable per pattern to an entity not assigned to other variables. Therefore, finiteness of the host graph implies finiteness of matched patterns and expansion steps. The actual application of pattern references is introduced by referring to the IncidenceClosure- Constraint. Figure 6b shows a variant of the nested pattern introduced above. In contrast to Figure 6a, pattern h3 does not contain an own IdentityConstraint to check the connectedness of the path ends. Instead, two PatternReferences are given: The upper one refers to pattern h2 , which means that this pattern is copied into pattern h3 if the latter can be matched. Furthermore, the lower reference copies pattern h3 into itself. Reference expansion is conducted as follows: Each PatternReference is replaced by a new pat- tern created inside the reference’s container, and filled with copies of the referenced pattern’s 9 / 14 Volume 10 (2008) Extending Graph Query Languages by Reduction Ident Constr Node Var Node Var Incid Constr src trg Node Var Ident Constr 1 2 3 PatternPatternPattern Iso Constr (a) Nested patterns checking length 0 and 1 Ident Constr Node Var Node Var Incid Constr src trg Node Var 1 2 3 Iso Constr ReferenceReference  (b) Recursive pattern, initial state Ident Constr Node Var Node Var Incid Constr trg Node Var 1 2 3 Iso Constr src   (c) External links and according mappings marked Ident Constr Node Var Node Var Incid Constr src trg Node Var 1 2 3 Iso Constr Incid Constr src trg Node Var 4 Ident Constr 5   (d) After 1st replacement Figure 6: Patterns for incidence closure entities. This covers the entities’ types, attribute values, and connectedness to other copied en- tities. However, the question remains how the copied pattern’s context should be handled. This context is defined by the edges connecting its contained entities to entities not contained in the pattern being copied, the so-called external links. Figure 6c highlights the external links of Fig- ure 6b for both copied patterns. Here, this concerns Restricts edges (four times), but also the pattern referred to by the upper PatternReference. For each pattern reference, the developer has to specify a mapping of entities connected to external links, relating them to entities that should be connected to the referred pattern. Identical mapping of an element to itself is a valid choice. Mappings are copied along with other pattern entities during reference expansion, which is required for recursive expansions. In order to achieve the desired replacement in case of the IncidenceClosureConstraint, the fol- lowing mappings are required (c.f. broad arrows in Figure 6c): • The upper reference copies pattern h2 into pattern h3 to check value-identity of the outer- right variable and the variable of h3 . Therefore, the outer-left variable referenced by h2 is mapped to the variable of h3 , whereas the outer-right variable is mapped identically. • Expansion of the lower reference should yield a query for path of length 2. Therefore, the same mapping of the outer-left variable to the variable of h3 applies here, such that the IncidenceConstraint of the copy of pattern h3 refers to the original’s variable as source. • The traversed node should not have been visited before, so all node variables are connected to the IsomorphismConstraint of h1 , which is mapped identically for this purpose. This constraint ensures termination of the replacement, as discussed above. Proc. GT-VMT 2008 10 / 14 ECEASST • A last external link of the lower reference is the pattern referred to by the upper reference. Here, pattern h2 is mapped to its reference, such that the copied reference will refer to the expanded upper reference of h3 . In this case, identical mapping would lead to broken copied mappings in later expansion steps, if the lower reference is expanded first. Using these mappings, expanding both references yields the pattern structure shown in Figure 6d. Expanding the upper reference results in h5 , whereas the lower one is expanded to h4 . The resulting query checks for paths of length 0 by matching h1 and h2 , and 1 by matching h1 , h3 , and h5 , respectively. Paths of length 2 can be found after the next step, using h1 , h3 , h4 , and the expanded reference to h5 . This section showed how complex or application-specific language constructs (represented by constraints) can be reduced to basic ones. With the presented nested query mechanism, recur- sive expression can be captured as well. Although its evaluation might be inefficient, it serves as the guideline for implementing the QTL. This is required by the fact that the actual storage backend of DRAGOS is exchangeable, and so is the implementation of its language. As dis- cussed in [Wei08], such implementations may either rely on the DRAGOS core graph model, or convert rules into a backend-specific format. e.g. SQL statements. To provide an efficient imple- mentation, language extensions might also be converted into such a backend-specific language. The modeled reduction rules in this case serve as the formal definition and as reference used in test-based validation of the specific implementations. 5 Related Work In contrast to previous publications on DRAGOS [Böh04] and the according QTL [Wei08], this paper focusses on the language’s extensibility. In this section, we give a brief comparison to other research in the area of graph transformations. Graph transformations based on constraint satisfaction. The QTL is based on the theory of constraint satisfaction problems (CSP) known from the area of artificial intelligence. CSPs are well-suited to model graph pattern matching by solving the subgraph-isomorphism problem [LV02]. In our work, we aim to implement the QTL based on existing systems, and therefore extensive development of a basic constraint solver is not of crucial importance. This would only improve the generic implementation based on the core graph model, which should be considered as fallback solution only. Instead, we focus on implementations based on sophisticated storage backends like databases. CSP-like representations of graph transformation rules have also been applied in [HVV07], where search-plan optimization is discussed for such a rule model. As this approach is not concerned with the evaluation of expressions, dynamic aspects such as matches need not to be considered. In contrast, our approach also incorporates matches to model the result of a query. Furthermore, the cited work includes negative application conditions directly into the language, whereas NACs are treated as special case of nested patterns in our QTL. Graph transformations on relational databases. Implementing GTS on established rela- tional databases has been presented initially by the authors of [VFV06]. Basically, the authors 11 / 14 Volume 10 (2008) Extending Graph Query Languages by Reduction transform a graph schema to a set of database relations, and implement pattern matching by de- riving views on these tables. One difference to our approach is the applied meta-level (M 1), as the DRAGOS graph model constitutes a common meta model for all applications (thus M 2). Furthermore, we apply the basic idea of generating SQL code in a language-independent envi- ronment, with the QTL forming a common basis. The separation between variables, constraints, and operators applied in the QTL is indeed closer to the SQL than traditional graph languages considered by [VFV06], which simplifies the translation process for us. The authors also mention a specialized query optimizer developed for the applied relational databases, which, unfortunately, is not discussed any further. We agree that an optimizer special- ized on graph queries is indeed necessary. Inspection of the internal search plans of the database backend showed that the standard optimizer already prefers table joins with small result sets, e.g. traversing edges instead of global searches over the graph. However, the order of edge traver- sal is not optimized effectively, which causes inefficient behavior for large queries. To relieve this drawback, a specialized query optimizer should adapt results from search-plan driven code generation found in common graph transformation tools. Graph transformations for visual programming. Graph transformation languages like PRO- GRES provide similar functionality like the DRAGOS QTL. However, the latter has been de- signed as base layer of specialized graph languages, and therefore especially focuses on two aspects: First, the language has provide common functionality, yet being extensible, as discussed in Section 1. In contrast, the architecture of PROGRES has not been designed for extensibility at all. Second, the provided platform has to be able to represent and translate specialized lan- guages. The QTM meets this requirement by applying graph-oriented data storages not only for the runtime data, but also for representing rules modeled using specialized languages and there counterparts in the QTL. In contrast to common graph transformation languages, the low-level DRAGOS QTL is not feasible for direct use by developers. Therefore, it should not be considered as competitor to existing languages, but as a common core for existing and new languages to build on. 6 Conclusion In this paper, we introduced the QTL currently being developed for the DRAGOS graph database. This language especially focuses on extensibility, which is the core aspect of this publication. Developers may choose to add new constructs to the language in case existing ones do not suffice the application’s needs or do not match its semantics. These are implemented by reduction to existing ones, also allowing recursive substitutions. In addition, language constructs may be converted into a storage-specific query such as SQL statements. The presented work is fully implemented based on the DRAGOS graph model interface, des- ignated as generic implementation in [Wei08]. Currently, we are working on an SQL-based solution. Interesting problems remain in the recursive evaluation of queries, which cannot be expressed directly in many database systems2. Upon completion, we will conduct performance 2 Altough recursive SELECT statements are defined by SQL3, support is optional and obviously not very popular. Proc. GT-VMT 2008 12 / 14 ECEASST evaluations comparing the QTL to DRAGOS applied in the code generation approach. Further- more, comparisons to other graph transformation solutions based on databases are of interest. Currently, we are embedding support for control flow into the language definition and its generic implementation. Core features of this mechanism include hierarchical rule composition, optional dataflow and rule invocation. Rule application strategies will allow non-deterministic and random (with or without backtracking) processing of multiple matches. Using this mecha- nism, rules can be combined to complex graph transformation systems. As next step, we will support the development of specialized graph languages on top of the QTL. For this purpose, a set of parser specifications are being developed to import textual doc- uments from graph languages into the DRAGOS database. Currently, we are working on sup- port for the general-purpose rewriting language PROGRES and a derivate of the query language GReQL [KW99]. To enact the corresponding specifications, we also develop a transformation language to translate them to the QTL. Bibliography [Böh04] B. Böhlen. Specific Graph Models and Their Mappings to a Common Model. In Pfaltz et al. (eds.), 2nd Intl. Workshop on Applications of Graph Transformations with In- dustrial Relevance, (AGTIVE). Lect. Notes in Comp. Sci. 3062, pp. 45–60. Springer, 2004. [BW02] A. D. Brucker, B. Wolff. A Proposal for a Formal OCL Semantics in Isabelle/HOL. In Muñoz et al. (eds.), Theorem Proving in Higher Order Logics. Lect. Notes in Comp. Sci. 2410, pp. 99–114. Springer, Hampton, VA, USA, 2002. [HVV07] Á. Horváth, G. Varró, D. Varró. Generic Search Plans for Matching Advanced Graph Patterns. In Ehrig and Giese (eds.), Graph Transformation and Visual Modeling Tech- niques, 6th Intl. Workshop. Elec. Comm. of the EASST 6, pp. 57–68. 2007. [HWS00] R. Holt, A. Winter, A. Schürr. GXL: Towards a Standard Exchange Format. In Proc. of the 7th Working Conference on Reverse Engineering (WCRE). Pp. 162–171. IEEE Computer Society Press, 2000. [KW99] B. Kullbach, A. Winter. Querying as an enabling technology in software reengineer- ing. In Proc. of the 3rd Europ. Conf. on Software Maintenance and Reengineering. Pp. 42–50. IEEE Computer Society Press, 1999. [LS88] C. Lewerentz, A. Schürr. GRAS, a Management System for Graph-Like Documents. In Proc. of the 3rd International Conference on Data and Knowledge Bases. Pp. 19– 31. Morgan Kaufmann, 1988. [LV02] J. Larrosa, G. Valiente. Constraint Satisfaction Algorithms for Graph Pattern Match- ing. Mathematical Structures in Computer Science 12(4):403–422, 2002. [SNZ08] A. Schürr, M. Nagl, A. Zündorf (eds.). Proc. of the 3rd Intl. Workshop on Applications of Graph Transformation with Industrial Relevance (AGTIVE). Lect. Notes in Comp. Sci. 5088. Springer, 2008. 13 / 14 Volume 10 (2008) Extending Graph Query Languages by Reduction [SWZ99] A. Schürr, A. J. Winter, A. Zündorf. The PROGRES Approach: Language and En- vironment. In Ehrig et al. (eds.), Handbook on Graph Grammars and Computing by Graph Transformation: Applications, Languages, and Tools. Pp. 487–550. Volume 2. World Scientific, 1999. [VFV06] G. Varró, K. Friedl, D. Varró. Implementing a Graph Transformation Engine in Rela- tional Databases. Journal on Software and Systems Modeling 5(3):313–341, 2006. [Wei08] E. Weinell. Adaptable Support for Queries and Transformations for the DRAGOS Graph-Database. Pp. 369–411 in [SNZ08]. Proc. GT-VMT 2008 14 / 14 Introduction Graph database DRAGOS Queries & Transformations for DRAGOS Example Query Syntax & Semantics Extending the Query & Transformation language Type-level reasoning Basic constraint reduction Nested pattern matching Recursive constraint reduction Related Work Conclusion