XText Grammar I.

Textual Modeling Languages
Slides 4-31 and 38-40 of this lecture are reused from the Model
Engineering course at TU Vienna with the kind permission of
Prof. Gerti Kappel (head of the Business Informatics Group)
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 1
Workflow Overview
 Text comprehension + transformation
 Text
 Lexing
 Token stream (tokens = lexical units: comments and whitespaces removed)
 Parsing
 Concrete syntax tree (syntactical units: for cycle, if statement)
 Abstraction
 Abstract syntax tree (AST)
 Transformation
 Model
 Editor specification
 How to provide useful feedback to the user in a GUI
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 2
Table of Contents
 EBNF
 XText
 ANTLR + M2M transformation
 Online demo
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 3
Language Specification Basics
 Extended Backus-Naur-Form (EBNF)
 Originally introduced by Niklaus Wirth to specify the syntax of Pascal
 In general, they can be used to specify a context-free grammar
 ISO Standard
 Fundamental assumption: A text consists of a sequence of terminal
symbols (visible characters).
 EBNF specifies all valid terminal symbol sequences using a compact and
finite representation  grammar
 Grammar
 Terminal symbols (lexical units)
 Non-terminal symbols (syntactical units)
 Starting non-terminal symbol (root of the syntax tree)
 Production rules consist of a left side non-terminal and a right side (valid
symbol sequences)
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 4
EBNF
 Production rules consist of:
 Terminal
 NonTerminal
 Choice
 Optional
 Repetition
 Grouping
 Comment
…
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 5
Example: Entity DSL
 Language constructs
 Arbitrary character sequences
 Keywords
 Separation characters
 Scope borders
 References
type String
type Boolean
entity Conference {
property name : String
property attendees : Person[]
property speakers : Speaker[]
}
entity Person {
property name : String
}
entity Speaker extends Person {
}
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 6
Example: Entity DSL II.
 EBNF grammar
Model := Type*;
Type := SimpleType | Entity;
SimpleType := 'type' Identifier;
Entity := 'entity' Identifier
('extends' Identifier)? '{' Property* '}';
Property := 'property' Identifier ':'
Identifier ('[]')?;
Identifier := ('a'..'z'|'A'..'Z'|'_')
('a'..'z'|'A'..'Z'|'_'|'0'..'9');
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 7
Example: Entity DSL III.
 EBNF-to-Ecore mapping
Model := Type*;
Type := SimpleType | Entity;
SimpleType := 'type' Identifier;
Entity := 'entity' Identifier
('extends' Identifier)? '{' Property* '}';
Property := 'property' Identifier ':'
Identifier ('[]')?;
Identifier := ('a'..'z'|'A'..'Z'|'_')
('a'..'z'|'A'..'Z'|'_'|'0'..'9');
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 8
EBNF vs. Ecore
 EBNF
+ Specifies concrete syntax
+ Linear order of elements
– No reusability
– Only containment relationships
 Ecore
+ Reusability by inheritance
+ Non-containment references
+ Predefined data types and user-defined enumerations
~ Specifies only abstract syntax
 Conclusion
 A meaningful EBNF cannot be generated from a metamodel and vice versa!
 Challenge
 How to overcome the gap between these two worlds?
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 9
Solution Overview
 Generic Syntax
 Like XML
 Metamodel is sufficient, i.e., no concrete syntax is needed
 For instance: HUTN (OMG Standard) - www.omg.org/spec/HUTN
 Metamodel First!
 Step 1: Specify metamodel
 Step 2: Specify textual syntax additionally
 For instance: TCS (Eclipse Plug-in) - www.eclipse.org/gmt/tcs
 Grammar First!
 Step 1: Syntax is specified by a grammar (concrete syntax & abstract sytnax)
 Step 2: Metamodel is derived from step 1
 For instance: Xtext (Eclipse Plugin) - www.eclipse.org/Xtext
 Although Xtext supports importing metamodels in the meanwhile
 Separate metamodel and grammar
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 10
Table of Contents
 EBNF
 XText
 ANTLR + M2M transformation
 Online demo
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 11
XText Introduction
 XText was originally part of openArchitectureWare, now it is more a separate
project
 Used to develop textual domain specific languages
 Grammar definition similar to EBNF, but with extra features for metamodel
generation
 Creates metamodel, parser, and editor from grammar definition
 Editor supports syntax check, highlighting, and code completion
 Context-sensitive constraints on the grammar described in OCL-like language
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 12
XText Workflow Overview
oAW
«artifact»
Grammar
«artifact»
Constraints
«artifact»
Metamodel
«component»
Xtext
Parser
(Textual DSLs)
«component»
Editor
Xpand
(M2C)
Xtend
(M2M)
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 13
«artifact»
Model
XText Grammar I.
 Xtext grammar similar to EBNF
 Extended by
 Object-oriented concepts
 Information necessary to derive the metamodel
 Editor
 Example
A
A : (type = B);
type
B
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 14
0..1
XText Grammar II.
 Terminal rules
 Similar to EBNF rules
 Return value is String by default
 EBNF expressions
 Cardinalities
 Zero to one
 Zero to many
 One to many
 Character range
 Wildcard
 Until Token
 Negated Token
?
*
+
'0'..'9'
'f'.'o'
'/*' -> '*/'
'#' (!'#')* '#'
 Predefined rules
 ID, String, Int, URI
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 15
XText Grammar III.
terminal ID : ('^')?('a'..'z'|'A'..'Z'|'_')
('a'..'z'|'A'..'Z'|'_'|'0'..'9')*;
terminal INT returns ecore::EInt : ('0'..'9')+;
terminal ML_COMMENT : '/*' -> '*/';
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 16
XText Grammar IV.
 Type rules
 For each type rule a class is generated in the metamodel
 Class name corresponds to rule name
 Type rules contain
 Terminals -> Keywords
 Assignments -> Attributes or containment references
 Cross References -> Non-containment references
…
 Assignment Operators
 Single-valued feature
 Multi-valued feature
 Boolean features
=
+=
?=
 Enum rules
 Map Strings to enumeration literals
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 17
XText Grammar V.
 Assignment
State : 'state' name = ID (transitions += Transition)* 'end';
 Cross References
Transition : event = [Event] '=>' state = [State];
 Enum rules
enum ChangeKind : ADD | MOVE | REMOVE;
enum ChangeKind :
ADD = 'add' | ADD = '+' |
MOVE = 'move' | MOVE = '->' |
REMOVE = 'remove' | REMOVE = '-';
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 18
XText Tooling I.
 Xtext Blueprint Projects
Grammar definition
Code generator
IDE functionality
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 19
XText Tooling II.
 Xtext Grammar Project
Workflow File for
generating DSL Editor
Properties: File
Extension, …
Grammar Definition
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 20
XText Tooling III.
 Xtext Grammar Definition
Grammar Name
Default Terminals (ID, STRING,…)
Metamodel URI
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 21
XText Tooling IV.
 Xtext Grammar Definition for State Machines
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 22
XText Tooling V.
 Generated Ecore-based Metamodel
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 23
XText Tooling VI.
 Generated DSL Editor
Outline View
Error!
Code
Completion
(Ctrl+Space)
Highlighting of
keywords
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 24
Error
Description
Entity DSL Example
EBNF Grammar
Model := Type*;
Example Model
type String
type Boolean
Type := SimpleType | Entity;
SimpleType := 'type' Identifier;
Entity := 'entity' Identifier
('extends' Identifier)? '{' Property* '}';
Property := 'property' Identifier ':'
Identifier ('[]')?;
Identifier := ('a'..'z'|'A'..'Z'|'_')
('a'..'z'|'A'..'Z'|'_'|'0'..'9');
entity Conference {
property name : String
property attendees : Person[]
property speakers : Speaker[]
}
entity Person {
property name : String
}
entity Speaker extends Person {
}
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 25
EBNF to XText Transition
EBNF Grammar
Model := Type*;
XText Grammar
grammar MyDsl with
org.eclipse.xtext.common.Terminal
Type := SimpleType | Entity;
generate myDsl "http://MyDsl"
SimpleType := 'type' Identifier;
Property := 'property' Identifier ':'
Identifier ('[]')?;
Model : elements += Type*;
Type : SimpleType | Entity;
SimpleType : 'type' name = ID;
Entity : 'entity' name = ID
('extends' extends = [Entity])?
'{' properties += Property* '}';
Identifier := ('a'..'z'|'A'..'Z'|'_')
('a'..'z'|'A'..'Z'|'_'|'0'..'9');
Property : 'property' name = ID ':'
type = [Type](many ?= '[]')?;
Entity := 'entity' Identifier
('extends' Identifier)? '{' Property* '}';
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 26
Specifying Context Sensitive Constraints for
Textual DSLs I.
 Examples
 Entity names must start with an uppercase
character
 Entity names must be unique
 Property names must be unique within one
entity
 Answer
 Use the same techniques as for
metamodels!
XText Grammar
grammar MyDsl with
org.eclipse.xtext.common.Terminal
generate myDsl "http://MyDsl"
Model : elements += Type*;
Type : SimpleType | Entity;
SimpleType : 'type' name = ID;
Entity : 'entity' name = ID
('extends' extends = [Entity])?
'{' properties += Property* '}';
Property : 'property' name = ID ':'
type = [Type](many ?= '[]')?;
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 27
Specifying Context Sensitive Constraints for
Textual DSLs II.
 Examples
1. Entity names must start with an uppercase character
2. Entity names must be unique
3. Property names must be unique within one entity
 Solutions
context myDsl::Entity
WARNING "Name should start with a capital":
name.toFirstUpper() == name;
context myDsl::Entity
ERROR "Name must be unique":
((Model)this.eContainer).elements.name.select(e|e == this.name).size == 1;
context myDsl::Property
ERROR "Name must be unique":
((Model)this.eContainer.eContainer).properties.name.select(p|p == this.name).size == 1;
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 28
Specifying Context Sensitive Constraints for
Textual DSLs III.
 Every edit operation for cheap constraints
 Every save operation for cheap to expensive constraints
 Every generation operation for very expensive constraints
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 29
Specifying Context Sensitive Constraints for
Textual DSLs IV.
 Code Generation Projects are automatically generated
 Code 2 Model Injector is used in the workflow file
<component class="org.eclipse.xtext.MweReader" uri="${modelFile}">
<!-- this class will be generated by the xtext generator -->
<register class="org.xtext.example.MyDslStandaloneSetup"/>
</component>
 Xpand templates can be developed as usual
«IMPORT myDsl»
«DEFINE main FOR Model»
«FOREACH this.elements.typeSelect(Entity) AS e-»
«FILE e.name".class"-»
class «e.name» ...
«ENDFILE»
«ENDFOREACH»
«ENDDEFINE»
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 30
XText: Standard Framework for EMF?
 Tight integration with Eclipse and oAW
 Powerful editors, constraint checks, code generation, model transformation, …
 Allows to switch between EMF models and text-based artefacts
 Both are supported in Eclipse: Textual and Graphical Modeling!
 Allows to customize the editor by many extension points
 Pretty printing, code completion, …
 Strong community
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 31
Strengths and Weaknesses of XText
(+) Compact:




Minimal effort for establishing “small/simple” DSLs
EMF/Eclipse integration
EBNF like (easy to learn)
Editor “for free”
text
comprehension
transformation
editor specification
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 32
Strengths and Weaknesses of XText
(-) Mixing of concerns:
Text comprehension, actual transformation and editor specification are implicit and
woven tightly together in one specification with the following consequences:
 Bidirectionality is hard to support
 Impossible to reuse e.g., the transformation part with a different textual
comprehension part (e.g., for XML, regular expressions, directory structures, handwritten parsers, …).
 Complex specifications become difficult to maintain as everything becomes
complex, i.e., textual comprehension and the transformation.
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 33
Strengths and Weaknesses of XText
(-) Generality:
 Not all languages can be expressed (easily) with an Xtext grammar (mixing in Java is
not possible!)
 Target model is usually quite low-level. In general a metamodel first approach for an
arbitrary model requires a subsequent M2M transformation leading to a high
complexity of the complete transformation chain due to different languages.
 Fixed interpretation of EBNF
 Using existing or preferred (e.g., bottom-up) parsers is not possible
 No parts of the transformation can be reused if the modeling standard (Ecore) is
changed
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 34
Table of Contents
 EBNF
 XText
 ANTLR + M2M transformation
 Online demo
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 35
Alternative „Tree-Based“ Approach
Separate transformation into two distinct parts:
1. Text-to-tree (text comprehension) with a parser (e.g., generated from an
EBNF specification)
2. Tree-to-model (actual transformation or interpretation) with a model
transformation language (e.g., SDM, TGG)



Step (1) can be achieved with just an EBNF specification and no extra effort.
Step (2) involves (i) adding extra typing information to the homogeneous tree
structure from a parser, and (ii) deducing context-sensitive relationships to
yield a (regular) graph structure.
Arbitrary metamodels can be targeted with increasing complexity of Step (2)
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 36
Workflow with ANTLR + eMoflon
Simple Tree
Metamodel (AST)
[Moca Tree]
Target Metamodel
EBNF
SDM, TGG
Lexer + Parser
Transformation
tree
text
Text-to-Tree
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 37
model
Tree-to-Model
Advantages of Textual DSLs
 Textual languages have specific strengths compared to graphical
languages
 Scalability
 Pretty-printing
 Compact and expressive syntax
 Productivity for experienced users
 IDE support softens learning curve
 Configuration management/versioning
 Concurrent work on a model, especially with a version control system
 Diff, merge, search, replace, …
 Working without a sophisticated editor
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 38
State-of-the-Art DSL Development Techniques
 DSLs are very common: CSS, regular expressions, ant, SQL, HQL, Rails
 Two types of DSLs: internal vs. external
 Internal DSLs
 Embedded languages in existing host languages
 Explicit internal DSLs: becoming mainstream through Ruby and Groovy
 Implicit internal DSLs: fluent interfaces simulate DSLs in Java and C#
 External DSLs
 Have their own custom syntax
 Own parser to process them
 Own editor to build sentences
 Own compiler to executable language or own interpreter
 Many XML-based languages have ended up as external DSLs (not user
friendly)
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 39
References
 XText related
 Project Site: http://www.eclipse.org/Xtext
 Xtext Webinar: http://live.eclipse.org/node/705
 Text-based language meta-frameworks
 Monticore: www.monticore.org
 MGrammar: http://msdn.microsoft.com/en-us/library/dd857654(VS.85).aspx
 TCS: www.eclipse.org/gmt/tcs
 HUTN: http://www.omg.org/spec/HUTN/
 Domain specific (modeling) languages
 Fowler, M.: Domain Specific Languages. 2009.
 Mernik, M., Heering, J., and Sloane, A. M.: When and how to develop domainspecific languages. ACM Computing Surveys 37(4), pp. 316-344, 2005.
 Kelly, S., and Tolvanen, J.P.: Domain-Specific Modeling: Enabling Full Code
Generation. Wiley-IEEE Computer Society, 2008.
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 40
Table of Contents
 EBNF
 XText
 ANTLR + M2M transformation
 Online demo
24. Mai 2012 | Real-Time Systems Lab | Prof. Dr. Andy Schürr | Dr. Gergely Varró | 41