Vesta Beef Risotto Pro Points

An object-oriented metalanguage

In computer magazines and books about model-driven software development in recent times often described the framework
Xtext
and praise - which is mainly due to the fact that it is used for the integration of programming in the Eclipse IDE [. 1] Xtext (or its underlying parser
Antlr
) is an easy-to-learn tool to meet the broad development of domain-specific languages - that is why it is also suitable for entry into the area. But there are interesting alternatives that should at least know.

A more interesting option to make run-time maintainable models for code generation and for the development of DSLs in my view, the so-called
Parsing Expression Grammars (PEG) is
, especially developed by Alessandro Warth object-oriented metalanguage
OMeta
. The basic idea of all PEGs is a generic term the pattern matching, such as it is used in regular expressions. PEGs use the knowledge that all usually are the construction of compilers and interpreters related tasks in a way, only variations of a common mechanism - the pattern recognition [2]

The
Tokenizer
(also
Lexer
or
Scanner
called) transforms an array of characters (the input, the source code) into a sequence of basic syntactic units, tokens. Classically used for this purpose a tool like lex

or its Flex development

. In principle, a Lex or Flex file provides a list of regular expressions that will be applied sequentially to the input until a suitable expression is found. Each regular expression is an "action" in the form of C code assigned, which must eventually end with the return of a well-defined tokens, together with a possible token argument. To illustrate how such a flex file is listed, like my serve
Flex Definition File apc_lexer.lex
project
astro
patterns.

The

parser operates on the array of tokens to generate an internal data structure that is suitable for the efficient processing machine particularly well. Usually the result of parsing an abstract syntax tree (AST). Also, the parsing is basically Pattern recognition, only on a higher level: it uses a different kind of "alphabet" consisting of terminal and nonterminal symbols, and the patterns are the syntax rules, called production rules. Parser has a long history, represented in which
yacc
(

y
et
a nother

c
ompiler
c
ompiler
) a first milestone.
yacc
, although developed over 30 years, is still used for the design of programming languages. I have implemented with
Bison
, the successor to yacc, a
astrological DSL
which serve also to illustrate here should. The
Bison grammar of these DSL
shows a structural similarity that is compared here with the input of a number of so-called production rules. If a suitable rule is found, the associated semantic action

executed.

Type Checker
and
optimizer
are transformations of the AST, which recognize certain patterns and replace them with more specific or more efficient designs. As the product of the parser is usually a syntax tree, one often uses the

visitor design pattern to traverse the tree, run depending on the application-specific operations. [3] The same applies

Finally, the code generator

, who transformed the AST program in source code or binary code for a processor or a virtual machine. Also be used for this purpose to the visitor pattern ajar methods, although this task can be formulated in the form of a grammar with production rules.

OMeta The programming language provides a mechanism to formulate all these tasks in the form of a PEG grammar and fix them. The concept is so general that in principle any transformation of code and data-structures in other target structures. The OMeta syntax can be embedded in different host languages, for example, there OMeta implementations many others,
C #
,

JavaScript, Ruby
- for
Squeak
, COLA (a mixture of Scheme and Smalltalk used the first host language) - increases the number of implementations. I find it particularly attractive to the cooperation with host dynamic languages such as JavaScript or Python. For the interpreted languages are obviously very well prepared for the vision of model-driven software development, at run time in a domain-specific language, the system behavior can be. The JavaScript language also has the additional advantage of being a "default language of the Web Browser" at no extra cost in web applications available. Because I think the language OMeta so important, I have posted on my homepage an area for experimentation at the link
http://ruediger-plantiko.net/ometa/

. It is similar to the http://tinlizzie.org OMeta is, as its name says, an object-oriented metalanguage that has many interesting features:

Entering OMeta must not source or even be a string, but can an array of arbitrary objects be the host language. This allows for example the transformation of abstract syntax trees. For each syntax tree iw as any sequence of nested arrays, tokens, and objects can be displayed. The JavaScript statement document.getElementById ("btnSave") click ();

for example, could be represented by the AST.

[APPLY, document, "getElementById", ["btnSave"]] , "click", is []] This, formally speaking, an array that consists of objects - namely, in turn, from arrays of symbols as APPLY (for example, strings and feasible would be) from ordinary JavaScript objects to the host language, as here, the DOM document object , and data types such as strings, numbers, etc. [4] In this form of AST could turn a OMeta program to be submitted for processing. allows object orientation, separation of production rules in different namespaces. Thus, each grammar is free in their choice of name for the symbols, unaffected by any already existing grammars in the same process. Exception of course is the desired explicit reference to an existing grammar through the inheritance mechanism. It is possible to OMeta grammar to inherit from another and can be obtained in the formulation of the production rules to the rules of the superclass. This allows to extend existing parsers, for example, the necessarily existing parser to the host language, with a few lines of code to their own constructs. This is an ideal tool for developers OMeta of programming languages, to try new syntactic idioms.
In contrast to context-free grammars (CFG) of the selection operator is equal rights. On the other hand, it is also an advantage, because the interpretation of a priosierten selection operator produces no ambiguity, so that the PEGs are easier to understand. The last feature of the prioritized choice operator, it usually brings the problem of endless loops on left recursive rules with them. The rule expr = expr "-" number to an infinite descent. In more complicated cases, a closed Cycles forms of several interdependent rules, known as indirect left recursion. Now let all the rules that come across the problem of left recursion, rewriting by hand so that the left recursion does not occur. But the expressions are cumbersome and thus are not so easily understand. From the perspective of a grammar developer it is an unnecessary complication to have to avoid in his grammar Links recursion. Now it is Alessandro Warth OMeta succeeded to solve this problem. In OMeta can also record left recursive rules like the one above. The efficiency suffers including not significant - the Parsezeiten usually depend on the continued linearly on the length of the input. The possibility of recursive rule definitions increases the expression of OMeta in the use of PEG as alternative implementations. OMeta allows the definition of so-called parameterized rules . They are, as the name says, rules that depend on parameters. Warth explains this idiom with reference to a rule for character sets. Character sets can be defined by the traditional selection operator. If it's the built-in rule digit for the recognition of digits in OMeta not exist already, you could define it like this: digit = '0 ' uppercase = CHARRANGE ('A', 'Z')
The example can be seen that the parameterized rules introduce a higher level of abstraction. Such as inheritance, they allow an "up-down separation, a separation of the universal from the concrete.
How to do this yet to offer support OMeta even higher order rules: These are rules that have different rules as an argument! This feature is made possible by a combination of parameterized rules with a special built-in rule called
apply.
apply takes the name of a rule against
and executes them. As a useful, practical Warth following example returns OMeta rule:

listof: p = apply (p) ("," apply p ()) *
would match one or more, separated by commas occurrence of "something that usually p enough '. For example, recognizes the term listof ('expression') a comma separated list of expressions (meet a previously defined rule expression ), while
listof ('name')
on a list of names ( the name of the previously defined rule meet
) fits.

I can not find the entire Language OMeta
explain, but only to stimulate their studies. The above statements summarize key parts of Warth's thesis. Who wants to learn more: they have

least I still want a small show in the scene of the "compiler design" favorite example: a "desktop calculator". This concrete calculator can even work with variables, which shows the way that you can build with OMeta also stateful parser. On my test page
reach its definition by selecting
```
 Calculator 
```
in the list of available grammars. This example is the dissertation of Warth removed. I will discuss briefly here.

The first line
```
 
 ometa Calc 
 
```
a new grammar (a new transformation, a new parser, a new OMeta object) is declared. In this case, it will inherit from the built-in grammar parser
. This built-in class currently contains only one basic parser functionality - namely, a token rule, can be identified with the space-separated string. This token
rule is automatically highlighted in all of

The next line
var = letter spaces: x -> x,

- separated> , a semantic action
be specified. A semantic action is listed in the code of the host language and will be evaluated for a recognized pattern. The result of the evaluation is the return value of the so-called

In this case we have a rule
```
 var 
```
detection of variable names. First, specify that spaces should be ignored just before a variable name. This is achieved with the built-in rule
spaces, which recognizes "zero or more spaces." Next comes the actual determination of what is to be recognized as a variable: variables should be single letter in our calculator. For this purpose, we use the built-in rule-based OMeta
letter. After the colon, then followed by a variable name that contains in the successful application of the rule evaluation result of the semantic action (or, if no semantic action is recognized as fitting the part of the input). These variables that are assigned during parsing, then you can in the semantic Action draw - as here. We determine that the return value of the rule is exactly the recognized character.

A simpler way would be completely equivalent formulation

var = letter spaces

sets the following variables according to the rule determine the permitted numbers - positive whole numbers: num = num: digit n: d -> (n * 10 + d * 1) The declaration of the alternatives. We also learn that even parts of a rule is already a semantic action, may be assigned. The first part of the rule also contains the already discussed left recursion.
```
 This rule could have been equivalent to write again: 
```
digit num = +: d -> (1 * d.join ('')),
```
 new here is the quantifier <= c && c <= y) -> + 
```
with the same semantics as in regular expressions . Enriched with the Quantifier
+ or *
expression always evaluates to an array with the individual hits as elements. The array, we can by in the semantic action the method
```
 join () 
 the JavaScript Array class 
 
```
zusammenspleissen to a string, and finally to force by multiplying by 1 (type coercion) in the conversion of the numeric data type. This variant usually
num
is certainly instructive.

primary expressions : Entering a variable name will return the contents of the input of a number of strings is said to have just that number to answer, and complex expressions are intended to be clamped: primaryExpr = spaces var: x -> self . vars [x] ")" -> R, new here is the access to the variable self.vars to initialize in a special, in the so-called Parserinstanziierung
()
```
 method is defined as an empty hash and us as a container ( more precisely, serves as a symbol table) for the variables entered in the calculator website. 
```
There are now the multiplication and addition rules: mulExpr = mulExpr: x "*" primaryExpr: y -> (x * y)
mulExpr: y -> (x + y)
```
  be encoded. 
```
The addExpr is thus more general than
mulExpr
```
 and more general than the 
```
primaryExpr
. You latter contains as special cases. The following rule = expr var: x "=" expr: r -> (self.vars [x] = r)
Expression should be output.

follows Here - outside the OMeta grammar that already mentioned initialize () method by which the computer is put into the initial state: Calc.initialize = function () {this.vars = {};}

In my "Workbench" I can now rule on the input doit

x = 1 y = 2 x + y

 get and apply the output <: Parser {

3. This is because the doit rule several expressions separated by blank space identifies and evaluates. I would not have three Entries can be made in sequence: First x = 1 then y = 2

and finally

 x + y

Even then I would get 3 the output. The reason is that my "workbench" in each dialog step with the same Parserinstanz works and has noticed so all variable assignments already made in the symbol table vars . It's just a stateful parser - and with this property is also suitable for OMeta REPL Shells (
read-evaluate-print
loops). You might want to OMeta my workbench in my OMeta implemented XML parser View: A well-read in my view, only 42 lines of code allows the transformation of an XML document into a JavaScript object modeled tree. These are the variables, with which one can work in OMeta, it is also one of the issues with which the Viewpoints Research Institute is taken up, developed under the roof also OMeta was: To how many orders of magnitude can consist of source code with the use of appropriate, possible expressive programming environments will be reduced to still achieve that for which it now uses computer (word processing, graphics, Internet, etc.)?

This may be a first insight into a very interesting parser Comply with the I can only recommend to closer study. The fields, around the area of domain-specific languages (DSL) seem to me to be very promising. The language is compact and expressive. Its powerful features, such as object orientation, and parameterized rules of recursion is to thank that grammars can be written much more elegantly than previous tools of this kind

[1] Thus, in Thomas Stahl, Markus Volter , Sven Efftinge, Arno Haase:
Model-Driven Software Development
, 2 Dpunkt edition, Verlag, Heidelberg 2007th

 Jan Köhnlein and Sebastian Zarnekow, 
 Xtext practice

, eclipse-Magazin 1.2010, p. 50
[2] The following ideas of the thesis by Alessandro Warth Experimenting with Programming Languages are removed.
[3] It is, however, even without the visitor design pattern, which unfortunately is intrusive and in the implementation of a certain awkwardness attached (as I have shown in my blog

).

 [4] Here, the application of a method m is

of the object o with the argument array [args ...] by the construct [APPLY, o, m, [args ...]] modeled.

Rüdiger Plantiko

Tuesday, November 23, 2010

Vesta Beef Risotto Pro Points

0 comments:

Post a Comment