Software Language Engineering (SLE)
Summary of Most Relevant Topic Papers
Identifying or engineering appropriate languages for the various activities in software and systems development is one of the most important issues in software engineering. Even programming languages are still subject to improvement. For many other activities, such as architectural design, behavioral modeling, and data structure specifications, we use the general purpose Unified Modeling Language (UML) [Rum16] [Rum17]. Nevertheless, UML and its tooling still are much less elaborate and hence subject to extensive syntactic, semantic, and methodical improvement.
In various domains, however, it is more appropriate to employ Domain Specific Languages (DSLs) to enable non-software developers specifying properties, configuring their systems, etc. in terms of established domain concepts and corresponding language elements. DSLs have already achieved a significant degree of penetration in industry [HRW15]. With the upcoming age of digitization, we thus expect DSLs to grow even stronger and therefore also involve increasing effort in their efficient engineering, integration and composition.
Design of a DSL is a complex task, because, on one hand, it needs to be precise enough for being processed by a computer and, on the other hand, comprehensible by humans. Monolithic design of a language can already benefit from methods for language engineering in the small including design guidelines and tooling. The MontiCore language workbench [HKR21] is such a tool to assist development of languages. It provides, e.g., techniques for an integrated definition of concrete and abstract syntax of a language [KRV07b] [Kra10], but is much more a framework for compositional language design [KRV10] [HRW18].
Language Engineering in the Large
To efficiently engineer languages in the large, we need all the help that we can get. As software languages are software too, it is not surprising that the following techniques largely discussed in [CFJ+16] help:
- Elaborate tooling to assist language development.
- Reuse of tools, e.g. for parsing and for parameterizable pretty printing.
- Reuse of language components.
- Decomposition of the language to be designed in smaller components.
- Refinement and adaptation of existing languages.
- Automatic derivation of new languages from existing ones.
To improve understanding of language engineering, we have defined the terms language and language components in [CBCR15] [BEK+18b] and how to capitalize on this from a global perspective in [CCF+15a]. Additionally, we discuss the possibilities and the challenges using metamodels for language definition [SRVK10], identifying, for instance, the need for metamodel merging and inference, as well as assistance for their evolution . As a bridge between the two techniqual spaces, namely grammars and metamodels, we examine a mapping between both in [BJRW18].
Of course, we also consider variability for modeling languages and have investigated a method to model syntactic language variability through language product lines [BEK+18b] [BEK+19] [BPR+20].
Language and Tool Composition
Divide and conquer is one of the core concepts for managing large and complex tasks. Language design therefore needs to be decomposed along several dimensions: First, we want to decompose the language in language components [BEK+18b]. Some of these components, for example the basic language for full qualified names, constants, expressions, or imperative statements, should be designed in a reusable form.
In a second dimension, we decompose the tooling along the activities (front-end: model processing, context conditions, internal transformations, backend: printing) and decompose each of these activities along the individual language components. MontiCore 3 [HR17], e.g., was already able to decompose the front-end language processing along the decomposition of the language itself [KRV10] [Voe11] [KRV08] [HNRW16] [Naz17] [RRRW15b].
MontiCore also assists modular decomposition of the backend code generation based on different targets and different sublanguages [RRRW15b] [BBC+18] (see also Compositionality/Modularity of Models).
Language Derivation
Language derivation is, to our belief, a promising technique to develop new languages for a specific purpose that are relying on existing basic languages [HHK+13] [HHK+15] [HRW15] [GLRR15] [BDL+18] [BJRW18]. Formally, a language derivation is a mapping D, that maps a base language B into another language D(B). This mapping produces new languages, not models. To automatically derive such new languages D(B) or, at least, assist such derivation with tools, the base language B itself has to be modeled explicitly, for instance as a metamodel or as a grammar together with its well-formedness rules in a reasonably explicit form. Thus, language derivation can be partially understood as model transformation on a metalanguage. We, so far, successfully conceived three language derivation techniques, described below.
Transformation Languages in Concrete Syntax
Instead of using a fully generic transformation language that is applicable to a base language B, we automatically derive a transformation language T(B) that merges elements of the concrete syntax of B with generic - and thus reusable - elements for defining transformations on B models. The result is a comprehensible and easy applicable transformation language that modelers find familiar, because it systematically reuses the syntax of the base language B. Automatic derivation of such transformation languages using concrete syntax of the base language is described in [HRW15] [Wei12] [Hoe18].
As the language derivation operator T is applicable to any language, we have successfully applied it to, e.g., class diagrams, object diagrams, MontiArc, Automata. The operator T not only derives the new languages T(B), but the tool infrastructure behind T also generates the transformation engine necessary to execute transformations defined in T(B) (which finally transform models of the base language B).
Tagging Languages
A tagging model is used in the context of a base model M and adds additional information in form of tags to the elements defined in M. This, for example, can be used to add technology-specific information or advice on how code generation, model merging and other algorithmic transformations have to handle the tagged elements. Tags can, for example, instruct a persistence generator, whose data model classes are mapped into single transportable DAOs or add security restrictions to data objects. For activity diagrams, tags can describe, where to find the appropriate activity implementation, etc.
Tagging models share the idea of UML’s stereotypes, but are not part of the base model. Instead, the separate tagging model references the base model. This has the advantages (1) that the base model can be reused without technology specific pollution, (2) several different tag models are possible for the same base model in different technological spaces (e.g., iPhone, Android or Windows clients), and (3) a tag model can also be reused for different base models.
A tagging language is the language of the tagging models and thus is highly dependent on the base language that it tags (i.e., it must be aware of the modeling elements of the base language). [GLRR15] describes how to systematically derive tagging languages from a base language and how code for processing tagging models can be generated automatically.
This also rests on the concept of a tag definition language, which allows defining the possible form and values that tags may have, as well as which kind of model elements they can be applied to and therefore acts as type definition for tags.
Delta Languages
Another way of deriving new languages from existing languages is described in [HHK+15] and [HHK+13], where a base language B is used to derive a delta language Delta(B). The delta language Delta(B) enables to explicitly describe differences between a base model of B and the model variant (also of B). This helps to define system variability in a bottom-up fashion. A delta model describes which model elements are added, modified, or deleted on the base model. Thus delta approach is popular for the management of Variability and Software Product Lines (SPL) (see Variability and Software Product Lines (SPL)). Again the delta operator transforms a base language B into a language Delta(B) allowing to describe delta models. Each delta model can be applied individually and therefore n deltas amount to 2^n variants (modulo application dependencies and orders).
Delta language techniques are specifically suited for architectural languages, such as MontiArc to add and modify components as well as channels, but also have been applied to Simulink in an industrial context.
Key Statements
- Software Language Engineering requires elaborate tooling to assist (partially automated) language development.
- “Language Engineering in The Large” is based on techniques well known from
Software Engineering:
- Decomposition of a language enables to design language components independently.
- Reuse of language components is essential for quality improvement, workload reduction, and humans to adopt a familiarly looking language.
- Techniques for systematic refinement and adaption of existing languages help white-box reuse.
- Automatic derivation of new languages is based on existing base languages.
- Transformation, tagging and delta languages can be automatically derived from base languages.
Selected Topic-Specific Publications
-
[HKR21]Aachener Informatik-Berichte, Software Engineering, Band 48, ISBN 978-3-8440-8010-0, Shaker Verlag, May 2021.
-
[BPR+20]In: Proceedings of the 23rd ACM/IEEE International Conference on Model Driven Engineering Languages and Systems, pp. 35-46, ACM, Oct. 2020.
-
[BEK+19]In: Journal of Systems and Software (JSS), R. C. Sevilla, L. Fuentes, M. Lochau (Eds.), Volume 152, pp. 50-69, Elsevier, Jun. 2019.
-
[Hoe18]Aachener Informatik-Berichte, Software Engineering, Band 36, ISBN 978-3-8440-6322-6, Shaker Verlag, Dec. 2018.
-
[BEK+18b]In: International Conference on Systems and Software Product Line (SPLC’18), ACM, Sep. 2018.
-
[BDL+18]In: International Conference on Software Language Engineering (SLE’18), pp. 187-199, ACM, 2018.
-
[BJRW18]In: International Conference on Software Language Engineering (SLE’18), pp. 174-186, ACM, 2018.
-
[BBC+18]In: Journal Frontiers in Neuroinformatics, Volume 12, 2018.
-
[HRW18]In: Journal Computer Languages, Systems & Structures, Volume 54, pp. 386-405, Elsevier, 2018.
-
[HR17]Aachener Informatik-Berichte, Software Engineering, Band 32, ISBN 978-3-8440-5713-3, Shaker Verlag, Dec. 2017.
-
[Naz17]Aachener Informatik-Berichte, Software Engineering, Band 29, ISBN 978-3-8440-5320-3, Shaker Verlag, Jun. 2017.
-
[Rum17]Springer International, May 2017.
-
[CFJ+16]Chapman & Hall/CRC Innovations in Software Engineering and Software Development Series, Nov. 2016.
-
[HNRW16]In: Conference on Modelling Foundations and Applications (ECMFA), pp. 67-82, LNCS 9764, Springer, Jul. 2016.
-
[Rum16]Springer International, Jul. 2016.
-
[HHK+15]In: Journal on Software Tools for Technology Transfer (STTT), Volume 17(5), pp. 601-626, Springer Berlin Heidelberg, Oct. 2015.
-
[RRRW15b]In: Journal of Software Engineering for Robotics (JOSER), Volume 6(1), pp. 33-57, 2015.
-
[CBCR15]In: Globalizing Domain-Specific Languages, pp. 7-20, LNCS 9400, Springer, 2015.
-
[CCF+15a]LNCS 9400, Springer, 2015.
-
[GLRR15]In: Conference on Model Driven Engineering Languages and Systems (MODELS’15), pp. 34-43, ACM/IEEE, 2015.
-
[HRW15]In: Conference on Model Driven Engineering Languages and Systems (MODELS’15), pp. 136-145, ACM/IEEE, 2015.
-
[HHK+13]In: Software Product Line Conference (SPLC’13), pp. 22-31, ISBN 978-1-4503-1968-3, ACM, 2013.
-
[Wei12]Aachener Informatik-Berichte, Software Engineering, Band 12, ISBN 978-3-8440-1191-3, Shaker Verlag, 2012.
-
[Voe11]Aachener Informatik-Berichte, Software Engineering, Band 9, ISBN 978-3-8440-0328-4, Shaker Verlag, 2011.
-
[KRV10]In: International Journal on Software Tools for Technology Transfer (STTT), Volume 12(5), pp. 353-372, Springer, Sep. 2010.
-
[SRVK10]In: Model-Based Engineering of Embedded Real-Time Systems Workshop (MBEERTS’10), pp. 57-76, LNCS 6100, Springer, 2010.
-
[Kra10]Aachener Informatik-Berichte, Software Engineering, Band 1, ISBN 978-3-8322-8948-5, Shaker Verlag, März. 2010.
-
[KRV08]In: Conference on Objects, Models, Components, Patterns (TOOLS-Europe’08), pp. 297-315, LNBIP 11, Springer, 2008.
-
[KRV07b]In: Conference on Model Driven Engineering Languages and Systems (MODELS’07), pp. 286-300, LNCS 4735, Springer, 2007.
Related Topics
- Compositionality/Modularity of Models
- Domain-Specific Languages (DSLs)
- Evolution & Transformation of Models
- MontiCore - Language Workbench
- State-Based Modeling (Automata)
- Unified Modeling Language (UML)
- Variability & Software Product Lines (SPL)