A tool environment for quality assurance based on the Eclipse Modeling Framework
- First Online:
- Cite this article as:
- Arendt, T. & Taentzer, G. Autom Softw Eng (2013) 20: 141. doi:10.1007/s10515-012-0114-7
- 666 Views
The paradigm of model-based software development has become more and more popular since it promises an increase in the efficiency and quality of software development. Following this paradigm, models become primary artifacts in the software development process. Therefore, software quality and quality assurance frequently leads back to the quality and quality assurance of the involved models. In our approach, we propose a model quality assurance process that can be adapted to project-specific and domain-specific needs. This process is based on static model analysis using model metrics and model smells. Based on the outcome of the model analysis, appropriate model refactoring steps can be performed. In this paper, we present a tool environment conveniently supporting the proposed model quality assurance process. In particular, the presented tools support metrics reporting, smell detection, and refactoring for models being based on the Eclipse Modeling Framework, a widely used open source technology in model-based software development.
KeywordsModelingModel-based software developmentModel qualityModel quality assuranceEclipse Modeling Framework
In modern software development, models play an increasingly important role promising a growth in efficiency and quality of software development. In particular, this is true for model-driven software development where models are used directly for automatic code generation. High code quality can be reached only if the quality of input models is already high.
In our approach, we concentrate on quality aspects to be checked on the model syntax. They include not only the consistency with the language syntax definition, but also e.g. the conceptual integrity in using patterns and principles in similar situations, and the conformity with modeling conventions often defined and adapted to specific software projects. In Mohagheghi et al. (2009), six classes of quality goals for software models are identified. We take them as conceptual basis for a goal-question-metrics approach (Basili et al. 1994) to our quality assurance process for software models.
In the literature, well-known quality assurance techniques for models are model metrics and refactorings, see e.g. Genero et al. (2005), Sunyé et al. (2001), Markovic and Baar (2008), Zhang et al. (2005), Porres (2003), Lange (2007). They origin from corresponding techniques for software code by lifting them to models. Especially class models are closely related to programmed class structures in object-oriented programming languages such as C++ and Java. For behavior models, the relation between models and code is less obvious. Furthermore, the concept of code smells (Fowler 1999) can be lifted to models leading to model smells (compare e.g. Lange 2007). Again, code smells for class structures can be easily adapted to model smells, but smells of behavior models cannot directly be deduced from code smells.
In Arendt et al. (2011), we present the integration of these techniques in a predefined quality assurance process that can be adapted to specific project needs. It consists of two sub-processes: Before a software project starts, project- and domain-specific quality checks and refactorings have to be defined. Quality checks are formulated using model smells which can be specified e.g. by model metrics and anti-patterns. After formulating quality checks and refactorings, the specified quality assurance process can be applied to concrete software models by computing model metrics, reporting all model smells and applying model refactorings to erase smells that indicate clear model defects.
Since the process of manual model reviews is very time consuming and error prone, the proposed project-specific model quality assurance process should be automated as effectively as possible. In this article, we present a flexible tool environment for metrics reporting, smell detection, and refactoring of models being based on the Eclipse Modeling Framework (EMF) (Steinberg et al. 2008), a widely used open source technology in model-based software development. We integrated the entire tool set into the Eclipse incubation project EMF Refactor (EMF Refactor 2012).
The paper is organized as follows: In the next section, we motivate the development of the presented tool environment by a discussion of selected model quality aspects and an overview on the general approach of our contribution. In Sect. 3, we present the running example of this paper whereas Sect. 4 describes the state-of-the-art in model quality assurance tooling as well as the deduced requirements for our tool environment. Thereafter, we describe the architecture of our tools in Sect. 5. The application of our quality assurance tool environment to the running example is illustrated in Sect. 6. Afterwards, we present supported specification approaches for model metrics, smells, and refactorings in Sect. 7. We evaluate the tools in Sect. 8 and finally conclude in Sect. 9.
2 Model quality and quality assurance
In this section, we present the definition and application of a structured model quality assurance process that can be used to address project-specific and domain-specific needs (compare Arendt et al. 2011). The general approach uses known model quality assurance techniques being model metrics, model smells, and model refactorings. They are combined in an overall process for structured model quality assurance focusing on syntactical model issues. We start our presentation by presenting selected model quality aspects in software development that serve as the basis for our general approach.
2.1 The 6C model quality goals presented by Mohagheghi et al.
A model is correct if it includes the right elements and correct relations between them, and if it includes correct statements about the domain. Furthermore, a model must not violate rules and conventions. This definition includes syntactic correctness relative to the modeling language as well as semantic correctness related to the understanding of the domain.
A model is complete if it contains all relevant information, and if it is detailed enough according to the purpose of modeling. For example, requirement models are said to be complete when they specify all the black-box behavior of the modeled entity, and when they do not include anything that is not in the real world.
A model is consistent if does not contain contradictions. This definition covers horizontal consistency concerning models/diagrams on the same level of abstraction, vertical consistency concerning modeled aspects on different levels of abstraction as well as semantic consistency concerning the meaning of the same element in different models or diagrams.
A model is comprehensible if it is understandable by the intended users, being humans or tools. In most of the literature, the focus is on comprehensibility by humans including aspects like aesthetics of a diagram, model simplicity or complexity, and the use of the correct type of diagram for the intended audience. Several authors also call this goal pragmatic quality.
A model is confined if it suits to the modeling purpose and the type of system. This definition also includes relevant diagrams on the right abstraction level. Furthermore, a confined model does not have unnecessary information and is not more complex or detailed than necessary. Developing the right model for a system or purpose of a given kind also depends on selecting an adequate modeling language. This means that the modeler uses language concepts that are suitable for the intended purpose of the modeling activity. Further concepts should be used very sparsely or even omitted deliberately.
A model is changeable if it can be evolved rapidly and continuously. This is important since both the domain and its understanding as well as system requirements evolve over time. Furthermore, changeability should be supported by modeling languages and modeling tools as well.
2.2 Specification and application processes for customized model quality assurance techniques
For a first rough model overview, a report on model metrics might be helpful. Furthermore, a model can be checked against the existence (respectively absence) of specified model smells. Each model smell found has to be interpreted in order to evaluate whether it should be eliminated by a suitable model modification (either by a manual model change or a refactoring). However, we have to take into account that also new model smells can be induced by refactorings and care should be taken to minimize this effect. This check-improve cycle should be performed as long as needed to get a reasonable model quality.
Ideally a quality assurance process is fully specified before using it within model-based software development projects. However, it is not seldom that the process has to be adapted during the model development phase. Our process allows the straight adaptation to new model checks and refactorings.
Another factor that influences the significance of a model quality aspect is the corresponding application domain. This means that software models are used in various domains like web applications or embedded systems having different impacts on the significance of a certain model quality aspect. For example, models of safety-critical embedded systems need to be more correct than models of usual web applications.
The preceding discussions show that it is appropriate to set up a specific model quality assurance process for each software project being dependent on the modeling purpose as well as the corresponding modeling domain.
In the next step, static syntax checks for these quality aspects are defined. This is done by formulating questions that should lead to so-called model smells hinting to model parts that might violate a specific model quality aspect. Here, we adopt the goal-question-metrics approach (GQM) that is widely used for defining measurable goals for quality and has been well established in practice (Basili et al. 1994). In our approach, we consider the syntax of the model in order to give answers to these questions. Some of these answers can be based on metrics. Other questions may be better answered by considering specific patterns which can be formulated on the abstract syntax of the model. However, further static analysis techniques could be incorporated to find out additional potential model smells. Furthermore, the project-specific process can (re-)use general metrics and smells as well as special metrics and smells specific for the intended modeling purpose.
Refactoring is the technique of choice for fixing a recognized model smell. A specified smell serves as precondition of at least one model refactoring that can be used to restructure models in order to improve model quality aspects but appreciably not influence the semantics of the model. In this context, it is also recommended to analyze the specified refactorings whether the application of a certain refactoring may cause the occurrence of a specific model smell.
Support for the implementation of new model metrics, smells, and refactorings using several concrete specification languages.
Calculation of implemented model metrics, detection of implemented model smells, and application of implemented model refactorings.
User-friendly support for project-specific configurations of model metrics, smells, and refactorings.
Generation of model metrics reports.
Suggestion of suitable refactorings in case of specific smell occurrences.
Provision of warnings in cases where new model smells come in by applying a certain refactoring.
The following sections present a flexible tool environment for model metrics reports, smell detection, and refactoring for models that is based on the Eclipse Modeling Framework (EMF) (Steinberg et al. 2008).
3 Running example
After giving an overview on our approach for model quality assurance, this section discusses a simple example of this process that serves as running example throughout this paper. After presenting the application of a sample model quality assurance process and the used modeling language, we describe the definition of such a process in detail.
3.1 Application of a project-specific model quality assurance process
In our example, we consider a software project for the development of an accounting system of a vehicle rental company. This company has a headquarter and owns some cars, trucks, and motorbikes which can be rented by customers via a vehicle rental service. A car is specified by its manufacturer, its registration number, its engine power, and the number of provided seats. A truck is specified by its manufacturer, its registration number, its engine power, and its weight. Finally, a motorbike is specified by its manufacturer, its registration number, its engine power, and its cylinder capacity. Each customer has a name and an email address and is related to a consultant being an employee of the company. Furthermore, the company has some subcontractors being specific employees and customers.
We assume that software models are used in the domain analysis phase in order to get an overview on real world entities in the problem domain. The modeling of this problem domain is done using SCM, a simple domain specific modeling language (DSML) being simplified UML class models (UML 2012). We discuss this language in more detail in the next section.
3.2 Domain specific modeling language SimpleClassModel (SCM)
A SimpleClassModel consists of a number of ScmPackages where each one consists of a number of packaged elements being Types or Associations. A Type is either a PrimitiveType, e.g. Integer or String, or an ScmClass. A (potentially abstract) class can have an arbitrary number of parent classes realized by model element Generalization. The derived reference superclasses subsumes the total set of ancestor classes of a given class in its inheritance hierarchy. Furthermore, a class can have several (potentially constant) Attributes, whereas attributes inherited from ancestor classes are subsumed by a corresponding derived reference. Each attribute has a visibility and an optional type. Additionally, an attribute can redefine another attribute within the inheritance hierarchy of the owning class. Relationships between classes can be modeled using unidirectional Associations.
As mentioned above, valid SCM instance models must conform to some additional well-formedness constraints, e.g. unique names of owned elements in a ScmPackage, or acyclic inheritance hierarchies.
3.3 Specification of project-specific model quality assurance techniques
In this section, we demonstrate how the specification process for the used model quality assurance techniques (see Fig. 2) is applied along our running example. Please note that this process does not need to be applied for each individual project in its full extent. Once these techniques are defined they can be reused in future projects as well.
3.3.1 Specification of relevant model quality aspects
In our example, we use the 6C quality goals described in Sect. 2.1 as quality model and determine those aspects which are most relevant as follows.
The most important property of a domain analysis model is that it models the problem domain in the right way, i.e. choosing the right elements and claiming the right statements. So, 6C goal Correctness is an essential quality aspect that has to be considered when applying a model quality assurance process. Since an analysis model is used for communicating with problem domain experts who are typically inexperienced in software modeling, it is also important that the model is easily understandable. This implies that the model must not allow different interpretation results. Furthermore, the analysis model must not have unnecessary information that make it more complex as necessary. So, 6C goals Comprehensibility, Consistency, and Confinement can be seen as essential quality aspects.
Since the modeling purpose in our example is to get an overview on the problem domain, it is rather crucial if less important information is missing. So, 6C goal Completeness is a less important quality aspect in our example. Furthermore, since SCM is very simple and manageable, model reviewers do not have to prioritize the quality goal Changeability.
Please note that we are arguing from a very specific point of view only to keep the argumentation compact. Of course, the selection of main quality aspects may vary dependent on the intended modeling purpose and demonstrates the complexities and challenges of this basic task.
3.3.2 Formulation of questions leading to static quality checks
Are there classes being not used by any other model element? This is a typical case of unnecessarily modeled information.
Are there classes inheriting from another class several times? This would indicate that the modeler uses the inheritance concept in a too complex way, i.e. the model is more detailed than necessary.
Are there abstract classes not doing much? Again, this might be an indicator for unnecessary information within the model.
Are there at least three similar attributes staying together in more than one class? This might be a hint that the modeler does not use the inheritance concept of the SCM language which might be more suitable in this case.
Are there attributes redefining other ones within the inheritance hierarchy? Since the purpose of the model is to get an overview about the problem domain the use of this language construct might be too complex, i.e. it does not suit to the modeling purpose.
3.3.3 Specification of project-specific SCM smells
- Unused Class (deduced from question Q1):
Unused classes often stand alone in the model without any references to other classes. This smell is adapted from Riel who analyzed object-oriented design (Riel 1996) and can be detected by two different mechanisms. First, we can define the absence of child classes, associated classes, and attributes with class type as anti-patterns based on the abstract syntax of SCM and check whether they do not match on a concrete instance class. Second, we can define a constraint that uses three metrics (Number of direct children, Number of associated classes, and Number of times the class is externally used as attribute type) and that checks whether each metric is evaluated to zero. Nevertheless, the former alternative seems to be the most appropriate one. We discuss this pattern in Sect. 7.
- Diamond Inheritance (deduced from question Q2):
This smell is based on the multiple inheritance concept of SCM. It occurs when the same predecessor is inherited by a class several times and is known in literature as ‘diamond’ inheritance problem for object-oriented techniques using multiple inheritance and was first discussed by Sakkinen (1989). An adequate mechanism to detect this smell is to specify a corresponding pattern on the abstract syntax of SCM and to find matches in concrete SCM instance models.
- Speculative Generality (deduced from question Q3):
This smell occurs if there is an abstract class inherited by one single class only. It is based on the corresponding code smell introduced by Fowler (1999) and refined by Zhang et al. (2008). To detect this smell we can use metric Number of direct children and check whether this metric is evaluated to 1 on an arbitrary SCM class. Of course, the corresponding constraint must check whether this class is abstract. Furthermore, it is possible to specify this smell by a corresponding pattern based on the abstract syntax of SCM and try to match this pattern on classes of a concrete SCM instance model.
- Data Clumps (deduced from question Q4):
- A SCM model holds this smell if interrelated data items often occur as ‘clump’. More precisely, this smell can be defined as follows:Again, this smell is also based on the corresponding code smell introduced by Fowler (1999) and refined by Zhang et al. (2008). To detect this smell there must be a mechanism to detect similarities in SCM models. This is due to the fact that one can not predict how many attributes are involved in this smell. Furthermore, there might be variants w.r.t. similar attributes when using a more general definition of this smell than here (think of attribute names that need not to be equal but just similar or attributes with different visibilities). Another possibility to detect this smell is to define a metric for an ScmClass counting all equal attributes with other classes. Nevertheless, using a strict definition with exactly three attributes and equal signatures it is possible to define this smell as pattern based on the abstract syntax of SCM.
At least three attributes stay together in more than one class.
These attributes should have the same signatures (same names, same types, and same visibility).
The order of these attributes may vary.
- Redefined Attribute (deduced from question Q5):
SCM allows for redefining attributes owned by ancestor classes. However, using this language feature could lead to misunderstandings of the modeled aspect and so might be confusing for model readers. It can be checked by matching a corresponding pattern or by evaluating metric Number of redefined attributes to zero.
3.3.4 Specification of project-specific SCM refactorings
Suitable SCM refactorings to erase specific SCM model smells
Extract Intermediate Superclass
Remove Intermediate Superclass
Remove Redefined Attribute
Remove Unused Class
To eliminate SCM smell Unused Class suitable refactorings can hardly be deduced since one can not determine whether this class is either useless or if there are some missing relationships. So, this smell can either be eliminated by removing the class (i.e. by using the simple refactoring Remove Unused Class) or by adding further information to the model not indicated as refactorings.
Smell Diamond Inheritance can be eliminated by applying refactorings Remove Superclass or Remove Intermediate Superclass. Both refactorings can also be used to eliminate SCM smell Speculative Generality. Here, the unnecessarily modeled abstract class has to be removed by one of those refactorings, depending on whether this class has a parent class or not. A further applicable refactoring addresses missing information, more precisely missing subclasses of the abstract class. This refactoring is called Extract Subclass. It creates a new subclass and applies refactoring Push Down Attribute to a set of attributes of the contextual class (which is empty in our case).
The elimination of smell Data Clumps can be done in two different ways, either by moving corresponding attributes to a new associated class or by moving them to a new class that is a common superclass of the owning classes. The first option uses SCM refactoring Extract Class that internally uses refactorings Create Associated Class and Move Attribute. The second alternative uses either refactoring Extract Superclass or Extract Intermediate Superclass if the owning classes have a common superclass already. Besides the creation of an empty (intermediate) superclass, both refactorings use refactoring Pull Up Attribute to move equal attributes to this newly created class.
Last but not least, SCM smell Redefined Attribute can be eliminated using refactoring Remove Redefined Attribute that removes the redefinition relationship as well as the contextual attribute if and only if the redefined attribute is visible to the owning class of the redefining attribute.
On the complementary website of this article, you find a structured definition of each SCM refactoring including a name, a short description, an illustrating example, the contextual meta model element for applying the refactoring, and the input parameters. Furthermore, we use a three-part specification preparing the implementation of refactorings in Eclipse using the Language Toolkit (LTK) technology (Frenzel 2006). The parts of a refactoring specification reflect a primary application check for a selected refactoring without input parameters, a second one with parameters, and the proper refactoring execution steps. Please note that some of the SCM refactorings are adapted from corresponding UML refactorings, for example discussed in Thongmak and Muenchaisri (2004), Zhang et al. (2005), and Markovic and Baar (2008).
Possible impacts of SCM refactorings on SCM model smells
Extract Intermediate Superclass
Remove Intermediate Superclass
Remove Redefined Attribute
Remove Unused Class
Each Extract … Class refactoring may cause SCM smell Data Clumps if appropriate attributes are moved to the newly created class. Please note that this smell already existed before the refactoring but in another context (without the newly inserted class). We mark this kind of smell with × whereas completely new smell occurrences are marked with ⊗. Furthermore, smell Data Clumps can also be introduced by refactorings Remove Superclass and Remove Intermediate Superclass when moved attributes complete an equivalent set of attributes in some subclasses.
The application of refactoring Extract Superclass can introduce smell Diamond Inheritance if the contextual classes have a common subclass. Refactoring Extract Subclass can lead to an unused class if no attribute is pushed down to the new class. Furthermore, if this refactoring is applied on an abstract class not inherited so far, SCM smell Speculative Generality is introduced. Refactoring Remove Redefined Attribute can lead to an unused class if the type class of the removed attribute has been the only use of this class. Finally, refactoring Remove Unused Class does not cause any smell from the analyzed list.
4 Tool environment: general approach
In this section, we present the general concepts of our tool environment for quality assurance of EMF-based models. After giving an overview on the state-of-the-art of model quality assurance tooling, we discuss the requirements on our tool set which are deduced from this survey and from the model quality assurance process presented in Sect. 2.2.
4.1 State-of-the-art: tool support for model quality assurance
The existing tool support for model quality assurance is mainly aiming at UML and EMF modeling.
4.1.1 UML modeling
Considering UML modeling, quality assurance tools are integrated in standard UML CASE tools to a certain extent. In the following, we give a rough overview on existing UML model quality assurance tools: In UML CASE tools such as the IBM Rational Software Architect (RSA 2012) and MagicDraw (MD 2012), a number of metrics and validation rules are predefined and can be configured in metrics and validation suites. MD supports class model metrics (e.g. measuring the number of classes, inheritance tree depth, and coupling), so-called system metrics such as Halstead and McCabe, and requirements metrics based on function points and use cases. Validation rules comprise completeness and correctness constraints such as all essential information fields are filled, properties have types specified, etc. Further validation rules can be specified using Java or a restricted form of OCL. RSA also supports predefined metrics. In addition, models can be checked against validation rules being based on metrics. A tool dedicated to the calculation of model metrics is SDMetrics (SDM 2012). SDMetrics analyzes the structural properties of UML models and uses object-oriented measures as well as design rule checking to automatically detect design and style problems in UML models. Measurement data is displayed in different views (e.g., tables, histograms, and kiviat diagrams) and can be exported in various formats like HTML and XML. Furthermore, SDMetrics supports custom definitions of UML metrics and design rules using XML-based configuration files.
Considering UML model refactoring, there is no mature tool support available yet. However, some research prototypes for model refactoring are discussed in the literature, e.g. in Porres (2003), Boger et al. (2003), Markovic and Baar (2008). Most of them are no longer maintained. For example, Porres (2003) describes the execution of UML model refactorings as sequence of transformation rules and guarded actions. He presents an execution algorithm for these transformation rules and constructed an experimental, meta-model driven refactoring tool that uses SMW, a scripting language based on Python, for specifying the UML model refactorings.
To summarize, UML CASE tools and further model analysis tools for UML provide model analyses by predefined metrics and validation rules and support the custom configuration of metrics and validation suites as well as the definition of further custom techniques but do not offer an integrated, custom configured quality assurance environment for UML models based on metrics, smells (validations), and refactorings.
4.1.2 EMF modeling
Since EMF has evolved to a well-known and widely used modeling technology, it is worthwhile to provide model quality assurance tools for this technology. To the best of our knowledge, explicit tool support for metrics calculation on EMF-based models is not yet available. However, there is the EMF Model Query Framework (EMF Query 2012) to construct and execute query statements that can be used to compute metrics and to check constraints. These queries have the form of select statements similar to SQL and can also be formulated based on OCL. Specified queries are triggered from the context menu. The configuration of queries in suites as well as reports on query results in various forms are not provided. The EMF Validation Framework (EMF Validation 2012) supports the construction and assurance of well-formedness constraints for EMF models. Two modes are distinguished: batch and live. While batch validations are explicitly triggered by the client, live validations listen to change notifications to model objects to immediately check that the change does not violate any well-formedness constraint.
The Epsilon language family (Epsilon 2012) provides the Epsilon Validation Language (EVL) to validate EMF-based models with respect to constraints that are, in their simplest form, quite similar to OCL constraints. Furthermore, EVL supports dependencies between constraints, customizable error messages to be displayed to the user and the specification of fixes to be invoked by the user to repair inconsistencies. For reporting purposes, EVL supports a specific validation view reporting the identified inconsistencies in a textual way. Suitable quick fixes are formulated in the Epsilon Object Language (EOL) being the core language of Epsilon and therefore not specifically dedicated to model refactoring. Here, Epsilon provides the Epsilon Wizard Language (EWL) (Kolovos et al. 2007), a textual domain-specific language for in-place transformations of EMF. We compare our first refactoring prototype with EWL in Arendt et al. (2009). The comparison shows that refactoring EMF-based models using EWL has some strengths but also weaknesses. Refactoring specifications in EWL are very compact, each refactoring is triggered from within the context menu of a contextual model element, and redo/undo functionality is supported. Nevertheless, EWL does not follow the homogeneous refactoring execution structure used in Eclipse. For example, a refactoring is provided only if all preconditions hold (i.e., no meaningful error message is provided), and a preview of the results of a refactoring is missing. Furthermore, EWL does not support reuse of existing refactoring specifications. Finally, there are no predefined EVL inconsistency checks and EWL refactorings (for more general languages like Ecore and UML2, for example) as well as no support for custom configurations of validation suites.
Another approach for EMF model refactoring is presented in Reimann et al. (2010), Refactory (2012). Here, the authors propose the definition of EMF-based refactoring in a generic way, however do not consider the comprehensive specification of preconditions. Our experiences in refactoring specification show that it is mainly the preconditions that cannot be defined generically. (See Arendt et al. (2010b) for a more complex refactoring with elaborated precondition checks.) Furthermore, there are no attempts to analyze EMF models w.r.t. model smell detection.
Finally, the MoDisco framework (Barbier et al. 2010) provides a model-driven reverse engineering process for legacy systems in order to document, maintain, improve, or migrate them. Here, several specific models are deduced (for example, Java models are deduced from Java code) which can be analyzed in order to detect anti-patterns and then be manually improved, for example by refactorings. As the UML and EMF tooling discussed so far, MoDisco supports the specification and computation of custom metrics and queries on models as well as metrics visualization. The main difference between MoDisco and our tool suite is the intended purpose (reverse engineering vs. modeling).
Similar as for UML modeling, there is various tool support to perform EMF model analyses and to improve EMF models by refactoring. However, there is not yet a comprehensive tool environment for specifying and applying predefined and custom metrics, smells, and refactorings to EMF models in an integrated way where metrics, smells, and refactorings are tightly inter-related. We are heading towards such a tool environment in the following.
4.2 Requirements on the tool environment for quality assurance in EMF
The analysis of existing model quality assurance tools presented in the previous section and the definition of the proposed model quality assurance process presented in Sect. 2.2 lead to the following requirements on our supporting tool set concerning model metrics, model smells, and model refactorings.
4.2.1 Requirements common to all model quality assurance tools
Each tool should be based on the Eclipse Modeling Framework (EMF), i.e. the corresponding functionality should be provided on any model that is based on EMF since EMF is a well-established format for models.
The tool environment should reuse existing Eclipse respectively EMF components as far as possible, e.g. EMF Compare (EMF Compare 2012) for refactoring preview and BIRT (BIRT 2012) for metric reporting. Furthermore, quality assurance techniques implemented should be reusable since many of them recur most likely in several projects even if modeling purposes may differ.
It should be possible to integrate QA plugins into EMF-based UML CASE tools like the IBM Rational Software Architect (RSA 2012).
4.2.2 Requirements on the application of specific model quality assurance tools (metrics calculation, smell detection, and refactoring execution)
The modeler (respectively model reviewer) should be provided with a project-specific configuration of model metrics, smells, and refactorings suites. For model smells being based on metrics it should be possible to specify project-specific thresholds.
- Integrated application
The corresponding functionality should be triggered from within several views in Eclipse like files in the project explorer (for metrics calculation and smell detection) and model elements in the standard tree-based EMF instance editor (for refactoring execution).
Calculated metric values and detected model smells should be reported in specific integrated views. Model elements being involved in a specific smell occurrence should be highlighted in the standard tree-based EMF instance editor. Furthermore, it should be possible to export metric results in various formats (e.g., HTML, PDF, and XML).
- Refactoring features
The application of refactorings should follow the homogeneous refactoring execution structure in Eclipse including a preview of the resulting model. This includes a transactional execution of refactorings. Furthermore, the refactoring tool should provide undo and redo functionality as well as an optional analysis of smell occurrences before and after refactoring application. Finally, smells should be related to refactorings being suitable to erase the smell, and refactorings should be related to smells potentially occurring after applying the refactoring.
- Quick-fix mechanism
It should be possible to invoke a suitable refactoring from within the context menu of a concrete smell occurrence in the smell results view.
4.2.3 Requirements on specification components for metrics, smells, and refactoring
- Flexible specification approaches
It should be possible to define custom metrics, smells, and refactorings for arbitrary EMF-based models. Here, the tools should support various concrete specification approaches like OCL, Java, and the EMF model transformation language Henshin (Arendt et al. 2010b; Henshin 2012). Furthermore, a designer should be provided with tool support for composing metrics and refactorings from existing ones.
- QA tool code generation
The tools should provide a comfortable input mechanism for specification-related information like the meta model, the name, and a description of an arbitrary metric, smell, or refactoring. Afterwards, each tool should generate Java code that can be used by the application component in order to provide the corresponding functionality (metrics calculation, smell detection, and refactoring execution).
5 Tool environment: architecture
This section discusses the architecture of our tool environment for EMF model quality assurance and summarizes the used components. Each tool is based on the Eclipse Modeling Framework (Steinberg et al. 2008; EMF 2012), i.e. each tool can be used for arbitrary models whose meta models are instances of EMF Ecore, for example domain-specific languages, common languages like UML21 used by Eclipse Papyrus (Papyrus 2012) and the Java EMF model used by JaMoPP (JaMoPP 2012) and MoDisco (Barbier et al. 2010; MoDisco 2012), or even Ecore instance models themselves.
Our tool environment mainly consists of two kinds of modules: For calculating model metrics, detecting smells, and executing refactorings there is an application module each. Similarly there are three specification modules for generating metrics, smell, and refactoring plugins containing Java code that can be used by the corresponding application module. For simplicity reasons, we refer to these plugins as custom QA plugins in the remainder of this section.
Java (Java 2012); version 6.
Henshin (Henshin 2012), a model transformation engine for the Eclipse Modeling Framework based on graph transformation concepts. Henshin uses pattern-based rules that can be structured into nested transformation units with well-defined operational semantics. For further information about Henshin we refer to Arendt et al. (2010b).
CoMReL (Arendt and Taentzer 2012b), a model-based language for the combination of EMF model refactorings.
Model metrics can be concretely specified in Java, as OCL expressions, by Henshin pattern rules, or as a combination of existing metrics using a binary operator.
Model smells can be concretely specified in Java, as OCL invariants, by Henshin pattern rules, or as a combination of an existing metric and a comparator like greater than (>).
The three parts of a model refactoring can be concretely specified in Java, as OCL invariants (only precondition checks), in Henshin (pattern rules for precondition checks; transformations for the proper model change), or as a combination of existing refactorings using the CoMReL language.
Extension point descriptions for EMF model metrics, smells, and refactorings
Name of the EMF model metric
Unique identifier of the EMF model metric
Description of the EMF model metric (optional)
Namespace URI of the corresponding meta model
Name of the context element type
Java class that implements IMetricCalculator
Name of the EMF model smell
Unique identifier of the EMF model smell
Description of the EMF model smell (optional)
Namespace URI of the corresponding meta model
Java class that implements IModelSmellFinder
Name of the EMF model refactoring
Unique identifier of the EMF model refactoring
Namespace URI of the corresponding meta model
Java class that implements IController
Java class that implements IGuiHandler
This interface provides the calculation of the corresponding EMF model metric on a given model element. Here, two methods have to be implemented: method void setContext(List<EObject> context) for maintaining the model element on which the metric should be calculated on, and method double calculate() for the proper calculation of the metric value on this element.
This interface provides the detection of the corresponding model smell in a given EMF model. It has one method which must be implemented by the corresponding Java class: LinkedList<LinkedList <EObject>> findSmell(EObject root). Here, the model is specified by parameter root. The method returns a list of detected smell occurrences where such an occurrence is given by a list of model elements which are involved in the detected smell.
This interface is responsible for executing the corresponding model refactoring. Here, the main method which has to be implemented is RefactoringProcessor getLtkRefactoringProcessor() that returns an instance of class RefactoringProcessor from the Language Toolkit (LTK) API (Frenzel 2006). Within this class, the refactoring specific preconditions are checked by methods checkInitialConditions(…) and checkFinalConditions(…) whereas the refactoring is finally executed by method createChange(…).
This interface checks whether the refactoring can be executed on the given context elements (method boolean showInMenu(List<EObject> selection)) and the process is started by method RefactoringWizard show(). As above, RefactoringWizard is a class of the LTK API.
For manually defining the relationships between model smells and model refactorings, our tool environment uses the Eclipse extension point technology again to provide information about these relationships globally. Therefore, two extension points for the manual definition of relations between model smells and model refactorings are provided. Since our tools identify smells respectively refactorings by distinct identifiers (see Table 3), these extension points require relations from smell IDs to a list of refactoring IDs (in case of providing suitable refactorings for a given smell) and relations from refactoring IDs to a list of smell IDs (in case of possible new smells when applying a given refactoring). To serve these extension points in a user-friendly way, we extend the property page of a certain Eclipse plugin project in the workspace by providing graphical user interfaces for (de-)activating appropriate relations.
In the following two sections, we present how to work with both kinds of modules. For simplicity reasons and to relate the application of our tools to the process presented in Sect. 2.2, we first present how to work with the application module and its implemented quality assurance techniques. Thereafter, Sect. 7 presents how to specify new metrics, smells, and refactorings for our example language SCM.
6 Tool environment: application of project-specific model quality assurance techniques in EMF
In this section, we present the application of the model quality assurance techniques defined in Sect. 3.3 on our example model as described in Sect. 3.1 supported by our tool environment for EMF model quality assurance.
6.1 Calculation of project-specific SCM model metrics
For the first overview on a model, a report on project-specific model metrics might be helpful. In Sect. 3.3, several metrics for SCM models have been identified that can be used for detecting corresponding smells. In the following, we do not calculate those smell-related metrics only but also other common metrics to get an overview on interesting model properties.
The first three metrics within the results view in Fig. 9 are calculated using these ‘basic’ metrics. The abstractness (A) of the package is 0.18 (ratio between the number of abstract classes in the package and the total number of classes in the package), the attribute inheritance factor (AIF) is 0.12 (ratio between the number of inherited attributes in all concrete classes in the package and the total number of attributes in all concrete classes in the package), and the average number of attributes in concrete classes within the package (AvNAtP) is 1.89. As a first evaluation of these metrics results, one can state that the model might not be complete since (1) there are only 11 classes modeled for the vehicle company domain, and (2) these classes have less than two attributes on average. Furthermore, language concepts of abstractness and inheritance are not used too exhaustively. So the model is less complex and easier to understand. On the other hand, the low values of A and AIF can be interpreted as a hint that the modeling purpose is not yet achieved since the modelers use the provided language features insufficiently only.
6.2 Detection of project-specific SCM model smells
Similar to the calculation process for model metrics, a smell analysis can be triggered either for the entire model or for a concrete model element. In the latter case, all smells are reported occurring within the containment hierarchy of the selected model element. Nevertheless, it has to be considered that there are model smells which might be distributed along several subtrees (like Multiple Definition of Classes with equal Names, looking for equally named classes in different packages). However, our framework provides smell analysis on subtrees only in order to narrow the scope of the analysis, for example on large-scale models.
Concerning concrete smell occurrences, the smell detection tool provides a highlighting mechanism for involved model elements within the standard tree-based EMF instance editor. For example, selecting the occurrence of smell Diamond Inheritance in the smell view (compare left-hand side of Fig. 13) highlights classes Subcontractor and Person in the instance editor as shown in the right-hand side of Fig. 13.
Use refactoring Extract Superclass on classes Car, Truck, and Motorbike to insert a common parent class Vehicle and pull up attributes manufacturer, power, and regNo to it.
The diamond inheritance smell detected on class Subcontractor should not be eliminated since this seems to be an important detail that has to be addressed in the domain model.
Smell Speculative Generality should be removed by using refactoring Remove Superclass on class Service since the company does not offer further services.
Class RentalPeriod is unused up to now. It should be associated to class VehicleRental and shall refer a new class Date twice (named from and to).
6.3 Application of project-specific SCM model refactorings
Besides manual changes, model refactoring is the technique of choice to eliminate occurring smells. In our tool environment for model quality assurance, this task is provided by the primary functionality of EMF Refactor as presented in Arendt et al. (2010a). Again, this component provides a configuration mechanism to select refactorings being relevant for the given modeling project. The configuration user interface is similar to that of the metrics component (see Fig. 8) and is not shown here.
The application of a certain model refactoring can be triggered by using two alternative ways: First, it can be invoked from within the context menu of at least one model element in the standard tree-based EMF instance editor. Dependent on the selected element(s), only those refactorings are provided in the menu being defined for the corresponding model element type(s).
The second way to trigger a model refactoring is to use the quick fix mechanism of the smell results view as shown on the left-hand side of Fig. 13. Starting from this view, our tool environment provides a suggestion for potential refactorings according to pre-defined smell-refactoring relations (see Fig. 12) and a dynamic analysis of applicable model refactorings.
Besides the model change preview, our tool environment provides the opportunity to get a quantitative analysis on changes of model smell occurrences. In contrast to the manual configuration of potential refactoring-smell-relations, this preview provides a concrete overview on smell occurrence changes when applying the refactoring. For a detailed discussion of model refactoring-smell-relations respectively their implementation in our tool environment we refer to Arendt and Taentzer (2012a).
From the detected smell occurrences only one is left (smell Diamond Inheritance in class hierarchy Subcontractor ⇒ Person). Nevertheless, there are model parts remaining suspicious with respect to several model quality aspects. For example, there are two elements indicating incorrect modeling. First, class Vehicle is concrete even though it should represent a generic term for concrete vehicle kinds, hence should be abstract. Moreover, the association between classes Company and VehicleRentalService has a too general name and should be named vehicleRentalService instead. Furthermore, there are associations from class Company to classes Car, Truck and Motorbike respectively from class VehicleRentalService to these classes hinting to some kind of redundant modeling.
The former discussion shows that project-specific model quality assurance techniques need not be completely defined before a project starts. In our example, the quality assurance process should be adapted during the model development phase in order to be steadily improved. SCM smells Concrete Superclass and Association Clumps as well as SCM refactorings Rename Association and Pull Up Association would extend the suite of project-specific model quality assurance techniques in a meaningful way.
7 Tool environment: specification of project-specific model quality assurance techniques in EMF
Our tool environment for EMF model quality assurance provides a wizard-based specification process for each supported quality assurance technique model metrics, model smells, and model refactorings. In this section, we present several supported concrete specification mechanisms for model quality assurance techniques discussed along the SCM example.
7.1 Specification of project-specific model metrics
For the specification of model metrics, our tool environment currently supports four concrete techniques. As basic approaches, pure Java code using the modeling language API generated by EMF and OCL expressions can be used. Another approach is to define a pattern using the abstract model syntax first and to count its occurrences in a concrete model thereafter. These patterns are formulated as rules in a language included in the EMF model transformation tool Henshin (Arendt et al. 2010b; Henshin 2012). To define compositional metrics, our tool environment supports a combination of existing ones. Here, the involved metrics as well as appropriate arithmetic operations have to be specified.
7.2 Specification of project-specific model smells
Our tool environment currently supports four concrete mechanisms for model smell specification. Again, pure Java code and OCL expressions can be used as basic approaches. Some smells can be detected well by metric benchmarks. Here, appropriate model metrics are used together with suitable benchmarks being set by project-specific configurations. Pattern-based smells (i.e., smells that are detectable by the existence of specific anti-patterns) can be specified by Henshin rules. The specification process for model smells is similar to that for metrics specification as shown in Fig. 20. After inserting smell-specific information like the name or the corresponding meta-model, our tool environment generates Java code to be completed. Again, the list of supported model smells is extended using the extension point technology of Eclipse.
The first negative application condition (NAC) 〈〈 forbid:subclass〉〉 looks for a direct subclass of the contextual class; the second NAC 〈〈 forbid:attributetype〉〉 looks for an attribute owned by another class that has the contextual class as type; NAC 〈〈 forbid:outgoingassociation〉〉 looks for an outgoing association of the contextual class that is targeted in another class; the last NAC 〈〈 forbid:incomingassociation〉〉 looks for an incoming association of the contextual class originating from another class. All NACs have to hold, e.g. if one of the specified relations is found the SCM class is not unused, i.e. the Henshin rule is not applicable on that class. Our smell detection tool uses Henshin’s pattern matching algorithm to detect rule matches. The matches found represent the existence of model smells in the model.
7.3 Specification of project-specific model refactorings
The specification process for model refactorings is started from the context menu of an arbitrary model element. Doing this, several required information like the meta-model and the type of the contextual element is obtained automatically. Its wizard is similar to that for metrics specification as shown in Fig. 20.
Since EMF Refactor uses the LTK technology (Frenzel 2006) as described in Sect. 6.3, a concrete refactoring specification requires up to three parts (i.e., specifications for initial checks, final checks, and the proper model changes). EMF Refactor currently supports four concrete mechanisms for EMF model refactoring specification. As for metrics and smells, refactorings can be specified using Java and OCL. A way to specify a model refactoring straight forwardly is to use Henshin. Here, EMF Refactor uses Henshin’s model transformation engine for executing the refactoring as well as Henshin’s pattern matching algorithm to detect violated preconditions. Finally, our current work concentrates on a combination of existing refactorings to more complex ones by using a domain-specific language, called CoMReL (Arendt and Taentzer 2012b).
In Sect. 6.3, SCM model refactoring Extract Superclass is applied to eliminate smell Data Clumps. In the following, we demonstrate the use of each specification approach mentioned above for specifying this refactoring.
SCM refactoring Extract Superclass internally uses refactoring Pull Up Attribute to move equal attributes to the newly created parent class (compare Sect. 3.3.4). The model change part of Pull Up Attribute moves the contextual attribute to the specified superclass and removes all equal attributes from their corresponding sibling classes. To specify these changes we can use the amalgamation concept provided by Henshin. This concept contains an interaction scheme consisting of one rule acting as a kernel rule and multiple rules acting as multi-rules. The effect is that the modification defined in the kernel rule is applied exactly once while modifications defined in the multi-rules are applied as often as matches are found.
8 Tool environment: evaluation
In this section, we evaluate our tool environment for EMF model quality assurance along two different perspectives: suitability and performance (resp. scalability). More information about the test design and results can be found on the complementary website of this article.
Number of proof-of-concept implementations of metrics, smells, and refactorings for Ecore, UML2, and SCM models
Used specification approaches for UML2 metrics, smells, and refactorings
Java specifications of UML2 metrics use 15.2 LoC (Lines of Code) on average (min. 1 LoC; max. 36 LoC), whereas UML2 smells are implemented in 20.5 LoC on average (min. 13 LoC; max. 74 LoC). Refactoring specifications require 99.7 LoC on average (min. 8 LoC; max. 269 LoC). Here, about 20 % (20.2 LoC on average) are used for specifying the model change part only, but almost 80 % (79.5 LoC on average) for specifying the initial and final precondition checks. This shows that the complexity of refactoring specifications is particularly hidden in checking the corresponding preconditions.
As a last topic in our proof-of-concept implementation we have related altogether 16 UML2 smells to 18 potentially suitable refactorings and 14 refactorings to 6 potentially occurring smells.
In summary, our implementations show that our tool environment supports metrics calculation, smell detection, and refactoring of EMF-based models to a high extent. Specifications are compact and concentrate purely on the QA technique to be specified. All further functionalities such as metrics reports, etc. are provided by the framework. Furthermore, it is shown that each supported specification approach is suited to specify metrics, smells, and refactorings which can be used by our tool environment. Our experiences in using the various specification approaches show that using Java has been the favorite approach for implementing specifications, especially for implementing refactoring specifications. In fact, this may be due to the preferences of the designer and the progress of supported approaches by the corresponding tool. Independent of the preferred specification language, we feel confident that OCL is particularly suited for specifying metrics which can be directly deduced from the contextual model element using adequate meta attributes respectively references. Henshin transformations have been proven well-suited especially for specifying the model change part of a refactoring. The specification of new metrics, smells, and refactorings is a straightforward task since it is highly supported by comfortable wizards and several concrete specification languages.
8.2 Performance and scalability
To evaluate the scalability of our tool environment, we implemented several performance tests of all three application modules. We performed our tests on a Lenovo ThinkPad W500, Intel Centrino vPro 2.8 GHz, 4 MB RAM.
8.2.1 Metrics calculation
TNME—Total number of elements in the model.
MaxDIT—Maximum of all depths of inheritance trees (context: model).
MaxHAgg—Maximum of aggregation trees (context: model).
DNH—Depth in the nesting hierarchy (context: package).
NATIP—Number of inherited attributes in classes within the package.
NOPIP—Number of inherited operations in classes within the package.
HAgg—Length of the longest path to the leaves in the aggregation hierarchy (context: class).
MaxDITC—Depth of Inheritance Tree (maximum due to multiple inheritance; context: class).
NSUBC2—Number of all children of the class.
NSUPC2—Total number of ancestors of the class.
Results of the performance tests for calculating 10 UML2 metrics on model instances with 100 to 100 000 elements
Calculated UML2 metrics
Average time needed
8 min 36 sec
33 min 54 sec
8.2.2 Smell detection
Concrete Superclass—The model contains an abstract class with a concrete superclass.
Equal Attributes in Sibling Classes—Each sibling class of the owning class of an attribute contains an equal attribute.
Specialization Aggregation—The model contains a generalization hierarchy between associations.
Speculative Generality (Abstract Class)—The model contains an abstract class that is inherited by one single class only.
Speculative Generality (Interface)—The model contains an interface that is implemented by one single class only.
Unused Class—The model contains a class that has no child or parent classes, that is not associated to any other classes, and that is not used as attribute or parameter type.
Unused Interface—The model contains an interface that is not specialized by another interface, and not realized or used by any classes.
Results of the performance tests for the detection of 7 UML2 model smells on model instances with 100 to 100 000 elements
Detected UML2 smells
Average time needed
5 min 05 sec
20 min 50 sec
8.2.3 Refactoring execution
Results of the performance tests for the application of 7 UML2 model refactorings on model instances having a larger refactoring context
Refactoring application on a class having 10 attributes and 10 operations
Refactoring application on a class having 10 attributes and 10 operations. The selected class has 10 child classes already. Each child class has 10 attributes and 10 operations
Refactoring application on 10 classes having 10 equal attributes and 10 equal operations
Refactoring application on a class having 10 attributes and 10 operations
Introduce Parameter Object
Refactoring application on 9 parameters of an operation with 10 input parameters. The owning class has altogether 10 operations with 10 parameters each. Each operation has parameters equal to the selected ones
Refactoring application on a state with 5 incoming transitions. The parameter state has entry, doAction, and exit behaviour. The parameter state has 5 incoming transitions equal to the selected state. The owning region has 20 further states
Refactoring application on a class having 10 attributes and 10 operations. The selected class has 10 child classes already. Each child class has 10 attributes and 10 operations
8.2.4 Interpretation of results
The results show that the application modules for metrics calculation and smell detection are well-suited for small and mid-sized EMF-based models. For large-scale models, reporting of a high number of calculated metrics (respectively detected smells) is provided in a satisfying time only. However, since static analyses normally do not need to be performed time-critically, this is no crucial limitation of our tool set. Furthermore, the configuration mechanism of our tools can be used even to deal with large-scale models efficiently. For example, the configuration of only a small number of relevant metrics and smells reduces the overall execution time. Moreover, a smell search can be performed on a subtree of the model only, again reducing the overall execution time. Concerning model refactoring, the results show that the refactoring execution module is well-suited for applying refactorings even on large-scale refactoring contexts.
In this article, we present a tool environment for model quality assurance based on the Eclipse Modeling Framework (EMF), a common open source technology in model-based software development. It has been designed to support a syntax-oriented model quality assurance process that can be easily adapted to specific needs in model-based projects. This means that dependent on the modeling language and the modeling purpose, specific quality goals, and hence specific metrics, smells, and refactorings may be defined. In such a tailored process, smell detection and model refactoring can be iterated as long as a reasonable model quality has not been reached.
Our tool environment supports the model designer respectively reviewer by obtaining metrics reports, by checking for potential model deficiencies (called model smells) and by systematically restructuring models using refactorings. Automatically proposed refactorings as quick fixes for occurring smells and information on implications of a selected refactoring concerning new model smells widen the provided functionality and support an integrated use of the quality assurance tools.
Model checks and refactorings can be specified by several specification mechanisms. In this paper, we present Java, OCL, and the model transformation language Henshin as possible specification approaches. However, other model transformation approaches such as EWL (Kolovos et al. 2007) are interesting alternatives to be used. In our tool environment, metrics can be composed to more complex metrics and refactorings can be composed by using a dedicated language named CoMReL. It is up to future work to analyze the preconditions of component refactorings w.r.t. to their execution order and to deduce a composite precondition therefrom. A first approach for in-depth composition of refactorings is available for Henshin-specified ones using algebraic graph transformations and critical pair analysis (Ehrig et al. 2006).
As a next step, we plan to evaluate the proposed model quality assurance process in larger case studies using UML models. To do so, we intend to use the UML2 model as language definition and to provide a set of well-known smells and refactorings for class models. A comprehensive catalog of UML metrics, smells and refactorings that have been extracted from literature has already been implemented. A list of implemented techniques can be found on the complementary website of this article (Arendt 2012).
The entire tool set presented belongs to the Eclipse incubation project EMF Refactor (EMF Refactor 2012) and is available under the Eclipse public license. Furthermore, we integrated our tool environment into the widely used EMF-based UML CASE tool IBM Rational Software Architect. Here, each version additionally provides a highlighting of model elements for smells in the graphical model view. It is up to future work, to present the preview of refactoring effects also graphically. Both the version for the open source UML tool Eclipse Papyrus and the version for the commercial tool IBM RSA can be installed from the download area of the EMF Refactor homepage. Further information about the integration in Papyrus and RSA can be found at Arendt and Taentzer (2012c).
In future releases, we will continue with making our quality assurance tools still more user-friendly. Besides support for further available QA techniques and further specification languages, performance and scalability shall be further optimized. Here, potential inefficiencies in the framework need to be analyzed and performance-oriented technologies for metric computation and smell detection need to be discussed. Another open issue is how to deal with false positives during model smell detection. These are concrete smell occurrences being actually non-issues to be ignored. Here, we think of using mechanisms like @SupressWarnings in Java to indicate areas to be elided during a specific smell search. In the context of EMF, EAnnotations might be useful.
We are convinced that performing quality assurance processes is an essential task to obtain software products of high quality. Using the structured model quality assurance process and the corresponding tools presented in this article, model-based and model-driven development can be made more mature yielding software of higher quality.
In this article, we refer to UML2 being the standard EMF-based representation of UML2, i.e. org.eclipse.emf.uml2.uml.
This work has been partially funded by Siemens Corporate Technology, Germany. Furthermore, we thank the students Jan Baart, Matthias Burhenne, Gerrit H. Freise, Florian Mantz, Pawel Stepien, and Alexander Weber for their work on our tools. Last but not least, we like to thank the anonymous reviewers for their valuable comments on the previous version of this article.