The current section briefly depicts the notions and concepts related to feature models, their importance and presents various synthesis approaches. The original study is also discussed in details. A subsection is dedicated to how to conduct replication in Software Engineering, whereas the last part presents the motivation for this replication.
Feature Model Synthesis
A feature model is a compact representation of all the products of the Software Product Line (SPL) in terms of “features”. The variability of software systems is designed with the help of feature models. A Feature model is composed of a set of boolean constraints, a hierarchical tree representation of parent-child relationships that are established between features, and a constraints set between features. These feature models tell us which combinations of features are allowed and which combinations are prohibited. Unfortunately, companies only use this reverse engineering of an SPLs from their existing product variants when the number of possible configurations of a product is difficult to manage.
Feature models are important for changing the variability of a system because these feature models describe all the possible and valid combinations of features. The features of a system can be mandatory or optional and can be described with the help of some labelled boxes which are connected together, forming a tree structure. In a feature model, each feature has a parent feature except for the root of the tree structure that is always included in a model.
Several types of relationships can be established between a feature that is considered to be a child feature and a parent feature. Whenever a parent feature is selected, mandatory features are also selected unlike the optional features which may or may not be selected whenever a parent feature is selected. In an “exclusive-or” relationship only one feature is selected along with the selected parent. In a “inclusive-or” relationship at least one feature is selected along with the selected parent. Along with these relations between parent and child features, the so called Cross-Tree Constraints (CTCs) helps to create relationships between different branches of the feature model.
A Feature Set is a combination of features. This set is valid if all the constraints imposed by the feature model are respected. A feature set table contains all the valid feature sets of the feature model. A feature set table therefore represents a set of products, each product having a different combination of features (i.e., feature set).
Figure 1 shows the Feature Diagram for the “SmartHome” feature model. As mentioned above, features are described with the help of labelled boxes. The Cross-Tree Constraints set is exposed under the Feature Diagram. The mandatory features are WindowsControl, ManualWindows, HeatingControl, UI, InhomeScreen, and the optional features are AutomatedWindows, SmartHeating, InternetApp, Security, BurglarAlarm, Siren, Bell, Light, Notification, Authorization, Authentication, LAN and DoorLock. An inclusive-or relationship can be observed between Siren, Light, Bell and Notification.
Related Work
Various researchers have found solutions to this reverse engineering problem. Haslinger et al. (2011) presented an algorithm that reverse engineers only one feature model and which identifies patterns in the selected and non-selected features. Later, he included Cross-Tree Constraints (CTCs) that requires and excludes as an extension in another work (Haslinger et al. 2013). Acher et al. (2012) came up with the proposal to map each feature into a feature model and build a single merged feature model. This merged feature model contains all the mapped features as feature models.
In article (Lopez-Herrejon et al. 2012), where reverse engineering of the feature models was done using Evolutionary Algorithms, two fitness functions were designed to be used. One is used to describe feature sets but disregards any surplus, and another one is used to get the desired number of feature sets.
Another approach (Acher et al. 2013) that investigated the synthesis of a feature model FM from a set of configurations used two main steps: characterise the different meanings of feature models and identify the key properties allowing to discriminate between them. The synthesis procedure was able to restitute the intended meanings of feature models based on inferred or user-specified knowledge.
Andersen et. al. (She et al. 2014) proposed algorithms for synthesis of feature models using the propositional constraints by deriving symbolic representations of all candidate diagrams, and deriving instances from this diagrams.
Three standard search based techniques (evolutionary algorithms, hill climbing, and random search) with two objective functions were evaluated for the reverse engineering task regarding feature models in paper (Lopez-Herrejon et al. 2015). The two objective functions were refined in a third one, an information retrieval measure.
In Assunção et al. (2016), authors used a set of feature sets to determine precision, recall and a dependency graph to compute variability safety of a feature model. This approach takes a multi-objective perspective and helps us to obtain feature models which represents the desired feature combinations and also make sure that these combinations are well-formed, more exactly variability safe.
Dynamic Software Product Line is another approach that is presented in article (Larissa Luciano Carvalho et al. 2017), but they do not address the problem of reverse engineering of feature models. The focus is on developing dynamically adaptable software that manages reconfiguration during execution. This approach aims to reuse components and also aims to adapt to environment changes or user requests, and is implemented with the help of Object Oriented Programming (OOP) as well as Aspect Oriented Programming (AOP).
We would like to emphasize that our replication is on a paper (Linsbauer et al. 2014) that is published earlier than some of the newer work on feature model synthesis that we mention above. The investigations (Lopez-Herrejon et al. 2015; Assunção et al. 2016; Larissa Luciano Carvalho et al. 2017) that follow this paper improve various aspects of the approach like: use of various standard search-based techniques with two objective functions for the reverse engineering task of feature models, multi-objective algorithms to obtain the desired feature combinations under consideration of variability safety, and developing dynamically adaptable software that manages reconfiguration during execution. The aim of our replication study is not to improve aspects of the proposed approach but to confirm the already obtained results and also to contribute information regarding the relation between number of features, number of generations and number of populations, providing insights into the best optimized values for several parameters of the genetic programming algorithm.
The Original Study
We conduct a replication study of the approach in paper (Linsbauer et al. 2014) where the authors applied genetic programming to the problem of reverse engineering feature models in the realm of SPLs. They started from a previous work (Lopez-Herrejon et al. 2012) where they used a genetic algorithm for reverse engineering feature models. In paper (Linsbauer et al. 2014), the authors started with a set of 17 feature models of actual SPLs. The implementation was made using ECJ framework Footnote 1 for evolutionary computation. For each of those feature models they computed the respective set of valid feature sets and used these sets as input for their Genetic Programming (GP) pipeline, Random Search (RS) baseline and genetic algorithm approach.
After obtaining the results, a statistical analysis was performed using the Wilcoxon Signed-Rank Test and the obtained results. The test was performed on the average fitness function values to compare the genetic programming approach against the random search baseline and also against genetic algorithm approach.
In the end, the test indicated a significant difference between those three approaches and by comparing the results it was established that genetic programming approach outperforms the genetic algorithm approach and the random search baseline.
The investigation considered the following GP parameters: Population size 100, Maximum number of generations 100, Crossover probability= 0.7, Feature Tree Mutation probability= 0.5, and CTCs Mutation probability= 0.5.
To compare their results, as a base line, the authors used a Random Search (RS) that just randomly creates feature models in hopes of finding a good solution. The number of random tries is set to the product of the maximum number of generations and the population size of the GP. Thus the number of evaluated candidate feature model individuals is the same for both approaches that we consider (GP and RS), thus the number of performed evaluations is:
maxGenerations × populationSize = 100 × 100 = 10000.
The investigation examined also the results considering feature models with different number of features: 4 models with [6,10] features, 4 models with (10,15] features, 5 models with (15, 20] features, 4 models with > 20 features.
The GP pipeline that was applied in the original study can also be found in our replication study. This GP pipeline includes a set of operators like Builder, Selection, Crossover, Reproduction, Mutation, Breeding that are next described.
Feature Model Representation
In the initial study, as well as in other studies, the Model Driven Engineering (MDE) approach was used. This approach uses a metamodel to define the structure and semantics of the models that can be derived from it. The metamodel is provided in Fig. 2. A simplified version of SPLX metamodel (which is actually a common representation for feature models) was chosen and describes the structure of a feature model individual. In the metamodel it can be seen that there is only one feature for the Root and for any other feature node. We can also see the Mandatory and Optional child features as well as Alternative and Or group relations which must have at least one GroupedFeature as a child. Along with these are the CTCs of a feature model individual.
Fitness Function
This pipeline uses a fitness function based on information retrieval metrics for the Evaluator to describe the fitness of an individual. To obtain this function, the authors of the original study defined two auxiliary functions called Precision and Recall. According to the initial study, Precision is the fraction of the retrieved feature sets that are relevant to the search, and Recall is the fraction of the feature sets that are relevant to the search that are successsfully retrieved. Those are calculated using the following expressions:
$$ \ precision(sfs,fm)=\frac{ \# containedFeatureSets(sfs,fm)}{ \# featureSets(fm)}; $$
$$ \ recall(sfs,fm)=\frac{ \# containedFeatureSets(sfs,fm)}{ |sfs|}. $$
The “\(\# containedFeatureSets : SFS \times FM \rightarrow \mathbb {N}\)” represents the number of feature sets received as first argument sfs that are valid according to a feature model fm. The “\(\# featureSets : FM \rightarrow \mathbb {N} \)” represents the number of feature sets denoted by a feature model fm. Fβ is a weighted measure of precision and recall. β indicates how many times the recall values weight more in comparison with the precision value. Fβ is obtained using the expression:
$$ \ F_{\beta}=\frac{ (1 + \beta^{2}) \times precision \times recall }{ \beta^{2} \times precision + recall}. $$
Every node of a feature model is implemented as a function that manipulates a set of feature sets in order to compute the final feature sets that are represented by the whole feature model. This implementation is used to obtain the required metrics.
Operators, Mutation and Crossover
The necessary operators for GP were developed based on the tree structures that are derived from the metamodel and also on the domain constraints. Those necessary operators are Builder, Crossover and Mutator.
The Builder creates random feature trees and random CTCs according to the metamodel and to domain constraints. It was implemented in the oridinal study using the FaMa (Benavides et al. 2007) and BeTTy (Segura et al. 2012) frameworks.
A feature model individual suffers small random changes made by the Mutator. According to the initial study (Linsbauer et al. 2014), mutations that can be performed are:
-
Randomly swaps two features in the feature tree.
-
Randomly changes an Alternative relation to an Or relation or vice-versa.
-
Randomly changes an Optional or Mandatory relation to any other kind of relation (Mandatory, Optional, Alternative, Or ).
-
Randomly selects a subtree in the feature tree and puts it somewhere else in the tree without violating the metamodel or any of the domain constraints.
Also, the mutations performed on the CTCs, that are applied with equal probability, are:
-
Adds a new, randomly created CTC (i.e. clause) that does not contradict the other CTCs and does not already exist.
-
Randomly removes a CTC (i.e. a clause).
The Crossover deals with the creation of two new individuals(offspring), from two individuals who belong to the current population (parents). Those new individuals should maintain desirable traits from both parents.
In the original study, how the crossover operator for feature model individuals works is described as follows:
-
In the first step, the offspring is initialized with the root feature of Parent 1 and the root feature of Parent 2 is added to the offspring if is different from Parent 1. In this case, the root feature of Parent 2 is added as a mandatory child feature of its root feature.
-
The second step is to cross the levels of the first parent, starting first from the root node, and add a random number r of features (that are not already contained) to the offspring. This is done by appending the features to their respective parent feature already contained in the offspring, using the same relation type between them.
-
In the third step, the second step is repeated but this time using the second parent.
-
In the last step is repeated the second step until every feature is contained in the offspring.
-
The second offspring is obtained in the same way but starting with reversed positions of the parents. Regarding to the crossover for CTCs, the union of CTCs of the two parents is performed and then it is assigned a random subset to the first offspring. The remaining is assigned to the second offspring.
Replication in Software Engineering
Experimentation plays a major role in scientific improvement, replication being one of the essentials of the experimental methods: experiments are repeated aiming to check their results. Successful replication increases the validity of the outcomes observed in the experiments.
For an experiment to be considered a replication (Shepperd et al. 2018), the following elements are required to be fulfilled: the authors must explicitly state which original experiment is being replicated; the purpose of the replication study includes extending the external validity of the experiment (i.e., adding to our understanding of how the results generalise); both experiments must have research questions or hypotheses in common (i.e., there are shared constructs and interventions), and the analysis must contain a comparison of original and new results with a view to conforming or disconforming the original experiment. Note that we intentionally avoid judgements such as “successful”. Internal (the replication team includes members from the original experiment) or external replications (the entire replication team is independent of the original team) may be performed: some researchers express a preference for external replications (being more independent but may be unintentionally less exact).
Other research investigations concerning how to do replications of software engineering experiments (Carver 2010; Carver et al. 2014) emphasized guidelines that suggest four types of information to include in a replication report: (1) information about the original study to provide enough context for understanding the replication; (2) information about the replication to help readers understand specific important details about replication itself; (3) comparison of replication results with original study results to illustrate commonalities and differences in the results obtained, and (4) conclusions across studies to provide readers with important insights that can be drawn from the series of studies that may not be obvious from a single study.
The elements of the software engineering experimental configuration are investigated and established in paper (Gómez et al. 2014). There are four dimensions: operationalization, population, protocol, and experimeters. A common conduct is that the authors should rigorously document their replication designs (Fagerholm et al. 2019).
-
Dimension: Operationalization.
:
-
Operationalization describes the act of translating a construct into its manifestation, thus having cause and effect constructs: cause constructs are operationalized into treatments (they indicate how similar the replication is to the baseline expriment), and effect constructs are operationalized into metrics and measurement procedures.
-
Dimension: Population.
:
-
Replications should also study the properties of experimental objects. Specifications, design documents, source code, programs are all examples of experimental objects. Replications examine the limits of the properties of experimental objects for which results hold.
-
Dimension: Protocol.
:
-
The elements of the experimental protocol that can vary in a replication are: experimental design, experimental objects, guides, measuring instruments and data analysis techniques.
-
Dimension: Experimenters.
:
-
This dimension refers to the people involved in the experiment. A replication (Gómez et al. 2014) should verify whether the observed results are independent of the experimenters by varying the people who performed each role (designer, trainer, monitor, measurer, analyst).
Each stated dimensions will be discussed in the next section, particularly for our replication design.
Replication in software engineering is of a paramount importance and must be conducted for both confirmatory and discovering additional knowledge of the investigated method.
Motivation for the Replication
Conducting a replication study (Dyba et al. 2005) has two main aims: first, to confirm the results of the original study, and secondly, to generate new knowledge that otherwise would not be possible to be created.
Thus, our replication further investigates the GP approach and the “behavior” of the algorithm when considering various #features with different characteristics, examining the relation between # features and # generations, and also between # features and # population, and also exploring the impact of GP parameters (crossover and mutation) on obtained solutions.