Empirical Software Engineering

, Volume 21, Issue 4, pp 1794–1841 | Cite as

Breathing ontological knowledge into feature model synthesis: an empirical study

  • Guillaume Bécan
  • Mathieu Acher
  • Benoit Baudry
  • Sana Ben Nasr


Feature Models (FMs) are a popular formalism for modeling and reasoning about the configurations of a software product line. As the manual construction of an FM is time-consuming and error-prone, management operations have been developed for reverse engineering, merging, slicing, or refactoring FMs from a set of configurations/dependencies. Yet the synthesis of meaningless ontological relations in the FM – as defined by its feature hierarchy and feature groups – may arise and cause severe difficulties when reading, maintaining or exploiting it. Numerous synthesis techniques and tools have been proposed, but only a few consider both configuration and ontological semantics of an FM. There are also few empirical studies investigating ontological aspects when synthesizing FMs. In this article, we define a generic, ontologic-aware synthesis procedure that computes the likely siblings or parent candidates for a given feature. We develop six heuristics for clustering and weighting the logical, syntactical and semantical relationships between feature names. We then perform an empirical evaluation on hundreds of FMs, coming from the SPLOT repository and Wikipedia. We provide evidence that a fully automated synthesis (i.e., without any user intervention) is likely to produce FMs far from the ground truths. As the role of the user is crucial, we empirically analyze the strengths and weaknesses of heuristics for computing ranking lists and different kinds of clusters. We show that a hybrid approach mixing logical and ontological techniques outperforms state-of-the-art solutions. We believe our approach, environment, and empirical results support researchers and practitioners working on reverse engineering and management of FMs.


Software product lines Feature model Variability Model management Reverse engineering Refactoring 


  1. Abbasi EK, Acher M, Heymans P, Cleve A (2014) Reverse engineering web configurators. In: CSMR/WRCE’14Google Scholar
  2. Acher M, Cleve A, Collet P, Merle P, Duchien L, Lahire P (2011) Reverse engineering architectural feature models. In: ECSA’11, LNCS, vol 6903, pp 220–235Google Scholar
  3. Acher M, Cleve A, Collet P, Merle P, Duchien L, Lahire P (2014) Extraction and evolution of architectural variability models in plugin-based systems. Software and Systems Modeling (SoSyM)Google Scholar
  4. Acher M., Cleve A., Perrouin G, Heymans P, Vanbeneden C, Collet P, Lahire P. (2012) On extracting feature models from product descriptions. In: VaMoS’12, pp 45–54. ACMGoogle Scholar
  5. Acher M, Collet P, Lahire P, France R (2013) Familiar: A domain-specific language for large scale management of feature models. Sci Comput Program 78 (6):657–681CrossRefGoogle Scholar
  6. Acher M, Combemale B, Collet P, Barais O, Lahire P, France RB (2013) Composing your compositions of variability models. In: MoDELS’13, pp 352–369Google Scholar
  7. Acher M, Heymans P, Cleve A, Hainaut JL, Baudry B (2013) Support for reverse engineering and maintaining feature models. In: VaMoS’13. ACMGoogle Scholar
  8. Ahnassay A, Bagheri E, Gasevic D (2013) Empirical evaluation in software product line engineering. Tech. Rep. TR-LS3-130084R4T, Laboratory for Systems, Software and Semantics. Ryerson UniversityGoogle Scholar
  9. Aho AV, Garey MR, Ullman JD (1972) The transitive reduction of a directed graph. SIAM J Comput 1(2):131–137MathSciNetCrossRefzbMATHGoogle Scholar
  10. Algorithm of Haslinger et al. (2013):
  11. Alves V, Schwanninger C, Barbosa L, Rashid A, Sawyer P, Rayson P, Pohl C, Rummler A (2008) An exploratory study of information retrieval techniques in domain analysis. In: SPLC’08, pp 67–76. IEEEGoogle Scholar
  12. Andersen N, Czarnecki K, She S, Wasowski A (2012) Efficient synthesis of feature models. In: Proceedings of SPLC’12, pp 97–106. ACMGoogle Scholar
  13. Apel S, Batory D, Kästner C, Saake G (2013) Feature-Oriented Software Product Lines: Concepts and Implementation. SpringerGoogle Scholar
  14. Apel S, Kästner C (2009) An overview of feature-oriented software development. Journal of Object Technology (JOT) 8(5):49–84CrossRefGoogle Scholar
  15. Apel S, Kästner C, Lengauer C (2013) Language-independent and automated software composition: The featurehouse experience. IEEE Trans Softw Eng 39:63–79CrossRefGoogle Scholar
  16. Apel S, von Rhein A, Wendler P, Größlinger A, Beyer D (2013) Strategies for product-line verification: Case studies and experiments. In: ICSE’13. IEEEGoogle Scholar
  17. Arcuri A, Briand L (2011) A practical guide for using statistical tests to assess randomized algorithms in software engineering. In: Proceedings of the 33rd International Conference on Software Engineering, ICSE ’11. ACM, New York, pp 1–10Google Scholar
  18. Baader F, Nutt W (2003) The description logic handbook. chap. Basic Description Logics. Cambridge University Press, New York, NY, USA, pp 43–95Google Scholar
  19. Bagheri E, Ensan F, Gasevic D (2012) Decision support for the software product line domain engineering lifecycle. Autom Softw Eng 19(3):335–377CrossRefGoogle Scholar
  20. Bagheri E, Gasevic D (2011) Assessing the maintainability of software product line feature models using structural metrics. Softw Qual J 19(3):579–612CrossRefGoogle Scholar
  21. Bécan G, Acher M, Baudry B, Ben Nasr S (2013) Breathing ontological knowledge into feature model management. Rapport Technique RT-0441, INRIA.
  22. Bécan G, Nasr SB, Acher M, Baudry B (2014) WebFML: Synthesizing Feature Models Everywhere. In: SPLC’14Google Scholar
  23. Bécan G, Sannier N, Acher M, Barais O, Blouin A, Baudry B (2014) Automating the formalization of product comparison matrices. In: Proceedings of the 29th ACM/IEEE international conference on Automated software engineering, pp 433–444. ACMGoogle Scholar
  24. Benavides D, Segura S, Ruiz-Cortes A (2010) Automated analysis of feature models 20 years later: a literature review. Information Systems 35(6):p.615–636CrossRefGoogle Scholar
  25. Berger T, She S, Lotufo R, Wasowski A, Czarnecki K (2013) A study of variability models and languages in the systems software domain . IEEE Trans Softw Eng 39(12):1611–1640CrossRefGoogle Scholar
  26. Berger T, Rublack R, Nair D, Atlee J M, Becker M, Czarnecki K, Wasowski A (2013) A survey of variability modeling in industrial practice. In: VaMoS’13. ACMGoogle Scholar
  27. Boucher Q, Abbasi E, Hubaux A, Perrouin G, Acher M, Heymans P (2012) Towards more reliable configurators: A re-engineering perspective. In: PLEASE’12 Int’l workshop at ICSE’12Google Scholar
  28. Budanitsky A, Hirst G (2006) Evaluating wordnet-based measures of lexical semantic relatedness. Comput Linguis 32(1):13–47CrossRefzbMATHGoogle Scholar
  29. Camerini P, Fratta L, Maffioli F (1979) A note on finding optimum branchings. Networks 9(4):309–312MathSciNetCrossRefzbMATHGoogle Scholar
  30. Chen K, Zhang W, Zhao H, Mei H (2005) An approach to constructing feature models based on requirements clustering. In: RE’05, pp 31–40Google Scholar
  31. Classen A, Heymans P, Schobbens PY, Legay A (2011) Symbolic model checking of software product lines. In: ICSE’11, pp 321–330. ACMGoogle Scholar
  32. Classen A, Heymans P, Schobbens PY, Legay A, Raskin JF (2010) Model checking lots of systems: efficient verification of temporal properties in software product lines. In: ICSE’10, pp 335–344. ACMGoogle Scholar
  33. Cordy M, Schobbens PY, Heymans P, Legay A (2013) Beyond boolean product-line model checking: dealing with feature attributes and multi-features. In: ICSE’13, pp 472–481Google Scholar
  34. Czarnecki K, Eisenecker U (2000) Generative Programming: Methods, Tools and Applications. Addison-Wesley, ReadingGoogle Scholar
  35. Czarnecki K, Kim CHP, Kalleberg KT (2006) Feature models are views on ontologies. In: SPLC ’06, pp 41–51. IEEEGoogle Scholar
  36. Czarnecki K, Pietroszek K (2006) Verifying feature-based model templates against well-formedness ocl constraints. In: GPCE’06, pp 211–220. ACMGoogle Scholar
  37. Czarnecki K, She S, Wasowski A (2008) Sample spaces and feature models: There and back again. In: SPLC’08, pp 22–31Google Scholar
  38. Czarnecki K, Wasowski A (2007) Feature diagrams and logics: There and back again. In: SPLC’07, pp 23–34. IEEEGoogle Scholar
  39. Davril JM, Delfosse E, Hariri N, Acher M, Cleland-Huang J, Heymans P (2013) Feature model extraction from large collections of informal product descriptions. In: ESEC/FSE’13Google Scholar
  40. Dietrich C, Tartler R, Schröder-Preikschat W, Lohmann D (2012) A robust approach for variability extraction from the linux build system. In: SPLC’12, pp 21–30Google Scholar
  41. Fan S, Zhang N (2006) Feature model based on description logics. In: Gabrys B, Howlett R, Jain L (eds) Knowledge-Based Intelligent Information and Engineering Systems, Lecture Notes in Computer Science, vol 4252. Springer, Berlin Heidelberg, pp 1144–1151CrossRefGoogle Scholar
  42. Ferrari A, Spagnolo GO, dell’Orletta F (2013) Mining commonalities and variabilities from natural language documents. In: Kishi T, Jarzabek S, Gnesi S (eds) SPLC, pp 116–120. ACMGoogle Scholar
  43. Gruber TR (1993) A translation approach to portable ontology specifications. Knowl Acquis 5(2):199–220CrossRefGoogle Scholar
  44. Hariri N, Castro-Herrera C, Mirakhorli M, Cleland-Huang J, Mobasher B (2013) Supporting domain analysis through mining and recommending features from online product listings IEEE Trans Softw EngGoogle Scholar
  45. Haslinger EN, Lopez-Herrejon RE, Egyed A (2011) Reverse engineering feature models from programs’ feature sets. In: WCRE’11, pp 308–312. IEEEGoogle Scholar
  46. Haslinger EN, Lopez-Herrejon RE, Egyed A (2013) On extracting feature models from sets of valid feature combinations. In: FASE’13, LNCS, vol 7793, pp 53–67Google Scholar
  47. Heidenreich F, Sanchez P, Santos J, Zschaler S, Alferez M, Araujo J, Fuentes L, amd Ana Moreira UK, Rashid A (2010) Relating feature models to other models of a software product line: A comparative study of featuremapper and vml*. Transactions on Aspect-Oriented Software Development VII. Special Issue on A Common Case Study for Aspect-Oriented Modeling 6210:69–114Google Scholar
  48. Heule MJH, Järvisalo M, Biere A (2011) Efficient cnf simplification based on binary implication graphs. In: Proceedings of the 14th International Conference on Theory and Application of Satisfiability Testing, SAT’11. Springer-Verlag, Berlin, Heidelberg, pp 201–215Google Scholar
  49. Hubaux A, Acher M, Tun TT, Heymans P, Collet P, Lahire P (2013) Domain Engineering: Product Lines, Conceptual Models, and Languages, chap. Separating Concerns in Feature Models: Retrospective and Multi-View Support. Springer 45(4):51Google Scholar
  50. Hubaux A, Tun TT, Heymans P (2013) Separation of concerns in feature diagram languages: A systematic survey. ACM Comput SurvGoogle Scholar
  51. Janota M, Kuzina V, Wasowski A (2008) Model construction with external constraints: An interactive journey from semantics to syntax. In: MODELS’08, LNCS, vol 5301, pp 431–445Google Scholar
  52. Kang K, Lee J, Donohoe P (2002) Feature-oriented product line engineering. Software, IEEE 19(4):58–65CrossRefGoogle Scholar
  53. Kästner C, Dreiling A, Ostermann K (2013) Variability mining: Consistent semiautomatic detection of product-line features. IEEE Trans Softw Eng 40(1):67–82CrossRefGoogle Scholar
  54. Krueger CW (2007) Biglever software Gears and the 3-tiered spl methodology. In: OOPSLA’07, pp 844–845. ACMGoogle Scholar
  55. Linden FJvd, Schmid K, Rommes E (2007) Software Product Lines in Action: The Best Industrial Practice in Product Line Engineering. Springer-Verlag, New York, Inc., Secaucus, NJ, USACrossRefGoogle Scholar
  56. Lopez-Herrejon RE, Galindo JA, Benavides D, Segura S, Egyed A (2012) Reverse engineering feature models with evolutionary algorithms: An exploratory study. In: SSBSE’12, LNCS, vol 7515, pp 168–182. SpringerGoogle Scholar
  57. Lopez-Herrejon RE, Linsbauer L, Galindo JA, Parejo JA, Benavides D, Segura S, Egyed A (2014) assessment of search-based techniques for reverse engineering feature models. J Syst Softw.  10.1016/j.jss.2014.10.037
  58. Medelyan O, Milne DN, Legg C, Witten IH (2009) Mining meaning from wikipedia. Int J Hum-Comput Stud 67(9):716–754CrossRefGoogle Scholar
  59. Mendonca M, Branco M, Cowan D (2009) S.p.l.o.t.: software product lines online tools. In: OOPSLA’09 (companion). ACMGoogle Scholar
  60. Mendonca M, Wasowski A, Czarnecki K (2009) SAT-based analysis of feature models is easy. In: SPLC’09, pp 231–240. IEEEGoogle Scholar
  61. Metzger A, Pohl K, Heymans P, Schobbens PY, Saval G (2007) Disambiguating the documentation of variability in software product lines: A separation of concerns, formalization and automated analysis. In: RE’07, pp 243–253Google Scholar
  62. Miller GA (1995) Wordnet: a lexical database for english. Commun ACM 38(11):39–41CrossRefGoogle Scholar
  63. Milne D (2007) Computing semantic relatedness using wikipedia link structure. In: The New Zealand Computer Science Research Student Conference. CiteseerGoogle Scholar
  64. Milne DN, Witten IH (2013) An open-source toolkit for mining wikipedia. Artif Intell 194:222– 239MathSciNetCrossRefGoogle Scholar
  65. Mussbacher G, Araújo J, Moreira A, Amyot D (2012) Aourn-based modeling and analysis of software product lines. Softw Qual J 20(3–4):645–687CrossRefGoogle Scholar
  66. Nadi S, Berger T, Kästner C, Czarnecki K (2014) Mining configuration constraints: Static analyses and empirical results. In: Proceedings of the 36th International Conference on Software Engineering (ICSE)Google Scholar
  67. Niu N, Easterbrook SM (2009) Concept analysis for product line requirements. In: Sullivan KJ, Moreira A, Schwanninger C, Gray J (eds) AOSD’09, pp 137–148. ACMGoogle Scholar
  68. Pleuss A, Botterweck G (2012) Visualization of variability and configuration options. Int J Softw Tools Technol Transfer 14(5):497–510CrossRefGoogle Scholar
  69. Pohl K, Böckle G, van der Linden FJ (2005) Software Product Line Engineering: Foundations, Principles and Techniques. Springer-VerlagGoogle Scholar
  70. Pohl R, Lauenroth K, Pohl K (2011) A performance comparison of contemporary algorithmic approaches for automated analysis operations on feature models. In: ASE’11, pp 313–322Google Scholar
  71. Pohl R, Stricker V, Pohl K (2013) Measuring the structural complexity of feature models. In: ASE’13Google Scholar
  72. Rabkin A, Katz R (2011) Static extraction of program configuration options. In: ICSE’11, pp 131–140. ACMGoogle Scholar
  73. Rubin J, Chechik M (2012) Locating distinguishing features using diff sets. In: ASE’12, pp 242–245. ACMGoogle Scholar
  74. Rubin J, Chechik M (2013) Domain Engineering: Product Lines, Conceptual Models, and Languages, chap. A Survey of Feature Location Techniques. SpringerGoogle Scholar
  75. Ryssel U, Ploennigs J, Kabitzsch K (2011) Extraction of feature models from formal contexts. In: FOSD’11, pp 1–8Google Scholar
  76. Sannier N, Acher M, Baudry B (2013) From Comparison Matrix to Variability Model: The Wikipedia Case Study. In: ASE’13. IEEEGoogle Scholar
  77. Sayyad AS, Menzies T, Ammar H (2013) On the value of user preferences in search-based software engineering: a case study in software product lines. In: ICSE’13, pp 492–501Google Scholar
  78. Schobbens PY, Heymans P, Trigaux JC, Bontemps Y (2007) Generic semantics of feature diagrams. Comput Netw 51(2):456–479CrossRefzbMATHGoogle Scholar
  79. She S (2013) Feature Model Synthesis. University of Waterloo, Ph.D. thesisGoogle Scholar
  80. She S, Lotufo R, Berger T, Wasowski A, Czarnecki K (2011) Reverse engineering feature models. In: ICSE’11, pp 461–470. ACMGoogle Scholar
  81. Smith T, Waterman M (1981) Identification of common molecular subsequences. Mol Biol 147:195– 197CrossRefGoogle Scholar
  82. Tarjan RE (1977) Finding optimum branchings. Networks 7(1):25–35MathSciNetCrossRefzbMATHGoogle Scholar
  83. Thaker S, Batory D, Kitchin D, Cook W (2007) Safe composition of product lines. In: GPCE ’07. ACM, New York, NY, USA, pp 95–104Google Scholar
  84. Thüm T, Batory D, Kästner C (2009) Reasoning about edits to feature models. In: ICSE’09, pp 254–264. ACMGoogle Scholar
  85. Thüm T, Kstner C, Benduhn F, Meinicke J, Saake G, Leich T (2012) Featureide: An extensible framework for feature-oriented software development. Sci Comput Program 79:70–85CrossRefGoogle Scholar
  86. Vacchi E, Combemale B, Cazzola W, Acher M (2014) Automating Variability Model Inference for Component-Based Language Implementations. In: 18th International Software Product Line Conference (SPLC’14)Google Scholar
  87. Valente MT, Borges V, Passos L (2012) A semi-automatic approach for extracting software product lines. IEEE Trans Softw Eng 38(4):737–754CrossRefGoogle Scholar
  88. Wagner RA, Fischer MJ (1974) The string-to-string correction problem. J ACM 21(1):168–173MathSciNetCrossRefzbMATHGoogle Scholar
  89. Wang HH, Li YF, Sun J, Zhang H, Pan J (2007) Verifying feature models using owl. Web Semant 5(2):117–129CrossRefGoogle Scholar
  90. Weston N, Chitchyan R, Rashid A (2009) A framework for constructing semantically composable feature models from natural language requirements. In: SPLC’09, pp 211–220. ACMGoogle Scholar
  91. Wu Z, Palmer M (1994) Verbs semantics and lexical selection. In: the 32nd annual meeting on Association for Computational Linguistics, pp 133–138. Association for Computational LinguisticsGoogle Scholar
  92. Wulf-Hadash O, Reinhartz-Berger I (2013) Cross product line analysis. In: VaMoS’13 ACMGoogle Scholar
  93. Yi L, Zhang W, Zhao H, Jin Z, Mei H (2012) Mining binary constraints in the construction of feature models. In: RE’12, pp 141–150. IEEEGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Guillaume Bécan
    • 1
  • Mathieu Acher
    • 1
  • Benoit Baudry
    • 1
  • Sana Ben Nasr
    • 1
  1. 1.Inria / IRISAUniversity of Rennes 1RennesFrance

Personalised recommendations