Abstract
Almost every sufficiently complex software system today is configurable. Conditional compilation is a simple variability-implementation mechanism that is widely used in open-source projects and industry. Especially, the C preprocessor (CPP) is very popular in practice, but it is also gaining (again) interest in academia. Although there have been several attempts to understand and improve CPP, there is a lack of understanding of how it is used in open-source and industrial systems and whether different usage patterns have emerged. The background is that much research on configurable systems and product lines concentrates on open-source systems, simply because they are available for study in the first place. This leads to the potentially problematic situation that it is unclear whether the results obtained from these studies are transferable to industrial systems. We aim at lowering this gap by comparing the use of CPP in open-source projects and industry—especially from the embedded-systems domain—based on a substantial set of subject systems and well-known variability metrics, including size, scattering, and tangling metrics. A key result of our empirical study is that, regarding almost all aspects we studied, the analyzed open-source systems and the considered embedded systems from industry are similar regarding most metrics, including systems that have been developed in industry and made open source at some point. So, our study indicates that, regarding CPP as variability-implementation mechanism, insights, methods, and tools developed based on studies of open-source systems are transferable to industrial systems—at least, with respect to the metrics we considered.
Similar content being viewed by others
Notes
In a violin plot, the white dot indicates the median, the small black horizontal bar shows the mean value, the wide black vertical bar spans from first to third quartile, and the shape describes the kernel density.
References
Adams B, De Meuter W, Tromp H, Hassan AE (2009) Can we refactor conditional compilation into aspects? In: Proc. int. conf. aspect-oriented software development (AOSD), ACM, pp 243–254
Anderson TW, Finn JD (1996) The new statistical analysis of data. Springer
Apel S, Batory D, Kästner C, Saake G (2013) Feature-oriented software product lines: concepts and implementation. Springer
Basili V, Caldiera G, Rombach H (1994) Goal question metrics paradigm. Encyclopedia of software engineering
Baxter I, Mehlich M (2001) Preprocessor conditional removal by simple partial evaluation. In: Proc. working conference on reverse engineering (WCRE), IEEE, pp 281–290
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B (Methodological) 57(1):289–300
Berger T, She S, Lotufo R, Wa̧sowski A, Czarnecki K (2010) Variability modeling in the real: a perspective from the operating systems domain. In: Proc. int. conf. automated software engineering (ASE), ACM, pp 73–82
Clements PC, Northrop L (2001) Software product lines: practices and patterns. SEI Series in Software Engineering, Addison-Wesley
Cliff N (1996) Ordinal methods for behavioral data analysis. Erlbaum
Conway M E (1968) How do committees invent? Datamation 14(5):28–31
Cowles M, Davis C (1982) On the origins of the .05 level of statistical significance. Am Psychol 37:553–558
Czarnecki K, Eisenecker UW (2000) Generative programming – methods, tools and applications. Addison-Wesley
Ernst MD, Badros GJ, Notkin D (2002) An empirical analysis of C preprocessor use. IEEE Trans Softw Eng (TSE) 12:1146–1170
Erwig M, Walkingshaw E (2011) The choice calculus: a representation for software variation. ACM Trans Softw Eng Methodol (TOSEM) 21(1):6:1–6:27
Favre JM (1996) Preprocessors from an abstract point of view. In: Proc. int. conf. software maintenance (ICSM), IEEE, pp 329–339
Favre JM (1997) Understanding-in-the-large. In: Proc. int. workshop on program comprehension (WPC), IEEE, pp 29–38
Feigenspan J, Kästner C, Apel S, Liebig J, Schulze M, Dachselt R, Papendieck M, Leich T, Saake G (2013) Do background colors improve program comprehension in the #ifdef hell? Empir Softw Eng 18(4):699–745
Fitzgerald B (2006) The transformation of open source software. MIS Quarterly 30(3):587–598
Ganesan D, Lindvall M, Ackermann C, McComas D, Bartholomew M (2009) Verifying architectural design rules of the flight software product line. In: Proc. int. software product line conference (SPLC), ACM, pp 161–170
Godfrey MW, Germán DM (2014) On the evolution of Lehman’s laws. J Softw Evol Process 26(7):613–619
Jepsen HP, Beuche D (2009) Running a software product line – standing still is going backwards. In: Proc. int. software product line conference (SPLC), ACM, pp 101–110
Kang K, Cohen SG, Hess JA, Novak WE, Peterson AS (1990) Feature-oriented domain analysis (FODA) feasibility study. Tech. Rep. CMU/SEI-90-TR-21, Carnegie-Mellon University, Software Engineering Institute
Kästner C (2010) Virtual separation of concerns: toward preprocessors 2.0. Logos Verlag Berlin
Kästner C, Apel S, Kuhlemann M (2008a) Granularity in software product lines. In: Proc. int. conf. software engineering (ICSE), ACM, pp 311–320
Kästner C, Trujillo S, Apel S (2008b) Visualizing software product line variabilities in source code. In: Proc. int. SPLC workshop visualisation in software product line engineering (ViSPLE), Lero Int. Science Centre, University of Limerick, Ireland, pp 303–312
Kästner C, Giarrusso PG, Rendel T, Erdweg S, Ostermann K, Berger T (2011) Variability-aware parsing in the presence of lexical macros and conditional compilation. In: Proc. int. conf. object-oriented programming, systems, languages, and applications (OOPSLA). ACM, pp 805–824
Kästner C, Ostermann K, Erdweg S (2012) A variability-aware module system. In: Proc. int. conf. object-oriented programming, systems, languages, and applications (OOPSLA). ACM, pp 773–792
Kernighan BW, Ritchie D (1988) The C programming language. Prentice-Hall
Krone M, Snelting G (1994) On the inference of configuration structures from source code. In: Proc. int. conf. software engineering (ICSE). IEEE, pp 49–57
Kullbach B, Riediger V (2001) Folding: an approach to enable program understanding of preprocessed languages. In: Proc. working conference on reverse engineering (WCRE), IEEE, pp 3–12
Kumar A, Sutton A, Stroustrup B (2012) Rejuvenating C++ programs through demacrofication. In: Proc. int. conf. software maintenance (ICSM). IEEE, pp 98–107
Liebig J, Apel S, Lengauer C, Kästner C, Schulze M (2010) An analysis of the variability in forty preprocessor-based software product lines. In: Proc. int. conf. software engineering (ICSE). ACM, pp 105–114
Liebig J, Kästner C, Apel S (2011) Analyzing the discipline of preprocessor annotations in 30 million lines of C code. In: Proc. int. conf. aspect-oriented software development (AOSD). ACM, pp 191–202
Liebig J, von Rhein A, Kästner C, Apel S, Dörre J, Lengauer C (2013) Scalable analysis of variable software. In: Proc. Europ. software engineering conference and the int. symposium on the foundations of software engineering (ESEC/FSE). ACM, pp 81–91
Lohmann D, Scheler F, Tartler R, Spinczyk O, Schröder-Preikschat W (2006) A quantitative analysis of aspects in the eCos Kernel. In: Proc. int. EuroSys conference (EuroSys). ACM, 191–204
Lotufo R, She S, Berger T, Czarnecki K, Wasowski A (2010) Evolution of the Linux kernel variability model. In: Proc. int. software product line conference (SPLC). Springer, pp 136–150
Mauerer W, Jaeger MC (2013) Open source engineering processes. Inform Technol 55(5):196– 203
McCloskey B, Brewer E (2005) ASTEC: A New Approach to Refactoring C. In: Proc. Europ. software engineering conference and the int. symposium on the foundations of software engineering (ESEC/FSE), ACM, pp 21–30
Passos L, Teixeira L, Dintzner N, Apel S, Wasowski A, Czarnecki K, Borba P, Guom J (2015) Coevolution of variability models and related software artifacts: a fresh look at evolution patterns in the Linux kernel. Empirical Software Engineering. To appear.
Pearse TT, Oman PW (1997) Experiences developing and maintaining software in a multi-platform environment. In: Proc. int. conf. software maintenance (ICSM). IEEE, pp 270–277
Pech D, Knodel J, Carbon R, Schitter C, Hein D (2009) Variability management in small development organizations – experiences and lessons learned from a case study. In: Proc. int. software product line conference (SPLC), ACM, pp 285–294
Pohl K, Böckle G, van der Linden F (2005) Software product line engineering – foundations, principles, and techniques. Springer
Queiroz R, Passos LT, Valente MT, Apel S, Czarnecki K (2014) Does feature scattering follow power-law distributions? An investigation of five pre-processor-based software families. In: Proc. int. workshop on feature-oriented software development (FOSD). ACM, pp 23–29
Ribeiro M, Queiroz F, Borba P, Tolêdo T, Brabrand C, Soares S (2011) On the impact of feature dependencies when maintaining preprocessor-based software product lines. In: Proc. int. conf. generative programming and component engineering (GPCE). ACM, pp 23–32
Schulze S, Liebig J, Siegmund J, Apel S (2013) Does the discipline of preprocessor annotations matter? A controlled experiment. In: Proc. int. conf. generative programming and component engineering (GPCE). ACM, pp 65–74
Singh N, Gibbs C, Coady Y (2007) C-CLR: a tool for navigating highly configurable system software. In: Proc. AOSD workshop on aspects, components, and patterns for infrastructure software (ACP4IS). ACM, p 6
Spencer H, Collyer G (1992) #ifdef considered harmful, or portability experience with C News. In: USENIX summer technical conference. USENIX Association, pp 185–197
Spinellis D (2008) A tale of four kernels. In: Proc. int. conf. software engineering (ICSE). ACM, pp 381–390
Sutton A, Maletic JI (2007) How we manage portability and configuration with the C preprocessor. In: Proc. int. conf. software maintenance (ICSM). IEEE, pp 275–284
Tartler R (2013) Mastering variability challenges in Linux and related highly-configurable system software. PhD thesis, Friedrich-Alexander-Universität Erlangen-Nürnberg
Tartler R, Lohmann D, Sincero J, Schröder-Preikschat W (2011) Feature consistency in compile-time-configurable system software: facing the Linux 10,000 feature problem. In: Proc. int. EuroSys conference (EuroSys). ACM, pp 47–60
Tomassetti F, Ratiu D (2013) Extracting variability from C and lifting it to mbeddr. In: Proc. int. workshop on reverse variability engineering (REVE), pp 9–16
Vo K, Chen Y (1992) Incl: a tool to analyze include files. In: Proc. USENIX conference. USENIX Association, pp 199–208
Weise D, Crew R (1993) Programmable syntax macros. In: Proc. int. conf. programming languages design and implementation (PLDI). ACM, pp 156–165
Zhang B, Becker M, Patzke T, Sierszecki K, Savolainen JE (2013) Variability evolution and erosion in industrial product lines: a case study. In: Proc. int. software product line conference (SPLC). ACM, pp 168–177
Acknowledgments
This work was partially supported by the DFG (German Research Foundation, 206/4, AP 206/5, AP 206/6) under the Priority Programme SPP1593 (Design For Future-Managed Software Evolution) and by NSF grant CCF-1318808. Furthermore, this work was partially sponsored by the Innovation Center Applied System Modeling, which is funded by Fraunhofer and the state Rhineland Palatinate of the Federal Republic of Germany.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Ebrahim Bagheri, David Benavides, Per Runeson and Klaus Schmid
Janet Siegmund published previous work as Janet Feigenspan.
Rights and permissions
About this article
Cite this article
Hunsen, C., Zhang, B., Siegmund, J. et al. Preprocessor-based variability in open-source and industrial software systems: An empirical study. Empir Software Eng 21, 449–482 (2016). https://doi.org/10.1007/s10664-015-9360-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-015-9360-1