Skip to main content
Log in

Preprocessor-based variability in open-source and industrial software systems: An empirical study

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Almost every sufficiently complex software system today is configurable. Conditional compilation is a simple variability-implementation mechanism that is widely used in open-source projects and industry. Especially, the C preprocessor (CPP) is very popular in practice, but it is also gaining (again) interest in academia. Although there have been several attempts to understand and improve CPP, there is a lack of understanding of how it is used in open-source and industrial systems and whether different usage patterns have emerged. The background is that much research on configurable systems and product lines concentrates on open-source systems, simply because they are available for study in the first place. This leads to the potentially problematic situation that it is unclear whether the results obtained from these studies are transferable to industrial systems. We aim at lowering this gap by comparing the use of CPP in open-source projects and industry—especially from the embedded-systems domain—based on a substantial set of subject systems and well-known variability metrics, including size, scattering, and tangling metrics. A key result of our empirical study is that, regarding almost all aspects we studied, the analyzed open-source systems and the considered embedded systems from industry are similar regarding most metrics, including systems that have been developed in industry and made open source at some point. So, our study indicates that, regarding CPP as variability-implementation mechanism, insights, methods, and tools developed based on studies of open-source systems are transferable to industrial systems—at least, with respect to the metrics we considered.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. http://www.fosd.net/cppstats/

  2. http://www.fosd.net/oss_vs_is/

  3. ftp://ftp.mozilla.org/pub/mozilla.org/mozilla/source/

  4. http://www.srcml.org/

  5. http://www.r-project.org/

  6. In a violin plot, the white dot indicates the median, the small black horizontal bar shows the mean value, the wide black vertical bar spans from first to third quartile, and the shape describes the kernel density.

  7. https://www.kernel.org/doc/Documentation/CodingStyle

References

  • Adams B, De Meuter W, Tromp H, Hassan AE (2009) Can we refactor conditional compilation into aspects? In: Proc. int. conf. aspect-oriented software development (AOSD), ACM, pp 243–254

  • Anderson TW, Finn JD (1996) The new statistical analysis of data. Springer

  • Apel S, Batory D, Kästner C, Saake G (2013) Feature-oriented software product lines: concepts and implementation. Springer

  • Basili V, Caldiera G, Rombach H (1994) Goal question metrics paradigm. Encyclopedia of software engineering

  • Baxter I, Mehlich M (2001) Preprocessor conditional removal by simple partial evaluation. In: Proc. working conference on reverse engineering (WCRE), IEEE, pp 281–290

  • Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B (Methodological) 57(1):289–300

    MathSciNet  MATH  Google Scholar 

  • Berger T, She S, Lotufo R, Wa̧sowski A, Czarnecki K (2010) Variability modeling in the real: a perspective from the operating systems domain. In: Proc. int. conf. automated software engineering (ASE), ACM, pp 73–82

  • Clements PC, Northrop L (2001) Software product lines: practices and patterns. SEI Series in Software Engineering, Addison-Wesley

  • Cliff N (1996) Ordinal methods for behavioral data analysis. Erlbaum

  • Conway M E (1968) How do committees invent? Datamation 14(5):28–31

    Google Scholar 

  • Cowles M, Davis C (1982) On the origins of the .05 level of statistical significance. Am Psychol 37:553–558

    Article  Google Scholar 

  • Czarnecki K, Eisenecker UW (2000) Generative programming – methods, tools and applications. Addison-Wesley

  • Ernst MD, Badros GJ, Notkin D (2002) An empirical analysis of C preprocessor use. IEEE Trans Softw Eng (TSE) 12:1146–1170

    Article  Google Scholar 

  • Erwig M, Walkingshaw E (2011) The choice calculus: a representation for software variation. ACM Trans Softw Eng Methodol (TOSEM) 21(1):6:1–6:27

    Article  Google Scholar 

  • Favre JM (1996) Preprocessors from an abstract point of view. In: Proc. int. conf. software maintenance (ICSM), IEEE, pp 329–339

  • Favre JM (1997) Understanding-in-the-large. In: Proc. int. workshop on program comprehension (WPC), IEEE, pp 29–38

  • Feigenspan J, Kästner C, Apel S, Liebig J, Schulze M, Dachselt R, Papendieck M, Leich T, Saake G (2013) Do background colors improve program comprehension in the #ifdef hell? Empir Softw Eng 18(4):699–745

    Article  Google Scholar 

  • Fitzgerald B (2006) The transformation of open source software. MIS Quarterly 30(3):587–598

    Google Scholar 

  • Ganesan D, Lindvall M, Ackermann C, McComas D, Bartholomew M (2009) Verifying architectural design rules of the flight software product line. In: Proc. int. software product line conference (SPLC), ACM, pp 161–170

  • Godfrey MW, Germán DM (2014) On the evolution of Lehman’s laws. J Softw Evol Process 26(7):613–619

    Article  Google Scholar 

  • Jepsen HP, Beuche D (2009) Running a software product line – standing still is going backwards. In: Proc. int. software product line conference (SPLC), ACM, pp 101–110

  • Kang K, Cohen SG, Hess JA, Novak WE, Peterson AS (1990) Feature-oriented domain analysis (FODA) feasibility study. Tech. Rep. CMU/SEI-90-TR-21, Carnegie-Mellon University, Software Engineering Institute

  • Kästner C (2010) Virtual separation of concerns: toward preprocessors 2.0. Logos Verlag Berlin

  • Kästner C, Apel S, Kuhlemann M (2008a) Granularity in software product lines. In: Proc. int. conf. software engineering (ICSE), ACM, pp 311–320

  • Kästner C, Trujillo S, Apel S (2008b) Visualizing software product line variabilities in source code. In: Proc. int. SPLC workshop visualisation in software product line engineering (ViSPLE), Lero Int. Science Centre, University of Limerick, Ireland, pp 303–312

  • Kästner C, Giarrusso PG, Rendel T, Erdweg S, Ostermann K, Berger T (2011) Variability-aware parsing in the presence of lexical macros and conditional compilation. In: Proc. int. conf. object-oriented programming, systems, languages, and applications (OOPSLA). ACM, pp 805–824

  • Kästner C, Ostermann K, Erdweg S (2012) A variability-aware module system. In: Proc. int. conf. object-oriented programming, systems, languages, and applications (OOPSLA). ACM, pp 773–792

  • Kernighan BW, Ritchie D (1988) The C programming language. Prentice-Hall

  • Krone M, Snelting G (1994) On the inference of configuration structures from source code. In: Proc. int. conf. software engineering (ICSE). IEEE, pp 49–57

  • Kullbach B, Riediger V (2001) Folding: an approach to enable program understanding of preprocessed languages. In: Proc. working conference on reverse engineering (WCRE), IEEE, pp 3–12

  • Kumar A, Sutton A, Stroustrup B (2012) Rejuvenating C++ programs through demacrofication. In: Proc. int. conf. software maintenance (ICSM). IEEE, pp 98–107

  • Liebig J, Apel S, Lengauer C, Kästner C, Schulze M (2010) An analysis of the variability in forty preprocessor-based software product lines. In: Proc. int. conf. software engineering (ICSE). ACM, pp 105–114

  • Liebig J, Kästner C, Apel S (2011) Analyzing the discipline of preprocessor annotations in 30 million lines of C code. In: Proc. int. conf. aspect-oriented software development (AOSD). ACM, pp 191–202

  • Liebig J, von Rhein A, Kästner C, Apel S, Dörre J, Lengauer C (2013) Scalable analysis of variable software. In: Proc. Europ. software engineering conference and the int. symposium on the foundations of software engineering (ESEC/FSE). ACM, pp 81–91

  • Lohmann D, Scheler F, Tartler R, Spinczyk O, Schröder-Preikschat W (2006) A quantitative analysis of aspects in the eCos Kernel. In: Proc. int. EuroSys conference (EuroSys). ACM, 191–204

  • Lotufo R, She S, Berger T, Czarnecki K, Wasowski A (2010) Evolution of the Linux kernel variability model. In: Proc. int. software product line conference (SPLC). Springer, pp 136–150

  • Mauerer W, Jaeger MC (2013) Open source engineering processes. Inform Technol 55(5):196– 203

    Google Scholar 

  • McCloskey B, Brewer E (2005) ASTEC: A New Approach to Refactoring C. In: Proc. Europ. software engineering conference and the int. symposium on the foundations of software engineering (ESEC/FSE), ACM, pp 21–30

  • Passos L, Teixeira L, Dintzner N, Apel S, Wasowski A, Czarnecki K, Borba P, Guom J (2015) Coevolution of variability models and related software artifacts: a fresh look at evolution patterns in the Linux kernel. Empirical Software Engineering. To appear.

  • Pearse TT, Oman PW (1997) Experiences developing and maintaining software in a multi-platform environment. In: Proc. int. conf. software maintenance (ICSM). IEEE, pp 270–277

  • Pech D, Knodel J, Carbon R, Schitter C, Hein D (2009) Variability management in small development organizations – experiences and lessons learned from a case study. In: Proc. int. software product line conference (SPLC), ACM, pp 285–294

  • Pohl K, Böckle G, van der Linden F (2005) Software product line engineering – foundations, principles, and techniques. Springer

  • Queiroz R, Passos LT, Valente MT, Apel S, Czarnecki K (2014) Does feature scattering follow power-law distributions? An investigation of five pre-processor-based software families. In: Proc. int. workshop on feature-oriented software development (FOSD). ACM, pp 23–29

  • Ribeiro M, Queiroz F, Borba P, Tolêdo T, Brabrand C, Soares S (2011) On the impact of feature dependencies when maintaining preprocessor-based software product lines. In: Proc. int. conf. generative programming and component engineering (GPCE). ACM, pp 23–32

  • Schulze S, Liebig J, Siegmund J, Apel S (2013) Does the discipline of preprocessor annotations matter? A controlled experiment. In: Proc. int. conf. generative programming and component engineering (GPCE). ACM, pp 65–74

  • Singh N, Gibbs C, Coady Y (2007) C-CLR: a tool for navigating highly configurable system software. In: Proc. AOSD workshop on aspects, components, and patterns for infrastructure software (ACP4IS). ACM, p 6

  • Spencer H, Collyer G (1992) #ifdef considered harmful, or portability experience with C News. In: USENIX summer technical conference. USENIX Association, pp 185–197

  • Spinellis D (2008) A tale of four kernels. In: Proc. int. conf. software engineering (ICSE). ACM, pp 381–390

  • Sutton A, Maletic JI (2007) How we manage portability and configuration with the C preprocessor. In: Proc. int. conf. software maintenance (ICSM). IEEE, pp 275–284

  • Tartler R (2013) Mastering variability challenges in Linux and related highly-configurable system software. PhD thesis, Friedrich-Alexander-Universität Erlangen-Nürnberg

  • Tartler R, Lohmann D, Sincero J, Schröder-Preikschat W (2011) Feature consistency in compile-time-configurable system software: facing the Linux 10,000 feature problem. In: Proc. int. EuroSys conference (EuroSys). ACM, pp 47–60

  • Tomassetti F, Ratiu D (2013) Extracting variability from C and lifting it to mbeddr. In: Proc. int. workshop on reverse variability engineering (REVE), pp 9–16

  • Vo K, Chen Y (1992) Incl: a tool to analyze include files. In: Proc. USENIX conference. USENIX Association, pp 199–208

  • Weise D, Crew R (1993) Programmable syntax macros. In: Proc. int. conf. programming languages design and implementation (PLDI). ACM, pp 156–165

  • Zhang B, Becker M, Patzke T, Sierszecki K, Savolainen JE (2013) Variability evolution and erosion in industrial product lines: a case study. In: Proc. int. software product line conference (SPLC). ACM, pp 168–177

Download references

Acknowledgments

This work was partially supported by the DFG (German Research Foundation, 206/4, AP 206/5, AP 206/6) under the Priority Programme SPP1593 (Design For Future-Managed Software Evolution) and by NSF grant CCF-1318808. Furthermore, this work was partially sponsored by the Innovation Center Applied System Modeling, which is funded by Fraunhofer and the state Rhineland Palatinate of the Federal Republic of Germany.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Claus Hunsen.

Additional information

Communicated by: Ebrahim Bagheri, David Benavides, Per Runeson and Klaus Schmid

Janet Siegmund published previous work as Janet Feigenspan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hunsen, C., Zhang, B., Siegmund, J. et al. Preprocessor-based variability in open-source and industrial software systems: An empirical study. Empir Software Eng 21, 449–482 (2016). https://doi.org/10.1007/s10664-015-9360-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-015-9360-1

Keywords

Navigation