Advertisement

An investigation of misunderstanding code patterns in C open-source software projects

  • Flávio MedeirosEmail author
  • Gabriel Lima
  • Guilherme Amaral
  • Sven Apel
  • Christian Kästner
  • Márcio Ribeiro
  • Rohit Gheyi
Article

Abstract

Maintenance consumes 40% to 80% of software development costs. So, it is essential to write source code that is easy to understand to reduce the costs with maintenance. Improving code understanding is important because developers often mistake the meaning of code, and misjudge the program behavior, which can lead to errors. There are patterns in source code, such as operator precedence, and comma operator, that have been shown to influence code understanding negatively. Despite initial results, these patterns have not been evaluated in a real-world setting, though. Thus, it is not clear whether developers agree that the patterns studied by researchers can cause substantial misunderstandings in real-world practice. To better understand the relevance of misunderstanding patterns, we applied a mixed research method approach, by performing repository mining and a survey with developers, to evaluate misunderstanding patterns in 50 C open-source projects, including Apache, OpenSSL, and Python. Overall, we found more than 109K occurrences of the 12 patterns in practice. Our study shows that according to developers only some patterns considered previously by researchers may cause misunderstandings. Our results complement previous studies by taking the perception of developers into account.

Keywords

Misunderstanding patterns Repository mining Survey 

Notes

Acknowledgments

We would like to thank Dan Gopstein for the useful feedback regarding our study. Apel’s work has been supported by the German Research Foundation (AP 206/6). This work was funded by CNPq (308380/2016-9, 477943/2013-6, 460883/2014-3, 465614/2014-0, 306610/2013-2, 307190/2015-3, and also CNPq 409335/2016-9), FAPEAL (PPG 14/2016), and CAPES grants (175956 and 117875).

References

  1. Baxter ID (1992) Design maintenance systems. Commun ACM 35(4):73–89CrossRefGoogle Scholar
  2. Baxter I, Mehlich M (2001) Preprocessor conditional removal by simple partial evaluation. In: Proceedings of the working conference on reverse engineering, IEEE, WCRE, pp 281–290Google Scholar
  3. Beller M, Bacchelli A, Zaidman A, Juergens E (2014) Modern code reviews in open-source projects: which problems do they fix? In: Proceedings of the working conference on mining software repositories. ACM, pp 202–211Google Scholar
  4. Bland M (2014) Finding more than one worm in the apple. Commun ACM 57 (7):58–64CrossRefGoogle Scholar
  5. Burke D (1995) All Circuits are Busy Now: The 1990 AT&T Long Distance Network Collapse. California Polytechnic State UniversityGoogle Scholar
  6. Buse RP, Weimer WR (2008) A metric for software readability. In: Proceedings of the international symposium on software testing and analysis. ACM, pp 121–130Google Scholar
  7. Cannon LW, Elliott RA, Kirchhoff LW, Miller JH, Milner JM, Mitze RW, Schan EP, Whittington NO, Spencer H, Brader M, Cannon LW, Elliott RA, Kirchhoff LW, Miller JH, Milner JM, Mitze RW, Schan EP, Whittington NO, Spencer H, Brader M (2000) Recommended C style and coding standardsGoogle Scholar
  8. Collberg C, Thomborson C, Low D (1997) A taxonomy of obfuscating transformations. Technical Report 148, Department of Computer Science. University of AucklandGoogle Scholar
  9. Creswell JW, Clark VLP (2011) Designing and Conducting Mixed Methods Research. SAGE Publications, Thousand OaksGoogle Scholar
  10. Darnell PA, Margolis PE (1996) C: A Software Engineering Approach. Springer, BerlinCrossRefGoogle Scholar
  11. Dijkstra EW (1968) Go to statement considered harmful. Commun ACM 11 (3):147–148CrossRefGoogle Scholar
  12. Dowson M (1997) The Ariane 5 software failure. SIGSOFT Softw Eng Notes 22 (2):84–93CrossRefGoogle Scholar
  13. Easterbrook S, Singer J, Storey MA, Damian D (2008) Selecting empirical methods for software engineering research. Springer, Berlin, pp 285–311Google Scholar
  14. Elgot CC (1976) Structured programming with and without go to statements. IEEE Trans Softw Eng SE-2(1):41–54MathSciNetCrossRefGoogle Scholar
  15. Ernst M, Badros G, Notkin D (2002) An empirical analysis of C, preprocessor use. IEEE Trans Softw Eng 28(12):1146–1170CrossRefGoogle Scholar
  16. Feigenspan J, Kästner C, Apel S, Liebig J, Schulze M, Dachselt R, Papendieck M, Leich T, Saake G (2013) Do background colors improve program comprehension in the #ifdef hell? Empir Softw Eng 18(4):699–745CrossRefGoogle Scholar
  17. Fowler M, Beck K, Brant J, Opdyke W, Roberts D, Gamma E (1999) Refactoring: Improving the Design of Existing Code. Addison-Wesley, ReadingGoogle Scholar
  18. Gamma E, Helm R, Johnson R, Vlissides J (1995) Design Patterns: Elements of Reusable Object-oriented Software. Addison-Wesley, ReadingzbMATHGoogle Scholar
  19. Garrido A, Johnson R (2003) Refactoring C with conditional compilation. In: Proceedings of the IEEE international conference on automated software engineering. IEEE, pp 323–326Google Scholar
  20. Glass RL (2001) Frequently forgotten fundamental facts about software engineering. IEEE Softw 18(3):112–111CrossRefGoogle Scholar
  21. Gopstein D, Iannacone J, Yan Y, DeLong L, Zhuang Y, Yeh MKC, Cappos J (2017) Understanding misunderstandings in source code. In: Proceedings of the 2017 11th joint meeting on foundations of software engineering. ACM, ESEC/FSE 2017, pp 129-139Google Scholar
  22. Gousios G (2013) The GHTorent dataset and tool suite. In: Proceedings of the working conference on mining software repositories. IEEE Press, pp 233–236Google Scholar
  23. Gopstein D, Zhou H, Frankl P, Cappos J (2018) Prevalence of confusing code in software projects: atoms of confusion in the wild. In: Proceedings of the working conference on mining software repositories. ACMGoogle Scholar
  24. Herzberg A, Pinter SS (1987) Public protection of software. ACM Trans Comput Syst 5(4):371–393CrossRefGoogle Scholar
  25. ISO/IEC/IEEE (2006) Iso/iec/ieee international standard for software engineering - software life cycle processes - maintenance. Std 14764-2006, pp 1–58Google Scholar
  26. Jha MM, Vilardell RMF, Narayan J (2016) Scaling agile scrum software development: providing agility and quality to platform development by reducing time to market. In: 2016 IEEE 11th international conference on global software engineering (ICGSE), pp 84–88Google Scholar
  27. Kästner C, Giarrusso P, Rendel T, Erdweg S, Ostermann K, Berger T (2011) Variability-aware parsing in the presence of lexical macros and conditional compilation. In: Proceedings of the object-oriented programming systems languages and applications, ACM, pp 805–824Google Scholar
  28. Kernighan BW, Pike R (1999) The Practice of Programming. Addison-Wesley, ReadingGoogle Scholar
  29. Liebig J, Kästner C, Apel S (2011) Analyzing the discipline of preprocessor annotations in 30 million lines of C code. In: Proceedings of the international conference on aspect-oriented software development. ACM, pp 191–202Google Scholar
  30. Lohmann D, Scheler F, Tartler R, Spinczyk O, Schröder-Preikschat W (2006) A quantitative analysis of aspects in the eCos kernel. In: Proceedings of the European conference on computer systems. ACM, pp 191–204CrossRefGoogle Scholar
  31. Malaquias R, Ribeiro M, Bonifácio R, Monteiro E, Medeiros F, Garcia A, Gheyi R (2017) The discipline of preprocessor-based annotations does #ifdef TAG N’T #endif matter. In: Proceedings of the international conference on program comprehension. IEEE Press, pp 297–307Google Scholar
  32. Marshall L, Webber J (2000) Gotos considered harmful and other programmers taboos. In: Proceedings of the workshop of the psychology of programming interest group. PPIG, pp 171–180Google Scholar
  33. Medeiros F, Ribeiro M, Gheyi R (2013) Investigating preprocessor-based syntax errors. In: Proceedings of the international conference on generative programming, concepts & experiences. ACM, pp 75–84Google Scholar
  34. Medeiros F, Kästner C, Ribeiro M, Nadi S, Gheyi R (2015a) The Love/Hate Relationship with the C Preprocessor: An Interview Study. In: European conference on object-oriented programming (ECOOP), Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Leibniz International Proceedings in Informatics (LIPIcs), vol 37, pp 495–518Google Scholar
  35. Medeiros F, Rodrigues I, Ribeiro M, Teixeira L, Gheyi R (2015b) An empirical study on configuration-related issues: Investigating undeclared and unused identifiers. In: Proceedings of the ACM SIGPLAN international conference on generative programming, concepts and experiences. ACM, pp 35-44Google Scholar
  36. Medeiros F, Kästner C, Ribeiro M, Gheyi R, Apel S (2016) A comparison of 10 sampling algorithms for configurable systems. In: Proceedings of the international conference on software engineering. ACM, pp 643–654Google Scholar
  37. Medeiros F, Ribeiro M, Gheyi R, Apel S, Kastner C, Ferreira B, Carvalho L, Fonseca B (2018a) Discipline matters: refactoring of preprocessor directives in the #ifdef hell, vol 44Google Scholar
  38. Medeiros F, Silva G, Amaral G, Apel S, Kästner C, Ribeiro M, Gheyi R (2018b) Investigating Misunderstanding Code Patterns in C Open-Source Software Projects (Replication Package).  https://doi.org/10.5281/zenodo.1461534
  39. Nagappan M, Robbes R, Kamei Y, Tanter E, McIntosh S, Mockus A, Hassan AE (2015) An empirical study of goto in C code from GitHub repositories. In: Proceedings of the joint meeting on foundations of software engineering. ACM, NY, pp 404–414Google Scholar
  40. Padioleau Y (2009) Parsing C/C++ code without pre-processing. In: Proceedings of the international conference on compiler construction. Springer, pp 109–125Google Scholar
  41. Pahal A, Chillar RS (2017) Code readability: a review of metrics for software quality. Int J Comput Trends Technol 46(1):1–58CrossRefGoogle Scholar
  42. Rigby PC, German DM, Storey MA (2008) Open source software peer review practices: a case study of the Apache server. In: Proceedings of the international conference on software engineering. ACM, pp 541–550Google Scholar
  43. Schulze S, Liebig J, Siegmund J, Apel S (2013) Does the discipline of preprocessor annotations matter? a controlled experiment. In: Proceedings of the international conference on generative programming, concepts and experiences. ACM, pp 65–74Google Scholar
  44. Scott ML (2000) Programming language pragmatics. Morgan Kaufmann Publishers Inc., San FranciscozbMATHGoogle Scholar
  45. Spencer H, Collyer G (1992) #ifdef considered harmful, or portability experience with C News. In: USENIX summer technical conference, pp 185–197Google Scholar
  46. Stamelos I, Angelis L, Oikonomou A, Bleris GL (2002) Code quality analysis in open source software development. Inf Syst J 12(1):43–60CrossRefGoogle Scholar
  47. Wulf W, Shaw M (1973) Global variable considered harmful. SIGPLAN Not 8(2):28–34CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Federal Institute of Alagoas (IFAL)MaceióBrazil
  2. 2.Federal University of Alagoas (UFAL)MaceióBrazil
  3. 3.Universität PassauPassauGermany
  4. 4.Carnegie Mellon University (CMU)PittsburghUSA
  5. 5.Federal University of Campina Grande (UFCG)ParaíbaBrazil

Personalised recommendations