Empirical Software Engineering

, Volume 21, Issue 1, pp 104–158 | Cite as

Linguistic antipatterns: what they are and how developers perceive them

  • Venera Arnaoudova
  • Massimiliano Di Penta
  • Giuliano Antoniol


Antipatterns are known as poor solutions to recurring problems. For example, Brown et al. and Fowler define practices concerning poor design or implementation solutions. However, we know that the source code lexicon is part of the factors that affect the psychological complexity of a program, i.e., factors that make a program difficult to understand and maintain by humans. The aim of this work is to identify recurring poor practices related to inconsistencies among the naming, documentation, and implementation of an entity—called Linguistic Antipatterns (LAs)—that may impair program understanding. To this end, we first mine examples of such inconsistencies in real open-source projects and abstract them into a catalog of 17 recurring LAs related to methods and attributes. Then, to understand the relevancy of LAs, we perform two empirical studies with developers—30 external (i.e., not familiar with the code) and 14 internal (i.e., people developing or maintaining the code). Results indicate that the majority of the participants perceive LAs as poor practices and therefore must be avoided—69 % and 51 % of the external and internal developers, respectively. As further evidence of LAs’ validity, open source developers that were made aware of LAs reacted to the issue by making code changes in 10 % of the cases. Finally, in order to facilitate the use of LAs in practice, we identified a subset of LAs which were universally agreed upon as being problematic; those which had a clear dissonance between code behavior and lexicon.


Source code identifiers Linguistic antipatterns Empirical study Developers’ perception 



The authors would like to thank the participants to the two studies for their precious time and effort. They made this work possible.


  1. Abbes M, Khomh F, Guéhéneuc YG, Antoniol G (2011) An empirical study of the impact of two antipatterns, Blob and Spaghetti Code, on program comprehension. In: Proceedings of the European Conference on Software Maintenance and Reengineering (CSMR), pp 181–190Google Scholar
  2. Abebe S, Tonella P (2011) Towards the extraction of domain concepts from the identifiers. In: Proceedings of the Working Conference on Reverse Engineering (WCRE), pp 77–86Google Scholar
  3. Abebe S, Tonella P (2013) Automated identifier completion and replacement. In: Proceedings of the European Conference on Software Maintenance and Reengineering (CSMR), pp 263–272Google Scholar
  4. Abebe SL, Haiduc S, Tonella P, Marcus A (2011) The effect of lexicon bad smells on concept location in source code. In: Proceedings of the International Working Conference on Source Code Analysis and Manipulation (SCAM), pp 125–134Google Scholar
  5. Abebe SL, Arnaoudova V, Tonella P, Antoniol G, Guéhéneuc YG (2012) Can lexicon bad smells improve fault prediction? In: Proceedings of the Working Conference on Reverse Engineering (WCRE), pp 235–244Google Scholar
  6. Anquetil N, Lethbridge T (1998) Assessing the relevance of identifier names in a legacy software system. In: Proceedings of the International Conference of the Centre for Advanced Studies on Collaborative Research (CASCON), pp 213–222Google Scholar
  7. Arnaoudova V, Di Penta M, Antoniol G, Guéhéneuc YG (2013) A new family of software anti-patterns: Linguistic anti-patterns. In: Proceedings of the European Conference on Software Maintenance and Reengineering (CSMR), pp 187–196Google Scholar
  8. Arnaoudova V, Eshkevari L, Di Penta M, Oliveto R, Antoniol G, Guéhéneuc YG (2014) Repent: Analyzing the nature of identifier renamings. IEEE Trans Softw Eng (TSE) 40(5):502–532CrossRefGoogle Scholar
  9. Brooks R (1983) Towards a theory of the comprehension of computer programs. In J Man-Machine Stud 18(6):543–554CrossRefMathSciNetGoogle Scholar
  10. Brown WJ, Malveau RC, Brown WH, McCormick III HW, Mowbray TJ (1998a) Anti patterns: refactoring software, architectures, and projects in crisis, 1st edn. Wiley, New YorkGoogle Scholar
  11. Brown WJ, Malveau RC, HWM III, Mowbray TJ (1998b) AntiPatterns: refactoring software, architectures, and projects in crisis. Wiley, New YorkGoogle Scholar
  12. Caprile B, Tonella P (1999) Nomen est omen: Analyzing the language of function identifiers. In: Proceedings of Working Conference on Reverse Engineering (WCRE), pp 112–122Google Scholar
  13. Caprile B, Tonella P (2000) Restructuring program identifier names. In: Proceedings of the International Conference on Software Maintenance (ICSM), pp 97–107Google Scholar
  14. Chaudhary BD, Sahasrabuddhe HV (1980) Meaningfulness as a factor of program complexity. In: Proceedings of the ACM Annual Conference, ACM, ACM ’80, pp 457–466Google Scholar
  15. De Lucia A, Di Penta M, Oliveto R (2011) Improving source code lexicon via traceability and information retrieval. IEEE Trans Softw Eng 37(2):205–227CrossRefGoogle Scholar
  16. Deissenbock F, Pizka M (2005) Concise and consistent naming. In: Proceedings of the International Workshop on Program Comprehension (IWPC), pp 97–106Google Scholar
  17. Fowler M (1999) Refactoring: improving the design of existing code. Addison-Wesley, MAGoogle Scholar
  18. Gamma E, Helm R, Johnson R, Vlissides J (1995) Design patterns: elements of reusable object oriented software. Addison-Wesley, BostonGoogle Scholar
  19. Glaser BG (1992) Basics of grounded theory analysis. Sociology PressGoogle Scholar
  20. Grissom RJ, Kim JJ (2005) Effect sizes for research: a broad practical approach, 2nd edn. Lawrence Earlbaum AssociatesGoogle Scholar
  21. Groves RM, Fowler Jr FJ, Couper MP, Lepkowski JM, Singer E, Tourangeau R (2009) Survey methodology, 2nd edn. Wiley, New YorkzbMATHGoogle Scholar
  22. Hintze JL, Nelson RD (1998) Violin plots: a box plot-density trace synergism. Am Stat 52(2):181–184Google Scholar
  23. Jedlitschka A, Pfahl D (2005) Reporting guidelines for controlled experiments in software engineering. In: International symposium on empirical software engineeringGoogle Scholar
  24. Khomh F, Di Penta M, Guéhéneuc YG (2009) An exploratory study of the impact of code smells on software change-proneness. In: Proceedings of the working conference on reverse engineering (WCRE), pp 75–84Google Scholar
  25. Khomh F, Di Penta M, Guéhéneuc YG, Antoniol G (2012) An exploratory study of the impact of antipatterns on class change- and fault-proneness. Empir Softw Eng 17(3):243–275CrossRefGoogle Scholar
  26. Kitchenham B, Pfleeger S, Pickard L, Jones P, Hoaglin D, El Emam K, Rosenberg J (2002) Preliminary guidelines for empirical research in software engineering. IEEE Trans Softw Eng (TSE) 28(8):721–734CrossRefGoogle Scholar
  27. Lawrie D, Morrell C, Feild H, Binkley D (2006) What’s in a name? a study of identifiers. In: Proceedings of the International Conference on Program Comprehension (ICPC), pp 3–12Google Scholar
  28. Lawrie D, Morrell C, Feild H, Binkley D (2007) Effective identifier names for comprehension and memory. Innovations Syst Softw Eng 3(4):303–318CrossRefGoogle Scholar
  29. Merlo E, McAdam I, De Mori R (2003) Feed-forward and recurrent neural networks for source code informal information analysis. J Softw Maint 15(4):205–244CrossRefGoogle Scholar
  30. Miller GA (1995) WordNet: a lexical database for English. Commun ACM 38(11):39–41CrossRefGoogle Scholar
  31. Moha N, Guéhéneuc YG, Duchien L, Le Meur AF (2010) DECOR: a method for the specification and detection of code and design smells. IEEE Trans Softw Eng (TSE’10) 36(1):20–36CrossRefGoogle Scholar
  32. Nagappan M, Zimmermann T, Bird C (2013) Diversity in software engineering research. In: Proceedings of the joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), pp 466–476Google Scholar
  33. Oppenheim AN (1992) Questionnaire design, interviewing and attitude measurement. Pinter, LondonGoogle Scholar
  34. Palomba F, Bavota G, Di Penta M, Oliveto R, De Lucia A, Poshyvanyk D (2013) Detecting bad smells in source code using change history information. In: Proceedings of the international conference on automated software engineering (ASE), pp 268–278Google Scholar
  35. Palomba F, Bavota G, Penta M D, Oliveto R, Lucia A D (2014) Do they really smell bad? A study on developers’ perception of code bad smells. In: International conference on software maintenance and evolution (ICSME), p. to appearGoogle Scholar
  36. Parsons J, Saunders C (2004) Cognitive heuristics in software engineering: applying and extending anchoring and adjustment to artifact reuse. IEEE Trans Softw Eng (TSE) 30(12):873–888CrossRefGoogle Scholar
  37. Prechelt L, Unger-Lamprecht B, Philippsen M, Tichy W (2002) Two controlled experiments assessing the usefulness of design pattern documentation in program maintenance. IEEE Trans Softw Eng (TSE) 28(6):595–606CrossRefGoogle Scholar
  38. Raţiu D, Ducasse S, Girba T, Marinescu R (2004) Using history information to improve design flaws detection. In: Proceedings of the European conference on software maintenance and reengineering (CSMR), pp 223–232Google Scholar
  39. Sheil BA (1981) The psychological study of programming. ACM Comput Surv (CSUR) 13(1):101–120CrossRefGoogle Scholar
  40. Shneiderman B (1977) Measuring computer program quality and comprehension. Int J Man-Machine Stud 9(4):465–478CrossRefGoogle Scholar
  41. Shneiderman B, Mayer R (1975) Towards a cognitive model of progammer behavior, Tech Rep, vol 37. Indiana University, BloomingtonGoogle Scholar
  42. Shull F, Singer J, Sjøberg DI (eds) (2007) Guide to advanced empirical software engineering. Springer, New YorkGoogle Scholar
  43. Strauss AL (1987) Qualitative analysis for social scientists. Cambridge Univsersity PressGoogle Scholar
  44. Takang A, Grubb PA, Macredie RD (1996) The effects of comments and identifier names on program comprehensibility: an experiential study. J Program Lang 4(3):143–167Google Scholar
  45. Tan L, Yuan D, Krishna G, Zhou Y (2007) /*iComment: bugs or bad comments?*/, Proceedings of the ACM SIGOPS Symposium on Operating Systems Principles (SOSP) 41(6):145–158Google Scholar
  46. Tan L, Zhou Y, Padioleau Y (2011) Acomment: mining annotations from comments and code to detect interrupt related concurrency bugs. In: Proceedings of the International Conference on Software Engineering (ICSE)Google Scholar
  47. Tan SH, Marinov D, Tan L, Leavens GT (2012) @tComment: Testing Javadoc comments to detect comment-code inconsistencies. In: Proceedings of the international conference on software testing, verification and validation (ICST), pp 260–269Google Scholar
  48. Torchiano M (2002) Documenting pattern use in java programs. In: Proceedings of the international conference on software maintenance (ICSM), pp 230–233Google Scholar
  49. Toutanova K, Manning CD (2000) Enriching the knowledge sources used in a maximum entropy part-of-speech tagger. In: Proceedings of the Joint SIGDAT conference on empirical methods in natural language processing and very large corpora (EMNLP/VLC-2000), association for computational linguistics, pp 63–70Google Scholar
  50. Weissman L (1974a) Psychological complexity of computer programs: an experimental methodology. SIGPLAN Not 9(6):25–36CrossRefGoogle Scholar
  51. Weissman LM (1974b) A methodology for studying the psychological complexity of computer programs. PhD thesisGoogle Scholar
  52. Wohlin C, Runeson P, Höst M, Ohlsson MC, Regnell B, Wesslén A (2000) Experimentation in software engineering - an introduction. Kluwer, BostonCrossRefzbMATHGoogle Scholar
  53. Woodfield SN, Dunsmore HE, Shen VY (1981) The effect of modularization and comments on program comprehension. In: Proceedings of the international conference on software engineering (ICSE), pp 215–223Google Scholar
  54. Yamashita A, Moonen L (2013) Do developers care about code smells? - An exploratory survey. In: Proceedings of the working conference on reverse engineering (WCRE), pp 242–251Google Scholar
  55. Zhong H, Zhang L, Xie T, Mei H (2011) Inferring specifications for resources from natural language api documentation. Autom Softw Eng 18(3–4):227–261CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Venera Arnaoudova
    • 1
  • Massimiliano Di Penta
    • 2
  • Giuliano Antoniol
    • 3
  1. 1.Soccer Lab., DGIGLPolytechnique MontréalMontréalCanada
  2. 2.Department of EngineeringUniversity of SannioBeneventoItaly
  3. 3.Soccer Lab., DGIGLPolytechnique MontréalMontréalCanada

Personalised recommendations