Can static analysis tools find more defects?

Mehrpour, Sahar; LaToza, Thomas D.

doi:10.1007/s10664-022-10232-4

Can static analysis tools find more defects?

A qualitative study of design rule violations found by code review

Published: 08 November 2022

Volume 28, article number 5, (2023)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

893 Accesses
2 Citations
3 Altmetric
Explore all metrics

Abstract

Static analysis tools find defects in code, checking code against rules to reveal potential defects. Many studies have evaluated these tools by measuring their ability to detect known defects in code. But these studies measure the current state of tools rather than their future potential to find more defects. To investigate the prospects for tools to find more defects, we conducted a study where we formulated each issue raised by a code reviewer as a violation of a rule, which we then compared to what static analysis tools might potentially check. We first gathered a corpus of 1323 defects found through code review. Through a qualitative analysis process, for each defect we identified a violated rule and the type of Static Analysis Tool (SAT) which might check this rule. We found that SATs might, in principle, be used to detect as many as 76% of code review defects, considerably more than current tools have been demonstrated to successfully detect. Among a variety of types of SATs, Style Checkers and AST Pattern Checkers had the broadest coverage of defects, each with the potential to detect 25% of all code review defects. We found that static analysis tools might be able to detect more code review defects by better supporting the creation of project-specific rules. We also investigated the characteristics of code review defects not detectable by traditional static analysis techniques, which to detect might require tools which simulate human judgements about code.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Large Language Model Assisted Software Engineering: Prospects, Challenges, and a Case Study

Challenges of Low-Code/No-Code Software Development: A Literature Review

Sampling in software engineering research: a critical review and guidelines

Article 28 April 2022

Data Availability

Our dataset is publicly available on https://doi.org/10.6084/m9.figshare.14925222

Notes

https://doi.org/10.6084/m9.figshare.14925222
https://pmd.github.io/latest/pmd_rules_java_performance.html#useindexofchar
https://github.com/pmd/pmd/blob/master/pmd-java/src/main/java/net/sourceforge/pmd/lang/java/rule/performance/UseIndexOfCharRule.java
https://github.com/checkstyle/checkstyle/issues/2982
E.g., some tools offer quick fixes for limited types of defects found by FindBugs, https://github.com/kjlubick/fb-contrib-eclipse-quick-fixes

References

Aftandilian E, Sauciuc R, Priya S, Krishnan S (2012) Building useful program analysis tools using an extensible java compiler. In: International working conference on source code analysis and manipulation (SCAM). https://doi.org/10.1109/SCAM.2012.28, pp 14–23
Aghajani E, Nagy C, Vega-Márquez O L, Linares-Vásquez M, Moreno L, Bavota G, Lanza M (2019) Software documentation issues unveiled. In: International conference on software engineering (ICSE). https://doi.org/10.1109/icse.2019.00122, pp 1199–1210
Aldrich J, Chambers C, Notkin D (2002) Archjava: connecting software architecture to implementation. In: International conference on software engineering (ICSE). https://doi.org/10.1145/581339.581365, pp 187–197
Ayewah N, Pugh W (2010) The google findbugs fixit. In: International symposium on software testing and analysis (ISSTA). https://doi.org/10.1145/1831708.1831738, pp 241–252
Bacchelli A, Bird C (2013) Expectations, outcomes, and challenges of modern code review. In: International conference on software engineering (ICSE). https://doi.org/10.1109/icse.2013.6606617, pp 712–721
Bader J, Scott A, Pradel M, Chandra S (2019) Getafix: learning to fix bugs automatically. In: Conference on object-oriented programming systems, languages and applications (OOPSLA). https://doi.org/10.1145/3360585, pp 159:1–159:27
Balachandran V (2013) Reducing human effort and improving quality in peer code reviews using automatic static analysis and reviewer recommendation. In: International conference on software engineering (ICSE). https://doi.org/10.1109/icse.2013.6606642, pp 931–940
Barik T, Ford D, Murphy-Hill E, Parnin C (2018) How should compilers explain problems to developers?. In: European software engineering conference and international symposium on the foundations of software engineering (ESEC/FSE). https://doi.org/10.1145/3236024.3236040, pp 633–643
Basili V R, Selby R W (1987) Comparing the effectiveness of software testing strategies. Trans Softw Eng SE-13 (12):1278–1296. https://doi.org/10.1109/tse.1987.232881
Article Google Scholar
Beller M, Bacchelli A, Zaidman A, Juergens E (2014) Modern code reviews in open-source projects: which problems do they fix?. In: International conference on mining software repositories (MSR). https://doi.org/10.1145/2597073.2597082, pp 202–211
Beller M, Bholanath R, McIntosh S, Zaidman A (2016) Analyzing the state of static analysis: a large-scale evaluation in open source software. In: International conference on software analysis, evolution, and reengineering (SANER). https://doi.org/10.1109/saner.2016.105, pp 470–481
Bessey A, Block K, Chelf B, Chou A, Fulton B, Hallem S, Gros C, Kamsky A, McPeak S, Engler D R (2010) A few billion lines of code later: using static analysis to find bugs in the real world. Commun ACM 53(2):66–75. https://doi.org/10.1145/1646353.1646374
Article Google Scholar
Beyer D, Keremoglu ME (2011) Cpachecker: a tool for configurable software verification. In: International conference on computer aided verification. https://doi.org/10.1007/978-3-642-22110-1_16, pp 184–190
Board I (1993) IEEE Standard classification for software anomalies. IEEE Std, pp 1044
Brunet J, Murphy GC, Terra R, Figueiredo J, Serey D (2014) Do developers discuss design?. In: International conference on mining software repositories (MSR). https://doi.org/10.1145/2597073.2597115, pp 340–343
Burnstein I (2002) Practical software testing. Springer, Berlin
MATH Google Scholar
Calcagno C, Distefano D, Dubreil J, Gabi D, Hooimeijer P, Luca M, O’Hearn PW, Papakonstantinou I, Purbrick J, Rodriguez D (2015) Moving fast with software verification. In: NASA formal methods symposium, Lecture Notes in Computer Science, vol 9058. https://doi.org/10.1007/978-3-319-17524-9_1, pp 3–11
Casalnuovo C, Barr E T, Dash S K, Devanbu P, Morgan E (2020) A theory of dual channel constraints. In: International conference on software engineering, new ideas and emerging results (ICSE-NIER). https://doi.org/10.1145/3377816.3381720, pp 25–28
CheckStyle (2004) http://checkstyle.sourceforge.net
Chillarege R, Bhandari I S, Chaar J K, Halliday M J, Moebus D S, Ray B K, Wong M Y (1992) Orthogonal defect classification-a concept for in-process measurements. Trans Softw Eng 18(11):943–956. https://doi.org/10.1109/32.177364
Article Google Scholar
Christakis M, Bird C (2016) What developers want and need from program analysis: an empirical study. In: International conference on automated software engineering (ASE). https://doi.org/10.1145/2970276.2970347, pp 332–343
Copeland T (2005) PMD applied. Centennial Books
Crockford D (2011) Jslint: the javascript code quality tool. http://www.jslint.com
Ebert F, Castor F, Novielli N, Serebrenik A (2018) Communicative intention in code review questions. In: International conference on software maintenance and evolution (ICSME). https://doi.org/10.1109/icsme.2018.00061, pp 519–523
Fard A M, Mesbah A (2013) JSNOSE: detecting Javascript code smells. In: International working conference on source code analysis and manipulation (SCAM). https://doi.org/10.1109/scam.2013.6648192, pp 116–125
Ghorbani N, Garcia J, Malek S (2019) Detection and repair of architectural inconsistencies in java. In: International conference on software engineering (ICSE). https://doi.org/10.1109/icse.2019.00067, pp 560–571
Gilb T, Graham D, Finzi S (1993) Software inspection. Addison-Wesley
GitHub (2021) Github copilot. https://copilot.github.com/
Gousios G (2013) The ghtorrent dataset and tool suite. In: International conference on mining software repositories (MSR). https://doi.org/10.1109/msr.2013.6624034, pp 233–236
Group I et al (2010) 1044-2009-ieee standard classification for software anomalies. IEEE, New York
Google Scholar
Habib A, Pradel M (2018) How many of all bugs do we find? A study of static bug detectors. In: International conference on automated software engineering (ASE). https://doi.org/10.1145/3238147.3238213, pp 317–328
Hovemeyer D, Pugh W (2004) Finding bugs is easy. In: Conference on object-oriented programming systems, languages, and applications (OOPSLA). https://doi.org/10.1145/1028664.1028717, pp 132–136
Humphrey WS (1995) A discipline for software engineering, 1st edn. Addison-Wesley
Huo X, Yang Y, Li M, Zhan D (2018) Learning semantic features for software defect prediction by code comments embedding. In: International conference on data mining (ICDM). https://doi.org/10.1109/ICDM.2018.00133, pp 1049–1054
JCov (2014) https://wiki.openjdk.java.net/display/CodeTools/jcov
Jenkins (2019) https://jenkins.io
Johnson B, Song Y, Murphy-Hill E, Bowdidge R (2013) Why don’t software developers use static analysis tools to find bugs?. In: International conference on software engineering (ICSE). https://doi.org/10.1109/icse.2013.6606613, pp 672–681
Kamiya T, Kusumoto S, Inoue K (2002) CCFInder: a multilinguistic token-based code clone detection system for large scale source code. Trans Softw Eng 28(7):654–670. https://doi.org/10.1109/tse.2002.1019480
Article Google Scholar
Knodel J, Popescu D (2007) A comparison of static architecture compliance checking approaches. In: Working IEEE/IFIP conference on software architecture (WICSA). https://doi.org/10.1109/wicsa.2007.1, pp 12–12
Kruchten P (2004) An ontology of architectural design decisions in software-intensive systems. In: Groningen workshop on software variability management
Laitenberger O (1998) Studying the effects of code inspection and structural testing on software quality. In: International symposium on software reliability engineering (ISSER). https://doi.org/10.1109/ISSRE.1998.730887, pp 237–246
Laitenberger O, DeBaud J M (2000) An encompassing life cycle centric survey of software inspection. J Syst Softw 50(1):5–31. https://doi.org/10.1016/s0164-1212(99)00073-4
Article Google Scholar
Landis J R, Koch G G (1977) The measurement of observer agreement for categorical data. Biometrics 159–174. https://doi.org/10.2307/2529310
LaToza T D, Myers B A (2011) Visualizing call graphs. In: Symposium on visual languages and human-centric computing (VL/HCC). https://doi.org/10.1109/VLHCC.2011.6070388, pp 117–124
Little G, Miller R C (2007) Keyword programming in java. In: International conference on automated software engineering (ASE). https://doi.org/10.1007/s10515-008-0041-9, pp 84–93
Mäntylä MV, Lassenius C (2009) What types of defects are really discovered in code reviews? Trans Softw Eng 35(3):430–448. https://doi.org/10.1109/tse.2008.71
Article Google Scholar
Mao K, Harman M, Jia Y (2016) Sapienz: multi-objective automated testing for android applications. In: International symposium on software testing and analysis (ISSTA). https://doi.org/10.1145/2931037.2931054, pp 94–105
Mehrpour S, LaToza T D, Kindi R K (2019) Active documentation: helping developers follow design decisions. In: Symposium on visual languages and human-centric computing (VL/HCC). https://doi.org/10.1109/vlhcc.2019.8818816, pp 87–96
Mehrpour S, LaToza T D, Sarvari H (2020) Rulepad: interactive authoring of checkable design rules. In: European software engineering conference and international symposium on the foundations of software engineering (ESEC/FSE). https://doi.org/10.1145/3368089.3409751, pp 386–397
Nam J, Kim S (2015) CLAMI: defect prediction on unlabeled datasets (t). In: International conference on automated software engineering (ASE). https://doi.org/10.1109/ASE.2015.56, pp 452–463
Panichella S, Arnaoudova V, Di Penta M, Antoniol G (2015) Would static analysis tools help developers with code reviews?. In: International conference on software analysis, evolution, and reengineering (SANER). https://doi.org/10.1109/saner.2015.7081826, pp 161–170
Pascarella L, Spadini D, Palomba F, Bruntink M, Bacchelli A (2018) Information needs in contemporary code review. Human-Comput Interact 2(CSCW) 27:1–135. https://doi.org/10.1145/3274404
Google Scholar
Pradel M, Sen K (2018) Deepbugs: a learning approach to name-based bug detection. In: ACM on programming languages OOPSLA, vol 2. https://doi.org/10.1145/3276517, pp 147:1–147:25
Rice A, Aftandilian E, Jaspan C, Johnston E, Pradel M, Arroyo-Paredes Y (2017) Detecting argument selection defects. In: ACM On programming languages, OOPSLA, vol 1. https://doi.org/10.1145/3133928, pp 104:1–104:22
Rigby P C, Bird C (2013) Convergent contemporary software peer review practices. In: European software engineering conference and international symposium on the foundations of software engineering (ESEC/FSE). https://doi.org/10.1145/2491411.2491444, pp 202–212
Romano S, Scanniello G, Sartiani C, Risi M (2016) A graph-based approach to detect unreachable methods in java software. In: Symposium on applied computing (SAC). https://doi.org/10.1145/2851613.2851968, pp 1538–1541
Roy C K, Cordy J R (2007) A survey on software clone detection research. Tech. rep. Queen’s University at Kingston, Ontario
Runeson P, Wohlin C (1998) An experimental evaluation of an experience-based capture-recapture method in software code inspections. Empir Softw Eng 3(4):381–406. https://doi.org/10.1023/A:1009728205264
Article Google Scholar
Runeson P, Andersson C, Thelin T, Andrews A, Berling T (2006) What do we know about defect detection methods? IEEE Softw 23(3):82–90. https://doi.org/10.1109/MS.2006.89
Article Google Scholar
Sadowski C, van Gogh J, Jaspan C, Söderberg E, Winter C (2015) Tricorder: building a program analysis ecosystem. In: International conference on software engineering (ICSE). https://doi.org/10.1109/icse.2015.76, pp 598–608
Sadowski C, Aftandilian E, Eagle A, Miller-Cushon L, Jaspan C (2018a) Lessons from building static analysis tools at Google. Commun ACM 61 (4):58–66. https://doi.org/10.1145/3188720
Article Google Scholar
Sadowski C, Söderberg E, Church L, Sipko M, Bacchelli A (2018b) Modern code review: a case study at google. In: International conference on software engineering: software engineering in practice (ICSE-SEIP). https://doi.org/10.1145/3183519.3183525, pp 181–190
Saldaña J (2015) The coding manual for qualitative researchers. Sage
Seaman C B (1999) Qualitative methods in empirical studies of software engineering. Trans Softw Eng 25(4):557–572
Article Google Scholar
Shafiq S, Mashkoor A, Mayr-Dorn C, Egyed A (2021) A literature review of using machine learning in software development life cycle stages. IEEE Access 9:140896–140920. https://doi.org/10.1109/ACCESS.2021.3119746
Article Google Scholar
Sharma T, Spinellis D (2018) A survey on software smells. J Syst Softw 138:158–173. https://doi.org/10.1016/j.jss.2017.12.034
Article Google Scholar
Silva M C O, Valente M T, Terra R (2016) Does technical debt lead to the rejection of pull requests?. In: Symposium on information systems on Brazilian symposium on information systems: information systems in the cloud computing era (SBSI). https://doi.org/10.5555/3021955.3021997, pp 248–254
Singh D, Sekar V R, Stolee K T, Johnson B (2017) Evaluating how static analysis tools can reduce code review effort. In: Symposium on visual languages and human-centric computing (VL/HCC). https://doi.org/10.1109/vlhcc.2017.8103456, pp 101–105
SonarSource (2022) Code quality and code security. https://www.sonarqube.org/
Structure101 (2019) https://structure101.com
Sui Y, Ye D, Xue J (2012) Static memory leak detection using full-sparse value-flow analysis. In: International symposium on software testing and analysis (ISSTA). https://doi.org/10.1145/2338965.2336784, pp 254–264
Svajlenko J, Roy C K (2015) Evaluating clone detection tools with bigclonebench. In: International conference on software maintenance and evolution (ICSME). https://doi.org/10.1109/icsm.2015.7332459, pp 131–140
Thelin T, Runeson P, Wohlin C (2003) Prioritized use cases as a vehicle for software inspections. IEEE Softw 20(4):30–33. https://doi.org/10.1109/ms.2003.1207451
Article Google Scholar
Thung F, Lucia, Lo D, Jiang L, Rahman F, Devanbu PT (2012) To what extent could we detect field defects? An empirical study of false negatives in static bug finding tools. In: International conference on automated software engineering (ASE). https://doi.org/10.1145/2351676.2351685, pp 50–59
Travis CI (2019) https://travis-ci.org
Ueda Y, Ihara A, Ishio T, Matsumoto K (2018) Impact of coding style checker on code review—a case study on the openstack projects. In: International workshop on empirical software engineering in practice (IWESEP). https://doi.org/10.1109/iwesep.2018.00014, pp 31–36
van Emden E, Moonen L (2002) Java quality assurance by detecting code smells. In: Conference on reverse engineering. https://doi.org/10.1109/WCRE.2002.1173068, pp 97–106
Vassallo C, Panichella S, Palomba F, Proksch S, Zaidman A, Gall H C (2018) Context is king: the developer perspective on the usage of static analysis tools. In: International conference on software analysis, evolution, and reengineering (SANER). https://doi.org/10.1109/saner.2018.8330195, pp 38–49
Viviani G, Janik-Jones C, Famelis M, Murphy G C (2018) The structure of software design discussions. In: International workshop on cooperative and human aspects of software engineering (CHASE). https://doi.org/10.1145/3195836.3195841, pp 104–107
Wagner S, Jürjens J, Koller C, Trischberger P (2005) Comparing bug finding tools with reviews and tests. In: Testing of communicating systems. https://doi.org/10.1007/11430230_4, pp 40–55
Wang S, Liu T, Nam J, Tan L (2020) Deep semantic feature learning for software defect prediction. Trans Softw Eng 46(12):1267–1293. https://doi.org/10.1109/TSE.2018.2877612
Article Google Scholar
Wiegers KE (2002) Peer reviews in software: a practical guide. Addison-Wesley Boston
Yan D, Xu G, Yang S, Rountev A (2014) Leakchecker: practical static memory leak detection for managed languages. In: International symposium on code generation and optimization. https://doi.org/10.1145/2581122.2544151, pp 87–97
Zampetti F, Scalabrino S, Oliveto R, Canfora G, Di Penta M (2017) How open source projects use static code analysis tools in continuous integration pipelines. In: International conference on mining software repositories (MSR). https://doi.org/10.1109/msr.2017.2, pp 334–344

Download references

Funding

This work was supported in part by the National Science Foundation under grant NSF CCF-1703734.

Author information

Authors and Affiliations

Department of Computer Science, George Mason University, Fairfax, VA, USA
Sahar Mehrpour & Thomas D. LaToza

Authors

Sahar Mehrpour
View author publications
You can also search for this author in PubMed Google Scholar
Thomas D. LaToza
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sahar Mehrpour.

Ethics declarations

Conflict of Interest

The authors have no financial or non-financial interest.

Additional information

Communicated by: Emerson Murphy-Hill

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported in part by the National Science Foundation under grant NSF CCF-1703734.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Mehrpour, S., LaToza, T.D. Can static analysis tools find more defects?. Empir Software Eng 28, 5 (2023). https://doi.org/10.1007/s10664-022-10232-4

Download citation

Accepted: 28 August 2022
Published: 08 November 2022
DOI: https://doi.org/10.1007/s10664-022-10232-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Can static analysis tools find more defects?

Abstract

Access this article

Similar content being viewed by others

Large Language Model Assisted Software Engineering: Prospects, Challenges, and a Case Study

Challenges of Low-Code/No-Code Software Development: A Literature Review

Sampling in software engineering research: a critical review and guidelines

Data Availability

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Can static analysis tools find more defects?

Abstract

Access this article

Similar content being viewed by others

Large Language Model Assisted Software Engineering: Prospects, Challenges, and a Case Study

Challenges of Low-Code/No-Code Software Development: A Literature Review

Sampling in software engineering research: a critical review and guidelines

Data Availability

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation