Abstract
Static analysis tools find defects in code, checking code against rules to reveal potential defects. Many studies have evaluated these tools by measuring their ability to detect known defects in code. But these studies measure the current state of tools rather than their future potential to find more defects. To investigate the prospects for tools to find more defects, we conducted a study where we formulated each issue raised by a code reviewer as a violation of a rule, which we then compared to what static analysis tools might potentially check. We first gathered a corpus of 1323 defects found through code review. Through a qualitative analysis process, for each defect we identified a violated rule and the type of Static Analysis Tool (SAT) which might check this rule. We found that SATs might, in principle, be used to detect as many as 76% of code review defects, considerably more than current tools have been demonstrated to successfully detect. Among a variety of types of SATs, Style Checkers and AST Pattern Checkers had the broadest coverage of defects, each with the potential to detect 25% of all code review defects. We found that static analysis tools might be able to detect more code review defects by better supporting the creation of project-specific rules. We also investigated the characteristics of code review defects not detectable by traditional static analysis techniques, which to detect might require tools which simulate human judgements about code.
Similar content being viewed by others
Data Availability
Our dataset is publicly available on https://doi.org/10.6084/m9.figshare.14925222
Notes
E.g., some tools offer quick fixes for limited types of defects found by FindBugs, https://github.com/kjlubick/fb-contrib-eclipse-quick-fixes
References
Aftandilian E, Sauciuc R, Priya S, Krishnan S (2012) Building useful program analysis tools using an extensible java compiler. In: International working conference on source code analysis and manipulation (SCAM). https://doi.org/10.1109/SCAM.2012.28, pp 14–23
Aghajani E, Nagy C, Vega-Márquez O L, Linares-Vásquez M, Moreno L, Bavota G, Lanza M (2019) Software documentation issues unveiled. In: International conference on software engineering (ICSE). https://doi.org/10.1109/icse.2019.00122, pp 1199–1210
Aldrich J, Chambers C, Notkin D (2002) Archjava: connecting software architecture to implementation. In: International conference on software engineering (ICSE). https://doi.org/10.1145/581339.581365, pp 187–197
Ayewah N, Pugh W (2010) The google findbugs fixit. In: International symposium on software testing and analysis (ISSTA). https://doi.org/10.1145/1831708.1831738, pp 241–252
Bacchelli A, Bird C (2013) Expectations, outcomes, and challenges of modern code review. In: International conference on software engineering (ICSE). https://doi.org/10.1109/icse.2013.6606617, pp 712–721
Bader J, Scott A, Pradel M, Chandra S (2019) Getafix: learning to fix bugs automatically. In: Conference on object-oriented programming systems, languages and applications (OOPSLA). https://doi.org/10.1145/3360585, pp 159:1–159:27
Balachandran V (2013) Reducing human effort and improving quality in peer code reviews using automatic static analysis and reviewer recommendation. In: International conference on software engineering (ICSE). https://doi.org/10.1109/icse.2013.6606642, pp 931–940
Barik T, Ford D, Murphy-Hill E, Parnin C (2018) How should compilers explain problems to developers?. In: European software engineering conference and international symposium on the foundations of software engineering (ESEC/FSE). https://doi.org/10.1145/3236024.3236040, pp 633–643
Basili V R, Selby R W (1987) Comparing the effectiveness of software testing strategies. Trans Softw Eng SE-13 (12):1278–1296. https://doi.org/10.1109/tse.1987.232881
Beller M, Bacchelli A, Zaidman A, Juergens E (2014) Modern code reviews in open-source projects: which problems do they fix?. In: International conference on mining software repositories (MSR). https://doi.org/10.1145/2597073.2597082, pp 202–211
Beller M, Bholanath R, McIntosh S, Zaidman A (2016) Analyzing the state of static analysis: a large-scale evaluation in open source software. In: International conference on software analysis, evolution, and reengineering (SANER). https://doi.org/10.1109/saner.2016.105, pp 470–481
Bessey A, Block K, Chelf B, Chou A, Fulton B, Hallem S, Gros C, Kamsky A, McPeak S, Engler D R (2010) A few billion lines of code later: using static analysis to find bugs in the real world. Commun ACM 53(2):66–75. https://doi.org/10.1145/1646353.1646374
Beyer D, Keremoglu ME (2011) Cpachecker: a tool for configurable software verification. In: International conference on computer aided verification. https://doi.org/10.1007/978-3-642-22110-1_16, pp 184–190
Board I (1993) IEEE Standard classification for software anomalies. IEEE Std, pp 1044
Brunet J, Murphy GC, Terra R, Figueiredo J, Serey D (2014) Do developers discuss design?. In: International conference on mining software repositories (MSR). https://doi.org/10.1145/2597073.2597115, pp 340–343
Burnstein I (2002) Practical software testing. Springer, Berlin
Calcagno C, Distefano D, Dubreil J, Gabi D, Hooimeijer P, Luca M, O’Hearn PW, Papakonstantinou I, Purbrick J, Rodriguez D (2015) Moving fast with software verification. In: NASA formal methods symposium, Lecture Notes in Computer Science, vol 9058. https://doi.org/10.1007/978-3-319-17524-9_1, pp 3–11
Casalnuovo C, Barr E T, Dash S K, Devanbu P, Morgan E (2020) A theory of dual channel constraints. In: International conference on software engineering, new ideas and emerging results (ICSE-NIER). https://doi.org/10.1145/3377816.3381720, pp 25–28
CheckStyle (2004) http://checkstyle.sourceforge.net
Chillarege R, Bhandari I S, Chaar J K, Halliday M J, Moebus D S, Ray B K, Wong M Y (1992) Orthogonal defect classification-a concept for in-process measurements. Trans Softw Eng 18(11):943–956. https://doi.org/10.1109/32.177364
Christakis M, Bird C (2016) What developers want and need from program analysis: an empirical study. In: International conference on automated software engineering (ASE). https://doi.org/10.1145/2970276.2970347, pp 332–343
Copeland T (2005) PMD applied. Centennial Books
Crockford D (2011) Jslint: the javascript code quality tool. http://www.jslint.com
Ebert F, Castor F, Novielli N, Serebrenik A (2018) Communicative intention in code review questions. In: International conference on software maintenance and evolution (ICSME). https://doi.org/10.1109/icsme.2018.00061, pp 519–523
Fard A M, Mesbah A (2013) JSNOSE: detecting Javascript code smells. In: International working conference on source code analysis and manipulation (SCAM). https://doi.org/10.1109/scam.2013.6648192, pp 116–125
Ghorbani N, Garcia J, Malek S (2019) Detection and repair of architectural inconsistencies in java. In: International conference on software engineering (ICSE). https://doi.org/10.1109/icse.2019.00067, pp 560–571
Gilb T, Graham D, Finzi S (1993) Software inspection. Addison-Wesley
GitHub (2021) Github copilot. https://copilot.github.com/
Gousios G (2013) The ghtorrent dataset and tool suite. In: International conference on mining software repositories (MSR). https://doi.org/10.1109/msr.2013.6624034, pp 233–236
Group I et al (2010) 1044-2009-ieee standard classification for software anomalies. IEEE, New York
Habib A, Pradel M (2018) How many of all bugs do we find? A study of static bug detectors. In: International conference on automated software engineering (ASE). https://doi.org/10.1145/3238147.3238213, pp 317–328
Hovemeyer D, Pugh W (2004) Finding bugs is easy. In: Conference on object-oriented programming systems, languages, and applications (OOPSLA). https://doi.org/10.1145/1028664.1028717, pp 132–136
Humphrey WS (1995) A discipline for software engineering, 1st edn. Addison-Wesley
Huo X, Yang Y, Li M, Zhan D (2018) Learning semantic features for software defect prediction by code comments embedding. In: International conference on data mining (ICDM). https://doi.org/10.1109/ICDM.2018.00133, pp 1049–1054
JCov (2014) https://wiki.openjdk.java.net/display/CodeTools/jcov
Jenkins (2019) https://jenkins.io
Johnson B, Song Y, Murphy-Hill E, Bowdidge R (2013) Why don’t software developers use static analysis tools to find bugs?. In: International conference on software engineering (ICSE). https://doi.org/10.1109/icse.2013.6606613, pp 672–681
Kamiya T, Kusumoto S, Inoue K (2002) CCFInder: a multilinguistic token-based code clone detection system for large scale source code. Trans Softw Eng 28(7):654–670. https://doi.org/10.1109/tse.2002.1019480
Knodel J, Popescu D (2007) A comparison of static architecture compliance checking approaches. In: Working IEEE/IFIP conference on software architecture (WICSA). https://doi.org/10.1109/wicsa.2007.1, pp 12–12
Kruchten P (2004) An ontology of architectural design decisions in software-intensive systems. In: Groningen workshop on software variability management
Laitenberger O (1998) Studying the effects of code inspection and structural testing on software quality. In: International symposium on software reliability engineering (ISSER). https://doi.org/10.1109/ISSRE.1998.730887, pp 237–246
Laitenberger O, DeBaud J M (2000) An encompassing life cycle centric survey of software inspection. J Syst Softw 50(1):5–31. https://doi.org/10.1016/s0164-1212(99)00073-4
Landis J R, Koch G G (1977) The measurement of observer agreement for categorical data. Biometrics 159–174. https://doi.org/10.2307/2529310
LaToza T D, Myers B A (2011) Visualizing call graphs. In: Symposium on visual languages and human-centric computing (VL/HCC). https://doi.org/10.1109/VLHCC.2011.6070388, pp 117–124
Little G, Miller R C (2007) Keyword programming in java. In: International conference on automated software engineering (ASE). https://doi.org/10.1007/s10515-008-0041-9, pp 84–93
Mäntylä MV, Lassenius C (2009) What types of defects are really discovered in code reviews? Trans Softw Eng 35(3):430–448. https://doi.org/10.1109/tse.2008.71
Mao K, Harman M, Jia Y (2016) Sapienz: multi-objective automated testing for android applications. In: International symposium on software testing and analysis (ISSTA). https://doi.org/10.1145/2931037.2931054, pp 94–105
Mehrpour S, LaToza T D, Kindi R K (2019) Active documentation: helping developers follow design decisions. In: Symposium on visual languages and human-centric computing (VL/HCC). https://doi.org/10.1109/vlhcc.2019.8818816, pp 87–96
Mehrpour S, LaToza T D, Sarvari H (2020) Rulepad: interactive authoring of checkable design rules. In: European software engineering conference and international symposium on the foundations of software engineering (ESEC/FSE). https://doi.org/10.1145/3368089.3409751, pp 386–397
Nam J, Kim S (2015) CLAMI: defect prediction on unlabeled datasets (t). In: International conference on automated software engineering (ASE). https://doi.org/10.1109/ASE.2015.56, pp 452–463
Panichella S, Arnaoudova V, Di Penta M, Antoniol G (2015) Would static analysis tools help developers with code reviews?. In: International conference on software analysis, evolution, and reengineering (SANER). https://doi.org/10.1109/saner.2015.7081826, pp 161–170
Pascarella L, Spadini D, Palomba F, Bruntink M, Bacchelli A (2018) Information needs in contemporary code review. Human-Comput Interact 2(CSCW) 27:1–135. https://doi.org/10.1145/3274404
Pradel M, Sen K (2018) Deepbugs: a learning approach to name-based bug detection. In: ACM on programming languages OOPSLA, vol 2. https://doi.org/10.1145/3276517, pp 147:1–147:25
Rice A, Aftandilian E, Jaspan C, Johnston E, Pradel M, Arroyo-Paredes Y (2017) Detecting argument selection defects. In: ACM On programming languages, OOPSLA, vol 1. https://doi.org/10.1145/3133928, pp 104:1–104:22
Rigby P C, Bird C (2013) Convergent contemporary software peer review practices. In: European software engineering conference and international symposium on the foundations of software engineering (ESEC/FSE). https://doi.org/10.1145/2491411.2491444, pp 202–212
Romano S, Scanniello G, Sartiani C, Risi M (2016) A graph-based approach to detect unreachable methods in java software. In: Symposium on applied computing (SAC). https://doi.org/10.1145/2851613.2851968, pp 1538–1541
Roy C K, Cordy J R (2007) A survey on software clone detection research. Tech. rep. Queen’s University at Kingston, Ontario
Runeson P, Wohlin C (1998) An experimental evaluation of an experience-based capture-recapture method in software code inspections. Empir Softw Eng 3(4):381–406. https://doi.org/10.1023/A:1009728205264
Runeson P, Andersson C, Thelin T, Andrews A, Berling T (2006) What do we know about defect detection methods? IEEE Softw 23(3):82–90. https://doi.org/10.1109/MS.2006.89
Sadowski C, van Gogh J, Jaspan C, Söderberg E, Winter C (2015) Tricorder: building a program analysis ecosystem. In: International conference on software engineering (ICSE). https://doi.org/10.1109/icse.2015.76, pp 598–608
Sadowski C, Aftandilian E, Eagle A, Miller-Cushon L, Jaspan C (2018a) Lessons from building static analysis tools at Google. Commun ACM 61 (4):58–66. https://doi.org/10.1145/3188720
Sadowski C, Söderberg E, Church L, Sipko M, Bacchelli A (2018b) Modern code review: a case study at google. In: International conference on software engineering: software engineering in practice (ICSE-SEIP). https://doi.org/10.1145/3183519.3183525, pp 181–190
Saldaña J (2015) The coding manual for qualitative researchers. Sage
Seaman C B (1999) Qualitative methods in empirical studies of software engineering. Trans Softw Eng 25(4):557–572
Shafiq S, Mashkoor A, Mayr-Dorn C, Egyed A (2021) A literature review of using machine learning in software development life cycle stages. IEEE Access 9:140896–140920. https://doi.org/10.1109/ACCESS.2021.3119746
Sharma T, Spinellis D (2018) A survey on software smells. J Syst Softw 138:158–173. https://doi.org/10.1016/j.jss.2017.12.034
Silva M C O, Valente M T, Terra R (2016) Does technical debt lead to the rejection of pull requests?. In: Symposium on information systems on Brazilian symposium on information systems: information systems in the cloud computing era (SBSI). https://doi.org/10.5555/3021955.3021997, pp 248–254
Singh D, Sekar V R, Stolee K T, Johnson B (2017) Evaluating how static analysis tools can reduce code review effort. In: Symposium on visual languages and human-centric computing (VL/HCC). https://doi.org/10.1109/vlhcc.2017.8103456, pp 101–105
SonarSource (2022) Code quality and code security. https://www.sonarqube.org/
Structure101 (2019) https://structure101.com
Sui Y, Ye D, Xue J (2012) Static memory leak detection using full-sparse value-flow analysis. In: International symposium on software testing and analysis (ISSTA). https://doi.org/10.1145/2338965.2336784, pp 254–264
Svajlenko J, Roy C K (2015) Evaluating clone detection tools with bigclonebench. In: International conference on software maintenance and evolution (ICSME). https://doi.org/10.1109/icsm.2015.7332459, pp 131–140
Thelin T, Runeson P, Wohlin C (2003) Prioritized use cases as a vehicle for software inspections. IEEE Softw 20(4):30–33. https://doi.org/10.1109/ms.2003.1207451
Thung F, Lucia, Lo D, Jiang L, Rahman F, Devanbu PT (2012) To what extent could we detect field defects? An empirical study of false negatives in static bug finding tools. In: International conference on automated software engineering (ASE). https://doi.org/10.1145/2351676.2351685, pp 50–59
Travis CI (2019) https://travis-ci.org
Ueda Y, Ihara A, Ishio T, Matsumoto K (2018) Impact of coding style checker on code review—a case study on the openstack projects. In: International workshop on empirical software engineering in practice (IWESEP). https://doi.org/10.1109/iwesep.2018.00014, pp 31–36
van Emden E, Moonen L (2002) Java quality assurance by detecting code smells. In: Conference on reverse engineering. https://doi.org/10.1109/WCRE.2002.1173068, pp 97–106
Vassallo C, Panichella S, Palomba F, Proksch S, Zaidman A, Gall H C (2018) Context is king: the developer perspective on the usage of static analysis tools. In: International conference on software analysis, evolution, and reengineering (SANER). https://doi.org/10.1109/saner.2018.8330195, pp 38–49
Viviani G, Janik-Jones C, Famelis M, Murphy G C (2018) The structure of software design discussions. In: International workshop on cooperative and human aspects of software engineering (CHASE). https://doi.org/10.1145/3195836.3195841, pp 104–107
Wagner S, Jürjens J, Koller C, Trischberger P (2005) Comparing bug finding tools with reviews and tests. In: Testing of communicating systems. https://doi.org/10.1007/11430230_4, pp 40–55
Wang S, Liu T, Nam J, Tan L (2020) Deep semantic feature learning for software defect prediction. Trans Softw Eng 46(12):1267–1293. https://doi.org/10.1109/TSE.2018.2877612
Wiegers KE (2002) Peer reviews in software: a practical guide. Addison-Wesley Boston
Yan D, Xu G, Yang S, Rountev A (2014) Leakchecker: practical static memory leak detection for managed languages. In: International symposium on code generation and optimization. https://doi.org/10.1145/2581122.2544151, pp 87–97
Zampetti F, Scalabrino S, Oliveto R, Canfora G, Di Penta M (2017) How open source projects use static code analysis tools in continuous integration pipelines. In: International conference on mining software repositories (MSR). https://doi.org/10.1109/msr.2017.2, pp 334–344
Funding
This work was supported in part by the National Science Foundation under grant NSF CCF-1703734.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors have no financial or non-financial interest.
Additional information
Communicated by: Emerson Murphy-Hill
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was supported in part by the National Science Foundation under grant NSF CCF-1703734.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Mehrpour, S., LaToza, T.D. Can static analysis tools find more defects?. Empir Software Eng 28, 5 (2023). https://doi.org/10.1007/s10664-022-10232-4
Accepted:
Published:
DOI: https://doi.org/10.1007/s10664-022-10232-4