Skip to main content
Log in

Learning to recognize actionable static code warnings (is intrinsically easy)

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Static code warning tools often generate warnings that programmers ignore. Such tools can be made more useful via data mining algorithms that select the “actionable” warnings; i.e. the warnings that are usually not ignored. In this paper, we look for actionable warnings within a sample of 5,675 actionable warnings seen in 31,058 static code warnings from FindBugs. We find that data mining algorithms can find actionable warnings with remarkable ease. Specifically, a range of data mining methods (deep learners, random forests, decision tree learners, and support vector machines) all achieved very good results (recalls and AUC(TRN, TPR) measures usually over 95% and false alarms usually under 5%). Given that all these learners succeeded so easily, it is appropriate to ask if there is something about this task that is inherently easy. We report that while our data sets have up to 58 raw features, those features can be approximated by less than two underlying dimensions. For such intrinsically simple data, many different kinds of learners can generate useful models with similar performance. Based on the above, we conclude that learning to recognize actionable static code warnings is easy, using a wide range of learning algorithms, since the underlying data is intrinsically simple. If we had to pick one particular learner for this task, we would suggest linear SVMs (since, at least in our sample, that learner ran relatively quickly and achieved the best median performance) and we would not recommend deep learning (since this data is intrinsically very simple).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. https://github.com/XueqiYang/intrinsic_dimension.

  2. https://pmd.github.io/latest/index.html

  3. https://checkstyle.sourceforge.io/

  4. http://findbugs.sourceforge.net

  5. https://github.com/XueqiYang/intrinsic_dimension

References

  • Aggarwal CC, Hinneburg A, Keim DA (2001) On the surprising behavior of distance metrics in high dimensional spaces. In: Proceedings of the 8th international conference on database theory, ICDT ’01, pp 420–434. Springer-Verlag, Berlin, Heidelberg

  • Agrawal A, Fu W, Chen D, Shen X, Menzies T (2019) How to “dodge” complex software analytics. Preprint, IEEE Transactions on Software Engineering, Available on-line at 1902.01838

  • Agrawal A, Menzies T (2018) Is better data better than better data miners?: on the benefits of tuning smote for defect prediction. In: International Conference on Software Engineering

  • Allier S, Anquetil N, Hora A, Ducasse S (2012) A framework to compare alert ranking algorithms. In: 2012 19th Working conference on reverse engineering, pp 277–285. IEEE

  • Avgustinov P, Baars AI, Henriksen AS, Lavender G, Menzel G, de Moor O, Schäfer M, Tibble J (2015) Tracking static analysis violations over time to capture developer characteristics. In: Proceedings of the 37th international conference on software engineering-volume 1, pp 437–447. IEEE Press

  • Ayewah N, Pugh W, Hovemeyer D, Morgenthaler JD, Penix J (2008) Using static analysis to find bugs. IEEE software 25(5):22–29

    Article  Google Scholar 

  • Bhattacharya P, Iliofotou M, Neamtiu I, Faloutsos M (2012) Graph-based analysis and prediction for software evolution. In: 2012 34th International conference on software engineering (ICSE), pp 419–429. IEEE

  • Boogerd C, Moonen L (2008) Assessing the value of coding standards: An empirical study. In: 2008 IEEE International conference on software maintenance, pp 277–286. IEEE

  • Breiman L (1999) Random forests. UC Berkeley TR567

  • Chen C, Xing Z, Liu Y, Ong KLX (2019) Mining likely analogical apis across third-party libraries via large-scale unsupervised api semantics embedding. IEEE Trans Softw Eng

  • Chen W-C, Tseng S-S, Wang C-Y (2005) A novel manufacturing defect detection method using association rule mining techniques. Expert systems with applications 29(4):807–815

    Article  Google Scholar 

  • Choetkiertikul M, Dam HK, Tran T, Pham TTM, Ghose A, Menzies T (2018) A deep learning model for estimating story points. IEEE Trans Softw Eng

  • Cortes C, Vapnik V (1995) Support-vector networks. Machine learning 20(3):273–297

    MATH  Google Scholar 

  • Courtney RE, Gustafson DA (1993) Shotgun correlations in software measures. Softw Eng J, vol 8

  • Géron A (2019) Hands-on machine learning with scikit-learn, keras, and tensorflow: Concepts, tools, and techniques to build intelligent systems. O’Reilly Media

  • Ghotra B, McIntosh S, Hassan AE (2015) Revisiting the impact of classification techniques on the performance of defect prediction models. In: Proceedings of the 37th international conference on software engineering-volume 1, pp 789–800. IEEE Press

  • Goh Anthony TC (1995) Back-propagation neural networks for modeling complex systems. Artif Intell Eng 9(3):143–151

    Article  Google Scholar 

  • Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT press

  • Gu X, Zhang H, Zhang D, Kim S (2016) Deep api learning. In: Proceedings of the 2016 24th ACM SIGSOFT international symposium on foundations of software engineering, pp 631–642. ACM

  • Guo J, Cheng J, Cleland-Huang J (2017) Semantically enhanced software traceability using deep learning techniques. In: 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE), pp 3–14. IEEE

  • Hanam Q, Tan L, Holmes R, Lam P (2014) Finding patterns in static analysis alerts: improving actionable alert ranking. In: Proceedings of the 11th working conference on mining software repositories, pp 152–161. ACM

  • Heckman S, Williams L (2009) A model building process for identifying actionable static analysis alerts. In: 2009 International conference on software testing verification and validation, pp 161–170. IEEE

  • Heckman S, Williams L (2011) A systematic literature review of actionable alert identification techniques for automated static code analysis. Inf Softw Technol 53(4):363–387

    Article  Google Scholar 

  • Hindle A, Barr ET, Gabel M, Su Z, Devanbu P (2016) On the naturalness of software. Commun. ACM 59(5):122–131. https://doi.org/10.1145/2902362

    Article  Google Scholar 

  • Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural networks 4(2):251–257

    Article  MathSciNet  Google Scholar 

  • Huo X, Thung F, Li M, Lo D, Shi S-T (2019) Deep transfer bug localization. IEEE Trans Softw Eng

  • Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167

  • Johnson B, Song Y, Murphy-Hill E, Bowdidge R (2013) Why don’t software developers use static analysis tools to find bugs?. In: Proceedings of the 2013 international conference on software engineering, pp 672–681. IEEE Press

  • Khalid H, Nagappan M, Hassan AE (2015) Examining the relationship between findbugs warnings and app ratings. Ieee Software 33(4):34–39

    Article  Google Scholar 

  • Kim S, Ernst MD (2007) Prioritizing warning categories by analyzing software history. In: Proceedings of the fourth international workshop on mining software repositories, p 27. IEEE Computer Society

  • Kremenek T, Ashcraft K, Yang J, Engler D (2004) Correlation exploitation in error ranking. In: ACM SIGSOFT software engineering notes, 29, pp 83–93. ACM

  • Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

  • Levina E, Bickel PJ (2005) Maximum likelihood estimation of intrinsic dimension. In: Advances in neural information processing systems, pp 777–784

  • Li X, Jiang H, Ren Z, Li G, Zhang J (2018) Deep learning in software engineering. arXiv:1805.04825

  • Li Y, Yuan Y (2017) Convergence analysis of two-layer neural networks with relu activation. In: Advances in neural information processing systems, pp 597–607

  • Liang G, Wu L, Wu Q, Wang Q, Xie T, Mei H (2010) Automatic construction of an effective training set for prioritizing static analysis warnings. In: Proceedings of the IEEE/ACM international conference on Automated software engineering, pp 93–102. ACM

  • Lin B, Zampetti F, Bavota G, Di Penta M, Lanza M, Oliveto R (2018) Sentiment analysis for software engineering: How far can we go?. In: 2018 IEEE/ACM 40th International conference on software engineering (ICSE), pp 94–104. IEEE

  • Lin Y-Z, Nie Z-H, Ma H-W (2017) Structural damage detection with automatic feature-extraction through deep learning. Comput. Aided Civ Infrastructure Eng 32:1025–1046

    Article  Google Scholar 

  • Maaten L, Hinton G (2008) Visualizing data using t-sne. Journal of machine learning research 9:2579–2605

    MATH  Google Scholar 

  • Menzies T, Owen D, Richardson J (January 2007) The strangest thing about software. Computer 40(1):54–60. https://doi.org/10.1109/MC.2007.37

    Article  Google Scholar 

  • Najafabadi MM, Villanustre F, Khoshgoftaar TM, Seliya N, Wald R, Muharemagic E (2015) Deep learning applications and challenges in big data analytics. Journal of Big Data 2(1):1

    Article  Google Scholar 

  • Nguyen TD, Nguyen AT, Phan HD, Nguyen TN (2017) Exploring api embedding for api usages and applications. In: 2017 IEEE/ACM 39th International conference on software engineering (ICSE), pp 438–449. IEEE

  • Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: Machine learning in python. Journal of machine learning research

  • Quinlan JR (1987) Generating production rules from decision trees.

  • Rasmussen C (1999) The infinite gaussian mixture model. Advances in neural information processing systems 12:554–560

    Google Scholar 

  • Rosenthal R, Cooper H, Hedges L (1994) Parametric measures of effect size. The handbook of research synthesis 621(2):231–244

    Google Scholar 

  • Sawilowsky SS (2009) New effect size rules of thumb. Journal of Modern Applied Statistical Methods 8(2):26

    Article  MathSciNet  Google Scholar 

  • Shalev-Shwartz S, Ben-David S (2014) Understanding machine learning: From theory to algorithms. Cambridge university press, Cambridge

    Book  Google Scholar 

  • Shen H, Fang J, Zhao J (2011) Efindbugs: Effective error ranking for findbugs. In: 2011 Fourth IEEE International conference on software testing, verification and validation, pp 299–308. IEEE

  • Shivaji S, Whitehead Jr EJ, Akella R, Kim S (2009) Reducing features to improve bug prediction. In: 2009 IEEE/ACM International conference on automated software engineering, pp 600–604. IEEE

  • Tantithamthavorn C, McIntosh S, Hassan AE, Matsumoto K (2016) Automated parameter optimization of classification techniques for defect prediction models. In: ICSE’16, pp 321–332

  • Thung F, Lo D, Jiang L, Rahman F, Devanbu PT, et al. (2015) To what extent could we detect field defects? an extended empirical study of false negatives in static bug-finding tools. Autom Softw Eng 22(4):561–602

    Article  Google Scholar 

  • Tu H, Nair V (2018) While tuning is good, no tuner is best. In: FSE SWAN

  • Vandekerckhove J, Matzke D, Wagenmakers E-J, et al. (2015) Model comparison and the principle of parsimony. Oxford handbook of computational and mathematical psychology, pp 300–319

  • Wang J, Wang S, Wang Q (2018) Is there a golden feature set for static warning identification?: an experimental evaluation. In: Proceedings of the 12th ACM/IEEE international symposium on empirical software engineering and measurement, pp 17. ACM

  • Wang S, Liu T, Tan L (2016) Automatically learning semantic features for defect prediction. In: 2016 IEEE/ACM 38th International conference on software engineering (ICSE), pp 297–308. IEEE

  • White M, Tufano M, Vendome C, Poshyvanyk D (2016) Deep learning code fragments for code clone detection. In: Proceedings of the 31st IEEE/ACM international conference on automated software engineering, pp 87–98. ACM

  • Witten IH, Frank E, Hall MA, Pal CJ (2016) Data mining: Practical machine learning tools and techniques. Morgan Kaufmann

  • Wolpert DH, Macready WG, et al. (1997) No free lunch theorems for optimization. IEEE transactions on evolutionary computation 1(1):67–82

    Article  Google Scholar 

  • Yu Z, Kraft NA, Menzies T (2018) Finding better active learners for faster literature reviews. Empir Softw Eng 23(6):3161–3186

    Article  Google Scholar 

  • Zhang C, Bengio S, Hardt M, Recht B, Vinyals O (2016) Understanding deep learning requires rethinking generalization. arXiv:1611.03530

  • Zhao G, Huang J (2018) Deepsim: deep learning code functional similarity. In: Proceedings of the 2018 26th ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering, pp 141–151. ACM

Download references

Acknowledgment

This work was partially funded by an NSF award #1703487.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tim Menzies.

Additional information

Communicated by: Andy Zaidman

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, X., Chen, J., Yedida, R. et al. Learning to recognize actionable static code warnings (is intrinsically easy). Empir Software Eng 26, 56 (2021). https://doi.org/10.1007/s10664-021-09948-6

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10664-021-09948-6

Keywords

Navigation