Abstract
Static code warning tools often generate warnings that programmers ignore. Such tools can be made more useful via data mining algorithms that select the “actionable” warnings; i.e. the warnings that are usually not ignored. In this paper, we look for actionable warnings within a sample of 5,675 actionable warnings seen in 31,058 static code warnings from FindBugs. We find that data mining algorithms can find actionable warnings with remarkable ease. Specifically, a range of data mining methods (deep learners, random forests, decision tree learners, and support vector machines) all achieved very good results (recalls and AUC(TRN, TPR) measures usually over 95% and false alarms usually under 5%). Given that all these learners succeeded so easily, it is appropriate to ask if there is something about this task that is inherently easy. We report that while our data sets have up to 58 raw features, those features can be approximated by less than two underlying dimensions. For such intrinsically simple data, many different kinds of learners can generate useful models with similar performance. Based on the above, we conclude that learning to recognize actionable static code warnings is easy, using a wide range of learning algorithms, since the underlying data is intrinsically simple. If we had to pick one particular learner for this task, we would suggest linear SVMs (since, at least in our sample, that learner ran relatively quickly and achieved the best median performance) and we would not recommend deep learning (since this data is intrinsically very simple).
Similar content being viewed by others
References
Aggarwal CC, Hinneburg A, Keim DA (2001) On the surprising behavior of distance metrics in high dimensional spaces. In: Proceedings of the 8th international conference on database theory, ICDT ’01, pp 420–434. Springer-Verlag, Berlin, Heidelberg
Agrawal A, Fu W, Chen D, Shen X, Menzies T (2019) How to “dodge” complex software analytics. Preprint, IEEE Transactions on Software Engineering, Available on-line at 1902.01838
Agrawal A, Menzies T (2018) Is better data better than better data miners?: on the benefits of tuning smote for defect prediction. In: International Conference on Software Engineering
Allier S, Anquetil N, Hora A, Ducasse S (2012) A framework to compare alert ranking algorithms. In: 2012 19th Working conference on reverse engineering, pp 277–285. IEEE
Avgustinov P, Baars AI, Henriksen AS, Lavender G, Menzel G, de Moor O, Schäfer M, Tibble J (2015) Tracking static analysis violations over time to capture developer characteristics. In: Proceedings of the 37th international conference on software engineering-volume 1, pp 437–447. IEEE Press
Ayewah N, Pugh W, Hovemeyer D, Morgenthaler JD, Penix J (2008) Using static analysis to find bugs. IEEE software 25(5):22–29
Bhattacharya P, Iliofotou M, Neamtiu I, Faloutsos M (2012) Graph-based analysis and prediction for software evolution. In: 2012 34th International conference on software engineering (ICSE), pp 419–429. IEEE
Boogerd C, Moonen L (2008) Assessing the value of coding standards: An empirical study. In: 2008 IEEE International conference on software maintenance, pp 277–286. IEEE
Breiman L (1999) Random forests. UC Berkeley TR567
Chen C, Xing Z, Liu Y, Ong KLX (2019) Mining likely analogical apis across third-party libraries via large-scale unsupervised api semantics embedding. IEEE Trans Softw Eng
Chen W-C, Tseng S-S, Wang C-Y (2005) A novel manufacturing defect detection method using association rule mining techniques. Expert systems with applications 29(4):807–815
Choetkiertikul M, Dam HK, Tran T, Pham TTM, Ghose A, Menzies T (2018) A deep learning model for estimating story points. IEEE Trans Softw Eng
Cortes C, Vapnik V (1995) Support-vector networks. Machine learning 20(3):273–297
Courtney RE, Gustafson DA (1993) Shotgun correlations in software measures. Softw Eng J, vol 8
Géron A (2019) Hands-on machine learning with scikit-learn, keras, and tensorflow: Concepts, tools, and techniques to build intelligent systems. O’Reilly Media
Ghotra B, McIntosh S, Hassan AE (2015) Revisiting the impact of classification techniques on the performance of defect prediction models. In: Proceedings of the 37th international conference on software engineering-volume 1, pp 789–800. IEEE Press
Goh Anthony TC (1995) Back-propagation neural networks for modeling complex systems. Artif Intell Eng 9(3):143–151
Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT press
Gu X, Zhang H, Zhang D, Kim S (2016) Deep api learning. In: Proceedings of the 2016 24th ACM SIGSOFT international symposium on foundations of software engineering, pp 631–642. ACM
Guo J, Cheng J, Cleland-Huang J (2017) Semantically enhanced software traceability using deep learning techniques. In: 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE), pp 3–14. IEEE
Hanam Q, Tan L, Holmes R, Lam P (2014) Finding patterns in static analysis alerts: improving actionable alert ranking. In: Proceedings of the 11th working conference on mining software repositories, pp 152–161. ACM
Heckman S, Williams L (2009) A model building process for identifying actionable static analysis alerts. In: 2009 International conference on software testing verification and validation, pp 161–170. IEEE
Heckman S, Williams L (2011) A systematic literature review of actionable alert identification techniques for automated static code analysis. Inf Softw Technol 53(4):363–387
Hindle A, Barr ET, Gabel M, Su Z, Devanbu P (2016) On the naturalness of software. Commun. ACM 59(5):122–131. https://doi.org/10.1145/2902362
Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural networks 4(2):251–257
Huo X, Thung F, Li M, Lo D, Shi S-T (2019) Deep transfer bug localization. IEEE Trans Softw Eng
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167
Johnson B, Song Y, Murphy-Hill E, Bowdidge R (2013) Why don’t software developers use static analysis tools to find bugs?. In: Proceedings of the 2013 international conference on software engineering, pp 672–681. IEEE Press
Khalid H, Nagappan M, Hassan AE (2015) Examining the relationship between findbugs warnings and app ratings. Ieee Software 33(4):34–39
Kim S, Ernst MD (2007) Prioritizing warning categories by analyzing software history. In: Proceedings of the fourth international workshop on mining software repositories, p 27. IEEE Computer Society
Kremenek T, Ashcraft K, Yang J, Engler D (2004) Correlation exploitation in error ranking. In: ACM SIGSOFT software engineering notes, 29, pp 83–93. ACM
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Levina E, Bickel PJ (2005) Maximum likelihood estimation of intrinsic dimension. In: Advances in neural information processing systems, pp 777–784
Li X, Jiang H, Ren Z, Li G, Zhang J (2018) Deep learning in software engineering. arXiv:1805.04825
Li Y, Yuan Y (2017) Convergence analysis of two-layer neural networks with relu activation. In: Advances in neural information processing systems, pp 597–607
Liang G, Wu L, Wu Q, Wang Q, Xie T, Mei H (2010) Automatic construction of an effective training set for prioritizing static analysis warnings. In: Proceedings of the IEEE/ACM international conference on Automated software engineering, pp 93–102. ACM
Lin B, Zampetti F, Bavota G, Di Penta M, Lanza M, Oliveto R (2018) Sentiment analysis for software engineering: How far can we go?. In: 2018 IEEE/ACM 40th International conference on software engineering (ICSE), pp 94–104. IEEE
Lin Y-Z, Nie Z-H, Ma H-W (2017) Structural damage detection with automatic feature-extraction through deep learning. Comput. Aided Civ Infrastructure Eng 32:1025–1046
Maaten L, Hinton G (2008) Visualizing data using t-sne. Journal of machine learning research 9:2579–2605
Menzies T, Owen D, Richardson J (January 2007) The strangest thing about software. Computer 40(1):54–60. https://doi.org/10.1109/MC.2007.37
Najafabadi MM, Villanustre F, Khoshgoftaar TM, Seliya N, Wald R, Muharemagic E (2015) Deep learning applications and challenges in big data analytics. Journal of Big Data 2(1):1
Nguyen TD, Nguyen AT, Phan HD, Nguyen TN (2017) Exploring api embedding for api usages and applications. In: 2017 IEEE/ACM 39th International conference on software engineering (ICSE), pp 438–449. IEEE
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: Machine learning in python. Journal of machine learning research
Quinlan JR (1987) Generating production rules from decision trees.
Rasmussen C (1999) The infinite gaussian mixture model. Advances in neural information processing systems 12:554–560
Rosenthal R, Cooper H, Hedges L (1994) Parametric measures of effect size. The handbook of research synthesis 621(2):231–244
Sawilowsky SS (2009) New effect size rules of thumb. Journal of Modern Applied Statistical Methods 8(2):26
Shalev-Shwartz S, Ben-David S (2014) Understanding machine learning: From theory to algorithms. Cambridge university press, Cambridge
Shen H, Fang J, Zhao J (2011) Efindbugs: Effective error ranking for findbugs. In: 2011 Fourth IEEE International conference on software testing, verification and validation, pp 299–308. IEEE
Shivaji S, Whitehead Jr EJ, Akella R, Kim S (2009) Reducing features to improve bug prediction. In: 2009 IEEE/ACM International conference on automated software engineering, pp 600–604. IEEE
Tantithamthavorn C, McIntosh S, Hassan AE, Matsumoto K (2016) Automated parameter optimization of classification techniques for defect prediction models. In: ICSE’16, pp 321–332
Thung F, Lo D, Jiang L, Rahman F, Devanbu PT, et al. (2015) To what extent could we detect field defects? an extended empirical study of false negatives in static bug-finding tools. Autom Softw Eng 22(4):561–602
Tu H, Nair V (2018) While tuning is good, no tuner is best. In: FSE SWAN
Vandekerckhove J, Matzke D, Wagenmakers E-J, et al. (2015) Model comparison and the principle of parsimony. Oxford handbook of computational and mathematical psychology, pp 300–319
Wang J, Wang S, Wang Q (2018) Is there a golden feature set for static warning identification?: an experimental evaluation. In: Proceedings of the 12th ACM/IEEE international symposium on empirical software engineering and measurement, pp 17. ACM
Wang S, Liu T, Tan L (2016) Automatically learning semantic features for defect prediction. In: 2016 IEEE/ACM 38th International conference on software engineering (ICSE), pp 297–308. IEEE
White M, Tufano M, Vendome C, Poshyvanyk D (2016) Deep learning code fragments for code clone detection. In: Proceedings of the 31st IEEE/ACM international conference on automated software engineering, pp 87–98. ACM
Witten IH, Frank E, Hall MA, Pal CJ (2016) Data mining: Practical machine learning tools and techniques. Morgan Kaufmann
Wolpert DH, Macready WG, et al. (1997) No free lunch theorems for optimization. IEEE transactions on evolutionary computation 1(1):67–82
Yu Z, Kraft NA, Menzies T (2018) Finding better active learners for faster literature reviews. Empir Softw Eng 23(6):3161–3186
Zhang C, Bengio S, Hardt M, Recht B, Vinyals O (2016) Understanding deep learning requires rethinking generalization. arXiv:1611.03530
Zhao G, Huang J (2018) Deepsim: deep learning code functional similarity. In: Proceedings of the 2018 26th ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering, pp 141–151. ACM
Acknowledgment
This work was partially funded by an NSF award #1703487.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Andy Zaidman
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yang, X., Chen, J., Yedida, R. et al. Learning to recognize actionable static code warnings (is intrinsically easy). Empir Software Eng 26, 56 (2021). https://doi.org/10.1007/s10664-021-09948-6
Accepted:
Published:
DOI: https://doi.org/10.1007/s10664-021-09948-6