Skip to main content

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 217))

Abstract

Code review turns into a progressively mainstream method to detect early defects in the codebase. These days experts are rushing towards peer-investigating the codebases written by any co-located team members or other authors from distributed or dispersed teams. Chipping away at a circulated or scattered team, reviewing a codebase is required to inspect the patches before consolidating. Code looking into can likewise be a structure of approving practical and non-useful necessities. In certain circumstances, analysts do not invest enough time to comment in an organized manner, which turns into a bottleneck to other developers for tackling the discoveries or recommendations remarked by the peer-reviewers. To make the review process more progressively successful and well-organized, productive remarks are compulsory. We have extricated 2185 human code review comments of five marketed projects by mining respective projects’ repositories. Six machine learning classifiers have been utilized to train our model. Stochastic Gradient Descent (SGD) vector machine accomplishes a higher accuracy of 63.89% among the others. This work will assist the specialists with building up organized and viable code review culture among worldwide programmers or software engineers by categorizing code review comments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/SshShamma/Categorizing-Code-Review-Comments-Using-Machine-Learning.

References

  1. Hossain SS, Arafat Y, Hossain ME, Arman MS, Islam A (2020) Measuring the effectiveness of software code review comments. In: International conference on advances in computing and data sciences. Springer, pp 247–257

    Google Scholar 

  2. Hossain SS (2019) Challenges and mitigation strategies in reusing requirements in large-scale distributed agile software development: a survey result. In: Intelligent computing-proceedings of the computing conference. Springer, pp 920–935

    Google Scholar 

  3. Bacchelli A, Bird C (2003) Expectations, outcomes, and challenges of modern code review. In Proceedings of the 2013 international conference on software engineering. IEEE Press, pp 712–721

    Google Scholar 

  4. Holzmann GJ (2010) Scrub: a tool for code reviews. Innov Syst Softw Eng 6(4):311–318

    Google Scholar 

  5. Ahmed T, Bosu A, Iqbal A, Rahimi S (2017) Senticr: a customized sentiment analysis tool for code review interactions. In: Proceedings of the 32nd IEEE/ACM international conference on automated software engineering. IEEE Press, pp 106–111

    Google Scholar 

  6. Moskowitz M, Potter W, King W (2005) Automatic computer code review tool, May 26 2005. US Patent App. 10/769,535

    Google Scholar 

  7. Sadowski C, Söderberg E, Church L, Sipko M, Bacchelli A (2018) Modern code review: a case study at google. In Proceedings of the international conference on software engineering: software engineering in practice. Association for Computing Machinery, pp 181–190

    Google Scholar 

  8. Bernhart M, Reiterer S, Matt K, Mauczka A, Grechenig T (2011) A task-based code review process and tool to comply with the do-278/ed-109 standard for air traffic managment software development: An industrial case study. In: 2011 IEEE 13th international symposium on high-assurance systems engineering. IEEE, pp 182–187

    Google Scholar 

  9. Thongtanunam P, Tantithamthavorn C, Gaikovina Kula R, Yoshida N, Iida H, Matsumoto K (2015) Who should review my code? a file location-based code-reviewer recommendation approach for modern code review. In: 2015 IEEE 22nd international conference on software analysis, evolution, and reengineering (SANER). IEEE, pp 141–150

    Google Scholar 

  10. Madera M, Tomoń R (2017) A case study on machine learning model for code review expert system in software engineering. In: 2017 Federated conference on computer science and information systems (FedCSIS). IEEE, pp 1357–1363

    Google Scholar 

  11. Kalyan A, Chiam M, Sun J, Manoharan S (2016) A collaborative code review platform for github. In: 2016 21st international conference on engineering of complex computer systems (ICECCS). IEEE, pp 191–196

    Google Scholar 

  12. Ciolkowski Marcus, Laitenberger Oliver, Biffl Stefan (2003) Software reviews, the state of the practice. IEEE Softw 20(6):46–51

    Article  Google Scholar 

  13. Harjumaa L, Tervonen I, Huttunen A (2005) Peer reviews in real life-motivators and demotivators. In: Fifth international conference on quality software (QSIC’05). IEEE, pp 29–36

    Google Scholar 

  14. Rigby PC, Bird C (2013) Convergent contemporary software peer review practices. In: Proceedings of the 2013 9th joint meeting on foundations of software engineering, pp 202–212

    Google Scholar 

  15. Stein M, Riedl J, Harner SJ, Mashayekhi V (1997) A case study of distributed, asynchronous software inspection. In: Proceedings of the 19th international conference on Software engineering, pp 107–117

    Google Scholar 

  16. Meyer Bertrand (2008) Design and code reviews in the age of the internet. Commun ACM 51(9):66–71

    Article  Google Scholar 

  17. Sharma S, Sodhi B (2019) Using stack overflow content to assist in code review. Softw Pract Exp 49(8):1255–1277

    Google Scholar 

  18. Sommerville I (2011) Software engineering 9th edition. ISBN-10, 137035152:18

    Google Scholar 

  19. McConnell S (2004) Code complete. Pearson Education

    Google Scholar 

  20. Sculley D, Holt G, Golovin D, Davydov E, Phillips T, Ebner D, Chaudhary V, Young M, Crespo JF, Dennison D (2015) Hidden technical debt in machine learning systems. In: Advances in neural information processing systems, pp 2503–2511

    Google Scholar 

  21. McMahan HB, Holt G, Sculley D, Young M, Ebner D, Grady J, Nie L, Phillips T, Davydov E, Golovin D, et al (2013) Ad click prediction: a view from the trenches. In: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 1222–1230

    Google Scholar 

  22. Chandra T, Ie E, Goldman K, Llinares TL, McFadden J, Pereira F, Redstone J, Shaked T, Singer Y (2010) Sibyl: a system for large scale machine learning. Keynote I PowerPoint presentation, Jul, 28

    Google Scholar 

  23. Louis A, Dash SK, Barr ET, Sutton C (2018) Deep learning to detect redundant method comments. arXiv:1806.04616

  24. Movshovitz-Attias D, Cohen W (2013) Natural language models for predicting programming comments. In: Proceedings of the 51st annual meeting of the association for computational linguistics (Volume 2: Short Papers), pp 35–40

    Google Scholar 

  25. White M, Tufano M, Vendome C, Poshyvanyk D (2016) Deep learning code fragments for code clone detection. In: 2016 31st IEEE/ACM international conference on automated software engineering (ASE). IEEE, pp 87–98

    Google Scholar 

  26. Rahman S, Hossain SS, Islam S, Chowdhury MI, Rafiq FB, Badruzzaman KB (2019) Context-based news headlines analysis using machine learning approach. In: International Conference on Computational Collective Intelligence. Springer, pp 167–178

    Google Scholar 

  27. Bosu A, Greiler M, Bird C (2015) Characteristics of useful code reviews: an empirical study at microsoft. In: 2015 IEEE/ACM 12th working conference on mining software repositories. IEEE, pp 146–156

    Google Scholar 

  28. McIntosh S, Kamei Y, Adams B, Hassan AE (2016) An empirical study of the impact of modern code review practices on software quality. Empir Soft Eng 21(5):2146–2189

    Google Scholar 

  29. Harrell Jr FE (2015) Regression modeling strategies: with applications to linear models, logistic and ordinal regression, and survival analysis. Springer

    Google Scholar 

  30. Rennie JD, Shih L, Teevan J, Karger DR (2003) Tackling the poor assumptions of naive bayes text classifiers. In: Proceedings of the 20th international conference on machine learning (ICML-03), pp 616–623

    Google Scholar 

  31. Hosmer Jr DW, Lemeshow S, Sturdivant RX (2013) Applied logistic regression, vol 398. John Wiley & Sons

    Google Scholar 

  32. Bottou L (2012) Stochastic gradient descent tricks. In: Neural networks: tricks of the trade. Springer, pp 421–436

    Google Scholar 

  33. McCallum A, Nigam K, et al (1998) A comparison of event models for naive bayes text classification. In: AAAI-98 workshop on learning for text categorization, vol 752. Citeseer, pp 41–48

    Google Scholar 

  34. Suykens Johan AK, Vandewalle Joos (1999) Least squares support vector machine classifiers. Neural Process Lett 9(3):293–300

    Article  Google Scholar 

  35. Weinberger KQ, Blitzer J, Saul L (2005) Distance metric learning for large margin nearest neighbor classification. Adv Neural Inf Process Syst 18:1473–1480

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yeasir Arafat .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Arafat, Y., Sumbul, S., Shamma, H. (2022). Categorizing Code Review Comments Using Machine Learning. In: Yang, XS., Sherratt, S., Dey, N., Joshi, A. (eds) Proceedings of Sixth International Congress on Information and Communication Technology. Lecture Notes in Networks and Systems, vol 217. Springer, Singapore. https://doi.org/10.1007/978-981-16-2102-4_18

Download citation

Publish with us

Policies and ethics