A study of common bug fix patterns in Rust

Robati Shirzad, Mohammad; Lam, Patrick

doi:10.1007/s10664-023-10437-1

A study of common bug fix patterns in Rust

Published: 12 February 2024

Volume 29, article number 44, (2024)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

223 Accesses
Explore all metrics

Abstract

Rust is a relatively new programming language which allows programmers to write programs that have low-level control over resources while still ensuring high-level safety guarantees (for programs written in safe Rust). Rust’s ownership framework enables programs to meet these two seemingly-contradictory goals. The Rust compiler’s Borrow-Checker component enforces the ownership framework requirements that ensure Rust’s safety guarantees. Rust is popular: as of 2022, it has ranked first, for the seventh consecutive year, in Stack Overflow’s annual Developer Survey as the most-loved programming language. The number of Rust developers is growing as the need for faster and safer software increases. Yet, to our knowledge, no research has sought to identify the most pervasive bug fix patterns within Rust programs. In this project, we introduce Ruxanne, a tool for analyzing and extracting fix patterns in Rust. Ruxanne implements a novel embedding of Rust code into fixed-sized vectors. Using Ruxanne, we mined the top 18 most-starred Rust projects in GitHub to discover the most common bug fix patterns committed to their repositories. We analyzed 87,726 code changes drawn from 57,214 commits across these 18 projects. After clustering the code changes, and conducting a manual analysis, we identified 20 groups of cross-project bug fix patterns, which we categorize as (1) general patterns and (2) borrow-checker-related patterns. Among the general patterns, the most frequently observed pattern is when the user either adds or removes struct fields. In the case of borrow-checker-related patterns, the most common pattern we encountered is when the user removes a clone() call. We describe all detected patterns and their implications to automated program repair.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Large Language Model Assisted Software Engineering: Prospects, Challenges, and a Case Study

How different are different diff algorithms in Git?

Article Open access 11 September 2019

An empirical study of automated unit test generation for Python

Article Open access 31 January 2023

Data Availability Statement

The datasets generated and analyzed during the current study are available in the Zenodo repository, https://zenodo.org/record/8052979.

Notes

References

Alon U, Zilberstein M, Levy O, Yahav E (2018) A general path-based representation for predicting program properties. ACM SIGPLAN Not 53(4):404–419
Article Google Scholar
Alon U, Zilberstein M, Levy O, Yahav E (2019a) code2seq: Generating sequences from structured representations of code. In: Proceedings of the 2019 Conference of the Association for Computational Linguistics (ACL). pp 6304–6315
Alon U, Zilberstein M, Levy O, Yahav E (2019b) code2vec: Learning distributed representations of code. Proc ACM Program Lang 3(POPL):1–29
Arcuri A, Briand L (2011) A practical guide for using statistical tests to assess randomized algorithms in software engineering. In: Proceedings of the 33rd international conference on software engineering. pp 1–10
Bielik P, Raychev V, Vechev M (2016) PHOG: Probabilistic model for code. In: International conference on machine learning. pp 2933–2942
Campos EC, Maia MA (2019) Discovering common bug-fix patterns: A large-scale observational study. J Softw: Evol Process 31(7):1–28
Google Scholar
Cannon L, Elliott R, Kirchhoff L, Miller J, Milner J, Mitze R, Schan E, Whittington N, Spencer H, Keppel D et al (1991) Recommended C style and coding standards. Pocket reference guide, Specialized Systems Consultants
Google Scholar
Chen Z, Monperrus M (2019) A literature study of embeddings on source code. arXiv:1904.03061
Collins CR, Stephenson K (2003) A circle packing algorithm. Comput Geom 25(3):233–256
Google Scholar
Cotroneo D, De Simone L, Iannillo A K, Natella R, Rosiello S, Bidokhti N (2019) Analyzing the context of bug-fixing changes in the OpenStack cloud computing platform. In: 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE). IEEE, pp 334–345
DeGroot M H, Schervish M J (2012) Probability and statistics. Pearson Education
Endres A (1975) An analysis of errors and their causes in system programs. IEEE Trans Softw Eng 1(1):140–149
Article Google Scholar
Ester M, Kriegel H-P, Sander J, Xu X et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD. pp 226–231
Eyolfson J (2018) Enforcing Abstract Immutability. PhD thesis, University of Waterloo
Flanagan C, Felleisen M (1998) A new way of debugging Lisp programs. In: Proceedings of Lisp Users’ Group Meeting (LUGM)
Forrest S, Nguyen T, Weimer W, Le Goues C (2009) A genetic programming approach to automated software repair. In: Proceedings of the 11th annual conference on genetic and evolutionary computation. pp 947–954
Gopinath R, Jensen C, Groce A et al (2015) Mutant census: An empirical examination of the competent programmer hypothesis. Technical Report, School of EECS, Oregon State University
Hanam Q, Brito FSd M, Mesbah A (2016) Discovering bug patterns in JavaScript. In: Proceedings of the 2016 24th ACM SIGSOFT international symposium on foundations of software engineering. pp 144–156
Hindle A, Barr ET, Gabel M, Su Z, Devanbu P (2016) On the naturalness of software. Commun ACM 59(5):122–131
Article Google Scholar
Hoang T, Kang H J, Lo D, Lawall J (2020) CC2Vec: Distributed representations of code changes. In: Proceedings of the ACM/IEEE 42nd international conference on software engineering. pp 518–529
Huang W, Milanova A, Dietl W, Ernst MD (2012) ReIm & ReImInfer: Checking and inference of reference immutability and method purity. OOPSLA 2012, Object-Oriented Programming Systems, Languages, and Applications. Tucson, AZ, USA, pp 879–896
Google Scholar
Islam MR, Zibran MF (2021) What changes in where? An empirical study of bug-fixing change patterns. ACM SIGAPP Appl Comput Rev 20(4):18–34
Article Google Scholar
Jeffrey D, Feng M, Gupta N, Gupta R (2009) Bugfix: A learning-based tool to assist developers in fixing bugs. In: 2009 IEEE 17th international conference on program comprehension. IEEE, pp 70–79
Jones J A, Harrold M J (2005) Empirical evaluation of the Tarantula automatic fault-localization technique. In: Proceedings of the 20th IEEE/ACM international conference on automated software engineering. pp 273–282
Klabnik S, Nichols C (2019) The Rust programming language (Covers Rust 2018). No Starch Press
Knuth DE (1989) The errors of TeX. Softw-Pract Exper 19(7):607–685
Article Google Scholar
Le Goues C, Dewey-Vogt M, Forrest S, Weimer W (2012) A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In: 2012 34th International Conference on Software Engineering (ICSE). IEEE, pp 3–13
Le Goues C, Pradel M, Roychoudhury A (2019) Automated program repair. Commun ACM 62(12):56–65
Article Google Scholar
Li Z, Wang J, Sun M, Lui J C (2021) MirChecker: Detecting bugs in rust programs via static analysis. In: Proceedings of the 2021 ACM SIGSAC conference on computer and communications security. pp 2183–2196
Lin B, Wang S, Wen M, Mao X (2022) Context-aware code change embedding for better patch correctness assessment. ACM Trans Softw Eng Methodol (TOSEM) 31(3):1–29
Google Scholar
Ling M, Yu Y, Wu H, Wang Y, Cordy J R, Hassan A E (2022) In Rust we trust: a transpiler from unsafe C to safer Rust. In: Proceedings of the ACM/IEEE 44th international conference on software engineering: companion proceedings. pp 354–355
Liu Y, Zhang L, Zhang Z (2018) A survey of test based automatic program repair. J. Softw. 13(8):437–452
Article Google Scholar
Madeiral F, Durieux T, Sobreira V, Maia M (2018) Towards an automated approach for bug fix pattern detection. arXiv:1807.11286
Martinez M, Monperrus M (2012) Mining repair actions for guiding automated program fixing. PhD thesis, Inria
Martinez M, Monperrus M (2015) Mining software repair models for reasoning on the search space of automated program fixing. Emp Softw Eng 20(1):176–205
Article Google Scholar
Monperrus M (2014) “A critical review of automatic patch generation learned from human-written patches”: Essay on the problem statement and the evaluation of automatic software repair. In: Proceedings of the 36th international conference on software engineering. pp 234–242
Moss S (2021) How Dropbox pulled off its hybrid cloud transition. https://www.datacenterdynamics.com/en/analysis/how-dropbox-pulled-off-its-hybrid-cloud-transition/. November 21, 2022
Naish L, Lee H J, Ramamohanarao K (2009) Spectral debugging with weights and incremental ranking. In: 2009 16th Asia-pacific software engineering conference. IEEE, pp 168–175
Nguyen T, Weimer W, Le Goues C, Forrest S (2009) Using execution paths to evolve software patches. In: 2009 International conference on software testing, verification, and validation workshops. IEEE, pp 152–153
Pan K, Kim S, Whitehead EJ (2009) Toward an understanding of bug fix patterns. Emp Softw Eng 14(3):286–315
Article Google Scholar
Qi Y, Mao X, Lei Y (2013) Efficient automated program repair through fault-recorded testing prioritization. In 2013 IEEE International Conference on Software Maintenance. IEEE, pp 180–189
Qi Y, Mao X, Lei Y, Dai Z, Wang C (2014) The strength of random search on automated program repair. In Proceedings of the 36th International Conference on Software Engineering. pp 254–265
Qin B, Chen Y, Yu Z, Song L, Zhang Y (2020) Understanding memory and thread safety practices and issues in real-world Rust programs. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation. pp 763–779
Raychev V, Bielik P, Vechev M, Krause A (2016) Learning programs from noisy data. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’16. page 761-774, Association for Computing Machinery, New York, NY, USA
Sam G, Cameron N, Potanin A (2017) Automated refactoring of Rust programs. In Proceedings of the Australasian Computer Science Week Multiconference. pp 1–9
Spadini D, Aniche M, Bacchelli A (2018) Pydriller: Python framework for mining software repositories. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. pp 908–911
Tan S H, Roychoudhury A (2015) relifix: Automated repair of software regressions. In 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering volume 1. IEEE, pp 471–482
Tian H, Tang X, Habib A, Wang S, Liu K, Xia X, Klein J, Bissyandé T F (2022) Is this change the answer to that problem? Correlating descriptions of bug and code changes for evaluating patch correctness. In Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering. pp 1–13
Wong WE, Gao R, Li Y, Abreu R, Wotawa F (2016) A survey on software fault localization. IEEE Trans Softw Eng 42(8):707–740
Article Google Scholar
Xie X, Chen TY, Kuo F-C, Xu B (2013) A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization. ACM Trans Softw Eng Methodol (TOSEM) 22(4):1–40
Article Google Scholar
Xu R, Wunsch D (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645–678
Article Google Scholar
Yang Y, He T, Feng Y, Liu S, Xu B (2022) Mining Python fix patterns via analyzing fine-grained source code changes. Emp Softw Eng 27(2):1–37
Google Scholar
Ye H, Gu J, Martinez M, Durieux T, Monperrus M (2021) Automated classification of overfitting patches with statically extracted code features. IEEE Trans Softw Eng 48(8):2920–2938
Article Google Scholar
Zhang Y, Chen Y, Cheung S-C, Xiong Y, Zhang L (2018) An empirical study on TensorFlow program bugs. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. pp 129–140

Download references

Author information

Authors and Affiliations

University of Waterloo, Address: 200 University Ave W, Waterloo, ON, N2L 3G1, Canada
Mohammad Robati Shirzad & Patrick Lam

Authors

Mohammad Robati Shirzad
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Lam
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammad Robati Shirzad.

Ethics declarations

Competing Interests

We have no competing interests and are funded by a Discovery Grant from Canada’s Natural Science and Engineering Research Council

Additional information

Communicated by: Martin Monperrus.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Robati Shirzad, M., Lam, P. A study of common bug fix patterns in Rust. Empir Software Eng 29, 44 (2024). https://doi.org/10.1007/s10664-023-10437-1

Download citation

Accepted: 10 December 2023
Published: 12 February 2024
DOI: https://doi.org/10.1007/s10664-023-10437-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A study of common bug fix patterns in Rust

Abstract

Access this article

Similar content being viewed by others

Large Language Model Assisted Software Engineering: Prospects, Challenges, and a Case Study

How different are different diff algorithms in Git?

An empirical study of automated unit test generation for Python

Data Availability Statement

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A study of common bug fix patterns in Rust

Abstract

Access this article

Similar content being viewed by others

Large Language Model Assisted Software Engineering: Prospects, Challenges, and a Case Study

How different are different diff algorithms in Git?

An empirical study of automated unit test generation for Python

Data Availability Statement

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation