Automated Program Repair Using Generative Models for Code Infilling

Koutcheme, Charles; Sarsa, Sami; Leinonen, Juho; Hellas, Arto; Denny, Paul

doi:10.1007/978-3-031-36272-9_74

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13916))

Included in the following conference series:

International Conference on Artificial Intelligence in Education

3229 Accesses
2 Altmetric

Abstract

In educational settings, automated program repair techniques serve as a feedback mechanism to guide students working on their programming assignments. Recent work has investigated using large language models (LLMs) for program repair. In this area, most of the attention has been focused on using proprietary systems accessible through APIs. However, the limited access and control over these systems remain a block to their adoption and usage in education. The present work studies the repairing capabilities of open large language models. In particular, we focus on a recent family of generative models, which, on top of standard left-to-right program synthesis, can also predict missing spans of code at any position in a program. We experiment with one of these models on four programming datasets and show that we can obtain good repair performance even without additional training.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/KoutchemeCharles/aied2023.

References

Azcona, D., Smeaton, A.: +5 Million Python & Bash Programming Submissions for 5 Courses & Grades for Computer-Based Exams Over 3 Academic Years (2020). https://doi.org/10.6084/m9.figshare.12610958.v1
Bavarian, M., et al.: Efficient training of language models to fill in the middle (2022). https://doi.org/10.48550/ARXIV.2207.14255
Bommasani, R., et al.: On the opportunities and risks of foundation models (2021). https://doi.org/10.48550/ARXIV.2108.07258
Chen, M., et al.: Evaluating large language models trained on code (2021). https://doi.org/10.48550/ARXIV.2107.03374
Chen, Z., Kommrusch, S., Tufano, M., Pouchet, L., Poshyvanyk, D., Monperrus, M.: SequenceR: sequence-to-sequence learning for end-to-end program repair. IEEE Trans. Softw. Eng. 47(09), 1943–1959 (2021). https://doi.org/10.1109/TSE.2019.2940179
Article Google Scholar
Cleuziou, G., Flouvat, F.: Learning student program embeddings using abstract execution traces. In: 14th International Conference on Educational Data Mining, pp. 252–262 (2021)
Google Scholar
Dey, N., et al.: Cerebras-GPT: open compute-optimal language models trained on the Cerebras wafer-scale cluster. arXiv preprint arXiv:2304.03208 (2023)
Fried, D., et al.: InCoder: a generative model for code infilling and synthesis (2022). https://doi.org/10.48550/ARXIV.2204.05999
Hirsch, T., Hofer, B.: A systematic literature review on benchmarks for evaluating debugging approaches. J. Syst. Softw. 192, 111423 (2022). https://doi.org/10.1016/j.jss.2022.111423
Article Google Scholar
Hu, Y., Ahmed, U.Z., Mechtaev, S., Leong, B., Roychoudhury, A.: Re-factoring based program repair applied to programming assignments. In: 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE) (2019)
Google Scholar
Le Goues, C., Nguyen, T., Forrest, S., Weimer, W.: GenProg: a generic method for automatic software repair. IEEE Trans. Softw. Eng. 38(1), 54–72 (2012). https://doi.org/10.1109/TSE.2011.104
Article Google Scholar
Lin, D., Koppel, J., Chen, A., Solar-Lezama, A.: QuixBugs: a multi-lingual program repair benchmark set based on the Quixey challenge. In: Proceedings Companion of the 2017 ACM SIGPLAN International Conference on Systems, Programming, Languages, and Applications: Software for Humanity, pp. 55–56. SPLASH Companion 2017, ACM (2017). https://doi.org/10.1145/3135932.3135941
Long, F., Rinard, M.: Automatic patch generation by learning correct code. In: Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. POPL 2016, pp. 298–312. ACM (2016)
Google Scholar
McCauley, R., et al.: Debugging: a review of the literature from an educational perspective. Comput. Sci. Educ. 18(2), 67–92 (2008)
Article Google Scholar
Prenner, J.A., Babii, H., Robbes, R.: Can OpenAI’s codex fix bugs? An evaluation on QuixBugs. In: Proceedings of the Third International Workshop on Automated Program Repair, pp. 69–75 (2022)
Google Scholar
Pu, Y., Narasimhan, K., Solar-Lezama, A., Barzilay, R.: Sk_p: a neural program corrector for MOOCs. In: Companion Proceedings of the 2016 ACM SIGPLAN International Conference on Systems, Programming, Languages and Applications: Software for Humanity, pp. 39–40. ACM (2016). https://doi.org/10.1145/2984043.2989222
Touvron, H., et al.: Llama: open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023)
Xia, C.S., Wei, Y., Zhang, L.: Practical program repair in the era of large pre-trained language models (2022). https://doi.org/10.48550/ARXIV.2210.14179
Yasunaga, M., Liang, P.: Graph-based, self-supervised program repair from diagnostic feedback (2020). https://doi.org/10.48550/ARXIV.2005.10636
Zhang, J., et al.: Repairing bugs in python assignments using large language models (2022). https://doi.org/10.48550/ARXIV.2209.14876

Download references

Author information

Authors and Affiliations

Aalto University, Espoo, Finland
Charles Koutcheme, Sami Sarsa & Arto Hellas
The University of Auckland, Auckland, New Zealand
Juho Leinonen & Paul Denny

Authors

Charles Koutcheme
View author publications
You can also search for this author in PubMed Google Scholar
Sami Sarsa
View author publications
You can also search for this author in PubMed Google Scholar
Juho Leinonen
View author publications
You can also search for this author in PubMed Google Scholar
Arto Hellas
View author publications
You can also search for this author in PubMed Google Scholar
Paul Denny
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Charles Koutcheme .

Editor information

Editors and Affiliations

University of Southern California, Los Angeles, CA, USA
Ning Wang
University of British Columbia, Vancouver, BC, Canada
Genaro Rebolledo-Mendez
North Carolina State University, Raleigh, NC, USA
Noboru Matsuda
Despacho 3.01, UNED-Grupo de Investigación aDeNu, Madrid, Spain
Olga C. Santos
University of Leeds, Leeds, UK
Vania Dimitrova

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Koutcheme, C., Sarsa, S., Leinonen, J., Hellas, A., Denny, P. (2023). Automated Program Repair Using Generative Models for Code Infilling. In: Wang, N., Rebolledo-Mendez, G., Matsuda, N., Santos, O.C., Dimitrova, V. (eds) Artificial Intelligence in Education. AIED 2023. Lecture Notes in Computer Science(), vol 13916. Springer, Cham. https://doi.org/10.1007/978-3-031-36272-9_74

Download citation

DOI: https://doi.org/10.1007/978-3-031-36272-9_74
Published: 26 June 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36271-2
Online ISBN: 978-3-031-36272-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Automated Program Repair Using Generative Models for Code Infilling