Skip to main content

Automated Program Repair Using Generative Models for Code Infilling

  • Conference paper
  • First Online:
Artificial Intelligence in Education (AIED 2023)

Abstract

In educational settings, automated program repair techniques serve as a feedback mechanism to guide students working on their programming assignments. Recent work has investigated using large language models (LLMs) for program repair. In this area, most of the attention has been focused on using proprietary systems accessible through APIs. However, the limited access and control over these systems remain a block to their adoption and usage in education. The present work studies the repairing capabilities of open large language models. In particular, we focus on a recent family of generative models, which, on top of standard left-to-right program synthesis, can also predict missing spans of code at any position in a program. We experiment with one of these models on four programming datasets and show that we can obtain good repair performance even without additional training.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/KoutchemeCharles/aied2023.

References

  1. Azcona, D., Smeaton, A.: +5 Million Python & Bash Programming Submissions for 5 Courses & Grades for Computer-Based Exams Over 3 Academic Years (2020). https://doi.org/10.6084/m9.figshare.12610958.v1

  2. Bavarian, M., et al.: Efficient training of language models to fill in the middle (2022). https://doi.org/10.48550/ARXIV.2207.14255

  3. Bommasani, R., et al.: On the opportunities and risks of foundation models (2021). https://doi.org/10.48550/ARXIV.2108.07258

  4. Chen, M., et al.: Evaluating large language models trained on code (2021). https://doi.org/10.48550/ARXIV.2107.03374

  5. Chen, Z., Kommrusch, S., Tufano, M., Pouchet, L., Poshyvanyk, D., Monperrus, M.: SequenceR: sequence-to-sequence learning for end-to-end program repair. IEEE Trans. Softw. Eng. 47(09), 1943–1959 (2021). https://doi.org/10.1109/TSE.2019.2940179

    Article  Google Scholar 

  6. Cleuziou, G., Flouvat, F.: Learning student program embeddings using abstract execution traces. In: 14th International Conference on Educational Data Mining, pp. 252–262 (2021)

    Google Scholar 

  7. Dey, N., et al.: Cerebras-GPT: open compute-optimal language models trained on the Cerebras wafer-scale cluster. arXiv preprint arXiv:2304.03208 (2023)

  8. Fried, D., et al.: InCoder: a generative model for code infilling and synthesis (2022). https://doi.org/10.48550/ARXIV.2204.05999

  9. Hirsch, T., Hofer, B.: A systematic literature review on benchmarks for evaluating debugging approaches. J. Syst. Softw. 192, 111423 (2022). https://doi.org/10.1016/j.jss.2022.111423

    Article  Google Scholar 

  10. Hu, Y., Ahmed, U.Z., Mechtaev, S., Leong, B., Roychoudhury, A.: Re-factoring based program repair applied to programming assignments. In: 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE) (2019)

    Google Scholar 

  11. Le Goues, C., Nguyen, T., Forrest, S., Weimer, W.: GenProg: a generic method for automatic software repair. IEEE Trans. Softw. Eng. 38(1), 54–72 (2012). https://doi.org/10.1109/TSE.2011.104

    Article  Google Scholar 

  12. Lin, D., Koppel, J., Chen, A., Solar-Lezama, A.: QuixBugs: a multi-lingual program repair benchmark set based on the Quixey challenge. In: Proceedings Companion of the 2017 ACM SIGPLAN International Conference on Systems, Programming, Languages, and Applications: Software for Humanity, pp. 55–56. SPLASH Companion 2017, ACM (2017). https://doi.org/10.1145/3135932.3135941

  13. Long, F., Rinard, M.: Automatic patch generation by learning correct code. In: Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. POPL 2016, pp. 298–312. ACM (2016)

    Google Scholar 

  14. McCauley, R., et al.: Debugging: a review of the literature from an educational perspective. Comput. Sci. Educ. 18(2), 67–92 (2008)

    Article  Google Scholar 

  15. Prenner, J.A., Babii, H., Robbes, R.: Can OpenAI’s codex fix bugs? An evaluation on QuixBugs. In: Proceedings of the Third International Workshop on Automated Program Repair, pp. 69–75 (2022)

    Google Scholar 

  16. Pu, Y., Narasimhan, K., Solar-Lezama, A., Barzilay, R.: Sk_p: a neural program corrector for MOOCs. In: Companion Proceedings of the 2016 ACM SIGPLAN International Conference on Systems, Programming, Languages and Applications: Software for Humanity, pp. 39–40. ACM (2016). https://doi.org/10.1145/2984043.2989222

  17. Touvron, H., et al.: Llama: open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023)

  18. Xia, C.S., Wei, Y., Zhang, L.: Practical program repair in the era of large pre-trained language models (2022). https://doi.org/10.48550/ARXIV.2210.14179

  19. Yasunaga, M., Liang, P.: Graph-based, self-supervised program repair from diagnostic feedback (2020). https://doi.org/10.48550/ARXIV.2005.10636

  20. Zhang, J., et al.: Repairing bugs in python assignments using large language models (2022). https://doi.org/10.48550/ARXIV.2209.14876

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Charles Koutcheme .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Koutcheme, C., Sarsa, S., Leinonen, J., Hellas, A., Denny, P. (2023). Automated Program Repair Using Generative Models for Code Infilling. In: Wang, N., Rebolledo-Mendez, G., Matsuda, N., Santos, O.C., Dimitrova, V. (eds) Artificial Intelligence in Education. AIED 2023. Lecture Notes in Computer Science(), vol 13916. Springer, Cham. https://doi.org/10.1007/978-3-031-36272-9_74

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-36272-9_74

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-36271-2

  • Online ISBN: 978-3-031-36272-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics