Abstract
An Automatic Program Repair (APR) technique is an implementation of a repair model to fix a given bug by modifying program behavior. Recently, repair models which collect source code and code changes from software history and use such collected resources for patch generation became more popular. Collected resources are used to expand the patch search space and to increase the probability that correct patches for bugs are included in the space. However, it is also revealed that navigation on such expanded patch search space is difficult due to the sparseness of correct patches in the space. In this study, we evaluate the effectiveness of Context-based Change Application (CCA) technique on change selection, fix location selection and change concretization, which are the key aspects of navigating patch search space. CCA collects abstract subtree changes and their AST contexts, and applies them to fix locations only if their contexts are matched. CCA repair model can address both search space expansion and navigation issues, by expanding search space with collected changes while narrowing down search areas in the search space based on contexts. Since CCA applies changes to a fix location only if their contexts are matched, it only needs to consider the same context changes for each fix location. Also, if there is no change with the same context as a fix location, this fix location can be ignored since it means that past patches did not modify such locations. In addition, CCA uses fine-grained changes preserving changed code structures, but normalizing user-defined names. Hence change concretization can be simply done by replacing normalized names with concrete names available in buggy code. We evaluated CCA’s effectiveness with over 54K unique collected changes (221K in total) from about 5K human-written patches. Results show that using contexts, CCA correctly found 90.1% of the changes required for test set patches, while fewer than 5% of the changes were found without contexts. We discovered that collecting more changes is only helpful if it is supported by contexts for effective search space navigation. In addition, CCA repair model found 44-70% of the actual fix locations of Defects4j patches more quickly compared to using SBFL techniques only. We also found that about 48% of the patches can be fully concretized using concrete names from buggy code.
Similar content being viewed by others
Notes
Apache’s JIRA issue tracker (https://issues.apache.org/jira).
References
Arcuri A, Yao X (2008) A novel co-evolutionary approach to automatic software bug fixing. In: IEEE congress on evolutionary computation, 2008. CEC 2008. (IEEE World Congress on Computational Intelligence), pp 162–168. https://doi.org/10.1109/CEC.2008.4630793
Barr ET, Brun Y, Devanbu P, Harman M, Sarro F (2014) The plastic surgery hypothesis. In: Proceedings of the 22nd ACM SIGSOFT international symposium on foundations of software engineering, FSE ’14
Barr ET, Harman M, Jia Y, Marginean A, Petke J (2015) Automated software transplantation. In: Proceedings of the 2015 international symposium on software testing and analysis, ISSTA 2015. ACM, New York, pp 257–269, DOI https://doi.org/10.1145/2771783.2771796
Chen MY, Kiciman E, Fratkin E, Fox A, Brewer E (2002) Pinpoint: Problem determination in large, dynamic internet services. In: International conference on dependable systems and networks, 2002. DSN 2002. Proceedings, pp 595–604. IEEE
Chilowicz M, Duris E, Roussel G, Paris-est U (2009) Syntax tree fingerprinting: a foundation for source code similarity detection
Debroy V, Wong WE (2010) Using mutation to automatically suggest fixes for faulty programs. In: Proceedings of the 2010 3rd international conference on software testing, verification and validation, ICST ’10. IEEE Computer Society, Washington, pp 65–74, DOI https://doi.org/10.1109/ICST.2010.66
DeMarco F, Xuan J, Le Berre D, Monperrus M (2014) Automatic repair of buggy if conditions and missing preconditions with smt. In: Proceedings of the 6th international workshop on constraints in software testing, verification, and analysis, pp 30–39. ACM
Falleri JR, Morandat F, Blanc X, Martinez M, Montperrus M (2014) Fine-grained and accurate source code differencing. In: Proceedings of the 29th ACM/IEEE international conference on automated software engineering, ASE ’14. ACM, New York, pp 313–324, DOI https://doi.org/10.1145/2642937.2642982
Fluri B, Wursch M, Pinzger M, Gall H (2007) Change distilling: Tree differencing for fine-grained source code change extraction. IEEE Trans Softw Eng 33(11):725–743. https://doi.org/10.1109/TSE.2007.70731
Gabel M, Su Z (2010) A study of the uniqueness of source code. In: Proceedings of the Eighteenth ACM SIGSOFT international symposium on foundations of software engineering, FSE ’10. ACM, New York, pp 147–156, DOI https://doi.org/10.1145/1882291.1882315
Goues CL, Nguyen T, Forrest S, Weimer W (2012) Genprog: A generic method for automatic software repair. IEEE Trans Softw Eng 38(1):54–72. https://doi.org/10.1109/TSE.2011.104
Jiang J, Xiong Y, Zhang H, Gao Q, Chen X (2018) Shaping program repair space with existing patches and similar code. In: Proceedings of the 27th ACM SIGSOFT international symposium on software testing and analysis, ISSTA 2018. ACM, New York, pp 298–309, DOI https://doi.org/10.1145/3213846.3213871
Just R, Jalali D, Ernst MD (2014) Defects4j: A database of existing faults to enable controlled testing studies for java programs. In: Proceedings of the 2014 international symposium on software testing and analysis, ISSTA 2014. ACM, New York, pp 437–440, DOI https://doi.org/10.1145/2610384.2628055
Ke Y, Stolee KT, Goues CL, Brun Y (2015) Repairing programs with semantic code search (t). In: 2015 30th IEEE/ACM international conference on automated software engineering (ASE), pp 295–306
Kim D, Nam J, Song J, Kim S (2013) Automatic patch generation learned from human-written patches. In: Proceedings of the 2013 international conference on software engineering, ICSE’13. http://dl.acm.org/citation.cfm?id=2486788.2486893
Kim J, Kim J, Lee E (2018) Vfl: Variable-based fault localization. Information and Software Technology. http://www.sciencedirect.com/science/article/pii/S0950584918302453
Kim J, Kim S (2016) Location aware source code differencing for mining changes. Tech. rep., Hong Kong University of Science and Technology. https://github.com/thwak/LAS. [Online; accessed 05-Mar-2019]
Le XB, Lo D, Goues CL (2016) History driven program repair. In: 2016 IEEE 23rd international conference on software analysis, evolution, and reengineering (SANER), vol 01, pp 213–224. https://doi.org/10.1109/SANER.2016.76
Le XBD, Chu DH, Lo D, Le Goues C, Visser W (2017a) Jfix: Semantics-based repair of java programs via symbolic pathfinder. In: Proceedings of the 26th ACM SIGSOFT international symposium on software testing and analysis, ISSTA 2017. ACM, New York, pp 376–379, DOI https://doi.org/10.1145/3092703.3098225
Le XBD, Chu DH, Lo D, Le Goues C, Visser W (2017b) S3: Syntax- and semantic-guided repair synthesis via programming by examples. In: Proceedings of the 2017 11th joint meeting on foundations of software engineering, ESEC/FSE 2017. ACM, New York, pp 593–604, DOI https://doi.org/10.1145/3106237.3106309
Le XBD, Thung F, Lo D, Goues CL (2018) Overfitting in semantics-based automated program repair. Empir Softw Eng 23 (5):3007–3033. https://doi.org/10.1007/s10664-017-9577-2
Le Goues C, Dewey-Vogt M, Forrest S, Weimer W (2012) A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In: Proceedings of the 34th international conference on software engineering, ICSE ’12. http://dl.acm.org/citation.cfm?id=2337223.2337225. IEEE Press, Piscataway, pp 3–13
Lipowski A, Lipowska D (2012) Roulette-wheel selection via stochastic acceptance. Physica A: Statistical Mechanics and its Applications 391(6):2193–2196. https://doi.org/10.1016/j.physa.2011.12.004. http://www.sciencedirect.com/science/article/pii/S0378437111009010
Liu K, Koyuncu A, Kim D, Tegawendé F, Bissyandé T (2019) AVATAR: fixing semantic bugs with fix patterns of static analysis violations. In: Proceedings of the 26th IEEE international conference on software analysis, evolution, and reengineering, pp 456–467. IEEE
Livshits B, Zimmermann T (2005) Dynamine: Finding common error patterns by mining software revision histories. In: Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on foundations of software engineering, ESEC/FSE-13. ACM, New York, pp 296–305, DOI https://doi.org/10.1145/1081706.1081754
Long F, Amidon P, Rinard M (2017) Automatic inference of code transforms for patch generation. In: Proceedings of the 2017 11th joint meeting on foundations of software engineering, ESEC/FSE 2017. ACM, New York, pp 727–739, DOI https://doi.org/10.1145/3106237.3106253
Long F, Rinard M (2015) Staged program repair with condition synthesis. In: Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2015. ACM, New York, pp 166–178, DOI https://doi.org/10.1145/2786805.2786811
Long F, Rinard M (2016a) An analysis of the search spaces for generate and validate patch generation systems. In: Proceedings of the 38th international conference on software engineering, ICSE ’16. ACM, New York, pp 702–713, DOI https://doi.org/10.1145/2884781.2884872
Long F, Rinard M (2016b) Automatic patch generation by learning correct code. In: Proceedings of the 43rd annual ACM SIGPLAN-SIGACT symposium on principles of programming languages, POPL ’16. ACM, New York, pp 298–312, DOI https://doi.org/10.1145/2837614.2837617
Martinez M, Duchien L, Monperrus M (2013) Automatically extracting instances of code change patterns with ast analysis. In: Proceedings of the 2013 IEEE International Conference on Software Maintenance, ICSM ’13. IEEE Computer Society, Washington, pp 388–391, DOI https://doi.org/10.1109/ICSM.2013.54
Martinez M, Monperrus M (2015) Mining software repair models for reasoning on the search space of automated program fixing. Empir Softw Eng 20(1):176–205. https://doi.org/10.1007/s10664-013-9282-8
Martinez M, Weimer W, Monperrus M (2014) Do the fix ingredients already exist? an empirical inquiry into the redundancy assumptions of program repair approaches. In: Companion Proceedings of the 36th international conference on software engineering, pp 492–495. ACM
Mechtaev S, Yi J, Roychoudhury A (2015) Directfix: Looking for simple program repairs. In: 2015 IEEE/ACM 37th IEEE international conference on software engineering, vol 1, pp 448–458
Mechtaev S, Yi J, Roychoudhury A (2016) Angelix: Scalable multiline program patch synthesis via symbolic analysis. In: Proceedings of the 38th international conference on software engineering, ICSE ’16. ACM, New York, pp 691–701, DOI https://doi.org/10.1145/2884781.2884807
Meng N, Kim M, McKinley KS (2011a) Sydit: Creating and applying a program transformation from an example. In: Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on foundations of software engineering, ESEC/FSE ’11. ACM, New York, pp 440–443, DOI https://doi.org/10.1145/2025113.2025185
Meng N, Kim M, McKinley KS (2011b) Systematic editing: Generating program transformations from an example. In: Proceedings of the 32Nd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’11. ACM, New York, pp 329–342, DOI https://doi.org/10.1145/1993498.1993537
Meng N, Kim M, McKinley KS (2013) Lase: locating and applying systematic edits by learning from examples. In: Proceedings of the 2013 international conference on software engineering, pp 502–511. IEEE Press
Meyer ADS, Garcia AAF, Souza APD, Souza CLD Jr (2004) Comparison of similarity coefficients used for cluster analysis with dominant markers in maize (zea mays l). Genet Mol Biol 27(1):83–91
Nguyen HA, Nguyen AT, Nguyen T, Nguyen T, Rajan H (2013) A study of repetitiveness of code changes in software evolution. In: 2013 IEEE/ACM 28th international conference on automated software engineering (ASE), pp 180–190
Nguyen HDT, Qi D, Roychoudhury A, Chandra S (2013) Semfix: Program repair via semantic analysis. In: Proceedings of the 2013 international conference on software engineering, pp 772–781. IEEE Press
Pearson S, Campos J, Just R, Fraser G, Abreu R, Ernst MD, Pang D, Keller B (2017) Evaluating and improving fault localization. In: Proceedings of the 39th international conference on software engineering, ICSE ’17. IEEE Press, Piscataway, pp 609–620, DOI https://doi.org/10.1109/ICSE.2017.62
Perkins JH, Kim S, Larsen S, Amarasinghe S, Bachrach J, Carbin M, Pacheco C, Sherwood F, Sidiroglou S, Sullivan G et al (2009) Automatically patching errors in deployed software. In: Proceedings of the ACM SIGOPS 22nd symposium on operating systems principles, pp 87–102. ACM
Petke J, Harman M, Langdon WB, Weimer W (2014) Using genetic improvement & code transplants to specialise a c++ program to a problem class. In: 17th European conference on genetic programming (EuroGP), Granada, Spain
Qi Y, Mao X, Lei Y (2013) Efficient automated program repair through fault-recorded testing prioritization. In: Proceedings of the 2013 IEEE international conference on software maintenance, ICSM ’13. IEEE Computer Society, Washington, pp 180–189, DOI https://doi.org/10.1109/ICSM.2013.29
Qi Y, Mao X, Lei Y, Dai Z, Wang C (2014) The strength of random search on automated program repair. In: Proceedings of the 36th international conference on software engineering, pp 254–265. ACM
Qi Z, Long F, Achour S, Rinard M (2015) An analysis of patch plausibility and correctness for generate-and-validate patch generation systems. In: Proceedings of the 2015 international symposium on software testing and analysis, ISSTA 2015. ACM, New York, pp 24–36, DOI https://doi.org/10.1145/2771783.2771791
Raghavan S, Rohana R, Leon D, Podgurski A, Augustine V (2004) Dex: a semantic-graph differencing tool for studying changes in large code bases. In: 20th IEEE international conference on software maintenance, 2004. Proceedings., pp 188–197
Ray B, Nagappan M, Bird C, Nagappan N, Zimmermann T (2014) The uniqueness of changes: characteristics and applications. Tech. rep., Microsoft Research Technical Report
Rolim R, Soares G, D’Antoni L, Polozov O, Gulwani S, Gheyi R, Suzuki R, Hartmann B (2017) Learning syntactic program transformations from examples. In: Proceedings of the 39th international conference on software engineering, ICSE ’17. IEEE Press, Piscataway, pp 404–415, DOI https://doi.org/10.1109/ICSE.2017.44
Saha RK, Lyu Y, Yoshida H, Prasad MR (2017) Elixir: Effective object oriented program repair. In: Proceedings of the 32Nd IEEE/ACM international conference on automated software engineering, ASE 2017. http://dl.acm.org/citation.cfm?id=3155562.3155643. IEEE Press, Piscataway, pp 648–659
Sidiroglou-Douskos S, Lahtinen E, Long F, Rinard M (2015) Automatic error elimination by horizontal code transfer across multiple applications. In: Proceedings of the 36th ACM SIGPLAN conference on programming language design and implementation, PLDI ’15. ACM, New York, pp 43–54, DOI https://doi.org/10.1145/2737924.2737988
Smith EK, Barr ET, Le Goues C, Brun Y (2015) Is the cure worse than the disease? overfitting in automated program repair. In: Proceedings of the 2015 10th joint meeting on foundations of software engineering, ESEC/FSE 2015. ACM, New York, pp 532–543, DOI https://doi.org/10.1145/2786805.2786825
Tan SH, Roychoudhury A (2015) Relifix: Automated repair of software regressions. In: Proceedings of the 37th international conference on software engineering - Volume 1, ICSE ’15. http://dl.acm.org/citation.cfm?id=2818754.2818813. IEEE Press, Piscataway, pp 471–482
Tao Y, Dang Y, Xie T, Zhang D, Kim S (2012) How do software engineers understand code changes?: An exploratory study in industry. In: Proceedings of the ACM SIGSOFT 20th international symposium on the foundations of software engineering, FSE ’12. ACM, New York, pp 51:1–51:11, DOI https://doi.org/10.1145/2393596.2393656
Tao Y, Kim S (2015) Partitioning composite code changes to facilitate code review. In: 2015 IEEE/ACM 12th working conference on mining software repositories, pp 180–190
Weimer W, Fry ZP, Forrest S (2013) Leveraging program equivalence for adaptive program repair: Models and first results. In: 2013 IEEE/ACM 28th international conference on automated software engineering (ASE), pp 356–366. IEEE
Weimer W, Nguyen T, Le Goues C, Forrest S (2009) Automatically finding patches using genetic programming. In: Proceedings of the 31st international conference on software engineering, pp 364–374
Wen M, Chen J, Wu R, Hao D, Cheung SC (2018) Context-aware patch generation for better automated program repair. In: Proceedings of the 40th international conference on software engineering, ICSE ’18. ACM, New York, pp 1–11, DOI https://doi.org/10.1145/3180155.3180233
Xin Q, Reiss SP (2017) Leveraging syntax-related code for automated program repair. In: Proceedings of the 32Nd IEEE/ACM international conference on automated software engineering, ASE 2017. http://dl.acm.org/citation.cfm?id=3155562.3155644. IEEE Press, Piscataway, pp 660–670
Zhong H, Meng N (2018) Towards reusing hints from past fixes: An exploratory study on thousands of real samples. In: Proceedings of the 40th international conference on software engineering, ICSE ’18. ACM, New York, pp 885–885, DOI https://doi.org/10.1145/3180155.3182550
Zhong H, Su Z (2015) An empirical study on real bug fixes. In: Proceedings of the 37th international conference on software engineering - Volume 1, ICSE ’15. http://dl.acm.org/citation.cfm?id=2818754.2818864. IEEE Press, Piscataway, pp 913–923
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Martin Monperrus
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kim, J., Kim, J., Lee, E. et al. The effectiveness of context-based change application on automatic program repair. Empir Software Eng 25, 719–754 (2020). https://doi.org/10.1007/s10664-019-09770-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-019-09770-1