Skip to main content
Log in

FixMiner: Mining relevant fix patterns for automated program repair

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Patching is a common activity in software development. It is generally performed on a source code base to address bugs or add new functionalities. In this context, given the recurrence of bugs across projects, the associated similar patches can be leveraged to extract generic fix actions. While the literature includes various approaches leveraging similarity among patches to guide program repair, these approaches often do not yield fix patterns that are tractable and reusable as actionable input to APR systems. In this paper, we propose a systematic and automated approach to mining relevant and actionable fix patterns based on an iterative clustering strategy applied to atomic changes within patches. The goal of FixMiner is thus to infer separate and reusable fix patterns that can be leveraged in other patch generation systems. Our technique, FixMiner, leverages Rich Edit Script which is a specialized tree structure of the edit scripts that captures the AST-level context of the code changes. FixMiner uses different tree representations of Rich Edit Scripts for each round of clustering to identify similar changes. These are abstract syntax trees, edit actions trees, and code context trees. We have evaluated FixMiner on thousands of software patches collected from open source projects. Preliminary results show that we are able to mine accurate patterns, efficiently exploiting change information in Rich Edit Scripts. We further integrated the mined patterns to an automated program repair prototype, PARFixMiner, with which we are able to correctly fix 26 bugs of the Defects4J benchmark. Beyond this quantitative performance, we show that the mined fix patterns are sufficiently relevant to produce patches with a high probability of correctness: 81% of PARFixMiner’s generated plausible patches are correct.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

Notes

  1. The initial version of this paper was written concurrently to SimFix and CapGen.

  2. https://github.com/gwtproject/gwt/issues/676

  3. The order of AST subtrees follows the order of hunks of the GNU diff format.

  4. https://commons.apache.org/proper/commons-text/

  5. In this experiment, we excluded 34 patches from Defects4J dataset which affect more than 1 file.

  6. Semantic Patch Language

  7. We used GZoltar version 0.1.1

  8. Version 1.2.0 - https://github.com/rjust/defects4j/releases/tag/v1.2.0

  9. https://github.com/xuanbachle/bugfixes/blob/master/fixed.txt

References

  • Abreu R, Zoeteweij P, Van Gemund A J (2007) On the accuracy of spectrum-based fault localization. In: Testing: Academic and industrial conference practice and research techniques-MUTATION (TAICPART-MUTATION 2007), pp 89–98. IEEE

  • Al-Ekram R, Adma A, Baysal O (2005) Diffx: An algorithm to detect changes in multi-version xml documents. In: Proceedings of the 2005 conference of the Centre for Advanced Studies on Collaborative research, pp 1–11. IBM Press

  • Andersen J, Lawall JL (2010) Generic patch inference. Auto Softw Eng 17 (2):119–148

    Article  Google Scholar 

  • Andersen J, Nguyen AC, Lo D, Lawall JL, Khoo SC (2012) Semantic patch inference. In: 2012 Proceedings of the 27th IEEE/ACM international conference on automated software engineering (ASE), pp 382–385. IEEE

  • Bhatia S, Singh R (2016) Automated correction for syntax errors in programming assignments using recurrent neural networks. arXiv:1603.06129

  • Bille P (2005) A survey on tree edit distance and related problems. Theor Comput Sci 337(1-3):217–239

    Article  MathSciNet  Google Scholar 

  • Brunel J, Doligez D, Hansen RR, Lawall JL, Muller G (2009) A foundation for flow-based program matching: Using temporal logic and model checking. In: Proceedings of the 36th annual ACM SIGPLAN-SIGACT symposium on principles of programming languages, POPL ’09. ACM, New York, pp 114–126. https://doi.org/10.1145/1480881.1480897

  • Campos J, Riboira A, Perez A, Abreu R (2012) Gzoltar: an eclipse plug-in for testing and debugging. In: Proceedings of the 27th IEEE/ACM international conference on automated software engineering, pp 378–381. ACM

  • Chawathe SS, Rajaraman A, Garcia-Molina H, Widom J (1996) Change Detection in Hierarchically Structured Information. In: Proceedings of the 1996 ACM SIGMOD international conference on management of data, SIGMOD ’96. ACM, New York, pp 493–504. https://doi.org/10.1145/233269.233366

  • Chen L, Pei Y, Furia CA (2017) Contract-based program repair without the contracts. In: Proceedings of the 32nd IEEE/ACM international conference on automated software engineering. IEEE, Urbana, pp 637–647

  • Chilowicz M, Duris E, Roussel G (2009) Syntax tree fingerprinting for source code similarity detection. In: IEEE 17th international conference on program comprehension, 2009. ICPC’09, pp 243–247. IEEE

  • Coker Z, Hafiz M (2013) Program transformations to fix c integers. In: Proceedings of the international conference on software engineering. IEEE, San Francisco, pp 792–801

  • Dallmeier V, Zeller A, Meyer B (2009) Generating fixes from object behavior anomalies. In: Proceedings of the 2009 IEEE/ACM international conference on automated software engineering, pp 550–554. IEEE Computer Society

  • Duley A, Spandikow C, Kim M (2012) Vdiff: A program differencing algorithm for verilog hardware description language. Autom Softw Eng 19(4):459–490

    Article  Google Scholar 

  • Durieux T, Cornu B, Seinturier L, Monperrus M (2017) Dynamic patch generation for null pointer exceptions using metaprogramming. In: Proceedings of the 24th international conference on software analysis, evolution and reengineering, pp 349–358. IEEE

  • Falleri JR GumTree. https://github.com/GumTreeDiff/gumtree (Last Access: Mar. 2018.)

  • Falleri JR, Morandat F, Blanc X, Martinez M, Monperrus M (2014) Fine-grained and accurate source code differencing. In: Proceedings of ACM/IEEE international conference on automated software engineering. ACM, Vasteras, pp 313–324

  • Fischer M, Pinzger M, Gall H (2003) Populating a release history database from version control and bug tracking systems. In: Proceeding of the 19th ICSM, pp 23–32. IEEE

  • Fluri B, Gall HC (2006) Classifying change types for qualifying change couplings. In: 14th IEEE international conference on program comprehension, 2006. ICPC 2006, pp 35–45. IEEE

  • Fluri B, Giger E, Gall HC (2008) Discovering patterns of change types. In: Proceedings of the 23rd IEEE/ACM International Conference on Automated Software Engineering. IEEE, L’Aquila, pp 463– 466

  • Fluri B, Wuersch M, PInzger M, Gall H (2007) Change distilling: Tree differencing for fine-grained source code change extraction. IEEE Transactions on software engineering 33(11)

  • Gupta R, Pal S, Kanade A, Shevade S (2017) Deepfix: Fixing common c language errors by deep learning. In: AAAI, pp 1345–1351

  • Hanam Q, Brito FSDM, Mesbah A (2016) Discovering bug patterns in javascript. In: Proceedings of the 2016 24th ACM SIGSOFT international symposium on foundations of software engineering, pp 144–156. ACM

  • Hashimoto M, Mori A (2008) Diff/ts: A tool for fine-grained structural change analysis. In: 2008 15th working conference on reverse engineering, pp 279–288. IEEE

  • Herzig K, Zeller A (2013) The impact of tangled code changes. In: Proceedings of the 10th Working Conference on Mining Software Repositories, MSR ’13. IEEE, San Francisco, pp 121–130

  • Hovemeyer D, Pugh W (2004) Finding bugs is easy. ACM Sigplan Notices 39 (12):92–106

    Article  Google Scholar 

  • Hua J, Zhang M, Wang K, Khurshid S (2018) Towards practical program repair with on-demand candidate generation. In: Proceedings of the 40th international conference on software engineering, pp 12–23. ACM

  • Huang K, Chen B, Peng X, Zhou D, Wang Y, Liu Y, Zhao W (2018) Cldiff: generating concise linked code differences. In: Proceedings of the 33rd ACM/IEEE international conference on automated software engineering, pp 679–690. ACM

  • Jaro MA (1989) Advances in record-linkage methodology as applied to matching the 1985 census of tampa, florida. J Am Stat Assoc 84(406):414–420

    Article  Google Scholar 

  • Jiang J, Xiong Y, Zhang H, Gao Q, Chen X (2018) Shaping program repair space with existing patches and similar code. In: Proceedings of the 27th ACM SIGSOFT international symposium on software testing and analysis, pp 298–309. ACM

  • Just R, Jalali D, Ernst MD (2014) Defects4j: A database of existing faults to enable controlled testing studies for java programs. In: Proceedings of the 2014 international symposium on software testing and analysis. ACM, San Jose, pp 437–440

  • Ke Y, Stolee KT, Le Goues C, Brun Y (2015) Repairing programs with semantic code search. In: Proceedings of the 30th IEEE/ACM international conference on automated software engineering (ASE). IEEE, Lincoln, pp 295–306

  • Kim D, Nam J, Song J, Kim S (2013) Automatic patch generation learned from human-written patches. In: Proceedings of the 2013 international conference on software engineering, pp 802–811. IEEE Press

  • Kim M, Notkin D (2009) Discovering and representing systematic code changes. In: Proceedings of the 31st international conference on software engineering, pp 309–319. IEEE Computer Society

  • Kim M, Notkin D, Grossman D (2007) Automatic inference of structural changes for matching across program versions. In: ICSE, vol 7, pp 333–343. Citeseer

  • Kim S, Pan K, Whitehead Jr E (2006) Memories of bug fixes. In: Proceedings of the 14th ACM SIGSOFT international symposium on foundations of software engineering, pp 35–45. ACM

  • Koyuncu A, Bissyandé T, Kim D, Klein J, Monperrus M, Le Traon Y (2017) Impact of tool support in patch construction. In: Proceedings of the 26th ACM SIGSOFT international symposium on software testing and analysis. ACM, New York, pp 237–248

  • Koyuncu A, Bissyandé TF, Kim D, Liu K, Klein J, Monperrus M, Traon Y L (2019) D&c: A divide-and-conquer approach to ir-based bug localization. arXiv:1902.02703

  • Koyuncu A, Liu K, Bissyandé TF, Kim D, Monperrus M, Klein J, Le Traon Y (2019) Ifixr: bug report driven program repair. In: Proceedings of the 2019 27th ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering, pp 314–325. ACM

  • Kreutzer P, Dotzler G, Ring M, Eskofier BM, Philippsen M (2016) Automatic clustering of code changes. In: Proceedings of the 13th international conference on mining software repositories, MSR ’16. ACM, New York, pp 61–72. https://doi.org/10.1145/2901739.2901749. http://doi.acm.org.proxy.bnl.lu/10.1145/2901739.2901749

  • Le XBD, Chu DH, Lo D, Le Goues C, Visser W (2017) S3: syntax-and semantic-guided repair synthesis via programming by examples. In: Proceedings of the 11th joint meeting on foundations of software engineering. ACM, Paderborn, pp 593–604

  • Le XD, Lo D, Le Goues C (2016a) History driven program repair. In: Proceedings of the 23rd international conference on software analysis, evolution, and reengineering, vol 1, pp 213–224. IEEE

  • Le XBD, Le Q L, Lo D, Le Goues C (2016b) Enhancing automated program repair with deductive verification. In: Proceedings of the international conference on software maintenance and evolution (ICSME). IEEE, Raleigh, pp 428–432

  • Le Goues C, Nguyen T, Forrest S, Weimer W (2012) GenProg: A generic method for automatic software repair. TSE 38(1):54–72

    Google Scholar 

  • Le Goues C, Nguyen T, Forrest S, Weimer W (2012) Genprog: A generic method for automatic software repair. IEEE Trans Softw Eng 38(1):54–72

    Article  Google Scholar 

  • Lee J, Kim D, Bissyandé TF, Jung W, Le Traon Y (2018) Bench4bl: reproducibility study on the performance of ir-based bug localization. In: Proceedings of the 27th ACM SIGSOFT international symposium on software testing and analysis, pp 61–72. ACM

  • Lin W, Chen Z, Ma W, Chen L, Xu L, Xu B (2016) An empirical study on the characteristics of python fine-grained source code change types. In: 2016 IEEE international conference on software maintenance and evolution (ICSME), pp 188–199. IEEE

  • Liu K, Kim D, Bissyandé TF, Yoo S, Le Traon Y (2018a) Mining fix patterns for findbugs violations. IEEE Transactions on Software Engineering

  • Liu K, Kim D, Koyuncu A, Li L, Bissyandé TF, Le Traon Y (2018b) A closer look at real-world patches. In: 2018 IEEE international conference on software maintenance and evolution, pp 275–286. IEEE

  • Liu K, Koyuncu A, Kim D, Bissyandé TF (2019) Avatar: Fixing semantic bugs with fix patterns of static analysis violations. In: Proceedings of the IEEE 26th international conference on software analysis, evolution and reengineering, pp 456–467. IEEE

  • Liu K, Koyuncu A, Bissyandé TF, Kim D, Klein J, Le Traon Y (2019b) You cannot fix what you cannot find! an investigation of fault localization bias in benchmarking automated program repair systems. In: 2019 12th IEEE conference on software testing, validation and verification (ICST), pp 102–113. IEEE

  • Liu K, Koyuncu A, Kim D, Bissyandé TF (2019) TBar: revisiting template-based automated program repair. In: Proceedings of the 28th international symposium on software testing and analysis

  • Liu K, Koyuncu A, Kim K, Kim D, Bissyandé TF (2018) LSRepair: Live search of fix ingredients for automated program repair. In: Proceedings of the 25th Asia-Pacific software engineering conference, pp 658–662

  • Liu X, Zhong H (2018) Mining stackoverflow for program repair. In: Proceedings of the 25th international conference on software analysis, evolution and reengineering, pp 118–129. IEEE

  • Livshits B, Zimmermann T (2005) DynaMine: Finding common error patterns by mining software revision histories. In: Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on foundations of software engineering, ESEC/FSE-13. ACM, New York, pp 296–305. https://doi.org/10.1145/1081706.1081754

  • Long F, Amidon P, Rinard M (2017) Automatic inference of code transforms for patch generation. In: Proceedings of the 11th joint meeting on foundations of software engineering. ACM, Paderborn, pp 727–739

  • Long F, Rinard M (2015) Staged program repair with condition synthesis. In: Proceedings of the 2015 10th joint meeting on foundations of software engineering. ACM, Bergamo, pp 166–178

  • Long F, Rinard M (2016) Automatic patch generation by learning correct code. In: Proceedings of the 43rd annual ACM SIGPLAN-SIGACT symposium on principles of programming languages. ACM, St. Petersburg, pp 298–312

  • Martinez M, Duchien L, Monperrus M (2013) Automatically extracting instances of code change patterns with ast analysis. In: 2013 29th IEEE international conference on software maintenance (ICSM), pp 388–391. IEEE

  • Martinez M, Durieux T, Sommerard R, Xuan J, Monperrus M (2017) Automatic repair of real bugs in java: A large-scale experiment on the defects4j dataset. Empir Softw Eng 22(4):1936–1964

    Article  Google Scholar 

  • Martinez M, Monperrus M (2015) Mining software repair models for reasoning on the search space of automated program fixing. Empir Softw Eng 20(1):176–205

    Article  Google Scholar 

  • Martinez M, Monperrus M (2016) Astor: A program repair library for java. In: Proceedings of the 25th international symposium on software testing and analysis. ACM, Saarbru̇cken, pp 441–444

  • Martinez M, Monperrus M (2018) Ultra-large repair search space with automatically mined templates: The cardumen mode of astor. In: Proceedings of the 10th SSBSE, pp 65–86. Springer

  • Mechtaev S, Yi J, Roychoudhury A (2015) Directfix: Looking for simple program repairs. In: Proceedings of the 37th international conference on software engineering-volume 1. IEEE, Florence, pp 448–458

  • Meng N, Kim M, McKinley KS (2011) Systematic editing: Generating program transformations from an example. ACM SIGPLAN Not 46(6):329–342

    Article  Google Scholar 

  • Meng N, Kim M, McKinley KS (2013) Lase: locating and applying systematic edits by learning from examples. In: Proceedings of the 2013 international conference on software engineering, pp 502–511. IEEE Press

  • Molderez T, Stevens R, De Roover C (2017) Mining change histories for unknown systematic edits. In: Procee dings of the 14th international conference on mining software repositories, pp 248–256. IEEE Press

  • Monperrus M (2018) Automatic software repair: a bibliography. ACM Comput Surveys (CSUR) 51(1):17

    Article  Google Scholar 

  • Myers EW (1986) Ano (nd) difference algorithm and its variations. Algorithmica 1(1-4):251–266

    Article  MathSciNet  Google Scholar 

  • Neamtiu I, Foster JS, Hicks M (2005) Understanding source code evolution using abstract syntax tree matching. ACM SIGSOFT Softw Eng Notes 30(4):1–5

    Article  Google Scholar 

  • Nguyen HA, Nguyen AT, Nguyen TN (2013) Filtering noise in mixed-purpose fixing commits to improve defect prediction and localization. In: 2013 IEEE 24th international symposium on software reliability engineering (ISSRE), pp 138–147. IEEE

  • Nguyen HDT, Qi D, Roychoudhury A, Chandra S (2013) SemFix: program repair via semantic analysis. In: Proceedings of the 35th ICSE, pp 772–781. IEEE

  • Nguyen TT, Nguyen HA, Pham NH, Al-Kofahi J, Nguyen TN (2010) Recurring bug fixes in object-oriented programs. In: 2010 ACM/IEEE 32nd international conference on software engineering, vol 1, pp 315–324. IEEE

  • Osman H, Lungu M, Nierstrasz O (2014) Mining frequent bug-fix code changes. In: 2014 software evolution week-IEEE conference on software maintenance, reengineering and reverse engineering (CSMR-WCRE), pp 343–347. IEEE

  • Oumarou H, Anquetil N, Etien A, Ducasse S, Taiwe KD (2015) Identifying the exact fixing actions of static rule violation. In: 2015 IEEE 22nd international conference on software analysis, evolution and reengineering (SANER), pp 371–379. IEEE

  • Padioleau Y, Lawall J, Hansen RR, Muller G (2008) Documenting and Automating Collateral Evolutions in Linux Device Drivers. In: Proceedings of the 3rd ACM SIGOPS/EuroSys european conference on computer systems 2008, Eurosys ’08. https://doi.org/10.1145/1352592.1352618. ACM, New York, pp 247–260

  • Pan K, Kim S, Whitehead EJ (2009) Toward an understanding of bug fix patterns. Empir Softw Eng 14(3):286–315

    Article  Google Scholar 

  • Park J, Kim M, Ray B, Bae DH (2012) An empirical study of supplementary bug fixes. In: Proceedings of the 9th IEEE working conference on mining software repositories, pp 40–49. IEEE Press

  • Pawlik M, Augsten N (2011) Rted: A robust algorithm for the tree edit distance. Proceedings of the VLDB Endowment 5(4):334–345

    Article  Google Scholar 

  • Rolim R, Soares G, Gheyi R, D’Antoni L (2018) Learning quick fixes from code repositories. arXiv:1803.03806

  • Saha RK, Lyu Y, Yoshida H, Prasad MR (2017) Elixir: Effective object-oriented program repair. In: 2017 32nd IEEE/ACM international conference on automated software engineering (ASE), pp 648–659. IEEE

  • Skiena SS (1997) The stony brook algorithm repository. http://www.cs.sunysb.edu/algorith/implement/nauty/implement. shtml

  • Sobreira V, Durieux T, Madeiral F, Monperrus M, Maia MA (2018) Dissection of a bug dataset: Anatomy of 395 patches from Defects4J. In: Proceedings of SANER

  • Tan SH, Roychoudhury A (2015) Relifix: Automated repair of software regressions. In: Proceedings of the 37th international conference on software engineering-volume 1, pp 471–482. IEEE Press

  • Tao Y, Kim S (2015) Partitioning composite code changes to facilitate code review. In: 2015 IEEE/ACM 12th working conference on mining software repositories, pp 180–190. IEEE

  • Thomas SW, Nagappan M, Blostein D, Hassan AE (2013) The impact of classifier configuration and classifier combination on bug localization. TSE 39(10):1427–1443

    Google Scholar 

  • Tian Y, Lawall J, Lo D (2012) Identifying linux bug fixing patches. In: Proceedings of the 34th international conference on software engineering, pp 386–396. IEEE Press

  • Weimer W, Nguyen T, Le Goues C, Forrest S (2009) Automatically finding patches using genetic programming. In: Proceedings of the 31st international conference on software engineering, May 16-24. IEEE, Vancouver, pp 364–374

  • Weissgerber P, Diehl S (2006) Identifying refactorings from source-code changes. In: 21st IEEE/ACM international conference on automated software engineering, 2006. ASE’06, pp 231–240. IEEE

  • Wen M, Chen J, Wu R, Hao D, Cheung SC (2018) Context-aware patch generation for better automated program repair. In: Proceedings of the 40th international conference on software engineering, pp 1–11. ACM

  • Wen M, Wu R, Cheung SC (2016) Locus: Locating bugs from software changes. In: 2016 31st IEEE/ACM international conference on automated software engineering (ASE), pp 262–273. IEEE

  • Winkler WE (1990) String comparator metrics and enhanced decision rules in the fellegi-sunter model of record linkage

  • Xin Q, Reiss SP (2017) Leveraging syntax-related code for automated program repair. In: Proceedings of the 32nd IEEE/ACM international conference on automated software engineering, pp 660–670. IEEE

  • Xiong Y, Wang J, Yan R, Zhang J, Han S, Huang G, Zhang L (2017) Precise condition synthesis for program repair. In: Proceedings of the 39th international conference on software engineering. IEEE, Buenos Aires, pp 416–426

  • Xuan J, Martinez M, DeMarco F, Clement M, Marcote S L, Durieux T, Le Berre D, Monperrus M (2017) Nopol: Automatic repair of conditional statement bugs in java programs. IEEE Trans Softw Eng 43(1):34–55

    Article  Google Scholar 

  • Ying AT, Murphy GC, Ng R, Chu-Carroll MC (2004) Predicting source code changes by mining change history. IEEE Trans Softw Eng 30(9):574–586

    Article  Google Scholar 

  • Yue R, Meng N, Wang Q (2017) A characterization study of repeated bug fixes. In: 2017 IEEE international conference on software maintenance and evolution (ICSME), pp 422–432. IEEE

Download references

Acknowledgements

This work is supported by the Fonds National de la Recherche (FNR), Luxembourg, through RECOMMEND 15/IS/10449467 and FIXPATTERN C15/IS/9964569.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anil Koyuncu.

Additional information

Communicated by: Paolo Tonella

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Koyuncu, A., Liu, K., Bissyandé, T.F. et al. FixMiner: Mining relevant fix patterns for automated program repair. Empir Software Eng 25, 1980–2024 (2020). https://doi.org/10.1007/s10664-019-09780-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-019-09780-z

Keywords

Navigation