Advertisement

Empirical Software Engineering

, Volume 23, Issue 5, pp 2901–2947 | Cite as

Do automated program repair techniques repair hard and important bugs?

  • Manish Motwani
  • Sandhya Sankaranarayanan
  • René Just
  • Yuriy Brun
Article

Abstract

Existing evaluations of automated repair techniques focus on the fraction of the defects for which the technique can produce a patch, the time needed to produce patches, and how well patches generalize to the intended specification. However, these evaluations have not focused on the applicability of repair techniques and the characteristics of the defects that these techniques can repair. Questions such as “Can automated repair techniques repair defects that are hard for developers to repair?” and “Are automated repair techniques less likely to repair defects that involve loops?” have not, as of yet, been answered. To address such questions, we annotate two large benchmarks totaling 409 C and Java defects in real-world software, ranging from 22K to 2.8M lines of code, with measures of the defect’s importance, the developer-written patch’s complexity, and the quality of the test suite. We then analyze relationships between these measures and the ability to produce patches for the defects of seven automated repair techniques —AE, GenProg, Kali, Nopol, Prophet, SPR, and TrpAutoRepair. We find that automated repair techniques are less likely to produce patches for defects that required developers to write a lot of code or edit many files, or that have many tests relevant to the defect. Java techniques are more likely to produce patches for high-priority defects. Neither the time it took developers to fix a defect nor the test suite’s coverage correlate with the automated repair techniques’ ability to produce patches. Finally, automated repair techniques are less capable of fixing defects that require developers to add loops and new function calls, or to change method signatures. These findings identify strengths and shortcomings of the state-of-the-art of automated program repair along new dimensions. The presented methodology can drive research toward improving the applicability of automated repair techniques to hard and important bugs.

Keywords

Automated program repair Repairability 

Notes

Acknowledgements

This work is supported by the National Science Foundation under grants CCF-1453474 and CCF-1564162.

References

  1. Alkhalaf M, Aydin A, Bultan T (2014) Semantic differential repair for input validation and sanitization. In: International symposium on software testing and analysis (ISSTA), San Jose, CA, USA, pp 225–236Google Scholar
  2. Ammann P, Offutt J (2008) Introduction to software testing, 1st edn. Cambridge University Press, New YorkCrossRefGoogle Scholar
  3. Arcuri A, Yao X (2008) A novel co-evolutionary approach to automatic software bug fixing. In: Congress on Evolutionary Computation, pp 162–168Google Scholar
  4. Bradbury JS, Jalbert K Di Penta M, Poulding S, Briand L, Clark J (eds) (2010) Automatic repair of concurrency bugs. Benevento, ItalyGoogle Scholar
  5. Brun Y, Bang J, Edwards G, Medvidovic N (2015) Self-adapting reliability in distributed software systems. IEEE Transactions on Software Engineering (TSE) 41(8):764–780.  https://doi.org/10.1109/TSE.2015.2412134 CrossRefGoogle Scholar
  6. Brun Y, Barr E, Xiao M, Le Goues C, Devanbu P (2013) Evolution vs. intelligent design in program patching. Tech. Rep., UC Davis: College of Engineering https://escholarship.org/uc/item/3z8926ks
  7. Brun Y, Medvidovic N (2007) An architectural style for solving computationally intensive problems on large networks. In: Software engineering for adaptive and self-managing systems (SEAMS). Minneapolis, MN, USA.  https://doi.org/10.1109/SEAMS.2007.4
  8. Brun Y, Medvidovic N (2007) Fault and adversary tolerance as an emergent property of distributed systems’ software architectures. In: International workshop on engineering fault tolerant systems (EFTS). Dubrovnik, Croatia, pp 38–43.  https://doi.org/10.1145/1316550.1316557
  9. Bryant A, Charmaz K (2007) The SAGE handbook of grounded theory. SAGE Publications Ltd, New YorkCrossRefGoogle Scholar
  10. Carbin M, Misailovic S, Kling M, Rinard M (2011) Detecting and escaping infinite loops with xJolt. In: European conference on object oriented programming (ECOOP). Lancaster, England, UKGoogle Scholar
  11. Carzaniga A, Gorla A, Mattavelli A, Perino N, Pezzė M (2013) Automatic recovery from runtime failures. In: ACM/IEEE international conference on software engineering (ICSE). San Francisco, CA, USA, pp 782–791Google Scholar
  12. Carzaniga A, Gorla A, Perino N, Pezzė M (2010) Automatic workarounds for web applications. In: ACM SIGSOFT international symposium on foundations of software engineering (FSE). Santa Fe, New Mexico, USA, pp 237–246.  https://doi.org/10.1145/1882291.1882327
  13. Charmaz K (2006) Constructing grounded theory: a practical guide through qualitative analysis. SAGE Publications Ltd, New YorkGoogle Scholar
  14. Coker Z, Hafiz M (2013) Program transformations to fix C integers. In: ACM/IEEE international conference on software engineering (ICSE). San Francisco, CA, USA, pp 792–801Google Scholar
  15. Dallmeier V, Zeller A, Meyer B (2009) Generating fixes from object behavior anomalies. In: IEEE/ACM international conference on automated software engineering (ASE) short paper track. Auckland, New Zealand, pp 550–554.  https://doi.org/10.1109/ASE.2009.15
  16. Debroy V, Wong W (2010) Using mutation to automatically suggest fixes for faulty programs. In: International conference on software testing, verification, and validation. Paris, France, pp 65–74.  https://doi.org/10.1109/ICST.2010.66
  17. DeMarco F, Xuan J, Berre DL, Monperrus M (2014) Automatic repair of buggy if conditions and missing preconditions with SMT. In: International workshop on constraints in software testing, verification, and analysis (CSTVA). Hyderabad, India, pp 30–39.  https://doi.org/10.1145/2593735.2593740
  18. Demsky B, Ernst MD, Guo PJ, McCamant S, Perkins JH, Rinard M (2006) Inference and enforcement of data structure consistency specifications. In: International symposium on software testing and analysis (ISSTA). Portland, ME, USA, pp 233–243Google Scholar
  19. Durieux T, Martinez M, Monperrus M, Sommerard R, Xuan J (2015) Automatic repair of real bugs: An experience report on the Defects4J dataset. arXiv:1505.07002
  20. Elkarablieh B, Khurshid S (2008) Juzi: a tool for repairing complex data structures. In: ACM/IEEE international conference on software engineering (ICSE) formal demonstration track. Leipzig, Germany, pp 855–858.  https://doi.org/10.1145/1368088.1368222
  21. Ernst MD, Cockrell J, Griswold WG, Notkin D (2001) Dynamically discovering likely program invariants to support program evolution. IEEE Transactions on Software Engineering (TSE) 27(2):99–123CrossRefGoogle Scholar
  22. Ferguson CJ (2009) An effect size primer: a guide for clinicians and researchers. Prof Psychol: Res Prac 40(5):532–538.  https://doi.org/10.1037/a0015808 MathSciNetCrossRefGoogle Scholar
  23. Fry ZP, Landau B, Weimer W (2012) A human study of patch maintainability. In: International symposium on software testing and analysis (ISSTA). Minneapolis, MN, USA, pp 177–187Google Scholar
  24. Galhotra S, Brun Y, Meliou A (2017) Fairness testing: testing software for discrimination. In: European software engineering conference and ACM SIGSOFT symposium on the foundations of software engineering (ESEC/FSE). Paderborn, Germany, pp 498–510.  https://doi.org/10.1145/3106237.3106277
  25. Gopinath D, Malik MZ, Khurshid S (2011) Specification-based program repair using SAT. In: International conference on tools and algorithms for the construction and analysis of systems (TACAS). Saarbrücken, Germany, pp 173–188Google Scholar
  26. Harman M (2007) The current state and future of search based software engineering. In: ACM/IEEE international conference on software engineering (ICSE), pp 342–357.  https://doi.org/10.1109/FOSE.2007.29
  27. Hutchins M, Foster H, Goradia T, Ostrand T (1994) Experiments of the effectiveness of dataflow-and control flow-based test adequacy criteria. In: ACM/IEEE international conference on software engineering (ICSE). Sorrento, Italy, pp 191–200Google Scholar
  28. Jeffrey D, Feng M, Gupta N, Gupta R (2009) Bugfix: a learning-based tool to assist developers in fixing bugs. In: International conference on program comprehension (ICPC). Vancouver, BC, Canada, pp 70–79.  https://doi.org/10.1109/ICPC.2009.5090029
  29. Jiang M, Chena TY, Kuoa FC, Towey D, Ding Z (2016) A metamorphic testing approach for supporting program repair without the need for a test oracle. J Syst Softw (JSS) 126:127–140.  https://doi.org/10.1016/j.jss.2016.04.002
  30. Jin G, Song L, Zhang W, Lu S, Liblit B (2011) Automated atomicity-violation fixing. In: ACM SIGPLAN conference on programming language design and implementation (PLDI). San Jose, CA, USA, pp 389–400.  https://doi.org/10.1145/1993498.1993544
  31. Just R, Jalali D, Ernst MD (2014) Defects4j: a database of existing faults to enable controlled testing studies for Java programs. In: Proceedings of the international symposium on software testing and analysis (ISSTA). San Jose, CA, USA, pp 437–440Google Scholar
  32. Ke Y, Stolee KT, Le Goues C, Brun Y (2015) Repairing programs with semantic code search. In: International conference on automated software engineering (ASE). Lincoln, NE, USA, pp 295–306.  https://doi.org/10.1109/ASE.2015.60
  33. Kim D, Nam J, Song J, Kim S (2013) Automatic patch generation learned from human-written patches. In: ACM/IEEE international conference on software engineering (ICSE). San Francisco, CA, USA, pp 802–811. http://dl.acm.org/citation.cfm?id=2486788.2486893
  34. Kong X, Zhang L, Wong WE, Li B (2015) Experience report: how do techniques, programs, and tests impact automated program repair?. In: IEEE international symposium on software reliability engineering (ISSRE). Gaithersburg, MD, USA, pp 194–204.  https://doi.org/10.1109/ISSRE.2015.7381813
  35. Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, CambridgezbMATHGoogle Scholar
  36. Langdon WB, White DR, Harman M, Jia Y, Petke J (2016) API-constrained genetic improvement. In: International symposium on search based software engineering (SSBSE). Raleigh, NC, USA, pp 224–230.  https://doi.org/10.1007/978-3-319-47106-8_16
  37. Le XBD, Chu DH, Lo D, Le Goues C, Visser W (2017) S3: syntax- and semantic-guided repair synthesis via programming by examples. In: European software engineering conference and ACM SIGSOFT international symposium on foundations of software engineering (ESEC/FSE). Paderborn, GermanyGoogle Scholar
  38. Le Goues C, Dewey-Vogt M, Forrest S, Weimer W (2012a) A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In: AMC/IEEE international conference on software engineering (ICSE). Zurich, Switzerland, pp 3–13Google Scholar
  39. Le Goues C, Holtschulte N, Smith EK, Brun Y, Devanbu P, Forrest S, Weimer W (2015) The ManyBugs and IntroClass benchmarks for automated repair of C programs. IEEE Transactions on Software Engineering (TSE) 41(12):1236–1256.  https://doi.org/10.1109/TSE.2015.2454513 CrossRefGoogle Scholar
  40. Le Goues C, Nguyen T, Forrest S, Weimer W (2012b) Genprog: a generic method for automatic software repair. IEEE Transactions on Software Engineering (TSE) 38:54–72.  https://doi.org/10.1109/TSE.2011.104 CrossRefGoogle Scholar
  41. Le Roy MK (2009) Research methods in political science: an introduction using MicroCase, 7th edn. Thompson Learning, WadsworthGoogle Scholar
  42. Liu P, Tripp O, Zhang C (2014) Grail: context-aware fixing of concurrency bugs. In: ACM SIGSOFT international symposium on foundations of software engineering (FSE). Hong Kong, China, pp 318–329Google Scholar
  43. Liu P, Zhang C (2012) Axis: Automatically fixing atomicity violations through solving control constraints. In: ACM/IEEE international conference on software engineering (ICSE). Zurich, Switzerland, pp 299–309Google Scholar
  44. Long F, Rinard M (2015) Staged program repair with condition synthesis. In: European software engineering conference and ACM SIGSOFT international symposium on foundations of software engineering (ESEC/FSE). Bergamo, Italy, pp 166–178.  https://doi.org/10.1145/2786805.2786811
  45. Long F, Rinard M (2016a) An analysis of the search spaces for generate and validate patch generation systems. In: ACM/IEEE international conference on software engineering (ICSE). Austin, TX, USA, pp 702–713.  https://doi.org/10.1145/2884781.2884872
  46. Long F, Rinard M (2016b) Automatic patch generation by learning correct code. In: ACM SIGPLAN-SIGACT symposium on principles of programming languages (POPL). St. Petersburg, FL, USA, pp 298–312.  https://doi.org/10.1145/2837614.2837617
  47. Martinez M, Durieux T, Sommerard R, Xuan J, Monperrus M (2017) Automatic repair of real bugs in Java: a large-scale experiment on the Defects4J dataset. Empirical Software Engineering (EMSE) 22(4):1936–1964.  https://doi.org/10.1007/s10664-016-9470-4 CrossRefGoogle Scholar
  48. Matavire R, Brown I (2013) Profiling grounded theory approaches in information systems research. Eur J Inf Syst 22(1):119–129.  https://doi.org/10.1057/ejis.2011.35 CrossRefGoogle Scholar
  49. Mechtaev S, Yi J, Roychoudhury A (2015) Directfix: looking for simple program repairs. In: International conference on software engineering (ICSE). Florence, ItalyGoogle Scholar
  50. Mechtaev S, Yi J, Roychoudhury A (2016) Angelix: Scalable multiline program patch synthesis via symbolic analysis. In: International conference on software engineering (ICSE). Austin, TX, USAGoogle Scholar
  51. Monperrus M (2014) A critical review of automatic patch generation learned from human-written patches: essay on the problem statement and the evaluation of automatic software repair. In: ACM/IEEE international conference on software engineering (ICSE). Hyderabad, India, pp 234–242.  https://doi.org/10.1145/2568225.2568324
  52. Muşlu K, Brun Y, Meliou A (2013) Data debugging with continuous testing. In: European software engineering conference and ACM SIGSOFT symposium on the foundations of software engineering (ESEC/FSE) NIER Track. Saint Petersburg, Russia, pp 631–634.  https://doi.org/10.1145/2491411.2494580
  53. Muşlu K, Brun Y, Meliou A (2015) Preventing data errors with continuous testing. In: International symposium on software testing and analysis (ISSTA). Baltimore, MD, USA, pp 373–384.  https://doi.org/10.1145/2771783.2771792
  54. Newson R (2002) Parameters behind nonparametric statistics: Kendall’s tau, Somers’ D and median differences. Stata J 2(1):45–64Google Scholar
  55. Nguyen HDT, Qi D, Roychoudhury A, Chandra S (2013) Semfix: program repair via semantic analysis. In: ACM/IEEE international conference on software engineering (ICSE). San Francisco, CA, USA, pp 772–781Google Scholar
  56. Orlov M, Sipper M (2011) Flight of the FINCH through the Java wilderness. IEEE Trans Evol Comput 15(2):166–182CrossRefGoogle Scholar
  57. Pei Y, Furia CA, Nordio M, Wei Y, Meyer B, Zeller A (2014) Automated fixing of programs with contracts. IEEE Transactions on Software Engineering (TSE) 40(5):427–449.  https://doi.org/10.1109/TSE.2014.2312918 CrossRefGoogle Scholar
  58. Perkins JH, Kim S, Larsen S, Amarasinghe S, Bachrach J, Carbin M, Pacheco C, Sherwood F, Sidiroglou S, Sullivan G, Wong WF, Zibin Y, Ernst MD, Rinard M (2009) Automatically patching errors in deployed software. In: ACM symposium on operating systems principles (SOSP). Big Sky, MT, USA, pp 87–102.  https://doi.org/10.1145/1629575.1629585
  59. Petke J, Haraldsson SO, Harman M, Langdon WB, White DR, Woodward JR (2017) Genetic improvement of software: a comprehensive survey. IEEE Transactions on Evolutionary Computation (TEC). In press.  https://doi.org/10.1109/TEVC.2017.2693219
  60. Qi Y, Mao X, Lei Y (2013) Efficient automated program repair through fault-recorded testing prioritization. In: International conference on software maintenance (ICSM). Eindhoven, The Netherlands, pp 180–189.  https://doi.org/10.1109/ICSM.2013.29
  61. Qi Z, Long F, Achour S, Rinard M (2015) An analysis of patch plausibility and correctness for generate-and-validate patch generation systems. In: International symposium on software testing and analysis (ISSTA). Baltimore, MD, USA, pp 24–36.  https://doi.org/10.1145/2771783.2771791
  62. Schulte E, Dorn J, Harding S, Forrest S, Weimer W (2014) Post-compiler software optimization for reducing energy. In: International conference on architectural support for programming languages and operating systems (ASPLOS). Salt Lake City, UT, USA, pp 639–652.  https://doi.org/10.1145/2541940.2541980
  63. Sidiroglou S, Keromytis AD (2005) Countering network worms through automatic patch generation. IEEE Secur Priv 3(6):41–49CrossRefGoogle Scholar
  64. Sidiroglou-Douskos S, Lahtinen E, Long F, Rinard M (2015) Automatic error elimination by horizontal code transfer across multiple applications. In: ACM SIGPLAN conference on programming language design and implementation (PLDI). Portland, OR, USA, pp 43–54.  https://doi.org/10.1145/2737924.2737988
  65. Smith EK, Barr E, Le Goues C, Brun Y (2015) Is the cure worse than the disease? Overfitting in automated program repair. In: European software engineering conference and ACM SIGSOFT symposium on the foundations of software engineering (ESEC/FSE). Bergamo, Italy, pp 532–543.  https://doi.org/10.1145/2786805.2786825
  66. softwaretestinghelp.com (2015) 15 most popular bug tracking software to ease your defect management process. http://www.softwaretestinghelp.com/popular-bug-tracking-software/, accessed December 11 2015
  67. Soto M, Thung F, Wong CP, Goues CL, Lo D (2016) a deeper look into bug fixes: patterns, replacements, deletions, and additions. In: International conference on mining software repositories (MSR) mining challenge track. Austin, TX, USA.  https://doi.org/10.1145/2901739.2903495
  68. Tan SH, Roychoudhury A (2015) relifix: automated repair of software regressions. In: International conference on software engineering (ICSE). Florence, ItalyGoogle Scholar
  69. Wang X, Dong XL, Meliou A (2015) Data X-Ray: a diagnostic tool for data errors. In: International conference on management of data (SIGMOD)Google Scholar
  70. Wei Y, Pei Y, Furia CA, Silva LS, Buchholz S, Meyer B, Zeller A (2010) Automated fixing of programs with contracts. In: International symposium on software testing and analysis (ISSTA). Trento, Italy, pp 61–72.  https://doi.org/10.1145/1831708.1831716
  71. Weimer W, Fry ZP, Forrest S (2013) Leveraging program equivalence for adaptive program repair: models and first results. In: IEEE/ACM international conference on automated software engineering (ASE). Palo alto, CA, USAGoogle Scholar
  72. Weimer W, Nguyen T, Le Goues C, Forrest S (2009) Automatically finding patches using genetic programming. In: ACM/IEEE international conference on software engineering (ICSE). Vancouver, BC, Canada, pp 364–374.  https://doi.org/10.1109/ICSE.2009.5070536
  73. Weiss A, Guha A, Brun Y (2017) Tortoise: interactive system configuration repair. In: International conference on automated software engineering (ASE). Urbana-champaign, IL, USAGoogle Scholar
  74. Wilkerson JL, Tauritz DR, Bridges JM (2012) Multi-objective coevolutionary automated software correction. In: Conference on genetic and evolutionary computation (GECCO). Philadelphia, PA, USA, pp 1229–1236.  https://doi.org/10.1145/2330163.2330333
  75. Yang G, Khurshid S, Kim M (2012) Specification-based test repair using a lightweight formal method. In: International symposium on formal methods (FM). Paris, France, pp 455–470.  https://doi.org/10.1007/978-3-642-32759-9_37

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.College of Information and Computer ScienceUniversity of MassachusettsAmherstUSA

Personalised recommendations