Current challenges in automatic software repair

Abstract

The abundance of defects in existing software systems is unsustainable. Addressing them is a dominant cost of software maintenance, which in turn dominates the life cycle cost of a system. Recent research has made significant progress on the problem of automatic program repair, using techniques such as evolutionary computation, instrumentation and run-time monitoring, and sound synthesis with respect to a specification. This article serves three purposes. First, we review current work on evolutionary computation approaches, focusing on GenProg, which uses genetic programming to evolve a patch to a particular bug. We summarize algorithmic improvements and recent experimental results. Second, we review related work in the rapidly growing subfield of automatic program repair. Finally, we outline important open research challenges that we believe should guide future research in the area.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2

Notes

  1. 1.

    GenProg can also effect repairs in assembly code, binary files, and (recently) the LLVM intermediate representation.

  2. 2.

    In practice, we use several test cases to express program requirements. We describe only one here for brevity.

References

  1. Abreu, R., Zoeteweij, P., & van Gemund, A. J. C. (2006). An evaluation of similarity coefficients for software fault localization. In Pacific rim international symposium on dependable computing. IEEE Computer Society, 39–46.

  2. Ackling, T., Alexander, B., & Grunert, I. (2011). Evolving patches for software repair. In Genetic and evolutionary computation, 1427–1434.

  3. Adamopoulos, K., Harman, M., & Hierons, R. M. (2004). How to overcome the equivalent mutant problem and achieve tailored selective mutation using co-evolution. In Genetic and evolutionary computation conference, 1338–1349.

  4. Alba, E., & Chicano, F. (2007). Finding safety errors with ACO. In Genetic and evolutionary computation conference, 1066–1073.

  5. Albertsson, L., & Magnusson, P. S. (2000). Using complete system simulation for temporal debugging of general purpose operating systems and workload. In International symposium on modeling, analysis and simulation of computer and telecommunication systems, 191.

  6. Al-Ekram, R., Adma, A., & Baysal, O. (2005). diffX: An algorithm to detect changes in multi-version XML documents. In Conference of the centre for advanced studies on collaborative research. IBM Press, 1–11.

  7. Anvik, J., Hiew, L., & Murphy, G. C. (2006). Who should fix this bug? In International conference on software engineering, 361–370.

  8. Arcuri, A. (2011). Evolutionary repair of faulty software. Applied Soft Computing, 11(4), 3494–3514.

    Article  Google Scholar 

  9. Arcuri, A., & Yao, X. (2008). A novel co-evolutionary approach to automatic software bug fixing. In Congress on evolutionary computation, 162–168.

  10. Ashok, B., Joy, J., Liang, H., Rajamani, S. K., Srinivasa, G., & Vangala, V. (2009) DebugAdvisor: A recommender system for debugging. In Foundations of software engineering, 373–382.

  11. Ball, T., Naik, M., & Rajamani, S. K. (2003). From symptom to cause: Localizing errors in counterexample traces. SIGPLAN Notices, 38(1), 97–105.

    Article  Google Scholar 

  12. Barrantes, E. G., Ackley, D. H., Palmer, T. S., Stefanovic, D., & Zovi, D. D. (2003). Randomized instruction set emulation to disrupt binary code injection attacks. In Computer and communications security, 281–289.

  13. Barreto, A., de Barros, O. M., & Werner, C. M. (2008). Staffing a software project: A constraint satisfaction and optimization-based approach. Computers and Operations Research, 35(10), 3073–3089.

    MATH  Article  Google Scholar 

  14. BBC News. (2008). Microsoft Zune affected by ‘bug’. http://news.bbc.co.uk/2/hi/technology/7806683.stm.

  15. Beck, K. (2000). Extreme programming explained: Embrace change. Reading: Addison-Wesley.

    Google Scholar 

  16. Binder, R. V. (1999). Testing object-oriented systems: Models, patterns, and tools. Reading: Addison-Wesley Longman Publishing Co., Inc.

    Google Scholar 

  17. Blackburn, S. M., Garner, R., Hoffman, C., Khan, A. M., McKinley, K. S., Bentzur, R., et al. (2006). The DaCapo benchmarks: Java benchmarking development and analysis. In Object-oriented programming, systems, languages, and applications, 169–190.

  18. Bradbury, J. S., & Jalbert, K. (2010). Automatic repair of concurrency bugs. In: International symposium on search based software engineering—fast abstracts, 1–2.

  19. Buse, R. P. L., & Weimer, W. (2008). A metric for software readability. In International symposium on software testing and analysis, 121–130.

  20. Buse, R. P. L., & Weimer, W. (2010). Automatically documenting program changes. In Automated software engineering, 33–42.

  21. Cadar, C., Ganesh, V., Pawlowski, P. M., Dill, D. L., & Engler, D. R. (2006) EXE: Automatically generating inputs of death. In Computer and communications security, 322–335.

  22. Carbin, M., Misailovic, S., Kling, M., & Rinard, M. C. (2011) Detecting and escaping infinite loops with Jolt. In European conference on object oriented programming.

  23. Carzaniga, A., Gorla, A., Mattavelli, A., Perino, N., Pezzè, M. (2013). Automatic recovery from runtime failures. In International conference on sofware engineering.

  24. Chaki, S., Groce, A., & Strichman, O. (2004). Explaining abstract counterexamples. In Foundations of software engineering, 73–82.

  25. Chen, M. Y., Kiciman, E., Fratkin, E., Fox, A., & Brewer, E. (2002). Pinpoint: Problem determination in large, dynamic Internet services. In International conference on dependable systems and networks, 595–604.

  26. Dallmeier, V., Zeller, A., & Meyer, B. (2009). Generating fixes from object behavior anomalies. In Automated software engineering, 550–554.

  27. Debroy, V., & Wong, W. E. (2010). Using mutation to automatically suggest fixes for faulty programs. In International conference on software testing, verification, and validation, 65–74.

  28. Demsky, B., Ernst, M. D., Guo, P. J., McCamant, S., Perkins, J. H., & Rinard, M. C. (2006) Inference and enforcement of data structure consistency specifications. In International symposium on software testing and analysis.

  29. Elkarablieh, B., & Khurshid, S. (2008). Juzi: A tool for repairing complex data structures. In International conference on software engineering, 855–858.

  30. Engler, D. R., Chen, D. Y., & Chou, A. (2001). Bugs as inconsistent behavior: A general approach to inferring errors in systems code. In Symposium on operating systems principles.

  31. Ernst, M. D., Perkins, J. H., Guo, P. J., McCamant, S., Pacheco, C., Tschantz, M. S., & Xiao, C. (2007). The Daikon system for dynamic detection of likely invariants. Science of Computer Programming, 69(1–3), 35–45.

    Google Scholar 

  32. Fast, E., Le Goues, C., Forrest, S., & Weimer, W. (2010). Designing better fitness functions for automated program repair. In Genetic and evolutionary computation conference, 965–972.

  33. Forrest, S. (1993). Genetic algorithms: Principles of natural selection applied to computation. Science, 261, 872–878.

    Google Scholar 

  34. Forrest, S., Weimer, W., Nguyen, T., & Le Goues, C. (2009). A genetic programming approach to automated software repair. In Genetic and evolutionary computation conference, 947–954.

  35. Fraser, G., & Zeller, A. (2012). Mutation-driven generation of unit tests and oracles. Transactions on Software Engineering, 38(2), 278–292.

    Article  Google Scholar 

  36. Fraser, G., & Zeller, A. (2011). Generating parameterized unit tests. In International symposium on software testing and analysis, 364–374.

  37. Fry, Z. P., Landau, B., & Weimer, W. (2012). A human study of patch maintainability. In M. P. E. Heimdahl, & Su, Z., (Eds.), International symposium on software testing and analysis, 177–187.

  38. Gabel, M., & Su, Z. (2012). Testing mined specifications. In Foundations of software engineering, ACM, 1–11.

  39. Godefroid, P., Klarlund, N., & Sen, K. (2005). Dart: Directed automated random testing. In Programming language design and implementation, 213–223.

  40. Gopinath, D., Malik, M. Z., & Khurshid, S. (2011). Specification-based program repair using sat. In Tools and algorithms for the construction and analysis of systems. Volume 6605 of lecture notes in computer science. Springer, 173–188.

  41. Groce, A., & Kroening, D. (2005). Making the most of BMC counterexamples. Electronic Notes in Theoretical Computer Science, 119(2), 67–81.

    Article  Google Scholar 

  42. Harman, M. (2010). Automated patching techniques: The fix is in (technical perspective). Communications of the ACM, 53(5), 108.

    Article  Google Scholar 

  43. Harman, M. (2007). The current state and future of search based software engineering. In International conference on software engineering, 342–357.

  44. He, H., & Gupta, N. (2004). Automated debugging using path-based weakest preconditions. In Fundamental approaches to software engineering, 267–280.

  45. Hutchins, M., Foster, H., Goradia, T., & Ostrand, T. (1994). Experiments of the effectiveness of dataflow-and control flow-based test adequacy criteria. In International conference on software engineering 191–200.

  46. Jeffrey, D., Feng, M., Gupta, N., & Gupta, R. (2009). BugFix: A learning-based tool to assist developers in fixing bugs. In International conference on program comprehension.

  47. Jhala, R., & Majumdar, R. (2005). Path slicing. In Programming language design and implementation. New York, NY: ACM Press, 38–47.

  48. Jia, Y., & Harman, M. (2010). An analysis and survey of the development of mutation testing. IEEE Transactions on Software Engineering, 99 (PrePrints).

  49. Jin, G., Song, L., Zhang, W., Lu, S., & Liblit, B. (2011). Automated atomicity-violation fixing. In Programming language design and implementation.

  50. Jones, T., & Forrest, S. (1995). Fitness distance correlation as a measure of problem difficulty for genetic algorithms. In International conference on genetic algorithms, 184–192.

  51. Jones, J. A., & Harrold, M. J. (2005). Empirical evaluation of the Tarantula automatic fault-localization technique. In Automated software engineering, 273–282.

  52. Kim, D., Nam, J., Song, J., & Kim, S. (2013). Automatic patch generation learned from human-written patches. In International conference on sofware engineering.

  53. Koza, J.R. (1922). Genetic programming: On the programming of computers by means of natural selection. Cambridge: MIT Press.

    Google Scholar 

  54. Koza, J. R. (2009). Awards for human-competitive results produced by genetic and evolutionary computation. http://www.genetic-programming.org/hc2009/cfe2009.html.

  55. Lakhotia, K., Harman, M., & McMinn, P. (2007). A multi-objective approach to search-based test data generation. In Genetic and evolutionary computation conference, 1098–1105.

  56. Langdon, W. B., & Harman, M. (2010). Evolving a CUDA kernel from an nVidia template. In Congress on evolutionary computation, 1–8.

  57. Lanza, M., Penta, M. D., Xi, T., (Eds). (2012). IEEE working conference o mining software repositories. MSR, IEEE.

  58. Le Goues, C., & Weimer, W. (2012). Measuring code quality to improve specification mining. IEEE Transactions on Software Engineering, 38(1), 175–190.

    Article  Google Scholar 

  59. Le Goues, C., Nguyen, T., Forrest, S., & Weimer, W. (2012a). GenProg: A generic method for automated software repair. Transactions on Software Engineering, 38(1), 54–72.

    Article  Google Scholar 

  60. Le Goues, C., Forrest, S., & Weimer, W. (2010). The case for software evolution. In Workshop on the future of software engineering research, 205–210.

  61. Le Goues, C., Dewey-Vogt, M., Forrest, S., & Weimer, W. (2012b). A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In International conference on software engineering, 3–13.

  62. Le Goues, C., Forrest, S., & Weimer, W. (2012c). Representations and operators for improving evolutionary software repair. In Genetic and evolutionary computation conference, 959–966.

  63. Liblit, B., Aiken, A., Zheng, A. X., & Jordan, M. I. (2003). Bug isolation via remote program sampling. In Programming language design and implementation, 141–154.

  64. Liblit, B., Naik, M., Zheng, A. X., Aiken, A., & Jordan, M. I. (2005). Scalable statistical bug isolation. In Programming language design and implementation, 15–26.

  65. Liu, P., & Zhang, C. (2012). Axis: Automatically fixing atomicity violations through solving control constraints. In: International conference on software engineering, 299–309.

  66. McCabe, T. J. (1976). A complexity measure. IEEE Transactions on Software Engineering, 2(4), 308–320.

    MathSciNet  MATH  Article  Google Scholar 

  67. Michael, C. C., McGraw, G., & Schatz, M. A. (2001). Generating software test data by evolution. IEEE Transactions on Software Engineering, 27(12), 1085–1110.

    Article  Google Scholar 

  68. Miller, B. L., & Goldberg, D. E. (1996). Genetic algorithms, selection schemes, and the varying effects of noise. Evolutionary Computing, 4(2), 113–131.

    Article  Google Scholar 

  69. Necula, G. C. (1997). Proof-carrying code. In Principles of programming languages. New York, NY: ACM, 106–119.

  70. Nguyen, T., Kapur, D., Weimer, W., & Forrest, S. (2012) Using dynamic analysis to discover polynomial and array invariants. In International conference on software engineering, 683–693.

  71. Nguyen, H. D. T., Qi, D., Roychoudhury, A., & Chandra, S. (2013). SemFix: Program repair via semantic analysis. In International conference on sofware engineering, 772–781.

  72. Nullhttpd. (2002). Bug: http://nvd.nist.gov/nvd.cfm?cvename=CVE-2002-1496. Exploit: http://www.mail-archive.com/bugtraq@securityfocus.com/msg09178.html.

  73. Orlov, M., & Sipper, M. (2011). Flight of the FINCH through the Java wilderness. Transactions on Evolutionary Computation, 15(2), 166–192.

    Article  Google Scholar 

  74. Orlov, M., & Sipper, M. (2009) Genetic programming in the wild: Evolving unrestricted bytecode. In Genetic and evolutionary computation conference, 1043–1050.

  75. Palshikar, G. (2001). Applying formal specifications to real-world software development. IEEE Software, 18(5), 89–97.

    Article  Google Scholar 

  76. Perkins, J. H., Kim, S., Larsen, S., Amarasinghe, S., Bachrach, J., Carbin, M., et al. (2009). Automatically patching errors in deployed software. In Symposium on operating systems principles.

  77. Rinard, M. C., Cadar, C., Dumitran, D., Roy, D. M., Leu, T., & Beebee, W. S. (2004). Enhancing server availability and security through failure-oblivious computing. In Operating systems design and implementation, 303–316.

  78. Robillard, M. P., Bodden, E., Kawrykow, D., Mezini, M., & Ratchford, T. (2012). Automated API property inference techniques. Transactions on Software Engineering, 99 (PP).

  79. Rowe, J. E., & McPhree, N. F. (2001) The effects of crossover and mutation operators on variable length linear structures. In Genetic and evolutionary computation conference, 535–542.

  80. Saha, D., Nanda, M. G., Dhoolia, P., Nandivada, V. K., Sinha, V., & Chandra, S. (2011). Fault localization for data-centric programs. In Foundations of software engineering.

  81. Schulte, E., Forrest, S., & Weimer, W. (2010). Automatic program repair through the evolution of assembly code. In Automated software engineering, 33–36.

  82. Schulte, E., Fry, Z. P., Fast, E., Forrest, S., & Weimer, W. (2012). Software mutational robustness: Bridging the gap between mutation testing and evolutionary biology. CoRR abs/1204.4224.

  83. Schulte, E., DiLorenzo, J., Forrest, S., & Weimer, W. (2013). Automated repair of binary and assembly programs for cooperating embedded devices. In Architectural support for programming languages and operating systems.

  84. Seacord, R. C., Plakosh, D., & Lewis, G. A. (2003). Modernizing legacy systems: software technologies, engineering process and business practices. Reading: Addison-Wesley Longman Publishing Co. Inc.

    Google Scholar 

  85. Sen, K. (2007). Concolic testing. In Automated software engineering, 571–572.

  86. Seng, O., Stammel, J., & Burkhart, D. (2006). Search-based determination of refactorings for improving the class structure of object-oriented systems. In Genetic and evolutionary computation conference, 1909–1916.

  87. Sidiroglou, S., & Keromytis, A. D. (2005). Countering network worms through automatic patch generation. IEEE Security and Privacy, 3(6), 41–49.

    Article  Google Scholar 

  88. Sidiroglou, S., Giovanidis, G., & Keromytis, A. D. (2005). A dynamic mechanism for recovering from buffer overflow attacks. In Information security, 1–15.

  89. Sitthi-Amorn, P., Modly, N., Weimer, W., & Lawrence, J. (2011). Genetic programming for shader simplification. ACM Transactions on Graphics, 30(5).

  90. Smirnov, A., & Chiueh, T. C. (2005). Dira: Automatic detection, identification and repair of control-hijacking attacks. In Network and distributed system security symposium.

  91. Smirnov, A., Lin, R., & Chiueh, T. C. (2006). PASAN: Automatic patch and signature generation for buffer overflow attacks. In Systems and information security, 165–170.

  92. von Laszewski, G., Fox, G., Wang, F., Younge, A., Kulshrestha, A., Pike, G., et al. (2010). Design of the futuregrid experiment management framework. In Gateway computing environments workshop, 1–10.

  93. Wappler, S., & Wegener, J. (2006). Evolutionary unit testing of object-oriented software using strongly-typed genetic programming. In Genetic and evolutionary computation conference, 1925–1932.

  94. Wei, Y., Pei, Y., Furia, C. A., Silva, L. S., Buchholz, S., Meyer, B., & Zeller, A. (2010). Automated fixing of programs with contracts. In International symposium on software testing and analysis, 61–72.

  95. Weimer, W. (2006). Patches as better bug reports. In Generative programming and component engineering, 181–190.

  96. Weimer, W., & Necula, G. C. (2005). Mining temporal specifications for error detection. In Tools and algorithms for the construction and analysis of systems, 461–476.

  97. Weimer, W., Nguyen, T., Le Goues, C., & Forrest, S. (2009). Automatically finding patches using genetic programming. In International conference on software engineering, 364–367.

  98. White, D. R., Arcuri, A., & Clark, J. A. (2011). Evolutionary improvement of programs. Transactions on Evolutionary Computation, 15(4), 515–538.

    Article  Google Scholar 

  99. Wilkerson, J. L., & Tauritz, D. R. (2011). A guide for fitness function design. In Genetic and evolutionary computation conference, 123–124.

  100. Wilkerson, J. L., Tauritz, D. R., & Bridges, J. M. (2012). Multi-objective coevolutionary automated software correction. In Genetic and evolutionary computation conference, 1229–1236.

  101. Yin, X., Knight, J. C., & Weimer, W. (2009). Exploiting refactoring in formal verification. In International conference on dependable systems and networks, 53–62.

  102. Yin, Z., Yuan, D., Zhou, Y., Pasupathy, S., & Bairavasundaram, L. N. (2011). How do fixes become bugs? In: Foundations of software engineering, 26–36.

  103. Zeller, A. (1999). Yesterday, my program worked. Today, it does not. Why? In Foundations of software engineering.

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Westley Weimer.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Le Goues, C., Forrest, S. & Weimer, W. Current challenges in automatic software repair. Software Qual J 21, 421–443 (2013). https://doi.org/10.1007/s11219-013-9208-0

Download citation

Keywords

  • Automatic program repair
  • Software engineering
  • Evolutionary computation