Empirical Software Engineering

, Volume 23, Issue 5, pp 2980–3006 | Cite as

Improved representation and genetic operators for linear genetic programming for automated program repair

  • Vinicius Paulo L. OliveiraEmail author
  • Eduardo Faria de Souza
  • Claire Le Goues
  • Celso G. Camilo-Junior


Genetic improvement for program repair can fix bugs or otherwise improve software via patch evolution. Consider GenProg, a prototypical technique of this type. GenProg’s crossover and mutation operators manipulate individuals represented as patches. A patch is composed of high-granularity edits that indivisibly comprise an edit operation, a faulty location, and a fix statement used in replacement or insertions. We observe that recombination and mutation of such high-level units limits the technique’s ability to effectively traverse and recombine the repair search spaces. We propose a reformulation of program repair representation, crossover, and mutation operators such that they explicitly traverse the three subspaces that underlie the search problem: the Operator, Fault and Fix Spaces. We provide experimental evidence validating our insight, showing that the operators provide considerable improvement over the baseline repair algorithm in terms of search success rate and efficiency. We also conduct a genotypic distance analysis over the various types of search, providing insight as to the influence of the operators on the program repair search problem.


Automatic software repair Automated program repair Genetic improvement Genetic programming Crossover operator Mutation operator 



Acknowledgements to be added to support a camera-ready.


  1. Ackling T, Alexander B, Grunert I (2011) Evolving patches for software repair. In: Genetic and Evolutionary Computation, pp 1427–1434Google Scholar
  2. Arcuri A (2011) Evolutionary repair of faulty software. Appl Soft Comput 11 (4):3494–3514CrossRefGoogle Scholar
  3. Arcuri A, Briand L (2011) A practical guide for using statistical tests to assess randomized algorithms in software engineering. In: International Conference on Software Engineering, ACM, New York, NY, USA, ICSE ’11, pp 1–10Google Scholar
  4. Arcuri A, Yao X (2008) A novel co-evolutionary approach to automatic software bug fixing. In: 2008 IEEE Congress on Evolutionary Computation. CEC 2008. (IEEE World Congress on Computational Intelligence), IEEE, pp 162–168Google Scholar
  5. Barr ET, Brun Y, Devanbu P, Harman M, Sarro F (2014) The plastic surgery hypothesis. In: Proceedings of the 22nd ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE), pp 306–317Google Scholar
  6. Barr ET, Harman M, Jia Y, Marginean A, Petke J (2015) Automated software transplantation. In: International Symposium on Software Testing and Analysis (ISSTA), pp 257–269Google Scholar
  7. Brameier MF, Banzhaf W (2007) Linear genetic programming, 1st edn. Springer, BerlinzbMATHGoogle Scholar
  8. Britton T, Jeng L, Carver G, Cheak P, Katzenellenbogen T (2013) Reversible debugging software. Tech rep., University of Cambridge, Judge Business SchoolGoogle Scholar
  9. Bruce BR, Petke J, Harman M (2015) Reducing energy consumption using genetic improvement. In: Annual Conference on Genetic and Evolutionary Computation (GECCO), pp 1327–1334Google Scholar
  10. Burlacu B, Affenzeller M, Winkler S, Kommenda M, Kronberger G (2015) Methods for genealogy and building block analysis in genetic programming. In: Computational Intelligence and Efficiency in Engineering Systems. Springer, pp 61–74Google Scholar
  11. Debroy V, Wong WE (2010) Using mutation to automatically suggest fixes for faulty programs. In: International Conference on Software Testing, Verification, and Validation, pp 65–74Google Scholar
  12. DeJong K (1975) An analysis of the behavior of a class of genetic adaptive systems. Ph D Thesis, University of MichiganGoogle Scholar
  13. Fast E, Le Goues C, Forrest S, Weimer W (2010) Designing better fitness functions for automated program repair. In: Pelikan M, Branke J (eds) Genetic and Evolutionary Computation Conference (GECCO). ACM, pp 965–972Google Scholar
  14. Forrest S (1993) Genetic algorithms: principles of natural selection applied to computation. Science 261:872–878CrossRefGoogle Scholar
  15. Forrest S, Nguyen T, Weimer W, Le Goues C (2009) A genetic programming approach to automated software repair. In: Genetic and evolutionary computation conference (GECCO), pp 947–954Google Scholar
  16. Freitas E, Camilo CG Jr, Vincenzi AMR (2016) SCOUT: a multi-objective method to select components in designing unit testing. In: IEEE 27th International Symposium on Software Reliability Engineering (ISSRE), pp 36–46Google Scholar
  17. Goldberg DE (1989) Genetic algorithms in search, optimization and machine learning, 1st edn. Addison-Wesley Longman Publishing Co., Inc, ReadingzbMATHGoogle Scholar
  18. Gupta D, Ghafir S (2012) An overview of methods maintaining diversity in genetic algorithms. Int J Emerg Technol Adv Eng 2(5):56–60Google Scholar
  19. Harik GR, Lobo FG, Goldberg DE (1999) The compact genetic algorithm. IEEE Trans Evol Comput 3(4):287–297CrossRefGoogle Scholar
  20. Harman M, Mansouri SA, Zhang Y (2012) Search-based software engineering: trends, techniques and applications. ACM Comput Surv 45(1):11:1–11:61. CrossRefGoogle Scholar
  21. Holland JH (1992) Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control and artificial intelligence. MIT Press, CambridgeGoogle Scholar
  22. Jones JA, Harrold MJ, Stasko J (2002) Visualization of test information to assist fault localization. In: International conference on software engineering (ICSE), Orlando, FL, USA., pp 467–477
  23. Ke Y, Stolee KT, Le Goues C, Brun Y (2015) Repairing programs with semantic code search. In: 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp 295–306Google Scholar
  24. Kim D, Nam J, Song J, Kim S (2013) Automatic patch generation learned from human-written patches. In: ACM/IEEE International Conference on Software Engineering (ICSE), San Francisco, CA, USA, pp 802–811Google Scholar
  25. Kim YH, Moon BR (2004) Distance measures in genetic algorithms. In: Genetic and evolutionary computation conference (GECCO). Springer, pp 400–401Google Scholar
  26. Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, CambridgezbMATHGoogle Scholar
  27. Koza JR (1994) Genetic programming II: automatic discovery of reusable programs. MIT Press, CambridgezbMATHGoogle Scholar
  28. Langdon WB, Harman M (2015) Grow and graft a better CUDA pknotsRG for RNA pseudoknot free energy calculation. In: Genetic and Evolutionary Computation Conference, GECCO Companion ’15, pp 805–810Google Scholar
  29. Le XBD, Lo D, Le Goues C (2016) History driven program repair. In: IEEE 23Rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), vol 1. IEEE, pp 213–224Google Scholar
  30. Le Goues C, Dewey-Vogt M, Forrest S, Weimer W (2012a) A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In: International Conference on Software Engineering (ICSE), pp 3–13Google Scholar
  31. Le Goues C, Nguyen T, Forrest S, Weimer W (2012b) Genprog: a generic method for automatic software repair. IEEE Transactions on Software Engineering (TSE) 38:54–72.
  32. Le Goues C, Weimer W, Forrest S (2012) Representations and operators for improving evolutionary software repair. In: Genetic and evolutionary computation conference (GECCO), pp 959–966Google Scholar
  33. Le Goues C, Forrest S, Weimer W (2013) Current challenges in automatic software repair. Softw Qual J 21(3):421–443. CrossRefGoogle Scholar
  34. Le Goues C, Holtschulte N, Smith EK, Brun Y, Devanbu P, Forrest S, Weimer W (2015) The ManyBugs and IntroClass benchmarks for automated repair of C programs. IEEE Transactions on Software Engineering (TSE)Google Scholar
  35. Liblit B, Naik M, Zheng AX, Aiken A, Jordan MI (2005) Scalable statistical bug isolation. SIGPLAN Not 40(6):15–26. CrossRefGoogle Scholar
  36. Long F, Rinard M (2015) Staged program repair with condition synthesis. In: Joint Meeting of the European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), ACM, New York, NY, USA, ESEC/FSE 2015, pp 166–178Google Scholar
  37. Long F, Rinard M (2016) Automatic patch generation by learning correct code. In: Principles of Programming Languages, POPL ’16, pp 298–312Google Scholar
  38. Louis SJ, Rawlins GJE (1992) Syntactic analysis of convergence in genetic algorithms. In: Foundations of Genetic Algorithms 2, Morgan Kaufmann, pp 141–151Google Scholar
  39. Luke S, Spector L (1997) A comparison of crossover and mutation in genetic programming. Genet Program 97:240–248Google Scholar
  40. Machado BN, Camilo CG Jr, Rodrigues CL, Quijano EHD (2016) Sbstframe: a framework to search-based software testing. In: 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp 004,106–004,111Google Scholar
  41. Martinez M, Monperrus M (2015) Mining software repair models for reasoning on the search space of automated program fixing. Empir Softw Eng 20(1):176–205CrossRefGoogle Scholar
  42. Mattfeld DC (2013) Evolutionary search and the job shop: Investigations on genetic algorithms for production scheduling. Springer, BerlinzbMATHGoogle Scholar
  43. Mechtaev S, Yi J, Roychoudhury A (2016) Angelix: Scalable multiline program patch synthesis via symbolic analysis. In: International Conference on Software Engineering, ICSE ’16, pp 691–701Google Scholar
  44. Moncao ACBL, Camilo CG, Queiroz LT, Rodrigues CL, de Sa Leitao P, Vincenzi AMR (2013) Shrinking a database to perform SQL mutation tests using an evolutionary algorithm. In: IEEE Congress on Evolutionary Computation (CEC), pp 2533–2539Google Scholar
  45. Morrison RW, De Jong KA (2001) Measurement of population diversity. In: International Conference on Artificial Evolution (Evolution Artificielle). Springer, pp 31–41Google Scholar
  46. Nguyen HDT, Qi D, Roychoudhury A, Chandra S (2013) SemFix: program repair via semantic analysis. In: International Conference on Software Engineering (ICSE), pp 772–781Google Scholar
  47. Oliveira AAL, Camilo CG Jr, Vincenzi AMR (2013) A coevolutionary algorithm to automatic test case selection and mutant in mutation testing. In: IEEE Congress on Evolutionary Computation (CEC), pp 829–836Google Scholar
  48. Oliveira VPL, Souza EF, Le Goues C, Camilo CG Jr (2016) Improved crossover operators for genetic programming for program repair. In: International Symposium on Search Based Software Engineering (SSBSE). Springer, pp 112–127Google Scholar
  49. Orlov M, Sipper M (2011) Flight of the FINCH through the Java wilderness. IEEE Trans Evol Comput 15(2):166–182CrossRefGoogle Scholar
  50. Petke J, Harman M, Langdon WB, Weimer W (2014) Using genetic improvement and code transplants to specialise a C++ program to a problem class. In: Genetic Programming, pp 137–149Google Scholar
  51. Pressman RS (2001) Software engineering: a practitioner’s approach, 5th edn. McGraw-Hill Higher Education, Burr RidgezbMATHGoogle Scholar
  52. Qi Y, Mao X, Lei Y, Dai Z, Wang C (2014) The strength of random search on automated program repair. In: International Conference on Software Engineering (ICSE), pp 254–265Google Scholar
  53. Qi Z, Long F, Achour S, Rinard M (2015) An analysis of patch plausibility and correctness for generate-and-validate patch generation systems. In: International Symposium on Software Testing and Analysis (ISSTA), pp 24–36Google Scholar
  54. Rawlins GJE (1991) Foundations of genetic algorithms. Morgan Kaufmann, San FranciscoGoogle Scholar
  55. Rothlauf F (2011) Design of modern heuristics: principles and application. Springer, BerlinCrossRefzbMATHGoogle Scholar
  56. Saha D, Nanda MG, Dhoolia P, Nandivada VK, Sinha V, Chandra S (2011) Fault localization for data-centric programs. In: Joint Meeting of the European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), pp 157–167Google Scholar
  57. Schulte E, Forrest S, Weimer W (2010) Automated program repair through the evolution of assembly code. In: Automated software engineering (ASE), pp 313–316Google Scholar
  58. Schulte E, Dorn J, Harding S, Forrest S, Weimer W (2014) Post-compiler software optimization for reducing energy. SIGARCH Comput Archit News 42(1):639–652Google Scholar
  59. Silva S, Esparcia-Alcázar AI (eds.) (2015) Genetic and evolutionary computation conference companion material proceedings, Workshop on Genetic Improvement, ACMGoogle Scholar
  60. Smith EK, Barr E, Le Goues C, Brun Y (2015) Is the cure worse than the disease? Overfitting in automated program repair. In: Joint Meeting of the European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), pp 532–543Google Scholar
  61. Weimer W, Nguyen T, Le Goues C, Forrest S (2009) Automatically finding patches using genetic programming. In: International Conference on Software Engineering (ICSE), pp 364–374Google Scholar
  62. Weimer W, Fry ZP, Forrest S (2013) Leveraging program equivalence for adaptive program repair: models and first results. In: Automated Software Engineering (ASE), pp 356–366Google Scholar
  63. Wong WE, Gao R, Li Y, Abreu R, Wotawa F (2016) A survey on software fault localization. IEEE Transactions on Software Engineering (TSE) 42(8):707–740CrossRefGoogle Scholar
  64. Zeller A (1999) Yesterday, my program worked. Today, it does not. Why?. In: Joint Meeting of the European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), pp 253–267Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2018

Authors and Affiliations

  • Vinicius Paulo L. Oliveira
    • 1
    Email author
  • Eduardo Faria de Souza
    • 1
  • Claire Le Goues
    • 2
  • Celso G. Camilo-Junior
    • 1
  1. 1.Instituto de InformaticaUniversidade Federal de Goias (UFG)GoiâniaBrazil
  2. 2.School of Computer ScienceCarnegie Mellon University (CMU)PittsburghUSA

Personalised recommendations