Abstract
Genetic improvement for program repair can fix bugs or otherwise improve software via patch evolution. Consider GenProg, a prototypical technique of this type. GenProg’s crossover and mutation operators manipulate individuals represented as patches. A patch is composed of high-granularity edits that indivisibly comprise an edit operation, a faulty location, and a fix statement used in replacement or insertions. We observe that recombination and mutation of such high-level units limits the technique’s ability to effectively traverse and recombine the repair search spaces. We propose a reformulation of program repair representation, crossover, and mutation operators such that they explicitly traverse the three subspaces that underlie the search problem: the Operator, Fault and Fix Spaces. We provide experimental evidence validating our insight, showing that the operators provide considerable improvement over the baseline repair algorithm in terms of search success rate and efficiency. We also conduct a genotypic distance analysis over the various types of search, providing insight as to the influence of the operators on the program repair search problem.
Similar content being viewed by others
Notes
Note that GenProg manipulates C programs at the statement-level, but the formulation generalizes to arbitrary languages and granularity levels.
Both available from the GenProg project, http://genprog.cs.virginia.edu/
IntroClass and ManyBugs are available at http://repairbenchmarks.cs.umass.edu/
References
Ackling T, Alexander B, Grunert I (2011) Evolving patches for software repair. In: Genetic and Evolutionary Computation, pp 1427–1434
Arcuri A (2011) Evolutionary repair of faulty software. Appl Soft Comput 11 (4):3494–3514
Arcuri A, Briand L (2011) A practical guide for using statistical tests to assess randomized algorithms in software engineering. In: International Conference on Software Engineering, ACM, New York, NY, USA, ICSE ’11, pp 1–10
Arcuri A, Yao X (2008) A novel co-evolutionary approach to automatic software bug fixing. In: 2008 IEEE Congress on Evolutionary Computation. CEC 2008. (IEEE World Congress on Computational Intelligence), IEEE, pp 162–168
Barr ET, Brun Y, Devanbu P, Harman M, Sarro F (2014) The plastic surgery hypothesis. In: Proceedings of the 22nd ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE), pp 306–317
Barr ET, Harman M, Jia Y, Marginean A, Petke J (2015) Automated software transplantation. In: International Symposium on Software Testing and Analysis (ISSTA), pp 257–269
Brameier MF, Banzhaf W (2007) Linear genetic programming, 1st edn. Springer, Berlin
Britton T, Jeng L, Carver G, Cheak P, Katzenellenbogen T (2013) Reversible debugging software. Tech rep., University of Cambridge, Judge Business School
Bruce BR, Petke J, Harman M (2015) Reducing energy consumption using genetic improvement. In: Annual Conference on Genetic and Evolutionary Computation (GECCO), pp 1327–1334
Burlacu B, Affenzeller M, Winkler S, Kommenda M, Kronberger G (2015) Methods for genealogy and building block analysis in genetic programming. In: Computational Intelligence and Efficiency in Engineering Systems. Springer, pp 61–74
Debroy V, Wong WE (2010) Using mutation to automatically suggest fixes for faulty programs. In: International Conference on Software Testing, Verification, and Validation, pp 65–74
DeJong K (1975) An analysis of the behavior of a class of genetic adaptive systems. Ph D Thesis, University of Michigan
Fast E, Le Goues C, Forrest S, Weimer W (2010) Designing better fitness functions for automated program repair. In: Pelikan M, Branke J (eds) Genetic and Evolutionary Computation Conference (GECCO). ACM, pp 965–972
Forrest S (1993) Genetic algorithms: principles of natural selection applied to computation. Science 261:872–878
Forrest S, Nguyen T, Weimer W, Le Goues C (2009) A genetic programming approach to automated software repair. In: Genetic and evolutionary computation conference (GECCO), pp 947–954
Freitas E, Camilo CG Jr, Vincenzi AMR (2016) SCOUT: a multi-objective method to select components in designing unit testing. In: IEEE 27th International Symposium on Software Reliability Engineering (ISSRE), pp 36–46
Goldberg DE (1989) Genetic algorithms in search, optimization and machine learning, 1st edn. Addison-Wesley Longman Publishing Co., Inc, Reading
Gupta D, Ghafir S (2012) An overview of methods maintaining diversity in genetic algorithms. Int J Emerg Technol Adv Eng 2(5):56–60
Harik GR, Lobo FG, Goldberg DE (1999) The compact genetic algorithm. IEEE Trans Evol Comput 3(4):287–297
Harman M, Mansouri SA, Zhang Y (2012) Search-based software engineering: trends, techniques and applications. ACM Comput Surv 45(1):11:1–11:61. https://doi.org/10.1145/2379776.2379787
Holland JH (1992) Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control and artificial intelligence. MIT Press, Cambridge
Jones JA, Harrold MJ, Stasko J (2002) Visualization of test information to assist fault localization. In: International conference on software engineering (ICSE), Orlando, FL, USA. https://doi.org/10.1145/581339.581397, pp 467–477
Ke Y, Stolee KT, Le Goues C, Brun Y (2015) Repairing programs with semantic code search. In: 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp 295–306
Kim D, Nam J, Song J, Kim S (2013) Automatic patch generation learned from human-written patches. In: ACM/IEEE International Conference on Software Engineering (ICSE), San Francisco, CA, USA, pp 802–811
Kim YH, Moon BR (2004) Distance measures in genetic algorithms. In: Genetic and evolutionary computation conference (GECCO). Springer, pp 400–401
Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge
Koza JR (1994) Genetic programming II: automatic discovery of reusable programs. MIT Press, Cambridge
Langdon WB, Harman M (2015) Grow and graft a better CUDA pknotsRG for RNA pseudoknot free energy calculation. In: Genetic and Evolutionary Computation Conference, GECCO Companion ’15, pp 805–810
Le XBD, Lo D, Le Goues C (2016) History driven program repair. In: IEEE 23Rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), vol 1. IEEE, pp 213–224
Le Goues C, Dewey-Vogt M, Forrest S, Weimer W (2012a) A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In: International Conference on Software Engineering (ICSE), pp 3–13
Le Goues C, Nguyen T, Forrest S, Weimer W (2012b) Genprog: a generic method for automatic software repair. IEEE Transactions on Software Engineering (TSE) 38:54–72. https://doi.org/10.1109/TSE.2011.104
Le Goues C, Weimer W, Forrest S (2012) Representations and operators for improving evolutionary software repair. In: Genetic and evolutionary computation conference (GECCO), pp 959–966
Le Goues C, Forrest S, Weimer W (2013) Current challenges in automatic software repair. Softw Qual J 21(3):421–443. https://doi.org/10.1007/s11219-013-9208-0
Le Goues C, Holtschulte N, Smith EK, Brun Y, Devanbu P, Forrest S, Weimer W (2015) The ManyBugs and IntroClass benchmarks for automated repair of C programs. IEEE Transactions on Software Engineering (TSE)
Liblit B, Naik M, Zheng AX, Aiken A, Jordan MI (2005) Scalable statistical bug isolation. SIGPLAN Not 40(6):15–26. https://doi.org/10.1145/1064978.1065014
Long F, Rinard M (2015) Staged program repair with condition synthesis. In: Joint Meeting of the European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), ACM, New York, NY, USA, ESEC/FSE 2015, pp 166–178
Long F, Rinard M (2016) Automatic patch generation by learning correct code. In: Principles of Programming Languages, POPL ’16, pp 298–312
Louis SJ, Rawlins GJE (1992) Syntactic analysis of convergence in genetic algorithms. In: Foundations of Genetic Algorithms 2, Morgan Kaufmann, pp 141–151
Luke S, Spector L (1997) A comparison of crossover and mutation in genetic programming. Genet Program 97:240–248
Machado BN, Camilo CG Jr, Rodrigues CL, Quijano EHD (2016) Sbstframe: a framework to search-based software testing. In: 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp 004,106–004,111
Martinez M, Monperrus M (2015) Mining software repair models for reasoning on the search space of automated program fixing. Empir Softw Eng 20(1):176–205
Mattfeld DC (2013) Evolutionary search and the job shop: Investigations on genetic algorithms for production scheduling. Springer, Berlin
Mechtaev S, Yi J, Roychoudhury A (2016) Angelix: Scalable multiline program patch synthesis via symbolic analysis. In: International Conference on Software Engineering, ICSE ’16, pp 691–701
Moncao ACBL, Camilo CG, Queiroz LT, Rodrigues CL, de Sa Leitao P, Vincenzi AMR (2013) Shrinking a database to perform SQL mutation tests using an evolutionary algorithm. In: IEEE Congress on Evolutionary Computation (CEC), pp 2533–2539
Morrison RW, De Jong KA (2001) Measurement of population diversity. In: International Conference on Artificial Evolution (Evolution Artificielle). Springer, pp 31–41
Nguyen HDT, Qi D, Roychoudhury A, Chandra S (2013) SemFix: program repair via semantic analysis. In: International Conference on Software Engineering (ICSE), pp 772–781
Oliveira AAL, Camilo CG Jr, Vincenzi AMR (2013) A coevolutionary algorithm to automatic test case selection and mutant in mutation testing. In: IEEE Congress on Evolutionary Computation (CEC), pp 829–836
Oliveira VPL, Souza EF, Le Goues C, Camilo CG Jr (2016) Improved crossover operators for genetic programming for program repair. In: International Symposium on Search Based Software Engineering (SSBSE). Springer, pp 112–127
Orlov M, Sipper M (2011) Flight of the FINCH through the Java wilderness. IEEE Trans Evol Comput 15(2):166–182
Petke J, Harman M, Langdon WB, Weimer W (2014) Using genetic improvement and code transplants to specialise a C++ program to a problem class. In: Genetic Programming, pp 137–149
Pressman RS (2001) Software engineering: a practitioner’s approach, 5th edn. McGraw-Hill Higher Education, Burr Ridge
Qi Y, Mao X, Lei Y, Dai Z, Wang C (2014) The strength of random search on automated program repair. In: International Conference on Software Engineering (ICSE), pp 254–265
Qi Z, Long F, Achour S, Rinard M (2015) An analysis of patch plausibility and correctness for generate-and-validate patch generation systems. In: International Symposium on Software Testing and Analysis (ISSTA), pp 24–36
Rawlins GJE (1991) Foundations of genetic algorithms. Morgan Kaufmann, San Francisco
Rothlauf F (2011) Design of modern heuristics: principles and application. Springer, Berlin
Saha D, Nanda MG, Dhoolia P, Nandivada VK, Sinha V, Chandra S (2011) Fault localization for data-centric programs. In: Joint Meeting of the European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), pp 157–167
Schulte E, Forrest S, Weimer W (2010) Automated program repair through the evolution of assembly code. In: Automated software engineering (ASE), pp 313–316
Schulte E, Dorn J, Harding S, Forrest S, Weimer W (2014) Post-compiler software optimization for reducing energy. SIGARCH Comput Archit News 42(1):639–652
Silva S, Esparcia-Alcázar AI (eds.) (2015) Genetic and evolutionary computation conference companion material proceedings, Workshop on Genetic Improvement, ACM
Smith EK, Barr E, Le Goues C, Brun Y (2015) Is the cure worse than the disease? Overfitting in automated program repair. In: Joint Meeting of the European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), pp 532–543
Weimer W, Nguyen T, Le Goues C, Forrest S (2009) Automatically finding patches using genetic programming. In: International Conference on Software Engineering (ICSE), pp 364–374
Weimer W, Fry ZP, Forrest S (2013) Leveraging program equivalence for adaptive program repair: models and first results. In: Automated Software Engineering (ASE), pp 356–366
Wong WE, Gao R, Li Y, Abreu R, Wotawa F (2016) A survey on software fault localization. IEEE Transactions on Software Engineering (TSE) 42(8):707–740
Zeller A (1999) Yesterday, my program worked. Today, it does not. Why?. In: Joint Meeting of the European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), pp 253–267
Acknowledgements
Acknowledgements to be added to support a camera-ready.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Martin Monperrus and Westley Weimer
Rights and permissions
About this article
Cite this article
Oliveira, V.P.L., Souza, E.F.d., Goues, C.L. et al. Improved representation and genetic operators for linear genetic programming for automated program repair. Empir Software Eng 23, 2980–3006 (2018). https://doi.org/10.1007/s10664-017-9562-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-017-9562-9