Genetic Programming and Evolvable Machines

, Volume 15, Issue 3, pp 281–312 | Cite as

Software mutational robustness

  • Eric SchulteEmail author
  • Zachary P. Fry
  • Ethan Fast
  • Westley Weimer
  • Stephanie Forrest


Neutral landscapes and mutational robustness are believed to be important enablers of evolvability in biology. We apply these concepts to software, defining mutational robustness to be the fraction of random mutations to program code that leave a program’s behavior unchanged. Test cases are used to measure program behavior and mutation operators are taken from earlier work on genetic programming. Although software is often viewed as brittle, with small changes leading to catastrophic changes in behavior, our results show surprising robustness in the face of random software mutations. The paper describes empirical studies of the mutational robustness of 22 programs, including 14 production software projects, the Siemens benchmarks, and four specially constructed programs. We find that over 30 % of random mutations are neutral with respect to their test suite. The results hold across all classes of programs, for mutations at both the source code and assembly instruction levels, across various programming languages, and bear only a limited relation to test suite coverage. We conclude that mutational robustness is an inherent property of software, and that neutral variants (i.e., those that pass the test suite) often fulfill the program’s original purpose or specification. Based on these results, we conjecture that neutral mutations can be leveraged as a mechanism for generating software diversity. We demonstrate this idea by generating a population of neutral program variants and showing that the variants automatically repair latent bugs. Neutral landscapes also provide a partial explanation for recent results that use evolutionary computation to automatically repair software bugs.


Mutational robustness Genetic programming Mutation testing Proactive diversity N-version programming Neutral landscapes 



The authors would like to thank Lauren Ancel Meyers, William B. Langdon and Peter Schuster for their thoughtful comments. This work was supported by the National Science Foundation (SHF-0905236), Air Force Office of Scientific Research (FA9550-07-1-0532, FA9550-10-1-0277), DARPA (P-1070-113237), DOE (DE-AC02-05CH11231) and the Santa Fe Institute.


  1. 1.
    D. Ackley, Personal communication (2000)Google Scholar
  2. 2.
    T. Ackling, B. Alexander, I. Grunert, Evolving patches for software repair. in Genetic and Evolutionary Computation (2011), pp. 1427–1434Google Scholar
  3. 3.
    C. Andrieu, N. De Freitas, A. Doucet, M.I. Jordan, An introduction to MCMC for machine learning. Mach. Learn. 50(1–2), 5–43 (2003)CrossRefzbMATHGoogle Scholar
  4. 4.
    J. Anvik, L. Hiew, G.C. Murphy, Coping with an open bug repository. in OOPSLA Workshop on Eclipse Technology eXchange (2005), pp. 35–39Google Scholar
  5. 5.
    A. Arcuri, Evolutionary repair of faulty software. Appl. Soft Comput. 11(4), 3494–3514 (2011)CrossRefGoogle Scholar
  6. 6.
    W. Banzhaf, A. Leier, Evolution on neutral networks in genetic programming. in Genetic Programming Theory and Practice III (2006), pp. 207–221Google Scholar
  7. 7.
    G. Barrantes, D. Ackley, S. Forrest, D. Stefanovic, Randomized instruction set randomization. ACM Trans. Inf. Syst. Secur. (TISSEC) 8(1), 3–40 (2005)CrossRefGoogle Scholar
  8. 8.
    E.G. Barrantes, D.H. Ackley, S. Forrest, D. Stefanovic, Randomized instruction set emulation. ACM Trans. Inf. Syst. Secur. 8(1), 3–40 (2005)CrossRefGoogle Scholar
  9. 9.
    E.G. Barrantes, D.H. Ackley, T.S. Palmer, D. Stefanovic, D.D. Zovi, Randomized instruction set emulation to disrupt binary code injection attacks. in Computer and Communications Security (2003), pp. 281–289Google Scholar
  10. 10.
    S. Bhatkar, D. DuVarney, R. Sekar, Address obfuscation: an effcient approach to combat a broad range of memory error exploits. in USENIX Security Symposium (2003)Google Scholar
  11. 11.
    C. Bienia, S. Kumar, J. Singh, K. Li, The parsec benchmark suite: characterization and architectural implications. in Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques (ACM, 2008), pp. 72–81Google Scholar
  12. 12.
    T.A. Budd, D. Angluin, Two notions of correctness and their relation to testing. Acta Informatica 18, 31–45 (1982)Google Scholar
  13. 13.
    S. Ciliberti, O. Martin, A. Wagner, Innovation and robustness in complex regulatory gene networks. Proc. Natl. Acad. Sci. 104(34), 13–591 (2007)CrossRefGoogle Scholar
  14. 14.
    C. Cowan, H. Hinton, C. Pu, J. Walpole, The cracker patch choice: an analysis of post hoc security techniques. in Proceedings of the 23rd National Information Systems Security Conference (NISSC) (2000)Google Scholar
  15. 15.
    B. Cox, D. Evans, A. Filipi, J. Rowanhill, W. Hu, J. Davidson, J. Knight, A. Nguyen-Tuong, J. Hiser, N-variant systems: a secretless framework for security through diversity. in USENIX Security Symposium (2006)Google Scholar
  16. 16.
    V. Dallmeier, A. Zeller, B. Meyer, Generating fixes from object behavior anomalies. in Automated Software Engineering (2009), pp. 550–554Google Scholar
  17. 17.
    R. DeMillo, A. Offutt, Constraint-based automatic test data generation. Trans. Softw. Eng. 17(9), 900–910 (1991)CrossRefGoogle Scholar
  18. 18.
    J. Draghi, T. Parsons, G. Wagner, J. Plotkin, Mutational robustness can facilitate adaptation. Nature 463(7279), 353–355 (2010)CrossRefGoogle Scholar
  19. 19.
    A. Eyre-Walker, P. Keightley, The distribution of fitness effects of new mutations. Nat. Rev. Genet. 8(8), 610–618 (2007)CrossRefGoogle Scholar
  20. 20.
    R. Feldt, Generating diverse software versions with genetic programming: an experimental study. IEE Proc. Softw. 145(6), 228–236 (1998)CrossRefGoogle Scholar
  21. 21.
    R.A. Fisher, The genetical theory of natural selection. (Oxford, Oxford, 1930)Google Scholar
  22. 22.
    S. Forrest, S. Hofmeyr, A. Somayaji, The evolution of system-call monitoring. in Annual Computer Security Applications Conference (2008), pp. 418–430Google Scholar
  23. 23.
    S. Forrest, A. Somayaji, D.H. Ackley, Building diverse computer systems. in Workshop on Hot Topics in Operating Systems (1997), pp. 67–72Google Scholar
  24. 24.
    P.G. Frankl, S.N. Weiss, C. Hu, All-uses vs mutation testing: an experimental comparison of effectiveness. J. Syst. Softw. 38(3), 235–253 (1997)CrossRefGoogle Scholar
  25. 25.
    Z.P. Fry, W. Weimer, A human study of fault localization accuracy. in International Conference on Software Maintenance (2010), pp. 1–10Google Scholar
  26. 26.
    D. Garlan, J. Kramer, A.L. Wolf, (eds.), Proceedings of the First Workshop on Self-Healing Systems, WOSS 2002, Charleston, South Carolina, USA, November 18–19, 2002 (ACM, 2002)Google Scholar
  27. 27.
    T. Graves, M. Harrold, J. Kim, A. Porter, G. Rothermel, An empirical study of regression test selection techniques. Trans. Softw. Eng. Methodol. 10(2), 184–208 (2001)CrossRefzbMATHGoogle Scholar
  28. 28.
    I. Harvey, A. Thompson, Through the labyrinth evolution finds a way: a silicon ridge. in Evolvable Systems: From Biology to Hardware, pp. 406–422, (1997)Google Scholar
  29. 29.
    W. Hordijk, A measure of landscapes. Evolut. Comput. 4(4), 335–360 (1996)CrossRefGoogle Scholar
  30. 30.
    W.E. Howden, Reliability of the path analysis testing strategy. Softw. Eng. IEEE Trans. 2(3), 208–215 (1976)Google Scholar
  31. 31.
    M. Hutchins, H. Foster, T. Goradia, T. Ostrand, Experiments of the effectiveness of dataflow-and control flow-based test adequacy criteria. in International Conference on Software Engineering (1994), pp. 191–200Google Scholar
  32. 32.
    M. Huynen, P. Stadler, W. Fontana, Smoothness within ruggedness: the role of neutrality in adaptation. Proc. Natl. Acad. Sci. 93(1), 397 (1996)CrossRefGoogle Scholar
  33. 33.
    IEEE security and privacy, special issue on IT monocultures. 7(1) (2009)Google Scholar
  34. 34.
    T. Jackson, B. Salamat, A. Homescu, K. Manivannan, G. Wagner, A. Gal, S. Brunthaler, C. Wimmer, M. Franz, Compiler-generated software diversity. in Moving Target Defense (Springer, 2011), pp. 77–98Google Scholar
  35. 35.
    Y. Jia, M. Harman, An analysis and survey of the development of mutation testing. Trans. Softw. Eng. 37(5), 649–678 (2011). doi: 10.1109/TSE.2010.62 CrossRefGoogle Scholar
  36. 36.
    G. Jin, L. Song, W. Zhang, S. Lu, B. Liblit, Automated atomicity-violation fixing. in Programming Language Design and Implementation (2011)Google Scholar
  37. 37.
    E. Juergens, F. Deissenboeck, B. Hummel, S. Wagner, Do code clones matter? in International Conference on Software Engineering (2009), pp. 485–495Google Scholar
  38. 38.
    R. Kafri, O. Dahan, J. Levy, Y. Pilpel, Preferential protection of protein interaction network hubs in yeast: evolved functionality of genetic redundancy. Proc. Natl Acad. Sci. 105(4), 1243–1248 (2008)CrossRefGoogle Scholar
  39. 39.
    T. Kamiya, S. Kusumoto, K. Inoue, Ccfinder: a multilinguistic token-based code clone detection system for large scale source code. IEEE Trans. Softw. Eng. 28, 654–670 (2002)CrossRefGoogle Scholar
  40. 40.
    G. Kc, A. Keromytis, V. Prevelakis, Countering code-injection attacks with instruction-set randomization. in Computer and Communications Security (2003), pp. 272–280Google Scholar
  41. 41.
    M. Kim, L. Bergman, T. Lau, D. Notkin, An ethnographic study of copy and paste programming practices in oopl. in Empirical Software Engineering, 2004. ISESE’04. Proceedings 2004 International Symposium on (IEEE, 2004), pp. 83–92Google Scholar
  42. 42.
    M. Kimura et al., Evolutionary rate at the molecular level. Nature 217(5129), 624 (1968)CrossRefGoogle Scholar
  43. 43.
    M. Kimura, The Neutral Theory of Molecular Evolution. (Cambridge University Press, Cambridge, 1985)Google Scholar
  44. 44.
    K. King, A. Offutt, A fortran language system for mutation-based software testing. Softw. Pract. Exp. 21(7), 685–718 (1991)CrossRefGoogle Scholar
  45. 45.
    H. Kitano, Biological robustness. Nat. Rev. Genet. 5(11), 826–837 (2004)CrossRefGoogle Scholar
  46. 46.
    J.C. Knight, P. Ammann, Issues influencing the use of n-version programming. in IFIP Congress (1989)Google Scholar
  47. 47.
    J.C. Knight, P.E. Ammann, An experimental evaluation of simple methods for seeding program errors. in International Conference on Software Engineering (1985)Google Scholar
  48. 48.
    J.C. Knight, N.G. Leveson, An experimental evaluation of the assumption of independence in multiversion programming. IEEE Trans. Softw. Eng. 12(1), 96–109 (1986)Google Scholar
  49. 49.
    J. Krinke, A study of consistent and inconsistent changes to code clones. in Working Conference on Reverse Engineering (IEEE Computer Society, 2007), pp. 170–178. doi: 10.1109/WCRE.2007.7
  50. 50.
    C. Le Goues, T. Nguyen, S. Forrest, W. Weimer, GenProg: a generic method for automated software repair. Trans. Softw. Eng. 38(1), 54–72 (2012)CrossRefGoogle Scholar
  51. 51.
    C. Le Goues, M. Dewey-Vogt, S. Forrest, W. Weimer, A systematic study of automated program repair: fixing 55 out of 105 bugs for 8 each. in International Conference on Software Engineering (2012), pp. 3–13Google Scholar
  52. 52.
    R.E. Lenski, J. Barrick, C. Ofria, Balancing robustness and evolvability. PLoS Biol. 4(12), e428 (2006)CrossRefGoogle Scholar
  53. 53.
    B. Littlewood, D. Miller, Conceptual modelling of coincident failures in multiversion software. IEEE Trans. Softw. Eng. 15(12), 1596–1614 (1989)CrossRefMathSciNetGoogle Scholar
  54. 54.
    C. Meiklejohn, D. Hartl, A single mode of canalization. Trends Ecol. Evol. 17(10), 468–473 (2002)CrossRefGoogle Scholar
  55. 55.
    L.A. Meyers, F.D. Ancel, M. Lachmann, Evolution of genetic potential. PLoS Comput. Biol. 1(3), e32 (2005)CrossRefGoogle Scholar
  56. 56.
    J. Miller, Cartesian genetic programming. in Cartesian Genetic Programming (2011), pp. 17–34Google Scholar
  57. 57.
    S. Misailovic, D. Roy, M. Rinard, Probabilistically accurate program transformations. in Static Analysis (2011), pp. 316–333Google Scholar
  58. 58.
    G.C. Necula, S. McPeak, S.P. Rahul, W. Weimer, Cil: An infrastructure for C program analysis and transformation. in International Conference on Compiler Construction (2002), pp. 213–228Google Scholar
  59. 59.
    A. Offutt, W. Craft, Using compiler optimization techniques to detect equivalent mutants. Softw. Test. Verif. Reliab. 4(3), 131–154 (1994)CrossRefGoogle Scholar
  60. 60.
    A. Offutt, J. Pan, Automatically detecting equivalent mutants and infeasible paths. Softw. Test. Verif. Reliab. 7(3), 165–192 (1997)CrossRefGoogle Scholar
  61. 61.
    A. Offutt, A. Lee, G. Rothermel, R. Untch, C. Zapf, An experimental determination of sufficient mutant operators. Trans. Softw. Eng. Methodol. 5(2), 99–118 (1996)CrossRefGoogle Scholar
  62. 62.
    J. Offutt, Y. Ma, Y. Kwon, The class-level mutants of mujava. in International Workshop on Automation of Software Test (ACM, 2006), pp. 78–84Google Scholar
  63. 63.
    C. Ofria, W. Huang, E. Torng, On the gradual evolution of complexity and the sudden emergence of complex features. Artif. Life 14(3), 255–263 (2008)CrossRefGoogle Scholar
  64. 64.
    M. Orlov, M. Sipper, Flight of the FINCH through the Java wilderness. Trans. Evolut. Comput. 15(2), 166–192 (2011)CrossRefGoogle Scholar
  65. 65.
    M. Orlov, M. Sipper, Genetic programming in the wild: Evolving unrestricted bytecode. in Genetic and Evolutionary Computation Conference, (2009), pp. 1043–1050Google Scholar
  66. 66.
    J.H. Perkins, S. Kim, S. Larsen, S. Amarasinghe, J. Bachrach, M. Carbin, C. Pacheco, F. Sherwood, S. Sidiroglou, G. Sullivan, W.F. Wong, Y. Zibin, M.D. Ernst, M. Rinard, Automatically patching errors in deployed software. in Symposium on Operating Systems Principles (2009)Google Scholar
  67. 67.
    A. Poon, B.H. Davis, L. Chao, The coupon collector and the suppressor mutation estimating the number of compensatory mutations by maximum likelihood. Genetics 170(3), 1323–1332 (2005)CrossRefGoogle Scholar
  68. 68.
    M.C. Rinard, C. Cadar, D. Dumitran, D.M. Roy, T. Leu, W.S. Beebee, Enhancing server availability and security through failure-oblivious computing. in Operating Systems Design and Implementation (2004), pp. 303–316Google Scholar
  69. 69.
    D. Robson, H. Regier, Sample size in petersen mark–recapture experiments. Trans. Am. Fish. Soc. 93(3), 215–226 (1964)CrossRefGoogle Scholar
  70. 70.
    G. Rothermel, M. Harrold, Empirical studies of a safe regression test selection technique. Trans. Softw. Eng. 24(6), 401–419 (1998)CrossRefGoogle Scholar
  71. 71.
    D. Schuler, A. Zeller, (un-) covering equivalent mutants. in Software Testing, Verification and Validation (ICST), 2010 Third International Conference on (IEEE, 2010), pp. 45–54Google Scholar
  72. 72.
    E. Schulte, J. DiLorenzo, S. Forrest, W. Weimer, Automated repair of binary and assembly programs for cooperating embedded devices. in Architectural Support for Programming Languages and Operating Systems (2013)Google Scholar
  73. 73.
    E. Schulte, S. Forrest, W. Weimer, Automatic program repair through the evolution of assembly code. in Automated Software Engineering (2010), pp. 33–36Google Scholar
  74. 74.
    P. Schuster, W. Fontana, P. Stadler, I. Hofacker, From sequences to shapes and back: a case study in rna secondary structures. in Proceedings: Biological Sciences (1994), pp. 279–284Google Scholar
  75. 75.
    S. Segura, R. Hierons, D. Benavides, A. Ruiz-Cortés, Mutation testing on an object-oriented framework: an experience report. Inf. Softw. Technol. 53(10), 1124–1136 (2011)Google Scholar
  76. 76.
    R. Sharma, E.S., A. Aiken, Stochastic superoptimization. in Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ACM, 2013)Google Scholar
  77. 77.
    S. Sidiroglou, O. Laadan, C. Perez, N. Viennot, J. Nieh, A.D. Keromytis, Assure: automatic software self-healing using rescue points. in Architectural Support for Programming Languages and Operating Systems (2009), pp. 37–48Google Scholar
  78. 78.
    P. Sitthi-Amorn, N. Modly, W. Weimer, J. Lawrence, Genetic programming for shader simplification. ACM Trans. Graph. 30(5), 152 (2011)Google Scholar
  79. 79.
    V. Smith, K. Chou, D. Lashkari, D. Botstein, P. Brown et al., Functional analysis of the genes of yeast chromosome v by genetic footprinting. Science 274(5295), 2069–2074 (1996)CrossRefGoogle Scholar
  80. 80.
    T. Smith, P. Husbands, P. Layzell, M. O’Shea, Fitness landscapes and evolvability. Evolut. Comput. 10(1), 1–34 (2002)CrossRefGoogle Scholar
  81. 81.
    T. Smith, P. Husbands, M. O’Shea, Neutral networks and evolvability with complex genotype-phenotype mapping. in Advances in Artificial Life (2001), pp. 272–281Google Scholar
  82. 82.
    G.E. Suh, J.W. Lee, D. Zhang, S. Devadas, Secure program execution via dynamic information flow tracking. in ACM SIGPLAN Notices, vol. 39, (ACM, 2004), pp. 85–96Google Scholar
  83. 83.
    R. Untch, Schema-Based Mutation Analysis: A New Test Data Adequacy Assessment Method. Ph.D. thesis. (Clemson University, Clemson, 1995)Google Scholar
  84. 84.
    E. Van Nimwegen, J. Crutchfield, M. Huynen, Neutral evolution of mutational robustness. Proc. Natl. Acad. Sci. 96(17), 9716 (1999)CrossRefGoogle Scholar
  85. 85.
    V. Vassilev, J. Miller, The advantages of landscape neutrality in digital circuit evolution. in Evolvable systems: from biology to hardware (2000), pp. 252–263Google Scholar
  86. 86.
    F. Vokolos, P. Frankl, Empirical evaluation of the textual differencing regression testing technique. in International Conference on Software Maintenance (1998), pp. 44–53Google Scholar
  87. 87.
    A. Wagner, Robustness and Evolvability in Living Systems (Princeton Studies in Complexity). (Princeton University Press, Princeton, 2005)Google Scholar
  88. 88.
    A. Wagner, Neutralism and selectionism: a network-based reconciliation. Nat. Rev. Genet. 9(12), 965–974 (2008)CrossRefGoogle Scholar
  89. 89.
    A. Wagner, Robustness and evolvability: a paradox resolved. Proc. R. Soc. B Biol. Sci. 275(1630), 91 (2008)CrossRefGoogle Scholar
  90. 90.
    K.S.H.T. Wah, A theoretical study of fault coupling. Softw. Test. Verif. Reliability 10(1), 3–45 (2000)CrossRefGoogle Scholar
  91. 91.
    W. Weimer, Patches as better bug reports. in Generative Programming and Component Engineering (2006), pp. 181–190Google Scholar
  92. 92.
    W. Weimer, T. Nguyen, C. Le Goues, S. Forrest, Automatically finding patches using genetic programming. in International Conference on Software Engineering (2009), pp. 364–367Google Scholar
  93. 93.
    Y. Wei, Y. Pei, C.A. Furia, L.S. Silva, S. Buchholz, B. Meyer, A. Zeller, Automated fixing of programs with contracts. in International Symposium on Software Testing and Analysis (2010), pp. 61–72Google Scholar
  94. 94.
    E. Weyuker, On testing non-testable programs. Comput. J. 25(4), 465–470 (1982)CrossRefGoogle Scholar
  95. 95.
    J.L. Wilkerson, D. Tauritz, Coevolutionary automated software correction. in Proceedings of the 12th Annual Conference on Genetic and Evolutionary Computation (ACM, 2010), pp. 1391–1392Google Scholar
  96. 96.
    D. Williams, W. Hu, J.W. Davidson, J. Hiser, J.C. Knight, A. Nguyen-Tuong, Security through diversity: leveraging virtual machine technology. IEEE Secur. Priv. 7(1), 26–33 (2009)CrossRefGoogle Scholar
  97. 97.
    X. Yin, J.C. Knight, W. Weimer, Exploiting refactoring in formal verification. in International Conference on Dependable Systems and Networks (2009), pp. 53–62Google Scholar
  98. 98.
    E. Zuckerkandl, L. Pauling, Molecular disease, evolution and genetic heterogeneity. in Horizons in Biochemistry (1962)Google Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Eric Schulte
    • 1
    Email author
  • Zachary P. Fry
    • 2
  • Ethan Fast
    • 3
  • Westley Weimer
    • 2
  • Stephanie Forrest
    • 1
    • 4
  1. 1.Department of Computer ScienceUniversity of New MexicoAlbuquerqueUSA
  2. 2.Department of Computer ScienceUniversity of VirginiaCharlottesvilleUSA
  3. 3.Department of Computer ScienceStanford UniversityPalo AltoUSA
  4. 4.Santa Fe InstituteSanta FeUSA

Personalised recommendations