Applied Biochemistry and Biotechnology

, Volume 162, Issue 2, pp 321–328 | Cite as

A Law of Mutation: Power Decay of Small Insertions and Small Deletions Associated with Human Diseases

  • Jia Zhang
  • Li Xiao
  • Yufang Yin
  • Pierre Sirois
  • Hanlin Gao
  • Kai LiEmail author


Indels in evolutionary studies are rapidly decayed obeying a power law. The present study analyzed the length distribution of small insertions and deletions associated with human diseases and confirmed that the decay pattern of these small mutations is similar to that of indels when the mutation datasets are large enough. The describable decay pattern of somatic mutations may have application in the evaluation of varied penetrance of different mutations and in association study of gene mutation with carcinogenesis.


Indels Cancer Length distribution 



The authors are indebt to Dr. Gabriel Gutiérrez at Departamento de Genética, Universidad de Sevilla, Sevilla, Spain for proofreading this manuscript. This paper is partially supported by Department of Personnel Jiangsu province “Liu Da Ren Cai Gao Feng” grant (07-B-033), and The National Natural Science Foundation of China (No. 30970877).


  1. 1.
    Kamb, A. (2003). Mutation load, functional overlap, and synthetic lethality in the evolution and treatment of cancer. Journal of Theoretical Biology, 223, 205–213.CrossRefGoogle Scholar
  2. 2.
    Sommer, S. S. (1994). Does cancer kill the individual and save the species? Human Mutation, 3, 166–169.CrossRefGoogle Scholar
  3. 3.
    Temin, H. M. (1988). Evolution of cancer genes as a mutation-driven process. Cancer Research, 48, 1697–1701.Google Scholar
  4. 4.
    Hughes, A. L. (2008). Near neutrality: leading edge of the neutral theory of molecular evolution. Annals of the New York Academy of Sciences, 1133, 162–179.CrossRefGoogle Scholar
  5. 5.
    Pfeifer, G. P., & Besaratinia, A. (2009). Mutational spectra of human cancer. Human Genetics, 24. (Epub ahead of print)Google Scholar
  6. 6.
    Ott, J., & Hoh, J. (2001). Statistical multilocus methods for disequilibrium analysis in complex traits. Human Mutation, 17, 285–288.CrossRefGoogle Scholar
  7. 7.
    White, P. S., Kwok, P. Y., Oefner, P., & Brookes, A. J. (2001). 3rd international meeting on single nucleotide polymorphism and complex genome analysis: SNPs: ‘some notable progress’. European Journal of Human Genetics, 9, 316–318.CrossRefGoogle Scholar
  8. 8.
    Sachidanandam, R., Weissman, D., Schmidt, S. C., Kakol, J. M., Stein, L. D., Marth, G., et al. (2001). A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature, 409, 928–933.CrossRefGoogle Scholar
  9. 9.
    Ball, E. V., Stenson, P. D., Abeysinghe, S. S., Krawczak, M., Cooper, D. N., & Chuzhanova, N. A. (2005). Microdeletions and microinsertions causing human genetic disease: common mechanisms of mutagenesis and the role of local DNA sequence complexity. Human Mutation, 26, 205–213.CrossRefGoogle Scholar
  10. 10.
    Chen, J. M., Chuzhanova, N., Stenson, P. D., Férec, C., & Cooper, D. N. (2005). Complex gene rearrangements caused by serial replication slippage. Human Mutation, 26, 125–134.CrossRefGoogle Scholar
  11. 11.
    Chuzhanova, N. A., Anassis, E. J., Ball, E. V., Krawczak, M., & Cooper, D. N. (2003). Meta-analysis of indels causing human genetic disease: mechanisms of mutagenesis and the role of local DNA sequence complexity. Human Mutation, 21, 28–44.CrossRefGoogle Scholar
  12. 12.
    Scaringe, W. A., Li, K., Gu, D., Gonzalez, K. D., Chen, Z., Hill, K. A., et al. (2008). Somatic microindels in human cancer: the insertions are highly error-prone and derive from nearby but not adjacent sense and antisense templates. Human Molecular Genetics, 17, 2910–2918.CrossRefGoogle Scholar
  13. 13.
    Gonzalez, K. D., Hill, K. A., Li, K., Scaringe, W. A., Wang, J. C., Gu, D., et al. (2007). Somatic microindels: analysis in mouse soma and comparison with the human germline. Human Mutation, 28, 69–80.CrossRefGoogle Scholar
  14. 14.
    Gu, D., Scaringe, W. A., Li, K., Saldivar, J. S., Hill, K. A., Chen, Z., et al. (2007). Database of somatic mutations in EGFR with analyses revealing indel hotspots but no smoking-associated signature. Human Mutation, 28, 760–770.CrossRefGoogle Scholar
  15. 15.
    Lunter, G., Rocco, A., Mimouni, N., Heger, A., Caldeira, A., & Hein, J. (2008). Uncertainty in homology inferences: assessing and improving genomic sequence alignment. Genome Research, 18, 298–309.CrossRefGoogle Scholar
  16. 16.
    Gibbs, R. A., Weinstock, G. M., Metzker, M. L., Muzny, D. M., Sodergren, E. J., Scherer, S., et al. (2004). Genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature, 428, 493–521.CrossRefGoogle Scholar
  17. 17.
    Chang, M. S., & Benner, S. A. (2004). Empirical analysis of protein insertions and deletions determining parameters for the correct placement of gaps in protein sequence alignments. Journal of Molecular Biology, 341, 617–631.CrossRefGoogle Scholar
  18. 18.
    Gu, X., & Li, W. H. (1995). The size distribution of insertions and deletions in human and rodent pseudogenes suggests the logarithmic gap penalty for sequence alignment. Journal of Molecular Evolution, 40, 464–473.CrossRefGoogle Scholar
  19. 19.
    Cartwright, R. A. (2006). Logarithmic gap costs decrease alignment accuracy. BMC Bioinformatics, 7, 527.CrossRefGoogle Scholar
  20. 20.
    Kim, J., & Sinha, S. (2007). Indelign: a probabilistic framework for annotation of insertions and deletions in a multiple alignment. Bioinformatics, 23, 289–297.CrossRefGoogle Scholar
  21. 21.
    Lunter, G. (2007). Probabilistic whole-genome alignments reveal high indel rates in the human and mouse genomes. Bioinformatics, 23, i289–i296.CrossRefGoogle Scholar
  22. 22.
    Yamane, K., Yano, K., & Kawahara, T. (2006). Pattern and rate of indel evolution inferred from whole chloroplast intergenic regions in sugarcane, maize and rice. DNA Research, 13, 197–204.CrossRefGoogle Scholar
  23. 23.
    Denver, D. R., Morris, K., Lynch, M., & Thomas, W. K. (2004). High mutation rate and predominance of insertions in the Caenorhabditis elegans nuclear genome. Nature, 430, 679–682.CrossRefGoogle Scholar
  24. 24.
    Lunter, G., Ponting, C. P., & Hein, J. (2006). Genome-wide identification of human functional DNA using a neutral indel model. PLoS Computational Biology, 2, e5.CrossRefGoogle Scholar
  25. 25.
    Halpern, A. L., & Bruno, W. J. (1998). Evolutionary distances for protein-coding sequences: modeling site-specific residue frequencies. Molecular Biology and Evolution, 15, 910–917.Google Scholar
  26. 26.
    Cartwright, R. A. (2009). Problems and solutions for estimating indel rates and length distributions. Molecular Biology and Evolution, 26, 473–480.CrossRefGoogle Scholar
  27. 27.
    Zhang, Z., & Gerstein, M. (2003). Patterns of nucleotide substitution, insertion and deletion in the human genome inferred from pseudogenes. Nucleic Acids Research, 31, 5338–5348.CrossRefGoogle Scholar
  28. 28.
    Li, K. (2006). Small insertions and deletions is revealed in association with the number of inserted or deleted nucleotides. J Nanhua University, 34(1–2), 9.Google Scholar
  29. 29.
    Li, K., Xiao, L., Yin, Y. F., & Zhang, J. (Oct 9-13, 2006) How to associate the somatic mutations and a specific cancer. 56th ASHG, New Orleans, USA.Google Scholar
  30. 30.
    Taylor, M. S., Ponting, C. P., & Copley, R. R. (2004). Occurrence and consequences of coding sequence insertions and deletions in mammalian genomes. Genome Research, 14, 555–566.CrossRefGoogle Scholar
  31. 31.
    Malkin, D., Li, F. P., Strong, L. C., Fraumeni, J. F., Jr., Nelson, C. E., Kim, D. H., et al. (1990). Germ line p53 mutations in a familial syndrome of breast cancer, sarcomas, and other neoplasms. Science, 250, 1233–1238.CrossRefGoogle Scholar
  32. 32.
    Donehower, L. A. (1996). The p53-deficient mouse: a model for basic and applied cancer studies. Seminars in Cancer Biology, 7, 269–278.CrossRefGoogle Scholar
  33. 33.
    Hollstein, M., Sidransky, D., Vogelstein, B., & Harris, C. C. (1991). p53 mutations in human cancers. Science, 253, 49–53.CrossRefGoogle Scholar
  34. 34.
    Sjöblom, T., Jones, S., Wood, L. D., Parsons, D. W., Lin, J., Barber, T. D., et al. (2006). The consensus coding sequences of human breast and colorectal cancers. Science, 314, 268–274.CrossRefGoogle Scholar

Copyright information

© Humana Press 2009

Authors and Affiliations

  • Jia Zhang
    • 1
  • Li Xiao
    • 1
  • Yufang Yin
    • 2
  • Pierre Sirois
    • 3
  • Hanlin Gao
    • 4
  • Kai Li
    • 5
    Email author
  1. 1.Clinical Molecular Diagnostic Centerthe Second Affiliated Hospital of Soochow UniversitySuzhouChina
  2. 2.SNP InstituteUniversity of South ChinaHenyangChina
  3. 3.Department of PharmacologyUniversity of SherbrookeSherbrookeCanada
  4. 4.Beckman Research InstituteCity of HopeDuarteUSA
  5. 5.Department of Pharmacology, Medical CollegeSoochow UniversitySuzhouChina

Personalised recommendations