Inferring Weak Adaptations and Selection Biases in Proteins from Composition and Substitution Matrices

  • Steinar Thorvaldsen
  • Elinor Ytterstad
  • Tor Flå
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4463)


There is a desire for increasing use of statistical methods in analysing the growing amounts of bio-sequences. We present statistical methods that are useful when a protein alignment can be divided into two groups based on known features or traits. The approach is based on stratification of the data, and to show the applicability of the methods we present analysis of genomic data from proteobacteria orders. A dataset of 25 periplasmic/extracellular bacterial enzyme endonuclease I proteins was compiled to identify genotypic characteristics that separate the cold adapted proteins from ortholog sequences with a higher optimal growth temperature. Our results reveal that the cold adapted protein has a significantly more positively charged exterior. Life in a cold climate seems to be enabled by many minor structural modifications rather than a particular amino acid substitution. Redistribution of charge might be one of the most important signatures for cold adaptation.


Stratified data Two-way ANOVA Mantel-Haenszel test cold adaptation 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Committee on Frontiers in Polar Biology (CB): Frontiers in polar biology in the genomics era. National Academies Press, Washington (2003)Google Scholar
  2. 2.
    Sohail, K., Cavicchioli, R.: Cold-adapted enzymes. Annu. Rev. Biochem. 75, 403–433 (2006)CrossRefGoogle Scholar
  3. 3.
    Saunders, N.F.W., Thomas, T., Curmi, P.M.G., et al.: Mechanisms of thermal adaptation revealed from the genomes of the Antarctic Archaea Methanogenium frigidum and Methanococcoides burtonii. Genome Res. 13(7), 1580–1588 (2003)CrossRefGoogle Scholar
  4. 4.
    Karlin, S., et al.: Heterogeneity of genome and proteome content in bacteria, archaea, and eukaryotes. Theor. Popul. Biol. 61, 367–390 (2002)zbMATHCrossRefGoogle Scholar
  5. 5.
    Pe’er, I., et al.: Proteomic signatures: Amino acid and oligopeptide compositions differentiate among phyla. Proteins-Structure Function and Genetics 54(1), 20–40 (2004)CrossRefGoogle Scholar
  6. 6.
    Jekel, M., Wackernagel, W.: The periplasmic endonuclease I of Escherichia coli has amino-acid sequence homology to the extracellular DNases of Vibrio cholerae and Aeromonas hydrophila. Gene 154(1), 55–59 (1995)CrossRefGoogle Scholar
  7. 7.
    Li, C.L., et al.: DNA binding and cleavage by the periplasmic nuclease Vvn: a novel structure with a known active site. Embo J. 22(15), 4014–4025 (2003)CrossRefGoogle Scholar
  8. 8.
    Hall, T.A.: BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucl. Acids. Symp. Ser. 41, 95–98 (1999)Google Scholar
  9. 9.
    Lambros, R.J., Mortimer, J.R., Forsdyke, D.R.: Optimum growth temperature and the base composition of open reading frames in prokaryotes. Extremophiles 7, 443–450 (2003)CrossRefGoogle Scholar
  10. 10.
    Bendtsen, J.D., et al.: Improved prediction of signal peptides: SignalP 3.0. J. Mol. Biol. 340, 783–795 (2004)CrossRefGoogle Scholar
  11. 11.
    Huang, S.L., et al.: PGTdb: a database providing growth temperatures of prokaryotes. Bioinformatics 20, 276–278 (2004)CrossRefGoogle Scholar
  12. 12.
    Garrity, G.M.: Bergey’s Manual of Systematic Bacteriology, vol. 2B, 2nd edn. Plenum, New York (2005)Google Scholar
  13. 13.
    Fraczkiewicz, R., Braun, W.: Exact and efficient analytical calculation of the accessible surface areas and their gradients for macromolecules. J. Comp. Chem. 19(3), 319–333 (1998)CrossRefGoogle Scholar
  14. 14.
    Haney, P.J., Badger, J.H., Buldak, G.L., et al.: Thermal adaptation analyzed by comparison of protein sequences from mesophilic and extremely thermophilic Methanococcus species. PNAS 96(7), 3578–3583 (1999)CrossRefGoogle Scholar
  15. 15.
    McDonald, J.H., Grasso, A.M., Rejto, L.K.: Patterns of temperature adaptation in proteins from Methanococcus and Bacillus. Molecular Biology and Evolution 16(12), 1785–1790 (1999)Google Scholar
  16. 16.
    Smith, N.G.C., Eyre-Walker, A.: A test of amino acid reversibility. J. Mol. Evol. 52, 467–469 (2001)Google Scholar
  17. 17.
    Chakravarty, S., Varadarajan, R.: Elucidation of factors responsible for enhanced thermal stability of proteins: A structural genomics based study. Biochemistry 41(25), 8152–8161 (2002)CrossRefGoogle Scholar
  18. 18.
    Mantel, N., Fliss, J.L.: Minimum expected cell-size requirements for the Mantel-Haenszel one-degree-of-freedom chi-square test and a related rapid procedure. American Journal of Epidemiology 112, 129–134 (1980)Google Scholar
  19. 19.
    Parshall, C.G., Miller, T.R.: Exact versus asymptotic Mantel-Haenszel DIF statistics - A comparison of performance under small-sample conditions. Journal of Educational Measurement 32(3), 302–316 (1995)CrossRefGoogle Scholar
  20. 20.
    Benjamini, Y., Yekutieli, D.: The control of the false discovery rate in multiple testing under dependency. Ann. Stat. 29(4), 1165–1188 (2001)zbMATHCrossRefMathSciNetGoogle Scholar
  21. 21.
    Koen, J.F., et al.: Implementing false discovery rate control: increasing your power. OIKOS 108, 643–647 (2005)CrossRefGoogle Scholar
  22. 22.
    Nakashima, H., Fukuchi, S., Nishikawa, K.: Compositional changes in RNA, DNA and proteins for bacterial adaptation to higher and lower temperatures. Journal of Biochemistry 133(4), 507–513 (2003)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Steinar Thorvaldsen
    • 1
  • Elinor Ytterstad
    • 2
  • Tor Flå
    • 2
  1. 1.Tromsø University College, AFL-Informatics, 9293 TromsøNorway
  2. 2.Dept of Mathematics and Statistics, University of Tromsø, 9037 TromsøNorway

Personalised recommendations