Abstract
Traditional bioinformatics methods performed systematic comparison between the halophilic proteins and their non-halophilic homologues, to investigate the features related to hypersaline adaptation. Therefore, proposing some quantitative models to explain the sequence-characteristic relationship of halophilic proteins might shed new light on haloadaptation and help to design new biocatalysts adapt to high salt concentration. Five machine learning algorithm, including three linear and two non-linear methods were used to discriminate halophilic and their non-halophilic counterparts and the prediction accuracy was encouraging. The best prediction reliability for halophilic proteins was achieved by artificial neural network and support vector machine and reached 80 %, for non-halophilic proteins, it was achieved by linear regression and reached 100 %. Besides, the linear models have captured some clues for protein halo-stability. Among them, lower frequency of Ser in halophilic protein has not been report before.
Abbreviations
- PCA:
-
Principal component analysis
- LR:
-
Linear regression
- PLSR:
-
Partial least-square regression
- ANN:
-
Artificial neural networks
- SVM:
-
Support vector machine
- LVs:
-
Latent variables
- PCs:
-
Principal components
References
Arakawa T, Tokunaqa M (2004) Protein Pept Lett 11:125–132
Binder SR, Hixson C, Glossenger J (2006) Autoimmun Rev 5:234–241
Britton KL, Baker PJ, Fisher M, Ruzheinikov S, Gilmour DJ, Bonete MJ, Ferrer J, Pire C, Esclapez J, Rice DW (2006) Proc Natl Acad Sci USA 103:4846–4851
Brown K, Nurizzo D, Besson S, Shepard W, Moura J, Moura I, Tegoni M, Cambillau C (1999) J Mol Biol 289:1017–1028
Chou KC (2009) Curr Proteomics 6:262–274
Coquelle N, Talon R, Juers DH, Girard E, Kahn R, Madern D (2010) J Mol Biol 404:493–505
Duttaa D, Mohanty AK, Choudhury RK, Chand P (1998) Nucl Instrum Meth in Phy Res A 404:445–454
Ebrahimie E, Ebrahimi M, Rahpayma NS, Ebrahimi M (2011) Saline Systems 7:1
Elcock AH, McCammon JA (1998) J Mol Biol 280:731–748
Ferrer J, Perez-Pomares F, Bonete MJ (1996) FEMS Microbil Lett 141:59–63
Fukuchi S, Yoshimune K, Wakayama M, Moriguchi M, Nishikawa K (2003) J Mol Biol 327:347–357
Geladi P, Kowalski BR (1986) Anal Chim Acta 185:1–17
Gromiha MM, Yabuki Y (2008) BMC Bioinform 9:135
Hemmateenejad B, Safarpour MA, Miri R, Nesari N (2005) J Chem Inf Model 45:190–199
Imamoto Y, Kataoka M (2007) Photochem Photobiol 83:40–49
Inamdar NM, Ehrlich KC, Ehrlich M, Iannello RC, Frank E, Hall K, Trigg L, Holmes G, Witten IH (2004) Bioinformatics 20:2479–2481
Joo WA, Kim CW (2005) J Chromatogr B Analyt Technol Biomed Life Sci 815:237–250
Karan R, Capes MD, Dassarma S (2012) Aquat Biosyst 8:4
Kastritis PL, Papandreou NC, Hamodrakas SJ (2007) Int J Biol Macromol 41:447–453
Kennedy SP, Ng WV, Salzberg SL, Hood L, DasSarma S (2001) Genome Res 11:1641–1650
Lanyi JK (1974) Bacteriol Rev 38:272–290
Liew AWC, Yan H, Yang M (2005) Pattern Recog 38:2055–2073
Ma JC, Dougherty DA (1997) Chem Rev 97:1303–1324
Mevarech M, Frolow F, Gloss LM (2000) Biophys Chem 86:155–164
Mongodin EF, Nelson KE, Daugherty S, Deboy RT, Wister J, Khouri H, Weidman J, Walsh DA, Papke RT, Sanchez Perez G, Sharma AK, Nesbo CL, MacLeod D, Bapteste E, Doolittle WF, Charlebois RL, Legault B, Rodriguez-Valera F (2005) Proc Natl Acad Sci USA 102:18147–18152
Paul S, Bag SK, Das S, Harvill ET, Dutta C (2008) Genome Biol 9:R70
Schlessinger A, Rost B (2005) Proteins 61:115–126
Siglioccolo A, Paiardini A, Piscitelli M, Pascarella S (2011) BMC Structur Biol 11:50
Tokunaga H, Arakawa T, Tokunaga M (2008) Protein Sci 17:1603–1610
Acknowledgments
This work was supported by the Cultivation Project of Huaqiao University for the China National Funds for Distinguished Young Scientists (No. JB-GJ1006) and the Program for New Century Excellent Talents in Universities of Fujian Province (No. 07176C02).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhang, G., Ge, H. Protein Hypersaline Adaptation: Insight from Amino Acids with Machine Learning Algorithms. Protein J 32, 239–245 (2013). https://doi.org/10.1007/s10930-013-9484-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10930-013-9484-3