Amino Acids

, Volume 42, Issue 5, pp 1947–1953

Detecting thermophilic proteins through selecting amino acid and dipeptide composition features

Original Article


Detecting thermophilic proteins is an important task for designing stable protein engineering in interested temperatures. In this work, we develop a simple but efficient method to classify thermophilic proteins from mesophilic ones using the amino acid and dipeptide compositions. Since most of the amino acid and dipeptide compositions are redundant, we propose a new forward floating selection technique to select only a useful subset of these compositions as features for support vector machine-based classification. We test the proposed method on a benchmark data set of 915 thermophilic and 793 mesophilic proteins. The results show that our method using 28 amino acid and dipeptide compositions achieves an accuracy rate of 93.3% evaluated by the jackknife cross-validation test, which is higher not only than the existing methods but also than using all amino acid and dipeptide compositions.


Amino acid composition Dipeptide composition Feature selection Floating search method Protein thermostability 

Supplementary material

726_2011_923_MOESM1_ESM.pdf (57 kb)
Supplementary material 1 (PDF 56 kb)

Copyright information

© Springer-Verlag 2011

Authors and Affiliations

  1. 1.Key Laboratory of Systems Biology, SIBS-Novo Nordisk Translational Research Centre for PreDiabetes, Shanghai Institutes for Biological SciencesChinese Academy of SciencesShanghaiChina
  2. 2.Department of Electrical and Computer EngineeringThammasat UniversityPathumthaniThailand

Personalised recommendations