Skip to main content
Log in

Using Over-Represented Tetrapeptides to Predict Protein Submitochondria Locations

  • Regular Article
  • Published:
Acta Biotheoretica Aims and scope Submit manuscript

Abstract

The mitochondrion is a key organelle of eukaryotic cell that provides the energy for cellular activities. Correctly identifying submitochondria locations of proteins can provide plentiful information for understanding their functions. However, using web-experimental methods to recognize submitochondria locations of proteins are time-consuming and costly. Thus, it is highly desired to develop a bioinformatics method to predict the submitochondria locations of mitochondrion proteins. In this work, a novel method based on support vector machine was developed to predict the submitochondria locations of mitochondrion proteins by using over-represented tetrapeptides selected by using binomial distribution. A reliable and rigorous benchmark dataset including 495 mitochondrion proteins with sequence identity ≤25 % was constructed for testing and evaluating the proposed model. Jackknife cross-validated results showed that the 91.1 % of the 495 mitochondrion proteins can be correctly predicted. Subsequently, our model was estimated by three existing benchmark datasets. The overall accuracies are 94.0, 94.7 and 93.4 %, respectively, suggesting that the proposed model is potentially useful in the realm of mitochondrion proteome research. Based on this model, we built a predictor called TetraMito which is freely available at http://lin.uestc.edu.cn/server/TetraMito.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Chou KC, Shen HB (2007) Recent progress in protein subcellular location prediction. Anal Biochem 370:1–16

    Article  Google Scholar 

  • Chou KC, Shen HB (2008) Cell-PLoc: a package of Web servers for predicting subcellular localization of proteins in various organisms. Nat Protoc 3:153–162

    Article  Google Scholar 

  • Chou KC, Zhang CT (1995) Review: prediction of protein structural classes. Crit Rev Biochem Mol Biol 30:275–349

    Article  Google Scholar 

  • Ding H, Liu L, Guo FB, Huang J, Lin H (2011) Identify Golgi protein types with modified Mahalanobis discriminant algorithm and pseudo amino acid composition. Protein Pept Lett 18:58–63

    Article  Google Scholar 

  • Du P, Li Y (2006) Prediction of protein submitochondria locations by hybridizing pseudo-amino acid composition with various physicochemical features of segmented sequence. BMC Bioinform 7:518

    Article  Google Scholar 

  • Du P, Cao S, Li Y (2009) SubChlo: predicting protein subchloroplast locations with pseudo-amino acid composition and the evidence-theoretic K-nearest neighbor (ET-KNN) algorithm. J Theor Biol 261:330–335

    Article  Google Scholar 

  • Du P, Li T, Wang X (2011) Recent progress in predicting protein sub-subcellular locations. Expert Rev Proteomics 8:391–404

    Article  Google Scholar 

  • Fan GL, Li QZ (2012) Predicting protein submitochondria locations by combining different descriptors into the general form of Chou’s pseudo amino acid composition. Amino Acids 43:545–555

    Google Scholar 

  • Fan RE, Chen PH, Lin CJ (2005) Working set selection using the second order information for training SVM. J Mach Learn Res 6:1889–1918

    Google Scholar 

  • Feng Y, Luo L (2008) Use of tetrapeptide signals for protein secondary-structure prediction. Amino Acids 35:607–614

    Article  Google Scholar 

  • Henze K, Martin W (2003) Evolutionary biology: essence of mitochondria. Nature 426:127–128

    Article  Google Scholar 

  • Huang WL, Tung CW, Huang HL, Hwang SF, Ho SY (2007) ProLoc: prediction of protein subnuclear localization using SVM with automatic selection from physicochemical composition features. Biosystems 90:573–581

    Article  Google Scholar 

  • Huang WL, Tung CW, Ho SW, Hwang SF, Ho SY (2008) ProLoc-GO: utilizing informative Gene Ontology terms for sequence-based prediction of protein subcellular localization. BMC Bioinform 9:80

    Article  Google Scholar 

  • Huang WL, Tung CW, Huang HL, Ho SY (2009) Predicting protein subnuclear localization using GO-amino-acid composition features. Biosystems 98:73–79

    Article  Google Scholar 

  • Jiang X, Wei R, Zhao Y, Zhang T (2008) Using Chou’s pseudo amino acid composition based on approximate entropy and an ensemble of AdaBoost classifiers to predict protein subnuclear location. Amino Acids 34:669–675

    Article  Google Scholar 

  • Lei Z, Dai Y (2005) An SVM-based system for predicting protein subnuclear localizations. BMC Bioinform 6:291

    Article  Google Scholar 

  • Lei Z, Dai Y (2006) Assessing protein similarity with Gene Ontology and its use in subnuclear localization prediction. BMC Bioinform 7:491

    Article  Google Scholar 

  • Li FM, Li QZ (2008) Using pseudo amino acid composition to predict protein subnuclear location with improved hybrid approach. Amino Acids 34:119–125

    Article  Google Scholar 

  • Mei S (2012) Multi-kernel transfer learning based on Chou’s PseAAC formulation for protein submitochondria localization. J Theor Biol 293:121–130

    Article  Google Scholar 

  • Mei S, Fei W (2010) Amino acid classification based spectrum kernel fusion for protein subnuclear localization. BMC Bioinform Suppl 1:S17

    Article  Google Scholar 

  • Nanni L, Lumini A (2008) Genetic programming for creating Chou’s pseudo amino acid based features for submitochondria localization. Amino Acids 34:653–660

    Article  Google Scholar 

  • Polianskyte Z, Peitsaro N, Dapkunas A, Liobikas J, Soliymani R, Lalowski M, Speer O, Seitsonen J, Butcher S, Cereghetti GM, Linder MD, Merckel M, Thompson J, Eriksson O (2009) LACTB is a filament-forming protein localized in mitochondria. Proc Natl Acad Sci USA 106:18960–18965

    Article  Google Scholar 

  • Rackovsky S (1993) On the nature of protein folding code. Proc Natl Acad Sci USA 90:644–648

    Article  Google Scholar 

  • Shen HB, Chou KC (2005) Predicting protein subnuclear location with optimized evidence-theoretic K-nearest classifier and pseudo amino acid composition. Biochem Biophys Res Commun 337:752–756

    Article  Google Scholar 

  • Shen HB, Chou KC (2007) Nuc-PLoc: a new web-server for predicting protein subnuclear localization by fusing PseAA composition and PsePSSM. Protein Eng Des Sel 20:561–567

    Article  Google Scholar 

  • Shi Y (2002) A conserved tetrapeptide motif: potentiating apoptosis through IAP-binding. Cell Death Differ 9:93–95

    Article  Google Scholar 

  • Shi SP, Qiu JD, Sun XY, Huang JH, Huang SY, Suo SB, Liang RP, Zhang L (2011) Identify submitochondria and subchloroplast locations with pseudo amino acid composition: approach from the strategy of discrete wavelet transform feature extraction. Biochim Biophys Acta 1813:424–430

    Article  Google Scholar 

  • Stuart GW, Moffett K, Leader JJ (2002) A comprehensive vertebrate phylogeny using vector representations of protein sequences from whole genomes. Mol Biol Evol 19:554–562

    Article  Google Scholar 

  • UniProt Consortium (2012) Reorganizing the protein space at the Universal Protein Resource (UniProt). Nucleic Acids Res 40:D71–D75

    Article  Google Scholar 

  • van Dijk AD, Bosch D, ter Braak CJ, van der Krol AR, van Ham RC (2008) Predicting sub-Golgi localization of type II membrane proteins. Bioinformatics 24:1779–1786

    Article  Google Scholar 

  • Verhagen AM, Kratina TK, Hawkins CJ, Silke J, Ekert PG, Vaux DL (2007) Identification of mammalian mitochondrial proteins that interact with IAPs via N-terminal IAP binding motifs. Cell Death Differ 14:348–357

    Article  Google Scholar 

  • Wang G, Dunbrack RL Jr (2005) PISCES: recent improvements to a PDB sequence culling server. Nucleic Acids Res 33:W94–W98

    Article  Google Scholar 

  • Zakeri P, Moshiri B, Sadeghi M (2011) Prediction of protein submitochondria locations based on data fusion of various features of sequences. J Theor Biol 269:208–216

    Article  Google Scholar 

  • Zeng YH, Guo YZ, Xiao RQ, Yang L, Yu LZ, Li ML (2009) Using the augmented Chou’s pseudo amino acid composition for predicting protein submitochondria locations based on auto covariance approach. J Theor Biol 259:366–372

    Article  Google Scholar 

Download references

Acknowledgments

We are grateful to Dr. Loris Nanni for his help. This work was supported by the National Nature Scientific Foundation of China (No. 61202256, 61100092), the Project of Education Department in Sichuan (12ZA112), the Fundamental Research Funds for the Central Universities (ZYGX2012J113) and the Scientific Research Startup Foundation of UESTC.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Hao Lin or Wei Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lin, H., Chen, W., Yuan, LF. et al. Using Over-Represented Tetrapeptides to Predict Protein Submitochondria Locations. Acta Biotheor 61, 259–268 (2013). https://doi.org/10.1007/s10441-013-9181-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10441-013-9181-9

Keywords

Navigation