Skip to main content

Advertisement

Log in

Prediction of mitochondrial proteins of malaria parasite using split amino acid composition and PSSM profile

  • Original Article
  • Published:
Amino Acids Aims and scope Submit manuscript

Abstract

The rate of human death due to malaria is increasing day-by-day. Thus the malaria causing parasite Plasmodium falciparum (PF) remains the cause of concern. With the wealth of data now available, it is imperative to understand protein localization in order to gain deeper insight into their functional roles. In this manuscript, an attempt has been made to develop prediction method for the localization of mitochondrial proteins. In this study, we describe a method for predicting mitochondrial proteins of malaria parasite using machine-learning technique. All models were trained and tested on 175 proteins (40 mitochondrial and 135 non-mitochondrial proteins) and evaluated using five-fold cross validation. We developed a Support Vector Machine (SVM) model for predicting mitochondrial proteins of P. falciparum, using amino acids and dipeptides composition and achieved maximum MCC 0.38 and 0.51, respectively. In this study, split amino acid composition (SAAC) is used where composition of N-termini, C-termini, and rest of protein is computed separately. The performance of SVM model improved significantly from MCC 0.38 to 0.73 when SAAC instead of simple amino acid composition was used as input. In addition, SVM model has been developed using composition of PSSM profile with MCC 0.75 and accuracy 91.38%. We achieved maximum MCC 0.81 with accuracy 92% using a hybrid model, which combines PSSM profile and SAAC. When evaluated on an independent dataset our method performs better than existing methods. A web server PFMpred has been developed for predicting mitochondrial proteins of malaria parasites (http://www.imtech.res.in/raghava/pfmpred/).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25:25–29

    Article  CAS  PubMed  Google Scholar 

  • Bender A, van Dooren GG, Ralph SA, McFadden GI, Schneider G (2003) Properties and prediction of mitochondrial transit peptides from Plasmodium falciparum. Mol Biochem Parasitol 132:59–66

    Article  CAS  PubMed  Google Scholar 

  • Bhasin M, Raghava GPS (2004) ESLpred: SVM-based method for subcellular localization of eukaryotic proteins using dipeptide composition and PSI-BLAST. Nucleic Acids Res 32:W414–W419

    Article  CAS  PubMed  Google Scholar 

  • Cai YD, Liu XJ, Xu XB, Chou KC (2002) Prediction of protein structural classes by support vector machines. Comput Chem 26:293–296

    Article  CAS  PubMed  Google Scholar 

  • Cai YD, Lin S, Chou KC (2005) Support vector machines for prediction of protein signal sequences and their cleavage sites. Peptides 24:159–161

    Article  Google Scholar 

  • Chen C, Chen LX, Zou XY, Cai PX (2008) Predicting protein structural class based on multi-features fusion. J Theor Biol 253:388–392

    Article  CAS  PubMed  Google Scholar 

  • Chou KC, Shen HB (2006a) Predicting eukaryotic protein subcellular location by fusing optimized evidence-theoretic K-nearest neighbor classifiers. J Proteome Res 5:1888–1897

    Article  CAS  PubMed  Google Scholar 

  • Chou KC, Shen HB (2006b) Hum-PLoc: a novel ensemble classifier for predicting human protein subcellular localization. Biochem Biophys Res Commun 347:150–157

    Article  CAS  PubMed  Google Scholar 

  • Chou KC, Shen HB (2007a) MemType-2L: a web server for predicting membrane proteins and their types by incorporating evolution information through Pse-PSSM. Biochem Biophys Res Commun 360:339–345

    Article  CAS  PubMed  Google Scholar 

  • Chou KC, Shen HB (2007b) Large-scale plant protein subcellular location prediction. J Cell Biochem 100:665–678

    Article  CAS  PubMed  Google Scholar 

  • Chou KC, Shen HB (2007c) Review: recent progresses in protein subcellular location prediction. Anal Biochem 370:1–16

    Article  CAS  PubMed  Google Scholar 

  • Chou KC, Shen HB (2007d) Euk-mPLoc: a fusion classifier for large-scale eukaryotic protein subcellular location prediction by incorporating multiple sites. J Proteome Res 6:1728–1734

    Article  CAS  PubMed  Google Scholar 

  • Chou KC, Shen HB (2008a) ProtIdent: a web server for identifying proteases and their types by fusing functional domain and sequential evolution information. Biochem Biophys Res Commun 376:321–325

    Article  CAS  PubMed  Google Scholar 

  • Chou KC, Shen HB (2008b) Cell-PLoc: a package of web-servers for predicting subcellular localization of proteins in various organisms. Nat Protoc 3:153–162

    Article  CAS  PubMed  Google Scholar 

  • Chou KC, Shen HB (2009) FoldRate: a web-server for predicting protein folding rates from primary sequence. Open Bioinform J 3:31–50. Accessible at http://www.bentham.org/open/tobioij/)

    Google Scholar 

  • Chou KC, Zhang CT (1995) Review: prediction of protein structural classes. Crit Rev Biochem Mol Biol 30:275–349

    Article  CAS  PubMed  Google Scholar 

  • Claros MG, Vincens P (1996) Computational method to predict mitochondrially imported proteins and their targeting sequences. Eur J Biochem 241:770–786

    Article  Google Scholar 

  • Ding YS, Zhang TL (2008) Using Chou’s pseudo amino acid composition to predict subcellular localization of apoptosis proteins: an approach with immune genetic algorithm-based ensemble classifier. Pattern Recognit Lett 29:1887–1892

    Article  CAS  Google Scholar 

  • Ding YS, Zhang TL, Gu Q, Zhao PY, Chou KC (2009) Using maximum entropy model to predict protein secondary structure with single sequence. Protein Pept Lett 16:552–560

    Article  CAS  PubMed  Google Scholar 

  • Emanuelsson O, Nielsen H, Brunak S, von Heijne G (2000) Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J Mol Biol 300:1005–1016

    Article  CAS  PubMed  Google Scholar 

  • Gardner MJ et al (2002) Genome sequence of the human malaria parasite Plasmodium falciparum. Nature 419:498–511

    Article  CAS  PubMed  Google Scholar 

  • Garg A, Raghava GPS (2008) ESLpred2: improved method for predicting subcellular localization of eukaryotic proteins. BMC Bioinform 9:503

    Article  Google Scholar 

  • Garg A, Bhasin M, Raghava GPS (2005) Support vector machine-based method for subcellular localization of human proteins using amino acid compositions, their order, and similarity search. J Biol Chem 280:14427–14432

    Article  CAS  PubMed  Google Scholar 

  • Guda C, Fahy E, Subramaniam S (2004) MITOPRED: a genome-scale method for prediction of nucleus-encoded mitochondrial proteins. Bioinformatics 20:1785–1794

    Article  CAS  PubMed  Google Scholar 

  • Guo J, Lin Y, Liu X (2006) GNBSL: a new integrative system to predict the subcellular location for Gram-negative bacteria proteins. Proteomics 6:5099–5105

    Article  CAS  PubMed  Google Scholar 

  • Huang WL, Tung CW, Ho SW, Hwang SF, Ho SY (2008) ProLoc-GO: utilizing informative gene ontology terms for sequence-based prediction of protein subcellular localization. BMC Bioinform 9:80

    Article  Google Scholar 

  • Joachims T (1999) Making large-scale SVM learning practical. In: Scholkopf B, Burges C, Smola A (eds) Advances in Kernel methods—support vector learning. MIIT Press, Cambridge, MA; London, England

    Google Scholar 

  • Kaur H, Raghava GPS (2003) Prediction of beta-turns in proteins from multiple alignment using neural network. Protein Sci 12:627–634

    Article  CAS  PubMed  Google Scholar 

  • Kaur H, Raghava GPS (2004a) A neural network method for prediction of beta-turn types in proteins using evolutionary information. Bioinformatics 16:2751–2758

    Article  Google Scholar 

  • Kaur H, Raghava GPS (2004b) Role of evolutionary information in prediction of aromatic-backbone NH interactions in proteins. FEBS Lett 564:47–57

    Article  CAS  PubMed  Google Scholar 

  • Kumar M, Verma R, Raghava GPS (2006) Prediction of mitochondrial proteins using support vector machine and hidden markov model. J Biol Chem 281:5357–5363

    Article  CAS  PubMed  Google Scholar 

  • Kumar M, Gromiha MM, Raghava GPS (2007) Identification of DNA-binding proteins using support vector machines and evolutionary profiles. BMC Bioinform 8:463

    Article  CAS  Google Scholar 

  • Kumar M, Gromiha MM, Raghava GPS (2008) Prediction of RNA binding sites in a protein using SVM and PSSM profile. Proteins 71:189–194

    Article  CAS  PubMed  Google Scholar 

  • Li W, Godzik A (2006) Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22:1658–1659

    Article  CAS  PubMed  Google Scholar 

  • Li FM, Li QZ (2008) Predicting protein subcellular location using Chou’s pseudo amino acid composition and improved hybrid approach. Protein Pept Lett 15:612–616

    Article  PubMed  Google Scholar 

  • Mather MW, Vaidya AB (2008) Mitochondria in malaria and related parasites: ancient, diverse and streamlined. J Bioenerg Biomembr 40:425–433

    Article  CAS  PubMed  Google Scholar 

  • Rashid M, Saha S, Raghava GPS (2007) Support vector machine-based method for predicting subcellular localization of mycobacterial proteins using evolutionary information and motifs. BMC Bioinform 8:337

    Article  Google Scholar 

  • Shen HB, Chou KC (2007a) EzyPred: a top-down approach for predicting enzyme functional classes and subclasses. Biochem Biophys Res Commun 364:53–59

    Article  CAS  PubMed  Google Scholar 

  • Shen HB, Chou KC (2007b) Nuc-PLoc: a new web-server for predicting protein subnuclear localization by fusing PseAA composition and PsePSSM. Protein Eng Des Sel 20:561–567

    Article  CAS  PubMed  Google Scholar 

  • Shen HB, Chou KC (2009) QuatIdent: a web server for identifying protein quaternary structural attribute by fusing functional domain and sequential evolution information. J Proteome Res 8:1577–1584

    Article  CAS  PubMed  Google Scholar 

  • Shen HB, Song JN, Chou KC (2009) Prediction of protein folding rates from primary sequence by fusing multiple sequential features. J Biomed Sci Eng 2:136–143. Accessible at http://www.srpublishing.org/journal/jbise/)

    Google Scholar 

  • Vaidya AB, Mather MW (2005) A post-genomic view of the mitochondrion in malaria parasites. Curr Top Microbiol Immunol 295:233–250

    Article  CAS  PubMed  Google Scholar 

  • Vaidya AB, Mather MW (2009) Mitochondrial evolution and functions in malaria parasites. Annu Rev Microbiol 63:249–267

    Article  CAS  PubMed  Google Scholar 

  • Verma R, Tiwari A, Kaur S, Varshney GC, Raghava GPS (2008) Identification of proteins secreted by malaria parasite into erythrocyte using SVM and PSSM profiles. BMC Bioinform 9:201

    Article  Google Scholar 

  • Xiao X, Wang P, Chou KC (2009a) GPCR-CA: a cellular automaton image approach for predicting G-protein-coupled receptor functional classes. J Comput Chem 30:1414–1423

    Article  CAS  PubMed  Google Scholar 

  • Xiao X, Wang P, Chou KC (2009b) Predicting protein quaternary structural attribute by hybridizing functional domain composition and pseudo amino acid composition. J Appl Crystallogr 42:169–173

    Article  CAS  Google Scholar 

Download references

Acknowledgments

The authors gratefully acknowledged the financial support provided by the Council of Science and Industrial Research (CSIR) and Department of Biotechnology (DBT), Government of India. This paper has IMTECH communication number 048/2007.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to G. P. S. Raghava.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Verma, R., Varshney, G.C. & Raghava, G.P.S. Prediction of mitochondrial proteins of malaria parasite using split amino acid composition and PSSM profile. Amino Acids 39, 101–110 (2010). https://doi.org/10.1007/s00726-009-0381-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00726-009-0381-1

Keywords

Navigation