DPROT: prediction of disordered proteins using evolutionary information

Sethi, Deepti; Garg, Aarti; Raghava, G. P. S.

doi:10.1007/s00726-008-0085-y

DPROT: prediction of disordered proteins using evolutionary information

Original Article
Published: 19 April 2008

Volume 35, pages 599–605, (2008)
Cite this article

Amino Acids Aims and scope Submit manuscript

Deepti Sethi¹,
Aarti Garg¹ &
G. P. S. Raghava¹

1261 Accesses
11 Citations
Explore all metrics

Abstract

The association of structurally disordered proteins with a number of diseases has engendered enormous interest and therefore demands a prediction method that would facilitate their expeditious study at molecular level. The present study describes the development of a computational method for predicting disordered proteins using sequence and profile compositions as input features for the training of SVM models. First, we developed the amino acid and dipeptide compositions based SVM modules which yielded sensitivities of 75.6 and 73.2% along with Matthew’s Correlation Coefficient (MCC) values of 0.75 and 0.60, respectively. In addition, the use of predicted secondary structure content (coil, sheet and helices) in the form of composition values attained a sensitivity of 76.8% and MCC value of 0.77. Finally, the training of SVM models using evolutionary information hidden in the multiple sequence alignment profile improved the prediction performance by achieving a sensitivity value of 78% and MCC of 0.78. Furthermore, when evaluated on an independent dataset of partially disordered proteins, the same SVM module provided a correct prediction rate of 86.6%. Based on the above study, a web server (“DPROT”) was developed for the prediction of disordered proteins, which is available at http://www.imtech.res.in/raghava/dprot/.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Computational Prediction of Intrinsic Disorder in Protein Sequences with the disCoP Meta-predictor

DisCons: a novel tool to quantify and classify evolutionary conservation of intrinsic protein disorder

Article Open access 13 May 2015

Prediction of Intrinsic Disorder in Proteins Using MFDp2

References

Bhasin M, Raghava GPS (2004), ESLpred: SVM-based method for subcellular localization of eukaryotic proteins using dipeptide composition and PSI-BLAST. Nucleic Acids Res 32:414–419
Article CAS Google Scholar
Chou KC, Shen HB (2007a) MemType-2L: a web server for predicting membrane proteins and their types by incorporating evolution information through Pse-PSSM. Biochem Biophys Res Comm 360:339–345
Article PubMed CAS Google Scholar
Chou KC, Shen HB (2007b) Recent progresses in protein subcellular location prediction. Anal Biochem 370:1–16
Article PubMed CAS Google Scholar
Dosztanyi Z, Csizmok V, Tompa P, Simon I (2005) The pairwise energy content estimated from amino acid composition discriminates between folded and intrinsically unstructured proteins. J Mol Biol 347:827–839
Article PubMed CAS Google Scholar
Dunker AK, Obradovic Z (2001) The protein trinity-linking function and disorder. Nat Biotechnol 19:805–806
Article PubMed CAS Google Scholar
Dunker AK, Brown CJ, Obradovic Z (2002) Identification and functions of usefully disordered proteins. Adv Protein Chem 62:25–49
Article PubMed CAS Google Scholar
Fink AL (2005) Natively unfolded proteins. Curr Opin Struct Biol 15:35–41
Article PubMed CAS Google Scholar
Galzitskaya OV, Garbuzynskiy SO, Lobanov MY (2006) FoldUnfold: web server for the prediction of disordered regions in protein chain. Bioinformatics 22:2948–2949
Article PubMed CAS Google Scholar
Garg A, Bhasin M, Raghava GPS (2005a) Support vector machine-based method for subcellular localization of human proteins using amino acid compositions, their order, and similarity search. J Biol Chem 280:14427–14432
Article PubMed CAS Google Scholar
Garg A, Kaur H, Raghava GPS (2005b) Real value prediction of solvent accessibility in proteins using multiple sequence alignment and secondary structure. Proteins 61:318–325
Article PubMed CAS Google Scholar
Joachims T (1999) Making large-scale SVM learning particle. In: Scholkopf B, Burges C, Smola A (eds) Advances in kernel methods support vector learning. MIT Press, Cambridge, MA, pp 42–56
Jones DT (1999) Protein secondary structure prediction based on position specific scoring matrices. J Mol Biol 292:195–202
Article PubMed CAS Google Scholar
Jones DT, Ward JJ (2003) Prediction of disordered regions in proteins from position specific score matrices. Proteins 53:573–578
Article PubMed CAS Google Scholar
Kaur H, Raghava GPS (2002) BetaTPred: Prediction of beta turns in a protein using statistical algorithms. Bioinformatics 18:498–499
Article PubMed CAS Google Scholar
Kaur H, Raghava GPS (2003) A neural-network based method for prediction of gamma-turns in proteins from multiple sequence alignment. Protein Sci 2:923–929
Article CAS Google Scholar
Kaur H, Raghava GPS (2004a) Prediction of alpha-turns in proteins using PSI-BLAST profiles and secondary structure information. Proteins 55:83–90
Article PubMed CAS Google Scholar
Kaur H, Raghava GPS (2004b) A neural network method for prediction of β-turn types in proteins using evolutionary information. Bioinformatics 20:2751–2758
Article PubMed CAS Google Scholar
Kumar M, Bhasin M, Natt NK, Raghava GPS (2005) BhairPred: prediction of b-hairpins in a protein from multiple alignment information using ANN and SVM techniques. Nucleic Acids Res 33:154–159
Article CAS Google Scholar
Lata S, Sharma BK, Raghava GPS (2007) Analysis and prediction of antibacterial peptides. BMC Bioinformatics 8:263
Article PubMed CAS Google Scholar
Linding R, Jensen LJ, Diella F, Bork P, Gibson TJ, Russell RB (2003a) Protein disorder prediction: implications for structural proteomics. Structure 11:1453–1459
Article PubMed CAS Google Scholar
Linding R, Russell RB, Neduva V, Gibson TJ (2003b) GlobPlot: exploring protein sequences for globularity and disorder. Nucleic Acids Res 31:3701–3708
Article PubMed CAS Google Scholar
Radivojac P, Obradovic Z, Smith DK, Zhu G, Vucetic S, Brown CJ, Lawson JD, Dunker AK. (2004) Protein flexibility and intrinsic disorder. Protein Sci 13:71–80
Article PubMed CAS Google Scholar
Rashid M, Saha S, Raghava GPS (2007) Support vector machine-based method for predicting subcellular localization of mycobacterial proteins using evolutionary information and motifs. BMC Bioinformatics 8:337
Article PubMed CAS Google Scholar
Romero P, Obradovic Z, Dunker AK (1999) Folding minimal sequences: the lower bound for sequence complexity of globular proteins. FEBS Lett 462:363–367
Article PubMed CAS Google Scholar
Romero P, Obradovic Z, Li X, Garner EC, Brown CJ, Dunker AK (2001) Sequence complexity of disordered protein. Proteins 42:38–48
Article PubMed CAS Google Scholar
Shen HB, Chou KC (2007a) EzyPred: a top-down approach for predicting enzyme functional classes and subclasses. Biochem Biophys Res Comm 364:53–59
Article PubMed CAS Google Scholar
Shen HB, Chou KC (2007b) Nuc-PLoc: a new web-server for predicting protein subnuclear localization by fusing PseAA composition and PsePSSM. Protein Eng Des Sel 20:561–567
Article PubMed CAS Google Scholar
Shimizu K, Muraoka Y, Hirose S, Tomii K, Noguchi T (2007) Predicting mostly disordered proteins by using structure-unknown protein data. BMC Bioinformatics 8:78
Article PubMed CAS Google Scholar
Sussman JL, Prilusky J, Felder CE, Zeev-Ben-Mordehai T, Rydberg EH, Man O, Beckmann JS, Silman I (2005) Fold index: a simple tool to predict whether a given protein sequence is intrinsically unfolded. Bioinformatics 21:3435–3438
Article PubMed CAS Google Scholar
Vihinen M, Torkkila E, Riikonen P (1994) Accuracy of protein flexibility predictions. Proteins 19:141–149
Article PubMed CAS Google Scholar
Vucetic S, Obradovic Z, Vacic V, Radivojac P, Peng K, Iakoucheva LM, Cortese MS, Lawson JD, Brown CJ, Sikes JG, Newton CD, Dunker AK (2005) DisProt: a database of protein disorder. Bioinformatics 21:137–140
Article PubMed CAS Google Scholar
Ward JJ, McGuffin LJ, Bryson K, Buxton BF, Jones DT (2004) The DISOPRED server for the prediction of protein disorder. Bioinformatics 20:2138–2139
Article PubMed CAS Google Scholar
Weinreb PH, Zhen W, Poon AW, Conway KA, Lansbury PT Jr (1996) NACP, a protein implicated in Alzheimer’s disease and learning, is natively unfolded. Biochemistry 35:13709–13715
Article PubMed CAS Google Scholar
Wright PE, Dyson HJ (1999) Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm. J Mol Biol 293:321–331
Article PubMed CAS Google Scholar
Xie D, Li A, Wang M, Fan Z Feng H (2005) LOCSVMPSI: a web server for subcellular localization of eukaryotic proteins using SVM and profile of PSI-BLAST. Nucleic Acids Res 33:105–110
Article CAS Google Scholar
Xie Q, Arnold GE, Romero P, Obradovic Z, Garner E, Dunker AK (1998) The sequence attribute method for determining relationships between sequence and protein disorder. Genome Inform 9:193–200
CAS Google Scholar
Yang ZR, Thomson R, McNeil P, Esnouf RM (2005) RONN: the bio-basis function neural network technique applied to the detection of natively disordered regions in proteins. Bioinformatics 21:3369–3376
Article PubMed CAS Google Scholar
Yip YL, Scheib H, Diemand AV, Gattiker A, Famiglietti LM, Gasteiger E, Bairoch A (2004) The Swiss-Prot variant page and the ModSNP database: a resource for sequence and structure information on human protein variants. Hum Mutat 23:464–470
Google Scholar
Zhi-Qiang Ye, Zhao SQ, Gao G, Liu XQ, Langlois RE, Lu H, Wei L (2007) Finding new structural and sequence attributes to predict possible disease association of single amino acid polymorphism (SAP). Bioinformatics 23:1444–1450
Article CAS Google Scholar

Download references

Acknowledgments

The authors are thankful to the Council of Scientific and Industrial Research (CSIR) and the Department of Biotechnology, Government of India for financial assistance.

Author information

Authors and Affiliations

Scientist and Head Bioinformatics Centre, Institute of Microbial Technology, Sector 39A, Chandigarh, India
Deepti Sethi, Aarti Garg & G. P. S. Raghava

Authors

Deepti Sethi
View author publications
You can also search for this author in PubMed Google Scholar
Aarti Garg
View author publications
You can also search for this author in PubMed Google Scholar
G. P. S. Raghava
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to G. P. S. Raghava.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sethi, D., Garg, A. & Raghava, G.P.S. DPROT: prediction of disordered proteins using evolutionary information. Amino Acids 35, 599–605 (2008). https://doi.org/10.1007/s00726-008-0085-y

Download citation

Received: 19 February 2008
Accepted: 17 March 2008
Published: 19 April 2008
Issue Date: October 2008
DOI: https://doi.org/10.1007/s00726-008-0085-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DPROT: prediction of disordered proteins using evolutionary information

Abstract

Access this article

Similar content being viewed by others

Computational Prediction of Intrinsic Disorder in Protein Sequences with the disCoP Meta-predictor

DisCons: a novel tool to quantify and classify evolutionary conservation of intrinsic protein disorder

Prediction of Intrinsic Disorder in Proteins Using MFDp2

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

DPROT: prediction of disordered proteins using evolutionary information

Abstract

Access this article

Similar content being viewed by others

Computational Prediction of Intrinsic Disorder in Protein Sequences with the disCoP Meta-predictor

DisCons: a novel tool to quantify and classify evolutionary conservation of intrinsic protein disorder

Prediction of Intrinsic Disorder in Proteins Using MFDp2

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation