Skip to main content
Log in

PluriPred: A Web server for predicting proteins involved in pluripotent network

  • Published:
Journal of Biosciences Aims and scope Submit manuscript

Abstract

Pluripotency is a unique property of stem cells that allows them to differentiate into all types of adult cells or maintain the self-renewal property. PluriPred predicts whether a protein is involved in pluripotency from primary protein sequence using manually curated pluripotent proteins as training datasets. Machine learning techniques (MLTs) such as Support Vector Machine (SVM), Naïve Base (NB), Random Forest (RF), and sequence alignment technique BLAST were used in our study. The combination of SVM and PSI-BLAST was our proposed best model, which obtained a sensitivity of 77.40%, specificity of 79.72%, accuracy of 79.2%, and area under the ROC curve was 0.82 using 5-fold cross-validation. Furthermore, PluriPred gives the confidence of the prediction from training dataset’s SVM score distribution and p-value from BLAST. We validated our proposed model with the other existing high-throughput studies using blind/independent datasets. Using PluriPred, 233 novel core and 323 novel extended core pluripotent proteins from mouse proteome, and 167 novel core and 385 extended core pluripotent proteins from human proteome, were predicted with high confidence. The Web application of PluriPred is available from bicresources.jcbose.ac.in/ssaha4/pluripred/. Many pluripotent genes/proteins take part in protein-protein networks associated with stem cell, cancer, and developmental biology, and we believe that PluriPred will help in these research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4

Similar content being viewed by others

References

  • Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W and Lipman DJ 1997 Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25 3389–3402

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Berriz GF, Beaver JE, Cenik C, Tasan M and Roth FP 2009 Next generation software for functional trend analysis. Bioinformatics 25 3043–3044

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Boyer LA, Lee TI, Cole MF, Johnstone SE, Levine SS, Zucker JP, Guenther MG, Kumar RM, et al. 2005 Core transcriptional regulatory circuitry in human embryonic stem cells. Cell 122 947–956

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL 2009 BLAST+: architecture and applications. BMC Bioinf. 10

  • Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P and Witten IH 2009 The WEKA data mining software: an update. ACM SIGKDD Explor. Newsl. 11 10–18

    Article  Google Scholar 

  • Joachims T 1999 Making large-scale SVM learning practical; in Advances in Kernel methods - support vector learning (MIT Press) pp 169–184

  • Li W and Godzik A 2006 Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22 1658–1659

    Article  CAS  PubMed  Google Scholar 

  • Muller FJ, Laurent LC, Kostka D, Ulitsky I, Williams R, Lu C, Park IH, Rao MS, et al. 2008 Regulatory networks define phenotypic classes of human stem cell lines. Nature 455 401–405

    Article  PubMed  PubMed Central  Google Scholar 

  • Orchard S, Ammari M, Aranda B, Breuza L, Briganti L, Broackes-Carter F, Campbell NH, Chavali G, et al. 2014 The MIntAct project--IntAct as a common curation platform for 11 molecular interaction databases. Nucleic Acids Res. 42 D358–D363

    Article  CAS  PubMed  Google Scholar 

  • Saha S and Raghava GPS 2006 AlgPred: prediction of allergenic proteins and mapping of IgE epitopes. Nucleic Acids Res. 34 W202–W209

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Scheubert L, Schmidt R, Repsilber D, Lustrek M and Fuellen G 2011 Learning biomarkers of pluripotent stem cells in mouse. DNA Res. 18 233–51

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Som A, Harder C, Greber B, Siatkowski M, Paudel Y, Warsow G, Cap C, Schöler H, et al. 2010 The PluriNetWork: an electronic representation of the network underlying pluripotency in mouse, and its applications. PLoS One 5 e15165

    Article  PubMed  PubMed Central  Google Scholar 

  • Takahashi K and Yamanaka S 2006 Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors. Cell 126 663–676

    Article  CAS  PubMed  Google Scholar 

  • The UniProt Consortium 2015 UniProt: a hub for protein information. Nucleic Acids Res. 2015 D204–D212

    Article  Google Scholar 

  • Tonge PD, Corso AJ, Monetti C, Hussein SMI, Puri MC, Michael IP, Li M, Lee DS, et al. 2014 Divergent reprogramming routes lead to alternative stem-cell states. Nature 516 192–197

    Article  CAS  PubMed  Google Scholar 

  • Wang A, Zhong Y, Wang Y and He Q 2014a A web server of cell type discrimination system. Sci. World J. 2014, Article ID 459064

    Google Scholar 

  • Wang Y, Thilmony R and Gu YQ 2014b NetVenn: an integrated network analysis web platform for gene lists. Nucleic Acids Res. 42 W161–W166

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Xu H, Lemischka IR and Ma'ayan A 2010 SVM classifier to predict genes important for self-renewal and pluripotency of mouse embryonic stem cells. BMC Syst. Biol. 4 1–10

    Article  Google Scholar 

  • Xu H, Baroukh C, Dannenfelser R, Chen EY, Tan CM, Kou Y, Kim YE, Lemischka IR, et al. 2013 ESCAPE: database for integrating high-content published data collected from human and mouse embryonic stem cells. Database (Oxford) 2013, bat045

Download references

Acknowledgements

We thank Tanmoy Jana, Debasree Sarkar, Sumit Mukherjee, and Souvik Sinha for their valuable comments for developing the server. We also give a special thanks to the Bioinformatics Centre, Bose Institute, for providing us the computational facility to do the work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sudipto Saha.

Additional information

[Mandal SD and Saha S 2016 PluriPred: A Web server for predicting proteins involved in pluripotent network. J. Biosci.]

Supplementary materials pertaining to this article are available on the Journal of Biosciences Website.

Electronic supplementary material

Below is the link to the electronic supplementary material.

ESM 1

(PDF 731 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mandal, S.D., Saha, S. PluriPred: A Web server for predicting proteins involved in pluripotent network. J Biosci 41, 743–750 (2016). https://doi.org/10.1007/s12038-016-9649-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12038-016-9649-2

Keywords

Navigation