Skip to main content

Annotation of Alternatively Spliced Proteins and Transcripts with Protein-Folding Algorithms and Isoform-Level Functional Networks

  • Protocol
  • First Online:
Protein Bioinformatics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1558))

Abstract

Tens of thousands of splice isoforms of proteins have been catalogued as predicted sequences from transcripts in humans and other species. Relatively few have been characterized biochemically or structurally. With the extensive development of protein bioinformatics, the characterization and modeling of isoform features, isoform functions, and isoform-level networks have advanced notably. Here we present applications of the I-TASSER family of algorithms for folding and functional predictions and the IsoFunc, MIsoMine, and Hisonet data resources for isoform-level analyses of network and pathway-based functional predictions and protein-protein interactions. Hopefully, predictions and insights from protein bioinformatics will stimulate many experimental validation studies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Omenn GS, Menon R, Zhang Y (2013) Innovations in proteomic profiling of cancers: alternative splice variants as a new class of cancer biomarker candidates and bridging of proteomics with structural biology. J Proteomics 90:28–37

    Article  CAS  PubMed  Google Scholar 

  2. Menon R, Panwar B, Eksi R, Kleer C, Guan Y, Omenn GS (2015) Computational inferences of the functions of alternative/noncanonical splice isoforms specific to HER2+/ER-/PR- breast cancers, a chromosome 17 C-HPP study. J Proteome Res 14(9):3519–3529

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Menon R, Omenn GS (2010) Proteomic characterization of novel alternative splice variant proteins in human epidermal growth factor receptor 2/neu-induced breast cancers. Cancer Res 70(9):3440–3449

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Patro R, Mount SM, Kingsford C (2014) Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms. Nat Biotechnol 32(5):462–464

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Menon R, Roy A, Mukherjee S, Belkin S, Zhang Y, Omenn GS (2011) Functional implications of structural predictions for alternative splice proteins expressed in Her2/neu-induced breast cancers. J Proteome Res 10(12):5503–5511

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Roy A, Kucukural A, Zhang Y (2010) I-TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc 5(4):725–738

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Wu S, Skolnick J, Zhang Y (2007) Ab initio modeling of small proteins by iterative TASSER simulations. BMC Biol 5:17

    Article  PubMed  PubMed Central  Google Scholar 

  8. Zhang Y (2007) Template-based modeling and free modeling by I-TASSER in CASP7. Proteins 69(S8):108–117

    Article  CAS  PubMed  Google Scholar 

  9. Zhang Y (2009) I-TASSER: Fully automated protein structure prediction in CASP8. Proteins 77(S9):100–113

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Zhang Y (2014) Interplay of I-TASSER and QUARK for template-based and ab initio protein structure prediction in CASP10. Proteins 82(Suppl 2):175–187. doi:10.1002/prot.24341.

  11. Yang J, Yan R, Roy A, Xu D, Poisson J, Zhang Y (2015) The I-TASSER Suite: protein structure and function prediction. Nat Methods 12(1):7–8

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Xu D, Zhang Y (2012) Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field. Proteins 80(7):1715–1735

    CAS  PubMed  PubMed Central  Google Scholar 

  13. Xu, D, Zhang, Y (2012) Towards optimal fragment generations for ab initio protein structure assembly. Proteins. 10.1002/prot.24179.

    Google Scholar 

  14. Xu D, Zhang J, Roy A, Zhang Y (2011) Automated protein structure modeling in CASP9 by I-TASSER pipeline combined with QUARK-based ab initio folding and FG-MD-based structure refinement. Proteins 79(Suppl 10):147–160

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Wu S, Zhang Y (2007) LOMETS: a local meta-threading-server for protein structure prediction. Nucl Acids Res 35:3375–3382

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Zhang Y, Kolinski A, Skolnick J (2003) TOUCHSTONE II: a new approach to ab initio protein structure prediction. Biophys J 85:1145–1164

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Zhang Y, Skolnick J (2004) SPICKER: a clustering approach to identify near-native protein folds. J Comput Chem 25(6):865–871

    Article  CAS  PubMed  Google Scholar 

  18. Swendsen RH, Wang JS (1986) Replica Monte Carlo simulation of spin glasses. Phys Rev Lett 57(21):2607–2609

    Article  CAS  PubMed  Google Scholar 

  19. Li Y, Zhang Y (2009) REMO: a new protocol to refine full atomic protein models from C-alpha traces by optimizing hydrogen-bonding networks. Proteins 76(3):665–676

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Zhang Y (2014) Interplay of I-TASSER and QUARK for template-based and ab initio protein structure prediction in CASP10. Proteins 82(Suppl 2):175–187

    Article  CAS  PubMed  Google Scholar 

  21. Wu S, Zhang Y (2008) A comprehensive assessment of sequence-based and template-based methods for protein contact prediction. Bioinformatics 24(7):924–931

    Article  PubMed  PubMed Central  Google Scholar 

  22. Wu S, Szilagyi A, Zhang Y (2011) Improving protein structure prediction using multiple sequence-based contact predictions. Structure 19(8):1182–1191

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Wu S, Zhang Y (2010) Recognizing protein substructure similarity using segmental threading. Structure 18(7):858–867

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Zhang J, Liang Y, Zhang Y (2011) Atomic-level protein structure refinement using fragment-guided molecular dynamics conformation sampling. Structure 19(12):1784–1795

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Xu D, Zhang Y (2013) Toward optimal fragment generations for ab initio protein structure assembly. Proteins 81(2):229–239

    Article  CAS  PubMed  Google Scholar 

  26. Xu D, Zhang Y (2011) Improving the physical realism and structural accuracy of protein models by a two-step atomic-level energy minimization. Biophys J 101(10):2525–2534

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Zhang Y (2008) I-TASSER server for protein 3D structure prediction. BMC Bioinformatics 9:40

    Article  PubMed  PubMed Central  Google Scholar 

  28. Kopp J, Bordoli L, Battey JN, Kiefer F, Schwede T (2007) Assessment of CASP7 predictions for template-based modeling targets. Proteins 69(S8):38–56

    Article  CAS  PubMed  Google Scholar 

  29. Battey JN, Kopp J, Bordoli L, Read RJ, Clarke ND, Schwede T (2007) Automated server predictions in CASP7. Proteins 69(S8):68–82

    Article  CAS  PubMed  Google Scholar 

  30. Cozzetto D, Kryshtafovych A, Fidelis K, Moult J, Rost B, Tramontano A (2009) Evaluation of template-based models in CASP8 with standard measures. Proteins 77(Suppl 9):18–28

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Moult J, Pedersen JT, Judson R, Fidelis K (1995) A large-scale experiment to assess protein structure prediction methods. Proteins 23(3):ii–iv

    Article  CAS  PubMed  Google Scholar 

  32. Moult J (2005) A decade of CASP: progress, bottlenecks and prognosis in protein structure prediction. Curr Opin Struct Biol 15(3):285–289

    Article  CAS  PubMed  Google Scholar 

  33. Mariani V, Kiefer F, Schmidt T, Haas J, Schwede T (2011) Assessment of template based protein structure predictions in CASP9. Proteins 79(Suppl 10):37–58

    Article  CAS  PubMed  Google Scholar 

  34. Montelione GT (2012) Template based modeling assessment in CASP10. Paper presented at the 10th community wide experiment on the critical assessment of techniques for protein structure prediction, Gaeta, Italy, 9–12 Dec 2012

    Google Scholar 

  35. Kinch LN, Li W, Monastyrskyy B, Kryshtafovych A, Grishin NV (2016) Evaluation of free modeling targets in CASP11 and ROLL. Proteins 84(Suppl 1):51–66. doi:10.1002/prot.24973.

  36. Yang J, Roy A, Zhang Y (2013) BioLiP: a semi-manually curated database for biologically relevant ligand-protein interactions. Nucleic Acids Res 41(D1):D1096–D1103

    Article  CAS  PubMed  Google Scholar 

  37. Roy A, Zhang Y (2012) Recognizing protein-ligand binding sites by global structural alignment and local geometry refinement. Structure 20(6):987–997

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Roy A, Yang J, Zhang Y (2012) COFACTOR: an accurate comparative algorithm for structure-based protein function annotation. Nucleic Acids Res 40(Web Server issue):W471–W477

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Yang J, Roy A, Zhang Y (2013) Protein-ligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment. Bioinformatics 29(20):2588–2595

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Rose PW, Beran B, Bi C, Bluhm WF, Dimitropoulos D, Goodsell DS, Prlic A, Quesada M, Quinn GB, Westbrook JD, Young J, Yukich B, Zardecki C, Berman HM, Bourne PE (2011) The RCSB Protein Data Bank: redesigned web site and web services. Nucleic Acids Res 39(Database issue):D392–D401

    Article  CAS  PubMed  Google Scholar 

  41. Benson ML, Smith RD, Khazanov NA, Dimcheff B, Beaver J, Dresslar P, Nerothin J, Carlson HA (2008) Binding MOAD, a high-quality protein-ligand database. Nucleic Acids Res 36(Database issue):D674–D678

    CAS  PubMed  Google Scholar 

  42. Cheng T, Li X, Li Y, Liu Z, Wang R (2009) Comparative assessment of scoring functions on a diverse test set. J Chem Inf Model 49(4):1079–1093

    Article  CAS  PubMed  Google Scholar 

  43. Liu T, Lin Y, Wen X, Jorissen RN, Gilson MK (2007) BindingDB: a web-accessible database of experimentally determined protein-ligand binding affinities. Nucleic Acids Res 35(Database issue):D198–D201

    Article  CAS  PubMed  Google Scholar 

  44. Barrett AJ (1997) Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB). Enzyme Nomenclature. Recommendations 1992. Supplement 4: corrections and additions (1997). Eur J Biochem 250(1):1–6

    Article  CAS  PubMed  Google Scholar 

  45. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25(1):25–29

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Schmidt T, Haas J, Gallo Cassarino T, Schwede T (2011) Assessment of ligand-binding residue predictions in CASP9. Proteins 79(Suppl 10):126–136

    Article  CAS  PubMed  Google Scholar 

  47. Brylinski M, Skolnick J (2008) A threading-based method (FINDSITE) for ligand-binding site prediction and functional annotation. Proc Natl Acad Sci U S A 105(1):129–134

    Article  CAS  PubMed  Google Scholar 

  48. Capra JA, Laskowski RA, Thornton JM, Singh M, Funkhouser TA (2009) Predicting protein ligand binding sites by combining evolutionary sequence conservation and 3D structure. PLoS Comput Biol 5(12):e1000585

    Article  PubMed  PubMed Central  Google Scholar 

  49. Schwede T (2015) Montly summary of ligand binding prediction results in CAMEO is at http://www.cameo3d.org/lb.

  50. Whiteaker JR, Zhang H, Zhao L, Wang P, Kelly-Spratt KS, Ivey RG, Piening BD, Feng LC, Kasarda E, Gurley KE, Eng JK, Chodosh LA, Kemp CJ, McIntosh MW, Paulovich AG (2007) Integrated pipeline for mass spectrometry-based discovery and confirmation of biomarkers demonstrated in a mouse model of breast cancer. J Proteome Res 6(10):3962–3975

    Article  CAS  PubMed  Google Scholar 

  51. Zhang Y, Skolnick J (2005) TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res 33(7):2302–2309

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Moss SE, Jacob SM, Davies AA, Crumpton MJ (1992) A growth-dependent post-translational modification of annexin VI. Biochim Biophys Acta 1160(1):120–126

    Article  CAS  PubMed  Google Scholar 

  53. Eksi R, Li H-D, Menon R, Wen Y, Omenn GS, Kretzler MK, Guan Y (2013) Systematically differentiating functions for alternatively spliced isoforms through integrating RNA-seq data. PLoS Comput Biol 9(11):e1003314

    Article  PubMed  PubMed Central  Google Scholar 

  54. Li H-D, Menon R, Eksi R, Guerler A, Zhang Y, Omenn GS, Guan Y (2013) Modeling the functional relationship network at the splice isoform level through heterogeneous data integration. bioRxiv:doi: 10.1101/001719.

    Google Scholar 

  55. Li H-D, Menon R, Omenn GS, Guan Y (2014) Revisiting the identification of canonical splice isoforms through integration of functional genomics and proteomics evidence. Proteomics 14(23–24):2709–2718

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Li H-D, Omenn GS, Guan Y (2015) MIsoMine: a genome-scale high-resolution data portal of expression, function and networks at the splice isoform level in the mouse. Database 2015. doi: 10.1093/database/bav1045.

  57. Panwar B, Menon R, Eksi R, Li H-D, Omenn GS, Guan Y (2015) Genome-wide functional annotation of human protein-coding splice variants using multiple instance learning under revision

    Google Scholar 

  58. Consortium EP (2004) The ENCODE (ENCyclopedia Of DNA Elements) Project. Science 306:636–640

    Article  Google Scholar 

  59. Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, Pimentel H, Salzberg SL, Rinn JL, Pachter L (2012) Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc 7(3):562–578

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Li H-D, Menon R, Govindarajoo B, Panwar B, Zhang Y, Omenn GS, Guan Y (2015) Functional networks of highest-connected splice isoforms: from the Chromosome 17 Human Proteome Project. J Proteome Res 14(9):3484–3491

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Liu SL, Im H, Bairoch A, Cristofanilli M, Chen R, Deutsch EW, Dalton S, Fenyo D, Fanayan S, Gates C, Gaudet P, Hincapie M, Hanash S, Kim H, Jeong SK, Lundberg E, Mias G, Menon R, Mu ZM, Nice E, Paik YK, Uhlen M, Wells L, Wu SL, Yan FF, Zhang F, Zhang Y, Snyder M, Omenn GS, Beavis RC, Hancock WS (2012) A chromosome-centric Human Proteome Project (C-HPP) to characterize the sets of proteins encoded in Chromosome 17. J Proteome Res 12(1):45–57

    Article  PubMed  PubMed Central  Google Scholar 

  62. Simons KT, Kooperberg C, Huang E, Baker D (1997) Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J Mol Biol 268(1):209–225

    Article  CAS  PubMed  Google Scholar 

  63. Soding J (2005) Protein homology detection by HMM-HMM comparison. Bioinformatics 21(7):951–960

    Article  PubMed  Google Scholar 

  64. Sali A, Blundell TL (1993) Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol 234(3):779–815

    Article  CAS  PubMed  Google Scholar 

  65. Laskowski RA, Watson JD, Thornton JM (2005) ProFunc: a server for predicting protein function from 3D structure. Nucl Acids Res 33(Web Server issue):W89–W93

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Haas J, Roth S, Arnold K, Kiefer F, Schmidt T, Bordoli L, Schwede T (2013) The Protein Model Portal—a comprehensive resource for protein structure and model information. Database (Oxford) 2013:bat031

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media LLC

About this protocol

Cite this protocol

Li, H., Zhang, Y., Guan, Y., Menon, R., Omenn, G.S. (2017). Annotation of Alternatively Spliced Proteins and Transcripts with Protein-Folding Algorithms and Isoform-Level Functional Networks. In: Wu, C., Arighi, C., Ross, K. (eds) Protein Bioinformatics. Methods in Molecular Biology, vol 1558. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-6783-4_20

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-6783-4_20

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-6781-0

  • Online ISBN: 978-1-4939-6783-4

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics