GOAL: the comprehensive gene ontology analysis layer

Abstract

Homogeneity or heterogeneity of cells is the most fundamental and important features of analyzing biological associations of genes and gene products. Recent bioinformatics technology requires an automated high-throughput analysis application that can handle massively produced data from next generation sequences and dramatically increased size of public proteomic/genomic databases. Although Gene ontology (GO) database has been newly spotlighted on its wide coverage of machine-readable terminologies, its complex DB schema and vast amount of applications utilizing GO without deep considerations of GO term relations dilute the actual power of GO-based analysis and resulted in misleading/under estimated outcomes. Meanwhile, our recent studies showed that BSM score, a new way of measuring functional similarity, clearly outperformed existing conventional methods. However, implementing BSM score that requires integrating multiple databases and calculating scoring matrix is not trivial and even difficult for bioinformatics experts; therefore, a web-based graphical user interface (GUI) tool, Gene Ontology Analysis Layer (GOAL: http://www.ittc.ku.edu/chenlab/ goal) is introduced to provide user-friendly GO application powered by state of art functional similarity metric, BSM score.

This is a preview of subscription content, access via your institution.

References

  1. 1

    Patel A P, Tirosh I, Trombetta J J, et al. Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma. Science, 2014, 344: 1396–1401

    Article  Google Scholar 

  2. 2

    Ploper D, Taelman V F, Robert L, et al. MITF drives endolysosomal biogenesis and potentiates Wnt signaling in melanoma cells. Proc Nat Acad Sci USA, 2015. 112: E420–E429

    Article  Google Scholar 

  3. 3

    Ashburner M, Ball C A, Blake J A, et al. Gene ontology: tool for the unification of biology. Nat Genet, 2000, 25: 25–29

    Article  Google Scholar 

  4. 4

    Salzman J, Chen R E, Olsen M N, et al. Cell-type specific features of circular RNA expression. PLoS Genet, 2013, 9: e1003777

    Article  Google Scholar 

  5. 5

    Caffrey C R, Rohwer A, Oellien F, et al. A comparative chemogenomics strategy to predict potential drug targets in the metazoan pathogen, Schistosoma mansoni. PLoS ONE, 2009, 4: e4413

    Article  Google Scholar 

  6. 6

    Campillos M, Kuhn M, Gavin A C, et al. Drug target identification using side-effect similarity. Science, 2008, 321: 263–266

    Article  Google Scholar 

  7. 7

    Crowther G J, Shanmugam D, Carmona S J, et al. Identification of attractive drug targets in neglected-disease pathogens using an in silico approach. PLoS Negl Trop Dis, 2010, 4: e804

    Article  Google Scholar 

  8. 8

    Smith C. Drug target identification: a question of biology. Nature, 2004, 428: 225–231

    Article  Google Scholar 

  9. 9

    Takenaka T. Classical vs reverse pharmacology in drug discovery. BJU Int, 2001, 88, Suppl 2: 7–10; discussion 49–50

    Google Scholar 

  10. 10

    Osadchy M, Kolodny R. Maps of protein structure space reveal a fundamental relationship between protein structure and function. Proc Nat Acad Sci USA, 2011, 108: 12301–12306

    Article  Google Scholar 

  11. 11

    Yildirim M A, Goh K I, Cusick M E, et al. Drug-target network. Nat Biotechnol, 2007, 25: 1119–1126

    Article  Google Scholar 

  12. 12

    Devos D, Valencia A. Intrinsic errors in genome annotation. Trends Genet, 2001, 17: 429–431

    Article  Google Scholar 

  13. 13

    Petrey D, Fischer M, Honig B. Structural relationships among proteins with different global topologies and their implications for function annotation strategies. Proc Nat Acad Sci USA, 2009, 106: 17377–17382

    Article  Google Scholar 

  14. 14

    Yu H Y, Luscombe N M, Lu H X, et al. Annotation transfer between genomes: protein-protein interologs and protein- DNA regulogs. Genom Res, 2004, 14: 1107–1118

    Article  Google Scholar 

  15. 15

    Petrey D, Honig B. Is protein classification necessary? Toward alternative approaches to function annotation. Curr Opin Struct Biol, 2009, 19: 363–368

    Article  Google Scholar 

  16. 16

    Jeong J C, Chen X-W. Evaluating topology-based metrics for GO term similarity measures. In: Proceedings of IEEE International Conference on Bioinformatics and Biomedicine, Shanghai, 2013. 43–48

    Google Scholar 

  17. 17

    Gentleman R. Visualizing and distances using GO. 2010. http://www.bioconductor.org/packages/release/bioc/ vignettes/GOstats/inst/doc/GOvis.pdf

    Google Scholar 

  18. 18

    Jiang J J, Conrath D W. Semantic similarity based on corpus statistics and lexical taxonomy. In: Proceedings of International Conference Research on Computational Linguistics (ROCLING X), Taipei, 1997

    Google Scholar 

  19. 19

    Resnik P. Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence. San Francisco: Morgan Kaufmann Publishers Inc., 1995. 448–453

    Google Scholar 

  20. 20

    Schlicker A, Domingues F S, Rahnenfhrer J, et al. A new measure for functional similarity of gene products based on Gene Ontology. BMC Bioinform, 2006, 7: 302

    Article  Google Scholar 

  21. 21

    Ye P, Peyser B D, Pan X, et al. Gene function prediction from congruent synthetic lethal interactions in yeast. Mol Syst Biol, 2005, 1: 2005–0026

    Article  Google Scholar 

  22. 22

    Lerman G, Shakhnovich B E. Defining functional distance using manifold embeddings of gene ontology annotations. Proc Nat Acad Sci USA, 2007, 104: 11334–11339

    MathSciNet  Article  MATH  Google Scholar 

  23. 23

    Lin D. An information-theoretic definition of similarity. In: Proceedings of the 15th International Conference on Machine Learning. San Francisco: Morgan Kaufmann Publishers Inc., 1998. 296–304

    Google Scholar 

  24. 24

    Shannon C E. The mathematical theory of communication. 1963. MD Comput, 1997, 14: 306–317

    Google Scholar 

  25. 25

    Jeong J C, Chen X W. A new semantic functional similarity over gene ontology. IEEE/ACM Trans Comput Biol Bioinform, 2014, 12: 322–334

    Article  Google Scholar 

  26. 26

    Chen X W, Jeong J C, Dermyer P. KUPS: constructing datasets of interacting and non-interacting protein pairs with associated attributions. Nucl Acids Res, 2011, 39: 750–754

    Article  Google Scholar 

  27. 27

    Andreeva A, Howorth D, Chandonia J M, et al. Data growth and its impact on the SCOP database: new developments. Nucl Acids Res, 2008, 36: D419–D425

    Article  Google Scholar 

  28. 28

    Orengo C A, Michie A D, Jones S, et al. CATH—a hierarchic classification of protein domain structures. Structure, 1997, 5: 1093–1108

    Article  Google Scholar 

  29. 29

    Consortium T U. The Universal Protein Resource (UniProt) in 2010. Nucl Acids Res, 2010, 38: D142–D148

    Article  Google Scholar 

  30. 30

    Lord P W, Stevens R D, Brass A, et al. Investigating semantic similarity measures across the gene ontology: the relationship between sequence and annotation. Bioinformatics, 2003, 19: 1275–1283

    Article  Google Scholar 

  31. 31

    Schlicker A, Albrecht M. FunSimMat: a comprehensive functional similarity database. Nucl Acids Res, 2008, 36: D434–D439

    Article  Google Scholar 

  32. 32

    Pesquita C, Faria D, Bastos H, et al. Metrics for GO based protein semantic similarity: a systematic evaluation. BMC Bioinform, 2008, 9, Suppl 5: S4

    Article  Google Scholar 

  33. 33

    Wang J Z, Du Z, Payattakool R, et al. A new method to measure the semantic similarity of GO terms. Bioinformatics, 2007, 23: 1274–1281

    Article  Google Scholar 

  34. 34

    Hamosh A, Scott A F, Amberger J S, et al. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucl Acids Res, 2005, 33: D514–D517

    Article  Google Scholar 

  35. 35

    Schlicker A, Lengauer T, Albrecht M. Improving disease gene prioritization using the semantic similarity of Gene Ontology terms. Bioinformatics, 2010, 26: i561–i567

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Xue-Wen Chen.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Jeong, J.C., Li, G. & Chen, XW. GOAL: the comprehensive gene ontology analysis layer. Sci. China Inf. Sci. 59, 070108 (2016). https://doi.org/10.1007/s11432-016-5581-1

Download citation

Keywords

  • gene ontology
  • molecular function
  • functional similarity
  • network-based analysis
  • BSM score