A Context-Driven Gene Prioritization Method for Web-Based Functional Genomics

  • Jeremy J. Jay
  • Erich J. Baker
  • Elissa J. Chesler
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7875)


Functional genomics experiments often result in large sets of gene centered results associated with biological concepts such as diseases. Prioritization and interpretation of these results involves evaluation of the relevance of genes to various annotations or associated terms and is often executed through the use of prior information in biological databases. These diverse databases are frequently disconnected, or loosely federated data stores. Consequently, assessing the relations among biological entities and constructs, including genes, gene products, diseases, and model organism phenotypes is a challenging task typically requiring manual intervention, and as such only limited information is considered. Extracting and quantifying relations among genes and disease related concepts can be improved through the quantification of the entire contextual similarity of gene representations among the landscape of biological data. We have devised a suitable metric for this analysis which, unlike most similar methods requires no user-defined input parameters. We have demonstrated improved gene prioritization relative to existing metrics and commonly used software systems for gene prioritization. Our approach is implemented as an enhancement to the flexible integrative genomics platform,


Semantic Similarity Autistic Disorder Rand Index Gene Prioritization Human Phenotype Ontology 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
  2. 2.
    Aerts, S., Lambrechts, D., Maity, S., Van Loo, P., Coessens, B., De Smet, F., Tranchevent, L.C., De Moor, B., Marynen, P., Hassan, B., et al.: Gene prioritization through genomic data fusion. Nature biotechnology 24(5), 537–544 (2006)CrossRefGoogle Scholar
  3. 3.
    Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., Harris, M.A., Hill, D.P., Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese, J.C., Richardson, J.E., Ringwald, M., Rubin, G.M., Sherlock, G.: Gene ontology: tool for the unification of biology. Nature Genetics 25(1), 25–29 (2000)CrossRefGoogle Scholar
  4. 4.
    Baker, E.J., Jay, J.J., Bubier, J.A., Langston, M.A., Chesler, E.J.: GeneWeaver: a web-based system for integrative functional genomics. Nucleic Acids Research (November 2011)Google Scholar
  5. 5.
    Becker, K.G., Barnes, K.C., Bright, T.J., Wang, S.A.: The genetic association database. Nature Genetics 36(5), 431–432 (2004)CrossRefGoogle Scholar
  6. 6.
    Bubier, J., Chesler, E.: Accelerating discovery for complex neurological and behavioral disorders through systems genetics and integrative genomics in the laboratory mouse. Neurotherapeutics, 1–11 (2012)Google Scholar
  7. 7.
    Chen, C., Mungall, C.J., Gkoutos, G.V., Doelken, S.C., Köhler, S., Ruef, B.J., Smith, C., Westerfield, M., Robinson, P.N., Lewis, S.E., Schofield, P.N., Smedley, D.: MouseFinder: candidate disease genes from mouse phenotype data. Human Mutation 33(5), 858–866 (2012)CrossRefGoogle Scholar
  8. 8.
    Chen, J., Bardes, E.E., Aronow, B.J., Jegga, A.G.: ToppGene suite for gene list enrichment analysis and candidate gene prioritization. Nucleic Acids Research 37, W305–W311 (2009)Google Scholar
  9. 9.
    Davis, A.P., Wiegers, T.C., Rosenstein, M.C., Mattingly, C.J.: MEDIC: a practical disease vocabulary used at the comparative toxicogenomics database. Database: The Journal of Biological Databases and Curation 2012 (February 2012)Google Scholar
  10. 10.
    Fernald, G.H., Capriotti, E., Daneshjou, R., Karczewski, K.J., Altman, R.B.: Bioinformatics challenges for personalized medicine. Bioinformatics 27(13), 1741–1748 (2011)Google Scholar
  11. 11.
    Gardner, D., Akil, H., Ascoli, G.A., Bowden, D.M., Bug, W., Donohue, D.E., Goldberg, D.H., Grafstein, B., Grethe, J.S., Gupta, A., Halavi, M., Kennedy, D.N., Marenco, L., Martone, M.E., Miller, P.L., Müller, H., Robert, A., Shepherd, G.M., Sternberg, P.W., Van Essen, D.C., Williams, R.W.: The neuroscience information framework: a data and knowledge environment for neuroscience. Neuroinformatics 6(3), 149–160 (2008)CrossRefGoogle Scholar
  12. 12.
    McKusick-Nathans Institute of Genetic Medicine, J.H.U.B.: Online mendelian inheritance in man, OMIM®,
  13. 13.
    Gentleman, R.: Visualizing and distances using GO (2005),
  14. 14.
    Hibbs, M.A., Hess, D.C., Myers, C.L., Huttenhower, C., Li, K., Troyanskaya, O.G.: Exploring the functional landscape of gene expression: directed search of large microarray compendia. Bioinformatics 23(20), 2692–2699 (2007)CrossRefGoogle Scholar
  15. 15.
    Homayouni, R., Heinrich, K., Wei, L., Berry, M.W.: Gene clustering by latent semantic indexing of MEDLINE abstracts. Bioinformatics 21(1), 104–115 (2005)CrossRefGoogle Scholar
  16. 16.
    Hubert, L., Arabie, P.: Comparing partitions. Journal of Classification 2(1), 193–218 (1985)CrossRefGoogle Scholar
  17. 17.
    Jaccard, P.: Étude comparative de la distribution florale dans une portion des alpes et des jura. Bulletin de la Société Vaudoise des Sciences Naturelles 37, 547–579 (1901)Google Scholar
  18. 18.
    Kreek, M., Nielsen, D., LaForge, K.: Genes associated with addiction: alcoholism, opiate, and cocaine. NeuroMolecular Medicine 5(1), 85–108 (2004)CrossRefGoogle Scholar
  19. 19.
    Lord, P.W., Stevens, R.D., Brass, A., Goble, C.A.: Investigating semantic similarity measures across the gene ontology: the relationship between sequence and annotation. Bioinformatics 19(10), 1275–1283 (2003)CrossRefGoogle Scholar
  20. 20.
    Perez-Iratxeta, C., Bork, P., Andrade, M.A.: Association of genes to genetically inherited diseases using data mining. Nature Genetics 31(3), 316–319 (2002)Google Scholar
  21. 21.
    Pesquita, C., Faria, D., Bastos, H., Falcáo, A., Couto, F.: Evaluating go-based semantic similarity measures. In: Proc. 10th Annual Bio-Ontologies Meeting, pp. 37–40 (2007)Google Scholar
  22. 22.
    Pesquita, C., Faria, D., Bastos, H., Ferreira, A.E., Falcáo, A.O., Couto, F.M.: Metrics for GO based protein semantic similarity: a systematic evaluation. BMC Bioinformatics 9(S5), S4 (2008)Google Scholar
  23. 23.
    Rand, W.M.: Objective criteria for the evaluation of clustering methods. Journal of the American Statistical association, 846–850 (1971)Google Scholar
  24. 24.
    Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, pp. 448–453 (1995)Google Scholar
  25. 25.
    Robinson, P.N., Köhler, S., Bauer, S., Seelow, D., Horn, D., Mundlos, S.: The human phenotype ontology: A tool for annotating and analyzing human hereditary disease. The American Journal of Human Genetics 83(5), 610–615 (2008)CrossRefGoogle Scholar
  26. 26.
    Tiffin, N., Adie, E., Turner, F., Brunner, H.G., van Driel, M.A., Oti, M., Lopez-Bigas, N., Ouzounis, C., Perez-Iratxeta, C., Andrade-Navarro, M.A., Adeyemo, A., Patti, M.E., Semple, C.A.M., Hide, W.: Computational disease gene identification: a concert of methods prioritizes type 2 diabetes and obesity candidate genes. Nucleic Acids Research 34(10), 3067–3081 (2006)CrossRefGoogle Scholar
  27. 27.
    Tranchevent, L., Capdevila, F.B., Nitsch, D., Moor, B.D., Causmaecker, P.D., Moreau, Y.: A guide to web tools to prioritize candidate genes. Briefings in Bioinformatics 12(1), 22–32 (2011)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Jeremy J. Jay
    • 1
  • Erich J. Baker
    • 2
  • Elissa J. Chesler
    • 1
  1. 1.The Jackson LaboratoryBar HarborUSA
  2. 2.Baylor UniversityWacoUSA

Personalised recommendations