Cross-Ontological Analytics: Combining Associative and Hierarchical Relations in the Gene Ontologies to Assess Gene Product Similarity

  • C. Posse
  • A. Sanfilippo
  • B. Gopalan
  • R. Riensche
  • N. Beagley
  • B. Baddeley
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3992)


Gene and gene product similarity is a fundamental diagnostic measure in analyzing biological data and constructing predictive models for functional genomics. With the rising influence of the gene ontologies, two complementary approaches have emerged where the similarity between two genes/gene products is obtained by comparing gene ontology (GO) annotations associated with the gene/gene products. One approach captures GO-based similarity in terms of hierarchical relations within each gene ontology. The other approach identifies GO-based similarity in terms of associative relations across the three gene ontologies. We propose a novel methodology where the two approaches can be merged with ensuing benefits in coverage and accuracy.


Gene Ontology Semantic Similarity Protein Pair Spearman Rank Order Correlation Hierarchical Relation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Anang, Z., Miller, W., Lipman, D.J.: Gapped BLAST and PSI-BLST: a new generation of protein database search programs. Nucl. Acids Res. 25, 3389–3402 (1997)CrossRefGoogle Scholar
  2. 2.
    Azuaje, F., Wang, H., Bodenreider, O.: Ontology-driven similarity approaches to supporting gene functional assessment. In: Proceedings of the ISMB 2005 SIG meeting on Bio-ontologies, pp. 9–10 (2005)Google Scholar
  3. 3.
    Bodenreider, O., Aubry, M., Burgun, A.: Non-lexical approaches to identifying associative relations in the gene ontology. In: Proceedings of Pacific Symposium on Biocomputing, pp. 104–115 (2005)Google Scholar
  4. 4.
    Budanitsky, A.: Lexical semantic relatedness and its application in natural language processing. Technical report CSRG-390, Department of Computer Science, University of Toronto (1999)Google Scholar
  5. 5.
    Couto, F.M., Silva, M.J., Coutinho, P.: Implementation of a functional semantic similarity measure between gene-products. Technical Report, Department of Informatics, University of Lisbon (2003),
  6. 6.
    Jiang, J., Conrath, D.: Semantic similarity based on corpus statistics and lexical taxonomy. In: Proceedings of International Conference on Research in Computational Linguistics, Taiwan (1997)Google Scholar
  7. 7.
    Kendall, M.G.: Rank and product-moment correlation. Biometrika 36, 177–193 (1949)MathSciNetMATHGoogle Scholar
  8. 8.
    Lin, D.: An information-theoretic definition of similarity. In: Proceedings of the 15th International Conference on Machine Learning, Madison, WI (1998)Google Scholar
  9. 9.
    Lord, P.W., Stevens, R.D., Brass, A., Goble, C.A.: Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation. Bioinformatics 19(10), 1275–1283 (2002)CrossRefGoogle Scholar
  10. 10.
    Lord, P.W., Stevens, R.D., Brass, A., Goble, C.A.: Semantic similarity measures as tools for exploring the Gene Ontology. In: Proceedings of Pacific Symposium on Biocomputing, pp. 601–612 (2003)Google Scholar
  11. 11.
    Oehmen, C.S., Nieplocha, J.N.: ScalaBLAST: A scalable implementation of BLAST for high performance data-intensive bioinformatics analysis. IEEE Transactions on Parallel and Distributed Systems (2006) (in press)Google Scholar
  12. 12.
    Pearson, W.R., Lipman, D.J.: Improved tools for biological sequence analysis. Proceedings of the National Academy of Sciences 85, 2444–2448 (1988)CrossRefGoogle Scholar
  13. 13.
    Popescu, M., Keller, J.M., Mitchell, J.A.: Gene ontology automatic annotation using a domain based gene product similarity measure. In: Proceedings of 14th IEEE International Conference on Fuzzy Systems, Reno, Nevada, May 21-25, pp. 108–111 (2005)Google Scholar
  14. 14.
    Raychaudhuri, S., Chang, J.T., Imam, F., Altman, R.B.: The computational analysis of scientific literature to define and recognize gene expression clusters. Nucleic Acids Research 31(15), 4553–4560 (2003)CrossRefGoogle Scholar
  15. 15.
    Resnik, P.: Using information content to evaluate semantic similarity. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, Montreal, pp. 448–453 (1995)Google Scholar
  16. 16.
    Salton, G., Wong, A., Yang, C.S.: A Vector space model for automatic indexing. CACM 18(11), 613–620 (1975)MATHGoogle Scholar
  17. 17.
    Smith, T., Waterman, M.S.: Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197 (1981)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • C. Posse
    • 1
  • A. Sanfilippo
    • 1
  • B. Gopalan
    • 1
  • R. Riensche
    • 1
  • N. Beagley
    • 1
  • B. Baddeley
    • 1
  1. 1.Pacific Northwest National LaboratoryRichlandUSA

Personalised recommendations