Methods of Gene Ontology Term Similarity Analysis in Graph Database Environment
The article presents and analyses three graph processing issues that can be identified in three methods of GO term similarity evaluation. The solutions of these problems are implemented in Neo4j graph database environment. Each of the issues can be solved directly by a single Cypher query or can be divided into several queries which results have to be merged. The comparison of the introduced solutions is presented in terms of time and memory effectivness. The results show how to implement the effective solutions of this class of issues.
Keywordsgraph database Neo4j Gene Ontology GO term similarity
Unable to display preview. Download preview PDF.
- 1.Al Mubaid, H., Nagar, A.: Comparison of four similarity measures based on go annotations for gene clustering. In: IEEE Symposium on Computers and Communications, ISCC 2008, pp. 531–536. IEEE (2008)Google Scholar
- 4.Jiang, J., Conrath, D.: Semantic similarity based on corpus statistics and lexical ontology. In: Proc. on International Conference on Research in Computational Linguistics, pp. 19–33 (1997)Google Scholar
- 5.Kozielski, M., Stypka, Ł.: Gene ontology based gene analysis in graph database environment. Studia Informatica 34(2A), 111 (2013)Google Scholar
- 6.Lin, D.: An information-theoretic definition of similarity. In: ICML, vol. 98, pp. 296–304 (1998)Google Scholar
- 7.Neo4j: Graph database: http://www.neo4j.org
- 8.Pesquita, C., Faria, D., Falcao, A.O., Lord, P., Couto, F.M.: Semantic similarity in biomedical ontologies. PLoS Computational Biology 5(7), e1000443 (2009)Google Scholar