Impact of the Continuous Evolution of Gene Ontology on Similarity Measures
Gene Ontology (GO) is a taxonomy of biological terms related to the properties of genes and gene products. It can be used to define a similarity measure between two gene products and assign a confidence score to protein-protein interactions (PPIs). GO is being evolved regularly by the addition/deletion/merging of terms. However, there is no study which evaluates the robustness of a particular similarity measure over the evolution of GO. By robustness of a similarity measure, we mean it should either improve or keep its performance similar over the evolution of GO. In this paper, we systematically study the same for the task of scoring confidence of PPIs using GO-based similarity measures. We observe that the performance of similarity measures gets affected due to the regular updates of GO. We find that similarity measures are not robust in all conditions, rather they keep their performance quite similar over the evolution of GO in certain conditions.
KeywordsGene Ontology (GO) Protein-protein interaction (PPI) Similarity measures
- 5.Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. In: Proceedings of 10th International Conference on Research In Computational Linguistics, ROCLING 1997 (1997)Google Scholar
- 6.Lin, D.: An information-theoretic definition of similarity. In: Proceedings of the Fifteenth International Conference on Machine Learning, vol. 98, pp. 296–304. Morgan Kaufmann Publishers Inc., San Francisco (1998)Google Scholar
- 8.Paul, M., Anand, A.: A new family of similarity measures for scoring confidence of protein interactions using gene ontology, p. 459107. bioRxiv (2018)Google Scholar
- 12.Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, pp. 448–453. Morgan Kaufmann Publishers Inc., San Francisco (1995)Google Scholar