Impact of the Continuous Evolution of Gene Ontology on Similarity Measures

  • Madhusudan Paul
  • Ashish AnandEmail author
  • Saptarshi Pyne
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11942)


Gene Ontology (GO) is a taxonomy of biological terms related to the properties of genes and gene products. It can be used to define a similarity measure between two gene products and assign a confidence score to protein-protein interactions (PPIs). GO is being evolved regularly by the addition/deletion/merging of terms. However, there is no study which evaluates the robustness of a particular similarity measure over the evolution of GO. By robustness of a similarity measure, we mean it should either improve or keep its performance similar over the evolution of GO. In this paper, we systematically study the same for the task of scoring confidence of PPIs using GO-based similarity measures. We observe that the performance of similarity measures gets affected due to the regular updates of GO. We find that similarity measures are not robust in all conditions, rather they keep their performance quite similar over the evolution of GO in certain conditions.


Gene Ontology (GO) Protein-protein interaction (PPI) Similarity measures 


  1. 1.
    Ashburner, M., et al.: Gene ontology: tool for the unification of biology. Nature Genet. 25(1), 25–29 (2000)CrossRefGoogle Scholar
  2. 2.
    Bandyopadhyay, S., Mallick, K.: A new path based hybrid measure for gene ontology similarity. IEEE/ACM Trans. Comput. Biol. Bioinform. (TCBB) 11(1), 116–127 (2014)CrossRefGoogle Scholar
  3. 3.
    Benabderrahmane, S., Smail-Tabbone, M., Poch, O., Napoli, A., Devignes, M.D.: IntelliGO: a new vector-based semantic similarity measure including annotation origin. BMC Bioinform. 11(1), 588 (2010)CrossRefGoogle Scholar
  4. 4.
    Jain, S., Bader, G.D.: An improved method for scoring protein-protein interactions using semantic similarity within the gene ontology. BMC Bioinform. 11(1), 562 (2010)CrossRefGoogle Scholar
  5. 5.
    Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. In: Proceedings of 10th International Conference on Research In Computational Linguistics, ROCLING 1997 (1997)Google Scholar
  6. 6.
    Lin, D.: An information-theoretic definition of similarity. In: Proceedings of the Fifteenth International Conference on Machine Learning, vol. 98, pp. 296–304. Morgan Kaufmann Publishers Inc., San Francisco (1998)Google Scholar
  7. 7.
    Lord, P.W., Stevens, R.D., Brass, A., Goble, C.A.: Investigating semantic similarity measures across the gene ontology: the relationship between sequence and annotation. Bioinformatics 19(10), 1275–1283 (2003)CrossRefGoogle Scholar
  8. 8.
    Paul, M., Anand, A.: A new family of similarity measures for scoring confidence of protein interactions using gene ontology, p. 459107. bioRxiv (2018)Google Scholar
  9. 9.
    Pesquita, C.: Semantic similarity in the gene ontology. In: Dessimoz, C., Škunca, N. (eds.) The Gene Ontology Handbook. MMB, vol. 1446, pp. 161–173. Springer, New York (2017). Scholar
  10. 10.
    Pesquita, C., Faria, D., Falcao, A.O., Lord, P., Couto, F.M.: Semantic similarity in biomedical ontologies. PLoS Comput. Biol. 5(7), e1000443 (2009)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Razick, S., Magklaras, G., Donaldson, I.M.: iRefIndex: a consolidated protein interaction database with provenance. BMC Bioinform. 9(1), 1 (2008)CrossRefGoogle Scholar
  12. 12.
    Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, pp. 448–453. Morgan Kaufmann Publishers Inc., San Francisco (1995)Google Scholar
  13. 13.
    Schlicker, A., Domingues, F.S., Rahnenführer, J., Lengauer, T.: A new measure for functional similarity of gene products based on gene ontology. BMC Bioinform. 7(1), 302 (2006)CrossRefGoogle Scholar
  14. 14.
    Wang, J.Z., Du, Z., Payattakool, R., Yu, P.S., Chen, C.F.: A new method to measure the semantic similarity of go terms. Bioinformatics 23(10), 1274–1281 (2007)CrossRefGoogle Scholar
  15. 15.
    Xenarios, I., Rice, D.W., Salwinski, L., Baron, M.K., Marcotte, E.M., Eisenberg, D.: DIP: the database of interacting proteins. Nucleic Acids Res. 28(1), 289–291 (2000)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringIIT GuwahatiGuwahatiIndia
  2. 2.Department of Computer and System SciencesVisva-BharatiSantiniketanIndia

Personalised recommendations