Advertisement

GO-WAR: A Tool for Mining Weighted Association Rules from Gene Ontology Annotations

  • Giuseppe AgapitoEmail author
  • Mario Cannataro
  • Pietro H. Guzzi
  • Marianna Milano
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8623)

Abstract

The Gene Ontology (GO) is a controlled vocabulary of concepts (called GO Terms) structured on three main ontologies. Each GO Term contains a description of a biological concept that is associated to one or more gene products through a process also known as annotation. Each annotation may be derived using different methods and an Evidence Code (EC) takes into account of this process. The importance and the specificity of both GO terms and annotations are often measured by their Information Content (IC). Mining annotations and annotated data may extract meaningful knowledge from a biological stand point. For instance, the analysis of these annotated data using association rules provides evidence for the co-occurrence of annotations. Nevertheless classical association rules algorithms do not take into account the source of annotation nor the importance yielding to the generation of candidate rules with low IC. This paper presents a methodology for extracting Weighted Association Rules from GO implemented in a tool named GO-WAR (Gene Ontology-based Weighted Association Rules). It is able to extract association rules with a high level of IC without loss of Support and Confidence from a dataset of annotated data. A case study on using of GO WAR on publicly available GO annotation dataset is used to demonstrate that our method outperforms current state of the art approaches.

Keywords

Gene Ontology Weighted Association Rules 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Guzzi, P.H., Mina, M., Guerra, C., Cannataro, M.: Semantic similarity analysis of protein data: assessment with biological features and issues. Briefings in Bioinformatics 13(5), 569–585 (2012)CrossRefGoogle Scholar
  2. 2.
    Harris, M.A., Clark, J., Ireland, A., Lomax, J., Ashburner, M., et al.: The gene ontology (go) database and informatics resource. Nucleic Acids Res. 32(Database issue), 258–261 (2004)Google Scholar
  3. 3.
    Camon, E., Magrane, M., Barrell, D., Lee, V., Dimmer, E., Maslen, J., Binns, D., Harte, N., Lopez, R., Apweiler, R.: The gene ontology annotation (goa) database: sharing knowledge in uniprot with gene ontology. Nucl. Acids Res. 32(suppl_1), D262–D266 (2004)Google Scholar
  4. 4.
    Hipp, J., Güntzer, U., Nakhaeizadeh, G.: Algorithms for association rule mining a general survey and comparison. ACM Sigkdd Explorations Newsletter 2(1), 58–64 (2000)CrossRefGoogle Scholar
  5. 5.
    Guzzi, P.H., Milano, M., Cannataro, M.: Mining association rules from gene ontology and protein networks: Promises and challenges. Procedia Computer Science 29, 1970–1980 (2014)CrossRefGoogle Scholar
  6. 6.
    Zaki, M.J., Parthasarathy, S., Ogihara, M., Li, W., et al.: New algorithms for fast discovery of association rules. In: KDD, vol. 97, pp. 283–286 (1997)Google Scholar
  7. 7.
    Cannataro, M., Guzzi, P.H., Sarica, A.: Data mining and life sciences applications on the grid. Wiley Interdisc. Rew.: Data Mining and Knowledge Discovery 3(3), 216–238 (2013)Google Scholar
  8. 8.
    Faria, D., Schlicker, A., Pesquita, C., Bastos, H., Ferreira, A.E.N., Albrecht, M., Falco, A.O.: Mining go annotations for improving annotation consistency. PLoS One 7(7), e40519 (2012)Google Scholar
  9. 9.
    Carmona-Saez, P., Chagoyen, M., Rodriguez, A., Trelles, O., Carazo, J.M., Pascual-Montano, A.: Integrated analysis of gene expression by association rules discovery. BMC Bioinformatics 7(1), 54 (2006)CrossRefGoogle Scholar
  10. 10.
    Ponzoni, I., Nueda, M.J., Tarazona, S., Götz, S., Montaner, D., Dussaut, J.S., Dopazo, J., Conesa, A.: Pathway network inference from gene expression data. BMC Systems Biology 8(2), 1–17 (2014)Google Scholar
  11. 11.
    Tew, C., Giraud-Carrier, C., Tanner, K., Burton, S.: Behavior-based clustering and analysis of interestingness measures for association rule mining. Data Mining and Knowledge Discovery 28(4), 1004–1045 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Benites, F., Simon, S., Sapozhnikova, E.: Mining rare associations between biological ontologies. PloS One 9(1), e84475 (2014)Google Scholar
  13. 13.
    Manda, P., Ozkan, S., Wang, H., McCarthy, F., Bridges, S.M.: Cross-ontology multi-level association rule mining in the gene ontology. PloS One 7(10), e47411 (2012)Google Scholar
  14. 14.
    Nguyen, C.D., Gardiner, K.J., Cios, K.J.: Protein annotation from protein interaction networks and gene ontology. Journal of Biomedical Informatics 44(5), 824–829 (2011)CrossRefGoogle Scholar
  15. 15.
    Manda, P., McCarthy, F., Bridges, S.M.: Interestingness measures and strategies for mining multi-ontology multi-level association rules from gene ontology annotations for the discovery of new go relationships. Journal of Biomedical Informatics 46(5), 849–856 (2013)CrossRefGoogle Scholar
  16. 16.
    Naulaerts, S., Meysman, P., Bittremieux, W., Vu, T.N., Vanden Berghe, W., Goethals, B.: Kris Laukens. A primer to frequent itemset mining for bioinformatics. Briefings in Bioinformatics (2013)Google Scholar
  17. 17.
    Huttenhower, C., Hibbs, M.A., Myers, C.L., Caudy, A.A., Hess, D.C., Troyanskaya, O.G.: The impact of incomplete knowledge on evaluation: an experimental benchmark for protein function prediction. Bioinformatics 25(18), 2404–2410 (2009)CrossRefGoogle Scholar
  18. 18.
    Alterovitz, G., Xiang, M., Hill, D.P., Lomax, J., Liu, J., Cherkassky, M., Dreyfuss, J., Mungall, C., Harris, M.A., Dolan, M.E., et al.: Ontology engineering. Nature Biotechnology 28(2), 128–130 (2010)CrossRefGoogle Scholar
  19. 19.
    Harispe, S., Sánchez, D., Ranwez, S., Janaqi, S., Montmain, J.: A framework for unifying ontology-based semantic similarity measures: A study in the biomedical domain. Journal of Biomedical Informatics 48, 38–53 (2014)CrossRefGoogle Scholar
  20. 20.
    Wang, W., Yang, J., Yu, P.S.: Efficient mining of weighted association rules (war). In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 270–274. ACM (2000)Google Scholar
  21. 21.
    Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Chen, W., Naughton, J., Bernstein, P.A. (eds.) 2000 ACM SIGMOD Intl. Conference on Management of Data, pp. 1–12. ACM Press, May 2000Google Scholar
  22. 22.
    Borgelt, C.: Efficient implementations of apriori and eclat. In: Proc. 1st IEEE ICDM Workshop on Frequent Item Set Mining Implementations (FIMI 2003, Melbourne, FL). CEUR Workshop Proceedings 90 (2003)Google Scholar
  23. 23.
    du Plessis, L., Skunca, N., Dessimoz, C.: The what, where, how and why of gene ontology–a primer for bioinformaticians. Briefings in Bioinformatics 12(6), 723–735 (2011)CrossRefGoogle Scholar
  24. 24.
    Sánchez, D., Batet, M., Isern, D.: Ontology-based information content computation. Knowledge-Based Systems 24(2), 297–303 (2011)CrossRefGoogle Scholar
  25. 25.
    Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. SIGMOD Rec. 29(2), 1–12 (2000)CrossRefGoogle Scholar
  26. 26.
    Hahsler, M., Grün, B., Hornik, K.: arules: Mining association rules and frequent itemsets (2006). http://cran.r-project.org/, r package version. SIGKDD Explorations 2, 0–4 (2007)
  27. 27.
    Finn, R.D., Tate, J., Mistry, J., Coggill, P.C., Sammut, S.J.J., Hotz, H.-R.R., Ceric, G., Forslund, K., Eddy, S.R., Sonnhammer, E.L., Bateman, A.: The pfam protein families database. Nucleic Acids Research 36(database issue), D281–D288 (2008)Google Scholar
  28. 28.
    Cho, Y.-R., Mina, M., Lu, Y., Kwon, N., Guzzi, P.H.: M-finder: Uncovering functionally associated proteins from interactome data integrated with go annotations. Proteome Sci. 11(suppl. 1), S3 (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Giuseppe Agapito
    • 1
    Email author
  • Mario Cannataro
    • 1
  • Pietro H. Guzzi
    • 1
  • Marianna Milano
    • 1
  1. 1.Department of Medical and Surgical SciencesMagna Graecia UniversityCatanzaroItaly

Personalised recommendations