Skip to main content

GO-WAR: A Tool for Mining Weighted Association Rules from Gene Ontology Annotations

  • Conference paper
  • First Online:
Computational Intelligence Methods for Bioinformatics and Biostatistics (CIBB 2014)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 8623))

Abstract

The Gene Ontology (GO) is a controlled vocabulary of concepts (called GO Terms) structured on three main ontologies. Each GO Term contains a description of a biological concept that is associated to one or more gene products through a process also known as annotation. Each annotation may be derived using different methods and an Evidence Code (EC) takes into account of this process. The importance and the specificity of both GO terms and annotations are often measured by their Information Content (IC). Mining annotations and annotated data may extract meaningful knowledge from a biological stand point. For instance, the analysis of these annotated data using association rules provides evidence for the co-occurrence of annotations. Nevertheless classical association rules algorithms do not take into account the source of annotation nor the importance yielding to the generation of candidate rules with low IC. This paper presents a methodology for extracting Weighted Association Rules from GO implemented in a tool named GO-WAR (Gene Ontology-based Weighted Association Rules). It is able to extract association rules with a high level of IC without loss of Support and Confidence from a dataset of annotated data. A case study on using of GO WAR on publicly available GO annotation dataset is used to demonstrate that our method outperforms current state of the art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Guzzi, P.H., Mina, M., Guerra, C., Cannataro, M.: Semantic similarity analysis of protein data: assessment with biological features and issues. Briefings in Bioinformatics 13(5), 569–585 (2012)

    Article  Google Scholar 

  2. Harris, M.A., Clark, J., Ireland, A., Lomax, J., Ashburner, M., et al.: The gene ontology (go) database and informatics resource. Nucleic Acids Res. 32(Database issue), 258–261 (2004)

    Google Scholar 

  3. Camon, E., Magrane, M., Barrell, D., Lee, V., Dimmer, E., Maslen, J., Binns, D., Harte, N., Lopez, R., Apweiler, R.: The gene ontology annotation (goa) database: sharing knowledge in uniprot with gene ontology. Nucl. Acids Res. 32(suppl_1), D262–D266 (2004)

    Google Scholar 

  4. Hipp, J., Güntzer, U., Nakhaeizadeh, G.: Algorithms for association rule mining a general survey and comparison. ACM Sigkdd Explorations Newsletter 2(1), 58–64 (2000)

    Article  Google Scholar 

  5. Guzzi, P.H., Milano, M., Cannataro, M.: Mining association rules from gene ontology and protein networks: Promises and challenges. Procedia Computer Science 29, 1970–1980 (2014)

    Article  Google Scholar 

  6. Zaki, M.J., Parthasarathy, S., Ogihara, M., Li, W., et al.: New algorithms for fast discovery of association rules. In: KDD, vol. 97, pp. 283–286 (1997)

    Google Scholar 

  7. Cannataro, M., Guzzi, P.H., Sarica, A.: Data mining and life sciences applications on the grid. Wiley Interdisc. Rew.: Data Mining and Knowledge Discovery 3(3), 216–238 (2013)

    Google Scholar 

  8. Faria, D., Schlicker, A., Pesquita, C., Bastos, H., Ferreira, A.E.N., Albrecht, M., Falco, A.O.: Mining go annotations for improving annotation consistency. PLoS One 7(7), e40519 (2012)

    Google Scholar 

  9. Carmona-Saez, P., Chagoyen, M., Rodriguez, A., Trelles, O., Carazo, J.M., Pascual-Montano, A.: Integrated analysis of gene expression by association rules discovery. BMC Bioinformatics 7(1), 54 (2006)

    Article  Google Scholar 

  10. Ponzoni, I., Nueda, M.J., Tarazona, S., Götz, S., Montaner, D., Dussaut, J.S., Dopazo, J., Conesa, A.: Pathway network inference from gene expression data. BMC Systems Biology 8(2), 1–17 (2014)

    Google Scholar 

  11. Tew, C., Giraud-Carrier, C., Tanner, K., Burton, S.: Behavior-based clustering and analysis of interestingness measures for association rule mining. Data Mining and Knowledge Discovery 28(4), 1004–1045 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  12. Benites, F., Simon, S., Sapozhnikova, E.: Mining rare associations between biological ontologies. PloS One 9(1), e84475 (2014)

    Google Scholar 

  13. Manda, P., Ozkan, S., Wang, H., McCarthy, F., Bridges, S.M.: Cross-ontology multi-level association rule mining in the gene ontology. PloS One 7(10), e47411 (2012)

    Google Scholar 

  14. Nguyen, C.D., Gardiner, K.J., Cios, K.J.: Protein annotation from protein interaction networks and gene ontology. Journal of Biomedical Informatics 44(5), 824–829 (2011)

    Article  Google Scholar 

  15. Manda, P., McCarthy, F., Bridges, S.M.: Interestingness measures and strategies for mining multi-ontology multi-level association rules from gene ontology annotations for the discovery of new go relationships. Journal of Biomedical Informatics 46(5), 849–856 (2013)

    Article  Google Scholar 

  16. Naulaerts, S., Meysman, P., Bittremieux, W., Vu, T.N., Vanden Berghe, W., Goethals, B.: Kris Laukens. A primer to frequent itemset mining for bioinformatics. Briefings in Bioinformatics (2013)

    Google Scholar 

  17. Huttenhower, C., Hibbs, M.A., Myers, C.L., Caudy, A.A., Hess, D.C., Troyanskaya, O.G.: The impact of incomplete knowledge on evaluation: an experimental benchmark for protein function prediction. Bioinformatics 25(18), 2404–2410 (2009)

    Article  Google Scholar 

  18. Alterovitz, G., Xiang, M., Hill, D.P., Lomax, J., Liu, J., Cherkassky, M., Dreyfuss, J., Mungall, C., Harris, M.A., Dolan, M.E., et al.: Ontology engineering. Nature Biotechnology 28(2), 128–130 (2010)

    Article  Google Scholar 

  19. Harispe, S., Sánchez, D., Ranwez, S., Janaqi, S., Montmain, J.: A framework for unifying ontology-based semantic similarity measures: A study in the biomedical domain. Journal of Biomedical Informatics 48, 38–53 (2014)

    Article  Google Scholar 

  20. Wang, W., Yang, J., Yu, P.S.: Efficient mining of weighted association rules (war). In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 270–274. ACM (2000)

    Google Scholar 

  21. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Chen, W., Naughton, J., Bernstein, P.A. (eds.) 2000 ACM SIGMOD Intl. Conference on Management of Data, pp. 1–12. ACM Press, May 2000

    Google Scholar 

  22. Borgelt, C.: Efficient implementations of apriori and eclat. In: Proc. 1st IEEE ICDM Workshop on Frequent Item Set Mining Implementations (FIMI 2003, Melbourne, FL). CEUR Workshop Proceedings 90 (2003)

    Google Scholar 

  23. du Plessis, L., Skunca, N., Dessimoz, C.: The what, where, how and why of gene ontology–a primer for bioinformaticians. Briefings in Bioinformatics 12(6), 723–735 (2011)

    Article  Google Scholar 

  24. Sánchez, D., Batet, M., Isern, D.: Ontology-based information content computation. Knowledge-Based Systems 24(2), 297–303 (2011)

    Article  Google Scholar 

  25. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. SIGMOD Rec. 29(2), 1–12 (2000)

    Article  Google Scholar 

  26. Hahsler, M., Grün, B., Hornik, K.: arules: Mining association rules and frequent itemsets (2006). http://cran.r-project.org/ , r package version. SIGKDD Explorations 2, 0–4 (2007)

  27. Finn, R.D., Tate, J., Mistry, J., Coggill, P.C., Sammut, S.J.J., Hotz, H.-R.R., Ceric, G., Forslund, K., Eddy, S.R., Sonnhammer, E.L., Bateman, A.: The pfam protein families database. Nucleic Acids Research 36(database issue), D281–D288 (2008)

    Google Scholar 

  28. Cho, Y.-R., Mina, M., Lu, Y., Kwon, N., Guzzi, P.H.: M-finder: Uncovering functionally associated proteins from interactome data integrated with go annotations. Proteome Sci. 11(suppl. 1), S3 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Giuseppe Agapito .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Agapito, G., Cannataro, M., Guzzi, P.H., Milano, M. (2015). GO-WAR: A Tool for Mining Weighted Association Rules from Gene Ontology Annotations. In: DI Serio, C., Liò, P., Nonis, A., Tagliaferri, R. (eds) Computational Intelligence Methods for Bioinformatics and Biostatistics. CIBB 2014. Lecture Notes in Computer Science(), vol 8623. Springer, Cham. https://doi.org/10.1007/978-3-319-24462-4_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-24462-4_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-24461-7

  • Online ISBN: 978-3-319-24462-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics