GO-WAR: A Tool for Mining Weighted Association Rules from Gene Ontology Annotations

Agapito, Giuseppe; Cannataro, Mario; Guzzi, Pietro H.; Milano, Marianna

doi:10.1007/978-3-319-24462-4_1

Giuseppe Agapito¹⁷,
Mario Cannataro¹⁷,
Pietro H. Guzzi¹⁷ &
…
Marianna Milano¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 8623))

Included in the following conference series:

International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics

1417 Accesses
1 Citations

Abstract

The Gene Ontology (GO) is a controlled vocabulary of concepts (called GO Terms) structured on three main ontologies. Each GO Term contains a description of a biological concept that is associated to one or more gene products through a process also known as annotation. Each annotation may be derived using different methods and an Evidence Code (EC) takes into account of this process. The importance and the specificity of both GO terms and annotations are often measured by their Information Content (IC). Mining annotations and annotated data may extract meaningful knowledge from a biological stand point. For instance, the analysis of these annotated data using association rules provides evidence for the co-occurrence of annotations. Nevertheless classical association rules algorithms do not take into account the source of annotation nor the importance yielding to the generation of candidate rules with low IC. This paper presents a methodology for extracting Weighted Association Rules from GO implemented in a tool named GO-WAR (Gene Ontology-based Weighted Association Rules). It is able to extract association rules with a high level of IC without loss of Support and Confidence from a dataset of annotated data. A case study on using of GO WAR on publicly available GO annotation dataset is used to demonstrate that our method outperforms current state of the art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Finding Gene Associations by Text Mining and Annotating it with Gene Ontology

Using prior knowledge in the inference of gene association networks

Article 04 July 2020

Automatic gene annotation using GO terms from cellular component domain

Article Open access 07 December 2018

References

Guzzi, P.H., Mina, M., Guerra, C., Cannataro, M.: Semantic similarity analysis of protein data: assessment with biological features and issues. Briefings in Bioinformatics 13(5), 569–585 (2012)
Article Google Scholar
Harris, M.A., Clark, J., Ireland, A., Lomax, J., Ashburner, M., et al.: The gene ontology (go) database and informatics resource. Nucleic Acids Res. 32(Database issue), 258–261 (2004)
Google Scholar
Camon, E., Magrane, M., Barrell, D., Lee, V., Dimmer, E., Maslen, J., Binns, D., Harte, N., Lopez, R., Apweiler, R.: The gene ontology annotation (goa) database: sharing knowledge in uniprot with gene ontology. Nucl. Acids Res. 32(suppl_1), D262–D266 (2004)
Google Scholar
Hipp, J., Güntzer, U., Nakhaeizadeh, G.: Algorithms for association rule mining a general survey and comparison. ACM Sigkdd Explorations Newsletter 2(1), 58–64 (2000)
Article Google Scholar
Guzzi, P.H., Milano, M., Cannataro, M.: Mining association rules from gene ontology and protein networks: Promises and challenges. Procedia Computer Science 29, 1970–1980 (2014)
Article Google Scholar
Zaki, M.J., Parthasarathy, S., Ogihara, M., Li, W., et al.: New algorithms for fast discovery of association rules. In: KDD, vol. 97, pp. 283–286 (1997)
Google Scholar
Cannataro, M., Guzzi, P.H., Sarica, A.: Data mining and life sciences applications on the grid. Wiley Interdisc. Rew.: Data Mining and Knowledge Discovery 3(3), 216–238 (2013)
Google Scholar
Faria, D., Schlicker, A., Pesquita, C., Bastos, H., Ferreira, A.E.N., Albrecht, M., Falco, A.O.: Mining go annotations for improving annotation consistency. PLoS One 7(7), e40519 (2012)
Google Scholar
Carmona-Saez, P., Chagoyen, M., Rodriguez, A., Trelles, O., Carazo, J.M., Pascual-Montano, A.: Integrated analysis of gene expression by association rules discovery. BMC Bioinformatics 7(1), 54 (2006)
Article Google Scholar
Ponzoni, I., Nueda, M.J., Tarazona, S., Götz, S., Montaner, D., Dussaut, J.S., Dopazo, J., Conesa, A.: Pathway network inference from gene expression data. BMC Systems Biology 8(2), 1–17 (2014)
Google Scholar
Tew, C., Giraud-Carrier, C., Tanner, K., Burton, S.: Behavior-based clustering and analysis of interestingness measures for association rule mining. Data Mining and Knowledge Discovery 28(4), 1004–1045 (2014)
Article MathSciNet MATH Google Scholar
Benites, F., Simon, S., Sapozhnikova, E.: Mining rare associations between biological ontologies. PloS One 9(1), e84475 (2014)
Google Scholar
Manda, P., Ozkan, S., Wang, H., McCarthy, F., Bridges, S.M.: Cross-ontology multi-level association rule mining in the gene ontology. PloS One 7(10), e47411 (2012)
Google Scholar
Nguyen, C.D., Gardiner, K.J., Cios, K.J.: Protein annotation from protein interaction networks and gene ontology. Journal of Biomedical Informatics 44(5), 824–829 (2011)
Article Google Scholar
Manda, P., McCarthy, F., Bridges, S.M.: Interestingness measures and strategies for mining multi-ontology multi-level association rules from gene ontology annotations for the discovery of new go relationships. Journal of Biomedical Informatics 46(5), 849–856 (2013)
Article Google Scholar
Naulaerts, S., Meysman, P., Bittremieux, W., Vu, T.N., Vanden Berghe, W., Goethals, B.: Kris Laukens. A primer to frequent itemset mining for bioinformatics. Briefings in Bioinformatics (2013)
Google Scholar
Huttenhower, C., Hibbs, M.A., Myers, C.L., Caudy, A.A., Hess, D.C., Troyanskaya, O.G.: The impact of incomplete knowledge on evaluation: an experimental benchmark for protein function prediction. Bioinformatics 25(18), 2404–2410 (2009)
Article Google Scholar
Alterovitz, G., Xiang, M., Hill, D.P., Lomax, J., Liu, J., Cherkassky, M., Dreyfuss, J., Mungall, C., Harris, M.A., Dolan, M.E., et al.: Ontology engineering. Nature Biotechnology 28(2), 128–130 (2010)
Article Google Scholar
Harispe, S., Sánchez, D., Ranwez, S., Janaqi, S., Montmain, J.: A framework for unifying ontology-based semantic similarity measures: A study in the biomedical domain. Journal of Biomedical Informatics 48, 38–53 (2014)
Article Google Scholar
Wang, W., Yang, J., Yu, P.S.: Efficient mining of weighted association rules (war). In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 270–274. ACM (2000)
Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Chen, W., Naughton, J., Bernstein, P.A. (eds.) 2000 ACM SIGMOD Intl. Conference on Management of Data, pp. 1–12. ACM Press, May 2000
Google Scholar
Borgelt, C.: Efficient implementations of apriori and eclat. In: Proc. 1st IEEE ICDM Workshop on Frequent Item Set Mining Implementations (FIMI 2003, Melbourne, FL). CEUR Workshop Proceedings 90 (2003)
Google Scholar
du Plessis, L., Skunca, N., Dessimoz, C.: The what, where, how and why of gene ontology–a primer for bioinformaticians. Briefings in Bioinformatics 12(6), 723–735 (2011)
Article Google Scholar
Sánchez, D., Batet, M., Isern, D.: Ontology-based information content computation. Knowledge-Based Systems 24(2), 297–303 (2011)
Article Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. SIGMOD Rec. 29(2), 1–12 (2000)
Article Google Scholar
Hahsler, M., Grün, B., Hornik, K.: arules: Mining association rules and frequent itemsets (2006). http://cran.r-project.org/ , r package version. SIGKDD Explorations 2, 0–4 (2007)
Finn, R.D., Tate, J., Mistry, J., Coggill, P.C., Sammut, S.J.J., Hotz, H.-R.R., Ceric, G., Forslund, K., Eddy, S.R., Sonnhammer, E.L., Bateman, A.: The pfam protein families database. Nucleic Acids Research 36(database issue), D281–D288 (2008)
Google Scholar
Cho, Y.-R., Mina, M., Lu, Y., Kwon, N., Guzzi, P.H.: M-finder: Uncovering functionally associated proteins from interactome data integrated with go annotations. Proteome Sci. 11(suppl. 1), S3 (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Medical and Surgical Sciences, Magna Graecia University, Catanzaro, Italy
Giuseppe Agapito, Mario Cannataro, Pietro H. Guzzi & Marianna Milano

Authors

Giuseppe Agapito
View author publications
You can also search for this author in PubMed Google Scholar
Mario Cannataro
View author publications
You can also search for this author in PubMed Google Scholar
Pietro H. Guzzi
View author publications
You can also search for this author in PubMed Google Scholar
Marianna Milano
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Giuseppe Agapito .

Editor information

Editors and Affiliations

CUSSB, University "Vita-Salute" San Raffae, Milano, Italy
Clelia DI Serio
The Computer Laboratory, University of Cambridge, Cambridge, United Kingdom
Pietro Liò
CUSSB, Università Vita-Salute San Raffaele, Milano, Italy
Alessandro Nonis
Dipartimento di Informatica, Universitá degli Studi di Salerno, Fisciano, Salerno, Italy
Roberto Tagliaferri

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Agapito, G., Cannataro, M., Guzzi, P.H., Milano, M. (2015). GO-WAR: A Tool for Mining Weighted Association Rules from Gene Ontology Annotations. In: DI Serio, C., Liò, P., Nonis, A., Tagliaferri, R. (eds) Computational Intelligence Methods for Bioinformatics and Biostatistics. CIBB 2014. Lecture Notes in Computer Science(), vol 8623. Springer, Cham. https://doi.org/10.1007/978-3-319-24462-4_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-24462-4_1
Published: 18 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24461-7
Online ISBN: 978-3-319-24462-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

GO-WAR: A Tool for Mining Weighted Association Rules from Gene Ontology Annotations

Abstract

Access this chapter

Preview

Similar content being viewed by others

Finding Gene Associations by Text Mining and Annotating it with Gene Ontology

Using prior knowledge in the inference of gene association networks

Automatic gene annotation using GO terms from cellular component domain

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

GO-WAR: A Tool for Mining Weighted Association Rules from Gene Ontology Annotations

Abstract

Access this chapter

Preview

Similar content being viewed by others

Finding Gene Associations by Text Mining and Annotating it with Gene Ontology

Using prior knowledge in the inference of gene association networks

Automatic gene annotation using GO terms from cellular component domain

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation