Skip to main content

Integration and Mining of Genomic Annotations: Experiences and Perspectives in GFINDer Data Warehousing

  • Conference paper
Data Integration in the Life Sciences (DILS 2009)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 5647))

Included in the following conference series:

Abstract

Many tasks in bioinformatics require the comprehensive evaluation of different types of data, generally available in distributed and heterogeneous data sources. Several approaches, including federated databases, multi databases and mediator based systems, have been proposed to integrate data from multiple sources. Yet, data warehousing seams to be the most adequate when numerous data need to be integrated, efficiently processed, and mined comprehensively. To support biological interpretation of high-throughput gene lists, we previously developed GFINDer (Genome Functional INtegrated Discoverer, http://www.bioinformatics.polimi.it/GFINDer/), a web server that statistically analyzes and mines functional and phenotypic gene annotations sparsely available in numerous databanks to highlight annotation categories significantly enriched or depleted in the considered gene lists. GFINDer includes a data warehouse that integrates gene and protein annotations of several organisms expressed through various controlled terminologies and ontologies. Here, we describe GFINDer data warehouse and discuss the lessons learned in its construction and five-year maintenance and development.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Galperin, M.Y., Cochrane, G.R.: Nucleic Acids Research Annual Database Issue and the NAR Online Molecular Biology Database Collection in 2009. Nucleic Acids Res. 37(Database issue), D1–D4 (2009)

    Article  Google Scholar 

  2. Lee, T.J., Pouliot, Y., Wagner, V., Gupta, P., Stringer-Calvert, D.W., Tenenbaum, J.D., Karp, P.D.: BioWarehouse: A Bioinformatics Database Warehouse Toolkit. BMC Bioinformatics 7(170), 1–14 (2006)

    Google Scholar 

  3. Stevens, R., Baker, P., Bechhofer, S., Ng, G., Jacoby, A., Paton, N.W., Goble, C.A., Brass, A.: TAMBIS: Transparent Access to Multiple Bioinformatics Information Sources. Bioinformatics 16, 184–185 (2000)

    Article  CAS  PubMed  Google Scholar 

  4. Haas, L.M., Rice, J.E., Schwarz, P.M., Swops, W.C., Kodali, P., Kotlar, E.: DiscoveryLink: A System for Integrated Access to Life Sciences Data Sources. IBM Systems Journal 40, 489–511 (2001)

    Article  Google Scholar 

  5. Diehn, M., Sherlock, G., Binkley, G., Jin, H., Matese, J.C., Hernandez-Boussard, T., Rees, C.A., Cherry, J.M., Botstein, D., Brown, P.O., Alizadeh, A.A.: SOURCE: A Unified Genomic Resource of Functional Annotations, Ontologies, and Gene Expression Data. Nucleic Acids Res. 31, 219–223 (2003)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Cadag, E., Louie, B., Myler, P.J., Tarczy-Hornoch, P.: Biomediator Data Integration and Inference for Functional Annotation of Anonymous Sequences. In: Pac. Symp. Biocomput., pp. 343–354 (2007)

    Google Scholar 

  7. Huang, D.W., Sherman, B.T., Lempicki, R.A.: Bioinformatics Enrichment Tools: Paths Toward the Comprehensive Functional Analysis of Large Gene Lists. Nucleic Acids Res. 37(1), 1–13 (2009)

    Article  Google Scholar 

  8. Masseroli, M., Martucci, D., Pinciroli, F.: GFINDer: Genome Function INtegrated Discoverer through Dynamic Annotation, Statistical Analysis, and Mining. Nucleic Acids Res. 32, W293–W300 (2004)

    Article  Google Scholar 

  9. The Gene Ontology Consortium: Gene Ontology: Tool for the Unification of Biology. Nature Genet. 25, 25–29 (2000)

    Google Scholar 

  10. Masseroli, M., Galati, O., Pinciroli, F.: GFINDer: Genetic Disease and Phenotype Location Statistical Analysis and Mining of Dynamically Annotated Gene Lists. Nucleic Acids Res. 33, W717–W723 (2005)

    Article  Google Scholar 

  11. Masseroli, M., Bellistri, E., Franceschini, A., Pinciroli, F.: Statistical Analysis of Genomic Protein Family and Domain Controlled Annotations for Functional Investigation of Classified Gene Lists. BMC Bioinformatics 8(suppl. 1), 1–10 (2007)

    Google Scholar 

  12. Ceresa, M., Masseroli, M., Campi, A.: A Web-enabled Database of Human Gene Expression Controlled Annotations for Gene List Functional Evaluation. In: Dittmar, A., Clark, J. (eds.) 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBS 2007), pp. 394–397. The Printing House, Stoughton (2007)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Masseroli, M., Ceri, S., Campi, A. (2009). Integration and Mining of Genomic Annotations: Experiences and Perspectives in GFINDer Data Warehousing. In: Paton, N.W., Missier, P., Hedeler, C. (eds) Data Integration in the Life Sciences. DILS 2009. Lecture Notes in Computer Science(), vol 5647. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02879-3_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-02879-3_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-02878-6

  • Online ISBN: 978-3-642-02879-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics