Distributed Management and Analysis of Omics Data

  • Mario Cannataro
  • Pietro Hiram Guzzi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7156)

Abstract

The omics term refers to different biology disciplines such as, for instance, genomics, proteomics, or interactomics. The suffix -ome is used to indicate the objects of study of such disciplines, such as the genome, proteome, or interactome, and usually refers to a totality of some sort. This paper introduces omics data and the main computational techniques for their storage, preprocessing and analysis. The increasing availability of omics data due to the advent of high throughput technologies poses novel issues on data management and analysis that can be faced by parallel and distributed storage systems and algorithms. After a survey of main omics databases, preprocessing techniques and analysis approaches, the paper describes some recent bioinformatics tools in genomics, proteomics and interactomics that use a distributed approach.

Keywords

Omics Data Genomics Proteomics Interactomics Distributed Computing 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Guzzi, P.H., Cannataro, M.: Challenges in microarray data management and analysis. In: Proceedings of the 24th IEEE International Symposium on Computer-Based Medical Systems, Bristol, United Kingdom, June 27-30 (2011)Google Scholar
  2. 2.
    Cannataro, M., Guzzi, P.H., Veltri, P.: Using ontologies for querying and analysing protein-protein interaction data. Procedia CS 1(1), 997–1004 (2010)CrossRefGoogle Scholar
  3. 3.
    Barrell, D., Dimmer, E., Huntley, R.P., Binns, D., O’Donovan, C., Apweiler, R.: The GOA database in 2009–an integrated Gene Ontology Annotation resource. Nucleic Acids Research 37, D396–D403 (2009)Google Scholar
  4. 4.
    Benson, D.A., Karsch-Mizrachi, I., Lipman, D.J., Ostell, J., Wheeler, D.L.: GenBank. Nucleic Acids Research 36(Database issue) (2008)Google Scholar
  5. 5.
    Boeckmann, B., Bairoch, A., Apweiler, R., Blatter, M.-C.C., Estreicher, A., Gasteiger, E., Martin, M.J., Michoud, K., O’Donovan, C., Phan, I., Pilbout, S., Schneider, M.: The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Research 31(1), 365–370 (2003)CrossRefGoogle Scholar
  6. 6.
    Cannataro, M., Guzzi, P.H., Veltri, P.: Protein-to-protein interactions: Technologies, databases, and algorithms. ACM Comput. Surv. 43 (2010)Google Scholar
  7. 7.
    Cannataro, M., Guzzi, P.H., Mazza, T., Tradigo, G., Veltri, P.: Using ontologies for preprocessing and mining spectra data on the grid. Future Generation Comp. Syst. 23(1), 55–60 (2007)CrossRefGoogle Scholar
  8. 8.
    Cannataro, M., Guzzi, P.H., Veltri, P.: Impreco: Distributed prediction of protein complexes. Future Generation Comp. Syst. 26(3), 434–440 (2010)CrossRefGoogle Scholar
  9. 9.
    Cerami, E., Bader, G., Gross, B.E., Sander, C.: Cpath: open source software for collecting, storing, and querying biological pathways. BMC Bioinformatics 7(497), 1–9 (2006)Google Scholar
  10. 10.
    Chaurasia, G., Iqbal, Y., Hanig, C., Herzel, H., Wanker, E.E., Futschik, M.E.: UniHI: an entry gate to the human protein interactome. Nucl. Acids Res. 35(suppl. 1), D590–D594 (2007)Google Scholar
  11. 11.
    The UniProt Consortium: The universal protein resource (UniProt) in 2010. Nucleic Acids Research 38(suppl. 1), D142–D148 (2010)Google Scholar
  12. 12.
    Craig, R., Cortens, J.P., Beavis, R.C.: Open source system for analyzing, validating, and storing protein identification data. Journal of Proteome Research 3(6), 1234–1242 (2004)CrossRefGoogle Scholar
  13. 13.
    Desiere, F., Deutsch, E.W., King, N.L., Nesvizhskii, A.I., Mallick, P., Eng, J., Chen, S., Eddes, J., Loevenich, S.N., Aebersold, R.: The peptideatlas project. Nucleic Acids Research 34(suppl. 1), D655–D658Google Scholar
  14. 14.
    Guzzi, P.H., Cannataro, M.: mu-cs: An extension of the tm4 platform to manage affymetrix binary data. BMC Bioinformatics 11, 315 (2010)CrossRefGoogle Scholar
  15. 15.
    Schmidberger, M., Vicedo, E., Mansmann, U.: Affypara: a bioconductor package for parallelized preprocessing algorithms of affymetrix microarray dataGoogle Scholar
  16. 16.
    Taylor, C.F., Hermjakob, H., Julian, R.K., Garavelli, J.S., Aebersold, R., Apweiler, R.: The work of the human proteome organisation’s proteomics standards initiative (HUPO PSI). OMICS 10(2), 145–151 (2006)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Mario Cannataro
    • 1
  • Pietro Hiram Guzzi
    • 1
  1. 1.Department of Medical and Surgical Sciences, Bioinformatics LaboratoryUniversity Magna Græcia of CatanzaroCatanzaroItaly

Personalised recommendations