SWAMI: Integrating Biological Databases and Analysis Tools Within User Friendly Environment

  • Rami Rifaieh
  • Roger Unwin
  • Jeremy Carver
  • Mark A. Miller
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4544)

Abstract

In the last decade, many projects have tried to deal with the integration of biological resources. Web portals have flourished online, each providing data from a public provider. Although these online resources are available with a set of manipulation tools, scientists, researchers, and students often have to shift from one resource to another to accomplish a particular task. Making a rich tool set available along with a variety of databases, data formats, and computational capabilities is a complex task. It requires building a versatile environment for data integration, data manipulation, and data storage. In this paper, we study the requirements and report the architectural design of a web application, code named SWAMI, which aims at integrating a rich tool set and a variety of biological databases. The suggested architecture is highly scalable in terms of adding databases and new manipulation tools.

Keywords

Biology Workbench Biological data integration Biological tool integration 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Scientific Data Management Center (2005), http://sdm.lbl.gov/sdmcenter/
  2. 2.
    Abiteboul, S., Agrawal, R., Bernstein, P., Carey, M., Ceri, S., Croft, B., DeWitt, D., et al.: The Lowell database research self-assessment. Commun. ACM 48(5), 111–118 (2005)CrossRefGoogle Scholar
  3. 3.
    Subramaniam, S.: The Biology Workbench (1998), http://workbench.sdsc.edu/
  4. 4.
    Etzold, T., Argos, P.: SRS – An Indexing And Retrieval Tool For Flat File Data Libraries. CABIOS 9, 49–57 (1993)Google Scholar
  5. 5.
    Documentation, S.: SRS at the European Bioinformatics Institute (2006), http://srs.ebi.ac.uk/srs/doc/index.html
  6. 6.
    Entigen. Bionavigator - BioNode & BioNodeSA: Overview (2001), http://www.entigen.com/library
  7. 7.
    National Library of Medicine (2005), http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?DB=pubmed
  8. 8.
    Michalickova, K., Bader, G.D., Dumontier, M., Lieu, H., Betel, D., Isserlin, R., Hogue, C.W.: Seqhound: biological sequence and structure database as a platform for bioinformatics research. BMC Bioinformatics 3(1), 32 (2002)CrossRefGoogle Scholar
  9. 9.
    GMOD, LuceGene: Document/Object Search and Retrieval (2007), http://www.gmod.org/?q=node/83
  10. 10.
    Subramaniam, S.: The biology workbench - A seamless database and analysis environment for the biologist. Proteins-Structure Function and Genetics 32(1), 1–2 (1998)CrossRefGoogle Scholar
  11. 11.
    Davidson, S.B., Crabtree, J., Brunk, B.P., Schug, J., Tannen, V., Overton, G.C., Stoeckert, C.J.: K2/Kleisli and GUS: Experiments in integrated access to genomic data sources. Ibm. Systems Journal 40(2), 512–531 (2001)Google Scholar
  12. 12.
    Shah, S.P., Huang, Y., Xu, T., Yuen, M.M.S., Ling, J., Ouellette, B.F.F.: Atlas - a data warehouse for integrative bioinformatics. Bmc Bioinformatics 6 (2005)Google Scholar
  13. 13.
    Lapp, H.: BioSQL (2006), http://www.biosql.org/wiki/Main_Page
  14. 14.
    Haider, S., Holland, R., Smedley, D., Kasprzyk, A.: BioMART Project (2007), http://www.biomart.org/index.html
  15. 15.
    Lee, T.J., Pouliot, Y., Wagner, V., Gupta, P., Stringer-Calvert, D.W.-J., Tenenbaum, J.D., Karp, P.D.: BioWarehouse: a bioinformatics database warehouse toolkit. BMC Bioinformatics 7, 170–184 (2006)CrossRefGoogle Scholar
  16. 16.
    GMOD, Getting Started with Chado and GMOD (2007), http://www.gmod.org/getting_started
  17. 17.
    Stevens, R., Baker, P., Bechhofer, S., Ng, G., Jacoby, A., Paton, N.W., Goble, C.A., Brass, A.: TAMBIS: Transparent access to multiple bioinformatics information sources. Bioinformatics 16(2), 184–185 (2000)CrossRefGoogle Scholar
  18. 18.
    Haas, L., Schwartz, P., Kodali, P., Kotlar, E., Rice, J., Swope, W.: DiscoveryLink: A System for Integrated Access to Life Sciences Data Sources. Ibm. Systems Journal 40(2), 489–511 (2001)Google Scholar
  19. 19.
    Kohler, J.: Integration of Life Sciences Databases. BIOSILICO 2, 61–69 (2004)MathSciNetGoogle Scholar
  20. 20.
    Letondal, C.: A Web interface generator for molecular biology programs in Unix. Bioinformatics 17(1), 73–82 (2001)CrossRefGoogle Scholar
  21. 21.
    RENCI, The NC BioPortal Project (2007), http://www.ncbioportal.org/
  22. 22.
    Rampp, M., Soddemann, T., Lederer, H.: The MIGenAS integrated bioinformatics toolkit for web-based sequence analysis. Nucleic Acids Research 34(Web Server issue), W15–W19 (2006)CrossRefGoogle Scholar
  23. 23.
    Badidi, E., De Sousa, C., Lang, B.F., Burger, G.: AnaBench: a Web/CORBA-based workbench for biomolecular sequence analysis. BMC informatics 4, 63–72 (2003)CrossRefGoogle Scholar
  24. 24.
    Badidi, E., De Sousa, C., Lang, B.F., Burger, G.: FLOSYS–a web-accessible workflow system for protocol-driven biomolecular sequence analysis. Cell. Mol. Biol (Noisy-le-grand) 50(7), 785–793 (2004)Google Scholar
  25. 25.
    Lee, S., Wang, T.D., Hashmi, N., Cummings, M.P.: Bio-STEER: a Semantic Web workflow tool for Grid computing in the life sciences. Future Generation Computer Systems 23, 497–509 (2007)CrossRefGoogle Scholar
  26. 26.
    Shah, S.P., He, D.Y.M., Sawkins, J.N., Druce, J.C., Quon, G., Lett, D., Zheng, G.X.Y., Xu, T., Ouellette, B.F.: Pegasys: software for executing and integrating analyses of biological sequences. BMC Bioinformatics 5, 40–48 (2004)CrossRefGoogle Scholar
  27. 27.
    Altintas, I., Berkley, C., Jaeger, E., Jones, M., Ludäscher, B., Mock, S.: Kepler: An Extensible System for Design and Execution of Scientific Workflows. In: 16th Intl. Conf. on Scientific and Statistical Database Management (SSDBM’04), Santorini Island, Greece (2004)Google Scholar
  28. 28.
    Oinn, T., Addis, M., Ferris, J., Marvin, D., Senger, M., Greenwood, M., Carver, T., Glover, K., Pocock, M.R., Wipat, A., Li, P.: Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics 20(17), 3045–3054 (2003)CrossRefGoogle Scholar
  29. 29.
    Baldridge, K., Bhatia, K., Greenberg, J.P., Stearn, B., Mock, S., Sudholt, W., Krishnan, S., Bowen, A., Amoreira, C., Potier, Y.: GEMSTONE: Grid Enabled Molecular Science Through Online Networked Environments. In: Life Sciences Grid 2005, LSGrid, Singapore (2005)Google Scholar
  30. 30.
    McGuinness, D.L., van Harmelen, F.: OWL Web Ontology Language (2007), http://www.w3.org/TR/owl-features/
  31. 31.
    Parsia, B., Sirin, E.: Pellet: An OWL DL Reasoner. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, Springer, Heidelberg (2004)Google Scholar
  32. 32.
    Nambiar, U., Ludaescher, B., Lin, K., Baru, C.: The GEON portal: accelerating knowledge discovery in the geosciences. In: Eighth ACM international Workshop on Web information and Data Management (WIDM ’06), Arlington, Virginia, USA, pp. 83–90. ACM Press, New York (2006)CrossRefGoogle Scholar
  33. 33.
    Buzko, O.: SIRIUS: An Extensible Molecular Graphics and Analysis Environment (2007), http://sirius.sdsc.edu

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Rami Rifaieh
    • 1
  • Roger Unwin
    • 1
  • Jeremy Carver
    • 1
  • Mark A. Miller
    • 1
  1. 1.The San Diego Supercomputer Center, University of California San Diego, 9500 Gilman Drive La Jolla, CA 92093-0505 

Personalised recommendations