Skip to main content

A Higher-Order Component for Efficient Genome Processing in the Grid

  • Chapter
Making Grids Work

Abstract

Computational grids combine computers in the Internet for distributed data processing and are an attractive platform for the data-intensive applications of bioinformatics. We present an extensible genome processing software for the grid and evaluate its performance. Our software was able to discover previously unknown circular permutations (CP) in the ProDom database containing more than 70MB of protein data. A specific feature of our software is its design as a component: the Alignment HOC, a Higher-Order Component that makes use of the latest Globus toolkit as grid middleware. Besides genome data, the Alignment HOC accepts plugin code for processing this data as its input, and contains all the required configuration to run the component on top of Globus, thus, freeing the non-grid-expert user from dealing with grid middleware. Instead of writing data distribution procedures and configuring the middleware appropriately for every new algorithm, Alignment HOC users reuse the existing component and only write application-specific plugins. To maintain plugins persistently in a reusable manner, we built a web-accessible plugin database with a comfortable administration GUI. The flexible component-based implementation makes it easy to study CPs in other databases (e.g. UniProt/Swiss-Prot) or to use an alignment algorithm different than the standard Needleman-Wunsch. For the efficient distribution of workload, we developed a library of group communication operations for HOCs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. M. Altunay, D. Colonnese, and C. Warade. High Throughput Web Services for Life Sciences. In IT Coding and Computing, pages 329-334, Washington, DC, USA, 2005. IEEE.

    Google Scholar 

  2. Laurent Baduel, Francoise Baude, and Denis Caromel. Efficient, Flexible, and Typed Group Communications in Java. In Java Grande Conference, pages 28-36, Seattle, 2002. ACM Press.

    Google Scholar 

  3. Bornberg-Bauer et al. Raspodom Results. http://www.uni-muenster.de/Biologie.Botanik/ebb/projects/raspodom

  4. Janusz M. Bujnicki. Sequence Permutations in the Molecular Evolution of DNA methyl-transferases. BMC Evolutionary Biology, 2:3, 2002.

    Article  Google Scholar 

  5. Jan D unnweber and Catalin L. Dumitrescu et al. . The HOC-SA Globus Incubator Project. Web page: http://dev.globus.org/incubator/hoc-sa/, 2006.

  6. Jan D unnweber, Sergei Gorlatch, Marco Aldinucci, Marco Danelutto, and Sonia Campa. Adaptable Parallel Components for Grid Programming. In Integrated Research in GRID Computing, pages 43-59. Springer Verlag, December 2006.

    Google Scholar 

  7. Ian T. Foster. Globus Toolkit Version 4: Software for Service-Oriented Systems. In NPC, pages 2-13, 2005.

    Google Scholar 

  8. Sergei Gorlatch and Jan D unnweber. From Grid Middleware to Grid Applications: Bridging the Gap with HOCs. In Future Generation Grids, pages 299-306. Springer Verlag, 2005.

    Google Scholar 

  9. O. Gotoh. An Improved Algorithm for Matching Biological Sequences. J. Mol. Biol., 162:705-708, 1982.

    Article  Google Scholar 

  10. A. Jeltsch. Circular Permutations in the Molecular Evolution of DNA Methyltransferases. S164, 1999.

    Google Scholar 

  11. Ahmed Moustafa. The JAligner Library for Biological Sequence Alignment, 2007. http://jaligner.sourceforge.net.

  12. S. B. Needleman and C. D. Wunsch. A General Method Applicable to Search for Sim-ilarities in the Amino Acid Sequences of two Proteins. Journal of Molecular Biology, 48:443-453, 1970.

    Article  Google Scholar 

  13. Zemin Ning, Anthony Cox, and James Mullikin. SSAHA: A Fast Search Method for Large DNA Databases. In Genome Research 11, pages 1725-1729, 2001.

    Article  Google Scholar 

  14. OGSA-DAI project team. The Open Grid Service Architecture - Data Access and Integra-tion OGSA-DAI, 2007. http://www.ogsadai.org.uk.

  15. T. Rauber, R. Reilein-Ruı, and G. R unger. ORT - A Communication Library for Orthogonal Processor Groups. In Proc. of the ACM/IEEE Supercomputing Conf. 2001 (SC’01), Denver, Colorado, USA, 2001. ACM.

    Google Scholar 

  16. T. F. Smith and M. S. Waterman. Identification of Common Molecular Subsequences. Journal of Molecular Biology, 147:195-197, 1981.

    Article  Google Scholar 

  17. J. 3rd Weiner, G. Thomas, and E. Bornberg-Bauer. Rapid motif-based Prediction of Circular Permutations in multi-domain Proteins. Bioinformatics, 21:932-937, 2005.

    Article  Google Scholar 

  18. Asim YarKhan and Jack J. Dongarra. Biological Sequence Alignment on Computational Grids using the GrADS Framework. Future Gener. Comput. Syst., 21(6):980-986, 2005.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Ludeking, P., Dunnweber, J., Gorlatch, S. (2008). A Higher-Order Component for Efficient Genome Processing in the Grid. In: Making Grids Work. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-78448-9_28

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-78448-9_28

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-78447-2

  • Online ISBN: 978-0-387-78448-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics