Many-Core Processor Bioinformatics and Next-Generation Sequencing

Esteban, Francisco J.; Díaz, David; Hernández, Pilar; Caballero, Juan Antonio; Dorado, Gabriel; Gálvez, Sergio

doi:10.1007/978-3-642-32304-1_15

Francisco J. Esteban¹⁸,
David Díaz¹⁹,
Pilar Hernández²⁰,
Juan Antonio Caballero²¹,
Gabriel Dorado²² &
…
Sergio Gálvez¹⁹

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 82))

Included in the following conference series:

International Conference on IT Revolutions

643 Accesses
1 Citations

Abstract

The new massive DNA sequencing methods demand both computer hardware and bioinformatics software capable of handling huge amounts of data. This paper shows how the many-core processors (in which each core can execute a whole operating system) can be exploited to address problems which previously required expensive supercomputers. Thus, the Needleman-Wunsch/Smith-Waterman pairwise alignments will be described using long DNA sequences (>100 kb), including the implications for progressive multiple alignments. Likewise, assembling algorithms used to generate contigs on sequencing projects (therefore, using short sequences) and the future in peptide (protein) folding computing methods will be also described. Our study also integrates the last trends in many-core processors and their applications in the field of bioinformatics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Gálvez, S., et al.: Next-Generation Bioinformatics: Using Many-Core Processor Architecture to Develop a Web Service for Sequence Alignment. Bioinformatics 26(5), 683–686 (2010)
Article Google Scholar
Castillo, A., et al.: Genomic approaches for olive oil quality control. In: Plant Genomics European Meetings (Plant GEM 6), Tenerife, Spain (2007)
Google Scholar
Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48(3), 443–453 (1970)
Article Google Scholar
Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. J. Mol. Biol. 147(1), 195–197 (1981)
Article Google Scholar
Gotoh, O.: An improved algorithm for matching biological sequences. J. Mol. Biol. 162(3), 705–708 (1982)
Article Google Scholar
Hirschberg, D.S.: A linear space algorithm for computing maximal common subsequences. Commun. ACM 18(6), 341–343 (1975)
Article MathSciNet MATH Google Scholar
Driga, A., et al.: FastLSA: A Fast, Linear-Space, Parallel and Sequential Algorithm for Sequence Alignment. Algorithmica 45(3), 337–375 (2006)
Article MathSciNet MATH Google Scholar
Thompson, J.D., Higgins, D.G., Gibson, T.J.: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic. Acids Res. 22(22), 4673–4680 (1994)
Article Google Scholar
Larkin, M.A., et al.: Clustal W and Clustal X version 2.0. Bioinformatics 23(21), 2947–2948 (2007)
Article Google Scholar
Li, K.-B.: ClustalW-MPI: ClustalW analysis using distributed and parallel computing. Bioinformatics 19(12), 1585–1586 (2003)
Article Google Scholar
Saitou, N., Nei, M.: The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4(4), 406–425 (1987)
Google Scholar
Sneath, P.H.A., Sokal, R.R.: Numerical Taxonomy. The Principles and Practice of Numerical Classification (1973)
Google Scholar
Pop, M., Salzberg, S.L., Shumway, M.: Genome sequence assembly: Algorithms and issues. Computer 35(7), 47–48 (2002)
Article Google Scholar
Sutton, G.G., et al.: TIGR Assembler: A New Tool for Assembling Large Shotgun Sequencing Projects. Genome Science & Technology 1(1), 11 (1995)
Article MathSciNet Google Scholar
Green, P.: Phrap Documentation: Algorithms. Phred/Phrap/Consed System Home Page (2002), http://www.phrap.org (cited October 31, 2010)
Huang, X., Madan, A.: CAP3: A DNA sequence assembly program. Genome. Res. 9(9), 868–877 (1999)
Article Google Scholar
De Bruijn, N.G.: A Combinational Problem. Koninklijke Nederlandse Akademie v. Wetenschappen 49, 758–764 (1946)
MathSciNet MATH Google Scholar
Pevzner, P.A., Tang, H.X., Waterman, M.S.: An Eulerian path approach to DNA fragment assembly. Proceedings of the National Academy of Sciences of the United States of America 98(17), 9748–9753 (2001)
Article MathSciNet MATH Google Scholar
Zerbino, D.R., Birney, E.: Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome. Res. 18(5), 821–829 (2008)
Article Google Scholar
Chaisson, M.J., Pevzner, P.A.: Short read fragment assembly of bacterial genomes. Genome. Res. 18(2), 324–330 (2008)
Article Google Scholar
Butler, J., et al.: ALLPATHS: de novo assembly of whole-genome shotgun microreads. Genome. Res. 18(5), 810–820 (2008)
Article Google Scholar
Warren, R.L., et al.: Assembling millions of short DNA sequences using SSAKE. Bioinformatics 23(4), 500–501 (2007)
Article Google Scholar
Jeck, W.R., et al.: Extending assembly of short DNA sequences to handle error. Bioinformatics 23(21), 2942–2944 (2007)
Article Google Scholar
Dohm, J.C., et al.: SHARCGS, a fast and highly accurate short-read assembly algorithm for de novo genomic sequencing. Genome. Res. 17(11), 1697–1706 (2007)
Article Google Scholar
Hernandez, D., et al.: De novo bacterial genome sequencing: millions of very short reads assembled on a desktop computer. Genome. Res. 18(5), 802–809 (2008)
Article Google Scholar
Simpson, J.T., et al.: ABySS: a parallel assembler for short read sequence data. Genome. Res. 19(6), 1117–1123 (2009)
Article Google Scholar
Shirts, M., Pande, V.S.: COMPUTING: Screen Savers of the World Unite! Science 290(5498), 1903–1904 (2000)
Article Google Scholar
Marianayagam, N.J., Fawzi, N.L., Head-Gordon, T.: Protein folding by distributed computing and the denatured state ensemble. Proc. Natl. Acad. Sci. USA 102(46), 16684–16689 (2005)
Article Google Scholar
Ding, F., et al.: Ab initio RNA folding by discrete molecular dynamics: from structure prediction to folding mechanisms. RNA 14(6), 1164–1173 (2008)
Article Google Scholar
Ding, F., et al.: Ab initio folding of proteins with all-atom discrete molecular dynamics. Structure 16(7), 1010–1018 (2008)
Article Google Scholar
Shah, A.A., et al.: Parallel and Distributed Processing with Applications. In: Proceedings of the 2008 International Symposium on Parallel and Distributed Processing with Applications, pp. 817–822 (2008)
Google Scholar
Shah, A.A., Barthel, D., Krasnogor, N.: Grid and Distributed Public Computing Schemes for Structural Proteomics: A Short Overview. In: Thulasiraman, P., He, X., Xu, T.L., Denko, M.K., Thulasiram, R.K., Yang, L.T. (eds.) ISPA Workshops 2007. LNCS, vol. 4743, pp. 424–434. Springer, Heidelberg (2007)
Chapter Google Scholar
Intel, The SCC Platform Overview (2010), Web: http://techresearch.intel.com/spaw2/uploads/files/SCC-Overview.pdf (cited October 31, 2010)
Intel, Intel’s Teraflops Research Chip (2010), Web: http://download.intel.com/pressroom/kits/Teraflops/Teraflops_Research_Chip_Overview.pdf (cited October 31, 2010)
Brookwood, N.: AMD Fusion^TM Family of APUs: Enabling a Superior, Immersive PC Experience (2010), Web: http://sites.amd.com/us/Documents/48423B_fusion_whitepaper_WEB.pdf (cited October 31, 2010)
nVidia, Tesla C2050 and Tesla C2070 Computing Processor Board Specification (2010), Web: http://www.nvidia.com/docs/IO/43395/BD-04983-001_v04.pdf (cited October 31, 2010)
nVidia, GeForce GTX 580 Specification (2010), Web: http://www.geforce.com/#/Hardware/GPUs/geforce-gtx-580/specifications (cited October 31, 2010)
Tilera, Tile-Gx Processor Family Product Brief, Web: http://www.tilera.com/sites/default/files/productbriefs/PB025_TILE-Gx_Processor_A_v3.pdf (cited October 31, 2010)

Download references

Author information

Authors and Affiliations

Servicio de Informática, Universidad de Córdoba, Campus Rabanales, 14071, Córdoba, Spain
Francisco J. Esteban
Dep. Lenguajes y Ciencias de la Computación, Universidad de Málaga, Boulevard Louis Pasteur 17, 29071, Málaga, Spain
David Díaz & Sergio Gálvez
Instituto de Agricultura Sostenible (IAS – CSIC), Alameda del Obispo s/n, 14080, Córdoba, Spain
Pilar Hernández
Dep. Estadística, Universidad de Córdoba, Campus Rabanales, 14071, Córdoba, Spain
Juan Antonio Caballero
Dep. Bioquímica y Biología Molecular, Universidad de Córdoba, Campus Rabanales C6-1-E17, Campus de Excelencia Internacional Agroalimentario (ceiA3), 14071, Córdoba, Spain
Gabriel Dorado

Authors

Francisco J. Esteban
View author publications
You can also search for this author in PubMed Google Scholar
David Díaz
View author publications
You can also search for this author in PubMed Google Scholar
Pilar Hernández
View author publications
You can also search for this author in PubMed Google Scholar
Juan Antonio Caballero
View author publications
You can also search for this author in PubMed Google Scholar
Gabriel Dorado
View author publications
You can also search for this author in PubMed Google Scholar
Sergio Gálvez
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Universidad de Córdoba, Spain
Matías Liñán Reyes , José M. Flores Arias , Francisco J. Bellido Outeiriño & Antonio Moreno-Munñoz , , &
University of Cádiz, Spain
Juan J. González de la Rosa
Fachhochschule Hagenberg, Austria
Josef Langer

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Esteban, F.J., Díaz, D., Hernández, P., Caballero, J.A., Dorado, G., Gálvez, S. (2012). Many-Core Processor Bioinformatics and Next-Generation Sequencing. In: Liñán Reyes, M., Flores Arias, J.M., González de la Rosa, J.J., Langer, J., Bellido Outeiriño, F.J., Moreno-Munñoz, A. (eds) IT Revolutions. IT Revolutions 2011. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 82. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32304-1_15

Download citation

DOI: https://doi.org/10.1007/978-3-642-32304-1_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32303-4
Online ISBN: 978-3-642-32304-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics