Skip to main content

Parallel Computation on Large-Scale DNA Sequences

  • Chapter
  • First Online:

Part of the book series: EAI/Springer Innovations in Communication and Computing ((EAISICC))

Abstract

With the advent of next-generation DNA sequencing technology, the field of bioinformatics and computational biology is becoming increasingly complex and computationally intensive. The bioinformatics community faces the challenge of finding suitable methods to solve growing computational issues, for instance, processing of massive volumes of DNA sequences. Such method can be found in the field of high-performance computing through parallel processing. In this paper we have proposed parallel approach which is built on top of modified VSM. The proposed method is parallelized computation on a number of available processing cores in order to minimize computation time and support analysis of a large number of DNA sequences analysis.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bald, P., Baronio, R., Cristofaro, E. D., Gasti, P., & Tsudik, G. (2000). Efficient and secure testing of fully-sequenced human genomes. Biological Sciences Initiative, 470, 7–10.

    Google Scholar 

  2. Memeti, S., & Pllana, S. 2016. Analyzing large-scale DNA sequences on multi-core architectures. Proceedings – IEEE 18th international conference on computational science and engineering CSE 2015, pp. 208–215.

    Google Scholar 

  3. Ogheneovo, E. E., & Japheth, R. B. (2016). Application of vector space model to query ranking and information retrieval. International Journal of Advanced Research in Computer Science and Software Engineering, 6(5), 42–47.

    Google Scholar 

  4. Smith, T. F., & Waterman, M. S. (1981). Identification of common molecular subsequences. Journal of Molecular Biology, 147(1), 195–197.

    Article  Google Scholar 

  5. Dereeper, A., Audic, S., Claverie, J.-M., & Blanc, G. (2010). BLAST-EXPLORER helps you building datasets for phylogenetic analysis. BMC Evolutionary Biology, 10(1), 8.

    Article  Google Scholar 

  6. Abual-Rub, M., Abdullah, R., & Rashid, N. (2007). A modified vector space model for protein retrieval. International Journal of Computer Science and Network Security, 7(9), 85–89.

    Google Scholar 

  7. Patel, S., Panchal, H., & Anjaria, K. (2012). DNA sequence analysis by ORF FINDER amp; GENOMATIX tool: Bioinformatics analysis of some tree species of Leguminosae family, in 2012 IEEE international conference on bioinformatics and biomedicine workshops, pp. 922–926.

    Google Scholar 

  8. Vandin, F., Upfal, E., & Raphael, B. J. (2012, March). Algorithms and Genome Sequencing : Identifying Driver Pathways in Cancer. IEEE Computer Magazine, 45(3), 39–46.

    Article  Google Scholar 

  9. Benson, D. A., Cavanaugh, M., Clark, K., Karsch-mizrachi, I., Lipman, D. J., Ostell, J., & Sayers, E. W. (2013). GenBank. Nucleic Acids Research, 41(D1 November 2012), 36–42.

    Article  Google Scholar 

  10. de Almeida, T. J. B. M., & Roma, N. F. V. (2010, February). A Parallel Programming Framework for Multi-core DNA Sequence Alignment, 2010 international conference on Complex, Intelligent and Software Intensive Systems (CISIS), 2010, pp. 907–912.

    Google Scholar 

  11. Marçais, G., & Kingsford, C. (2011). A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics, 27(6), 764–770.

    Article  Google Scholar 

  12. Herath, D., Lakmali, C., Ragel, R. (2012, March). Accelerating string matching for bio-computing applications on multi-core CPUs. IEEE 7th, Int. Conf. Ind. Inf. Syst. ICIIS 2012.

    Google Scholar 

  13. Takeuchi, T., Yamada, A., Aoki, T., & Nishimura, K. (2016). cljam: A library for handling DNA sequence alignment/map (SAM) with parallel processing. Source Code for Biology and Medicine, 11, 1–4.

    Article  Google Scholar 

  14. Manning, C. D., Raghavan, P., & Schütze, H. (2008), An introduction to information retrieval, Cambridge University Press, 2008.

    Google Scholar 

  15. Raghavan, V. V., & Wong, S. K. M. (1986). A critical analysis of vector space model for information retrieval. Journal of the American Society for Information Science, 37(5), 279--287.

    Article  Google Scholar 

  16. Singhal, A. (2001). Modern information retrieval : A brief overview. IEEE Data Engineering Bulletin, 24, 35–43.

    Google Scholar 

  17. Castells, P., Fernandez, M., & Vallet, D. (Feb. 2007). An adaptation of the vector-space model for ontology-based information retrieval. IEEE Transactions on Knowledge and Data Engineering, 19(2), 261–272.

    Article  Google Scholar 

  18. Sarkar, I. N. (2012). A vector space model approach to identify genetically related diseases. Journal of the American Medical Informartion Association, 19(2), 249–254.

    Article  Google Scholar 

  19. “NCBI,” National Center for Biotechnology Information. [Online]. Available: https://www.ncbi.nlm.nih.gov/. Accessed 26 Jan 2017.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mukhtaj Khan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Majid, A., Khan, M., Khan, M., Ahmad, J., Li, M., Paracha, R.Z. (2019). Parallel Computation on Large-Scale DNA Sequences. In: Khan, F., Jan, M., Alam, M. (eds) Applications of Intelligent Technologies in Healthcare. EAI/Springer Innovations in Communication and Computing. Springer, Cham. https://doi.org/10.1007/978-3-319-96139-2_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-96139-2_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-96138-5

  • Online ISBN: 978-3-319-96139-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics