Advertisement

Deep Learning Approach for Pathogen Detection Through Shotgun Metagenomics Sequence Classification

  • Ying-Feng HsuEmail author
  • Makiko Ito
  • Takumi Maruyama
  • Morito Matsuoka
  • Nicolas Jung
  • Yuki Matsumoto
  • Daisuke Motooka
  • Shota Nakamura
Conference paper
  • 778 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11526)

Abstract

Studies have shown that shotgun metagenomics sequencing facilitates the evaluation of diverse viruses, bacteria, and eukaryotic microbes and assists in exploring their abundances in complex samples. Due to the challenges of processing a substantial amount of sequences and overall computational complexity, it is time-consuming to analyze these data through traditional database sequence comparison approaches. Deep learning has been widely used to solve many classification problems, including those in the bioinformatics field, and has demonstrated its accuracy and efficiency for analyzing large-scale datasets. The purpose of this work is to explore how a long short-term memory (LSTM) network can be used to learn sequential genome patterns through pathogen detection from metagenome data. Our experimental result showed that we can obtain similar accuracy to the conventional BLAST method, but at a speed that is about 36 times faster.

Keywords

Shotgun metagenomics sequencing Sequence classification Deep learning LSTM GPU acceleration Parallel computing 

Notes

Acknowledgments

The authors are members of Fujitsu next generation Cloud Research Alliance Laboratory (FCRAL). This research and development work was partially supported by the MIC/SCOPE #172107106 and by Fujitsu Ltd.

References

  1. 1.
    Gotoh, O.: An improved algorithm for matching biological sequences. J. Mol. Biol. 162, 705–708 (1982)Google Scholar
  2. 2.
    NCBI: BLAST: Basic Local Alignment Search Tool. https://blast.ncbi.nlm.nih.gov/Blast.cgi
  3. 3.
    BWA: Aligner Burrows-Wheeler (BWA). http://bio-bwa.sourceforge.net/
  4. 4.
    Zielezinski, A., Vinga, S., Almeida, J., Karlowski, W.M.: Alignment-free sequence comparison: benefits, applications, and tools. Genome Biol. 18, 186 (2017)Google Scholar
  5. 5.
    Sill, J., Takacs, G., Mackey, L., Lin, D.: Feature-Weighted Linear Stacking, arXiv:0911.0460

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Cybermedia CenterOsaka UniversityOsakaJapan
  2. 2.Fujitsu Laboratories Ltd.KanagawaJapan
  3. 3.Fujitsu LimitedTokyoJapan
  4. 4.Genome Information Research Center, Research Institute for Microbial DiseasesOsaka UniversityOsakaJapan

Personalised recommendations