Deep Learning Approach for Pathogen Detection Through Shotgun Metagenomics Sequence Classification

  • Ying-Feng HsuEmail author
  • Makiko Ito
  • Takumi Maruyama
  • Morito Matsuoka
  • Nicolas Jung
  • Yuki Matsumoto
  • Daisuke Motooka
  • Shota Nakamura
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11526)


Studies have shown that shotgun metagenomics sequencing facilitates the evaluation of diverse viruses, bacteria, and eukaryotic microbes and assists in exploring their abundances in complex samples. Due to the challenges of processing a substantial amount of sequences and overall computational complexity, it is time-consuming to analyze these data through traditional database sequence comparison approaches. Deep learning has been widely used to solve many classification problems, including those in the bioinformatics field, and has demonstrated its accuracy and efficiency for analyzing large-scale datasets. The purpose of this work is to explore how a long short-term memory (LSTM) network can be used to learn sequential genome patterns through pathogen detection from metagenome data. Our experimental result showed that we can obtain similar accuracy to the conventional BLAST method, but at a speed that is about 36 times faster.


Shotgun metagenomics sequencing Sequence classification Deep learning LSTM GPU acceleration Parallel computing 



The authors are members of Fujitsu next generation Cloud Research Alliance Laboratory (FCRAL). This research and development work was partially supported by the MIC/SCOPE #172107106 and by Fujitsu Ltd.


  1. 1.
    Gotoh, O.: An improved algorithm for matching biological sequences. J. Mol. Biol. 162, 705–708 (1982)Google Scholar
  2. 2.
    NCBI: BLAST: Basic Local Alignment Search Tool.
  3. 3.
    BWA: Aligner Burrows-Wheeler (BWA).
  4. 4.
    Zielezinski, A., Vinga, S., Almeida, J., Karlowski, W.M.: Alignment-free sequence comparison: benefits, applications, and tools. Genome Biol. 18, 186 (2017)Google Scholar
  5. 5.
    Sill, J., Takacs, G., Mackey, L., Lin, D.: Feature-Weighted Linear Stacking, arXiv:0911.0460

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Cybermedia CenterOsaka UniversityOsakaJapan
  2. 2.Fujitsu Laboratories Ltd.KanagawaJapan
  3. 3.Fujitsu LimitedTokyoJapan
  4. 4.Genome Information Research Center, Research Institute for Microbial DiseasesOsaka UniversityOsakaJapan

Personalised recommendations