Discovering Trends in Environmental Time-Series with Supervised Classification of Metatranscriptomic Reads and Empirical Mode Decomposition

  • Enzo Acerbi
  • Caroline Chénard
  • Stephan C. Schuster
  • Federico M. LauroEmail author
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 1024)


In metagenomic and metatranscriptomic studies, the assignment of reads to taxonomic bins is typically performed by sequence similarity or phylogeny based approaches. Such methods become less effective if the sequences are closely related and/or of limited length. Here, we propose an approach for multi-class supervised classification of metatranscriptomic reads of short length (100–300 bp) which exploits k-mers frequencies as discriminating features. In addition, we take a first step in addressing the lack of established methods for the analysis of periodic features in environmental time-series by proposing Empirical Mode Decomposition as a way of extracting information on heterogeneity and population dynamics in natural microbial communities. To prove the validity of our computational approach as an effective tool to generate new biological insights, we applied it to investigate the transcriptional dynamics of viral infection in the ocean. We used data extracted from a previously published metatranscriptome profile of a naturally occurring oceanic bacterial assemblage sampled Lagrangially over 3 days. We discovered the existence of light-dark oscillations in the expression patterns of auxiliary metabolic genes in cyanophages which follow the harmonic diel transcription of both oxygenic photoautotrophic and heterotrophic members of the community, in agreement to what other studies have just recently found. Our proposed methodology can be extended to many other datasets opening opportunities for a better understanding of the structure and function of microbial communities in their natural environment.


Empirical mode decomposition Metatranscriptomics Metagenomics Marine microbial ecology Environmental time-series Microbial communities K-mers 



The authors would like to acknowledge financial support from Singapore’s Ministry of Education Academic Research Fund Tier 3 under the research grant MOE2013-T3-1-013, Singapore’s National Research Foundation under its Marine Science Research and Development Programme (Award No. MSRDP-P13) and the Singapore Centre for Environmental Life Sciences Engineering (SCELSE), whose research is supported by the National Research Foundation Singapore, Ministry of Education, Nanyang Technological University and National University of Singapore, under its Research Centre of Excellence Program. The authors would like to thank Fabio Stella, Rohan Williams and James Houghton for their valuable feedbacks.

Supplementary material (2.8 mb)
Supplementary material 1 (zip 2891 KB)


  1. 1.
    Acerbi, E., Chenard, C., Schuster, S.C., Lauro, F.M.: Supervised classification of metatranscriptomic reads reveals the existence of light-dark oscillations during infection of phytoplankton by viruses. In: Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2018) - Volume 3: BIOINFORMATICS, Funchal, Madeira, Portugal, 19–21 January 2018, pp. 69–77 (2018).
  2. 2.
    Aylward, F.O., et al.: Diel cycling and long-term persistence of viruses in the Ocean’s euphotic zone. Proc. Natl. Acad. Sci. 114(43), 11446–11451 (2017)CrossRefGoogle Scholar
  3. 3.
    Bagherzadeh, S.A., Sabzehparvar, M.: A local and online sifting process for the empirical mode decomposition and its application in aircraft damage detection. Mech. Syst. Signal Process. 54, 68–83 (2015)CrossRefGoogle Scholar
  4. 4.
    de Bashan, L.E., Trejo, A., Huss, V.A., Hernandez, J.P., Bashan, Y.: Chlorella sorokiniana utex 2805, a heat and intense, sunlight-tolerant microalga with potential for removing ammonium from wastewater. Bioresour. Technol. 99(11), 4980–4989 (2008)CrossRefGoogle Scholar
  5. 5.
    Breitbart, M., Thompson, L.R., Suttle, C.A., Sullivan, M.: Exploring the vast diversity of marine viruses. Oceanography 20(SPL. ISS. 2), 135–139 (2007)CrossRefGoogle Scholar
  6. 6.
    Chambers, D.P.: Evaluation of empirical mode decomposition for quantifying multi-decadal variations and acceleration in sea level records. Nonlinear Process. Geophys. 22(2), 157–166 (2015)CrossRefGoogle Scholar
  7. 7.
    Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27:1–27:27 (2011)CrossRefGoogle Scholar
  8. 8.
    Chang, K.M.: Ensemble empirical mode decomposition for high frequency ECG noise reduction. Biomed. Tech./Biomed. Eng. 55(4), 193–201 (2010)CrossRefGoogle Scholar
  9. 9.
    Chen, C.R., Shu, W.Y., Chang, C.W., Hsu, I.C.: Identification of under-detected periodicity in time-series microarray data by using empirical mode decomposition. PLoS ONE 9(11), e111719 (2014)CrossRefGoogle Scholar
  10. 10.
    Chen, Y., Wei, D., Wang, Y., Zhang, X.: The role of interactions between bacterial chaperone, aspartate aminotransferase, and viral protein during virus infection in high temperature environment: the interactions between bacterium and virus proteins. BMC Microbiol. 13(1), 48 (2013)CrossRefGoogle Scholar
  11. 11.
    Chenard, C., Suttle, C.A.: Phylogenetic diversity of sequences of cyanophage photosynthetic gene psbA in marine and freshwaters. Appl. Environ. Microbiol. 74(17), 5317–5324 (2008)CrossRefGoogle Scholar
  12. 12.
    Clokie, M.R., Millard, A.D., Mehta, J.Y., Mann, N.H.: Virus isolation studies suggest short-term variations in abundance in natural cyanophage populations of the indian ocean. J. Mar. Biol. Assoc. U. K. 86(03), 499–505 (2006)CrossRefGoogle Scholar
  13. 13.
    Clokie, M.R., et al.: Transcription of a ‘photosynthetic’ T4-type phage during infection of a marine cyanobacterium. Environ. Microbiol. 8(5), 827–835 (2006)CrossRefGoogle Scholar
  14. 14.
    Cole, J.J.: Interactions between bacteria and algae in aquatic ecosystems. Annu. Rev. Ecol. Syst. 13(1), 291–314 (1982)CrossRefGoogle Scholar
  15. 15.
    Doron, S., et al.: Transcriptome dynamics of a broad host-range cyanophage and its hosts. The ISME J. 10(6), 1437 (2016)CrossRefGoogle Scholar
  16. 16.
    Frees, D., et al.: CLP atpases are required for stress tolerance, intracellular replication and biofilm formation in staphylococcus aureus. Mol. Microbiol. 54(5), 1445–1462 (2004)CrossRefGoogle Scholar
  17. 17.
    Golden, S.S., Ishiura, M., Johnson, C.H., Kondo, T.: Cyanobacterial circadian rhythms. Annu. Rev. Plant Biol. 48(1), 327–354 (1997)CrossRefGoogle Scholar
  18. 18.
    Goldsmith, D.B., et al.: Development of phoh as a novel signature gene for assessing marine phage diversity. Appl. Environ. Microbiol. 77(21), 7730–7739 (2011)CrossRefGoogle Scholar
  19. 19.
    Goldsmith, D.B., Parsons, R.J., Beyene, D., Salamon, P., Breitbart, M.: Deep sequencing of the viral phoH gene reveals temporal variation, depth-specific composition, and persistent dominance of the same viral phoh genes in the sargasso sea. PeerJ 3, e997 (2015)CrossRefGoogle Scholar
  20. 20.
    Hahnke, S., Brock, N.L., Zell, C., Simon, M., Dickschat, J.S., Brinkhoff, T.: Physiological diversity of roseobacter clade bacteria co-occurring during a phytoplankton bloom in the north sea. Syst. Appl. Microbiol. 36(1), 39–48 (2013)CrossRefGoogle Scholar
  21. 21.
    Han, J., van der Baan, M.: Empirical mode decomposition for seismic time-frequency analysis. Geophysics 78(2), O9–O19 (2013)CrossRefGoogle Scholar
  22. 22.
    Hess, W.R.: Genome analysis of marine photosynthetic microbes and their global role. Curr. Opin. Biotechnol. 15(3), 191–198 (2004)CrossRefGoogle Scholar
  23. 23.
    Holmfeldt, K., et al.: Twelve previously unknown phage genera are ubiquitous in global oceans. Proc. Natl. Acad. Sci. 110(31), 12798–12803 (2013)CrossRefGoogle Scholar
  24. 24.
    Huang, N.E., et al.: The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. In: Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, vol. 454, pp. 903–995. The Royal Society (1998)Google Scholar
  25. 25.
    Kim, D., Oh, H.S.: EMD: a package for empirical mode decomposition and Hilbert spectrum. R J. 1(1), 40–46 (2009)CrossRefGoogle Scholar
  26. 26.
    Kurochkina, L.P., Semenyuk, P.I., Orlov, V.N., Robben, J., Sykilinda, N.N., Mesyanzhinov, V.V.: Expression and functional characterization of the first bacteriophage-encoded chaperonin. J. Virol. 86(18), 10103–10111 (2012)CrossRefGoogle Scholar
  27. 27.
    Lauro, F.M., et al.: The genomic basis of trophic strategy in marine bacteria. Proc. Natl. Acad. Sci. 106(37), 15527–15533 (2009) CrossRefGoogle Scholar
  28. 28.
    Li, F., Jo, Y.H., Liu, W.T., Yan, X.H.: A dipole pattern of the sea surface height anomaly in the north Atlantic: 1990s–2000s. Geophys. Res. Lett. 39(15) (2012)Google Scholar
  29. 29.
    Lindell, D., et al.: Genome-wide expression dynamics of a marine virus and host reveal features of co-evolution. Nature 449(7158), 83–86 (2007)CrossRefGoogle Scholar
  30. 30.
    Liu, R., Chen, Y., Zhang, R., Liu, Y., Jiao, N., Zeng, Q.: Cyanophages exhibit rhythmic infection patterns under light-dark cycles. bioRxiv p. 167650 (2017)Google Scholar
  31. 31.
    Mayali, X., Franks, P.J., Azam, F.: Cultivation and ecosystem role of a marine roseobacter clade-affiliated cluster bacterium. Appl. Environ. Microbiol. 74(9), 2595–2603 (2008)CrossRefGoogle Scholar
  32. 32.
    Mella-Flores, D., et al.: Prochlorococcus and synechococcus have evolved different adaptive mechanisms to cope with light and UV stress (2012)Google Scholar
  33. 33.
    Mourino-Pérez, R.R., Worden, A.Z., Azam, F.: Growth of vibrio cholerae o1 in red tide waters off california. Appl. Environ. Microbiol. 69(11), 6923–6931 (2003)CrossRefGoogle Scholar
  34. 34.
    Ni, T., Zeng, Q.: Diel infection of cyanobacteria by cyanophages. Front. Mar. Sci. 2, 123 (2016)CrossRefGoogle Scholar
  35. 35.
    Ottesen, E.A., et al.: Multispecies diel transcriptional oscillations in open ocean heterotrophic bacterial assemblages. Science 345(6193), 207–212 (2014)CrossRefGoogle Scholar
  36. 36.
    Partensky, F., Hess, W.R., Vaulot, D.: Prochlorococcus, a marine photosynthetic prokaryote of global significance. Microbiol. Mol. Biol. Rev. 63(1), 106–127 (1999)Google Scholar
  37. 37.
    Paulson, J.N., Stine, O.C., Bravo, H.C., Pop, M.: Differential abundance analysis for microbial marker-gene surveys. Nat. Methods 10(12), 1200–1202 (2013)CrossRefGoogle Scholar
  38. 38.
    Ribalet, F., et al.: Light-driven synchrony of prochlorococcus growth and mortality in the subtropical Pacific gyre. Proc. Natl. Acad. Sci. 112(26), 8008–8012 (2015)CrossRefGoogle Scholar
  39. 39.
    Sandberg, R., Winberg, G., Bränden, C.I., Kaske, A., Ernberg, I., Cöster, J.: Capturing whole-genome characteristics in short sequences using a Naive Bayesian classifier. Genome Res. 11(8), 1404–1409 (2001)CrossRefGoogle Scholar
  40. 40.
    Stitson, M., Weston, J., Gammerman, A., Vovk, V., Vapnik, V.: Theory of support vector machines. Technical report, CSD-TR-96-17, Computational Intelligence Group, University of London (1996)Google Scholar
  41. 41.
    Sullivan, M.B., Lindell, D., Lee, J.A., Thompson, L.R., Bielawski, J.P., Chisholm, S.W.: Prevalence and evolution of core photosystem II genes in marine cyanobacterial viruses and their hosts. PLoS Biol. 4(8), e234 (2006)CrossRefGoogle Scholar
  42. 42.
    Sullivan, M.B., Waterbury, J.B., Chisholm, S.W.: Cyanophages infecting the oceanic cyanobacterium prochlorococcus. Nature 424(6952), 1047–1051 (2003)CrossRefGoogle Scholar
  43. 43.
    Suttle, C.A., Chen, F.: Mechanisms and rates of decay of marine viruses in seawater. Appl. Environ. Microbiol. 58(11), 3721–3729 (1992)Google Scholar
  44. 44.
    Thompson, L.R., et al.: Phage auxiliary metabolic genes and the redirection of cyanobacterial host carbon metabolism. Proc. Natl. Acad. Sci. 108(39), E757–E764 (2011)CrossRefGoogle Scholar
  45. 45.
    Tolonen, A.C., et al.: Global gene expression of prochlorococcus ecotypes in response to changes in nitrogen availability. Mol. Syst. Biol. 2(1), 53 (2006)CrossRefGoogle Scholar
  46. 46.
    Tzahor, S., et al.: A supervised learning approach for taxonomic classification of core-photosystem-II genes and transcripts in the marine environment. BMC Genom. 10(1), 229 (2009)CrossRefGoogle Scholar
  47. 47.
    Wilhelm, S.W., Weinbauer, M.G., Suttle, C.A., Jeffrey, W.H.: The role of sunlight in the removal and repair of viruses in the sea. Limnol. Ocean. 43(4), 586–592 (1998)CrossRefGoogle Scholar
  48. 48.
    Wyckoff, T.J., Taylor, J.A., Salama, N.R.: Beyond growth: novel functions for bacterial cell wall hydrolases. Trends Microbiol. 20(11), 540–547 (2012)CrossRefGoogle Scholar
  49. 49.
    Zhao, Y., Tang, H., Ye, Y.: RAPSearch2: a fast and memory-efficient protein similarity search tool for next-generation sequencing data. Bioinformatics 28(1), 125–126 (2011)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Enzo Acerbi
    • 1
  • Caroline Chénard
    • 2
  • Stephan C. Schuster
    • 1
  • Federico M. Lauro
    • 1
    • 2
    Email author
  1. 1.Singapore Centre for Environmental Life Sciences Engineering (SCELSE)Nanyang Technological UniversityJurong WestSingapore
  2. 2.Asian School of the EnvironmentNanyang Technological UniversityJurong WestSingapore

Personalised recommendations