Conservation Genetics

, Volume 16, Issue 3, pp 513–522 | Cite as

Rare biosphere exploration using high-throughput sequencing: research progress and perspectives



Identification of rare species and mapping their distributions is crucial for understanding natural species distributions and causes and consequences of accelerating species declines. However, detection of rare species in both terrestrial and especially aquatic communities typically dominated by numerous microscopic species (i.e. rare biosphere) represents a formidable technical challenge. Rapid advances in high-throughput sequencing (HTS) technologies have revolutionized biodiversity studies in the rare biosphere, and also stimulated associated debates. Here we summarize research progress, discuss debates and problems, and propose possible solutions and future studies to address these issues. In addition, we provide take-home messages for experimental design and data interpretation when utilizing HTS techniques for rare biosphere exploration in ecology and conservation biology.


Biodiversity Metabarcoding Next-generation sequencing Rare species Type I error Type II error 



This work was supported by the  National Natural Science Foundation of China (31272665), the One-Three-Five Program (YSW2013B02) of the Research Center for Eco-Environmental Sciences and 100-Talent Program of the Chinese Academy of Sciences to A.Z., by Discovery grants from Natural Sciences and Engineering Research Council of Canada (NSERC), the NSERC Canadian Aquatic Invasive Species Network (CAISN), and Canada Research Chair to H.J.M.


  1. Barnosky AD, Matzke N, Tomiya S, Wogan GO, Swartz B, Quental TB, Marshall C, McGuire JL, Lindsey EL, Maguire KC, Mersey B, Ferrer EA (2011) Has the Earth’s sixth mass extinction already arrived? Nature 471:51–57CrossRefPubMedGoogle Scholar
  2. Bellemain E, Carlsen T, Brochmann C, Coissac E, Taberlet P, Kauserud H (2010) ITS as an environmental DNA barcode for fungi: an in silico approach reveals potential PCR biases. BMC Microbiol 10:189CrossRefPubMedCentralPubMedGoogle Scholar
  3. Berry D, Mahfoudh KB, Wagner M, Loy A (2011) Barcoded primers used in multiplex amplicon pyrosequencing bias amplification. Appl Environ Microbiol 77:7846–7849CrossRefPubMedCentralPubMedGoogle Scholar
  4. Blaalid R, Kumar S, Nilsson RH, Abarenkov K, Kirk PM, Kauserud H (2013) ITS1 versus ITS2 as DNA metabarcodes for fungi. Mol Ecol Resour 13:218–224CrossRefPubMedGoogle Scholar
  5. Boessenkool S, Epp LS, Haile J, Bellemain E (2012) Blocking human contaminant DNA during PCR allows amplification of rare mammal species from sedimentary ancient DNA. Mol Ecol 21:1806–1815CrossRefPubMedGoogle Scholar
  6. Bohmann K, Evans A, Gilbert TP, Carvalho GR, Creer S, Knapp M, Yu DW, de Bruyn M (2014) Environmental DNA for wildlife biology and biodiversity monitoring. Trends Ecol Evol 29:358–367CrossRefPubMedGoogle Scholar
  7. Brown SP, Veach AM, Rigdon-Huss AR, Grond K, Lickteig SK, Lothamer K, Oliver AK, Jumpponen A (2014) Scraping the bottom of the barrel: are rare high throughput sequences artifacts? Fungal Ecol. doi: 10.1016/j.funeco.2014.08.006 Google Scholar
  8. Carlsen T, Aas AB, Lindner D, Vrålstad T, Chumacher T, Kauserud H (2012) Don’t make a mista(g)ke: Is tag switching an overlooked source of error in amplicon pyrosequencing studies? Fungal Ecol 5:747–749CrossRefGoogle Scholar
  9. Chapin FS III, Zavaleta ES, Eviner VT, Naylor RL, Vitousek PM, Reynolds HL, Hooper DU, Lavorel S, Sala OE, Hobbie SE, Mack MC, Díaz S (2000) Consequences of changing biodiversity. Nature 405:234–242CrossRefPubMedGoogle Scholar
  10. Clarke LJ, Soubrier J, Weyrich LS, Cooper A (2014) Environmental metabarcodes for insects: in silico PCR reveals potential for taxonomic bias. Mol Ecol Resour. doi: 10.1111/1755-0998.12265 Google Scholar
  11. Creer S (2010) Second-generation sequencing derived insights into the temporal biodiversity dynamics of freshwater protists. Mol Ecol 19:2829–2831CrossRefPubMedGoogle Scholar
  12. Crooks JA, Soulé ME (1999) Lag times in population explosions of invasive species: Cases and implications. In: Sandlund OT, Schei PJ, Viken A (eds) Invasive species and biodiversity management. Kluwer Academic Publishers, DordrechtGoogle Scholar
  13. Darling JA (2014) Genetic studies of aquatic biological invasions: closing the gap between research and management. Biol Invasions. doi: 10.1007/s10530-014-0726-x Google Scholar
  14. Darling JA, Mahon AR (2011) From molecules to management: Adopting DNA-based methods for monitoring biological invasions in aquatic environments. Environ Res 111:978–988CrossRefPubMedGoogle Scholar
  15. Davey ML, Kauserud H, Ohlson M (2014) Forestry impacts on the hidden fungal biodiversity associated with bryophytes. FEMS Microbiol Ecol. doi: 10.1111/1574-6941.12386 PubMedGoogle Scholar
  16. Deagle BE, Kirkwood R, Jarman SN (2009) Analysis of Australian fur seal diet by pyrosequencing prey DNA in faeces. Mol Ecol 18:2022–2038CrossRefPubMedGoogle Scholar
  17. Edgar RC (2013) UPARSE: highly accurate OUT sequences from microbial amplicon reads. Nat Methods 10:996–998CrossRefPubMedGoogle Scholar
  18. Engelbrektson A, Kunin V, Wrighton K, Zvenigorodsky N, Chen F, Ochman H, Hugenholtz P (2010) Experimental factors affecting PCR-based estimates of microbial species richness and evenness. ISME J 4:642–647CrossRefPubMedGoogle Scholar
  19. Gibson J, Shokralla S, Porter TM, King I, van Konynenburg S, Janzen DH, Hallwachs W, Hajibabaei M (2014) Simultaneous assessment of the macrobiome and microbiome in a bulk sample of tropical arthropods through DNA metasystematics. P Natl Acad Sci USA. doi: 10.1073/pnas.1406468111 Google Scholar
  20. Greenfield P, Duesing K, Papanicolaou A, Bauer DC (2014) Blue: correcting sequencing errors using consensus and context. Bioinformatics. doi: 10.1093/bioinformatics/btu368 PubMedGoogle Scholar
  21. Hajibabaei M (2012) The golden age of DNA metasystematics. Trends Genet 28:535–537CrossRefPubMedGoogle Scholar
  22. Hajibabaei M, Shokralla S, Zhou X, Singer GAC, Baird DJ (2011) Environmental barcoding: a next-generation sequencing approach for biomonitoring applications using river benthos. PLoS One 6:e17497CrossRefPubMedCentralPubMedGoogle Scholar
  23. Heo Y, Wu X, Chen D, Ma J, Hwu WM (2014) Bless: bloom filter-based error correction solution for high-thoughput sequencing reads. Bioinformatics 30:1354–1362CrossRefPubMedGoogle Scholar
  24. Huber JA, Morrison HG, Huse SM, Neal PR, Sogin ML, Mark Welch DB (2009) Effect of PCR amplicon size on assessments of clone library microbial diversity and community structure. Environ Microbiol 11:1292–1302CrossRefPubMedCentralPubMedGoogle Scholar
  25. Ihrmark K, Bödeker ITM, Cruz-Martinez K, Friberg H, Kubartova A, Schenck J, Strid Y, Stenlid J, Brandström-Durling M, Clemmensen KE, Lindahl BD (2012) New primers to amplify the fungal ITS2 region—evaluation by 454-sequencing of artificial and natural communities. FEMS Microbiol Ecol 82:666–677CrossRefPubMedGoogle Scholar
  26. Ilie L, Fazayeli F, Ilie S (2011) HiTEC: accurate error correction in high-throughput sequencing data. Bioinformatics 27:295–302CrossRefPubMedGoogle Scholar
  27. Kao WC, Chan AH, Song YS (2011) ECHO: a reference-free short-read error correction algorithm. Genome Res 21:1181–1192CrossRefPubMedCentralPubMedGoogle Scholar
  28. Kumar S, Carlsen T, Mevik B-H, Enger P, Blaalid R, Shalchian-Tabrizi K, Kauserud H (2011) CLOTU: an online pipeline for processing and clustering of 454 amplicon reads into OTUs followed by taxonomic annotation. BMC Bioinformatics 12:182CrossRefPubMedCentralPubMedGoogle Scholar
  29. Kunin V, Engelbrektson A, Ochman H, Hugenholtz P (2010) Wrinkles in the rare biosphere: pyrosequencing errors can lead to artificial inflation of diversity estimates. Environ Microbiol 12:118–123CrossRefPubMedGoogle Scholar
  30. Leray M, Yang JY, Meyer CP, Mills SC, Agudelo N, Ranwez V, Boehm JT, Machida RJ (2013) A new versatile primer set targeting a short fragment of the mitochondrial COI region for metabarcoding metazoan diversity: application for characterizing coral reef fish gut contents. Front Zool 10:34CrossRefPubMedCentralPubMedGoogle Scholar
  31. Lindahl BD, Nilsson RH, Tedersoo L, Abarenkov K, Carlsen T, Kjøller R, Kõljalg U, Pennanen T, Rosendahl S, Stenlid J, Kauserud H (2013) Fungal community analysis by high-throughput sequencing of amplified markers – a user’s guide. New Phytol 199:288–299Google Scholar
  32. Liu S, Li Y, Lu J, Su X, Tang M, Zhang R, Zhou L, Zhou C, Yang Q, Ji Y, Yu DW, Zhou X (2013) SOAPBarcode: revealing arthropod biodiversity through assembly of Illumina shotgun sequences of PCR amplicons. Methods Ecol Evol 4:1142–1150CrossRefGoogle Scholar
  33. Machida RJ, Knowlton N (2012) PCR Primers for metazoan nuclear 18S and 28S ribosomal DNA sequences. PLoS One 7:e46180CrossRefPubMedCentralPubMedGoogle Scholar
  34. Nossa CW, Oberdorf WE, Yang L, Aas JA, Paster BJ, De Santis TZ, Brodie EL, Malamud D, Poles MA, Pei Z (2010) Design of 16S rRNA gene primers for 454 pyrosequencing of the human foregut microbiome. World J Gastroentero 16:4135–4144CrossRefGoogle Scholar
  35. Ojaveer H, Galil BS, Minchin D, Olenin S, Amorim A, Canning-Clode J, Chainho P, Copp GH, Collasch S, Jelmert A, Lehtiniemi M, McKenzie C, Mikušm J, Miossecn L, Occhipinti-Ambrogio A, Pećarevićm M, Pedersonp J, Quilez-Badiaq G, Wijsmanr JWM, Zenetoss A (2013) Ten recommendations for advancing the assessment and management of non-indigenous species in marine ecosystems. Mar Policy 44:1–6Google Scholar
  36. Ovaskainen O, Schigel D, Ali-Kovero H, Auvinen P, Paulin L, Nordén B, Nordén J (2013) Combining high-throughput sequencing with fruit body surveys reveals contrasting life-history strategies in fungi. ISME J 7:1696–1709CrossRefPubMedCentralPubMedGoogle Scholar
  37. Pedrós-Alió C (2012) The rare bacterial biosphere. Annu Rev Mar Sci 4:449–466CrossRefGoogle Scholar
  38. Pimm SL, Russell GJ, Gittleman JL, Brooks TM (1995) The future of biodiversity. Science 269:347–350CrossRefPubMedGoogle Scholar
  39. Pochon X, Bott NJ, Smith KF, Wood SA (2013) Evaluating detection limits of next-generation sequencing for the surveillance and monitoring of international marine pests. PLoS One 8:e73935CrossRefPubMedCentralPubMedGoogle Scholar
  40. Quince C, Lanzén A, Curtis TP, Davenport RJ, Hall N, Head IM, Read LF, Sloan WT (2009) Accurate determination of microbial diversity from 454 pyrosequencing data. Nat Methods 6:639–641CrossRefPubMedGoogle Scholar
  41. Reeder J, Knight R (2009) The ‘rare biosphere’: a reality check. Nat Methods 6:636–637CrossRefPubMedGoogle Scholar
  42. Salmela L (2010) Correction of sequencing errors in a mixed set of reads. Bioinformatics 26:1284–1290CrossRefPubMedGoogle Scholar
  43. Salmela L, Schröder J (2011) Correcting errors in short reads by multiple alignments. Bioinformatics 27:1455–1461CrossRefPubMedGoogle Scholar
  44. Schmieder R, Edwards R (2011) Fast identification and removal of sequence contamination from genomic and metagenomic datasets. PLoS One 6:e17288CrossRefPubMedCentralPubMedGoogle Scholar
  45. Schoch CL, Seifert KA, Huhndorf S, Robert V, Spouge JL, Levesque CA, Chen W, Fungal Barcoding Consortium (2012) Nuclear ribosomal internal transcribed spacer (ITS) region as a universal DNA barcode marker for Fungi. P Natl Acad Sci USA 109:6241–6246CrossRefGoogle Scholar
  46. Srivathsan A, Sha JCM, Vogler AP, Meier R (2014) Comparing the effectiveness of metagenomics and metabarcoding for diet analysis of a leaf-feeding monkey (Pygathrix nemaeus). Mol Ecol Resour. doi: 10.1111/1755-0998.12302 PubMedGoogle Scholar
  47. Taberlet P, Coissac E, Hajibabaei M, Rieseberg LH (2012) Environmental DNA. Mol Ecol 21:1789–1793CrossRefPubMedGoogle Scholar
  48. Tang CQ, Leasi F, Obertegger U, Kieneke A, Barraclough TG, Fontaneto D (2012) The widely used small subunit 18S rDNA molecule greatly underestimates true diversity in biodiversity surveys of the meiofauna. P Natl Acad Sci USA 109:16208–16212CrossRefGoogle Scholar
  49. Thomsen PF, Kielgast J, Iversen LL, Wiuf C, Rasmussen M, Gilbert MTP, Orlando L, Willerslev E (2012) Monitoring endangered freshwater biodiversity using environmental DNA. Mol Ecol 21:2565–2573CrossRefPubMedGoogle Scholar
  50. Toju H, Tanabe AS, Yamamoto S, Sato H (2012) High-coverage ITS primers for the DNA-based identification of Ascomycetes and Basidiomycetes in environmental samples. PLoS One 7:e40863CrossRefPubMedCentralPubMedGoogle Scholar
  51. van Orsouw NJ, Hogers RCJ, Janssen A, Yalcin F, Snoeijers S, Verstege E, Schneiders H, Poel Hvd, Oeveren Jv, Verstege H, Schneiders H, van der Poel H, van Oeveren J, Verstegen H, van Eijk MJT (2007) Complexity reduction of polymorphic sequences (CRoPS™): A novel approach for large-scale polymorphism discovery in complex genomes. PLoS One 2:e1172CrossRefPubMedCentralPubMedGoogle Scholar
  52. van Velzen R, Weitschek E, Felici G, Bakker FT (2012) DNA barcoding of recently diverged species: relative performance of matching methods. PLoS One 7:e30490CrossRefPubMedCentralPubMedGoogle Scholar
  53. Westra H-J, Jansen RC, Fehrmann RSN, te Meerman GJ, van Heel D, Wijmenga C, Franke L (2011) MixupMapper: correcting sample mix-ups in genome-wide datasets increases power to detect small genetic effects. Bioinformatics 27:2104–2111CrossRefPubMedGoogle Scholar
  54. Wilson HB, Joseph LN, Moore AL, Possingham HP (2011) When should we save the most endangered species? Ecol Lett 14:886–890CrossRefPubMedGoogle Scholar
  55. Yang X, Dorman KS, Aluru S (2010) Reptile: representative tiling for short read error correction. Bioinformatics 26:2526–2533CrossRefPubMedGoogle Scholar
  56. Yang X, Chockalingam SP, Aluru S (2013) A survey of error-correction methods for next-generation sequencing. Brief Bioinform 14:56–66CrossRefPubMedGoogle Scholar
  57. Zhan A, Hulák M, Sylvester F, Huang X, Adebayo AA, Abbott CL, Adamowicz SJ, Heath DD, Cristescu ME, MacIsaac HJ (2013) High sensitivity of 454 pyrosequencing for detection of rare species in aquatic communities. Methods Ecol Evol 4:558–565CrossRefGoogle Scholar
  58. Zhan A, Bailey SA, Heath DD, MacIsaac HJ (2014a) Performance comparison of genetic markers for high-throughput sequencing-based biodiversity assessment in complex communities. Mol Ecol Resour 14:1049–1059PubMedGoogle Scholar
  59. Zhan A, He S, Brown EA, Chain FJJ, Therriault TW, Abbott CL, Heath DD, Cristescu ME, MacIsaac HJ (2014b) Reproducibility of pyrosequencing data for biodiversity assessment in complex communities. Methods Ecol Evol 5:881–890CrossRefGoogle Scholar
  60. Zhan A, Xiong W, He S, MacIsaac HJ (2014c) Influence of artifact removal on rare species recovery in natural complex communities using high-throughput sequencing. PLoS One 9:e96928CrossRefPubMedCentralPubMedGoogle Scholar
  61. Zhou J, Wu L, Deng Y, Zhi X, Jiang Y, Tu Q, Xie J, Van Nostrand JD, He Z, Yang Y (2011) Reproducibility and quantitation of amplicon sequencing-based detection. ISME J 5:1303–1313CrossRefPubMedCentralPubMedGoogle Scholar
  62. Zhou X, Li Y, Liu S, Yang Q, Su X, Zhou L, Tang M, Fu R, Li J, Huang Q (2013a) Ultra-deep sequencing enables high-fidelity recovery of biodiversity for bulk arthropod samples without PCR amplification. GigaScience 2:4CrossRefPubMedCentralPubMedGoogle Scholar
  63. Zhou J, Jiang Y, Deng Y, Shi Z, Zhou B, Xue K, Wu L, He Z, Yang Y (2013b) Random sampling process leads to overestimation of β-diversity of microbial communities. mBio 4:e00324PubMedCentralPubMedGoogle Scholar
  64. Zimmerman N, Izard J, Klatt C, Zhou J, Aronson E (2014) The unseen world: environmental microbial sequencing and identification methods for ecologists. Front Ecol Environ 12:224–231CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2014

Authors and Affiliations

  1. 1.Research Center for Eco-Environmental SciencesChinese Academy of SciencesBeijingChina
  2. 2.Great Lakes Institute for Environmental ResearchUniversity of WindsorWindsorCanada

Personalised recommendations