Genes & Genomics

, Volume 40, Issue 3, pp 281–288 | Cite as

Comparison of methods for library construction and short read annotation of shellfish viral metagenomes

  • Hong-Ying Wei
  • Sheng Huang
  • Jiang-Yong Wang
  • Fang Gao
  • Jing-Zhe Jiang
Research Article


The emergence and widespread use of high-throughput sequencing technologies have promoted metagenomic studies on environmental or animal samples. Library construction for metagenome sequencing and annotation of the produced sequence reads are important steps in such studies and influence the quality of metagenomic data. In this study, we collected some marine mollusk samples, such as Crassostrea hongkongensis, Chlamys farreri, and Ruditapes philippinarum, from coastal areas in South China. These samples were divided into two batches to compare two library construction methods for shellfish viral metagenome. Our analysis showed that reverse-transcribing RNA into cDNA and then amplifying it simultaneously with DNA by whole genome amplification (WGA) yielded a larger amount of DNA compared to using only WGA or WTA (whole transcriptome amplification). Moreover, higher quality libraries were obtained by agarose gel extraction rather than with AMPure bead size selection. However, the latter can also provide good results if combined with the adjustment of the filter parameters. This, together with its simplicity, makes it a viable alternative. Finally, we compared three annotation tools (BLAST, DIAMOND, and Taxonomer) and two reference databases (NCBI’s NR and Uniprot’s Uniref). Considering the limitations of computing resources and data transfer speed, we propose the use of DIAMOND with Uniref for annotating metagenomic short reads as its running speed can guarantee a good annotation rate. This study may serve as a useful reference for selecting methods for Shellfish viral metagenome library construction and read annotation.


Shellfish Viral metagenome Whole genome amplification Size selection Library construction Read annotation 



We express our thanks to Dr. Wang Rui-Xuan, Dr. Ye Ling-Tong and Dr. Yao Tuo (South China Sea Fisheries Research Institute, Guangzhou, China) for the collection of samples. This work was supported by the “Central Public-interest Scientific Institution Basal Research Fund,CAFS” (2016RC-LX05), the “Earmarked Fund for Modern Agro-industry Technology Research System” (CARS-49), the “Guangdong Province Marine Fishery Development Projects” and the “Guangdong Special Support Program” (00-201620641).

Compliance with ethical standards

Conflict of interest

Hong-Ying Wei has declared that no conflict of interest exists, Sheng Huang has declared that no conflict of interest exists, Jiang-Yong Wang has declared that no conflict of interest exists, Fang Gao has declared that no conflict of interest exists, Jing-Zhe Jiang has declared that no conflict of interest exists.

Ethical approval

All animal work have been conducted according to relevant national and international guidelines. South China Sea Fisheries Research Institute Academic Committee approved this research.


  1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403–410CrossRefPubMedGoogle Scholar
  2. Banach BS, Orenstein JM, Fox LM, Randell SH, Rowley AH, Baker SC (2009) Human airway epithelial cell culture to identify new respiratory viruses: coronavirus NL63 as a model. J Virol Methods 156:19–26CrossRefGoogle Scholar
  3. Buchfink B, Xie C, Huson DH (2015) Fast and sensitive protein alignment using DIAMOND. Nat Methods 12:59–60CrossRefPubMedGoogle Scholar
  4. Carter NP, Bebb CE, Nordenskjo M, Ponder BA, Tunnacliffe A (1992) Degenerate oligonucleotide-primed PCR: general amplification of target DNA by a single degenerate primer. Genomics 13:718–725CrossRefPubMedGoogle Scholar
  5. Djikeng A, Halpin R, Kuzmickas R, DePasse J, Feldblyum J, Sengamalay N, Afonso C, Zhang X, Anderson NG, Ghedin E (2008) Viral genome sequencing by random priming methods. BMC Genom 9:5CrossRefGoogle Scholar
  6. Flygare S, Simmon K, Miller C, Qiao Y, Kennedy B, Di Sera T, Graf EH, Tardif KD, Kapusta A, Rynearson S (2016) Taxonomer: an interactive metagenomics analysis portal for universal pathogen detection and host mRNA expression profiling. Genome Biol 17:111CrossRefPubMedPubMedCentralGoogle Scholar
  7. Ge X, Li Y, Yang X, Zhang H, Zhou P, Zhang Y, Shi Z (2012) Metagenomic analysis of viruses from the bat fecal samples reveals many novel viruses in insectivorous bats in China. J Virol 86:4620Google Scholar
  8. Lasken RS, Egholm M (2003) Whole genome amplification: abundant supplies of DNA from precious samples or clinical specimens. Trends Biotechnol 21:531–535CrossRefPubMedGoogle Scholar
  9. Ling J, Zhuang G, Tazon-Vega B, Zhang C, Cao B, Rosenwaks Z, Xu K (2009) Evaluation of genome coverage and fidelity of multiple displacement amplification from single cells by SNP array. Mol Hum Reprod 15:739–747CrossRefPubMedPubMedCentralGoogle Scholar
  10. Lou X, Qian J, Xiao Y, Viel L, Gerdon AE, Lagally ET, Atzberger P, Tarasow TM, Heeger AJ, Soh HT (2009) Micromagnetic selection of aptamers in microfluidic channels. Proc Natl Acad Sci USA 106:2989–2994CrossRefPubMedPubMedCentralGoogle Scholar
  11. Lovmar L, Syvänen AC (2006) Multiple displacement amplification to create a long-lasting source of DNA for genetic studies. Hum Mutat 27:603–614CrossRefPubMedGoogle Scholar
  12. Meyer F, Paarmann D, D’Souza M, Olson R, Glass EM, Kubal M, Paczian T, Rodriguez A, Stevens R, Wilke A (2008) The metagenomics RAST server: a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinform 9:386CrossRefGoogle Scholar
  13. Morishima A, Grumbach MM, Simpson ER, Fisher C, Qin K (1995) Aromatase deficiency in male and female siblings caused by a novel mutation and the physiological role of estrogens. J Clin Endocr Metab 80:3689–3698PubMedGoogle Scholar
  14. Nakamura S, Yang C-S, Sakon N, Ueda M, Tougan T, Yamashita A, Goto N, Takahashi K, Yasunaga T, Ikuta K (2009) Direct metagenomic detection of viral pathogens in nasal and fecal specimens using an unbiased high-throughput sequencing approach. PLoS ONE 4:e4219CrossRefPubMedPubMedCentralGoogle Scholar
  15. Ounit R, Wanamaker S, Close TJ, Lonardi S (2015) CLARK: fast and accurate classification of metagenomic and genomic sequences using discriminative k-mers. BMC Genom 16:236CrossRefGoogle Scholar
  16. Pan X, Urban AE, Palejev D, Schulz V, Grubert F, Hu Y, Snyder M, Weissman SM (2008) A procedure for highly specific, sensitive, and unbiased whole-genome amplification. Proc Natl Acad Sci USA 105:15499–15504CrossRefPubMedPubMedCentralGoogle Scholar
  17. Simmon KE (2016) Taxonomer: a fast and accurate metagenomics tool and its uses on clinical specimens. The University of Utah, UtahGoogle Scholar
  18. Simon C, Daniel R (2011) Metagenomic analyses: past and future trends. Appl Environ Microb 77:1153–1161CrossRefGoogle Scholar
  19. Sujayanont P, Chininmanu K, Tassaneetrithep B, Tangthawornchaikul N, Malasit P, Suriyaphol P (2014) Comparison of phi29-based whole genome amplification and whole transcriptome amplification in dengue virus. J Virol Methods 195:141–147CrossRefPubMedGoogle Scholar
  20. Suzuki S, Kakuta M, Ishida T, Akiyama Y (2014) GHOSTX: an improved sequence homology search algorithm using a query suffix array and a database suffix array. PLoS ONE 9:e103833CrossRefPubMedPubMedCentralGoogle Scholar
  21. Tok JB-H, Fischer NO (2008) Single microbead SELEX for efficient ssDNA aptamer generation against botulinum neurotoxin. Chem Commun 16:1883Google Scholar
  22. Tyson GW, Chapman J, Hugenholtz P, Allen EE, Ram RJ, Richardson PM, Solovyev VV, Rubin EM, Rokhsar DS, Banfield JF (2004) Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature 428:37–43CrossRefPubMedGoogle Scholar
  23. Venter JC, Remington K, Heidelberg JF, Halpern AL, Rusch D, Eisen JA, Wu D, Paulsen I, Nelson KE, Nelson W (2004) Environmental genome shotgun sequencing of the Sargasso Sea. Science 304:66–74CrossRefPubMedGoogle Scholar
  24. Wood DE, Salzberg SL (2014) Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome Biol 15:R46CrossRefPubMedPubMedCentralGoogle Scholar
  25. Yan Q, Bi Y, Deng Y, He Z, Wu L, Van Nostrand J, Shi Z, Li J, Wang X, Hu Z (2015) Impacts of the Three Gorges Dam on microbial structure and potential function. Sci Rep UK 5:8605CrossRefGoogle Scholar
  26. Yu J, Blom J, Sczyrba A, Goesmann A (2017) Rapid protein alignment in the cloud: HAMOND combines fast DIAMOND alignments with Hadoop parallelism. J Biotechnol 257:58–60CrossRefPubMedGoogle Scholar
  27. Zhang L, Cui X, Schmitt K, Hubert R, Navidi W, Arnheim N (1992) Whole genome amplification from a single cell: implications for genetic analysis. Proc Natl Acad Sci USA 89:5847–5851CrossRefPubMedPubMedCentralGoogle Scholar
  28. Zhang R, Ma Z, Wu B (2015) Multiple displacement amplification of whole genomic DNA from urediospores of Puccinia striiformis f. sp. tritici. Curr Genet 61:221–230CrossRefPubMedGoogle Scholar
  29. Zhao Y, Tang H, Ye Y (2012) RAPSearch2: a fast and memory-efficient protein similarity search tool for next-generation sequencing data. Bioinformatics 28:125–126CrossRefPubMedGoogle Scholar
  30. Zhong S, Joung J-G, Zheng Y, Chen Y-r, Liu B, Shao Y, Xiang JZ, Fei Z, Giovannoni JJ (2011) High-throughput illumina strand-specific RNA sequencing library preparation. Cold Spring Harb Protoc 2011:pdb.prot5652CrossRefPubMedGoogle Scholar

Copyright information

© The Genetics Society of Korea and Springer Science+Business Media B.V., part of Springer Nature 2017

Authors and Affiliations

  • Hong-Ying Wei
    • 1
    • 2
  • Sheng Huang
    • 1
    • 2
  • Jiang-Yong Wang
    • 1
  • Fang Gao
    • 1
    • 2
  • Jing-Zhe Jiang
    • 1
  1. 1.Key Laboratory of Aquatic Product Processing, Ministry of Agriculture, South China Sea Fisheries Research InstituteChinese Academy of Fishery SciencesGuangzhouChina
  2. 2.Shanghai Ocean UniversityShanghaiChina

Personalised recommendations