Abstract
Next-generation sequencing methods provide comprehensive data for the analysis of structural and functional analysis of the genome. The draft genomes with low contig number and high N50 value can give insight into the structure of the genome as well as provide information on the annotation of the genome. In this study, we designed a pipeline that can be used to assemble prokaryotic draft genomes with low number of contigs and high N50 value. We aimed to use combination of two de novo assembly tools (SPAdes and IDBA-Hybrid) and evaluate the impact of this approach on the quality metrics of the assemblies. The followed pipeline was tested with the raw sequence data with short reads (< 300) for a total of 10 species from four different genera. To obtain the final draft genomes, we firstly assembled the sequences using SPAdes to find closely related organism using the extracted 16 s rRNA from it. IDBA-Hybrid assembler was used to obtain the second assembly data using the closely related organism genome. SPAdes assembler tool was implemented using the second assembly, produced by IDBA-hybrid as a hint. The results were evaluated using QUAST and BUSCO. The pipeline was successful for the reduction of the contig numbers and increasing the N50 statistical values in the draft genome assemblies while preserving the coverage of the draft genomes.
Similar content being viewed by others
References
Andrews S (2010) FASTQC A quality control tool for high throughput sequence data. In: Babraham Inst. http://www.bioinformatics.babraham.ac.uk/projects/fastqc
Bankevich A, Nurk S, Antipov D et al (2012) SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol. https://doi.org/10.1089/cmb.2012.0021
Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. https://doi.org/10.1093/bioinformatics/btu170
Bradnam KR, Fass JN, Alexandrov A et al (2013) Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species. Gigascience. https://doi.org/10.1186/2047-217X-2-10
Bugrysheva JV, Cherney B, Sue D et al (2016) Complete genome sequences for three chromosomes of the Burkholderia stabilis type strain (ATCC BAA-67). Genome Announc. https://doi.org/10.1128/genomeA.01294-16
Earl D, Bradnam K, St. John J et al (2011) Assemblathon 1: a competitive assessment of de novo short read assembly methods. Genome Res 21
Esmaeel Q, Issa A, Sanchez L et al (2018) Draft genome sequence of Burkholderia reimsis BE51, a plant-associated bacterium isolated from agricultural rhizosphere. Microbiol Resour Announc. https://doi.org/10.1128/mra.00978-18
Goris J, Konstantinidis KT, Klappenbach JA et al (2007) DNA-DNA hybridization values and their relationship to whole-genome sequence similarities. Int J Syst Evol Microbiol. https://doi.org/10.1099/ijs.0.64483-0
Guizelini D, Raittz RT, Cruz LM et al (2016) GFinisher: a new strategy to refine and finish bacterial genome assemblies. Sci Rep. https://doi.org/10.1038/srep34963
Gurevich A, Saveliev V, Vyahhi N, Tesler G (2013) QUAST: quality assessment tool for genome assemblies. Bioinformatics. https://doi.org/10.1093/bioinformatics/btt086
Hollmann J, Brinks E, Schwake-Anduschus C et al (2019) Draft genome sequences of Pseudomonas sp. strains isolated from wheat in Germany. Microbiol Resour Announc https://doi.org/10.1128/mra.00178-19
Hunt M, Kikuchi T, Sanders M et al (2013) REAPR: a universal tool for genome assembly evaluation. Genome Biol. https://doi.org/10.1186/gb-2013-14-5-r47
Kim M, Oh HS, Park SC, Chun J (2014) Towards a taxonomic coherence between average nucleotide identity and 16S rRNA gene sequence similarity for species demarcation of prokaryotes. Int J Syst Evol Microbiol. https://doi.org/10.1099/ijs.0.059774-0
Kolmogorov M, Raney B, Paten B, Pham S (2014) Ragout - a reference-assisted assembly tool for bacterial genomes. Bioinformatics. https://doi.org/10.1093/bioinformatics/btu280
Kunst F, Ogasawara N, Moszer I et al (1997) The complete genome sequence of the gram-positive bacterium Bacillus subtilis. Nature 390:249–256. https://doi.org/10.1038/36786
Leong LEX, Lagana D, Carter GP et al (2018) Burkholderia lata infections from intrinsically contaminated chlorhexidine Mouthwash, Australia, 2016. Emerg Infect Dis 24
Liao X, Li M, Zou Y et al (2019) Current challenges and solutions of de novo assembly. Quant Biol
Lischer HEL, Shimizu KK (2017) Reference-guided de novo assembly approach improves genome reconstruction for related species. BMC Bioinformatics. https://doi.org/10.1186/s12859-017-1911-6
National Center for Biotechnology Information (NCBI) (1988) Bethesda (MD): National Library of Medicine (US), National Center for Biotechnology Information. https://www.ncbi.nlm.nih.gov/genome. Accessed 2 Sep 2020
Økstad OA, Tourasse NJ, Stabell FB et al (2004) The bcr1 DNA repeat element is specific to the Bacillus cereus group and exhibits mobile element characteristics. J Bacteriol 186:7714–7725. https://doi.org/10.1128/JB.186.22.7714-7725.2004
Owusu-Darko R, Allam M, de Oliveira SD et al (2019) Genome sequences of Bacillus sporothermodurans strains isolated from ultra-high-temperature milk. Microbiol Resour Announc. https://doi.org/10.1128/mra.00145-19
Page AJ, De Silva N, Hunt M et al (2016) Robust high-throughput prokaryote de novo assembly and improvement pipeline for Illumina data. Microb Genomics. https://doi.org/10.1099/mgen.0.000083
Palevich N, Palevich FP, Maclean PH et al (2019) Draft genome sequence of Clostridium estertheticum subsp. laramiense DSM 14864T, isolated from spoiled uncooked beef. Microbiol Resour Announc. https://doi.org/10.1128/mra.01275-19
Peng Y, Leung HCM, Yiu SM, Chin FYL (2012) IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics. https://doi.org/10.1093/bioinformatics/bts174
Prjibelski A, Antipov D, Meleshko D et al (2020) Using SPAdes de novo assembler. Curr Protoc Bioinforma. https://doi.org/10.1002/cpbi.102
Ramasamy KP, Telatin A, Mozzicafreddo M et al (2019) Draft genome sequence of a new Pseudomonas sp. Strain, ef1, associated with the psychrophilic antarctic ciliate Euplotes focardii. Microbiol Resour Announc. https://doi.org/10.1128/mra.00867-19
Ricker N, Qian H, Fulthorpe RR (2012) The limitations of draft assemblies for understanding prokaryotic adaptation and evolution. Genomics. https://doi.org/10.1016/j.ygeno.2012.06.009
Seemann T (2013) barrnap 0.9 : rapid ribosomal RNA prediction. Github.Com
Simão FA, Waterhouse RM, Ioannidis P et al (2015) BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. https://doi.org/10.1093/bioinformatics/btv351
Utturkar SM, Klingeman DM, Hurt RA, Brown SD (2017) A case study into microbial genome assembly gap sequences and finishing strategies. Front Microbiol. https://doi.org/10.3389/fmicb.2017.01272
Author information
Authors and Affiliations
Contributions
Uğur Çabuk contributed to study conception and design, data analysis, evaluation of results, and writing/editing the manuscript. Ercan Selçuk Ünlü contributed to study conception and design, mentoring Uğur Çabuk throughout the data analysis, evaluation of results, and writing/editing the manuscript.
Corresponding author
Ethics declarations
Ethics approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Consent to participate
This article does not contain any studies with human participants or animals performed by any of the authors.
Consent for publication
This article does not contain any studies with human participants or animals performed by any of the authors.
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Çabuk, U., Ünlü, E.S. A combined de novo assembly approach increases the quality of prokaryotic draft genomes. Folia Microbiol 67, 801–810 (2022). https://doi.org/10.1007/s12223-022-00980-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12223-022-00980-7