Comparison of Genome Sequencing Technology and Assembly Methods for the Analysis of a GC-Rich Bacterial Genome
Improvements in technology and decreases in price have made de novo bacterial genomic sequencing a reality for many researchers, but it has created a need to evaluate the methods for generating a complete and accurate genome assembly. We sequenced the GC-rich Caulobacter henricii genome using the Illumina MiSeq, Roche 454, and Pacific Biosciences RS II sequencing systems. To generate a complete genome sequence, we performed assemblies using eight readily available programs and found that builds using the Illumina MiSeq and the Roche 454 data produced accurate yet numerous contigs. SPAdes performed the best followed by PANDAseq. In contrast, the Celera assembler produced a single genomic contig using the Pacific Biosciences data after error correction with the Illumina MiSeq data. In addition, we duplicated this build using the Pacific Biosciences data with HGAP2.0. The accuracy of these builds was verified by pulsed-field gel electrophoresis of genomic DNA cut with restriction enzymes.