Abstract
Whole-genome sequence (WGS) analysis provides the best resolution for reconstructing bacterial phylogeny. However, the resulting tree could vary according to parameters used in the WGS pipeline, making it difficult to compare results across multiple studies. This study compares effects on phylogenies when applying different parameter stringencies. We used as the study model to optimize parameters strains of Mycobacteroides abscessus serially isolated at various intervals, isolates known to represent persistent infection (PI) cases or re-infection (RI) cases and isolates from different subspecies. Un-optimized parameters with low stringency provided an excessive number of SNPs (823) compared to the optimized setting (3 SNPs) between paired strains isolated 1 day apart from PI cases, discordant tree topology and misclassification of subspecies and of instances of RI. We demonstrated that using high-quality variants provides more accuracy for recognizing serial isolates of the same clone versus different clones and for phylogenetic analysis of M. abscessus. Our approach might be used as a model for analyses requiring phylogenetic reconstruction of other bacteria.
Similar content being viewed by others
Data availability
The datasets generated and/or analyzed during the current study are available in the NCBI repository, containing 69 biosamples under the bioproject Accession no. PRJNA523980.
References
Ananta P et al (2018) Analysis of drug-susceptibility patterns and gene sequences associated with clarithromycin and amikacin resistance in serial Mycobacterium abscessus isolates from clinical specimens from Northeast Thailand. PLoS One 13:e0208053–e0208053. https://doi.org/10.1371/journal.pone.0208053
Andrews S (2010) FastQC: a quality control tool for high throughput sequence data [Online]. http://www.bioinformatics.babraham.ac.uk/projects/fastqc/. Accessed 7 Dec 2018
Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120. https://doi.org/10.1093/bioinformatics/btu170
Bryant JM et al (2016) Emergence and spread of a human-transmissible multidrug-resistant nontuberculous mycobacterium. Science 354:751–757. https://doi.org/10.1126/science.aaf8156
Chew KL, Cheng JWS, Hudaa Osman N, Lin RTP, Teo JWP (2017) Predominance of clarithromycin-susceptible Mycobacterium massiliense subspecies: characterization of the Mycobacterium abscessus complex at a tertiary acute care hospital. J Med Microbiol 66:1443–1447. https://doi.org/10.1099/jmm.0.000576
Davidson RM, Hasan NA, de Moura VC, Duarte RS, Jackson M, Strong M (2013) Phylogenomics of Brazilian epidemic isolates of Mycobacterium abscessus subsp. bolletii reveals relationships of global outbreak strains. Infect Genet Evol 20:292–297. https://doi.org/10.1016/j.meegid.2013.09.012
Davidson RM et al (2014) Genome sequencing of Mycobacterium abscessus isolates from patients in the united states and comparisons to globally diverse clinical strains. J Clin Microbiol 52:3573–3582. https://doi.org/10.1128/JCM.01144-14
Everall I et al (2017) Genomic epidemiology of a national outbreak of post-surgical Mycobacterium abscessus wound infections in Brazil. Microb Genom 3:e000111. https://doi.org/10.1099/mgen.0.000111
Faksri K et al (2016) Whole-genome sequencing analysis of serially isolated multi-drug and extensively drug resistant Mycobacterium tuberculosis from Thai Patients. PLoS One 11:e0160992. https://doi.org/10.1371/journal.pone.0160992
Harris KA et al (2015) Whole-genome sequencing and epidemiological analysis do not provide evidence for cross-transmission of mycobacterium abscessus in a cohort of pediatric cystic fibrosis patients. Clin Infect Dis 60:1007–1016. https://doi.org/10.1093/cid/ciu967
Jeong SH et al (2017) Mycobacteriological characteristics and treatment outcomes in extrapulmonary Mycobacterium abscessus complex infections. Int J Infect Dis 60:49–56. https://doi.org/10.1016/j.ijid.2017.05.007
Kham-Ngam I et al (2019) Differentiation between persistent infection/colonization and re-infection/re-colonization of Mycobacterium abscessus isolated from patients in Northeast Thailand. Infect Genet Evol 68:35–42. https://doi.org/10.1016/j.meegid.2018.12.001
Kumar S, Stecher G, Tamura K (2016) MEGA7: molecular evolutionary genetics analysis version 7.0 for bigger datasets. Mol Biol Evol 33:1870–1874. https://doi.org/10.1093/molbev/msw054
Lee MR, Sheng WH, Hung CC, Yu CJ, Lee LN, Hsueh PR (2015) Mycobacterium abscessus complex infections in humans. Emerg Infect Dis 21:1638–1646. https://doi.org/10.3201/2109.141634
Letunic I, Bork P (2016) Interactive tree of life (iTOL) v3: an online tool for the display and annotation of phylogenetic and other trees. Nucleic Acids Res 44:W242-245. https://doi.org/10.1093/nar/gkw290
Li H et al (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079. https://doi.org/10.1093/bioinformatics/btp352
Li H (2013) Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv:1303.3997v2
Lipworth S et al (2019) Whole-genome sequencing for predicting clarithromycin resistance in Mycobacterium abscessus. Antimicrob Agents Chemother. https://doi.org/10.1128/AAC.01204-18
McKenna A et al (2010) The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20:1297–1303. https://doi.org/10.1101/gr.107524.110
Nessar R, Cambau E, Reyrat JM, Murray A, Gicquel B (2012) Mycobacterium abscessus: a new antibiotic nightmare. J Antimicrob Chemother 67:810–818. https://doi.org/10.1093/jac/dkr578
Quainoo S et al (2017) Whole-genome sequencing of bacterial pathogens: the future of nosocomial outbreak analysis. Clin Microbiol Rev 30:1015–1063. https://doi.org/10.1128/CMR.00016-17
Tiba-Casas MR et al (2019) Molecular analysis of clonally related Salmonella Typhi recovered from epidemiologically unrelated cases of typhoid fever, Brazil. Int J Infect Dis 81:191–195. https://doi.org/10.1016/j.ijid.2019.02.009
Victoria L, Gupta A, Gómez JL, Robledo J (2021) Mycobacterium abscessus complex: a review of recent developments in an emerging pathogen. Front Cell Infect Microbiol 11:659997–659997. https://doi.org/10.3389/fcimb.2021.659997
Xu C (2018) A review of somatic single nucleotide variant calling algorithms for next-generation sequencing data. Comput Struct Biotechnol J 16:15–24. https://doi.org/10.1016/j.csbj.2018.01.003
Yan J et al (2020) Investigating transmission of Mycobacterium abscessus amongst children in an Australian cystic fibrosis centre. J Cyst Fibros 19:219–224. https://doi.org/10.1016/j.jcf.2019.02.011
Acknowledgements
We would like to acknowledge Prof. David Blair for editing the MS via Publication Clinic KKU, Thailand.
Funding
Thailand and General Supportive Grant (KKU 6200039), Khon Kaen University, Thailand 2019; the National University of Singapore Start-Up Grant (to RTHO) and the Royal Golden Jubilee (RGJ)-Ph.D. program Grant (PHD/0115/2559) of the Thailand Research Fund (TRF). These funding sources had no role in design of the study and collection, analysis, interpretation of data and in writing the manuscript.
Author information
Authors and Affiliations
Contributions
KF conceived the ideas and designed methodology; ROTH managed whole-genome sequencing and provided raw sequence data; KF and ROTH supervised the research assistant (OK); OK, KF, ROTH and ST curated data; OK and KF analyzed data; KF, ROTH, ST and OK interpreted the results and wrote manuscript; KF ROTH and ST edited manuscript. All authors contributed critically to the drafts and gave final approval for publication.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there are no conflicts of interest.
Ethical approval
This study was approved by Khon Kaen University Ethics Committee for Human Research (EC.No. HE591454).
Additional information
Communicated by Erko Stackebrandt.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
203_2022_2792_MOESM1_ESM.tif
Fig. S1. Number of variants called using different base-quality scores (BaseQ; Q). As BaseQ values were increased from Q10 to Q50, numbers of variants called decreased. The pattern differed somewhat between the two subspecies; M. abscessus subsp. abscessus (MAB) and M. abscessus subsp. massiliense (MMAS). BaseQ values Q10 to Q40 provided a distinct SNP set differentiating between MAB and MMAS, but Q50 did not. Hence, Q30 and Q40 were selected for further analysis in the optimization process. (TIF 466 kb)
203_2022_2792_MOESM2_ESM.tif
Fig. S2. Optimization of mapping-quality (C) and depth-of-coverage (d) values. A. Numbers of variants called using different mapping-quality (C) and depth-of-coverage values (d) values based on Q30. Q30C40 was the lowest stringency that could differentiate between MMAS and MAB and was selected for further optimization in the next step. Increase of the d parameter value decreased the number of variants called. However, the optimal value of d was unknown in this step and was adjusted in the next step. B. Numbers of variants called using different mapping-quality (C) and depth-of-coverage values (d) based on Q40. Q40 when combined with certain C and d parameters provided too few variants and could not differentiate between the two subspecies. (TIF 894 kb)
203_2022_2792_MOESM3_ESM.tif
Fig. S3. Adjustment of SNP quality (QSNPs). SNP quality was optimized using parameter values ranging from QSNP10 to 50. QSNP40 and 50 were too stringent and dramatically decreased the number of variants called. QSNP10 to 30 provided enough variants to differentiate between the two subspecies. Therefore, QSNP30 was combined with depth of coverage for adjustment in subsequent steps. (TIF 626 kb)
203_2022_2792_MOESM4_ESM.xlsx
Fig. S4. Optimization of minimum depth of coverage (d). The optimal value for minimum depth of coverage was found using values ranging from d=0X to d=150X. The number of called variants decreased as d increased but changed little at values above d=60. Therefore, d=60 was selected as the optimal value for this parameter. (TIF 352 kb)
Rights and permissions
About this article
Cite this article
Kaewprasert, O., Tongsima, S., Ong, R.TH. et al. Optimized analysis parameters of variant calling for whole genome-based phylogeny of Mycobacteroides abscessus. Arch Microbiol 204, 190 (2022). https://doi.org/10.1007/s00203-022-02792-2
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00203-022-02792-2