Skip to main content
Log in

Optimized analysis parameters of variant calling for whole genome-based phylogeny of Mycobacteroides abscessus

  • Original Paper
  • Published:
Archives of Microbiology Aims and scope Submit manuscript

Abstract

Whole-genome sequence (WGS) analysis provides the best resolution for reconstructing bacterial phylogeny. However, the resulting tree could vary according to parameters used in the WGS pipeline, making it difficult to compare results across multiple studies. This study compares effects on phylogenies when applying different parameter stringencies. We used as the study model to optimize parameters strains of Mycobacteroides abscessus serially isolated at various intervals, isolates known to represent persistent infection (PI) cases or re-infection (RI) cases and isolates from different subspecies. Un-optimized parameters with low stringency provided an excessive number of SNPs (823) compared to the optimized setting (3 SNPs) between paired strains isolated 1 day apart from PI cases, discordant tree topology and misclassification of subspecies and of instances of RI. We demonstrated that using high-quality variants provides more accuracy for recognizing serial isolates of the same clone versus different clones and for phylogenetic analysis of M. abscessus. Our approach might be used as a model for analyses requiring phylogenetic reconstruction of other bacteria.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data availability

The datasets generated and/or analyzed during the current study are available in the NCBI repository, containing 69 biosamples under the bioproject Accession no. PRJNA523980.

References

Download references

Acknowledgements

We would like to acknowledge Prof. David Blair for editing the MS via Publication Clinic KKU, Thailand.

Funding

Thailand and General Supportive Grant (KKU 6200039), Khon Kaen University, Thailand 2019; the National University of Singapore Start-Up Grant (to RTHO) and the Royal Golden Jubilee (RGJ)-Ph.D. program Grant (PHD/0115/2559) of the Thailand Research Fund (TRF). These funding sources had no role in design of the study and collection, analysis, interpretation of data and in writing the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

KF conceived the ideas and designed methodology; ROTH managed whole-genome sequencing and provided raw sequence data; KF and ROTH supervised the research assistant (OK); OK, KF, ROTH and ST curated data; OK and KF analyzed data; KF, ROTH, ST and OK interpreted the results and wrote manuscript; KF ROTH and ST edited manuscript. All authors contributed critically to the drafts and gave final approval for publication.

Corresponding author

Correspondence to Kiatichai Faksri.

Ethics declarations

Conflict of interest

The authors declare that there are no conflicts of interest.

Ethical approval

This study was approved by Khon Kaen University Ethics Committee for Human Research (EC.No. HE591454).

Additional information

Communicated by Erko Stackebrandt.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

203_2022_2792_MOESM1_ESM.tif

Fig. S1. Number of variants called using different base-quality scores (BaseQ; Q). As BaseQ values were increased from Q10 to Q50, numbers of variants called decreased. The pattern differed somewhat between the two subspecies; M. abscessus subsp. abscessus (MAB) and M. abscessus subsp. massiliense (MMAS). BaseQ values Q10 to Q40 provided a distinct SNP set differentiating between MAB and MMAS, but Q50 did not. Hence, Q30 and Q40 were selected for further analysis in the optimization process. (TIF 466 kb)

203_2022_2792_MOESM2_ESM.tif

Fig. S2. Optimization of mapping-quality (C) and depth-of-coverage (d) values. A. Numbers of variants called using different mapping-quality (C) and depth-of-coverage values (d) values based on Q30. Q30C40 was the lowest stringency that could differentiate between MMAS and MAB and was selected for further optimization in the next step. Increase of the d parameter value decreased the number of variants called. However, the optimal value of d was unknown in this step and was adjusted in the next step. B. Numbers of variants called using different mapping-quality (C) and depth-of-coverage values (d) based on Q40. Q40 when combined with certain C and d parameters provided too few variants and could not differentiate between the two subspecies. (TIF 894 kb)

203_2022_2792_MOESM3_ESM.tif

Fig. S3. Adjustment of SNP quality (QSNPs). SNP quality was optimized using parameter values ranging from QSNP10 to 50. QSNP40 and 50 were too stringent and dramatically decreased the number of variants called. QSNP10 to 30 provided enough variants to differentiate between the two subspecies. Therefore, QSNP30 was combined with depth of coverage for adjustment in subsequent steps. (TIF 626 kb)

203_2022_2792_MOESM4_ESM.xlsx

Fig. S4. Optimization of minimum depth of coverage (d). The optimal value for minimum depth of coverage was found using values ranging from d=0X to d=150X. The number of called variants decreased as d increased but changed little at values above d=60. Therefore, d=60 was selected as the optimal value for this parameter. (TIF 352 kb)

Supplementary file5 (XLSX 61 kb)

Supplementary file6 (DOCX 55 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kaewprasert, O., Tongsima, S., Ong, R.TH. et al. Optimized analysis parameters of variant calling for whole genome-based phylogeny of Mycobacteroides abscessus. Arch Microbiol 204, 190 (2022). https://doi.org/10.1007/s00203-022-02792-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00203-022-02792-2

Keywords

Navigation