Advertisement

Ensemble-Based Somatic Mutation Calling in Cancer Genomes

  • Weitai HuangEmail author
  • Yu Amanda Guo
  • Mei Mei Chang
  • Anders Jacobsen Skanderup
Protocol
  • 237 Downloads
Part of the Methods in Molecular Biology book series (MIMB, volume 2120)

Abstract

Identification of somatic mutations in tumor tissue is challenged by both technical artifacts, diverse somatic mutational processes, and genetic heterogeneity in the tumors. Indeed, recent independent benchmark studies have revealed low concordance between different somatic mutation callers. Here, we describe Somatic Mutation calling method using a Random Forest (SMuRF), a portable ensemble method that combines the predictions and auxiliary features from individual mutation callers using supervised machine learning. SMuRF has improved prediction accuracy for both somatic point mutations (single nucleotide variants; SNVs) and small insertions/deletions (indels) in cancer genomes and exomes. Here, we describe the method and provide a tutorial on the installation and application of SMuRF.

Key words

Somatic mutation calling Next-generation sequencing 

References

  1. 1.
    Cibulskis K, Lawrence MS, Carter SL et al (2013) Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnol 31:213.  https://doi.org/10.1038/nbt.2514CrossRefPubMedPubMedCentralGoogle Scholar
  2. 2.
    Lai Z, Markovets A, Ahdesmaki M et al (2016) VarDict: a novel and versatile variant caller for next-generation sequencing in cancer research. Nucleic Acids Res 44(11):e108.  https://doi.org/10.1093/nar/gkw227CrossRefPubMedPubMedCentralGoogle Scholar
  3. 3.
    Koboldt DC, Zhang Q, Larson DE et al (2012) VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res 22(3):568–576.  https://doi.org/10.1101/gr.129684.111CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    Kim S, Scheffler K, Halpern AL et al (2018) Strelka2: fast and accurate calling of germline and somatic variants. Nat Methods 15(8):591–594.  https://doi.org/10.1038/s41592-018-0051-xCrossRefPubMedGoogle Scholar
  5. 5.
    Hwang S, Kim E, Lee I et al (2015) Systematic comparison of variant calling pipelines using gold standard personal exome variants. Sci Rep 5:17875.  https://doi.org/10.1038/srep17875CrossRefPubMedPubMedCentralGoogle Scholar
  6. 6.
    Kroigard AB, Thomassen M, Laenkholm AV et al (2016) Evaluation of nine somatic variant callers for detection of somatic mutations in exome and targeted deep sequencing data. PLoS One 11(3):e0151664.  https://doi.org/10.1371/journal.pone.0151664CrossRefPubMedPubMedCentralGoogle Scholar
  7. 7.
    O'Rawe J, Jiang T, Sun G et al (2013) Low concordance of multiple variant-calling pipelines: practical implications for exome and genome sequencing. Genome Med 5(3):28.  https://doi.org/10.1186/gm432CrossRefPubMedPubMedCentralGoogle Scholar
  8. 8.
    Roberts ND, Kortschak RD, Parker WT et al (2013) A comparative analysis of algorithms for somatic SNV detection in cancer. Bioinformatics (Oxford, England) 29(18):2223–2230.  https://doi.org/10.1093/bioinformatics/btt375CrossRefGoogle Scholar
  9. 9.
    Alioto TS, Buchhalter I, Derdak S et al (2015) A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing. Nat Commun 6:10001.  https://doi.org/10.1038/ncomms10001CrossRefPubMedPubMedCentralGoogle Scholar
  10. 10.
    Huang W, Guo YA, Muthukumar K et al (2019) SMuRF: portable and accurate ensemble prediction of somatic mutations. Bioinformatics (Oxford, England) 35:3157–3159.  https://doi.org/10.1093/bioinformatics/btz018CrossRefGoogle Scholar
  11. 11.
    Cingolani P, Platts A, Wang le L et al (2012) A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly 6(2):80–92.  https://doi.org/10.4161/fly.19695CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2020

Authors and Affiliations

  • Weitai Huang
    • 1
    • 2
    Email author
  • Yu Amanda Guo
    • 1
  • Mei Mei Chang
    • 1
  • Anders Jacobsen Skanderup
    • 1
  1. 1.Computational and Systems Biology 3Genome Institute of Singapore, A∗STAR (Agency for Science, Technology and Research)SingaporeSingapore
  2. 2.National University of Singapore Graduate School for Integrative Sciences and Engineering, National University of SingaporeSingaporeSingapore

Personalised recommendations