Ensemble-Based Somatic Mutation Calling in Cancer Genomes
- 237 Downloads
Identification of somatic mutations in tumor tissue is challenged by both technical artifacts, diverse somatic mutational processes, and genetic heterogeneity in the tumors. Indeed, recent independent benchmark studies have revealed low concordance between different somatic mutation callers. Here, we describe Somatic Mutation calling method using a Random Forest (SMuRF), a portable ensemble method that combines the predictions and auxiliary features from individual mutation callers using supervised machine learning. SMuRF has improved prediction accuracy for both somatic point mutations (single nucleotide variants; SNVs) and small insertions/deletions (indels) in cancer genomes and exomes. Here, we describe the method and provide a tutorial on the installation and application of SMuRF.
Key wordsSomatic mutation calling Next-generation sequencing
- 11.Cingolani P, Platts A, Wang le L et al (2012) A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly 6(2):80–92. https://doi.org/10.4161/fly.19695CrossRefPubMedPubMedCentralGoogle Scholar