A Mixture Model for Signature Discovery from Sparse Mutation Data
- 1k Downloads
Mutational signatures and their exposures are key to understanding the processes that shape cancer genomes with applications to diagnosis and treatment. Yet current signature discovery or refitting approaches are limited to relatively rich mutation data that comes from whole-genome or whole-exome sequencing. Recently, orders of magnitude sparser data sets from gene panel sequencing have become increasingly available in the clinical setting. Such data have typically less than 10 mutations per sample, making them challenging to deal with using current approaches. Here we suggest a novel mixture model for sparse mutation data. In application to synthetic sparse datasets and real gene panel sequences it is shown to outperform current approaches and yield mutational signatures and patient stratifications that are in higher agreement with the literature.