Background

The cancer genomes carry many different genomic alterations including somatic mutations. Most notably, driver mutations are causally implicated in oncogenesis. They have conferred a growth advantage on the cancer cells [1]. Increasing number of mutations are being characterized for better treatment selection and prognosis [2]. In non-small cell lung cancer, for example, patients with epidermal growth factor receptor (EGFR) mutation are likely to benefit from tyrosine kinase inhibitors (TKIs) [3]. On the opposite, colorectal cancer patients with KRAS or NRAS mutations should not be treated with either EGFR-inhibitor Cetuximab [4] or Panitumumab [5].

Several methods are available for the detection of tumor somatic mutations. Next generation sequencing (NGS) is widely adopted as a discovery tool for somatic mutation analysis in cancer [6]. Using NGS, The Cancer Genome Atlas (TCGA) presents genomic alternations identified from over 11,000 tumor samples from 33 cancer types, providing valuable insights into new targets for drug development, treatment selection, or even combination therapies for personalized medicine [7]. While NGS methods can analyze hundreds to tens of thousands mutations [8], methods such as qPCR or digital PCR are ideal for analyzing a few clinically important mutations with high efficiency and cost-effectiveness [9, 10]. However, as biologically and clinically important mutations for many cancer types continue to be identified, there is a growing unmet need to cost-effectively and quantitatively analyze tens to a few hundred of mutations, in highly heterogenous cancer tissues where mutant allele frequency can be highly variable.

The MassARRAY platform, based on automated MALDI-TOF mass spectrometry, is suitable to meet this growing need due to its high multiplex capability, flexibility for both sample size and mutation number, quantification capability, and automation in sample processing and data analysis [11]. The MassARRAY can detect up to 40 mutations in a single well on a 96 or 384-well plate.

In this report, we developed and validated an 8-well panel covering 299 most common somatic mutations in colorectal cancer (CRC). Such multiplex level may be sufficient for detecting virtually all clinically relevant mutations in a cancer type, with cost-effectiveness and turnaround time desirable in real-life clinical settings.

Method

Patients recruitment

Patients diagnosed with colorectal cancer were recruited with informed consent between July 2015 and June 2019 from the First Affiliated Hospital and the Second Affiliated Hospital of Wenzhou Medical University, China. Exclusion criteria include more than two pathological types, metastasis, treated with neoadjuvant chemotherapy or immunotherapy before surgery, or any cancer within the past 5 years. Tumor stages were determined according to the 8th edition of the American Joint Committee on Cancer (AJCC). The study was approved by the ethics committee of Wenzhou Medical University and its affiliated hospital.

Sample collection and DNA extraction

Resections or biopsies of primary solid tumors and adjacent normal tissues located 2-cm away from the tumor tissue, were taken immediately after surgery, snap frozen in liquid nitrogen, and stored at − 80 °C. DNA was extracted from tissue using the QIAamp DNA mini kit (Qiagen, Hilden, Germany) according to the manufacturer’s instruction, and stored at − 20 °C for further use. The DNA concentration was quantified by NanoDrop One (Thermo Fisher Scientific, Foster City, CA, USA).

FFPE tumor samples were analyzed by hematoxylin-and-eosin (H&E) staining to determine tumor purity. For each FFPE tissue, 20 slices of 10-μm-thick sections were used for DNA extraction with the QIAamp DNA FFPE Tissue kit (Qiagen). Concentration of DNA was determined by a Qubit 3.0 Fluorometer (Thermo Fisher Scientific).

CRC mutation panel design

The mutations were selected for either of the following three reasons: 1) present in at least two of the following three sources for CRC samples: TCGA (https://cancergenome.nih.gov/); International Cancer Genome Consortium (ICGC) database (https://icgc.org/); and publication by Muzny DM et al. [12]; 2) potential resistance to CRC targeted therapy based on My Cancer Genome (http://www.mycancergenome.org), or 3) reported in a commercial cancer panel [13].

The CRC mutation panel was designed with Assay Designer software (MassARRAY Typer, Version 4.0, Agena Biosciences, San Diego, CA, USA), with primer sequences not overlapping known single nucleotide polymorphisms whenever possible.

CRC mutation detection and quantification by MALDI-TOF MS

CRC Mutation detection was optimized and performed by MassARRAY Analyzer 4 with CPM (Agena Biosciences). All primers were purchased from Integrated DNA Technologies (Integrated DNA Technologies, Coralville, IA, USA). All reagents were purchased from Agena Bioscience unless otherwise specified. Briefly, tissue genomic DNA was amplified by multiplex PCR. Shrimp alkaline phosphatase treatment was performed to inactivate surplus nucleotides. A primer extension reaction (iPLEX Pro) was performed with mass-modified terminator nucleotides, and the product was spotted on SpectroCHIP. Wild-type and mutant alleles were then discriminated by molecular weights determined by MassARRAY analyzer.

Allele calls were performed with MassARRAY Typer Analyzer software (Typer 4.0.26). Additionally, at least two investigators independently reviewed the mass spectra to further confirm the automated calls by the software. To estimate mutant allele frequency, the heights of raw spectral peaks corresponding to the wild-type and mutation allele were quantified. Mutation allele frequency was estimated by calculating mutant peak /(mutant peak + wild type peak).

Targeted NGS and data analysis

Genomic DNA was fragmented to an average size of 200 to 500 bp using Bioruptor Pico sonication device (Diagenode, Denville, NJ, USA). Sequencing libraries were prepared using the KAPA LTP Library Preparation Kit for Illumina (Kapa Biosystems, Wilmington, MA, USA) according to manufacturer’s suggestions. Hybridization-based target enrichment was carried out with xGen Pan-Cancer Panel v2.4 (532 cancer-relevant genes), and xGen Lockdown Hybridization and Wash Reagents Kit (Integrated DNA Technologies). Captured libraries were amplified and purified using Agencourt AMPure XP Beads (Beckman Coulter, Atlanta, GA, USA). Concentrations and qualities of DNA libraries were analyzed by the Agilent High Sensitivity DNA Kit (Agilent Technologies, Santa Clara, CA, USA).

The libraries were paired-end sequenced on the Illumina HiSeq X platform (Illumina, San Diego, CA, USA) according to the manufacturer’s instructions. Sequencing adapters and low quality bases were trimmed from raw sequencing reads using Trim Galore (v0.4.1; https:// github.com/FelixKrueger/TrimGalore). The filtered reads were then mapped to the reference Human Genome (hg38) using BWA-MEM (v0.7.12; https://github.com/lh3/bwa/tree/master/bwakit) with the default settings. The GATK (v4.0.12.0; https://software.broadinstitute.org/gatk/) was used for single nucleotide variation (SNV) identifications. SNV calls with at least 2.5% variant allele frequency (VAF) were retained, followed by annotation using ANNOVAR [14].

Mutation validation by capillary sequencing

A subset of samples were selected for Sanger sequencing to validate MS results. The capillary sequencing was performed with the BigDye Terminator Cycle Sequencing Kit and ABI 3730 Genetic Analyzer (Thermo Fisher Scientific).

Statistical analysis

Statistical analyses were performed with IBM SPSS Statistics 20.0. The χ2 test or Fisher exact test were used to compare baseline categorical variables, and the Kruskal-Wallis test was used to analyze the association between mutation and tumor size. The Cox proportional-hazards model of multivariate analysis was performed to analyze covariables, such as age, gender, clinical stage, location, differentiation, metastasis, treatment after surgery, and tumor size.

Results

Patients information

A total of 300 patients diagnosed with colorectal cancer between July 2015 and June 2019 were recruited. Seventy one patients were excluded (see exclusion criteria in the methods section). Frozen tissues (paired tumor and adjacent normal tissues) from 136 patients, frozen tumor tissues from 47 patients, and formalin fixed paraffin embedded (FFPE) tissues (paired tumor and adjacent normal tissues) from 46 patients with CRC were analyzed. Table 1 summarizes the patients’ baseline characteristics, tumor stage, location, differentiation, size, follow-up treatment after surgery and tumor marker (CEA) level.

Table 1 Patient characteristics

Selection of somatic mutations and multiplex assay design

Somatic mutations for CRC were chosen based on their prevalence, and biological/clinical relevance. The final mutation panel consists of 299 mutations from 109 genes (Additional file 1: Supplementary Table 1).

We designed an 8-well multiple assay for a panel of 299 mutations (36-plex in well 1 and 2, 34-plex in well 3 and 5, 30-plex in well 4, 29-plex in well 6, 26-plex in well 7, and 12-plex in well 8) for MALDI-TOF MS analysis. The sequences of PCR primers and extension primers are provided in Additional file 1: Supplementary Table 2. Wild-type and mutant alleles were then discriminated by molecular weights of the extension products measured by MALDI-TOF MS (example results shown in Fig. 1).

Fig. 1
figure 1

Representative MS results showing paired tumor and adjacent normal tissues from frozen and FFPE tissues. Shown in (a), (b), and (c) are three different mutations in the TP53, APC and KRAS genes from three frozen tissues, (d), (e) and (f) show the mass spectra for three FFPE tissues. Mutant allele frequency can be estimated by comparing the peak signals for the mutant and wild type alleles. Wt: wild type, Mut: mutation

To determine the sensitivity of the assays, we constructed plasmids for 35 different mutations. Plasmid mutations were confirmed by Sanger sequencing. Mixtures of plasmids containing wild-type and mutant sequences to mimic samples with 5 and 10% mutation frequencies were prepared to analyze the sensitivity and accuracy of the assay. Among them, 19 assays achieved a 5% sensitivity for mutations, and 16 assays achieved a 10% sensitivity (Fig. 2 and Supplement figure 1).

Fig. 2
figure 2

Evaluation the performance of the MS assay. The DNA mixture samples with 5 and 10% mutation were prepared to analyze the sensitivity and accuracy of the assay. Among them, 19 assays achieved a 5% sensitivity for mutations, and 16 assays achieved a 10% sensitivity. Here we show 4 assays that could achieve a 5% sensitivity

Mutation profiles of 229 patients from frozen and FFPE tissues

Using the multiplex assay, we analyzed 411 samples from 229 patients (319 freshly frozen tissues from 183 patients and 92 FFPE samples from 46 patients) (Fig. 3). In the frozen tissue cohort, we identified and quantified 52 different somatic mutations in 107 patients. The most frequently mutated genes were KRAS (52 patients, 28.4%), TP53 (48 patients, 26.2%) and APC (32 patients, 17.5%). We chose 20 mutations from 31 patients for Sanger sequencing validation. Sanger Sequencing showed complete concordance with MALDI-TOF MS (Additional file 2: Supplementary Figure 2).

Fig. 3
figure 3

Mutation profiles quantified by the multiplex CRC panel. Each colored bar represents one mutation (orange) or two mutations in the same gene (blue). The heights of the colored bars represent mutant allele frequencies

In the cohort with 46 paired FFPE tissues (tumor and adjacent normal tissues), we first performed histopathological analysis with H&E staining (Supplement figure 3). With the multiplex MS assay, we detected a total of 39 mutations in 35 samples. Similarly, APC, TP53, and KRAS were most frequently mutated. We chose 15 different mutations (19 total mutations) from 13 patients for Sanger sequencing validation (Additional file 2: Supplementary Figure 4). Eighteen of the 19 mutations (95%) were validated successfully. Samples chosen for validation, as well as sequences of PCR primers and sequencing primers, are provided in Additional file 1: Supplementary Table 3.

Both frozen and FFPE tissues are commonly used for molecular pathology analysis. Tissue sampling differences may affect the somatic mutation frequencies in individual samples. For the three genes (TP53, APC, and KRAS) that are most commonly mutated in CRC, we found two genes (TP53 and APC) showed higher mutation frequencies in the frozen tissues than the FFPE tissues. The average TP53 mutation frequencies were 50.9 and 38.3% for the frozen tissues and FFPE tissues, respectively (p = 0.006, Mann Whitney U test). The average APC mutation frequencies were 47.5 and 26.8% for the frozen tissues and FFPE tissues, respectively (p = 0.005, Mann Whitney U test).

We next examined the association between the molecular alterations and patient characteristics (Table 2). Among the 142 patients with at least one somatic mutation identified, we found that the 63 patients without a RAS mutation exhibited different clinical stage distribution as compared with the 79 patients with RAS mutation (p = 0.019, χ2 test). TP53 mutations were more frequently found in male patients by univariate analysis (p = 0.002, χ2 test).

Table 2 Patient characteristics in different molecular subgroups

Comparison between targeted NGS and MS assays

We chose 13 patient samples and performed targeted NGS sequencing using xGen Pan-Cancer Panel V2.4. The NGS panel covers 532 genes. Deep sequencing was performed to have an average sequencing depth of 1400. Among these 13 patients, the MS method identified 11 somatic mutations. Targeted NGS detected 14 mutations, including the 11 mutations detected by MS. Three mutations were identified only by NGS. These three discordant mutations were further analyzed by DNA cloning and capillary sequencing. Two mutations (KRAS_c.G35A, APC_c.C2626T) were verified by cloning and sequencing. One mutation (KRAS_c.G34A) was present at 3% by targeted NGS. We sequenced 37 clones and did not observe any mutant.

Discussion

We developed an 8-well, multiplexed assay based on automated MALDI-TOF mass spectrometry to detect 299 CRC related mutations in 229 southern China patients.

The selected 299 mutations cover the National Comprehensive Cancer Network (NCCN) guideline of colon cancer recommended gene list relevant to treatment and prognosis, such as KRAS, NRAS, BRAF V600E mutations [15,16,17]. The MALDI-TOF MS assay is sensitive and quantitative in analyzing somatic mutations, with extensive validation by capillary sequencing, as well as a head-to-head comparison with a targeted NGS panel covering 532 genes. We analyzed both frozen tissues and FFPE samples as both sample types are commonly used for molecular pathology analysis. In our cohorts, we found the frozen tissues may contain higher tumor content as evidenced by higher mutation frequencies for TP53 and APC, which may be due to sampling differences.

NGS-based methods can detect known and unknown mutations while the MALDI-TOF MS assay is more suitable to detect pre-selected, functionally relevant known mutations. The cost of the multiplex MS assay (about USD 60/sample) is substantially lower than targeted NGS (about USD 150 ~ 200/sample). The MS approach is highly automated and suitable for high throughput analysis of larger sample size, making it suitable as a routine testing platform. The overall time needed for the MS assay is about 9 h, with hands-on time of about 60 min. The MS assay is also flexible in panel expansion to add more important mutations.

Conclusions

We have developed and validated a highly multiplexed assay for the quantification of 299 somatic mutations in colorectal cancer tissues, offering a tool for studying the biological and clinical significance of somatic mutations with large numbers of cancer tissues. The multiplex assay may also be useful in clinical management of colorectal cancer patients.