Background

Cervical cancer is the fourth most common malignancy in women worldwide. In November 2020, the WHO launched an initiative to accelerate the elimination of cervical cancer via vaccination, screening, and treatment [1]. Cervical cancer screening, aiming to identify women with pre-invasive dysplastic lesions which can be surgically excised, is one of the most successful personalised cancer prevention strategies to date [2].

Primary HPV testing has been consistently shown to be superior to other screening methods [3]. As a result, most countries are currently changing screening from primary cytology to primary HPV testing, with cytology as the triage for colposcopic assessment of oncogenic HPV-positive (oncHPV+) women [4]. In Europe, cervical cancer screening participation rates vary between 40.5 and 81.4%, and efforts to increase participation to ≥ 85% are essential. Self-collection of cervicovaginal samples may be more widely acceptable than collection of a cervical screening sample by a healthcare professional (HCP) and offers an alternative option for individuals that may suffer from trauma, embarrassment, or pain, thereby increasing attendance [5]. Self- and HCP-collected sampling shows comparable HPV testing results [6], but cytology is not feasible in self-collected samples and would therefore need to be followed up with a HCP-collected sample. Because less than 60% of women who provide a self-collected sample attend follow-up appointments [7,8,9] and cytology achieves a sensitivity and specificity for detection of CIN3+ of ~ 80% and only ~ 60% [10,11,12,13,14,15,16], respectively, an accurate test to triage oncHPV+ women that could be performed on the same original sample could be highly beneficial to reduce loss-to-triage follow-up and improve detection without false positives.

The feasibility of utilising DNA methylation (DNAme) markers for the detection of pre-invasive or invasive gynaecological cancers has been shown by us [17,18,19,20] and others (reviewed in [21, 22]), including in self-collected samples. As outlined in our recent publication [20] describing the use of a methylation array-based signature for detection of cervical malignancies, premalignancies, and risk of malignancy, the clinical use of DNAme markers to triage oncHPV+ women has so far been impeded for several reasons, including a lack of sensitivity for detection of CIN3(+) in women below age 30 who have a high prevalence of oncHPV [23], a lack of assessment of specificity in young women, and the fact that it has largely been unknown whether DNAme markers are able to outperform cytology as an indicator for future disease risk in oncHPV+ women. Our recent work has addressed these issues: a DNA methylation array-based signature, the WID-CIN, is able to identify women with, or at risk of, cervical malignancies, offers high sensitivity and specificity even below the age of 30, and outperforms cytology as an indicator for future disease risk [20]. The advantage of the array-based WID-CIN signature is the ability to detect or predict the risk also for the other women’s cancers using a single cervical sample and one single assay as demonstrated recently for breast, ovarian, and endometrial cancer [24, 25] (latter currently under review).

Despite these clear advantages of an array-based signature, for the current screening which utilises samples collected by health care professionals or self-samples, a simple and accurate PCR-based test is required to triage oncHPV+ women.

Here, we therefore aimed to develop a triage test capable of both detection of current and identification of future CIN3+ and potentially suitable for use with self-collected samples, using a PCR-based approach. Using cervical liquid-based cytology samples in a nested case–control setting for generalisability of our findings, we develop and validate the three-marker PCR-based DNAme WID™-qCIN test for triage of oncHPV+ women.

Results

WID™-qCIN test development

An overview of the WID™-qCIN test development is shown in Fig. 1A, with relevant sample sets shown in Additional file 1: Figure S1. To identify suitable regions for MethyLight reactions, we assessed DNA methylation at ~ 850,000 CpG sites in cervical smear samples from women with CIN3+ (n = 170) and normal cytology (n = 202) (“CpG identification set”) using the Infinium MethylationEPIC array employing a previously established workflow [24]. We designed MethyLight reactions for 28 regions amongst the top 50 differentially methylated positions (DMPs) whose index CpG and surrounding CpGs showed the largest difference between cases and controls and where at least one CpG showed no or very low methylation in controls regardless of cellular composition of the samples (example regions in Fig. 1B, C). We evaluated DNA methylation with MethyLight reactions in 20 current and 40 future cases and 60 controls (LBC-CIN Discovery Set) and ranked reactions according to their area under the receiver operating characteristic curve (AUC of the ROC) (Fig. 1D). We selected the final WID™-qCIN top regions via a mutual information approach [26], aiming to identify those regions which carry maximum information about the outcome (current/future CIN3) but are minimally redundant (i.e. aiming to reduce the number of markers that would only show the same information and therefore not deemed “independent”) (Fig. 1E). This indicated that the combination of three regions in the human genes DPP6, RALYL, and GSX1 was most suitable for discriminatory performance. We defined the sum of the three PMR values for DPP6, RALYL, and GSX1 (ΣPMR), without any additional weighting, as the WID™-qCIN test. The WID™-qCIN (ΣPMR) led to an AUC of 0.94 for current cases and 0.64 for future cases in the LBC-CIN Discovery Set. Based on prior knowledge from our WID™-qEC test where we identified a threshold using Youden’s index of the AUC [19], we set the test threshold to 0.63 for subsequent diagnostic and predictive validation.

Fig. 1
figure 1

Overview of the study and selection of WID™-qCIN MethyLight reactions. A Overview of the study setup. B Example plots of CpG methylation beta values in cervical samples of controls and cervical intraepithelial neoplasia grade 3+ (CIN3+) cases versus immune cell proportion (ic) and C mean methylation values of CpGs in close genomic proximity (within 500 base pairs) to the index CpG, to identify differentially methylated regions. D AUC of individual MethyLight reactions for discrimination of (current or future) CIN3 + cases from controls in the LBC-CIN Discovery set. Three reactions were selected for the WID™-qCIN test using a minimal conditional mutual maximisation filter for feature selection (maximum information with the outcome, minimal redundancy with each other). CIN cervical intraepithelial neoplasia, LBC liquid-based cytology

WID™-qCIN test in cervical smear samples

Validation of the WID™-qCIN test in 506 cervical smear samples in the LBC-CIN Diagnostic Set (Table 1) with CIN3 + as the outcome resulted in an AUC of 0.89, comparing CIN3+ versus ≤ CIN1. Stratification by ages ≥ 30 and < 30 years led to AUCs of 0.9 and 0.89, respectively (Fig. 2A). The WID™-qCIN ΣPMR values of the LBC-CIN Diagnostic Set are displayed on a log scale in Fig. 2B for illustrative purposes. (Additional file 1: Figure S2 shows values of individual regions.) WID™-qCIN identification of CIN3+ cases from controls and CIN1 cases was not dependent on HPV subtype (Fig. 2C), although identification was slightly superior in samples from women positive HPV16 or HPV18+ compared to other oncogenic HPV (oncHPV) subtypes (Fig. 2D). HPV16/18+ CIN3 cases had a small but significant increase in the WID™-qCIN compared with other oncHPV+ CIN3 cases.

Table 1 Overview of datasets
Fig. 2
figure 2

WID™-qCIN test performance in relation to age, diagnosis, and HPV status in the LBC-CIN Diagnostic set. A The WID™-qCIN test discriminates well between controls (up to CIN1) and cases (CIN3/AIS and invasive cervical cancers) in the LBC-CIN Diagnostic Validation Set regardless of age, although performance is higher in individuals over 30 years. B WID™-qCIN values (log scale) in LBC-CIN Diagnostic set samples. The WID™-qCIN threshold of 0.63 (− 0.46 on a log scale, dashed line) was selected a priori based on prior knowledge. C ROC curve analysis of different HPV subtypes (HPV16/18+ vs. other oncHPV+). Curves are not significantly different (DeLong p value = 0.23). D The WID™-qCIN is slightly higher in CIN2 samples from HPV16/HPV18+ individuals compared with other oncHPV subtypes (p = 0.04 in Wilcoxon signed-rank test). AUC area under the curve, PMR percentage methylated reference, oncHPV oncogenic human papillomavirus

Applying the pre-specified threshold of 0.63 led to a 14%, 55%, 69%, and 100% sensitivity for the detection of CIN1, CIN2, CIN3, and invasive cervical cancers, respectively (Additional file 1: Table S1). Importantly, the performance was highly similar when restricting to HPV+ samples only (Additional file 1: Table S2). The sensitivity for CIN3+ detection in the LBC-CIN Diagnostic Set was 78%, with 83% and 65% in women ≥ 30 and < 30 years, respectively (Table 2). Importantly, this outperformed HPV typing, which showed generally lower combined sensitivity and specificities (Additional file 1: Table S3).

Table 2 Sensitivity and specificity of the WID™-qCIN test

The specificity, based on HPV+ and HPV−/cyt− samples as well as HPV+/cyt+ samples which were diagnosed as CIN1 on histology, was 90% in the LBC-CIN Diagnostic Set, and 95% and 88% in women ≥ and < 30 years of age, respectively. Importantly, the WID™-qCIN test only deemed 14% overall women with CIN1, and 17% and 8% of women ≥ 30 and < 30 years, respectively, as positive (Additional file 1: Table S1).

Assuming a CIN3+ population prevalence of 21.5% in pre-screened women (e.g. HPV+ individuals), as previously reported [21], the positive (PPV) and negative (NPV) predictive values for > CIN3+ and ≤ CIN1, respectively, in the LBC-CIN Diagnostic Set in all HPV+ women, regardless of age, were 59% and 93% (Table 2). The PPV and NPV (CIN3+ and ≤ CIN1) in women ≥ 30 years of age in this set were 57% and 95%, while they were 76% and 91% in women < 30 years, respectively (Table 2). Inclusion of CIN2 for computation of NPV (≤ CIN2) did not alter values much, although this will need to be further validated as the number of CIN2 samples in our set was relatively small (n = 66).

WID™-qCIN to predict future cancer risk

To interpret a positive test in the absence of current cervical pathologies, we furthermore aimed to assess whether the WID™-qCIN test could identify future disease in HPV+/cyt− women donating samples 1–4 years in advance of a CIN3 diagnosis.

Amongst HPV+/cyt− women aged ≥ 30 and < 30 years diagnosed with CIN3 1–4 years after sample donation, 52% and 15% were WID™-qCIN positive, respectively (Table 2). The specificity for women ≥ 30 and < 30 years of age was 94% and 92%, respectively (Table 2, Fig. 3). WID™-qCIN values were not significantly different between HPV16/18+ individuals and other oncHPV+ individuals who were either future cyt− or CIN3+ (Fig. 3B, C).

Fig. 3
figure 3

Influence of time to future CIN3 on WID™-qCIN in individuals below and above 30 years of age and in relation to HPV status in the LBC-CIN Predictive set. A Scatter plot of time to event (censoring/future cytology negative (cyt−) or CIN3+ diagnosis) versus logarithm of the raw PMR values of the WID™-qCIN. PMR values are visualised on log scale only for illustrative purposes. The dashed line indicates the threshold [log(0.63) = − 0.46]. B AUC of the WID™-qCIN stratified by HPV16/18+ or other oncHPV+ status. Curves are not significantly different (DeLong p value = 0.52). C WID™-qCIN values in HPV16/HPV18+ or other oncHPV+ individuals and future CIN3 cases. CIN3 cervical intraepithelial neoplasia grade 3, cyt− cytology negative, AUC area under the curve, PMR percentage methylated reference, oncHPV oncogenic human papillomavirus

Discussion

Here, we demonstrate that a PCR-based DNA methylation signature, the WID™-qCIN test, may outperform cytology as a triage. Importantly, in addition to high sensitivity and specificity for current cases, the test is also able to identify future cases: 52% and 15% of HPV+/cyt− women ≥ 30 and < 30 years of age, respectively, diagnosed with CIN3 1–4 years after index sample donation, had a positive WID™-qCIN. While a negative cytology followed by a positive diagnosis 1–2 years later could be interpreted as issues with cytological sampling or classification at the initial visit, we argue that this is an inherent weakness of cytology as a standard of care, and sensitivity for current or future disease could be improved with the WID™-qCIN. The WID™-qCIN test also offers high specificity and only identifies 16% and 8% of HPV+/cyt+ women aged ≥ 30 and < 30 years, respectively, who eventually only show CIN1 on biopsy and are therefore deemed to be false cyt+ (Additional file 1: Table S2).

Only a small number of studies assessed the clinical validity of DNAme markers in cervical liquid-based cytology samples in a primary screening setting, as summarised in our recent publication [20]. Briefly, a prospective study by Verhoef et al., evaluating the use of DNAme in MAL and miR-124-2, reported a lower sensitivity of DNAme than cytology triage (67.5% versus 74.8%, respectively), required twice as many colposcopy referrals, and only included women ≥ 33 years [11]. The performance of the methylation markers described by Verhoef et al. may presumably have been worse in younger women [27]. Although we also observed an age-dependent performance of the WID™-qCIN test, we were able to achieve a sensitivity of 65% at a 95% specificity in this age group in women < 30 years of age (Table 2).

Although tests were not carried out side by side in the same cohort, when comparing sensitivities and specificities of the new WID™-qCIN test with QIAsure, a commercially available DNAme test, the WID™-qCIN test exhibits an improved performance (overall sensitivity and specificity = 77 and 91 vs. 77.2 and 78.3). Of note, the majority of women in a recent QIAsure set were aged ≥ 29 years [28], with a mean age of 40.7 years, and all CIN/HPV tests to date perform substantially better in older women. Conversely, the mean age in our dataset was 33.7 years. In addition, DNAme of FAM19A4, a gene which we described first to become aberrantly methylated in cervical carcinogenesis [29] and which is one of the two regions assessed in the QIAsure test, was not amongst the top differentially methylated regions.

For payers and health policy decision makers in countries such as Austria and possibly in other European countries with similar standards for primary cervical cancer screening, long-term modelling results (Additional file 1) highlight that utilising WID™-qCIN testing at three yearly intervals might be an effective and likely cost-effective cervical cancer even as a primary screening modality, although this is not the envisioned initial use of the WID™-qCIN test, which is initially designed as a triage test following HPV screening. As for all model-based studies, our analysis has potential limitations based on the assumptions made and data used. One limitation is that effectiveness data were based on different evidence levels. We used test accuracy data from international meta-analyses of randomised screening trials for cytology and HPV-based screening, whereas test accuracy of the WID™-qCIN test was derived from original data based on a case–control setting, and therefore, incremental cost-effectiveness results may be biased and have limited external validity. However, our sensitivity analyses showed that results are mostly robust when varying assumptions. Further independent research on test accuracy may further reduce uncertainty.

The strengths of this study include the use of only samples from a well-defined population-based screening cohort under careful design to control for potential bias due to factors such as age, sample year, and time of storage, with a comprehensive registry linkage strategy that allowed for the identification of samples long preceding disease. In addition, we employed an epigenome-wide array-based approach for de novo identifying the most informative CpG sites in order to identify women with or at future risk of CIN3+ diagnosis and used a different modality (PCR) to validate the signature.

In summary, in addition to a recent array-based WID-CIN signature for detection and risk prediction of cervical cancer [20], here we have demonstrated the performance of the three-marker PCR-based DNA methylation WID™-qCIN test in triaging women with or at future risk of CIN3+ diagnosis. Whether the array-based WID-CIN or PCR-based WID™-qCIN should be utilised may depend on the setting and aim of screening. Our recent report on the feasibility of the use of cervicovaginal self-samples using MethyLight-based testing for endometrial cancer detection suggests that self-collected samples may also be suitable for the WID™-qCIN test, although this will need to be further validated in individual studies. A strength of this approach is that HPV+ self-collected samples could be rapidly followed up using an automatable platform, making use of the same original sample without the need for patient recall and repeated sample testing. Taken together, our data indicate that the WID™-qCIN test may represent a promising triage strategy for cervical cancer screening and may be prioritised for comprehensive cost-effectiveness analyses and potentially rapid implementation in the clinical arena.

Methods

Cervical liquid-based cytology sample collection

All cervical liquid-based cytology samples processed in the capital region of Stockholm in Sweden are biobanked through a state-of-the-art platform at the Karolinska University Laboratory, Karolinska University Hospital, as previously described [30]. Since the year 2013, virtually all of the ~ 150,000 LBCs per year are compacted and stored in a 600 µl, 96 well plate format at − 27 °C. This allows for preservation of intact cells and subsequent analyses of DNA, RNA, and protein content, among others. The biobank is linked to the Swedish health register infrastructure for cytology/HPV results, histopathology test and results, as well as cervical cancer diagnoses, through the individually unique personal identification number (PIN) [31]. We defined cohorts of women resident in Stockholm (Additional file 1: Figure S2), participating in cervical screening, or clinically indicated testing during the years 2013–2017, and have screening sample(s) stored in the biobank. An overview of the Swedish cervical cancer screening programme at the time of the sample collection for this study is shown in Additional file 1: Figure S3. We linked them to the National Cancer Register at the Swedish National Board of Health and Welfare, and the Swedish National Cervical Screening Registry, to identify all current or future cases of CIN3/Adenocarcinoma in situ (AIS) or invasive cervical cancer (CIN3+) and defined datasets for discovery and diagnostic and predictive testing of the WID™-qCIN test.

Discovery (n = 465)

CpG Identification set. For epigenome-wide assessment of cervical cancer markers, we utilised cervical samples from 202 HPV+ cyt− women and 170 women with CIN3+ part of the LBC-CIN Discovery and Diagnostic Sets.

LBC-CIN Discovery set. The LBC-CIN Discovery set consisted of 20 samples from current CIN3 + cases and 20 age-matched controls as well as 40 samples of future CIN3+ cases and 40 age-matched controls, i.e. subsets selected from samples above. Samples with sufficient DNA were included and current/future CIN3+ cases were selected.

Diagnostic validation (n = 506)

LBC-CIN Diagnostic set. All screening-derived samples that were cytology-positive during 1–90 days prior to CIN3+ diagnoses in 2013–2015 were defined as cases. Controls were randomly selected from samples that were cyt- in women having no historical cervical lesions and frequency matched (to CIN3+) 1:1 on age group and calendar year of samples. We also identified samples during 1–90 days prior to histologically diagnosed CIN1 and CIN2 with similar age distribution to assess the discrimination ability to exclude low-risk lesions.

Predictive validation (n = 255)

LBC-CIN Predictive set. For assessment of CIN3+ prediction, all cervical samples that were oncHPV+ and cyt− during 2014–2016 from women who were future diagnosed with CIN3+ up to the end of 2017 were defined as cases. The vast majority of future case samples were collected in 2014 (515 out of 669, 77%), the year Stockholm county initiated randomised healthcare policy trial for primary HPV testing. Random oncHPV+ and cyt− samples of women who did not have CIN3+ diagnosis up to the end of 2017 were selected as controls, frequency matched 1:1 on age group, calendar year and type of samples (screening or clinically indicated). All women tested oncHPV+ and cyt− were recalled after 3 years, and 85% attended the follow-up in the recall. All samples for which no HPV results were available were put through high-performance HPV testing on the cobas® 4800 assay [32].

An overview of all datasets and corresponding characteristics is shown in Table 1 and Fig. 1.

WID™-qCIN assay development

The WID™-qCIN test is based on data from ~ 850,000 methylation sites from 202 HPV+/cyt− women and 170 women with CIN3+ generated on the Illumina MethylationEPIC array platform which indicated CpGs of interest in an epigenome-wide manner. For development of the WID™-qCIN PCR-based assay, we assessed the top 50 differentially methylated positions between women with CIN3+ and those without, including regions 500 bp up- or downstream of the site of interest. Following visual inspection, 28 suitable regions, i.e. those who showed a methylation of 0 (or near 0) in controls and increased methylation in CIN3+ cases across several adjacent CpGs, were selected for development of MethyLight reactions (three exemplary regions shown in Fig. 1B). To account for cellular heterogeneity in cervical samples that consist of both epithelial and immune cells, we plotted exemplary regions against inferred immune cell proportion and verified that methylation differences were present across different sample compositions. These reactions were tested in the LBC-CIN Discovery Set (see Additional file 1: Figure S1) consisting of samples from current CIN3 cases (n = 20), controls (n = 20), and future CIN3 cases with matched controls (n = 40 each). Three reactions were selected for the WID™-qCIN test using minimal conditional mutual information maximisation (CMIM) [26], aimed at identifying those features which are maximally relevant with the output (diagnosis) but minimally redundant with each other in order maximise the information obtained from three combined regions.

WID™-qCIN DNA methylation assay

DNA methylation-specific, quantitative real-time PCR (MethyLight) analysis was performed as previously described [33] with some modifications. Cervical DNA was extracted and normalised to 25 ng/μl using the Nucleo-Mag Blood 200 µl kit (Macherey Nagel, cat #744501.4). DNA concentration was measured using the Qubit™ 4 Fluorometer (Invitrogen™). 500 ng of extracted DNA was bisulphite modified and eluted to a concentration of 4 ng/µl using the EZ-96 DNA Methylation-Lightning™ Kit (Zymo Research corp, cat. #D5033) as per the manufacturer’s protocol. For the multiplex MethyLight assay, 20 ng of bisulphite modified DNA was amplified in a 20 µl reaction containing 1× Luna® Universal Probe qPCR Master Mix (NEB®, cat. #M3004G) and one of the primer–probe sets listed in Additional file 1: Table S4. All PCR reactions were run in duplicates. To normalise for DNA input in each reaction, COL2A1 was selected as the reference gene. Human SssI-treated DNA or double-stranded gBlocks™ Gene Fragments (IDT™) containing known copy-numbers of each analysed target and COL2A1 functioned as equivalent fully methylated calibrator and as qPCR standard curve material. PCR reactions were run on the QuantStudio™ 7 Pro (Applied Biosystems™) and results further extracted via the Design & Analysis Software 2.5.0 (Applied Biosystems™). The percentage of fully methylated reference (PMR) molecules at the target locus was standardised using an R script, dividing the TARGET:COL2A1 input amount ratio (derived using the COL2A1 standard curve; Eq. 1) of a sample by the TARGET:COL2A1 input amount ratio of gBlocks™ Gene Fragments DNA and multiplying by 100 (Eq. 2).

$$\text{input amount}={10}^{\frac{\text{Ct target}-\text{intercept}[\text{COL}2A1 \text{standard curve}]}{\text{slope}[\text{COL}2A1 \text{standard curve}]}}$$
(1)
$$\text{PMR}= \frac{\frac{\text{input amount target}}{\text{input amount COL}2A1}[\text{Sample}]}{\frac{\text{input amount target}}{\text{input amount COL}2A1}[\text{gBlock}]} \times 100$$
(2)

Statistical analyses

All statistical analyses were conducted using R version 4.0.2 (2020-06-22). Cellular composition of cervical samples analysed using the Illumina MethylationEPIC array was inferred using the EpiDISH algorithm, version 2.10.0 [34]. Three regions were selected using the minimal conditional mutual information maximisation filter function (CMIM) in the praznik package, version 9.0.0, based on the method developed by Fleuret [26]. Receiver operating characteristic curves, areas under the curve, and corresponding 95% confidence intervals were generated using the pROC package, version 1.18.0. Sensitivity and specificity including 95% confidence intervals were calculated using the epi.tests function in the epiR package, version 2.0.38, while PPV and NPV were obtained from the BDtest function (bdpv package, version 1.3) assuming a CIN3+ prevalence of 21.5% for pre-screened individuals (i.e. HPV+ individuals) following a previous systematic review for CIN3+ [21]. All statistics were conducted on original ∑PMR values. Original data are available in Additional file 2, Tables 1, 2 and 3.

Table 3 Diagnostic and predictive WID™-qCIN assessment in distinct HPV subgroups