Introduction

Human papillomavirus (HPV) is generally accepted as the causative agent of cervical cancer (CC) [1], which was first unmasked by the landmark studies of Meisels and Fortin [2] and Purola and Savia [3]. Currently, there are 198 reference HPV types listed on Papillomavirus Episteme (PaVE) database, and at least 12 were classified as high-risk by World Health Organization (WHO) International Agency for Research on Cancer (IARC) Monographs Working Group [4,5,6]. HPV testing has been adopted by several European countries for primary CC screening, to augment cytology-based screening programs [7, 8]. A number of HPV assays are available commercially, which are mainly based on direct HPV genome detection, HPV DNA amplification and E6/ E7 mRNA detection [9]. Recent advent of next-generation sequencing (NGS) technologies has facilitated high throughput tools for infectious disease diagnostics and epidemiological research. Several research groups have explored utility of Illumina MiSeq and Ion Torrent platforms for HPV genotyping, with comparable sensitivity to well-established line blot assays and broader detection spectrum [10,11,12]. While the reagent cost is comparable to existing commercial assays for large sample batches, these NGS platforms may not be the best choice for medium sample throughput and laboratories with less resources and space. In this regard, portable Nanopore sequencers may allow more flexibility with shorter sequencing time and lower reagent cost. In light of this, we developed a Nanopore HPV genotyping protocol using 2 published primer sets, and compared its performance with 2 commercial HPV assays: cobas HPV Test and Roche Linear Array HPV Genotyping Test (LA).

Methods

Specimens

Two hundred and one cervicovaginal swabs were collected from March to July, 2019 in Hong Kong Sanatorium & Hospital. The swabs were preserved in SurePath preservative fluid (Becton, Dickson and Company, Sparks, MD, USA) and routinely tested for Papanicolaou smear (Pap smear, following The Bethesda System for reporting), cobas HPV Test and LA (Roche Diagnostics, Mannheim, Germany). Routine test results are shown in Table 1.

Table 1 Results of Pap smear, cobas HPV Test, Roche Linear Array HPV Genotyping Test, and Nanopore sequencing

DNA extraction

DNA extraction and cobas HPV Test were performed using cobas 4800 system (Roche Diagnostics, Rotkreuz, Switzerland). Briefly, 500 μL of cervicovaginal specimen was added to 500 μL of sample preparation buffer and heated at 120 °C for 20 min. The mixture was brought to ambient temperature for 10 min and processed on cobas × 480 using ‘high-risk HPV DNA PCR’ protocol. Real-time polymerase chain reaction (PCR) was performed on cobas z 480. Fifty microliter of DNA extract was used for LA according to manufacturer’s recommendations. Residual DNA was used for Nanopore protocol after routine testing.

HPV PCR

For each specimen, L1 region of HPV genome was amplified in 2 separate PCRs using PGMY and MGP primer sets [13, 14]. Primer sequences and cycling conditions are shown in Tables 2 and 3. Human β-globin gene was used as inhibition control and contamination was monitored by negative extraction control. Five microliter of each PCR amplicon was electrophoresized in 2% agarose gel (Invitrogen, Carlsbad, CA, USA) and analyzed. PCR-positive specimens were sequenced using Nanopore MinION.

Table 2 Primer sequences
Table 3 Master mix constituents and PCR conditions

Nanopore sequencing library preparation

PGMY and MGP PCR amplicons of each positive specimen were pooled and purified using AMPure XP beads (Beckman-Coulter, Brea, CA, USA). Nanopore sequencing libraries were prepared from purified amplicons using Ligation Sequencing Kit 1D (SQK-LSK109) and PCR-free Native Barcoding Expansion Kit (EXP-NBD104/114) (Oxford Nanopore Technologies, Oxford, England). The barcoded libraries were loaded and sequenced on MinION flow cells (FLO-MIN106D R9.4.1, Oxford Nanopore Technologies, Oxford, England) after quality control runs.

Data analysis

Data from first 2 h of sequencing runs was analyzed. FASTQ files generated by live basecalling (MinKNOW version 2.0) were demultiplexed using ‘FASTQ Barcoding’ workflow on EPI2ME (Oxford Nanopore Technologies, Oxford, England) with default minimum qscore of 7, ‘auto’ and ‘split by barcode’ options. FASTQ files of each specimen were concatenated into a single file and analyzed using a 2-step custom workflow on Galaxy bioinformatics platform. Briefly, FASTQ files were converted into FASTA format, followed by aligning sequences against HPV reference genomes from PaVE database using NCBI BLAST+ blastn (Galaxy version 1.1.1). PGMY and MGP reads were sorted based on sequence length and analyzed individually. Threshold of each run was derived from average number of background reads plus 10 standard deviations, which were calculated using interquartile rule, excluding first and last quartiles. A positive HPV call was based on either (1) the number of reads for a particular HPV type was above threshold, or (2) the specimen had the highest number of reads for a particular HPV type. All positive calls were further assessed by aligning FASTQ sequences against HPV reference genomes using minimap2 (Galaxy version 2.17 + galaxy0), and consensus sequences were built from BAM files using Unipro UGENE (version 1.29.0) for determining their percentage of identity to reference genomes.

Results

As HPV 66 is categorized as ‘other high-risk’ by cobas HPV Test, all calculations were based on this grouping, albeit HPV 66 was found as a single infection in cancers with extreme rarity and re-classified as possible carcinogen (Group 2B) by IARC Monographs Working Group [6].

The results are summarized in Table 1. PCR was successful for 191 specimens (191/201, 95.02%), with 10 specimens (10/201, 4.98%) lacking β-globin band and therefore regarded as inappropriate for further analysis. Seventy-six specimens (76/201, 37.81%) were negative for both PGMY and MGP PCRs, and 115 (115/201, 57.21%) were positive for either of the two. PCR-positive specimens were sequenced on 10 MinION flow cells with 145–890 active pores, generating 31,748–525,880 HPV reads in first 2 h (Table 4). For the 115 specimens sequenced, 19 were negative (7–522 reads, 113 in average) and 96 were positive (45–96,549 reads, 20,158 in average) for HPV. Taken together, there were 95 HPV-negative (95/201, 47.26%) and 96 HPV-positive (96/201, 47.76%) specimens by Nanopore workflow.

Table 4 Details of Nanopore sequencing runs

Table 5 shows concordance of Nanopore workflow with cobas HPV Test and LA, which was based on the 37 HPV types detectable by LA. For cobas HPV Test, our workflow achieved 93.19, 93.19 and 81.94% for perfect, total and positive agreement, respectively, with Cohen’s kappa of 0.85. For LA, Nanopore achieved a perfect agreement of 83.77% for both high-risk and non-high risk HPVs. For high-risk types, total and positive agreement were 96.86 and 91.78%, respectively, with Cohen’s kappa of 0.93. For non-high risk types, total and positive agreement were 93.19 and 77.59%, respectively, with Cohen’s kappa of 0.83.

Table 5 Agreement between cobas HPV Test, Roche Linear Array HPV Genotyping Test (LA) and Nanopore

Table 6 shows per-type concordance of Nanopore and LA. A total of 13 high-risk and 19 non-high risk HPV types were evaluated. Positive agreement for HPV 16 (n = 8) and 18 (n = 1) were 87.5 and 100%, respectively. Positive agreement was 75–100% for high-risk HPV 31, 33, 35, 39, 51, 52, 56, 58, 59 and 66, and 20% for HPV 68 (n = 5). For non-high risk HPVs, positive agreement was 37.5–100% for HPV 6, 11, 40, 42, 53, 54, 55, 61, 62, 70, 72, 73, 81, 82, 83, 84 and 89. There were 2 non-high risk types with 0% positive agreement (HPV 26 and 71). HPV 26 (n = 1) was only detected by Nanopore workflow, whereas HPV 71 (n = 2) was only detected by LA.

Table 6 Per HPV type positive agreement between Roche Linear Array Genotyping Test (LA) and Nanopore

Table 7 reveals the percentage of identity of Nanopore consensus sequences to HPV reference genomes. In general, Nanopore consensus sequences showed an average identity of 98% to the best matches, with an average difference of 15% from second BLAST hits.

Table 7 Percentage of identity of Nanopore consensus sequences to HPV reference genomes

Table 8 summarizes HPV status of each cytology grading. For high-grade and low-grade squamous intraepithelial lesion (HSIL and LSIL), nearly all specimens were positive for high-risk HPV (HSIL: 4/4, 100%; LSIL: 16/18, 88.89%). For atypical squamous/ glandular cells, about half of the specimens were positive for high-risk HPV (by LA: 19/41, 46.34%; by Nanopore: 18/41, 43.90%). For cases without observable abnormalities, 22.12% (25/113) and 21.24% (24/113) were positive for high-risk HPV by LA and Nanopore, respectively.

Table 8 Results of Pap smear, LA and Nanopore workflow. The calculations were based 176 quality control-valid specimens with Pap smear results available

Discussion

Hong Kong has been one of the Asian regions with the lowest incidence and mortality rate of CC [16]. This might be attributable to the territory-wide cervical screening program implemented by Department of Health since 2004. The program is well-organized, which involves public education, regular cervical smear and follow-up service for eligible women, and a quality assurance mechanism on key components of the program [17]. Cytology is the mainstay of primary screening, and high-risk HPV testing may be performed for triage to colposcopy.

Cytology and HPV testing have their own value for CC screening. High quality cytology has high specificity for CC, but with lower sensitivity ranging from 50% suggested by cross-sectional studies to 75% estimated longitudinally [18]. For HPV testing, the sensitivity was reported to be about 10% higher than cytology, yet with lower specificity [18]. Complementary use of both tests could enhance the sensitivity approaching 100% with high specificity (92.5%) [19]. In fact, this combined approach has been adopted by several European countries and may become the future trend of primary CC screening in developed countries.

Compared with HPV assays in the market, HPV genotyping by NGS offers a broader detection spectrum which, despite minimal benefit of non-high risk HPV information for CC screening, may provide important etiologic clues for other HPV-associated infections and a more complete picture of HPV epidemiology. For the latter, Nanopore identified more HPV types per sample (Fig. 1) and 5 extra HPV types (HPV 43, 44, 74, 87 and 90, n = 34) not detectable by LA (Fig. 2), with an unexpected high incidence of HPV 90 (n = 12) which was reported in North America and Belgium but not in Hong Kong [20, 21]. Another advantage offered by NGS is its potential utility for simultaneous characterization of cervicovaginal microbiome, with its possible role in dysplasia and carcinogenesis revealed by accumulating research evidence [22,23,24,25]. These merits may facilitate a multifaceted approach for evaluation of woman health in near feature.

Fig. 1
figure 1

Number of HPV types detected per sample by Nanopore workflow and LA

Fig. 2
figure 2

Diversity of HPV types detected by Nanopore workflow and LA

In general, Nanopore had substantial agreement with cobas HPV Test and LA. Compared with cobas HPV Test, Nanopore appeared to be more sensitive for HPV 52 (n = 7) and 59 (n = 4), with 81.82% (9/11) of these discrepant results matched with LA. Compared with LA, concordance for high-risk HPV was higher than non-high risk types. Among the 37 discrepant results, 22 were false negatives by Nanopore and 15 were not detected by LA.

For the false negatives by Nanopore, more than half (12/22, 54.55%) were mixed infections, and similar finding was reported by other research groups using HPV consensus primers for NGS-based genotyping [10, 11]. Other possible causes of false negatives included (1) low viral load, as evident by Specimen 182, from which HPV 16 was missed by both Nanopore and cobas HPV Test; (2) substantial difference in DNA input (50 μL for LA versus 5 μL for PGMY/ MGP PCR), as well as (3) lower sensitivity due to reduced magnesium chloride concentration of PGMY PCR (from 4 mM to 1.5 mM), which was fine-tuned for minimal non-specific amplification.

For the 15 HPV types missed by LA, the average identity of Nanopore consensus sequences was 98.27% with an average difference of 16% from second BLAST hits (Table 7). As distinct HPV types generally have more than 10% difference in L1 sequence [26, 27], it appeared that the discrepant positive calls were less likely caused by high sequencing error rate of Nanopore. More specifically, 5 of these positive calls were identified solely by MGP PCR (5/15, 33.33%), 5 detected by PGMY PCR only (5/15, 33.33%), and 5 by both PCRs (5/15, 33.33%). These revealed differential sensitivities of PGMY and MGP PCR primers, which might complement with each other and enhance overall performance of the Nanopore assay. On the other hand, Nanopore sequencing might improve the resolution of genotyping, which might not be attained by line blot method due to cross-hybridization of certain probes. For instance, Nanopore identified HPV 52 in Specimen 5, 40 and 74, which could not be confirmed by LA due to cross-hybridization with HPV 33 and 58, respectively. Another example was Specimen 125, which was HPV 84-positive by LA and HPV 87-positive by Nanopore. From literature, Artaza-Irigaray and colleagues reported cross-hybridization between these 2 HPV types by LA, with 11.5% of HPV 84-positive cervical specimens by LA were actually HPV 87-positive by NGS [28].

The Nanopore method and LA revealed very similar high-risk HPV positivity in each cytology grading. The goal of combined cytology-HPV testing approach is to enhance cost effectiveness of CC screening. While minimizing unnecessary referral for colposcopy, HPV genotyping may identify high-risk individuals before observable cytological abnormalities, for instance, the 4 HPV 16-positive patients without abnormal cytology findings in this study. This may facilitate an early detection approach for cancer prevention.

Our study had several limitations. First, the sample size of certain HPV types, for example, HPV 18 (n = 1), was less satisfactory for evaluating type-specific performance. Second, as residual DNA was used after routine testing, DNA input for PGMY and MGP PCRs was constrained which might lower the sensitivity. In addition, as flow cells with suboptimal number of active pores were used, sequencing time and depth might be further improved if new flow cells were used.

Conclusions

We developed a Nanopore workflow for HPV genotyping, with performance comparable to or better than 2 reference methods in the market. Our method was economical, with a reagent cost of about USD 50.77 per patient specimen for 24-plex runs, which was competitive when compared to an average price of USD 106.14 (from 4 randomly-selected laboratories) for HPV genotyping referral service in our region (Table 9). The protocol was also straightforward with reasonable turnaround time of about 12 h from samples to answers. The small size and portability of MinION sequencers may well suit remote or resource-limited laboratories with constraints in space. Future prospective study with larger sample size is warranted to further evaluate test performance and streamline the protocol. As LA was discontinued in Hong Kong, the Nanopore workflow described here may provide an economical option for broad-range HPV genotyping.

Table 9 Comparison of estimated reagent cost of Nanopore workflow (24-plex) and randomly-selected prices of HPV genotyping referral service in Hong Kong