Background

Cervical cancer is caused by human papillomavirus (HPV) and develops over many years through a series of precancerous steps [1, 2]. The disease can be prevented by using the HPV vaccine or by screening with HPV test or Pap smears [3, 4]. Since 2009, there has been a HPV vaccination program for 12-year-old girls in Norway. The program’s coverage is around 80% [5]. Since November 2016, there has been an ongoing two-year catch-up vaccination program for 20–25 years old women where the expected coverage rate is 40–45% (www.fhi.no). Since 2015, there has been a pilot for HPV testing in primary screening in four counties [6]. In this pilot, women 34 years and older are randomized to Pap smear every three years or HPV test every five years [6]. However, in most parts of Norway, the cervical screening program is still based on cervical cytology [5].

Since 1995, there has been a national cervical cancer screening program in Norway, where women aged 25–69 years are recommended to take Pap smears every three years [5]. Women with high-grade cytology (ASC-H / HSIL) are referred to a gynecologist for colposcopy and biopsy. HPV test is used in triage of women with low-grade cytology (ASC-US / LSIL). The cervical screening program has a coverage of 60% after 3.5 years. The Norwegian Cancer Registry sends a reminder to women without a Pap smear after three years and a new reminder after four years. The coverage is 80% after 5 years [5]. Most Pap smears are taken by GPs, while some samples are taken by gynecologists. There are 17 different laboratories involved in the screening program, and most of these use liquid-based cytology (ThinPrep or SurePath).

It is well known that cervical cytology has limited sensitivity and reproducibility [7,8,9,10,11,12]. Diagnoses may vary from cytotechnician to cytotechnician, from pathologist to pathologist and from lab to lab [9, 11, 12]. All cervical cytology diagnoses, results of HPV tests and biopsies from all laboratories in Norway are reported to the Norwegian Cancer Registry, which drafts annual reports with feedback to each laboratory, including the distribution of their diagnoses compared with the national average [5] (Table 1).

Table 1 Distribution (%) of selected cytological diagnoses in different labs in Norway in 2015

There is a high variability in detection rates across hospitals. This may be due to higher sensitivity, lower specificity, differences in HPV prevalence, cervical dysplasia and cancer in some parts of the country compared to other parts of the country, or a combination of these causes. We wanted to investigate the accuracy of cytology diagnoses by four different pathologists at three different hospitals in Norway.

Methods

One hundred cervical cytological samples screened at UNN in 2015 with the diagnoses normal, ASC-US, LSIL, ASC-H and HSIL were sent to the Departments of Pathology in Bergen (HUS), Bodø (Nordland), Fredrikstad (Østfold), Stavanger (SUS) and Tønsberg (Vestfold). The pathologist at the Department of Pathology in Bergen did not have time to participate in the study, and he forwarded the slides to Stavanger without looking at them. Two cytotechnologist at the Department of Pathology in Fredrikstad diagnosed the slides, but they were trained to screen SurePath samples. Their results were therefore excluded from this study based on ThinPrep samples.

All slides were first screened by a cytotechnologist at UNN and then evaluated by a pathologist at UNN (P1, reference). The abnormal cells were marked on the slides before being dispatched for the study. The slides were not screened at the other hospitals. The four other pathologists (P2–P5) at other hospitals were to only evaluate the abnormal cells marked on the slides. The other pathologists were blinded for age, previous findings, clinical information and HPV result. Diagnoses from each of the four pathologists were compared with diagnoses from the three other pathologists. Women with abnormal findings at UNN were followed up according to national guidelines. In Norway, the Bethesda System for Reporting Cervical Cytology is used by all laboratories. All patients were followed up through December 2016. Histologically confirmed high-grade dysplasia (CIN2+) was considered as study endpoint (gold standard). When calculating the sensitivity and specificity, women with normal Pap smears, and women with low-grade cytology (ASC-US / LSIL) and negative HPV test without histology, were considered free of high-grade dysplasia (CIN1-).

All analyses were done in IBM SPSS Statistics, version 23, with Chi-square test for categorical variables and t-test for continuous variables. For accuracy of cytological diagnoses between different observers, we used weighted kappa with linear weights.

Results

Of the 100 cervical cytology samples, 20 were diagnosed Normal, 20 ASC-US, 20 LSIL, 20 ASC-H and 20 HSIL at UNN. There were 32 women with high-grade histology (CIN2+) in the follow-up, including 19 CIN2, 12 CIN3 and one squamous cell carcinoma (SCC). There were no CIN2+ in women with Normal diagnosis, one CIN2+ in women with ASC-US diagnoses, three women with CIN2+ in the LSIL group, 10 CIN2+ in the ASC-H group and 18 CIN2+ in the HSIL group (Table 2). Using high-grade cytology (ASC-H+) as cut-off, the sensitivity for CIN2+ at UNN was 87.5% (28/32).

Table 2 Cytology diagnoses at UNN with HPV tests and biopsies

The number of samples diagnosed as “Normal” varied from 15 to 39 by the four pathologists, with a mean of 28.8. One pathologist (P2) had significantly fewer “Normal” cases than the average of the four pathologists (p<0.05) (Table 3). The corresponding variation of ASC-US, LSIL, ASC-H and HSIL were 17 to 24 (mean 19.8), 9 to 20 (mean 14.0), 10 to 18 (mean 13.3) and 16 to 32 (mean 24.0), respectively (Table 3), none of which were significant. There was moderate agreement between the observers (weighted kappa 0.45–0.58) (Table 4). The kappa statistics were not statistically different.

Table 3 Distribution of diagnoses per pathologist
Table 4 Agreement between observers (weighted kappa)

The agreement of the different diagnoses was higher for “Normal” and “HSIL” samples than the other diagnoses (ASC-US, LSIL and ASC-H) (Additional file 1: Tables S1–S5). The number for high-grade cytology (ASC-H+) varied from 26 (P4) to 50 (P2). Of 61 women with at least one high-grade cytology, 17 samples (27.9%) were considered high-grade by all four observers (Additional file 1: Figure S1). The number of true positive (CIN2+) using ASC-H+ as a cut-off varied from 22 to 30 (mean 24.8) (Additional file 1: Figure S2 and Table 5). The corresponding sensitivity for CIN2+ varied from 68.8% to 93.8% (mean 77.4%). One pathologist (P2) had significantly higher sensitivity than the average of the four pathologists (p<0.05) (Table 5). Of 32 women with CIN2+, 15 samples (46.9%) were considered high-grade by all four observers (Additional file 1: Figure S2). One woman with CIN2 was not considered to have high-grade cytology by any of the four observers (patient 57, Additional file 1: Table S3). The number of true negative (CIN1-) using LSIL- as a cut-off varied from 48 to 65 (mean 55.3). The corresponding specificity ranged from 70.6% to 95.6% (mean 81.3%) (Table 5). One pathologist (P2) had significantly lower specificity and one pathologist (P4) had significantly higher specificity than the average of the four pathologists (p<0.05) (see Table 5). The pathologist (P2) with the highest sensitivity for CIN2+ had the highest false positive rate and the lowest specificity (Table 5). The accuracy for CIN2+ varied from 74.1% to 83.8% (mean 79.4%). There were no statistically significant differences in accuracy (Table 5). The Pap smear from the woman with cervical cancer (SCC) was diagnosed as high-grade (ASC-H+) by one of the four pathologists (P2), while three pathologists diagnosed her as ASC-US (Additional file 1: Table S5). The woman had a positive HPV test for HPV type 16 (data not shown).

Table 5 True positive, true negative, sensitivity and specificity for CIN2+ per pathologists using ASC-H+ as cut-off

Discussion

The study’s purpose was to investigate the accuracy of cytology diagnoses by four different pathologists at three hospitals using 100 Pap smears with different cytological diagnoses screened at UNN. The agreement of the cytological diagnoses between the four pathologists in this study was “moderate.” A moderate agreement is better than “fair,” but worse than “substantial.” The kappa statistics were not statistically different.

In Norway there are 17 cytology laboratories covering a population of 5 million people [5]. All the laboratories receive most of their samples from general practitioners in primary screening. The population in Norway is quite homogenous, where Norwegian women in the different parts of Norway are mostly the same. The differences between the various laboratories are probably caused by different interpretation of the Bethesda criteria. Two pathologists (P4 and P5) were from the same laboratory but still used very different diagnoses for the same patients.

In the ATHENA study, the sensitivity of cytology varied from 42.0% to 73.0% [12]. In our study, the sensitivity for CIN2+ varied from 68.8% to 93.8%, but all the smears were first screened at the same hospital, and abnormal cells were marked on the slide. It is easy to find abnormal cells on a slide full of marks. In a population with a given prevalence of CIN2+, the sensitivity of cytology is dependent on the detection rate. In the ATHENA study, the positivity rate of cytology in primary screening varied from 3.8% to 9.9% while the detection rate of HPV DNA test (Cobas 4800) varied from 10.9% to 13.4% [12]. In our study, the detection rate of high-grade cytology (ASC-H / HSIL) varied from 26.0% to 50.0%, while the detection rate of HPV DNA test (Cobas 4800) was 74.3% (52/70).

In our study, the accuracy varied from 74.1% to 83.8% (mean 79.4%). In five published studies the accuracy varied from 64.2% to 78.4% (mean 76.1%) (Table 6). There was less variation between the four pathologists in our study than between the five published studies. The mean accuracy of the four pathologists in our study was significantly higher than the mean of the five published studies (79.4% vs 76.1%, p<0.05).

Table 6 True positive, true negative, sensitivity and specificity for CIN2+ in different studies using ASC-US+ as cut-off

There is a trade-off between sensitivity and specificity in cervical cancer screening. In our study the pathologist with the significantly highest sensitivity for CIN2+ had the significantly lowest specificity. In general, laboratories with a high detection rate of cytology also have higher sensitivity for CIN2+. If the sensitivity is higher, the hospital detects more women with CIN2/3 that can be treated, and fewer women develop cervical cancer before the next screening round. When women with low-grade cytology (ASC-US / LSIL) are triaged with HPV test, a high detection rate of low-grade cytology should not be considered as a major problem. A false positive ASC-US will have a negative HPV test and does not need follow-up. A false negative “Normal” cytology has no indication for HPV testing, according to Norwegian guidelines (www.kreftregisteret.no).

Cytology is subjective with poorly reproducible criteria. HPV testing is more objective with strictly defined criteria. Co-testing with both cytology and HPV test may reduce the risk of false negative cytology when the pathologists take the HPV result in consideration when evaluating the cytological slide. In our study, only the observer at UNN (P1, reference) knew the HPV result. All other observers were blinded for clinical information and HPV result, which might explain the lower sensitivity for CIN2+ for some of the other pathologist. Originally, in the ATHENA study, cytology was reviewed blinded to HPV status. When the same slides were re-reviewed unblinded to HPV status, the sensitivity for CIN3+ of co-testing increased from 54.1% to 62.4% (P = 0.0015) [13]. In our study, the mean sensitivity for CIN2+ for the four external pathologists was 77.4% based on slides screened at the same hospital.

The present study also has other weaknesses. For P1 the diagnoses were set in normal routine work, while the cases for the other four pathologists had to be diagnosed in addition to normal workload. This might affect the interpretation. In addition, only P1 at UNN had access to the initial diagnoses suggested by the cytotechnician. In daily practice the pathologist usually compares his or her initial impression with the diagnosis suggested by the cytotechnician. If there is discrepancy, the slide is reviewed. This might explain the lower sensitivity of some of the other pathologist. In normal routine work, difficult cases will be discussed with other pathologists. In this study, the pathologists reviewed all the slides alone.

Out of the 100 women in this study, there was one woman with cervical cancer. Three of the four pathologists diagnosed her cytology as ASC-US. According to Norwegian guidelines, women with ASC-US and a positive HPV result should be followed up with a new cytology and HPV test after 6–12 months. Only women with persistent HPV infection should be referred to a gynecologist for colposcopy and biopsy (www.kreftregisteret.no). This may delay diagnosis, treatment and worsen her prognosis.

There were statistically significant differences in sensitivity and specificity (p<0.05) for CIN2+ between the observations, but not in accuracy. In a low resource setting, specificity is important to reduce colposcopy workload. In a high resource setting like Norway, sensitivity is more important to reduce the number of cervical cancer. Specificity of cytology can be improved by HPV test in a triage of ASC-US / LSIL. The costs of a high number of HPV tests are of minor importance in a high resource setting. In the USA, co-testing (cytology and HPV test) every five years is recommended for women 30–60 years of age [10, 14].

Conclusions

Cervical cancer screening based on cytology has limited accuracy. The study revealed a moderate agreement between the observers, along with a trade-off between sensitivity and specificity. This might indicate that hospitals with high detection rate of cervical cytology have higher sensitivity for CIN2+, but lower specificity.