Introduction

The common symptoms of ovarian cancer are vague and similar to those observed in other benign conditions [13] so that most patients are diagnosed at advanced stages. This explains that ovarian cancer is the fifth most common cause of cancer death in women [14]. Despite advances in treatment, there has been little change in the mortality rate of ovarian cancer [13]. A diagnostic approach based on the use of CA 125 in association with ultrasonography has been suggested for the early diagnosis of ovarian cancer [2, 414]. However, this approach has several drawbacks including low sensitivity and specificity [416]. Abnormal CA 125 serum levels can be found in malignancies of different origin including epithelial (endometrial, endocervix and lung cancer) and non-epithelial malignancies (lymphomas) [46, 1321]. Abnormal CA 125 serum levels may be also found in several benign diseases, mainly those with effusions, liver or renal failure and benign gynaecological conditions (ovarian cysts, myomas and endometriosis) [4, 6, 1322]. Sensitivity of CA 125 in ovarian cancer is related to tumour stage, with abnormal CA 125 serum levels in approximately 50% of stage I patients and 80–90% in patients of stages III–IV [2, 46, 1116].

Recently, another tumour marker for ovarian cancer has been proposed, the HE4 protein, frequently overexpressed in ovarian cancers, especially in serous and endometrioid histology [2329]. However, HE4 is not specific of ovarian cancer and some expression has also been found in other malignancies mainly pulmonary and endometrial adenocarcinomas [30, 31]. Recently, our group reported that HE4 was more specific than CA 125 in benign and malignant conditions [31]. HE4 serum levels may be abnormal mainly in patients with renal failure or effusions and in patients with lung carcinomas. Studies suggest that HE4 has a similar sensitivity to CA 125, but an increased specificity in patients with gynaecological malignancies as compared with those with benign gynaecological disease [3137]. Likewise, different studies propose the use of a Risk of Ovarian Malignancy Algorithm (ROMA) to improve the sensitivity and specificity of the combined use of both tumour markers in patients with abdominal masses [9, 28, 3236].

The aims of this study were:

  1. 1.

    To evaluate the HE4 and CA 125 serum levels in healthy subjects and in patients with benign and malignant gynaecological diseases

  2. 2.

    To compare the utility of the three parameters, HE4, CA 125 and ROMA, for risk stratification and diagnostic purpose in patients with gynaecological diseases.

Material and methods

Patient population

We have determined HE4 serum levels in 66 healthy women (20–91 years, median 49 ± SE 2.2 years) (34 premenopausal, 32 postmenopausal), 285 patients with benign gynaecological diseases (17–90 years, median 40 ± SE 0.8 years), 143 patients with active gynaecological cancer (23–87 years, median 61 ± SE 1.2 years) and 33 patients without active disease (NED) after radical treatment (23 adenocarcinomas of endometrium or endocervix, six squamous cervical cancer and four with ovarian cancer). The group with benign diseases included 137 patients with ovarian cysts, 56 patients with myomas, 68 patients with endometriosis, 14 patients with endometrial polyps and ten patients with other diseases). Fifty-nine of these patients were postmenopausal (median 61, range from 48 to 90 years old), and the remaining 226 patients were premenopausal (median 37.5, range 17–51 years old).

The group with malignant diseases, classified according to the International Federation of Gynaecology and Obstetrics [38], included 143 patients with active cancer and without renal failure or creatinine serum levels >1.5 mg/dl, including 111 ovarian cancers (19 in stages I–II, 48 in stage III and 44 in stage IV), 17 endometrial cancers (4 in stages I–II, 13 in stages III–IV), 7 endocervical cancers (one in stages I–II, six in stages III–IV) and 8 squamous cell carcinomas of the cervix (one in stages I–II, seven in stages III–IV).

Laboratory methods

The serum levels of CA 125 and HE4 were determined using a chemiluminescent enzyme immunoassay on the Architect® Analyzer (Abbott Laboratories, Chicago, IL) with an inter-assay precision of CA 125 was 2.85% (35.6 mU/mL), 2.5% (268.3 mU/mL) and 1.96% (623 mU/mL). The inter-assay precision of HE4 was 3.5% (49.7 pmol/L), 3.6% (168.1 pmol/L) and 3.8% (648.2 pmol/L). We have considered 35 U/mL and 150 pmol/L as the upper limits of normality for CA 125 and HE4, respectively. ROMA algorithm was calculated according to the formulae described previously [9, 33] using logistic regression analysis for premenopausal (predictive index: \( - 12 + 2.38\; \times \;{\text{LN}}\left( {{\text{HE}}4} \right) + 0.0626\; \times \;{\text{LN}}\left( {{\text{CA }}125} \right) \)) and postmenopausal women (predictive index: \( - 8.09 + 1.04\; \times \;{\text{LN}}\left( {{\text{HE}}4} \right) + 0.732\; \times \;{\text{LN}}\left( {{\text{CA }}125} \right) \)). We have considered positive (risk) PI results ≥13.1 and 27.7 for premenopausal and postmenopausal women, respectively [9, 33]. This protocol was approved by the ethical committee of Hospital Clinic.

Statistical analysis

Values were finally reported as median (range). All the data were analysed with SPSS statistical software (version 14.0; SPSS Inc. Chicago, IL). Tumour marker levels between groups were compared using the Kruskall–Wallis and Mann–Whitney test. The level of statistical significance was set at p < 0.05. Sensitivity was considered as the ratio between the numbers of patients with malignancy whose marker levels were elevated over the total number of patients with malignancy. Specificity was calculated as the ratio between the number of patients without malignancy and normal tumour marker values by the total number of patients without malignancy. Positive predictive values were calculated as the ratio among the cases with elevated tumour markers and malignancy and the sum of all the cases with elevated tumour markers. The negative predictive value was calculated by the ratio among the patients with negative results and without malignancy and the total number of patients with negative results. Efficacy value was calculated by the ratio among patients with cancer and positive results plus patients without cancer and negative results and the total number of patients studied. Youden indices (J) were calculated according to the equation \( J = \left( {{\text{sensitivity}} + {\text{specificity}} - {1}} \right) \). Receiver operator characteristic (ROC) curves were assessed for HE4 and CA 125 (cut-off: 150 pmol/L, 35 U/mL) and compared using the DeLong mathematical model [39].

Results

The results of CA 125 and HE4 measurements in the different populations studied are shown in Table 1. Significantly higher CA 125 serum levels were found in premenopausal women than in postmenopausal women (p = 0.001). For HE4, higher concentrations in postmenopausal women were found but the data were not statistically significant.

Table 1 HE4 and CA 125 serum levels in the different studied populations

In patients with benign diseases, abnormal serum levels of HE4 and CA 125 were found in 1.1% (3 of 285) and 30.2% (86 of 285) of patients, respectively. CA 125 false positive results were mainly found in premenopausal women with abnormal serum levels in 32.3% of the patients with gynaecological benign diseases in contrast to 22% of them found in postmenopausal women. The use of ROMA algorithm reduces the proportion of CA 125 false positive results in the total group as well as in pre- or postmenopausal women (Table 1).

For CA 125, the most common cause of elevated results was endometriosis with abnormal values in 30 of 68 studied patients (44.1%). However, only 10.3% of these patients with endometriosis had risk by the ROMA algorithm.

Significantly higher serum concentrations of CA 125 and HE4 (p = 0.005) were found in patients with cancer than in those with benign diseases (Table 1). Both tumour markers showed a similar sensitivity in ovarian cancer, slightly higher with CA 125 (82.9%) than with HE4 (79.3%). Likewise, significantly higher concentrations of HE4 were found in patients with ovarian cancer than in other gynaecological malignancies. CA 125 is frequently abnormal in gynaecological adenocarcinomas and in 25% of squamous cell tumours, and a trend towards higher concentrations in ovarian carcinoma was found, but differences were not statistically significant. ROMA algorithm is not related to the tumour origin and results above the cut-off suggest high risk in the majority of ovarian cancer and adenocarcinomas as well as in a high proportion of patients with squamous tumours.

Table 2 shows the CA 125 and HE4 serum levels in patients with ovarian cancer, subdivided according to tumour stage and histological type. Both tumour markers were clearly related to stage with significantly higher concentrations in advanced stages III–IV than in stages I–II (p = 0.001). However, no differences were found between stages III and IV with either tumour marker. Likewise, the use of both tumour markers together improved the sensitivity obtained with only one tumour marker in all the stages (Table 2). Risk established by the ROMA algorithm showed a higher sensitivity than either tumour marker individually and slightly lower than the sensitivity obtained with the combined use of both tumour markers (either positive) in all stages (Table 2).

Table 2 HE4 and CA 125 serum levels in patients with ovarian cancer, subdivided according to tumour stage and histological type

HE4 and CA 125 were related to the histological type with significantly higher serum concentrations of CA 125 (p = 0.024) and HE4 (p = 0.001) in serous papillary ovarian cancer. No differences between mucinous tumours and other histologies (no serous papillary) were found in relation to the serum levels of these tumour markers. CA 125 had a higher sensitivity than HE4 in non-serous histologies, mainly in mucinous tumours (Table 2).

Figure 1 shows the ROC curve evaluating the utility of HE4, CA 125 serum levels and ROMA algorithm in the diagnosis of ovarian malignancy comparing patients with ovarian cancer and those with benign gynaecological diseases. ROMA algorithm showed a slightly high area under the curve than HE4 and both are significantly higher than CA 125.

Fig. 1
figure 1

HE4, CA 125 and ROMA in the differential diagnosis of ovarian cancer and benign gynaecological diseases

Table 3 shows the HE4, CA 125 and ROMA efficacy in the differential diagnosis of ovarian cancer and patients with benign gynaecological conditions. HE4 showed the higher specificity and positive predictive value (PPV) in the total group as well as in premenopausal and postmenopausal women. By contrast, ROMA showed the highest sensitivity and negative predictive value (NPV) (mainly in postmenopausal women) in all groups. HE4 had the highest efficacy in the total group as well as in premenopausal women and ROMA in postmenopausal women.

Table 3 Evaluation of HE4, CA 125 and ROMA efficacy in the determination of the risk of ovarian cancer (including only patients with benign gynaecological diseases and ovarian cancer)

The PPV was 96.7% (88/91) in patients with abnormal HE4 and 97.8% (88/90) using the ROMA algorithm in this subgroup of patients. Sensitivity was similar (79.3%) and only one patient (1.1%) with abnormal HE4 and no risk with the ROMA algorithm were found and with the diagnosis of ovarian cyst. Table 4 shows the accuracy of the risk of ovarian cancer according to CA 125 serum levels and ROMA, subdividing the patients according to HE4 serum levels. It is interesting to point out that in patients with HE4 in the normal range, only in those patients with abnormal CA 125 and risk with the ROMA algorithm a significant number of correctly detected cancer was found: 54.5% in the total group, 37.5% in premenopausal women and 64.3% in postmenopausal women. Likewise, there is a group of 23 patients (8% of the benign studied population) with risk according the ROMA algorithm and with normal values of both tumour markers (Table 4). None of these patients had ovarian cancer.

Table 4 Evaluation of CA 125 and ROMA efficiency in the diagnosis of ovarian cancer (excluding other gynaecological malignancies) according to HE4 results (> or <150 pmol/L)

Discussion

Different authors have suggested the use of serial CA 125 in combination with ultrasonography in postmenopausal asymptomatic women as an aid in the early diagnosis of ovarian cancer [2, 414]. These studies are mainly performed in postmenopausal women due to the lack of specificity in premenopausal women with abnormal levels in different benign diseases, mainly related to endometriosis. Different studies reported a positive predictive value in the diagnosis of ovarian cancer in asymptomatic women ranging from 10% to 21% [2, 48, 1216]. The major drawback of using CA 125 as an initial step in such a screening strategy is that up to 20% of ovarian cancers lack expression of the antigen [2, 48, 1216]. It is therefore necessary to combine CA 125 with new tumour markers that provide a better diagnostic efficiency [4, 6, 8, 13, 15, 16, 28, 3336].

Specificity is a significant issue for CA 125. Abnormal serum levels of CA 125 may be found in several benign and malignant diseases other than ovarian cancer [2, 46, 1322]. Despite these issues, CA 125 is used to differentiate benign from malignant pelvic masses and is used as a prognostic factor in the early diagnosis of recurrence or to assess response to treatment [1113, 15, 16, 32].

HE4 is part of a family of protease inhibitors that functions in protective immunity which is overexpressed in ovarian cancers, especially in serous and endometrioid histotype [2628, 31, 40, 41]. It is secreted by the cell and then detectable in the bloodstream of patients with ovarian carcinoma via an enzyme immunoassay. Preliminary studies of HE4 reported a higher specificity than CA 125 in different benign and malignant conditions, excluding renal failure [31]. Patients with renal failure had very high HE4 serum levels, undistinguishable from ovarian cancer. For this reason, patients with this pathology were excluded in our study. Excluding this disease, slightly elevated HE4 serum levels were found in only one third of patients with effusions or in 5% of patients with chronic liver diseases [31].

The majority of HE4 studies in serum has been published indicating that HE4 sensitivity and specificity in gynaecological diseases are better than CA 125 and that both tumour markers are complementary [9, 10, 3236, 40, 41]. Our results confirm these previous studies and clearly show that the use of HE4 may be important in the differential diagnosis of ovarian cancer with other gynaecological conditions including premenopausal women [9, 10, 3236, 40, 41]. However, HE4 as well as CA 125 are not only found in ovarian carcinoma; abnormal levels may be also found in gynaecological adenocarcinomas or in lung cancer as was previously reported. Moore et al. [28, 33] have previously indicated the HE4 expression in patients with endometrial cancer had a clear relationship to tumour stage. In our experience, HE4 has a better utility in the differential diagnosis of ovarian cancer, and abnormal levels were found in only one third of the patients with endometrial or endocervical cancer and none with squamous cervical cancer. By contrast, CA 125 is frequently abnormal in all these malignancies.

To improve sensitivity and specificity, an algorithm, ROMA, using both tumour markers, has been suggested that indicates the ovarian cancer risk in patients with abdominal masses. ROC curve shows that ROMA algorithm is the parameter with the highest discrimination between ovarian cancer and gynaecological diseases. The lower difference for the area under the ROC curve between HE4 and ROMA compared to other studies might be explained by the fact that in our population, the group of premenopausal women was much larger than the group of postmenopausal women. Due to the lower specificity of CA 125 in this group of patients, the superiority of ROMA is diminished. A further optimization of the ROMA cut-offs might need further investigation in a multicentric setup assessing different populations. Also, our results clearly show that the ROMA algorithm improves the specificity found with CA 125, but decreases it in relation to HE4. Twelve percent of patients without malignancy were classified as risk by ROMA in contrast to the low proportion of false positive results found with HE4 (1.1%). Likewise, ROMA algorithm is not useful to establish the tumour origin, as it has a high sensitivity of 90.1% for ovarian cancer but also of 94.1% for endometrial cancer or higher than 60% for cervical cancer.

HE4 is a tumour marker with higher efficacy than CA 125. HE4 positive predictive value is 96.7% (high) in the total group, 100% in premenopausal women and 96% in postmenopausal women. However, the main problem to use HE4 alone as a help in the diagnosis of pelvic mass is that sensitivity in ovarian cancer is not enough with 20.7% of false negative results [4, 14, 15, 3336]. It is of note that HE4 showed a higher sensitivity than CA 125 in early stages and by contrast CA 125 was the most sensitive tumour marker in an advance stage. CA 125 can be used together with HE4 but the problem is the high proportion of false positive results (PPV+ 52.6%), mainly in premenopausal women (PPV+ 22.2%). The main advantage of the ROMA algorithm is the sensitivity, classifying 90.1% of patients with ovarian cancer as risk group, 7.2% more compared to CA 125 and 10.8% more than for HE4. An increase in the sensitivity is independent of menopausal status (Table 3). However, the false positive results associated to CA 125 decrease the ROMA PPV, mainly in premenopausal women. These data explain why ROMA is mainly useful (higher efficacy and Youden indices) in postmenopausal women. However, it is interesting to notice that ROMA had a low PPV in premenopausal women but its NPV is high, indicating that is very unusual to find ovarian cancer with ROMA or HE4 negative. These data are interesting if we consider that the ovarian cancer prevalence in our study was 10.7% (27/253 patients) in the premenopausal group and 58.7% (84/143 patients) in the postmenopausal group (Table 3). These results are in concordance with the last publication of Montagnana et al. [41] that reported that ROMA is mainly useful in postmenopausal women. Likewise, Montagnana et al. [41] reported that HE4 is the most efficient tumour marker, with no clear advantages including CA 125, as we found in our study. More studies are necessary to clarify the HE4 cut-point as well as the best cut-offs for ROMA.

Results obtained in our study can be summarised by the fact that HE4 alone is useful in supporting the diagnosis of ovarian masses. In our population, the addition of CA 125 and ROMA trying to increase sensitivity is not optimal in those patients with abnormal HE4 because the high specificity of HE4 already indicates a high risk for ovarian cancer. The PPV of HE4 is 96.7% and increases until 97.8% when ROMA is also abnormal. In summary, the use of ROMA algorithm in HE4+ patients does not increase sensitivity but only increases the PPV by 3.2% (three patients). These results are in discordance with those published by van Gorp et al. [40] suggesting that ROMA and HE4 did not increase the detection rate of malignant diseases compared to CA 125. These results are surprising in relation to our data or other publications because there is an agreement in the fact that HE4 is more specific than CA 125 and that the sensitivity is at least similar, and that CA 125 false positive results in patients with gynaecological diseases is high, mainly in premenopausal women [9, 3135, 41].

An optimised application of the ROMA algorithm may result in an improved CA 125 specificity in those patients with HE4 negative results. Patients with abnormal CA 125 had a risk by the ROMA algorithm of 44.4%, increasing the HE4 sensitivity in 12 (10.8%) patients with ovarian cancer. By contrast, it is possible to find that 23 patients (8% of the benign studied population) had risk by the ROMA algorithm with negativity of both markers that are used in this calculation. None of these patients had ovarian cancer. ROMA algorithm was calculated for optimising surgical referral of patients with abdominal masses to either oncologists or non-oncologists. Twenty-three of our patients with risk by ROMA and with negativity of both tumour markers had benign gynaecological diseases, the majority of them with abdominal masses or endometriosis. However, the risk to miss early stage of cancers in this group needs to be determined by larger populations.

Our results indicate that the best algorithm to suggest ovarian cancer risk may be is to classify those patients with HE4 positive results as high risk and to determine the risk by the ROMA algorithm for patients with normal HE4 but abnormal CA 125 values. These criteria allow us to obtain a high sensitivity (90.1% 100/111 ovarian cancer) and increase the specificity with a lower proportion of false positive results (PPV+ 82.6%, 100/121 patients). Only 11 patients with ovarian cancer were missed using these criteria and six of them had mucinous tumours. In the assessment of histological types HE4 and CA, 125 are related to mucinous tumours with the lowest sensitivity and median concentrations [26, 28, 31, 40, 41]. Using this approach, we were able to suggest the diagnosis in 94.9% (93/98) of patients with non-mucinous ovarian cancer. Other markers or approaches may be used when we suspect mucinous tumours as for example CA 19.9 [11, 16].

In conclusion, HE4 is the tumour marker of choice in ovarian cancer, with a higher sensitivity in early stages, specificity and efficiency than CA 125 and ROMA algorithm. Our data indicate a potential to improve this algorithm mainly by using it in those patients with HE4-negative and with CA 125-positive results. The use of this combination allows to increase the tumour marker utility in the diagnosis of pelvic masses with a sensitivity of 90.1% (95% in non-mucinous tumours) and a specificity of 82.1%.