Introduction

Advanced stage malignant melanoma (MM) patients face a poor outlook, with 5-year survival rates varying between 86% for stage IIIa patients to around 15% for stage IV patients [1, 2].

To improve these survival rates, adjuvant immune checkpoint inhibitor treatment has been introduced for high-risk patients [3]. Both ipilimumab (CTLA-4 inhibitor) and nivolumab/pembrolizumab (PD-1 inhibitors) have been shown to reduce the risk of recurrence when used as adjuvant treatment in high-risk MM patients [4, 5].

While improving prognosis, treatment with PD-1 inhibitors comes with a risk of serious immune-related adverse events [6]. Furthermore, these new treatments are expensive [7]. To mitigate both of these costs as much as possible, it is necessary to closely monitor the patients during treatment, to detect MM recurrence as well as adverse effects as early as possible.

Positron emission tomography/computed tomography (PET/CT) with 2-deoxy-2-[18F]fluoro-d-glucose (FDG) is frequently used to detect hypermetabolic cells such as cancer cells. Studies have shown the high accuracy of FDG-PET/CT in diagnosing active malignant disease in MM[8], with a sensitivity for detecting MM recurrence of 86% and a specificity of 91% [9].

Treatment with PD-1 inhibitors leads to a high rate of inflammatory changes, and some of these cannot be discriminated from MM recurrence on FDG-PET/CT scans [10]. Treatment monitoring using FDG-PET/CT, therefore, has been associated with challenges of distinguishing disease recurrence from inflammatory side effects.

In Denmark, high-risk patients (stage III-IV MM) with resected MM are offered treatment with adjuvant nivolumab, a PD-1 inhibitor [3, 11, 12]. As part of the follow-up program, patients are recommended to undergo FDG-PET/CT scan every 3 months the first year after treatment has been initiated [13].

While the use of FDG-PET/CT in diagnosing MM recurrence and the value of nivolumab in adjuvant treatment has been described [5, 8], the diagnostic accuracy and optimal frequency of FDG-PET/CT in patients treated with adjuvant nivolumab have, to the best of our knowledge, not yet been examined [3, 14].

There is at present little consensus on the frequency of follow-up examinations and imaging modalities in these high-risk MM patients [3]. In this study, we report the incidence of recurrence in patients in the Danish follow-up program after macroscopically radical surgery during adjuvant immunotherapy. The study aims to evaluate the sensitivity, specificity, and predictive values of FDG-PET/CT in the first year of the follow-up program. Secondarily, the study aims to evaluate the clinical impact of the follow-up program through the number of further diagnostics resulting from the FDG-PET/CT scans.

Methods

We conducted a register-based, retrospective, longitudinal, diagnostic accuracy study including patients with high-risk MM who received adjuvant immunotherapy at Odense University Hospital in the Region of Southern Denmark between November 2018 and February 2021. Patients were monitored with FDG-PET/CT scans according to the Danish follow-up program and were identified from the Danish Metastatic Melanoma Database (DAMMED) [15], which contains data on stage III and IV MM patients in Denmark. Data was collected from DAMMED on the 8th of February 2021.

Ethics

This was a non-interventional retrospective cohort study that had no impact on the patients’ treatment or outcome. The patients enrolled have given written consent to participate in research relevant for MM upon being registered in DAMMED. Of patients eligible for inclusion in DAMMED, around 95% were included[15].

Patients

Patients were included in the study based on having undergone radical (microscopic/R0 or macroscopic/R1) resection for high-risk MM (stage III or IV), having received at least one dose of adjuvant PD-1 inhibitor (nivolumab or pembrolizumab), and being followed for at least one control FDG-PET/CT scan. Patients were followed until recurrence, end time of data collection, or having completed one year of follow-up after the first immunotherapeutic treatment.

Due to side-effects, not all patients received 13 immunotherautic treatments. All patients were included in the study on an intention-to-treat basis. Further statistical information is available in Table 1.

Table 1 - Baseline characteristics

All patients with macroscopically discovered disease received a preoperative FDG-PET/CT scan to rule out metastatic disease. Patients found to have inoperable disease were allocated to a medical treatment protocol and not included in this study.

The date of operation was collected from DAMMED. Collection of data regarding scan results, biopsy results, and outcome of patients were collected using the Region of Southern Denmark’s electronic patient record and imaging systems. Study data were collected and managed using REDCap electronic data capture tools hosted at OPEN, Denmark [16, 17]. Baseline data and treatment data were collected from both sources and cross-referenced. Any data collected by one investigator was cross-referenced by another (JASA and ADS). Disagreements were settled by consensus. In absence of consensus, disagreements were settled by consulting a nuclear medicine specialist (MHV or PG).

Scan report interpretation

When the scans prospectively were performed in the clinic, a nuclear medicine physician and a radiologist continually provided the scan reports. The findings reported were later graded on a 5-point Likert scale based on the described malignancy suspicion of the lesions: 1 = clearly benign lesions; 2 = likely benign lesions; 3 = either benign or malignant lesions; 4 = likely malignant lesions; and 5 = clearly malignant lesions. Grading of the scan reports was performed retrospectively by two investigators (JASA and ADS). Disagreements were settled by consensus. In absence of consensus, disagreements were settled by consulting a nuclear medicine specialist (MHV or PG).

Outcome measures

To calculate sensitivity, specificity, and predictive values, each scan was assessed as true positive (TP), true negative (TN), false positive (FP), or false negative (FN). A positive scan was defined as having at least one PET-positive lesion with a Likert score of 3 or above. A negative scan was defined as having either no PET-positive lesions or having a PET-positive lesion with a Likert score of 2 or below. These definitions were based on clinical practice, as patients with 3 or above were generally further examined for recurrence. Recurrence was defined as a malignant lesion verified by biopsy, progression on subsequent FDG-PET/CT scan, or magnetic resonance imaging. In selected cases, patients with multiple PET-positive lesions were clinically diagnosed with MM recurrence without having a biopsy performed. Non-recurrence was assumed when no metastases were observed until the subsequent follow-up scan. MM carcinoma in situ was not considered a recurrence. Any lesions verified as metastasis were traced on earlier scans and considered to be malignant (and TP) if the lesion was reported in the same location.

The distinction between TP and FN was made based on the timing of the recurrence. If the recurrence was first detected by the FDG-PET/CT scan, it was defined as TP, while any recurrences detected in the time interval until the next planned scan were defined as FN.

FDG-PET/CT

FDG-PET/CT scans were performed at three different locations (Vejle, Odense, and Esbjerg). Scan procedures performed at Lillebaelt Hospital, Vejle, and the Hospital of South West Jutland, Esbjerg, can be seen in the Supplementary Material. A plurality (49.5%) was performed at Odense University Hospital, using the following scan procedure. PET/CT data was acquired on a GE Discovery 710 PET/CT scanner. The PET-scan was performed using a standard whole-body acquisition protocol with an acquisition time of 2½ min per bed position. The scan field of view was 70 cm. PET data was reconstructed into transaxial slices with a matrix size of 256 × 256 (pixel size 2.74 mm) and a slice thickness of 3.75 mm using iterative 3D OS-EM (3 iterations, 24 subsets) with corrections for time-of-flight (GE VPFX) and point-spread-blurring (GE SharpIR). Attenuation correction was based on a dedicated ultra-low-dose helical CT attenuation correction scan. A helical diagnostic CT scan was acquired after the PET scan with intravenous contrast (ULRAVIST 370 I/ml) using a standard CT protocol with a scan field of view of 70 cm. Data was reconstructed with a standard filter into transaxial slices with a field of view of 50 cm, matrix size of 512 × 512 (pixel size 0.98 mm) and a slice thickness of 3.75 mm. Analysis of CT, PET, and fused PET/CT data was done on a GE Advantage Workstation v. 4.4. The CT scan was described by a radiologist and the PET scan by a nuclear medicine specialist.

Statistics

Descriptive statistics were done according to data type: categorical variables were shown as frequencies and respective percentages, and continuous variables as median and range (minimum–maximum). Diagnostic accuracy measures comprised sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV). These were estimated and supplemented by exact (Clopper-Pearson type) 95% confidence intervals (95% CIs). The discriminatory power of the nuclear medicine physicians’ confidence in the FDG-PET/CT scan was explored with receiver operating characteristics (ROC) curve analysis. Recurrence-free survival was visualized by a Kaplan–Meier survival estimate. All statistical analyses were done in STATA/IC 16.1 (StataCorp, College Station, Texas 77,845, USA) and Microsoft Excel for Windows.

Results

The study included 124 patients with a total of 366 FDG-PET/CT scans. Baseline characteristics are described in Table 1. The patient population in this study is mostly male (60.5%). The median age was 62 years (range: 17–83 years), i.e. the patient population tended to be near the older end of the spectrum, with a few younger outliers. All patients included with primary disease were offered sentinel lymph node biopsy, and none of these received completion lymph node dissection. Completion lymph node dissection was reserved for patients with recurrent disease in regional lymph nodes. A total of eight patients received completion lymph node dissection.

The patients received a median of seven treatments with immunotherapy during the study period ranging from one treatment to the full program of 13 treatments. All patients except two received a baseline scan before starting immunotherapy. The two remaining patients received baseline scans at three and four days after the first treatment with immunotherapy. Seven patients received more than the standard four scans due to clinical suspicions of malignancy. Figure 1 shows the distribution of the number of scans the patients received.

Fig. 1
figure 1

Bar chart showing the number of FDG-PET/CT scans each patient received

Incidence of recurrence

The incidence rate of MM recurrence was 0.27 [95% CI 0.17–0.37] per person-year during the first year after radical surgery for the entire population. For patients with stage IIIA disease, the incidence rate was 0.0815 [95% CI 0.012–0.53], for stage IIIB patients 0.19 [95% CI 0.094–0.40], for stage IIIC patients 0.32 [95% CI 0.21–0.49], and for IV patients 0.55 [95% CI 0.31–0.99]. A prevalence of 21.8% was found, reflecting that 27 out of 124 patients had MM recurrence within the study timeframe.

Recurrence was first detected in 13 patients (10%) at the 3-month FDG-PET/CT scan, in 10 patients (8.1%) at 6-month, 1 patient (0.8%) at 9-month, 3 patients (2.4%) at 12-month scan. Ninety-seven patients (78%) had no recurrence within the study timeframe.

A Kaplan–Meier estimate depicting the time from surgery to recurrence in days is shown in Fig. 2. The figure illustrates a high rate of recurrences early in the follow-up program until around 200 days after surgery, where the recurrence rate slowed down. When examining the time beyond 400 days, it should be noted that this is based on a low number of patients, relating to time from surgery to first immunotherapeutic treatment.

Fig. 2
figure 2

Kaplan–Meier curve showing time in days from surgery until recurrence or end-of-study (recurrence-free survival)

Accuracy of FDG-PET/CT

Overall and time-related accuracy results are presented in Table 2. The lowest specificity was found at the 3-month scan (78%), which also presented a low PPV (35%). The 6-month scan presented the highest specificity (87%) and PPV (56%).

Table 2 - Sensitivity, specificity, and predictive values with 95% CIs

Results by disease stage are presented in Table 3. The PPV trended towards higher values as disease stage increased. Sensitivity, specificity, and NPV remained consistent across disease stages.

Table 3 - Sensitivity, specificity, and predictive values with 95% Cis for the entire program stratified by disease stage

Lesions were detected in 141 of 366 (38.5%) scans: 6 (4.3%) were graded 1; 37 (26.2%) were graded 2; 37 (26.2%) were graded 3; 55 (39.0%) were graded 4; and 6 (4.3%) were graded 5.

The ROC curve (Fig. 3) shows the relationship between the applied Likert scale (grade 1–5) and the sensitivity and specificity at each cut-off value for the entire program. The area under the ROC curve was 0.94 (95% CI: 0.90–0.97), and the cut-off value (grade 3) used in this study provided a sensitivity of 97% and a specificity of 82% (see point (1-spec, sens) = (0.18, 0.97) in Fig. 3).

Fig. 3
figure 3

ROC curve exploring the discriminatory power of the applied Likert scale for recurrence

The vast majority of verified malignant lesions (20/27, 74.1%) reported on FDG-PET/CT were confirmed by a biopsy performed after the PET scan, but in some cases, malignant lesions showing on FDG-PET/CT were verified by a progression of the lesion on a subsequent scan, rather than immediate confirmation of recurrence by biopsy. This was the case in 7 of 27 patients (25.9%); 5 patients at the 3-month scan; 2 patients at the 6-month scan; and no patients at the 9-month scan.

Clinical consequences and FDG-positive findings

FDG-positive lesions reported on FDG-PET/CT at the 3-month scan have been illustrated in Fig. 4 showing locations for TP (a) and FP (b) lesions. Examples of true positive and false positive lesions are included in Fig. 5. While the PET-positive lesions appeared in many locations, malignancy was mainly confirmed at sites local or locoregional to the primary MM. At the 3-month scan, all patients diagnosed with distant recurrence had at least one locoregional recurrence as well.

Fig. 4
figure 4

Distribution of true and false positive lesions at the 3-month scan. a A total of 26 true positive lesions were detected; 11 lesions were verified by biopsy. b A total of 60 false positive lesions were reported

Fig. 5
figure 5

a A patient operated for malignant melanoma (stage IIIc) located centrally at the top of the back. The 3-month FDG-PET/CT shows false positive lesions in mediastinal lymph nodes as seen in an axial FDG-PET/CT section of the thorax. b A patient radically operated for malignant melanoma (stage IIIb) located at the right foot sole. The 12-month FDG-PET/CT shows a true positive lesion in an inguinal lymph node in the right groin as seen in an axial FDG-PET/CT section of the pelvis

Out of 98 FDG-PET/CT scans with malignancy-suspicious lesions, 29 (29.6%) did not lead to any clinical action. In 25 (25.5%) of 98 cases, the follow-up FDG-PET/CT scan was expedited, 11 (11.2%) patients had MRI scans, 5 (5.1%) resulted in an immediate diagnosis of recurrence, 28 (28.6%) were sent to surgery, 5 (5.1%) went to an otorhinolaryngological department for examination, and finally, 6 (6.1%) received a variety of other diagnostics. Six scans (6.1%) were described and graded but not addressed clinically at end time of data collection, and thus could not contribute to this statistic. The percentages add to more than 100%, as some scans led to more than one form of clinical action.

Discussion

Summary of main findings

In this study, we observed a high rate of disease recurrence during the first year of follow-up. FDG-PET/CT detected recurrence with a sensitivity of 97%, while the specificity was moderate at 82%. The time point with the greatest clinical effect was the 3-month scan that detected the most recurrences (13/27), yet also presenting with the lowest specificity (78%), reflecting the difficulties of separating false and true positive results at this time point. The 3-month scan showed a tendency for recurrences to appear locoregionally, as distant metastases only occurred concurrently with locoregional recurrences. This suggests that increased attention should be given to the areas locoregional to surgery, at least at the 3-month time point. The 9-month scan presents with both the lowest PPV (31%) and the fewest detected recurrences (1/27). These values might indicate that the recurrence prevalence is lowest around the 9-month scan, but due to the low number of cases, caution should be made for concluding this. The incidence rate of recurrence was, as to be expected, lower in patients with stage IIIa disease, and increased with disease stage as did also the PPV. Hence, the importance of monitoring the patients using scans seemed to increase with disease stage.

One of the main strengths of the follow-up program is that the high sensitivity suggests that the high frequency of FDG-PET/CT scans effectively detects almost every recurrence. Furthermore, the high NPV suggests that a negative scan reliably indicates no recurrence during treatment with immunotherapy.

Not all patients in the study were immediately diagnosed with recurrence after detection of PET-positive lesions. A number of the patients were followed on subsequent scans to confirm malignancy. This proportion was highest at the 3-month scan (5/13), which may be an indication of the difficulties of differentiating between benign and malignant lesions at this time, in this population.

Out of 92 positive scans, 29 led to no clinical action at all. The remaining positive scans (63) resulted in some form of additional diagnostics, ranging from expedited control scans to surgery of possible recurrence.

Comparison with literature

In the current study, we found an incidence rate of 0.27 per person-year. This is comparable with the CheckMate 238 study finding a 12-month recurrence-free survival of 70.5% in the nivolumab group—reflecting an incidence of 29.5% [5]. Our study had included a similar patient population to CheckMate 238, as both studies examine adjuvant nivolumab on high-risk MM patients. While CheckMate238 includes stage IIIB-IV patients, this study includes stage IIIA patients as well.

In comparison with the EORTC 1325 study[18], our results were quite similar. They found a 12-month recurrence-free survival of 75.4%, and they included only patients with stage III disease, while our study included both stage III (110/124) and stage IV (14/124) patients. Additionally, EORTC 1325 used pembrolizumab, while our study mainly used nivolumab.

Our study found an overall PPV of 39%. This compares poorly with the results from Xing et al., a large meta-review examining high-risk MM patients not treated with immunotherapy [9]. Xing et al. found a PPV for detection of distant recurrence in high-risk patients of 80% and a PPV of 97% for lymph node metastases for FDG-PET/CT in surveillance. As Xing et al. define high-risk patients as patients with a recurrence risk of 30% during the first five years, and our patient population has an incidence rate of 0.27 per person-year during the first year of follow-up, our patient population is at significantly higher risk of recurrence, which would lead to a higher PPV. As such, the contrasting low PPV of our study may reflect the increased rates of false positives brought about by recent surgery in our population and the addition of immunotherapy with the risk of inflammatory side effects.

The low PPV might also be differently interpreted, since the high-risk nature of this patient population may incentivise action against PET-positive lesions. This approach to scan results is equivalent to lowering the cut-off value to maximise sensitivity and lowering specificity. As illustrated in Fig. 3, it is possible to increase specificity in this study by increasing the cut-off value, which would result in fewer FP results. In this study, a cut-off value of above 3 would result in a PPV of 55%.

Strengths and limitations

One strength of this study is the fact that our patient population is representative of clinical practice in the Region of Southern Denmark, including eligible patients treated in the region up until the date of data collection, hence diminishing the risk of selection bias.

The study included FDG-PET/CT scans from three different sites. Each site has small differences between the scan protocols. Despite somewhat compromising internal validity, these factors are true to the clinical setting in which the study is performed.

For this study, the reference standard defined the true positive findings well by histologic verification, but for false positive cases, this does not necessarily reflect the truth since some of the inflammatory changes seen on FDG-PET/CT that spontaneously regressed may have represented immune reactions targeting cancerous cells or micro-metastases. Further complicating matters, it could be that some true positive cases would have regressed during further treatment. This muddles the distinction between a positive scan and the effective treatment outcome of a patient who would otherwise develop recurrence, especially within the first 3 months of immunotherapy. In clinical practice, this incentivises a “wait-and-see” approach to establish a more accurate diagnosis, which may be contraindicated by the time-sensitive nature of the malignant disease.

While the results focus on the accuracy of the follow-up program in diagnosing recurrence, this may not be possible to validate. The absence of a reliable gold standard for detecting malignancy makes it difficult to categorise scans as false negative, which introduces the possibility of overestimation of NPV and sensitivity.

It should be noted that patients in the study received a varying number of treatments. This was mainly due to the adverse events which caused some patients to stop treatment early. The patients who stopped immunotherapy but continued with the follow-up FDG-PET/CT scans were included to avoid bias.

Towards the end of the study timeframe, COVID-19 vaccinations were beginning distribution in Denmark. Potentially 45 patients could have received a vaccine, however, information regarding vaccination status was limited. One patient is known to have received a COVID vaccine and had PET-positive lesions in the lymph nodes local to the injection site. As there was no widespread availability of the COVID-19 vaccines in Denmark at this time, it is unlikely that the number of false positives was significantly affected by the vaccines. Despite this, it is a clear limitation that this information is not available.

Perspectives

The size of our study population may limit the conclusions drawn from this study. Future studies would have the benefit of more patients to include, as this treatment and follow-up program is ongoing. Furthermore, such a study might be able to examine long-term outcomes for the patient population.

The adverse effects relating to immunotherapy may be severe, and the relationship between these adverse effects and their possible manifestations on FDG-PET/CT scans has not been well described in the literature. A study examining the relationship between PET-positive lesions and the clinically manifested immune-related adverse effects could be clinically relevant.

As the prevalence of recurrence seems low around the 9-month scan, this scan may provide only a small clinical benefit. For future studies, a policy of optionally using an FDG-PET/CT scan at 9 months to monitor suspicious lesions present at the 6-month scan might be examined.

Conclusion

In conclusion, the relatively high rate of recurrence emphasizes the need for follow-up to detect recurrence in high-risk MM patients treated with adjuvant immunotherapy. The relatively low specificity reflects a high number of false-positive results, and their potential clinical harm must be weighed against the benefit of early detection of recurrence. FDG-PET/CT is a valuable method for detecting recurrent disease in high-risk MM patients treated with immunotherapy, especially at the 3-, 6-, and 12-month mark. However, considering limited timeframe and number of patients, verification on a larger sample is necessary.