Patient population
This prospective clinical study was approved by the institutional ethics committee and performed in accordance with the guidelines of the Helsinki II declaration. Written informed consent was obtained from all subjects. Eligible patients were identified at the Clinic for Otorhinolaryngology, Head and Neck Surgery of the University Hospital of Geneva. Over 36 months, hybrid PET/MRI examinations were performed in a consecutive series of 76 adult patients previously treated with curative radio(chemo)therapy ± surgery (delay between treatment end and imaging: mean ± SD = 15.2 ± 12.8 months, median [quartiles] = 12 months [5–22.5]). Indications for PET/MRI were persisting or newly developed symptoms after radio(chemo)therapy (pain, reflex otalgia, hoarseness, dysphagia). Exclusion criteria were standard MRI contraindications. None of the potentially eligible patients refused to participate. Two PET/MRI examinations were excluded from the study due to poor image quality (n = 1) or absent follow-up (n = 1). Therefore, a total of 74 PET/MRI examinations formed the basis of this series. Most patients (50/74, 67.5%) were males (mean age ± SD 62.1 ± 12.6 years). A small proportion of this cohort (15/74 patients) was included in a study comparing image quality and whole-body FDG uptake detectability with PET/MRI versus PET/CT [16].
Image acquisition
PET/MRI examinations were performed on a Philips Ingenuity time of flight (TF) hybrid PET/MRI (Philips Healthcare, Cleveland, OH, USA) [12]. All patients fasted >4 h prior to injection of 3.5 MBq/kg body weight FDG. The time interval necessary for FDG uptake was used for HN MRI scanning including a sequence for PET attenuation correction (AC). HN MRI obtained with a 16-channel SENSE neurovascular coil covered the area from the roof of the frontal sinuses to the aortic arch. The following high-resolution sequences were acquired: coronal STIR (TR/TE/TI = 5,043/80/200 ms; voxel = 0.45 × 0.45 × 4 mm3, 3 min 30 s), axial T2 (TR/TE = 3,528/90 ms; voxel = 0.45 × 0.45 × 3 mm3, 2 min 40 s), axial SE EPI-DWI with six diffusion gradient b values (TR/TE/TI = 6,803/72/230 ms, b = 0, 50, 100, 500, 750, 1,000; voxel = 1.3 × 1.3 × 3 mm3, 4 min 05 s) and with apparent diffusion coefficient (ADC) map calculation by mono-exponential fitting [1, 5,6,7], axial and coronal T1(TR/TE = 683/16 ms; voxel = 0.45 × 0.45 × 3 mm3, 3 min 45 s) before and after injection of gadoterate-meglumine (0.1 mmol/kg Dotarem, Guerbet, Aulnay-sous-Bois, France), and contrast-enhanced axial 3DT1GE Dixon(flip angle 10°, TE1/TE2//TR = 1.44/2.6/5.7 ms, voxel = 0.45 × 0.45 × 1.5 mm3, 4 min 12 s). We used a 6b-value SE EPI-DWI sequence because a similar 6b-value SE EPI-DWI sequence has been successfully used by other authors [7, 21,22,23]. All commercially available DWI sequences use fat saturation, which can be obtained by chemical shift selective fat saturation, water excitation or by STIR methods. Based on the literature [24, 25] and on our experience, DWI with STIR-based fat saturation is more robust in the HN than classical spectral fat saturation and yields good quality images. After HN MRI, a whole-body 3DT1GE Dixon (flip angle 10°, TE1/TE2//TR = 1.12/2.1/3.3 ms, voxel = 0.78 × 0.78 × 6 mm3, 19 s/stack, 8–10 stacks) and an AC sequence (2 min 30 s) were acquired. Whole-body PET acquisition was started 60 min post-injection (10 beds, acquisition = 32 min). PET images were corrected for attenuation using the segmented MRI-based AC procedure described in the literature [26]. PET reconstruction was performed using a 3D-LOR-TF-blob-based OSEM algorithm (3 iterations, 33 subsets, voxel = 2 × 2 × 2 mm3 for HN).
Image evaluation, diagnostic criteria and measurements
Two board-certified radiologists with substantial experience in HN MRI and PET/CT (>15 years) and a board-certified nuclear medicine physician with substantial experience in PET/CT and HN MRI (>10 years) evaluated the images separately and were blinded to all clinical/histopathological data. In case of discrepant evaluations, consensus was reached. Findings were recorded on pre-defined evaluation sheets using a five-point scale for receiver operating characteristics (ROC) analysis as follows: 1, definitely negative for recurrence; 2, probably negative; 3, indeterminate, therefore, suspicious/possibly positive; 4, probably positive; and 5, definitely positive.
The three readers evaluated morphological MRI first, then DWI and PET. All images (MRI, DWI and PET) were assessed according to the diagnostic criteria established in the literature and taking into consideration diagnostic pitfalls related to radiation-induced changes [1, 9]. Internationally established qualitative and quantitative criteria were applied [1, 5,6,7,8,9, 11, 27, 28]. Tumours involving the upper aero-digestive tract, the neopharynx (after total laryngectomy) or flaps in the oral cavity/pharynx were considered as local recurrence [27]. On MRI, recurrent tumours were diagnosed in the presence of well-defined or ill-defined mass-like lesions with intermediate T2 signal (‘evil grey’), moderate contrast enhancement and restricted diffusivity (high signal on b1000, low signal on ADC) [1, 5,6,7, 27, 28]. Lesions with high signal on T2, strong contrast enhancement and high signal on b1000 and ADC were interpreted as suggesting post-treatment inflammatory oedema. Mature scar tissue/long-standing fibrosis was diagnosed in the presence of an elongated lesion with very low signal on T2, no/minor contrast enhancement, and low signal on b1000 and ADC [1]. If on a DWI sequence localised artefacts were seen on slices outside the lesion to be measured, the sequence was regarded as being of acceptable quality and ADC measurements were carried out. Qualitative DWI assessment (visual assessment of b1000 and ADC) and quantitative assessment with ADC threshold were obtained for all lesions. The ADC threshold was calculated after completed radiological-histological correlation based on ROC analysis of prospectively measured ADCs [22]. Focal FDG uptake (visual tracer accumulation exceeding the adjacent background activity) was rated as PET positive taking into account physiological FDG accumulation and pitfalls in the HN, such as muscular, salivary gland, physiological Waldeyer’s ring uptake or post-treatment inflammatory changes [1, 8, 9, 20, 29,30,31]. Qualitative and quantitative PET assessment (with standardised uptake value (SUV) threshold) was obtained. The SUV threshold was calculated analogous to the ADC threshold.
Benign post-treatment lesions and complications (oedema, scar/fibrosis, soft tissue- and osteonecrosis, ulceration, denervation atrophy) were diagnosed on combined PET/DWIMRI taking into consideration established criteria [1]. As FDG uptake can be variable in post-radiotherapy changes/complications, increased focal FDG uptake was not necessarily regarded as indicating recurrence, and MRI and DWI characteristics were taken into consideration for the combined PET/DWIMRI interpretation [1].
Measurements were obtained for: diameters for tumours and benign lesions, mean/minimum ADC values (ADCmean/ADCmin), and mean/maximum standardised uptake values (SUVmean/SUVmax). Tumour ADCs were measured with small elliptical regions of interest (ROIs) placed over several tumour sections on b1000 images and copied on the corresponding ADCmaps, while carefully avoiding areas of apparent necrosis [1, 27, 31]. Average ADCmean/ADCmin values were then calculated for each measured tumour. In analogy, SUV measurements were performed with ROIs placed on anatomically matched areas [16, 31].
Standard of reference and correlation with imaging findings
The standard of reference consisted of histology and follow-up ≥24 months after PET/DWIMRI. Histology was obtained within 2 weeks: (1) in lesions with a rating ≥3 on PET/DWIMRI, (2) in endoscopically suspicious lesions or (3) whenever there was a discrepancy between clinical/endoscopic examination and imaging. Histology included endoscopic biopsy and salvage surgery. Histological analysis of the resected tumours was based on serial whole-organ sections as described in the literature [28]. It served as a gold standard for the assessment of the pathological T-stage (pT) according to UICC [32]. Two experienced pathologists (>12 years) interpreted histology prospectively and blinded to imaging findings.
Patients with negative examinations or negative histology were followed ≥24 months to determine whether negative readings corresponded to true negative assessments and to detect false-negative evaluations. Follow-up consisted of clinical evaluation and fiberoptic endoscopy every month during the first year, every 2 months in the second year, every 3 months in the third year, every 6 months in the fourth year and additional cross-sectional imaging. If follow-up was negative during the entire period, negative assessments were considered as true negatives. If recurrence was proven ≤3 months after PET/MRI, negative assessments were considered as false negatives. If recurrence was proven >3 months after a negative PET/MRI, the case was re-evaluated at the interdisciplinary HN tumour board to distinguish between a false-negative evaluation and a metachronous tumour unrelated to the initial PET/DWIMRI.
After completed image analysis, correlation between follow-up, histopathological and imaging findings was obtained. Correlation between imaging and whole-organ surgical specimens was made on a slice-by-slice basis.
Statistical analysis
Statistical analysis was carried out by an experienced biomedical statistician (>15 years). Diameters, ADCmean/ADCmin and SUVmean/SUVmax for benign lesions and tumour recurrence were compared using a linear mixed effect regression model with a random intercept to account for data clustering. The diagnostic performance for combined PET/DWIMRI was assessed globally by calculating the area under the curve (AUC) and specifically at a cut-off of 3 (sensitivity, specificity, predictive values, accuracy). Statistical comparisons considered paired clustered data [33, 34]. An optimal cut-off value for ADCmean/SUVmean was calculated by minimising the distance between the corresponding point of the ROC curve and the upper left graph corner [35]. Multivariant logistic regression analysis (with mixed effects to account for clustering) was performed to assess the association between histology and ADCmean/SUVmean binarised according to optimal cut-off values. Cohen’s kappa coefficient was used to assess the concordance between PET/DWIMRI and the pathological T-classification (pT) [36]. All statistical analyses were conducted with R3.3.1(R-foundation for Statistical Computing, Vienna, Austria) and statistical tests were two-sided with a significance level of 0.05.