Introduction

Squamous cell carcinoma of the anus (SCCA) is a rare cancer; however, the incidence is increasing [1]. The curative treatment is chemoradiotherapy (CRT) with mitomycin (MMC) and 5‑fluorouracil (5-FU) or capecitabine [2,3,4,5,6]. Although overall survival is good, locoregional recurrence is still of significant concern, especially for patients with locally advanced disease [3, 7, 8]. Treatment of locally recurrent disease is curative-intent salvage surgery, usually with extensive pelvic surgery [9]. CRT is associated with considerable late side effects that have an impact on quality of life [10, 11], and there is a delicate balance between tumor dose escalation and toxicity. There is a need to clarify tumor characteristics and identify prognostic biomarkers to promote the future development of personalized CRT for SCCA to improve outcomes [12].

Magnetic resonance imaging (MRI) and positron-emission tomography computed tomography (PET/CT) are used for diagnosis and staging of SCCA and for radiotherapy treatment planning, and have a role in response evaluation after CRT completion [5]. Considering that advanced pelvic MRI is well established and plays an important role in the clinical workflow for SCCA, it is valuable to investigate whether pelvic MRI may provide imaging-based prognostic biomarkers [13]. As part of the MRI examination of the pelvis in SCCA and rectal cancer, high-resolution T2-weighted sequences (T2W) are used to depict the tumor, tumor size, and anatomic relations [5]. In rectal cancer, extramural tumor depth (EMTD) is an important prognostic parameter [14]. Although rectal cancer and SCCA have very different pathological features despite their anatomic neighborhood, we sought to evaluate this metric for SCCA.

Diffusion-weighted imaging (DWI) enables quantification of the motion of water molecules in tissues. In solid cancers, the free movement of water molecules is often restricted due to high tissue density and interstitial fluid pressure, resulting in a low apparent diffusion coefficient (ADC). In oncology, DWI is used to identify malignant tumors, characterize tumor aggressiveness, and evaluate treatment response [15]. Moreover, identification of an imaging-based early biomarker using an additional second MRI scan during CRT might provide information that has the potential to guide treatment modification. Early identification during CRT of the subgroup of patients at risk of treatment failure would allow us to personalize treatment. Although there are no established alternative personalized modified treatment concepts yet, conceivable trial options could be, for instance, dose escalation or intensified treatment with chemotherapy or immunotherapy.

The use of DWI to predict outcome has been investigated in squamous cell carcinomas of other anatomic sites, such as head and neck squamous cell carcinoma (HNSCC) [16,17,18] and cervical squamous cell carcinoma [19]. Changes in the ADC between a baseline MRI scan prior to CRT and a second scan in the early phase of radiotherapy (1–3 weeks) were consistently correlated with local control. In addition, there is an increasing number of studies indicating the usefulness of DWI to predict the response to CRT in rectal cancer [20, 21]. There are few studies on MRI-based metrics as biomarkers for SCCA [22,23,24,25]. Given the rarity of SCCA, it is challenging to obtain a sample size with sufficient statistical power for biomarker development and validation.

This study aimed to assess MRI-based tumor characteristics of SCCA prior to CRT (baseline scan) and in the early phase of CRT in week 2 (second scan), with the secondary aim of identifying a marker that could be used to predict treatment response.

Materials and methods

Patient inclusion

The present investigation is part of the “Anal cancer radiotherapy—prospective study of treatment outcome, patient-reported outcomes, utility of imaging and biomarkers, and cancer survivorship (ANCARAD)” study, a prospective multidisciplinary observational trial (NCT01937780). Histologically proven SCCA, planned CRT, and adequate performance status (ECOG 0–2) were the main inclusion criteria. A total of 141 eligible patients referred to Oslo University Hospital (OUS) between October 2013 and September 2017 were included in the study. The study was approved by the Regional Ethical Committee South–East (2012/2274) and the local data protection officer. All patients provided written informed consent. As part of the study protocol, patients underwent pelvic MRI, CT of the thorax/abdomen/pelvis, and, in most cases, PET/CT prior to CRT for staging. Due to logistical reasons, approximately half of the MRI scans at baseline were performed at OUS with the study protocol. The remaining patients were examined at regional hospitals and institutions with varying MRI protocols and were not included in the current study. A prospective cohort was formed from a subset of patients with baseline scans following the 3T MRI study protocol at OUS; these patients were invited to participate in the study and consented to additional imaging during the second week of CRT (second scan) with pelvic MRI.

The total number of patients eligible for a dedicated study 3T MRI scan at OUS was 52 prior to CRT (baseline scan). Of these patients, 39 had an additional second scan during the second week of CRT (Fig. 1).

Fig. 1
figure 1

Flowchart showing the inclusion and exclusion of patients in the ANCARAD study and for the present MRI analyses

Magnetic resonance imaging

All pelvic MRI scans included in this analysis were performed with a 3T Philips Ingenia MRI scanner (Philips Healthcare, Amsterdam, The Netherlands). Patients were scanned in the supine position and a pelvic phased array coil was used. If not contraindicated, all patients received 1 mg glucagon (Glucagon®, Novo Nordisk, Bagsvaerd, Denmark) intramuscularly prior to the examination and 20 mg butylscopolamine (Buscopan®, Opella Healthcare, Gentilly, France) intravenously during the examination to reduce bowel movement artifacts. The MRI study protocol included a combination of T2-weighted imaging (T2WI) and DWI sequences (Table 1). ADC maps were generated using the standard algorithm provided on the console of the scanner using b0 and b1200 values.

Table 1 MRI protocol with acquisition parameters for the Philips-Ingenia 3T MR scanner (study protocol)

Chemoradiotherapy

All patients were discussed in a multidisciplinary team (MDT) meeting and treated according to national guidelines. Radiotherapy was delivered using 3D conformal radiotherapy, intensity-modulated radiotherapy (IMRT), or volumetric modulated arc therapy (VMAT). The radiotherapy doses administered to the primary tumor and the metastatic lymph nodes were 54.0 or 58.0 Gy, depending on stage, while the dose to the noninvolved nodal regions was 46.0 Gy. Chemotherapy was delivered with MMC 10 mg/m2/day (one patient received cisplatin instead of MMC) on day 1 and 5‑FU 1000 mg/m2/day days 1–4, and a new cycle of MMC/5-FU began on day 29 for patients with advanced disease. Further details and the main results of the ANCARAD study have been published previously [26].

Follow-up

Tumor response was routinely assessed 3 months after CRT by clinical examination, anoscopy/proctoscopy, imaging with pelvic MRI, and either PET/CT or CT thorax/abdomen/pelvis. All patients were followed for at least 5 years or until death or recurrence. Patients who had residual disease or later developed recurrence were considered for salvage surgery. The main outcome in our study was locoregional treatment failure and was defined as failure to demonstrate a complete response (CR) 6 months after CRT or evidence of local or regional disease after CR had been achieved. Patients with locoregional treatment failure were considered for salvage surgery. Using these clinical follow-up data, patients were divided into a (locoregional) failure group or a nonfailure group, depending on the outcome.

Characteristics, data analysis, and statistics

The MR images were anonymized, and a board-certified radiologist (B.A.H.) with extensive experience in pelvic MRI delineated a region of interest (ROI) encompassing the macroscopic tumor on T2W images together with DW images on a multimodality reading platform (Syngovia VB30®, Siemens Healthineers, Erlangen, Germany). Areas of suspected necrosis were not excluded, and tumor volume was calculated. The greatest dimension (mm) of the tumor, tumor infiltration to other organs and structures, external anal sphincter infiltration (EASI), and extramural tumor depth (EMTD) were assessed on T2W images as part of the MR study protocol and for TNM staging. Similar to measurement in rectal cancer [27], the maximum depth of extramural infiltration was measured from the outer edge of the internal sphincter of the anus/muscularis propria to the outer margin of the tumor (mm). To evaluate ADC values in the tumor, the ROI for each T2W image was propagated to the corresponding ADC map. The mean ADC in the tumor volume was calculated, and skewness, kurtosis, entropy, and standard deviation (SD) were extracted from the ADC histogram analysis. The histogram analysis was executed by LIFEx, a freeware for radiomic feature calculation in multimodality imaging [28].

All statistical analyses were performed using STATA (Statistical Software: Release 16, StataCorp LLC, TX, USA). To assess the differences in MRI characteristics between the two different timepoints (baseline and second scan during CRT), the Wilcoxon signed rank-sum test was used. Pearson’s correlation coefficient between different baseline variables was estimated. The median differences in the relative change of the MR characteristics between the scans were compared between the failure group and the nonfailure group by quantile regression. Univariate logistic regression analysis with estimation of odds ratio (OR) and AUC values from ROC curves was used to assess the association of locoregional failure with the MRI characteristics at baseline scan, at second scan, and the relative change between the scans. Optimal cutoff points from receiver operating characteristic (ROC) curves to predict treatment failure were estimated by the LIU method [29]. P-values of  0.05 were considered significant.

Results

The median age of the 52 patients included in the MRI study was 61 years (range 40–90); 77% were women, 48% had T3–T4 tumors, 58% had N1–N3 disease, and 86% had human papillomavirus (HPV)-positive tumors. Patient characteristics, treatment, and follow-up data corresponded well with data from the main study [26], except for a higher percentage of N1–N3 disease in the MRI study (58% versus 45%). The median follow-up was 60 months (range 5–85).

Treatment failure occurred in 8/52 patients (15%); of these, locoregional failure occurred in 7/52 patients (13%). At baseline MRI, 36/52 patients (69%) had extramural tumor infiltration with a median EMTD of 5 mm, and 29/52 (56%) had external sphincter infiltration. Baseline tumor size characteristics were significantly correlated with each other: the median baseline diameter was 40 mm (range 14–140), and the median volume was 14.5 cm3 (range 1.5–97). The baseline MRI characteristics of all 52 included patients are shown in Table 2. The changes in parameters between the baseline scan and the second scan in week 2 for the subgroup of 39 patients are given in Table 3. During the second week of CRT, all T2W-based tumor size parameters decreased significantly, while ADC mean significantly increased and ADC skewness decreased.

Table 2 Baseline MRI tumor characteristics for all included patients (n = 52) on baseline scan
Table 3 MRI tumor characteristics on baseline scan and second scan during week 2 of chemoradiotherapy in the patient subgroup (n = 39); Wilcoxon signed rank sum test

Patients in the subgroup (n = 39) who received two 3 T MRI scans (baseline scan and second scan) were divided into a without locoregional failure group (nonfailure; 32/39) and a locoregional failure group (failure; 7/32). There was no significant difference in the baseline characteristics and TNM staging between the groups. Differences in the relative change in the MRI characteristics from baseline to the second week scan between the groups are shown in Table 4. The decrease in tumor size was lower in the failure group, although the threshold for statistical significance was missed by a small margin. The results of logistic regression analysis to assess associations between MRI characteristics and their relative degree of change during CRT with locoregional failure are shown in Table 5. Small relative changes in the volume and diameter of the tumor between the scans were associated with treatment failure (Fig. 2), and this variable had the highest AUC (0.73 and 0.76, respectively). The optimal cutoff point for the relative change in tumor volume was −50%, and −12.5% for the relative change in tumor diameter (sensitivity 0.71, specificity 0.75).

Table 4 Relative changes in MRI tumor characteristics from baseline to the second week scan (n = 39) in the failure group versus the nonfailure group. Median differences and p-values were estimated by quantile regression
Table 5 Univariate regression models for the association of MRI characteristics before CRT (baseline scan), during week two of CRT (second scan) and their relative changes between the scans with locoregional failure
Fig. 2
figure 2

Transversal T2 image of a patient who had a 70% relative change in tumor volume and no recurrence: a Baseline scan, b second scan in week 2. Transverse T2 image of a patient who had a 7% relative change in tumor volume and recurrence: c baseline scan, d second scan in week 2. Tumor delineated with coloured contours. White arrows point to the delineated tumor

Discussion

By conducting an additional early MRI examination during CRT, we have shown that T2W-based tumor size and the size-related characteristic EMTD decreased compared to the baseline scan. For ADC-based characteristics, skewness decreased, while the mean increased. Many of the baseline MRI characteristics were significantly correlated with each other, probably reflecting similar tumor features. None of the ADC-derived characteristics or EMTD and EASI were correlated with outcome, nor were any other MRI characteristics at baseline or during week 2 of CRT (second scan). The relative decrease in tumor size was lower in the treatment failure group, although the difference was not statistically significant and was probably related to the small number of patients.

For T staging in the UICC TNM classification, the greatest dimension of the tumor is used [30]. In our study, we additionally used T2W images to estimate tumor volume on a baseline MRI scan and found that the values were highly correlated with each other. In rectal cancer, EMTD is an important prognostic parameter [31]. We wanted to evaluate whether this metric could provide additional information on SCCA, being aware that rectal cancer and SCCA are two very different tumors despite of their anatomic neighborhood. Apart from its correlation with baseline tumor size, EMTD did not provide any additional information and was not correlated with treatment failure.

The majority of our study patients (39/52) underwent additional pelvic MRI during CRT. We chose week 2 as the timepoint for the early second scan according to previous studies on SCCA and squamous cell carcinoma of other sites [16, 19, 24, 25], enabling potential early changes in treatment plans during CRT.

Several MRI characteristics changed significantly between the two scans: tumor size and ADC skewness decreased while ADC mean increased. These changes probably reflect early tumor regression, most likely due to decreasing cellularity and heterogeneity, combined with increasing oedema. An increase in ADC mean and a decrease in tumor size during CRT are common findings for squamous cell carcinoma of other sites and SCCA [16, 17, 24, 25, 32]. Decreasing skewness in ADC histograms after CRT has also been reported in rectal cancer [33], metastatic ovarian cancer, and primary peritoneal cancer during chemotherapy [34].

The secondary aim of our study was to identify a response metric with the potential to predict outcome. Early identification of the subgroup of patients with treatment failure in the early phase during CRT would allow us to personalize treatment. Although there are no established alternative personalized and modified treatment concepts yet, one could consider, for instance, dose escalation, intensified treatment with chemotherapy, or immunotherapy for patients with a predicted risk for treatment failure.

None of the ADC-based histogram characteristics (mean, skewness, kurtosis, SD, entropy) or any of the baseline (n = 52) or week‑2 (n = 39) MRI characteristics correlated significantly with outcome in our study. Low relative changes in volume and diameter from the baseline scan to the second scan during CRT were associated with treatment failure, so these variables may represent easily assessable imaging-based biomarkers. This finding is in line with previous trials on SCC of other sites that reported that tumor size-based characteristics assessed on T2W MRI at different timepoints prior, during, or after CRT were correlated with outcome [18, 35], but such correlations have not yet been reported for SCCA.

Two previous studies on SCCA assessed the relative change in ADC mean between baseline scans and scans during CRT, but the results were different. Muirhead et al. [25] found that the median percentage change in ADC mean between baseline and week-2 scans was lower in the failure group, while Jones et al. [24] described no significant correlation of the relative change in ADC mean with local recurrence. In the latter study, several other features of the ADC histogram (baseline skewness and SD, week-2 skewness and SD, week-4 kurtosis and SD) were correlated with recurrence, in contrast to the results of our study. The results of the few existing studies for SCCA vary, and there is a need for further trials with more patients to assess ADC histogram-based characteristics. Future trials should also evaluate T2W images, as recent studies [22, 23, 36] found different first- or higher-order texture histogram characteristics on pretreatment T2W images to be correlated with outcome in SCCA.

One limitation of our study was the small sample size and the small number of locoregional treatment failures due to the rarity of SCCA. Future studies should focus on including larger numbers of patients in multicenter studies or meta-analyses. A feasible approach could be the use of distributed learning [37]. Another limitation is manual tumor delineation, which is a subjective process prone to intra- and interobserver variability. To improve the objectivity of tumor delineation, future studies should favor semiautomated methods (e.g., those using threshold ADC values) [38]. As DWI is especially prone to artifacts related to bowel gas and motion, future studies should therefore include both DWI and the more stable T2W-MRI. Finally, the choice of the timepoint for the second scan in the early phase (during week 2 of CRT) might not be ideal for SCCA, and future research should strive to identify the optimal timepoint for the second scan.

In conclusion, the relative change in tumor size between a baseline MRI scan prior to CRT and an early second scan during CRT might have potential as an easily assessable imaging-based biomarker for SCCA without the need to assess more complex MRI characteristics.