Introduction

Standardized interpretation in oncological imaging has gained increasing importance as it provides reproducible and consistent reports, facilitates communication with the referring clinician, and minimizes misinterpretation of imaging pitfalls [1,2,3]. Numerous Reporting and Data Systems (RADS) have been established for different organs and diagnostic modalities such as LI-RADS for hepatocellular carcinoma in MRI and CT; BI-RADS for breast lesions in mammography, ultrasound, and MRI; or PI-RADS for prostate cancer in MRI (https://www.acr.org/Clinical-Resources/Reporting-and-Data-Systems). For patients with well-differentiated neuroendocrine tumors (NET), a novel standardized framework for the interpretation of somatostatin receptor (SSTR)–positron emission tomography/computed tomography (PET/CT) has been introduced titled SSTR-RADS 1.0 [3]. SSTR are overexpressed in the cell membrane of NET, which forms the basis for the affinity of radiolabeled SSTR analogs [4, 5]. SSTR expression makes NET lesions accessible not only for functional imaging but also for targeted therapy (peptide receptor radionuclide therapy, PRRT), which is a systemic treatment option in inoperable, metastatic NET patients. The extent of SSTR expression in PET/CT indicates the patients’ eligibility for treatment [6]. Large studies, such as the NETTER-1 trial, demonstrated that PRRT significantly prolongs the progression-free survival and the time to health-related quality-of-life deterioration and showed a clinically meaningful (but not significant) increase of overall survival of 11.7 months [7,8,9]. In the evaluation of SSTR-PET/CT scans, the SSTR expression of NET lesions has been reported descriptively so far, mainly based on the ratio of SSTR uptake in the liver compared to the tumor, the so-called Krenning’s score [10,11,12]. The proposed standardized reporting and data system SSTR-RADS 1.0 for SSTR-PET/CT showed promising first results in the assessment of diagnosis and treatment planning with PRRT in patients with NET [13, 14]. This study aims to determine the interreader agreement of four readers with different experience levels and the intrareader agreement in a second read 6 weeks later using SSTR-RADS 1.0 to further assess the feasibility of the proposed framework in routine clinical practice.

Methods

Study patients

For this retrospective study, patients were selected from an institutional database with histologically confirmed or suspected NET who underwent SSTR-PET/CT between April and November 2020. Only patients who received the mainly used tracer DOTA(0)-Phe(1)-Tyr(3))octreotide (DOTA-TOC) at inclusion time were selected. Patients receiving other tracers than DOTA-TOC were excluded for homogeneity reasons. Imaging was performed for initial staging or follow-up examination. Further inclusion criteria were complete clinical and imaging data. Patient characteristics are presented in Table 1. Almost all patients were pretreated (n = 95) at the time of the reading depending on numerous factors such as tumor grading, size and site of the primary tumor, Ki-67, and presence of metastases. Therapy included surgery, somatostatin analogs, chemotherapy, and locoregional procedures either as single therapy or in combination. Patients who underwent PRRT before PET/CT imaging were excluded.

Table 1 Patient characteristics

DOTA-TOC-PET/CT imaging

SSTR-PET/CT scans were acquired on Biograph 64 TruePoint w/TrueV and Biograph mCT Flow 20-4R PET/CT scanners (Siemens, Healthcare GmbH) and were acquired approximately 60 min after intravenous administration of 232 ± 36 MBq radiolabelled somatostatin analogs (68Ga-DOTA-TOC). After intravenous injection of contrast agent (n = 96; Ultravist 300, Bayer Vital GmbH or Imeron 350 mgl/mL, 2.5 mL/s, Bracco Imaging), diagnostic CT scans of the neck, thorax, abdomen, and pelvis (100–190 mAs; 120 kV) were acquired. Patients received diagnostic CT scans without contrast enhancement (n = 4) in case of known allergic reactions to iodinated contrast agent, renal impairment/failure, or hyperthyroidism. Image analysis was performed using a dedicated software package (Hermes Hybrid Viewer, Hermes Medical Solutions). All acquired PET/CT scans were analyzed using dedicated software packages (syngo.via, Siemens Healthcare or Hermes Hybrid Viewer, Hermes Medical Solutions).

Readers

The PET/CT scans of all 100 included study patients (one per person) were evaluated by a board-certified radiologist and nuclear medicine physician (experienced reader (ER) 1 and 2,  > 7 years of experience in PET/CT imaging), respectively, as well as one radiology and one nuclear medicine resident (inexperienced reader (IR) 1 and 2,  < 2 years of experience in PET/CT imaging), respectively. Readers were masked to the clinical patient data except age and sex of the patient. All readers were familiar with the used workstations and software from clinical routine and were introduced to the SSTR-RADS 1.0 before the first read.

SSTR-RADS version 1.0 and image interpretation

Lesions classified as SSTR-RADS 1 are definitely benign. SSTR-RADS 2 defines lesions with a minor level of SSTR expression or non-specific radiotracer uptake at an atypical site for NET, indicating that the lesions are almost certainly benign. Further workup (subsequent biopsy or follow-up imaging) is required for SSTR-RADS 3 lesions. These imaging findings are suggestive of, but not definitive for, NET. SSTR-RADS 4 includes those findings having an enhanced SSTR expression in sites typical for NET lesions, but without definitive findings on conventional imaging, whereas SSTR-RADS 5 shows intense uptake in sites typical for NET lesions with corresponding findings on conventional imaging. A detailed overview of the SSTR-RADS 1.0 is described in the original work [3].

For the evaluation of interreader agreement, all four readers were encouraged to choose a maximum of five target lesions (TLs) for each scan, with no more than three of the five TLs assigned to the same compartment. The imaging findings that are most apparent in CT imaging or have the highest tracer uptake on PET should be included in the selection. Predefined organ compartments were liver, lymph nodes (LNs), soft tissue (other than LNs), skeleton, and lung. An overall scan score was determined, which corresponded to the highest SSTR-RADS score of all individual TLs. After each TL was assigned to one SSTR-RADS score, the readers decided whether a PRRT was reasonable for the patient based on the assigned scores and the general image impression. In order to be able to evaluate a higher number of lesions and to determine intrareader agreement, all scans were examined a second time 6 weeks after the first read by the four readers under the same conditions.

Statistical analysis

Continuous variables were expressed as mean ± SD and categorical variables as N (%). The agreement of SSTR-RADS 1.0 was evaluated using the intraclass correlation coefficient (ICC) and their 95% CIs. For the analysis of intra- and interreader agreement, Shrout & Fleiss ICC (2,1) was used. According to Cicchetti, ICC values  < 0.40 indicate poor agreement, 0.40–0.59 indicate fair agreement, 0.60–0.74 were considered as good, and  ≥ 0.75 were considered as excellent [15]. A p value  < 0.05 was considered statistically significant. All analyses were performed using SPSS computer software (SPSS Statistics 25, IBM). ICC agreements between two groups were compared with “cocron” [16].

Results

Interreader agreement for compartments

A total of 3037 TL were chosen by all 4 readers. Of these, 1058 TLs were selected at least once. Identical TLs were selected by all four readers in 127 cases in the first read and in 115 cases in the second read. The distribution of the TLs among the compartments is shown in the supplementary material in Table S1.

The interreader agreement for scoring identically chosen TLs by all four readers was excellent with an ICC of 89% in the first read and 91% in the second read. Even when evaluated separately for readers classified by their level of experience, the interreader agreement showed excellent results with an ICC of 92% in the first read and 91% in the second read for ERs and 83% in the first and 82% in the second read for IRs.

In the compartment-based analysis, excellent results could be found among most organs with ICCs  ≥ 76% for both reads as presented in Table 2. LN scoring according to SSTR-RADS resulted in an ICC of 76% in the first read and only 50% in the second read.

Table 2 Interreader agreement of SSTR-RADS for 4 identical target lesions (TL) among all 4 readers regarding reader types and organ system

Interreader agreement for the overall scan score

In the evaluation of the overall scan score, the ICC for all four readers was 91% in the first read and 93% in the second read. Even among IRs, ICC was excellent with 87% in the first read and 85% in the second read as seen in Table 3. From the 100 evaluated SSTR-PET/CT scans, most of the scans were rated as SSTR-RADS score 4 or 5 by all four readers as presented in Fig. 1. Dedicated results are presented in the supplements in Table S2.

Table 3 Interreader agreement for the overall scan score among experienced (ER) and inexperienced readers (IR)
Fig. 1
figure 1

Distribution of SSTR-RADS for the overall scan score of experienced (ER) and inexperienced readers (IR)

Interreader agreement for treatment decision with PRRT

All 4 readers were asked whether they would consider PRRT for each patient based on the assigned SSTR-RADS scores and the general image impression. Among ERs, excellent results were achieved for the recommendation of PRRT in both reads (ICC 77% and 79%; Table 4). Among IRs, the agreement was good in the first read (ICC 68%) as well as in the second read (ICC 66%). The overall agreement on treatment decision was high in both reads (ICC 81% and 86%). However, among all 4 readers, IRs decided more frequently on PRRT in both reads (n = 228) compared to ERs (n = 188) as illustrated in Fig. 2.

Table 4 Interreader agreement on the decision for peptide receptor radionuclide therapy (PRRT) among experienced (ER) and inexperienced readers (IR)
Fig. 2
figure 2

Treatment decision “functional imaging fulfils requirements for PRRT and qualifies patient as potential candidate for peptide receptor radionuclide therapy (PRRT)” among experienced (ER) and inexperienced readers (IR)

Intrareader agreement for compartments, overall scan score, and decision for PRRT

Intrareader agreement was excellent among ER and IR for the scoring of compartments and the overall scan score with ICCs  ≥ 92%.

For the decision of treatment with PRRT, slightly lower ICC values were observed, with an ICC of 87% for ERs, 89% for IRs, and 88% for all 4 readers (Table 5). A patient example with assigned SSTR-RADS scores is presented in Fig. 3.

Table 5 Intrareader agreement on organ system–/target lesion–based, overall scan score and decision for peptide receptor radionuclide therapy (PRRT) scoring among experienced (ER) and inexperienced readers (IR)
Fig. 3
figure 3

50-year-old woman with neuroendocrine tumor of the pancreas. The patient underwent contrast-enhanced diagnostic CT. The thyroid gland was defined as SSTR-RADS 1, with no abnormal tracer uptake. Axial CT, 68 Ga-DOTATOC PET, and fused PET/CT show a visible lesion with moderate tracer uptake (dashed circle) in the left breast compatible with fibroadenoma. SSTR-RADS 3C was assigned by all readers except one ER in one read. There is intense focal uptake in the liver dome (arrow) with corresponding finding on CT. This lesion was classified as SSTR-RADS-5 by all 4 readers. However, there are more lesions (red square) in segment II and VIII with no corresponding finding on CT (assigned SSTR-RADS 4 by only 2 readers in both reads). Intense uptake in a mesenterial lymph node can be noted (red circle). All readers identified the corresponding finding on CT, so this lesion was classified as SSTR-RADS-5 by all readers. All 4 readers except one ER in one read recommended PRRT according to the SSTR-RADS 1.0 criteria in this patient

Discussion

A novel framework for the standardized interpretation of SSTR-PET/CT and treatment planning of NET patients has been introduced in analogy to previously established RADS, titled SSTR-RADS version 1.0 [3]. The present multireader study was conducted to validate the reader-dependent reproducibility of the standardized reporting system SSTR-RADS. SSTR-RADS was applied to SSTR-PET/CT scans of 100 NET patients by readers with low and high levels of experience to evaluate interreader agreement. Moreover, SSTR-PET/CT scans were presented a second time to the readers after 6 weeks to assess intrareader agreement.

Applying the SSTR-RADS score to the SSTR-PET/CT scans, an overall excellent inter- and intrareader agreement was observed for the overall scan score in both the first and the second reads. These results were consistent among readers with different levels of experience, confirming high reproducibility of SSTR-RADS and simple application even for inexperienced readers, which is essential to provide the clinician with reliable information. Our results are in line with previously published studies by Fendler et al and Werner et al who reported on ICCs  ≥ 85% in the assessment of the overall scan score [14, 17].

Since the theranostic approach for NET has developed into a standardized diagnostic and therapeutic procedure in recent years, accurate assessment of the overall scan score is of utmost importance for selecting eligible patients for PRRT [6, 12, 18,19,20]. Our analysis showed that less experienced readers considered PRRT overall more often (n = 228) than experienced readers (n = 188), which underlines the findings from previously published data. Fendler et al reported that IRs considered inappropriately more frequent PRRT compared to ERs, and therefore recommended interpretation of SSTR-PET/CT scans by ERs in this case [21]. In contrast to our study, Fendler et al referred to the primary nuclear medicine physician as the reference standard who had access to all clinical data, which was not the case in our study. Werner et al reported significantly varying results for considering PRRT among ERs and IRs, which further emphasizes our discrepant findings. However, even though decision-making for PRRT seems to require experience and training, the overall interreader agreement among all four readers was excellent in both reads. Therefore, the present study confirms that the proposed framework system should be considered for implementation into clinical routine as SSTR-RADS seems to serve as a guide for nuclear medicine physicians in the consideration of PRRT.

Currently, Krenning’s score is most commonly used but novel molecular imaging reporting and data system (MI-RADS) as the SSTR-RADS 1.0 score might be promising. However, further clinical studies are required to evaluate the clinical outcome of patient selection for PRRT based on SSTR-RADS 1.0 score. An appropriate scan score is of utmost importance for selecting eligible patients for PRRT [12, 20] and the nuclear medicine physicians’ general statement “functional imaging fulfills requirements for PRRT and qualifies patient as potential candidate for PRRT.” Moreover, in all cases PRRT is considered, double reading should be implemented by a senior physician. However, a definite treatment decision for PRRT in a theranostic center requires clinical case discussion in a multidisciplinary team (MDT) [20], and MDTs are considered a quality performance indicator [22]. The MDT board discussion should include multiple parameters such as patient history, tumor load, tumor dynamic, primary tumor location, tumor grading, and alternative systemic and local treatment options and thus provide a profound basis for a MDT decision.

Roughly one third of identical TLs were chosen by all four readers. The compartment-based assessment of the SSTR-RADS scoring to SSTR-PET/CT scans mostly showed almost perfect interreader agreements among all readers. In the assessment of LN scoring, the interreader agreement varied between excellent (ICC 76%) in the first read and fair (ICC 50%) in the second read. This finding can be explained by scoring mostly LNs with 4 or 5 but in different numbers. Although these results have statistical impact, in the clinical aspect, both lead to the consideration of PRRT according to SSTR-RADS. Moreover, this finding further emphasizes the relevance of functional imaging especially in evaluating small target lesions such as lymph nodes which can be overseen in anatomical imaging (CT). Based on these results, it seems reasonable to use the SSTR-RADS to describe single lesions from the SSTR-PET/CT findings, assuming there was mostly excellent agreement not just for the overall scan score but also for single lesions. However, since LN scoring showed difficulties, we support the proposal of Werner et al to select TLs stricter and more standardized [14] to further improve SSTR-RADS scoring by, e.g., selecting loco-regional lymph nodes.

Since SSTR-PET/CT plays an increasingly important role in the diagnosis of NETs, such as 68Ga-DOTA-TOC-PET/CT for diagnosing and staging of pancreatic NET, and given the increasing availability of PET/CT, several pitfalls in the interpretation of SSTR-PET/CT have been reported in recent years, such as the potential physiological distribution of SSTR on the cell surface of the pituitary gland or adrenal gland and macrophages in the case of inflammation [1,2,3, 23]. Minimizing these pitfalls is expected to be another characteristic of the SSTR-directed framework. A study by Weich et al showed that aiding interpretation of SSTR-RADS image findings led to reduced anxiety especially in inexperienced readers and increased readers’ confidence [13]. Moreover, the study reported on high motivation to learn such standardized framework and complement it into clinical routine.

The readers received a brief introduction to the SSTR-RADS before the study was conducted. Due to the simplicity and good comprehensibility of the SSTR-RADS, the readers were able to familiarize themselves with the SSTR-RADS in a very short time. Since the 5-point scale SSTR-RADS is structured in a reciprocal fashion with PSMA-RADS for the interpretation of PSMA-PET/CT, both frameworks are summarized under the term molecular imaging (MI)–RADS and can be apparently implemented into clinical routine without significant additional effort [24, 25].

There are a few limitations of this study. First, no histopathological comparison was available to validate each TL. Second, all readers were blinded to the clinical status of the study patients, which may have reduced interreader agreement and, with a better understanding of the clinical situation, interreader agreement may increase even further. Further studies could also evaluate the performance of inexperienced readers against a reference standard established by a consensus interpretation of several experienced readers or by an experienced reader provided with all clinical information. In conclusion, SSTR-RADS 1.0 represents a highly reproducible and accurate system for stratifying SSTR-targeted PET/CT imaging in NET patients with high inter- and intrareader agreement among readers with different levels of experience. The proposed scoring system represents a useful tool for simplifying and improving the management of NET patients in clinical practice by the standardization of diagnosis and treatment planning. However, in the compartment-based assessment of the SSTR-RADS score, lymph nodes should be carefully selected and scored. Furthermore, image-based decisions on PRRT should be taken by rather experienced physicians.