Introduction

Worldwide, the prevalence of overweight and obesity is on the rise, with more than 1.9 billion adults affected in 2016 [1]. Abdominal obesity is a major risk factor for cardiovascular disease (CVD) development and premature mortality [2, 3]. However, not all obese individuals are at high risk of CVD [4, 5]. Abdominal adipose tissue can be divided in visceral adipose tissue (VAT) and subcutaneous adipose tissue (SAT) [6]. Interestingly, VAT is related to an increased CVD risk, while SAT is not [2, 3, 6, 7]. VAT does not only provide storage of lipids but also functions as an endocrine organ with adipocytes secreting bioactive factors and pro-atherogenic cytokines (adipokines) [2, 8]. Consequently, measurement of VAT volume improves accuracy of CVD risk profiling [3, 9]. However, the link between abdominal obesity and CVD may also be influenced by metabolic activity of VAT in the individual patient, with inflammation caused by overproduction of adipokines [10,11,12,13,14]. Therefore, it is likely that not only VAT volume but also the metabolic activity of VAT is linked to CVD risk [15] and may be useful for determining targets for treatment to reduce CV risk.

Imaging modalities such as magnetic resonance imaging (MRI) and X-ray computed tomography (CT) are both reliable methods for the assessment of abdominal adipose tissue volume [16] although both modalities are limited for the assessment of metabolic activity. Previous studies have assessed the metabolic activity of abdominal adipose tissue with 2-deoxy-2-[18F]fluoro-D-glucose ([[18F]FDG) positron emission tomography (PET) [17,18,19,20,21,22,23,24,25,26,27]. Overall, there is a growing interest in quantifying VAT as a CV risk marker and as a readout for therapeutic approaches [22, 28, 29]. The most common method to measure metabolic VAT activity is by manually drawing regions of interest (ROIs). However, the mean standardized uptake values (SUVmean) in VAT measured by ROIs in different studies ranges from 0.22 to 0.88 [17, 26, 27] indicating a great variability with this manual method. Consequently, there is a considerable need for a robust (semi)automated method with good accuracy and repeatability for assessment of VAT [18F]FDG uptake on [18F]FDG -PET/CT scans. The objective of this study is to evaluate the reproducibility and repeatability of a semi-automated method for assessment VAT [18F]FDG uptake using a [18F]FDG PET/CT scan and compare its performance with commonly applied manual methods.

Materials and Methods

Literature Search

Prior to developing a semi-automated method for the metabolic assessment of VAT, a literature search was performed (details are described in Supplementary material). The purpose of this search was to systematically review published data on the [18F]FDG uptake reported values and methods in VAT.

Study Design

To assess the reproducibility and repeatability of the automated method, test-retest scans obtained from an existing study in patients with non-small cell lung cancer were used [30]. This study was approved by the institutional review board and was registered in the Dutch trial register (trialregister.nl, NTR3508). All procedures performed in this study were in accordance with the Ethical Standards of the institutional research committee and carried out according to the principles of the Declaration of Helsinki. Written informed consent for all subjects was obtained before study enrolment.

Patients

Per patient, two whole-body [18F]FDG PET/low-dose (LD) CT scans at 60 min uptake time were performed within 1 week. There were no significant differences in patient preparation and PET acquisition between the test and retest scan. In the current study, only scans obtained 60 min after [18F]FDG injection were included as is recommended by the European Association of Nuclear Medicine (EANM). In addition, the reproducibility and repeatability of VAT [18F]FDG uptake measurements was analyzed in 10 patients who had not received chemotherapy in the past 4 weeks and without known diabetes mellitus (60 % men, median age 61 years (IQR, 45–66), weight 75 kg (IQR, 67–77), median BMI 24.6 (IQR, 23.1–26.9), injected activity test scan 248 Mbq (IQR, 194–377), injected activity retest scan 238 Mbq (IQR, 192–392), test scan glucose 5.8 mmol/L (IQR, 5.5–6.1), retest scan glucose 5.9 mmol/L (IQR, 5.4–6.4).

PET/CT Imaging

All scans were performed on a Gemini TF PET/CT scanner (Philips Healthcare, Best, Netherlands). Low-dose CT scans were performed (120 keV; 50 mAs and pitch: 0.829). The PET acquisition procedures and reconstruction conform to the EANM recommendations [31]. Patients underwent a low-dose (LD) CT during tidal breathing for attenuation correction purposes, followed by a whole-body [18F]FDG PET/CT scan (skull vertex to mid-thigh) 60 min after [18F]FDG injection, using 2 min per bed position. Weight, height, plasma glucose levels, total injected activity, time of injection, residual activity, and scan start times were recorded.

Data Analysis

All measurements were performed using MATLAB software (version R2015b, The MathWorks, Inc., Natick, MA, USA). PET and LDCT data were loaded into MATLAB and PET data were realigned to match the LDCT. The quality of the image fusion was visually verified and approved for all data sets prior to the fat segmentation and analysis. In order to analyze the entire abdomen, all slices from vertebral levels L1 to L5 were manually selected. Two observers (SdB and MR, both trained PhD students) independently analyzed all PET/LDCT scans twice at different time points in order to test both inter and intra observer variability. Both observers were blinded to the prior analyses and results.

Adipose tissue was initially segmented by thresholding the CT images between − 174 and − 24 Hounsfield Units (HU) [18, 32,33,34,35] (refer to the supplementary material for more details on fat segmentation). The abdominal muscular layer was used as a boundary to separate VAT and SAT. Because the abdominal muscular layer did not always totally separate the VAT and SAT on the LDCT, for instance at the linea alba, a line was manually drawn as a reference in all slices in order to separate VAT and SAT.

The metabolic activity was expressed as SUV of [18F]FDG [36]. High SUV inside VAT and SAT can be due to overspill of metabolic active organs such as kidneys and intestines. Therefore, the initial CT-based segmentation was further adapted using an SUV threshold and a morphological erosion in order to exclude spillover of signal from [18F]FDG avid structures. Because in previously studies SUVmean in VAT ranged from 0.22 to 0.89 [25, 27] and SUVmax from 0.53 to 1.21 [17, 26], the effect of using SUV thresholds ranging from 1.0 to 2.5 on VAT and SAT uptake assessments were analyzed. In addition, the effects of different erosions ranging from 0 to 5 pixels (pixel size of 1.17 × 1.17 mm2) on VAT and SAT uptake assessments were analyzed. The mean and median SUV generated with the automated method are referred to as ASUVmean and ASUVmedian. The ASUVmean in VAT and SAT were compared with SUVmean assessed with a manual ROI selection. For the manual method, all slices between L1 and L5, selected by the observers during the automated method, were used. This area was divided into 4 equally sized regions and ROIs were placed in the middle slice for every region. On each of these slices, 3 ROIs (diameter 10.5 mm) were placed in both VAT and SAT. Observers were instructed to place ROIs in homogeneous areas covering the surface of the ROI, avoid spillover effects and maximize the distance between ROIs. SUVmean across these slices were averaged and referred to as MSUVmean. Furthermore, the percentage of VAT volume depicted with CT that remained after thresholding and erosion was calculated. A schematic overview of the semi-automated method is shown in Fig. 1.

Fig. 1.
figure 1

Schematic overview of the most important steps of adipose tissue segmentation on CT and SUV analysis on [18F]FDG PET/LDCT scan. SAT subcutaneous adipose tissue, SUV standardized uptake values, VAT visceral adipose tissue.

Criteria Optimal Settings for Automated Metabolic Assessment of VAT

According to an expert panel, the optimal threshold and erosion settings for the automated metabolic assessment of VAT had to fulfill the following criteria: (1) highly reproducible and repeatable (ICC > 0.80), (2) the VAT volume that remains for analysis should be as large as possible (at least 50 % of the CT-based segmented VAT) while ruling out spillover effects by visual inspection, (3) The change in ASUVmedian / ASUVmean VAT should be smaller than 0.01 which was not considered as a relevant difference.

Statistical Analysis

All analyses were performed using SPSS (Released 2013. IBM SPSS Statistics for Windows, Version 22.0. Armonk, NY: IBM Corp). For reproducibility analysis, only the measurements of the first (test) [18F]FDG-PET/LDCT scan performed were used as the second (retest) scan is related with the first scan and can therefore not been used as an independent measurement. For the repeatability analysis (test-retest), measurements of the same observer were used to exclude the intra-observer variability.

The influence of threshold and erosion on ASUVmean VAT was evaluated using a generalized estimating equations approach with an unstructured covariance matrix. ASUVmean VAT was used as the dependent variable in the model, erosion, and threshold were used as factors. An interaction between erosion and threshold was also added in the model. Effects were evaluated and compared with appropiate correction for pairwise comparisons. Effect of threshold and erosion on ASUVmedian VAT were analyzed similarly as ASUVmean VAT.

The automated measurement of metabolic activity (with the most optimal threshold and erosion settings) was compared to manually placed ROIs with a Wilcoxon signed-rank test. To explore whether the automated and manually measurement were correlated, a Spearman correlation coefficient (r) was calculated.

The inter and intra observers reproducibility and the repeatability were quantified using intraclass correlation coefficients (ICCs; based on absolute agreement). Bland-Altman plots [37] were used to evaluate the reproducibility and repeatability. The measurement error for the reproducibility and repeatability were calculated according to the formula of Bland and Altman [38]. The variation coefficients (%) were calculated as the measurement error divided by the mean of the measurements.

Results

Semi-automated Metabolic Assessment of VAT Threshold and Erosion

For every combination of threshold and erosion, the reproducibility (inter and intra observers) and repeatability for the ASUVmean VAT and ASUVmedian VAT are calculated. As a result of 16 thresholds and 6 sizes of erosion, each 3D plot represents 96 ICCs (Suppl. Fig. 2 in Electronic Supplementary Material (ESM)). Since the ICCs for ASUVmean VAT and ASUVmedian VAT were highly reproducible and repeatable for all combinations of threshold and erosion, both parameters could be used to report [18F]FDG uptake (see also Suppl. Fig. 3).

Fig. 2.
figure 2

The influence of threshold and erosion on (a) ASUVmean VAT, (b) ASUVmedian VAT, and (c) the percentage of VAT volume for metabolic analysis. Pixel size is 1.17 × 1.17 mm2.

Fig. 3.
figure 3

Reproducibility. aMSUVmean VAT of observer 1 plotted against those of observer 2 and d corresponding Bland-Altman plot. bASUVmean VAT of observer 1 plotted against those of observer 2 and e corresponding Bland-Altman plot. cASUVmedian VAT of observer 1 plotted against those of observer 2 and f corresponding Bland-Altman plot. SD standard deviation, SUV standardized uptake values, VAT visceral adipose tissue; ASUVmean = automated generated with the method with setting SUV threshold 1.9, erosion; MSUVmean = manually generated by drawing regions of interest.

The influence of the threshold and erosion on ASUVmean VAT, ASUVmedian VAT, and percentage VAT volume remaining after threshold and erosion are shown in Fig. 2. In addition, see also supplemental Figs. 4 and 5 for an example of the influence of different threshold and erosion on the remaining abdominal adipose tissue analyzed. ASUVmean VAT and ASUVmedian VAT decreased significantly for every increase in erosion (all p < 0.001) and increased significantly for every 0.1 SUV increase in threshold (all p < 0.001). According to the earlier described criteria in this article (patients and methods), a SUV threshold of 1.9 and an erosion of 1 turned out to be the optimal setting for automated assessment of ASUVmean VAT and as such was used for further analysis. For ASUVmedian VAT a SUV threshold of ≥ 1.5 with an erosion of 1 or maximal 2 turned out to be optimal (see also Fig. 2). For further analysis, ASUVmedian VAT was defined as a SUV threshold of 1.5 and an erosion of 2.

Reproducibility and Repeatability PET/CT Data

The characteristics of the PET/CT data for observer 1 and 2 are shown in Table 1. The reproducibility inter and intra observers ICCs and the repeatability ICCs are shown in Table 2. The automated assessment of SUVmean and SUVmedian in VAT and SAT was significantly higher compared to manual ROIs (both p < 0.01). The MSUVmean VAT was correlated with ASUVmean VAT (r = 0.71, p = 0.02) and ASUVmedian VAT (r = 0.79, p < 0.01). The MSUVmean SAT was correlated with ASUVmean SAT (r = 0.79, p < 0.01) and ASUVmedian SAT (r = 0.86, p < 0.01). The ASUVmean VAT correlated with ASUVmedian VAT (r = 0.94, p < 0.01) and ASUVmean SAT correlated with ASUVmedian SAT (r = 0.88, p < 0.01).

Table 1 PET/CT data characteristics for both observers
Table 2 Intra class correlation coefficients of PET/LDCT data reproducibility and repeatability

Figure 3 shows the intra observers reproducibility for MSUVmean VAT and ASUVmean VAT and corresponding Bland-Altman plots. In addition, the intra observers mean MSUVmean VAT was 0.48, with a measurement error of 0.091 SUV and variation coefficient of 19.2 %. The mean ASUVmean VAT was 0.73 with a measurement error of 0.004 SUV and variation coefficient of 0.6 %. The mean ASUVmedian VAT was 0.60 with a measurement error of 0.003 SUV and variation coefficient of 0.5 %.

Figure 4 shows the repeatability, test-retest data, for MSUVmean VAT and ASUVmean VAT and corresponding Bland-Altman plots. The mean MSUVmean VAT was 0.55 with a measurement error of 0.069 SUV and variation coefficient of 12.6 %. The mean ASUVmean VAT was 0.73 with a measurement error of 0.019 SUV and variation coefficient of 2.5 %. The mean ASUVmedian VAT was 0.60 with a measurement error of 0.010 SUV and variation coefficient of 1.7 %.

Fig. 4.
figure 4

Repeatability. aMSUVmean VAT of scan 1 (test) plotted against those of scan 2 (retest) and d corresponding Bland-Altman plot. bASUVmean VAT of scan 1 (test) plotted against those of scan 2 (test-retest) and e corresponding Bland-Altman plot. cASUVmedian VAT of scan 1 (test) plotted against those of scan 2 (test-retest) and f corresponding Bland-Altman plot. SD standard deviation, SUV standardized uptake values, VAT visceral adipose tissue; ASUVmean = automated generated with the method with setting SUV threshold 1.9, erosion; MSUVmean = manually generated by drawing regions of interest.

Discussion

The present study assessed the reproducibility and repeatability for the metabolic assessment of VAT and SAT using [18F]FDG-PET/CT imaging using both manual and semi-automated segmentation. The automated metabolic assessment of VAT was highly reproducible and repeatable. Moreover, the ICCs concerning the automated metabolic assessment of VAT were superior to the manual method. The ICCs for automated and manually metabolic assessment of SAT were also highly reproducible. However, the repeatability of the automated metabolic assessment of SAT was lower than the manual method and lower than the automated metabolic assessment of VAT.

The present study investigated the repeatability of [18F]FDG uptake in VAT and SAT with a semi-automated segmentation which included a SUV threshold and erosion approach, with settings optimized for analysis of VAT. As expected, the SUVmean/median in VAT increased with higher SUV thresholds and decreased with larger erosions. However, the increase in SUVmean/median decreased with every 0.1 SUV increase in threshold. As a difference of < 0.01 SUV was not considered relevant, this was used as a criteria to assess the most optimal threshold. In addition, since SUV in VAT are almost normally distributed (Suppl Fig. 3), the ASUVmean as well as ASUVmedian are reliable parameters. As both parameters were highly reproducible and repeatable, we report both parameters. Overall, SUVmean is the most common parameter used to report [18F]FDG uptake in VAT [18, 20, 23,24,25, 27, 29].

Another criterion for the optimal threshold and erosion was that at least 50 % of the CT-based segmented VAT should remain for SUV analysis. This was based on the assumption that not more than 50 % of the CT-based segmented VAT would be influenced by spillover effects. Based on the results of this study, we suggest that for automated assessment of the metabolic activity of VAT, a SUV threshold should optimally be 1.9 for ASUVmean or 1.5 for ASUVmedian and an erosion should be 1 pixel and maximal 2 pixels.

The method was not fully automated since two manual actions were needed; selection of the slices corresponding to vertebral levels L1 to L5 and drawing a line to close the abdominal muscular layer to separate VAT and SAT. However, these manual actions barely affect the outcomes as VAT and SAT volume measurements were highly reproducible and repeatable (all ICC > 0.97).

In order to improve CVD risk management associated with obesity, VAT is recognized as an important contributor. Clearly, VAT volume and metabolic activity are both linked to the CVD risk and have become targets of imaging modalities [3, 9, 15, 39]. Although, a note of caution is due here since the interaction of insulin resistance which is common in obesity, with [18F]FDG uptake in VAT is not fully understood. In addition, some studies showed a decreased [18F]FDG VAT uptake in obese subjects which suggest an inverse association with insulin resistance and CVD risk [18, 20]. However, with the availability of an automated method of VAT and SAT including the reproducibility and repeatability, this interaction can very well be taken into account for future studies.

Two other studies used an automated method, in which a VOI generated on CT was transferred to PET, to report [18F]FDG uptake in VAT [23, 27]. Interestingly, one of this studies showed that VAT [18F]FDG uptake was associated with the degree of intestinal uptake on PET/CT [27]. Those findings confirm the need for a threshold and erosion for the automated metabolic assessment of VAT to overcome overspill effects from surrounding organs.

The ASUVmean VAT/ ASUVmedian VAT was higher compared with MSUVmean VAT. This result may be explained by the fact that manual ROIs were placed in the low [18F]FDG uptake areas, in an attempt to avoid spillover, and therefore potentially suffer from selection bias. Furthermore, the current study showed that automated measurements of VAT were more accurate than manually drawn ROIs, as the reproducibly, especially intra observers, and the repeatability ICCs were much higher. A possible explanation for this might be that the uptake of [18F]FDG in VAT is not uniform. Therefore, the uptake in an ROI may be not representative for the effective mean uptake of [18F]FDG in the whole VAT region. However, it could be argued that in the current study the SUVmean is still influenced by spillover effects from other structures and may represent an overestimation of VAT activity. As the uptake of [18F]FDG in VAT is almost normal distributed, it seems likeable that SUVmean is a reliable parameter. Nevertheless, for a less normal distribution SUVmedian can be used. In addition, [18F]FDG uptake in VAT measured as SUVmedian is also higher than measured with ROIs (0.59 vs.0.49 respectively). It may be the case that ROIs are an underestimation of [18F]FDG in VAT. As the combined volume of all ROIs is approximately 0.1 % of the total VAT volume, it is unlikely that ROIs reliable represent the overall [18F]FDG uptake in VAT, let alone an adequate method for readout of therapeutic approaches. Moreover, the automated measurement variation coefficient of the reproducibility between observers (0.6 %/0.5 %) and the repeatability (2.5 %/1.7 %) was far less compared with manual ROIs (19.2 and 12.6 %, respectively).

Our study also has some limitations. First, this study included predominantly patients with a healthy BMI (< 25) and no obese patients (BMI > 30). Therefore, it is uncertain if the automated method is also equally accurate in obese subjects. Secondly, in the current study the settings were optimized for analysis of VAT. As a result, the repeatability of automated metabolic assessment of SAT was lower than VAT. Thirdly, the [18F]FDG uptake in VAT was not compared with levels of adipokines or macrophage infiltration. Therefore, the hypothesis that the inflammatory state measured by [18F]FDG uptake in VAT is correlated with macrophage infiltration or is positively associated with adipokine levels could not be investigated. Further studies, which take levels of adipokines and macrophage infiltration in adipocytes into account, will need to be performed. Finally, it is unclear if the optimal SUV threshold and level of erosion found in this research are also optimal for other vendors. This requires a sensitivity analysis, which was not achievable within the scope of current study.

Conclusion

In summary, we conclude that a (semi-)automated method is feasible and should be the preferred approach for metabolic assessment of VAT in PET/CT [18F]FDG data. Furthermore, the metabolic assessment of VAT may be useful for determining targets for treatment to reduce CV risk.