Performance Evaluation of a Semi-automated Method for [18F]FDG Uptake in Abdominal Visceral Adipose Tissue

Severity of abdominal obesity and possibly levels of metabolic activity of abdominal visceral adipose tissue (VAT) are associated with an increased risk for cardiovascular disease (CVD). In this context, the purpose of the current study was to evaluate the reproducibility and repeatability of a semi-automated method for assessment of the metabolic activity of VAT using 2-deoxy-2-[18F]fluoro-D-glucose ([18F]FDG) positron emission tomography (PET)/x-ray computed tomography (CT). Ten patients with lung cancer who underwent two baseline whole-body [18F]FDG PET/low-dose (LD) CT scans within 1 week were included. Abdominal VAT was automatically segmented using CT between levels L1–L5. The initial CT-based segmentation was further optimized using PET data with a standardized uptake value (SUV) threshold approach (range 1.0–2.5) and morphological erosion (range 0–5 pixels). The [18F]FDG uptake in SUV that was measured by the automated method was compared with manual analysis. The reproducibility and repeatability were quantified using intraclass correlation coefficients (ICCs). The metabolic assessment of VAT on [18F]FDG PET/LDCT scans expressed as SUVmean, using an automated method showed high inter and intra observer (all ICCs > 0.99) and overall repeatability (ICC = 0.98). The manual method showed reproducible inter observer (all ICCs > 0.92), but less intra observer (ICC = 0.57) and less overall repeatability (ICC = 0.78) compared with the automated method. Our proposed semi-automated method provided reproducible and repeatable quantitative analysis of [18F]FDG uptake in VAT. We expect this method to aid future research regarding the role of VAT in development of CVD.


Introduction
Worldwide, the prevalence of overweight and obesity is on the rise, with more than 1.9 billion adults affected in 2016 [1].Abdominal obesity is a major risk factor for cardiovascular disease (CVD) development and premature mortality [2,3].However, not all obese individuals are at high risk of CVD [4,5].Abdominal adipose tissue can be divided in visceral adipose tissue (VAT) and subcutaneous adipose tissue (SAT) [6].Interestingly, VAT is related to an increased CVD risk, while SAT is not [2,3,6,7].VAT does not only provide storage of lipids but also functions as an endocrine organ with adipocytes secreting bioactive factors and pro-atherogenic cytokines (adipokines) [2,8].Consequently, measurement of VAT volume improves accuracy of CVD risk profiling [3,9].However, the link between abdominal obesity and CVD may also be influenced by metabolic activity of VAT in the individual patient, with inflammation caused by overproduction of adipokines [10][11][12][13][14]. Therefore, it is likely that not only VAT volume but also the metabolic activity of VAT is linked to CVD risk [15] and may be useful for determining targets for treatment to reduce CV risk.
Imaging modalities such as magnetic resonance imaging (MRI) and X-ray computed tomography (CT) are both reliable methods for the assessment of abdominal adipose tissue volume [16] although both modalities are limited for the assessment of metabolic activity.Previous studies have assessed the metabolic activity of abdominal adipose tissue with 2-deoxy-2-[ 18 F]fluoro-D-glucose ( [ [ 18 F]FDG) positron emission tomography (PET) [17][18][19][20][21][22][23][24][25][26][27].Overall, there is a growing interest in quantifying VAT as a CV risk marker and as a readout for therapeutic approaches [22,28,29].The most common method to measure metabolic VAT activity is by manually drawing regions of interest (ROIs).However, the mean standardized uptake values (SUV mean ) in VAT measured by ROIs in different studies ranges from 0.22 to 0.88 [17,26,27] indicating a great variability with this manual method.Consequently, there is a considerable need for a robust (semi)automated method with good accuracy and repeatability for assessment of VAT [ 18 F]FDG uptake on [ 18 F]FDG -PET/CT scans.The objective of this study is to evaluate the reproducibility and repeatability of a semi-automated method for assessment VAT [ 18 F]FDG uptake using a [ 18 F]FDG PET/CT scan and compare its performance with commonly applied manual methods.

Literature Search
Prior to developing a semi-automated method for the metabolic assessment of VAT, a literature search was performed (details are described in Supplementary material).The purpose of this search was to systematically review published data on the [ 18 F]FDG uptake reported values and methods in VAT.

Study Design
To assess the reproducibility and repeatability of the automated method, test-retest scans obtained from an existing study in patients with non-small cell lung cancer were used [30].This study was approved by the institutional review board and was registered in the Dutch trial register (trialregister.nl,NTR3508).All procedures performed in this study were in accordance with the Ethical Standards of the institutional research committee and carried out according to the principles of the Declaration of Helsinki.Written informed consent for all subjects was obtained before study enrolment.

PET/CT Imaging
All scans were performed on a Gemini TF PET/CT scanner (Philips Healthcare, Best, Netherlands).Low-dose CT scans were performed (120 keV; 50 mAs and pitch: 0.829).The PET acquisition procedures and reconstruction conform to the EANM recommendations [31].Patients underwent a low-dose (LD) CT during tidal breathing for attenuation correction purposes, followed by a whole-body [ 18 F]FDG PET/CT scan (skull vertex to mid-thigh) 60 min after [ 18 F]FDG injection, using 2 min per bed position.Weight, height, plasma glucose levels, total injected activity, time of injection, residual activity, and scan start times were recorded.

Data Analysis
All measurements were performed using MATLAB software (version R2015b, The MathWorks, Inc., Natick, MA, USA).PET and LDCT data were loaded into MATLAB and PET data were realigned to match the LDCT.The quality of the image fusion was visually verified and approved for all data sets prior to the fat segmentation and analysis.In order to analyze the entire abdomen, all slices from vertebral levels L1 to L5 were manually selected.Two observers (SdB and MR, both trained PhD students) independently analyzed all PET/LDCT scans twice at different time points in order to test both inter and intra observer variability.Both observers were blinded to the prior analyses and results.
Adipose tissue was initially segmented by thresholding the CT images between − 174 and − 24 Hounsfield Units (HU) [18,[32][33][34][35] (refer to the supplementary material for more details on fat segmentation).The abdominal muscular layer was used as a boundary to separate VAT and SAT.Because the abdominal muscular layer did not always totally separate the VAT and SAT on the LDCT, for instance at the linea alba, a line was manually drawn as a reference in all slices in order to separate VAT and SAT.
The metabolic activity was expressed as SUV of [ 18 F]FDG [36].High SUV inside VAT and SAT can be due to overspill of metabolic active organs such as kidneys and intestines.Therefore, the initial CT-based segmentation was further adapted using an SUV threshold and a morphological erosion in order to exclude spillover of signal from [ 18 F]FDG avid structures.Because in previously studies SUV mean in VAT ranged from 0.22 to 0.89 [25,27] and SUV max from 0.53 to 1.21 [17,26], the effect of using SUV thresholds ranging from 1.0 to 2.5 on VAT and SAT uptake assessments were analyzed.In addition, the effects of different erosions ranging from 0 to 5 pixels (pixel size of 1.17 × 1.17 mm 2 ) on VAT and SAT uptake assessments were analyzed.The mean and median SUV generated with the automated method are referred to as A SUV mean and A SUV median.The A SUV mean in VAT and SAT were compared with SUV mean assessed with a manual ROI selection.For the manual method, all slices between L1 and L5, selected by the observers during the automated method, were used.This area was divided into 4 equally sized regions and ROIs were placed in the middle slice for every region.On each of these slices, 3 ROIs (diameter 10.5 mm) were placed in both VAT and SAT.Observers were instructed to place ROIs in homogeneous areas covering the surface of the ROI, avoid spillover effects and maximize the distance between ROIs.SUV mean across these slices were averaged and referred to as M SUV mean .Furthermore, the percentage of VAT volume depicted with CT that remained after thresholding and erosion was calculated.A schematic overview of the semi-automated method is shown in Fig. 1.

Criteria Optimal Settings for Automated Metabolic Assessment of VAT
According to an expert panel, the optimal threshold and erosion settings for the automated metabolic assessment of VAT had to fulfill the following criteria: (1) highly reproducible and repeatable (ICC 9 0.80), (2) the VAT volume that remains for analysis should be as large as possible (at least 50 % of the CT-based segmented VAT) while ruling out spillover effects by visual inspection, (3) The change in A SUV median / A SUV mean VAT should be smaller than 0.01 which was not considered as a relevant difference.

Statistical Analysis
All analyses were performed using SPSS (Released 2013.IBM SPSS Statistics for Windows, Version 22.0.Armonk, NY: IBM Corp).For reproducibility analysis, only the measurements of the first (test) [ 18 F]FDG-PET/LDCT scan performed were used as the second (retest) scan is related with the first scan and can therefore not been used as an independent measurement.For the repeatability analysis (test-retest), measurements of the same observer were used to exclude the intra-observer variability.
The influence of threshold and erosion on A SUV mean VAT was evaluated using a generalized estimating equations approach with an unstructured covariance matrix.A SUV mean VAT was used as the dependent variable in the model, erosion, and threshold were used as factors.An interaction between erosion and threshold was also added in the model.Effects were evaluated and compared with appropiate correction for pairwise comparisons.Effect of threshold and erosion on A SUV median VAT were analyzed similarly as A SUV mean VAT.
The automated measurement of metabolic activity (with the most optimal threshold and erosion settings) was compared to manually placed ROIs with a Wilcoxon signed-rank test.To explore whether the automated and de Boer S.A. et al.: Automated Fat Analysis manually measurement were correlated, a Spearman correlation coefficient (r) was calculated.
The inter and intra observers reproducibility and the repeatability were quantified using intraclass correlation coefficients (ICCs; based on absolute agreement).Bland-Altman plots [37] were used to evaluate the reproducibility and repeatability.The measurement error for the reproducibility and repeatability were calculated according to the formula of Bland and Altman [38].The variation coefficients (%) were calculated as the measurement error divided by the mean of the measurements.

Semi-automated Metabolic Assessment of VAT Threshold and Erosion
For every combination of threshold and erosion, the reproducibility (inter and intra observers) and repeatability for the A SUV mean VAT and A SUV median VAT are calculated.As a result of 16 thresholds and 6 sizes of erosion, each 3D plot represents 96 ICCs (Suppl.Fig. 2 in Electronic Supplementary Material (ESM)).Since the ICCs for A SUV mean VAT and A SUV median VAT were highly reproducible and repeatable for all combinations of threshold and erosion, both parameters could be used to report [ 18 F]FDG uptake (see also Suppl.Fig. 3).
The influence of the threshold and erosion on A SUV mean VAT, A SUV median VAT, and percentage VAT volume remaining after threshold and erosion are shown in Fig. 2. In addition, see also supplemental Figs. 4 and 5 for an example of the influence of different threshold and erosion on the remaining abdominal adipose tissue analyzed.A SUV mean VAT and A SUV median VAT decreased significantly for every increase in erosion (all p G 0.001) and increased significantly for every 0.1 SUV increase in threshold (all p G 0.001).According to the earlier described criteria in this article (patients and methods), a SUV threshold of 1.9 and an erosion of 1 turned out to be the optimal setting for automated assessment of A SUV mean VAT and as such was used for further analysis.For A SUV median VAT a SUV threshold of ≥ 1.5 with an erosion of 1 or maximal 2 turned out to be optimal (see also Fig. 2).For further analysis, A SUV median VAT was defined as a SUV threshold of 1.5 and an erosion of 2.

Reproducibility and Repeatability PET/CT Data
The characteristics of the PET/CT data for observer 1 and 2 are shown in Table 1.The reproducibility inter and intra observers ICCs and the repeatability ICCs are shown in Table 2.The automated assessment of SUV mean and SUV median in VAT and SAT was significantly higher compared to manual ROIs (both p G 0.01).The M SUV mean VAT was correlated with A SUV mean VAT (r = 0.71, p = 0.02) and A SUV median VAT (r = 0.79, p G 0.01).The M SUV mean SAT was correlated with A SUV mean SAT (r = 0.79, p G 0.01) and A SUV median SAT (r = 0.86, p G 0.01).The A SUV mean VAT correlated with A SUV median VAT (r = 0.94, p G 0.01) and A SUV mean SAT correlated with A SUV median SAT (r = 0.88, p G 0.01).
Figure 3 shows the intra observers reproducibility for M SUV mean VAT and A SUV mean VAT and corresponding Bland-Altman plots.In addition, the intra observers mean M SUV mean VAT was 0.48, with a measurement error of 0.091 SUV and variation coefficient of 19.2 %.The mean A SUV mean VAT was 0.73 with a measurement error of 0.004 SUV and variation coefficient of 0.6 %.The mean A SUV median VAT was 0.60 with a measurement error of 0.003 SUV and variation coefficient of 0.5 %.
Figure 4 shows the repeatability, test-retest data, for M SUV mean VAT and A SUV mean VAT and corresponding Bland-Altman plots.The mean M SUV mean VAT was 0.55 with a measurement error of 0.069 SUV and variation coefficient of 12.6 %.The mean A SUV mean VAT was 0.73 with a measurement error of 0.019 SUV and variation coefficient of 2.5 %.The mean A SUV median VAT was 0.60 with a measurement error of 0.010 SUV and variation coefficient of 1.7 %.

Discussion
The present study assessed the reproducibility and repeatability for the metabolic assessment of VAT and SAT using [ 18 F]FDG-PET/CT imaging using both manual and semiautomated segmentation.The automated metabolic de Boer S.A. et al.: Automated Fat Analysis  SAT was lower than the manual method and lower than the automated metabolic assessment of VAT.
The present study investigated the repeatability of [ 18 F]FDG uptake in VAT and SAT with a semiautomated segmentation which included a SUV threshold and erosion approach, with settings optimized for analysis of VAT.As expected, the SUV mean/median in VAT increased with higher SUV thresholds and decreased with larger erosions.However, the increase in SUV mean/median decreased with every 0.1 SUV increase in threshold.As a difference of G 0.01 SUV was not considered relevant, this was used as a criteria to assess the most optimal threshold.In addition, since SUV in VAT are almost normally distributed (Suppl Fig. 3), the A SUV mean as well as A SUV median are reliable parameters.As both parameters were highly reproducible and repeatable, we report both parameters.Overall, SUV mean is the most common parameter used to report [ 18 F]FDG uptake in VAT [18, 20, 23-25, 27, 29].
Another criterion for the optimal threshold and erosion was that at least 50 % of the CT-based segmented VAT should remain for SUV analysis.This was based on the assumption that not more than 50 % of the CT-based segmented VAT would be influenced by spillover effects.Based on the results of this study, we suggest that for automated assessment of the metabolic activity of VAT, a SUV threshold should optimally be 1.9 for A SUV mean or 1.5 for A SUV median and an erosion should be 1 pixel and maximal 2 pixels.
The method was not fully automated since two manual actions were needed; selection of the slices corresponding to vertebral levels L1 to L5 and drawing a line to close the abdominal muscular layer to separate VAT and SAT.However, these manual actions barely affect the outcomes as VAT and SAT volume measurements were highly reproducible and repeatable (all ICC 9 0.97).
In order to improve CVD risk management associated with obesity, VAT is recognized as an important contributor.Clearly, VAT volume and metabolic activity are both linked to the CVD risk and have become targets of imaging modalities [3,9,15,39].Although, a note of caution is due here since the interaction of insulin resistance which is common in obesity, with [ 18 F]FDG uptake in VAT is not fully understood.In addition, some studies showed a decreased [ 18 F]FDG VAT uptake in obese subjects which suggest an inverse association with insulin resistance and CVD risk [18,20].However, with the availability of an automated method of VAT and SAT including the reproducibility and repeatability, this interaction can very well be taken into account for future studies.
Two other studies used an automated method, in which a VOI generated on CT was transferred to PET, to report [ 18 F]FDG uptake in VAT [23,27].Interestingly, one of this studies showed that VAT [ 18 F]FDG uptake was associated with the degree of intestinal uptake on PET/CT [27].Those findings confirm the need for a threshold and erosion for the automated metabolic assessment of VAT to overcome overspill effects from surrounding organs.
The A SUV mean VAT/ A SUV median VAT was higher compared with M SUV mean VAT.This result may be explained by the fact that manual ROIs were placed in the low [ 18 F]FDG uptake areas, in an attempt to avoid spillover, and therefore potentially suffer from selection bias.Furthermore, the current study showed that automated measurements of VAT were more accurate than manually drawn ROIs, as the reproducibly, especially intra observers, and the repeatability ICCs were much higher.A possible explanation for this might be that the uptake of [ 18 F]FDG in VAT is not uniform.Therefore, the uptake in an ROI may be not representative for the effective mean uptake of [ 18 F]FDG in the whole VAT region.However, it could be argued that in the current study the SUV mean is still influenced by spillover effects from other structures and may represent an overestimation of VAT activity.As the uptake of [ 18 F]FDG in VAT is almost normal distributed, it seems likeable that SUV mean is a reliable parameter.Nevertheless, for a less normal distribution SUV median can be used.In addition, [ 18 F]FDG uptake in VAT measured as SUV median is also higher than measured with ROIs (0.59 vs.0.49respectively).
It may be the case that ROIs are an underestimation of [ 18 F]FDG in VAT.As the combined volume of all ROIs is approximately 0.1 % of the total VAT volume, it is unlikely that ROIs reliable represent the overall [ 18 F]FDG uptake in VAT, let alone an adequate method for readout of therapeutic approaches.Moreover, the automated measurement variation coefficient of the reproducibility between observers (0.6 %/0.5 %) and the repeatability (2.5 %/1.7 %) was far less compared with manual ROIs (19.2 and 12.6 %, respectively).
Our study also has some limitations.First, this study included predominantly patients with a healthy BMI (G 25) and no obese patients (BMI 9 30).Therefore, it is uncertain if the automated method is also equally accurate in obese subjects.Secondly, in the current study the settings were optimized for analysis of VAT.As a result, the repeatability of automated metabolic assessment of SAT was lower than VAT.Thirdly, the [18F]FDG uptake in VAT was not compared with levels of adipokines or macrophage infiltration.Therefore, the hypothesis that the inflammatory state measured by [18F]FDG uptake in VAT is correlated with macrophage infiltration or is positively associated with adipokine levels could not be investigated.Further studies, which take levels of adipokines and macrophage infiltration in adipocytes into account, will need to be performed.Finally, it is unclear if the optimal SUV threshold and level of erosion found in this research are also optimal for other vendors.This requires a sensitivity analysis, which was not achievable within the scope of current study.

Conclusion
In summary, we conclude that a (semi-)automated method is feasible and should be the preferred approach for metabolic assessment of VAT in PET/CT [ 18 F]FDG data.Furthermore, the metabolic assessment of VAT may be useful for determining targets for treatment to reduce CV risk.

Fig. 1 .
Fig. 1.Schematic overview of the most important steps of adipose tissue segmentation on CT and SUV analysis on [ 18 F]FDG PET/LDCT scan.SAT subcutaneous adipose tissue, SUV standardized uptake values, VAT visceral adipose tissue.

Fig. 2 .Fig. 3 .
Fig.2.The influence of threshold and erosion on (a) A SUV mean VAT, (b) A SUV median VAT, and (c) the percentage of VAT volume for metabolic analysis.Pixel size is 1.17 × 1.17 mm 2 .

Fig. 4 .
Fig. 4. Repeatability. a M SUV mean VAT of scan 1 (test) plotted against those of scan 2 (retest) and d corresponding Bland-Altman plot.b A SUV mean VAT of scan 1 (test) plotted against those of scan 2 (test-retest) and e corresponding Bland-Altman plot.c A SUV median VAT of scan 1 (test) plotted against those of scan 2 (test-retest) and f corresponding Bland-Altman plot.SD standard deviation, SUV standardized uptake values, VAT visceral adipose tissue; A SUV mean = automated generated with the method with setting SUV threshold 1.9, erosion; M SUV mean = manually generated by drawing regions of interest.

Table 1 .
PET/CT data characteristics for both observers SUV median = automated generated with the method with setting SUV threshold 1.5, erosion 2. M SUV mean = manually generated by drawing regions of interest L lumbar vertebral body, SAT subcutaneous adipose tissue, SUV standardized uptake values, VAT visceral adipose tissue A

Table 2 .
Intra class correlation coefficients of PET/LDCT data reproducibility and repeatability *Data presented as intra class correlation coefficients and 95 % confidence interval.A SUV mean = automated generated with the method with setting SUV threshold 1.9, erosion 1. M SUV mean = manually generated by drawing regions of interest L lumbar vertebral body, SAT subcutaneous adipose tissue, SUV standardized uptake values, VAT visceral adipose tissue *P value G 0.001; ‡ P value G 0.05