Introduction

Incidentally discovered adrenal masses, also called incidentalomas, represent relatively frequent findings during the evaluation of computed tomography (CT) examinations. For their further management, it is crucial to distinguish benign adenomas from so called non-adenomas, which include adrenal metastases, pheochromocytomas, and adrenal carcinomas, which are lesions that need a further active approach [1]. A mean unenhanced attenuation with a cut-off value of 10 Hounsfield units (HU) is a simple tool that can reliably diagnose about 2/3 of adenomas [2, 3]. Unfortunately, about a third of adenomas have an unenhanced attenuation higher than 10 HU. A CT histogram analysis was proposed to distinguish these, which are termed lipid-poor adenomas, from non-adenomas [4]. This analysis is based on the calculation of the pixels inside an adrenal mass with a negative CT number, i.e., below 0 HU.

So far, there has been no consensus on the optimization of CT scanning and post-processing parameters. One of the variable parameters is slice thickness, as the thinner layer provides better detail and spatial resolution, but on the other hand, the noise in the CT image increases with a thinner slice [5]. The exact relationship between slice thickness and the ability to detect adipose content inside adrenal tissue has so far only been investigated in a phantom study using animal tissues [6].

The aim of the work was to determine the optimal slice thickness of a CT image for CT histogram analysis and to verify the optimum threshold of the percentage of negative voxels in the CT histogram so that adenomas and non-adenomas can be distinguished with best accuracy in a study group of indeterminate adrenal lesions with a mean unenhanced attenuation higher than 10 Hounsfield units.

Methods

Study design

CT data from the time period 2013–2018 were analyzed retrospectively. The inclusion criteria were 1) indeterminate adrenal mass with a mean unenhanced attenuation > 10 HU, 2) available unenhanced phase reconstructed in three different slice thickness (5 mm, 2.5 mm, and 1.25 mm) or available raw data allowing such reconstructions and 3) a final diagnosis based on histology or follow-up. Lipid-rich adenomas with mean attenuation less than 10 HU and examinations without reconstructions with different slice thicknesses were not considered. The ethical committee approved the study; informed consent was waived.

Study group

In total, we discovered 195 indeterminate adrenal masses meeting the inclusion criteria in 172 subjects in the given time period. Out of these, 32 masses in 31 subjects were excluded because follow-up imaging examination or histology was missing and thus it was impossible to determine the diagnostic conclusion (Fig. 1).

Fig. 1
figure 1

A flowchart displaying creation of the study group and means of final diagnosis

Retrospective analysis was performed on 163 masses in 141 subjects. The study group included 68 males and 73 females; the age range was 18–89 years and the mean age was 61.9 years. The adrenal masses were located on the right side in 45 cases and on the left side in 74 cases and were bilateral in 22 cases.

The study group included 83 lipid-poor adenomas, out of which 12 were confirmed by histology and 71 by stable size during a follow-up after at least 6 months. The remaining adrenal masses were non-adenomas (21 pheochromocytomas, 55 metastases and four adrenocortical carcinomas), out of which 36 were confirmed by histology (11 metastases and all the pheochromocytomas and carcinomas), and 44 metastases were confirmed by the rapid growth of the adrenal mass in subjects with known malignancy.

CT protocols

All CT examinations were performed on a multi-detector CT scanner (LightSpeed RT16 or LightSpeed VCT or Discovery CT750HD, GE Healthcare, Milwaukee, WI, USA). Various parameters were applied using the usual routine settings for the imaging of thoracic and abdominal organs. We selected four CT protocols that represent most frequent protocols in our study group to provide technical details: the abdominal CT protocol using following parameters: tube voltage 120 kV, rotation 0.75 s, pitch 1.375:1, tube current modulation with noise index 16.66, estimated volume CT dose index (CTDIvol) 9.53 mGy, image reconstructed with 50% filtered back projection (FBP), and 50% adaptive statistical iterative reconstruction (ASIR); the thoracic CT protocol (tube voltage 120 kV, rotation 0.55 s, pitch 1.375:1, tube current modulation with noise index 15.38, estimated CTDIvol 10.48 mGy, image reconstructed with 60% FBP and 40% ASIR); urolithiasis protocol (tube voltage 120 kV, rotation 0.8 s, pitch 0.984:1, tube current modulation with noise index 19.51, estimated CTDIvol 6.33 mGy, image reconstructed with 60% FBP, and 40% ASIR); the adrenal protocol (tube voltage 120 kV, rotation 0.8 s, pitch 1.375:1, tube current modulation with noise index 14.23, estimated CTDIvol 14.40 mGy, image reconstructed with 70% FBP and 30% ASIR). Axial reconstructions were already available or were created using 5-mm, 2.5-mm, and 1.25-mm slice thicknesses and the “Soft” reconstruction kernel. The field of view (FOV) was adjusted individually to cover the patient’s body and the matrix was 512 × 512.

CT image analysis

The CT images were loaded into an Advantage Windows workstation supplied by a CT vendor. The evaluation included the measurement of three dimensions of each lesion and hand traced slice by slice extraction of the volume of interest (VOI) inside a lesion covering as large a part of the soft tissue volume as possible while avoiding the edges and any necrotic parts of a lesion. The mean attenuation and standard deviation (SD) of the HU values were recorded, as was the fraction (%) of voxels with attenuation < 0 HU in a histogram using the “Volume Histogram” tool (Fig. 2). Splenic tissue is often used as a reference for adrenal tissue [7, 8]; therefore, a VOI similar to an adrenal mass was drawn inside a central part of the spleen to obtain the mean attenuation, SD and percentage of negative voxels as a reference. These measurements were performed in all lesions and spleen using three different slice thickness values (5 mm, 2.5 mm, and 1.25 mm). The analysis was performed by PK (1 year of experience) and PS (5 years of experience), discrepancies were resolved by FČ (15 years of experience); all three radiologists were blinded in respect to the final diagnosis.

Fig. 2
figure 2

a Methodology of manual extraction of an adrenal mass; b evaluation of the histogram curve

Statistical analysis

Shapiro–Wilk tests of normality revealed non-normal distribution of the data. The data were expressed as median, quartiles, and minimal and maximal value (mean and standard deviation). The differences between adenomas and non-adenomas were analyzed by the Mann–Whitney U test. The comparison between three different slice thickness values was performed using the Friedman test and post hoc tests with Bonferroni correction. Correlations were sought using Spearman’s correlation analysis. The value of p < 0.05 was adopted as the level of statistical significance. All statistical analyses were conducted with IBM SPSS Statistics for Windows, Version 23.0. Armonk, NY: IBM Corp.

Using three different slice thickness options and applying three different thresholds for the percentage of negative voxels (10%, 5% and 1%), we obtained nine combinations for which we calculated the sensitivity (Sens), specificity (Spec), and positive predictive value (PPV) for the diagnosis of an adenoma.

Results

The mean unenhanced attenuation was 22.12 HU in lipid-poor adenomas and 31.62 HU in non-adenomas; this difference was statistically significant (p < 0.001) but there was a large overlap of the mean attenuation between both groups (Fig. 3).

Fig. 3
figure 3

A boxplot comparing mean attenuation values in adenomas and non-adenomas. Despite statistically significant difference between the two groups, the boxplot reveals large overlap of mean attenuation values

The comparison of the mean attenuation between different slice thicknesses revealed no significant change in adenomas (p = 0.075), non-adenomas (p = 0.661) or splenic tissue (p = 0.610). On the other hand, a statistically significant increase in the percentage of negative voxels and SD was observed with a thinner slice thickness in adenomas, non-adenomas and splenic tissue (p < 0.001 in all groups). Results comparing the mean attenuation, percentage of negative voxels, and SD in slices of different thicknesses are summarized in Table 1 and graphically displayed in Fig. 4.

Table 1 Summary of differences in mean attenuation, standard deviation of HU values, and percentage of negative voxels between three different options of slice thickness
Fig. 4
figure 4

A bar graph summarizing differences of mean attenuation, standard deviation of HU values, and percentage of negative voxels between three different options of slice thickness. Whereas mean attenuation shows no difference, SD and percentage of negative voxels reveal significant increase with thinner slice thickness in all three groups

Spearmen’s rho coefficients comparing the percentage of negative voxels and SD of HU values as a measure of CT image noise are summarized in Table 2a. We observed a weak correlation (r = 0.371) in adenomas with a 5-mm slice thickness, but the strength increased to moderate with 2.5-mm and 1.25-mm slice thicknesses (r = 0.408 and r = 0.474, respectively). In non-adenomas, the correlation of the percentage of negative voxels and SD was moderate with a 5-mm slice (r = 0.565), but increased to strong with 2.5-mm and 1.25-mm slice thicknesses (r = 0.694 and r = 0.713, respectively).

Table 2 Results of correlation analysis between percentage of negative voxels in adrenal lesions and (a) SD as a measure of image noise resp and (b) percentage of negative voxels in referential splenic tissue

Spearman’s rho coefficients comparing the percentage of negative voxels in an adrenal lesion and the percentage of negative voxels in splenic tissue as the reference tissue are summarized in Table 2b. The correlation of adenomas with splenic tissue was weak with a 5-mm slice (r = 0.295), but the strength increased to moderate with 2.5-mm and 1.25-mm slice thicknesses (r = 0.378 and r = 0.415, respectively). In non-adenomas, the correlation of adrenal tissue and splenic tissue was also weak with a 5-mm slice (r = 0.261), but increased to strong with 2.5-mm and 1.25-mm slice thicknesses (r = 0.646 and r = 0.669, respectively).

Our results in Table 3 also indicate that a thinner slice thickness leads to higher image noise and thus to an increased number of measurements with SD > 30 in adenomas, non-adenomas and also splenic tissue.

Table 3 Number of adenomas, non-adenomas, and splenic samples exceeding SD 30 as unacceptable value of CT image noise

Diagnostic accuracy of CT histogram analysis using different combinations of slice thicknesses and thresholds of the percentage of negative voxels are summarized in Table 4. The combination of a 5-mm slice thickness and a 10% threshold achieved the best Spec (98.8%) and PPV (97.8%) because only one non-adenoma (a metastasis of a renal cell carcinoma) exceeded the 10% threshold and reached 17.4% of negative voxels. Thinner slices and lower thresholds achieved higher Sens (up to 100%) but Spec and PPV dropped substantially up to 3.7% and 51.3% respectively because of the increasing number of non-adenomas exceeding the threshold and thus the increasing number of false positive findings.

Table 4 Diagnostic test performance data using different slice thickness reconstructions and cut-off values of negative voxels

Discussion

The increasing availability of imaging methods increases the likelihood of finding adrenal masses incidentally. Such incidental lesions (i.e., incidentalomas) are found in approximately 5–10% of abdominal CT scans [9] and represent an emerging clinical problem because of the increasing technical quality of imaging and increasing use of imaging in clinical practice. It is always necessary to rule out secretory hormonal activity and the malignant nature of each incidentaloma. Even an initial imaging examination that discovers an incidentaloma may already reveal morphological features that can not only determine whether it is a malignant process but can directly raise suspicion of a particular type of adrenal tumor [2]. The decision is crucial in patients with already proven extra-adrenal malignancy. In such cases, it is crucial to distinguish an adenoma from a metastasis because the therapeutic procedure is totally different and the difference in prognosis is huge. A CT scan is definitely the method that finds incidentalomas most frequently. A CT scan is, generally speaking, an important tool to determine the biological nature of lesions and to allow morphological differentiation between the most frequent adrenal pathologies. A generally accepted parameter to distinguish the benignity or malignancy of a lesion is the average non-enhanced attenuation measured in HU. Unenhanced attenuation of adenomas is often low because of the higher content of lipids in the cytoplasm, whereas the density of metastases, adrenocortical carcinomas, and pheochromocytomas is higher since the occurrence of lipids in these lesions is rare. Adrenal lesions can be roughly divided into two groups according to non-enhanced density values—adenomas and other lesions, i.e., non-adenomas. A non-enhanced density of 10 HU is generally considered to be the crucial cut-off value between adenomas and non-adenomas [2, 3]. In clinical practice, the simplicity of this measure is a great advantage, as is the fact that the measurement can be performed even when CT was not primarily aimed at the adrenal glands. Approximately 30% of adenomas, however, have mean attenuation higher than 10 HU [2, 3]. Such adenomas are called “lipid-poor” and represent a significant clinical problem, because it is impossible to reliably distinguish lipid-poor adenomas from other adrenal masses by means of unenhanced attenuation.

A CT histogram analysis method was proposed by Bae et al. to help identify lipid-poor adrenal adenomas [4]. The principle of the analysis is to quantify the number of pixels with attenuation of less than 0 HU in a specified ROI on a CT image. An unenhanced CT scan is most suitable for the analysis and it can be performed retrospectively, which is important in lesions that are discovered incidentally. A cut-off value of 10% or more negative pixels to reliably identify an adrenal adenoma was proposed by Bae et al. and generally accepted by other researchers [4, 7, 8, 10,11,12,13]. It is important to note that the lipid content is not the only factor influencing the percentage of negative pixels/voxels. Increasing noise in the CT image leads to the spread of CT values at both extremes and thus more voxels will fall below 0 HU just because of this phenomenon [5]. Noise in the CT image can be quantified by calculating the SD of HU values inside an ROI/VOI [5]. Noise in the CT image is further influenced by the number of photons absorbed and by the voxel size determined by three parameters: slice thickness, matrix and FOV [5]. The standard matrix is 512 × 512 and FOV is adjusted to the patient’s body size in clinical practice. On the other hand, a wide range of slice widths is used to reconstruct CT images used in clinical practice; a thinner slice means smaller voxel size and a higher level of noise [5, 6]. On the other hand, smaller voxel size means better image detail and increased influence of the partial volume effect. Despite the obvious fact that the slice thickness directly influences the percentage of negative pixels/voxels, most published papers regarding histogram analysis mention this topic only superficially if at all. The relationship between noise and the result of histogram analysis is discussed only theoretically or vaguely. Previous papers reporting the results of histogram analysis in adrenal masses used slice thicknesses in a very wide range from 2 to 8 mm and it is unclear how these differences could influence the results of histogram analysis [4, 7, 8, 10,11,12,13]. There is only a phantom study using animal tissues to imitate adrenal masses published as a meeting abstract that reports the relationship between noise and slice thickness [6]. Therefore, the aim of our study was to evaluate how the slice thickness influences histogram analysis in confirmed adrenal tumors and whether there is a compromise between slice thickness, image detail and the noise of the CT image that provides the best diagnostic accuracy. We had axial CT images reconstructed with three different slice thickness values (5, 2.5 and 1.25 mm). Splenic tissue was used as a reference because it theoretically contains no lipid fraction and is routinely used for the evaluation of adenomas and chemical shift calculations in magnetic resonance images [7, 8]. Mean unenhanced attenuation of adrenal masses and splenic tissue revealed no significant change with different slice thicknesses; this result confirms the previous phantom study using animal tissues [6]. Conversely, the percentage of negative voxels and SD increased significantly with thinner slices in adenomas, non-adenomas and splenic tissue, which also corresponded to the phantom study [6]. Both adenomas and non-adenomas increased their SD and percentage of negative voxels in a similar proportion to splenic tissue, in which case the increase is undoubtedly caused by an increase in the image noise.

If we compare the results of the correlation of the percentage of negative voxels and image noise expressed as SD, we can see only a weak correlation in adenomas using a 5-mm slice thickness, but in 2.5-mm and 1.25-mm slice thicknesses the strength of the correlation gradually increases. On the other hand, in non-adenomas, the correlation is stronger and also increases with thinner slices (Table 2a). The correlation of the percentage of negative voxels in adrenal tumors and splenic tissue revealed similar results (Table 2b). Splenic tissue includes no lipid particles so any negative voxels must be attributed to image noise. With a 5-mm slice thickness, the correlation was weak both in adenomas and non-adenomas, but while adenomas showed a gradual increase in the strength of the correlation to moderate with 2.5-mm and 1.25-mm slices, non-adenomas’ strength of correlation increased to strong with thinner slices. We interpret the weaker correlation of non-adenomas and spleen on a 5-mm slice by the “floor effect” of 0 HU threshold for negative voxels. If we summarize the results of the correlation analyses, we can conclude that an increase in percentage of negative voxels in adrenal tumors using thinner slices is because of increasing CT image noise and not as a result of the detection of lipid content revealed by better image detail and spatial resolution. An increasing number of false positive results with increasing image noise was also reported by Lin et al. [12]; according to their conclusions, a CT image with noise exceeding 30 SD is inappropriate for histogram analysis. In our study, this threshold was exceeded by 12 adrenal masses with a 2.5-mm slice and by 68 masses with a 1.25-mm slice. There was also an increasing number of spleen measurements exceeding the 30 SD threshold (Table 3). VOIs exceeding 30 SD also exceeded the threshold of percentage of negative pixels in adenomas, non-adenomas and splenic tissue. Therefore, our results confirm the observation of Lin et al. [12]. Interestingly, Lin et al. used slice thicknesses ranging from 2 to 5 mm, which is, regarding the lower value, the thinnest slice thickness out of all the papers referenced in our work; theoretically, there was a higher probability of higher image noise resulting from the use of thinner slices. However, it should be stressed again that CT image noise is influenced not only by a slice thickness and voxel size. CT image noise strongly depends also on a tube current applied during a data acquisition; it is, theoretically, a priori possible to increase number of photons (and thus increase the radiation dose) to obtain similar image quality even with e.g., 1-mm slice compared to 5 mm slice acquired with lower tube current. And vice versa, a low-dose CT protocol with very low tube current could probably be too noisy and inappropriate for the histogram analysis even if reconstructed with 5-mm slice thickness. On the other hand, CT image noise could be reduced by application of iterative reconstruction algorithms in contemporary CT scanners [14, 15]. There are also differences in CT image noise between different reconstruction kernels; e.g., lung and bone kernels provide CT images with increased noise compared to soft tissue kernel [6]. Our main aim was to evaluate whether a radiologist could benefit from CT images reconstructed with certain slice thickness to identify an adenoma discovered on already performed examination by CT histogram analysis and, therefore, we simplified the situation only to slice thickness but all above listed circumstances should be kept in mind when performing a CT histogram.

Another goal of our study was to confirm whether the cut-off value of 10% of negative voxels proposed by Bae et al. [4] is optimal or whether there is a more appropriate value. Therefore, we calculated the sensitivity, specificity and PPV for the cut-off values of 10, 5, and 1% negative voxels combined with three different slice thickness values, resulting in nine different combinations. Table 4 shows that the sensitivity of the diagnosis of an adenoma increases with a thinner slice thickness by up to 100%; however, the specificity decreases by up to 3.6%. In our opinion, for the needs of clinical practice, the combination with the highest specificity and PPV is the most appropriate, as it is necessary to exclude with the highest degree of certainty the possibility that an incidentaloma exceeding a threshold of negative voxels is an adenoma, which does not require prompt management, unlike a pheochromocytoma, carcinoma or metastasis. In other words, it is a minor mistake to investigate a lipid-poor adenoma intensely with a false negative result of histogram analysis rather than leaving a non-adenoma with a false positive result without confirmation and timely adequate treatment. On the basis of this rationale, we propose a 5-mm slice thickness and a threshold of 10% of negative voxels as the best combination for the performance of the histogram analysis. Using this combination, 44 lipid-poor adenomas out of 83 were correctly identified and only one non-adenoma out of 80 was incorrectly identified as an adenoma. A positive result under these conditions identifies an incidentaloma as an adenoma with a high degree of confidence and can prevent a patient from undergoing further adrenal imaging, such as MRI, CT wash-out study or PET/CT. It is worth mentioning that the only non-adenoma that exceeded the threshold of 10% negative voxels using a 5-mm slice thickness was a metastasis of a renal carcinoma. A paper reporting a case of renal carcinoma metastasis that met the imaging criteria for an adenoma has already been published [16].

Limitations

Our work has several limitations. One of them is the retrospective nature of the study; in such a setting, lesions were examined with a wide range of CT protocols dedicated to the assessment of chest and abdominal organs; on the other hand, a prospective study of incidentally found lesions is basically impossible, and our study thus reflects the daily clinical routine. Another limitation could be the relatively low number of histologically verified lesions in our study group, which may lead to uncertainty about lesions in which the diagnosis was determined only by a follow-up. However, it should be noted that the ratio of histologically verified lesions is higher than that found in other published works [4, 7, 10, 11] and there is only one paper including histological verification of all lesions [13]. Further, PPV values in Table 4 should be understood only as values allowing relative comparison among different combinations of the slice thickness and the threshold within our study group. PPV is dependent on prevalence of certain pathology and prevalence of lipid-poor adenomas was artificially increased in our study group because of inclusion and exclusion criteria; our PPV values, therefore, are also artificially increased and cannot be applied to a general population.

Conclusion

Slice thickness significantly affects the results of histogram analysis. Thin slice reconstructions are not appropriate for histogram analysis because of increased noise in the CT image, resulting in a high rate of false positive results in non-adenomas. In clinical practice, the combination of a 5-mm slice thickness and a threshold of 10% of negative voxels seems to be most appropriate.