Abstract
Objectives
Quantitative CT imaging is an important emphysema biomarker, especially in smoking cohorts, but does not always correlate to radiologists’ visual CT assessments. The objectives were to develop and validate a neural network-based slice-wise whole-lung emphysema score (SWES) for chest CT, to validate SWES on unseen CT data, and to compare SWES with a conventional quantitative CT method.
Materials and methods
Separate cohorts were used for algorithm development and validation. For validation, thin-slice CT stacks from 474 participants in the prospective cross-sectional Swedish CArdioPulmonary bioImage Study (SCAPIS) were included, 395 randomly selected and 79 from an emphysema cohort. Spirometry (FEV1/FVC) and radiologists’ visual emphysema scores (sum-visual) obtained at inclusion in SCAPIS were used as reference tests. SWES was compared with a commercially available quantitative emphysema scoring method (LAV950) using Pearson’s correlation coefficients and receiver operating characteristics (ROC) analysis.
Results
SWES correlated more strongly with the visual scores than LAV950 (r = 0.78 vs. r = 0.41, p < 0.001). The area under the ROC curve for the prediction of airway obstruction was larger for SWES than for LAV950 (0.76 vs. 0.61, p = 0.007). SWES correlated more strongly with FEV1/FVC than either LAV950 or sum-visual in the full cohort (r = − 0.69 vs. r = − 0.49/r = − 0.64, p < 0.001/p = 0.007), in the emphysema cohort (r = − 0.77 vs. r = − 0.69/r = − 0.65, p = 0.03/p = 0.002), and in the random sample (r = − 0.39 vs. r = − 0.26/r = − 0.25, p = 0.001/p = 0.007).
Conclusion
The slice-wise whole-lung emphysema score (SWES) correlates better than LAV950 with radiologists’ visual emphysema scores and correlates better with airway obstruction than do LAV950 and radiologists’ visual scores.
Clinical relevance statement
The slice-wise whole-lung emphysema score provides quantitative emphysema information for CT imaging that avoids the disadvantages of threshold-based scores and is correlated more strongly with reference tests than LAV950 and reader visual scores.
Key Points
• A slice-wise whole-lung emphysema score (SWES) was developed to quantify emphysema in chest CT images.
• SWES identified visual emphysema and spirometric airflow limitation significantly better than threshold-based score (LAV950).
• SWES improved emphysema quantification in CT images, which is especially useful in large-scale research.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
Chronic obstructive pulmonary disease (COPD) is the third leading cause of death worldwide [1]. The main symptoms are shortness of breath, exertional dyspnea, and cough. COPD is mainly caused by cigarette smoking that induces a variable combination of bronchiolitis and emphysema resulting in chronic airflow limitation. Chronic bronchitis, i.e., cough with phlegm, may also be present.
Emphysema is characterized by destruction of alveolar walls with impaired gas exchange and hyperinflation, visualizable on chest CT images as low-density regions [2]. COPD is irreversible and should be diagnosed as early as possible to prevent progress. The diagnosis is confirmed by spirometry (pulmonary function testing, PFT) revealing chronic airway obstruction not normalizable with bronchodilators or other therapy [3].
Computed tomography is the modality of choice for emphysema visualization. The extent of emphysema can be estimated by a radiologist, but quantitative image information is desirable, for example, as outcome in clinical trials [4, 5]. Quantitative and visual CT image scores add information to PFT and help predict morbidity and mortality in COPD, independent of PFT results [6,7,8,9,10].
The Swedish CArdioPulmonary bioImage Study (SCAPIS) is a multi-center, cross-sectional study including chest CT scans and spirometry for over 30,000 individuals. The aim of SCAPIS is to predict and prevent cardiovascular disease and COPD [11]. Inter-observer variation in CT evaluation is a well-known challenge in clinical trials [12]. For multi-center trials, such as SCAPIS, an unbiased analytical tool for core-lab can reduce inter-observer variation.
The fraction of low-attenuation pixels in the lungs, i.e., low-attenuation volume below − 950 HU (LAV950), is a frequently used quantitative CT emphysema metric in COPD and smoking cohorts [6, 7, 13,14,15,16]. However, LAV950 correlates weakly with visual emphysema scores in cross-sectional cohorts [17]. Furthermore, even in prediction models that include LAV950, the visual emphysema score remains a significant predictor, suggesting that LAV950 captures only part of the CT image information [8].
In this study, we aim to create a refined reader-independent quantitative emphysema metric for CT images that measures what radiologists identify as emphysema. Using machine learning and detailed radiologist emphysema annotations, we introduce a slice-wise whole-lung CT emphysema score (SWES) to combine the predictive value of visual scores with the objective assessment of quantitative CT.
The purposes of the current study were (a) to develop a machine learning SWES method for lung CT; (b) to externally validate the method against radiologists’ emphysema scoring on unseen CT data with the test hypothesis of better correlation with radiologists’ scores for SWES compared to LAV950; and (c) to compare the correlation with PFT between the SWES method, a commercial LAV950 application, and visual emphysema scoring.
Materials and methods
The Swedish Ethical Review Authority approved the study protocol. Written informed consent was obtained at inclusion. SCAPIS prospectively included 30,154 participants at six Swedish university hospitals including Gothenburg from 2013 to 2018. A pilot study, SCAPIS Pilot, recruited 1111 participants in 2012 [17]. In this study, the development data originated from SCAPIS Pilot, while the external validation was performed on data from the main SCAPIS Gothenburg cohort. The inclusion process is described in Fig. 1.
Inclusion flowchart. a Development cohort. Only six of the matched controls were selected to avoid worsening the already skewed distribution of regional emphysema scores. b The total validation cohort (n = 474) consisted of a random (n = 395) and emphysema (n = 79) cohort. LAV950, low-attenuation volume − 950 HU; PFT, pulmonary function testing
In the present work, clinical terminology is used, defining external validation as the application of the machine learning method on the unseen CT data. The machine learning term test set is used only for the limited test set used during algorithm development.
Image data
The image data were thin-slice unenhanced CT image stacks capturing the lungs at full inspiration, approximately 500 slices per examination. All images were acquired on a Siemens Somatom Definition Flash CT system. Acquisition parameters in development and validation cohorts were identical: 120 kVp, CARE dose quality reference mAs 25, pitch 0.9, rotation time 0.5 s, reference patient CTDI 1.7 mGy.
In the development cohort (Fig. 1a), the 102 CT stacks were reconstructed as 0.6-mm/0.6-mm (slice thickness/increment) slices. The reconstruction filters were a medium smooth soft tissue filtered back projection algorithm, B31f (n = 42), or medium smooth soft tissue iterative reconstructions (SAFIRE) strength 2/5 (I31f2, n = 12) or strength 3/5 (I31f3, n = 48).
In the validation cohort (Fig. 1b), all images were reconstructed as 0.75-mm/0.6-mm (slice thickness/increment) slices using a smooth soft tissue filtered back projection algorithm, B20f.
Algorithm development
Detailed slice-wise annotations
The machine learning emphysema prediction model was developed using only thin-slice data from the SCAPIS Pilot cohort.
Detailed slice-wise emphysema annotations were acquired in a three-step process including multiple readers in the first (26 readers) and third steps (16 readers) (see supplemental Fig S1–S3). In the first step, each centimeter of each lung was classified according to a 4-degree emphysema scale as previously described [18]. The second step consisted of median z-direction filtering of the 4-degree annotations and the third step was a refinement algorithm, increasing the granularity of the emphysema labels from a 4-degree scale to 10 degrees (see Appendix A for details).
Machine learning slice-wise emphysema scoring (SWES) method
The 10-degree annotations in the development cohort were used in developing a machine learning method based on a convolutional neural network (CNN) and a linear regressor. The method processed a chest CT scan, with separate outputs from each slice and lung.
For each patient, the SWES was constructed as the average emphysema score for each lung and slice, weighted by the segmented lung area (see Fig. 2).
The intra-scan repeatability of SWES was analyzed on a subset of 63 CT stacks that were randomly rotated up to ± 7.5° in the x–y-, x–z-, and y–z-plane simultaneously. The Bland–Altman limit of agreement between SWES computed in the original and rotated image volume was computed.
Neural network and CT scan pre-processing
The neural network was trained on axial CT images with pre-processing intended to direct the training on the lung-texture features of emphysema and to reduce the imbalanced data containing few examples of advanced emphysema in the training cohort (Supplemental Figure S4). For prediction, only segmentation, contrast windowing, cropping, and resizing were used. Pre-processing details are given in Appendix B.
A ResNet-18 architecture was adopted with the final linear as a single neuron predicting the emphysema score for the slice [19]. From the 102 CT stacks in the development cohort, 82 were used for training/validation and 20 were held-out for testing. The binned accuracy was computed on the test data, where the outcome measure was fraction of predictions within ± 1.5 from slice emphysema annotation.
The regression network was optimized on the training set with oversampling to partially compensate the label imbalance (see supplemental Table S1). Optimization was performed using back-propagation and the Adam algorithm with default parameters. The loss function was the sum of the mean-square-error and the mean-absolute-error weighted by the inverse of the label proportion in the oversampled train dataset to balance rare labels.
The model was evaluated in each epoch and the final model was selected as the one displaying the lowest validation loss aiming to select a model with best fit on unseen data. The model was trained five times with different random seeds to assess the stability. Each time, the ResNet-18 was trained for 75 epochs with a batch size of 32 and a learning rate of 0.001 exponentially decayed with a power of 0.95 at each epoch. The network’s parameters were regularized with an L2 penalty with a weight of 10−6.
The source code for the slice-wise predictions is available as supplementary material.
Validation reference metrics
The main Gothenburg SCAPIS cohort was used for validation, with electronic case report forms (eCRF) visual emphysema scoring and PFT for comparison. None of the validation data was used in algorithm development or training. The validation data consisted of two cohorts made available by SCAPIS: 395 randomly selected cases, and 79 selected cases of emphysema according to eCRF visual scores (see Fig. 1b). There was no overlap between the cohorts.
Pulmonary function testing
Each SCAPIS participant was tested using dynamic spirometry 15 min after inhaling 400 µg of salbutamol [11]. A post-bronchodilator-forced expiratory volume in 1 s (FEV1) divided by forced vital capacity (FEV1/FVC) ratio of under 0.7 confirms chronic airway obstruction compatible with COPD [3], and FEV1/FVC is also the PFT parameter with the strongest correlation with LAV950 and visual emphysema scoring [14, 20]. It was therefore chosen as validation reference. Correlations with post-bronchodilator carbon monoxide diffusing coefficient in percent of predicted (DLcoPred%), in participants with available data, were also used [11],
Visual scoring
Three regions in each lung were reviewed at inclusion in SCAPIS: upper, middle, and lower using a Syngo.Via (Siemens Healthineers) thin-slice workstation. In each region, emphysema was graded on a 4-point scale: none, mild, moderate, or severe [17]. In this study, the eCRF emphysema score in each region was coded 0–3, and the sum of codes for all regions was used as a patient score of 0–18 (sum-visual score).
Significant visual emphysema was defined as a sum-visual score > 2, corresponding to more than two regions with mild emphysema or more than one region with moderate emphysema.
LAV950
LAV950 was assessed using AI-Rad Companion Chest CT (Siemens Healthineers). The LAV950 analysis is threshold-based; the algorithm determines the fraction of all voxels below − 950 Hounsfield units (HU) in the lungs. The analysis was performed using a fully automated workflow without any manual adjustments. The automated results were verified with the segmentations in the SWES algorithm (see appendix C).
Statistics
Pearson’s correlation coefficients between SWES and sum-visual were used to assess whether SWES measures what radiologists identify as emphysema, with Meng’s test for dependent correlation coefficients to test the hypothesis of stronger correlation for SWES compared to LAV950 [21].
The SWES, LAV950, and sum-visual scores were correlated to FEV1/FVC and DLcoPred% in the random (n = 395), emphysema (n = 79), and total validation (n = 474) cohorts, separately. Pearson’s correlation coefficients were compared using Meng’s test.
Receiver operating characteristic statistics were used for SWES and LAV950, considering prediction of significant visual emphysema, and airway obstruction, as defined by the GOLD criteria for COPD (FEV1/FVC < 0.7). ROC curves were compared using DeLong’s test and 95% CI were obtained through bootstrapping.
Statistics were computed with Matlab R2020a (The Mathworks) and STATA 17.0 (StataCorp LLC).
Results
Baseline characteristics
Baseline characteristics of included participants are presented in Table 1. The development and validation cohorts were different regarding PFT as well as CT emphysema metrics (SWES, LAV950, and sum-visual). Also, the reconstruction parameters used were different in the development cohort compared to the validation cohorts.
Algorithm development
The slice-wise predictions on the 10-degree scale in the development cohort were stable on the unseen test set of 20 CT stacks not used in training. The model reached a slice-wise mean binned accuracy ± 1.5 of 83.7% over the replicates on the training-validation set, and 82.5% on the held-out test set.
Intra-scan repeatability comparing SWES on original and randomly rotated CT scans showed narrow limits of agreement, 0.06 ± 0.11.
Validation against visual emphysema
In the total validation cohort, SWES correlated strongly with the sum-visual regional score, while LAV950 correlated weakly (r = 0.78 vs. r = 0.41, p < 0.001 for difference) (see Fig. 3). The strong correlation indicates that SWES measures what radiologists identify as emphysema. Figure 4 shows the gradual increase in emphysema in randomly chosen slices with slice scores distributed between 0 and 10.
Examples of per-slice predictions in the validation dataset. The slices and sides are randomly selected from multiple participants to obtain uniformly distributed examples with scores between 0 and 10. Inserted numbers show per-slice emphysema score for shown lung. All images are shown in window level/width − 500/1200 HU
With an area under the curve of 0.85 (95% confidence interval (CI) 0.74–0.96), SWES was an excellent predictor of significant emphysema (sum-visual > 2) in the random cohort, while LAV950 did not discriminate between cases with and without significant emphysema (AUC 0.49 (95% CI 0.29–0.70) (p < 0.001 for difference)) (see Fig. 5a).
ROC curves. a Prediction of significant visual emphysema, defined as sum-visual > 2. The AUC for SWES was higher than that for LAV950 (p < 0.001). b Prediction of airway obstruction, defined as FEV1/FVC < 0.7. The AUC for SWES was higher than that for LAV950 (p = 0.007) and for sum-visual (p = 004). SWES, slice-wise whole-lung emphysema score; LAV950, low-attenuation volume below − 950 HU
Correlation with PFT compared to LAV950 and sum-visual
Airway obstruction
There was a strong inverse correlation with airway obstruction (FEV1/FVC) in the full cohort (r = − 0.69, p < 0.001). The correlations between SWES, sum-visual, and LAV950, and FEV1/FVC, with pair-wise comparisons using Meng’s test are shown in Table 2. The correlation between SWES and FEV1/FVC was significantly stronger than the correlation between LAV950 and FEV1/FVC or between sum-visual and FEV1/FVC in all cohorts. Scatter plots illustrating the correlations are shown in Fig. 6.
SWES was a better predictor of airway obstruction (defined as FEV1/FVC < 0.7) than either LAV950 (p = 0.007) or sum-visual (p = 0.004). The AUC for SWES, LAV950, and sum-visual for prediction of GOLD criteria for COPD (FEV1/FVC < 0.7) in the random cohort was 0.76 (95% CI 0.67–0.85), 0.61 (95% CI 0.50–0.72), and 0.62 (95% CI 0.54–0.70), respectively (see Fig. 5b).
Diffusing capacity
DLcoPred% was available in 61 and 328 participants in the emphysema and random cohort, respectively (see Table 3). In the emphysema cohort, SWES, sum-visual, and LAV950 were correlated to DLcoPred% (r = − 0.74, r = − 0.74, and r = − 0.52, respectively, each p < 0.001). In the random cohort, with low emphysema frequency, SWES and sum-visual were weakly correlated to DLcoPrc (r = − 0.21, p < 0.001 and r = − 0.20, p < 0.001, respectively), while LAV950 was not correlated to DLcoPrc (r = 0.06, p = 0.27).
The correlation with DLcoPred% was significantly stronger for SWES than for LAV950 in all validation cohorts (Meng’s test p < 0.001). Compared to sum-visual, the correlation was approximately equal in all cohorts (all p > 0.05).
Discussion
In this study, a slice-wise whole-lung CT emphysema score was developed to obtain a method for the rapid identification of emphysema suitable for population-based large cohorts. We compared SWES with quantitative CT and the sum of regional visual emphysema scores. SWES was a significantly better approximation of the readers’ visual score than LAV950 and correlated significantly more strongly with pulmonary function testing than either sum-visual or LAV950.
Emphysema is visible in CT imaging as low-density regions and is an important predictor of mortality and morbidity in COPD, independent of lung function [2, 6,7,8]. The development of quantitative CT metrics for emphysema in recent decades parallels attempts to include the additional value of CT imaging in COPD models in especially research and clinical trials [4,5,6,7,8].
A threshold-based emphysema score such as LAV950 is a reasonable quantitative metric for emphysema, especially in cohorts with advanced disease [6,7,8, 10, 22]. However, in cross-sectional cohorts, there is considerable overlap in LAV950 between subjects with and without visual emphysema [17]. Furthermore, the readers’ visual emphysema estimation has been shown to provide additional predictive value, even in models that include thresholding [8].
The output measure of LAV950—the fraction of low-attenuation volume—is appealing as it may be interpreted as the proportion of lung parenchyma affected by emphysema. However, in diffusely distributed lung diseases, there is generally no clear cutoff in CT images between healthy and affected parenchyma [23], and the delineation based on attenuation alone is oversimplified. Low-density pixels can appear, for example, because of air-trapping, hyperinflated lungs, and image noise, phenomena that may be distinguished from emphysema by experienced readers.
The difficult delineation of emphysema also makes 3D CNN architectures, which are computationally logical for thin-slice CT data, challenging to apply with detailed visual scores as ground truth. 3D CNN may be applied on a global score basis such as PFT [22], but fine-grained global scores that truly represent visual emphysema are even more difficult to obtain than slice-wise scores. Instead, to address the continuous aspect of emphysema, we developed SWES as an aggregated score for each lung on a slice-wise basis. Acquisition of quality annotations is a challenge for supervised machine learning in radiology [24]. Multiple image comparisons by a large number of trained readers and approximate sorting enabled the creation of detailed training data.
We present two major results: First, SWES is a good approximation of the radiologist’s assessment, which shows that the algorithm measures what the radiologist identifies as emphysema. Second, SWES is a better predictor of chronic airway obstruction than either the reader’s visual score or LAV950. In addition, the correlation with reduced diffusing capacity was stronger compared to LAV950 and equal compared to visual scores. The absence of correlation between LAV950 and DLcoPred% in the random cohort is likely caused by the overlap in LAV950 between participants with and without emphysema and the low number of participants with severe emphysema in the random cohort.
The improved performance in predicting obstruction compatible with COPD may be explained by the greater granularity of SWES and the absence of the inter-observer variations inevitable in visual scoring [17, 25, 26]. While readers estimated the emphysema extent on a 4-point scale in three regions in each lung, SWES is an aggregated continuous score for each 0.6-mm slice of each lung, using 10-point scale training data. The improved performance than that of LAV950 indicates that counting low-attenuation pixels does not gather complete information regarding emphysema in CT images [8, 17].
Most previous studies using deep learning to detect emphysema in chest CT images have used smoking cohorts with a high frequency of COPD [22, 27,28,29]. Given the aim of the study, an important feature was the cross-sectional test cohort with low emphysema frequency. Comparison with a study by Singla et al illustrates the association between airway obstruction and visual emphysema scoring [22]. While Singla et al used PFT results as image labels and showed that the machine learning method also predicted visual emphysema, we took the opposite approach, using visual emphysema labels and showing that we could also predict the physiological airway obstruction [22].
Limitations
The results indicate that the algorithm performs well with fixed imaging parameters in the validation cohort from the main SCAPIS study, although the reconstruction parameters were different in the training dataset. However, the differences in magnitude, as seen in Table 1, indicate that, similarly to LAV950, the magnitude of SWES is highly dependent on the reconstruction parameters and cannot be directly compared using different settings. For routine clinical use or with other image parameters, additional training to equalize the output would be necessary, but the demonstrated principle of an aggregated visual slice-wise score is valid.
Although we demonstrate that SWES correlates to visual emphysema scoring, the specific image features that the algorithm detects are unknown. Airways thickening, the other main component of COPD, was not assessed. The airway involvement as seen in CT also has clinical predictive value [30, 31] and should be included in future work.
The SWES scale, developed using ordinal data from radiologists’ annotations, is arbitrary and has no fixed reference, which makes score interpretation more difficult. The emphysema type, which may add predictive value, was not assessed in the study, but could be assessed with further developments.
In conclusion, SWES is a quantitative emphysema score for CT imaging that avoids the disadvantages of threshold-based scores and is correlated more strongly with reference tests than LAV950 and visual scores. Aggregated slice-wise emphysema quantification is especially suited for pulmonary research use in large-scale cross-sectional CT multi-center image cohorts.
Abbreviations
- CNN:
-
Convolutional neural network
- COPD:
-
Chronic obstructive pulmonary disease
- DLcoPred%:
-
Post-bronchodilator carbon monoxide diffusing capacity in percent of predicted
- eCRF:
-
Electronic case report form
- FEV1/FVC:
-
Forced expiratory volume in 1 s divided by forced vital capacity
- LAV950:
-
Percentage low-attenuation value below − 950 HU
- PFT:
-
Pulmonary Function Test (spirometry)
- ROC:
-
Receiver operating characteristics
- SCAPIS:
-
Swedish CArdiopulmonary bioImage Study
- Sum-visual:
-
Sum of regional visual emphysema grading in one participant
- SWES:
-
Slice-wise whole-lung emphysema score
References
The top 10 causes of death. World Health Organization. (2022) Available via https://www.who.int/news-room/fact-sheets/detail/the-top-10-causes-of-death. Accessed 3 Aug 2022
Hansell DM, Bankier AA, MacMahon H, McLoud TC, Müller NL, Remy J (2008) Fleischner Society: Glossary of Terms for Thoracic Imaging. Radiology 246:697–722. https://doi.org/10.1148/radiol.2462070712
Vestbo J, Hurd SS, Agustí AG et al (2013) Global strategy for the diagnosis, management, and prevention of chronic obstructive pulmonary disease: GOLD executive summary. Am J Respir Crit Care Med 187:347–365. https://doi.org/10.1164/rccm.201204-0596PP
Dirksen A, Piitulainen E, Parr DG et al (2009) Exploring the role of CT densitometry: a randomised study of augmentation therapy in alpha1-antitrypsin deficiency. Eur Respir J 33:1345–1353. https://doi.org/10.1183/09031936.00159408
McElvaney NG, Burdon J, Holmes M et al (2017) Long-term efficacy and safety of α1 proteinase inhibitor treatment for emphysema caused by severe α1 antitrypsin deficiency: an open-label extension trial (RAPID-OLE). Lancet Respir Med 5:51–60. https://doi.org/10.1016/S2213-2600(16)30430-1
Andrianopoulos V, Celli BR, Franssen FME et al (2016) Determinants of exercise-induced oxygen desaturation including pulmonary emphysema in COPD: Results from the ECLIPSE study. Respir Med 119:87–95. https://doi.org/10.1016/j.rmed.2016.08.023
Martinez CH, Chen Y-H, Westgate PM et al (2012) Relationship between quantitative CT metrics and health status and BODE in chronic obstructive pulmonary disease. Thorax 67:399–406. https://doi.org/10.1136/thoraxjnl-2011-201185
Lynch DA, Moore CM, Wilson C et al (2018) CT-based Visual Classification of Emphysema: Association with Mortality in the COPDGene Study. Radiology 288:859–866. https://doi.org/10.1148/radiol.2018172294
Han MK, Kazerooni EA, Lynch DA et al (2011) Chronic obstructive pulmonary disease exacerbations in the COPDGene study: associated radiologic phenotypes. Radiology 261:274–282. https://doi.org/10.1148/radiol.11110173
Labaki WW, Xia M, Murray S et al (2021) Quantitative Emphysema on Low-Dose CT Imaging of the Chest and Risk of Lung Cancer and Airflow Obstruction: An Analysis of the National Lung Screening Trial. Chest 159:1812–1820. https://doi.org/10.1016/j.chest.2020.12.004
Bergström G, Berglund G, Blomberg A et al (2015) The Swedish CArdioPulmonary BioImage Study: objectives and design. J Intern Med 278:645–659. https://doi.org/10.1111/joim.12384
Wesdorp NJ, Kemna R, Bolhuis K et al (2022) Interobserver Variability in CT-based Morphologic Tumor Response Assessment of Colorectal Liver Metastases. Radiol Imaging Cancer 4:e210105. https://doi.org/10.1148/rycan.210105
Mascalchi M, Camiciottoli G, Diciotti S (2017) Lung densitometry: why, how and when. J Thorac Dis 9:3319–3345. https://doi.org/10.21037/jtd.2017.08.17
Schroeder JD, McKenzie AS, Zach JA et al (2013) Relationships between airflow obstruction and quantitative CT measurements of emphysema, air trapping, and airways in subjects with and without chronic obstructive pulmonary disease. AJR Am J Roentgenol 201:W460–W470. https://doi.org/10.2214/AJR.12.10102
Dijkstra AE, Postma DS, ten Hacken N et al (2013) Low-dose CT measurements of airway dimensions and emphysema associated with airflow limitation in heavy smokers: a cross sectional study. Respir Res 14:11. https://doi.org/10.1186/1465-9921-14-11
Hoffman EA, Ahmed FS, Baumhauer H et al (2014) Variation in the percent of emphysema-like lung in a healthy, nonsmoking multiethnic sample. The MESA lung study. Ann Am Thorac Soc 11:898–907. https://doi.org/10.1513/AnnalsATS.201310-364OC
Vikgren J, Khalil M, Cederlund K et al (2019) Visual and Quantitative Evaluation of Emphysema: A Case-Control Study of 1111 Participants in the Pilot Swedish CArdioPulmonary BioImage Study (SCAPIS). Acad Radiol. https://doi.org/10.1016/j.acra.2019.06.019
Lidén M, Hjelmgren O, Vikgren J, Thunberg P (2020) Multi-Reader-Multi-Split Annotation of Emphysema in Computed Tomography. J Digit Imaging. https://doi.org/10.1007/s10278-020-00378-2
He K, Zhang X, Ren S, Sun J (2015) Deep Residual Learning for Image Recognition. arXiv:1512.03385. https://doi.org/10.48550/arXiv.1512.03385
Nambu A, Zach J, Schroeder J et al (2016) Quantitative computed tomography measurements to evaluate airway disease in chronic obstructive pulmonary disease: Relationship to physiological measurements, clinical index and visual assessment of airway disease. Eur J Radiol 85:2144–2151. https://doi.org/10.1016/J.EJRAD.2016.09.010
Meng XL, Rosenthal R, Rubin DB (1992) Comparing correlated correlation coefficients. Psychol Bull 111:172–175. https://doi.org/10.1037/0033-2909.111.1.172
Singla S, Gong M, Riley C, Sciurba F, Batmanghelich K (2021) Improving clinical disease subtyping and future events prediction through a chest CT-based deep learning approach. Med Phys 48:1168–1181. https://doi.org/10.1002/mp.14673
Längkvist M, Widell J, Thunberg P, Loutfi A, Lidén M (2019) Interactive user interface based on convolutional auto-encoders for annotating CT-scans. arXiv:1904.11701, 2019. https://doi.org/10.48550/arXiv.1904.11701
Choy G, Khalilzadeh O, Michalski M et al (2018) Current Applications and Future Impact of Machine Learning in Radiology. Radiology 288:318–328. https://doi.org/10.1148/radiol.2018171820
Widell J, Lidén M (2020) Interobserver variability in high-resolution CT of the lungs. Eur J Radiol Open 7:100228. https://doi.org/10.1016/j.ejro.2020.100228
Walsh SLF, Calandriello L, Sverzellati N, Wells AU, Hansell DM (2016) Interobserver agreement for the ATS/ERS/JRS/ALAT criteria for a UIP pattern on CT. Thorax 71:45–51. https://doi.org/10.1136/thoraxjnl-2015-207252
Hasenstab KA, Yuan N, Retson T et al (2021) Automated CT Staging of Chronic Obstructive Pulmonary Disease Severity for Predicting Disease Progression and Mortality with a Deep Learning Convolutional Neural Network. Radiol Cardiothorac Imaging 3:e200477. https://doi.org/10.1148/ryct.2021200477
Humphries SM, Notary AM, Centeno JP et al (2020) Deep Learning Enables Automatic Classification of Emphysema Pattern at CT. Radiology 294:434–444. https://doi.org/10.1148/radiol.2019191022
González G, Ash SY, Vegas-Sánchez-Ferrero G et al (2018) Disease Staging and Prognosis in Smokers Using Deep Learning in Chest Computed Tomography. Am J Respir Crit Care Med 197:193–203. https://doi.org/10.1164/rccm.201705-0860OC
Orlandi I, Moroni C, Camiciottoli G et al (2005) Chronic obstructive pulmonary disease: thin-section CT measurement of airway wall thickness and lung attenuation. Radiology 234:604–610. https://doi.org/10.1148/radiol.2342040013
Xie X, Dijkstra AE, Vonk JM, Oudkerk M, Vliegenthart R, Groen HJM (2014) Chronic respiratory symptoms associated with airway wall thickening measured by thin-slice low-dose CT. AJR Am J Roentgenol 203:W383–W390. https://doi.org/10.2214/AJR.13.11536
Acknowledgements
Many thanks to the participating readers at the Radiology Department, Örebro University Hospital, Sweden. We are very grateful to all the participants in this study and the staff at the SCAPIS test center in Gothenburg, Sweden.
Funding
Open access funding provided by Örebro University. This study has received funding from Nyckelfonden, Örebro, Sweden (OLL-881491), Analytic Imaging Diagnostics Arena (AIDA), Linköping, Sweden (2104_Lidén) and Region Örebro län, Sweden (OLL-959996).
The main funding body of The Swedish CArdioPulmonary bioImage Study (SCAPIS) is the Swedish Heart and Lung Foundation. SCAPIS is also funded by the Knut and Alice Wallenberg Foundation, the Swedish Research Council and VINNOVA (Sweden’s Innovation Agency). In addition, the SCAPIS pilot received support from the Sahlgrenska Academy at University of Gothenburg and Region Västra Götaland.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Guarantor
The scientific guarantor of this publication is Mats Lidén, MD, Assoc Prof.
Conflict of interest
No industry support was provided for the project, and the authors have no conflicts of interest to declare related to the project. Not related to this project, MS has received consultancies from Roche, Boehringer, Ingelheim, Novartis, Pfizer, AstraZeneca, GlaxoSmithKline, and Chiesi.
Statistics and biometry
One of the authors (ML) has significant statistical expertise.
Informed consent
Written informed consent was obtained from all subjects (patients) in this study.
Ethical approval
Institutional Review Board approval was obtained.
Study subjects or cohorts overlap
Concerning machine learning CT emphysema assessment, the validation cohort has not been previously reported. Related to the present work, the subjects in the development cohort have previously been reported by Vikgren et al [17] and Lidén et al [18].
Methodology
• Prospective inclusion
• Cross-sectional
• Single inclusion site
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Lidén, M., Spahr, A., Hjelmgren, O. et al. Machine learning slice-wise whole-lung CT emphysema score correlates with airway obstruction. Eur Radiol 34, 39–49 (2024). https://doi.org/10.1007/s00330-023-09985-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00330-023-09985-3