Automatic quantitative computed tomography measurement of longitudinal lung volume loss in interstitial lung diseases

Objectives
 To compare the lung CT volume (CTvol) and pulmonary function tests in an interstitial lung disease (ILD) population. Then to evaluate the CTvol loss between idiopathic pulmonary fibrosis (IPF) and non-IPF and explore a prognostic value of annual CTvol loss in IPF. Methods We conducted in an expert center a retrospective study between 2005 and 2018 on consecutive patients with ILD. CTvol was measured automatically using commercial software based on a deep learning algorithm. In the first group, Spearman correlation coefficients (r) between forced vital capacity (FVC), total lung capacity (TLC), and CTvol were calculated. In a second group, annual CTvol loss was calculated using linear regression analysis and compared with the Mann–Whitney test. In a last group of IPF patients, annual CTvol loss was calculated between baseline and 1-year CTs for investigating with the Youden index a prognostic value of major adverse event at 3 years. Univariate and log-rank tests were calculated. Results In total, 560 patients (4610 CTs) were analyzed. For 1171 CTs, CTvol was correlated with FVC (r: 0.86) and TLC (r: 0.84) (p < 0.0001). In 408 patients (3332 CT), median annual CTvol loss was 155.7 mL in IPF versus 50.7 mL in non-IPF (p < 0.0001) over 5.03 years. In 73 IPF patients, a relative annual CTvol loss of 7.9% was associated with major adverse events (log-rank, p < 0.0001) in univariate analysis (p < 0.001). Conclusions Automated lung CT volume may be an alternative or a complementary biomarker to pulmonary function tests for the assessment of lung volume loss in ILD. Key Points • There is a good correlation between lung CT volume and forced vital capacity, as well as for with total lung capacity measurements (r of 0.86 and 0.84 respectively, p < 0.0001). • Median annual CT volume loss is significantly higher in patients with idiopathic pulmonary fibrosis than in patients with other fibrotic interstitial lung diseases (155.7 versus 50.7 mL, p < 0.0001). • In idiopathic pulmonary fibrosis, a relative annual CT volume loss higher than 9.4% is associated with a significantly reduced mean survival time at 2.0 years versus 2.8 years (log-rank, p < 0.0001).


Introduction
Interstitial lung diseases (ILDs) encompass a heterogeneous group of chronic and fibrotic lung diseases with distinct disease course and prognosis [1]. They may be associated with progressive lung volume loss with impaired quality of life, and in advanced stage, respiratory failure. Idiopathic pulmonary fibrosis (IPF), an inexorably progressive disease, is the most severe and lethal among others [2,3]. Survival is shortened in patients with lower forced vital capacity (FVC) at baseline and annual FVC decline ≥ 10%. FVC has been thus proposed as a surrogate marker for disease progression and mortality in all ILDs and has been advocated as a primary outcome in major clinical trials in IPF [4][5][6]. However, FVC measurement is subject to inherent measurement variability and might be inaccurate in frail patients, advanced disease stages, and subjects with intractable cough [7,8]. Therefore, a new feasible, reproducible, and effortless surrogate biomarker is still needed.
Chest high-resolution computed tomography (HRCT) is mandatory for disease evaluation in patients with ILD and is used for diagnostic, monitoring, and prognostic purposes. The current guidelines recommend that pulmonary function tests (PFT) and chest HRCT are both fundamental in patient follow-up [9][10][11]. Moreover, CT loss volume derived from visual or automated quantification of the lung volumes on HRCT has shown great interest for assessing the degree of severity, disease progression, and mortality in IPF and systemic sclerosis-associated ILD, in a relatively small ILD population [12][13][14][15]. Yet, there is a dearth of data on the role of longitudinal CT lung volume loss and annual decline and its prognostic in the IPF population.
Therefore, using a newly commercially available deep learning algorithm for automatic quantification of lung CT volume, we compare the lung CT volume and pulmonary function tests in a large interstitial lung diseases population. Then, we evaluate the longitudinal CT volume loss between IPF and non-IPF populations and explore a predictive value of annual CT volume loss in the IPF population.

Study population
We conducted a retrospective, observational, longitudinal study between February 2005 and July 2018 in an ILD expert center (Louis Pradel Hospital, Hospices Civils de Lyon, Lyon, France). Clinical, functional, and imaging data for consecutive patients that underwent at least one unenhanced HRCT study for fibrotic ILD were collected. The diagnosis of ILD was made in multidisciplinary discussion according to international guidelines at the time of the patient's presentation. Data usage policy of the "Hospices Civils de Lyon" in terms of confidentiality, anonymization, and security was applied for each study, and approval was obtained from our local committee. Institutional review board approval was obtained for the study, and patient consent was waived.

CT studies
For the comparison between the PFT and CT volume (CTvol), only patients having a CT study within the two weeks of the PFT were considered eligible. For the longitudinal CTvol loss evaluation within non-IPF and IPF groups, patients who underwent more than four CT examinations were included. Finally, for the predictive analysis of CT volume loss in IPF patients, patients who had a baseline CT and a 1-year follow-up CT (± 10 days) were included. For all CT examinations, patients with a history of recent (within 3 months) acute exacerbation, pneumothorax, pleural effusion, or lower respiratory tract infection and confirmed by two senior radiologists (with 6 and 20 years of experience in chest imaging, S.S-M. and L.B., respectively) were excluded in order to be representative of the chronic disease course of these fibrotic ILDs.

CT protocols
All HRCT acquisitions were performed at the end of deep inspiration. The data were collected retrospectively from all CT examinations on several systems over the years: GE Medical Systems (Revolution GSI), Philips (Brilliance 40, Brilliance 64, iCT 256, Ingenuity CT, IQon, Spectral CT), Siemens (Somatom Definition AS and AS +). The scanning parameters were as follows: tube voltage = 100-140 kVp (mean ± SD: 121 ± 9 kVp), helical scan mode. The mean slice thickness was 1.8 ± 0.8 mm (range: 0.9-3 mm).

Automatic quantification of lung CTvol
We used commercially available software implemented in a clinical workstation (CT Pulmo Auto Results, provided under a research contract; IntelliSpace Portal ISP11.1, Philips Healthcare). This software is a U-net-based deep learning algorithm and allows the lung segmentation with the exclusion of the main airways including the trachea, stem, lobar bronchi, and the main vessels. Lung CTvol was expressed in liters (L) (Fig. 1).

Clinical respiratory functional test data
Patients' data on demographics and PFTs were collected from patients' records. PFTs were performed in all patients according to ATS/ERS official statement [16]. Total lung capacity (TLC) was measured with body plethysmography, and forced vital capacity (FVC) by spirometry, both expressed in liters (L).

Statistical analysis
The data are expressed as mean ± standard deviation (SD) and median with interquartile range (IQR) for continuous variables. Categorical variables were presented as frequency (percentage). Distributions were tested for normality using the D'Agostino-Pearson test. Variables were compared using two paired Student t-test or Wilcoxon rank-sum test, as appropriate. For comparison purposes, differences in lung CTvol and functional volumes were compared using the Wilcoxon matched-pairs signed-rank test. For comparison between CTvol and respiratory volumes (FVC and TLC), a Bland-Altman analysis (bias, limits of agreement), a linear regression analysis (95% confidence interval, R 2 ), and Spearman correlation coefficients and their 95% confidence were calculated.
For estimation of the daily and annual CTvol loss during follow-up, linear regression between all individual CT across time was calculated. The Mann-Whitney test was used to compare daily and annual CTvol loss between IPF and non-IPF groups.
For determining a predictive CTvol loss in the IPF population for major adverse events (MAE), i.e., death and transplantation, we first calculated the absolute annual and relative CTvol loss between CT baseline and 1-year follow-up CT of each patient that underwent a minimal 3-year followup or had died or undergone transplantation within 3 years. Then, a receiver operator characteristic (ROC) analysis was used to evaluate the greater baseline CTvol and CTvol loss values with the Youden index after having dichotomized the population with or without MAE at 4 years after the first CT study (baseline). Accordingly, to this value, a chi-square test with Yates' continuity correction was used to test the association between categorical variables (MAE at 4 years, baseline CTvol, annual and relative CTvol loss). Univariate Cox regressions were performed. A Kaplan-Meier statistical analysis was used to test the survival rate in the IPF population with the best predictor greater and lower than the threshold tested.
Statistical analysis was performed using the SPSS® software v23 (IBM) and R software v3.5. A p value less than 0.05 was considered significant. The Bonferroni correction was used to adjust the p values in the longitudinal and predictive studies, i.e., less than 0.01, respectively.

Relation between CTvol and pulmonary function tests
We analyzed 424 patients corresponding to 1171 CT studies (Table 1). There was a good correlation between CTvol and FVC (Spearman coefficient of 0.86 (IC95%: 0.84-0.87, p < 0.0001) as well as between CTvol and TLC (Spearman coefficient of 0.84 (IC95%: 0.82-0.86, p < 0.0001) ( Table 2). Bland-Altman analysis between CTvol and FVC revealed a positive proportional bias of 0.97, with 95.0% limits of agreement from − 0.27 to 2.21 L and a negative proportional bias between CTvol and TLC of − 0.68, with 95.0% limits of agreement from − 2.05 to 0.69 L ( Table 2 and Fig. 3).

Longitudinal lung CTvol loss between IPF and non-IPF groups
We analyzed 408 patients corresponding to 3332 CT studies, which were distributed in two groups: 81 (19.9%) patients in the IPF group and 327 (80.1%) in the non-IPF group (

One-year lung volume loss from baseline in the IPF group
Annual CTvol loss between the CT baseline and the 1-year CT follow-up of 73 patients with IPF were analyzed. Twenty-one MAEs during the 3-year follow-up period were reported with a mean ± SD delay of 440 ± 288 days after the annual CT. Sixteen patients died with a mean ± SD delay of 416 ± 288 days after the annual CT from pneumonia (6, 38%), respiratory failure (3, 19%), acute exacerbation (4, 25%), lung cancer (1, 6%), sepsis (1, 6%), and pulmonary embolism (1, 6%). Five patients underwent single lung transplantation with a mean ± SD delay of 542 ± 197 days after the annual CT. Eight patients were lost before the 3-year follow-up and were previously excluded from the initial population. Four patients underwent transplantation before the annual CT follow-up and were excluded from the predictive analysis. Dichotomized baseline characteristics of the study population by presence of MAE are summarized in Table 4. Based on a ROC statistical analysis, performed on baseline CTvol, absolute and relative annual CTvol loss, we found that a relative CTvol loss of 7.9% best matched for greater sensitivity and specificity, respectively of 81. 0% (Table 5), no difference was observed with respect to age, sex, and baseline CTvol. But we found a significant effect for annual absolute CTvol loss (p < 0.01) and relative annual CTvol loss (p < 0.001), both dichotomized, on patient survival. The categorical variable (death at 4 years) was significantly associated with the annual CTvol loss with a p < 0.001 using a chi-square test with Yates' continuity correction. Log-rank test demonstrated a significant

Discussion
In the present study, we found that lung CT volume measurement, enabled by an automatic approach based on a deep learning algorithm, correlated strongly with FVC and TLC. Using longitudinal lung CT volume loss, we found that patients with IPF had a distinct disease course than other ILDs. In addition, in the IPF group, higher annual CT volume loss was associated with a worse prognosis. The first results were the colinear relationship between CTvol with TLC, as previously shown [14,17], but also with FVC among 1171 pairs of CT and PFT in a large cohort of ILD patients. CTvol was lower than TLC, such as previously demonstrated [18], which can be explained by patient positioning, e.g., the difference between CT exams (decubitus) and plethysmography (sitting position). Moreover, TLC measurement (body plethysmography) sums up all thoracic airspaces (including anatomic dead space) while CTvol only provides the measurement of parenchymal lung volume [19]. Thus, the delta between TLC and CTvol represents an anatomic dead space of 450 mL. CTvol on the other hand, was higher than FVC, which can be explained by the contribution of the residual volume contrary to the evaluation of FVC. In addition, the Bland Altman analysis showed a positive proportional bias suggesting a higher difference between CTvol and FVC for high lung volumes.
From a clinical perspective, longitudinal assessment of volume loss may be more useful than a single measurement at baseline. The results of the present longitudinal study demonstrated a greater and faster CTvol loss among IPF than non-IPF patients, associated with shorter follow-up time (due to greater mortality) but a comparable number of CT exams. These findings are consistent with higher morbidity Fig. 4 Notched box-and-whiskers plot showing the annual CT Volume (CTvol) in IPF and non-IPF groups. The lower and upper margins of each box indicate the 25th and 75th percentile. Median is marked by the line in the box, and mean by the red dot. 95% confidence interval of the median is represented by the notches and outliers indicate the minimal and maximal values and mortality in IPF as compared to other fibrotic ILDs, leading to more frequent acute complications requiring more exams, and earlier death [20]. In addition, results from the longitudinal analysis showed in the IPF population a good agreement between annual CTvol loss and annual FVC loss as reported previously [21]. Of note, PFTs are time-consuming and require multiple measurements along with technical expertise in order to be reproducible [22][23][24][25][26][27].
Finally, in our attempt to clarify the interest for lung CTvol in the IPF population, we showed a significant association between the annual CTvol loss and the prognosis. We found a relative CTvol loss predictive value of 7.9% for death at 3 years after 1-year CT follow-up which is close to the 10% FVC decline threshold currently used as an endpoint, i.e., indicating disease progression in IPF clinical studies [28]. Using the 7.9% threshold, we found a mean survival time of 2.8 versus 2.0 years which is in line with the average 2.5-year survival time reported in IPF patients [29]. Interestingly, we found also an association between the absolute CTvol loss and mortality. In addition, the baseline CT volume was not associated with the prognosis contrary to the CT volume loss while there was a trend to a lower baseline CT in patients with poor prognosis. This result is supporting the findings of previous studies that highlighted the stronger prognostic value of the clinical and physiological parameters change compared to the baseline lung function in fibrotic ILD [30,31]. However, the meaning of this finding may be limited due to the absence of normalization as a function of the age, sex, and height of the patient.
In this study, the strength was to use an automated solution based on a deep learning algorithm that allowed the analysis of a large cohort, previously limited by exclusive manual processing [21][22][23]. This solution is a commercially available fully automatic application that does not require any user interaction, besides installation, and that is part of the software suite available in the IntelliSpace Portal (Philips Healthcare). In our study, this was made available as a standalone to be deployed in a batch mode. But lung volumetric measurements can be done on-demand or in pre-processing mode as soon as the data from the scanner is available. Running time (including additional measurements and reporting) varies, depending on the IT network and system workload, typically less than 2-3 min. Analysis of the inter-observer variability between a manual and automatic segmentation reported by the vendor showed a median and interquartile range (IQR) of the absolute volume differences of 19.6 ml and 9.1-32.0 mL IQR (Q1-Q3) [32], which appears as a relatively small range in comparison to the CTvol loss values reported in our study. Taken together, this explains the feasibility for investigating lung volumes at CT in a large representative cohort of consecutive patients with fibrotic ILD in a center of expertise. In addition, this automatic approach may be of great interest in many cases where measuring PFTs can be challenging. PFTs, although generally reproducible, are also person-dependent (i.e., on the patient and on the technician or physician performing the test). Measuring lung volumes using a different method could help interpret the findings, as do complementary methods in other areas. One additional result from the CT that cannot be obtained by PFTs is the ability to measure lobar volumes that may help to define the clinical evolution in some fibrotic ILDs, as suggested recently in pleuroparenchymal fibroelastosis [33]. Hence, our study represents a first step towards the identification of a new biomarker predicting physiological outcomes in order to consider the design of a new IPF mortality-risk score [34]. By taking into account additional CT parameters, automatic quantitative CT analysis could become a valid alternative or a complementary tool to pulmonary function tests in patients with ILD. However, the implementation of this technique in clinical practice warrants further prospective and controlled studies. This study besides its retrospective and monocentric design has several limitations. First, CTvol measurement may vary with the individual's degree of inspiration during the CT examination. However, previous study reported an acceptable variability (< 10%) and a good repeatability of CTvol in patients with restrictive lung diseases, explained by reduced pulmonary compliance leading to less variable inspiratory volumes [36]. Hence, we assumed that inspiratory CT would be sufficient for patient follow-up although combined inspiratory and expiratory CT exams are often recommended [37]. Second, the presence of concomitant emphysema in ILD can underestimate disease progression despite FVC and TLC remaining stable [7], that is why estimation of lung volume solely with PFT can therefore sometimes be insufficient to monitor restrictive physiology. Nevertheless, our study did not take into account patients with concomitant emphysema. Hence, it would be interesting to perform additional lobar segmentation since pulmonary fibrosis takes place mainly at the basal lobes while emphysematous changes preferentially affect upper lobes [38]. Third, in the longitudinal study, patients with at least 4 CT studies were included to guarantee the quality of linear regressions which may be variable depending on the sampling pattern across time, i.e., may lead to influential points. However, in this real-life study, the CTs scans were not performed at fixed intervals, which can represent a bias. Fourth, our comparative study between the PFT results and CT volume values did not take into account the PFT values expressed as a percentage of the predicted value. Nevertheless, our aim was to evaluate the absolute precision of the CT for volume quantification. Fifth, the annual CT volume loss was assessed only in the IPF patients. Currently, there is no recommendation for performing an annual CT follow-up in the non-IPF patients, contrary to the IPF patients who undergo annual follow-up CT in our expert center according to the French recommendations for the management of IPF [39]. This explains the difficulty to assess the annual CT lung volume loss in non-IPF patients. In addition, non-IPF patients who had a follow-up CT probably had a CT due to a clinical indication including the occurrence of acute exacerbations, which may bias our analysis of the chronic disease course of the fibrotic ILD. Finally, cases were enrolled over a period of 13 years, and several different CTs were used; this in fact demonstrates that our approach can be implemented to various scenarios in a real-life setting.
In conclusion, automatic evaluation of the lung CT volume, in patients with ILD and particularly with IPF, may be an alternative or a complementary biomarker to pulmonary function tests for assessment of lung volume loss, in clinical care as in randomized trials. Fig. 6 Three-year survival Kaplan-Meier curve and logrank test, as a function of a 7.9% threshold for the relative CTvol loss in the idiopathic pulmonary fibrosis population. Mean survival times are 1039 (SEM: 28) and 746 (SEM: 73) days respectively (2.8 versus 2.0 years)