Introduction

Chronic fibrosing interstitial lung diseases (ILDs) encompass a large and heterogeneous group of parenchymal lung disorders that may be related to either systemic diseases, environmental exposures, or have no known cause. Idiopathic pulmonary fibrosis (IPF) is a common form of chronic, progressive fibrosing ILD that requires treatment with anti-fibrotic drugs [1]. In secondary fibrosing ILDs, such as connective tissue disease-associated ILD and fibrotic hypersensitivity pneumonitis (HP), initial inflammation progresses to fibrosis [2]. Extensive inflammatory cell infiltration is an important histological feature suggestive of secondary ILDs, including connective tissue disease (CTD). Moreover, inflammatory cell infiltration is also part of the hallmark histologic findings of interstitial pneumonia with autoimmune features (IPAF) [3, 4]. In these secondary fibrosing ILDs, immunomodulatory medications play an essential role in the treatment strategy [5,6,7,8]. However, fibrotic ILDs frequently have non-specific and overlapping clinical and radiological features, and approximately 10–20% of patients with ILD are diagnosed with unclassifiable ILD [9]. Therefore, an accurate ILD diagnosis may sometimes require invasive procedures such as a surgical lung biopsy (SLB). However, SLB is associated with a substantial risk of complications [10], and thus, an alternative technique to evaluate cellular infiltration is warranted in the management of patients with ILD.

Radiomics involves converting medical images into high-dimensional data that reflect the underlying pathophysiology [11]. Radiomics features can effectively predict histologic subtypes of lung cancers [12, 13]. Moreover, some radiomics features have already been used for segmenting ILD lesions on computed tomography (CT) images [14]. Some reports have evaluated radiomics features in the patients with ILD [15, 16]. However, a direct comparison between CT radiomics features and histological specimens in the ILD patients has not been thoroughly performed. We hypothesized that CT radiomics features are related to cellular infiltration in SLB specimens of the ILD patients. Thus, this study aimed to create radiomics models to quantitatively and qualitatively predict the degree of cellularity based on the direct comparison of the CT radiomics features with the histology of fibrotic ILD.

Materials and methods

Study design and patients

This retrospective, single-center study was approved by the Institutional Review Board (Ethics Committee number KCRC-21-0037). The requirement for informed consent was waived owing to the retrospective nature of the study.

Patients with ILD who underwent thoracoscopic SLB between October 2018 and January 2022 were eligible (Fig. 1). We included 122 consecutive patients with fibrosing ILD who underwent CT examinations within 6 months before surgery. Patients with lung cancer (n = 12) and pulmonary infection (n = 3) were excluded, because these involve neoplastic cell proliferations. The patients who previously underwent thoracic surgery (n = 4), one with a foreign body in the lung (n = 1) and two with inadequate CT image quality (n = 2), were also excluded because of the potential undesirable artifacts. Baseline clinical measurements, including pulmonary function tests, were obtained within 3 months of the CT examinations. All baseline CT images were non-contrast, obtained using a thin-slice CT scanner (Aquilion Precision, Canon Medical Systems, Otawara, Japan) at full inspiration in the supine position with a tube voltage of 120 kVp and automatic tube current modulation. The median effective dose (interquartile range) of the CT scans was 8.23 (7.24–9.1) mSv. These data were calculated based on the dose–length products (mGy · cm) and a k-factor of 0.0140 (mSv · mGy−1 · cm−1). We used the CT images reconstructed using filtered-back projection and soft kernel, with 512 × 512 matrices and 0.5 mm slices. We automatically measured the total lung volume and the extent of each lesion using artificial intelligence-based software [Quantification by Ziosoft Informatics Platform for Interstitial Lung Disease (QZIP-ILD), Ziosoft, Tokyo, Japan] [17]. Two board-certified chest radiologists performed IPF and HP pattern classification according to the American Thoracic Society guidelines [1, 4] without clinical and pathological information. The findings were agreed upon in a consensus between the two radiologists.

Fig. 1
figure 1

Study flowchart. SLB, video-assisted thoracic surgery; SLB, surgical lung biopsy

Biopsy and histopathological/morphometrical analysis

One pathologist with 20 years of experience in ILD selected a representative hematoxylin–eosin-stained slide for each SLB site. The slides were scanned using a virtual slide scanner (NanoZoomer, Hamamatsu Photonics, Hamamatsu, Japan). Then, we counted the number of nucleated cells and measured the tissue area using public morphometric software (QuPath v0.2.3) [18]. We calculated the number of nucleated cells per unit area (1000 μm2). Among specimens from the 99 patients with two SLB sites, those with the higher cell count were used for the analysis. We compared the cell counts between patients treated with and without anti-inflammatory therapy. A multidisciplinary discussion (MDD) that included specialists in respiratory medicine, diagnostic radiology, and pathology determined whether anti-inflammatory therapy was indicated. Based on a comprehensive review of the patient's clinical course, encompassing a detailed chronological medical history, thorough physical examination, and a series of laboratory and imaging tests subsequent to the initial diagnosis of MDD, the diagnosis of unclassifiable disease was revised during the departmental conference in respiratory medicine.

Radiomics analysis and radiomics-based models

A schematic diagram of the region-of-interest (ROI) placement and construction of the specimen-based model and patient-based model is shown in Fig. 1. We positioned the ROI using surgical staples as a guide to indicate the biopsy site. The staple line measured 2–4 cm in length in the post-operative CT images. Therefore, we placed two volumes of interest (VOIs) to cover the staple line. Moreover, we measured the radiomics features. All VOIs were placed by an expert radiologist with 29 years of experience or a pneumologist with 18 years of experience. The center points of the two spherical VOIs with 2 cm diameter were manually placed to include a maximum volume of the interstitial lesion, as well as the portion removed by SLB in each patient. In usual interstitial pneumonia (UIP) pattern fibrosis, the lesion existed just below the pleura. Therefore, VOIs included some extrapulmonary regions. When the sphere contained extrapulmonary regions, these regions were automatically removed using QZIP-ILD [17]. All radiomics features were calculated using in-house software based on PyRadiomics (version 3.0.1; https://pyradiomics.readthedocs.io/en/latest/). In total, 107 radiomics features from the category of the histogram and gray-level co-occurrence matrix were extracted from each VOI and the whole lung.

The radiomics features obtained from the two VOIs placed for each SLB site were labeled as “SLB.a” and “SLB.b.” The order of labeling was arbitrary. Radiomics features extracted from regions obtained from whole lungs were labeled “_WL.” We built two radiomics-based models. One model predicts cell counts in the histological specimens (specimen-based model) with “SLB.a” and “SLB.b.”-labeled radiomics features, and another one predicts patients with abundant cellular infiltration in their histology specimens (patient-based model) with “SLB.a”, “SLB.b.” and “WL.”-labeled radiomics features.

Step-1: relationship of specimen-based model and histological cell count

To build the specimen-based model, 199 VOIs were divided into two datasets: training (140 specimens) and validation (59 specimens). A linear regression model with the least absolute shrinkage and selection operator (LASSO) [19] was applied to simultaneously construct a support vector machine (SVM) to predict the number of cells per unit area in surgical specimens and select significant radiomics features in the training dataset. A discordant pathological diagnosis between different biopsy sites is frequently observed in the same patient [20]. Therefore, we treated the specimens obtained from different biopsy sites in one patient as different samples.

Step-2: relationship of patient-based model and the dichotomized classification of cellularity

In clinical settings, the level of cellularity, whether high or low, is one of the crucial factors in justifying the indication for an anti-inflammatory therapy. Therefore, the patient-based model to classify the patient to high or low cellularity was developed. We divided patients into two groups, one with high cell counts and the other one with low cell counts. The median value of the histological cell counts across all patients was used to define the cut-off value to dichotomize the degree of cellularity: 50 high-cellularity cases and 50 low-cellularity cases. Seventy of these 100 cases were randomly selected and used as training datasets to build the model. Thirty separate cases were used as test data for external validation. We constructed this model using SVM with LASSO, the same as for the specimen-based model in Step 1.

Statistical analysis

Continuous patient characteristics are summarized using descriptive statistics (n, mean, median, standard deviation, and interquartile range), while categorical patient characteristics are summarized using frequency (percentage). The histological cell counts were compared between the two groups, i.e., the anti-inflammatory therapy group and the non-anti-inflammatory therapy group, using Student’s t test.

In the specimen-based prediction model, for the evaluation of the relationship between the actual and predicted number of cells, the estimated model was evaluated using a root-mean-square error in the validation dataset, and the correlation coefficient (R) and p value of the linear fit were calculated. The R package ‘glmnet’ (https://cran.r-project.org/) was used for the model construction.

In the patient-based prediction model, the median cell count was used as a cut-off to classify patients into two groups, and the model was constructed using the R package ‘glmnet’. The model’s performance was evaluated by the F1 score and the accuracy of the result of validation.

To compare the radiomics features between the specimen-based and patient-based models, the top five features in regression coefficient value were selected for each of the specimen-based model and patient-based model in the order of absolute value, and a heat map was created to show the correlation between them.

Results

Patient characteristics

A total of 100 patients (61 men and 39 women; mean age, 62 ± 8.9 years) were included in this study (Fig. 1). None of the patients had received treatment with steroids or other immunosuppressants. Table 1 shows the patient characteristics, including respiratory function test and MDD results. Table 2 shows the CT pattern classification according to the American Thoracic Society guidelines [1, 4] and ILD volume analysis using Quantification by Ziosoft Informatics Platform for Interstitial Lung Disease. Based on the visual interpretation by the radiologists, 66 patients had indeterminate UIP, 2 showed probable UIP patterns, and 32 showed alternative diagnoses. In all the cases of alternative diagnoses, fibrosis was observed in histological findings. After MDD, the diagnosis of these patients was established as CTD (n = 16), HP (n = 27), IPF (n = 19), unclassifiable ILD (n = 32) that included two cases that met IPAF criteria, idiopathic non-specific interstitial pneumonia (n = 5), and desquamative interstitial pneumonia (n = 1). Two patients initially determined to have unclassifiable disease in the MDD were later diagnosed with connective tissue diseases, including rheumatoid arthritis and systemic scleroderma, during follow-up. Similarly, one patient was later identified as having fibrotic HP. Among these 100 patients, 45 were treated with anti-inflammatory therapy according to the MDD.

Table 1 Patient characteristics
Table 2 CT characteristics

Cell number in SLB and anti-inflammatory therapy

Ninety-nine patients had two SLB sites (upper and lower), and 1 had only one SLB site due to adhesions. The median and interquartile range of the number of cells were 2.5/1000 μm2 (2.16–3.06) and 3.1/1000 μm2 (2.39–3.79) in the upper and lower SLB sites, respectively. Finally, 199 pathological specimens from 100 patients were included in the study. Figure 2 shows two representative cases: a patient with unclassifiable ILD with a large amount of cellular infiltration (Fig. 2a) and a patient with IPF showing peri-lobular fibrosis without marked cellular infiltration (Fig. 2b). The cell count in the specimens of the patients with indication for steroid treatment in the MDD was significantly greater than that in the patients without indications for anti-inflammatory therapy (p < 0.001) (Fig. 3).

Fig. 2
figure 2

CT and histological slides of representative cases. (a) A 68-year-old man with high cell counts per unit area (4.32 cells/1000 μm2) necessitating anti-inflammatory therapy and is classified by the model as belonging to the high-cellularity group. (b) A 63-year-old man with low cell counts per unit area (2.32 cells/1000 μm2), for whom anti-inflammatory therapy is not indicated, who is classified by the model into the low-cellularity group

Fig. 3
figure 3

Box-and-whisker plot of cell counts for the groups according to eligibility for anti-inflammatory therapy in the multidisciplinary discussion

Step-1: relationship of specimen-based model with histological cell count

We constructed a model to predict cell count in the specimens. In the training dataset, 14 radiomics features were selected (Supplementary Table 1). Figure 4 shows the top 10 radiomics features used in this model. The most significant coefficient was “original_glcm_ClusterShade_SLB.a,” followed by “original_firstorder_Median_SLB.a.” Fig. 5 shows the correlation between the predicted cell count using this radiomics model and the actual cell count measured in the external validation set consisting of 59 specimens. The root-mean-square error was 0.797. The p value between predicted and measured values was less than 0.001, and the correlation coefficient was 0.48.

Fig. 4
figure 4

The top 10 radiomics features of the specimen-based model. gldm, Gray Level Dependence Matrix; glcm, Gray Level Co-occurrence Matrix; IMC, Informational measure of correlation; glszm, Gray Level Size Zone Matrix; gldm, Gray Level Dependence Matrix

Fig. 5
figure 5

External validation of the specimen-based radiomics model. The horizontal axis is the number of cells predicted by the model, and the vertical axis is the actual number of cells. SLB, surgical lung biopsy

Step-2: relationship of patient-based model with the dichotomized classification of cellularity

The median number of cells in the patient was 3.27 (/1000 μm2). The patients were divided into two groups, 50 patients in each, one with the higher cell count and the other one with lower cell count. We created a patient-based model predicting higher or lower cellularity using 70 training sets chosen randomly from all sets. Supplementary Table 2 shows patient backgrounds in the training and test dataset. The training dataset had a significantly higher percentage of DLCO, but there was no difference in other patient backgrounds including quantitative CT results by QZIP-ILD. As a result, 22 radiomics features were selected to predict patients with higher cellularity (Supplementary Table 3). The top 10 selected radiomics features are shown in Fig. 6. The performance of this model was evaluated using external validation. The accuracy was 70.0%, F1 score 0.73, and the area under the ROC curve 0.70.

Fig. 6
figure 6

Top 10 radiomics features of the patient-based model. glcm, Gray Level Co-occurrence Matrix; gldm, Gray Level Dependence Matrix; ngtdm, Neighboring Gray Tone Difference Matrix

Radiomics features’ correlation

Some radiomics features in these models were correlated with each other. Specifically, “original_firstorder_Median_SLB.a” and “original_firstorder_Median_SLB.b” were negatively correlated with “original_gldm_LowGrayLevelEmphasis_SLB.b”, “original_firstorder_TotalEnergy_SLB.b” and “original_gldm_SmallDependenceLowGrayLevelEmphasis_SLB.b”, and positively correlated with “original_firstorder_InterquartileRange_SLB.a” (Supplementary Fig. 1).

Discussion

This study directly compared the cell count in histological specimens obtained using surgical biopsy with the CT radiomics features of the biopsy sites in patients with ILD. We constructed a CT radiomics model to predict the cell count in 140 histological specimens used as training datasets. This model predicted the actual cell count measured in 59 specimens used for external validation (root-mean-square error: 0.797). Given that the median cell count per 1000 μm2 was 2.5 in the group without indication for anti-inflammatory therapy and 3.1 in that with indication, these numbers are not sufficient to determine a treatment strategy on their own. However, it is suggested that radiomics can predict cell counts, albeit with insufficient accuracy in this particular case. It is expected that the accuracy will increase with the increase in the number of n and number of facilities in the future and that the method will become more practical. The CT radiomics model, trained using the biopsy site and whole lung data, to predict higher cellular infiltration showed good performance, with F1 score of 0.73 in the external validation. To the best of our knowledge, this is the first study to compare CT radiomics features with histological specimens directly in patients with ILD.

In this study, approximately two-thirds of the patients showed an “indeterminate for UIP” CT pattern; these patients could have had either IPF, HP, CTD, or unclassifiable ILDs [21]. In these situations, histological findings may support the diagnosis, e.g., granuloma and giant cells may suggest HP [22, 23], and the presence of inflammatory cellular infiltration may support CTD or IPAF [3]. Mäkelä et al. reported that high numbers of interstitial mononuclear inflammatory cells and intra-alveolar macrophages were associated with prolonged survival in patients with IPF in the Finnish IPF registry [24]. Treatment has not been completely established for these unclassifiable ILDs. However, Yamano et al. reported that immunosuppressive therapy provided better survival for the patients with IPAF whose histological findings were similar to UIP and inflammatory cellular infiltration than anti-fibrotic treatment [5].” These studies support the importance of inflammatory cellular infiltration in patients with fibrotic ILD. Our results showed that CT radiomics could predict the degree of inflammatory cell infiltration, and we posit that CT radiomics has the potential to help clinicians choose treatment strategies for ILDs.

In the specimen-based model, using CT radiomics features at the biopsy sites, original_glcm_ClusterShade_SLB.a, original_firstorder_Median_SLB.a, original_firstorder_Median_SLB.b, and other features were selected. Original_firstorder_Median directly reflects the CT attenuation values. Higher cellularity will likely increase CT attenuation values in the lesion. Our result is consistent with that of a previous report that showed a positive correlation between the cellularity of the surgical specimens and CT attenuation values at the biopsy site in patients with ILDs [25]. Original_glcm_ClusterShade was one of the features that showed a sharp boundary of the lesion, which tended to be slightly lower in SLB sites with higher cellularity. The typical histological feature of UIP encompasses spatial heterogeneity, where normal lung tissue exists next to peri-lobular fibrosis with a sharp boundary, and cellular infiltration is low, culminating in elevated levels of Original_glcm_ClusterShade [23]. Conversely, it was postulated that regions exhibiting higher cellularity might display diminished demarcation, consequently yielding reduced levels of Original_glcm_ClusterShade.

Meanwhile, non-specific interstitial pneumonia commonly seen in ILD with higher inflammatory cellularity shows a uniform lesion distribution, and the boundary between fibrosis and normal lung tissue is ill defined [26]. In the patient-based model, the top 10 selected features were from SLB. The most significant feature selected was original_gldm_LowGrayLevelEmphasis_SLB.b. This feature exhibited higher values when low CT values were uniformly distributed throughout the area. Ground-glass opacities, commonly seen in lesions with substantial cellular infiltration, are well known. Therefore, it is plausible that this imaging characteristic effectively captures the essence of such lesions, indicating a homogenous presence of cellular infiltrates as reflected in uniformly low CT values.

After creating the models, we needed to analyze the correlation of each of the automatically selected radiomics features in each model, because it was necessary to examine why different radiomics features were selected. Most of the features were different between the specimen-based and patient-based models. These radiomics features were correlated with each other, and some degree of correlation was observed when the influential features of the two models were compared (Supplementary Fig. 1). Specifically, “original_firstorder_Median_SLB (.a and.b)” and “original_firstorder_TotalEnergy_SLB.b” were strongly negatively correlated with each other. These correlated features may have a similar role, which explains the inconsistent feature selection.

This study had some limitations. First, this was a retrospective study performed in a single center, and we used CT images obtained with one type of CT scanner, which limited the generalizability of our results. Although cohort reliability is questionable due to the bias from being a single site, it was difficult to calibrate, because we used SVM to build the predictive model. Other than a significant difference in DLco, there were no differences between the training and test datasets. Future studies should include multicenter studies and predictive models that allow for calibration to show sufficient reliability for the cohort. Second, the patients with high inflammatory cell infiltration were grouped together in our study. It should be more practical to develop a model that detects cases with a UIP pattern fibrosis but with a severe lymphocytic infiltration. However, it was difficult to create such a model owing to the small sample size. In addition, SLB was performed in our study population for the diagnosis of ILD, because non-invasive imaging diagnosis was difficult, and the initial treatment was not effective enough even in the patients who were finally confirmed as HP and DIP. Therefore, our radiomics model could be biased toward the patients in the critical or complicated situation. In the patients with typical HP and DIP, antigen avoidance for HP and smoking cessation for DIP are the basic first treatment strategy, but anti-inflammatory therapy may be a treatment option in a case with high inflammatory cell infiltration, and we believe that our model might potentially provide a useful information in HP and DIP. Moreover, our study's limited population size and technical limitations precluded the development of a more sophisticated model, such as one involving optimal subset causality regression [27]. This is in contrast with Schniering et al., who employed CT radiomics to analyze scleroderma-associated ILDs in a dataset of 156 samples with 1386 features [15]. Furthermore, the cut-off values for the two categories in the patient-based model might have been better had they been set to optimize the more clinically important criteria, such as the indication for anti-inflammatory treatment. However, owing to the low number of cases, a biased number of cases in the test data for the two categories caused by an off-median cut-off would have prevented a predictive model from being created. Third, our model could not distinguish between high lymphocytic and high macrophage infiltration. Further studies should be conducted for more detailed evaluation of lung inflammation. Despite this, the significant differences observed in the indications for anti-inflammatory treatment when total nucleated cells were evaluated indicate that our analysis method is still valid. Finally, our study was a pilot study; a prospective, multicenter study with a larger population is necessary to overcome the study limitations.

In conclusion, our direct comparison of CT radiomics features with histological specimens from SLB enabled the development of predictive models to assess cellular infiltration in ILD. The radiomics features identified provide valuable insights for both quantitative and qualitative evaluations of cellular infiltration severity in ILD. Leveraging these features in quantitative CT analysis could potentially compensate for the limitations of visual CT interpretation and SLB.