Introduction

In recent years, deep learning approaches have been explored in various fields in radiology [1, 2]. With its excellent natural contrast to the surrounding structures, the lung is one of the most promising organs for the application of deep learning algorithms. Specifically, the relatively high Hounsfield unit (HU) values of vessels compared with the lung parenchyma and airway may make pulmonary vessel segmentation more favorable than in other organs. Moreover, extracting pulmonary vessels from noncontrast chest CT may reduce the workload for central nodule detection or mediastinal lymph node evaluation, and could be applied to various volume measurement tasks.

Pulmonary vessel extraction has been tried in previous studies, mainly using mathematical modeling on contrast-enhanced CT images, including HU thresholding and connection-detecting techniques [3, 4]. However, these techniques are hardly applicable for noncontrast CT scans, in which the HU contrast between the lung and vessel is reduced. A deep learning approach might be attempted, but a major challenge could be the producing enough vessel maps on noncontrast CT images for training [3, 4]. If spatiotemporally matched noncontrast and contrast-enhanced CT scans could be obtained simultaneously, the generation of vessel maps from noncontrast scans would be replaced by that from enhanced CT images. In this aspect, dual-energy CT may provide a solution, as virtual noncontrast images could be generated from enhanced scans.

The purpose of our study was to develop and validate a deep learning–based automatic pulmonary vessel segmentation algorithm for noncontrast chest CT images (DLVS). We generated virtual noncontrast scans from CT pulmonary angiography images using a dual-source CT, and utilized them for training a deep learning algorithm to segment vessel maps from noncontrast CT scans. To examine the clinical role of the algorithm, we additionally explored the impact of DLVS in assessing vascular remodeling in chronic obstructive lung disease (COPD) patients, in whom the loss of microvasculature is known to be associated with the pathogenesis of the disease [5,6,7].

Materials and methods

This retrospective study was approved by Seoul National University Hospital institutional review board, and the requirement for patients’ informed consent was waived. One coauthor (S.J.P.) is a founder and CEO of MedicalIp, but did not have control over any of the data submitted for publication.

Development of DLVS

For the development of DLVS, 104 pulmonary CT angiograms (49,054 slices) scanned using a dual-source scanner (Somatom Force; Siemens Healthineers) from 104 patients taken between September 2017 and February were collected. From the 80-keV and 150-keV angiography images, virtual 0.7-mm-section 50-keV contrast-enhanced images and virtual noncontrast images were produced. From the 50-keV CT images, the pulmonary vessels were segmented in a semi-automatic manner, using a thresholding technique, followed by a novel graph-cut algorithm (eFigure 1) [8].

DLVS was trained using each virtual noncontrast CT image as an input and spatiotemporally matched vessel map as an output. All input images were windowed under a width, level of 2,500, 150 before normalization. To decrease false-positive results, 5-fold data augmentation was performed by adding false nodules to each scan (number of false nodules: 20–60; size: 2–10 mm). Training was conducted in 2 steps: (a) an algorithm generating vessel maps from virtual noncontrast scan was trained (pre-DLVS), and (b) another algorithm producing vessel maps from the union of pre-DLVS results and ground-truth vessel maps was trained. For both training steps, a 3-dimensional (3D) U-Net based neural network was used, receiving an input size of 512 × 512 × 8 and using 3 encoders and 3 decoders (eFigure 2). Except for the final convolution (1 × 1 × 1 convolution), every convolutional layer consisted of 3 × 3 × 3 convolution, followed by the rectified linear unit and group normalization [9]. Detailed information is presented in supplement.

Internal validation

Internal validation was performed using 10 pulmonary CT angiography scans from 10 patients, whose inclusion criteria were the same as those in the development dataset. Vessel maps from these 10 images were generated by the same method used for the development dataset. To validate the vessel segmentation performance of DLVS, the Dice coefficient was calculated for each case [10, 11]. The total vascular volume and the volume of the vessels with a cross-sectional area < 5 mm2 were measured and compared between the ground-truth vessel maps and the DLVS results. Additionally, 1,000 points were randomly selected from both inside and outside the area of the pulmonary vessels, and the probability score for each point was measured [12, 13].

Validation of DLVS

External validation

For external validation, a temporally and vendor-independent dataset was collected (SNUH dataset). Among 63 patients who underwent both pre- and post-contrast-enhanced chest CT simultaneously for the purpose of pre-bronchoscopy evaluation at Seoul National University Hospital between March to December 2019, CT scans from 14 patients (mean age 67.4 ± 10.9 years [range 41–82 years]; 5 men and 9 women) were selected, all with lung parenchymal diseases. For each case, 200 points (100 intravascular and 100 extravascular) were selected from noncontrast CT images and labeled as either intravascular or extravascular, referring to the simultaneously taken contrast-enhanced CT images. Among 100 intravascular points, 40 points were selected within the segmental artery (n = 20) or vein (n = 20) and 60 points within small subsegmental vessels, with a diameter of less than 2 mm. The 100 extravascular points were selected within the lung parenchyma (n = 60), bronchial wall (n = 20), or intra-lesional area (n = 20). Intra-lesional points were selected inside parenchymal abnormalities (i.e., consolidation, ground-glass opacities, nodules, or atelectasis). The probability score of DLVS for each point with its decision (intravascular vs. extravascular) and HU was measured. Additionally, an open dataset from the VESSEL12 challenge (n = 3) was used [14]. The probability scores of DLVS for the referenced 876 points were used (278 intravascular points and 598 pulmonary parenchymal points).

Assessment of vascular remodeling in the COPD low-dose CT cohort

To include COPD patients, all patient whose forced expiratory volume in 1 s (FEV1) divided by forced vital capacity (FVC) was less than 0.7 on a post-bronchodilator pulmonary function test (PFT) performed between 2014 and 2015 were included (n = 2,204) and classified using the Global Initiative for Chronic Obstructive Lung Disease (GOLD) criteria (GOLD 1, FEV1% > 80; GOLD 2, FEV1% 50–80; GOLD 3, FEV1% 30–50; and GOLD 4, FEV1% < 30). Of these patients, 372 underwent low-dose chest CT within 1 month after the PFT examination. Ninety-one patients with superimposed active lung diseases that may affect pulmonary vascularity were excluded, as follows: pneumonia or active tuberculosis (n = 71), malignancy (n = 11), empyema (n = 4), interstitial lung disease (n = 4), and pneumothorax (n = 1). Finally, 281 low-dose CT scans from 281 patients were included (mean age 67.3 ± 9.31 years [range 42–88 years]; 256 men and 25 women). Among them, 234 patient had measured diffusion lung capacity for carbon monoxide (DLCO). Vendor and CT parameter information is provided in eTable 1. To evaluate vascular remodeling, the volume of total pulmonary vessels and those with a cross-sectional area < 5 mm2 (PVV5) and %PVV5, defined as PVV5 divided by total pulmonary vascular volume, were calculated from this-slice images (< 1.5 mm). The intra-lung area was calculated using a lung segmentation algorithm [15]. Additionally, the emphysema index was calculated for each patient from 3-mm-thick soft kernel images, as the percentage of lung voxels showing attenuation below −950 HU [16, 17]. The correlations of PVV5 and %PVV5 with GOLD categories, DLCO, and the emphysema index were explored and compared.

Statistical analysis

The classification performance of DLVS in terms of detecting intravascular areas was evaluated using the area under the receiver operating characteristic curve (AUROC). Upper/lower reproducibility limits (URL/LRL) were evaluated for PVV5, %PVV5, and the total vascular volume on interval validation, and Bland-Altman analysis was conducted. The correlations of PVV5 and %PVV5 with the GOLD indices were assessed using the Spearman rho coefficient. Differences in PVV5 and %PVV5 between GOLD 1–2 and GOLD 3–4 patients were evaluated using the independent t test. Statistical analyses were performed with SciKit-Learn 0.19.0 [18] and MedCalc version 15.8. Comparison of significant Spearman rho coefficients was conducted using an online calculator (http://quantpsy.org) following Steiger’s method [19].

Results

Internal validation

Among the 10 cases included in the internal validation dataset, only 2 showed relatively clean lungs, while the other 8 cases showed parenchymal infiltration, including numerous metastatic nodules (n = 1), ground-glass infiltration (n = 3), multifocal consolidation (n = 2), mass with pneumonia (n = 1), and multiple embolization coils (n = 1). The mean Dice coefficient between the DLVS results and ground-truth vessel maps were 91.5 ± 3.17 (93.1 ± 0.17 for healthy lungs and 91.1 ± 3.47 for diseased lungs). Both the total vascular volume and PVV5 measured from DLVS results showed < 2% error rates to the ground-truth vessel maps, and showed strong correlations (Spearman rho coefficient > 0.96, p < .001 for both; eTable 1). On Bland-Altman plots, all 10 points were located within the 95% CI limits of difference for total vascular volume, PVV5, and %PVV5 (eFigure 3). For discriminating the 2,000 randomly selected points per case (1,000 intravascular and 1,000 extravascular points), DLVS yielded an AUROC of 0.995. DLVS correctly classified 94.3% of all points (18,867/20,000), 89.5% of points from the intravascular area (8,952/10,000), and 99.2% from the extravascular area (9,915/10,000) (Table 1).

Table 1 Vendor and CT parameter information for the COPD low-dose CT cohort

External validation results

In external validation performed with 14 noncontrast CT scans (SNUH dataset), the AUROC of DLVS was 0.977 for 2,800 manually selected points (Table 2). It successfully classified 99.1% (1,387/1,400) of intravascular points, including 99.0% (832/840) of the points within small vessels (diameter < 2 mm). For the extravascular areas, 93.1% (1,309/1,400) of the extravascular points were correctly classified as non-vessel by DLVS. Specifically, 100% of normal lung points were correctly mapped, while 15.7% (44/280) points within the bronchial wall were misclassified. For the intra-lesional areas, 84.3% (233/280) of the points were accurately classified. Although > 90% of points were correctly classified for calcified nodules (92.6% [25/27]), consolidation (90.7% [49/54]), and ground-glass opacities (95.4% [83/87]), DLVS showed decreased accuracy for points within linear atelectasis (86.5% [32/37]) and demonstrated suboptimal results for noncalcified nodules (accuracy 58.7% [44/75]; Table 3). Representative cases are presented in Fig. 1.

Table 2 Internal and external validation results of DLVS
Table 3 Detailed performance of DLVS on the SNUH dataset
Fig. 1
figure 1

Examples of DLVS results in the external validation dataset. a DLVS successfully segmented pulmonary vessels inside a part-solid nodule on noncontrast CT. b DLVS successfully segmented small vessels, without yielding false positive results for nodules or consolidation. c DLVS detected small vessels passing through the multicystic mass

For the VESSEL 12 challenge dataset, DLVS showed an AUROC of 0.969. Its diagnostic accuracy was 84.1% (736/876), 45.6% from the intravascular areas (127/298) and 100% from the non-vessel areas (598/598) (Table 2).

Assessment of vascular remodeling from low-dose CT of COPD patients

Among 281 COPD patients confirmed from post-bronchodilator PFT, 166 were categorized as GOLD 1, 98 as GOLD 2, and 17 as GOLD 3. No patients were categorized as GOLD 4. Both DLVS-driven volume parameters (PVV5 and %PVV5) tended to be lower in patients with a higher GOLD index, and statistical significance was found for PPV5, although the correlation was weak (Spearman rho, 0.20). A remarkable difference was found for GOLD 1 or 2 patients versus GOLD 3 patients: the mean values for both volume parameters showed a significant difference (p < .01 for both; Table 2). PVV5 showed an AUROC of 0.804 (optimal threshold, 61.6 mL) in differentiating GOLD 3 from GOLD 1 or 2 patients (Table 2 and Fig. 2). Examples are shown in Fig. 3. PVV5 showed a significant correlation with both absolute and %predicted DLCO. Both PVV5 and %PVV5 were significantly correlated with the emphysema index (Table 4 and Fig. 4), and %PVV5 showed a significantly stronger correlation (Spearman rho, 0.37 vs. 0.17; p < .001). Among various indices which showed significant correlation with PVV5, DLCO (%predicted) showed a higher Spearman rho (Spearman rho, 0.32) than rho with GOLD criteria (0.20; p = .02) or emphysema index (0.17; p = .004).

Fig. 2
figure 2

Performance of PVV5 and %PVV5 in differentiating GOLD 3 patients from GOLD 1–2 patients. The areas under the receiver operating characteristic curve were 0.804 and 0.715, respectively

Fig. 3
figure 3

Examples of DLVS results for low-dose CT of COPD patients. PVV5 and %PVV5 both tended to decrease as the patients’ GOLD categorization increased. The red vessels have a cross-sectional area ≥ 5 mm2, while the green vessels have a cross-sectional area < 5 mm2

Table 4 Vascular volume analysis of DLVS from the COPD low-dose CT cohort
Fig. 4
figure 4

Plots showing correlations between the emphysema index and vascular volume parameters calculated from DLVS for the COPD low-dose CT cohort: a PVV5 and b %PVV5

Discussion

We developed an automatic pulmonary vessel segmentation algorithm from noncontrast chest CT by utilizing spatiotemporally matched CT pulmonary angiography images for vessel map generation. DLVS showed promising results in discriminating intravascular and extravascular areas on the internal and external validation datasets, even for diseased lungs. On low-dose CT scans from COPD patients, the DLVS-measured pulmonary small vessel area was significantly correlated with patients’ GOLD criteria, DLCO, and emphysema index, and successfully differentiated GOLD 3 patients from GOLD 1 or 2 patients.

Vessel map generation has been a major hurdle in developing an automatic vessel segmentation algorithm [3, 4], especially from noncontrast CT images. We enabled to produce vessel maps paired with spatially matched noncontrast CT by generating virtual noncontrast scans from CT pulmonary angiography images taken from a dual-source CT scanner. The training dataset of DLVS was inevitably homogeneous, as all scans were obtained using a device from a single vendor with the same protocol. However, DLVS worked well in the external validation dataset and COPD low-dose CT datasets taken from various vendors with heterogeneous settings (including different reconstruction kernel), suggesting that generalization was accomplished. We used a 3D U-Net model with group normalization and modification of the first max pooling size, which successfully enhanced the performance of DLVS. Group normalization is important for 3D U-Net training with a small batch size, and the first max pooling size was modified to 1 × 2 × 2 to preserve the data in the z-axis [9]. We also performed 2-step training, adding 1 more neural network that took the pre-DLVS result as an input, to reduce false-positives. We did not specify the results, but by passing one more network, we achieved a roughly 8% reduction of false-positives on noncalcified nodules.

A strength of DLVS is that it showed good performance in diseased lungs. Several studies have reported good performance in automatic pulmonary vessel segmentation, but the performance of the algorithm for diseased lungs was not thoroughly evaluated [13, 20]. DLVS exhibited an excellent Dice score for both healthy and diseased lungs on internal validation and showed < 10% false positives for most parenchymal lung lesions, including consolidation and ground-glass opacity, while detecting distinguishable intra-lesional vessels (Fig. 1). Combined with HU thresholding, DLVS can be utilized for accurate volume segmentation of intra-parenchymal lesions, for applications such as the volumetric evaluation of COVID-19 pneumonia burden [21]. DLVS could be applied in various other clinical settings, i.e., quantitatively assessing disease progression in diffuse lung disease or assisting readers’ performance in mediastinal lymph node evaluation, thromboembolism detection, or lung nodule detection. However, DLVS showed suboptimal performance for noncalcified lung nodules (accuracy 58.7%). We tried to minimize false positives by applying specific data augmentation (5-fold augmentation of cases with false nodules with varying sizes) and conducting 2-step training, but DLVS still mis-registered 41.3% (31/75) of the points in lung nodules as intravascular. Since nodule detection is an expected indication of automatic vessel segmentation, this result is quite disappointing. Further modification of the algorithm by removing nodules through connectivity evaluation could be beneficial.

In the COPD low-dose CT cohort, DLVS yielded some potentially meaningful parameters, including PVV5 and %PVV5. Endothelial dysfunction and loss of microvasculature, leading to increased vascular resistance and reduced oxygen delivery capacity, are well-known histopathologic phenomena observed in patients with COPD [5,6,7, 22, 23]. It has been known that both increased emphysema burden and decreased DLCO are associated with losses of pulmonary microvasculature [23], and the histology-radiology correlation of pulmonary microvasculature had been established on autopsy cases [5]. The significant correlation between CT-assessed PVV5 with other COPD-associated indices, including emphysema index, FEV1, and DLCO, was also reported [5, 24, 25]. Similar to the prior reports, our DLVS-computed PVV5 or %PVV5 significantly correlated with patients’ GOLD categorization, emphysema index, and DLCO. DLVS-computed PVV5 successfully differentiated GOLD 3 patients from GOLD 1–2 patients with an AUROC of 0.804. However, consistent with the previous study [24], the correlations were not very strong among these indices (Spearman rho indices < 0.4). As all COPD-associated indices we evaluated, PVV5, emphysema index, FEV1, and DLCO, all represent a certain part of the pathogenesis of COPD, we believe each factor may have its own clinical meaning. Although the correlations between the indices were not strong, these indices may act as independent factors predicting disease progression. Our study has strength in that we derived noninvasive and automatic parameters which reflect the pathogenesis of COPD. Further investigation of these volume-related factors would be beneficial as a way to explore patients’ survival or disease prognosis.

Our validation process for DLVS has some weaknesses. First, the VESSEL 12 dataset contained contrast-enhanced CT images, while DLVS was designed for noncontrast CT. As we could not find any relevant dataset of noncontrast CT scans for external validation of the pulmonary vessel segmentation, we had to use this contrast-enhanced dataset. Most likely due to this discrepancy, the diagnostic accuracy of DLVS in the detection of intravascular areas was unsatisfactory, even though the AUROC was high (0.969). Threshold adjustment should be considered when applying DLVS to scans obtained using different CT protocols. Indeed, a sensitivity and specificity of 90.6% and 95.3%, respectively, could have been achieved if the threshold was adjusted in accordance with the Youden index J [26]. Another limitation of our COPD dataset is that it was retrospectively collected, and the reasons for the examinations were diverse. As COPD patients usually do not undergo low-dose CT for COPD evaluation on our institution, most patients who underwent low-dose CT might have had respiratory symptoms or other underlying thoracic diseases. As a result, a considerable portion of the initial consecutively collected cohort was excluded (92 out of 373 patients). Additionally, most of the patients were GOLD 1 or 2, and none were classified as GOLD 4. This inevitably yielded selection bias, and a well-designed prospective study is warranted to confirm the clinical significance of DLVS-assessed pulmonary vascular remodeling.

Our study has some other limitations. First, as vessel map generation is labor-intensive, the size of the training dataset was limited. We tried to maximize the performance by using spatiotemporally matched vessel maps and augmenting images by adding false pulmonary nodules. Second, the role of DLVS in routine practice was not explored. As vessel segmentation is expected to improve radiologists’ performance in pulmonary nodule detection or mediastinal lymph node evaluation, a further evaluation that incorporates an evaluation of diagnostic performance would be beneficial. Third, our training dataset was homogeneous, which may limit the generalizability of the algorithm. Finally, we did not compare the performance of DLVS with that of other vessel segmentation algorithms, including ImageJ or Aview [25, 27].

In conclusion, DLVS successfully segmented pulmonary vessels from noncontrast chest CT images and showed promising results in assessing the loss of small vessel density in COPD patients.