Idiopathic pulmonary fibrosis (IPF) is a progressive disease with a variable clinical course but poor prognosis [1]. Several clinical and radiological characteristics have been associated with mortality in patients with IPF; however, the course of disease for an individual patient remains difficult to predict [2]. The identification and validation of blood biomarkers that are predictive of clinically relevant outcomes in patients with IPF would be of value in improving patient care.

Proteomic profiling plays an important role in the discovery of biomarkers. Patients with IPF have been shown to have a unique peripheral blood proteome [3, 4]. Furthermore, a recent report suggested that the circulating proteome may differentiate patients with IPF who will experience progression over the following 80 weeks from those who will remain stable over this period [5]. We examined the associations between circulating proteins and respiratory death or lung transplant, and the variable importance of circulating proteins as predictors of this outcome, in a cohort of patients from the IPF-PRO Registry.


Study Cohort

The IPF-PRO Registry is a multi-center US registry that enrolled patients with IPF that was diagnosed or confirmed at the enrolling center in the past 6 months, based on the 2011 ATS/ERS/JRS/ALAT diagnostic guidelines [6]. The design of the IPF-PRO Registry has been described [7]. The current analyses were based on data from 300 patients enrolled between March 2016 and February 2017. Outcomes were ascertained from enrollment to June 2019.

Proteomic Assays

Plasma samples taken at enrollment were assayed using an aptamer-based platform encompassing 1305 proteins (SOMAscan, SOMALogic Inc., Boulder, CO). Protein data were log2 transformed prior to analysis.


The univariable association between each protein and the composite outcome of respiratory death or lung transplant was determined using Cox proportional hazards modelling. Linearity and proportional hazards assumptions were assessed prior to fitting each model. Analyses were performed in an unadjusted fashion and adjusted for sex, age, % predicted forced vital capacity (FVC), % predicted diffusing capacity for carbon monoxide (DLco), oxygen use at rest and oxygen use with activity (all assessed at enrollment). p-values were corrected for multiple comparisons using the Benjamini–Hochberg method to control the false discovery rate (FDR) at 5%. Proteins for which the hazard ratio was > 2 or < 0.5 and the FDR-corrected p-value was ≤ 0.05 were regarded as significantly associated with the outcome.

Multivariable analyses using Cox regression modelling with the elastic net penalty identified a set of candidate predictors for the composite outcome of respiratory death or lung transplant. First, only proteins were considered in the pool of potential predictors and second, both proteins and clinical factors (sex, age, % predicted FVC, % predicted DLco, oxygen use at rest, oxygen use with activity [all assessed at enrollment]) were considered. The variable importance of the predictors selected by each model was plotted. The performance of each model was assessed using Harrell’s C-index and the optimism-corrected C-index. For the model including both proteins and clinical factors, the C-indices were also computed in groups based on antifibrotic drug use (i.e. taking or not taking an approved antifibrotic drug for IPF at enrollment). A multivariable model that included only the clinical factors was constructed and the C-index computed, such that its performance could be compared with that of the protein-inclusive models.



A total of 300 patients were included. At enrollment, median (Q1, Q3) age was 70 (65, 75) years, 74% were male, 94% were white, 99% were former or current smokers. Median (Q1, Q3) FVC % predicted and DLco % predicted were 69.7 (61.0, 80.2) and 40.5 (31.1, 49.3), respectively. The majority of patients were taking an approved antifibrotic medication for IPF (35% pirfenidone, 19% nintedanib). Median (Q1, Q3) duration of follow-up was 30.4 (20.1, 41.1) months. In total, 76 respiratory deaths and 26 lung transplants occurred.

Relationship Between Circulating Proteins and Respiratory Death or Lung Transplant

In unadjusted univariable analyses, 61 proteins were significantly associated with the composite of respiratory death or lung transplant. After adjustment for clinical factors, 22 proteins remained significantly associated with the composite outcome (Table 1).

Table 1 Circulating proteins associated with respiratory death or lung transplant in patients with IPF in univariable analyses adjusted for clinical factors

In multivariable analyses considering proteins only, a set of 54 proteins predicted the probability of the composite of respiratory death or lung transplant with a C-index of 0.83 (optimism-corrected C-index of 0.76). The variable importance of the selected proteins is shown in Fig. 1. Among the proteins of greatest importance in discriminating the outcome were spondin-1 (SPON1), intracellular adhesion molecule 5 (ICAM5), C-X-C motif chemokine 13 (CXCL13), alpha 2 HS glycoprotein (AHSG) and protein inhibitor of activated STAT4 (PIAS4).

Fig. 1
figure 1

Variable importance of 54 proteins selected using a multivariable model to identify candidate proteins associated with the outcome of respiratory death or lung transplant in patients with IPF. For CKM, two piece-wise linear components are shown

Multivariable analyses considering both proteins and clinical factors identified a set of 51 predictors (47 proteins, 4 clinical factors) with a C-index of 0.84 (optimism-corrected C-index of 0.76). Model performance was similar in patients who were and were not taking antifibrotic therapy at enrollment (C-index 0.84 and optimism-corrected C-index 0.77 in treated patients; C-index 0.82 and optimism-corrected C-index 0.74 in untreated patients). The variable importance of the selected predictors is shown in Fig. 2. In general, the same protein predictors were retained, but all were of lower importance than oxygen use and measures of lung function. Notably, the performance of the model including both proteins and clinical factors was superior to a model that considered only the clinical factors, for which the C-index was 0.75 and the optimism-corrected C-index was 0.73.

Fig. 2
figure 2

Variable importance of 47 proteins and 4 clinical factors selected using a multivariable model to identify candidate predictors of the outcome of respiratory death or lung transplant in patients with IPF. For CKM, two piece-wise linear components are shown


In this analysis of data from 300 patients with IPF, we identified several circulating proteins that strongly associated with a composite outcome of respiratory death or lung transplant, after adjusting for clinical variables known to be associated with mortality in this population [8]. Many of these proteins have functions in inflammation, immune activation/regulation, cell–cell adhesion, or pathways reported to play a role in fibrogenesis (e.g. TGF-β signaling, bone morphogenetic protein signaling, Janus kinase signaling).

While some of our findings are consistent with previous data, such as the association between elevated levels of chemokine CXCL13 and reduced survival [9], our analyses identified several additional candidate proteins as biomarkers of mortality risk, including proteins not measured in previous studies. These results extend previous analyses of data from the IPF-PRO Registry that identified several proteins that associated with clinical measures of IPF severity (% predicted FVC, % predicted DLco, composite physiologic index) at enrollment [3]. In the current analyses, each of the proteins that was associated with all three disease severity measures in this prior work (SPON1, ICAM5, roundabout homolog-2 [ROBO2], polymeric immunoglobulin receptor [PIGR]) was selected by the multivariable model that considered both proteins and clinical factors. While none of these proteins has been well characterized in lung fibrosis, it has been shown that ROBO2 is overexpressed in a mouse model of toxin-induced liver fibrosis, and that the interaction between ROBO2 and its ligand promotes fibrogenic activity within stellate cells [10]. Notably, inclusion of the proteins along with the clinical measures enhanced the discriminatory ability of the model compared with a model that included only clinical factors. This suggests that proteins may confer information that is independent from that captured by measures commonly performed in the clinic.

Among the top protein predictors of the composite of respiratory death or lung transplant were AHSG and PIAS4. Higher AHSG levels and lower PIAS4 levels were associated with reduced risk. These proteins have opposing roles in regulating TGF-β signalling, a pathway known to be important in IPF. Thus it is plausible that they may contribute to the development or progression of IPF. In experimental models, AHSG is an antagonist of TGF-β, with animals genetically lacking in AHSG expression showing increased SMAD2 phosphorylation [11, 12]. Furthermore, TGF-β-mediated suppression of immune cell function was exaggerated in AHSG-deplete animals, as shown by inhibition of macrophage activation [12]. In an experimental model of liver fibrosis, PIAS4 silencing blocked recruitment of SMAD3, decreasing pro-fibrotic gene expression and ameliorating hepatic fibrosis [13]. In the context of these experimental data, our findings compel mechanistic and clinical studies to define the contribution of these proteins to the pathogenesis of IPF and clarify their potential as biomarkers of IPF progression.

Strengths of our analysis include the multi-center nature of the cohort and the adjustment for clinical variables known to influence survival in patients with IPF. Our analyses also have limitations. First, the cohort was a population of mainly white patients enrolled at expert centers in the US, thus our findings may not be applicable to all patients with IPF. Second, while a broad array of proteins were analyzed, some potentially important proteins may have been missed as they were not included on the platform. An aptamer-based approach to protein detection does not always produce results that are reproducible using ELISA and analyses using ELISA are planned.

In conclusion, we identified several novel candidate circulating protein biomarkers for predicting respiratory death or lung transplant in patients with IPF. These data underscore the opportunity to develop biomarker-inclusive algorithms that provide meaningful risk stratification for patients with IPF.