Background

Idiopathic pulmonary fibrosis (IPF) is a chronic, progressive lung disease that results in fibrotic scarring of the alveolar tissues. Globally, the incidence of IPF is increasing, with approximately 3–9 cases per 100,000 individuals being reported each year [1]. Anti-fibrotic drugs, such as pirfenidone and nintedanib, which suppress disease progression, have been clinically approved for the treatment of IPF [2,3,4,5]. However, the overall survival of IPF patients is low, ranging from 2 to 3 years [6, 7]. Although a few patients with IPF and severe respiratory failure have been treated with lung transplantation [8, 9], strict eligibility criteria and shortage of organ donors often limits transplantation therapy.

Multiple studies have shown that damage to the respiratory epithelium and impairment in its repair mechanism play a central role in the development of IPF [10]. In particular, alveolar type II (AT2) cells play important roles in the pathogenesis of IPF because they act as progenitor cells and help in regeneration of the respiratory epithelium [11, 12]. Currently, multiple gene mutations affecting the function or survival of AT2 cells have been reported in IPF lung tissues [13, 14]. In addition, single nucleotide polymorphisms in mucin 5B (MUC5B), resulting in the abnormal production of mucin, are known to play a role in IPF pathogenesis [15]. Furthermore, the incidence of IPF increases with aging, which suggests the existence of a complex relationship between chronic environmental exposure, infection, host defense/repair pathways, and disease progression.

Currently, there are multiple clinically approved kinase inhibitors for a wide range of diseases, including fibrosis and malignant diseases. Furthermore, there are a few similarities in the pathogenesis of IPF and non-small cell lung cancer (NSCLC), a chronic respiratory disease with abnormal cell proliferation. In addition, the activation of tyrosine kinases and overexpression of growth factors have been known to play important roles in the progression of both pulmonary fibrosis [16, 17] and lung cancers [18]. Nintedanib has the potential to inhibit the activity of multiple kinases, including vascular endothelial growth factor receptor, fibroblast growth factor receptor, and platelet-derived growth factor receptor [19]. Interestingly, nintedanib has also demonstrated a beneficial effect on tumor suppression in clinical trials involving patients with advanced NSCLC [20]. Based on these findings, we hypothesized that other kinase inhibitors may also have the potential to inhibit the progression of IPF.

In this study, we analyzed the expression of 612 kinase and cancer-related genes to identify the potential therapeutic targets of IPF. We used next-generation sequencing to perform gene expression analysis of 13 and 8 surgically resected lung tissues from patients with and without IPF, respectively. Further, we validated the expression of selected genes at the protein level in fibrotic lesions using immunostaining.

Methods

Patients and sample preparation

This study (registered number: K1505-033) was approved by the Ethics Committee of Okayama University Graduate School of Medicine, Dentistry and Pharmaceutical Sciences and Okayama University Hospital.

Patients with or without IPF were enrolled in this study between April 2015 and November 2016 after obtaining written informed consent. The tissue samples with or without IPF were obtained from the organs removed for lung transplantation or resected during the treatment of lung cancer at Okayama University Hospital. The diagnosis of IPF was based on the official ATS/ERS/JRS/ALAT statement [21]. The collected samples were immediately cut into small sections and fixed using PAXgene Tissue System (PreAnalytiX, Hombrechtikon, Switzerland) or RNAlater™ (Sigma-Aldrich, St. Louis, MO, USA). RNA was extracted from RNAlater™-fixed samples using the RNeasy Micro Kit according to the manufacturer’s protocol. The concentration and quality of RNA were measured using an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA).

Targeted RNA sequencing and data analysis

The SureSelect RNA Human Kinome Kit (Agilent Technologies) that targets 612 genes, including 517 protein kinase-coding genes and 46 cancer-related genes, was utilized for library preparation using the RNA samples with an RNA integrity number > 7. Sequencing was performed on an Illumina MiSeq Sequencing System using the V2 Reagent Kit (Illumina, San Diego, CA, USA). The sequencing data were analyzed using the CLC Genomics Workbench (CLC bio, Aarhus, Denmark). Gene expression data were normalized based on the reads per kilobase per million mapped reads (RPKM). Hierarchical clustering analysis was performed using Cluster 3.0 software with the following adjusting method: centralization with median value of each sample, and normalization by dividing centered values by the standard deviation of all samples. Gene expression diversity was calculated with the following adjusting method: the difference between the maximum and minimum RPKM values was calculated for each gene in each patient, and the obtained values were divided by the standard deviation of all samples.

Histological analysis

The paraffin-embedded tissue blocks fixed using PAXgene Tissue System were cut into 5-µm thick sections and stained with hematoxylin and eosin. The severity of fibrosis in these samples was evaluated using the Ashcroft score, as previously described [22].

Immunohistochemistry (IHC)

The paraffin-embedded tissue blocks fixed using PAXgene Tissue System were cut into 5 µm thick sections, placed on glass slides, and deparaffinized in d-limonene and graded alcohol. The tissue sections were then incubated in 1 mM EDTA buffer (pH 8.0) for 10 min at 95 °C in a water bath and blocked for endogenous peroxidase activity with 3% hydrogen peroxide in methanol for 5 min. Following the incubation, the slides were rinsed with Tris-buffered saline containing 0.1% Tween 20 and blocked with normal goat serum or normal horse serum for 60 min. The sections were then incubated with primary antibodies overnight at 4 °C. The primary antibodies against doublecortin-like kinase 1 (DCLK1) (ab31704), pyruvate dehydrogenase kinase 4 (PDK4; ab71240), and spleen-associated tyrosine kinase (SYK; ab40781) were obtained from Abcam (Cambridge, MA, USA). Anti-Pim-2 proto-oncogene, serine/threonine kinase (anti-PIM2; HPA000285) and anti-serine/threonine kinase 33 (anti-STK33; HPA015742) antibodies were purchased from Sigma-Aldrich (St. Louis, MO, USA), and anti-Erb-B2 receptor tyrosine kinase 4 (anti-ERBB4) antibody (19943-1-AP) was obtained from Proteintech Japan (Tokyo, Japan). Following overnight incubation, the sections were incubated with EnVision + Single Reagents HRP Rabbit (Dako, Glostrup, Denmark) or ImmPRESS Reagent Anti-Mouse IgG (Vector Laboratories, Burlingame, CA, USA) secondary antibodies for 20 min. Finally, the sections were stained with 3,3-diaminobenzidine and counterstained with hematoxylin.

Signal-to-noise weighted-voting score

Samples of IPF case 1 were used to select the genes for weighted-voting score calculation. Samples of IPF cases 2 and 3 were used as an independent validation set to evaluate the model versatility. The signal-to-noise statistic (Sx) was calculated as described previously [23, 24]. Briefly, the signal-to-noise statistic (Sx) is calculated as the weight for gene x as Sx = (μAshcroft≥6 − μAshcroft<6Ashcroft≥6 + σAshcroft<6), where, for each gene, μAshcroft≥6 represents the mean value and σAshcroft≥6 represents the standard deviation for that gene in all samples of the Ashcroft score ≥ 6. In all, 26 genes were selected as “induced” genes showing (μAshcroft≥6 − μAshcroft<6) > 1, whereas 15 genes showing (μAshcroftt≥6 − μAshcroft<6) < -1 were selected as “suppressed” genes in Ashcroft score ≥ 6 samples (Additional file 1: Table S1). A weighted-voting classification algorithm was employed to predict Ashcroft score ≥ 6 and/or Ashcroft score < 6 samples using the genes selected as described above, and the resulting classifiers were tested using the independent dataset (IPF cases 2 and 3). “Weights” were calculated based on triplicated Ashcroft score ≥ 6 samples and triplicated Ashcroft score < 6 samples of case 1. In this scheme, gene x of a test sample γ in the predictive gene set has a vote based on its expression in this sample (gxγ) using weight Sx, boundaries bx = (μAshcroft≥6 + μAshcroft<6)/2, and weighted voting score Vx = Sx (gxγ − bx). The final voting scores were summed (∑xVx).

Statistical analysis

Statistical analyses were performed using STATA software version 15.1 (StataCorp, College Station, TX, USA). Expression of targetable kinase genes between the individual case and the control lung samples were compared using two-tailed paired Student’s t-tests.

Results

Characteristics of IPF and control patients

The characteristics of five IPF and four control patients are shown in Table 1. The median age of IPF patients was 57 years (range, 56–67 years). IPF lung tissues were harvested from two patients, who underwent lung transplantation, and from three patients, who underwent surgical lung resection for NSCLC. Multiple samples (2–6 samples) were obtained from three IPF patients (cases 1, 2, and 3), whereas a single sample was collected from each of the remaining two patients (cases 4 and 5). Thus, in total, 13 lung tissue samples were obtained from five IPF patients. The median age of the control patients was 64 years (range, 45–82 years). The lung tissues without IPF were harvested from the control patients who underwent surgical lung resection for NSCLC. A total of eight lung tissue samples were collected from four control patients (two from each patient).

Table 1 Patient characteristics and number of obtained samples

Gene expression signatures are altered with the progression of fibrosis

Gene expression analysis was performed using 13 IPF and 8 control (without IPF) lung tissue samples. Clustering analysis indicated that the IPF and control samples were clustered together (Fig. 1A). Hence, considering that the severity of fibrosis may affect gene expression, we divided the IPF tissue samples into two subgroups based on the Ashcroft score that estimates the severity of pulmonary fibrosis on a numerical scale (Ashcroft score < 6: normal to moderate fibrosis; Ashcroft score ≥ 6: severe fibrosis) [22]. As expected, the severe fibrotic samples (Ashcroft score ≥ 6) showed an independent gene signature compared to the moderate fibrotic (Ashcroft score < 6) and control samples, whereas the gene expression signature of the moderate fibrotic samples (Ashcroft score < 6) was not independent of that of the control lung samples (Fig. 1a). Additionally, only five genes were found to be differentially expressed by more than twofold in moderate fibrotic samples compared to the control samples, whereas 51 genes were differentially expressed by more than twofold in severe fibrotic samples compared to the control samples (Fig. 1b, Additional file 1: Table S2). Moreover, we performed an independent clustering analysis of the six IPF samples obtained from a single patient (case 1) and confirmed the correlation between the gene expression signature and Ashcroft score (Fig. 1c). Clustering analysis confirmed the correlation of gene expression signature and Ashcroft score in IPF cases 2 and 3 as well (Additional file 1: Fig. S1a, b). Further, we examined whether the change in gene expression according to the severity of fibrosis in IPF case 1 was also observed in IPF cases 2 and 3. As expected, the signal-to-noise weighted-voting score analysis showed that the gene expression signature in the fibrotic tissues with Ashcroft score ≥ 6 or < 6 in IPF case 1 was reproduced in both IPF cases 2 and 3 (Additional file 2: Fig. S1c). Altogether, these results suggest that the gene expression signatures are altered with the progression of fibrosis.

Fig. 1
figure 1

Expression analysis of 612 kinase-coding and cancer-related genes in IPF and control lung samples. a Clustering analysis of all samples. b Number of genes differentially expressed by more than twofold in moderate and severe fibrotic samples compared to control lung samples. c Clustering analysis of six samples from IPF case 1. d Gene expression diversity in IPF and control lung samples from each patient. IPF cases 4 and 5 were excluded from the gene expression diversity analysis as they contributed one sample each

Gene expression diversity in IPF lung tissues

As IPF lungs typically present with temporal and spatial heterogeneous histological findings, we assessed the gene expression diversity in IPF lung tissues harvested from different segments in the same patient. Therefore, only patients contributing multiple lung samples were included in this analysis (i.e., IPF cases 1–3 and control cases 1–4). Expectedly, a stronger diversity was observed in IPF lung tissues than in control lung tissues, suggesting not only histological, but also genetic heterogeneity of IPF lungs (Fig. 1d).

Expression of targetable kinase-coding genes in IPF lung

To explore the targetable kinase-coding genes in IPF lung tissues, we sought to identify genes that were upregulated in IPF lung tissues (n = 13) compared to control lung tissues (without IPF, n = 8) with a fold change of > 2. The integrated analysis indicated three genes (DCLK1; STK33; and cyclin-dependent kinase 1, CDK1) to be upregulated with the considered threshold of > twofold in 13 IPF samples compared to the 8 control samples (Table 2).

Table 2 Upregulated kinases in idiopathic pulmonary fibrosis compared to control samples

Further, considering the heterogeneous nature of IPF, we independently compared the gene expression profiles between the IPF lung samples collected from each of the three cases and control lung. In addition, we evaluated the expression of the 46 selected genes encoding kinases having clinically available kinase inhibitors (Additional file 1: Table S3) [25]. However, of the 46, six genes (ALK receptor tyrosine kinase, Ret proto-oncogene, neurotrophic receptor tyrosine kinase 1, neurotrophic receptor tyrosine kinase 3, Fms-related tyrosine kinase 3, and protein kinase C gamma) were excluded from further analysis because their RPKM values were lower than the overall median RPKM value. The findings from each of the IPF cases are shown below.

Case 1: A 57-year-old male with smoking habit (132 pack-years) and history of right upper lobectomy for squamous NSCLC. High-resolution computed tomography (HRCT) revealed bilateral honeycombing, indicating a usual interstitial pneumonia (UIP) pattern (Fig. 2a). Pathological examination confirmed that the fibrosis was UIP. Based on these findings and exclusion of other causes of pulmonary fibrosis, such as collagen disease, the patient was finally diagnosed with IPF. Lung fibrosis gradually progressed, and total pneumonectomy and bilateral cadaveric lung transplantation were performed. We obtained six lung tissue samples from each of the bilateral lung segments from this patient (Fig. 2b). Figure 2c shows the top ten kinase-coding genes upregulated in the IPF samples. DCLK1, PDK4, ERBB4, CDK1, and ribosomal protein S6 kinase A6 were upregulated by more than twofold (Fig. 2c). STK33, which is significantly upregulated in the integrated analysis, was also upregulated (log2 ratio 1.35), but with no statically significant difference. Furthermore, IHC revealed that DCLK1 and PDK4 proteins were mainly expressed in the epithelial layer and smooth muscle cells of fibrotic lesions in IPF lungs, whereas they were expressed in the airway epithelium of control lungs (Fig. 2d, e). Additionally, PDK4 expression was observed in the alveolar macrophages.

Fig. 2
figure 2

Clinical features and kinase expression profile of IPF lung tissues from case 1. a High-resolution computed tomography images of the lung. b Sampling site and hematoxylin and eosin staining of each lung tissue. c Top 10 kinase-coding genes upregulated in IPF case 1 lung samples compared to control lung samples. d Immunohistochemistry of DCLK1. Scale bar = 200 µm. e Immunohistochemistry of PDK4. Red arrows indicate alveolar macrophages. Scale bar = 200 µm. f Expression profile of the 40 selected genes encoding kinases having clinically available kinase inhibitors. Error bars indicate standard error. g Immunohistochemistry of ERBB4. Scale bar = 200 µm

Of the 40 genes that had clinically available kinase inhibitors, only ERBB4 was found to be significantly upregulated by more than twofold in IPF lung samples compared to the control lung samples (Fig. 2f). The ERBB4 protein had similar expression pattern to that of DCLK1 and PDK4 proteins, and it was mainly expressed in the epithelial layer and smooth muscle cells of fibrotic lesions in IPF lungs and in the airway epithelium of control lungs (Fig. 2g, Additional file 2: Fig. S2).

Case 2: A 67-year-old male with a smoking history (23.5 pack-years) who was diagnosed with lung squamous cell carcinoma (T2N0M0, stage IB) along with IPF. HRCT revealed lung cancer of the right lower lobe and bilateral honeycombing, indicating a UIP pattern (Fig. 3a). The patient underwent right lower lobectomy, and the pathological examination confirmed UIP in the lung tissue. Three samples were collected from the resected right lower lobe (Fig. 3b). Figure 3c shows the top ten genes upregulated by more than twofold in IPF lung tissues compared to control lung samples. STK33 and PIM2 were the top two genes based on fold change in IPF lung tissues. IHC indicated that STK33 protein was mainly expressed in the epithelial layer of fibrotic lesions in IPF, whereas it was observed in the airway epithelium of control lungs (Fig. 3d). Furthermore, PIM2 protein was mainly detected in the epithelial layer of fibrotic lesions, smooth muscle cells, and alveolar macrophages in IPF tissues, whereas it was expressed in the airway epithelium and alveolar macrophages in control lungs (Fig. 3e). Of the 40 genes having clinically available kinase inhibitors, SYK; Bruton tyrosine kinase; cyclin-dependent kinase 4; FGR proto-oncogene, Src family tyrosine kinase; and colony-stimulating factor 1 receptor were upregulated by more than twofold in IPF lung tissues (Fig. 3f). IHC indicated that SYK protein was mainly expressed in alveolar macrophages and epithelial layer of fibrotic lesions (Fig. 3g).

Fig. 3
figure 3

Clinical features and kinase expression profile of IPF lung tissues from case 2. a High-resolution computed tomography images of the lung. Blue arrow indicates lung cancer. b Sampling site and hematoxylin and eosin staining of each lung tissue. c Top ten upregulated kinase-coding genes in IPF lung samples compared to control lung samples. d Immunohistochemistry of STK33. Scale bar = 200 µm. e Immunohistochemistry of PIM2. Scale bar = 200 µm. f Expression profile of the 40 selected genes encoding kinases having clinically available kinase inhibitors. Error bars indicate standard error. g Immunohistochemistry of SYK. Scale bar = 200 µm

Case 3: A 61-year-old male with a smoking history (82.5 pack-years) who was diagnosed with IPF. HRCT revealed bilateral honeycombing, indicating a UIP pattern (Fig. 4a), and pathological examination confirmed UIP. The patient underwent a cadaveric transplant of the right lung. We obtained two lung samples from the right upper lobe and lower lobe (Fig. 4b). The top 10 upregulated kinase-coding genes are shown in Fig. 4c, with DCLK1 being upregulated by more than twofold in IPF samples compared to the control lung samples. STK33 was also upregulated (log2 ratio 1.30) with no statical significance. IHC indicated the expression of DCLK1 protein in the epithelial layer of fibrotic lesions and smooth muscle cells, similar to that observed in case 1 (Fig. 4d). Of the 40 genes having clinically available kinase inhibitors, Janus kinase 3 was upregulated by more than twofold (Fig. 4e); however, the fold change was not statistically significant. In cases 4 and 5, no genes were upregulated by more than twofold in IPF lung samples compared to the control samples (Additional file 2: Fig. S3a, b).

Fig. 4
figure 4

Clinical features and kinase expression profile of IPF lung tissues from case 3. a High-resolution computed tomography images of the lung. b Sampling site and hematoxylin and eosin staining of each lung tissue. c Top ten upregulated kinase-coding genes in IPF lung samples compared to control lung samples. d Immunohistochemistry of DCLK1. Scale bar = 200 µm. e Expression profile of the 40 selected genes encoding kinases having clinically available kinase inhibitors. Error bars indicate standard error

Discussion

The current study revealed a correlation between the gene expression signatures and degree of fibrosis, as assessed by Ashcroft score and indicated heterogeneity among IPF lung samples based on gene expression. In addition, we identified potentially targetable kinases, such as DCLK1, PDK4, ERBB4, STK33, PIM2, and SYK, which were overexpressed in IPF.

Our results demonstrated that DCLK1 followed by STK33 were the most upregulated genes in IPF lung tissues compared to control lung tissues. Consistent with our data, other studies have also reported the increased expression of these genes in IPF lungs [26,27,28]. Therefore, these genes may be universally upregulated in IPF lung tissues. DCLK1 regulates epithelial-mesenchymal transition (EMT) [29]. STK33 has been reported to be associated with cell proliferation as well as EMT in various cancer types [30]. In the current study, DCLK1 and STK33 proteins were expressed in the epithelial layer of fibrotic lesions in IPF lungs. Based on the evidence that epithelial cells differentiate into myofibroblasts through the EMT and that myofibroblasts promote lung fibrosis [31], DCLK1 and STK33 may serve as therapeutic candidates for IPF. In addition, recently, selective DCLK1 [32] and STK33 inhibitors [33] have been reported, which may provide alternative therapeutic strategies for IPF by suppressing the proliferation of aberrant epithelial cells and inhibiting EMT, thus hindering the progression of fibrosis.

Owing to the heterogeneous nature of IPF, its pathogenesis may vary in each patient. In some patients, IPF may be caused due to the dysfunction of AT2 cells, whereas in others it may be caused due to MUC5B gene aberration. Thus, the development of fibrosis and expression of genes may vary across individuals. In the present study, each patient showed a different gene expression pattern (e.g., case 1: DCLK1, PDK1, and ERBB4 expression; case 2: STK33, PIM2, and SYK expression). Except for DCLK1 and STK33, these genes were neither identified by our integrated analysis nor found in the IPF Gene Explorer database, indicating that these genes are not universally expressed in IPF. However, these genes could serve as potential targets for personalized IPF therapy because they were uniquely upregulated in individual patients.

Among the selected genes having clinically available specific kinase inhibitors, ERBB4 was the most upregulated gene in IPF case 1. Dreymueller et al. reported that the release of inflammatory cytokines, such as CXCL8 and IL-6, from the smooth muscle cells was suppressed by inhibiting ERBB4 expression [34]. In addition, ERBB4 is reportedly associated with EMT in lung and gastric cancer cells [35, 36]. Therefore, ERBB4 could be a potential therapeutic target for IPF. Moreover, preclinical studies on SYK expression (upregulated in IPF case 2) have reported that its inhibition suppresses TGF-β1-induced myofibroblast activation and progression of fibrosis in the liver, kidney, skin, and lung [37,38,39]. Collectively, these results suggest that ERBB4 and SYK are attractive targets for IPF treatment; however, further preclinical studies are needed to confirm the suppression of lung fibrosis following the inhibition of the expression and activation of these kinases.

This study had several limitations. First is the small sample size, potentially leading to skewed results because of selection bias. Second, of the 21 samples analyzed in this study, 13 were obtained from residual specimens of lung cancer surgery, which included a relatively high proportion of patients with lung cancer. Although we microscopically confirmed that the samples for this analysis did not contain lung cancer cells, we cannot completely rule out the possibility that the lung tissue that develops cancer may bias our results. Further, IPF case 3 was treated with pirfenidone prior to tissue collection, which may have affected gene expression [40]. Third, we used bulk RNA sequencing analysis to explore the kinome expression profile in IPF lung tissues, as opposed to single-cell sequencing (scRNA-Seq). Unlike in scRNA-Seq [41, 42], the current study does not provide information on cell type-specific expression of the genes (e.g., fibroblasts and alveolar epithelium). Thus, our study results should be cautiously interpreted. However, the upregulation of the genes by RNA sequencing was confirmed at the protein level by IHC in the epithelial layer and smooth muscle cells of fibrotic lesions. We believe that future studies using scRNA-Seq will delineate the cell type-specific dynamic changes in the expression of genes during the process of fibrosis and identify better therapeutic targets. Fourth, to select a therapeutic target based on genetic profile, the part of the heterogeneous IPF lung that should be biopsied remains unclear. A large-scale integrated analysis with multiple patients in whom tissue sampling can be performed from each lobe of both lungs such as in case 1, may be able to provide clues regarding whether areas with strong fibrosis or weak fibrosis are more appropriate biopsy sites.

We performed a comprehensive kinase expression analysis using RNA sequencing to explore potential therapeutic targets for IPF and found that DCLK1 and STK33 may serve as potential candidates for molecular targeted therapy of IPF. In addition, PDK4, ERBB4, PIM2, and SYK might also be attractive targets in individual cases. Additional large-scale studies are warranted to develop personalized therapies for patients with IPF.