Review

Introduction

Biological markers, which are often referred to as biomarkers, are commonly defined as objectively measured and evaluated indicators of physiological or pathological processes or pharmacological responses to therapeutic intervention [1], although there are also several other definitions. During recent years, blood-originated biomarkers from serum, plasma, or cells have been the most extensively reviewed with respect to idiopathic pulmonary fibrosis (IPF) [2,3]. Biomarkers have been postulated to be useful in several ways, e.g., in making a differential diagnosis between IPF and other interstitial lung diseases (ILDs), in estimating prognosis and survival, in revealing the course of disease, and also for monitoring drug efficacy. In addition, it is possible that biomarkers could be helpful in distinguishing between various phenotypes of IPF.

Rationale for lung tissue biomarkers

It has been estimated that about one third of IPF patients do require a surgical lung biopsy (SLB) in order to come to an ultimate diagnosis, and thus, it may be feasible to obtain lung tissue samples from a relatively high proportion of patients [4]. One benefit for lung tissue biomarkers would be the fact that the tissue obtained is probably the most appropriate source if one wishes to be able to link cell biological phenomena to pathogenetic mechanisms of the disease. Many biomarkers can be presumably located in several targets. Blood, sputum, and even broncho-alveolar lavage (BAL) samples can be collected repeatedly, which is usually not possible for lung tissue samples taken by surgical operation, because these procedures always carry a potential risk of serious complications [4]. The novel less invasive method for obtaining lung tissue samples by the transbronchial cryo-biopsy technique is expected to become more common in clinical practice; this may mean that in the future, lung tissue samples could be obtained not only for diagnostics but also for follow-up [5]. Some blood biomarkers have been investigated also in BAL and lung tissue, but there are very few reports describing the simultaneous examination of blood and BAL or lung tissue samples. The recent study of Seibold et al. combined multiple sources of materials and showed that a polymorphism in the promoter of mucin-5 subtype B (MUC5B) was associated with familial interstitial pneumonia (IP) and IPF [6].

This review article aims to focus on biomarkers in IPF, i.e., idiopathic usual interstitial pneumonia (UIP), in lung tissue concentrating on studies with relevant clinical endpoints. Studies focusing on IPF and UIP were included due to the changes in the classification, which have taken place during the past decades, although all UIP cases do not necessarily represent IPF. Publications comparing IPF with major types of ILDs like nonspecific interstitial pneumonia (NSIP) and connective tissue disease-associated ILD (CTD-ILD), which are the most common differential diagnostic dilemmas, were included. In addition, studies conducted on lung tissue samples using modern large-scale transcriptomic and proteomic technologies were included, although all of those had not used clinical or radiological endpoints.

Studies on lung tissue samples with clinical endpoints

Studies of fibroblast focus

The specific aggregates of fibroblasts, myofibroblasts, and extracellular matrix (ECM) proteins in fibrotic lung are called fibroblast foci (FF), and these structures are more common in IPF than in other types of lung fibroses. Several studies have demonstrated that a high amount of FF in lung tissue correlates with the shortened survival of IPF patients [7-12], as previously reviewed elsewhere [4]. At present, the number of the FF is the only histological biomarker that reproducibly correlates with the prognosis of IPF.

Lung tissue biomarkers with clinical or radiological endpoints

The majority, 75.8%, of patients with IPF were found to be positive for protease-activated receptor 2 (PAR-2) in the study of Park et al. (Table 1). Blood neutrophil counts were lower, whereas blood lymphocyte counts and honeycombing scores in chest CT were higher in PAR-2-positive patients than in the PAR-2-negative patients. All of the fatal cases belonged to the PAR-2-positive group, although this difference between groups did not reach statistical significance [13]. In the study of Tzouvelikis and co-authors, lung tissue samples of IPF, cryptogenic organizing pneumonia (COP), and NSIP patients were analyzed for epidermal growth factor receptor (EGFR). It was observed that the EGFR mRNA levels negatively correlated with forced vital capacity (FVC) and diffusion capacity (DLCO) [14].

Table 1 Compilation of studies using lung tissue biomarkers with clinical endpoints

Todd et al. studied IPF cases using both SLB and subsequent lung transplantation samples from each patient, which made it possible to compare the histological features of the early and late phases of the disease. It was revealed that numbers of lymphocytes in lung tissue increased during progression since the amount of lymphocytes was higher in the lung explants than in the SLB samples [15].

IPF cases were investigated for the expression levels of alpha smooth muscle actin (α-SMA), telomerase, interleukin 4 (IL-4), transforming growth factor-beta (TGF-β), and beta fibroblast growth factor (β-FGF). It was noted that the levels of expressions of myofibroblast α-SMA and IL-4 were negatively associated with patient survival [16]. Calabrese et al. examined explanted lungs of IPF patients, of which patients with high-grade dysplasia or carcinomas showed a greater increase in the levels of serpin B3/B4 expression in metaplastic epithelial cells than the patients without these diseases. The expression level of serpin B3/B4 was linearly and positively associated with age. Furthermore, the patients with greater impairments in DLCO displayed significantly higher expression of serpin B3/B4 [17].

It was noted that the number of mast cells were increased in IPF, and in addition, a high mast cell number also associated with a slower rate of decline in FVC in a study of Cha and co-authors [18]. Nagata and others evaluated Krebs von den Lungen-6 antigen (KL-6) and surfactant protein A (SP-A) in idiopathic interstitial pneumonia (IIP). In patients with IIPs as a whole and also in those with UIP, the SP-A positive ratio was significantly lower in those who died from the progression of disease in comparison to those patients with another prognosis, i.e., stable, improved, and deteriorating but living [19].

Myllärniemi et al. investigated UIP and NSIP cases for gremlin and bone morphogenetic protein 4 (BMP-4), revealing that the area of gremlin-positive staining correlated negatively with FVC. The levels of gremlin mRNA correlated negatively with the specific diffusion capacity corrected for alveolar volume (DLCO/VA), whereas BMP-4 mRNA correlated positively with FVC and DLCO [20]. A negative correlation between gremlin mRNA levels and DLCO/VA was observed when UIP and NSIP patients were analyzed. In contrast, a positive correlation was observed between BMP-4 mRNA and FVC as well as between BMP-4 mRNA and DLCO.

Parra et al. revealed that the total density of inflammatory cells was significantly increased in the patients with NSIP and diffuse alveolar damage (DAD) when compared to those with UIP. In UIP, forced expiratory volume in 1 s (FEV1) and survival correlated with the numbers of CD3-positive T lymphocytes (TL), the numbers of CD68-positive cells correlated with FEV1, and the amounts of neutrophil elastase-positive cells correlated with residual volume and residual volume/total lung capacity (TLC) and carbon monoxide transfer factor. The most important predictor of survival in UIP/IPF was CD3-positive TLs [21]. In another study it was found that in IPF, the numbers of CD8-positive TLs inversely correlated with FVC% predicted, TLC% predicted, DLCO% predicted, and arterial oxygen tension (PaO2). Positive and statistically significant correlations were found between the numbers of CD8-positive TLs and alveolar-arterial gradient (P(A-a)O2) as well as the Medical Research Council (MRC) score. Furthermore, the CD8-positive TLs displayed significant negative correlations with the FVC% predicted and the FEV1% predicted [22].

Tsukamoto et al. observed that epithelial cells were positively stained for Epstein-Barr virus-associated latent membrane protein 1 (LMP1) in 31% of the patients with IPF, whereas none of the patients with systemic sclerosis (SSc)-associated ILD or the controls showed this kind of positive staining. Death from respiratory failure was significantly more common in LMP1-positive patients than in LMP1-negative patients. The use of systemic steroids after lung biopsy was more frequent in the LMP-positive than in the LMP-negative patients [23].

The amount of tenascin-C was analyzed in patients with UIP and also other types of ILDs [24]. The mean survival of the patients with UIP with high scores of tenascin-C was significantly shorter than that of patients with UIP with a lower tenascin-C sum score. Testing of tenascin-C scores in different locations revealed that an increased accumulation of tenascin-C underneath metaplastic bronchiolar-type epithelium was also associated with a shorter survival.

Differential diagnostic biomarkers

Cipriani et al. investigated CTD-UIP and IPF to evaluate the count and area of both FF and lymphocyte aggregates (LAs) (Table 2). They found that FF counts and areas were lower in patients with CTD-UIP, whereas the LA counts and areas were greatest in the patients with CTD-UIP, although the differences did not quite reach statistical significance. The only marked difference was observed in NSIP features, which were more prevalent in CTD-UIP than in idiopathic UIP [25]. NSIP and IPF cases were evaluated for levels of chemokine receptors CXCR3 and CCR4. It was found that the number of CXCR3-positive lymphocytes in NSIP patients was significantly greater than the corresponding value in IPF patients. The number of CCR4-positive lymphocytes in NSIP patients was significantly lower than that in IPF, and thus, the CXCR3 to CCR4 ratio in the NSIP patients was significantly elevated [26].

Table 2 Examples of studies focusing on differential diagnostics between IPF and NSIP or CTD-ILD

The levels of epimorphin protein and mRNA expression in NSIP were significantly higher than those in the patients with UIP in the study of Terasaki and co-authors [27]. Nakashima et al. evaluated IPF and NSIP cases for signaling molecules associated with tumor protein p53-mediated apoptosis [28]. Western blotting revealed that the expression of p53, phosphorylated p53, and mouse double minute 2 homolog (Mdm2) protein was significantly higher in IPF and NSIP than in the controls. The numbers of cells positive for the p53, phosphorylated p53, Mdm2, and apoptosis regulator Bax proteins as well as the number of TUNEL-positive cells were higher in IPF than in NSIP. Suga et al. investigated cases with IPF, NSIP, and bronchiolitis obliterans organizing pneumonia (BOOP) for various matrix metalloproteinases (MMP) and the specific tissue inhibitors of metalloproteinases (TIMP) [29]. The intense expression of MMP-9 in metaplastic epithelial cells was a special characteristic of UIP.

Studies focusing on omics techniques

Zuo et al. conducted a microarray analysis of lung specimens from the patients with IPF and CTD-UIP (Table 3). A marked increase in the expression of genes that encode for muscle proteins was observed. The expression of genes that encode for proteins associated with cell contraction and actin filament organization was increased as well as that of genes encoding for collagens I, III and VI, tenascin-C, osteopontin, and fibronectin [30]. Selman et al. investigated lung tissues from the patients with IPF, hypersensitivity pneumonitis (HP), and NSIP by microarray [31]. IPF cases were enriched for genes involved in development, extracellular matrix structure and turnover, and cellular growth and differentiation. The levels of several epithelium-related genes were also upregulated in IPF lungs.

Table 3 Omic studies using lung tissue in IPF research

Yang et al. profiled lung tissue from the patients with sporadic pulmonary fibrosis and patients with familial pulmonary fibrosis. The genes involved in ECM turnover, ECM structural constituents, proteins involved in ECM degradation, and cell adhesion molecules were increased. Most of the genes that were differentially expressed in the familial IIP belonged to the same functional categories as those that distinguished IIP from control samples, but they were over- or under-expressed to a greater extent in familial IIP than in all cases of IIP [32].

Konishi et al. evaluated lungs from the patients with stable IPF and patients with acute exacerbation of IPF (IPF-AEx) by microarrays. When compared with control samples, the global gene expression patterns of IPF-AEx were almost identical to those of stable IPF. In the direct comparison of IPF-AEx and stable IPF, the differentially expressed genes included those related to stress responses such as heat shock proteins and α-defensins as well as mitosis-related genes including histones and cyclin-A2 protein (CCNA2) [33]. The study of Boon and co-authors compared stable or slowly progressing IPF patients with those suffering from progressive IPF. It was found that about a 100 of transcripts were upregulated in the progressive group [34].

Yang et al. conducted a microarray analysis of 119 IPF cases and 50 controls. There was elevated expression of the cilium genes associated with microscopic honeycombing as well as higher expression of MUC5B and MMP-7. Two novel subtypes of IPF/UIP could be defined by the expression of cilium-associated genes [35]. Patients with high cilium gene expression demonstrated more microscopic honeycombing, but not FF, and displayed elevated tissue expression of MUC5B and MMP7.

A proteome analysis of explanted lungs from IPF patients identified that many proteins upregulated in the IPF fell into the related categories of unfolded protein response (UPR), endoplasmic reticulum (ER) stress, proteasome, degradation, and general cell stress response [36]. Subsequently, the same researchers evaluated IPF and fibrotic NSIP patients with proteomics and noted that the majority of the proteins which were upregulated in IPF and NSIP fell into the related categories such as chaperone/protein folding, protein processing, energy generation/glycolysis, and antioxidant function [37].

Patel et al. studied IPF patients with pulmonary hypertension (PH-IPF) and IPF patients without PH (NPH-IPF). The comparison of PH-IPF arteriole with NPH-IPF arteriole results achieved no separation between the two groups. When gene expression of the combined IPF samples was compared to the controls, a total of 255 genes were differentially expressed in IPF arterioles [38]. In a gene expression microarray study of DePianto et al., microscopic pathological heterogeneity in IPF lung tissue corresponded to patterns related to bronchiolization and lymphoid aggregates [39]. Recently, researchers were able to identify 2,130 differentially methylated regions in IPF, of which 738 were associated with significant changes in gene expression [40].

Discussion

The current guidelines recommend that in the diagnostics of IPF, high resolution computed tomography (HRCT) has to be classified into three categories, i.e., 1) UIP, 2) probable UIP, or 3) not UIP [41]. If other diseases manifested as UIP are excluded, and if typical UIP is revealed in HRCT, then the diagnosis of IPF can be established on the basis of clinical and radiological investigations. When HRCT is categorized as possible UIP or not UIP, an examination of surgical lung biopsy is needed in order to make a reliable diagnosis, which is a multidisciplinary process and needs to be based on inputs from clinicians, radiologists, and pathologists [41].

The recent workshop on IPF emphasized three research foci, i.e., 1) alveolar injury, 2) cellular origins of myofibroblasts, and 3) role of stem/progenitor cells [42]. The present hypothesis of the pathogenesis of IPF indicates that injured and hyperplastic alveolar epithelia containing dysfunctional type II alveolar epithelial cells (AECII) are able to release factors causing proliferation of fibroblasts and myofibroblasts and deposition of ECM [43]. The pathogenesis of IPF resembles that of abnormal wound healing, a process involving many cell biological mechanisms including TGF-β and cellular stress [44]. The origin of the myofibroblasts in IPF is not yet totally clear, although several sources have been presented, e.g., that myofibroblasts arise from resident tissue fibroblast and bone marrow-derived cells, by epithelial mesenchymal transition (EMT) from epithelium, endothelium and pericytes, or from fibrocytes [43].

The levels of SP-A in alveolar epithelium have been shown to be lower in UIP patients dying from progressive disease in comparison with the patients with a stable disease, a finding which may reflect dysfunction of AECII [19]. The theory of epithelial damage has received further support from a study that used proteomics to reveal that many of the upregulated proteins in IPF belonged to the related categories of UPR and ER stress [36]. The finding in proteomics was also confirmed by immunohistochemistry highlighting that UPR markers were encountered in AECII cells in IPF but not in control cells [36]. The discoveries at the protein level are supported by a transcriptomic investigation, which showed that several epithelium-related genes were upregulated in IPF lungs [31]. A recent study has observed that the levels of cilium genes and MUC5B were increased in IPF and that cilium genes associated with microscopic honeycombing showing that not only alveolar but also bronchiolar epithelial alteration may have a role in the pathogenesis of IPF [35].

TGF-β-induced enhancement of ECM plays a fundamental role in the pathogenetic theories of IPF. Gremlin, an antagonist of BMP, which is a member of the TGF-β superfamily, has been shown to associate with lung function parameters in IPF [20]. High expressions of tenascin-C, an ECM protein, have been displayed to correlate with the shortened survival of the patients [24]. Surprisingly, few other immunohistochemical studies have been published on ECM alteration compared with clinical endpoints, whereas several studies have confirmed by transcriptomics that gene expression of various ECM proteins including collagens, tenascin, osteopontin, fibronectin, and genes involved in ECM regulation is upregulated in IPF [30-32]. The amount of α-SMA, which is a marker of myofibroblast, has been shown to associate negatively with patient survival in an immunohistochemical study [16], and an increased expression of the genes that code the factors associating with a myofibroblast has been presented by microarray studies [30,31]. Moreover, the high number of FF in lung tissue, in which myofibroblasts and ECM proteins are localized, has been shown to correlate with the shortened survival of the patients with IPF [4].

Many immunohistochemical studies have detected changes in the numbers of inflammatory cells in lung tissue of IPF, a finding which is not markedly supported by the current hypotheses. A recent study revealed that the amount of inflammatory cells did not decrease during the progression of the IPF as previously assumed by many [15]. Moreover, the number of mast cell has been shown to correlate with the progression of the disease [18], CD3-positive lymphocytes to correlate with FEV1 and survival [21], and CD68-positive T lymphocytes with lung function changes [22]. Furthermore, histological heterogeneity in IPF lung tissue corresponded to gene expression pattern related to lymphoid aggregates [39].

During the past decade, the research on IPF has focused on blood biomarkers, which is understandable since blood biomarkers can be easily obtained and analyzed, also repeatedly. At present, the most widely studied blood-derived biomarkers have been KL-6 and SP-A and -D [3]. As outlined in a recent review, a molecular biomarker in IPF should be quick, non-invasive, inexpensive, easy to repeat, and ideally blood or urine based [45]. Some blood biomarkers have been investigated also in BAL and lung tissue, but so far, there is a paucity of publications that have used simultaneously blood and BAL or lung tissue samples. It could be hypothesized, however, that the most reliable blood-derived biomarker could be found based on lung tissue analyses with the knowledge of cell-specific localization and with an established association with the course and the phenotype of the disease. In addition to the routine clinical follow-up investigations, it would be important to incorporate standardized staging and predictor models, such as GAP (gender, age, physiology) index, in any comparisons involving lung tissue biomarkers [46].

Conclusions

Studying lung tissue in IPF is not an easy task when the missing key cell type or event in the pathogenesis of IPF causes discrepant analysis protocols in histological studies. Can we afford to ignore lung tissue samples when the pathogenesis of the disease and characterization of the phenotypes still leave so many unanswered questions? In addition to supplementing lung tissue material with other non-solid organs like blood, urine, and BAL, also the combination of multiple methods like omics, immunohistochemistry, protein assays, and mRNA quantification together with clinically meaningful endpoints may well prove to be the most beneficial approach with which to discover relevant non-invasive biomarkers. This kind of approach would hopefully make such progress so that within 10–15 years, rapid and reliable blood-based biomarkers will have become available not only for research purposes, but they will have become routine procedures also in clinical practice.