Background

Chronic obstructive pulmonary disease (COPD) is one of the most common chronic diseases affecting over 10% of adults older than 65 years worldwide [1]. It is characterized by incompletely reversible airflow obstruction, which results in reduced quality of life and increased mortality. COPD is currently the fourth leading cause of death worldwide, according to the World Health Organization [2].

Some genetic factors, such as alpha-1 anti-trypsin deficiency, and environmental exposures, such as cigarette smoking, are well-established contributors to COPD, yet the precise molecular mechanisms of the disease remain to be elucidated. Genome-wide gene expression profiling offers opportunities to develop new insights in the origins of COPD based on molecular features.

The airway epithelium is the initial site of exposure to cigarette smoke and reflects molecular changes associated with both cigarette smoke exposure and smoking-associated lung disease. We have previously demonstrated that the bronchial airway epithelium responds to cigarette smoke exposure with characteristic alterations in gene expression [3]. Building upon this observation, we identified a bronchial airway gene expression signature associated with COPD and disease severity that is similarly altered in COPD-affected lung tissue [4]. These data suggest that bronchial gene expression obtained by bronchoscopy may be applied to monitor disease activity in COPD. However, the relative invasiveness of this procedure precludes its use in large populations. We therefore characterized the upper airway genomic response to cigarette smoke exposure by profiling epithelial cells collected by brushing the inferior turbinate of the nose. Using paired bronchial and nasal airway samples obtained from the same healthy individuals, we showed that smoking-induced changes in bronchial airway gene expression are similarly altered in the nasal epithelium [5]. Therefore, nasal epithelial gene expression profiling could potentially function as a non-invasive biomarker in COPD.

In the present study, we investigated whether gene expression profiling of the nasal epithelium can distinguish between patients with and without COPD. Additionally, we established whether COPD-associated gene expression changes in nasal epithelium reflect those occurring in the bronchial epithelium.

Methods

The online supplement provides additional details and a flowchart of the identification cohort and comparator cohorts (online Additional file 1: Figure E1).

Identification cohort, sample collection and processing

The local ethical committees approved the studies and all included subjects gave their written informed consent. COPD patients were recruited in the University Medical Center Groningen (UMCG) and seven other participating hospitals in The Netherlands between 2011 and 2012. From this study, we only included current smokers aged 40 years and older with at least 10 pack years, a forced expiratory volume in 1 s (FEV1)/forced vital capacity (FVC) < 0.7 and an FEV1 < 60% predicted. Controls were recruited in the UMCG between 2009 and 2012, including only current smokers aged 40 years and older with normal pulmonary function (i.e. post-bronchodilator FEV1/FVC > lower limit of normal, absence of bronchial hyperresponsiveness and reversibility of FEV1 to salbutamol <10% predicted). Spirometry and body plethysmography measurements were performed according to international guidelines [6, 7]. Symptom scores were assessed using the Clinical COPD Questionnaire (CCQ) [8].

Nasal epithelial brushings were obtained using a soft cytology brush sampling the inferior turbinate of the nose [5, 9]. Total RNA was isolated and microarray hybridization to Affymetrix Human Gene 1.0 ST Arrays was performed.

Data normalization, preprocessing and analysis

Statistical analyses were performed using R version 3.3.2. The quality of the microarray hybridization was assessed as previously described [4]. Nasal gene expression profiles associated with COPD were identified using a linear regression model with the log2-transformed expression of each gene as dependent variable, and COPD, age, gender, packyears, RNA integrity number (RIN), and the first 4 principal components (PCs) as independent variables. We adjusted for PCs to reduce technical variation in the microarray data (online Additional file 1). A Benjamini-Hochberg False Discovery Rate (FDR) procedure was applied to account for multiple testing, with an FDR < 0.01 indicating statistical significance [10].

Comparison of COPD-associated gene expression changes in the nasal and bronchial epithelium

We compared nasal COPD-associated gene expression with bronchial COPD-associated gene expression in two independent datasets of bronchial brushings (comparator cohort 1 and comparator cohort 2 as described below), using Gene Set Enrichment Analysis (GSEA) version 2.2.4 [11]. The local ethical committees approved the two studies and all included subjects gave their written informed consent.

  • Comparator cohort 1 was a previously published dataset of bronchial airway gene expression in current and former smokers with and without moderate-to-severe COPD (GSE37147) [4].

  • Comparator cohort 2 was an independent cohort of current and former smokers with and without moderate-to-severe COPD who participated in a previous study in the UMCG [12, 13]. COPD was defined as FEV1/FVC ≤ 0.7. Bronchial brushes were taken during bronchoscopy and RNA was isolated and processed as described in the online supplement.

We performed GSEA for genes associated with COPD in nasal and bronchial epithelium. First, we investigated COPD-associated nasal gene expression with COPD-associated bronchial gene expression. To this end, bronchial genes were ranked according to the strength of their association with COPD (t-value), and compared to significantly COPD-associated up- and downregulated genes in nasal epithelium.

Pathway analysis

We compared COPD-associated nasal and bronchial gene expression to gene-sets generated from the Kyoto Encyclopedia of Genes and Genomes (KEGG). A FDRGSEA < 0.25 was considered statistically significant. In addition, to identify more specific biological processes we performed Gene Ontology enrichment analyses (GOrilla) on COPD-associated genes exhibiting differential expression in nasal samples.

Results

Study populations

The identification cohort consisted of 31 COPD patients and 22 controls with nasal epithelial brushes. The comparator cohorts contained in total 97 COPD patients and 171 controls with bronchial epithelial brushes. Clinical characteristics are presented in Table 1.

Table 1 Characteristics of the identification cohort (nasal epithelial brushes) and comparator cohorts (bronchial epithelial brushes)

COPD-associated genes in nasal epithelium

Nasal epithelial gene expression levels of 135 genes were significantly altered in individuals with COPD versus controls (FDR <0.01; 21 upregulated and 114 downregulated). The top-10 most significantly upregulated genes were SHROOM1, STARD13, CMTM1, SHC4, RRBP1, MUC1, ARHGEF16, TEP1, TRIM3 and GPRC5C. The top-10 most significantly downregulated genes were NPHP1, CFAP206, C11orf70, CCDC113, CSE1L, FAM83B, LTV1, GMNN, SERPINB5 and AKAP14. Figure 1 shows a heatmap of all 135 significantly differentially expressed genes between COPD patients and controls, while Table E1 in online Additional file 1 presents all 135 differentially expressed genes.

Fig. 1
figure 1

Heatmap of gene expression significantly associated with COPD status. Between COPD and controls, 135 genes were significantly differentially expressed: 114 genes were significantly down- and 21 genes were significantly upregulated in COPD (FDR < 0.01)

COPD patients were recruited at multiple study sites within the Netherlands, whereas controls were recruited in the UMCG only. To exclude the possibility that our results were due to study site-specific factors in nasal epithelial sampling, we performed a sub-analysis in nasal epithelial samples derived from individuals with COPD recruited in the UMCG (n = 12) and comparing them to the non-COPD controls also recruited from the UMCG (n = 22). We also compared individuals with COPD recruited at other study sites (n = 19) with the non-COPD controls recruited in the UMCG. We used GSEA to compare COPD-associated gene expression in non-UMCG COPD patients with COPD-associated gene expression in UMCG COPD patients. GSEA showed that the COPD-associated gene expression profiles were comparable between UMCG and non-UMCG COPD patients (FDRGSEA < 0.001; online Additional file 1: Figure E2). In line with the above GSEA results, plots of the principal component analysis including only COPD patients revealed an even distribution of subjects across all study recruitment sites (online Additional file 1: Figure E3), indicating that the study site where the nasal epithelial brushes were collected did not have an effect on gene expression levels.

Comparison of COPD-associated gene expression in the nasal and bronchial epithelium

We assessed whether COPD-associated gene expression in the nasal epithelium is related to the COPD-associated gene expression changes in the bronchial epithelium. To this end, we explored the direct overlap between COPD-associated genes in nasal and bronchial epithelium in comparator cohort 1. Of the 135 genes significantly differentially expressed in nasal epithelium, 9 out of 21 genes that were significantly upregulated with COPD in nasal epithelium (FDR < 0.01), were also significantly upregulated with COPD in bronchial epithelium (FDR < 0.01). Of the 114 genes significantly downregulated with COPD in nasal epithelium, 19 were also significantly downregulated with COPD in bronchial epithelium (FDR < 0.01; online Additional file 1: Table E2). By chance, we would have expected an overlap of 1 gene at a FDR < 0.01, yet we identified 28 overlapping genes in nasal and bronchial epithelium (Chi square statistic p-value <0.01), providing suggestive evidence that our findings are not mere due to chance.

Next, we used GSEA to compare COPD-associated nasal epithelial gene expression with COPD-associated bronchial epithelial gene expression in the 2 independent comparator cohorts of subjects with and without COPD. Genes significantly upregulated in nasal epithelium of individuals with COPD were significantly enriched among genes upregulated in bronchial epithelium in COPD in both cohorts (FDRGSEA < 0.001, Fig. 2a and b). Similarly, genes significantly downregulated in nasal epithelium in COPD were significantly enriched among genes downregulated in bronchial epithelium in COPD (FDRGSEA < 0.001, Fig. 2c and d).

Fig. 2
figure 2

Gene set enrichment analysis showing that nasal gene expression associated with COPD resembles bronchial gene expression. The colored bars represent the ranked t-values of the association of bronchial gene expression with COPD of ~20.000 genes: red represents a positive association whereas blue represents a negative association with COPD. The black vertical lines each represent a significantly differentially expressed gene in nasal epithelium, which are ordered across the ranked bronchial genes. The height of the black lines represents the running enrichment scores of the gene set enrichment analysis. Significant differentially expressed genes at a FDR cut-off of <0.01 are shown. a Upregulated genes in nasal epithelium (n = 21) were significantly enriched among upregulated genes in bronchial epithelium in cohort 1, b Upregulated genes in nasal epithelium (n = 21) were significantly enriched among upregulated genes in bronchial epithelium in cohort 2, c Downregulated genes in nasal epithelium (n = 114) were significantly enriched among downregulated genes in bronchial epithelium in cohort 1, d Downregulated genes in nasal epithelium (n = 114) were significantly enriched among genes downregulated in bronchial epithelium in cohort 2

Similarities in pathway enrichment among COPD-associated gene expression in nasal and bronchial epithelium

We identified 6 KEGG pathways that were significantly enriched (FDRGSEA ≤ 0.25) in the nasal epithelium from the identification cohort and in the bronchial epithelium of both comparator cohorts (Table 2). These included 2 pathways enriched among genes upregulated in COPD (O-glycan biosynthesis and glycosphingolipid biosynthesis – lacto and neolacto series) and 4 pathways enriched among genes downregulated in COPD (RNA degradation, DNA replication, propanoate metabolism and tight junction) (online Additional file 1: Tables E3 and E4). Additionally, gene ontology enrichment analyses confirmed significant enrichment of downregulated genes involved in cell cycle and translation, such as negative regulation of G2/M transition of mitotic cell cycle and translational initiation, as well as pathways involved in ciliary function (online Additional file 1: Table E5). For upregulated genes in nasal epithelium, no significant biological processes were identified. These results may reflect a slower rate of epithelium renewal and ciliary dysfunction in COPD.

Table 2 Common KEGG pathways and leading-edge genes associated with COPD identified by GSEA (FDRGSEA ≤ 0.25) in nasal and bronchial epithelium

We aimed to identify genes that play an important role in these 6 KEGG pathways in nasal and bronchial epithelium. Using GSEA we identified the leading-edge subset of genes (online Additional file 1: Figure E4) for each of the pathways significantly enriched in the three cohorts, and took the overlap of the leading edge genes from these analyses that were also significantly differentially expressed both in nasal epithelium (p-value < 0.05) and in bronchial epithelium (cohort 1; FDR < 0.05) (online Additional file 1: Table E6 and E7). Fifteen genes fulfilled above criteria, suggesting they may play a role in COPD (Table 2).

Discussion

We performed gene expression profiling of the nasal epithelium in smoking individuals with and without COPD and identified a nasal epithelial gene expression profile associated with COPD. Furthermore, we provide suggestive evidence that the COPD-associated nasal gene expression profile overlaps with the COPD-associated bronchial gene expression profile, supporting the concept of a ‘united airway field of injury’. These findings suggest that sampling of nasal epithelium may be useful to study underlying mechanisms of COPD and to develop easily accessible non-invasive biomarkers for disease classification, prognosis and therapeutic monitoring.

There is a high need for an epigenetic biomarker that provides information on disease activity of respiratory diseases, but studies addressing this are limited. A recent study by Obeidat et al. explored expression of 127 emphysema-related genes in lung tissue, bronchial epithelium and peripheral blood in relation to lung function (FEV1 and FEV1/FVC) [14]. Of the 40 genes associated with lung function in peripheral blood (FDR 0.1), 29 (73%) were overlapping with lung function-associated genes in lung tissue. Of these 29 genes, 4 (13%) had a similar direction in blood and lung tissue. Additionally, they found 13 (33%) lung-function associated genes overlapping between peripheral blood and bronchial epithelium, of which 8 (20%) exhibited changes in similar direction. Of interest, Obeidat et al. used the same bronchial epithelium cohort as investigated in our study [4]. We found that out of 135 genes associated with COPD in nasal epithelium, 28 genes (21%) overlapped with genes associated with COPD in bronchial epithelium in similar direction (FDR 0.01), which is in line with the results of Obeidat et al. Nevertheless there are some considerable differences between the studies. In the first place, our study reports results of a genome wide gene expression analysis to identify COPD-associated genes, while Obeidat et al. conducted a targeted analysis of 127 genes known to be related to emphysema in lung tissue. The targeted approach might have increased the chance of finding differentially expressed genes, but, on the other hand, it disregards expressed genes that are related to features of COPD other than emphysema, such as small airways disease and bronchitis. Additionally, we investigated genes associated with COPD as dichotomous outcome, while Obeidat et al. did not investigate COPD as a disease entity, but investigated FEV1 and FEV1/FVC as continuous measures. The genes identified by Obeidat et al. are therefore more likely to be related to the severity of airway obstruction while the genes identified in our study are likely to be related to COPD in general. Of importance, we applied a stringent FDR of 0.01 while the former study applied an FDR of 0.1, which might have resulted in more false-positive findings among genes identified in peripheral blood. Also, since smoking status significantly affects gene expression, we only selected current smokers for our gene expression analysis in nasal epithelium, while Obeidat et al. investigated both current- and ex-smokers. In conclusion, both gene expression in nasal epithelium and peripheral blood appear promising as biomarkers in COPD. Future studies should address the direct comparison of gene expression in nasal epithelium and blood in relation to COPD, in which special attention should be given to a similar study design. These studies will be necessary to determine which of these tissues is most suitable to serve as a biomarker in respiratory diseases.

A number of the genes identified in the present study have been previously considered as candidates that associate with features of COPD, such as MUC1, CREB3L1, DSP, NPHP1, CFAP206 and CCDC113 [15,16,17,18,19,20,21,22,23,24,25,26,27,28]. Among the top-10 significantly upregulated genes in nasal epithelium of COPD patients is Mucin 1 (MUC1). MUC1 is a membrane-associated mucin which is expressed in nearly all human glandular epithelial cells including the airway epithelium [15]. It is thought that MUC1 exerts anti-inflammatory effects upon infection with pathogens by suppressing pro-inflammatory cytokines such as Tumor Necrosis Factor-alpha (TNF-α) [16, 17]. Of interest, plasma and sputum levels of KL-6, a glycoprotein classified as a human MUC1 mucin, are higher in COPD patients than healthy smokers and non-smokers [18], which is in line with our findings as we found MUC1 to be upregulated in nasal epithelium of COPD patients. Additionally, cAMP Responsive element Protein-Like 1 (CREB3L1) is a transcription factor involved in the unfolded protein response during endoplasmic reticulum stress. It has been proposed that CREB3L1 has a central role in mucus production, since it is associated with MUC5AC expression, an important mucin secreted by the airway epithelium [19]. Furthermore, CREB3L1 contributes to collagen-containing extracellular matrix production upon simulation with transforming growth factor beta (TGF-β) [20]. Increased mucus production and airway remodeling are key features of COPD [21, 22], therefore upregulation of CREB3L1 can be envisaged to play a role in the pathophysiology of COPD.

Among the top-10 significantly downregulated genes in nasal epithelium in COPD are nephrocystin 1 (NPHP1), cilia and flagella associated protein 206 (CFAP206) and coiled-coil domain containing 113 (CCDC113). These 3 genes have in common that they are all involved in cilliagenesis or cilia function. For example, nephrocystin 1 is localized at the ciliary transition zone of respiratory cilia [23]. Next, CFAP206 has recently been identified as a player in cilium motility through its role in the assembly of the axonemal radial spokes [24]. Finally, CCDC113 is a centrosome-associated protein and has a function in cilia formation as depletion of CCDC113 in retinal pigmented epithelial (RPE1) cells led to reduction of cilium formation [25]. In COPD, ciliary function is impaired which leads to decreased mucociliary transport [26]. We found cilia-associated genes to be downregulated in COPD, which might contribute to this process. Another COPD-associated downregulated gene in nasal epithelium is Desmoplakin (DSP), a gene recently discovered to be associated with COPD in a large GWAS [27]. Of interest, in the latter study COPD-linked variants in DSP were associated with decreased expression of DSP in lung tissue based on expression quantitative trait loci (eQTL) analyses, compatible with our results showing decreased expression in nasal epithelium of COPD patients. DSP is one of the important components of desmosomes, a structure that is important in (epithelial) cell adhesion and barrier function [28]. Although studies on the functional role of DSP in COPD are lacking, one could speculate that downregulation of DSP and, therefore, decreased epithelial barrier function contributes to the mechanisms underlying COPD.

We identified 6 KEGG-pathways that are significantly enriched for genes associated with COPD in both nasal and bronchial epithelium. We identified 4 KEGG pathways that are universally and significantly downregulated in COPD: RNA degradation, DNA replication, propanoate metabolism and tight junction. Of interest, our gene ontology enrichment analysis confirmed downregulation of pathways involved in translation and cell cycle, such as translational initiation and negative regulation of G2/M transition of mitotic cell cycle. It has been demonstrated that cell proliferation is reduced in bronchial epithelial cells and fibroblasts of patients with emphysema compared to controls [29,30,31], which is in line with our results from the KEGG and gene ontology pathway analyses.

Two pathways are significantly upregulated in both nasal and bronchial epithelium: O-linked glycan biosynthesis and glycosphingolipid biosynthesis. O-linked glycan biosynthesis is a posttranslational process during which carbohydrates (glycans) are attached to proteins. Interestingly, mucin-type O-glycans are the most common form of glycans present in humans [32]. Mucins are highly glycosylated proteins, which form the core structure of mucus. It could be speculated that the observed upregulation of the O-linked biosynthesis pathway in epithelial cells contributes to increase in mucus production, one of the key features of COPD pathophysiology. Glycosphingolipids are membrane lipids (ceramide) to which a glycan is attached. They function as structure elements in the cell membrane and as mediators in cell-cell interaction [33]. Recent studies of our group have shown that concentrations of (glyco)sphingolipids are increased in sputum of COPD patients [34]. Another study describing gene expression in peripheral blood mononuclear cells of COPD patients showed that the sphingolipid metabolism pathway is significantly upregulated in COPD individuals [35]. Additionally, higher blood (glyco)sphingolipids levels are associated with more extensive emphysema and more frequent exacerbations [36]. In combination with our novel findings in nasal epithelial cells, the role of glycosphingolipids appears important in COPD, and the exact contribution to (sub)phenotypes of COPD and targets for possible treatments needs to be further unraveled.

Fifteen genes were of specific interest as they were associated with COPD in both nasal and bronchial epithelium and additionally contributed to significantly enriched pathways associated with COPD. Of the upregulated genes, fucosyltransferase 3 (FUT3) and fucosyltransferase 6 (FUT6) are involved in the modification of proteins and lipids, i.e. the attachment of fucose. Fucosylation is increasingly recognized as an important player in cell-cell communication [37, 38]. For example, TNF-α, a proinflammatory cytokine, increases expression of FUT3 and FUT6 in human bronchial mucosa, suggesting a role for fucosyltransferases in airway inflammation [39]. Of interest, expression of FUT3 and FUT6 is associated with the major airway mucin MUC5AC [19]. Of the downregulated genes, Replication Factor C-3 (RFC3) forms, together with its family members RFC 1, RFC2, RFC4 and RFC5, a protein complex involved in the regulation of DNA replication and DNA damage repair. RFC complexes can function as checkpoints that delay DNA replication in case of DNA damage [40, 41]. Of interest, RFC1, RFC2 and RFC4 were significantly downregulated (nominal p value <0.05) in nasal epithelium as well; however, only RFC1 was also significantly downregulated in bronchial epithelium. Downregulation of RFC3 has been associated with lung-, gastric- and colorectal cancer [42, 43], but it might also play a role in COPD as oxidative stress due to cigarette smoke induces DNA damage.

The strength of our study is the use of an identification cohort to determine a COPD-associated nasal gene expression signature and the verification of this signature in two independent cohorts. We selected only current smokers for the analyses, which led to a true reflection of COPD-induced gene expression, hereby avoiding contamination of the results caused by differences in smoking status among patients. Furthermore, by selecting pathways and genes involved in all three cohorts, the probability is high that our findings reflect true biological mechanisms. A limitation of our study is the use of inhaled corticosteroids (ICS) among COPD patients, which was absent in healthy controls. Although ICS use confounds the primary analysis of nasal gene expression in this study, we addressed this issue by assessing whether COPD-associated nasal gene expression changes are similar to that observed in the bronchial airway using an independent dataset of 238 individuals (comparator cohort 1) where only a minority of individuals with COPD used ICS. Another possible limitation of our study is that we compared nasal epithelium of severe COPD patients with bronchial epithelium of moderate-to-severe COPD patients. It would be of added value to study matched nasal and bronchial samples of the same individual, in order to obtain the most appropriate comparison between nasal and bronchial gene expression. These data were not available in our study. However, despite the dissimilarity in COPD severity of the patients with either nasal or bronchial samples, we still found overlap in gene expression between the nose and the bronchus, suggesting a ‘shared’ gene expression signature that is relevant to COPD, regardless of disease severity. It is possible that distinct COPD phenotypes, such as emphysema, small airway disease and chronic bronchitis, have their own specific gene expression changes next to this shared signature, which needs to be explored in future research.

Conclusion

In conclusion, we demonstrate that the nasal epithelium is a suitable site to detect COPD-associated gene expression alterations. Of interest, we show supportive evidence that this nasal epithelial gene expression profile overlaps with COPD-associated bronchial epithelial gene expression. Our findings underscore the hypothesis that the upper and lower airways have a shared COPD-associated gene expression profile, a so-called ‘united airway field of injury’. Thus nasal gene expression, that is feasible given the ease of access to nasal epithelium, provides the opportunity to explore other applications in future research, such as the diagnosis of distinct molecular phenotypes of COPD and monitoring of disease progression and interventions.