Introduction

Emphysema and chronic bronchitis are important features of chronic obstructive pulmonary disease (COPD), and recent studies show that imaging features are associated with adverse clinical outcomes of COPD1. Computed tomography (CT) evidence of emphysema is associated with worse prognosis even among subjects without airflow obstruction2,3. However, there is a lack of specific emphysema prevention and treatment techniques in part due to a limited understanding of pathobiological mechanisms.

Smoking is a major risk factor for emphysema, but emphysema is identified in non-smokers and the severity of emphysema among smokers varies greatly4,5,6. Emphysema has a genetic component with significant heritability and a previous study estimated its heritability to be approximately 25%7,8. Previous reports have described genetic determinants of emphysema and airway phenotypes in smokers with or without COPD9, emphysema in the general population10, distinct local histogram emphysema pattern11, and emphysema distribution12.

However, most emphysema genome-wide association studies (GWAS) were performed either exclusively or predominantly in European ancestry individuals. The Korean Obstructive Lung Disease (KOLD) and COPD in dusty areas (CODA) cohorts were constructed in South Korea collecting CT imaging and blood samples that enabled assessments of genetic associations with CT features. We investigated genetic determinants of emphysema severity in Korean cohorts and sought to relate these to gene expression and DNA methylation.

Methods

Study sample

Blood samples of 1056 subjects from the KOLD and CODA cohorts were genotyped. KOLD is a prospective cohort which recruited COPD patients from 16 university hospitals in South Korea. CODA is a prospective cohort conducted on subjects with airflow limitation and healthy volunteers living in dusty areas near cement plants in the Kangwon and Chungbuk provinces of South Korea. Details of the cohorts were described in previous papers13,14. Written informed consent was given by each participant. This study received ethical approval from the Kangwon National University Hospital IRB (KNUH 2012-06-007) and the Asan Medical Center IRB (Approval No. 2005-0345). This study was conformed to the tenets of the Declaration of Helsinki.

Computed tomographic measurements

In the KOLD cohort, CT measurements were obtained using 16-channel multidetector row CT scanner (SOMATOM Sensation; Siemens Medical Systems, Erlangen, Germany). In the CODA cohort, CT measurements were obtained using dual source CT scanner (SOMATOM Definition, Siemens Healthcare, Forchheim, Germany). In both studies, all subjects were scanned at full inspiration in a supine position. Emphysema was calculated as percent of lung area below or equal to the − 950 HU threshold and log-transformed.

Genotyping and quality control

We genotyped all subjects on Axiom KoreanChip 1.0 platform15. We removed low quality SNPs with low variant call rate, excessive heterozygosity and singletons, gender discrepancy, Hardy–Weinberg Equilibrium p < 0.001 and minor allele frequency < 0.01 using Affymetrix Power Tools and PLINK. After quality control, 586,966 SNPs with minor allele frequency of 1% or more were remained. (Fig. 1) Genotype imputation was performed at the Michigan Imputation Server using the HRC r1.1 reference panel. After imputation, SNPs with low imputation quality (R2 < 0.8) and minor allele frequency < 0.01 were removed. We combined original genotype and imputation data for analysis, and finally, 5,992,248 SNPs were analyzed for GWAS. The SNPs locations were based on National Center for Biotechnology Information (NCBI) Build 37.

Figure 1
figure 1

Study workflow.

Genome-wide association analysis

We performed linear regression on natural log transformed emphysema index adjusted for age, sex, smoking status (never smoker, ex-smoker and current smoker), pack-years of smoking, and study center using PLINK (version 1.19)16. We also performed analyses only in COPD patients. We defined genome-wide significance as P < 5 × 10–8 and defined ‘candidate’ markers of interest at P < 5 × 10–6. Local association plots were generated around 600 kb in either direction of lead SNPs using LocusZoom in Asian genome (hg10/1000 Genome NOV 2014 ASN)17. Recombination rates were obtained using HapMap Phase II data18.

Epigenetic marks and DNase I hypersensitivity regions in top SNPs

Potential functional information of SNPs which exhibit high linkage disequilibrium (LD) (r2 > 0.8, 1000 G Phase1, Asian population as reference) with the top associated SNPs was obtained using the HaploReg database (version 4.1) and the single-tissue expression quantitative trait loci (eQTL) data in the genotype-tissue expression (GTEx) consortium (lung and whole blood)19,20. The P-value threshold for significant eQTL was set at 5 × 10–4 in GTEx database.

Epigenome-wide association study (EWAS) related to emphysema

We performed EWAS of emphysema in blood DNA from 100 CODA subjects using the Infinium HumanMethylation450 platform. EWAS methods have been previously described in a study of emphysema index21.

Comparison with previously published results

We performed look-ups of top SNPs in previously reported quantitative emphysema GWAS results which combined four study populations (COPDGene, ECLIPSE, National Emphysema Treatment Trial/Normative Aging Study, and GenKOLS)9. We also investigated whether the nearest genes of top SNPs were reported in previous differential expressed gene (DEG) analysis related to emphysema22.

Results

Baseline characteristics of study population

Baseline characteristics of study subjects are shown in Table 1. The mean age of all subjects was 71 years and 83.9% of all subjects were ever-smokers. Among a total of 548 subjects, 514 COPD patients who were defined as post-bronchodilator FEV1/FVC less than 0.7 and 34 subjects with normal spirometry had quantitative CT emphysema data (Fig. 1).

Table 1 Characteristics of subjects included in Genome-Wide Association Study.

GWAS in all subjects and COPD patients only

The Qq plot did not show systematic inflation in GWAS test statistics (lambda value 0.99 and 0.99 in all subjects and COPD patients, respectively (Supplementary Fig. S1). rs117084279 (MAF, 0.021), near the PIBF1 gene was genome-wide significant in all subjects (Table 2, Fig. 2). Fifty seven SNPs in 19 loci identified in all subjects, and 106 SNPs in 16 loci in COPD patients, reached a pre-defined suggestive significance level (P < 5.0 × 10–6) forming the candidate SNPs (Table 2). Among the candidate SNPs, 24 SNPs overlapped in both groups. The top 20 SNPs of each group are shown in Tables 2 and 3.

Table 2 Top 20 SNPs of GWAS results in all subjects for emphysema.
Figure 2
figure 2

Regional plot for top SNPs in all subjects. Regional plot for ± 600 kb from top SNPs on PIBF1 (a), KLF12 (b), KCNJ3 (c) and rs11214944 (d) in total patient main model.

Table 3 Top 20 SNPs of GWAS results in COPD patients for emphysema.

eQTL results

We sought to determine whether candidate SNPs are located in regions that have effect on regulation of gene expression in lung and whole blood using the GTEx database. Three candidate SNPs in all subjects and 8 candidate SNPs in COPD patients were identified as eQTL in lung or whole blood in GTEx (Tables 4 and 5).

Table 4 eQTL among GWAS candidate SNPs in all subjects.
Table 5 eQTL among GWAS candidate SNPs in COPD patients.

Comparing enhancer regions with GWAS results

To further explore the functional role of candidate SNPs, we examined our candidate SNPs and SNPs in LD with our candidate SNPs using the HaploReg database. We found 67 lead variants in candidate SNPs of all subjects and 69 lead variants in candidate SNPs of COPD patients were in LD (r2 > 0.8) with SNPs in promoter or enhancer histone marks or DNase I in lung tissue (Supplementary Table S1-A,B).

Look-ups in DEG and EWAS data

Comparing with DEGs identified in a preceding emphysema RNA-seq analysis, MACROD2 upregulation in emphysema overlapped with a region identified in the GWAS in all subjects. Similarly, DOCK1, LARGE and ERAL1 also overlapped loci noted genes that is associated with SNPs identified in COPD patients (Table 6). In addition, several annotated genes of our candidate SNPs were also identified in our EWAS study related to emphysema with nominal level of significance (Supplementary Table S2). We plotted results of GWAS and functional studies together in Fig. 3.

Table 6 Differentially expressed genes in previous COPD transcriptome analysis.
Figure 3
figure 3

Integrating results of GWAS and functional studies.

Look-ups in COPDGene data

In COPDGene emphysema results, we did not find any replicated SNP with our results (Supplementary Table S3).

Discussion

In this study, we identified one genome-wide significant SNP in a novel candidate gene (PIBF1) and candidate SNPs (P < 5.0 × 10–6) for quantitative emphysema on CT in all subjects and COPD patients in Korean cohorts. We further found candidate SNPs (at a more liberal threshold of significance) were often eQTL in lung tissue, in linkage disequilibrium with variants in promotor or enhancer histone markers, or DNase I hypersensitivity sites in lung tissue or whole blood. Moreover, several SNPs were located near genes reported in preceding emphysema RNA-seq analysis or preceding EWAS studies.

In all subjects, we identified 2 candidate SNPs which were also identified as eQTL in lung tissue, near CYP2A6. CYP2A6 has been associated with COPD and emphysema, and also smoking habits23,24. Genetic polymorphisms of CYP2A6 result in altered activity of the CYP2A6 protein, affecting nicotine metabolism and smoking behavior9,25,26. This effect was also evaluated in the Asian population27,28. However, a relationship between this region and COPD and emphysema has not yet been described. Smoking is an important risk factor for the development of emphysema; therefore, it is possible that this variant contributes to the development of emphysema through smoking. Further research is needed to elucidate the causal relationship.

One of our candidate SNPs (rs11214944) near NNMT was identified as an eQTL and in LD with SNPs lying in enhancer histone marks in lung tissue. NNMT has also been identified as a gene differentially expressed according to severity of COPD29,30. Moreover, NNMT has been identified as differentially expressed in moderate emphysema compared to mild emphysema by more than sixfold31. Intriguingly, in a previous study, NNMT was one of differentially expressed genes in IL-6 signaling related to airway inflammation and remodeling32. Although the study was focused on airway inflammation, not emphysema, considering that IL-6 and its signaling play a main function in emphysema pathogenesis, we could expect association of NNMT and emphysema through IL-633.

Although the MAF was very low, we identified one SNP, rs117084279, near PIBF1, that was genome-wide significant. Interestingly, PIBF1 has also been identified as one component of centriolar satellite proteins and has an essential role in primary cilia formation and ciliary protein recruitment34,35. By whole genome siRNA-based functional genomics screen, mutation in PIBF1 has been known to induce hereditary ciliopathy disease36. It has been well known that cigarette smoking causes structural and functional abnormality of cilia in bronchial epithelial cell37,38,39,40. Also, cigarette smoke is suggested to be responsible for genes associated with altered ciliary growth41. The relationship between ciliary function in airway epithelial cell and emphysema has not been well elucidated but our novel SNP could provide a potential link between them.

Likewise, KLF12and KCNJ3 are genes near our candidate SNPs, which were also found to be associated with our EWAS results. Of them, KCNJ3 was identified to associated with lung function and airway obstruction in previous studies, though there is no data on emphysema24,42. There is insufficient data to clarify that KLF12 is related to emphysema in lung. However, previous studies suggested that KLF12 gene, also known as the AP-2rep gene, functions as a transcriptional repressor of the AP-2α gene through a set of overlapping cis-regulatory promoter elements and a reciprocal regulation of both genes43,44. AP-2α is known to be involved in ras oncogene-mediated transformation and myc-mediated programmed apoptotic cell death45,46. Of interest, one study indicated that AP-2α protein was increased in lung of cigarette smoke exposure induced COPD rat model and this was also associated with increased cell apoptosis47. There are several mechanisms that likely contribute to the pathogenesis of emphysema48,49,50,51,52. One of them is apoptosis. Both animal COPD and human lung model suggest that apoptosis might be involved in the development of emphysema53,54,55. However, it has been unclear whether there is a direct relationship between AP-2α and apoptosis of cells in lung, results of our study yield novel insight of development and progression of emphysema and further experimental study is warranted.

In addition, we identified SNPs at a suggestive level of significance near MACROD2 which is DEG identified in preceding emphysema RNA-seq analysis. MACROD2 was associated with COPD and lung function in previous studies, but there is lack of data on the association with emphysema directly56,57,58. Although the reference study identified DEG according to presence of emphysema instead of quantitative value, integration with our GWAS result helps to identify meaningful genes among numerous genes22.

Although we did not find replicated SNPs in lookup results with COPDGene results, differential expression of genes and DNA methylation through high linkage disequilibrium with the genes could be suggested as a potential functional mechanism.

Inability to detect replicated SNPs in previous study is one of our limitations. This might be owing to small sample size, another limitation of our study, which can increase the false positive rates and decrease the statistical power. Even though a genome-wide significant SNP being identified, in view of both the relatively low MAF of the SNP and the small sample size of our study, further replication studies with larger population sizes are needed. Ethnic differences also could contribute to the inability of replication, and further Asian studies on emphysema are needed. Third, our study was not able to perform a meta-analysis with GWAS on emphysema or COPD in Asian population due to the lack of data. Comparing the results of meta-analysis with GWASs in different ethnicity would facilitate to elucidate the ethnic specificity. Fourth, the functional and biological impacts of the SNPs on emphysema are not identified in our study. Functional and integrated analysis may lead to a better understanding of pathophysiology of emphysema in Asian population. Also, we could not find out causal effects on emphysema of SNPs identified in our study. In further studies or meta-analysis including our study is needed to explore causal effects of the SNPs using Mendelian Randomization59,60,61,62,63. Finally, we had an interest in exploring whether the genetic cause could be a determining factor in emphysema regardless of smoking. Therefore, we focused on the results of the population that includes more never-smokers. However, owing to a small number of non-COPD subjects included in the total population, characteristics between the total and COPD populations may not be significantly different.

Conclusions

In a genome-wide association study of emphysema in Korean COPD, we identified a new genome-wide significant association and several associations at suggestive significance. Ours is the first GWAS related to quantitative emphysema in Korean population. Further analysis of including replication in other independent cohorts and functional studies would yield insights into the development of emphysema. In particular, this work may be a starting point to investigate the aspects of the pathobiology of emphysema that are shared or unique across differing ancestries.