Whole exome sequencing analysis identifies genes for alcohol consumption

Kang, Jujiao; Deng, Yue-Ting; Wu, Bang-Sheng; Liu, Wei-Shi; Li, Ze-Yu; Xiang, Shitong; Yang, Liu; You, Jia; Gong, Xiaohong; Jia, Tianye; Yu, Jin-Tai; Cheng, Wei; Feng, Jianfeng

doi:10.1038/s41467-024-50132-3

Whole exome sequencing analysis identifies genes for alcohol consumption

Article
Open access
Published: 10 July 2024

Volume 15, article number 5777, (2024)
Cite this article

Download PDF

You have full access to this open access article

From

View current issue

Whole exome sequencing analysis identifies genes for alcohol consumption

Download PDF

3531 Accesses
1 Altmetric
Explore all metrics

Abstract

Alcohol consumption is a heritable behavior seriously endangers human health. However, genetic studies on alcohol consumption primarily focuses on common variants, while insights from rare coding variants are lacking. Here we leverage whole exome sequencing data across 304,119 white British individuals from UK Biobank to identify protein-coding variants associated with alcohol consumption. Twenty-five variants are associated with alcohol consumption through single variant analysis and thirteen genes through gene-based analysis, ten of which have not been reported previously. Notably, the two unreported alcohol consumption-related genes GIGYF1 and ANKRD12 show enrichment in brain function-related pathways including glial cell differentiation and are strongly expressed in the cerebellum. Phenome-wide association analyses reveal that alcohol consumption-related genes are associated with brain white matter integrity and risk of digestive and neuropsychiatric diseases. In summary, this study enhances the comprehension of the genetic architecture of alcohol consumption and implies biological mechanisms underlying alcohol-related adverse outcomes.

New alcohol-related genes suggest shared genetic mechanisms with neuropsychiatric disorders

Article 29 July 2019

Genome-wide association study of alcohol consumption and genetic overlap with other health-related traits in UK Biobank (N=112 117)

Article Open access 25 July 2017

Multi-omics integration analysis identifies novel genes for alcoholism with potential overlap with neurodegenerative diseases

Article Open access 20 August 2021

Introduction

Alcohol consumption is a prominent risk factors for death and disability worldwide, accounting for over two million deaths each year¹. It poses a tremendous threat to human health through multiple mechanisms, including cumulative damage to organs and leading to self-harm or violence^2,3. Notably, these adverse effects are largely dependent on the average volume of alcohol consumption⁴. Identifying the risk factors that influence one’s level of alcohol consumption can contribute to the prevention, identification, and treatment of adverse outcomes from alcohol consumption⁵.

Over the recent decades, comprehensive genome-wide association studies (GWAS) have indicated the potential influence of genetic factors on one’s alcohol consumption volume and identified over 100 related variants^6,7. However, a predominant proportion of the identified variants are localized within noncoding regions, and their effect sizes tend to be small, making interpretation and identification of the causal gene challenging⁸. In addition, previous GWAS mainly utilized imputed genotype data, which only cover limited regions of the genome, and thus may have missed many potential genes. Furthermore, GWAS studies focused mainly on common variants, and few studies have investigated rare variants associated with alcohol consumption, which yield greater potential to interpret biological function and elucidate mechanisms⁹. Although there are studies that have attempted to leverage exome chip data to identify rare variants contributing to alcohol consumption, the sample size was small and limited regions of the whole exome were examined¹⁰.

The introduction of whole exome sequencing (WES) provides a great chance to overcome the limitations of previous genetic studies on alcohol consumption with a substantially larger amount of rare and ultra-rare protein-coding variants^11,12,13. Collapsing of loss-of-function (LOF) variants helps estimate the effect direction of associated genes^13,14. When combined with large-scale population cohorts with multi-modal phenotypic data, WES would greatly facilitate our understanding of the genetic underpinnings of alcohol consumption as well as its implication on physical and mental health⁶. However, to our knowledge, there have been few large-scale WES studies on alcohol consumption, let alone elucidating the potential implications of the identified genes^10,15. Meanwhile, as indicated by a previous genome-wide association study, significant genetic associations existed between alcohol consumption and several body health phenotypes⁷. The application of phenome-wide analysis for alcohol-related genes can help extend and deepen our current comprehension of the association between alcohol consumption and human health.

Hence, aiming to refine the genetic architecture of alcohol consumption, we conduct an exome-wide association study (ExWAS) for alcohol consumption among 304,119 individuals from the UK Biobank (UKB). We also examine the rare-variant associations with genes reported by previous GWAS^6,7,16,17. Finally, we provide biological insights into the identified genes via bioinformatics analyses and phenome-wide association analysis (PheWAS).

Results

Study population and data description

We leveraged exome sequencing data and phenotypic data from UKB and excluded low-quality variants and samples (Methods)^13,18. For the main analysis, we included 304,119 unrelated white British participants. The average age was 56.87 years at enrollment and 54.09% participants were female. Information about alcohol drinking per week were obtained from self-completed touchscreen interviews at baseline (Methods and Supplementary Data 1). The average alcohol consumption (alcohol amounts after natural logarithm) of the whole sample was 2.06 (Standard Deviation (SD) = 1.44), with a mean of 2.47 (SD = 1.41) and 1.72 (SD = 1.38) for males and females respectively (Supplementary Data 2). Finally, the exome-wide association analysis included 100,101 common variants (with a MAF of ≥1%) and 13,018,630 rare variants (with a MAF of < 1%). Figure 1 provided the general schema of our study.

ExWAS for alcohol consumption

To test whether alcohol consumption was associated with damaging coding variants, we conducted ExWAS using a linear mixed model with adjustments for ten principal components, age, and sex (Methods). The analysis discovered two rare variants and 23 independent common variants linked to alcohol consumption (P < 5 × 10⁻⁸) (Table 1, Fig. 2a, b). The genomic control lambda is 1.04, indicating that the association statistics are not systematically inflated (see Supplementary Fig. 1 for the corresponding quantile-quantile plot). The top rare variant, rs283413 (MAF = 0.8%; β_A = −0.15, P = 2.73 × 10⁻³¹) is a stop-gain variant in ADH1C, the well-known gene related to alcohol metabolism. Among the 23 common variants, three were not reported previously (rs41288799, rs4975020 and rs77623289). Most of the identified variants are intron (46%) or missense (19%) (Fig. 2c, Supplementary Data 3, Methods). Additionally, 15 of the 22 identified variants, which were examined in an independent alcohol consumption GWAS¹⁹, showed nominal significance (P < 0.05) (Table 1, Supplementary Data 4). Further, 17 of the 24 identified variants available in the FinnGen study²⁰ exhibited nominal associations with alcohol use disorder (AUD) (P < 0.05) (Supplementary Data 5). To assess the robustness of the main analysis, we adjusted for rs1229984, a well-established marker strongly linked to alcohol consumption^6,21. Notably, 23 of 25 variants (92%) retained the same association directions, with 22 variants (88%) maintaining their significance (P < 5 × 10⁻⁸) (Supplementary Data 6). Additionally, the main analysis maintained its robustness after excluding former drinkers and non-drinkers. Further, all effect directions remained the same, and 20 of the initially identified 25 variants (80%) retained their significance (P < 5 × 10⁻⁸) (Supplementary Data 7). Finally, the ExWAS for scores of alcohol use disorders identification test (AUDIT) identified a rare variant (rs283413) and two independent common variants (rs13107325 and rs201168482) associated with alcohol use problems (Supplementary Figs. 2–4, Supplementary Data 8).

Table 1 Exome-wide significant variants for alcohol consumption

Full size table

**Fig. 2: Single-variant ExWAS of alcohol consumption.**

Since a single rare variant tends to be of insufficient power to identify significant signals, we further performed gene-based collapsing analysis to detect genes related to alcohol consumption. LOF and missense rare variants of each gene and three MAF thresholds (< 1%, < 0.1% and < 0.01%) were utilized. In total, we identified 19 associations (covering seven genes) after Bonferroni correction (Table 2 and Fig. 3a; Supplementary Data 9; P < 0.05/19852 = 2.5 × 10⁻⁶). Rare variants in the known alcohol consumption-related gene, ADH1C showed the most significant gene-based association at P = 1.91 × 10⁻³⁰. The maximum genomic control lambda was 1.076 (see Supplementary Fig. 5 for the corresponding quantile-quantile plots). The total rare burden heritability of alcohol consumption was 0.88% (Fig. 3b and Supplementary Data 10). We additionally identified six putative alcohol consumption-related genes under the threshold of overall false discovery rate (FDR) < 0.05 (P <1.69 × 10⁻⁵). Among these rare-variant genes, seven (GIGYF1, ANKRD12, KDM5B, APC2, LGI2, ATP1A2, and ENSG00000224076 (not officially designated and excluded from further analysis)) were not previously reported in GWAS studies for alcohol consumption. The LOF and missense burden in eleven of the rare-variant genes reduced alcohol consumption (β = −0.003 to −0.023; Fig. 3c, Table 2). In addition, 2.03% (n = 8825) of the participants carried a LOF variant located in ADH1C exons and GIGYF1 variants were carried by 1.72% (n = 7449) participants (Fig. 3d). After excluding the former drinkers and non-drinkers, 19 out of the initially identified 39 associations retained the significance (P < 1.69 × 10⁻⁵, Supplementary Data 11). Following adjustment for rs1229984, the identified associations were robust except for ADH1C, ADH1A, SNX17 and ADH5 (Supplementary Data 12). Additionally, we performed ExWAS for AUDIT and identified two genes (ADH1C and CA1) associated with alcohol use problems (Supplementary Figs. 6-8, Supplementary Data 13).

Table 2 Gene associated with alcohol consumption at FDR < 0.05

Full size table

**Fig. 3: Gene-based ExWAS of alcohol consumption.**

Leave-one-variant-out (LOVO) and conditional analysis

To investigate whether a single variant dominated the gene-based associations, we firstly conducted LOVO analysis. While the maximum P-value for ADH1C was P = 0.802 after the removal of rs283413, P = 0.873 for ADH1A after the removal of rs190428650, P = 0.446 for SNX17 after the removal of rs147740391, and P = 0.016 for ADH5 after the removal of rs62325244, the other associations did not exhibit substantial attenuation (Supplementary Figs. 9–20, Supplementary Data 14). Hence, even a single variant, i.e. of ADH1C, ADH1A, and ADH5, may critically influence alcohol consumption, whereas the other significant associations were based on a burden of multiple rare variants. Subsequently, conditional analysis was performed to assess whether the significant associations with rare variants were influenced by adjacent common variants (Methods). Seven genes were found to have nearby common variants exhibiting significant associations with alcohol consumption. The associations of GIGYF1, ANKRD12 and APC2 did not exhibit substantial attenuation, whereas the associations of ADH1C, ADH5 and SNX17 exhibited attenuation, though still nominally significant, and the association of ADH1A lost its significance after adjustment for the nearby common variants (Supplementary Data 15).

Sex-specific analysis of the associations

As the average alcohol consumption showed a significant difference between males and females, we conducted gene-based collapsing analyses on participants separated by sex to explore whether the genetic contributions to alcohol consumption also differed by sex. While the KDM5B gene’s association with alcohol consumption was only observed in males (P = 3.04 × 10⁻⁷ for males and P = 0.170 for females), the other genes were significantly associated with alcohol consumption in both males and females (P < 0.05, Supplementary Data 16).

Associations of rare variants in alcohol-related genes

We then examined the impact of rare variants based on previous GWAS findings on alcohol consumption. We assessed a total of 174 alcohol consumption-related genes identified by the most recent GWAS studies^6,7,16,17. Although 25 genes showed nominal significance, only the ADH1C gene was significant after Bonferroni correction (Supplementary Data 17). The influence of coding variants within the GWAS regions did not exhibit substantial effects, potentially due to the limited statistical power of ExWAS.

Biological function and tissue expression of the alcohol consumption-related genes

We further conducted a series of bioinformatics analyses to investigate the biological functions of the alcohol consumption-related genes. We first performed pathway enrichment analyses. We found the enrichment of gene ontology (GO) pathways relevant to alcohol dehydrogenase activity, oxidoreductase activity, ethanol oxidation and ethanol metabolism (Fig. 4a, Supplementary Data 18). Also, the analysis of Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways identified the enrichment of these genes in tyrosine metabolism, fatty acid degradation, and pyruvate metabolism. These results hence supported the biological validity of our genetic findings.

**Fig. 4: Biological function of the alcohol consumption-related genes.**

We further analyzed tissue-specific expression enrichment of the identified genes based on the Human Protein Atlas project using the TissueEnrich R package²². We observed six, four, and two genes enriched in the liver, duodenum, and adipose tissue, respectively (Fig. 4b, Supplementary Fig. 21, and Supplementary Data 19). Genes, including SERPINA1, ADH1C, ADH1A, MLXIPL, MTTP, and KLB were specifically enriched in the liver (Supplementary Fig. 22). Subsequently, we evaluated the expression levels of these six genes across various cell types in the liver with single-cell RNA sequencing (scRNA-seq) data. While SERPINA1 was widely expressed in all cell types, ADH1A, ADH1C, MTTP, and MLXIPL were all predominantly expressed in the hepatocytes (Fig. 4c, d).

We further estimated the similarities between genes based on the association results of collapsing analyses across 1419 quantitative traits in UKB using Gene-SCOUT²³. Notably, the GIGYF1 gene exhibited the highest similarity to ANKRD12 (Fig. 5a and Supplementary Data 20). Interestingly, the top 10 similar genes of ANKRD12 are enriched in brain function-related pathways containing glial cell differentiation, cognitive function, and glutamate secretion (Fig. 5b and Supplementary Data 21). Thus, to gain more insights into how these rare-variant genes may be related to alcohol use, we further examined the expression of ANKRD12 and GIGYF1 across tissues within the Human Protein Atlas²⁴. Notably, both ANKRD12 and GIGYF1 exhibited strong expression in the brain, particularly in the cerebellum (Fig. 5c, Supplementary Fig. 23). In addition, both ANKRD12 and GIGYF1 showed broad expression in all cell types in brain (Supplementary Fig. 24). We subsequently characterized the spatiotemporal expression trajectories of ANKRD12 and GIGYF1 in the human brain, using mRNA sequencing (mRNA-seq) data from the PsychEncode study²⁵. Our findings revealed unique temporal expression patterns of these genes in the cerebellum compared to other regions of the brain (Fig. 5d, e). These results imply that these two genes associated with alcohol consumption may alter the function of brain, which are important targeted organ of alcohol intake, providing clues for future research on the alcohol-related brain injury.

**Fig. 5: Functional analysis of the rare-variant genes identified in our study.**

Phenotypic associations with alcohol consumption-related genes

Alcohol consumption has been documented to correlate with various biological markers, including metabolites, and health outcomes^7,26,27,28. To systematically assess the relationship between genetic variation in alcohol consumption and a broad spectrum of health phenotypes, we performed PheWAS for the identified alcohol consumption-related genes across blood indices, major diseases, body function, and brain structures from the UKB (Methods and Supplementary Data 22).

Among the 82 significant gene-phenotype and 380 variant-phenotype associations (P < 0.05/316/12 = 1.32 × 10⁻⁵, P < 0.05/316/25 = 6.33 × 10⁻⁶, respectively), 81.7% and 47.4% were related to inflammatory and blood biochemistry indices (Fig. 6 and Supplementary Data 23, 24, Supplementary Figs. 25–61). Indicators of inflammation and disturbance of lipid metabolism showed significant associations with alcohol consumption-related genes. GIGYF1 and ANKRD12 showed the most phenotypic associations. GIGYF1 showed strong positive associations with HbA1c (β_burden = 0.029, P = 2.51 × 10⁻¹³) and glucose (β_burden = 0.027, P = 1.92 × 10⁻¹⁰), and negative associations with total cholesterol level (β_burden = −0.029, P = 4.52 × 10⁻¹³), low-density lipoprotein cholesterol level (LDLC) (β_burden = −0.026, P = 1.33 × 10⁻¹⁰) and Apolipoprotein B (β_burden = −0.024, P = 6.15 × 10⁻¹⁰). ANKRD12 showed strong positive associations with neutrophil percentage (β_burden = 0.021, P = 1.19 × 10⁻¹³) and neutrophil-lymphocyte ratio (β_burden = 0.020, P = 7.34 × 10⁻¹³), and negative associations with lymphocyte percentage (β_burden = −0.021, P = 2.75 × 10⁻¹³), total protein level (β_burden = −0.019, P = 1.36 × 10⁻¹⁰), and monocyte percentage (β_burden = −0.015, P = 3.04 × 10⁻⁸).

**Fig. 6: Phenotypic associations of the rare-variant genes linked to alcohol consumption.**

Interestingly, the gene-phenotype associations also extended to cognitive function and white matter. ANKRD12 showed significant associations with lower fluid intelligence scores (β_burden = −0.028, P = 6.03 × 10⁻¹⁰) and worse performance in the pairs matching task (β_burden = 0.010, P = 2.93×10⁻⁷). GIGYF1 showed nominal associations with lower fractional anisotropy (FA) in the fornix tract (β_burden = −0.059, P = 1.06 × 10⁻⁴), and longer reaction time (β_burden = 0.014, P = 1.40×10⁻⁴). The Mendelian randomization analyses failed to uncover any causal relationship between cognition and alcohol consumption (Supplementary Data 25), in line with results from previous studies²⁹. Given the limited evidence supporting causal links between cognition and alcohol consumption, it is plausible that the observed associations may stem from the pleiotropic effects of ANKRD12 and GIGYF1.

The variant-phenotype association analyses revealed significant correlations with various white matter tracts. Notably, significant correlations were observed for FA in specific regions, including left anterior limb of the internal capsule (β = −0.072, P = 9.52 × 10⁻¹³), genu of corpus callosum (β = −0.069, P = 9.68 × 10⁻¹³), and left superior frontal-occipital fasciculus (β = −0.067, P = 6.13 × 10⁻¹²).

ExWAS in all white British participants and unrelated non-white British participants

Since SAIGE can handle sample relatedness in the regression model, we included all 373,152 white British participants (including both unrelated and related participants) in the analyses to increase statistical power. In the ExWAS for single variants, we identified 26 independent significant variants associated with alcohol consumption, including four variants not detected in the unrelated white British sample, of which two were not previously linked to alcohol consumption (Supplementary Data 26). The gene-based collapsing analysis identified 23 potential alcohol consumption-related genes with an overall FDR < 0.05. Of the 23 genes, 13 were not found in the unrelated white British participants, and among these, eight were not previously associated with alcohol consumption (Supplementary Data 27).

Moreover, ExWAS was conducted in 61,076 unrelated non-white British participants. While the ExWAS for single variants identified one locus significantly linked to alcohol consumption (Supplementary Data 28), the gene-based collapsing analysis did not uncover any significant associations after FDR correction, potentially attributed to the constrained sample size among non-white British participants.

Discussion

Herein we describe the largest comprehensive ExWAS of alcohol consumption to date and provide deep biological insights into the identified genes via functional analysis and phenome-wide association analysis with health-related data from the UKB. We identified ten previously unreported genes associated with alcohol consumption as well as replicated several known genes, which may shed light on pathophysiological processes in alcohol use. Furthermore, bioinformatics analyses supported the biological validity of the genetic associations and gene expression analysis highlighted the role of the cerebellum in alcohol consumption. PheWAS analyses provide strong support for the pleiotropic and consequent effects of alcohol consumption-related genes on human health, especially on inflammation, lipid metabolism, and white matter integrity.

Previous GWAS studies have enabled the identification of alcohol consumption-related genes, but our study extended previous findings via the discovery of more genes as well as the identification of more common and rare variants to the reported genes of alcohol consumption. We have identified thirteen genes at exome-wide significance based from rare variants using gene-based collapsing analysis, seven of which (GIGYF1, ANKRD12, KDM5B, APC2, LGI2, ENSG00000224076 and ATP1A2) were not reported by previous GWAS studies. Moreover, among the 174 reported genes from the most recent GWAS studies^6,7,16,17, twenty-five showed nominal significance and ADH1C passed Bonferroni correction. Notably, utilizing the LOVO analysis, we found that for those reported GWAS genes including ADH1C, ADH1A, SNX17, and ADH5, removal of a single SNP leads to loss of significance in gene-based collapsing analysis, while for the genes not previously reported in the GWAS studies, removal of any single SNP does not influence the significance. The results indicated that the significance of these genes is the cumulative effect from a group of rare SNPs, which may explain why they were not detected by previous GWAS studies. This is further validated by the single variant analysis, where a significant signal was detected in ADH1C while not in those genes. Our results emphasized the value of rare variants as well as the necessity of gene-based collapsing analysis in WES studies on alcohol consumption.

For the two rare-variant genes (GIGYF1 and ANKRD12) associated with alcohol consumption, GIGYF1, identified as a risk gene for diabetes in earlier research^13,30, is a protein-coding gene intricately involved in the regulation of cell growth and division. One meta-analysis of 38 studies demonstrated that a moderate level of alcohol intake was linked to a lower risk of type 2 diabetes compared to abstainers³¹. The association might be mediated by the beneficial metabolic effect of alcohol consumption such as altered HDL cholesterol and inflammation levels²⁶. Meanwhile, results of the PheWAS showed that the alcohol consumption-related gene, GIGYF1, was significantly associated with blood levels of HDL cholesterol and several inflammatory biomarkers. Therefore, GIGYF1 may participate in the metabolic disturbance caused by alcohol consumption. As for another gene ANKRD12, less evidence was found on its possible role in alcohol consumption. While the Gene-SCOUT analysis provided interesting findings that GIGYF1 and ANKRD12 showed high similarity in biomarker profiles, which suggested that they might execute similar biological functions. Interestingly, ANKRD12 and GIGYF1 are associated with a higher Townsend deprivation index, which could possibly lead to a less access to alcohol³². Given that those genes were associated with cognitive function in our PheWAS results and in previous studies^33,34, it is possible that reduced cognitive function in the gene carriers results in increased material deprivation and in turn reduced alcohol consumption. Nevertheless, the findings may be confounded by many factors and the causality is not validated by mechanism study, so further research is needed to clarify the potential associations of GIGYF1 and ANKRD12 with alcohol consumption.

In addition to the discovery of genetic associations, we also provide insights into alcohol metabolism-related brain alterations based on the two rare-variant genes identified in this study. The Gene-SCOUT analysis identified a series of genes that were highly similar to these two genes. These genes displayed significant enrichment in the regulation of glial cell differentiation and observational learning. As evidenced by previous human and animal studies, disrupted differentiation of glial cell (astrocytes and oligodendrocytes) is one of the human alcohol-related neuropathology³⁵ and heavy alcohol exposures could result in cognitive impairment³⁶. Since alcohol consumption influences intracellular signaling mechanisms, causing alterations in gene expression that gradually produce long-lasting damage in the brain³⁷, these identified genes might be involved in the pathological process. What’s more, glia dysfunction is known to cause white matter atrophy, and these two genes are significantly expressed in white matter, further hinting that they might mediate alcohol-related brain damage. Another finding lies in their dominant expression in the cerebellum, one of the major target organs of alcohol abuse. Moreover, ANKRD12 and GIGYF1 are well-known genes for reduced cognitive function and intellectual disability as evidenced by previous studies^33,34. Consistently with previous findings, our PheWAS analyses indicated strong correlations between these genes and cognitive decline as well as altered white matter integrity, which suggests that these genes play a significant role in brain function and structure. The findings are plausible as prior studies have observed the associations between heavy alcohol consumption and changes in brain structure^38,39. More interestingly, alcohol consumption-related white matter microstructure changes have been considered a hallmark of AUD^40,41. Therefore, ANKRD12, significantly associated with alcohol consumption, AUDIT, and white matter integrity alterations, might serve as therapeutic targets for the prevention of AUD.

We observed sex heterogeneity for KDM5B, the association between KDM5B and alcohol consumption was only observed in the male group. As we only observed heterogeneity in one gene, it is possible due to the sex-specific biological function of KDM5B. KDM5B encodes a lysine-specific histone demethylase, which is an important regulator of liver molecular pathways after alcohol consumption⁴². Previous studies found sex-specific roles of KDM5B in the alcohol-induced hepatic response, which regulates a fibrogenic program in females while contributes to hepatocyte dedifferentiation and fatty acid synthesis in males^43,44. However, the sex-specific mechanisms underlying the influence of KDM5B on alcohol consumption is still unclear. Future studies to identify the mechanisms will be necessary.

Despite these significant findings, our study has some limitations. First, as WES could only detect variants in the protein-coding regions, the possible genetic associations in non-protein-coding region were less investigated in this work. Second, because of the scarcity of a comparable population cohort with genetic sequencing and phenotype data for replication, we relied on existing GWAS data for alcohol consumption and AUD to support our findings. Further whole-exome studies are needed to replicate the identified genes. Third, the causality between the reported genes and alcohol use was largely unknown. Further research are needed to replicate and verify the identified genes and the potential relationship with alcohol consumption. Lastly, participants who drinking up to 3 times monthly and less were assigned a weekly drinking level of zero following a previous study⁴⁵. While this simplified approach may introduce some error into their drinking levels, it is expected to be relatively small given the infrequency of their alcohol consumption.

In conclusion, by sequencing the protein-coding regions, we were able to replicate the genes previously reported and identify common and rare coding variants that have a strong effect on alcohol consumption. Additionally, functional analysis of the identified genes not only recapitulated known biological processes in alcohol consumption but also provided insights into the brain’s role in alcohol consumption. We anticipate that our findings of the alcohol consumption-related genes will facilitate the identification of individuals that are vulnerable or intolerant to alcohol consumption, contributing eventually to the prevention as well as treatment of alcohol-related adverse outcomes.

Methods

UK Biobank

The UKB included phenotypic and genetic information for approximately 500,000 participants of ages between 40 and 69^46,47. Informed consent has been signed by all participants. The UKB cohort was approved by the NHS National Research Ethics Service North West (reference number: 16/NW/0274). The data utilized in the study included demographic data, alcohol-related phenotypes, neuropsychiatric diseases, cardiovascular diseases, cognition, brain grey matter and white matter phenotypes, heart function, lung function, biochemistry, and inflammation phenotypes. The research was performed under application number 19542.

Study phenotypes

The alcohol consumption score was determined through a self-administered touchscreen interview conducted during the baseline appointment. Initial data acquisition involved obtaining mean weekly alcohol consumption data, taking into account various beverage types, from participants reporting alcohol consumption more than once or twice weekly. Each alcoholic drink type was measured in specific units: spirits in measures, wines in glasses, and beer/cider in pints, approximately equating to one, two, and two point five units, respectively. For respondents indicating intake frequencies of “one to three times a month,” “special occasions only,” or “never” (for whom weekly alcohol consumption data were unavailable), a weekly volume of 0 units was assigned. The determination of alcoholic units per week involved aggregating the intakes for these five drink types, consistent with a previous study⁴⁵. The median alcoholic units per week of the whole sample was 10 (Supplementary Data 2). The alcohol consumption score was the log (units+1) transformed alcoholic units per week. Detailed information was available in Supplementary Data 1.

Whole exome sequencing data

WES was performed for approximately 454,756 individuals from the UKB with IDT xGen Exome Research Panel v1.0^11,18. We implemented centralized quality control following extensive quality control procedures following previous research¹³. Concisely, multi-allelic sites were segregated into bi-allelic sites and calls with poor genotype quality or excessively low/high genotype depth were marked as no-call. Next, we excluded variants located in Ensembl low-complexity regions, along with variants possessing call rate ≤ 90%, and Hardy-Weinberg Equilibrium (HWE) P-value ≤ 10⁻¹⁵. Finally, we removed participants who withdrew from the UKB, duplicates, participants exhibiting discrepancies between self-reported and genetically indicated sex, and participants with Ti/Tv, Het/Hom, SNV/indel, and the amount of singletons exceeding 8 standard deviations from the mean. Additionally, we excluded individuals who were genetically related at the 3rd degree or closer in the main analysis. Overall, a total of 304,119 individuals with available alcohol consumption data and genetic data passed the initial quality check and were used in the main analysis. We additionally conducted ExWAS in all (both genetically related and unrelated) white British participants and unrelated non-white British individuals. White British individuals were identified as the intersection of participants who self-reported as ‘White British’ and those who exhibited very similar genetic ancestry based on genetic components. To control population stratification, we generated the top 10 ancestral principal components (PCs) using a high-quality independent autosomal variants subset, as outlined in a prior study¹³. Specifically, this subset of variants comprised variants with MAF > 0.1%, HWE P > 10⁻⁶, missingness < 1%, and underwent two rounds of pruning (--indep-pairwise 200 100 0.1 and 200 100 0.05 in PLINK).

Variant annotation

First, rare variants were defined as MAF less than 1%. SnpEff was utilized to annotate the variants⁴⁸, during which the most detrimental consequence of the gene transcript was retained. Subsequently, variants annotated as frameshift, splicing donor, stop gain, splicing acceptor, stop loss, and start loss were categorized as loss of function (LOF). Variants that were consistently predicted as deleteriousness in SIFT⁴⁹, PolyPhen2 HDIV, and PolyPhen2 HVAR⁵⁰, LRT⁵¹, and MutationTaster⁵² were defined as likely deleterious missense.

ExWAS

ExWAS analysis was conducted using the SKAT-O test through SAIGE-GENE + ⁵³. In SAIGE-GENE + , ultra-rare variants (minor allele carrier (MAC) ≤ 10) were collapsed into a pseudo marker, effectively addressing data sparsity caused by the presence of ultra-rare variants⁵³. Therefore, both rare and ultra-rare variants could be investigated. First, single-variant association analyses were performed for all variants with MAC ≥ 20, as suggested by SAIGE-GENE + ⁵³. Independent significant variants were identified using linkage disequilibrium (LD)-clumping (r² < 0.1), with the UKB WES data utilized as the reference panel, and subsequently mapped to genes using VEP⁵⁴. Then, in the gene-based collapsing analyses, SKAT-O tests were conducted utilizing the minimum p-value method^53,55. We used three distinct maximum MAF cutoffs (0.01%, 0.1%, and 1%) and two annotations masks (LOF and LOF plus missense). We adjusted age, sex, and the top ten ancestral PCs (which were calculated with WES data). All quantitative phenotypes underwent inverse normalization in SAIGE-GENE + . A relative coefficient cutoff of 0.05 was applied to the sparse genetic relationship matrix for the estimation of variance ratios.

Genotype and imputation

Genotype data (version 3) were from the UKB cohort. The UKB conducted array design, genotyping, quality control, and imputation procedures⁴⁶. We performed quality control (excluding variants with MAF < 0.005, INFO < 0.3, call rate < 90% or HWE P < 10⁻⁵⁰) with PLINK v2⁵⁶ software. Additionally, participants with missingness less than 0.05, no sex mismatch, no abnormal sex chromosome aneuploidy, no outliers in heterozygosity rate, and estimated white British ancestry, with a maximum of ten putative third-generation relatives, were incorporated into the analysis.

ExWAS for AUDIT

To extend the implications of alcohol consumption findings to alcohol use disorder, we conducted an ExWAS utilizing measures from the Alcohol Use Disorders Identification Test (AUDIT)⁵⁷, obtained through an online mental health questionnaire and processed following the methodology detailed in the previous study⁵⁸. Specifically, the scores for the AUDIT subdomains, representing alcohol consumption (AUDIT-C) and indicating alcohol dependence and problematic alcohol use (AUDIT-P), were calculated by consolidating scores from items 1–3 and items 4–10, respectively. The total score (AUDIT-T) was the sum of items 1–10. Detailed information was available in Supplementary Data 1. A total of 101,240 participants with available AUDIT measurements, WES data and covariate information were used for the analyses. We conducted ExWAS for both the total score and the subscores.

LOVO analysis

The LOVO analysis was performed for associations identified in the gene-based analysis. For each gene-phenotype association, the collapsing test was iterated upon excluding each variant initially included, where each variant would have a P-value. This was undertaken to address specific aspects: firstly, to examine the stability and consistency of the results across variant exclusions; secondly, to discern whether the gene-based collapsing association results were predominantly driven by specific variants; and finally, to investigate whether the observed gene-based collapsing associations were influenced by numerous rare variants characterized by relatively small effect sizes. If the collapsing analysis after removing a single variant yields an attenuated significance (P > 0.01), that single variant was considered to predominantly drive the gene-phenotype association¹³. This analytical approach allows for a comprehensive evaluation of the role of individual variants within the broader gene-based context.

Conditional analysis

To test for independence between the significant rare variant associations and nearby common variation, we re-conducted the gene-based collapsing analyses additionally correcting the nearby common variants associated with alcohol consumption¹³. First, we conducted association analyses for common variants (MAF > 0.5%) within the 500 kb genomic region of the identified genes, utilizing the UKB imputed genotype data. Then, LD-clumping was performed to identify independent significant loci (P < 1 × 10⁻⁵ and r² < 0.01). At last, we performed the collapsing analyses additionally adjusting for the independent significant loci.

Burden heritability estimation

We estimated the burden heritability based on rare coding variants (LOF and missense) using the burden heritability regression (BHR) method⁵⁹. The BHR performed regression of the burden test statistic on the burden score using summary statistics of the association analysis and allele frequencies at the variant level, and derived the burden heritability through estimation of the regression slope⁵⁹.

Pathway enrichment analysis

We used the g:Profiler⁶⁰ software to conduct the enrichment analysis, selecting Gene Ontology and KEGG database as the gene set databases. The g:SCS (Set Counts and Sizes) correction method was employed for multiple testing correction.

Tissue enrichment and expression analysis

To determine whether the identified genes were enriched in multiple tissues, we conducted tissue enrichment analysis using the R package TissueEnrich²². The source data were from the Human Protein Atlas, and the hypergeometric test was used²².

Transcript expression levels of the two genes (GIGYF1 and ANKRD12) in 256 tissues were determined utilizing RNA sequencing data from the Human Protein Atlas²⁴. The dataset corresponds to Human Protein Atlas version 22.0 and Ensembl version 103.38. Additional details regarding the data are available elsewhere at (https://www.ebi.ac.uk/biostudies/arrayexpress/studies/E-MTAB-2836).

Lifespan spatio-temporal gene expression trajectory

The lifespan spatio-temporal brain expression trajectories of the alcohol consumption-related genes were characterized using the mRNA-seq data of human brain from the PsychENCODE study²⁵. The expression of each gene in each anatomical tissue was estimated. Gene expression levels was quantified utilizing the reads per kilobase per million mapped reads (RPKM) metric.

Single-cell expression

We used liver scRNA-seq data from Gene Expression Omnibus (GEO) database (accession ID: GSE115469)⁶¹ and processed it with the R package Seurat⁶². Individual cells with low quality, defined as the cells with less than 200 expressed genes or larger than 75% mitochondrial counts, were excluded. Then the gene expression matrix underwent normalization using the NormalizeData function in Seurat⁶². The top 25 PCs and a resolution of 0.4 were used to conduct clustering, and then the clusters were annotated according to the previous publication⁶¹.

Additionally, the brain scRNA-seq data sourced from temporal cortex tissues was obtained from the GEO database under accession ID GSE173731⁶³. In the dataset, all cell types in the brain were isolated and sequenced⁶³. Analysis and visualization were performed using the metadata files with the R package Seurat⁶².

Gene similarity

We utilized Gene-SCOUT²³ to estimate the similarities between genes using association results of collapsing analyses across various quantitative traits in the UKB. In this tool, we searched the “seed gene” ANKRD12 to identify the similar genes. The top 10 similar genes and the “seed gene” were then employed in the enrichment analysis with Gene Ontology terms²³.

MRI data and preprocessing

Structural MRI data were obtained from three dedicated and identical imaging centers^64,65. Preprocessing of this data followed a pipeline established in previous studies^66,67 with SPM12 software and the CAT12 toolbox⁶⁸ with default settings. This included high-dimensional spatial normalization, nonlinear modulations, and smoothing (with an 8 mm half-maximum full-width Gaussian kernel). For regional grey matter volume, we employed the Automated Anatomical Labeling 3 (AAL3) atlas⁶⁹, a brain parcellation system that subdivides the brain into 166 distinct regions. We utilized the AAL3 atlas due to its finer parcellation, especially in the subcortical regions, which are closely linked to alcohol use and addiction.

We utilized fractional anisotropy (FA) of white matter tracts provided by UKB. Detailed data processing and quality control procedures have been comprehensively outlined in prior study⁶⁰. Specifically, dual diffusion-weighted shells were employed to acquire diffusion-weighted images, incorporating 50 distinct diffusion-encoding directions for each shell, and with a resolution of 2 × 2 × 2 mm. TBSS⁷⁰ was used to conduct the alignment of FA images to a standard-space white matter skeleton. FA images was further improved with high-dimensional FNIRT-based warping for enhanced alignment⁷¹. Our analyses encompassed 48 distinct white matter tracts extracted based on the JHU ICBM-DTI-81 atlas⁷².

Phenome-wide association analysis

The phenotypes in PheWAS were centered around traits that are associated with alcohol consumption, including behavioral aspects and health outcomes. The disease-related analysis covered neuropsychiatric diseases, cardiovascular diseases, and digestive diseases, which can be impacted by alcohol consumption patterns. Additionally, the analysis incorporated cognitive tasks, inflammatory traits, blood biochemistry traits, neuroimaging traits (including grey and white matter measures), and cardiac and lung function measures, all of which are pertinent to understanding the impacts of genes related to alcohol consumption on human health and functioning. This comprehensive selection of phenotypes aligns with the aim of investigating the potential genetic influences on alcohol consumption and its related health implications. In the analysis of diseases, we investigated 10 neuropsychiatric diseases, 7 cardiovascular diseases, and 19 digestive diseases. For the analysis of continuous phenotypes, we examined 10 cognition tasks, 9 inflammatory traits, 30 blood biochemistry traits, 214 neuroimaging traits (including 166 grey matter measures and 48 white matter measures), 8 heart structure measures, and 9 spirometry measures. Comprehensive details regarding the phenotypes can be found in Supplementary Data 22. We used single-variant association tests for identified variants and SKAT-O tests for identified genes⁵³, adjusting for the top ten ancestral PCs, age, and sex.

For the cognitive function tasks, data were preprocessed similar to the previous study⁷³. We incorporated cognitive tests from both baseline and imaging follow-up. Specifically, we selected the timepoints that corresponded to the maximum sample size for each cognitive test.

Mendelian randomization analysis

To explore the mediating relationships between ANKRD12, GIGYF1, cognition, and alcohol consumption, we first conducted a bidirectional Mendelian randomization (MR) between cognition and alcohol consumption using TwoSampleMR R package. We employed GWAS summary data for the general factor of intelligence, derived from a compilation of seven distinct cognitive tests⁷⁴, all sourced from the UK Biobank. Ensuring the avoidance of sample overlap, we utilized separate GWAS summary data for alcohol consumption, excluding participants from the UK Biobank¹⁹.

Sensitivity analysis

To evaluate the stability of the main results, we conducted multiple sensitivity analyses. Initially, we excluded participants who were former drinkers and non-drinkers (Field 20117) and performed association analysis for the identified genes. Additionally, we adjusted for rs1229984, a well-known alcohol consumption-related locus^6,21, to identify independent associations.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The data used in the study from the UKB were accessible under restricted access (application number 19542). Access can be procured by submitting an application through the UKB platform (https://www.ukbiobank.ac.uk/). The scRNA-seq data are documented in the GEO database, accessible under accession codes GSE115469 and GSE173731. The transcript expression data are accessible in the Human Protein Atlas database (https://v22.proteinatlas.org/about/download). The processed human brain mRNA sequencing data were available in the PsychENCODE study (http://development.psychencode.org/files/processed_data/RNA-seq/). Summary GWAS statistics from FinnGen are available at https://storage.googleapis.com/finngen-public-data-r9/summary_stats/finngen_R9_AUD.gz and https://storage.googleapis.com/finngen-public-data-r9/summary_stats/finngen_R9_AUD_SWEDISH.gz. KEGG database used in gProfiler are available at https://www.genome.jp/kegg/pathway.html. The Gene Ontology database used in gProfiler are available at https://geneontology.org/docs/download-ontology/. The paper and/or the Supplementary Information contain all necessary data to assess the conclusions. In addition, this paper includes source data. Source data are provided with this paper.

Code availability

ExWAS analyses and PheWAS analyses was performed via the R package SAIGE GENE+ which was available on https://github.com/saigegit/SAIGE. Burden heritability regression analysis was performed via the R package BHR (v.0.1.0, https://github.com/ajaynadig/bhr). Annotation of significant variants was conducted with SnpEff (https://pcingola.github.io/SnpEff/). Gene ontology enrichment analysis was conducted using g:Profiler (https://biit.cs.ut.ee/gprofiler/gost) and tissue enrichment analysis was performed via the R package TissueEnrich (v.1.16.0, https://github.com/Tuteja-Lab/TissueEnrich). The scRNA-seq data were analyzed and visualized using the R package Seurat using the R package Seurat (v.4.3.0, https://satijalab.org/seurat/index.html). Gene-Scout was available through the website: https://astrazeneca-cgr-publications.github.io/gene-scout/.

References

Griswold, M. G. et al. Alcohol use and burden for 195 countries and territories, 1990-2016: a systematic analysis for the Global Burden of Disease Study 2016. Lancet 392, 1015–1035 (2018).
Article Google Scholar
Freisthler, B., Wolf, J. P., Hodge, A. I. & Cao, Y. Alcohol Use and Harm to Children by Parents and Other Adults. Child Maltreat 25, 277–288 (2020).
Article PubMed Google Scholar
Friesen, E. L. et al. Hazardous alcohol use and alcohol-related harm in rural and remote communities: a scoping review. Lancet Public Health 7, e177–e187 (2022).
Article PubMed Google Scholar
Rehm, J. et al. The relationship of average volume of alcohol consumption and patterns of drinking to burden of disease: an overview. Addiction 98, 1209–1228 (2003).
Article PubMed Google Scholar
Witkiewitz, K., Litten, R. Z. & Leggio, L. Advances in the science and treatment of alcohol use disorder. Sci. Adv. 5, eaax4043 (2019).
Article CAS PubMed PubMed Central ADS Google Scholar
Saunders, G. R. B. et al. Genetic diversity fuels gene discovery for tobacco and alcohol use. Nature 612, 720–724 (2022).
Article CAS PubMed PubMed Central ADS Google Scholar
Kranzler, H. R. et al. Genome-wide association study of alcohol consumption and use disorder in 274,424 individuals from multiple populations. Nat. Commun. 10, 1499 (2019).
Article PubMed PubMed Central ADS Google Scholar
Visscher, P. M. et al. 10 Years of GWAS Discovery: Biology, Function, and Translation. Am. J. Hum. Genet 101, 5–22 (2017).
Article CAS PubMed PubMed Central Google Scholar
Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).
Article CAS PubMed PubMed Central Google Scholar
Brazel, D. M. et al. Exome Chip Meta-analysis Fine Maps Causal Variants and Elucidates the Genetic Architecture of Rare Coding Variants in Smoking and Alcohol Use. Biol. Psychiatry 85, 946–955 (2019).
Article CAS PubMed Google Scholar
Van Hout, C. V. et al. Exome sequencing and characterization of 49,960 individuals in the UK Biobank. Nature 586, 749–756 (2020).
Article PubMed PubMed Central ADS Google Scholar
Cirulli, E. T. et al. Genome-wide rare variant analysis for thousands of phenotypes in over 70,000 exomes from two cohorts. Nat. Commun. 11, 542 (2020).
Article CAS PubMed PubMed Central ADS Google Scholar
Jurgens, S. J. et al. Analysis of rare genetic variation underlying cardiometabolic diseases and traits among 200,000 individuals in the UK Biobank. Nat. Genet 54, 240–250 (2022).
Article CAS PubMed PubMed Central Google Scholar
Szustakowski, J. D. et al. Advancing human genetics research and drug discovery through exome sequencing of the UK Biobank. Nat. Genet. 53, 942–948 (2021).
Article CAS PubMed Google Scholar
Marees, A. T. et al. Exploring the role of low-frequency and rare exonic variants in alcohol and tobacco use. Drug Alcohol Depend. 188, 94–101 (2018).
Article PubMed Google Scholar
Zhou, H. et al. Genome-wide meta-analysis of problematic alcohol use in 435,563 individuals yields insights into biology and relationships with other traits. Nat. Neurosci. 23, 809–818 (2020).
Article CAS PubMed PubMed Central Google Scholar
Schumann, G. et al. KLB is associated with alcohol drinking, and its gene product β-Klotho is necessary for FGF21 regulation of alcohol preference. Proc. Natl Acad. Sci. 113, 14372–14377 (2016).
Article CAS PubMed PubMed Central ADS Google Scholar
Backman, J. D. et al. Exome sequencing and analysis of 454,787 UK Biobank participants. Nature 599, 628–634 (2021).
Article CAS PubMed PubMed Central ADS Google Scholar
Liu, M. et al. Association studies of up to 1.2 million individuals yield new insights into the genetic etiology of tobacco and alcohol use. Nat. Genet. 51, 237–244 (2019).
Article CAS PubMed PubMed Central Google Scholar
Kurki, M. I. et al. FinnGen provides genetic insights from a well-phenotyped isolated population. Nature 613, 508–518 (2023).
Article CAS PubMed PubMed Central ADS Google Scholar
Bierut, L. J. et al. ADH1B is associated with alcohol dependence and alcohol consumption in populations of European and African ancestry. Mol. psychiatry 17, 445–450 (2012).
Article CAS PubMed Google Scholar
Jain, A. & Tuteja, G. TissueEnrich: Tissue-specific gene enrichment analysis. Bioinformatics 35, 1966–1967 (2018).
Article PubMed Central Google Scholar
Middleton, L. et al. Gene-SCOUT: identifying genes with similar continuous trait fingerprints from phenome-wide association analyses. Nucleic Acids Res. 50, 4289–4301 (2022).
Article CAS PubMed PubMed Central Google Scholar
Uhlén, M. et al. Tissue-based map of the human proteome. Science 347, 1260419 (2015).
Article PubMed Google Scholar
Li, M. et al. Integrative functional genomic analysis of human brain development and neuropsychiatric risks. Science 362, eaat7615 (2018).
Article CAS PubMed PubMed Central ADS Google Scholar
Brien, S. E., Ronksley, P. E., Turner, B. J., Mukamal, K. J. & Ghali, W. A. Effect of alcohol consumption on biological markers associated with risk of coronary heart disease: systematic review and meta-analysis of interventional studies. BMJ. 342, d636 (2011).
Article PubMed PubMed Central Google Scholar
Peng, B. et al. Role of Alcohol Drinking in Alzheimer’s Disease, Parkinson’s Disease, and Amyotrophic Lateral Sclerosis. Int J. Mol. Sci. 21, 2316 (2020).
Article CAS PubMed PubMed Central Google Scholar
Biddinger, K. J. et al. Association of Habitual Alcohol Intake With Risk of Cardiovascular Disease. JAMA Netw. Open 5, e223849 (2022).
Article PubMed PubMed Central Google Scholar
Mahedy, L. et al. Alcohol use and cognitive functioning in young adults: improving causal inference. Addiction 116, 292–302 (2021).
Article PubMed Google Scholar
Curtis, D. Analysis of rare coding variants in 200,000 exome-sequenced subjects reveals novel genetic risk factors for type 2 diabetes. Diabetes Metab. Res Rev. 38, e3482 (2022).
Article CAS PubMed Google Scholar
Knott, C., Bell, S. & Britton, A. Alcohol Consumption and the Risk of Type 2 Diabetes: A Systematic Review and Dose-Response Meta-analysis of More Than 1.9 Million Individuals From 38 Observational Studies. Diabetes Care 38, 1804–1812 (2015).
Article CAS PubMed Google Scholar
Karczewski, K. J. et al. Systematic single-variant and gene-based association testing of thousands of phenotypes in 394,841 UK Biobank exomes. Cell Genom. 2, 100168 (2022).
Article CAS PubMed PubMed Central Google Scholar
Fu, J. M. et al. Rare coding variation provides insight into the genetic architecture and phenotypic context of autism. Nat. Genet. 54, 1320–1331 (2022).
Article CAS PubMed PubMed Central Google Scholar
Chen, C. Y. et al. The impact of rare protein coding genetic variation on adult cognitive function. Nat. Genet. 55, 927–938 (2023).
Article CAS PubMed PubMed Central Google Scholar
de la Monte, S. M. & Kril, J. J. Human alcohol-related neuropathology. Acta. Neuropathol. 127, 71–90 (2014).
Article PubMed Google Scholar
Tiwari, V. & Chopra, K. Resveratrol abrogates alcohol-induced cognitive deficits by attenuating oxidative-nitrosative stress and inflammatory cascade in the adult rat brain. Neurochem. Int. 62, 861–869 (2013).
Article CAS PubMed Google Scholar
Egervari, G., Siciliano, C. A., Whiteley, E. L. & Ron, D. Alcohol and the brain: from genes to circuits. Trends Neurosci. 44, 1004–1015 (2021).
Article CAS PubMed PubMed Central Google Scholar
Daviet, R. et al. Associations between alcohol consumption and gray and white matter volumes in the UK Biobank. Nat. Commun. 13, 1175 (2022).
Article CAS PubMed PubMed Central ADS Google Scholar
Sullivan, E. V. & Pfefferbaum, A. Brain-behavior relations and effects of aging and common comorbidities in alcohol use disorder: A review. Neuropsychology 33, 760–780 (2019).
Article PubMed PubMed Central Google Scholar
Monnig, M. A., Tonigan, J. S., Yeo, R. A., Thoma, R. J. & McCrady, B. S. White matter volume in alcohol use disorders: a meta-analysis. Addict. Biol. 18, 581–592 (2013).
Article PubMed Google Scholar
Pfefferbaum, A. & Sullivan, E. V. Disruption of brain white matter microstructure by excessive intracellular and extracellular fluid in alcoholism: evidence from diffusion tensor imaging. Neuropsychopharmacology 30, 423–432 (2005).
Article CAS PubMed Google Scholar
Schonfeld, M., O’Neil, M., Weinman, S. A. & Tikhanovich, I. Alcohol-induced epigenetic changes prevent fibrosis resolution after alcohol cessation in miceresolution. Hepatology. https://doi.org/10.1097/HEP.0000000000000675 (9900).
Schonfeld, M., Averilla, J., Gunewardena, S., Weinman, S. A. & Tikhanovich, I. Alcohol‐associated fibrosis in females is mediated by female‐specific activation of lysine demethylases KDM5B and KDM5C. Hepatol. Commun. 6, 2042–2057 (2022).
Article CAS PubMed PubMed Central Google Scholar
Schonfeld, M., Averilla, J., Gunewardena, S., Weinman, S. A. & Tikhanovich, I. Male‐Specific Activation of Lysine Demethylases 5B and 5C Mediates Alcohol‐Induced Liver Injury and Hepatocyte Dedifferentiation. Hepatol. Commun. 6, 1373–1391 (2022).
Article CAS PubMed PubMed Central Google Scholar
Howe, L. J. et al. Genetic evidence for assortative mating on alcohol consumption in the UK Biobank. Nat. Commun. 10, 1–10 (2019).
Article CAS Google Scholar
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203 (2018).
Article CAS PubMed PubMed Central ADS Google Scholar
Sudlow, C. et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12, e1001779 (2015).
Article PubMed PubMed Central Google Scholar
Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff. Fly 6, 80–92 (2012).
Article CAS PubMed PubMed Central Google Scholar
Vaser, R., Adusumalli, S., Leng, S. N., Sikic, M. & Ng, P. C. SIFT missense predictions for genomes. Nat. Protoc. 11, 1–9 (2016).
Article CAS PubMed Google Scholar
Adzhubei, I., Jordan, D. M. & Sunyaev, S. R. Predicting Functional Effect of Human Missense Mutations Using PolyPhen-2. Curr. Protoc. Hum. Genet. 76, 7.20.1–7.20.41 (2013).
Google Scholar
Chun, S. & Fay, J. C. Identification of deleterious mutations within three human genomes. Genome Res. 19, 1553–1561 (2009).
Article CAS PubMed PubMed Central Google Scholar
Schwarz, J. M., Rödelsperger, C., Schuelke, M. & Seelow, D. MutationTaster evaluates disease-causing potential of sequence alterations. Nat. methods 7, 575–576 (2010).
Article CAS PubMed Google Scholar
Zhou, W. et al. SAIGE-GENE+ improves the efficiency and accuracy of set-based rare variant association tests. Nat. Genet. 54, 1466–1469 (2022).
Article CAS PubMed PubMed Central Google Scholar
Martin, F. J. et al. Ensembl 2023. Nucleic Acids Res. 51, D933–D941 (2022).
Article PubMed Central ADS Google Scholar
Zhou, W. et al. Scalable generalized linear mixed model for region-based association tests in large biobanks and cohorts. Nat. Genet. 52, 634–639 (2020).
Article CAS PubMed PubMed Central Google Scholar
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Article CAS PubMed PubMed Central Google Scholar
Saunders, J. B., Aasland, O. G., Babor, T. F., De La Fuente, J. R. & Grant, M. Development of the Alcohol Use Disorders Identification Test (AUDIT): WHO Collaborative Project on Early Detection of Persons with Harmful Alcohol Consumption-II. Addiction 88, 791–804 (1993).
Article CAS PubMed Google Scholar
Sanchez-Roige, S. et al. Genome-Wide Association Study Meta-Analysis of the Alcohol Use Disorders Identification Test (AUDIT) in Two Population-Based Cohorts. Am. J. Psychiatry 176, 107–118 (2018).
Article PubMed PubMed Central Google Scholar
Weiner, D. J. et al. Polygenic architecture of rare coding variation across 394,783 exomes. Nature 614, 492–499 (2023).
Article CAS PubMed PubMed Central ADS Google Scholar
Raudvere, U. et al. g:Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update). Nucleic Acids Res. 47, W191–W198 (2019).
Article CAS PubMed PubMed Central Google Scholar
MacParland, S. A. et al. Single cell RNA sequencing of human liver reveals distinct intrahepatic macrophage populations. Nat. Commun. 9, 4383 (2018).
Article PubMed PubMed Central ADS Google Scholar
Butler, A., Hoffman, P., Smibert, P., Papalexi, E. & Satija, R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol. 36, 411–420 (2018).
Article CAS PubMed PubMed Central Google Scholar
Garcia, F. J. et al. Single-cell dissection of the human brain vasculature. Nature 603, 893–899 (2022).
Article CAS PubMed PubMed Central ADS Google Scholar
Miller, K. L. et al. Multimodal population brain imaging in the UK Biobank prospective epidemiological study. Nat. Neurosci. 19, 1523 (2016).
Article CAS PubMed PubMed Central Google Scholar
Alfaro-Almagro, F. et al. Image processing and Quality Control for the first 10,000 brain imaging datasets from UK Biobank. NeuroImage 166, 400–424 (2018).
Article PubMed Google Scholar
Chun, S. et al. Associations of Social Isolation and Loneliness With Later Dementia. Neurology 99, e164 (2022).
Google Scholar
Jujiao, K. et al. Association between obesity, brain atrophy and accelerated brain aging and their genetic mechanisms. medRxiv, 2022.12.30.22284052 (2022).
Gaser, C. et al. CAT—A Computational Anatomy Toolbox for the Analysis of Structural MRI Data. bioRxiv, 2022.06.11.495736 (2023).
Rolls, E. T., Huang, C.-C., Lin, C.-P., Feng, J. & Joliot, M. Automated anatomical labelling atlas 3. NeuroImage 206, 116189 (2020).
Article PubMed Google Scholar
Smith, S. M. et al. Tract-based spatial statistics: Voxelwise analysis of multi-subject diffusion data. NeuroImage 31, 1487–1505 (2006).
Article PubMed Google Scholar
de Groot, M. et al. Improving alignment in Tract-based spatial statistics: Evaluation and optimization of image registration. NeuroImage 76, 400–411 (2013).
Article PubMed Google Scholar
Wakana, S., Jiang, H., Nagae-Poetscher, L. M., Van Zijl, P. C. & Mori, S. Fiber tract-based atlas of human white matter anatomy. Radiology 230, 77–87 (2004).
Article PubMed Google Scholar
Kang, J. et al. Increased brain volume from higher cereal and lower coffee intake: shared genetic determinants and impacts on cognition and metabolism. Cereb. Cortex 32, 5163–5174 (2022).
Article PubMed PubMed Central Google Scholar
de la Fuente, J., Davies, G., Grotzinger, A. D., Tucker-Drob, E. M. & Deary, I. J. A general dimension of genetic sharing across diverse cognitive traits inferred from molecular data. Nat. Hum. Behav. 5, 49–58 (2021).
Article PubMed Google Scholar

Download references

Acknowledgements

We express our gratitude to the participants of the UK Biobank for their valuable time, and we extend our appreciation to the dedicated team members of the UK Biobank for their efforts in data collection. We acknowledge the participants and investigators involved in the FinnGen study. We also acknowledge the contributions of MacParland, S.A. et al. and Garcia, F.J. et al. for providing the scRNA-seq matrix. We acknowledge the Human Protein Atlas project, the PsychENCODE Consortium and the FinnGen project for their unwavering dedication to advancing scientific research. W Cheng received support through grants from the National Natural Sciences Foundation of China (no. 82071997) and the Shanghai Rising-Star Program (no. 21QA1408700). J.T. Yu received support through grants from the Science and Technology Innovation 2030 Major Projects (2022ZD0211600), National Natural Science Foundation of China (82071201, 81971032, 92249305), Shanghai Municipal Science and Technology Major Project (No.2018SHZDZX01), Research Start-up Fund of Huashan Hospital (2022QD002), Excellence 2025 Talent Cultivation Program at Fudan University (3030277001), Shanghai Talent Development Funding for The Project (2019074), and ZHANGJIANG LAB, Tianqiao and Chrissy Chen Institute, and the State Key Laboratory of Neurobiology and Frontiers Center for Brain Science of Ministry of Education, Fudan University. J.F. Feng received support through grants from National Key R&D Program of China (No. 2018YFC1312904 and No. 2019YFA0709502), the Shanghai Municipal Science and Technology Major Project (No. 2018SHZDZX01), the 111 Project (No. B18015), Shanghai Center for Brain Science and Brain-Inspired Technology and Zhangjiang Lab. T.Y. Jia received support through grants from the National Key R&D Program of China (No. 2019YFA0709501) and the National Natural Science Foundation of China (T2122005, No. 81801773).

Author information

These authors contributed equally: Jujiao Kang, Yue-Ting Deng, Bang-Sheng Wu.

Authors and Affiliations

Institute of Science and Technology for Brain-Inspired Intelligence (ISTBI), Fudan University, Shanghai, 200433, China
Jujiao Kang, Ze-Yu Li, Shitong Xiang, Jia You, Tianye Jia, Wei Cheng & Jianfeng Feng
Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (Fudan University), Ministry of Education, Shanghai, 200433, China
Jujiao Kang, Ze-Yu Li, Shitong Xiang, Jia You, Tianye Jia, Wei Cheng & Jianfeng Feng
Department of Neurology and National Center for Neurological Disorders, Huashan Hospital, State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Shanghai Medical College, Fudan University, Shanghai, 200433, China
Yue-Ting Deng, Bang-Sheng Wu, Wei-Shi Liu, Liu Yang, Jin-Tai Yu & Wei Cheng
School of Life Sciences, Fudan University, Shanghai, 200433, China
Xiaohong Gong
Social Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK
Tianye Jia
School of Psychology, University of Southampton, Southampton, UK
Tianye Jia
Fudan ISTBI—ZJNU Algorithm Centre for Brain-inspired Intelligence, Zhejiang Normal University, Zhejiang, China
Wei Cheng & Jianfeng Feng
Department of Computer Science, University of Warwick, Coventry, CV4 7AL, UK
Jianfeng Feng

Authors

Jujiao Kang
View author publications
You can also search for this author in PubMed Google Scholar
Yue-Ting Deng
View author publications
You can also search for this author in PubMed Google Scholar
Bang-Sheng Wu
View author publications
You can also search for this author in PubMed Google Scholar
Wei-Shi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Ze-Yu Li
View author publications
You can also search for this author in PubMed Google Scholar
Shitong Xiang
View author publications
You can also search for this author in PubMed Google Scholar
Liu Yang
View author publications
You can also search for this author in PubMed Google Scholar
Jia You
View author publications
You can also search for this author in PubMed Google Scholar
Xiaohong Gong
View author publications
You can also search for this author in PubMed Google Scholar
Tianye Jia
View author publications
You can also search for this author in PubMed Google Scholar
Jin-Tai Yu
View author publications
You can also search for this author in PubMed Google Scholar
Wei Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Jianfeng Feng
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors had complete access to the data in this study and acknowledged the responsibility for its submission for publication. W.C., J.T.Y., and J.F.F. designed the study. J.J.K. and Y.T.D. conducted the main analyses and drafted the manuscript. B.S.W., Z.Y.L., W.S.L., S.T.X., L.Y., and J.Y. contributed to data collection and analyses. X.H.G. and T.Y.J. contributed to data interpretation. J.T.Y., W.C., and J.F.F. provided critical revisions to the manuscript. All authors have reviewed and approved the final version.

Corresponding authors

Correspondence to Jin-Tai Yu, Wei Cheng or Jianfeng Feng.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Neelroop Parikshak and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Description of Additional Supplementary Files

Supplementary Data 1-28

Reporting Summary

Source data

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Kang, J., Deng, YT., Wu, BS. et al. Whole exome sequencing analysis identifies genes for alcohol consumption. Nat Commun 15, 5777 (2024). https://doi.org/10.1038/s41467-024-50132-3

Download citation

Received: 15 May 2023
Accepted: 26 June 2024
Published: 10 July 2024
DOI: https://doi.org/10.1038/s41467-024-50132-3
Springer Nature Limited

Whole exome sequencing analysis identifies genes for alcohol consumption

Abstract

Similar content being viewed by others

Introduction

Results

Study population and data description

ExWAS for alcohol consumption

Leave-one-variant-out (LOVO) and conditional analysis

Sex-specific analysis of the associations

Associations of rare variants in alcohol-related genes

Biological function and tissue expression of the alcohol consumption-related genes

Phenotypic associations with alcohol consumption-related genes

ExWAS in all white British participants and unrelated non-white British participants

Discussion

Methods

UK Biobank

Study phenotypes

Whole exome sequencing data

Variant annotation

ExWAS

Genotype and imputation

ExWAS for AUDIT

LOVO analysis

Conditional analysis

Burden heritability estimation

Pathway enrichment analysis

Tissue enrichment and expression analysis

Lifespan spatio-temporal gene expression trajectory

Single-cell expression

Gene similarity

MRI data and preprocessing

Phenome-wide association analysis

Mendelian randomization analysis

Sensitivity analysis

Reporting summary

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation