Introduction

Bipolar disorder (BPD) is a severe neuropsychiatric disorder with a lifetime prevalence of ~ 0.75% worldwide [1, 2]. Recent studies have revealed relatively high heritability of BPD [3]. Also, genetic analyses including genome-wide association studies (GWAS) have reported many common genetic variations showing moderate to strong associations with BPD, and these variants have been repeatedly highlighted in subsequent analyses of enlarged sample sizes [4,5,6,7,8,9,10,11,12,13,14,15]. One epic study in the field of BPD genetics was the meta-analysis of multiple BPD GWAS datasets followed by independent replications by the Bipolar Disorder Working Group of Psychiatric Genomics Consortium (PGC1) in 2011 [13]. In this study, loci at CACNA1C, ODZ4 and several other genomic areas exhibited genome-wide significant associations with BPD (p < 5.0⨯10− 8). Following this, the PGC2 scientists recently performed a larger GWAS with multiple European cohorts (29,764 cases and 169,118 non-psychiatric controls), and identified 30 genomic loci showing genome-wide significant associations with BPD, which span genes including TRANK1, ANK3, NCAN and ITIH3 [15]. These discoveries quickly caught wide attention and have elicited numerous replicative studies.

Majority of the follow-up efforts have been put into understanding whether genes highlighted in these GWAS are indeed susceptibility genes of BPD in populations other than Europeans [16,17,18,19,20,21], and such cross-population replications were not limited to BPD [22,23,24,25]. For example, Gonzalez et al. conducted replication analysis of the European GWAS risk loci in a Latino BPD cohort [16], and Zeng et al. replicated the associations between DGKH SNPs and haplotypes with BPD in Chinese populations [18]. While many of the highlighted genes (e.g., ANK3) have been replicated [20], whether SYNE1, a promising BPD susceptibility gene identified in GWAS by PGC1 [13, 26] and also found to confer the risk of BPD in other populations, remains to be examined. The discovery of SYNE1 as a potential risk gene for BPD was led by the identification of genome-wide significant association between a single nucleotide polymorphism (SNP) rs9371601 within this gene and the illness in the discovery stage of the aforementioned GWAS by Green et al. (7481 BPD individuals and 9250 control individuals) [13]. Although this association was not replicated in their replication samples, they have successfully validated the significant association between rs9371601 and BPD susceptibility in an independent sample from the Great Britain (1527 cases and 1579 controls) [26]. Further meta-analysis combining their data with the previous GWAS samples confirmed the genome-wide level significance of this association signal [26]. In the PGC2 BPD GWAS study including 29,764 cases and 169,118 controls, while rs9371601 did not achieve genome-wide level of statistical significance, it showed nominal associations in both the discovery and replication samples with the same direction of allelic effects (discovery, p = 1.80⨯10− 6; replication, p = 0.0175) [15]. In addition to the genetic analysis of rs9371601, functional studies of its associated BPD candidate risk gene, SYNE1, have emerged. The human SYNE1 gene contains 145 exons and encodes multiple proteins. A recent study has investigated one of its proteins, CPG2 [27]. CPG2 is brain-specific and primarily locates in the excitatory postsynaptic areas to exert impact on synaptic plasticity and function [27,28,29]. Rathje et al. found that the protein expression of CPG2 was significantly decreased in the postmortem brains of BPD cases versus control subjects, and was regulated by genetic variants in its promoter and coding regions [27]. Taken together, dissecting the roles of BPD risk variation and SYNE1 in the pathogenesis of the disease may provide valuable insights.

Since rs9371601 was discovered as a BPD risk SNP in European populations, we have explored the recent two GWAS in East Asians [8, 30] to examine its link with BPD in other populations in the world. While this SNP was not highlighted in these two cohorts, further investigations are needed before conclusions can be made regarding its role in BPD in Eastern Asians. We have recruited a BPD case-control sample from Mainland China and sought to test whether rs9371601 was also associated with BPD in these subjects. Additionally, we have analyzed the impact of rs9371601 on gene expression and DNA methylation using public datasets. Comparisons of allelic frequencies and LD patterns of the genomic regions encompassing 9,371,601 between Europeans and Chinese were also performed.

Material and methods

BPD case-control samples

1315 BPD patients of Han Chinese origin were recruited from several provinces of Mainland China (e.g., Henan, Hunan, Shanghai, Zhejiang and Sichuan). Part of the samples have been previously described elsewhere [21, 31]. To minimize the impact of confounding variables, patients were excluded if they (i) had a history of mental retardation, drug/alcohol abuse, or schizophrenia; or (ii) had comorbid diagnosis of other brain injury. Diagnoses were confirmed by a research psychiatrist via an Extensive Clinical Interview and a Structured Clinical Interview for DSM-IV Axis/Disorders, Patient Version (SCID-P). For control subjects, 1956 individuals were recruited from local communities, and those with any history of major mental illnesses, neurological disorders, or a family history of severe forms of brain disorders were excluded from further participation. The study protocol was approved by the ethics committee of the Second Affiliated Hospital of Xinxiang Medical University and the ethics committees of all participating hospitals and institutes. All participants provided written informed consents before any study-related procedures were performed.

SNP genotyping

Genomic DNA samples of the participants were isolated from their peripheral blood leukocytes using high salt extraction method. The SNP was genotyped using the SNaPShot method. Briefly, the genomic fragments containing rs9371601 were amplified from 20 ng genomic DNA in a 15 μL volume of polymerase chain reaction (PCR) reaction in 96-well plates. The amplified PCR fragments were purified by treatment of SAP and Exo-I, and specifically designed SNaPShot primers were used to amplify the SNP target sites. After one base extension, the reaction was terminated and the products were loaded on an ABI 3730 automatic sequencer to generate SNP genotype callings, which were automatically performed using GeneMarker V2.2.0 and manually verified. All genotypes were called blind to sample identity and affection status.

Statistical analysis

Hardy-Weinberg equilibrium (HWE) test was conducted using the “Haploview” software [32]. The program PLINK (v1.07) was used to calculate p-values, odds ratios (OR) and 95% confidence intervals (CI) using the “logistic regression” option [33], with sex and region of participants included in the covariates. We also conducted a power analysis of the current sample size using the “Power and Sample Size” program [34] to examine whether it had sufficient statistical power. Linkage disequilibrium (LD) between rs9371601 and its nearby SNPs was analyzed in 1000 Genomes Project Phase 1 genotype data using tools on the SNAP website.

Queries about the impact of rs9371601

To investigate the potential impact of rs9371601, we examined the association between this SNP and the mRNA expression of nearby genes in brain and blood tissues using several expression quantitative trait loci (eQTL) databases. In brief, we first examined rs9371601 in Brain xQTL dataset (http://mostafavilab.stat.ubc.ca/xQTLServe/) [35] containing the RNA-seq results of human dorsal lateral prefrontal cortex (DLPFC) tissues and relevant genome-wide SNP genotyping of the donors. According to the original report, the researchers performed polyA+ RNA-seq in the DLPFC of 494 European individuals. Detailed information, including the sample information, statistical methods, and the strategies taken to minimize impact of known and hidden confounding factors, can be found in the original publication [35]. We also utilized the GTEx dataset of genome-wide SNP information and whole transcriptome RNA-seq data in frontal cortex (BA9) (https://www.gtexportal.org/home/) [36] for the current analyses. The GTEx contains RNA-seq data from multiple postmortem brain regions of primarily non-psychiatric Europeans, providing valuable eQTL information. Herein we utilized the eQTL data in frontal cortex tissues collected from 118 donors (BA9). The details for sample collection, data processing and statistical analyses can be found in the original study [36]. Since many eQTL are usually population specific, and the above brain eQTL datasets mainly contain non-Asian samples, we have also utilized a blood eQTL dataset consisting of East Asian donors to understand the role of rs9371601 in this population. We extracted eQTL data of 85 East Asian individuals (including Han Chinese and Japanese) and 56 European subjects from Stranger study [37], in which genome-wide mRNA expression in lymphoblastoid cell lines from HapMap3 global populations was analyzed using the microarray technology. Details of the demographic characteristics of participants, mRNA quantification methods and statistical analyses can be found in the original report [37]. In addition, we also investigated the impact of rs9371601 on DNA methylation using the Brain xQTL dataset [35], in which global DNA methylation (420,103 methylation sites) was quantified using 450 K Illumina array in 468 European individuals. The Spearman’s rank correlation was used to calculate the association between rs9371601 and DNA methylation status. Detailed information regarding this methylation quantitative trait loci (meQTL) can be found in the original report [35].

Results

In our case-control sample, we obtained ideal genotyping call rate for rs9371601. Further analyses showed that this SNP was in Hardy-Weinberg Equilibrium (P > 0.05) in both cases and controls. According to the allelic frequencies of rs9371601 in Han Chinese (data from 1000 Genomes Project [38]) and the reported ORs of this SNP in European GWAS [15, 26], we performed power analyses of the current sample, and found that the current sample size had sufficient statistical power (> 80%) for the purpose of this study. In the previous European GWAS analyses [13, 15, 26], the T-allele of rs9371601 was significantly overrepresented in BPD patients compared with controls (0.364 in cases and 0.349 in controls, Table 1), making it a “risk” allele for BPD. However, the results obtained from our Han Chinese case-control sample were in contrast to that in Europeans, as the frequency of T-allele was significantly lower in 1315 BPD patients than that in 1956 non-psychiatric controls (0.743 in cases and 0.773 in controls, p = 0.0121, OR = 0.859, Table 1).

Table 1 Association of rs9371601[T-allele] with bipolar disorder in world populations

The opposite directions of allelic effects for rs9371601 in European and Han Chinese populations, which might seem odd, were intriguing. To explore possible explanation of this inconsistency, we examined the characteristics of genetic architectures for this SNP in both populations, and found that it was always the “minor allele” conferring genetic risk of BPD in both Europeans and Chinese, however, the “minor allele” at rs9371601 was T-allele in the Europeans, while in Han Chinese, its “minor allele” switched to G-allele. This observation was made using data of the 1000 Genomes Project [38], in which we saw that the frequency of rs9371601 T-allele was 0.371 in European populations but 0.716 in Han Chinese subjects. This result was reproduced using another database from HGDP (http://hgdp.uchicago.edu/) worldwide populations [39, 40] (Fig. 1).

Fig. 1
figure 1

Global distributions of rs9371601 in 53 world populations from HGDP dataset [39, 40]

We also performed LD analysis for rs9371601 using the SNAP website, and identified numerous SNPs in high LD (r2 > 0.8) with this SNP in both European and East Asian populations (Fig. 2). However, there were observable differences between its LD patterns in these two populations. For example, there were 128 SNPs in high LD with rs9371601 in East Asians (r2 > 0.8), whereas in Europeans, 34 of these 128 SNPs were not highlighted in the pool of show strong LD SNPs for rs9371601 (r2 < 0.8) (Additional file 1: Table S1).

Fig. 2
figure 2

Comparisons of rs9371601 LD SNPs between European and East Asian populations

Although the BPD risk alleles at rs9371601 in distinct populations are different, previous studies and our findings strongly suggest involvement of this SNP in the genetic risk of BPD. Investigating the potential biological impact conferred by different alleles at rs9371601 is therefore necessary and valuable to understand mechanisms underlying the pathological risk of BPD linked to this locus. It is the consensus that risk SNPs of complex psychiatric disorders usually affect gene expression or epigenetic modifications in brain and blood tissues [41,42,43,44,45], we thus tested whether rs9371601 was an eQTL or meQTL of particular gene(s) or DNA methylation sites. In the brain eQTL datasets that comprised of European individuals (i.e., Brain xQTL [35] and GTEx [36]), rs9371601 was not associated with the mRNA expression of any gene in its surrounding regions (Additional file 2: Tables S2 and Additional file 3: Table S3). In the blood eQTL databases containing European or East Asian subjects [37], rs9371601 was neither associated with the expression of any nearby genes (Additional file 4: Tables S4 and Additional file 5: Table S5). It should be noted that our analyses have been mainly focused on the mRNA expression levels of annotated genes, and the possibility that this SNP is associated with the mRNA levels of specific exons or isoforms in certain genes, or even some previously uncharacterized RNAs, was not tested. Considering that SYNE1, the gene most likely affected by rs9371601, is a large gene containing many exons and undergoes extensive alternative splicing processes (Additional file 6: Figure S1), further studies analyzing the association between rs9371601 and mRNA levels of different transcripts of SYNE1 are necessary, given that accumulating studies have proven the important roles of alternative splicing in the pathogenesis of psychiatric disorders [44, 46, 47]. In addition, we also carried out meQTL analyses for rs9371601, and found that this SNP was significantly associated with the methylation of a CpG site (cg01844274, p = 5.05⨯10− 6) within SYNE1 in the DLPFC region in Brain xQTL dataset (Additional file 7: Table S6) [35]. While the functional consequences of the methylation at this site is still unclear, this result is in line with the recent studies showing that psychiatric risk SNPs are usually associated with DNA methylation in human brains [42, 48, 49].

Discussion

Genetic analysis of BPD has been emerging in recent years, and GWAS is a popular approach that has implicated many risk loci for this illness. Along with these discoveries, genes such as TRANK1, ANK3, NCAN and ODZ4 have been presumed to be conferring genetic risk of BPD in European populations [15]. However, it is widely accepted that genetic heterogeneity exists among different populations [50, 51], and genetic markers of a illness identified in one population do not always confer such risk in other populations. Therefore, the genetic risk architecture of BPD in Han Chinese, which has been relatively understudied comparing with that in Europeans, remains to be tackled. Despite the previous achievements in genetic analyses of BPD in Han Chinese [30], replication of reported risk loci in Europeans has been a major task. However, owning to the limited sample sizes in early GWAS of East Asians [30], few loci with robust statistical associations with BPD in these populations have been found. We therefore recruited BPD case and control subjects so as to provide insights into the genetic architecture of BPD in Han Chinese.

Our current study has examined whether the SNP rs9371601 is a potential BPD risk marker in Han Chinese, and implicated important roles of SYNE1 in this illness. The SYNE1 gene was initially reported in a genome-wide meta-analysis of diverse European GWAS datasets, and one of its intronic SNP rs9371601 was genome-wide significantly associated with BPD in the discovery sample. Although this SNP was not significantly associated with BPD in the replication stage of the above GWAS [13], it was found nominally associated with BPD in British populations in a later follow-up study, and meta-analysis combining this sample and all available previous data yielded a genome-wide significant association signal of this locus with BPD [26]. In the latest PGC2 BPD GWAS, the significant association between rs9371601 and BPD was again implicated. However, in this GWAS, rs9371601 did not achieve genome-wide level of statistical significance, though very close to the threshold [15]. These lines of evidence suggest that rs9371601 is likely a risk locus for BPD at least in European populations. In the present study using a relatively large sample of Han Chinese populations, while we did observe a significant association between this SNP and the illness, the allelic effect was in the opposite direction compared with that in Europeans. This is not the first case that a specific locus shows varied association trends with an illness in distinct populations. For example, rs6265 in BDNF was reported to confer risk of BPD in Europeans but not in Asians [52]. However, it is intriguing that the “minor” allele of rs9371601 in Europeans is different from that in Han Chinese populations, providing possible explanations for the inconsistent allelic effects discussed earlier. Given that risk genes for one psychiatric disorder usually exhibit robust associations with multiple other psychiatric illnesses as well [53,54,55], researchers have also conducted a cross-diagnosis analysis rs9371601. This SNP was nominally associated with major depressive disorder (MDD) in the Green study of European individuals, and the T-allele (i.e., BPD risk allele in Europeans) predicted increased risk of MDD [26]. We have also examined this SNP in the GWAS of schizophrenia from European populations, and the T-allele was again associated with an increased risk (p = 0.0033, OR = 1.030) [56]. Intriguingly, in consistent with our BPD analysis in Han Chinese, the frequency of T-allele at rs9371601 was again lower in Chinese MDD patients versus controls from a recent GWAS, although the association did not achieve nominal significance (p = 0.493, OR = 0.971) [57]. Therefore, rs9731601 is likely an authentic risk marker for an array of psychiatric illnesses, with distinct genetic risk architectures between Europeans and Chinese.

We have also explored the possible biological impact of the BPD risk allele at rs9371601 in the brain. By performing LD analysis, we found that there were multiple SNPs in high LD with rs9371601 in at least one population. However, eQTL analysis did not identify any association signals between this SNP and the expression of nearby genes, adding more difficulties in the elucidation of the underlying molecular mechanisms for the involvement of rs9371601 in BPD. We also examined the mRNA expression of SYNE1 from recent transcriptome datasets of DLPFC tissues from BPD patients and controls, and the total mRNA expression of SYNE1 did not differ significantly between cases and controls. Several SYNE1 isoforms showed reduced expression in BPD patients versus controls, but such association signals did not survive multiple correction [58]. This result is also partially consistent with that of a previous study [27]. Nevertheless, our meQTL analyses using the Brain xQTL dataset [35] showed that rs9371601 was significantly associated with the methylation of a CpG site (cg01844274, p = 5.05⨯10− 6) in SYNE1, suggesting that this SNP (or its linked variants) may exert function at the epigenomic level. Given the differences in population history and genetic architectures between Europeans and East Asians, there is also the possibility that the overall risk associations or directions in this specific genomic region are different between these two populations, but the possibility is not denied that there could be the unidentified causative variant(s) with the same risk direction in both populations. In the potential event that there is the undiscovered causative variant(s), the inconsistency of linkage between rs9371601 and BPD could be resulted in the disrupted LD connection between rs9371601 and the unidentified causative variant(s) owning to the differences in the LD structures in this genomic region between Europeans and East Asians.

Our study has presented the first replication analysis of SYNE1 rs9371601 in Han Chinese population. Despite the observed significant association between this SNP and the illness, and the insights into its biological impact, certain limitations should be acknowledged. First, this study does not confirm the reasons of the inconsistent association signals at this SNP with BPD between Europeans and Asians. We speculate that the differences of LD structures in this genomic region might be the explanation, and further fine-scale mapping of these genetic markers are necessary to prove this hypothesis. Second, the current analyses did not take into consideration the potential genetic background differences between various BPD subtypes (BP-I, BP-II or BP-NOS) in Han Chinese individuals, as rs9371601 showed associations with all types of BPD in Europeans [15]. Future studies stratify different BPD subtypes might be of great interest. Third, although we identified a meQTL association of rs9371601 in the brain DLPFC, the underlying regulative mechanisms of such effect is unknown. To the best of our knowledge, the functional connections underpinning meQTL associations have been a hassle in the relevant field, and the functional consequences of the DNA methylation are yet to be identified.