Human Genetics

, 125:63

Pathways-based analyses of whole-genome association study data in bipolar disorder reveal genes mediating ion channel activity and synaptic neurotransmission

Authors

    • Department of Psychiatry and Human BehaviorWarren Alpert Medical School of Brown University, Butler Hospital
  • Cynthia Read
    • OCD Research GroupButler Hospital
  • Jason Moore
    • Department of GeneticsDartmouth Medical School
Original Investigation

DOI: 10.1007/s00439-008-0600-y

Cite this article as:
Askland, K., Read, C. & Moore, J. Hum Genet (2009) 125: 63. doi:10.1007/s00439-008-0600-y

Abstract

Despite known heritability, the complex genetic architecture of bipolar disorder (likely including trait, locus and allelic heterogeneity, as well as genetic interactions) has confounded genetic discovery for many years. Even modern day whole genome association studies (WGAS) using over half a million common SNPs have implicated only a handful of genes at the genomewide level. Temporally coincident with this series of WGAS, a host of pathways-based analyses (PBAs) have emerged as novel computational approaches in the examination of large-scale datasets, but thus far rarely have been applied to WGAS data in psychiatric disorders. Here, we report a series of PBAs conducted using exploratory visual analysis, an analytic and visualization software tool for examining genomic data, to examine results from the National Institutes of Mental Health and Wellcome-Trust Case Control Consortium WGAS in bipolar disorder. Consistent with a host of prior linkage findings, some candidate gene association studies, and recent WGAS, our strongest findings suggest involvement of ion channel structural and regulatory genes, including voltage-gated ion channels and the broader ion channel group that comprises both voltage- and ligand-gated channels. Moreover, we found only modest overlap in the particular genes driving the significance of these gene sets across the analyses. This observation strongly suggests that variation in ion channel genes, as a class of genes, may contribute to the susceptibility of bipolar disorder and that heterogeneity may figure prominently in the genetic architecture of this susceptibility.

Introduction

Bipolar disorder is a complex condition characterized by episodic mood, affective, cognitive and behavioral dysregulation. Despite substantial heritability (Cardno et al. 1999), a solid knowledge of the genetic bases and molecular substrates of bipolar disorder remain elusive. To date, linkage and candidate gene association studies have generally yielded weak and poorly replicated results. Although over 40 linkage regions have been implicated in bipolar susceptibility (Newton 2007) most remain unreplicated. Candidate genes most consistently implicated are those involved in neurotransmitter transport, biosynthesis and receptor activity (e.g., SLC6A4, COMT, DRD4, TPH2, SLC6A3, possibly DAOA), regulation of synaptic transmission, excitability and nervous system development (KCNN3, BDNF), amino acid metabolism (MTHFR), and chemotaxis (CCL2). But of approximately 240 genes implicated in at least one association study, most remain unreplicated (Yu et al. 2008).

Deciphering the genetics of bipolar disorder: whole-genome association studies (WGAS)

Poor genetic linkage and association replication and the absence of clear modes of transmission or genetic loci ultimately shifted the focus toward WGAS. In light of their success in non-psychiatric disorders, four WGAS have been recently conducted in bipolar disorder (Baum et al. 2008; Ferreira et al. 2008; Sklar et al. 2008; The Wellcome Trust Case Control Consortium 2007). From these relatively large, independent WGAS, only a few single marker associations have reached genomewide significance (generally defined as disease association significance of P < 5.0 × 10−07): rs420259 at 6p12, in or between partner and localizer of BRCA2 (PALB2) and dynactin 5 (DCTN5) (The Wellcome Trust Case Control Consortium 2007), and rs1012053 at 13q14.11 in diacylglycerol kinase, eta (DGKH) (Baum et al. 2008). No independent study has replicated genome-wide association of either these markers or the associated genes. Nonetheless, a more liberal examination of their results and consideration of other lines of evidence identified some important leads. In particular, Baum et al. (2008) note that their most significant association, DGKH, replicates previous linkage findings in bipolar disorder, and that the DGKH gene encodes a key protein in the lithium-sensitive phosphatidyl inositol (PPI) pathway. In their discussion, Sklar et al. (2008) observe that one out of their 200 most significant SNP associations, rs1006737 (located in calcium channel, voltage-dependent, L type, alpha 1C subunit (CACNA1C), intron 3) also showed modest association in the WTCCC study (NIMH P = 2.95 × 10−04, WTCCC P = 6.92 × 10−04) and that there were numerous other associated SNPs in both samples in an area of strong linkage disequilibrium (LD) with this SNP. Finally, Ferreira et al. (2008), in addition to the results of their WGAS in a new sample (which found no SNP association reaching genomewide significance), also report the results of an analysis which combined their new sample with previously reported samples from the NIMH and WTCCC WGAS (a total of 4,387 cases, 6,209 controls, genotyped on 325,690 overlapping SNPs). The strongest associations in the combined analysis implicated structural and regulatory ion channel genes, CACNA1C (P = 7.0 × 10−08) and ankyrin 3 (ANK3) (P = 9.1 × 10−09), both of which retained significance in their expanded control group analysis (see Ferreira et al. 2008). CACNA1C encodes an L-type calcium channel subunit protein, a channel subtype that mediates a variety of neuronal calcium-dependent processes. ANK3 encodes a membrane-cytoskeleton linker which may participate in the maintenance/targeting of ion channels at the nodes of Ranvier and initial axonal segments (GeneCards, 09/03/08), and has been shown to regulate the assembly of voltage-gated sodium channels, in particular (Kordeli et al. 1995; Poliak et al. 2003). In agreement with previous findings (Akagawa et al. 1980; Hokin-Neaverson and JeVerson 1989; Meyer et al. 2005; Mynett-Johnson et al. 1998; Wittekindt et al. 1998) and pathophysiologic hypotheses (Askland 2006; Askland and Parsons 2006; Gargus 2006), Ferreira et al. (2008) conclude that bipolar disorder may be, in part, an ion channelopathy.

Studying complex genetic disorders

Though the time and expense of candidate gene association studies have traditionally made selection of candidates with biological rationale or previous evidence imperative, it has also limited association and replication studies to a relatively small number of genes. One of the many strengths of WGAS is that they circumvent this bias. Nonetheless, assumptions inherent in undertaking candidate gene studies may have given way to assumptions about genetic architecture inherent in the design and interpretation of WGAS. Namely, as WGAS are designed specifically to detect moderate to large disease associations with common human SNP variants (i.e., the common-disease, common-variant model), findings are most often [though not exclusively (Sklar et al. 2008)], interpreted assuming this model. Thus, weak and inconsistent findings are most often attributed to the presumed likelihood of polygenic, interacting genetic mechanisms, suggesting that disease susceptibility is a result of the interactions of modest effects from many genetic variants in each affected individual. A key consequence of such assumptions has been the nearly exclusive reliance on replication studies using the same methods to validate initial findings. However, a similar pattern of modest findings across candidate and whole genome studies could equally implicate heterogeneous mechanisms, including the potential effects from many rare alleles (e.g., Lencz et al. 2007b; Weiss et al. 2003), copy number variants (CNVs) (Christian et al. 2008; Sebat et al. 2007; Walsh et al. 2008), or runs of homozygosity (Lencz et al. 2007a) now known to make substantial contributions to risk of other neuropsychiatric disorders. A heterogeneous disease model suggests the possibility that disease susceptibility may result from sizeable effect(s) at a single or relatively small number of loci for each individual with many, and perhaps a great many, alleles and loci conferring risk across the affected population (i.e., disease susceptibility may derive from allelic and locus heterogeneity). To the extent that heterogeneity mediates the variability in findings across studies, it will confound replication attempts in both candidate gene and WGA studies. Moreover, in such a case, increases in sample size will not necessarily confer a corresponding increase in statistical power (McClellan et al. 2007). As demonstrated in the recent collaborative WGAS by Ferreira et al. (2008), despite the substantial increase in sample size, only two of the genotyped SNPs in the combined analysis reached genomewide significance. Importantly, neither of the previous genomewide findings from independent samples was replicated. Here, we suggest that relying only on single marker replication may ultimately delay discovery of etiologically-relevant genes.

Pathways-based analyses

Recognizing that genetic complexity will likely preclude reliance on traditional genetic methodologies and single marker replication, a host of novel conceptual and analytic approaches, often lumped under the term ‘pathways-based analyses (PBAs),’ have emerged (Hahne et al. 2008; Inada et al. 2008; Le-Niculescu et al. 2007a, b; Newton 2007; Niculescu et al. 2000; Ogden et al. 2004; Reif et al. 2005; Shriner et al. 2008; Subramanian et al. 2005; Yi et al. 2006). These analytic tools use computational methods to extract more biologically-meaningful information than might be apparent from simple observation of statistically-significant results in independent studies. They do so by defining sets of genes based upon common biological attributes from bioinformatics data (e.g., gene ontology or biological pathways) and then, by one of several methods, measure the degree of overrepresentation or ‘enrichment’ of each gene set among nominally disease-associated markers.

Such “gene-centric” approaches will best play to the strengths of WGA data. Significant SNPs, while likely to implicate a genomic region of interest, are only rarely themselves causal variants (Lencz et al. 2007b; Lewinger et al. 2005; McCarthy et al. 2008; Seng and Seng 2008). Moreover, genes harboring one disease-predisposing mutation are likely to harbor more than one (McClellan et al. 2007) and locus and allelic heterogeneity appear substantial in complex disorders such as bipolar illness. Within a sample, the clustering of even nominally-significant single marker findings within genes sharing particular biological attributes provides evidence of the gene set’s significance which may not be detectable by examination of only the most highly-significant single marker findings. Additionally, PBAs conducted in independent samples may provide gene set replication whether or not the single marker findings replicate. Finally, in the case where two studies identify enrichment within the same biologically-defined gene set but find association with different markers or genes within that gene set, the study may implicate genetic heterogeneity. Thus, PBAs can complement and extend findings derived from single marker replication studies, yielding information relevant to our understanding of disease susceptibility and to our future discovery efforts.

Thus far, PBAs have been most widely employed for the examination of microarray expression data (e.g., Huang and Chow 2007; Inada et al. 2008; Subramanian et al. 2005). More recently, investigators have begun to employ PBA in the examination of genomic data and WGA datasets, in particular. (e.g., Wang et al. 2007) We suggest that the computational approach described below circumvents several limitations of traditional analytic and interpretive approaches, yet capitalizes on the strengths of the collected data and we postulate that genetic susceptibility to bipolar disorder, at the population level, is mediated by disruption of genes characterized by a limited set of discernable biological attributes. Though we will use data from WGAS, which themselves are designed to identify common alleles or haplotypes associated with increased risk of disease, our results illuminate genes/genomic regions likely to contain both common and rare alleles, and do not rely on the strength of single-marker associations.

Materials and methods

We conducted a series of PBAs to identify sets of biological attributes defining likely susceptibility genes across the affected bipolar samples examined in the NIMH WGAS and used the WTCCC WGAS data as a replication sample. Our PBA was conducted using exploratory visual analysis (EVA), developed and maintained by Moore and colleagues at Dartmouth Medical School, for PBA and visualization of multiple types of data, including WGA results (Reif et al. 2007). Briefly this technique enables a biologically-informed statistical prioritization of analytic results, employing any type or combination of genomic or proteomic data, and producing a list of gene sets statistically-ranked according to their association with disease risk.

Data acquisition

WGAS data

Whole genome association studies data were obtained from the Wellcome Trust website (WTCCC data) and from the UCSC Genome Browser (NIMH data). The original NIMH WGAS study results dataset contained affected and unaffected allele frequencies for each RefSeq probe identifier (e.g. rs2980300), and the corresponding chi square, odds ratio and P value for each of 372,193 (74.4%) of the original 500,568 probes. See original paper (Sklar et al. 2008) for details regarding missing probe-level data. Of the 372,193 probes with available analytic results, 4,251 do not map to an Affymetrix Probe Set ID in the Affymetrix 500 K annotation files (v. 26). The absence of probe set IDs (and respective genomic locations) precluded probe-to-gene mapping for these probes. Additionally, as these probes were not utilized in the WTCCC WGAS they were excluded from the analysis. Thus, statistical results for 367,942 NIMH probes were available for the present analysis.

The original WTCCC WGAS dataset contained case and control genotype frequencies for each Affymetrix Probe Set ID, and six sets of analytic results: frequentist P values for both a general and an additive model, Bayesian (−log10 Bayes Factor) results under each model, as well as the sex-stratified frequentist P values for each model, for each of 469,927 (93.9%) probes out of the original 500,568 probes included in the Affymetrix 500 K chip. See original paper (The Wellcome Trust Case Control Consortium 2007) for details regarding missing probe-level data. All Affymetrix probes mapped to annotation data in the Affymetrix annotation files, thus statistical results for 469,927 WTCCC probes were available for the present analysis.

Results from at least one of the two studies were available for a total of 477,778 Affymetrix probes. Of these probes, 356,957 (74.7%) were ultimately mapped to a single gene using the Affymetrix ‘Associated Gene’ and dbRSid (e.g., rs10180517) identifiers from the Affymetrix 500 K annotation files, version 26 (AffyAnnotV26), by the methods described below. Thus, data corresponding to 120,821 probes could not be incorporated into our analyses because the associated gene could not be determined with certainty, either because there was no Affymetrix Associated Gene listed in the Affymetrix annotation file, or, much more commonly, more than one gene symbol mapped to the Probe ID. (See Supplementary Material for complete description of probe-to-gene mapping).

Annotation data

All annotation data for all analyses in EVA were obtained from the Affymetrix U133 2.0 plus array annotation file (HG-U133_Plus_2.na26.annot.csv, created 07-07-2008), which contains annotations for the HG-U133_Plus_2 chip probe identifiers. The sources of the genomic data are NCBI Build 36.1 and UCSC Genome version hg18. Pathways data are derived from Gene Map Annotator and Pathway Profiler (GenMAPP) and gene ontology data are extracted from both NCBI gene database and the Gene Ontology Annotation databases.

Gene ontologies are comprised of hierarchies of terms, each descending from one of three primary ontologies: molecular function (e.g., ion channel activity), biological process (e.g., synaptic transmission) and cellular component (e.g., synapse). The hierarchical structure of gene ontologies are such that a more specialized term (child) can be related to more than one less specialized term (parent). This allows for a number of gene groupings at various levels of specification.

The downloadable GenMAPP Biological Pathways (aka Biopathway) database comprises the annotation term (e.g., Calcium regulation in cardiac cells) assignments to genes. The term assignments are based upon known involvement of those genes in particular biological pathways, processes or any other functional grouping of genes.

Data processing

Using SAS proc means procedure, the minimum P value was obtained for each gene (i.e., the lowest of all P values for the set of probes within each gene) for each of the seven experiments (1 P value from NIMH dataset, and 6 P values from WTCCC). The new dataset containing all summary statistics for each of the 18,265 genes was then divided into seven experiment-specific datasets for upload into EVA. 14,819 genes from the NIMH dataset were mapped to annotations in EVA as were 14,813 and 15,324 genes from the Bayesian and frequentist WTCCC datasets, respectively. Finally, as PBA is not possible without the relevant biological data, the final results of each individual PBA represent only those genes for which respective ontology annotation is available.

The primary replication analysis was conducted using two of the six sets of analytic results published by WTCCC: those derived using a frequentist approach, under a genotypic model and those from the Bayesian analysis, under a genotypic model (abbreviated WTfreq and WTbaye, respectively). While the NIMH study utilizes a frequentist approach (i.e., allelic P values derived using the Cochran–Mantel–Haenszel test), we also included the WTCCC Bayesian results because this approach derives a single measure of the strength of evidence for an association the understanding of which, unlike P values, does not require a knowledge of power (The Wellcome Trust Case Control Consortium 2007). Thus, Bayes factors may better represent the potential contributions of SNPs with low minor allele frequencies (MAF) than P values derived from frequentist approaches as the latter will produce weaker evidence for association for SNPs with lower MAFs. (The WTCCC Bayesian analysis utilized prior information on allele frequencies from the CEU HapMap population). Since PBAs seek to identify potential contributions to disease susceptibility at the gene- and gene set-level, these data seem most apt.

Statistical and computational analyses

EVA was developed to address the limitations of other analytic approaches to genomic results (Reif et al. 2005; Reif et al. 2007; Reif and Moore 2006). The software can take as input any kind of statistical result(s) for any number of experiments, and the user can choose any statistic or define a custom statistic. The statistical significance of particular subsets of categories or particular genes within a category can be assessed through permutation testing. To complement the statistical analysis, EVA links to multiple annotation sources via Locus Link (Pruitt and Maglott 2001). To ensure that the user can replicate findings, EVA incorporates a printable command log feature. EVA uses a permutation testing strategy to assess the significance of the statistical results for a biological group. This feature complements visual inspection and provides statistical validation for the relative enrichment of particular groups. Permutation testing significance ranges are selected by the user, and yields a P value based on the results across a user-selected number of permutations. The derived P value represents the probability of obtaining the observed number of genes within the selected significance range in a given category by chance alone. In EVA’s current implementation, the user must account for multiple hypothesis (i.e., gene set) testing.

Three ontology-based PBAs [i.e., molecular function (PBA-MF), biological process (PBA-BP) and cellular components (PBA-CC)] were conducted in EVA for each set of WGAS analytic results (i.e., NIMH, WTbaye and WTfreq). In each analysis, parameter settings were as follows: (1) to allow a maximum of 500 ‘boxes’ (i.e., gene sets), corresponding to the testing of a maximum of 500 hypotheses about disease association with particular gene sets; (2) to restrict gene set size to those containing at least five genes; and (3) to utilize a gene significance cut-off which retained no more than the top 10% of genes in each analysis (i.e., NIMH P < 0.009, WTfreq P < 0.0049, WTbaye BF < −1.053). Both conservative (P < 0.01) and liberal (P < 0.05) gene set significance thresholds were used to present and evaluate our PBA results. These parameters were selected to maximize the potential to identify relevant biological attributes in this exploratory analysis. For each analysis, we selected EVA parameters to generate both a Fisher’s exact P value and a 100,000 permutation-based probability for each gene set. Fisher’s exact and permutation-based analytic results varied little. For many of the top-ranked results, permutation testing showed that these top results were produced by chance in 0 out of 100,000 permutation tests yielding probabilities of zero. Since P values of zero are less informative, we will present and discuss the full Fisher exact results making note of key permutation statistics in the text. The results of two Bonferroni corrections are noted in the tables for each primary analysis: one for an a priori gene set significance threshold of P < 0.01 and one for a threshold of P < 0.05. Although a maximum number of hypothesis tests can be set through EVA’s parameter settings, the actual number of hypotheses tested in each PBA depended upon the total number of unique terms from any given annotation dataset that have been mapped to human genes. Only gene sets surviving correction at the P < 0.01 threshold will be noted in the text. We employed these extremely conservative corrections to reduce false positive findings even though a less conservative correction may be advised in pathways-based analyses. Bonferroni correction assumes complete independence of each hypothesis tested, while gene ontologies are comprised of nested, correlated hierarchies of terms. In addition to the main effects analyses, and in an attempt to further clarify the most significantly enriched sets and subsets of genes implicated in the ontology-specific PBAs, we conducted a separate series of PBAs to examine two-way interactions between the molecular function ontologies and each of the other two ontologies, using the same parameters as described for the primary analyses. The interaction analyses assess for enrichment of gene sets defined by genes with membership in two ontologies from distinct lineages (e.g., molecular function and biological process). For example, the resulting test statistics estimate the extent to which a particular subset of the genes that belong to both the molecular function ‘ion channel activity’ set and the biological process ‘ion transport’ set may contribute to disease susceptibility, independent of the main effect of their membership in either one of the groupings alone. We conducted a similar interaction analysis of the molecular function ontology and the Biopathways annotations.

Results

Molecular function

NIMH

Among the most significant results across all NIMH Molecular Function PBAs (PBA-MF) were ion channel and ion binding gene sets and subsets (Table 1). The primary EVA analysis was run on the 9.7% of genes in the dataset that had P < 0.0090, and found the most significantly-enriched gene set was voltage-gated ion channel activity (Fisher’s exact P = 2.73 × 10−06, and 0/100,000 permutations). More specifically, of 138 total genes in the voltage-gated ion channel activity ontology category, 32 (23.19%) had a minimum P < 0.01. This gene set retains significance under the extremely conservative Bonferroni correction (P < 2 × 10−05) for 500 gene set hypothesis tests under a priori gene set significance threshold of P < 0.01.
Table 1

Results given for gene sets with P < 0.05 from the NIMH PBA and any corresponding WTCCC PBA results reaching P < 0.05

NIMH

WTCCC Bayes Gen.

WTCCC Freq. Gen.

Selected/total

Fisher exact right tail

GO molecular function

Selected/total

Fisher exact right tail

Selected/total

Fisher exact right tail

32/138

2.73E-06§

0005244: voltage-gated ion channel act.

22/134

0.01034

20/138

0.03892

53/292

6.99E-06§

0005216: ion channel act.

51/283

1.27E-05§

43/294

0.003276

25/111

5.37E-05*

0030955: K+ion binding

18/110

0.01964

  

8/18

1.43E-04

0005001: transmembrane receptor protein tyrosine phosphatase act.

5/18

0.0254

  

17/66

1.46E-04

0005267: K+channel act.

15/65

0.00122

11/65

0.0427

108/788

1.88E-04

0005509: Ca2+ion binding

110/784

6.58E-05*

113/809

2.77E-05*

20/90

3.47E-04

0005249: voltage-gated K+channel act.

14/89

0.04876

  

6/12

4.66E-04

0005251: delayed rectifier K+channel act.

  

4/12

0.02191

15/64

0.001026

0005262: Ca2+channel act.

13/64

0.007965

14/65

0.002884

7/22

0.003764

0030414: protease inhibitor act.

    

13/59

0.003865

0004714: transmembrane receptor protein tyrosine kinase act.

    

16/80

0.004056

0004222: metalloendopeptidase act.

14/80

0.02174

  

7/23

0.004962

0005245: voltage-gated Ca2+ channel act.

    

7/24

0.006425

0004114: 3′,5′-cyclic-nucleotide phosphodiesterase act.

6/24

0.02456

  

5/14

0.008229

0004697: protein kinase C act.

  

5/14

0.007564

5/14

0.008229

0005158: insulin receptor binding

    

6/22

0.01608

0019904: protein domain specific binding

    

23/148

0.01662

0008237: metallopeptidase act.

    

4/11

0.01689

0000155: two-component sensor act.

    

7/30

0.02258

0008146: sulfotransferase act.

    

4/12

0.02342

0005247: voltage-gated Cl- channel act.

    

4/12

0.02342

0008536: Ran GTPase binding

    

4/12

0.02342

0015269: Ca2+-activated K+channel act.

  

4/12

0.02191

8/37

0.02372

0051015: actin filament binding

    

11/59

0.02564

0019992: diacylglycerol binding

  

11/60

0.02524

42/321

0.0302

0005198: structural molecule act.

    

4/13

0.03127

0004289: subtilase act.

    

5/19

0.03174

0030165: PDZ domain binding

    

3/8

0.03549

0003730: mRNA 3′-UTR binding

    

3/8

0.03549

0050998: nitric-oxide synthase binding

    

19/126

0.03624

0005516: calmodulin binding

  

20/130

0.02216

4/14

0.04049

0005003: ephrin receptor act.

5/14

0.00824

  

131/1161

0.04207

0004872: receptor act.

142/1154

0.002263

  

10/56

0.04226

0031404: chloride ion binding

    

7/34

0.04249

0008092: cytoskeletal protein binding

    

9/49

0.04485

0005254: Cl- channel act.

    

3/9

0.04946

0004745: retinol dehydrogenase act.

    

3/9

0.04946

0005159: insulin-like growth factor receptor binding

    

3/9

0.04946

0008188: neuropeptide receptor act.

  

3/9

0.04701

3/9

0.04946

0015085: Ca2+ion transmembrane transporter act.

3/8

0.03552

3/9

0.04701

3/9

0.04946

0017048: Rho GTPase binding

    

3/9

0.04946

0030553: cGMP binding

    

Significantly enriched gene sets that were implicated in the WTCCC but not the NIMH analyses appear in Table 2. Results for each gene set include the number of genes selected out of the total number of genes in the set (i.e., of the total number of genes mapped to that ontologic term), the gene set P value, and the GO ID number and name. Italic indicates replication (at P < 0.05) in one of the two WTCCC WGA data analyses; bold indicates replication in both

act. Activity, K+ potassium, Ca2+ calcium, Cl chloride, Bayes Gen. Bayesian genotypic, Freq. Gen. frequentist genotypic

§Significance after Bonferroni correction for 500 hypotheses at P < 0.01

* Significance after Bonferroni correction for 500 hypotheses at P < 0.05

For the NIMH PBA-MF, a total of 16 gene sets met our conservative gene set criterion for statistical significance (P < 0.01). Nine of the 16 are ion channel/ion binding gene sets, with the second-ranked set, ion channel activity, also retaining significance under Bonferroni correction for gene set number. Gene set enrichment under a more liberal statistical cut-off (P < 0.05), identified an additional 26 enriched gene sets (for a total of 42), five of which comprise ion channel/binding/transporter genes.

WTCCC replication

The most significantly-enriched NIMH gene set, voltage-gated ion channel activity, was replicated with the WTCCC Bayesian results (Fisher’s exact P = 0.01034; 100,000 permutation-based P = 0.01028) and the WTCCC frequentist results (P = 0.03892; 100,000 permutation-based P = 0.03873). Additionally, the second-ranked NIMH gene set, ion channel activity (a broader group of ion channel genes subsuming the voltage-gated ion channels as well as several ligand-gated channel receptor genes), was replicated in the Bayesian analysis (Fisher’s exact P = 1.27 × 10−05; 100,000 permutation-based P = 1.00 × 10−05) in which it retained significance after Bonferroni correction,and frequentist (Fisher’s exact P = 0.003276; 100,000 permutation-based P = 0.00341) PBA, in which it did not retain significance after Bonferroni correction. All six gene sets that replicated in both WTCCC PBAs were ion channel/binding gene sets. An additional 13 gene sets met replication criterion (seven under conservative criterion, and an additional six under liberal criterion) in one WTCCC study. Of the seven with replication in only one study at P < 0.01, three are potassium channel/ion binding sets, while two comprise genes linked to the regulation of transmembrane receptors and intracellular signaling. (Table 2 lists the full results for the WTCCC analyses.)
Table 2

Results given for gene sets with P < 0.05 from the WTCCC PBAs

Selected/total

Fisher exact right tail

GO molecular function

WTCCC Bayesian genotypic

51/283

1.272E-05§

0005216: ion channel act.

110/784

6.581E-05*

0005509: Ca2+ion binding

7/15

0.0002622

0004970: ionotropic glutamate receptor act.

15/59

0.0004104

0019992: diacylglycerol binding

7/16

0.0004269

0005234: extracellular-glutamate-gated ion channel act.

5/9

0.000786

0045296: cadherin binding

9/28

0.0009641

0004553: hydrolase act., hydrolyzing O-glycosyl compounds

15/65

0.00122

0005267: K+channel act.

39/250

0.002235

0003779: actin binding

142/1154

0.002263

0004872: receptor act.

21/116

0.003908

0005085: guanyl-nucleotide exchange factor act.

4/9

0.007567

0008188: neuropeptide receptor act.

13/64

0.007965

0005262: Ca2+ channel act.

5/14

0.00824

0005003: ephrin receptor act.

22/134

0.01034

0005244: voltage-gated ion channel act.

94/764

0.01115

0003700: transcription factor act.

4/10

0.01165

0005261: cation channel act.

50/377

0.0157

0043565: sequence-specific DNA binding

16/92

0.01571

0003714: transcription corepressor act.

10/48

0.01575

0004879: ligand-dependent nuclear receptor act.

18/110

0.01964

0030955: K+ion binding

5/17

0.01991

0004890: GABA-A receptor act.

14/80

0.02174

0004222: metalloendopeptidase act.

9/44

0.02397

0003707: steroid hormone receptor act.

6/24

0.02456

0004114: 3′,5′-cyclic-nucleotide phosphodiesterase act.

5/18

0.0254

0005001: transmembrane receptor protein tyrosine phosphatase act.

81/676

0.03033

0004871: signal transducer act.

4/13

0.0313

0048037: cofactor binding

3/8

0.03552

0004065: arylsulfatase act.

3/8

0.03552

0015085: Ca2+ ion transmembrane transporter act.

20/138

0.04643

0003774: motor act.

14/89

0.04876

0005249: voltage-gated K+channel act.

WTCCC Frequentist genotypic

28/122

9.431E-06§

0005085: guanyl-nucleotide exchange factor act.

113/809

2.769E-05*

0005509: Ca2+ion binding

8/18

0.0001241

0004653: polypeptide N-acetylgalactosaminyltransferase act.

8/22

0.0006398

0005544: Ca2+-dependent phospholipid binding

14/65

0.002884

0005262: Ca2+ channel act.

43/294

0.003276

0005216: ion channel act.

11/46

0.003366

0003707: steroid hormone receptor act.

8/28

0.0037

0015297: antiporter act.

14/67

0.003863

0005089: Rho guanyl-nucleotide exchange factor act.

4/8

0.004227

0004716: receptor signaling protein tyrosine kinase act.

7/23

0.004441

0008067: metabotropic glutamate, GABA-B-like receptor act.

28/179

0.006327

0016757: transferase act., transferring glycosyl groups

11/50

0.006616

0004879: ligand-dependent nuclear receptor act.

5/14

0.007564

0004697: protein kinase C act.

45/347

0.0218

0004674: protein serine/threonine kinase act.

4/12

0.02191

0005251: delayed rectifier K+channel act.

4/12

0.02191

0008301: DNA bending act.

4/12

0.02191

0015269: Ca2+-activated K+ channel act.

20/130

0.02216

0005516: calmodulin binding

3/7

0.02264

0004952: dopamine receptor act.

3/7

0.02264

0008239: dipeptidyl-peptidase act.

5/18

0.02347

0005021: vascular endothelial growth factor receptor act.

11/60

0.02524

0019992: diacylglycerol binding

49/389

0.02781

0043565: sequence-specific DNA binding

4/13

0.0293

0030246: carbohydrate binding

134/1200

0.0316

0016740: transferase act.

3/8

0.03369

0015277: kainate selective glutamate receptor act.

3/8

0.03369

0016524: latrotoxin receptor act.

4/14

0.03799

0008509: anion transmembrane transporter act.

20/138

0.03892

0005244: voltage-gated ion channel act.

55/454

0.03987

0004672: protein kinase act.

11/65

0.0427

0005267: K+channel act.

6/28

0.04536

0005496: steroid binding

6/28

0.04536

0016758: transferase act., transferring hexosyl groups

3/9

0.04701

0004935: adrenoceptor act.

3/9

0.04701

0008188: neuropeptide receptor act.

3/9

0.04701

0015085: Ca2+ ion transmembrane transporter act.

3/9

0.04701

0045296: cadherin binding

4/15

0.048

0005248: voltage-gated Na+ channel act.

Results for each gene set include the number of genes selected out of the total number of genes in the set, the P value, and the GO ID and name. Bold indicates gene sets that replicated NIMH significant (P < 0.01) gene sets; normal text indicates gene sets novel to the WTCCC data

act. Act., K+ potassium, Ca2+ calcium, Na+ sodium

§Significance after Bonferroni correction for 500 hypotheses at P < 0.01

* Significance after Bonferroni correction for 500 hypotheses at P < 0.05

Comparing the lists of ‘selected’ voltage-gated ion channel genes (i.e., at least P < 0.01) rendering the gene set’s significance in each PBA, there is only modest overlap in the particular genes rendering the significance of this gene set across the three analyses (see Table 3). Fifty-five genes met gene significance threshold for selection in one or more of the analyses. Though four (7.3%) appear in all three analyses and 10 (18.1%) in two of the three, the vast majority (75%) of the genes that render the gene set’s significance are unique to a single PBA.
Table 3

Lists the significance levels (P value for NIMH and WTCCC Frequentist, Bayes factor for WTCCC Bayesian) for each voltage-gated ion channel activity gene meeting selection criteria in the molecular function PBAs

Gene

NIMH

WTCCC Bayes Gen.

WTCCC Freq. Gen.

CACNA1A

0.0044

  

CACNA1B

  

7.84E-04

CACNA1C*

2.95E-04

 

2.77E-04

CACNA1D

0.004

  

CACNA1E

0.0033

−1.0613

2.03E-04

CACNA1S

0.0052

  

CACNA2D1

  

0.0049

CACNA2D3

 

−1.0646

 

CACNA2D4

0.0051

  

CACNB2

0.0029

 

2.40E-09

CACNG3

0.0084

  

CATSPER3

  

0.0021

CLCN6

0.0022

  

CLCNKA

0.0059

  

CLIC4

0.0012

  

CLIC5

 

−1.0653

5.45E-04

CLIC6

0.0043

  

HCN4

 

−1.064

 

KCNA1

 

−1.0844

 

KCNA2*

0.0014

  

KCNA3

0.0062

 

0.0029

KCNAB1

0.0021

−1.0636

 

KCNAB2

0.0083

  

KCNB1

  

0.0016

KCNB2

1.23E-04

  

KCNC2

  

1.26E-04

KCNC4

 

−1.0584

 

KCND3

0.0065

−1.0615

 

KCNE2

 

−1.0623

 

KCNE4

 

−1.0697

 

KCNH1

0.0072

 

5.87E-04

KCNH5

0.0045

 

0.004

KCNH8

0.0011

  

KCNIP1

0.0019

1.064

 

KCNIP4

0.0043

−1.0593

0.0025

KCNJ3

 

−1.0766

 

KCNJ5

0.0065

  

KCNJ6

0.0024

  

KCNK10

0.0035

  

KCNMA1

3.27E-04

−1.0548

3.94E-05

KCNQ1

0.0068

  

KCNQ3

3.98E-04

−1.0784

0.0014

KCNQ4

0.004

−1.0667

 

KCNQ5

0.0052

  

KCNS1

  

2.66E-04

KCNS2

 

−1.0727

 

KCNS3

0.0051

  

KCNV1

 

−1.0789

 

KCNV2

 

−1.0591

 

NALCN*

 

−1.0765

0.0014

SCN10A

 

−1.0699

 

SCN2A

  

0.0013

SCN4B

  

0.004

SCN7A

 

−1.0591

 

SCN9A

  

0.001

Italic indicates significance in more than one data analysis; bold indicates significance in all three data analyses

Bayes Gen. Bayesian genotypic, Freq. Gen. frequentist genotypic

* A gene previously implicated in a bipolar WGAS study. See text for references

Biological processes

NIMH

Among the most significant results in the Biological Process PBAs (PBA-BP) are gene sets mediating synaptic transmission, ion transport and nervous system development and organization. The most significantly-enriched gene set in the NIMH PBA-BP was synaptic transmission, with 40 of the 164 genes (24.39%) in that set having a minimum P value less than 0.01 (P = 3.65 × 10−08; 0/100,000 permutations), and retaining significance after Bonferroni correction,.

For the NIMH PBA-BP, 15 gene sets met conservative (P < 0.01) criterion for statistical significance (Table 4). At least 8 of these 15 mediate processes related to synaptic transmission, ion transport and nervous system development/organization. Under our more liberal statistical cut-off (P < 0.05), an additional 18 enriched gene sets were identified for a total of 33 out of an allowed 500 (Table S1). Among the 18 gene sets are three additional sets mediating nervous system development/organization.
Table 4

Results given for gene sets with P < 0.01 from the NIMH PBA-BP data and any corresponding WTCCC PBA results reaching P < 0.05

NIMH

WTCCC Bayes Gen.

WTCCC Freq. Gen.

Selected/total

Fisher exact right tail

GO biological process

Selected/total

Fisher exact right tail

Selected/total

Fisher exact right-tail

40/164

3.65E-08§

0007268: synaptic transmission

31/161

0.0002915

27/164

0.002368

15/44

9.43E-06§

0007411: axon guidance

16/40

6.034E-07§

  

32/157

4.30E-05*

0006813: K+ ion transport

    

77/505

5.14E-05*

0006811: ion transport

62/493

0.03751

62/510

0.01678

51/307

1.08E-04

0007399: nervous system development

48/293

0.0004766

50/309

7.48E-05*

19/78

1.34E-04

0007417: central nervous system development

20/75

0.00003477*

15/81

0.007004

72/489

2.58E-04

0007155: cell adhesion

91/481

2.335E-09§

82/495

1.78E-07§

105/795

7.95E-04

0007275: multicellular organismal development

112/780

0.00006959*

96/817

0.009417

6/14

0.001266

0007194: negative regulation of adenylate cyclase act.

    

15/67

0.001657

0007420: brain development

11/62

0.04284

  

56/396

0.002889

0030154: cell differentiation

52/388

0.0192

  

5/12

0.003816

0016043: cellular component organization and biogenesis

    

5/12

0.003816

0030890: positive regulation of B cell proliferation

    

18/97

0.005422

0006816: Ca2+ion transport

  

20/100

0.000766

6/18

0.005567

0009411: response to UV

    

Results include for each set the number of genes selected out of the total number of genes in the set, the Fisher’s exact P value, and the GO ID number and name. Italic indicates replication (at P < 0.05) in one of the two WTCCC WGA data analyses, bold indicates replication in both

act. activity, K+ potassium, Ca2+ calcium, Bayes Gen. Bayesian genotypic, Freq. Gen. Frequentist genotypic

§Significance after Bonferroni correction for 500 hypotheses at P < 0.01

* Significance after Bonferroni correction for 500 hypotheses at P < 0.05

WTCCC replication

The most significantly-enriched NIMH PBA-BP gene set, synaptic transmission, was replicated in both WTCCC PBAs (frequentist P = 0.002368, Bayesian P = 2.92 × 10−04), though this set retained significance after Bonferroni correction in the NIMH study only. Ten of the 15 gene sets meeting our conservative criteria and an additional 5 of the 18 under our more liberal criterion were replicated in one or both WTCCC PBAs (Table 4 and Table S1). The second-ranked gene set in the primary NIMH analysis, axon guidance, also retained significance after Bonferroni correction in both the NIMH and Bayesian WTCCC PBA.

Seventy genes contribute to the significance of the synaptic transmission gene set across all three studies, and 70% of them are unique to a single study (Table S2). Moreover, only eight genes were common to all three analyses. Thus, the appearance of synaptic transmission as a significantly-enriched gene set under conservative PBA significance criterion in each of the three analyses and the relatively minimal overlap in the genes responsible for the significance of the gene set across studies underscores the potential relevance of this gene set to bipolar susceptibility and suggests a role for heterogeneous loci within this biological pathway.

Cellular component

NIMH

Among the most significant results in the cellular component PBAs (PBA-CC) are membrane and submembrane cellular components, including membrane-bound protein complexes (see Table 5 and Table S3). The significance of the larger gene sets (e.g., membrane, integral to membrane) is difficult to interpret since they contain many gene subsets with unknown contributions to the significance of the parent set. Thus, smaller gene sets (e.g., those implicating membrane subcomponents) may be more illuminating. For example, of those gene sets with fewer than 300 genes, postsynaptic membrane (20 of 96 genes, P = 7.29 × 10−04; 100,000 permutation-based p = 7.50 × 10−04, voltage-gated potassium channel complex (19 of 91 genes, P = 9.51 × 1004; 100,000 permutation-based P = 8.90 × 10−04, and voltage-gated calcium channel complex (7 of 20 genes, P = 0.001915; 100,000 permutation-based P = 0.00201), each membrane subcomponent sets, are among the most enriched sets.
Table 5

Results given for gene sets with P < 0.01 from the NIMH PBA-CC data and any corresponding WTCCC PBA results reaching P < 0.05

NIMH

WTCCC Bayes Gen.

WTCCC Freq. Gen.

Selected/total

Fisher exact right tail

GO cellular component

Selected/total

Fisher exact right tail

Selected/total

Fisher exact right tail

547/4793

7.18E-06§

0016020: membrane

554/4756

4.543E-06§

500/4917

0.0034

435/3738

1.15E-05§

0016021: integral to membrane

433/3706

4.354E-05*

381/3829

0.03101

20/96

7.29E-04

0045211: postsynaptic membrane

26/92

4.863E-07§

17/98

0.007087

19/91

9.51E-04

0008076: voltage-gated K+ channel complex

    

10/35

0.001303

0031012: extracellular matrix

    

7/20

0.001915

0005891: voltage-gated Ca2+channel complex

  

5/21

0.03689

181/1539

0.002788

0005886: plasma membrane

211/1508

7.309E-08§

178/1571

0.001365

29/177

0.003215

0045202: synapse

34/172

5.766E-05*

30/180

0.0008816

112/920

0.005989

0005887: integral to plasma membrane

112/908

0.006815

  

Results include for each set the number of genes selected out of the total number of genes in the set, the Fisher’s exact P value, and the GO ID number and name. Italic indicates replication (at P < 0.05) in one of the two WTCCC WGA data analyses, bold indicates replication in both

K+ potassium, Ca2+ calcium, Bayes Gen. Bayesian genotypic, Freq. Gen. Frequentist genotypic

§Significance after Bonferroni correction for 302 hypotheses at P < 0.01

* Significance after Bonferroni correction for 302 hypotheses at P < 0.05

WTCCC replication

In the replication PBAs, all of the very large membrane-related gene sets replicated, but among the smaller gene sets, postsynaptic membrane, synapse and voltage-gated calcium channel complex found replication in one or both WTCCC PBAs (see Table 5 and Table S3).

Ontology interactions

Notably, 11 gene sets showed very significant enrichment (P < 1.00 × 10−04) in the NIMH PBA-MF × BP analysis. All 11 related to interactions between ion channel and/or ion binding activity molecular functions and ion transport biological processes. Of the eleven highly-significantly enriched interacting gene sets in the primary NIMH analysis, five were replicated (at P < 0.01) in one or both WTCCC PBA-MFxBPs. One of the two that replicated in both is actually a subset of the other, so they represent evidence of the same interacting set: ion channel activity (MF) × ion transport (BP).

In the PBA-MF × CC analyses, two gene sets were enriched at the P < 1.00 × 10−04 level and an additional eleven were enriched at P < 1.00 × 10−03 level. All 13 related to interactions between ion channel and/or ion binding activity molecular functions and membrane-related cellular components. Of these 13 most significant sets identified in the NIMH PBA-MF × CC, 6 were replicated (P < 0.01) in one or both WTCCC MF × CC PBAs.

Biopathway analyses

From our PBA using GenMAPP Pathways annotations (aka Biopathway), six met conservative gene set significance criterion in the NIMH PBA (with the top two retaining significance after Bonferroni correction), and all were replicated in at least one WTCCC PBA (Table 6). The top three gene sets in all three primary analyses were no annotation, calcium regulation in cardiac cells, and G-protein signaling. Given the significance and size (13,634 genes) of the set of genes missing GenMAPP annotation and the strong significance of cardiac calcium regulation pathway in bipolar WGAS results, we ran a PBA examining the interaction between molecular function and Biopathway. Of 8 gene set interactions showing enrichment at P < 1.00 × 10−04, 6 are ion channel/binding activity molecular function sets interacting with either calcium regulation in cardiac cells (3) or no annotation (3) (See Table S6).
Table 6

Results given for gene sets with P < 0.05 from NIMH PBA-Biopathways data and any corresponding WTCCC PBA results reaching P < 0.05

NIMH

WTCCC Bayes Gen.

WTCCC Freq. Gen.

Selected/total

Fisher exact right tail

GenMAPP pathways

Selected/total

Fisher exact right tail

Selected/total

Fisher exact right tail

1267/13187

1.37E-06§

<no annotation>

1285/13172

7.56E-09§

1263/13635

5.45E-07§

28/126

5.79E-06§

Ca2+regulation in cardiac cells

24/127

0.000388*

22/129

0.001567

16/81

0.002102

G Protein Signaling

19/83

0.000119§

13/83

0.02565

5/15

0.008212

GPCRDB Class C Metabotropic glutamate pheromone

  

4/15

0.0352

21/140

0.01393

Smooth muscle contraction

20/142

0.03039

  

6/26

0.02535

Prostaglandin synthesis regulation

  

6/27

0.02534

Results include for each set the number of genes selected out of the total number of genes in the set, the Fisher’s exact P value, and the pathway name. Italic indicates replication (at P < 0.05) in one of the two WTCCC WGA data analyses, bold indicates replication in both

Ca2+ calcium, Bayes Gen. Bayesian genotypic, Freq. Gen. Frequentist genotypic

§Significance after Bonferroni correction for 62 hypotheses at P < 0.01

* Significance after Bonferroni correction for 62 hypotheses at P < 0.05

These results suggest that the significance of calcium regulation in cardiac cells across the main effects PBAs is likely an artifact of the fact that a substantial proportion of this Biopathway gene set are also ion channel activity genes, most of which also have known or predicted activity in the brain. Perhaps more notably, examination of the MFxGenMAPP results shows that the majority of the genes comprising the top-ranked sets have ion channel-related molecular functional annotations and no GenMAPP annotations. For example, the K + binding activity X no annotation gene set is comprised of 101 genes. Thus, 101 out of a total of 111 K + binding activity genes (Table 1 ‘selected/total’ column shows the total number of genes comprising each ranked gene set) have no GenMAPP annotations. Similarly, 127 out of 138 voltage-gated ion channel and 65 out of 66 K + channel activity genes have no GenMAPP annotation.

Discussion

Summary of key findings

The primary NIMH PBA analysis reveals 16 statistically-significantly enriched gene sets at P < 0.01, and an additional 26 (out of 500 allowed by parameter settings) at a nominal significance level of P < 0.05. The most striking finding in the primary NIMH data analysis is that 9 of the 16 most significant NIMH gene sets (and 5 of 26 under liberal criterion) comprise ion channel/binding/transporter sets, eight of which were replicated in at least one (and five in both) of the two primary WTCCC analyses. Additionally, several gene sets involved in the regulation of receptors and/or intracellular signaling were also replicated by one or both WTCCC PBAs.

Ion channels and related regulatory proteins

Voltage-gated ion channel activity was the single most significantly enriched gene set in the NIMH PBA and was replicated in both WTCCC PBAs. Moreover, several additional ion channel gene sets—including several ion-selective and voltage-gated ion channel subsets—were implicated in the NIMH study and replicated in one or more WTCCC analyses. (Table 1) In addition, the second-ranked NIMH gene set, ion channel activity was replicated in both WTCCC analyses. These findings may suggest the possibility that, relative to the NIMH sample which appeared to be more confined to disease-associated variation in voltage-gated ion channel genes (and subsets thereof), the WTCCC sample contained proportionally more disease-associated variation in particular voltage-gated potassium channels subsets and in non-voltage-gated ion channels, including calcium-, ATP- and G-protein-gated channels.

Interestingly, in comparing the lists of ‘selected’ voltage-gated ion channel genes (i.e., at least P < 0.01) rendering the gene sets significant in each PBA, there is only modest overlap in the particular genes that drive the significance of this gene set across the three analyses. This observation strongly suggests that the significance of the voltage-gated ion channel activity gene set is not driven by a handful of very significant results common to the three studies. Rather, it suggests two, nonexclusive alternative explanations: First, what might be important about the particular genes conferring susceptibility to bipolar disorder is that they share a molecular function, namely mediating voltage-gated ion channel activity; and second that heterogeneity may figure prominently in the genetic architecture of this susceptibility. Ultimately, this suggests that the search for individual genes and genetic variation may only be effective if we first uncover the genetic architecture of susceptibility and the molecular mechanisms likely to be relevant to pathogenesis.

That said, it is also interesting to note the overlap that does exist in the significant genes within the voltage-gated ion channel activity gene set across the studies. Four genes encoding ion channel subunits—KCNQ3, KCNMA1, CACNA1E and KCNIP4 (a regulatory subunit), met our gene significance threshold criterion in all three analyses and ten additional genes (primarily voltage-gated potassium and calcium channel genes) met gene significance threshold criterion in two of the three analyses. KCNIP4 is a regulatory subunit of Shal-type voltage-gated rapidly inactivating potassium channels (e.g., KCND2) and probably modulates channel density and inactivation kinetics. KCNQ3 is a slowly activating/deactivating potassium channel and KCNMA1 is a potassium channel activated by both membrane depolarization and cytosolic calcium concentrations. All three subunits are known to be important in the regulation of neuronal excitability and responsiveness to synaptic inputs. CACNA1E mediates entry of calcium into excitable cells and is involved in the regulation of neurotransmitter release. CACNA1C, not surprisingly, was selected in the WTCCC frequentist and NIMH PBAs. The ion channel genes noted in the original WTCCC (2008) WGAS, KCNC2, and in the WGAS by Baum et al. (2008), NALCN (aka VGCNL1), were also among the 55 voltage-gated ion channel genes conferring the set’s enrichment in the PBAs. Thus, the full set of 55 genes may represent plausible candidates for future fine-mapping or resequencing efforts.

There are a number of additional ion channel/transporter gene sets or subsets that were found to be enriched in only one of the three primary analyses but bear close functional relationships to gene sets enriched in at least one other primary analysis (Table 2). These include several voltage-gated ion channel activity subsets (i.e., voltage-gated calcium, chloride and sodium channel activity), as well as chloride channel, cation channel, antiporter, and anion transmembrane transporter activity gene sets.

A substantial body of evidence supports a susceptibility role in many neuropsychiatric disorders for genes involved in the regulation of synaptic neurotransmission (and monoaminergic systems, in particular) in the affective disorders. Many studies of functional pathways and neural networks, and a host of association studies demonstrate, on the one hand, the salience of such hypotheses, but on the other, their lack of full resolution (Abdolmaleky et al. 2008; Bloom 1984; Haavik et al. 2008; Jones and Craddock 2001; Lopes Aguiar et al. 2008; Mokrovic et al. 2008; Patrick 2000; Ressler and Nemeroff 2000; Talkowski et al. 2008; Vanyukov et al. 2007). Early molecular hypotheses and investigations also suggested a potential role for proteins involved in ion transport, including ion channels, ATPases and other ion transporters in the etiology and/or pathophysiology of bipolar and related disorders (Akagawa et al. 1980; Hokin-Neaverson and Jefferson 1989; Meyer et al. 2005; Mynett-Johnson et al. 1998; Wittekindt et al. 1998). In addition to our current PBA findings, consideration of the role of ion transport in the pathophysiology of neuropsychiatric disorders is warranted for several reasons. First, ion channels and transporters are implicated by a host of linkage findings (Askland 2006) and more recent WGAS (Ferreira et al. 2008; Sklar et al. 2008; The Wellcome Trust Case Control Consortium 2007). In addition to the WGAS findings related to ion channels noted in the Introduction, the WTCCC (2007) WGAS identified four regions with significant disease association at P < 5 × 10−07 in the expanded reference group analysis, one of which was most proximate to KCNC2, a gene encoding a Shaw-related voltage-gated potassium channel. Soon thereafter, Baum et al. (2008) found evidence for association with the VGCNL1 gene (official name now NALCN, a non-selective sodium-leak channel) encoding a voltage-gated ion channel. As these investigators note, VGCNL1 is highly-expressed in the brain and lies within a chromosomal region with previous bipolar linkage evidence in several studies.

Second, in addition to being directly involved in the regulation of neuronal excitability, ion channel proteins demonstrate a remarkable and relatively unique simultaneity of phylogenetic sequence conservation, immense isoform diversity, and developmental and distributional specificity, especially in the brain (Abernethy and Soldatov 2002; Anderson and Greenberg 2001; Mechaly et al. 2005; Sailer et al. 2004; Strong et al. 1993; Triggle et al. 2006; Trimmer and Rhodes 2004; Waxman 2000; Waxman et al. 2002, 2000; Wolfart et al. 2001). In an attempt to assess the potential candidacy of ion channels in episodic nervous system disorders, Freudenberg et al. (2007) conducted a comparative genomic analysis using bioinformatics data to look for features of human CNS-expressed ion channels that might correlate with their known disease relevance in monogenic forms of episodic nervous system disorders (Freudenberg et al. 2007). In addition to confirming the known properties noted above, the investigators demonstrated that ion channel genes show a combination of low nonsynonymous and high synonymous substitution rates, a pattern typical of genes causing monogenic neurological disease and indicative of more constrained protein sequence evolution. Notably, this pattern was most pronounced among the voltage-gated ion channel activity subset (see Freudenberg et al., Supp Table 2). Perhaps more interesting, the authors conducted a similar analysis on groups of non-ion channel genes independently suggested as affective disorders (AD) candidate genes by previous authors, predicting that these genes would display a similar phylogenetic pattern as ion channels if such a pattern is related to their central molecular roles in neuronal and CNS functioning. The authors found that the AD candidates actually displayed a similar, though less pronounced, phylogenetic pattern. In conclusion, the authors suggest a high potential relevance of mutations that regulate ion channel expression in episodic CNS disorders.

Third, numerous molecular studies conducted in ion transporting proteins have demonstrated that very small, non-lethal genetic alterations can confer a range, from subtle to substantial, of changes in protein functioning (Bracey and Wray 2006; Meisler et al. 2002). Fourth, multiple genetic disorders with both strong characteristic similarities and high comorbidity with bipolar disorder (e.g., epilepsy, migraines) are known to be caused by any one of a number of ion channel or ion-transporting ATPase mutations (Bahi-Buisson et al. 2007; Castro et al. 2007; Chandy et al. 2006; Dichgans et al. 2005; Doering and Zamponi 2005; Fernandez et al. 2008; Graves 2006; Grisar et al. 1992; Kors et al. 2002; Mossner et al. 2005; Nappi et al. 2000).

Neurotransmitter receptors and related regulatory proteins

GABA and glutamate signaling have been postulated to have a role in many neuropsychiatric diseases including bipolar disorder. Alongside clinical observations of the effectiveness of pro-GABAergic agents in the acute management of bipolar manias, GABA and glutamate are the brain’s respective primary inhibitory and excitatory neurotransmitters, modulating neurotransmission and neuronal excitability throughout the brain. In keeping with such clinical and functional observation, evidence from linkage, association and, most recently, WGAS lends support to the possible disruption of GABAergic functioning in some cases of bipolar disorder. The original WTCCC WGAS study found moderate genomewide statistical support for disease association with a marker in GABRB1 (P = 6.2 × 10−05), encoding a ligand-gated ion channel (GABA-A receptor, beta 1). Though not reaching liberal significance criterion in the NIMH PBA, several related neurotransmitter receptor gene sets were found to be significant in one of the WTCCC PBAs, including: ionotropic, metabotropic (GABA-B-like) and kainate-selective glutamate receptor, GABA-A receptor, dopamine receptor and adrenoceptor activity. These findings further validate the distinctions observed in the patterns of enrichment in ion channel gene sets between the NIMH and WTCCC analyses. Namely, ion channel activity was more significantly enriched than voltage-gated ion channel activity in the WTCCC analyses and this may be because variation in a host of ligand-gated ion channel genes (i.e., receptor genes) is more prominent in the WTCCC sample.

Finally, several additional gene sets, which do not encode structural receptor or channel subunits, but do encode proteins involved in receptor and channel regulatory functions (i.e., guanyl-nucleotide exchange factor activity, transmembrane receptor protein tyrosine phosphatase activity, protein kinase C activity and diacylglycerol binding) were each replicated in at least one WTCCC PBA By catalyzing the release of GDP from, and the subsequent uptake of GTP by, G proteins, guanyl exchange factors (GEFs) enable G proteins to activate downstream effectors such as kinases, adenylate cyclases and ion channels (Sprang 2001). Additionally, the functions of ion channels (e.g., the flow of ions across cell membranes) can be regulated by their phosphorylation state which, in turn, is tightly mediated by protein kinases and protein phosphatases, Such channel proteins can associate directly with kinases or phosphatases or through intermediate adaptor proteins (MacFarlane and Levitan 2001).

Limitations

Biopathways

It is clear that ion channels, neuroreceptors and other proteins involved in critical processes of neurotransmission are conspicuously absent within the Biopathway annotations. While we await pathways annotations for these genes, any analyses conducted using such annotations should be undertaken with the understanding that gene coverage is limited.

Gene sets

As might be readily observed from the listing of significantly-enriched gene sets across all analyses, gene sets derived from gene ontology annotations are not mutually exclusive. The hierarchical structure of gene ontologies allows for a number of gene groupings at various levels of specification. However, it also means that any gene set based upon an ontologic grouping may have any degree of overlap with any other such gene set. Furthermore, since EVA does not allow one to ‘control’ for one gene set while testing another, interpretation of the appearance of multiple nested gene subsets within the most significant results is challenging. This can be addressed, to some extent, by visual examination of the implicated gene set lists for overlap and relative rank.

In its current form, EVA allows the user to set a minimum, but not a maximum, gene set size for the analysis. Thus, as can be observed in the results (especially in the PBA-CC), several gene sets that appear in the top-ranked results are so large as to provide no useful information. Nonetheless, since the PBA assesses each gene set independently for enrichment relative to the likelihood of finding that proportion of enrichment by chance alone, the presence of large gene set annotations does not affect the statistics generated for any other gene set.

Future directions

It is our hope that the results of this and similar analyses will enable more circumscribed and therefore more cost-efficient follow-up association, resequencing and functional analyses. Ultimately, the identification of genetic susceptibility will provide a foundation for elucidating the intervening pathophysiologic mechanisms at cellular and system levels mediating behavioral phenotypes and, thereby, provide a number of biologically-rational targets for therapeutic development.

Ion channel and receptor-related gene sets encode groups of proteins comprising putative targets of the vast majority of current arsenal of psychotropic medications for the treatment of bipolar disorder (and psychiatric illness, more generally). While the precise molecular mechanisms by which psychotropics effect symptom control in the human brain remain a matter of ongoing investigation and debate in the field, there is substantial agreement about the molecular targets and pharmacologic binding properties of these drugs in vitro. In particular, there is reasonable agreement that antidepressants and antipsychotics target particular monoaminergic receptors and transporters; that benzodiazepine anxiolytics—often used in the acute management of agitation in bipolar disorder—enhance GABAergic tone while several mood stabilizing agents also have putative antiglutamatergic action; and that antiepileptic mood stabilizers—the mainstay (along with lithium) of acute and maintenance treatment in bipolar disorder—act in the brain via ion channel blockade, with varying levels of channel-type specificity and binding affinities. Not only does our PBA provide a potential rationale for the observed effectiveness of these agents in the management of bipolar disorder, but suggests that further drug development within these classes may be warranted. For example, if follow-up association, resequencing and/or functional analyses confirm the relevance of particular ion channel genes and of particular functional alterations to the etiology of bipolar disorder, the development of agents with more refined molecular targeting has the potential to enhance effectiveness and diminish side effects that arise from non-selective agents.

Acknowledgments

This manuscript is, in part, the product of preliminary analyses for an NIMH K08 application (K08 MH085810-01), currently under review. Additionally, the analyses in this paper would not have been possible without the primary WGAS data collection work of The Wellcome Trust Case Control Consortium and the Sklar et al. research groups. The authors would like to acknowledge Peter Andrews, B.S., EVA software engineer in the Computational Genetics Lab Dartmouth Medical School for his invaluable assistance in our analysis. We also acknowledge Drs. Benjamin Greenberg and Steven Rasmussen from The Warren Alpert School of Medicine at Brown University for their editorial reviews and suggestions. Finally, we thank the anonymous reviewers for their incisive and valuable reviews of our manuscript.

Supplementary material

439_2008_600_MOESM1_ESM.doc (274 kb)
Supplementary Material (DOC 274 kb)

Copyright information

© Springer-Verlag 2008