The genetics of gene expression in complex mouse crosses as a tool to study the molecular underpinnings of behavior traits

Hitzemann, Robert; Bottomly, Daniel; Iancu, Ovidiu; Buck, Kari; Wilmot, Beth; Mooney, Michael; Searles, Robert; Zheng, Christina; Belknap, John; Crabbe, John; McWeeney, Shannon

doi:10.1007/s00335-013-9495-6

The genetics of gene expression in complex mouse crosses as a tool to study the molecular underpinnings of behavior traits

Open access
Published: 31 December 2013

Volume 25, pages 12–22, (2014)
Cite this article

Download PDF

You have full access to this open access article

Mammalian Genome Aims and scope Submit manuscript

The genetics of gene expression in complex mouse crosses as a tool to study the molecular underpinnings of behavior traits

Download PDF

Robert Hitzemann^1,2,
Daniel Bottomly³,
Ovidiu Iancu²,
Kari Buck^1,2,
Beth Wilmot^3,4,
Michael Mooney⁴,
Robert Searles⁵,
Christina Zheng⁴,
John Belknap^1,2,
John Crabbe^1,2 &
…
Shannon McWeeney^1,3,4,6

1913 Accesses
8 Citations
Explore all metrics

Abstract

Complex Mus musculus crosses provide increased resolution to examine the relationships between gene expression and behavior. While the advantages are clear, there are numerous analytical and technological concerns that arise from the increased genetic complexity that must be considered. Each of these issues is discussed, providing an initial framework for complex cross study design and planning.

Complex Trait Analyses of the Collaborative Cross: Tools and Databases

Complex Genetics of Behavior: BXDs in the Automated Home-Cage

Future Directions for Animal Models in Behavior Genetics

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Sandberg et al. (2000) using Affymetrix microarrays, were the first to detect differences in genome-wide brain gene expression between two inbred mouse strains (C57BL/6J [B6] and 129SvEv [129; now 129S6/SvEvTac]). Importantly, these authors observed that some differentially expressed (DE) genes were found in chromosomal regions with known behavioral quantitative trait loci (bQTLs). For example, Kcnj9 which encodes for GIRK3, an inwardly rectifying potassium channel, was differentially expressed (higher expression in the 129 strain) and is located on distal chromosome 1 in a region where QTLs had been identified for locomotor activity, alcohol and pentobarbital withdrawal, open-field emotionality, and certain aspects of fear-conditioned behavior. This study was unable to address the question of whether or not the elements regulating Kcnj9 expression were located within the QTL intervals and/or near the gene locus. However, it is possible to extract such causal relationships by combining gene expression and genotype data in genetically segregating populations. Jansen and Nap (2001) were among the first to suggest this approach, which they termed “genetical genomics”. Although originally described for Arabidopsis, the strategy was quickly used to examine gene expression in Drosophila, yeast, and the mouse (see Lum et al. 2006 and references therein). Schadt et al. (2003) and others defined the expression QTLs (eQTLs) as either cis (mapping near the gene locus) or trans (mapping elsewhere in the genome). When behavioral QTLs (bQTLs) and cis-eQTLs overlap, the cis-eQTL genes are inferred as strong quantitative trait gene (QTG) candidates (see e.g. Farris et al. 2010). The situation for trans-eQTLs is more complicated since the QTL confidence interval is generally larger and any gene within the QTL interval could have a regulatory role.

The application of genetical genomics to mouse has generally focused on segregating populations involving two inbred strains, one of which is very frequently the B6 strain. Descriptions of these applications are found in the following section. The data analysis is relatively straightforward, especially because good sequence data are available for essentially all strains that would ever be used in a behavioral experiment (Keane et al. 2011). There are, however, problems with the two strain intercross approach. First, two strains will capture only a fraction of the genetic diversity that is available in Mus musculus (Roberts et al. 2007; Keane et al. 2011). Behavioral techniques and apparatus have been engineered for the placid and some would argue somnambulant laboratory strains of mice that are highly related (Roberts et al. 2007). Using SNPs as a surrogate for genetic diversity, a B6 x DBA/2J (D2) F2 intercross has only 1/6 the gene diversity of a heterogeneous stock (HS) formed from the eight inbred strains used to form the collaborative cross (CC) (Churchill et al. 2004; Iancu et al. 2010); the CC strains include three wild-derived strains. Crosses of low genetic diversity are not optimal for systems biology applications (Churchill et al. 2004; Threadgill and Churchill 2012). Second, given high quality sequence data and dense genotyping platforms, the use of complex crosses allows one to extract for any QTL a haplotype structure which in turn can markedly reduce the QTL confidence interval, in some cases to less than 1 Mbp. Although QTLs of this size are still 1–2 orders of magnitude larger than QTLs detected in human association studies, the reduction in size, especially in gene poor regions, is still sufficient to focus the analysis on a handful of candidates.

This article focuses on the use of complex crosses to examine the relationships between gene expression and behavior. Some historical background is provided as the field has moved from simple to complex segregating populations. While the advantages of complex crosses are obvious, there are several disadvantages, especially ones associated with data analysis. Microarray platforms were not designed for complex crosses and thus, RNA-Seq becomes the preferred strategy for assessing gene expression. While RNA-Seq allows one to examine not only gene expression but also the expression of non-coding RNAs, alternative splicing and allele specific expression, the data analysis is computationally intensive. An additional consideration is that the inclusion of wild-derived strains in the HS-CC has sometimes limited the application of this population for mapping certain behavioral responses. Behavioral testing protocols in mice have been primarily established for assessment in the common laboratory strains and increased locomotor activity associated with the inclusion of the wild-derived alleles has raised concerns about testing validity (see Logan et al. (2013) for recent examination of potential impact in the Diversity Outbred).

Model systems for complex populations

One could begin a discussion of brain gene expression, behavior, and complex crosses with Sandberg et al. (2000) (see above) but to fully understand the role of mouse complex crosses in this equation, it is perhaps best to start with a series of papers that appeared more than 20 years ago and demonstrated that it was possible to map QTLs for behavioral traits in recombinant inbred (RI) strains of mice (e.g. Gora-Maslak et al. 1991; Belknap 1992). While several RI panels were available, it was the BXD RI panel (Taylor 1978) that was most widely used. These papers and confirmatory F2 intercross studies clearly established two important and related points. One was that the QTL effect sizes were generally small and two, as a consequence, the QTL confidence intervals were typically very large, frequently more than 25 cM (or ~50 Mbp). As a result, it was almost impossible to know which gene or genes within the QTL interval are causally related to the phenotype of interest. This search was of course further complicated at the time by the poor annotation of the mouse genome. Several strategies were developed to reduce the QTL interval (see e.g. Darvasi 1998). These included the use of interval specific congenic strains, mapping in advanced intercross populations, recombinant progeny testing, and the recombinant inbred segregation test. (Talbot et al. 1999, used a variant of the advanced intercross strategy to map QTLs for open-field behavior in a heterogeneous stock (HS) created from eight inbred laboratory mouse strains. A subsequent analysis of these data (Mott et al. 2000) revealed that it would be possible to map QTLs with good precision and extract an approximate QTL haplotype structure. However, despite these and other improvements, only a very small number of behavioral quantitative trait genes (bQTG) have been identified (see e.g. Shirley et al. 2004). Although QTL resolution at the gene level is not typical in some mouse populations, it can be possible to approach gene level resolution in some commercially available outbred populations (Yalcin and Flint 2012) and interval specific congenic lines (Shirley et al. 2004).

Several approaches have been used to identify and prioritize candidate genes within a QTL interval. This initially focused on allelic sequence variation, but, even just a decade ago this was possible only if one was willing to sequence individual genes. Today, given the availability of high quality inbred strain sequence data (Keane et al. 2011), it is now possible to interrogate a QTL interval and determine which genes harbor non-synonymous coding SNPs that match the QTL profile. An alternative approach, which was widely adopted, was to integrate QTL analysis and gene expression profiling, emphasizing the genetical genomics approach (Jansen and Nap 2001). The emphasis on this approach was key to the development of WebQTL (Wang et al. 2003). Gene expression data from multiple brain regions was made available for the B6 and D2 inbred strains and 32 BXD RI strains. Also posted at the Web site were a variety of RI strain behavioral and genotype data. For many investigators, this was the first portal for examining how the natural variation in gene expression and behavior were correlated. Over the years, the Website has been updated by the inclusion of brain gene expression data from other RI panels, mouse F2 intercrosses, additional BXD RI strains, and a significant number of inbred mouse strains, including whole brain and brain regional data. The data have been used in a variety of ways, including detecting how patterns of gene co-expression have behavioral associations (Chesler et al. 2005).

Peirce et al. (2006) mined the data to address the question of “how reliable are eQTLs?”. These authors noted that for B6xD2 genotypes, cis-eQTLs are highly replicable but that there is an overabundance on eQTLs where the B6 strain is associated with higher expression. These data suggested that some of these QTLs were artifacts due to SNPs and the poor hybridization of the D2 cDNA. Subsequent experiments showed that indeed this was the case (Walter et al. 2007, 2009). Flint and colleagues (see Solberg et al. 2006; Valdar et al. 2006a, b) mapped QTLs for a variety of behavioral phenotypes in >2,000 HS animals; this HS population (HS/NPT), also an eight strain cross, differed from that used by Talbot et al. (1999). Importantly for this article, Flint and colleagues collected hippocampal gene expression data on 460 animals (Huang et al. 2009). Similar to Peirce et al. (2006), Huang et al. (2009) concluded that a significant proportion of the cis-eQTLs were hybridization artifacts. Nonetheless and not unexpectedly, the number of “true” cis-eQTLs appeared to be significantly greater than those previously detected in simpler crosses; i.e., in the HS population, additional regulatory alleles are detected. Similar results were obtained for gene expression in a simpler HS (HS4), derived from crossing four laboratory strains (Malmanger et al. 2006).

The CC (Churchill et al. 2004) was formed to provide a unique system biology resource that addresses many shortcoming in available mouse strain resources, such as limited genetic diversity. The goal was to generate >1,000 RI strains formed from eight inbred strain founders that capture >90 % of the genetic diversity available in Mus musculus. Three of the CC founders are wild-derived strains. Although it appears that only several hundred RI strains will reach completion, the CC, like the BXD RI panel, will in time provide an important reference population for examining gene-behavior relationships. Two outbred versions of the CC have been created, the HS-CC and the Diversity Outbred (DO) (Iancu et al. 2010; Churchill et al. 2012). To date, brain gene expression data are only available for the HS-CC. Iancu et al. (2010) compared brain (striatum) gene expression in a B6xD2 F2 intercross, HS4, and HS-CC animals. Although it was assumed that the regulation of gene expression would differ in each of the populations, it was also assumed that given striatal function is not cross dependent, at some level function and gene expression should overlap in a similar way for all three crosses. To address this issue, Iancu et al. (2010) utilized the Weighted Gene Co-expression Network Analysis (WGCNA) (Zhang and Horvath 2005). This analysis builds from the premise that (a) gene expression networks have scale free properties (i.e. there are a few highly connected nodes) and (b) co-expressed genes share similar functions. The analysis revealed that while there were some cross-dependent differences, the overall modular substructure of the co-expression network was cross independent, the highly connected nodes remained intact. Iancu et al. (2013) next asked if selection for a behavioral phenotype (haloperidol-induced catalepsy) had similar effects on expression network structure across the three crosses. The results obtained are both interesting and cautionary as we press forward examining complex cross gene expression. The selection paradigm was short-term (3–4 generations), the rate of segregation of the responsive and non-responsive lines was similar, and the responsive and non-responsive lines all differed by greater than 20-fold in the haloperidol ED50. The difference in response was not pharmacokinetic. The first key observation was that there was no overlap of differential gene expression for the three selections. The second key observation was that as genetic diversity increased, the number of co-expression modules affected by selection also increased. It was possible to identify a core set of modules affected by selection. What is unknown is whether or not the additional modules that were affected by selection e.g. in the HS-CC population, are relevant to our understanding of the gene-behavior relationship.

Phenotype measurements in eQTL analysis

Several technological advances have fundamentally altered the definition of phenotype in QTL studies. Mapping RNA transcript and protein abundance levels is widespread, and in principle any biologic characteristic of interest can be tested for association with genetic polymorphisms. In the context of neurobehavioral traits, examples include number of neuronal cells in specific brain regions (Rosen and Williams 2001; Airey et al. 2001) and also brain morphometry (Li et al. 2005; Jan et al. 2008). The focus of this review is on high-throughput methodologies and in particular measurements of gene expression such as microarrays, qPCR, and RNA-Seq. While these technologies offer tremendous breath to transcriptome analysis, several factors can adversely affect the quality of the results. All technologies assume intact RNA; the extent to which this assumption is true can be evaluated using the RNA integrity number (RIN) (Schroeder et al. 2006). From human studies, it has been shown that possible confounding factors include length of time post-mortem and the pH of the sample; statistical analysis can incorporate these as covariates (Liu 2011). For hybridization based methods, factors affecting probe matching can strongly affect expression measurement (Walter et al. 2007); these errors can further propagate in the course of eQTL mapping (Iancu et al. 2012). PCR based methods can also be affected by polymorphisms within the primer sequence. Taking into account these factors has beneficial effects on the downstream analysis.

Batch effects can introduce serious confounding factors in the analysis of expression levels; ideally, all samples should be processed at the same time. If separate batches are unavoidable, balancing case/controls, and sex within batches is important. Several techniques that alleviate batch effects have been proposed, with the ComBat package among the most popular (Johnson et al. 2007).

A major limitation affecting microarray-based analyses is the limited dynamic range of the fluorescence signals. This problem is resolved by the RNA-Seq methodology, where the dynamic range is orders of magnitude above the microarray capacity (Nagalakshmi et al. 2008). The adverse effects of SNPs on probe hybridization are also completely alleviated by RNA-Seq. Count data is directly related to expression level, as opposed to microarrays where the fluorescent intensity is an indirect measurement. Although RNASeq is more costly than array-based technologies, costs are steadily decreasing, which promises increased utilization of this technology.

Analytical approaches for eQTL

The analysis of eQTL in complex crosses mirrors that of traditional QTL mapping at its core. However, it also comes with additional issues that require special care by an analyst either not considered in the simplest forms of QTL mapping or further exacerbated. We will briefly review some of the most common choices of statistical methodology with an emphasis on methods for the analysis of crosses with more than two founders. First, we will consider common issues between high dimensional eQTL techniques. Specifically we will consider methods devised to deal with the multitude of statistical tests that need to be performed for a given experiment through either corrections to significance measures or by approaches that reduce the number of tests that need to be performed. We will then discuss specific statistical methodology devised for the analysis of the emerging RNA-Seq technology as related to more established microarray eQTL methods. Note that this review will mainly consider frequentist methods, though we note that Bayesian approaches are becoming more prevalent in mouse genetics. See, for instance, the review by Stephens and Balding (Stephens and Balding 2009) as an introduction to Bayesian methods in genetics. Also note that we focus on the case of a single QTL/eQTL underlying a given trait though generalizations of the below methodology allow the examination of two or more loci.

Overview of genetic and statistical considerations

The analytical methods with which QTL/eQTL analysis occurs depends on the cross as well as other experimental factors such as the assumed genetic model and phenotype. It is important to note that there are a number of design considerations that should be taken into account early in the planning process, particularly for studies utilizing complex crosses (Fig. 1). For crosses involving two inbred progenitor mouse lines (i.e. F2s intercross or backcrosses) either a single marker analysis of variance (Broman and Speed 1999), interval mapping (Lander and Botstein 1989), or related regression based approaches (Haley and Knott 1992) are typically applied when assuming the presence of a single QTL. For crosses with more than two inbred founders such as in heterogenous stock (HS) (McClearn et al. 1970), CC (Churchill et al. 2004) or Diversity Outbred (DO) (Svenson et al. 2012) mice, typically multiple regression procedures are performed based on estimates of founder strain allelic contributions for a given marker/interval (Talbot et al. 1999; Mott et al. 2000; Svenson et al. 2012; Aylor et al. 2011; Durrant et al. 2011; Philip et al. 2011). These values are the result of haplotype reconstruction in terms of the founder lines using either the genotype calls (Mott et al. 2000; Liu et al. 2010), or intensities of the genotyping arrays (Svenson et al. 2012; Collaborative Cross 2012). Haplotype reconstructions in this manner mainly draw on the use of a Hidden Markov Model though alternate approaches have also been recently considered (Zhou et al. 2012). Hidden Markov Models are a machine learning approach designed for inferring underlying states of an unknown spatially/temporally ordered variable (Rabiner 1989). For this application, the states would correspond to founder inbred strain haplotypes and the end result would be a matrix of probabilities of descent from each pair of founder inbred strains which can be further summarized per strain (Mott et al. 2000; Valdar et al. 2009). The basic multiple linear regression model approach in this case would typically compare a model with the founder contributions to one without the founder contributions for each marker interval. The comparison of these two models allow the computation of an F statistic and accompanying p value (Valdar et al. 2009).

Multiple testing considerations

One issue that is exacerbated in high dimensional eQTL scans is how to pick a significance threshold once p values (or LOD scores) are generated for each expression phenotype. The way in which these thresholds are chosen can be roughly divided into three categories ordered by decreasing conservativeness: familywise error rate, false discovery rate (FDR), and permutation/simulation procedures. The procedure used depends on the expected effect size as well as type of desired downstream analysis. For instance if the main goal is to confirm the top ranked genes via qPCR there is little benefit to incur the increased computational and analytical time generating and interpreting large lists of genes potentially regulated by an QTL. Therefore a familywise based approach such as the Bonferroni correction would make sense (Bottomly et al. 2012). The Bonferroni correction has also been used as an approach to estimate the number of false positives (Schadt et al. 2003).

Controlling the false discovery rate also has been suggested (Storey and Tibshirani 2003; Carlborg et al. 2005). A common way to implement this control is through the computation of q values from the scan p values. A q value corresponds to the expected proportion of false positives when calling a given test significant (Storey and Tibshirani 2003). It has been used on top of permutation-based p values as a way to estimate the specificity of the given scan (Aylor et al. 2011; Chesler et al. 2005). In addition, FDR values have been estimated directly using subsets of the eQTL p values (Ghazalpour et al. 2008). One issue with considering FDR corrections is the presence of dependence if multiple p values are considered per expression trait (Kendziorski and Wang 2006). Dependence between two tests in this context means that say, a low p value for trait A implies a low p value for trait B. For instance the computation of q values relies on at most weak dependence between p values and violations of this may cause inaccuracies of the method (Storey and Tibshirani 2003). However, application of an approach such as surrogate variable analysis could be applied to remove dependencies between the test statistics increasing the validity of the q values (Leek and Storey 2007, 2008).

Permutation testing is arguably the most common approach for significant assessment in eQTL studies. An approach similar to QTL studies would apply a permutation procedure to each expression trait separately (Churchill and Doerge 1994). However, as the number of tests is thousands of times greater than a standard QTL analysis, it is not desirable to perform a full permutation test potentially increasing computation time by at least an additional thousand-fold. One approach is to reduce the number of permutations necessary to compute the significance threshold through the use of a parametric model (Valdar et al. 2006a). Also, permutation testing procedures can be applied to only a subset of expression traits with the result then used to choose thresholds for the remaining traits (Huang et al. 2009; Aylor et al. 2011). This approach needs to take into consideration distributional differences among the traits that can lead to large differences in threshold values (Carlborg et al. 2005). One approach to choose representative threshold values is to interpolate based on a representative group of threshold values (Huang et al. 2009), another is to choose a global threshold based on the distribution of the thresholds (West et al. 2007). Regardless of the approach used to generate the significance thresholds, permutations need to be carried appropriately out with regard to experimental design (Churchill and Doerge 2008).

Dimension reduction

One strategy to reduce the number of tests being performed in an eQTL setting is to focus only on a subset of expression traits relevant to the phenotype(s) of interest. Relevance in this case is determined through differential expression analysis (Schadt et al. 2003). Other approaches take advantage of the fact that expression data is highly correlated to first form groups of genes with highly similar expression profiles followed by a QTL mapping procedure, two common procedures for doing this are clustering and principle component analysis. Clustering algorithms are commonly used in microarray experiments (Eisen et al. 1998) and have been used successfully as a means to reduce the number of traits necessary to map (Chun and Keleş 2009; Lan et al. 2003; Yvert et al. 2003). Procedures based on principal components analysis, which seeks to find eigengenes or eigentraits that explains a certain amount of variability while being independent from one another (Alter et al. 2000), have also been applied to expression data prior to mapping (Lan et al. 2003; Biswas et al. 2008). Mapping expression traits by first clustering the expression data and then summarizing the clusters using the ‘eigengene’ have also been shown to be effective for finding QTL regions with a large effect on expression traits (Fuller et al. 2007).

RNA-Seq eQTL approaches

The advent of microarrays made eQTL approaches an attractive option to elucidate the genetic underpinnings of gene expression. However, microarrays have many issues that prevent them from being an ideal datasource. For instance, microarrays have fixed probes/reporters that can limit expression estimates. This means both that a potential gene of interest may not be interrogated in addition to the possibility that hybridization of the probes on the array may be affected by genomic differences as is discussed later. A more recent approach is the high throughput sequencing of the mRNA population in a given experimental condition for a given animal (Mortazavi et al. 2008). This data source is less constrained by annotation, is free from relying on reporter hybridization and therefore allows additional types of analyses related to basic microarray-based eQTL to be performed.

The first type of analysis facilitated by RNA-Seq is the study of transcript-level expression specifically alternative splicing QTL (sQTL) as has been found to be informative in humans (Heinzen et al. 2008; Kwan et al. 2008). This type of analysis has been examined using microarrays for complex mouse crosses (Alberts et al. 2005), however, in practice fixed microarray probe placement and genomic differences between probe sequence and RNA source was a major impediment (Huang et al. 2009; Ciobanu et al. 2010). From recent studies using RNA-Seq, it appears that the technology is better suited to assessing the genetics of alternative splicing analysis in humans (Pickrell et al. 2010; Rakitsch et al. 2012). However though it has been suggested as a promising avenue of research (Guryev and Cuppen 2009; Hitzemann et al. 2013) little work appears to have been done applying the method to mouse crosses.

Another potential benefit to the use of RNA-Seq is the direct study of allele-specific expression. These experiments have traditionally been performed through the use of RT-PCR based confirmation approaches (Cowles et al. 2002). Allele-specific expression is implemented in practice for RNA-Seq in a similar manner by essentially counting the number of sequence reads generated by the technology that overlap with either the reference or alternative allele(s) (Degner et al. 2009). Initial applications of this approach to study embryonic imprinting yielded promising (Gregg et al. 2010) though conflicting messages (DeVeale et al. 2012) about the additional power RNA-Seq lends to the problem.

Computational issues

One of the central issues with eQTL mapping is the drastic increase in computational capabilities it requires over a similar QTL study. This is only exacerbated by increases in marker density of new genotyping arrays (Yang et al. 2009) and expression traits in exon-level oligonucleotide arrays (Gardina et al. 2006) or RNA-Seq (Mortazavi et al. 2008). In order to gain computational efficiency, aspects of the underlying mathematics can be leveraged to provide essentially the same results using less computational resources. The simplest example of this is the ability to use a matrix of phenotypes in standard linear model fitting as opposed to a single phenotype vector as is typically used. This means that relatively computationally expensive matrix calculations are performed only once and can therefore be leveraged to perform batch processing of phenotypes at a significant decrease in computational time (Valdar et al. 2009). This type of batch processing also lends itself to parallel processing either through a cluster computing environment or a single computer with multiple processors. A related example is the mixed effects model framework of EMMA (Kang et al. 2008). Similarly, analysis methods have also been developed for RNA-Seq that make computationally beneficial approximations to the underlying parameter estimation procedure (McCarthy et al. 2012).

Population substructure

Population substructure is a serious confounding factor in many QTL and eQTL mapping studies (Devlin et al. 2001; Pritchard and Donnelly 2001; Kang et al. 2008; Valdar et al. 2009; Listgarten et al. 2010). In brief, the problem can be summarized as follows: for a statistical test used to identify the causative genetic effects on a phenotype, the null hypothesis states that there is no association between the genetic locus and the phenotype. However, this assumption does not hold in cases where population substructure is present: differences in average phenotype value between the subpopulations will be detected as a QTL for each genetic locus that segregates between the subpopulations, even though the locus is not necessarily causative. It is therefore important to distinguish between causative associations and associations due solely to genetic linkage.

In mouse QTL studies, much of the uneven relatedness between individuals is due to the complex genetic history of the commonly used inbred strains. The most significant differences are between the classical inbred strains and the wild-derived inbred strains (Ideraabdullah et al. 2004; Yalcin et al. 2004). Classical inbred strains are derived from a limited number of individuals of the Mus musculus subspecies that have widely varying degrees of relatedness (Bonhomme et al. 1989). The wild-derived strains are derived from several Mus subspecies captured at different times and geographic locations (Bonhomme and Guenet 1989). Therefore, studies that evaluate phenotypic variability among several inbred strains need to account for the phylogenetic differences.

Heterogeneous stock mice are derived from inbred strains using various outbreeding procedures (Chia et al. 2005). QTL mapping in these populations offers markedly higher resolution as compared to simple intercrosses (Talbot et al. 1999; Svenson et al. 2012). However, despite efforts to randomize the mating process, individuals in outbred mouse populations display varying levels of relatedness (Aldinger et al. 2009; Iancu et al. 2012). Furthermore, an in-depth analysis of the structure of a heterogeneous stock mouse population revealed that relatedness is not evenly distributed across the genome and individual chromosomes can have effects on phenotype that are distinct from the whole genome kinship information (Iancu et al. 2012) adding another layer of complexity. Therefore, mapping strategies employed in outbred populations need to adjust for this confounding factor (e.g., Cheng et al. 2011 and references therein).

Attempts to adjust for population substructure fall into several categories. In human association studies, genomic control (Devlin et al. 2000) structured association (Pritchard et al. 2000) and principal component analysis (Patterson et al. 2006) are the most commonly employed procedures. In mouse populations, the relatively large effect size of the kinship structure seems to favor an alternative mixed-model approach (Kang et al. 2008). In a further refinement of this approach (Iancu et al. 2012), we recently demonstrated that it is possible to simultaneously detect strain-specific effects and also correct for population structure.

Causal inference

One of the main benefits of eQTL studies is the ability to form networks based on the correlation/covariation structure of the expression data across the experimental populations (Chesler et al. 2005). This allows relationships between expression traits to be expressed, for example, Trait A and Trait B are correlated and therefore there is potentially a relationship between the two traits. Without additional information or assumptions typically one cannot state confidently whether Trait A causes Trait B (Trait A→Trait B) or Trait A reacts to Trait B (Trait A←Trait B) or whether there is a confounding factor responsible for the observed correlation. Therefore co-expression networks by themselves cannot usually be used to form ‘causal’ or ‘reactive’ hypotheses, however when jointly considered with DNA variation data such inference is possible (Schadt et al. 2005). The inclusion of DNA variation data in the context of experimental crosses is necessary as it can be assumed to be the main driver of variation in the traits under consideration (Schadt et al. 2005). There are several similar ways in which causal reasoning is performed in the eQTL context: model selection approaches (Schadt et al. 2005; Chen et al. 2007; Millstein et al. 2009) structural equation modeling (SEM) (Liu et al. 2008; Aten et al. 2008) and Bayesian networks (Zhu et al. 2007). All of these approaches are similar in spirit in that they attempt to define local or global relationships of the form Marker A→Trait B→Trait C. Although, the use of causal inference approaches have shown promise, in general some cautions apply about the interpretation of causal modeling in eQTL. Specifically, consideration of large sample sizes, the removal of factors that can play a role as a hidden confounder as well as considering comprehensive sets of models are seen as necessary steps for robust causal modeling (Li et al. 2010).

Conclusion

The utility and value of complex crosses for examining the relationship between behavior and expression is clear. However, there are numerous considerations given the increased genetic complexity that must be dealt with in the design of these types of studies. By highlighting each of these, we provide a conceptual framework to guide researchers in study planning and implementation.

References

Airey DC, Lu L, Williams R (2001) Genetic control of the mouse cerebellum: identification of quantitative trait loci modulating size and architecture. J Neurosci 21(14):5099–5109
CAS PubMed Google Scholar
Alberts R, Terpstra P, Bystrykh LV, de Haan G, Jansen RC (2005) A statistical multiprobe model for analyzing cis and trans genes in genetical genomics experiments with short-oligonucleotide arrays. Genetics 171(3):1437–1439
CAS PubMed Google Scholar
Aldinger KA, Sokoloff G, Rosenberg DM, Palmer AA, Millen KJ (2009) Genetic variation and population substructure in outbred CD-1 mice: implications for genome-wide association studies. PLoS One 4(3):e4729
PubMed Central PubMed Google Scholar
Alter O, Brown PO, Botstein D (2000) Singular value decomposition for genome-wide expression data processing and modeling. Proc Natl Acad Sci 97(18):10101–10106
CAS PubMed Google Scholar
Aten J, Fuller T, Lusis A, Horvath S (2008) Using genetic markers to orient the edges in quantitative trait networks: the NEO software. BMC Syst Biol 2(1):34
PubMed Central PubMed Google Scholar
Aylor DL, Valdar W, Foulds-Mathes W, Buus RJ, Verdugo RA, Baric RS, Ferris MT, Frelinger JA, Heise M, Frieman MB, Gralinski LE, Bell TA, Didion JD, Hua K, Nehrenberg DL, Powell CL, Steigerwalt J, Xie Y, Kelada SNP, Collins FS, Yang IV, Schwartz DA, Branstetter LA, Chesler EJ, Miller DR, Spence J, Liu EY, McMillan L, Sarkar A, Wang J, Wang W, Zhang Q, Broman KW, Korstanje R, Durrant C, Mott R, Iraqi FA, Pomp D, Threadgill D, Pardo-Manuel de Villena F, Churchill GA (2011) Genetic analysis of complex traits in the emerging collaborative cross. Genome Res 21(8):1213–1222
CAS PubMed Google Scholar
Belknap JK (1992) Empirical estimates of Bonferroni corrections for use in chromosome mapping studies with the BXD recombinant inbred strains. Behav Genet 22(6):677–684
CAS PubMed Google Scholar
Biswas S, Storey J, Akey J (2008) Mapping gene expression quantitative trait loci by singular value decomposition and independent component analysis. BMC Bioinformatics 9(1):244
PubMed Central PubMed Google Scholar
Bonhomme F, Guenet JL (1989) The laboratory mouse and its wild relatives. In: Lyon MF (ed) Genetic variants and strains of the laboratory mouse. Oxford University Press, Oxford, p 876
Google Scholar
Bonhomme F, Miyashita N, Boursot P, Catalan J, Moriwaki K (1989) Genetical variation and polyphyletic origin in Japanese Mus musculus. Heredity (Edinb) 63(Pt 3):299–308
Google Scholar
Bottomly D, Ferris M, Aicher L, Rosenzweig E, Whitmore A, Aylor D, Haagmans B, Gralinski L, Bradel-Tretheway B, Bryan J, Threadgill D, Pardo-Manuel de Villena F, Baric R, Katze M, Heise M, McWeeney S (2012) Expression quantitative trait loci for extreme host response to Influenza A in pre-collaborative cross mice. G3: Genes Genomes Genet 2:213–221
Google Scholar
Broman KW, Speed TP (1999) A review of methods for identifying QTLs in experimental crosses. Lecture Notes-Monograph Series: 114–142
Carlborg Ö, De Koning DJ, Manly KF, Chesler E, Williams RW, Haley CS (2005) Methodological aspects of the genetic dissection of gene expression. Bioinformatics 21(10):2383–2393
CAS PubMed Google Scholar
Chen L, Emmert-Streib F, Storey J (2007) Harnessing naturally randomized transcription to infer regulatory relationships among genes. Genome Biol 8(10):R219
PubMed Central PubMed Google Scholar
Cheng R, Abney M, Palmer AA, Skol AD (2011) QTLRel: an R package for genome-wide association studies in which relatedness is a concern. BMC Genet 12:66
PubMed Central PubMed Google Scholar
Chesler EJ, Lu L, Shou S, Qu Y, Gu J, Wang J, Hsu HC, Mountz JD, Baldwin NE, Langston MA et al (2005) Complex trait analysis of gene expression uncovers polygenic and pleiotropic networks that modulate nervous system function. Nat Genet 37(3):233–242
CAS PubMed Google Scholar
Chia R, Achilli F, Festing MF, Fisher EM (2005) The origins and uses of mouse outbred stocks. Nat Genet 37(11):1181–1186
CAS PubMed Google Scholar
Chun H, Keleş S (2009) Expression quantitative trait loci mapping with multivariate sparse partial least squares regression. Genetics 182(1):79–90
CAS PubMed Google Scholar
Churchill GA, Doerge RW (1994) Empirical threshold values for quantitative trait mapping. Genetics 138(3):963–971
CAS PubMed Google Scholar
Churchill GA, Doerge RW (2008) Naive Application of Permutation Testing Leads to Inflated Type I Error Rates. Genetics 178(1):609–610
CAS PubMed Google Scholar
Churchill GA, Airey DC, Allayee H, Angel JM, Attie AD, Beatty J, Beavis WD, Belknap JK, Bennett B, Berrettini W, Bleich A, Bogue M, Broman KW, Buck KJ, Buckler E, Burmeister M, Chesler EJ, Cheverud JM, Clapcote S, Cook MN, Cox RD, Crabbe JC, Crusio WE, Darvasi A, Deschepper CF, Doerge RW, Farber CR, Forejt J, Gaile D, Garlow SJ, Geiger H, Gershenfeld H, Gordon T, Gu J, Gu W, de Haan G, Hayes NL, Heller C, Himmelbauer H, Hitzemann R, Hunter K, Hsu HC, Iraqi FA, Ivandic B, Jacob HJ, Jansen RC, Jepsen KJ, Johnson DK, Johnson TE, Kempermann G, Kendziorski C, Kotb M, Kooy RF, Llamas B, Lammert F, Lassalle JM, Lowenstein PR, Lu L, Lusis A, Manly KF, Marcucio R, Matthews D, Medrano JF, Miller DR, Mittleman G, Mock BA, Mogil JS, Montagutelli X, Morahan G, Morris DG, Mott R, Nadeau JH, Nagase H, Nowakowski RS, O’Hara BF, Osadchuk AV, Page GP, Paigen B, Paigen K, Palmer AA, Pan HJ, Peltonen-Palotie L, Peirce J, Pomp D, Pravenec M, Prows DR, Qi Z, Reeves RH, Roder J, Rosen GD, Schadt EE, Schalkwyk LC, Seltzer Z, Shimomura K, Shou S, Sillanpaa MJ, Siracusa LD, Snoeck HW, Spearow JL, Svenson K, Tarantino LM, Threadgill D, Toth LA, Valdar W, de Villena FP, Warden C, Whatley S, Williams RW, Wiltshire T, Yi N, Zhang D, Zhang M, Zou F (2004) The collaborative cross, a community resource for the genetic analysis of complex traits. Nat Genet 36(11):1133–1137
CAS PubMed Google Scholar
Churchill GA, Gatti DM, Munger SC, Svenson KL (2012) The diversity outbred mouse population. Mamm Genome 23(9–10):713–718
PubMed Central PubMed Google Scholar
Ciobanu DC, Lu L, Mozhui K, Wang X, Jagalur M, Morris JA, Taylor WL, Dietz K, Simon P, Williams RW (2010) Detection, validation, and downstream analysis of allelic variation in gene expression. Genetics 184(1):119–128
CAS PubMed Google Scholar
Collaborative Cross C (2012) The genome architecture of the collaborative cross mouse genetic reference population. Genetics 190(2):389–401
Google Scholar
Cowles CR, Hirschhorn JN, Altshuler D, Lander ES (2002) Detection of regulatory variation in mouse genes. Nat Genet 32(3):432–437
CAS PubMed Google Scholar
Darvasi A (1998) Experimental strategies for the genetic dissection of complex traits in animal models. Nat Genet 18(1):19–24
CAS PubMed Google Scholar
Degner JF, Marioni JC, Pai AA, Pickrell JK, Nkadori E, Gilad Y, Pritchard JK (2009) Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data. Bioinformatics 25(24):3207–3212
CAS PubMed Google Scholar
DeVeale B, van der Kooy D, Babak T (2012) Critical evaluation of imprinted gene expression by RNA–Seq: a new perspective. PLoS Genet 8:e1002600
CAS PubMed Central PubMed Google Scholar
Devlin B, Roeder K, Wasserman L (2000) Genomic control for association studies: a semiparametric test to detect excess-haplotype sharing. Biostatistics 1(4):369–387
CAS PubMed Google Scholar
Devlin B, Roeder K, Wasserman L (2001) Genomic control, a new approach to genetic-based association studies. Theor Popul Biol 60(3):155–166
CAS PubMed Google Scholar
Durrant C, Tayem H, Yalcin B, Cleak J, Goodstadt L, de Villena FPM, Mott R, Iraqi FA (2011) Collaborative cross mice and their power to map host susceptibility to Aspergillus fumigatus infection. Genome Res 21(8):1239–1248
CAS PubMed Google Scholar
Eisen MB, Spellman PT, Brown PO, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci 95(25):14863–14868
CAS PubMed Google Scholar
Farris SP, Wolen AR, Miles MF (2010) Using expression genetics to study the neurobiology of ethanol and alcoholism. Int Rev Neurobiol 91:95–128
CAS PubMed Central PubMed Google Scholar
Fuller TF, Ghazalpour A, Aten JE, Drake TA, Lusis AJ, Horvath S (2007) Weighted gene coexpression network analysis strategies applied to mouse weight. Mamm Genome 18(6–7):463–472
Google Scholar
Gardina P, Clark T, Shimada B, Staples M, Yang Q, Veitch J, Schweitzer A, Awad T, Sugnet C, Dee S, Davies C, Williams A, Turpaz Y (2006) Alternative splicing and differential gene expression in colon cancer detected by a whole genome exon array. BMC Genomics 7(1):325
PubMed Central PubMed Google Scholar
Ghazalpour A, Doss S, Kang H, Farber C, Wen PZ, Brozell A, Castellanos R, Eskin E, Smith DJ, Drake TA et al (2008) High-resolution mapping of gene expression using association in an outbred mouse stock. PLoS Genet 4(8):e1000149
PubMed Central PubMed Google Scholar
Gora-Maslak G, McClearn GE, Crabbe JC, Phillips TJ, Belknap JK, Plomin R (1991) Use of recombinant inbred strains to identify quantitative trait loci in psychopharmacology. Psychopharmacology 104(4):413–424
CAS PubMed Google Scholar
Gregg C, Zhang J, Weissbourd B, Luo S, Schroth GP, Haig D, Dulac C (2010) High-resolution analysis of parent-of-origin allelic expression in the mouse brain. Science 329(5992):643–648
CAS PubMed Central PubMed Google Scholar
Guryev V, Cuppen E (2009) Next-generation sequencing approaches in genetic rodent model systems to study functional effects of human genetic variation. Prague Special Issue: Funct Genomics and Proteomics 583(11):1668–1673
CAS Google Scholar
Haley CS, Knott SA (1992) A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69(4):315
CAS PubMed Google Scholar
Heinzen EL, Ge D, Cronin KD, Maia JM, Shianna KV, Gabriel WN, Welsh-Bohmer K, Hulette CM, Denny TN, Goldstein DB (2008) Tissue-specific genetic control of splicing: implications for the study of complex traits. PLoS Biol 6(12):e1000001
PubMed Central Google Scholar
Hitzemann R, Bottomly D, Darakjian P, Walter N, Iancu O, Searles R, Wilmot B, McWeeney S (2013) Genes, behavior and next-generation RNA sequencing. Genes, Brain and Behavior 12(1):1–12
CAS Google Scholar
Huang G, Shifman S, Valdar W, Johannesson M, Yalcin B, Taylor MS, Taylor JM, Mott R, Flint J (2009) High resolution mapping of expression QTLs in heterogeneous stock mice in multiple tissues. Genome Res 19(6):1133–1140
CAS PubMed Google Scholar
Iancu OD, Darakjian P, Walter NA, Malmanger B, Oberbeck D, Belknap J, McWeeney S, Hitzemann R (2010) Genetic diversity and striatal gene networks: focus on the heterogeneous stock-collaborative cross (HS-CC) mouse. BMC Genomics 11:585
PubMed Central PubMed Google Scholar
Iancu O, Darakjian P, Hitzemann R, McWeeney S (2012) Detection of expression quantitative trait loci in complex mouse crosses: impact and alleviation of data quality and complex population substructure. Front Genet 3:157
CAS PubMed Central PubMed Google Scholar
Iancu OD, Oberbeck D, Darakjian P, Kawane S, Erk J, McWeeney S, Hitzemann R (2013) Differential network analysis reveals genetic effects on catalepsy modules. PLoS One 8(3):e58951
CAS PubMed Central PubMed Google Scholar
Ideraabdullah FY, de la Casa-Esperon E, Bell TA, Detwiler DA, Magnuson T, Sapienza C, de Villena FP (2004) Genetic and haplotype diversity among wild-derived mouse inbred strains. Genome Res 14(10A):1880–1887
CAS PubMed Google Scholar
Jan T, Lu L, Li C, Williams R, Waters R (2008) Genetic analysis of posterior medial barrel subfield (PMBSF) size in somatosensory cortex (SI) in recombinant inbred strains of mice. BMC Neurosci 9:3
PubMed Central PubMed Google Scholar
Jansen RC, Nap JP (2001) Genetical genomics: the added value from segregation. Trends Genet 17(7):388–391
CAS PubMed Google Scholar
Johnson W, Li C, Rabinovic A (2007) Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8(1):118–127
PubMed Google Scholar
Kang HM, Zaitlen NA, Wade CM, Kirby A, Heckerman D, Daly MJ, Eskin E (2008) Efficient control of population structure in model organism association mapping. Genetics 178(3):1709–1723
PubMed Google Scholar
Keane TM, Goodstadt L, Danecek P, White MA, Wong K, Yalcin B, Heger A, Agam A, Slater G, Goodson M, Furlotte NA, Eskin E, Nellaker C, Whitley H, Cleak J, Janowitz D, Hernandez-Pliego P, Edwards A, Belgard TG, Oliver PL, McIntyre RE, Bhomra A, Nicod J, Gan X, Yuan W, van der Weyden L, Steward CA, Bala S, Stalker J, Mott R, Durbin R, Jackson IJ, Czechanski A, Guerra-Assuncao JA, Donahue LR, Reinholdt LG, Payseur BA, Ponting CP, Birney E, Flint J, Adams DJ (2011) Mouse genomic variation and its effect on phenotypes and gene regulation. Nature 477(7364):289–294
CAS PubMed Central PubMed Google Scholar
Kendziorski C, Wang P (2006) A review of statistical methods for expression quantitative trait loci mapping. Mamm Genome 17(6):509–517
PubMed Google Scholar
Kwan T, Benovoy D, Dias C, Gurd S, Provencher C, Beaulieu P, Hudson TJ, Sladek R, Majewski J (2008) Genome-wide analysis of transcript isoform variation in humans. Nat Genet 40(2):225–231
CAS PubMed Google Scholar
Lan H, Stoehr JP, Nadler ST, Schueler KL, Yandell BS, Attie AD (2003) Dimension reduction for mapping mRNA abundance as quantitative traits. Genetics 164(4):1607–1614
CAS PubMed Google Scholar
Lander ES, Botstein D (1989) Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121(1):185–199
CAS PubMed Google Scholar
Leek JT, Storey JD (2007) Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genet 3(9):e161
PubMed Central Google Scholar
Leek JT, Storey JD (2008) A general framework for multiple testing dependence. Proc Natl Acad Sci 105(48):18718–18723
CAS PubMed Google Scholar
Li C, Wei X, Lu L, Peirce J, Williams R, Waters RS (2005) Genetic analysis of barrel field size in the first somatosensory area (SI) in inbred and recombinant inbred strains of mice. Somatosens Mot Res 22(3):141–150
PubMed Google Scholar
Li Y, Tesson BM, Churchill GA, Jansen RC (2010) Critical reasoning on causal inference in genome-wide linkage and association studies. Trends Genet 26(12):493–498
PubMed Central PubMed Google Scholar
Listgarten J, Kadie C, Schadt EE, Heckerman D (2010) Correction for hidden confounders in the genetic analysis of gene expression. Proc Natl Acad Sci USA 107(38):16465–16470
CAS PubMed Google Scholar
Liu C (2011) Brain eQTL mapping informs genetic studies of psychiatric diseases. Neurosci Bull 27(2):123–133
PubMed Central PubMed Google Scholar
Liu B, de la Fuente A, Hoeschele I (2008) Gene network inference via structural equation modeling in genetical genomics experiments. Genetics 178(3):1763–1776
PubMed Google Scholar
Liu EY, Zhang Q, McMillan L, de Villena FP-M, Wang W (2010) Efficient genome ancestry inference in complex pedigrees with inbreeding. Bioinformatics 26(12):i199–i207
CAS PubMed Google Scholar
Logan RW, Robledo RF, Recla JM, Philip VM, Bubier JA, Jay JJ, Harwood C, Wilcox T, Gatti DM, Bult CJ, Churchill GA, Chesler EJ (2013) High-precision genetic mapping of behavioral traits in the diversity outbred mouse population. Genes Brain Behav 12(4):424–437
CAS PubMed Central PubMed Google Scholar
Lum PY, Chen Y, Zhu J, Lamb J, Melmed S, Wang S, Drake TA, Lusis AJ, Schadt EE (2006) Elucidating the murine brain transcriptional network in a segregating mouse population to identify core functional modules for obesity and diabetes. J Neurochem 97(Suppl 1):50–62
CAS PubMed Google Scholar
Malmanger B, Lawler M, Coulombe S, Murray R, Cooper S, Polyakov Y, Belknap J, Hitzemann R (2006) Further studies on using multiple-cross mapping (MCM) to map quantitative trait loci. Mamm Genome 17(12):1193–1204
PubMed Google Scholar
McCarthy DJ, Chen Y, Smyth GK (2012) Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Res 40(10):4288–4297
CAS PubMed Central PubMed Google Scholar
McClearn GE, Wilson JR, Meredith W (1970) The use of isogenic and heterogenic mouse stocks in behavioral research. In: Lindzey G, Thiessen DD (eds) Contributions to behavior-genetic analysis: the mouse as a prototype. Appleton-Century-Crofts, New York, pp 3–22
Google Scholar
Millstein J, Zhang B, Zhu J, Schadt E (2009) Disentangling molecular relationships with a causal inference test. BMC Genet 10(1):23
PubMed Central PubMed Google Scholar
Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5(7):621–628
CAS PubMed Google Scholar
Mott R, Talbot CJ, Turri MG, Collins AC, Flint J (2000) A method for fine mapping quantitative trait loci in outbred animal stocks. Proc Natl Acad Sci USA 97(23):12649–12654
PubMed Google Scholar
Nagalakshmi U, Wang Z, Waern K, Shou C, Raha D, Gerstein M, Snyder M (2008) The transcriptional landscape of the yeast genome defined by RNA sequencing. Science 320(5881):1344–1349
CAS PubMed Central PubMed Google Scholar
Patterson N, Price AL, Reich D (2006) Population structure and eigenanalysis. PLoS Genet 2(12):e190
PubMed Central PubMed Google Scholar
Peirce JL, Li H, Wang J, Manly KF, Hitzemann RJ, Belknap JK, Rosen GD, Goodwin S, Sutter TR, Williams RW, Lu L (2006) How replicable are mRNA expression QTL? Mamm Genome 17(6):643–656
CAS PubMed Google Scholar
Philip VM, Sokoloff G, Ackert-Bicknell CL, Striz M, Branstetter L, Beckmann MA, Spence JS, Jackson BL, Galloway LD, Barker P et al (2011) Genetic analysis in the Collaborative Cross breeding population. Genome Res 21(8):1223–1238
CAS PubMed Google Scholar
Pickrell JK, Marioni JC, Pai AA, Degner JF, Engelhardt BE, Nkadori E, Veyrieras J-B, Stephens M, Gilad Y, Pritchard JK (2010) Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature 464(7289):768–772
CAS PubMed Central PubMed Google Scholar
Pritchard JK, Donnelly P (2001) Case-control studies of association in structured or admixed populations. Theor Popul Biol 60(3):227–237
CAS PubMed Google Scholar
Pritchard JK, Stephens M, Rosenberg NA, Donnelly P (2000) Association mapping in structured populations. Am J Hum Genet 67(1):170–181
CAS PubMed Central PubMed Google Scholar
Rabiner LR (1989) A tutorial on hidden markov models and selected applications in speech recognition. Proc IEEE 77:2
Google Scholar
Rakitsch B, Lippert C, Topa H, Borgwardt K, Honkela A, Stegle O (2012) A mixed model approach for joint genetic analysis of alternatively spliced transcript isoforms using RNA-Seq data. arXiv preprint arXiv:1210.2850
Roberts A, Pardo-Manuel de Villena F, Wang W, McMillan L, Threadgill DW (2007) The polymorphism architecture of mouse genetic resources elucidated using genome-wide resequencing data: implications for QTL discovery and systems genetics. Mamm Genome 18(6–7):473–481
CAS PubMed Central PubMed Google Scholar
Rosen G, Williams R (2001) Complex trait analysis of the mouse striatum: independent QTLs modulate volume and neuron number. BMC Neurosci 2:5
CAS PubMed Central PubMed Google Scholar
Sandberg R, Yasuda R, Pankratz DG, Carter TA, Del Rio JA, Wodicka L, Mayford M, Lockhart DJ, Barlow C (2000) Regional and strain-specific gene expression mapping in the adult mouse brain. Proc Natl Acad Sci USA 97(20):11038–11043
CAS PubMed Google Scholar
Schadt EE, Monks SA, Drake TA, Lusis AJ, Che N, Colinayo V, Ruff TG, Milligan SB, Lamb JR, Cavet G, Linsley PS, Mao M, Stoughton RB, Friend SH (2003) Genetics of gene expression surveyed in maize, mouse and man. Nature 422(6929):297–302
CAS PubMed Google Scholar
Schadt EE, Lamb J, Yang X, Zhu J, Edwards S, GuhaThakurta D, Sieberts SK, Monks S, Reitman M, Zhang C, Lum PY, Leonardson A, Thieringer R, Metzger JM, Yang L, Castle J, Zhu H, Kash SF, Drake TA, Sachs A, Lusis AJ (2005) An integrative genomics approach to infer causal associations between gene expression and disease. Nat Genet 37(7):710–717
CAS PubMed Central PubMed Google Scholar
Schroeder A, Mueller O, Stocker S, Salowsky R, Leiber M, Gassmann M, Lightfoot S, Menzel W, Granzow M, Ragg T (2006) The RIN: an RNA integrity number for assigning integrity values to RNA measurements. BMC Mol Biol 7:3
PubMed Central PubMed Google Scholar
Shirley RL, Walter NA, Reilly MT, Fehr C, Buck KJ (2004) Mpdz is a quantitative trait gene for drug withdrawal seizures. Nat Neurosci 7(7):699–700
CAS PubMed Google Scholar
Solberg LC, Valdar W, Gauguier D, Nunez G, Taylor A, Burnett S, Arboledas-Hita C, Hernandez-Pliego P, Davidson S, Burns P, Bhattacharya S, Hough T, Higgs D, Klenerman P, Cookson WO, Zhang Y, Deacon RM, Rawlins JN, Mott R, Flint J (2006) A protocol for high-throughput phenotyping, suitable for quantitative trait analysis in mice. Mamm Genome 17(2):129–146
PubMed Google Scholar
Stephens M, Balding DJ (2009) Bayesian statistical methods for genetic association studies. Nat Rev Genet 10(10):681–690
CAS PubMed Google Scholar
Storey JD, Tibshirani R (2003) Statistical significance for genomewide studies. Proc Natl Acad Sci USA 100(16):9440–9445
CAS PubMed Google Scholar
Svenson KL, Gatti DM, Valdar W, Welsh CE, Cheng R, Chesler EJ, Palmer AA, McMillan L, Churchill GA (2012) High-resolution genetic mapping using the mouse diversity outbred population. Genetics 190(2):437–447
CAS PubMed Google Scholar
Talbot CJ, Nicod A, Cherny SS, Fulker DW, Collins AC, Flint J (1999) High-resolution mapping of quantitative trait loci in outbred mice. Nat Genet 21(3):305–308
CAS PubMed Google Scholar
Taylor B (1978) Recombinant inbred strains: Use in gene mapping. In: Morse HC (ed) Origins of inbred mice. Academic Press, New York, pp 423–438
Google Scholar
Threadgill DW, Churchill GA (2012) Ten years of the collaborative cross. G3 (Bethesda) 2(2):153–156
Google Scholar
Valdar W, Flint J, Mott R (2006a) Simulating the collaborative cross: power of quantitative trait loci detection and mapping resolution in large sets of recombinant inbred strains of mice. Genetics 172(3):1783–1797
CAS PubMed Google Scholar
Valdar W, Solberg LC, Gauguier D, Burnett S, Klenerman P, Cookson WO, Taylor MS, Rawlins JN, Mott R, Flint J (2006b) Genome-wide genetic association of complex traits in heterogeneous stock mice. Nat Genet 38(8):879–887
CAS PubMed Google Scholar
Valdar W, Holmes CC, Mott R, Flint J (2009) Mapping in structured populations by resample model averaging. Genetics 182(4):1263–1277
PubMed Google Scholar
Walter NA, McWeeney SK, Peters ST, Belknap JK, Hitzemann R, Buck KJ (2007) SNPs matter: impact on detection of differential expression. Nat Methods 4(9):679–680
CAS PubMed Central PubMed Google Scholar
Walter NA, Bottomly D, Laderas T, Mooney MA, Darakjian P, Searles RP, Harrington CA, McWeeney SK, Hitzemann R, Buck KJ (2009) High throughput sequencing in mice: a platform comparison identifies a preponderance of cryptic SNPs. BMC Genomics 10:379
PubMed Central PubMed Google Scholar
Wang J, Williams RW, Manly KF (2003) WebQTL: web-based complex trait analysis. Neuroinformatics 1(4):299–308
PubMed Google Scholar
West MAL, Kim K, Kliebenstein DJ, van Leeuwen H, Michelmore RW, Doerge RW, Clair DASt (2007) Global eQTL mapping reveals the complex genetic architecture of transcript-level variation in arabidopsis. Genetics 175(3):1441–1450
CAS PubMed Google Scholar
Yalcin B, Flint J (2012) Association studies in outbred mice in a new era of full-genome sequencing. Mamm Genome 23(9–10):719–726
CAS PubMed Central PubMed Google Scholar
Yalcin B, Fullerton J, Miller S, Keays DA, Brady S, Bhomra A, Jefferson A, Volpi E, Copley RR, Flint J, Mott R (2004) Unexpected complexity in the haplotypes of commonly used inbred strains of laboratory mice. Proc Natl Acad Sci USA 101(26):9734–9739
CAS PubMed Google Scholar
Yang H, Ding Y, Hutchins LN, Szatkiewicz J, Bell TA, Paigen BJ, Graber JH, de Villena FPM, Churchill GA (2009) A customized and versatile high-density genotyping array for the mouse. Nat Methods 6(9):663–666
CAS PubMed Central PubMed Google Scholar
Yvert G, Brem RB, Whittle J, Akey JM, Foss E, Smith EN, Mackelprang R, Kruglyak L (2003) Trans-acting regulatory variation in Saccharomyces cerevisiae and the role of transcription factors. Nat Genet 35(1):57–64
CAS PubMed Google Scholar
Zhang B, Horvath S (2005) A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol 4:17
Google Scholar
Zhou JJ, Ghazalpour A, Sobel EM, Sinsheimer JS, Lange K (2012) Quantitative trait loci association mapping by imputation of strain origins in multifounder crosses. Genetics 190(2):459–473
CAS PubMed Google Scholar
Zhu J, Wiener MC, Zhang C, Fridman A, Minch E, Lum PY, Sachs JR, Schadt EE (2007) Increasing the power to detect causal associations by combining genotypic and expression data in segregating populations. PLoS Comput Biol 3(4):e69
PubMed Central PubMed Google Scholar

Download references

Acknowledgments

This study was supported in part by United States Public Health Service grants AA10760, AA11034, AA13484, MH 51372, AA 13519, AA 20245, DA005228, Oregon Clinical and Translational Research Institute [5UL1RR024140], Knight Cancer Institute [5 P30 CA069533], and grant support from the Department of Veterans Affairs.

Author information

Authors and Affiliations

Portland Alcohol Research Center, Veterans Affairs Medical Center, Portland, 97239, OR, USA
Robert Hitzemann, Kari Buck, John Belknap, John Crabbe & Shannon McWeeney
Department of Behavioral Neuroscience, Oregon Health & Science University, Portland, 97239-3098, OR, USA
Robert Hitzemann, Ovidiu Iancu, Kari Buck, John Belknap & John Crabbe
Oregon Clinical and Translational Research Institute, Oregon Health & Science University, Portland, 97239-3098, OR, USA
Daniel Bottomly, Beth Wilmot & Shannon McWeeney
Dvision of Bioinformatics and Computational Biology, Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, 97239-3098, OR, USA
Beth Wilmot, Michael Mooney, Christina Zheng & Shannon McWeeney
Integrated Genomics Laboratory, Oregon Health & Science University, Portland, 97239-3098, OR, USA
Robert Searles
Division of Biostatistics, Public Health & Preventative Medicine, Oregon Health & Science University, Portland, 97239-3098, OR, USA
Shannon McWeeney

Authors

Robert Hitzemann
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Bottomly
View author publications
You can also search for this author in PubMed Google Scholar
Ovidiu Iancu
View author publications
You can also search for this author in PubMed Google Scholar
Kari Buck
View author publications
You can also search for this author in PubMed Google Scholar
Beth Wilmot
View author publications
You can also search for this author in PubMed Google Scholar
Michael Mooney
View author publications
You can also search for this author in PubMed Google Scholar
Robert Searles
View author publications
You can also search for this author in PubMed Google Scholar
Christina Zheng
View author publications
You can also search for this author in PubMed Google Scholar
John Belknap
View author publications
You can also search for this author in PubMed Google Scholar
John Crabbe
View author publications
You can also search for this author in PubMed Google Scholar
Shannon McWeeney
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shannon McWeeney.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.

Reprints and permissions

About this article

Cite this article

Hitzemann, R., Bottomly, D., Iancu, O. et al. The genetics of gene expression in complex mouse crosses as a tool to study the molecular underpinnings of behavior traits. Mamm Genome 25, 12–22 (2014). https://doi.org/10.1007/s00335-013-9495-6

Download citation

Received: 01 August 2013
Accepted: 25 November 2013
Published: 31 December 2013
Issue Date: February 2014
DOI: https://doi.org/10.1007/s00335-013-9495-6

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

The genetics of gene expression in complex mouse crosses as a tool to study the molecular underpinnings of behavior traits

Abstract

Similar content being viewed by others

Complex Trait Analyses of the Collaborative Cross: Tools and Databases

Complex Genetics of Behavior: BXDs in the Automated Home-Cage

Future Directions for Animal Models in Behavior Genetics

Introduction

Model systems for complex populations

Phenotype measurements in eQTL analysis

Analytical approaches for eQTL

Overview of genetic and statistical considerations

Multiple testing considerations

Dimension reduction

RNA-Seq eQTL approaches

Computational issues

Population substructure

Causal inference

Conclusion

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The genetics of gene expression in complex mouse crosses as a tool to study the molecular underpinnings of behavior traits

Abstract

Similar content being viewed by others

Complex Trait Analyses of the Collaborative Cross: Tools and Databases

Complex Genetics of Behavior: BXDs in the Automated Home-Cage

Future Directions for Animal Models in Behavior Genetics

Introduction

Model systems for complex populations

Phenotype measurements in eQTL analysis

Analytical approaches for eQTL

Overview of genetic and statistical considerations

Multiple testing considerations

Dimension reduction

RNA-Seq eQTL approaches

Computational issues

Population substructure

Causal inference

Conclusion

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation