Informatics resources for the Collaborative Cross and related mouse populations

Morgan, Andrew P.; Welsh, Catherine E.

doi:10.1007/s00335-015-9581-z

Informatics resources for the Collaborative Cross and related mouse populations

Published: 02 July 2015

Volume 26, pages 521–539, (2015)
Cite this article

Mammalian Genome Aims and scope Submit manuscript

Andrew P. Morgan¹ &
Catherine E. Welsh²

1338 Accesses
33 Citations
4 Altmetric
Explore all metrics

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Aylor DL, Valdar W, Foulds-Mathes W et al (2011) Genetic analysis of complex traits in the emerging Collaborative Cross. Genome Res 21:1213–1222. doi:10.1101/gr.111310.110
Article PubMed Central CAS PubMed Google Scholar
Bailey JA, Gu Z, Clark RA et al (2002) Recent segmental duplications in the human genome. Science 297:1003–1007. doi:10.1126/science.1072047
Article CAS PubMed Google Scholar
Bailey JA, Baertsch R, Kent WJ et al (2004) Hotspots of mammalian chromosomal evolution. Genome Biol 5:R23. doi:10.1186/gb-2004-5-4-r23
Article PubMed Central PubMed Google Scholar
Baker CL, Kajita S, Walker M et al (2015) PRDM9 drives evolutionary erosion of hotspots in Mus musculus through haplotype-specific initiation of meiotic recombination. PLoS Genet 11:e1004916. doi:10.1371/journal.pgen.1004916
Article PubMed Central PubMed Google Scholar
Bauer MJ, Cox AJ, Rosone G et al (2013) Lightweight algorithms for constructing and inverting the BWT of string collections. Theor Comput Sci 483:134–148. doi:10.1016/j.tcs.2012.02.002
Article Google Scholar
Baum LE, Petrie T (1966) Statistical inference for probabilistic functions of finite state Markov chains. Ann Math Stat 37:1554–1563
Article Google Scholar
Beck JA, Lloyd S, Hafezparast M et al (2000) Genealogies of mouse inbred strains. Nat Genet 24:23–25. doi:10.1038/71641
Article CAS PubMed Google Scholar
Benjamini Y, Hochberg Y et al (1995) Controlling the false-discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B 57:289–300
Google Scholar
Bennett BJ, Farber CR, Orozco L et al (2010) A high-resolution association mapping panel for the dissection of complex traits in mice. Genome Res 20:281–290. doi:10.1101/gr.099234.109
Article PubMed Central CAS PubMed Google Scholar
Boursot P, Auffray JC, Britton-Davidian J, Bonhomme F et al (1993) The evolution of house mice. Annu Rev Ecol Syst 24:119–152
Article Google Scholar
Broman KW, Wu H, Sen S, Churchill GA et al (2003) R/qtl: QTL mapping in experimental crosses. Bioinformatics 19:889–890
Article CAS PubMed Google Scholar
Calaway JD, Lenarcic AB, Didion JP et al (2013) Genetic architecture of skewed X inactivation in the laboratory mouse. PLoS Genet 9:e1003853. doi:10.1371/journal.pgen.1003853
Article PubMed Central PubMed Google Scholar
CCC et al (2012) The genome architecture of the Collaborative Cross mouse genetic reference population. Genetics 190:389–401. doi:10.1534/genetics.111.132639
Article Google Scholar
Chaisson MJ, Pevzner PA (2008) Short read fragment assembly of bacterial genomes. Genome Res 18:324–330. doi:10.1101/gr.7088808
Article PubMed Central CAS PubMed Google Scholar
Chesler EJ et al (2014) Out of the bottleneck: the Diversity Outcross and Collaborative Cross mouse populations in behavioral genetics research. Mamm Genome 25:3–11. doi:10.1007/s00335-013-9492-9
Article PubMed Central PubMed Google Scholar
Church DM, Schneider VA, Steinberg KM et al (2015) Extending reference assembly models. Genome Biol 16:13. doi:10.1186/s13059-015-0587-3
Article PubMed Central PubMed Google Scholar
Churchill GA, Doerge RW et al (1994) Empirical threshold values for quantitative trait mapping. Genetics 138:963–971
PubMed Central CAS PubMed Google Scholar
Churchill GA, Airey DC, Allayee H et al (2004) The Collaborative Cross, a community resource for the genetic analysis of complex traits. Nat Genet 36:1133–1137. doi:10.1038/ng1104-1133
Article CAS PubMed Google Scholar
Clark AG, Hubisz MJ, Bustamante CD et al (2005) Ascertainment bias in studies of human genome-wide polymorphism. Genome Res 15:1496–1502. doi:10.1101/gr.4107905
Article PubMed Central CAS PubMed Google Scholar
Cook MN, Bolivar V, McFadyen MP, Flaherty L et al (2002) Behavioral differences among 129 substrains: implications for knockout and transgenic mice. BehavNeurosci 116:600–611. doi:10.1037/0735-7044.116.4.600
Google Scholar
Crowley JJ, Zhabotynsky V, Sun W et al (2015) Analyses of allele-specific gene expression in highly divergent mouse crosses identifies pervasive allelic imbalance. Nat Genet. doi:10.1038/ng.3222
PubMed Central Google Scholar
Daetwyler HD, Calus MPL, Pong-Wong R et al (2013) Genomic prediction in animals and plants: simulation of data, validation, reporting, and benchmarking. Genetics 193:347–365. doi:10.1534/genetics.112.147983
Article PubMed Central PubMed Google Scholar
Didion JP, Yang H, Sheppard K et al (2012) Discovery of novel variants in genotyping arrays improves genotype retention and reduces ascertainment bias. BMC Genom 13:34. doi:10.1186/1471-2164-13-34
Article CAS Google Scholar
Didion JP, de Villena FP-M et al (2013) Deconstructing Mus gemischus: advances in understanding ancestry, structure, and variation in the genome of the laboratory mouse. Mamm Genome 24:1–20. doi:10.1007/s00335-012-9441-z
Article PubMed Central CAS PubMed Google Scholar
Dobzhansky T et al (1936) Studies on hybrid sterility. II Localization of sterility factors in Drosophila pseudoobscura hybrids. Genetics 21:113–135
PubMed Central CAS PubMed Google Scholar
Ferguson B, Ram R, Handoko HY et al (2014) Melanoma susceptibility as a complex trait: genetic variation controls all stages of tumor progression. Oncogene. doi:10.1038/onc.2014.227
PubMed Google Scholar
Ferris MT, Aylor DL, Bottomly D et al (2013) Modeling host genetic regulation of influenza pathogenesis in the Collaborative Cross. PLoS Pathog 9:e1003196. doi:10.1371/journal.ppat.1003196
Article PubMed Central PubMed Google Scholar
Flicek P, Ahmed I, Amode MR et al (2013) Ensembl 2013. Nucleic Acids Res 41:D48–D55. doi:10.1093/nar/gks1236
Article PubMed Central CAS PubMed Google Scholar
Forejt J, Ivanyi P et al (1974) Genetic studies on male sterility of hybrids between laboratory and wild mice (Mus musculus L.). Genet Res 24:189–206
Article CAS PubMed Google Scholar
Frazer KA, Eskin E, Kang HM et al (2007) A sequence-based variation map of 8.27 million SNPs in inbred mouse strains. Nature 448:1050–1053. doi:10.1038/nature06067
Article CAS PubMed Google Scholar
Fu C-P, Welsh CE, Villena FP-M de, McMillan L et al (2012) Inferring ancestry in admixed populations using microarray probe intensities. In: Proceedings of the ACM conference on bioinformatics, computational biology and biomedicine—bCB’12. ACM Press, New York, pp 105–112
Gatti DM, Svenson KL, Shabalin A et al (2014) Quantitative trait locus mapping methods for Diversity Outbred mice. G3(4):1623–1633. doi:10.1534/g3.114.013748
Google Scholar
Gelman A, Hill J et al (2007) Data analysis using regression and multilevel/hierarchical models. Cambridge University Press, Cambridge
Google Scholar
Geraldes A, Basset P, Gibson B et al (2008) Inferring the history of speciation in house mice from autosomal, X-linked, Y-linked and mitochondrial genes. Mol Ecol 17:5349–5363. doi:10.1111/j.1365-294X.2008.04005.x
Article PubMed Central PubMed Google Scholar
Ghazalpour A, Rau CD, Farber CR et al (2012) Hybrid Mouse Diversity Panel: a panel of inbred mouse strains suitable for analysis of complex genetic traits. Mamm Genome 23:680–692. doi:10.1007/s00335-012-9411-5
Article PubMed Central CAS PubMed Google Scholar
Gonzales NM, Palmer AA et al (2014) Fine-mapping QTLs in advanced intercross lines and other outbred populations. Mamm Genome 25:271–292. doi:10.1007/s00335-014-9523-1
Article PubMed Central PubMed Google Scholar
Good JM, Dean MD, Nachman MW et al (2008) A complex genetic basis to X-linked hybrid male sterility between two species of house mice. Genetics 179:2213–2228. doi:10.1534/genetics.107.085340
Article PubMed Central PubMed Google Scholar
Grabherr MG, Haas BJ, Yassour M et al (2011) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 29:644–652. doi:10.1038/nbt.1883
Article PubMed Central CAS PubMed Google Scholar
Grubb SC, Bult CJ, Bogue MA et al (2014) Mouse phenome database. Nucleic Acids Res 42:D825–D834. doi:10.1093/nar/gkt1159
Article PubMed Central CAS PubMed Google Scholar
Haley CS, Knott SA et al (1992) A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69:315–324. doi:10.1038/hdy.1992.131
Article CAS PubMed Google Scholar
Harrow J, Denoeud F, Frankish A et al (2006) GENCODE: producing a reference annotation for ENCODE. Genome Biol 7(Suppl 1):S4 1–S4 9. doi:10.1186/gb-2006-7-s1-s4
Article Google Scholar
Holt J, McMillan L et al (2014) Merging of multi-string BWTs with applications. Bioinformatics 30:3524–3531. doi:10.1093/bioinformatics/btu584
Article CAS PubMed Google Scholar
Huang S, Holt J, Kao C-Y et al (2014) A novel multi-alignment pipeline for high-throughput sequencing data. Database 2014:bau057. doi:10.1093/database/bau057
Article PubMed Central PubMed Google Scholar
Hudson RR, Kaplan NL et al (1985) Statistical properties of the number of recombination events in the history of a sample of DNA sequences. Genetics 111:147–164
PubMed Central CAS PubMed Google Scholar
Iraqi FA, Athamni H, Dorman A et al (2014) Heritability and coefficient of genetic variation analyses of phenotypic traits provide strong basis for high-resolution QTL mapping in the Collaborative Cross mouse genetic reference population. Mamm Genome 25:109–119. doi:10.1007/s00335-014-9503-5
Article PubMed Google Scholar
Kang HM, Zaitlen NA, Wade CM et al (2008) Efficient control of population structure in model organism association mapping. Genetics 178:1709–1723. doi:10.1534/genetics.107.080101
Article PubMed Central PubMed Google Scholar
Karolchik D, Barber GP, Casper J et al (2014) The UCSC genome browser database: 2014 update. Nucleic Acids Res 42:D764–D770. doi:10.1093/nar/gkt1168
Article PubMed Central CAS PubMed Google Scholar
Keane TM, Goodstadt L, Danecek P et al (2011) Mouse genomic variation and its effect on phenotypes and gene regulation. Nature 477:289–294. doi:10.1038/nature10413
Article PubMed Central CAS PubMed Google Scholar
Kelada SNP, Aylor DL, Peck BCE et al (2012) Genetic analysis of hematological parameters in incipient lines of the Collaborative Cross. G3 2:157–165. doi:10.1534/g3.111.001776
Article PubMed Central PubMed Google Scholar
Kelada SNP, Carpenter DE, Aylor DL et al (2014) Integrative genetic analysis of allergic inflammation in the murine lung. Am J Respir Cell Mol Biol 51:436–445. doi:10.1165/rcmb.2013-0501OC
Article PubMed Central CAS PubMed Google Scholar
Lenarcic AB, Svenson KL, Churchill GA, Valdar W et al (2012) A general Bayesian approach to analyzing diallel crosses of inbred strains. Genetics 190:413–435. doi:10.1534/genetics.111.132563
Article PubMed Central PubMed Google Scholar
Lippert C, Listgarten J, Liu Y et al (2011) FaST linear mixed models for genome-wide association studies. Nat Methods 8:833–835. doi:10.1038/nmeth.1681
Article CAS PubMed Google Scholar
Liu EY, Zhang Q, McMillan L et al (2010) Efficient genome ancestry inference in complex pedigrees with inbreeding. Bioinformatics 26:i199–i207. doi:10.1093/bioinformatics/btq187
Article PubMed Central CAS PubMed Google Scholar
Liu EY, Morgan AP, Chesler EJ et al (2014) High-resolution sex-specific linkage maps of the mouse reveal polarized distribution of crossovers in male germline. Genetics 197:91–106. doi:10.1534/genetics.114.161653
Article PubMed Central PubMed Google Scholar
McLaren W, Pritchard B, Rios D et al (2010) Deriving the consequences of genomic variants with the Ensembl API and SNP effect predictor. Bioinformatics 26:2069–2070. doi:10.1093/bioinformatics/btq330
Article PubMed Central CAS PubMed Google Scholar
Mott R, Talbot CJ, Turri MG et al (2000) A method for fine mapping quantitative trait loci in outbred animal stocks. Proc Natl Acad Sci USA 97:12649–12654. doi:10.1073/pnas.230304397
Article PubMed Central PubMed Google Scholar
Munger SC, Raghupathy N, Choi K et al (2014) RNA-seq alignment to individualized genomes improves transcript abundance estimates in multiparent populations. Genetics 198:59–73. doi:10.1534/genetics.114.165886
Article PubMed Central PubMed Google Scholar
Orth A, Adama T, Din W, Bonhomme F et al (1998) Natural hybridization between two subspecies of the house mouse, Mus musculus domesticus and Mus musculus castaneus, near Lake Casitas, California. Genome 41:104–110
Article CAS PubMed Google Scholar
Patro R, Mount SM, Kingsford C (2014) Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms. Nat Biotechnol 32:462–646. doi:10.1038/nbt.2862
Article PubMed Central CAS PubMed Google Scholar
Petkov PM, Ding Y, Cassell MA et al (2004) An efficient SNP system for mouse genome scanning and elucidating strain relationships. Genome Res 14:1806–1811. doi:10.1101/gr.2825804
Article PubMed Central CAS PubMed Google Scholar
Phillippi J, Xie Y, Miller DR et al (2014) Using the emerging Collaborative Cross to probe the immune system. Genes Immun 15:38–46. doi:10.1038/gene.2013.59
Article PubMed Central CAS PubMed Google Scholar
Rabiner LR (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 77:257–286
Article Google Scholar
Rasmussen AL, Okumura A, Ferris MT et al (2014) Host genetic diversity enables Ebola hemorrhagic fever pathogenesis and resistance. Science. doi:10.1126/science.1259595
PubMed Central Google Scholar
Rogala AR, Morgan AP, Christensen AM et al (2014) The Collaborative Cross as a resource for modeling human disease: CC011/Unc, a new mouse model for spontaneous colitis. Mamm Genome 25:95–108. doi:10.1007/s00335-013-9499-2
Article PubMed Central PubMed Google Scholar
She X, Cheng Z, Zöllner S et al (2008) Mouse segmental duplication and copy number variation. Nat Genet 40:909–914. doi:10.1038/ng.172
Article PubMed Central CAS PubMed Google Scholar
Simecek P, Churchill GA, Yang H et al (2015) Genetic analysis of substrain divergence in NOD mice. G3(5):771–775. doi:10.1534/g3.115.017046
Google Scholar
Soh YQS, Alföldi J, Pyntikova T et al (2014) Sequencing the mouse Y chromosome reveals convergent gene acquisition and amplification on both sex chromosomes. Cell 159:800–813. doi:10.1016/j.cell.2014.09.052
Article CAS PubMed Google Scholar
Svenson KL, Gatti DM, Valdar W et al (2012) High-resolution genetic mapping using the mouse Diversity Outbred population. Genetics 190:437–447. doi:10.1534/genetics.111.132597
Article PubMed Central CAS PubMed Google Scholar
Taylor BA, Heiniger HJ, Meier H et al (1973) Genetic analysis of resistance to cadmium-induced testicular damage in mice. Proc Soc Exp Biol Med 143:629–633
Article CAS PubMed Google Scholar
Ursin E (1952) Occurrence of voles, mice, and rats (Muridae) in Denmark, with a special note on a zone of intergradation between two subspecies of the house mouse (Mus musculus L.). Vid Medd Dansk Naturhist Foren 114:217–244
Google Scholar
Valdar W, Flint J, Mott R et al (2006a) Simulating the Collaborative Cross: power of quantitative trait loci detection and mapping resolution in large sets of recombinant inbred strains of mice. Genetics 172:1783–1797. doi:10.1534/genetics.104.039313
Article PubMed Central CAS PubMed Google Scholar
Valdar W, Solberg LC, Gauguier D et al (2006b) Genome-wide genetic association of complex traits in heterogeneous stock mice. Nat Genet 38:879–887. doi:10.1038/ng1840
Article CAS PubMed Google Scholar
Valdar W, Holmes CC, Mott R, Flint J et al (2009) Mapping in structured populations by resample model averaging. Genetics 182:1263–1277. doi:10.1534/genetics.109.100727
Article PubMed Central PubMed Google Scholar
Wade CM, Kulbokas EJ, Kirby AW et al (2002) The mosaic structure of variation in the laboratory mouse genome. Nature 420:574–578. doi:10.1038/nature01252
Article CAS PubMed Google Scholar
Wall JD, Pritchard JK et al (2003) Haplotype blocks and linkage disequilibrium in the human genome. Nat Rev Genet 4:587–597. doi:10.1038/nrg1123
Article CAS PubMed Google Scholar
Wang J, Moore KJ, Zhang Q et al (2010) Genome-wide compatible SNP intervals and their properties. In: Proceedings of the first aCM international conference on bioinformatics and computational biology—bCB’10. ACM Press, New York, p 43
Wang JR, de Villena FP-M, Lawson HA et al (2012a) Imputation of single-nucleotide polymorphisms in inbred mice using local phylogeny. Genetics 190:449–458. doi:10.1534/genetics.111.132381
Article PubMed Central CAS PubMed Google Scholar
Wang JR, de Villena FP-M, McMillan L et al (2012b) Comparative analysis and visualization of multiple collinear genomes. BMC Bioinform 13(Suppl 3):S13. doi:10.1186/1471-2105-13-S3-S13
Google Scholar
Waterston RH, Lindblad-Toh K, Birney E et al (2002) Initial sequencing and comparative analysis of the mouse genome. Nature 420:520–562. doi:10.1038/nature01262
Article CAS PubMed Google Scholar
Weiser M, Mukherjee S, Furey TS et al (2014) Novel distal eQTL analysis demonstrates effect of population genetic architecture on detecting and interpreting associations. Genetics 198:879–893. doi:10.1534/genetics.114.167791
Article PubMed Central PubMed Google Scholar
Williams RW, Gu J, Qi S, Lu L et al (2001) The genetic structure of recombinant inbred mice: high-resolution consensus maps for complex trait analysis. Genome Biol 2:46. doi:10.1186/gb-2001-2-11-research0046
Google Scholar
Williams RW, Bennett B, Lu L et al (2004) Genetic structure of the LXS panel of recombinant inbred mouse strains: a powerful resource for complex trait analysis. Mamm Genome 15:637–647. doi:10.1007/s00335-004-2380-6
Article CAS PubMed Google Scholar
Wilming LG, Gilbert JGR, Howe K et al (2008) The vertebrate genome annotation (Vega) database. Nucleic Acids Res 36:D753–D760. doi:10.1093/nar/gkm987
Article PubMed Central CAS PubMed Google Scholar
Yang H, Bell TA, Churchill GA, de Villena FPM et al (2007) On the subspecific origin of the laboratory mouse. Nat Genet 39:1100–1107. doi:10.1038/ng2087
Article CAS PubMed Google Scholar
Yang H, Ding Y, Hutchins LN et al (2009) A customized and versatile high-density genotyping array for the mouse. Nat Methods 6:663–666. doi:10.1038/nmeth.1359
Article PubMed Central CAS PubMed Google Scholar
Yang H, Wang JR, Didion JP et al (2011) Subspecific origin and haplotype diversity in the laboratory mouse. Nat Genet 43:648–655. doi:10.1038/ng.847
Article PubMed Central CAS PubMed Google Scholar
Zhang Z, Wang W, Valdar W et al (2014) Bayesian modeling of haplotype effects in multiparent populations. Genetics 198:139–156. doi:10.1534/genetics.114.166249
Article PubMed Central PubMed Google Scholar

URLs

BAGPIPE. http://valdarlab.unc.edu/software/bagpipe
BAGPHENOTYPE. http://valdarlab.unc.edu/bagphenotype.html
Collaborative Cross Status website. http://www.csbio.unc.edu/CCstatus/
Collaborative Cross Viewer. http://www.csbio.unc.edu/CCstatus/index.py?run=CCV
DOQTL. http://www.bioconductor.org/packages/release/bioc/html/DOQTL.html
GECCO gene expression browser. http://csbio.unc.edu/gecco/
MDA genotypes for 100 inbred strains. http://cgd.jax.org/datasets/popgen/diversityarray/yang2011.shtml
MegaMUGA genotypes for CC founder strains. http://csbio.unc.edu/CCstatus/index.py?run=GeneseekMM
modtools + lapels + suspenders pipeline. http://www.csbio.unc.edu/CCstatus/index.py?run=Pseudo
Mouse Imputation Resource. http://csbio.unc.edu/imputation/
Mouse Phylogeny Viewer. http://msub.csbio.unc.edu/
Sanger Mouse Genomes Project. http://www.sanger.ac.uk/resources/mouse/genomes/
Searchable index of sequencing reads from CC founder strains. http://www.csbio.unc.edu/CEGSseq/index.py?run=MsbwtTools
Seqnature. https://github.com/jaxcs/Seqnature

Download references

Acknowledgments

The development of the CC population and related tools at the UNC Systems Genetics Core Facility was supported by Grants from the National Institutes of Health (U01CA134240, P50MH090338, P50HG006582, and U54AI081680); Ellison Medical Foundation (Grant AG-IA-0202-05), and National Science Foundation (Grants IIS0448392 and IIS0812464). Essential support was provided by the Dean of the UNC School of Medicine, the UNC Mutant Mouse Regional Resource Center (U42OD010924), the Lineberger Comprehensive Cancer Center at UNC (U01CA016086 from the National Cancer Institute), and the University Cancer Research Fund from the state of North Carolina. Development of the Mouse Diversity Array was supported by NIH Grant P50GM076468. Support for development of the MegaMUGA array was provided by Neogen Corporation, Lincoln, NE. APM was supported by Grants T32GM067553 and F30MH103925. The authors thank Darla Miller, the UNC Systems Genetics Group, and the Center for Genome Dynamics at the Jackson Laboratory for helpful discussions; and Fernando Pardo-Manuel de Villena and Leonard McMillan for their mentorship and for their comments on this manuscript.

Author information

Authors and Affiliations

Department of Genetics, University of North Carolina, Chapel Hill, NC, USA
Andrew P. Morgan
Department of Mathematics & Computer Science, Rhodes College, Memphis, TN, USA
Catherine E. Welsh

Authors

Andrew P. Morgan
View author publications
You can also search for this author in PubMed Google Scholar
Catherine E. Welsh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Catherine E. Welsh.

Appendix: terms and definitions

Relatedness Relatedness in the genetic sense refers to the proportion of alleles shared between two individuals. The degree to which two individuals are genetically related depends on the number of common ancestors they share and the number of generations which have elapsed since they shared them. A pedigree describes the expected relatedness between individuals: first-degree relatives (parents or siblings) share, on average, half of their alleles; second-degree relatives (grandparents) one-fourth; and so on. With dense genotype data, we can instead compute realized relatedness as the proportion of shared, unlinked alleles.

Using dense genotypes, we can define relatedness both at the genome-wide and at the local scale. In the presence of admixture or introgression (see below), local relatedness in different regions of the genome may deviate from the genome-wide average.

Population structure A population is “structured” when it has experienced deviations from random mating, or equivalently, when it is divided into subpopulations with restricted genetic exchange between them. In a structured population, some groups of individuals are more closely related to (share more alleles with) each other than with other groups. Geography and mating behavior generate at least some degree of structure in most natural populations. Population structure in laboratory mouse strains is widespread: for instance, the 129 and C57BL strain groups form a genetic cluster distinct from so-called “Swiss mice” including FVB/NJ, the NOD substrains, and ICR outbred stock (Beck et al. 2000). Failure to account for population structure can lead to false-positive QTL in genetic mapping of complex traits.

Linkage disequilibrium (LD) Two loci are said to be in LD if the frequencies of pairwise genotypes depart from those expected if alleles were sampled randomly at each locus. LD is decreased by recombination, and therefore generally decreases with time and with physical distance between loci. Unlinked markers are expected to be in linkage equilibrium, but non-random mating can produce “long-range” LD between unlinked loci in structured populations.

Haplotype block A haplotype block is a chromosomal segment in which there is no evidence for recombination during the history of a sample of individuals. Within a block, individuals in a population can be collapsed into one of a small (relative to the population size) number of ancestral haplotypes (Wall et al. 2003). LD is relatively high between loci within a block, but relatively low between loci in adjacent blocks.

Although many schemes have been proposed for defining haplotype blocks, the one discussed in this review is the four-gamete test (Hudson et al. 1985). Consider two loci A and B with alleles A,a and B,b, respectively. There are four possible haploid genotypes (gametes)—AB, aB, Ab, and ab—and if all four are observed in a sample, recombination between A and B must have occurred at least once in the past.

Haplotype blocks are a useful means of investigating patterns of genetic diversity at intermediate timescales since a common ancestor, such as among classical inbred strains of mice (Yang et al. 2011). But because recombination events accumulate and LD decreases with time, haplotype blocks shared between two individuals with a common ancestor far in the past—for example, a wild-derived inbred strain and a classical laboratory strain—will be very short. For this reason, haplotype blocks were not inferred for the wild mice and wild-derived strains in Yang et al. (2011).

Identity by descent (IBD) A chromosomal segment is shared identical-by-descent between two individuals if it was inherited from their common ancestor without recombination. The notion of IBD is closely related to the haplotype block.

Admixture Admixture refers to inter-breeding between individuals from populations which were previously genetically isolated from one another. Admixture facilitates gene flow between populations, and in the process creates heterogeneity of relatedness across the genome.

Introgression Introgression refers to the introduction of a chromosomal segment from one population into a separate, genetically distinct population. It is often used to describe gene flow between species or subspecies which can still form fertile hybrids. Unlike admixture, which describes ongoing inter-breeding, introgression describes events which are episodic in nature. In this review, we refer to genetic exchange between mouse subspecies, which do not interbreed in the wild except at narrow hybrid zones (Ursin 1952), as introgression.

Ancestry inference Broadly speaking, an ancestry-inference procedure steps along the genome of an individual and attempts to assign each segment to one of a few ancestral clusters. These clusters may represent ancestral population groups, for samples from natural populations, or founder haplotypes in laboratory populations. Examples of ancestry inference discussed in this review include assignment of subspecific origin in wild mice (Yang et al. 2011), which labels genomic regions with one of three subspecies; and haplotype reconstruction on the CC and DO (Fu et al. 2012), which assigns genomic regions to one of those populations’ 8 founder strains.

Hidden Markov model (HMM) A hidden Markov model is a probabilistic model which describes how an observed sequence can be generated from an underlying, unknown sequence of “hidden states” (Baum and Petrie 1966; Rabiner 1989). Efficient algorithms can be used to “decode” the sequence of hidden states given an observed sequence. In this review, we discuss HMMs in which the observed sequences are genotypes along a chromosome, and the hidden states are founder haplotypes.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Morgan, A.P., Welsh, C.E. Informatics resources for the Collaborative Cross and related mouse populations. Mamm Genome 26, 521–539 (2015). https://doi.org/10.1007/s00335-015-9581-z

Download citation

Received: 16 March 2015
Accepted: 23 June 2015
Published: 02 July 2015
Issue Date: October 2015
DOI: https://doi.org/10.1007/s00335-015-9581-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Informatics resources for the Collaborative Cross and related mouse populations

Access this article

References

URLs

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: terms and definitions

Appendix: terms and definitions

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation