Mammalian Genome

, Volume 23, Issue 9, pp 479–489

Mouse genomics programs and resources

The mouse: pushing the boundaries

DOI: 10.1007/s00335-012-9429-8

Cite this article as:
Bucan, M., Eppig, J.T. & Brown, S. Mamm Genome (2012) 23: 479. doi:10.1007/s00335-012-9429-8

This year we celebrate 10 years since the publication of the draft genome sequence of the C57BL/6J mouse inbred strain. Initial and ongoing comparative sequence analysis continues to highlight the mouse as a leading model for studying human biology and disease. The availability of the reference sequence transformed the efforts of individual laboratories to understand the function of a specific gene(s) by studying mouse mutants. Moreover, the mouse genetic community recognized the opportunity to build on and take advantage of the mouse genome sequence for the generation of large functional genomic resources that will enable the elucidation of entire pathways and biological systems. As a result, laboratories and programs around the world have developed a tremendously rich genetic toolbox. The focus of this special issue of Mammalian Genome is on these programs and resources that underpin an often spectacular acceleration in novel insights into biology and disease.

The complete genome sequence of the C57BL/6J inbred strain along with the initial gene annotation revolutionized our ability to link the genome sequence to phenotypic diversity by facilitating genetic screens. The sequence offered a roadmap for the generation of loss-of-function alleles by gene-based mutagenesis. Also, the painstaking work of identifying altered gene sequence in mutants from phenotype-based screens was reduced to sifting through a list of annotated genes in a critical interval. However, having a reference sequence for only one inbred strain is very limiting. For decades, the biomedical community has explored the rich biology and phenotypic diversity of many inbred strains and their combined genomes. A recent report on the whole genome sequence of 17 inbred strains explored sequence variation for both single nucleotide and structural variants, significantly illuminating genetic variation and associated phenotype variation, as well as throwing light on inbred strain origin (Keane et al. 2011). In this issue, Yalcin et al. describe these exciting findings and Simon et al. focus on the application of next-generation sequencing methods (exome and whole genome sequence) to mutation discovery.

Detailed knowledge of the spatial and temporal expression patterns in the developing mouse embryo and in adult tissues is critical for deciphering complex regulatory networks and helping to define the genetic architecture of systems. The results of numerous genome-wide expression studies using microarray technologies and in situ hybridization are assembled in several data collections. Sophisticated molecular, microscopic and computational approaches were applied to create freely available databases and digital atlases with rigorously annotated expression patterns (Armit et al., Geffers et al., Henry & Hohmann). How to combine expression data from several sources and integrate them with other gene-centric and biological data resources is a major challenge as discussed by Ringwald et al.

Analysis of mutants represents a key step towards our understanding of how information encoded by the genome leads to specific phenotypes. Undoubtedly a very significant milestone in mouse genetics is the current availability and access of mouse mutants. The mouse genetics community has for many years invested heavily in animal facilities and repositories to handle and disseminate large collections of frozen embryos or sperm for the growing mouse mutant resource (see Donahue et al.). Moreover, there has been a parallel commitment to developing and improving the cryopreservation procedures that are critical for safe, efficient archiving of the mutant collection as described by Guan et al. Nevertheless, despite the apparent richness of mouse genetic variants in the worldwide repositories, the availability of the mouse genome sequence impelled the community to tackle the challenge of developing a genome-wide comprehensive mutant resource, comprising a mouse mutant for every gene in the mouse genome. Thus, the International Knockout Mouse Consortium was born, tasked to generate mainly conditional but also constitutive alleles in over 17,000 genes in Embryonic Stem (ES) cells (Bradley et al.). The many months of bench and tissue culture work once necessary to prepare targeting vectors and select ES cells with the desired constructs are now replaced by an electronic request for individual ES cell clones that can be converted into mice within weeks. While there are still several thousand genes that need to be targeted in order to establish a complete library of mouse mutants for every protein-coding gene in the genome, the finish line is within sight. Additionally, the role of non-coding RNA, microRNAs, and regulatory elements will need to be tackled using similarly unbiased approaches. The utility of this resource is amplified by the ability to create conditional mutants, an ability that is dependent upon the quality and diversity of lines carrying appropriate Cre drivers. Murray et al. review the current state of the art of Cre driver resources along with future developments. It is important that a concomitant effort in the production of well-annotated Cre lines accompanies the global mutagenesis efforts.

One of the greatest assets for the study of mammalian biology is the detailed knowledge generated by comprehensive phenotypic characterization of a wide range of genetic traits. Classically, the repertoire of investigated phenotypes in mouse mutants was narrow and driven by specific hypotheses. However, over the last two decades, reflecting the needs of both phenotype- and gene-based large-scale screens, many centers have focused their efforts on the coordination and standardization of protocols and procedures for comprehensive phenotypic assessments. In this issue, Ayadi et al., and Fuchs et al. describe centralized programs and facilities with wide-ranging expertise in physiological, biochemical and behavioral systems. Moreover, Laughlin et al. describe a more specialized network of phenotyping centers with a focus on metabolic traits. Particularly intriguing are early insights from the initial pilots in large scale, highly standardized and comprehensive phenotyping efforts. Overall, around 80 % of homozygous knockout mutant lines reveal at least one phenotype. While over 30 % of these homozygous lines are not viable, the phenotypic hit rate in heterozygote alleles (of these non-viable lines) was 70 %. These programmes demonstrate the power of comprehensive phenotyping for defining gene function and uncovering pleiotropy. Moreover, they reinforce a growing view that the community should proceed with broad based phenotyping of mutants for every gene in the mouse genome. The International Mouse Phenotyping Consortium (IMPC) has been formed to address this challenge, and Brown & Moore discuss the plans and progress that have been made since the consortium was formally launched in 2011.

Undertaking large scale, genome-wide phenotyping will be an immense undertaking, but an equally difficult challenge will be the capture, annotation and dissemination of all the data generated along with its integration with other datasets, not only from mouse, but other model organisms and human. Mallon et al. describe past developments and future plans for large-scale phenotyping programmes, including the comprehensive informatics solutions required for the IMPC. These range from mouse production and tracking systems, to standardized phenotyping procedures and their data management, to data QC and annotation. Key to making progress in the area of phenotype annotation will be the development of a robust mammalian phenotype ontology and Smith and Eppig, discuss progress with the Mammalian Phenotype Ontology as a standard for mouse phenotype data. Central to the full understanding of gene function and genome interaction will be the incorporation of IMPC data into the Mouse Genome Informatics (MGI, data resource, where the range of genetic, genomic, and biological data for mouse are placed in an integrated context. Moreover, the mapping of phenotypic traits across species will be important in order to relate phenotype discovery in mouse to biological and clinical phenotypes detected and described in other organisms. Gkoutos et al. review progress in the area of comparative phenomics, which will be critical to ensuring that we can make powerful, intelligent queries about phenotypes between multiple species.

While programmes such as the IMPC will provide an extraordinary database of baseline function for mouse genes, there remains the problem of integrating genetic variation and its contribution to phenotypic traits, thus identifying and characterizing the basis for phenotypic variation in complex traits and common diseases. A large number of well-characterized inbred strains of laboratory mice and their phenotypic diversity have been extensively utilized for the identification of physiological, metabolic, and behavioral quantitative trait loci (QTL). These studies were combined with the analysis of recombinant inbred strains and congenic lines, but the resources developed and used were often underpowered. However, there has been an extraordinary resurgence in the development of new approaches to dissect and analyse the genetic and phenotypic variation between inbred lines. Ghazalpour et al., describe the application of a novel association-based mixed model algorithm for the analysis of the Hybrid Mouse Diversity Panel. Chromosome substitution strains (CSSs) have had a major impact on the identification of QTLs and their underlying genetic variants, and Nadeau et al. review their role in dissecting a diverse array of genetic phenomena, such as epistasis, parent-of-origin effects and heritable epigenetic changes. Several papers in this issue provide current progress on the establishment of large multiparental panels of, first, recombinant inbred lines with fixed and reproducible genotypes—the Collaborative Cross (Welsh et al.) and, second, heterogeneous outbred mice which are unique and capture limitless combinations of segregating alleles (Churchill et al., Yalcin and Flint). The latter category includes the Diversity Outbred (DO) mice (Churchill et al.), as well as Heterogeneous Stocks (HS) and Commercial Outbreds (CO) (Yalcin and Flint). These new resources have been established with an aim to dramatically boost the level of genetic diversity and phenotypic heterogeneity, thereby increasing the resolution of mapping complex traits. The assembled reviews along with the linked publications provide early insights into the genomic architecture of these lines, and also serve as a practical guide related to their distribution and data sharing policies.

Finally, it is important to reiterate that, as we allude to above, the success and impact of these new functional genomics resources in the mouse should be considered in the wider context of mammalian biology and human disease studies. More than 2000 single-gene Mendelian diseases have been molecularly defined. However, the understanding of the genetic basis of complex common disorders lags behind. The genome-wide association studies (GWAS) of many complex diseases have uncovered a large number of common risk-associated alleles. The loci found by GWAS are mostly of small effect and explain a relatively low proportion of the heritability of complex traits. Furthermore, dense genotyping and extensive sequencing of thousands of genomes now permit near complete ascertainment of genetic variation, including low-frequency single nucleotide or copy number variants. The identification of their functional impact and validation will demand the use of model organisms. The collected resources and approaches described in this special issue (see Table 1) demonstrate the unique power of mouse genetics and genomics research to deliver a comprehensive functional map of a mammalian genome. The abundance of these new tools and resources, including phenotypically and genetically characterized mouse lines, libraries of ES cells with conditional and loss-of-function alleles, as well as candidate loci (genes and regulatory elements) identified by systems biology analysis, will be vital for the interpretation of variants identified in human and animal genetic studies. Nevertheless, there is still much to be done. Looking to the future, it is becoming clear that we need to continue to improve the richness and variety of available mouse mutant alleles and consider rapid and efficient approaches to generate and phenotype allelic combinations, all in the context of improved tools for phenotyping and annotation. This and more will be needed for a more profound mechanistic understanding of mammalian biology and disease.
Table 1

Mouse genomics programs and online resources


Full name


Web address

International Knockout Mouse Consortium information resources


International Knockout Mouse Consortium

Integrates data from all IKMC projects, including knockout vectors, ES cells and mice generated by the consortium. Links to specific IKMC participating sites and repositories


European Conditional Mouse Mutagenesis Programme

EUCOMM member site, IKMC consortium


EUCOMM Tools for Functional Annotation of the Mouse Genome

EUCOMMTOOLS member site, IKMC consortium


Knockout Mouse Project

KOMP network (CHORI, UC Davis, Sanger Institute, Velocigene/Regeneron) collectively comprise KOMP, member IKMC consortium


MicroRNA knockout project

mirKO member site, IKMC consortium


North American Conditional Mouse Mutagenesis Project

NorCOMM member site, IKMC consortium


Texas A&M Institute for Genomic Medicine

TIGM member site, IKMC consortium

Cre and other recombinase resources


Allen Brain Institute

Characterization of cre transgenes developed at ABI


Brain specific Cre mice

Project description (new resource)


Recombinase (Cre) Portal

Integrated data from publications, submissions, and large-scale projects on expression and activity for cre-containing transgenes and knock-ins; includes images, links to obtaining mice, references


Coordination of resources for conditional expression of mutated mouse alleles

Project overview, Biomart of data from CrePortal, Cre-X-Mice and CreZoo databases



Cre transgene excision from literature and authors

CreZoo (TgDb)

Transgenic Mice Database

Cre transgene excision data from literature and authors



European consortium creating new cre lines and targeted knockout-first alleles


GENSAT Cre mice

BAC-Cre lines made by Gensat, characterized for neuro expression



CreERT2 resource development, with characterization data


JAX Cre repository

Repository of Cre lines provided by JAX, with cre characterization data


IMSR Cre mouse repository listings

Consolidated listing of cre-bearing mice and frozen germplasm in repositories worldwide. Links to strain information and ordering


Pleiades Promoter Project, UBC

Hprt knock-ins of mini-promoters from human genes driving cre

Mice and mouse cell line public repository resources


International Mouse Strain Resource

Integrates mouse resource holdings from 47 individual repositories (mice, cryopreserved gametes, ES cell lines). Links to data and order forms from participating repositories


Australian Phenome Bank

Repository Site, Australia


Center for Animal Resources and Development, Japan

Repository Site, Japan

CC Repository

Collaborative Cross

Repository Site, U North Carolina


Canadian Mouse Mutant Repository

Repository Site, Toronto


European Mouse Mutant Archive

Repository Site consortium, Europe (multiple sites, coordinated in Monterotondo, Italy)


European Mouse Mutant Cell Repository

Repository Site, Germany


MRC Harwell MouseBook Catalog

Repository Site, UK


Jackson Laboratory Mice and Services

Repository Site, US

KOMP Repository

KOMP Repository

Repository Site. US


Mutant Mouse Regional Resource Center

Repository Site consortium US (includes UC Davis, JAX, U Missouri, UNC sites)


National Cancer Institute Mouse Repository

Repository Site, US


National Institute of Genetics, Japan

Repository Site, Japan


National Resource Center for Mutant Mice, China

Repository Site, China



Repository Site, Japan


RIKEN BioResource Center

Repository Site, Japan


National Applied Research Laboratories, Taiwan, ROC

Repository Site, Taiwan


Texas A&M Institute for Genomic Medicine

Repository Site, US

Phenotype data resources and centers


Australian Phenomics Network

Australian network for creation, validation, characterization, and cryopreservation


Deltagen and Lexicon Knockout Mice (at MGI)

Data generated by Deltagen/Lexicon for knockout mice acquired by NIH for use by the scientific community


Diabetic Complications Consortium

Collections of primary data availabe for diabetes models


European Mouse Phenotyping Resource of Standardised Screens

Database of Standard Operating Procedures (SOPs) from EUMOPHIA


European Mouse Disease Clinic

Consortia of primary phenotyping groups in Europe with a goal of assessing 600 mutant lines


Europhenome Collaborative Mouse Phenotype Resource

Access to raw and annotated mouse phenotyping data generated from EuMODIC project


International Mouse Phenotyping Consortium

Built on the IKMC effort to make knockouts for all genes in mouse, IMPC will carry out high-throughput phenotyping of these mutants


International Mouse Phenotyping Resource of Standardised Screens

Successor of EMPReSS, contains SOPs for phenotyping protocols for IMPC


Infrastructure for Phenotyping and Archiving of model mammalian genomes

Inter-European infrastructure building to support IMPC efforts and archiving/distribution of animal resources

KOMP pilot

KOMP Phenotyping Pilot at UC Davis

Pilot project for limited phenotype protocols using KOMP allele homozygotes


Knockout Mouse Project 2—Phenotyping Program

Program description from NIH


Mouse Genome Pipeline, Sanger Institute, UK

Mouse resource development and phenotyping from the Sanger Institute and collaborators

MGI Phenotypes & Disease Models

Phenotypes & Disease Models (at MGI)

Data on spontaneous, induced, and genetically-engineered mutations, their strain-specific phenotypes, and models of human disease models


Mouse Tumor Database (at MGI)

Data on tumor incidence and latency in genetically defined mice (strains, mutants), pathology data/images, models for human cancer


Mouse Metabolic Phenotyping Centers

Consortium of 6 US universities providing metabolic testing for investigators. Links to test descriptions and ordering


Mouse Phenome Database

Raw and analyzed phenotype data for strain characteristics


European Mutant Pathology Database

Database of histopathology images from mutant and genetically engineered mice


Toronto Centre for Phenogenomics

Phenotyping, imaging, pathology for mutants, distribution of stocks

Genetic variation data (SNPs)


Center for Genome Dynamics Mouse SNPs

Mouse SNPs, including imputed SNPs calculated from strain SNP data


The Single Nucleotide Polymorphism Database

SNP archive at NCBI, includes mouse and other species


Database of Genomic Structural Variation

Structural and base pair variation archive, multiple species, data shared with DGVa


Database of Genomic Variants archive

Structural and base pair variation archive, multiple species, data shared with dbVar


Mouse SNP Query (at MGI)

Mouse SNPs advanced query, including selection of strain comparisons and attributes


Mouse Phenome Database Mouse SNPs

Mouse SNPs, including dbSNP data and imputed SNPs

SNPs: Sanger Sequenced Strains

Mouse Genomes Project Browser

SNPs reflecting the newly sequenced 17 inbred strains

Genetic variation data (QTL)


QTL mapping using HS mice

Data from Northport Heterogeneous Stock (HS) for QTL mapping


QTL mapping using CO mice

Data from Commercially available outbreds (CO) for QTL mapping

Sequence CO

Full genome sequencing of CO mice

Sequences for commercially available outbreds (CO)


Collaborative Cross

Strains and genotype/phenotype data for CC


Diversity Outcross

Strains and genotype/phenotype data for DO

QTL Archive

QTL Archive (raw data from QTL studies)

Archive of published and submitted primary data for QTL studies

Sequence, genes, genome browsers


Ensembl Genome Browser

Automated Genome Annotation, Genome Browser

EBI Sequence Read Archives

European Bioinformatics Institute

Sequence Read Archives

Gene/Genome Features Unification

Gene/Genome Features Unification (at MGI)

MGI harmonization of Ensembl, NCBI, Vega, MGD gene annotations

Mouse Genome Browser

Mouse Genome Browser (at MG)

Genome Browser


National Center for Biotechnology Information Reference Sequence

Curated Reference Sequence

NCBI Sequence Archive

National Center for Biotechnology Information Sequence Read Archive

Sequence Read Archive


The Vertebrate Genome Annotation Database

Curated Gene Models

17 Strains Sequences

Sanger Mouse Genomes Project

Sequence Archive for 17 inbred strains

UCSC browser

University of California, Santa Cruz Genome Browser

Genome Browser

Gene expression data


Allen Mouse Brain Atlas (Allen Brain Institute)

RNA in situ gene expression patterns in developing mouse brain (mid-gestation to juvenile) and adult



Functional genomics experiments, including microarray and next-generation sequencing gene expression


Brain Gene Expression Map

RNA in situ gene expression patterns from wild-type mouse nervous tissues at E11.5, E15.5, P1, P42



Gene annotation portal for gene and protein function


Edinburgh Mouse Atlas Gene Expression Database

Spatial and text-based in situ gene expression data; data from publications, submissions, and large-scale projects


Embryonic Gene Expression Database for Biomedical Research Source

Whole-mount in situ hybridization gene expression pattern in E9.5, E10.5, E11.5 mouse embryos (wild-type)


European consortium for mouse gene expression

RNA in situ gene expression patterns of E14.5 wild-type mouse embryos



RNA in situ gene expression patterns in wild-type mice (E10.5, E14.5), head (E15.5), and brain (P7, P56)


Gene Expression Nervous System Atlas

In situ gene expression maps of mouse brain and spinal cord based on EGFP BAC transgenic reporter patterns


Gene Expression Omnibus

Repository of microarray, next-generation sequencing and other high–throughput expression data; includes web interface for query and download


GenitoUrinary Molecular Anatomy Project

Molecular atlas of gene expression for the developing organs of the GenitoUrinary tract


Gene Expression Database (at MGI)

Integrated data from publications, submissions and large-scale projects; emphasizing mouse development; covers all developmental stages; data from wild-type and mutant mice


Molecular Anatomy of the Mouse Embryo Project

Whole-mount in situ hybridization gene expression pattern in mid-gestation mouse embryos (wild-type)

Gene function and pathways


Gene Ontology

Gene Ontology Consortium site (all species)


Gene Ontology (at MGI)

Mouse annotations at MGI. MGI is the authoritative source for mouse GO annotations


Kyoto Encyclopedia of Genes & Genomes, mouse page

Pathways, functions, interactions data

MGI Biochemical Pathways

Biochemical Pathways (at MGI)

Curated biochemical pathways for mouse built with Pathway Tools Software (P Karp)



Manually curated, peer-reviewed database of pathways and reactions (pathway steps)

This table lists many important mouse resources described in this Special Issue, as well as other key resources relevant to particular topics. It is an overview of current key websites for mouse resources in each category. It is not meant to be an exhaustive listing of mouse related resources, nor does it include all websites mentioned in the articles of this Special Issue (see articles of interest for additional websites)

Lastly, we are aware that the mammalian genome community represents the primary readership of Mammalian Genome. However, with this special issue we attempt to acquaint a broader biomedical community with the genetic toolbox of the mouse. We hope that you enjoy and benefit from this issue and will share it with your colleagues from other fields. In the future, we look forward to reading more about your research inspired by the approaches, tools and resources described here.

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  1. 1.Department of Genetics, Perlman School of MedicineUniversity of PennsylvaniaPhiladelphiaUSA
  2. 2.The Jackson LaboratoryBar HarborUSA
  3. 3.MRC Mammalian Genetics UnitMRC HarwellUK