Background

Comparative embryology of various animal groups has suggested the idea of a conserved stage in embryogenesis. In 1828, Karl von Baer was the first to describe revolutionary laws for animal development [1], inferring that the later the stage of embryogenesis of related organisms, the less they resemble each other [1, 2]. This "progressive divergence model" was challenged by Seidel, Sander, and Elinson [35], who noted that earlier stages of development (such as patterns of cleavage) have relatively divergent morphological patterns among species, and so do not always fit the rule. Today, the revised version of the laws, called the "hourglass model" [6] or the "egg-timer model" [7], is widely accepted [4, 8], and the bottleneck stage, representing maximum resemblance among various vertebrates, is commonly referred to as the "phylotypic stage" [9]. However, much controversy remains about when the phylotypic stage appears; Ballard first proposed the pharyngula stage [10] as the phylotypic stage, whereas Wolpert proposed the early somite segmentation stage [11], and Slack, the tailbud stage [12]. Moreover, a few detailed morphological studies have thrown doubt on the existence of a phylotypic stage [13, 14]. These problems are due mainly to the difficulty in evaluating conserved or constrained developmental stages by morphological resemblance. In particular, heterochrony, or variations in the relative rate or timing of developmental events (e.g. pharyngeal arch formation and somitogenesis start almost at the same time in the mouse, but somitogenesis begins earlier in zebrafish), makes it extremely difficult to conclude which embryonic stage is the vertebrate phylotypic stage [15]. The problem of heterochrony prevents us from concluding whether each vertebrate really has a phylotypic stage, and whether any morphological features are associated with the stage.

Recent progress in developmental biology has revealed marked similarities of the molecular mechanisms of morphogenesis in bilaterians, leading to the "zootype" hypothesis [12]. The zootype is defined by a common pattern of Hox gene expression among bilaterians, and is considered to be displayed most clearly at the phylotypic stage. We note, however, many conserved genes other than Hox are also involved in animal development, so comprehensive evaluation of these genes should be included for a more accurate definition of zootype. Further, the fact that they are expressed differently during development among bilaterians prevents us from concluding which stage is zootype-related constrained. Recently, using mouse and human genome information, Hazkani-Covo et al. evaluated the protein distances of each gene expressed during mouse embryogenesis to show the existence of the vertebrate phylotypic stage [16]. Although they could not specify the timing of the phylotypic stage, they raised the possibility of a genome-based approach to examining the phylotypic stage hypothesis. Here, we sought to develop a novel method to evaluate the evolutionary ancestral nature of each embryonic stage comprehensively and quantitatively, without the use of comparative morphology but instead by evaluating the expression of conserved genes. By using this method, we tackled two questions. First, we asked whether the mouse embryo really passes through a highly constrained stage that is in accord with the vertebrate phylotypic stage hypothesis. Second, we asked whether the hourglass model could be expanded to cover the embryogenesis of vertebrates as members of the bilaterians, in other words, whether vertebrates pass through stages conserved among bilaterians.

Results and discussion

Estimating the ancestral nature of mouse embryogenesis

On the basis that the vertebrate phylotypic stage should express the highest ratio of genes that conceivably existed in ancestral vertebrates, or Vertebrate genes (see Methods and Figure 1 for the definition of Vertebrate genes, also see Additional file 1: Taxonomic classification of homologues of mouse protein-coding genes according to taxonomic range.), we established the following index:

Figure 1
figure 1

Genome datasets used to define the evolutional classifications. The evolutional classifications defined in our analyses are indicated, together with arrows covering the range of their putative evolutional origin (e.g. Vertebrate genes represent the genes that are considered to have already existed before vertebrate evolution). Numbers in parentheses indicate developmental genes and total genes classified in each evolutional classification. These classifications are consistent with a recent report [25, 26] that most of the duplications of developmental genes occurred before bilaterian and vertebrate evolution, supporting the appropriateness of our homologue determination (See Additional file 1: Taxonomic classification of homologues of mouse protein-coding genes according to taxonomic range.). See also Methods for the definition of evolutional classifications and "developmental genes".

Vertebrate ancestor index at developmental stage k = Vk/Nk

Where Vk: number of non-redundant Vertebrate genes expressed at stage k; Nk: number of non-redundant total genes expressed at stage k.

Similarly, analyses with Bilaterian, Chordate, Tetrapod, and Amniote genes would also reveal the nature of each embryonic stage in a taxonomic context (see Methods for conserved gene definition).

To analyze the ancestor indexes of mouse embryonic stages, we collected mouse gene sets from the ENSEMBL database [17] and gene expression profiles from the Expressed Sequence Tags Database (dbEST) [18], and considered a gene to be "expressed" if the corresponding UniGene [19] ID was present in any of the categorized EST libraries. EST libraries categorized to each stage are listed in Table 1. Additionally, to make this analysis tolerant to the fluctuation in timing of embryonic stages, we took advantage of moving group analysis (similar to the well known "moving average analysis" technique) to calculate the ancestor index of two sequential stages:

Table 1 EST library IDs categorized by the description of developmental stages. Each categorized mouse developmental stage contained about 22,300 (409–71,504) EST counts on average, which corresponded to 2,900 (194–5,923) non-redundant ENSEMBL gene entries on average. Developmental status was based on Manipulating the Mouse Embryo) [27] and The Mouse [28].

Vertebrate ancestor index at grouped stage k = Vk, k+1/Nk, k+1

Where Vk, k+1: number of non-redundant Vertebrate genes expressed at stage k or k+1; Nk, k+1: number of non-redundant total genes expressed at stage k or k+1.

With these methods, we calculated the bilaterian, chordate, vertebrate, tetrapod, and amniote ancestor indexes (Figure 2A–E).

Figure 2
figure 2

Moving group analyses of individual ancestor indexes obtained during embryogenesis. A–E: moving group analysis of the ancestor indexes within all genes (25,613 genes). Grouped stage number represents the earlier stage of a pair of developmental stages (e.g. gene expression of grouped stage 14 represents non-redundant genes expressed at stage 14 or 15; see Table 1 for staging). Each stage contained on average 22,744 EST counts (4,418–44,210), which corresponded to on average 4,396 (1,493–6,507) non-redundant ENSEMBL M. musculus genes. F–G: Moving group analysis of the ancestor indexes within developmental genes. Each stage contained on average 1,479 EST counts (282–3,406), which corresponded to 303 (94–491) non-redundant ENSEMBL M. musculus genes. E, M and L indicate the early, middle and late embryonic periods, covering non-redundant genes expressed in stages 2–7, 11–18 and 25–31, respectively. Statistical analysis was performed with Fisher's exact test. Significant differences are indicated by asterisks. n.s: P > 0.05. Within each graph, relative areas of circles reflect the number of expressed (total or developmental) genes at each stage.

What surprised us initially was the appearance of an early peak around grouped stages 4 to 10 from the analyses of Bilaterian and Chordate genes (Figure 2A,B). Broadly evaluated analysis also supported this tendency; both the early (stages 2–7) and middle embryonic periods (stages 11–18) have significantly higher ancestor indexes than the late period (stages 25–31). The prominently higher ancestor indexes shown in Figure 2A,B indicate the existence of novel, well-conserved stages conceivably carrying characteristics of the bilaterians at around the cleavage to blastula stages in mouse embryogenesis.

Vertebrate phylotypic stage in mouse embryogenesis

We next calculated ancestor indexes to search for a putative phylotypic stage in mouse embryogenesis. In contrast to the bilaterian and chordate ancestor indexes, vertebrate, tetrapod, and amniote ancestor indexes did not show prominent peaks (Figure 2C–E. Instead, we obtained weakly curving graphs peaking around grouped stages 8 to 18 (Figure 2C–E), covering the presumptive periods of the vertebrate phylotypic stage proposed by comparative morphological studies (such as the pharyngula stage [10], the early somite segmentation stage [11], and the tailbud stage [12]). Importantly, the vertebrate ancestor index of the middle embryonic period was significantly higher than those of both the early and late embryonic periods (Figure 2C), as also seen in the tetrapod and amniote ancestor indexes (Figure 2D,E). All of these curves (Figure 2C–E) were significantly different from those obtained and shown in Figure 2A,B (P < 0.001; see Methods for the evaluation of ancestor index distribution).

If we were to see the body plan and morphological patterning as readouts of genetic programs for development, then the ancestor indexes calculated for developmental genes would refine our understanding of the conservation of the stages between 8 and 18. On this assumption, we focused on mouse developmental genes defined in the Gene Ontology database (see Methods for the definition of developmental genes) to derive ancestor indexes of individual stages. The highest ancestor index occurred at grouped stage 14 (stages 14–15), and was significantly higher than values of both the early and late embryonic periods (Figure 2H–J). Similar peaks in vertebrate, tetrapod, and amniote ancestor indexes indicate that E8.0–8.5 has been the stage most conserved throughout the evolution of vertebrates. Additionally, the peak of the amniote ancestor index at grouped stage 14 (Figure 2J) indicates that these stages express the smallest ratio of newest developmental genes. In light of the recent report that newer genes tend to evolve faster than older genes [20], the E8.0–8.5 period is the most genetically unmodifiable or contains the strongest developmental constraints in mouse embryogenesis. Thus, our analyses with developmental genes reveal that mouse embryogenesis indeed has a highly conserved stage characteristic of ancestral vertebrates, conceivably at around E8.0 to E8.5 (the period of onset of pharyngeal arch formation and somitogenesis), which is in accord with the concepts of a vertebrate phylotypic stage [6]. Notably, many of the mice with mutations in vertebrate developmental genes expressed at the putative vertebrate phylotypic stage have fatal or systemic phenotypes: embryonic or perinatal lethal phenotypes (52%), and abnormal phenotypes of the nervous (33%), craniofacial (14%), cardiovascular (24%), respiratory (12%), or skeletal (18%) system (see Additional file 2: Mutant phenotypes of vertebrate developmental genes expressed at the phylotypic stage). Thus, many of these gene products could be critical factors for defining fundamental morphological features of embryos. This information in turn implies that mutations of developmental genes expressed at the putative phylotypic stage are not easily accommodated, which also supports the idea that this stage tends to be evolutionarily conserved.

Existence of bilaterian-related stage

Bilaterian and chordate ancestor indexes calculated with developmental genes again showed patterns significantly different from those of vertebrate genes (Figure 2F,G vs 2H–J; P < 0.001), with their highest values at grouped stages 3 and 10. Similar to the results obtained with genome-level analysis, both of these peaks were earlier than the putative vertebrate phylotypic stage, spanning from the period of cleavage to the onset of gastrulation (Table 1). These results suggest that this early embryonic period carries the basic body plan of bilaterians (i.e. bilaterally symmetric), making it the "bilateriotypic" stage. Gene expression information of Ciona intestinalis embryogenesis further allowed us to verify the existence of the putative bilaterian-related stage in this organism at the early embryonic stage (see Additional file 3: Ancestor indexes obtained for genes expressed during Ciona intestinalis embryogenesis). The highest value of bilaterian ancestor indexes of genes expressed during C. intestinalis embryogenesis occurred at the cleavage stage, and was significantly higher than the ancestor index of the later embryonic stage (combined expression data of larval and juvenile stages). However, we could not obtain evidence at the level of "developmental genes" owing to the scarcity of gene ontologies associated with C. intestinalis genes.

Conclusion

We have developed a comprehensive quantitative method to evaluate evolutionarily constrained stages during the development of individual embryos. Our results show that mouse embryogenesis indeed passes through a highly constrained stage, as the hourglass model implies, at the mid-embryonic period (E8.0–8.5). They also show that the stage may be the most developmentally conserved stage since the evolution of ancestral vertebrates, making it a candidate for the phylotypic stage. Notably, it exhibits the major characteristics of the phylotypic stage proposed by morphological studies: pharyngula stage [10], early somite segmentation stage [11], and tailbud stage [12]. At the pharyngula stage, animals should contain a series of paired branchial grooves, a notochord, a post-anal tail, and a dorsal hollow nerve cord. In mice, maxillary components of the 1st branchial arch become prominent at about E8.0. The 3rd branchial arch is still absent when mandibular components appear at around E8.5. The first somite pairs appear at E8.0, and 8 to 13 somite pairs are found at E8.5. The tailbud stage is E9.5–10.5, later than the pharyngula and early somite segmentation stages. Thus, the period of E8.0–8.5 corresponds to the onset of the pharyngula stage and the early somite segmentation stage in mice. However, the exact characteristics of the phylotypic stage cannot be concluded from our results, as it should be common to all vertebrate embryos. Our method will be a powerful tool for comparison among diverse vertebrates. In contrast to morphological studies that evaluate developmental constraints qualitatively by morphological resemblance among embryos, our genome-based method evaluates developmental constraints quantitatively in a single species. This latter point is especially important to the testing of the phylotypic stage hypothesis: our method allows us to define constrained or conserved stages in each organism, and thus to compare data among various vertebrate embryos. This approach would clarify the common developmental events and gene sets that characterize the phylotypic stage (e.g., whether or not somitogenesis is dissociated [21] from the stage).

In addition, our results suggest the existence of a bilateriotypic stage within the cleavage to gastrulation period in mouse embryogenesis, and the possible conservation of this stage in C. intestinalis. These results demonstrate that although the early developmental period (e.g., cleavage) is relatively modifiable on an evolutionary time-scale compared with the phylotypic stage, the cleavage-to-gastrulation period still retains highly ancestral characteristics. On the basis of this perspective, we propose that mouse embryogenesis possesses two distinct, nested hourglasses, the vertebrate and bilaterian ones. These hourglasses may represent two major developmental constraints from distinct evolutional pasts – ancestors of vertebrates and bilaterians – in mouse embryogenesis. This extended version of the developmental hourglass model once again highlights the questioned, hierarchical nature of embryogenesis proposed by von Baer, albeit partly: "General features common to a large group of animals appear earlier in the embryo than do specialized features." Similar and further analyses in various animals will be required to verify this model, to explain how the bilateriotypic period defined by our analysis is related to the "zootype" hypothesis proposed by Slack [12], and to clarify certain aspects of the cleavage period shared among bilaterians [22]. Comprehensive gene expression data of various animals (e.g. fishes, birds, amphibians, insects, roundworms, and outgroup animals) and further development of gene ontology (GO) databases would make this kind of analysis more effective and definitive.

Defining conserved stages and extracting genes by our method would be a powerful tool to answer what kind of developmental events and gene sets are evolutionarily constrained. The information in turn would help explain how the conserved stages limit the possible variations of animal body plan in the context of animal evolution.

Methods

Genome and homologous gene dataset

Genome and homologue information for each gene were taken from ENSEMBL database [17] (v37, February 2006) using the data mining system BioMart [23]. Only "protein coding" gene types were used in our analysis (total of 25,613 ENSMUSG IDs in the Mus musculus genome; 14,278 ENSCING IDs in the C. intestinalis genome). Homologues represent reciprocal BLAST hits (including UBRH: unique best reciprocal hits; MBRH: multiple best reciprocal hits; and RHS: reciprocal hits based on synteny information) with an expectation (E)-value < 1e-10 (WU BLASTP, BLOSUM62). The genome data sets were versions NCBI m34 for M. musculus and JGI 2 for C. intestinalis. Taking gene loss during evolution into consideration [24], we defined the following evolutional classifications, which depend on the presence or absence of at least one homologue in a taxon of interest (see also Figure 1). Bilaterian genes: at least one homologue present in protostomes. Chordate genes: in addition to genes classified in Bilaterian genes, genes that have at least one homologue in urochordates. Vertebrate genes: in addition to genes classified in Chordate genes, genes that have at least one homologue in teleosts. Tetrapod genes: in addition to genes classified in Vertebrate genes, genes that have at least one homologue in amphibians. Amniote genes: in addition to genes classified in Tetrapod genes, genes that have at least one homologue in aves.

Expression profile dataset

All the stage-traceable EST libraries of M. musculus and C. intestinalis were manually downloaded from the NCBI UniGene database [19] and categorized by their developmental stage descriptions (Table 1). We extracted all UniGene IDs with at least one entry in each categorized EST library, and further linked them to the ENSEMBL gene IDs. ENSEMBL genes with at least one of these UniGene IDs were considered as "expressed".

Statistical analysis

For comparing ancestor indexes between stages, we constructed a 2 × 2 contingency table and determined significant differences between the ratios (e.g. non-redundant Vertebrate gene numbers expressed at stages k or k+1 and non-redundant total genes expressed at stages k or k+1) by Fisher's exact test (two-tailed). To test the distribution pattern of ancestor indexes, we converted the number of expressed Bilaterian, Chordate, Vertebrate, Tetrapod, and Amniote genes in each grouped stage to cumulative frequency distribution, and compared these survival curves (Kaplan-Meier method) by the Generalized Wilcoxon test. For all the statistical tests, P-values less than 0.05 were considered significant.