Introduction

In multicellular organisms, the spatial organization and function of a variety of cell types is a direct consequence of information encoded in the genome. The specification of different cell fates during development depends on the activation of gene expression programs that endow cells with their specific cellular functions. Controlled changes in the transcriptional state of cells are therefore mechanistically responsible for the propagation of different cell fate trajectories. Gene regulatory networks (GRNs) provide a developmental control system for cell fate specification by regulating the expression of transcription factors and signaling molecules and thereby of any developmental gene expression program that defines the animal body plan1,2. In several developmental contexts, GRNs have now been experimentally characterized and shown to control various aspects of cell fate specification, including the spatial organization of early embryos, cell fate decisions, and the differentiation of cell types, for instance, in sea urchins, ciona, Drosophila, and vertebrates1,2,3,4,5,6,7,8,9,10,11,12.

A GRN that is experimentally well resolved controls the specification of the endomesoderm in early sea urchin embryos3,4,13,14,15,16. This GRN includes ~50 transcription factors and signaling molecules, interconnected by regulatory interactions that together control the specification of endomesodermal cell fates. The sufficiency of this GRN to control cell fate-specific gene expression has been demonstrated by computational modeling. Thus, a Boolean logic model that computes discrete expression states of all regulatory genes in this network on the basis of their regulatory interactions correctly recapitulates gene expression temporally as well as spatially throughout a 30 h developmental process17,18. This model indicates that the developmental function of GRNs is established not by subtle changes in gene expression levels but by changes in the ON/OFF state of transcription factor expression and function.

The implication of current insights is that cell fate specification depends on the expression of cell fate-specific combinations of transcription factors, or regulatory states19. Given the extensive regulatory transactions that are required for the specification of a few cell fates during early sea urchin development, and given that this process involves a substantial fraction of the roughly 300–350 regulatory genes encoded in the sea urchin genome, the question that arises is what extent of regulatory information will be necessary to specify all cell types during the development of a multicellular organism. If combinatorial regulatory states provide a major determinant of cell function, then it should be true that equivalent cell fates express equivalent regulatory states and that any differences in morphology and function must be caused by differences in the combinatorial expression of transcription factors. All developmental decisions leading to the diversification of cell fates should therefore involve changes in transcription factor expression that precede the appearance of morphological and functional differences. Accordingly, the expression of combinatorial regulatory states should reflect the cell fate specification process. Although there are many indications that this is indeed the case, the rate and extent of change in transcription factor expression that drives cells from precursor state to differentiated cell type and the differences among regulatory states expressed in cell types throughout an organism are, in most cases, not known.

Here we address these questions by identifying the combinatorial regulatory states that are expressed in response to global embryonic GRNs during development from pre-gastrular embryo to pluteus larva in the purple sea urchin Strongylocentrotus purpuratus (Fig. 1a). Unlike adult sea urchins, sea urchin larvae are bilateral and possess body parts similar to other bilaterian animals, including nervous system, tripartite gut, skeleton, muscle cells, and immune system, all formed within about 3 days of fertilization. The expression of just over 200 transcription factors in this process indeed indicates that temporally and spatially specific combinatorial regulatory states provide the molecular basis for embryonic cell fate specification.

Fig. 1: Identification of regulatory states based on genome-wide spatial expression data for transcription factors.
figure 1

a Diagram showing the aim of this study to determine the putative differences in the combination of expressed transcription factors in cell fate domains of the sea urchin embryo, here represented by two domains in green and orange, at five developmental stages. b Examples of spatial expression profiles for selected regulatory genes during sea urchin embryogenesis, including annotation of expression state in distinct embryonic regions: Ptf1a in the ciliated band and apical ectoderm (36–48 hpf) and in endoderm (48–72 hpf), nkx2.1 in the apical ectoderm and oral ectoderm, and mitf in skeletogenic cells. Images show representative expression patterns observed in >50 embryos. c Temporal expression of regulatory genes for which expression data were obtained by WMISH. d Diagram of sea urchin pluteus larva showing the broader embryonic regions used to annotate gene expression states. SKM skeletogenic mesoderm, MES non-skeletogenic mesoderm, END endoderm, APE apical plate ectoderm, CBE ciliary band ectoderm, OE oral ectoderm, ABO aboral ectoderm. Scale bars represent 20 μm.

Results

Transcription factor expression during sea urchin embryogenesis

The sea urchin genome includes about 300–350 regulatory genes encoding DNA binding transcription factors, of which approximately 80% are expressed during embryogenesis20,21,22,23. These represent all major families of transcription factors, including about 50 verified transcriptional regulators of the zinc finger family, although not counting zinc finger proteins of unknown function22,24. To identify transcription factors that are expressed during sea urchin embryogenesis, we analyzed available quantitative developmental transcriptome data20,22. Using a threshold expression level of >300 transcripts per embryo, corresponding to approximately 10–15 transcripts/cell in 20–30 cells, there are 240 regulatory genes in this dataset that are expressed during development from pre-gastrular embryo to pluteus larva stage, between 24 and 72 h post-fertilization (hpf). Of these, we analyzed the spatial expression of 230 regulatory genes (Supplementary Data 1), corresponding to >90% of expressed transcription factors, using whole-mount in situ hybridization (WMISH). In situs were performed at five developmental stages that represent the mesenchyme blastula stage prior to gastrulation (24 hpf), early and late gastrulation (36 and 48 hpf), and differentiation of various cell types composing pluteus larvae (60 and 72 hpf; Fig. 1b). Gene expression at any developmental stage was observed for 210 regulatory genes, of which 120 are expressed prior to gastrulation, increasing to about 200 regulatory genes by larval stage (Fig. 1c)25.

Cell fates expressing unique regulatory states at larval stage

Although cell lineage fate maps have been generated for early embryonic blastomeres26,27, the spatial organization of the pluteus larva has been described mostly at the level of broader embryonic regions, which include the skeletogenic mesoderm (SKM), non-skeletogenic mesoderm (MES), endoderm (END), apical plate ectoderm (APE), ciliated band ectoderm (CBE), and oral and aboral ectoderm (OE and ABO; Fig. 1d). The spatial expression of the >200 transcription factors, however, indicates that the sea urchin pluteus larva encompasses far more cell fates than previously described. To achieve a detailed characterization of cell fate-specific regulatory states, we therefore initially identified throughout the pluteus larva all spatial domains that show clear differences in the expression of transcription factors and that have the potential to establish unique cell fates. At the 72 hpf larval stage, such cell fate domains were identified separately in each embryonic region by comparing the expression domains of specifically expressed regulatory genes, as shown in Supplementary Fig. 1. We identified >70 potential cell fate domains throughout the sea urchin larva that were annotated using identifiers for each embryonic region and numerical codes for individual cell fates (Fig. 2a, Supplementary Fig. 1, Supplementary Data 2).

In the skeletogenic mesoderm, at least eight distinct regulatory states reflect the spatial organization of the skeleton28. Thus, the vegetal body rod is subdivided along the oral-aboral axis by expression of soxd (SKM1a), tbr (SKM1b), and nr1m3 (SKM2), and is distinct from the oral cluster of skeletogenic cells expressing nfia (SKM3), from skeletogenic cells of the lower arms expressing pitx1 (SKM4), from the transverse rod expressing alx1, tbx20 and ets1/2 (SKM5), and from the oral lateral rods expressing alx1 but not ets1/2 (SKM6). The non-skeletogenic mesoderm (MES) includes bilateral coelomic pouches, the left of which gives rise to the juvenile sea urchin later in development. Left/right asymmetry is established by expression of not and pitx2 in right (MES3) but not left (MES1) coelomic pouches and at least three regulatory states separate anterior, central, and posterior cell fates in both coelomic pouches (MES1/3a,c,b; Supplementary Figs. 1 and 2)29,30. Additional mesodermal cell fates include the hydropore canal expressing id and pax6 (MES2)31, nkx3-2 expressing esophageal muscles (MES4)32, and several mesenchymal cell fates that migrate within the blastocoel and establish the larval immune system33. These include at least two types of blastocoelar cells (MES5a,b) and pigment cells expressing scl and gcm that intercalate into the aboral ectoderm (MES6)34. A small group of mesenchymal cells expressing nfe2 furthermore contributes to the oral skeletal rod (MES7).

The endoderm (END) consists of a through gut that includes morphologically distinct foregut, midgut, and hindgut compartments. Based on transcription factor expression, the gut includes at least 11 cell fates along the anterior/posterior axis: the anterior foregut expressing hmg2 (FG1,2), posterior foregut expressing irxa (FG3,4), cardiac sphincter expressing osr and ahrl (CSP1,2), anterior midgut expressing six3 (MG1,2), anterior/center midgut expressing ptf1a (MG3,4), posterior/center midgut expressing cebpa (MG5,6), posterior midgut expressing nkx6.1 and lox/pdx1 (MG7,8)35, pyloric sphincter expressing osr and ese (PSP1,2), anterior hindgut expressing rhox3 (HG1,2), posterior hindgut expressing cdx and hox11/13b (HG3,4), and the anus expressing bra and osr (AN1,2). The gut is further subdivided into oral and aboral endoderm by expression of klf3/8/12 and hairy2/4 (FG1-AN1), and tbx2/3 (FG2-AN2).

The neurogenic ectoderm includes the apical plate ectoderm (APE), which forms 6–8 neurons, including serotonergic neurons, and the ciliary band ectoderm (CBE) which gives rise to 30–35 neurons36. In the apical organ, distinct regulatory states are expressed both along the oral/aboral and medial/lateral axis37. The oral apical domains (APE1-3) express foxg and gsc, with nkx3-2 expression restricted to medial cell fates (APE1) and rx expressing photoreceptors at the distal periphery (APE3)31. The central apical domains intersect with the ciliary band and express lhx2, paxc and nkx2.1 in the medial domains (APE4,5a) and rx and emx in the distal domains (APE6a,b)38. Serotonergic neurons (APE5b) form between central and aboral apical organ and express soxc, nkx2.1, nkx3-2, and lhx2, while cells in the aboral apical ectoderm express hbn, rx, and tbx2/3 (APE7,8a,b)37,39,40. The entire ciliary band, which surrounds the oral ectoderm, expresses hnf6 and z16641 and includes, in addition to the central apical domains, two lateral domains expressing msxl (CBE4) and sp5 (CBE5), domains across the lower arms expressing emx (CBE6) and nk2-2 (CBE7), and sensory neurons that are distributed throughout the ciliary band, here represented as neuronal regulatory state (CBE8) that may include several types of neurons.

Finally, the oral ectoderm includes cells positioned anteriorly (OE7), posteriorly (OE8), and laterally (OE6) to the oral opening, in addition to an anterior region expressing nkx2.1 and gsc (OE5) and two posterior regions expressing ese and gsc (OE9) and foxa, bra and glis1 (OE10). The outer boundary of the oral ectoderm is formed by foxg expressing cells aligning the ciliary band and includes cell fates expressing msxl (OE4), sp5 (OE11), emx (OE12), and nk2-2 (OE13), and postoral neurons (OE14) expressing ese, soxc and nk2-239. The aboral ectoderm is a structurally homogenous layer of cells that expresses five distinct regulatory states, an anterior domain expressing dmrta2 and irxa (ABO1), a more central domain (ABO2) expressing irxa and hox7, the tip of the aboral ectoderm (ABO3) expressing eve and hox7 but not irxa, a vegetal domain (ABO4) expressing eve, hox7 and irxa, and a domain next to the ciliary band (ABO5) expressing irxa and hmx but not hox7 or eve.

A summary diagram of all identified spatial domains expressing unique regulatory states throughout the sea urchin larva is shown in Fig. 2a, demonstrating that, without exception, larval domains with distinct morphology and function also express different sets of transcription factors. Differences in transcription factor expression were observed among the seven major embryonic regions, and within each region among morphologically recognizable substructures such as foregut, midgut, hindgut, aboral apex, and stomodeum, as well as among known cell types such as apical and ciliary neurons, mesenchymal blastocoelar cells and skeletal cells. These results confirm that cell fates are specified by unique combinatorial regulatory states that are captured by the set of transcription factors included in this study. Additional spatial domains might have remained undetected that are specified by transcription factors not included here or that express combinations of transcription factors not identified based on single gene WMISH. In several embryonic regions, for instance, in the apical organ and in the gut, we identified previously unknown spatial domains that likely represent functionally distinct cell types. However, distinct regulatory states are also expressed in some cases in cells that appear structurally and functionally homogenous, such as in the aboral ectoderm, where they might not control the specification of distinct cell types but the localized expression of different signaling ligands that in turn affect the specification of other cell fates. Although the function of regulatory states in each developmental context will have to be established elsewhere, we here consider the identified spatial domains as cell fate domains in terms of their unique regulatory potential.

Fig. 2: Spatial arrangement and trajectories of cell fates specified during sea urchin embryogenesis.
figure 2

a Schematic summary showing the spatial arrangement of >70 cell fate domains that were identified based on the spatial expression of regulatory genes in the pluteus larva at 72 hpf. Regulatory state domains and morphological descriptions are listed in Supplementary Data 2. b Developmental lineage of cell fate domains that are specified during sea urchin embryogenesis to form the cell fates shown in (a). Cell fate lineages were identified based on molecular expression data shown in Supplementary Figs. 1 and 2. Lineages prior to 24 hpf are based on data from refs. 3,27,38,42. c Spatial arrangement of cell fate domains at the five developmental stages, color-coded as in (b). SKM skeletal mesoderm, MES mesoderm, END endoderm, APE apical plate ectoderm, CBE ciliated band ectoderm, OE oral ectoderm, ABO aboral ectoderm.

Developmental organization of cell fate specification

The seven major embryonic regions are distinctly specified by 24 hpf, before gastrulation starts1,3,4,13,37,38,42. Additional cell fate specification processes spatially organize the entire embryo throughout embryogenesis. To reconstruct the specification of cell fates during sea urchin development, we also identified cell fate domains that are present at stages between 24 and 60 hpf and matched corresponding cell fates linked by ancestry across the five developmental stages using both transcription factor expression and the location of cells within the embryo. Evidence for identifying corresponding cell fate domains at different developmental stages is shown in Supplementary Figs. 2 and 3. For instance, all skeletogenic cells express alx1 and tbr at 24 hpf, indicating a common skeletogenic precursor cell fate, but the more restricted expression of additional transcription factors at 36 hpf and beyond shows the distinct specification of cells of the different skeletal rods before the onset of skeletogenesis3,28 (Supplementary Fig. 2). Thus, for each cell fate present at larval stage, precursor cell fates were identified at all four preceding developmental stages, revealing in some instances the precursors of multiple cell fates that become spatially resolved later in development.

The resulting cell fate trajectories show unexpected differences in the specification of ectodermal versus endodermal and mesodermal cell fates (Fig. 2b). In the ectoderm, which undergoes few structural changes during gastrulation, 21 of the 32 cell fates are already distinctly specified by 24 hpf. In the endoderm and mesoderm, most cell fates are specified during gastrulation, when changes occur in the spatial arrangement of cells. The early spatial organization of ectodermal cell fates again suggests that ectodermal domains, through secretion of signaling molecules, serve as a spatial coordinate system for the subsequent specification of endodermal and mesodermal cell fates, as indicated by several previous studies29,43,44,45.

These results clearly show that the expression of cell fate-specific regulatory states precedes morphological and functional diversification. For example, anterior and posterior endoderm express distinct regulatory states at 18 hpf, foreshadowing differences in migratory behavior during gastrulation4,46. Endodermal foregut, midgut, and hindgut are molecularly specified by 36 hpf, although morphological distinctions between these compartments become apparent only at 48–60 hpf. Furthermore, the position of the ectodermal oral opening is defined by the expression of distinct regulatory states by 24 hpf, long before the oral opening forms at 60–72 hpf. Thus, although the sea urchin embryo shows few morphologically distinct domains by 36 hpf, over half of the cell fates present at the larval stage are already molecularly distinct (Fig. 2b, c).

Combinatorial expression of transcription factors during cell fate specification

The expression of the 230 regulatory genes was manually annotated in all cell fate domains throughout the embryo at the five developmental stages, by presence or absence of expression (Fig. 3a; see “Methods”). Approximately 20 transcription factors showed no clear patterns of expression, however with expression levels of <3000 transcripts/embryo and approximately 1800–2000 cells present at 72 hpf, the expression of these genes amounts to <2 transcripts/cell, which is most likely not sufficient to produce functional levels of transcription factors47. For the remainder of this study, weakly expressed regulatory genes are considered not expressed. Remarkably, none of the transcription factors are expressed ubiquitously at 72 hpf, and fewer than 10 regulatory genes are broadly expressed in more than half of the cell fates at any stage in development (Fig. 3b, Supplementary Fig. 4)22,48. Thus >95% of transcription factors show specific spatial expression, particularly at larval stages. On the other hand, and perhaps even more importantly, <5% of regulatory genes are expressed exclusively in one cell fate throughout development, and ~15% are expressed specifically in 1–2 cell fates at 72 hpf (Fig. 3b, Supplementary Fig. 4). Furthermore, about a third of the transcription factors are expressed in <10 cell fates throughout development, also showing restricted spatial specificity. As shown in the expression matrix in Fig. 3a, a vast majority of transcription factors are therefore expressed in unique spatial and temporal expression patterns, with very few redundancies in this system.

Fig. 3: Complexity of combinatorial regulatory states.
figure 3

a Expression matrix showing hierarchically clustered spatial expression data of 230 regulatory genes in all cell fate domains at five developmental stages. Rows show temporal and spatial expression profiles of regulatory genes, columns show regulatory states expressed in each cell fate at 24, 36, 48, 60, and 72 hpf. b Distribution of the spatial expression of transcription factors, showing the number of expression domains per transcription factor between 24 and 72 hpf. c Venn diagram showing the number of regulatory genes expressed in mesoderm, endoderm, or ectoderm cell fates, with about a third expressed in all three. d Diagram showing for each cell fate domain, by embryonic region, the total number of transcription factors expressed between 24 and 72 hpf, indicating the complexity of cell fate-specific GRNs. Individual data points are shown in color, with mean values shown as bold lines, 25th to 75th percentiles represented by box plots and entire data range except outliers by whiskers, for SKM (n = 7), MES (n = 12), END (n = 22), APE (n = 11), CBE (n = 5), OE (n = 11), ABO (n = 5). P-values from two-sided t-test, *p ≤ 0.05; **p ≤ 0.01, ***p ≤ 0.001, ****p ≤ 0.0001 (p = 1.2e-4, 6.3e-4, not significant, 5.0e-2, 1.0e-3). e Distribution of the number of transcription factors per regulatory state at 24, 48, and 72 hpf, indicating at the top of each diagram the fraction of regulatory states that include 15–40 transcription factors. f Number of transcription factors expressed in each cell fate during embryogenesis, by embryonic region. SKM skeletal mesoderm, MES mesoderm, END endoderm, APE apical plate ectoderm, CBE ciliated band ectoderm, OE oral ectoderm, ABO aboral ectoderm. Source data are provided as a Source Data file.

In the following, this analysis focuses on the contribution of transcription factor expression to the spatial and temporal specification of cell fates. A first comparison shows that endoderm, mesoderm, and ectoderm each express ~120–145 transcription factors between 24 and 72 hpf, with 58 transcription factors expressed in all three (Fig. 3c). Similarly, individual embryonic regions express >100 transcription factors in the endoderm and non-skeletogenic mesoderm, and close to 80 in the apical and oral ectoderm at 72 hpf (Supplementary Fig. 5a). Remarkably, all regions that express a large number of transcription factors also include several cell fate domains, while fewer transcription factors are expressed in these regions at 24 hpf when few cell fates are specified, indicating that transcription factor expression relates to the complexity of cell fates specified within each embryonic region (Supplementary Fig. 5a). Even individual cell fates express a large number of transcription factors during specification from 24 to 72 hpf, with 60–80 transcription factors expressed in endodermal and mesodermal cell fates, and 40–60 transcription factors in most ectodermal and skeletogenic cell fates (Fig. 3d). Therefore, up to 30% of transcription factors encoded in the sea urchin genome are expressed in each region of this embryo during embryogenesis, and about 10–25% of all regulatory genes are expressed during the specification of each cell fate, indicating considerable overlap in transcription factor expression among different cell fates, different embryonic stages and different embryonic regions.

We identified the regulatory states that are expressed during sea urchin embryogenesis based on the combination of transcription factors co-expressed in any given cell fate at the five analyzed developmental time points. The identified regulatory states consist of 15–40 transcription factors in ~80% of states analyzed, corresponding to 5–10% of all regulatory genes encoded in the sea urchin genome (Fig. 3e, Supplementary Fig. 5b). Most cell fates also express a relatively constant number of transcription factors during developmental specification (Fig. 3f, Supplementary Fig. 5c). Exceptions expressing a higher number of transcription factors include in particular the progenitors of multiple cell fates, such as endodermal and mesodermal cells during early gastrulation, cell fates in the apical and ciliary ectoderm that are undergoing neurogenesis, and cell fates of the coelomic pouches which give rise to many additional cell fates as the rudiment develops (Fig. 3f). Fewer transcription factors, on the other hand, are expressed in differentiated cells, such as in the skeletogenic mesoderm and aboral ectoderm at later stages of development (Fig. 3f).

Temporal changes of regulatory states during cell fate specification

As cells progress from progenitors to differentiated cell types, GRNs control the expression of temporally specific cellular functions, through intercellular signaling, intracellular transcriptional circuitry, or a combination of both. If temporal specification depends mostly on intercellular signaling, cell fate-specific GRNs should be activated within a short time period in response to signal induction. If, on the other hand, the temporal control of cell fate specification relies mostly on the sequential activation of transcriptional circuitry, GRN activation should occur more gradually and involve continuously propagating changes in transcription factor expression.

In this species of sea urchins, embryos develop at 15 °C, and regulatory interactions typically occur within ~3 h from the initial transcriptional activation of a regulatory gene to the production of functional levels of transcription factors that regulate target gene expression17,47. Thus, if regulatory states continue to change at a similar rate, at least four transcription factors will change expression state from OFF to ON or ON to OFF per 12 h interval in each cell fate. Indeed, out of >200 temporal transitions in cell fate specification, almost all showed changes in regulatory state expression (Fig. 4a). On average, about 10–20 regulatory genes change expression state in each cell fate within 12 h intervals, however, up to 30–40 transcription factors change expression state in endodermal and mesodermal cell fates during early gastrulation, when many endomesodermal cell fates are specified, while fewer changes occur at later stages of development (Fig. 4a). Only two temporal transitions showed no change in transcription factor expression, both occurring at a late stage in development in cells that are near terminal differentiation (ABO3 and MES6, 60–72 hpf).

Fig. 4: Temporal specificity of regulatory states.
figure 4

a Total number of transcription factors showing change in expression state in each cell fate during the four temporal transitions (a–d, left panel) in mesodermal, endodermal, and ectodermal cell fates. Changes in regulatory states are enhanced at 24–36 hpf in mesodermal and endodermal cell fates and decrease significantly on average in all cell fates between 60 and 72 hpf. Mean values shown as bold lines, 25th to 75th percentiles represented by box plots and entire data range except outliers by whiskers, for mesodermal (MESO, n = 19), endodermal (ENDO, n = 22), and ectodermal (ECTO, n = 32) cell fates. P-values from two-sided t-tests are indicated by: *p ≤ 0.05; **p ≤ 0.01, ***p ≤ 0.001, ****p ≤ 0.0001 (MESO: p = 1.6e-4, not significant, 1.9e-4; ENDO: p = 2.3e-17, 4.8e-2, 4.0e-2; ECTO: p = not significant, 1.3e-5). b Regulatory states show temporal specificity, with different compositions of regulatory states consisting of transcription factors expressed during earlier developmental stages. Shown are for each embryonic region the average number of transcription factors expressed per cell fate domain, with time of first expression color-coded as indicated in the legend. c Fraction of overlap between regulatory states expressed at different developmental stages. Shown are the fraction of transcription factors in each regulatory state at time ta that are also expressed at time tb (RS(ta) ∩ RS(tb) / RS(ta)). Source data are provided as a Source Data file.

Consistent with the relatively constant number of transcription factors expressed during cell fate specification, transcription factor expression is turned ON and OFF at comparable rates. Thus, during each 3 h interval, about 1–5 transcription factors change expression state by activation and deactivation of gene expression, although more regulatory genes are activated during early gastrulation (24–36 hpf) and more regulatory genes are deactivated during later development, particularly at 60–72 hpf (Supplementary Fig. 6). Throughout cell fate specification, cells therefore express temporally specific regulatory states not because of timed signaling events but because of a continuous turnover in transcription factor expression, leading to a continuous change in regulatory potential. This is demonstrated by comparing the temporal composition of regulatory states by the onset of transcription factor expression, showing that early active regulatory genes gradually turn off expression while additional regulatory genes become activated at every developmental stage (Fig. 4b, Supplementary Fig. 7). Importantly, most regulatory genes that cease expression in specific cell fates continue to be expressed in other cells. Thus, close to 80% of transcription factors expressed at 24 or 36 hpf remain expressed in the same embryonic region(s) by 72 hpf (Supplementary Fig. 8a, b).

These results show that regulatory states change extensively during early gastrulation, concomitant with the initial specification of many novel cell fates. In addition, however, regulatory states change at a relatively steady rate throughout embryogenesis, indicating that, for the most part, cell fate specification proceeds at the regulatory level as a continuous process, driven by the progressively changing activity of transcriptional circuitry. In consequence, there is a variable degree of overlap between regulatory states expressed at different stages during cell fate specification (Fig. 4c). Thus, endodermal and mesodermal cell fates express regulatory states that share fewer than 50% of transcription factors between 24 and 36 hpf, and as few as 20% between precursors and differentiated cell fates at 24 and 72 hpf, while most cell fate-specific regulatory states share 80% or more of expressed transcription factors between 60 and 72 hpf (Fig. 4c).

Changes of regulatory states during cell fate decisions

Among the most important developmental decisions are those that determine the spatial arrangement of cell fates within the animal body plan. We therefore determined how changes in transcription factor expression contribute to the spatial specification of an increasing number of cell fates during embryogenesis. At least 32 cell fate decisions occur during development between 24 and 72 hpf, most of them during gastrulation, between 24 and 48 hpf (Fig. 2b). Of these, 24 involve binary cell fate decisions, five give rise to three cell fates, two give rise to four, and one gives rise to five cell fates during a 12 h developmental transition. To analyze the changes in transcription factor expression that lead to cell fate decisions, we focused on binary cell fate decisions where a progenitor cell fate A gives rise to two distinct cell fates B and C at a subsequent developmental stage (Fig. 5a).

Fig. 5: Changes in regulatory states during cell fate decisions.
figure 5

a Binary cell fate decisions result in the specification of two cell fates B and C with differences in regulatory gene expression although both share an equivalent progenitor cell fate A at the previous stage. b Patterns of expression for transcription factors during binary cell fate decisions. Strictly temporal changes involve activation (New ON) or deactivation (New OF) in both daughter states, while spatial differences between B and C are generated by activation or deactivation in only one daughter state. c Shown are for the 24 binary cell fate decisions the number of transcription factors showing expression according to categories defined in (b). Mean values shown as bold lines, 25th to 75th percentiles represented by box plots and entire data range except outliers by whiskers. TC temporal changes, SC spatial changes. d Number of transcription factors differentially expressed between progenitor cells A and daughter cell fates B and C. In endodermal (n = 10) and ectodermal (n = 8), but not mesodermal (n = 6) cell fate decisions, spatial differences between B and C are significantly smaller than temporal differences to progenitor state A. Mean values shown by stars, medians by central lines, 25th to 75th percentiles by box plots and entire data range except outliers by whiskers. P-values from two-sided t-tests, *p ≤ 0.05; **p ≤ 0.01, ***p ≤ 0.001, ****p ≤ 0.0001 (MESO: p = not significant; ENDO: p = 3.5e-4, 2.7e-4; ECTO: p = 1.0e-5, 2.3e-4). e Contribution of different types of regulatory changes to spatial differences in regulatory states expressed in B and C. Shown are numbers of regulatory genes spatially regulated during cell fate decisions, either turned ON (Spat ON B/C) or specifically turned OFF in B or in C (Spat OFF B/C). f Comparison of regulatory changes that occur in the presence (CFD) or absence (nCFD) of cell fate decisions. Shown are the number of transcription factors changing in expression state during the four temporal transitions: 24–36h, CFD (n = 51), nCFD (n = 22); 36–48 h CFD (n = 34), nCFD (n = 39); 48–60 h CFD (n = 18), nCFD (n = 55); 60–72 h CFD (n = 0), nCFD (n = 73). Mean values shown as bold lines, 25th to 75th percentiles represented by box plots and entire data range except outliers by whiskers. P-values from two-sided t-tests, *p ≤ 0.05; **p ≤ 0.01, ***p ≤ 0.001, ****p ≤ 0.0001 (p = 5.5e-11, 4.7e-7, not significant). Source data are provided as a Source Data file.

During binary cell fate decisions, transcription factor expression changes temporally, to generate differences between progenitors and daughter cell fates, and/or spatially, to distinguish cell fates B and C (Fig. 5b). In a majority of cell fate decisions, in particular in the endoderm and ectoderm, the temporal differences between progenitor and daughter regulatory states by far exceed the spatial differences between regulatory states B and C (Fig. 5c, d, Supplementary Fig. 9a, b). Thus, in about two-thirds of cell fate decisions, <10 transcription factors are differentially expressed between regulatory states B and C, while temporal changes of >10 transcription factors occur in all but three cases (Supplementary Fig. 9c). Most cell fate decisions therefore initially introduce only a few spatial differences in transcription factor expression between newly specified cell fates.

The specification of novel cell fates is often assumed to depend on the inductive activation of novel gene expression programs in one of the daughter cell fates downstream of intercellular signaling9,49,50,51. However, with few exceptions, changes in transcription factor expression occur during the specification of both cell fates B and C and involve both activation and deactivation of regulatory genes (Supplementary Fig. 9d, e). A majority of the 24 cell fate decisions deploy a combination of the four possible regulatory changes leading to spatial differences—specific gene activation in B or C, or specific gene deactivation in B or C—with 7 deploying all four mechanisms, 6 relying only on deactivation, 4 only on activation, and 7 where changes in regulatory gene expression occur exclusively in B or C (Fig. 5e, Supplementary Fig. 9f, g). Most cell fate decisions therefore involve more complex regulatory mechanisms beyond unilateral inductive signaling that affect both daughter cell fates and possibly vary among different developmental contexts.

We next addressed whether regulatory changes during cell fate decisions are substantially different from those that occur during temporal specification processes in the absence of spatial decisions. In cells that do not undergo cell fate decision, regulatory states change at a relatively constant rate between 24 and 60 hpf, with fewer changes occurring after that (Supplementary Fig. 10a). Cell fate decisions that occur during gastrulation, a time of major changes in developmental specification and cellular behavior, involve a significantly higher number of transcription factors that change expression state in each cell fate (Fig. 5f). After 48 hpf, changes in regulatory states are comparable in the presence or absence of cell fate decisions, indicating that during later development, cell fate decisions depend on only few spatially regulated transcription factors that presumably distinguish functionally similar cell types.

In sea urchins, several transcription factors that are expressed in endomesoderm progenitors are expressed more restrictedly in either endodermal or mesodermal cell fates later in development4,42. To determine if this is a general phenomenon, we analyzed whether transcription factors expressed in progenitor cells continue to be expressed in daughter cell fates and, if so, if expression occurs in all or just a subset of cell fates. Considering all 32 cell fate decisions, about 70% of transcription factors of the progenitor state remain expressed in daughter regulatory states at the subsequent developmental stage (Supplementary Fig. 10b, c). A majority of these transcription factors are expressed in all daughter cell states. By 72 hpf, about 50–60% of transcription factors expressed in progenitor cell fates are still expressed in daughter cell fates, however most of them in just a subset of regulatory states (Supplementary Fig. 10d, e). It is therefore common for multipotent progenitor cell fates to co-express transcription factors that later in development contribute to the differential specification of daughter cell fates. Cell fate decisions therefore initiate spatial differentiation by introducing relatively small spatial differences in transcription factor expression, while further differences among cell fate-specific regulatory states continue to accumulate during subsequent cell fate specification processes.

Combinatorial regulatory states reflect the functional organization of cell fates

Transcription factors are frequently expressed in multiple cell types, in cells that belong to the same organ system as well as in entirely independent developmental contexts. To some extent, regulatory states therefore share transcription factors regardless of developmental or functional relationship. To quantify the overlap in transcription factor expression between cell fate-specific regulatory states, we performed pairwise comparisons at 24, 48, and 72 hpf (Fig. 6a). Perhaps contrary to expectations, very few regulatory states show no overlap in transcription factors, with the number of cell fate pairs expressing completely exclusive regulatory states being 0 at 24 hpf, 41 at 48 hpf, and 85 at 72 hpf. At 72 hpf, the average overlap in transcription factor expression between two cell fates is about 25% of the regulatory state, and about 75% of the 5329 pairwise comparisons (73 × 73) show less than 30% overlap in transcription factor expression. On the other hand, fewer than 10% of the comparisons identified regulatory states with more than 50% overlap, and fewer than 5% show more than 70% overlap in transcription factor expression at 72 hpf. Therefore, despite overlap, regulatory states typically consist of clearly distinct combinations of transcription factors.

Fig. 6: Global comparison of cell fate-specific regulatory states.
figure 6

a Pairwise comparisons between regulatory states expressed throughout this embryo at 24, 48, and 72 hpf showing the similarity of regulatory states as the fraction of transcription factors in regulatory state A that are also expressed in regulatory state B. Common progenitor lineages at 24 hpf are indicated by perfect overlap (1.0) while strong overlap by 72 hpf is restricted to functionally related cell fates. b Representation of the major families of DNA binding domains among transcription factors in 72 hpf regulatory states. c Diagram showing the number of regulatory states representing different combinations of transcription factor families (top) and the number of regulatory states deploying each transcription factor family (right). d Summary diagram showing different scenarios of how changes in the combinatorial expression of transcription factors lead to temporally and spatially unique regulatory states during cell fate specification, as observed in this study. Regulatory states (theoretical) consisting of subsets of expressed transcription factors (black circles) change by activation and deactivation of expression of usually small numbers of transcription factors, as indicated. Source data are provided as a Source Data file.

Among the cell fates expressing highly similar regulatory states at 72 hpf are those that share a common developmental origin and that are functionally related. For instance, cell fates contributing to the three main skeletal rods (SKM2,5,6) share >60% of expressed transcription factors (Fig. 6a). In the coelomic pouches, anterior and central domains express regulatory states with >75% overlap and with >50% overlap among corresponding cell fates in the left and right coelomic pouches. Similarly, regulatory states expressed in the foregut share approximately 20 transcription factors (50–80% overlap), those expressed in the central apical organ (APE4,5a) share about 80% of transcription factors, cells surrounding the oral opening share up to 70% of transcription factors, and so do most cell fates of the aboral ectoderm.

Among the regulatory states that share just a few transcription factors and show <20% overlap are those expressed in cell fates that are allocated in different regions of the embryo and that are functionally distinct (Fig. 6a). However, even cell fates that share a similar developmental trajectory may end up expressing clearly distinct regulatory states. For example, regulatory states expressed in the coelomic pouches share <20% of transcription factors with those expressed in mesenchymal blastocoelar and pigment cells, although these cell fates share common mesodermal progenitors at 24 hpf. For comparison, mesenchymal pigment cells and blastocoelar cells with a last common progenitor at 18 hpf show 30–50% overlap in transcription factor expression. In the anterior endoderm, far more transcription factors are commonly expressed among cell fates of the foregut or midgut (20–30 transcription factors) than those that are commonly expressed in both compartments (10–15 transcription factors), although they all share a common progenitor at 24 hpf. Thus, differences in cell fate-specific regulatory states accumulate at different rates during developmental specification such that some mesodermal cell fates share no more transcription factors than they do with any endodermal or ectodermal regulatory state.

Furthermore, perhaps even more remarkably, some cell fates with different developmental origins also end up expressing very similar regulatory states. For instance, the ciliary band consists of cells of the vegetal, animal and apical ectoderm that are differentially specified by 12 hpf41. Yet by 72 hpf, these regulatory states show >80% overlap in transcription factor expression and about 60% overlap with the neurogenic regulatory states of the central apical organ (Fig. 6a). Regulatory states expressed in the endodermal foregut and in the ectodermal oral opening also show extensive similarity. Thus by 72 hpf, when the oral opening is formed, up to 15 transcription factors and close to 50% of the regulatory state are commonly expressed in cells of the foregut and mouth, even though their progenitors, the anterior endoderm and stomodeal ectoderm, share just about 5–6 transcription factors (<25% of regulatory state) by 24 hpf.

These results show that similarity among regulatory states reflects not necessarily developmental origin but similarity in cellular function among functionally related cell types that occurs either within the same organ system or even in different parts of an organism. During cell fate specification, regulatory states therefore become increasingly different in cells of common developmental origin that assume different fates and increasingly similar in cells of different developmental origin that assume similar cell fates.

Representation of transcription factor families in cell fate-specific regulatory states

The ability of regulatory states to control distinct gene expression states depends on the representation of DNA binding domains in co-expressed transcription factors. The set of transcription factors analyzed in this study includes all major transcription factor families, such as Hox, basic helix-loop-helix (bHLH), forkhead, and Ets factors (Supplementary Fig. 11). We determined whether transcription factor families show any specificity in spatial expression at 72 hpf (Fig. 6b). Most transcription factor families are broadly represented across the sea urchin larva, with little specificity for particular developmental contexts. Exceptions include the small group of Smad transcription factors, which are only expressed in the ciliary band and in three mesodermal cell fates by 72 hpf. Furthermore, regulatory states expressed in the aboral ectoderm mainly consist of homeodomain, bHLH and zinc finger transcription factors, lacking expression of basic leucine zipper (bZip), Ets, and forkhead transcription factors. Most ectodermal regulatory states are devoid of nuclear hormone receptor (NHR) expression, and most regulatory states in the skeletal mesoderm lack expression of Sox transcription factors and express relatively few homeodomain factors compared to other regulatory states. Overall, however, transcription factor families show limited specificity for certain cell fates or even embryonic regions.

A comparison of transcription factor family composition nevertheless revealed 34 unique combinations of transcription factor families that distinguish the regulatory states expressed at 72 hpf (Fig. 6c). Most combinations are expressed in just one or two spatial domains and are likely to contribute to specific gene expression profiles of the respective cell fates. Almost all regulatory states include transcription factors of the homeodomain, bHLH, forkhead and zinc finger families, groups that are also strongly represented in the set of transcription factors analyzed here, with 18–60 members each (Supplementary Fig. 11). However, specificity among regulatory states is generated by the differential expression of members of the Sox, Ets, Tbox, bZip and nuclear hormone receptor families of transcription factors, groups that are represented by 5–20 members that are nevertheless expressed in a majority of regulatory states. Differences in cis-regulatory DNA sequence recognition among regulatory states are therefore generated by the specific absence of certain transcription factor families and of course by variations in the DNA binding specificity between members of each family52.

Discussion

With as few as 200 transcription factors, the sea urchin GRN controls the formation of all major larval body parts, including a through gut, nervous system, skeleton, muscle, and immune system. Here we show that regulatory states expressed during sea urchin development not only distinguish all known body parts but further subdivide this organism into >70 cell fate domains with distinct regulatory potential. The developmental expression of regulatory states therefore provides insights into the genomic control system that globally defines animal body plans.

Most importantly, despite the limited number of transcription factors encoded in the genome, regulatory states expressed during cell fate specification in this embryo are sufficiently unique to enable the control of distinct cell fate- and developmental stage-specific gene expression programs. Thus, unique regulatory states not only distinguish cell fates with specific morphological or functional properties but also the same cells at different times in development (Fig. 6d). Throughout cell fate specification, combinatorial regulatory states change continuously in transcription factor composition, at least until cells undergo terminal differentiation. Small spatial differences in regulatory states are initially introduced during cell fate decisions that precede the functional and morphological diversification of cell fates. Furthermore, at larval stage, combinatorial differences among regulatory states are enhanced between functionally unrelated cell fates, while similar regulatory states are expressed in functionally similar cell fates regardless of developmental background. Since a large majority of transcription factors are expressed in several cell fates, several developmental stages, and even several embryonic regions, the specificity of regulatory states is typically not defined by any individual transcription factor but by their specific combination. Altogether, these results therefore indicate that the combinatorial information content of regulatory states is deterministic of the functional state of cells and responsible for the specification of functionally diverse cell types during embryogenesis.

Some of the most obvious changes in gene regulation during development affect quantitative levels of transcriptional gene expression, which change dynamically in time and often display variability among cells, never reaching what could be considered controlled stable states of transcriptional gene output. In some contexts, GRN circuits successfully explain these dynamic changes in gene expression levels9,15,53,54,55. In this study we focused exclusively on the presence or absence of transcription factors in each embryonic cell fate, at transcript levels detectable by in situ hybridization, which are likely sufficient also for the production of functional levels of transcription factors47. Thus we did not consider here the cell-to-cell variability in gene expression levels nor the quantitative differences in expression levels of transcription factors expressed in multiple cell fates. However, extensive combinatorial differences between regulatory states indicate that the developmental function of GRNs might rely less on gene regulation at steady state levels and instead controls cell fate-specific gene expression primarily through specific combinations of transcription factors, each expressed at functional levels even though absolute concentrations remain variable17,56,57.

Developmental transitions in cell function and cell behavior typically occur at specific times and places in development and involve changes in gene expression states, for instance, during the specification of new cell fates, the patterning of axes, cell proliferation, cell migration, and the differentiation of cell types of various form and function. Many of these transitions occur in response to signaling interactions, as demonstrated by countless experiments showing extensive phenotypic consequences when signaling interactions are interrupted8,58,59. Based on these observations one might expect that cells remain in a state of stasis until signaling interactions induce phenotypic specification. However, this study indicates that the genomic control system for cell fate specification operates in a far more continuous manner, with dynamic changes in the combinations of expressed transcription factors enabling the activation of different gene expression programs at any time throughout development. Even spatial decisions that typically involve intercellular signaling occur mostly by small changes in regulatory states followed by the accumulation of further differences during subsequent specification. The view that emerges is that the developmental progression of cell fate specification might therefore depend to a large extent on a continuous exchange of regulatory information that is propagated according to the architecture of GRN circuitry.

Methods

Animal cultures

Adult sea urchins were obtained from Pat Leahy, Kerckhoff Marine Laboratory, Caltech, and used to gain gametes. After fertilization, large batches of sea urchin embryos were cultured in Petri dishes at 15 °C in filtered fresh seawater and collected at various stages. Ethical permits are not required for invertebrate species.

Preparation of RNA probes for in situ hybridization

DNA templates for the preparation of in situ probes were generated using cDNA prepared at various developmental stages for regulatory genes listed in Supplementary Data 1. For RNA extraction, embryos were collected, centrifuged, and resuspended in lysis buffer, and RNA was purified according to the manufacturer’s protocol using Qiagen RNeasy Kit (Qiagen, Germany). cDNA was prepared using the iScript cDNA synthesis kit (BioRad) and primers listed in Supplementary Data 3. PCR products were purified and ligated into pGEM-TEZ or pCRII plasmid vectors or used directly for probe synthesis with T7 tailed primers. The quality and specificity of probe templates were confirmed by gel electrophoresis (fragment length) and by sequencing. Digoxygenin-labeled antisense probes were generated using the DIG RNA labeling kit (Roche).

Whole-mount in situ hybridization (WMISH)

The protocols for animal culture, collection, fixation, and whole-mount in situ hybridization (WMISH) to detect spatial gene expression have been described previously42,60. Briefly, sea urchin embryos were collected at 24, 36, 48, 60 and 72 hpf from a large culture grown at 15 °C and fixed for 48–72 h in 4% paraformaldehyde solution at 4 °C. Embryos were washed in 1 M mops solution and stored at −20C in 70% ethanol. For hybridization, 100–200 embryos were incubated with probes diluted to 1–2 ng/μL in hybridization buffer (50% formamide, 5× SSC, 1× Denhardt’s, 1 mg/mL yeast tRNA, 50 ng/mL heparin, and 0.1% Tween-20). Embryos were washed 2x in hybridization buffer, 2× SSCT (2× SSC, 0.1% Tween-20), 0.2× SSCT, and 0.1× SSCT, and 0.1% Tween-20, 0.1 M MOPS, 0.5 M NaCl. Embryos were incubated at RT with 1:2,000 diluted Anti-Digoxigenin-AP (fab fragment, Roche). The embryos were washed in MABT buffer (0.1 M maleic acid, 0.15 M NaCl, and 0.1% Tween-20) and with AP buffer (100 mM Tris·Cl (pH 9.5), 100 mM NaCl, 50 mM MgCl2, and 1 mM levamisole). Embryos were stained with 5-bromo- 4-chloro-3-indolyl-phosphate (BCIP) and nitro blue tetrazolium (NBT).

Imaging

WMISH of each regulatory gene was performed using 100–200 embryos and considered valid where observed expression patterns among assayed embryos were comparable. Assays were repeated in case of overstaining, weak expression, or inconsistent spatial expression among assayed embryos. At least three embryos were imaged per developmental time point from several views, including lateral, oral, aboral, apical and vegetal views at various focal depths to capture gene expression throughout the entire embryo. Expression patterns were validated by comparison to quantitative expression data from transcriptome analyses22 and to published gene expression patterns where available. Images were captured using a Zeiss AxioSkop microscope equipped with a Zeiss AxioCam camera.

Identification of cell fate domains and cell fate trajectories

Cell fate domains were identified separately for each embryonic region by comparing expression patterns of transcription factors. Spatial domains were defined according to specific gene expression patterns as groups of cells showing clear differences in the presence or absence of expression of one or more specific transcription factors compared to nearby cells outside the domain but within the same embryonic region. Differences in gene expression are furthermore supported by clear gene expression boundaries. Additional cell fates that are specified at the boundaries of these domains, for instance, by overlapping transcription factor expression, might have been missed in this comparison of single gene expression patterns. Furthermore, cells with variable spatial position, including mesenchymal cells and neurons, might also show additional differences in transcription factor expression beyond those identified here. However, the relatively small number of cells in each domain and the clear morphological structures at later stages contribute to the accuracy of the expression domains identified here.

Developmental trajectories of cell fate domains were analyzed by comparing spatial domains in each embryonic region at different developmental stages. For example, where three adjacent domains in comparable embryonic locations express similar transcription factors at 72 and also at 60 hpf, they are considered as corresponding cell fate domains present at different times. Where fewer domains are detected at earlier stages compared to subsequent stages, corresponding cell fates are identified by spatial arrangement of domains and by the expression of specific transcription factors to distinguish between corresponding domains and domains that are subdivided into two or more domains within a 12-h interval. Using this approach, cell fate domains are detected only once clearly distinct gene expression boundaries are established, while subtle changes in gene expression within a progenitor domain might occur at even earlier developmental stages.

Annotation of gene expression

Gene expression in each cell fate domain was annotated manually using five categories indicating (1) expression, where gene expression was detected by a clear signal in WMISH assays; (2) absence of expression, where gene expression was not detected; (3) partial expression, where gene expression was detected in a subset of cells within a domain, for example in individual neurons within a broader domain; (4) weak or unclear expression, where gene expression signals were too weak to consider expressed but also not clearly absent, or where expression was detectable only in some embryos; for all subsequent analyses in this study, weak expression was considered as below functional levels and therefore not expressed; (5) no data. Comparison of transcription factor families represented in regulatory states was performed using a tool for visualizing intersecting sets61.

Statistics and reproducibility

To increase reproducibility, spatial expression patterns of each regulatory gene were compared between multiple embryos and data were excluded for genes with inconsistent or not conclusive gene expression data. The sample size was not statistically determined. Statistical analyses were performed using t-tests as indicated in figure captions.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.