Inter-embryo gene expression variability recapitulates the hourglass pattern of evo-devo

Liu, Jialin; Frochaux, Michael; Gardeux, Vincent; Deplancke, Bart; Robinson-Rechavi, Marc

doi:10.1186/s12915-020-00842-z

Inter-embryo gene expression variability recapitulates the hourglass pattern of evo-devo

Research article
Open access
Published: 19 September 2020

Volume 18, article number 129, (2020)
Cite this article

Download PDF

You have full access to this open access article

BMC Biology Aims and scope Submit manuscript

Inter-embryo gene expression variability recapitulates the hourglass pattern of evo-devo

Download PDF

Jialin Liu^1,2,
Michael Frochaux^2,3,
Vincent Gardeux^2,3,
Bart Deplancke^2,3 &
…
Marc Robinson-Rechavi ORCID: orcid.org/0000-0002-3437-3329^1,2

3623 Accesses
17 Citations
10 Altmetric
Explore all metrics

Abstract

Background

The evolution of embryological development has long been characterized by deep conservation. In animal development, the phylotypic stage in mid-embryogenesis is more conserved than either early or late stages among species within the same phylum. Hypotheses to explain this hourglass pattern have focused on purifying the selection of gene regulation. Here, we propose an alternative—genes are regulated in different ways at different stages and have different intrinsic capacities to respond to perturbations on gene expression.

Results

To eliminate the influence of natural selection, we quantified the expression variability of isogenetic single embryo transcriptomes throughout fly Drosophila melanogaster embryogenesis. We found that the expression variability is lower at the phylotypic stage, supporting that the underlying regulatory architecture in this stage is more robust to stochastic variation on gene expression. We present evidence that the phylotypic stage is also robust to genetic variations on gene expression. Moreover, chromatin regulation appears to play a key role in the variation and evolution of gene expression.

Conclusions

We suggest that a phylum-level pattern of embryonic conservation can be explained by the intrinsic difference of gene regulatory mechanisms in different stages.

Genetic variants regulating expression levels and isoform diversity during embryogenesis

Article 26 December 2016

Evolutionary Developmental Biology and Its Contribution to a New Synthetic Theory

Shifting the Black Box: Approaches to the Development and Evolution of the Vertebrate Mesoderm

Background

Both morphological and transcriptomic surveys have proposed an “hourglass” model of evo-devo [1, 2]. The mid-development phylotypic stage is more conserved than both early and late development [3,4,5,6,7]. Currently, the proposed mechanisms for this pattern are mainly based on how natural selection shapes the outcome of regulatory variation on gene expression. The first hypothesis interpreted high conservation as a result of negative selection [1, 2, 8]. For example, Raff suggested a high inter-dependence in signaling among developmental modules in middle development, so expression changes underlying this stage will generally be deleterious and under negative selection. An alternative hypothesis, however, argues that high conservation can also be the result of variation being less visible to positive selection [9, 10]. In this scenario, embryonic development is more likely to evolve when ecological niches demand it. For example, variation in early development can result from adaptation to diverse ecological circumstances [11]. To distinguish the two hypotheses, Zalts and Yanai [6] compared the expression variation of 20 Caenorhabditis elegans mutation accumulation strains across embryonic development and found that the nematode phylotypic stage has lower expression variation. Since mutation accumulation experiments mostly remove the effect of positive selection, this study indicates that positive selection is not necessary to obtain an hourglass pattern of expression evolution.

The hourglass pattern may also result from the regulatory mechanisms of genes at different stages having a different inherent tendency to respond to perturbations. Under this hypothesis, the outcome of regulatory mutations on gene expression would be biased, and this might impact the patterns of expression divergence between species, even in the absence of patterns of natural selection. For example, genes regulated by redundant enhancers are more robust to genetic variation on gene expression [12, 13]. In addition, broad promoters (which initiate transcription at dispersed regions) are more robust to mutations than narrow ones (which initiate transcription at precise positions) [14]. What is more, chromatin regulators can act as capacitors to buffer expression divergence between species [15]. Gene expression can vary among isogenic individuals in homogenous environments [16,17,18,19,20], suggesting widespread stochastic perturbations during transcription. For example, gene expression can be perturbed by the random distribution of molecules at cell division or by the inherent randomness of biochemical reactions with low molecule numbers [18, 21]. It has been suggested that mechanisms which confer robustness to environmental or stochastic perturbations on gene expression could also buffer the effects of genetic perturbations [22,23,24]. Under this hypothesis, we can use these stochastic perturbations of gene expression to estimate how expression responds to random genetic perturbations at different stages. If the phylotypic stage is more robust to genetic perturbations on gene expression, we should observe low inter-embryo expression variability at this stage, even among isogenic embryos in constant conditions.

In this study, we built a single embryo transcriptome time series across fly embryonic development, with a high number of isogenic replicates. Using this dataset, we investigated the developmental patterns of expression variability and found that the expression variability recapitulates an hourglass pattern, with the minimum of noise at extended germband, the phylotypic stage.

Results

Single embryo RNA-seq profile over embryogenesis

We generated 288 single embryo 3′ end transcriptomes using bulk RNA barcoding and sequencing (BRB-seq) [25], at eight developmental stages covering the whole fly embryogenesis, with 3 h intervals (Fig. 1a). After quality control, 239 samples were kept (Additional file 1: Figure S1, S2). On average, we obtained over 5 million uniquely mapped reads of protein-coding genes per embryo. Based on multidimensional scaling analysis (MDS), 150 embryos follow the developmental trajectory, while there is a small cluster of 89 embryos collected at different time points mixed together (Fig. 1b). The samples in this cluster appear to be unfertilized eggs (Additional file 1: Figure S3). All further analysis was performed only on the 150 fertilized embryos.

The phylotypic stage is robust to stochastic perturbations on gene expression

We measured expression variability as adjusted SD and standard deviation (SD) of expression between replicates corrected for expression level (the “Materials and methods” section; Additional file 1: Figure S4-6). This expression variability follows an hourglass pattern overall, with a global minimum at E3 (Fig. 2), corresponding to the phylotypic stage of flies [7]. There is also a local minimum at E6. This is consistent with the pattern of transcriptome divergence between fly and mosquito Anopheles gambiae, with the global minimum at E3 and a local minimum at E6 [26]. We did not find any significant functional enrichment for genes which follow the hourglass variability pattern. Essential genes, and highly connected genes, have lower variability (Additional file 1: Figure S7), as observed in yeasts [27, 28], which indicates that variability is generally detrimental. Overall, expression variability is not equally distributed throughout embryogenesis, and the variability is lower at the phylotypic stage, supporting the hypothesis that the regulatory mechanisms at the phylotypic stage are more robust to stochastic perturbations on gene expression.

Testing potential confounding factors of hourglass variability pattern

Our observations are robust to the use of different variability metrics (Additional file 1: Figure S8). Based on bootstrap analysis (“Materials and methods” section), we confirmed that all the stages, excepting E4, are robust to the choice of samples used to calculate noise (Additional file 1: Figure S9). Bootstrap results also suggest that the minimum variability extends over E3 to E4. To check whether high variability at E1 is related with a low number of embryos at this stage, we sampled the same number of embryos for each stage (8 embryos, without replacement) and re-calculated the median variability across genes. We repeated this process 500 times and compared the median of variability over development. The pattern is similar, with a minimum at E3 (Additional file 1: Figure S10). Another potential source of bias is maternal transcripts. The embryo transcriptome is dominated by zygotic transcripts as soon as 2.5 h after egg laying [29], so the high variability in E1 and E2 is not directly caused by maternal transcripts; we confirmed this by removing all maternally expressed genes (Additional file 1: Figure S11). Finally, the variation in the expression variability could either be due to changes in the set of active genes, with genes differing in their intrinsic variability levels, or to genome-wide changes in the regulation of variability. To test this, we reproduced our results restricted to the subset of genes which are expressed at all stages. Under the first explanation, we would expect to lose the hourglass variability pattern, but the pattern is maintained (Additional file 1: Figure S12), suggesting that the lower variability at E3 is due to genome-wide regulation mechanisms more than to changes in the gene set.

Mutational robustness shapes the hourglass expression divergence pattern

Several studies have suggested that mechanisms which confer robustness to stochastic variations can also buffer the effects of genetic variations [22,23,24]. This raises the possibility that the regulatory mechanisms at the phylotypic stage might also be characterized by stronger robustness to genetic perturbations on gene expression. In this scenario, we should expect relatively low regulatory sequence conservation for the stage with high gene expression conservation [7]. Indeed, mutations that are buffered would behave nearly neutrally. To test this hypothesis, we identified genes specifically expressed in each stage and compared sequence conservation of their core promoter regions between species (49 bp upstream transcription start site (TSS) to 10 bp downstream of the TSS as defined by Dreos et al. [31]; phastCons score [32]). We found that genes specific of E3 have a relatively weak promoter sequence conservation (Fig. 3a). This pattern remains using 200-bp or 400-bp regions around TSS but disappears using 1-kb regions (Additional file 1: Figure S13). Using transcriptome indexes of conservation (mean promoter sequence conservation weighted by expression), we can extend this observation to the full transcriptome (Fig. 3b). E3 has a relatively low index, again showing that genes expressed at this stage tend to have low promoter sequence conservation. These results support a role of buffering effects on regulatory mutations in the hourglass pattern of expression divergence in fly embryogenesis.

Histone modifications and expression variability

To investigate the mechanisms which minimize expression variability, we focused on histone modifications. It has been suggested that histone modifications play a prominent role in regulating expression variability between cells (cell-to-cell expression variability) [33, 34, 35,36,37], between individuals (individual-to-individual expression variability) [19], and between environments (expression plasticity) [38]. To systematically study the role histone modifications play in relation to expression variability between embryos, we analyzed three active histone modifications (H3K4Me3, H3K9Ac, and H3K27Ac) at six developmental stages from modENCODE [39]. For each gene, we calculated the mean modification signal (background-subtracted tag density) of its proximal promoter region (2 kb upstream to 2 kb downstream for TSS). We found that higher modification signal genes tend to have lower variability for all histone modifications (Fig. 4), supporting a role in minimizing expression variability.

It has been found that, relative to narrow promoters, broad promoters were highly enriched with H3K9Ac [40], and Additional file 1: Figure S14). In addition, genes with broader promoters tend to have lower expression variability [41], and Additional file 1: Figure S15). These observations indicate that the negative correlation between histone modification signal and expression variability could be explained by promoter shape. After controlling the effects of promoter shape, however, we still found a strong negative correlation (Spearman’s rho − 0.34, − 0.32, − 0.32 for H3K4Me3, H3K9Ac, and H3K27Ac, respectively). Interestingly, the correlation between promoter shape index and expression variability is much weaker (Spearman’s rho 0.08), after controlling for histone modification signal.

Histone modifications and promoter sequence conservation

To study the relationship between histone modifications and promoter sequence evolution, we analyzed the correlation between histone modification signal and promoter sequence conservation. We found that genes with higher histone modification signal tend to have less conserved core promoter regions between species (Fig. 4). Since histone modifications are enriched in highly expressed genes [42], and selective pressure against mutations changing gene expression level is stronger in highly expressed genes [43], genes with higher histone modification signal might be expected to have more conserved promoter sequence conservation. Our observation of the inverse pattern suggests that genes with higher histone modification signal are under stronger buffering effect, thus less sensitive to mutations in their regulatory regions, and are thus less conserved.

Higher histone modification signal at the phylotypic stage

The relation between histone modifications and expression variability raises the possibility that the pattern of expression variability across development could be partially driven by changes in histone modification signal. To compare histone modification signal between stages, we normalized the promoter signal by that on intergenic regions (the “Materials and methods” section), which are not expected to change histone modification signal between stages. All normalized histone marks present an hourglass-like pattern, with the highest signal at 8–12 h, corresponding to E3 and E4, i.e., the lowest expression variability (Fig. 5). Generally, these results show that histone modification signal changes over development, with a correspondence between stronger histone modification signal and lower expression variability, consistent with our hypothesis.

Discussion

In this work, we found an uneven distribution of isogenic inter-individual expression variability, and thus of robustness of the process of gene expression, across development. This mirrors the hourglass evo-devo pattern [1, 2]. Stage E3 is the most robust to a stochastic variation on gene expression, with lower expression variability, and is the phylotypic stage of fly, with a conservation between species [7].

The hourglass pattern has been mainly interpreted as a signal of increased negative selection on gene regulation at mid-development, the phylotypic stage [1, 2, 4, 6, 8], although it may also result from an increased positive selection at both early and late development [6, 9, 10, 44]. However, our finding that genes expressed at distinct stages show distinct patterns of variability suggests an alternative, although not exclusive, model: genes expressed at the phylotypic stage utilize different regulatory strategies, which respond differently to perturbations. Thus, we should not expect divergence in gene expression to be equally likely at all stages even without natural selection on this divergence. The natural selection, instead, could be in part in the control of expression variability in individuals.

Consistent with previous observations in Arabidopsis thaliana [19], we found that histone modification is an important determinant of expression variability between multicellular individuals. Interestingly, histone modification also plays a role in regulating expression variability between single cells [33, 34, 35,36,37]. For example, a gene’s expression variability can be changed by modulating histone acetylation level of its promoter [35]. In addition, histone modifications are also associated with expression variability between environments [38]. In line with these results, we found that higher histone modification genes have less conserved promoter sequences, suggesting that histone modification might also buffer genetic variations on gene expression. Although there is no direct experimental validation, it has been found that deleting chromatin regulators tends to increase the expression divergence between species [15]. Since the sources of expression variability discussed above are quite different, the chromatin structure appears to be an important regulator of the robustness of gene expression under different types of perturbations.

While we found that genes specific of embryonic stages with lower expression divergence (E3, E4, E5, E6) generally have higher sequence conservation of promoters, the phylotypic stage (E3) does not have the highest promoter sequence conservation, suggesting that the lowest expression divergence at this stage is driven at least in part by mutational robustness. Our analyses have two potential caveats. First, our results were based solely on promoter analysis, and it remains to be seen how much these observations extend to other regulatory elements. Second, we only considered nucleotide substitution changes, but not promoter turnover. For large effect mutations, such as insertion, deletion, and turnover, purifying selection may be more efficient than mutational robustness to keep conserved expression level. For example, Piasecka et al. [45] found the neighborhood of genes specifically expressed in the phylotypic stage of zebrafish to be enriched with transposon free regions and long conserved non-coding elements, indicating stronger purifying selection for large effect mutations in the phylotypic stage. So, it is possible that both purifying selection and mutational robustness together shape the lower expression divergence in the fly phylotypic stage.

Although mutational robustness can evolve under natural selection theoretically [46], the conditions are too restrictive to be relevant in practice. In contrast, selection for robustness to environmental or stochastic variations can have a clear fitness advantage (individual-based and immediate). Thus, we propose that mutational buffering is a by-product of selection for minimizing such expression variability. The exact roles of natural selection and of histone modifications in the patterns that we observe remain to be tested, as does the generality of our observations beyond D. melanogaster. Our results support an important role for the control of expression variability in embryonic gene expression and in the evolution of development. We propose that selection for robustness to stochastic and to environmental perturbations in a key embryonic stage has led to the evolutionary conservation over large time scales which characterizes the phylotypic stage.

Conclusions

Overall, we suggest that the phylotypic stage is characterized by selection for robustness to stochastic and environmental perturbations. This could lead to mutational robustness, thus evolutionary conservation of expression and the hourglass pattern.

Materials and methods

Embryo collection and RNA extraction

Fly lines (w¹¹¹⁸) were obtained from the Bloomington Stock Center and reared at room temperature on a standard fly medium with a 12-h light-dark cycle. The fly medium we used is composed of 6.2 g agar powder (ACROS N. 400400050), 58.8 g Farigel wheat (Westhove N. FMZH1), 58.8 g yeast (Springaline BA10), 100 mL grape juice; 4.9 mL propionic acid (Sigma N. P1386), 26.5 mL of methyl 4-hydroxybenzoate (VWR N. ALFAA14289.0) solution (400 g/L) in 95% ethanol, and 1 L water. One hundred to 150 flies were transferred to cages, which were sealed to a grape agar plate (1:1 mixture of 6% agar and grape juice). We used 4 separate cages to collect the embryos. The adults were kept overnight on this plate before being transferred to a new plate supplemented with yeast paste. Synchronization of eggs on this plate lasted for 2 h before being swapped with a new plate supplemented with yeast paste. We let the adults lay eggs for 30 min before removing the plate and letting the eggs incubate for the desired time.

Eggs were harvested using the following protocol. First a 1:1 bleach (Reactolab 99412) 1× PBS mix was poured on the plate and incubated for 2 min. During this incubation, we used a brush to lightly scrape the surface to mobilize the embryos. We then poured the PBS-bleach mixture through a sieve, washed the plate with 1× PBS, and poured the wash on the same sieve. We washed the sieve several times with 1× PBS until the smell of bleach disappeared. Single embryos were then manually transferred to Eppendorf containing 50 μL beads and 350 μL TRIzol (Life Technologies AM9738). The tubes were homogenized in a Precellys 24 Tissue Homogenizer at 6000 rpm for 30 s. Samples were then transferred to liquid nitrogen for flash freezing and stored at − 80 °C. For RNA extraction, tubes were thawed on ice, supplemented with 350 μL of 100% ethanol before homogenizing again with the same parameters. We then used the Direct-zol™ RNA Miniprep R2056 Kit, with the following modifications: we did not perform DNAse I treatment, we added another 2 min centrifugation into an empty column after the RNA Wash step, finally elution was performed by adding 8 μL of RNAse-free water to the column, incubation at room temperature for 2 min, and then centrifugation for 2 min. RNA was transferred to a low-binding 96-well plate and stored at − 80 °C.

Bulk RNA barcoding and sequencing

The BRB-seq is a technique for multiplexed RNA-seq [25] which is able to provide high-quality 3′ transcriptomic data at a low cost (e.g., 10-fold lower than Illumina Truseq Stranded mRNA-seq). The data (fastq files) generated from BRB-seq are multiplexed and asymmetrical paired reads. Read R1 contains a 6-bp sample barcode, while read R2 contains the fragment sequence to align to the reference genome.

Library preparation

RNA quantity was assessed using picogreen (Invitrogen P11496). Samples were then grouped according to their concentration in 96-well plates and diluted to a final concentration determined by the lowest sample concentration on the plate. RNA was then used for gene expression profiling using BRB-seq. In short, the BRB-seq protocol starts with oligo-dT barcoding, without TSO for the first-strand synthesis (reverse transcription), performed on each sample separately. Then, all samples are pooled together, after which the second strand is synthesized using DNA PolII Nick translation. The sequencing library is then prepared using cDNA augmented by an in-house-produced Tn5 transposase preloaded with the same adapters (Tn5-B/B), and further enriched by limited-cycle PCR with Illumina compatible adapters. Libraries are then size-selected (200–1000 bp), profiled using High Sensitivity NGS Fragment Analysis Kit (Advanced Analytical, #DNF-474), and measured using Qubit dsDNA HS Assay Kit (Invitrogen, #Q32851). In total, we generated four libraries. For details of the library information, please check Additional file 2: Table S13.

Sequencing

Libraries were mixed in equimolar quantities and were then sequenced on an Illumina Hi-Seq 2500 with pair-end protocol (read R2 with 101 bp) at the Lausanne Genomic Technologies Facility.

RNA-seq analysis

Generating expression matrix

The fastq files were first demultiplexed by using the “Demultiplex” tool from BRB-seqTools suite (available at https://github.com/DeplanckeLab/BRB-seqTools). Then, we trimmed the polyA sequences of the demultiplexed files by using the “Trim” tool. Next, the STAR aligner [47] was used to map the trimmed reads to the reference genome of fly Drosophila melanogaster (BDGP6, Ensembl release 91 [48]). Finally, the read count of each gene was obtained with HTSeq [49].

Filtering samples and genes

Low-quality samples need to be filtered out, since they might bias the results of downstream analyses. In order to assess sample quality, we calculated the number of uniquely mapped reads and of expressed genes for each sample [50]. We removed samples with < 0.3 million uniquely mapped reads or with < 4500 expressed genes (Additional file 1: Figure S1). We confirmed that these filtered samples are indeed outliers in a multidimensional scaling analysis (MDS) (Additional file 1: Figure S16). Since lowly expressed genes have a larger technical error, to minimize the technical noise, we need to remove lowly expressed genes as well. We first calculated counts per million (cpm) with the edgeR package [51] for each gene. Then, we removed genes with mean cpm across samples ≤ 1, as suggested by Lun et al. [50]. Finally, for the remaining genes, we re-transformed their cpm values to the original count values for the downstream normalization analysis. After filtering, we obtained an expression count matrix with 239 samples (Additional file 1: Figure S2) and 8004 protein-coding genes.

Normalization and batch effect correction

Because BRB-seq retains only the 3′ end of the transcript, we performed sample normalization by using quantile normalization with log transformation in the voom package [52], but without transcript length normalization. To remove potential batch effects across the four libraries, we applied the ComBat function in the sva package [53] to the normalized and log2 transformed expression data. For genes with expression values less than 0 after Combat, or with original expression values equal to 0, we change its values to 0 after Combat correction as suggested by Kolodziejczyk et al. [54].

Multidimensional scaling analysis

A number of factors could be invoked to explain the two groups observed in our multidimensional scaling analysis (MDS) (Fig. 1b), but they should also explain that only one group is structured according to the developmental time. The obvious hypothesis that they correspond to male and female embryos does not explain that structure and is also not supported by examining X/autosome gene expression ratios (Additional file 1: Figure S17). An alternative hypothesis is that samples in the small cluster are unfertilized eggs. If an egg is not fertilized, after completion of meiosis, the development will be arrested [55], but they are visually indistinguishable. This hypothesis is confirmed by at least two lines of evidence, in addition to the lack of developmental time structure. First, for expression correlation, all samples in the small cluster are highly correlated with unfertilized egg, while the correlations in the other samples gradually decrease with development (Additional file 1: Figure S3A). Second, all the samples from the small cluster are enriched with meiosis-related genes (Additional file 1: Figure S3B). Thus, we excluded the small cluster for downstream analyses, i.e., we used 150 embryos with an average of 18 individuals per developmental stage.

Metrics of expression variability

Expression variability is generally measured by the coefficient of variation (CV) [56]. However, a gene’s CV is strongly dependent on its RNA abundance (Additional file 1: Figure S4). While this is an inherent property of time-interval counting processes (such as a Poisson process), it makes the comparison of variability between different conditions difficult [54, 57]. Distance to median (DM, the distance between the squared CV of a gene and its running median) has been proposed as a variability metric that is independent of expression level [28, 54, 57]. However, the DM is still strongly negatively correlated with the mean expression level in our data (Additional file 1: Figure S5). To avoid this dependency, we developed another variability measure, the adjusted standard deviation (adjusted SD), by calculating the ratio between observed SD and expected SD. Following the same approach as Barroso et al. [58], we performed polynomial regressions to predict the expected SD from mean expression. Since the adjusted SD metric works much better than DM in terms of accounting for the confounding effects of mean expression (Additional file 1: Figure S6), we adopted it as a measure of expression variability in our study. As observed in yeasts [27, 28], we found that essential genes and hubs (proteins in the center of protein-protein interaction network) have lower expression variability than other genes (Additional file 1: Figure S7), indicating selection to reduce it. This observation provides a control that we are indeed measuring biologically relevant expression variability.

Detailed calculation of expression variability is as follows:

Adjusted SD

For each gene, we computed the standard deviation (SD) in each stage and over all stages. Then, we fitted a polynomial model to predict the global (across all stages) SD from the global mean expression. We increased the degrees of the model until there was no more significant improvement (tested with ANOVA, p < 0.05 as a significant improvement). Then, based on this best-fitting model, for each gene, we computed its predicted global SD based on its global mean expression. Finally, the adjusted SD of a gene in one stage is this gene’s SD in its corresponding stage divided by its predicted global SD. This method is derived from Barroso et al. [58], but computing adjusted SD rather than adjusted variance.

Distance to median: the distance between the squared coefficient of variation of a gene and its running median

For each gene, we computed its squared CV in each stage and over all stages. Then, we ordered genes based on their global (across all stages) mean expression. Next, we defined a series of sliding windows of 50 genes with 25 genes overlap, starting from the lowest global mean expression. Finally, the distance to median of a gene in one stage is the stage-specific log10 squared CV minus the median of global log10 squared CV in this gene’s corresponding window. R code was modified from the DM function of the scran package [50].

Bootstrap analysis

For each stage, we randomly sampled the same number of samples. Then, we computed the adjusted SD based on these random samples. We repeated the first two steps 500 times. Each time, we only kept the median of the adjusted SD for each stage. Thus, in each stage, we obtained 500 medians. Finally, we performed a Wilcoxon test to test the significance of the difference between the bootstrapped values of different stages.

ChIP-seq data analysis

Histone modification signal datasets

The signal data files of four euchromatin histone modification marks (H3K4me3, H3K9ac, and H3K27ac) at six developmental stages (0–4 h, 4–8 h, 8–12 h, 12–16 h, 16–20 h, 20–24 h) were downloaded from modENCODE [39] (NCBI GEO: GSE16013) (March 2018). The signal is smoothed, background-subtracted tag density. The signal was precomputed along the genome in 35-bp windows.

Histone modification signal for promoter

For each gene, we calculated the mean signal of its proximal promoter (2 kb upstream to 2 kb downstream for transcription start site (TSS)) by using the bedtools “map” command [59]. The TSS and transcription end site (TES) information was retrieved from Ensembl release 91 [48]. For a gene with several TSS and TES, we use its mean coordinates.

Histone modification signal Z score transformation

For each modification mark in each stage, the signal value was transformed into a Z score by subtracting the mean signal across intergenic regions and dividing by the standard deviation signal of the intergenic regions. The intergenic region was defined by removing all proximal promoter regions and gene body regions (TSS to TES) with the bedtools “subtract” command [59]. Since the three histone modification marks are largely enriched at promoter over intergenic regions in Drosophila [39], this allows to normalize between libraries. Then, for each gene, we re-calculated the mean signal (Z score) of its proximal promoter (2 kb upstream to 2 kb downstream for TSS) by using the bedtools “map” command [59].

Identification of stage specifically expressed genes

Following the same approach as previously [10], we first defined 8 stage-specific expressed artificial expression profile (Additional file 1: Figure S18A). Then, for each gene, we performed Pearson’s correlation between its real expression and this artificial expression. Finally, for each artificial expression profile, we kept genes with top 10% correlation coefficient as the corresponding stage specifically expressed genes (Additional file 1: Figure S18B).

Identification of hourglass expression variability genes

Similar to the stage specifically expressed gene identification approach, we correlated each gene’s variability profile with the median across all genes. Then, we kept genes with the top 10% correlation coefficient as the ones following the global hourglass variability profile.

Gene Ontology enrichment analysis

We performed Gene Ontology (GO) enrichment analysis for hourglass expression variability genes by using the topGO [60] R package with the “elim” algorithm.

Partial correlation

The R package “ppcor” [61] was used to compute Spearman’s partial correlation coefficient between histone modification signal and expression variability after controlling for the effect of promoter shape.

Transcriptome index analysis

A “transcriptome index” [62, 63] is a weighted mean of a feature over all genes, where the weights are the expression levels of the genes at each condition (e.g., developmental stage). The transcriptome index of phastCons was calculated as:

\( {\mathrm{TPI}}_s=\frac{\sum \limits_{i=l}^n{\mathrm{phastCons}}_i\times {e}_{is}}{\sum \limits_i^n{e}_{is}} \)

where s is the developmental stage, phastCons_i is the promoter sequence conservation score of gene i, n is the total number of genes, and e_is is the expression level (log-transformed) of gene i in developmental stage s.

Confidence interval analysis for transcriptome index

Firstly, we randomly sampled gene IDs from each original data set 10,000 times with replacement. Then, we computed transcriptome indexes for the 10,000 samples. Finally, the 95% confidence interval is defined as the range from quantile 2.5% to quantile 97.5% of the 10,000 transcriptome indexes.

Meiosis-related genes and transcription factors

The meiosis-related genes and transcription factors were downloaded from AmiGO [64] (May 2018).

Individual unfertilized eggs RNA-seq data

The normalized and log-transformed expression matrix of individual unfertilized eggs was downloaded from NCBI GEO: GSE68062 [65] (May 2018).

Promoter shape index

The promoter shape index was downloaded from [66]. (June 2019). Lower value means broader promoter.

Essential gene annotation and protein connectivity datasets

The gene essentiality and protein connectivity datasets were downloaded from OGEE v2 [67] (March 2018).

PhastCons score

The pre-computed sequence conservation score phastCons [32] of fly genome was downloaded from http://hgdownload.soe.ucsc.edu/goldenPath/dm3/phastCons15way/ (February 2018). Higher value means higher conservation.

Experimentally validated core promoter regions

Experimentally validated transcription start sites (TSSs) were downloaded from the Eukaryotic Promoter Database (EPD) [31] (May 2018). For a gene with several TSSs, we selected the most representative one (the TSS that has been validated by the largest number of samples). The core promoter region was defined as 49 bp upstream TSS to 10 bp downstream of the TSS [31]. We used EPD-defined TSSs here because they are more accurate for defining core promoters, whose function is expected to be related to sequence conservation. Whereas for histone modification signal for promoter, we used Ensembl-defined TSSs to be consistent with the source of transcription end site (TES) information, and precision was less important in defining broader proximal promoters.

Availability of data and materials

Data files and analysis scripts are available on GitHub: https://github.com/ljljolinq1010/expression-noise-across-fly-embryogenesis.

Expression datasets have been deposited to the Gene Expression Omnibus with accession number GSE128370 [68].

References

Raff RA. The shape of life: genes, development, and the evolution of animal form. Chicago: University of Chicago Press. 1996.
Duboule D. Temporal colinearity and the phylotypic progression: a basis for the stability of a vertebrate Bauplan and the evolution of morphologies through heterochrony. Development. 1994;1994:135–42.
Google Scholar
Irie N, Kuratani S. Comparative transcriptome analysis reveals vertebrate phylotypic period during organogenesis. Nat Commun. 2011;2:248.
Article PubMed CAS Google Scholar
Hu H, Uesaka M, Guo S, Shimai K, Lu T-M, Li F, et al. Constrained vertebrate evolution by pleiotropic genes. Nat Ecol Evol. 2017;1:1722–30.
Article PubMed Google Scholar
Levin M, Hashimshony T, Wagner F, Yanai I. Developmental milestones punctuate gene expression in the Caenorhabditis embryo. Dev Cell. 2012;22(5):1101–8.
Article CAS PubMed Google Scholar
Zalts H, Yanai I. Developmental constraints shape the evolution of the nematode mid-developmental transition. Nat Ecol Evol. 2017;1:0113.
Article Google Scholar
Kalinka AT, Varga KM, Gerrard DT, Preibisch S, Corcoran DL, Jarrells J, et al. Gene expression divergence recapitulates the developmental hourglass model. Nature. 2010;468(7325):811–4.
Article CAS PubMed Google Scholar
Galis F, Metz JA. Testing the vulnerability of the phylotypic stage: on modularity and evolutionary conservation. J Exp Zool. 2001;291(2):195–204.
Article CAS PubMed Google Scholar
Darwin C. The descent of man, and selection in relation to sex. Princeton: Princeton University Press. 1871.
Liu J, Robinson-Rechavi M. Adaptive evolution of animal proteins over development: support for the Darwin selection opportunity hypothesis of evo-devo. Mol Biol Evol. 2018;35(12):2862–72.
CAS PubMed PubMed Central Google Scholar
Kalinka AT, Tomancak P. The evolution of early animal embryos: conservation or divergence? Trends Ecol Evol. 2012;27(7):385–93.
Article PubMed Google Scholar
Frankel N, Davis GK, Vargas D, Wang S, Payre F, Stern DL. Phenotypic robustness conferred by apparently redundant transcriptional enhancers. Nature. 2010;466(7305):490–3.
Article CAS PubMed PubMed Central Google Scholar
Perry MW, Boettiger AN, Bothma JP, Levine M. Shadow enhancers foster robustness of Drosophila gastrulation. Curr Biol. 2010;20(17):1562–7.
Article CAS PubMed PubMed Central Google Scholar
Schor IE, Degner JF, Harnett D, Cannavò E, Casale FP, Shim H, et al. Promoter shape varies across populations and affects promoter evolution and expression noise. Nat Genet. 2017;49(4):550–8.
Article CAS PubMed Google Scholar
Tirosh I, Reikhav S, Sigal N, Assia Y, Barkai N. Chromatin regulators as capacitors of interspecies variations in gene expression. Mol Syst Biol. 2010;6:435.
Article CAS PubMed PubMed Central Google Scholar
Elowitz MB, Levine AJ, Siggia ED, Swain PS. Stochastic Gene Expression in a Single Cell. Science. 2002;297(5584):1183–6.
Kærn M, Elston TC, Blake WJ, Collins JJ. Stochasticity in gene expression: from theories to phenotypes. Nat Rev Genet. 2005;6(6):451–64.
Raj A, van Oudenaarden A. Nature, nurture, or chance: stochastic gene expression and its consequences. Cell. 2008;135(2):216–26.
Cortijo S, Aydin Z, Ahnert S, Locke JC. Widespread inter-individual gene expression variability in Arabidopsis thaliana. Mol Syst Biol. 2019;15(1):e8591.
Eling N, Morgan MD, Marioni JC. Challenges in measuring and understanding biological noise. Nat Rev Genet. 2019;20(9):536–48.
Article CAS PubMed PubMed Central Google Scholar
Becskei A, Kaufmann BB, van Oudenaarden A. Contributions of low molecule number and chromosomal positioning to stochastic gene expression. Nat Genet. 2005;37(9):937–44.
Article CAS PubMed Google Scholar
Waddington CH. Canalization of development and genetic assimilation of acquired characters. Nature. 1942;150(3811):563–5.
Article Google Scholar
Lehner B. Genes confer similar robustness to environmental, stochastic, and genetic perturbations in yeast. PLoS One. 2010;5(2):e9035.
Article PubMed PubMed Central CAS Google Scholar
Meiklejohn CD, Hartl DL. A single mode of canalization. Trends Ecol Evol. 2002;17(10):468–73.
Article Google Scholar
Alpern D, Gardeux V, Russeil J, Mangeat B, Meireles-Filho ACA, Breysse R, et al. BRB-seq: ultra-affordable high-throughput transcriptomics enabled by bulk RNA barcoding and sequencing. Genome Biol. 2019;20(1):71.
Article PubMed PubMed Central Google Scholar
Schep AN, Adryan B. A comparative analysis of transcription factor expression during metazoan embryonic development. PLoS One. 2013;8(6):e66826.
Article CAS PubMed PubMed Central Google Scholar
Lehner B. Selection to minimise noise in living systems and its implications for the evolution of gene expression. Mol Syst Biol. 2008;4:170.
Article PubMed PubMed Central Google Scholar
Newman JRS, Ghaemmaghami S, Ihmels J, Breslow DK, Noble M, DeRisi JL, et al. Single-cell proteomic analysis of S. cerevisiae reveals the architecture of biological noise. Nature. 2006;441(7095):840–6.
Article CAS PubMed Google Scholar
Tadros W, Lipshitz HD. The maternal-to-zygotic transition: a play in two acts. Development. 2009;136(18):3033–42.
Levin M, Anavy L, Cole AG, Winter E, Mostov N, Khair S, et al. The mid-developmental transition and the evolution of animal body plans. Nature. 2016;531(7596):637–41.
Article CAS PubMed PubMed Central Google Scholar
Dreos R, Ambrosini G, Périer RC, Bucher P. The Eukaryotic Promoter Database: expansion of EPDnew and new promoter analysis tools. Nucleic Acids Res. 2015;43:D92–6.
Article CAS PubMed Google Scholar
Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, Rosenbloom K, et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 2005;15(8):1034–50.
Article CAS PubMed PubMed Central Google Scholar
Benayoun BA, Pollina EA, Ucar D, Mahmoudi S, Karra K, Wong ED, et al. H3K4me3 breadth is linked to cell identity and transcriptional consistency. Cell. 2014;158(3):673–88.
Article CAS PubMed PubMed Central Google Scholar
Wu S, Li K, Li Y, Zhao T, Li T, Yang Y-F, et al. Independent regulation of gene expression level and noise by histone modifications. PLoS Comput Biol. 2017;13(6):e1005585.
Article PubMed PubMed Central Google Scholar
Nicolas D, Zoller B, Suter DM, Naef F. Modulation of transcriptional burst frequency by histone acetylation. Proc Natl Acad Sci U S A. 2018;115(27):7153–8.
Article PubMed PubMed Central CAS Google Scholar
Weinberger L, Voichek Y, Tirosh I, Hornung G, Amit I, Barkai N. Expression noise and acetylation profiles distinguish HDAC functions. Mol Cell. 2012;47(2):193–202.
Article CAS PubMed PubMed Central Google Scholar
Faure AJ, Schmiedel JM, Lehner B. Systematic analysis of the determinants of gene expression noise in embryonic stem cells. Cell Syst. 2017;5(5):471–484.e4.
Article CAS PubMed Google Scholar
Xiao L, Zhao Z, He F, Du Z. Multivariable regulation of gene expression plasticity in metazoans. Open Biol. 2019;9(12):190150.
Article PubMed PubMed Central Google Scholar
Nègre N, Brown CD, Ma L, Bristow CA, Miller SW, Wagner U, et al. A cis-regulatory map of the Drosophila genome. Nature. 2011;471(7339):527–31.
Nozaki T, Yachie N, Ogawa R, Kratz A, Saito R, Tomita M. Tight associations between transcription promoter type and epigenetic variation in histone positioning and modification. BMC Genomics. 2011;12(1):416.
Sigalova OM, Shaeiri A, Forneris M, Furlong EE, Zaugg JB. Predictive features of gene expression variation reveal mechanistic link with differential expression. Mol Syst Biol. 2020;16(8):e9539.
Karlić R, Chung H-R, Lasserre J, Vlahovicek K, Vingron M. Histone modification levels are predictive for gene expression. Proc Natl Acad Sci U S A. 2010;107(7):2926–31.
Article PubMed PubMed Central Google Scholar
Gout J-F, Kahn D, Duret L, Paramecium Post-Genomics Consortium PP-G. The relationship among gene expression, the evolution of gene dosage, and the rate of protein evolution. PLoS Genet. 2010;6(5):e1000944.
Article PubMed PubMed Central CAS Google Scholar
Coronado-Zamora M, Salvador-Martínez I, Castellano D, Barbadilla A, Salazar-Ciudad I. Adaptation and conservation throughout the Drosophila melanogaster life-cycle. Genome Biol Evol. 2019;11(5):1463–82.
Article CAS PubMed PubMed Central Google Scholar
Piasecka B, Lichocki P, Moretti S, Bergmann S, Robinson-Rechavi M. The hourglass and the early conservation models—co-existing patterns of developmental constraints in vertebrates. PLoS Genet. 2013;9(4):e1003476.
Article CAS PubMed PubMed Central Google Scholar
Wagner GP, Booth G, Bagheri-Chaichian H. A Population Genetic Theory of Canalization. Evolution. 1997;51(2):329-47.
Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29(1):15–21.
Article CAS PubMed Google Scholar
Zerbino DR, Achuthan P, Akanni W, Amode MR, Barrell D, Bhai J, et al. Ensembl 2018. Nucleic Acids Res. 2018;46(D1):D754–61.
Article CAS PubMed Google Scholar
Anders S, Pyl PT, Huber W. HTSeq--a Python framework to work with high-throughput sequencing data. Bioinformatics. 2015;31(2):166–9.
Article CAS PubMed Google Scholar
Lun ATL, McCarthy DJ, Marioni JC. A step-by-step workflow for low-level analysis of single-cell RNA-seq data. F1000Res. 2016;5:2122.
PubMed PubMed Central Google Scholar
Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26(1):139–40.
Article CAS PubMed Google Scholar
Law CW, Chen Y, Shi W, Smyth GK. voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol. 2014;15(2):R29.
Article PubMed PubMed Central CAS Google Scholar
Johnson WE, Li C, Rabinovic A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics. 2007;8(1):118–27.
Article PubMed Google Scholar
Kolodziejczyk AA, Kim JK, Tsang JCH, Ilicic T, Henriksson J, Natarajan KN, et al. Single cell RNA-sequencing of pluripotent states unlocks modular transcriptional variation. Cell Stem Cell. 2015;17(4):471–85.
Article CAS PubMed PubMed Central Google Scholar
Avilés-Pagán EE, Orr-Weaver TL. Activating embryonic development in Drosophila. Semin Cell Dev Biol. 2018;84:100–10.
Article PubMed PubMed Central CAS Google Scholar
Raser JM, O’Shea EK. Noise in gene expression: origins, consequences, and control. Science. 2005;309(5743):2010–3.
Article CAS PubMed PubMed Central Google Scholar
Tung P-Y, Blischak JD, Hsiao CJ, Knowles DA, Burnett JE, Pritchard JK, et al. Batch effects and the effective design of single-cell gene expression studies. Sci Rep. 2017;7:39921.
Article CAS PubMed PubMed Central Google Scholar
Barroso GV, Puzovic N, Dutheil JY. The evolution of gene-specific transcriptional noise is driven by selection at the pathway level. Genetics. 2018;208(1):173–89.
Article CAS PubMed Google Scholar
Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26(6):841–2.
Article CAS PubMed PubMed Central Google Scholar
Alexa A, Rahnenfuhrer J, Lengauer T. Improved scoring of functional groups from gene expression data by decorrelating GO graph structure. Bioinformatics. 2006;22(13):1600–7.
Article CAS PubMed Google Scholar
Kim S. ppcor: an R package for a fast calculation to semi-partial correlation coefficients. Commun Stat Appl methods. 2015;22(6):665–74.
PubMed PubMed Central Google Scholar
Liu J, Robinson-Rechavi M. Developmental constraints on genome evolution in four bilaterian model species. Genome Biol Evol. 2018;10(9):2266–77.
Article CAS PubMed PubMed Central Google Scholar
Domazet-Loso T, Tautz D. A phylogenetically based transcriptome age index mirrors ontogenetic divergence patterns. Nature. 2010;468(7325):815–8.
Article CAS PubMed Google Scholar
Carbon S, Ireland A, Mungall CJ, Shu S, Marshall B, Lewis S, et al. AmiGO: online access to ontology and annotation data. Bioinformatics. 2009;25(2):288–9.
Article CAS PubMed Google Scholar
Paris M, Villalta JE, Eisen MB, Lott SE. Sex bias and maternal contribution to gene expression divergence in Drosophila blastoderm embryos. PLoS Genet. 2015;11(10):e1005592.
Hoskins RA, Landolin JM, Brown JB, Sandler JE, Takahashi H, Lassmann T, et al. Genome-wide analysis of promoter architecture in Drosophila melanogaster. Genome Res. 2011;21(2):182–92.
Article CAS PubMed PubMed Central Google Scholar
Chen W-H, Lu G, Chen X, Zhao X-M, Bork P. OGEE v2: an update of the online gene essentiality database with special focus on differentially essential genes in human cancer cell lines. Nucleic Acids Res. 2017;45(D1):D940–4.
Article CAS PubMed Google Scholar
Liu J, Frochaux M, Gardeux V, Deplancke B, Robinson-Rechavi M. Inter-embryo gene expression variability recapitulates the hourglass pattern of evo-devo. NCBI accession number GSE128370. 2020. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE128370.

Download references

Acknowledgements

We thank Virginie Braman for the help with the embryo collection and library preparation. We thank Daniel Alpern for the help with the BRB-seq technology. We thank Richard Benton, David Garfield, Laurent Keller, and Gunter Wagner for critically reading and commenting on the manuscript. We thank Eileen Furlong and Mattia Forneris for the help with understanding the promoter shape index. We thank Andrea Komljenovic and other members of the Robinson-Rechavi lab for the helpful discussions. Part of the computations were performed at the Vital-IT (http://www.vital-it.ch) Center for high-performance computing of the SIB Swiss Institute of Bioinformatics.

Funding

JL and MRR are supported by the Swiss National Science Foundation grants 31003A_153341/1 and 31003A_173048. MRR and BD are supported by SystemsX.ch grant AgingX.

Author information

Authors and Affiliations

Department of Ecology and Evolution, University of Lausanne, 1015, Lausanne, Switzerland
Jialin Liu & Marc Robinson-Rechavi
Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland
Jialin Liu, Michael Frochaux, Vincent Gardeux, Bart Deplancke & Marc Robinson-Rechavi
Laboratory of Systems Biology and Genetics, Institute of Bioengineering, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
Michael Frochaux, Vincent Gardeux & Bart Deplancke

Authors

Jialin Liu
View author publications
You can also search for this author in PubMed Google Scholar
Michael Frochaux
View author publications
You can also search for this author in PubMed Google Scholar
Vincent Gardeux
View author publications
You can also search for this author in PubMed Google Scholar
Bart Deplancke
View author publications
You can also search for this author in PubMed Google Scholar
Marc Robinson-Rechavi
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

JL designed the work with input from MRR, MF, and BD. MF led all experiments. JL performed all the computational analyses. VG and BD contributed expertise in the BRB-seq experiments. JL and MRR interpreted the results with input from all the other authors. JL wrote the first draft of the paper. JL and MRR finalized the paper with input from all the other authors. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Jialin Liu or Marc Robinson-Rechavi.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Competing interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Additional file 1:

Figure S1-S18. Figure S1. Relationship between uniquely mapped reads and expressed genes. Figure S2. Proportion of retained samples in each development stage. Figure S3. Evidence that the samples from the small cluster are unfertilized eggs. Figure S4. Relationship between average expression and coefficient of variation at each stage. Figure S5. Relationship between average expression and distance to median at each stage. Figure S6. Relationship between average expression and adjusted SD at each stage. Figure S7. Relationship between expression variability and protein importance. Figure S8. Variation of expression variability across development using alternate measures of variability. Figure S9. Bootstrap analysis of the variability calculation. Figure S10. Random sampling analysis of expression variability. Figure S11. Expression variability pattern for genes without maternally expressed genes. Figure S12. Expression variability pattern for genes expressed at all stages. Figure S13. Promoter sequence conservation across embryogenesis for different definitions of promoter width. Figure S14. Relationship between promoter shape index and histone modification signal. Figure S15. Relationship between promoter shape index and expression variability. Figure S16. Multidimensional scaling analysis for all samples. Figure S17. Mapping of X/autosome gene expression ratios to the multidimensional scaling analysis plot. Figure S18. Detection of stage specific genes.

Additional file 2:

Tables. S1-S13. Table S1-S12. multiple test corrected p-values of variability (adjusted SD) comparison between any two stages. Table S13. RNA-seq library information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Liu, J., Frochaux, M., Gardeux, V. et al. Inter-embryo gene expression variability recapitulates the hourglass pattern of evo-devo. BMC Biol 18, 129 (2020). https://doi.org/10.1186/s12915-020-00842-z

Download citation

Received: 15 April 2020
Accepted: 07 August 2020
Published: 19 September 2020
DOI: https://doi.org/10.1186/s12915-020-00842-z

Inter-embryo gene expression variability recapitulates the hourglass pattern of evo-devo

Abstract

Background

Results

Conclusions

Similar content being viewed by others

Genetic variants regulating expression levels and isoform diversity during embryogenesis

Evolutionary Developmental Biology and Its Contribution to a New Synthetic Theory

Shifting the Black Box: Approaches to the Development and Evolution of the Vertebrate Mesoderm

Background

Results

Single embryo RNA-seq profile over embryogenesis

The phylotypic stage is robust to stochastic perturbations on gene expression

Testing potential confounding factors of hourglass variability pattern

Mutational robustness shapes the hourglass expression divergence pattern

Histone modifications and expression variability

Histone modifications and promoter sequence conservation

Higher histone modification signal at the phylotypic stage

Discussion

Conclusions

Materials and methods

Embryo collection and RNA extraction

Bulk RNA barcoding and sequencing

Library preparation

Sequencing

RNA-seq analysis

Generating expression matrix

Filtering samples and genes

Normalization and batch effect correction

Multidimensional scaling analysis

Metrics of expression variability

Adjusted SD

Distance to median: the distance between the squared coefficient of variation of a gene and its running median

Bootstrap analysis

ChIP-seq data analysis

Histone modification signal datasets

Histone modification signal for promoter

Histone modification signal Z score transformation

Identification of stage specifically expressed genes

Identification of hourglass expression variability genes

Gene Ontology enrichment analysis

Partial correlation

Transcriptome index analysis

Confidence interval analysis for transcriptome index

Meiosis-related genes and transcription factors

Individual unfertilized eggs RNA-seq data

Promoter shape index

Essential gene annotation and protein connectivity datasets

PhastCons score

Experimentally validated core promoter regions

Availability of data and materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Ethics approval and consent to participate

Competing interests

Additional information

Publisher’s Note

Supplementary information

Additional file 1:

Additional file 2:

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation