Correlates of evolutionary rates in the murine sperm proteome
Protein-coding genes expressed in sperm evolve at different rates. To gain deeper insight into the factors underlying this heterogeneity we examined the relative importance of a diverse set of previously described rate correlates in determining the evolution of murine sperm proteins.
Using partial rank correlations we detected several major rate indicators: Phyletic gene age, numbers of protein-protein interactions, and survival essentiality emerged as particularly important rate correlates in murine sperm proteins. Tissue specificity, numbers of paralogs, and untranslated region lengths also correlate significantly with sperm genes’ evolutionary rates, albeit to a lesser extent. Multifunctionality, coding sequence or average intron lengths, and mean expression level have insignificant or virtually no independent effects on evolutionary rates in murine sperm genes. Gene ontology enrichment analyses of three equally sized murine sperm protein groups classified based on their evolutionary rates indicate strongest sperm-specific functional specialization in the most quickly evolving gene class.
We propose a model according to which slowly evolving murine sperm proteins tend to be constrained by factors such as survival essentiality, network connectivity, and/or broad expression. In contrast, evolutionary change may arise especially in less constrained sperm proteins, which might, moreover, be prone to specialize to reproduction-related functions. Our results should be taken into account in future studies on rate variations of reproductive genes.
KeywordsProtein evolution Tissue specificity Protein essentiality Sperm proteins Protein-protein interactions Gene age Multifunctionality Paralogs Gene compactness dN/dS
Database for Annotation, Visualization and Integrated Discovery
nonsynonymous substitution rate
synonymous substitution rate
Interologous Interaction Database
Mouse Genome Informatics
tissue specificity index
The spermatozoon is a highly specialized cell type indispensable for male fitness. Different types of postmating sexual selection are thought to drive the diversification of sperm anatomy and function across taxa. For instance, sperm competition, the competition between sperm from several males for fertilization of a female’s ova , is considered to increase sperm viability in Drosophila  and numbers of sperm produced in mice [3, 4]. In primates, sperm midpiece volume varies depending on mating systems and thus presumably in response to different levels of sexual selection  (for rodents, see ). Furthermore, sperm competition and other types of sexual selection, such as cryptic female choice (see, e.g., ) and sexual conflict (reviewed in ), may exert selective pressures not only on sperm form or production, but also on the proteins constituting these cells. Accordingly, adaptive evolution of male reproductive proteins has been observed in a wide range of taxa (see, e.g., [9, 10, 11, 12]). However, studies with proteome-wide perspectives demonstrated that despite the rapid evolution of certain reproduction-related genes, the majority of male reproductive proteins are conserved [13, 14]. This variation in sequence evolution of male reproductive proteins is moreover characterised by temporal and spatial compartmentalization of sperm or testis proteomes with higher rates in later steps of spermatogenesis and in proteins acting proximate to fertilization (see, e.g., [15, 16, 17, 18]). Dorus et al.  hypothesized that the strong conservation of many sperm proteins might rely on either their involvement in critical cellular functions or their expression in nonreproductive tissues, which generates pleiotropic constraints. Corresponding to the second prediction of Dorus et al. , other authors indeed found rate acceleration of testis-specific genes in Drosophila compared with genes also expressed in other tissues ; in the mouse sperm proteome, however, testis overexpression was insufficient to explain evolutionary rate variation .
Hence, although it may be assumed that all protein-coding genes expressed in sperm cells could potentially experience sexual selection, adaptive evolution affects only a relatively small fraction of them. The above mentioned studies were able to partially explain rate variations of male reproductive proteins. However, the influence of a larger set of potential rate determinants on the evolution of mammalian sperm proteins has not yet been studied.
In evolutionary biology, the quest for such rate determinants has long been a central issue and several correlates of evolutionary rates have been identified (reviewed in, e.g., [20, 21]): The evolutionary rate of a protein is thought to be mainly shaped by its level of essentiality and functional constraint . Essentiality manifests in the fitness effect of gene deletion and has been shown to correlate negatively with evolutionary rates (see, e.g., ). Functional constraint refers to the compatibility of substitutions with a protein’s function (see, e.g., ), which, however, is more difficult to quantify than essentiality. Still, several correlates of evolutionary rates, such as connectivity in protein-protein interaction (PPI) networks , multifunctionality , and expression breadth [26, 27] have been established, all of which might relate to pleiotropic [28, 29] and/or structural and functional constraints [30, 31]. Moreover, other variables including phyletic gene age [32, 33], gene compactness , expression level [35, 36, 37] or numbers of paralogs [38, 39] have been identified as substantial rate indicators of protein-coding genes.
In the present study we aim to uncover factors influencing rate variations of sperm proteins. We intend to unravel the relative importance of several proposed variables as correlates of sperm proteins’ evolutionary rates. Based on a published murine sperm proteome  we used zero-order and partial rank correlations to disentangle the effects of individual gene properties on dN/dS. This latter measure is the ratio of nonsynonymous (dN) and synonymous substitution rates (dS), with dN/dS = 1, < 1, and > 1 indicating neutral evolution, purifying, and positive selection, respectively. We included essentiality, multifunctionality, number of PPIs, tissue specificity (τ), mean expression level inferred over 22 mouse tissues, phyletic gene age, number of paralogs, and different measures of gene compactness (coding sequence (CDS) length, 5′ and 3′ untranslated region (UTR) length, average intron length) as potential correlates of sperm proteins’ dN/dS values. Additionally, we retrieved gene ontology (GO) information for sperm protein groups with different rates of sequence evolution. Based on our results, we propose a model for sperm protein evolution. Our findings underscore the relevance of tissue- or cell type-specific analyses of putative rate determinants.
Ensembl Gene IDs corresponding to proteins expressed in murine epididymal sperm according to Chauvin et al.  were identified using Uniprot’s ID mapping tool (19th march 2015) and Ensembl Biomart version 79 (for details, see Additional file 1: Supplementary Methods).
For each sperm gene, we extracted pairwise dN/dS estimates from the orthologues view pages in Ensembl version 79. On these pages, dN/dS values potentially biased by dS saturation are hidden (http://mar2015.archive.ensembl.org/info/genome/compara/homology_method.html), so that our data should be largely unaffected by such bias. We considered dN/dS values which had been estimated between 1-to-1 orthologues of mouse (Mus musculus; genome assembly GRCm38.p3) and rat (Rattus norvegicus; genome assembly Rnor_5.0) using CodeML from the PAML package .
Descriptions of how we determined the number of paralogs, CDS, average intron as well as 5′ and 3′ UTR length for each gene can be found in Additional file 1: Supplementary Methods.
Genes coding for the sampled sperm proteins were classified into six age groups corresponding to their earliest phyletic origin and coded as numbers descending towards younger ages: cellular organisms (6), Eukaryota (5), Fungi/Metazoa (4), Metazoa (3), Chordata (2), Mammalia (1). Age classifications were taken from supplementary data of Chen et al.  and had been estimated applying a bit score cutoff of 80 in FASTA (for more details, see supplementary methods of ).
Numbers of direct and indirect PPI partners relating to Swiss-Prot accession numbers were extracted from downloadable data of I2D (Interologous Interaction Database; http://ophid.utoronto.ca/; ) version 2.9 (for details, see Additional file 1: Supplementary Methods).
For each gene, the tissue specificity index τ  and the mean expression level were gathered from supplementary data of Kryuchkova-Mostacci and Robinson-Rechavi . The index τ ranges from 0 to 1, with larger values indicating higher tissue specificity . Both τ and mean expression level had originally been calculated based on ENCODE  expression data comprising 22 mouse tissues .
In order to measure multifunctionality, we assigned numbers of biological processes per protein  from GOSlim generic annotations to each Swiss-Prot ID (for details, see Additional file 1: Supplementary Methods).
MGI (Mouse Genome Informatics) IDs corresponding to Ensembl Gene IDs were identified via Ensembl Biomart version 79. Based on downloadable data from MGI (version 5.22; http://www.informatics.jax.org/; ; state: 19th August 2015), we determined the knockout phenotype for each gene in our dataset. We exclusively considered homo- or – in case of X-chromosomally encoded genes – hemizygous null alleles (with allele attributes containing the term “Null/knockout”) generated by homologous recombination (allele type “Targeted”). Furthermore, we incorporated only alleles affecting single genes. Essential genes were those associated with any of the MP (Mammalian Phenotype) IDs summarized under “preweaning lethality” (MP:0010770) or “lethality at weaning” (MP:0008569). All other genes with known phenotypes conforming to the criteria outlined above were classified as “nonessential”. Genes without such known phenotypes were left unregarded in analyses involving essentiality.
Our statistical analyses conform to the principles applied in related exploratory studies (see, e.g., [34, 48, 49]). Spearman’s rank correlations and partial Spearman’s rank correlation were conducted with MATLAB version R2015b. Genes with lacking information regarding any variable were removed from the dataset so that we included only those for which all variables were available. As many genes had to be excluded from analyses due to missing information concerning knockout phenotypes, we computed all correlations twice (except for those comprising essentiality), in a dataset with (681 proteins) and in a dataset without the essentiality variable (1557 proteins).
To assess potential functional trends in dependence of evolutionary rates, we divided our protein sample (without essentiality data; n = 1557) into three equally sized bins of 519 proteins according to their genes’ dN/dS estimates: bin 1 (“low dN/dS”; dN/dS ≤ 0.0362); bin 2 (“medium dN/dS”; dN/dS 0.03633–0.12561); bin 3 (“high dN/dS”; dN/dS ≥ 0.12574). For each bin, we performed an enrichment analysis by functional annotation clustering based on GOTERM_BP_FAT using the Database for Annotation, Visualization and Integrated Discovery (DAVID; version 6.7; https://david.ncifcrf.gov/home.jsp; [50, 51]). To this end, we entered Ensembl Gene IDs, all of which could be matched to DAVID IDs, and specified the classification stringency as “high”.
Results and discussion
Spearman’s rank correlations between dN/dS and each gene property
ρ with dN/dS (n = 681)
ρ with dN/dS (n = 1557)
Additionally, we computed partial rank correlations between each gene property and either dN or dS calculated for orthologous sequences of mouse and rat (see Additional file 1: Supplementary Methods). Results of these analyses demonstrate that the patterns observed for dN/dS are either driven by dN or a combination of dN and dS, but none of the significant partial correlations with dN/dS can solely be traced back to the impact of dS (see Additional file 1: Tables S1 and S2). Furthermore, results of partial rank correlations including dN/dS remained qualitatively unchanged when obtained in two datasets (with or without the essentiality variable) after exclusion of genes with signals of positive selection according to CodeML analyses (see Additional file 1: Supplementary Methods and Table S3). It can, thus, overall be concluded that none of our observations is the result of an overabundance of positively selected genes.
UTR and intron length
Liao et al.  reported that of the factors analysed in their study, gene compactness in terms of UTR and average intron length was the most important determinant of evolutionary rate in murine protein-coding genes. They found significant negative correlations between dN/dS and both UTR length and average intron length, whereas the zero-order correlation between evolutionary rate and CDS length was nonsignificant. In our sperm gene dataset we also detected significant negative partial correlations between pairwise dN/dS estimates and both 5′ (681 genes: Spearman’s partial ρ = −0.105, p < 0.01; 1557 genes: Spearman’s partial ρ = −0.101, p < 0.001) and 3′ UTR lengths (681 genes: Spearman’s partial ρ = −0.159, p < 0.001; 1557 genes: Spearman’s partial ρ = −0.122, p < 0.001) (Figs. 2 and 3). Untranslated regions are crucial for posttranscriptional regulation implemented via diverse mechanisms most of which utilize cis- and trans-acting elements (see e.g., [53, 54]). Against this background, genes with tightly adjusted expression patterns may have a tendency to contain more binding sites for regulatory molecules, which might entail longer UTRs. Accordingly, Cheng et al.  uncovered that human and mouse genes targeted by more distinct types of miRNAs (microRNAs) have longer 3′ UTRs and evolve at lower rates. This pattern applies to genes with critical functions whose mRNAs’ translation must be precisely adjusted to ensure correct temporal and spatial expression. Housekeeping genes, however, are less regulated by miRNAs and have short 3′ UTRs, but are also slowly evolving . Such opposing regulation patterns via UTR binding elements might have an impact on the correlations between UTR lengths and dN/dS. These correlations could presumably be stronger or weaker in different subsamples of genes with or without spatial or temporal expression variation, according to the patterns outlined above .
In contrast, average intron length showed no significant partial correlations with dN/dS in the current study (681 genes: Spearman’s partial ρ = 0.041, p > 0.05; 1557 genes: Spearman’s partial ρ = 0.028, p > 0.05) (see Figs. 2 and 3). This result disagrees with findings of previous investigations on mouse genes which revealed significant negative correlations [34, 56]. But as both partial and zero-order correlations (see Table 1, Figs. 2 and 3) were nonsignificant, average intron length does not seem to be an important rate indicator in our sperm protein dataset.
Partial correlation between CDS length and dN/dS was significant in the larger dataset of 1557 genes (Spearman’s partial ρ = 0.087, p < 0.001), but nonsignificant in the more restricted set (n = 681; Spearman’s partial ρ = 0.055, p > 0.05). In both cases, however, the partial correlation coefficient indicated a positive relationship between evolutionary rate and sequence length (see Figs. 2 and 3). The interplay of these two variables has been studied in different taxa, but the results were contradictory: While some authors detected no significant correlation between rates of sequence evolution and protein or CDS length (see, e.g., ), others reported significant positive (see, e.g., [49, 57, 58]) or negative correlations (see, e.g., [39, 59]). Chang and Liao  proposed that the positive correlation between dN/dS and CDS length might rely on lower domain density and thus lower percentage of functionally important residues in longer proteins. However, they concluded that this explanation is insufficient to fully account for this correlation, at least in the flagellated algae studied . Based on the current data we are unable to definitely resolve the relation between CDS lengths and evolutionary rates of murine sperm proteins. But regardless whether CDS lengths and evolutionary rates are related, other gene properties studied herein are considerably more strongly correlated with dN/dS (see Figs. 2 and 3).
Pleiotropy and essentiality
We measured different aspects of pleiotropy: degree of expressional tissue bias (τ), numbers of PPIs, and multifunctionality. The latter, defined as the number of biological processes a protein is involved in , correlated negatively with evolutionary rates of human genes in a study by Podder et al. . We found nonsignificant positive partial correlations (681 genes: Spearman’s partial ρ = 0.020, p > 0.05; 1557 genes: Spearman’s partial ρ = 0.042, p > 0.05) (Figs. 2 and 3) between multifunctionality and dN/dS, while the corresponding zero-order correlations were negative and significant (681 genes: Spearman’s ρ = −0.099, p < 0.01; 1557 genes: Spearman’s ρ = −0.051, p < 0.05) (Table 1). Together, these results illustrate that multifunctionality and sequence evolution of murine sperm proteins are largely unrelated when controlling for all other gene properties.
In contrast, we observed a positive partial correlation between tissue specificity index τ and dN/dS, which was significant in both datasets (681 genes: Spearman’s partial ρ = 0.124, p < 0.01; 1557 genes: Spearman’s partial ρ = 0.141, p < 0.001) (Figs. 2 and 3). This index ranges between 0 and 1, with higher values indicating more tissue-biased expression . Our observation of stronger purifying selection in genes expressed broadly is in accordance with numerous previous studies (see, e.g., [27, 60]). The same is true for number of PPIs: Results of partial correlations between evolutionary rates and this property indicated that proteins with higher connectivity in PPI networks tend to evolve more slowly (681 genes: Spearman’s partial ρ = −0.177, p < 0.001; 1557 genes: Spearman’s partial ρ = −0.229, p < 0.001) (Figs. 2 and 3), a pattern described for various taxa (see, e.g., [56, 61, 62]). This connection might be ascribed to the greater portion of functional sites within highly connected proteins . Additionally, deleterious mutations within proteins expressed in a wide range of tissues or having many interactors should have more serious consequences than such mutations in other proteins [27, 63]. This circumstance could also restrain their sequence evolution and might moreover underlie the correlations of tissue specificity and number of PPIs with essentiality (681 proteins; correlation between essentiality and τ: Spearman’s ρ = −0.318, p < 0.001; correlation between essentiality and number of PPIs: Spearman’s ρ = 0.256, p < 0.001; Fig. 1 and Additional file 1: Figure S1).
The relationship between dN/dS and survival essentiality was exclusively explored in the restricted dataset of 681 sperm protein-coding genes and we found a significant negative partial correlation (Spearman’s partial ρ = −0.194, p < 0.001; Fig. 2). In 1977 Wilson et al.  predicted that genes indispensable for survival or reproduction should evolve more slowly than nonessential genes. However, this postulate remained disputed after initial studies yielded contradictory results (see, e.g., [64, 65]), although the slower evolution of essential proteins could be confirmed in many following investigations (see, e.g., [34, 66]).
Overall, murine sperm genes which are indispensable for survival, broadly expressed, and/or whose encoded proteins are highly connected in PPI networks tend to evolve more slowly than other genes. Zero-order correlations highlight the interrelatedness among these variables (Fig. 1 and Additional file 1: Figure S1). Numbers of PPIs and gene essentiality are among the strongest independent rate correlates and τ appears to be an important determinant of dN/dS, too (Figs. 2 and 3); but beyond that, numbers of PPIs, survival essentiality, and expression breadth could influence evolutionary rate together with further properties as a compound factor which Choi and Hannenhalli  called the “function (fitness)-centred” variable.
Several studies found gene expression level to be an important determinant of evolutionary rates, especially in yeast [35, 37, 68, 69] and bacteria , but also in multicellular organisms [60, 70]. However, the correlation between rates of sequence evolution and expression level is rather weak in mammals , especially if other variables are controlled for  (see also ). Instead, breadth of expression seems to be a more prominent factor affecting mammalian protein evolution [34, 72]. Results of current analyses on murine sperm proteins underscore the greater importance of expression breadth than expression level in determining rates of protein evolution in mammalian species: While negative zero-order correlations between mean expression level of protein-coding genes and dN/dS were significant (681 genes: Spearman’s ρ = −0.259, p < 0.001; 1557 genes: Spearman’s ρ = −0.246, p < 0.001) (Table 1), the equivalent partial correlations became weaker (in the dataset of 1557 proteins; Spearman’s partial ρ = −0.060; p < 0.05; Fig. 3) or even nonsignificant (in the dataset of 681 proteins; Spearman’s partial ρ = −0.045; p > 0.05; Fig. 2). In contrast, the tissue specificity index τ was significantly positively correlated with evolutionary rate even when confounding factors were controlled for (see above; Figs. 2 and 3).
Feyertag et al.  discovered that the anticorrelation between evolutionary rate and expression level cannot be found in secreted proteins of diverse taxa. To investigate if the marginal or nonsignificant correlations between dN/dS and expression level observed in our two datasets relied on an enrichment of secreted proteins, we excluded these proteins and reran the analyses (see Additional file 1: Supplementary Methods). Results of partial correlations without secreted proteins remained qualitatively unchanged (see Additional file 1: Table S4). We conclude that our results are independent of the amount of secreted proteins. Hence, expression breadth remains the more relevant correlate of evolutionary rates in the murine sperm proteome compared with expression level.
Evolutionary gene properties: Numbers of paralogs and gene age
Gene duplication may have two opposing effects on evolutionary rates. A phase of fast evolution seems to follow immediately after a duplication event, presumably due to relaxed constraints and/or the action of positive selection . Later, if both copies are preserved in the genome increasing constraints  or functional relevance  account for a slow-down of sequence evolution. In our data, the latter effect was evident in significant negative partial correlations between the number of paralogs and dN/dS in both datasets (Figs. 2 and 3). This apparently reflects that many of the paralogs studied herein are rather ancient so that signatures of initial rate acceleration have already been superimposed by subsequent sequence conservation. The varying strength of correlation in the two datasets (681 genes: Spearman’s partial ρ = −0.162; p < 0.001; 1557 genes: Spearman’s partial ρ = −0.091; p < 0.001) might be a consequence of different sample compositions and sizes. However, the results altogether reveal a negative relationship between dN/dS and numbers of paralogs.
In the current study of murine sperm proteins, gene age also correlated negatively with dN/dS and was even among the strongest rate indicators according to partial correlation analyses (681 genes: Spearman’s partial ρ = −0.193, p < 0.001; 1557 genes: Spearman’s partial ρ = −0.183, p < 0.001) (Figs. 2 and 3). This agrees with findings by Albà and Castresana  who reported that older genes of mouse and human evolve more slowly than newer ones and proposed two models to explain this inverse relationship between evolutionary rates and gene age. Their model of “increasing constraint” predicts that beginning with weak selective pressures shortly after duplication, numbers of functional sites accumulate with time, thereby restraining evolutionary rates. According to their alternate model of “constant rate”, older evolutionary innovations (e.g., signal transduction or multicellularity) inherently entail more functional sites than newer ones. Neutral evolution should be more prevalent in younger genes containing fewer functionally constrained sites, resulting in higher evolutionary rates compared with older genes. Additionally, rapidly evolving genes are more likely to be lost from the genome and, thus, not to become “old”; therefore, they should be more numerous in younger age classes .
These theoretical considerations demonstrate that the possible mechanisms underlying the connections between rate of sequence evolution and both number of paralogs and gene age may be partly redundant. This overlap reflects the circumstance that probably most novel genes emerge from duplications [32, 74].
Top 10 GO annotation clusters for three protein groups with different dN/dS estimates
Cluster description (enrichment score)
Cluster description (enrichment score)
Cluster description (enrichment score)
Membrane lipid catabolism
Macromolecular complex assembly
Protein catabolic process
Response to oxidative stress
Protein complex assembly
Protein complex assembly
Regulation of protein complex assembly
Regulation of lipid transport
Taken together, results of GO analyses point to strongest sperm-specific functional enrichment in the most rapidly evolving protein class. Thus, proteins which evolve at relatively high rates are functionally more specialized than others in the murine sperm proteome. They should furthermore evolve under rather relaxed constraints as suggested by results of partial rank correlations. This outcome underscores that relaxation of constraints could be a prerequisite for adaptive evolution, especially in a sex-specific manner (see, e.g. ; see also [78, 79]).
Based on our findings, we propose a model of sperm protein evolution taking into account rate correlates and functional aspects.
We conclude that most murine sperm proteins’ primary function is not reproduction-specific; instead, most members of the sperm proteome apparently engage in processes required in sperm as well as other cell types. Accordingly, results of GO enrichment analyses illustrate that only few of the 1557 proteins considered are engaged in testis- and sperm-specific processes. Furthermore, most sperm proteins display high phyletic ages, with more than two thirds of genes (1078 of 1557) assigned to the two oldest age classes. This high proportion of pre-metazoan proteins corresponds to the findings of Freilich et al. , whereupon a relatively constant fraction (~ 65%) of proteins in each of the 14 mouse tissues investigated in their study originated before the emergence of Metazoa. In the current study, genes which have been retained in the genome over long evolutionary timescales are more likely to be broadly expressed, have more PPI partners and paralogs, and to be essential for survival as depicted in Fig. 1 (see also Additional file 1: Figure S1). Results of our partial rank correlations highlight the respective properties – gene age, numbers of PPIs and paralogs, τ, and essentiality – as some of the strongest correlates of dN/dS of murine sperm proteins. This observation is largely consistent in both datasets (n = 1557 and n = 681; exception: essentiality which was only included in analyses of the smaller dataset). These results agree with the notion that slowly evolving genes are of “high status”, while rapidly evolving genes have “low status” , a pattern which, according to our analyses, apparently also applies to evolutionary rate distributions in the murine sperm proteome. Thus, most slowly evolving sperm proteins are constrained by their high status properties, such as a multitude of PPIs and paralogs, essentiality , and/or high age (see ), while proteins with lower status are potentially more susceptible to evolve neutrally or in response to natural, in some cases sexual, selection. Status differences might also be the basis of the functional and evolutionary compartmentalization detected in male reproductive proteins in various taxa, with proteins expressed at early stages of reproduction (spermatogenesis, sperm assembly) being more constrained than those involved in interaction with the female (sperm motility and sperm-egg interaction) ([17, 18]; see also ). The current study reinforces the notion of Dorus et al.  that conservation of sperm proteins might rely on their basal functions and/or their degree of pleiotropy. Moreover, we expand this view by factors such as phyletic gene age and interdependencies between the mentioned variables. Additionally, our results underline the importance of incorporating various gene properties when analysing the evolution of gene groups or genomes to gain a more general picture of factors underlying evolutionary rate variation. Furthermore, they emphasize that although some gene properties are general correlates of evolutionary rates, their importance may vary not only in a lineage-, but to some extent also in a cell type-specific manner.
We gratefully acknowledge helpful comments and suggestions during the review process whose incorporation improved our manuscript. We are further indebted to Hans Zischler for fruitful discussions.
Availability of data and material
Sources of data analysed are specified in the Methods section of this article and its supplementary information files.
Financial support was provided by Johannes Gutenberg University (level 1) and German Research Foundation to HH (DFG; HE 3487/3–1).
JS conceived the study, collected and analysed data and wrote the manuscript. HH contributed to conception of the study, interpretation of data, and co-wrote the manuscript. All authors read and approved the final manuscript.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
- 33.Toll-Riera M, Bostick D, Albà MM, Plotkin JB. Structure and age jointly influence rates of protein evolution. PLoS Comput Biol. 2012;8:e1002542.Google Scholar
- 40.Chauvin T, Xie F, Liu T, Nicora CD, Yang F, Camp DG 2nd, et al. A systematic analysis of a deep mouse epididymal sperm proteome. Biol Reprod. 2012;87:141.Google Scholar
- 63.Jeong H, Mason SP, Barabási AL, Oltvai ZN. Lethality and centrality in protein networks. Nature. 2001;411:41–2.Google Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.