Background

The progression through the cell cycle (the G1→S→G2→M transition) is mainly controlled by the enzymic activities of cyclin-dependent kinases (CDKs). The association of cyclins with CDKs is the condition requisite for their activation [13]. Cyclins have been discovered in sea urchin eggs as proteins whose synthesis and degradation oscillate during the cell cycle [4]. Periodic changes in the concentration of cyclins cause sequential activation/inactivation of the CDK catalytic partners resulting in periodic advancement of cells through the cell cycle [2, 5, 6]. Most metazoans have four cyclin types A, B, D and E, which regulate cell-cycle transitions. For example, cyclin D controls the G1 phase by interacting with CDK4 and CDK6; cyclin E interacts with CDK2 and controls the end of the G1 phase and the transition through G1/S; the cyclin A/CDK2 complex regulates the S phase and the exit from it and the cyclin A/CDC2(CDK1) complex regulates the G2/M transition; cyclin B interacts with CDC2 and also regulates the G2/M transition. The major cyclins are those that directly regulate the cell cycle, when in complex with CDKs. The others, which perform auxiliary functions, are united into the group of additional cyclins [7]. In fungi, of the four major cyclin families characteristic of animals, only B-type cyclins are present. In Schizosaccharomyces pombe, there is just one family of these major cyclins, CDC13. In S. pombe Cig1 and Cig2 null-mutants, Cdc13 can alone provide orderly progression through cell cycle [8]. In the other fungi, such as Saccharomyces cerevisiae, there are several cyclins of B family (CLB1-2, CLB3-4, CLB5-6), which control the different phases of the cell cycle (CLB5-6, S phase; CLB1-2, CLB3-4, G2 and M phases) [6].

Cyclins are highly conserved proteins identified in fungi, plants, animals and protists [9, 10]. In recent years, based on genome-wide and comparative phylogenetic analyses, numerous studies have been conducted to define cyclin relatedness. Thus, 49 cyclins, assigned to 10 families, were identified in the Arabidopsis thaliana genome [11], 49 cyclins forming 9 families were detected in the Oryza sativa genome [12], the number of cyclins, the members of 6 families, was determined as 59 in Zea mays [13]. The relatedness of cyclins in the unicellular green algae Ostreococcus tauri [14], and diatom algae Thalassiosira pseudonana and Phaeodactylum tricornutum [15] has been examined. Furthermore, using genome-wide data, the relationship of distinct cyclin families was studied: cyclins of D-type in the Angiosperms (Arabidopsis thaliana, Oryza sativa, Zea mays, Populus alba) and the Bryophytes (Physcomitrella patens) [16] and cyclins of A-type [17], B-type [17, 18], D-type [18, 19] and E-type [18] in animals and fungi. On the basis of phylogenetic tree analysis, the relationships among cyclins A, B, D and E in different animal taxa were investigated [20]. In these recent studies, a particular focus has been on assignments of cyclin sequences to subfamilies. However, detailed analysis of phylogeny and evolution modes of proteins (in this context, by evolution modes, we mean a statistically significant acceleration or deceleration of accumulation of amino acid replacements) would be useful not only in a reconciliation of classification issues, it would also enable us to identify, with more reasonable accuracy, protein function features. The narrow taxonomic diversity of sequenced plant species and their polyploidy makes their statistical analysis very difficult. For this reason, plants were disregarded. With this stipulation, here we analyze the phylogeny and evolution modes of distinct cyclin families belonging to animals and fungi.

Over the past decade, the relation between the level of the expression of the gene and its evolutionary rate generated great interest. In 2005-2009, it was shown for the first time that this relation is universal, resulting from the selection against the toxic effect of protein misfolding. Protein misfolding may possibly be induced by (1) translation errors; (2) misfolding during erroneous translation; (3) spontaneous misfolding and unfolding [2123]. All force selection applies to counteract protein misfolding, which is obviously due to amino acid replacements, affecting differently the protein structure and function. Hence, a promising study of changes in molecular functions of proteins and protein encoding genes during evolution would place emphasis on changes in accumulation rates of amino acid replacements, and their localization in protein 3D structure, thereby providing a better understanding how amino acid replacements became fixed during evolution.

In the current study, we examine the structural-functional features of the molecular evolution of the of A-, B-, D-, E-type cyclin proteins of animals and fungi. An analysis of the fixation of amino acid replacements atypical of the four cyclin families disclosed the evolutionary features of structural cyclin regions, where these replacements occurred. The results made it possible to relate the evolutionary features of cyclins with trends of the ecological specialization apparent in the course of evolution of fungi, vertebrates and invertebrates. We discerned a broad pattern, encompassing virtually all the atypical amino acid replacements, the more superficial relative to protein surface (the surface replacements) became more frequently fixed during evolution, whereas those further away from the surface (the interior replacements) were rarely, if at all, fixed during it. Altered protein-protein interactions, and changes in the number and/or nature of cyclin-binding proteins, in particular, possibly underlay the evolutionary relationship between changes in molecular structure of cyclins and their involvement in the molecular-genetic events in distinct taxa.

Methods

Species studied

Here, we analyze a total number of 54 species representing animals and fungi whose genomes have been completely sequenced. Protist and placozoan cyclins from 7 fully sequenced species served as outgroups in phylogenetic analysis. In so doing, we minimized the consequences of possible insufficient data on cyclin sequences for any organism. The following taxonomic groups of organisms were included in analysis (Table 1): Mammalia; Aves; Amphibia; Actinopterygii; Tunicata; Cephalochordata; Echinodermata; Insecta; Nematoda; Cnidaria; Ascomycota; Basidiomycota. The following taxonomic groups of organisms served only as outgroups in our analysis (Table 1): Placozoa; Choanoflagellida; Amoebozoa; Euglenozoa; Diplomonadida.

Table 1 Species studied and number of their cyclins analyzed

Orthologous groups of cyclins

The protein sequences we analyzed were retrieved from the KEGG 52.0 database [24]. Assignments of animal and fungal cyclins to orthologous groups were extracted from KEGG Orthology [24, 25]. The protein orthologous groups were as follows: K12760 (CLN1), K06650 (CLN2), K06646 (CLN3), K11115 (CDC13), K02220 (CLB1-2, CIG2), K06659 (CLB3-4), K06651 (CLB5-6), fungi and protists; K04503 (CCND1), K10151 (CCND2, CycD), K10152 (CCND3), K06626 (CCNE), K05868 (CCNB), K06627 (CCNA), animals and protists. Based on KEGG Orthology, protein sequences for each of the above listed groups of orthologs were taken from the KEGG Genes database [24, 25]. Then, to enrich the sample, five proteins, which were most similar to the query protein by primary structure, were extracted from the KEGG SSDB database. Duplicate sequences and/or identical protein sequences were discarded. The resultant sample contained 233 of A-, B-, D-, E-type cyclin proteins.

Multiple alignment, reconstruction of phylogeny and ancestral sequences

A PROMALS3D web server was used for the multiple alignment of amino acid sequences using protein structure data [26]. The primary alignment was done using data on the secondary and tertiary structure of cyclins [27]. The obtained alignment was divided into 4 secondary alignments for each cyclin type, A, B, D and E (Additional file 1).

It is well known that the accuracy of phylogenetic reconstruction, i.e., the precision of estimated branch lengths, considerably increases when models of relative replacement rates of amino acids, specific to a particular protein family, are examined [28, 29]. With this in mind, we built models for relative amino acid replacement rates using gapless secondary alignments by MODELESTIMATOR 1.1 for each cyclin family [30]. Then the phylogenetic trees for cyclins of A-, B-, D-, E-type were reconstructed using these models and gapless secondary alignments by PhyML 3.0 [31].

Phylogenetic trees derived from gapless secondary alignments, were used for the analysis of phylogenetic relationships. Support values for each node of the phylogenetic tree from the gapless secondary alignment was provided by the approximate likelihood-ratio test for branches [32] and the test, based on multiple comparisons of log-likelihoods elaborated by Shimodaira and Hasegava [33]. Because cyclins are an evolutionary conserved group of proteins, the trees derived from their sequences were multifurcated. In such confounding instances, the issue was resolved manually, relying on the literary data and on the internet resource Tree of Life [34]. To define the divergence order of the unicellular ancestor of multicellular animals and fungi, we took advantage of genome-wide data [3538]. The issue of phylogenetic relationships within fungal taxa was also resolved on the basis of genome-scale data [3840]. To define tree topology at the divergence level of nematodes, arthropods and deuterostomes, we had recourse to evidence in favor of the existence of the ecdysozoans [4148]. In the construction of the tree at the level of the divergence of echinodermates, cephalochordates, chordates and tunicates, additional information was used [43, 4951]. The divergence order of organisms within a mammalian class was verified [5256], and so was that within an arthropod group [57]. The divergence order of paralogous cyclins in fungi, the group most susceptible to the effects of heterotachy during evolution [58], was consistent with that reported earlier [59].

The phylogenetic trees with multifurcations resolved as described above were utilized for the accurate estimation of branch lengths and the gamma shape parameter α using gapless secondary alignments by PhyML 3.0 [31]. The above estimates and gapless secondary alignments, were used to reconstruct ancestral sequences in tree inner nodes for each of the A-, B-, D-, E-type cyclins by the maximum-likelihood method. Three algorithms were applied for the reconstruction of ancestors differing by their approach to the optimization of ancestral protein reconstruction: 1) ANCESCON [60] used the general WAG model for relative amino acid replacement rates [61]); 2) FASTML server [62] took into account the general LG model for relative amino acid replacement rates [63]; 3) the specific models for the relative amino acid replacement rates concerning each cyclin family, which we constructed by MODELESTIMATOR 1.1 [30], were used in the ancestral reconstruction by AAML software (the PAML package 4.3b, the CODEML program [64]). In the reconstruction of the ancestor, we used only the marginal approach because it is the one that enables us to reconstruct the most probable protein sequences for each inner node of phylogenetic tree [65]. Having reconstructed the ancestral sequences in all the inner nodes of the four cyclin phylogenetic trees by all the three algorithms (Additional file 2; Additional file 3), we reconstructed three common ancestors of the cyclin families by the ANCESCON program [60]: the ancestor of A and B, and the ancestor of D and E, the ancestor common to all the four cyclin families for each of the three methods of reconstruction.

Search of atypical amino acid replacements in sequences in the phylogenetic tree branches

In this analysis, a replacement of an amino acid a (or b) in an ancestral node by an amino acid b (or a) in a descendant node was designated as atypical, if the occurrence probability of the a→b replacements was small. Such replacements may reflect important changes in the protein function fixed during evolution; however, they may indicate the increasing rate of amino acid substitutions in a particular branch. To identify atypical amino acid replacements in sequences corresponding to each branch, we compared the number of all the observed 190 types of amino acid replacements with what was expected, accepting that cyclin evolution is a stationary homogeneous Markov process. The number of amino acid replacements observed for each type, n(Type R ), was calculated in the inner branches by pairwise comparison of ancestral-descendant sequences reconstructed by three methods: AAML [64], FASTML [62] and ANCESCON [60] (see above). For the terminal nodes, the observed number of amino acid replacements was calculated by pairwise comparison of ancestral sequence, reconstructed also by three different methods, with a descendant sequence taken from an extant organism. In further analysis, we considered the ancestral amino acids whose reconstruction probability was greater than 0.99 (using estimates from each of the reconstruction methods (Additional file 3)).

The expected values for replacements were calculated on the basis of the computer simulation of cyclin molecular evolution with the package INDELible 1.03 [66]. The sequence evolution was simulated on the basis of a phylogenetic tree of A-, B-, D-, E-type cyclins taking into consideration the gapless alignments of these proteins (length; proportion of conserved columns; the gamma shape parameter (α), which describes the heterogeneity of evolutionary reorganization), and the matrix for amino acid replacement rates, estimated for each cyclin family (Additional file 4). Proceeding on the data from these 1000 alignments for each branch (including the branch leading to the terminal nodes), the rates expected for amino acids of 190 types were calculated.

The expected and observed replacements of each type were compared by the permutation test we developed earlier [67]. This type of statistical test [68] allowed us to estimate the deviation and the significance of the observed amino acid replacement types from the expected ones for each branch. A set of observed replacements was always included in the expected set. Thus, for a simulated set of replacements, 105 random samples of the same size as those containing observed replacements were generated by random permutation. The number of expected amino acid replacements nrand(Type Rrand ), which belonged to a given type, 190 in all, was estimated for each random sample. Also the number of random samples M in which nrand(Type Rrand ) > n(Type R ) was estimated in the course of the test. Therefore, the M/105 value expressed the probability p at which the occurrence of the observed replacements of each type (Type R ) could have arisen by chance (Additional file 5). The strict threshold p ≤ 0.01 was used to define the atypical amino acid replacements (the less probable according to the stationary and homogeneous Markov model). In our analysis, we considered only tree branches with atypical replacements (p ≤ 0.01) identified using all the three reconstruction methods (AAML, FASTML and ANCESCON).

Estimation of accessibility of protein to solvent, prediction of secondary structure

The accessibility of amino acids to solvent and their reference to particular regions of secondary structure were determined in protein 3D structure applying the DSSP program [69]. Representative proteins with known 3D structures for the A, B1/B2, E and D cyclin families we taken from the PDB database [70]: 1H1R:B (KEGG ID 890) for cyclin A family; 2W9Z (KEGG ID 595) for cyclin D family; 2B9R:A (KEGG ID 891) for cyclins of B1, B2 families; 1W98:B (KEGG ID 898) for cyclins of E family. 3D structures of the representative proteins of B3 and fungal B cyclin families were reconstructed by Phyre server [71] using 2B9R:A as template structure: 3D model of Homo sapiens cyclin B3 (KEGG ID 898); 3D model of the Saccharomyces cerevisiae cyclin B (KEGG ID YGR108W). Protein tertiary structure alignment was performed by the CE program [72].

In determination of the relation between the accessible surface area (ASA) with atypical amino acid replacement types, the true ASA i values of residue i, obtained with the DSSP program [69], also 10 classes (C = 10(ASA i /ASA max ) of relative ASA values were used. To scale the true ASA values, the ASA of the extended states of Ala-X-Ala for every residue X (ASA max ) are used under the assumption that the absolute values include side chain and backbone surface area [73]. These values are (in Å2) 110.2 (Ala), 144.1 (Asp),140.4 (Cys), 174.7 (Glu), 200.7 (Phe), 78.7 (Gly), 181.9 (His), 185.0 (Ile), 205.7 (Lys), 183.1 (Leu), 200.1 (Met),146.4 (Asn), 141.9 (Pro), 178.6 (Gln), 229.0 (Arg), 117.2 (Ser), 138.7 (Thr), 153.7 (Val), 240.5 (Trp), and 213.7 (Tyr), respectively [73]. The relative ASA values were roughly assigned to three categories: amino acids belonging to the ASA classes C ≤ 2 were considered as buried (internal), those belonging to the classes 2 < C ≤ 4 as subsurface (intermediate), and those belonging to C > 4 as surface (external).

Comparison of the number of expected and observed replacements for each secondary structure

The number of expected replacements was estimated for each secondary structure element (for example, each α-helix or the β-strand) of protein of animal cyclin A, D and E families and for animal cyclin B, fungal cyclin B families and animal B3-cyclins subfamilies. For this purpose, the total number of amino acid replacements and their number in each structure element observed from the roots to the tips of the trees for animal A-, D-, B-, E-, B3- and fungal B cyclins were calculated (Additional file 6). The estimate of the number of expected replacements in the structure element (e) was obtained under the assumption that fixed replacements were distributed evenly in the protein sequence: e = o (l/L), where l is the number of residues in the secondary structure element, L is the protein length, and o is the total number of observed replacements in protein. The significance of the relationship between the frequency of replacements and their location in the secondary structure element was estimated using the χ2 test with k-1 degrees of freedom, where k is the number of secondary structure elements. The choice of the secondary structures whose number of observed replacements was significantly smaller or greater than expected, was based on a comparison of the r = (e - o)2/e values with the critical χ2 value for probability p ≤ 0.01 at given degrees of freedom (Additional file 6).

Those secondary structure elements of animal A-, D-, B-, E-, and B3- and fungal B- type cyclins, whose number of observed replacements significantly differed from the expected (p ≤ 0.01) in all three ancestral reconstructions (AAML, FASTML and ANCESCON), were included in the final analysis.

Analysis of the evolutionary rates distribution

To analyze the variation in the rates of amino acid replacements within the cyclin families, subtree branch length distributions were examined in terms of their ranks. Comparisons based on the branch rank values allowed us to disregard the possible deviation of the branch length distribution from the Gaussian shape. The procedure for the examination of branch length distribution for the subtrees was multistep. First, the length of branches for every of the four cyclin family tree from root to tips were measured. Then, with the R:Stats program package [74], rank values were assigned to every branch of the cyclin family trees. At the last step, for all the analyzed subtrees of a cyclin family tree, R m , the rank median for subtree rank length and R v , the difference between the maximum and minimum rank lengths of the subtree divided by branch number in the subtree, were calculated. R m described the branch length in the analyzed subtree, relative to the lengths of all the branches in the cyclin family tree, while R v expressed the variation in the length of the subtree branches. Obviously, the greater was the subtree R m , the faster the protein evolved within the cyclin family tree and the smaller was the subtree R v (absolute minimum R v = 1), the closer were the evolution rates within the subtree.

Results and Discussion

General analysis of evolution rates of different families of cyclin proteins

Analysis of animal cyclin evolution

Based on the analysis of the branch length distribution within the phylogenetic trees of cyclins A, B, D and E, multicellular animals were divided into two monophyletic groups. These two were represented by two clades of the Metazoa phylogenetic tree, passing from the basal Coelenterates group. Invertebrates, Insects and Echinodermates (with the exception of the Nematodes) were referred to the first group Inv. Inv was characterized by long branches in A, B and D cyclin family trees (Figure 1; Figure 2; Table 2) and their wide variations (Table 2). This may be taken to mean that the cyclins of this group evolved both rapidly and at uneven rates. Cyclins resulting from duplications in vertebrates with the formation of a paralogous protein group were referred to the second Ver group (Figure 1; Figure 2). Ver was characterized by relatively short branch lengths in A, B and D cyclin family trees and the narrowest variation in their lengths (Table 2). It is of interest that R m and R v for branch lengths in the B3 subtree of the Ver group are close to those in B and A cyclins in the Inv group (Figure 1A; Figure 2A; Table 2). Thus, on the basis of the cyclin evolution rates, clubbing of two robust Ver and Inv groups of animal cyclins may be distinguished (irrespective of the inclusion of Coelenterates into the ingroup). As a result, these two groups of cyclins subdivided the animal species into two simultaneously phyletic and ecocentric groups. Vertebrates (the Ver group) were, as a rule, eurybiotic as adults. During their relatively long life, they could withstand drastic changes in the environment and gave little or no preference to particular food [7577]. This makes it possible for adult vertebrates to easily pass from one habitat to another, and to migrate widely [7577]. The situation was quite different for invertebrates, the Inv group. Their larvae and adults were, as a rule, adapted to specific environmental factors and food, being stenobiotic [76]. Consequently, during evolution, animals either conquered a narrow ecological niche, became specialized (first strategy), or took advantage of a wide range of narrow ecological niches (second strategy), passing from one habitat to another when recourses became inadequate and/or competition heavy [76, 78]. The smaller, simply organized animals with short life cycle gave preference to the first strategy. These animals are members of the Inv group. In contrast, the mobile, relatively long living animals with their complex organization (Ver group) preferred the second strategy [76]. Thus reasoning, it may be explained why cyclins A, B and D evolved slower in vertebrates than in invertebrates.

Figure 1
figure 1

Phylogenetic trees for cyclins A (A) and D (B). Branching significance estimated in calculation of phylogenetic tree was supported by numerical values (the Shimodaira and Hasegava test and the aLRT χ2 test). Bifurcations resolved manually on the basis of literary data are indicated by asterisks (*). Inner branches for which atypical replacements were observed are highlighted in red.

Figure 2
figure 2

Phylogenetic tree for cyclin B (A) and cyclin E (B). The designations are as in Figure 1. The bifurcations resolved manually on the basis of the data in the literature on fungi are indicated by hatches (#).

Table 2 Comparison of the branch length and branch length variance for various cyclin groups

The nematodes were exceptions; their cyclins had the longest branches from the root (Figure 1; Figure 2). The evolution of the nematodes had two interesting features: first, mutations of their proteins became fixed at higher frequencies [44, 79] and, second, their differentiation was progressive in the sense that cell pluripotency progressively decreased during development [8084], with the cell division number remaining strictly limited and, hence, the cell number in the adult. The accelerated rate of replacement accumulation in the nematode cyclins may possibly be explained by these features of nematode biology.

Analysis of fungal cyclin evolution

Duplication of cyclin B genes and formation of paralogs (CLB3-4; CLB1-2 and CLB5-6) was characteristic of some of the fungal taxa (Saccharomycotina); as for the other taxa, their salient feature was the presence in the genome of just one representative homolog of the cyclin B family, cdc13 (Ascomycetes, Pezizomycotina, the sole representative of Taphrinomycotina, Schizosaccharomyces pombe, in the current study, basidiomycetes, a representative of Agaricomycotina, Cryptococcus neoformans, and a representative of Ustilaginomycotina, Malassezia globosa). In this of interest that, in some Saccharomycotina species, the CLB family was represented by six paralogous genes (CLB3,4; CLB1,2 и CLB5,6), whereas other species lost one of the CLB subfamilies. Of the organisms we analyzed, Neosartorya fischeri (Pezizomycotina) had a set of paralogs different from that in the rest of fungi, it also harbored cdc13 and CLB3,4 homologs (Figure 2A). How to account for this diversity of homolog number in the fungal CLB family? One reason of this diversity was perhaps a specific cell cycle regulation that developed in the different taxa under the effect of mycelial organization requiring loss of synchronous nucleus and cell division [8593], another reason was perhaps other specificity of nuclear-cytoplasmic relationships characteristic of fungi (heterokaryon formation, for example) [85, 87, 94, 95].

Punctual and gradual cyclin evolution

Evolution in eukaryotes followed a saltatory and gradual course [96]. In terms of punctual evolution, the gene predominantly accumulated replacements in relatively short timescales. This was accompanied by either gene acquirement of a novel function or by selection pressure relaxation, with evolution nearing neutral. Events of such kind immediately preceded adaptive radiation and, for this reason, they, prevailed in the inner branches [97, 98]. Pagel et al. (2006) have suggested a simple test for revealing the prevailing nature of evolution [99]. It should be noted that punctual evolution was associated with dramatic genetic reorganization in the inner branches and evolutionary stasis in the terminal branches. Therefore, during punctual evolution, the relation between the inner node number and total branch length from the root became almost linear (from the power form n = α + βxδ , where n is number of nodes, x is the phylogenetic path length from root to tip), in this case, there was a statistically significant correlation (β > 0, δ~1 and α→0) [99101]. In contrast, when evolution was gradual and/or there was a node-density artifact, the relation is not linear, i.e. β~0, δ > 1 and α≠0 [99101]. It proved that the evolving invertebrate cyclins A and B (Inv group), vertebrate cyclins D1 and cyclins B3 (meiosis-specific cyclins [84, 102, 103]), also fungal cyclins CLB3-4 tended to conform to the punctual pattern (Table 3; β > 0, δ~1 and α→0).

Table 3 Statistics for the relation between inner node number and branch length from tree root

Comparative embryology provides evidence that the vertebrate embryogenesis was conserved, with similarity more pronounced at early embryogenesis. This is consistent with paleontological evidence: the ecological specialization was narrow in the early vertebrates, represented by relatively monomorphic group of relatively simply organized benthic animals with small body size [96, 104]. The vertebrates started to flourish quite late, after the appearance of jaws (allowing to diversify food), and that of two paired limbs providing high mobility [104]. In this way, these two aromorphoses (consequential adaptive changes) preadapted vertebrates to increase in body size and to their eurybiotic nature. The simply organized ancestors of vertebrates widened their tissue repertoire by making their ancestral differentiation mechanisms more elaborate. It should be noted that the mature differentiated somatic cell functions mainly during the G1 cell cycle phase. It of relevant that cyclin D1 together with their paralogs controls precisely G1 phase; moreover, increase in D1 concentration compared with D2 and D3 concentrations is due to the early differentiation of vertebrate germ cells along the neuroectodermal pathway [105]. It may be inferred that punctual evolution of cyclin D1 (Table 3) conforms to the evolution of ectoderm derivatives (neuroectoderm); the evolution of the derivatives of endo- and mesoderm result from the balance between the concentrations of all the tree paralogs and the more gradual evolutionary pattern of cyclins D2 and D3 [105]. The scenario was quite different for invertebrates. Their characteristic features, even within a taxonomic type (or phylum), were wide diversity of embryogenesis, deep metamorphosis (up to holometabolous development and hypermetamorphosis) and stenobiotic ecology. It would seem that for this reason that punctual evolution in invertebrates concerned cyclins A and, possibly cyclins B (Table 3), those controlling the S→G2→M cell cycle transitions, i.e. the most conserved in terms of duration of the cell cycle phases.

Fungal organisms fell into a separate group. Fungi are reducers (decomposers) and/or facultative parasites. Their thallus is organized as syncytium, i.e. nucleus division and the formation of new cell might have been independent of each other, temporally distinct events regulated by major and auxiliary cyclins and auxiliary kinases and/or modifications of major kinases [106112]. This means that evolution of complexity of fungi should have been nearly neutral with regard to cyclins. This implies the gradual evolution we detected for fungal CDC13. However, events like polyploidization [113115] or megaduplication [116], might have contributed largely to fungal speciation. This evolutionary pattern is punctuational. In fact, functional specialization in paralogous genes changed during evolution [116]. Ultimately, the newly arisen fungal species (including polyploid species) formed an ecosystem with ancestors [113115, 117, 118]. A good example is the punctual evolution of cyclins CLB3-4 in Ascomycota.

It followed that ecological and physiological features of animal and fungal species, which were closely related to the total rate of mutational replacement accumulation, presumably had a considerable impact on cyclin evolution. This conclusion made acute the question of the nature of fixed mutational replacements in different taxa. To examine it closer, we analyzed the rates of amino acid replacements in cyclin families.

Atypical replacements in evolution of different cyclin families

The first step was to define the periods of evolutionary history, when amino acid changes in cyclins was most dramatic. For this purpose, we performed Markov simulation of protein evolution taking into consideration all the known natural features of cyclin evolution (see Materials and Methods section). As a result, we identified atypical amino acid replacements in certain phylogenetic tree branches for A-, B-, D-, E-type cyclins.

Distinguishing features of the evolution of animal A-, B-, D- and E-type cyclins

In vertebrates, accumulation of atypical replacements provided evidence that accelerated evolution in protein function enfolded mainly during cyclin duplication or after it, not in all paralogs, however (Figure 1; Figure 2). Thus, after the split of cyclin A in the common vertebrate ancestor, atypical replacements started to accumulate only in paralog A1. Then, they were found to accumulate faster in both A1 and A2 paralogs, but in different taxa. To illustrate, at the level of the common ancestor of birds and mammals (this corresponded to the emergence of oviparity in the early reptiles), atypical replacements conversely accumulated in A2, while they did not accumulate in A1 (Figure 1A).

The pattern for cyclins D was somewhat more complicated (Figure 1B). A brief increase in the accumulation rates of atypical replacements accompanied diversification of cyclin D1 and the ancestor common to cyclins D2 and D3, then cyclin D2 in particular continued to accumulate atypical replacements; by contrast, cyclin D1 had atypical replacements only at the time of tetrapod rise. Interestingly, the diversification of cyclins D2 and D3 was not associated with accumulation of atypical replacements, but after that replacements of this kind intensely accumulated in paralog D2 in the different vertebrate clades. It is also of interest that against the background of multiple fixations of atypical replacements in paralogs D2 and D1, such events were not observed for D3 evolution.

The evolution of vertebrate cyclins E, at the diversification level of cyclin E paralogs, was obviously similar to that of cyclins D within the D2 and D3 paralog groups (compare Figure 2B and 1B). Diversification of cyclins E1 and E2 was not associated with fixation of atypical replacements, for example. Quite the reverse was observed for the evolutionary scenario within cyclins E1 and E2. They evolved specifically compared with cyclin D paralogs. In this light, of interest were the fixation events of atypical replacements in paralogs E1 and E2 during the rise of mammals, when birds and mammals diverged.

An important pattern became apparent when surveying the evolution of animal B cyclins (Figure 2A). Interestingly, this pattern is peculiar to the evolution of paralog groups of animal cyclins B1 and B2. It is of relevance that B1 is a major and B2 is an auxiliary cyclin [119]. In fact, cyclin B2-null mice develop normally and are fertile whereas cyclin B1-null mice died as embryos [119]. It is also known that cyclin B1 was expressed in interphase in two fractions, free cytoplasmic and membrane-bound, while cyclin B2 was expressed only in membrane-bound form; for this reason, cyclin B1 virtually compensated the absence of cyclin B2, but not vice versa [119]. An important point was that atypical amino acid replacements were distinguishing features at the divergence time of coelenterates and bilaterians. This observation suggested that the animal cyclin B gene possibly acquired multiple functions at this step of evolution.

Analysis of the data in Figure 2A revealed other taxon-specific events of accumulation of atypical replacements. Thus, in invertebrate cyclin B3, acceleration of fixation rate of atypical replacements was detected at the time dipterans and hymenopterans appeared; as for vertebrate cyclin B3, the acceleration became conspicuous when mammals appeared. It is pertinent to note that cyclin B3 is more important for regulation of meiosis than mitosis. In cyclin B3 mutant mice and nematodes, sex cell differentiation was disrupted [84, 102, 103]. Consequently, it may be suggested that during the divergence time of dipterans and hymenopterans, also of mammals, there occurred considerable changes in meiotic control by B3 cyclins. Cyclins B3 stand aloof from the rest, being relatively conserved in remote vertebrate and invertebrate taxa and containing motifs characteristic of both cyclins B and A [84, 102, 103, 120]. Hence, the coupled fixation of atypical amino acid replacements in cyclins B3, A and E at the formation step of hymenopterans appeared of interest.

General features of the evolution of animal A-, B-, D- and E-type cyclins

Thus, the fixation patterns of atypical amino acid replacements in the vertebrate cyclins are specific. However, fragments of these patterns may be similar in inner branches. These particular branches may represent the major aromorphoses (the consequential evolutionary changes), for example: (1) the rise of higher vertebrates, i.e. of the common ancestors of fishes and tetrapods, atypical amino acid replacements were noted for B1- and B2-cyclins (Figure 2A), and A1-cyclins (Figure 1A), D1-cyclins and the common ancestor of D2- and D3-cyclins (Figure 1B); (2) the rise of the tetrapod superclass was marked by accelerated evolution of D2- and D1-cyclins (Figure 1B); (3) the rise of amniotes, the oviparous vertebrates, the common ancestor of birds, reptiles and mammals characterized by atypical replacements in A2-cyclins (Figure 1A); (4) the formation of the common mammalian ancestors observed as accelerated evolution of E2- and D2- cyclins (Figure 1B and 2B); and finally (5) placental viviparity was presumably related to the accelerated evolution of the A1-, E1-, B2- and D2 cyclins (Figure 1; Figure 2). Importantly, two events only were highlighted for the fixation of atypical amino acid replacements in all studied cyclin families: the rise of higher vertebrates (ancestors of fishes and tetrapods) and of placental mammals. The two events possibly needed major developmental changes: in the case of the ancestors of fishes and tetrapods, the formation of jaws and of paired limbs triggered the evolution of all the organismal systems [121, 122]; the rise of placental mammals (complete viviparity) required the reorganization of early embryogenesis [123126].

Also, in vertebrates, atypical amino acid replacements evidenced that accelerated evolution proceeded predominantly during paralog duplication or after it. A model for evolution of gene function through gene duplication was pertinent at this point: it suggested that an ancestral gene had several functions [127]. Clearly, after duplication of a multifunctional gene, the functions of the ancestral gene could evolve independently in the duplicated gene copies, thereby increasing adaptive flexibility [127].

It is of interest that relays were revealed for a number of vertebrate cyclin paralogs. To illustrate, atypical evolution of A1-paralogs was first detected in the lineage of the subtree that corresponed to the fish and tetrapod ancestors, then in the lineage occupied by the common avian, reptilian and mammalian ancestors accelerated evolution in A1-paralogs was substituted by the one for A2-paralogs (Figure 1A). Evolution rate was greater in E2-paralogs of oviparous taxa (birds, reptiles and ancestors of mammals) and in E1-paralogs in placental mammals (Figure 2B).

The reverse was observed for invertebrates. In arthropods, nematodes and echinodermates, the fixation of atypical amino acid replacements was rarely brought into relation with deep evolutionary transitions (aromorphoses). Only the evolution of B-cyclins in the common Metazoa ancestors and ascending to the tip in the specific Coelenterates group (Figure 2A) could be related to aromorphosis (the appearance of true multicellularity). For A- and E-cyclins, atypical amino acid replacements were detected in the lineage of the evolutionary young taxon Hymenoptera; no relations were traced for D-cyclins (Figure 1; Figure 2).

Evolutionary features of fungal cyclins

A somewhat different pattern was peculiar to the evolving fungal cyclins (Figure 2A). Accelerated accumulation of atypical replacements was characteristic of just one of the duplicated cyclins, first of only CLB5-6, then of only CLB1-2 and, ultimately, of a distinct subgroup of cyclins within paralog group CDC13. According to Ohno's duplication model, after duplication of a gene performing a single function, one gene copy became subject to directional selection, this led to a dramatic change in its function [128]. In the great majority of cases, one of the gene copies lost its function in the long run under the effect of degenerative mutations. However, both copies may be retained as an outcome of chance acquirement of an important novel function by one of the copies. Of relevance here is the demonstration that in yeast selection against duplicated gene copies was enhanced, while gene copies retained during evolution fulfilled functions different from ancestral [129]. Our current data allowed us to extend Teichmann's (2004) conclusion to the evolution of fungal B cyclins [129]. This appeared reasonable because the distribution pattern of atypical replacements in the fungal phylogenetic tree was consistent with Ohno's neofunctionalization model [128].

Taken together, our current data suggest that cyclin evolution contributed slightly to increasing complexity of invertebrate and fungal organisms. This is supported by the fact that, in contrast to vertebrates, fixation of atypical replacements did not concern earliest evolution (branches at the roots) in invertebrates (Figure 1; Figure 2). Stating it otherwise, all the cases of accelerated evolution of cyclins in invertebrates are located in the terminal branches of phylogenetic tree, thereby providing evidence that events were relatively recent. To the contrary, in the vertebrate lineage (Figure 1; Figure 2), evolution of the four families of major cyclins (except for B3) turned out to be, in one way or another, related to progressive evolution from the emergence of chordates at the time of the Cambrian explosion to at least the divergence of orders. Moreover, for a number of paralogs, progressive evolution correlated in phylogenetic tree with branches corresponding to sharp increases in the number of cell types [96].

It is now evident that accumulation of atypical amino acid replacements during evolution of animal and fungal cyclins fits well into the model for increasing complexity of multicellular organisms through (i) sharp evolutionary aromorphic changes, particularly during vertebrate evolution, related to change in habitats and (ii) gradual differentiation of new functions and separation of the manifold old ones of duplicated genes during evolution of animal and fungi. With all this in mind, the location of the mutational replacements in the different structural-functional cyclin domains posed questions worth considering. We hoped that answers will become clear when the amino acid replacement rates in each secondary structure under study will be known.

Analysis of replacement rates during the evolution of different cyclin structural elements

Positioning of atypical amino acid replacements in the tertiary structure of cyclins

The obvious question was: How atypical amino acid replacenments, which, are, as a rule, rare evolutionary events, may affect the structure of cyclin molecules? The distribution of solvent accessibility values for atypical amino acid replacement positions in gapless alignments showed skewness in favor of either subsurface or surface positions in globule (Table 4). The deviation from symmetry is clear-cut, with clustering above the median in A, B3 and E cyclin types (Table 4). It was also shown that atypical replacements formed local densities on gapless alignments (Figure 3) and that they were predominantly fixed either on the surface or the subsurface regions of the protein globule or where the surface and the buried regions alternated (surface representations are available in Additional file 7).

Table 4 Comparison of the accessible surface area (ASA) values of atypical amino acid replacement sites with those of all sites in gapless alignments
Figure 3
figure 3

Schematic representation of mutation distribution in gapless alignment of cyclins. Cyclins: A, animal A; B, animal B; C, fungal B; D, animal B3; E, animal D; F, animal E. Atypical replacements are indicated above each rectangle in Figures A through E. The upper rectangle in Figures A through E represents the position of amino acids in protein globule sample by color transitions from black, inner amino acids, to white, those on the surface. The lower rectangle in Figures A through E represents the secondary structure in protein sample: black, helical structure, white, turns. Fringes of secondary structure are colored depending on whether the number of all replacements is significantly greater (red) than expected, or as expected (white).

It is known that amino acid replacements in the protein interior affect integral protein characteristics (globule stability and conformation). Surface and subsurface replacements were related to the evolution of distinct active sites (sites of protein-protein interaction or protein-ligand binding) [130]. The decrease in atypical replacement frequency in the interior and their enhanced fixation in the (sub)surface regions was evidence that the cyclin globule conformation was, on the whole, subject to stabilizing selection, which served as a background for local driving evolution of distinct active sites (seen as densities of atypical replacements in gapless alignments (Figure 3)). As known, cyclins are regulatory subunits, which bind, activate and provide substrate specificity of their catalytic partners CDKs [2, 5, 6]. Fulfilling the function of cell cycle regulators, through control of CDKs substrate specificity, cyclins form short-living complexes with different effector proteins, belonging to transcriptional and co-transcriptional factors, replication factors, signal cascade proteins.

The statistics for positioning of atypical amino acid replacements is in agreement with the comparative data on cyclin tertiary structures. Figure 4 shows the alignment of protein 3D structures of human cyclins D1 and A2 (RMSD of the structural alignment was 1.9 Å, the Z-score = 7). Clearly, most of the perfectly aligned residues consisted of highly conserved α-helices binding to CDKs and formed the internal protein skeleton, whereas the weakly conserved regions comprised the smallest part of the alignment. These regions included the external β-layers, short terminal fragments of the α-helices and loops linking together conserved α-helical regions. The atypical replacements fell, as a rule, in the external α-helical termini and in loops (Figure 3; Figure 5; Additional file 7). The interior regions, usually free of atypical replacements (Figure 3; Figure 5; Additional file 7), corresponded to the central regions of the α-helices.

Figure 4
figure 4

Alignment of tertiary structures of human cyclin D1 and A1 protein sequences. Sequences are colored as follows: cyclin D1, red; cyclin A1, blue. Arrows indicate regions binding to CDK.

Figure 5
figure 5

Three-dimensional structures of cyclins showing the location of conserved and variable secondary structure elements. A, human cyclin A1-CDK2 complex; B, human B1 cyclin; C, yeast B1 cyclin; D human B3 cyclin; E human cyclin D1-CDK4 complex; F, human E1 cyclin. Secondary structures: replacement number significantly greater than expected is marked "+". CDK molecules are in red; cyclin regions within gapless alignment are in blue.

Positioning features of all the amino acid replacements in the tertiary structure of cyclins

The next step was to define the most conserved and the most variable secondary structure elements of the cyclin protein molecule. Using the approach described in the Materials and Methods section, we calculated the total number of amino acid replacements fixed in each element of the protein secondary structure. The secondary structure element, whose number of amino acid replacements was significantly greater than expected, in the case of even distribution of replacements along the sequence, are shown in Figure 3, Figure 5 and Additional file 6.

Analysis of the data in Figure 3 and 5 and Additional file 6 revealed features unique to almost all cyclins (except cyclin D), namely the presence of highly variable regions on the protein surface opposite to the nearly conserved protein region interacting with CDKs. These evolutionary variable sites may conceivably be the acceptors for interaction with other regulator proteins. It is of interest that cyclin D1 has a CDK-independent role as co-activator or co-repressor of tissue-specific transcription factors [e.g. [131133]]. The results of studies on E-type cyclins also suggested their kinase-independent function. The mutant cyclin E is illustrative: although unable to activate its Cdk2 partners, it can provide the G1/S transition [134]. In the cell, cyclins D and E, in particular, are integrators of signals from extracellular growth factors, thereby defining their crucial role in cell differentiation. Inferences can be made from the results Bryja et al [105] obtained with a model for mammalian cell culture: pluripotent cells prevailed in the cyclin A-CDK2-p27 complex, cyclin E affected proliferation rate and prepared cells to differentiation; the cyclin D2/D3-CDK4-p27 complex prevailed among undifferentiated cells, cyclin D3 elevation in cytoplasm made differentiation follow the endodermal pathway, prevalence of the cyclin D1/D2-CDK4-p27 complex drove differentiation along the neuroectodermal pathway. Replacements considerably different in amino acid properties in the external protein regions can cause the appearance, the disappearance of binding sites in cyclins, or alter them through changes both in surface regions themselves and in the corresponding mutual disposition of internal α-helices and the angles between them (see Figure 3). This suggested that evolutionary changes in external regions of cyclin proteins, animal D, E and B3 in particular, led to changes in the number and properties of cyclin-signal protein binding sites. In fact, estimates of the number of protein-protein interactions for cyclins in different taxa disclosed that it tended to increase with organism complexity (Additional file 8) [135137]. Taken together, all these facts allow us to explain why cyclin families evolved mainly in the surface and subsurface regions that are far from the centre of CDK binding.

Thus, analysis of the rates of evolutionary reorganization in different regions of cyclin tertiary structure revealed the mechanism possibly underlying the flexibility of cyclin function. This flexibility is provided by very consequential reorganization of subsurface regions remote from interaction sites with CDK in animal and fungal cyclins, also by functional differentiation of paralogous cyclins (including the binding site for CDK) formed during animal evolution. This is the tentative explanation we offer for the differences in the structural-functional evolution between different cyclin families revealed here.

The results for analysis of phylogenetic trees and amino acid replacements may reflect two important features of cyclin evolution: at the organismal level, the evolutionary modes of these proteins differed among taxonomic groups; at the molecular level, the accumulated amino acid replacements possibly provided changes in the interactions of cyclins with other proteins in the cell.

Conclusions

We studied the molecular evolution of animal and fungal cyclins belonging to the A, B, D and E families. Phylogenetic trees for cyclin protein sequences were constructed, the fixation rates of amino acid replacements were estimated, sets of atypical amino acid replacements, the hallmark features of these cyclins, were identified. We also analyzed the relationship between the molecular evolutionary features of distinct cyclin regions and their structural characteristics in the protein globule. The salient findings were: (1) atypical amino acid replacements were due to aromorphic changes in vertebrates and to gene duplication in animals and fungi during evolution, this was consistent with the model for increasing complexity of multicellular life through changes in habitat, emergence of novel and diversification of old functions in duplicated multifunctional genes; (2) evolutionary flexibility of cyclin functions may be provided by consequential reorganization of surface and subsurface regions remote from CDK interaction sites in animal and fungal cyclins, also by functional differentiation of paralogous cyclins (including CDK binding sites) formed in animal evolution. Cyclin reorganization rates during evolution might have been considerably affected by ecology and physiology of animal and fungi.

Integrating facts and considerations, it may be concluded that the functional roles of cyclins remained interrelated through time at all levels from the molecule-cell-whole organism.