Gene organization and evolutionary history

Gene structure and chromosomal localization

The three human genes encoding members of the retinoblastoma (Rb) family share some features that are similar to other housekeeping genes, including a lack of the canonical TATA or CAAT boxes found in the promoters of most differentially expressed genes, the presence of a GC-rich zone immediately surrounding the main transcription-initiation site, the presence of multiple consensus sequences for binding the Sp1 transcription factor, and the presence of multiple transcription start sites [1,2,3].

The human RB gene (which encodes a protein of 105 kDa and is also called p105) was first identified when both familial and sporadic retinoblastoma, a form of malignant tumor of the retina, were found to be associated with deletions at 13q14 [4,5,6]. The RB transcript is encoded by 27 exons dispersed over about 200 kilobases (kb) of genomic DNA; the exons range from 31 to 1,889 base pairs (bp) in length and the introns range from 80 bp to 60 kb.

The human p107 gene (which encodes a protein of 107 kDa and is also known as RBL1) is located at chromosome 20q11.2, a region of special interest because of its association with some myeloid disorders. The arrangement of the p107 gene is similar to that of the other members of the Rb family. It is composed of 22 exons that vary in length from 50 to 840 bp, spanning approximately 100 kb of genomic DNA.

The human Rb2 gene (which encodes a protein of 130 kDa and is also called RBL2) maps to chromosome 16q12.2. The Rb2 messenger RNA is 4.6 kb in length; the gene consists of 22 exons and spans over 50 kb of genomic DNA, and the 21 introns vary in size from 82 bp to almost 9 kb.

The three Rb-family proteins are also known as 'pocket proteins', after their conserved pocket region, which is composed of two conserved domains (A and B) separated by a spacer (Figure 1). The pocket is important for the binding of other proteins (see below). The exons encoding domain A of the RB gene (exons 11-17), domain B (20-23), and the spacer region between domains A and B (18 and 19) are very similar in all members of the family. Interestingly, amino-acid residues that are identical between p107 and pRb2 are also found in the same exonic positions. This feature is not shared with RB, suggesting a closer evolutionary relationship between the p107 and Rb2 genes [7]. Additionally, the spacer regions of Rb2 and p107 show higher similarity to each other than to RB [8].

Figure 1
figure 1

Comparison of the genomic structure of human retinoblastoma genes and of the functional domains in different Rb-related proteins. Boxes indicate exons of the human RB gene; hatched boxes indicate exons encoding domains A and B. Adapted from [7].

Evolutionary history

The Rb-family proteins are fairly well conserved over a range of species. The arrangement of helices in domain A of pRb strongly resembles the cyclin-box folds found in cyclin A and the transcription factor TFIIB [9]; the Rb-family proteins may therefore have arisen in evolution by a tandem duplication of this fold. A phylogenetic tree of pRb protein sequences is shown in Figure 2.

Figure 2
figure 2

Phylogenetic tree illustrating the diversity of pRb in eight representatives of different phyla and kingdoms. Numbers are branch lengths, which correspond to the estimated evolutionary distance between protein sequences. The tree was constructed using ClustalW.

Mammals

Homologs of the three human Rb-family proteins have also been found in mice. The mouse pRb2 protein has a 43-amino-acid deletion in the pocket domain compared with the human homolog and other members of the family. This region (starting at residue 211 in human pRb2 [10,11]) is highly conserved in human pRb2 and p107, showing 70% identity over the 43 amino acids, but human pRb2 and pRb are only 50% identical over 21 amino acids of the region [12]. The corresponding region in mouse p107 binds and represses the transcription factor Sp1, but the significance of this deletion in mouse pRb2 remains unclear [13]. In both humans and mice, pRb2 shows a higher identity in amino-acid sequence to p107 than to pRb. Regions conserved between pRb2 and pRb are limited to the A and B domains of the pocket region, but conserved regions between pRb2 and p107 appear throughout the entire length of the protein, especially in the amino-terminal region, suggesting that the amino-terminal region could be very important for their functions [12]. Domains A and B and the carboxy-terminal region are highly conserved between the human and mouse p107 proteins. Domains A and B exhibited 90.6% and 89.4% identity respectively, and the carboxy-terminal region showed 91.5% similarity. With the exception of the 100% identity found in the string of amino acids stretching from position 782 to 889 in the B domain of human and mouse p107, the highest level of homology (94%) was found in the amino-terminal domain [14].

Rat pRb2 is almost 90% identical in amino-acid sequence to human pRb2 [15,16]. The 4.87 kb cDNA contains an 1,135 amino-acid open reading frame with high homologies to the human and mouse Rb2 and a partial homology to Rb. Rat pRb2 and rat pRb are conserved only in the pocket region and are only 32% identical in this region [16].

Other vertebrates

Comparison of chicken Rb-family proteins with those of mouse, human and Xenopus reveals a 66% amino-acid identity in the A and B domains of the pocket region but only 33% identity in the spacer between A and B [17]. A 20-amino-acid sequence at the carboxyl terminus is completely conserved in all the aforementioned Rb-family sequences, but its biological function is not yet clear. Although the chicken Rb family proteins demonstrate great similarity to the pRb homologs in mice, human and Xenopus in multiple regions, they also possess characteristics that are unique to each species. The region near the amino terminus is the most variable in Rb proteins in these four species. Chicken and Xenopus pRb each contain a unique and shorter amino terminus than the mouse and human homologs [17,18]. There are no known homologs of human p107 or pRb2 in Xenopus or chicken.

Invertebrates

The Drosophila RBF protein is intriguing, as it has structural features that resemble all three members of the Rb family, suggesting that the RBF gene may have evolved from a common ancestor of the human Rb-family genes. Paradoxically, the nucleotide sequence of RBF is more similar to human p107 and RB2 genes than it is to the Rb gene, but the RBF protein sequence has a higher percentage identity with pRb than with p107 or pRb2. The highly conserved spacer domain found in both p107 and pRb2 is absent in RBF, as is a long insertion in the B segment of the pocket domain, which is present in pRb2 and p107 but not in pRb [19].

The nematode Caenorhabditis elegans has a protein called LIN-35 that has significant sequence similarity with the human Rb pocket proteins [20]. LIN-35 shows 20% identity to human pRb2, 19% to p107, 15% to pRb, and 16% to Drosophila RBF [20]. The highest conservation is found in domains A and B, but the spacer region is not as highly conserved; it is short, as in human pRb. Because LIN-35 is not particularly similar in sequence to any one of the human Rb family proteins, LIN-35 may have diverged from an ancestor common to the Rb family proteins.

Plants

Until recently, it was thought that the Rb family proteins were peculiar to vertebrates [21], but in 1998 a homolog was cloned in a plant [22], and Rb homologs have now been found in maize, tobacco, Chenopodium rubrum (red goosefoot) and Arabidopsis. The conservation of Rb and of other components of the Rb pathway in plants suggests that Rb may have an important role in the development of all multicellular organisms, not just animals. The highest level of identity with human Rb-family proteins (20-35%) is found in the pocket region [23].

There is no evidence of an Rb pathway in any unicellular organism, but the mat3 gene of the unicellular green alga Chlamydomonas reinhardtii, which belongs to the land plant lineage, has a domain structure homologous to Rb [22]. It contains a pocket region with two domains separated by a spacer and also has the sequence Leu-X-Cys-X-Glu (LxCxE in the single-letter amino-acid code, where x indicates a non-conserved amino acid), which is characteristic of the Rb-family proteins and is thought to be a peptide-binding site (see Characteristic structural features). Unlike mammalian Rb-family mutants, however, mat3 mutants do not have a shortened G1 phase, do not enter S phase prematurely, and can exit the cell cycle and differentiate normally, indicating that this Chlamydomonas gene has a different role from that of animal Rb-like genes [24].

Characteristic structural features

The three pocket proteins consist of an amino-terminal domain, a pocket region composed of two conserved domains (A and B, residues 373-771 in human pRb) separated by a spacer region, and a carboxy-terminal domain. The pocket domain is responsible for interaction of the protein with transcription factors, cyclins, and cyclin-dependent kinases (CDKs), and for its functional activity [1,6,8,14,25,26]. The pRb2 and p107 proteins are thought to be more closely related to each other than they are to pRb. Some amino acids present in the B region of pRb are lacking in p107 and in pRb2. Conversely, p107 and pRb2 share a motif in the spacer region, which is absent in the pRb sequence. This enables them to form a strong binding site for cyclin A-Cdk2 and cyclin E-Cdk2 [8,14,26,27,28,29]. Additionally, pRb2 and p107 share a sequence near the amino terminus that is missing in pRb. A 20-amino-acid sequence at the carboxyl terminus is completely conserved in most Rb-family homologs, but its biological function has not been yet clarified.

The crystal structure of human pRb domain A shows that it is composed of nine α helices, two of them forming a hydrophobic core and the remaining seven surrounding this core [9]. In domain A, 47 amino acids are completely conserved between the three human Rb-family proteins, of which 21 are polar and 26 are non-polar. The majority of the conserved non-polar residues interact to stabilize the tertiary structure of the proteins. Interactions of the polar conserved residues suggest that they also have a role in stabilizing the tertiary structure. The A and B domains of the pRb pocket region have a cyclin-fold structural motif that is also common to cyclins and the transcription factor TFIIB. The pRb pocket domain also has a β hairpin, an extended tail, and eight additional helices. The cyclin folds of the B domain are more similar to the cyclin folds in cyclin B and TFIIB than they are to the cyclin folds of the A domain [30].

Both domains A and B are required for interactions with viral oncoproteins and cellular transcription factors [9,30]. The pocket region can also bind proteins that lack the LxCxE motif, such as the E2F family of transcription factors [31,32,33,34,35]. In pRb2 and p107 the spacer region can bind cyclin A-Cdk2 and cyclin E-Cdk2 complexes [36,37,38,39]. The surface residues of pRb that are conserved across species and with human p107 and pRb2 proteins cluster in two regions: the LxCxE-binding site in the B domain and the interface between the A and B domains. The conservation of this interface suggests that it may participate in binding to E2F or to proteins that may mediate transcriptional repression by pRb. Conservation of regions within the LxCxE binding site across species indicates its structural and functional importance. The four residues that meet the backbone of the peptide are identical in pRb homologs from human, newt, chicken, fruit fly and maize and in the human p107 and pRb2 [30].

A dozen distinct phosphorylation sites have been found in the spacer region, but the exact number of serine and threonine residues of pRb that can be phosphorylated during the G1 phase remains undefined [40]. Phosphorylation of pRb is important because it can influence its relationship with interacting proteins [40]. Ten of the potential phosphorylation sites are fully conserved between the three members of the Rb family in the rat: four in the amino-terminal region, five in the carboxy-terminal region, and one in the spacer region [16].

Localization and function

Subcellular distribution

Rb-family proteins are found in the nucleus. High-resolution deconvolution microscopy studies have revealed that, during G1 and S phases, the three pocket proteins are found in perinucleolar foci [41]. A recent study reported that some mechanisms of control of the cell cycle correlate, at least in part, with the compartmentalization of Rb proteins within the nucleus [42]. For example, the cell-cycle-dependent binding of pRb2 and p107 to the E2F4 transcription factor changes as a function of their subnuclear localization. Specifically, in the nucleoplasm, pRb2-E2F4 complexes are more numerous during G0 and G1 phases, whereas in the nucleolus they increase in S phase. In contrast, p107-E2F4 complexes in the nucleoplasm are more numerous in S phase than in G0 or G1 phases, and no cell-cycle change is observed in the nucleolus [42].

Additionally, pRb2, p107, E2F4 and the complexes between pRb2 and the histone deacetylase HDAC1 are all associated with the inner nuclear matrix, and they localize to sites different from pRb. The nuclear matrix, which is composed of chromatin and filamentous structures, is an integral part of nuclear structure and undergoes profound reorganization during DNA replication, gene expression and mitotis [43]. Recently, it has been shown that pRb is associated with the nuclear matrix only during G0 and G1 phases [44], whereas pRb2 and p107 associate with the nuclear matrix in a phase-independent manner [42]. According to Mancini et al. [44], pRb is distributed widely throughout the matrix, particularly at the nuclear periphery and in nucleolar remnants, whereas the core filaments of the matrix contain no detectable pRb. A significantly larger amount of pRb2, p107, E2F4, and their complexes were found in interchromatin than in heterochromatin regions [42]. Because active transcriptional sites are confined to the less-condensed interchromatin regions, it is not surprising that both Rb-related proteins and E2Fs, possibly associated with HDAC1, are more numerous in these regions.

The phosphorylation status of pRb2 and p107 regulates their association with different parts of the nuclear matrix. In extracts from G0/G1-phase cells, pRb2 and p107 are primarily in a hypophosphorylated state; in S-phase extracts, p107 remains hypophosphorylated but pRb2 is hyperphosphorylated, weakly bound to the nuclear matrix and inactivated. This suggests that the repressional control exhibited by pRb2 could be more intricate than that of pRb because the interaction of pRb2 with the nuclear matrix is modulated by phosphorylation as the cell moves from G1 to S phase. Nuclear structure may bring specific sequences together with transcriptional factors in both normal and transformed cells [45]; the association of p107 and pRb2 with the inner nuclear matrix is therefore a promising new area of research.

Tissue expression patterns

The three Rb-family members vary in their expression patterns in different tissues at various stages of the cell cycle: pRb is abundant during all phases of the cell cycle, showing only slight variations in expression levels but significant differences in its phosphorylation status; pRb2 is detectable at high levels in non-proliferating cells; and p107 expression is lost in cells that have withdrawn from the cell cycle, but is high throughout the proliferative cell cycle [31,46,47,48]. The pRb protein is ubiquitously expressed in normal cells and tissues. All three pocket proteins are highly expressed in some differentiated cells, although the pattern of expression is cell-type-specific. In neurons and in skeletal muscle cells there is a high expression of pRb2, whereas p107 shows higher expression levels in breast and prostate epithelial cells [49].

Functions

The retinoblastoma protein was originally described as a tumor suppressor, as it was found to be mutated in many forms of cancer. The region to which the human p107 gene maps (20q11.2) is not normally mutated in tumor cells, but a fraction of human myelogenous leukemias contain deletions of this region [1]. Mutations or deletions within the region containing Rb2 (16q12.2) have been described several times in human neoplasias, including breast, hepatic, ovarian, and prostatic cancers, suggesting that it is also a tumor suppressor [15]. Even though the pocket proteins are highly similar in many ways, each member of the family has distinct functions and has a non-redundant role [39]. Pocket-protein functions sometimes appear redundant, however, such as when the loss of one family member by mutation is totally or partially compensated for by the activity of another family member [50,51,52,53].

Growth-suppressive properties

The three Rb-family members can inhibit cell growth, acting on the cell cycle between G0 and S phases, primarily through binding and inactivation of transcription factors [54]. The growth-suppressive activity of the Rb-family members is cell-type-specific: for example, the C33A human cervical carcinoma cell line is inhibited by overexpression of p107 [25] and pRb2 [55], but not by pRb, whereas the T98G human glioblastoma cell line is sensitive to the growth-suppressive effects of pRb2 yet is unresponsive to that of pRb and p107 [25,56]. Saos-2 human osteosarcoma cells are growth-arrested in the G0/G1 phase of the cell cycle by all of the Rb-family members [25,56,57]. Together, these findings indicate that there are some fundamental differences in the molecular pathways by which the different Rb-family proteins exert control over the cell cycle.

The Rb family and differentiation

The pRb protein has an integral role in various differentiation processes, such as adipogenesis, myogenesis and hematopoiesis [58,59]. Studies on cellular differentiation have shown an interaction between pRb and several differentiation-specific transcription factors, such as the basic helix-loop-helix transcription factor MyoD, nuclear factor activated by interleukin-6 (NF-IL6) and the HMG-box-containing represser HBP1 [60,61,62]. For example, members of the MyoD family associate with pRb, and the binding of pRb to MyoD is thought to induce activation of genes that are specific for myogenic development. This is supported by the finding that cell lines lacking a functional RB are unable to convert into myogenic cells [63].

Perhaps the most convincing evidence of the importance of pRb in cellular differentiation and specialization comes from the studies of RB knockout mice. Homogeneous germline disruptions of the RB gene cause death by day 14 of gestation, associated with gross defects in the development of the hematopoietic and central nervous systems [64,65,66].

The Rb family and apoptosis

In addition to the canonical role of RB as a tumor suppressor gene, it has been recently discovered that pRb also acts as an anti-apoptotic factor. The evidence implies that transforming growth factor β1 (TGF-β1) induces apoptosis by suppressing pRb expression [67], and the active hypophosphorylated form of pRb inhibits the apoptotic function of interferon γ (IFN-γ) [68]. Experiments performed with RB-/- mice demonstrated that widespread cell death occurs in tissues that normally express high levels of pRb, such as liver, ocular lens, nervous system, and skeletal muscle tissue [64,65,66,69]. The p107 protein could also have an anti-apoptotic effect: RB-/-p107-/- double-knockout mouse embryos have more extensive apoptosis in their central nervous system and liver than single mutant RB-/- embryos [70]. Further studies are needed, though, to clarify the role of pRb2 in this context.

The Rb family and angiogenesis

Proper vascularization is necessary for the formation of a tumor mass and for invasion of other tissues during metastasis [71]. New blood vessels form a network in the tumor mass that provides the nourishment and substrates necessary for the progression of tumorigenesis. In fact, if a tumor is not nourished by supports derived from the blood vessels, its diameter is limited to 1-2 mm [72]. The vascularization mechanism is controlled by the highly balanced activities of angiogenic and anti-angiogenic molecules, which act in opposition to each other [72,73]. Two of the major factors regulating angiogenesis are the vascular endothelial growth factor (VEGF) and the multifunctional protein thrombospondin-1 (TSP-1). Recent evidence shows that pRb2, like the oncoprotein Ras and the tumor suppressor p53, is involved in angiogenesis [74,75,76]. Enhanced expression of pRb2 through virus-mediated gene transfer in tumors grown in nude mutant mice downregulates VEGF expression, contributing to the inhibition of tumor formation [76]. To date, no reports have been published on the role of the other Rb family members in angiogenesis.

Frontiers

During the past ten years our understanding of cell-cycle events has increased exponentially. Most of the work on the Rb family so far has focused on the development of assays that enhance our understanding of the key cell-cycle players. With the advent of proteomics, the next steps will be to study the interactions among different proteins and to discern the different protein-expression profiles that occur in normal and diseased tissues. These studies will help to find novel diagnostic and prognostic markers, as well as new and more specific targets for future molecular therapies.