Journal of Molecular Evolution

, Volume 58, Issue 6, pp 642–652

Linkage of the β-Like ω-Globin Gene to α-Like Globin Genes in an Australian Marsupial Supports the Chromosome Duplication Model for Separation of Globin Gene Clusters

Authors

  • David Wheeler
    • Department of Molecular BiosciencesThe University of Adelaide
    • Centre for Evolutionary Biology and BiodiversityThe University of Adelaide
    • Department of Molecular BiosciencesThe University of Adelaide
    • Centre for Evolutionary Biology and BiodiversityThe University of Adelaide
    • Australian and New Zeland Council for the Care of Animals in Teaching and ResearchUniversity of Adelaide
  • Steven J. B. Cooper
    • Evolutionary Biology UnitSouth Australian Museum
    • Centre for Evolutionary Biology and BiodiversityThe University of Adelaide
  • Andrew A. Gooley
    • Proteome Systems Ltd.
  • Robert A. B. Holland
    • School of Physiology and PharmacologyUniversity of New South Wales
Article

DOI: 10.1007/s00239-004-2584-0

Cite this article as:
Wheeler, D., Hope, R.M., Cooper, S.J.B. et al. J Mol Evol (2004) 58: 642. doi:10.1007/s00239-004-2584-0

Abstract

The structure, function, and evolutionary history of globin genes have been the subject of extensive investigation over a period of more than 40 years, yet new globin genes with highly specialized functions are still being discovered and much remains uncertain about their evolutionary history. Here we investigate the molecular evolution of the β-globin gene family in a marsupial species, the tammar wallaby, Macropus eugenii. We report the complete DNA sequences of two β-like globin genes and show by phylogenetic analyses that one of these genes is orthologous to embryonically expressed ε-globin genes of marsupials and eutherians and the other is orthologous to adult expressed β-globin genes of marsupials and eutherians. We show that the tammar wallaby contains a third functional β-like globin gene, ω-globin, which forms part of the α-globin gene cluster. The position of ω-globin on the 3′ side of the α-globin cluster and its ancient phylogenetic history fit the criteria, originally proposed by Jeffreys et al. (1980), of a “fossil” β-globin gene and suggest that an ancient chromosome or genome duplication preceded the evolution of unlinked clusters of α- and β-globin genes in mammals and avians. In eutherian mammals, such as humans and mice, ω-globin has been silenced or translocated away from the α-globin locus, while in marsupials ω-globin is coordinately expressed with the adult α-globin gene just prior to birth to produce a functional hemoglobin (α2 ω2).

Keywords

Globin gene evolutionα-globinβ-globinε-globinω-globinTammar wallabyMacropus eugeniiMarsupialMolecular evolution

Introduction

All vertebrates studied to date possess a variety of globin genes, including families of α-like and β-like globin genes that together encode the tetrameric oxygen carrying molecule hemoglobin. In amphibians (e.g., Xenopus laevis) and fish (e.g., Danio rerio) the α- and β-globin gene families, each containing a set of differentially expressed genes, are closely linked, generally with α-globin genes located at the 5′ side of the cluster (Jeffreys et al. 1980; Hosbach et al. 1983; Chan 1997). The linkage of these genes provides evidence that the α-globin and β-globin genes evolved by a process of tandem duplication of a single primordial globin gene (Jeffreys et al. 1980). In birds and mammals the α-like and β-like globin genes are arranged in two distinct unlinked α- and β-globin clusters of differentially expressed genes (Deisseroth et al. 1977, 1978; Hughes et al. 1979; Wainwright and Hope 1985). These findings led to the proposal that the α–β-globin linkage represents an ancestral arrangement and the genes were separated prior to the divergence of the avian and mammalian lineages from a common ancestor. The mechanism of this separation could be through translocation, with the breakpoint between the α- and the β-globin genes or by a process of chromosome (sometimes referred to as in trans) duplication of the locus (Jeffreys et al. 1980; Hardison 2001a). The latter mechanism would require that each duplicate-unlinked cluster would then evolve to the present arrangement, found in birds and mammals, by silencing of the linked α- or β-globin genes. Support for this chromosome-duplication hypothesis would be provided through the identification of remnant or so-called “fossil” α- or β-globin sequences adjacent to the β- or α-globin clusters, respectively, in mammals or avians. Based on the gene orientation in X. laevis, Jeffreys et al. (1980) predicted that such “fossil” α-globin genes would lie to the 5′ side of β-globin clusters and “fossil” β-globin genes would lie to the 3′ side of α-globin clusters.

Eutherian mammals generally contain four or more differentially expressed β-globin genes. For example, the human β-globin gene cluster consists of five functional genes, including one embryonic (ε), two fetal (Aγ and Gγ), and two adult expressed genes (δ and β), arranged in the order of their expression during development, 5′-ε–Aγ–Gγ–δ–β-3′ (Efstratiadis et al. 1980; Fritsch et al. 1980). Extensive sequence comparisons among β-globin genes in a wide range of eutherian mammals led to the proposal that a four- or five-gene ancestral β-globin cluster existed prior to the radiation of eutherian mammals (Goodman et al. 1984; Hardison 1984). The currently accepted model of β-globin gene evolution in mammals is that the first tandem duplication to produce stem-embryonic and stem-adult β-globin genes occurred approximately 150 million years ago (mya), after the divergence of the avian and mammalian lineages. Just prior to the radiation of eutherian mammals the adult gene duplicated once and the embryonic gene two times to produce a five-gene cluster, 5′-ε–γ–η–δ–β-3′. Support for this model of molecular evolution has come from studies of β-globin gene families in two marsupial species, the North American opossum Didelphis virginiana (Koop and Goodman 1988) and the Australian dunnart Sminthopsis crassicaudata (Cooper and Hope 1993; Cooper et al. 1996). Each marsupial species was found to have a single embryonic (ε) and adult (β) expressed gene, with marsupial ε-globin orthologous to eutherian ε-, γ-, and η-globin genes, and marsupial β-globin orthologous to β- and δ-globin genes. In S. crassicaudata the genes were linked in the order 5′-ε–β-3′, matching the 5′–3′ arrangement of embryonic and adult β-globin genes in eutherians. These results indicated that a relatively simple two-gene cluster 5′-ε–β-3′ existed in a common ancestor of all marsupials and eutherians.

This model of the molecular evolution of β-globin genes in mammals was recently revised following the discovery of a functional hemoglobin in the marsupial Macropus eugenii which contains unique beta-type chains termed omega (Holland and Gooley 1997; Holland et al. 1998). This third β-like globin gene in marsupials, called ω-globin, appears to be more closely related to avian β-globin genes than to other mammalian β-globin genes, including ε-globin and β-globin of marsupials (Wheeler et al. 2001). Both protein and RNA sequencing analyses confirmed that ω-globin is transcriptionally active in marsupial neonates. Significantly, it was shown that ω-globin is unlinked to the main β-globin gene cluster in species from two marsupial families, the dasyurid Sminthopsis crassicaudata and the macropod Macropus eugenii (Wheeler et al. 2001). This finding and the monophyletic grouping of marsupial ω-globin genes with avian β-like globin genes suggested that β-globin clusters of avians and mammals are not orthologous (Wheeler et al. 2001; Hardison 2001b). Under this model each cluster evolved independently prior to the divergence of the avian and mammalian lineages, with one gene cluster diverging to generate the cluster found in avians and ω-globin in marsupials, and the second cluster diverging to generate the ε–β cluster found in marsupials and the ε–γ–η–δ–β cluster of eutherian mammals.

Although the tammar ω-globin gene has been fully sequenced and characterized, full characterization of the β-globin gene family in this species at the molecular level is currently incomplete. Clues to the types and structure of hemoglobin present in this species have come from a number of biochemical and physiological studies of neonatal and embryonic blood (Holland et al. 1988; Holland and Gooley 1997). It has been shown that the embryonic blood of the tammar wallaby has unusual respiratory properties in that, unlike eutherian embryonic blood, the oxygen equilibrium curve (OEC) is to the right of the OEC for the maternal blood (Baudinette et al. 1988; Holland et al. 1988; Tibben et al. 1991). This arrangement is unfavorable for the transfer of oxygen from the mother to the embryo. In addition, the Hill coefficient of embryonic blood was found to be greater than 4, indicating the presence of aggregate forms of hemoglobin larger than tetramers (Holland et al. 1988; Tibben et al., 1991, Calvert et al. 1993). Initial characterization of the hemoglobins of tammar neonates revealed the presence of four major embryonic hemoglobins, each with a different isoelectric point (Holland et al. 1988). Isolation and partial sequencing of hemoglobin polypeptides from tammar wallaby neonates showed the existence of two different β-like chains (ε-globin and ω-globin) and three different α-like chains (ζ-globin, ζ′-globin, and adult α-globin), which are assembled in the combinations α2–ω2, α2–ε2, ζ2–ε2 and ζ′2–ε2 (Holland and Gooley 1997). The embryonic hemoglobins are gradually replaced by adult hemoglobin (α2–β2) from about 2–3 days after birth. It is, therefore, expected that the β-globin gene family of the tammar wallaby contains at least three functional β-like globin genes.

The aim of the research presented here is to characterize the β-globin gene family of the tammar wallaby and further investigate the molecular evolution of the ω-globin gene. In particular, we investigate the hypothesis that the β-globin clusters of avians and mammals are not orthologous, by characterizing genes that are syntenic to ω-globin and by carrying out phylogenetic analyses based on new Bayesian approaches. Our results show that the β-globin gene family in the tammar consists of three functional genes, ε-, β-, and ω-globin. Significantly, we show that ω-globin is part of the α-globin gene cluster in the tammar wallaby (henceforth referred to as the “tammar”), providing new insights into the complex evolutionary history and regulatory control of α- and β-globin genes.

Materials and Methods

Isolation and Sequencing of ε-globin

A 430-bp product was PCR-amplified with the degenerate primer, RogF (5′-atggtncattggacngcngaggaraa-3′; located in the terminal 5′ exon 1 region of ω-globin [Wheeler et al. 2001]) and P2 (5′-ccggaagttctcagggtccacatgc-3′; located in exon 2 of Sminthopsis crassicaudata ε-globin [Cooper and Hope 1993]). PCR amplifications were carried out in a final volume of 50 μ1 PCR buffer (GeneWorks, Adelaide, Australia) which contained 100 ng template DNA, 0.4 μmol each primer, 2.0 mM dNTPs, and 0.2 mM MgCl2. Cycling conditions were as follows: 95°C for 2 min, then 40 cycles of 95°C for 1 min, 45°C for 1 min, 72°C for 30 s. The PCR product was cloned into the plasmid pGEM-T (Promega) using conditions specified by Promega. The insert was sequenced in both directions using the vector-specific sequencing primers (sp6 and T7). DNA sequencing was carried out using an ABI PRISM Dye terminator kit and an ABI 377 DNA sequencer (Applied Biosystems, Melbourne, Australia). The 5′ end of the gene was PCR-amplified using a Genome Walker kit (Clonetech). Adaptor-ligated tammar genomic DNA constructed using the Genome Walker kit was subjected to two rounds of amplification, first, with the PCR primers epsi1 (5′-agcaccttcttgccatgggccttgacc-3′) and AP1 (5′-gtaatacgactcactataggg-3′) and, second, with the primers epsi2 (5′-gatagcagatgcagaagataggttgcc-3′) and AP2 (5′-actatagggcacgcgtggt-3′). The resulting PCR products were purified and directly sequenced and found to contain the remaining exon 1 sequence of the putative tammar ε-globin gene. The 3′ end of tammar ε-globin was isolated using a RT-PCR approach. Total RNA, isolated from 1-day-old tammar neonates (Cooper et al. 1996), was reverse transcribed using an oligo (dT) primer (UT17: 5′-gtaaaacgacggccagttttttttttttttttt-3′) and the Superscript II reverse transcriptase kit (Gibco). The ε-globin-specific primer esF (5′-actcctaagggaatctgagag-3′) was used with UT17 to PCR amplify an ∼530-bp cDNA which was purified using a Bresaclean purification column (Geneworks, Adelaide, Australia) and directly sequenced. The RT-PCR product contained the remaining region of exon III of tammar ε-globin. The full genomic DNA sequence of the gene was determined by PCR amplification of the region between exon 3 and exon 2, using the primer esF and a reverse primer CDR3 (5′-gcttctgccargmagcmtg-3′) located in the middle of exon 3.

Isolation and Sequencing of Adult β-globin

Coding regions from adult β-globin genes of the marsupials Sminthopsis crassicaudata (GenBank accession number Z69592) and Didelphis virginiana (GenBank accession number J03643) were aligned using CLUSTALW (Higgins et al. 1994) and a degenerate β-globin-specific primer (bgf: 5′-gactggtgstgaggccct-3′) was designed to hybridize to a region of high conservation in the first exon of these genes. The primer bgf was used with P2 to PCR-amplify an ∼390-bp product from the tammar wallaby. The amplified PCR product was purified using a Bresaclean purification column (GeneWorks) and sequenced on both strands using conditions given above. The Genome Walker kit (Clonetech) was used to PCR-amplify and sequence the 5′ and 3′ region of the putative tammar-β-globin gene. Depending on the specificity of the genome walker PCR reaction, products were either sequenced directly, or cloned prior to the sequencing process. The complete intron II sequence of tammar β-globin could not be determined, due to difficulty in “reading” through a CT repeat within this intron.

Construction and Screening of a Tammar Genomic Library for ω-globin Clones

Tammar gDNA (130 μg) was partially digested with Sau3AI using conditions that favor the production of fragments 15–23 kb in size. Fragments in this size range were purified from a 10–40% (w/v) sucrose gradient and ligated to BamHI arms from λ-GEM-11. The ligated products were packaged into Packagene helper phage and used to infect E. coli strain KW251 as described by the manufacturer Promega.

Plaques from the resultant library (λTG) were screened for ω-globin sequences by hybridization to a ω-globin-specific probe (OSP) containing a region of exon 1, intron 1, and exon 2 of tammar ω-globin, PCR-amplified using the primer combination drsF (5′-ggagaaacagatcattttagcc-3′) and P2 (see above). Probe DNA was radiolabeled with an α-32 P-dATP using the GIGA-prime labeling kit (Geneworks) per the manufacturer’s instructions. Two hybridizing clones (λTG3.2 and λTG3.4) were isolated after primary, secondary, and tertiary screens using standard techniques (Sambrook et al. 1989). A long-range PCR approach was used initially to characterize the clones. Combinations of the primers sp1 and T7, which specifically bind to the λ-arms, were used with forward (rseF:5′-cgggaccctggtcatgagtgc-3′) and reverse (P2) ω-globin-specific primers to PCR-amplify upstream and downstream regions of ω-globin using Elongase polymerase (Life Technologies). This analysis showed that the genomic insert within λTG3.4 included approximately 7.5 kb of DNA upstream of tammar ω-globin, while λTG3.2 included ∼2.5 kb of DNA upstream of ω-globin. In order to characterize the upstream region, insert DNA from λTG3.4 was restriction mapped and subjected to Southern analysis using OSP as a probe and standard techniques (Sambrook et al. 1989). In order to obtain an overlapping clone that extended farther 5′ of ω-globin, a 3-kb XhoI/XbaI fragment, pTG3X, representing the most 5′ region of λTG3.4 cloned into pBluescript (Fig. 2), was used to reprobe the λTG-library and a single hybridizing plaque (λTG2.11) was isolated. Restriction mapping and sequencing of restriction fragments which hybridized to the pTG3X probe showed that λTG2.11 had a 2.2-kb DNA overlap with λTG3.4 but included an additional 15 kb of sequence 5′ to ω-globin.

A region of the λTG2.11 clone, representing 7250 bp of DNA upstream from ω-globin, was sequenced and the resulting DNA sequence was subjected to a BLAST search of the NR nucleic acid database (NCBI; http://www.ncbi.nlm.nih.gov/). The highest level of sequence identity was exhibited to exon III of the human α1- and α2-globin genes. This analysis revealed that λTG3.4 contains the third exon of an α-like globin gene, which was provisionally named tammar-α2. The second exon of tammar-α2 was identified by subcloning the 2.5-kb SacI fragment of λTG2.11 (pTGS) that hybridized to the pTG3X probe in Southern analysis. The remaining regions of tammar-α2 were obtained by subcloning the ∼4-kb BamHI fragment of λTG2.11 (pTGB) into the vector pBluescript. DNA sequencing of the pTGB and pTGS inserts allowed the complete coding region of the gene to be identified.

Phylogenetic Analyses

Evolutionary relationships among β-like and α-like globin genes were assessed using the phylogenetic programs PAUP*4b10 (Swofford 2002) and MRBAYES (Version 3 [Huelsenbeck and Ronquist 2001]). The accession numbers of the globin gene sequences used in these analyses are given in the caption to Fig. 3 and phylogenetic analyses were restricted to the coding regions of each gene. Tests of homogeneity of base frequencies among taxa implemented using PAUP* rejected a hypothesis of homogeneity when all sites were included (χ2=126.15, p=0.001). Removal of third codon positions from the data set resulted in the acceptance of a hypothesis of homogeneity of base frequencies among taxa (χ2=24.96, p=1.0). These results imply that the heterogeneity in base frequencies among taxa occurred at third codon positions.

Maximum parsimony (MP) analyses were carried out using PAUP* with a standard heuristic search option and 100 random stepwise additions. MP bootstrap analyses (Felsenstein 1985) were carried out using 1000 bootstrap pseudoreplicates. Preliminary phylogenetic analyses using MP were carried out using the Fugu rubripes β-globin gene (accession number AY170464) as an outgroup. In these analyses the Xenopus laevis β-globin gene (accession number XELHBBC) was found to group outside all other mammalian and avian β-like globin genes supported by a high bootstrap value of 88% and was thus considered the most appropriate outgroup sequence for further analyses.

To determine the most appropriate model of evolution for the globin sequence data a series of likelihood ratio tests was carried out to compare different nested models using the program Modeltest (Posada and Crandall 1998) and PAUP*. The analyses indicated that the General Time Reversible model (Rodriguez et al. 1990), with a proportion of invariant sites and unequal rates among sites (Yang 1996) modeled with a gamma distribution (GTR + I + G) was the most appropriate model for the Bayesian analyses. Two analyses using MRBAYES (version 3b) were conducted. First, a single GTR + I + G model was applied (referred to as “linked” analysis) using default uninformative priors, running four chains simultaneously for 2 million generations, sampling trees every 500 generations. The likelihood and parameter values converged to relatively stationary values after about 10,000 generations and a conservative Burnin of 1000 trees was chosen with a 50% majority rule consensus tree constructed from the remaining 3001 trees using PAUP*. The second analysis involved partitioning the data into first, second, and third codon positions and MRBAYES was run in the “unlinked” mode applying a separate GTR+I+G model and base frequency estimates to each of the different partitions. The same priors, number of generations, and consensus tree generation were carried out as given for the linked analysis above. The partitioned analysis was conducted to assess the influence of different rates and base frequency estimates at codon positions on the overall tree topology, given that there was evidence for base frequency heterogeneity at third codon positions.

Given the evidence above that base frequencies at third codon positions may not be homogeneous among taxa, we also explored the use of log determinant (LogDet) analyses to assess whether this heterogeneity may have resulted in misleading topologies based on the MP and Bayesian approaches. LogDet analyses are relatively robust to variation in base composition among taxa (Lockhart et al. 1994). This analysis was conducted using PAUP* and the neighbor-joining algorithm and bootstrap analyses were conducted, with 500 pseudoreplicates, to determine support for nodes in the phylogeny.

Results

Characterization of Tammar ɛ-globin and β-globin Genes

The tammar ε- and β-globin genes were isolated using degenerate PCR and genome walking approaches (see Materials and Methods). Each gene contains a three-exon/two-intron structure with conserved donor/acceptor splice sites and an open reading frame of 444 nucleotides encoding a polypeptide of 146 amino acids, typical of most other mammalian β-like globins (Fig. 1). The length of the first and second introns are, respectively, 121 and 1424 bp for ε-globin and 111 and ∼1409 bp for β-globin. The large size of the second intron is typical of the marsupial ε-globin and β-globin genes that have been characterized to date (Koop and Goodman 1988; Cooper and Hope 1993; Cooper et al. 1996). The promoter region of tammar β-globin has ATA, CAAT, and two CACCC sequences that are found in all previously characterized adult expressed β-like globin genes of mammals. Full sequence of the promoter region of tammar ε-globin was not determined.
https://static-content.springer.com/image/art%3A10.1007%2Fs00239-004-2584-0/MediaObjects/fig1.gif
Figure 1

DNA sequences of (A) ε-globin and (B) β-globin of the tammar and corresponding predicted amino acid sequences in single-letter code. Noncoding DNA is shown in lowercase and the donor and acceptor splice signals of the introns are underlined. The sequences have been deposited in GenBank (accession numbers AY450927 [ε] and AY450928 [β]). The microsatellite sequence (CT)n in the second intron of tammar β-globin (in boldface) has not been completely sequenced.

Successful amplification of a cDNA identical to the coding region of tammar ε-globin confirmed that this gene was expressed in newborn tammars (data not shown). The protein encoded by the tammar ε-globin gene has a single-amino acid difference (position 52, alanine→serine) from the published partial (60-residue) ε-globin chain found in tammar blood (accession number P81042 [Holland and Gooley 1997]). This single-amino acid difference is unlikely to be due to a sequencing error, as the DNA sequence of tammar ε-globin was obtained from both cDNA and genomic templates. Although the amino acid encoded by the DNA sequence (SER) is also found in the dunnart and opossum ε-globin chains at this position, the difference may represent naturally occurring sequence differences between the animals used in each study. Although no tammar β-chain is available for comparisons, the amino acid sequence encoded by tammar β-globin showed 96% identity to the adult β-globin chain of a related macropodid marsupial (eastern gray kangaroo; accession number HBKG2G).

Phylogenetic Analyses

The orthology of tammar ε-globin and β-globin with embryonic and adult β-like globin genes of marsupials, respectively, was confirmed using phylogenetic analyses. In maximum parsimony (MP), Bayesian, and LogDet analyses tammar ε-globin and β-globin were monophyletic with Didelphis virginiana (opossum) and Sminthopsis crassicaudata (dunnart) ε-globin and β-globin, respectively, supported by high bootstrap (100%) in unweighted MP and LogDet analyses and high posterior probabilities (100%) in Bayesian analyses (Fig. 3, Table 1). Low bootstrap support for monophyly of marsupial ε-globin genes for MP analyses that excluded third codon positions was obtained but is most likely due to the reduced phylogenetic signal from these analyses. Moderate bootstrap support was also obtained for the grouping of dunnart and tammar β-globin and dunnart and tammar ε-globin, respectively, for MP (including third codon positions) and LogDet analyses (Table 1). Similarly, the grouping of dunnart and tammar ε-globin occurred at a high posterior probability in the Bayesian analyses (>97%), but the grouping of dunnart and tammar β-globin gave low posterior probabilities: 55% for the linked analysis and just 36% for the unlinked partition analysis. In the latter analysis, tammar β grouped with opossum β to the exclusion of dunnart β with a posterior probability of 61%, an arrangement that is highly unlikely, given the known phylogenetic relationships of American and Australian marsupials. A similar arrangement was found with parsimony analyses with third codon positions excluded supported by a low bootstrap value (64%).
Table 1

Bootstrap support values (%) and posterior probabilities (%) of nodes in the β-globin gene phylogeny shown in Fig. 3A

Node

MP+3

MP-3

LD

B-L

B-UL

1. Monophyly of marsupial ω-globin and avian β-globin

62

70

78

51

56

2. Monophyly of avian β-globin

<50

100

100

100

100

3. Monophyly of marsupial ω-globin

100

100

100

100

100

4. Monophyly of marsupial β-globin

100

93

100

100

100

5. Monophyly of dunnart and tammar β-globin

75

<50

72

55

36

6. Monophyly of marsupial ε-globin

100

<50

100

100

100

7. Monophyly of dunnart and tammar ε-globin

77

68

91

99

97

8. Monophyly of mammalian ε-globin

<50

<50

83

65

45

9. Monophyly of mammalian ε-globin and β-globin

68

54

61

95

91

Note. Bootstrap analyses were based on maximum parsimony (MP) with third codon positions included (MP+3), MP excluding third codon positions (MP-3), and LogDet analyses using neighbor joining (LD). Bayesian posterior probabilities were determined from analyses with linked models for each codon position (B-L) or unlinked models (B-UL).

Tammar ω-globin grouped strongly with dunnart ω-globin, supported by high bootstrap values (100%) and high posterior probabilities (100%) in all analyses. MP trees and Bayesian consensus trees supported the grouping of the ω-globin clade as the sister lineage to a group containing all avian β-like globin genes, but this arrangement only received moderate bootstrap support (62–78%) and low posterior probabilities (51% linked, 56% unlinked). An alternative arrangement showing the ω-globin clade grouping outside a monophyletic clade containing avian and mammalian β-like globin genes gave posterior probabilities of 46 and 41% for the linked and unlinked analyses, respectively. Together these phylogenies imply that a progenitor ω-globin gene existed in the common ancestor of avians and mammals, supported by an overall posterior probability of 97% for both analyses.

Tammar ω-globin Is Located 3′ to the α-globin Gene Cluster

Tammar ω-globin was originally isolated using a combination of PCR amplification with degenerate primers, PCR gene walking, and inverse-PCR approaches (full details in Wheeler et al. 2001). This approach allowed only a small region of the 5′ and 3′ flanking region to be sequenced. In order to identify genes that may be syntenic to ω-globin, three overlapping lambda clones, λTG2.11, λTG3.2, and λTG3.4, were isolated from a tammar genomic DNA library and shown by restriction mapping and Southern and sequence analyses of clone overlap regions to incorporate approximately 30 kb of DNA surrounding ω-globin (Fig. 2).
https://static-content.springer.com/image/art%3A10.1007%2Fs00239-004-2584-0/MediaObjects/fig2.gif
Figure 2

Restriction map of λTG2.11 and λTG3.4 showing the region of overlap between the two lambda clones, verified by sequence analysis, and the linkage of tammar α-globin, θ-globin, and ω-globin. Sites mapped are E = EcoRI, S = SacI, and B = BamHI. Thick lines denote the λGEM-11 arms that are not drawn to scale.

https://static-content.springer.com/image/art%3A10.1007%2Fs00239-004-2584-0/MediaObjects/fig3.gif
Figure 3

A One of six most parsimonious trees of length 1424 steps from a PAUP* analysis of marsupial, eutherian, and avian β-like globin gene coding sequences with all characters equally weighted, using Xenopus laevis β-globin as an outgroup. Numbers on branches refer to nodes where bootstrap values are provided in Table 1. B A 50% majority rule consensus phylogram from a Bayesian analysis constucted using a single GTR+I+G model of evolution and base frequencies for all sites (linked analysis). Numbers above branches refer to posterior probabilities. GenBank accession numbers for sequences are as follows: dunnart β (SCHBBHEMO), ε (SCHBBEGN), ω (AY014770); tammar β (AY450928), ε (AY450927), ω (AYO14769); opossum β (OPOHBBB), ε (OPOHBBE); chick (Gallus gallus) β (bH), ε, ρ (GGHBBRE); duck (Cairina moschata) β (CMBGA2B2), ε (CMEGA2E2); human β, γ, ε (HSHBB); mouse (Mus musculus) β, γ (βh0), ε (εY) (MMBGCXD); goat (Capra hircus) βA (CHHBBAA), εI (CHEBGLI); rabbit (Oryctolagus cuniclus) β, γ, ε (OCBGLO01); echidna (Tachyglossus aculeatus) β (TGLHBB); platypus (Ornithorhynchus anatinus) β (TGLHBB); Xenopus (Xenopus laevis) adult β-globin (XELHBBC).

Partial sequencing of the genomic DNA within λTG2.11 and λTG3.4 revealed the presence of two genes that have high levels of nucleotide sequence identity to marsupial and eutherian α-like globin genes (Fig. 2; sequences submitted to GenBank; accession numbers AY459589–90). The open reading frame from one of these genes encodes a protein that is identical to the partially sequenced adult α-chain from the tammar (accession number P81043 [Holland and Gooley 1997]; see Fig. 4). This gene is henceforth referred to as α-globin. The second open reading frame (tammar2) encoded a protein that appears unrelated to any previously sequenced marsupial adult α-globin protein. In order to identify a mammalian orthologue of this gene we carried out phylogenetic analyses using MP, including sequences from a variety of marsupial, eutherian, and avian α-like globin genes. The phylogenetic tree grouped tammar α-globin with the Native cat adult α-globin sequence with a high bootstrap value (91%) (Fig. 5). A second group was identified containing tammar2 and eutherian α-like θ-globin genes supported by 58% of bootstrap pseudoreplicates. This arrangement provides evidence that tammar2 is orthologous to eutherian θ-globins; thus this gene was renamed θ-globin (Fig. 5). Overall, the results indicate that ω-globin is found downstream of a cluster of at least two α-like globin genes, in the order 5′-α–θ–ω-3′ (Fig 2).
https://static-content.springer.com/image/art%3A10.1007%2Fs00239-004-2584-0/MediaObjects/fig4.gif
Figure 4

The α-chain of the gray kangaroo (1) aligned with the partial tammar α-chain (Holland and Gooley 1997) (2), and the adult-α conceptual peptide sequence (3) in single-letter code. The boldface residue in the gray kangaroo sequence denotes the single-amino acid difference.

https://static-content.springer.com/image/art%3A10.1007%2Fs00239-004-2584-0/MediaObjects/fig5.gif
Figure 5

A single MP tree of 1509 steps from a heuristic search using PAUP* (Swofford 2002) showing the evolutionary relationships between the two tammar α-like globin genes and eutherian and avian α-like globin genes. Bootstrap percentages of 1000 pseudoreplicates (>50%) are shown above the branches. Sequences used in the analysis included coding regions only of each gene. GenBank accession numbers or locus ID codes for sequences are as follows: tammar θ (AY459590), α (AY459589); horse θ (Y00284), α1 (M17902), ζ (X07051); goat α (J00043); human α1 (V00491), θ (X06482), ζ (NM005332); mouse αA1 (NM008218), ζ (X62302); rabbit α (X04751), ζ1 (AH001223), θ (X04751); rat θ (X56330); Native cat (Dasyurus viverrinus) α (M14567); chicken αA, π, αD (AF098919); duck (Cairina moschata) αD (X01831); pigeon (Columba livia) αD (AB001981).

Human and Mouse Genome Analyses: A Search forω-globin Orthologues in Eutherians

On the basis of our phylogenetic analyses, ω-globin orthologues are predicted to occur in eutherian species. BLAST searches of the available genome and EST databases for human and mouse, as well as the complete GenBank database, using tammar ω-globin DNA and protein sequences failed to identify any ω-like sequences. Therefore, it is possible that ω-globin has become a pseudogene in some or all eutherian species, a process that may make their identification on the basis of sequence homology difficult over the ∼120-million year timespan of eutherian evolution.

The localization of ω-globin to the 3′ end of the tammar α-globin cluster provides a starting point, on the basis of syntenic conservation, to search for remnant sequences from the eutherian orthologues of ω-globin. Unfortunately, no candidate ω-like sequences could be detected on the basis of nucleotide similarity within the 15 kb of DNA found immediately downstream from the human α-globin gene cluster (accession number NT_037887), even over very short tracts of sequence. Similarly, TBLASTN searches of the six-frame conceptual translation of the region downstream from human θ-globin failed to identify significant levels of peptide sequence identity with the tammar ω-chain, even at low confidence values. These analyses were repeated on 30 kb of sequence downstream from the α-globin cluster (accession number NT_039531) in the mouse, and once again, no ω-like sequences could be detected. These analyses indicate that the putative eutherian ω-globin gene has been deleted or translocated from its position that existed in the common ancestor of eutherians and marsupials.

Discussion

Our analyses show that the β-globin gene family of the tammar consists of at least three functional genes. Two of the genes, ε- and β-globin, are orthologous to ε- and β-globin, respectively, of the marsupials S. crassicaudata (Cooper and Hope 1993; Cooper et al. 1996) and D. virginiana (Koop and Goodman 1988). The third gene, ω-globin, is shown to be linked to the adult α-globin and θ-globin genes of the tammar, providing the first reported case in a mammal of a β-like globin gene that is linked to and expressed as a part of the α-globin gene cluster and unlinked to the main β-globin cluster. Features of the α-globin genes (α-globin and θ-globin) isolated in this study will be discussed in a separate paper on the evolution of the α-globin gene family in marsupials (Wheeler et al. in preparation).

Phylogenetic analyses provide strong evidence (combined posterior probability = 97%) that the progenitor of ω-globin existed prior to the divergence of the avian and mammalian lineages. This ancient evolutionary history and the presence of ω-globin to the 3′ side of the α-globin gene cluster fit the criteria originally proposed by Jeffreys et al. (1980) of a “fossil” β-like globin gene, the direct descendant of an ancestral β-globin gene that was linked to α-globin at the base of the avian and mammalian lineages. By these criteria the results support the chromosome duplication hypothesis that an ancient chromosome or in trans duplication event ultimately led to the evolution of unlinked clusters of α-globin and β-globin genes in avians and mammals (Jeffreys et al. 1980; Hardison 2001a). However, definitive acceptance of this chromosome duplication hypothesis requires the reciprocal observation in mammals of α-globin genes in synteny with a β-globin gene cluster that is unlinked to the main α-globin gene cluster. To date, such “fossil” α-globin genes have not been reported in mammals, but a recent report on the hemoglobin genes of the pufferfish Fugu rubripes (Gillemans et al. 2003) have shown the existence in this species of two distinct loci, one containing two α-globin genes linked to β-globin and a second containing only α-globin genes. It is possible that the Fugu arrangement of hemoglobin genes resulted from the same duplication event as that giving unlinked clusters in avians and mammals, but in the lineage leading to Fugu the β-globin genes were independently silenced in one of the duplicate αβ clusters (but see further discussion below).

Phylogenetic analyses indicate that ω-globin existed and was most likely linked to the α-globin cluster in the common ancestor of all eutherian mammals and marsupials. In humans and mice, where the α-globin clusters have been extensively mapped and sequenced, there appears to be an absence of any additional globin-like genes to the 3′ side of these clusters, providing evidence that ω-globin has been silenced or translocated away from this cluster. Similar analyses by Flint et al. (2001) also failed to show any additional globin-like genes. Their comparisons of the mouse and human α-globin gene clusters show that, despite extensive gene synteny at the 5′ ends of these clusters, the genes found in the region downstream from θ-globin are not homologous, suggesting that this latter region contains a break point for one or more translocation events. For example, the mouse orthologue of Luc7L, a gene that is located ∼7.8 kb 3′ to θ-globin in humans, is found on a different chromosome than the mouse α-globin gene cluster (Flint et al. 2001; Tufarelli et al. 2001). Translocation of ω-globin from the HS40 enhancer (an enhancer required for erythroid-specific expression of α-like globin genes [Higgs et al. 1990]) would probably lead to a reduced level of expression of ω-globin, which would, in turn, allow the accumulation of mutations that may lead to rapid evolutionary change in sequence. The timing of this translocation within the eutherian lineage will ultimately determine the possibility that ω-globin is a pseudogene in all extant eutherian species.

The chromosome duplication hypothesis helps to explain how α-globin and β-globin genes may have become unlinked while still retaining enhancer signals necessary for erythroid-specific expression. These enhancer signals have been identified in the 5′ region of the human α-globin (HS40 [Higgs et al. 1990]) and β-globin (LCR [Grosveld et al. 1987]) gene clusters and are likely to have evolved from a common ancestral erythroid-specific enhancer located 5′ to a linked α–β cluster (Hardison 1998). A process of translocation between the ancestral α-globin and β-globin genes would have severed the link between the β-globin gene and the enhancer, resulting in reduced and/or nonspecific expression of this gene. Under the chromosome duplication hypothesis the enhancer would be retained allowing continued erythroid-specific expression of the α-globin and β-globin genes in both duplicate clusters.

Previous reports of ω-globin in marsupials and the finding that this gene appeared most closely related to avian β-globin genes and is unlinked to the β-globin gene cluster in two marsupials led to the proposal that avian and mammalian β-globin gene clusters were not orthologous (Hardison 2001b; Wheeler et al. 2001). Phylogenetic analyses using new Bayesian methods (MRBAYES version 3.0b [Huelsenbeck and Ronquist 2001]), allowing different models of evolution to be applied to different codon partitions of the data set, and parsimony analyses also support the monophyly of avian β-like globin genes and marsupial ω-globin, although this arrangement received relatively low posterior probabilities and bootstrap support (56 and 62%, respectively). Under the chromosome duplication hypothesis, monophyly of ω-globin and avian β-like globin genes could result if a different combination of α-globin or β-globin genes were silenced in the duplicate α–β clusters of avians and mammals (Fig. 6). This model requires the duplication of an ancestral α–β cluster to have occurred some time prior to the divergence of the avian and mammalian lineages. Following the divergence of these lineages from a common ancestor, α1 and β2 were silenced in the mammalian lineage (with β2 [the progenitor of ω] retained in marsupials), while α2 and β1 were silenced in the avian lineage (Fig. 6). This model of evolution could be tested by comparative analyses of the genes that are syntenic to α-globin and β-globin gene families from avians and mammals. To date, it has been shown that a region 5′ to the Fugu α-globin cluster contains a number of genes in common with the 5′ region of the human α-globin cluster, but the extent of synteny in the chicken α-globin cluster has not yet been determined (Flint et al. 2001). Similar comparisons of the human β-globin and α-globin clusters with the Fugu αβ locus failed to show any conserved synteny of genes for the human β-globin cluster, but the human α-globin cluster shares an open reading frame (C16orf8 gene) in common with the Fugu αβ locus. These results suggest that the locus duplication events leading to unlinked α-globin and β-globin genes in these lineages may have occurred independently. However, it cannot be ruled out that additional chromosomal rearrangements of the β-globin locus in mammals have separated them from their original flanking genes after a single duplication event.
https://static-content.springer.com/image/art%3A10.1007%2Fs00239-004-2584-0/MediaObjects/fig6.jpg
Figure 6

A model for the evolution of unlinked clusters of α-globin and β-globin genes byin trans duplication and gene silencing (X) drawn within the confines of a species tree for the avian and mammalian lineages. The differential silencing of the α-globin and β-globin genes in avians and mammals leads to the orthology of avian β-like globin genes and ω-globin in marsupials that is supported by current phylogenetic analyses (see Fig. 3).

There are limitations to the use of β-globin sequences for reconstruction of a robust gene tree spanning the entire evolutionary history of the gene family. The genes comprise only 444 coding sequence characters, and for the resolution of ancient divergences it is likely that third codon positions may be saturated. In addition, we have shown that base frequencies lack stationarity at third codon positions among the broad range of taxa we used. For our analyses it was not feasible to increase the sequence length of the data set by using noncoding regions because unambiguous alignments of intronic and flanking sequences were not possible. The differences obtained in parsimony analyses where third codon positions were either included or excluded highlight these problems; phylogenetic signal was greatly reduced when they were excluded with low bootstrap support on almost all internal branches, but inclusion of third codon positions also led to noisy signal and lower bootstrap values in a number of areas of the phylogeny, such as the monophyly of avian β-like globin genes (Fig. 3, Table 1). The Bayesian analyses generally gave high posterior probability values for many internal nodes in the phylogeny, but the more parameter-rich model, allowing different Q-matrices, base frequencies, and site rate variation parameters at first, second, and third codon positions (unlinked analysis), led to reduced overall resolution, most probably because of an increase in the variance of parameter estimates with such a small data set. One possibility for improving overall resolution in the β-globin phylogeny is to include sequence data from additional taxonomic groups that help to break up some of the long branches (Rannala et al. 1998). In particular, further assessment of the monophyly of avian β-like globin genes and ω-globin would benefit from the inclusion of reptilian β-globin sequences to help break up the long branch leading to the avian β-globin genes.

The linkage of ω-globin to the α-globin gene cluster allows for the intriguing possibility of coregulation in cis, through the HS40 enhancer, of the ω-/α-globin gene partners that form ω-containing hemoglobin. Analyses of tammar neonatal globin protein types by Holland and Gooley (1997) detected four embryonic-type hemoglobins, with ω-globin only found in combination with adult α-globin. The other hemoglobins comprised one of two distinct σ-globin chains and α-globin in combination with ε-globin (Holland and Gooley 1997). The absence of σ–ω hemoglobins suggests that the two σ-globin genes are switched off in cells coexpressing α-globin and ω-globin, and ω-globin is switched off in cells expressing the σ-globin genes. However, at present, we cannot exclude the possibility that there is a steric incompatibility between ω- and σ-globins.

This study extends to three the number of marsupial species (opossum, dunnart, and tammar) now known to contain early expressed ε-globin and adult β-globin genes. It can now be claimed with more confidence that (i) the identification of ε-globin and β-globin genes in a number of divergent marsupial species suggests that these genes are likely to be present in all marsupials (Cooper et al. 1996) and (ii) an embryonic β-like globin existed before the divergence of marsupial and eutherian lineages and that this gene had already acquired changes that would restrict its expression to the neonatal developmental environment (Koop and Goodman 1988; Cooper and Hope 1993). The identification in the tammar of β- and ε-globin genes provides further support for a model predicting that a two-gene β-globin cluster existed in the common ancestor of eutherians and marsupials (Koop and Goodman 1988; Cooper et al. 1996). No orthologues of the eutherian γ- or η-like or δ-globins were identified in this study, despite the use of PCR primers to bind highly conserved regions of β-like globin genes. Although the linkage order of the tammar β- and ε-globin genes was not determined, the previous determination of the linkage arrangement in the dunnart (Cooper et al. 1996) would suggest that the tammar β-like globin cluster is organized 5′-ε–β-3′.

Acknowledgments

We thank Morris Goodman and Ross Hardison for their continuing interest and advice in relation to globin gene evolution. We also thank two anonymous reviewers for their helpful comments on the manuscript. A grant from the Australian Research Council to R.A.B. Holland, A.A. Gooley, and R.M. Hope supported this research.

Copyright information

© Springer-Verlag 2004