Abstract
Glycosylphosphatidylinositol-anchored high-density lipoprotein-binding protein 1 (GPIHBP1) functions as a platform and transport agent for lipoprotein lipase (LPL) which functions in the hydrolysis of chylomicrons, principally in heart, skeletal muscle and adipose tissue capillary endothelial cells. Previous reports of genetic deficiency for this protein have described severe chylomicronemia. Comparative GPIHBP1 amino acid sequences and structures and GPIHBP1 gene locations were examined using data from several mammalian genome projects. Mammalian GPIHBP1 genes usually contain four coding exons on the positive strand. Mammalian GPIHBP1 sequences shared 41–96% identities as compared with 9–32% sequence identities with other LY6-domain-containing human proteins (LY6-like). The human N-glycosylation site was predominantly conserved among other mammalian GPIHBP1 proteins except cow, dog and pig. Sequence alignments, key amino acid residues and conserved predicted secondary structures were also examined, including the N-terminal signal peptide, the acidic amino acid sequence region which binds LPL, the glycosylphosphatidylinositol linkage group, the Ly6 domain and the C-terminal α-helix. Comparative and phylogenetic studies of mammalian GPIHBP1 suggested that it originated in eutherian mammals from a gene duplication event of an ancestral LY6-like gene and subsequent integration of exon 2, which may have been derived from BCL11A (B-cell CLL/lymphoma 11A gene) encoding an extended acidic amino acid sequence.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
Recent studies (Ioka et al. 2003; Beigneux et al. 2007) have shown that a glycosylphosphatidylinositol-anchored high-density lipoprotein-binding protein 1 (GPIHBP1) of capillary endothelial cells is required for the metabolism of triglyceride-rich lipoproteins in mammalian plasma. This glycoprotein binds lipoprotein lipase (LPL) and apolipoproteins (apoA-V) strongly (Gin et al. 2007, 2011) and may serve as a platform for lipolysis within capillaries, particularly in tissues which show high expression levels for both GPIHBP1 and LPL genes, such as heart, skeletal muscle and adipose tissue (Beigneux et al. 2007; Wion et al. 1987; Havel and Kane 2001; Young et al. 2007). Studies of Gpihbp1−/Gpihbp1− knock out mice have shown that GPIHBP1-deficiency causes severe hypertriglyceridemia with very high plasma triglyceride levels of 2,000–5,000 mg/dl (Beigneux et al. 2007; Young et al. 2007).
Human clinical studies have also examined loss of function GPIHBP1 mutations leading to familial chylomicronemia. Wang and Hegele (2007) reported two siblings with severe chylomicronemia of 160 patients examined exhibiting chylomicronemia who were homozygous for a GPIHBP1 gene missense mutation (G56R). Franssen et al. (2010) and Olivecrona et al. (2010) have recently identified mutations of conserved cysteines (C65S, C65Y and C68G) in the Ly6 domain of GPIHBP1 in familial chylomicronemia, while Beigneux et al. (2009) have reported a mutant GPIHBP1 (Q115P) which lacked the ability to bind LPL and chylomicrons in a patient with chylomicronemia.
Biochemical studies (Beigneux et al. 2007; Gin et al. 2007, 2011) have suggested that GPIHBP1 is localized on the luminal and abluminal capillary endothelial cell surfaces where it is bound by a glycosylphosphatidylinositol anchor and binds strongly to LPL. GPIHBP1 serves as an LPL transporter from the sub-endothelial spaces to the luminal face of capillaries, enabling lipolysis of circulating triglycerides localized within plasma chylomicrons (Davies et al. 2010; Fisher 2010). Molecular modeling of human GPIHBP1 (Beigneux et al. 2007) and biochemical analyses (Gin et al. 2007) have shown that this protein contains at least four major domains with distinct roles: an N-terminal signal peptide which targets the intracellular trafficking of GPIHBP1 to the cell surface via the endoplasmic reticulum; a very acidic amino acid domain within the GPIHBP1 amino-terminal region may play a role in binding to the positively charged residues of the heparin-binding domain for LPL and apolipoproteins; a cysteine-rich LY6 domain also contributes to LPL binding, as shown by site-directed mutagenesis and human clinical mutation studies (Franssen et al. 2010; Olivecrona et al. 2010); and a C-terminal region which contains a hydrophobic domain which is replaced by a glycosylphosphotidylinositol anchor within the endoplasmic reticulum and which binds GPIHBP1 to the endothelial cell surface (Nosjean et al. 1997; Fisher 2010; Ory 2007). Recently, Gin et al. (2011) have reported several important GPIHBP1-binding properties and have shown specific binding for LPL whereas other related neutral lipases, hepatic lipase (HL) and endothelial lipase (EL), do not bind. In addition, GPIHBP1 also binds APO-A5 strongly whereas another lipid transport protein (APO-A1) does not.
Structures of mammalian GPIHBP1 genes have been reported in association with a number of mammalian genome sequencing projects, including human, mouse and rat (Mammalian Genome Project Team 2004; Rat Genome Sequencing Project Consortium 2004), and some mammalian GPIHBP1 cDNA and protein sequences have been described (Ioka et al. 2003; Beigneux et al. 2007; Beigneux et al. 2009a, b). Human, mouse and rat GPIHBP1 genes contain four exons of DNA encoding GPIHBP1 sequences (Thierry-Mieg and Thierry-Mieg 2006).
This paper describes predicted gene structures and amino acid sequences for several mammalian GPIHBP1 genes and proteins, and predicted secondary structures for mammalian GPIHBP1 proteins. In addition, we examine the relatedness for mammalian GPIHBP1 with other lymphocyte antigen-6 (Ly6-like) genes and proteins, and describe an hypothesis for the origin of the GPIHBP1 gene within eutherian mammals from an ancestral mammalian LY6-like gene and subsequent integration of an exon within the mammalian GPIHBP1 gene encoding the acidic amino acid LPL-binding platform previously described for human and mouse GPIHBP1 (Beigneux et al. 2007; Gin et al. 2007, 2011).
Methods
Mammalian GPIHBP1 gene and protein identification
Basic Local Alignment Search Tool (BLAST) studies were undertaken using web tools from the National Center for Biotechnology Information (NCBI) (http://blast.ncbi.nlm.nih.gov/Blast.cgi) (Altschul et al. 1997). Protein BLAST analyses used mammalian GPIHBP1 amino acid sequences previously described (Table 1). Non-redundant protein sequence databases for several mammalian genomes were examined using the blastp algorithm, including human (Homo sapiens) (International Human Genome Consortium 2001); chimpanzee (Pan troglodytes) (Chimpanzee Sequencing and Analysis Consortium 2005); orangutan (Pongo abelii) (http://genome.wustl.edu); rhesus monkey (Macaca mulatta) (Rhesus Macaque Genome Sequencing and Analysis Consortium 2007), cow (Bos Taurus) (Bovine Genome Project 2008); horse (Equus caballus) (Horse Genome Project 2008); mouse (Mus musculus) (Mouse Genome Sequencing Consortium 2002); rat (Rattus norvegicus) (Rat Genome Sequencing Project Consortium 2004); opossum (Monodelphis domestica) (Mikkelsen et al. 2007); and platypus (Ornithorhynchus anatinus) (Warren et al. 2008). This procedure produced multiple BLAST ‘hits’ for each of the protein databases which were individually examined and retained in FASTA format, and a record kept of the sequences for predicted mRNAs and encoded GPIHBP1-like proteins. These records were derived from annotated genomic sequences using the gene prediction method: GNOMON and predicted sequences with high similarity scores for human GPIHBP1. Predicted GPIHBP1-like protein sequences were obtained in each case and subjected to analyses of predicted protein and gene structures.
Blast-Like Alignment Tool (BLAT) analyses were subsequently undertaken for each of the predicted GPIHBP1 amino acid sequences using the University of California Santa Cruz (UCSC) Genome Browser [http://genome.ucsc.edu/cgi-bin/hgBlat] (Kent et al. 2003) with the default settings to obtain the predicted locations for each of the mammalian GPIHBP1 genes, including predicted exon boundary locations and gene sizes. BLAT analyses were similarly undertaken for other mammalian LY6-like and vertebrate BCL11A-like (encoding B-cell CLL/lymphoma 11A) genes and proteins using previously reported sequences for LY6D, LY6E, LY6H, LY6K, LY6NX1, PSCA, SLURP1, GML, LY6D2 and BCL11A in each case (Tables 1, 2, 3). Structures for human, mouse and rat GPIHBP1 genes and encoded proteins were obtained using the AceView website Thierry-Mieg and Thierry-Mieg 2006) (http://www.ncbi.nlm.nih.gov/IEB/Research/Acembly/index.html?human).
Predicted structures, properties and alignments of mammalian GPIHBP1 and human LY6-like sequences
Predicted secondary structures for human and other mammalian GPIHBP1 proteins were obtained using the PSIPRED v2.5 website tools [http://bioinf.cs.ucl.ac.uk/psipred/psiform.html] (McGuffin et al. 2000). Other web tools were used to predict the presence and locations of the following for each of the mammalian GPIHBP1 sequences: SignalP 3.0 for signal peptide cleavage sites (http://www.cbs.dtu.dk/services/SignalP/) (Emmanuelsson et al. 2007); NetNGlyc 1.0 for potential N-glycosylation sites (http://www.cbs.dtu.dk/services/NetNGlyc/); and big-PI Predictor for the glycosylphosphatidylinositol linkage group-anchored sites (http://mendel.imp.ac.at/sat/gpi/gpi_server.html) (Eisenhaber et al. 1998). The reported tertiary structure for human CD59 (membrane-bound glycoprotein) (Leath et al. 2007) served as the reference for the predicted human, rat, pig and guinea pig GPIHBP1 tertiary structures, with modeling ranges of residues 62–138, 69–146, 65–141 and 61–139, respectively. Alignments of mammalian GPIHBP1 sequences with human LY6D, LY6E, LY6H, LY6K, LYNX1 and LYPD2 lymphocyte antigen-6-related proteins or with vertebrate B-cell CLL/lymphoma 11A (BCL11A) sequences were assembled using the ClustalW2 multiple sequence alignment program (Larkin et al. 2007) (http://www.ebi.ac.uk/Tools/clustalw2/index.html).
Comparative bioinformatics of mammalian GPIHBP1, vertebrate LY6-like and vertebrate BCL11A genes and proteins
The UCSC Genome Browser (http://genome.ucsc.edu) (Kent et al. 2003) was used to examine comparative structures for mammalian GPIHBP1 (Table 1), vertebrate LY6-like (lymphocyte antigen-6 complex; Tables 1, 2) and vertebrate BCL11A (B-cell CLL/lymphoma 11A) (Table 3) genes and proteins. We also used the UCSC Genome Browser Comparative Genomics track that shows alignments of up to 28 vertebrate species and evolutionary conservation of GPIHBP1 gene sequences. Species aligned for this study included 4 primates, 6 non-primate eutherian mammals (e.g., mouse, rat), a marsupial (opossum), a monotreme (platypus) and bird species (chicken). Conservation measures were based on conserved sequences across all of these species in the alignments which included the 5′-flanking, 5′-untranslated and coding regions of the GPIHBP1 gene.
BLAT analyses were subsequently undertaken using the nucleotide sequence for exon 2 of human GPIHBP1 using the UCSC Genome Browser [http://genome.ucsc.edu/cgi-bin/hgBlat] (Kent et al. 2003) to identify homologs for this exon in the human genome.
Phylogenetic studies and sequence divergence
Alignments of mammalian GPIHBP1 and vertebrate LY6-like protein sequences were assembled using BioEdit v.5.0.1 and the default settings (Hall 1999). Alignment ambiguous regions, including the acidic amino acid region of GPIHBP1, were excluded prior to phylogenetic analysis yielding alignments of 60 residues for comparisons of sequences with the zebrafish (Danio rerio) LY6-like (LYPD6) sequence (Tables 1, 2). Evolutionary distances were calculated using the Kimura option (Kimura 1983) in TREECON (Van De Peer and de Wachter 1994). Phylogenetic trees were constructed from evolutionary distances using the neighbor-joining method (Saitou and Nei 1987) and rooted with the zebrafish LYPD6 sequence. Tree topology was reexamined by the bootstrap method (100 bootstraps were applied) of resampling and only values that were highly significant (≥90) are shown (Felsenstein 1985).
Results and discussion
Alignments of mammalian GPIHBP1 amino acid sequences with human LY6-related antigen sequences
The deduced amino acid sequences for orangutan (Pongo abelii), rhesus monkey (Macaca mulatta), marmoset (Callithrix jacchus), horse (Equus caballus), cow (Bos taurus) and rat (Rattus norvegicus) GPIHBP1 are shown in Fig. 1 together with previously reported sequences for human and mouse GPIHBP1 (Beigneux et al. 2007; Gin et al. 2007). In addition, amino acid sequences for several LY6-related lymphocyte antigen sequences are also aligned with the mammalian GPIHBP1 sequences, including human LY6D (Brakenoff et al. 1995), LY6E (Capone et al. 1996), LYPD2 (Clark et al. 2003), LY6H (Horie et al. 1998), LY6K (Ishikawa et al. 2007) and LYNX1 (Mammalian Genome Project Team 2004) (Table 1). Alignments of human and other mammalian GPIHBP1 sequences examined showed identities between 46 and 96%, suggesting that these are the products of the same gene family, whereas comparisons of sequence identities of mammalian GPIHBP1 proteins with human LY6-like lymphocyte antigen sequences exhibited low levels of sequence identities (9–32%), indicating that these are the members of distinct protein families (Table 4).
The amino acid sequences for most of the mammalian GPIHBP1 proteins contained 167–184 residues whereas mouse and rat GPIHBP1 contained 225 and 236 amino acids, respectively, with the latter having extended C-terminal sequences (Fig. 1). Previous biochemical and genetic analyses of human and mouse GPIHBP1 (Beigneux et al. 2007; Gin et al. 2007, 2011) have enabled predictions of key residues for these mammalian GPIHBP1 proteins (sequence numbers refer to human GPIHBP1). These included the N-terminus signal peptide (residues 1–20) which participates in the trafficking of GPIHBP1 via the endoplasmic reticulum; two acidic amino acid clusters (residues 25–32 and 41–50) which may contribute to LPL binding within a basic amino acid LPL heparin-binding site region (Sendak and Bensadoun 1998); a conserved Gly56 with an unknown function (Gin et al. 2007); a predominantly conserved N-glycosylation site (Asn78-Leu79-Thr80) which is critical for the movement of GPIHBP1 onto the cell surface (Beigneux et al. 2008); a urokinase plasminogen activator receptor (UPAR)-lymphocyte antigen-6 (LY6) domain which contains 10 conserved cysteine residues (Cys65, Cys68, Cys77, Cys83, Cys89, Cys110, Cys114, Cys130, Cys131 and Cys136) and forms five disulfide bridges within this domain; Gln115 which plays a role in LPL binding to GPIHBP1 (Franssen et al. 2010); and a hydrophobic C-terminal helix domain (residues 160–178) which is replaced by a glycosylphosphatidylinositol anchor (to Gly159) and is responsible for linking GPIHBP1 to the endothelial cell surface (Nosjean et al. 1997; Davies et al. 2010; Fisher 2010). These residues and predicted properties were conserved for all of the mammalian GPIHBP1 sequences examined (Fig. 1) with the exception of the cow GPIHBP1 sequence, which lacked a predicted N-glycosylation site (Beigneux et al. 2008). Predicted N-glycosylation site(s) were also absent in guinea pig, dog and pig GPIHBP1 sequences; whereas human and orangutan GPIHBP1 sequences exhibited two predicted N-glycosylation sites (Asn78-Leu79-Thr80 and Asn82-Cys83-Ser84) (Table 5) although experimental evidence for in vivo N-glycosylation is only available for the first site (Beigneux et al. 2008).
The human LY6-like sequences examined shared several of the mammalian GPIHBP1 domain regions, including the N-signal peptide region (sequence numbers refer to human LY6D) (residues 1–20); the UPAR-LY6 domain with 10 conserved cysteine residues (Cys23, Cys26, Cys32, Cys38, Cys45, Cys63, Cys67, Cys86, Cys87 and Cys92) forming five disulfide bonds previously reported for LY6-like proteins (Fry et al. 2003; Leath et al. 2007), and the hydrophobic C-terminal helix domain (residues 104–125) which is replaced by a glycosylphosphatidylinositol anchor (predicted to be bound to Asn98). These LY6-like sequences, however, lacked the N-terminal acidic amino acid domain and contained fewer amino acids in the protein region surrounding the UPAR-Ly6 domain (residues 21–96). These sequences also lacked the predominantly conserved N-glycosylation site observed for mammalian GPIHBP1 proteins but contained amidation sites for attaching the glycosylphosphatidylinositol anchor in each case.
Predicted structures for mammalian GPIHBP1 proteins
Predicted secondary structures for mammalian GPIHBP1 sequences were compared with those predicted for human lymphocyte antigen-6-like proteins (Fig. 1). α-Helix and β-sheet structures for these sequences were similar for several regions with the human LY6-like secondary structures, including the N-terminal signal peptide which contained an extended helical structure; the UPAR-LY6 domain which contained four or five β-sheet structures (designated as β1–β5) within the region for five disulfide bonds; and the C-terminal hydrophobic region, which is removed following GPI-attachment within the endoplasmic reticulum. The distinctive secondary structures observed for mammalian GPIHBP1 sequences were two acidic amino acid α-helical regions which were notably absent in the LY6-like predicted secondary structures.
Tertiary structures for the members of the LY6 protein family has been reported previously which are characterized by an amino acid motif containing eight or ten cysteine residues arranged in consistent spacing patterns forming four or five disulfide bonds and a three-finger motif which comprised β-pleated sheets predominantly. The predicted secondary structures observed for the human LY6-like proteins (LY6D, LY6E, LY6PD, LY6H, LY6K and LY6NX1) and the mammalian GPIHBP1 protein sequences examined are consistent with the presence of this LY6 protein family motif within these proteins (Fig. 1). Figure 2 describes predicted tertiary structures for human, rat, pig (Sus scrofa) and guinea pig (Cavia porcellus) GPIHBP1 protein sequences and shows significant similarities to the UPAR-LY6 domain reported for the human CD59 antigen (membrane-bound glycoprotein) (Leath et al. 2007). Five anti-parallel β-sheets are readily apparent in each case, which is consistent with the predictions observed for the human and rat GPIHBP1 proteins shown in the amino acid sequence alignments in Fig. 1. This suggests that the UPAR-LY6 domain secondary and tertiary structures are shared among all GPIHBP1 proteins examined as well as the human LY6-like proteins examined.
The overall structure for mammalian GPIHBP1 may then comprise the two α-helices of acidic amino acids (which bind LPL to GPIHBP1) and the three-fingered β-sheet motif which is covalently linked to the plasma membrane by a glycosylphosphatidylinositol anchor. Recent studies have shown that both motifs are essential for LPL binding and transport and for GPIHBP1 function (Beigneux et al. 2009a, b; Gin et al. 2011).
Comparative human GPIHBP1 tissue expression
Beigneux et al. (2009b) have previously examined Gpihbp1 tissue expression in mouse tissues and reported high levels of expression in heart and adipose tissue, which corresponds with the major distribution for LPL in the body and supports the key role played by this enzyme in lipid metabolism, especially in heart and adipose tissue (Wion et al. 1987; Havel and Kane 2001). Overall, human GPIHBP1, and mouse and rat Gpihbp1 genes were moderately expressed in comparison with the other lymphocyte antigen-like genes being 0.1–0.7 times the average level of gene expression in comparison with human LY6E and LYNX1 genes, which showed expression levels of 4.3 and 1.8 times the average gene, respectively (Table 1). This may reflect a more restricted GPIHBPI cellular expression as compared with LY6-like genes and/or a more specialized role of GPIHBP1 is being responsible for LPL binding in heart and adipose tissue as compared with the broader and more widely distributed functions of LY6-like proteins as lymphocyte antigens throughout the body.
Gene locations and exonic structures for mammalian GPIHBP1 genes and human LY6-like genes
Table 1 summarizes the predicted locations for mammalian GPIHBP1 genes and human LY6-like genes based on BLAT interrogations of several mammalian genomes using the reported sequences for human and mouse (Beigneux et al. 2007; Gin et al. 2007, 2011) and the predicted sequences for the other mammalian GPIHBP1 proteins and the UCSC Genome Browser (Kent et al. 2003). Table 2 also presents the predicted locations and other features for mouse, cow and opossum LY6-like genes and proteins. The mammalian GPIHBP1 genes were predominantly transcribed on the positive strand, with the exception of the marmoset and pig genes which were transcribed on the negative strand. Figure 1 summarizes the predicted exonic start sites for mammalian GPIHBP1 genes with most having 4 coding exons in identical or similar positions to those predicted for the human GPIHBP1 gene, with the exception of the orangutan GPIHBP1 gene, which contained an additional exon within the encoding region for the C-terminal sequence. In contrast, the human, mouse, cow and opossum LY6-like genes examined contained only 3 coding exons encoded on either the positive or negative strands. These results are indicative of structural similarities between the mammalian GPIHBP1 and LY6-like genes but with the GPIHBP1 genes possessing an additional exon (exon 2) in each case.
Figure 3 summarizes the comparative locations of human, rhesus monkey, mouse, cow and opossum LY6-like genes within respective gene clusters. Nine human and rhesus LY6-like and the related GPIHBP1 genes, for example, were localized within 535 or 618 kb gene clusters, respectively, on human and rhesus chromosome 8 whereas 15 mouse Ly6-like genes and the Gpihbp1 gene were co-localized within a 883-kb gene cluster on mouse chromosome 15. Cow and opossum (Monodelphis domestica—a marsupial mammal) LY6-like genes were also similarly located within respective gene clusters on chromosomes 14 and 3, respectively, although in each case, there were fewer LY6-like genes identified in comparison with human and rhesus genomes, and particularly the mouse genome. Of special interest to this current study, however, is the absence of an identified opossum GPIHBP1-like gene and the presence of two predicted opossum LY6H-like genes on chromosome 3 of the opossum genome. For each of the mammalian genomes examined (human, rhesus monkey, mouse, cow and opossum), there were similarities in LY6-like gene order: LYPD2-LYNX1-LY6D-LY6E-LY6H-GPIHBP1, but with GPIHBP1 being undetected in the case of the opossum genome.
Figure 4 shows the predicted structures of mRNAs for human, mouse and rat GPIHBP1 transcripts (Thierry-Mieg and Thierry-Mieg 2006) which were 2.3–3.1 kbs in length with three introns and four exons present and in each case, an extended 3′-untranslated region (UTR) was observed.
Evolutionary appearance of the GPIHBP1 gene in mammalian genomes
Figure 5 shows a UCSC Genome Browser Comparative Genomics track that shows evolutionary conservation and alignments of the nucleotide sequences for the human GPIHBP1 gene, including the 5′-flanking, 5′-untranslated, intronic, exonic and 3′-untranslated regions of this gene, with the corresponding sequences for 12 mammalian and bird genomes, including 4 primates (e.g., rhesus), 6 non-primate eutherian mammals (e.g., mouse, rat), a marsupial (opossum), a monotreme (platypus) and a bird species (chicken). Extensive conservation was observed among these GPIHBP1 genomic sequences for the eutherian mammalian genomes, particularly for the primate species but also for the exonic and 5′-flanking regions for all eutherian genomes examined. An examination of non-synonymous (ns) single nucleotide polymorphisms (SNPs) within the human genome supported this conclusion of GPIHBP1 conservation with this gene containing only a single ns-SNP within exon 1. In contrast with the eutherian mammalian genomes examined, the opossum (marsupial mammal) genome lacked conserved sequences within the 5′-flanking and exon 1 and 2 regions, but showed some genomic sequence conservation within the exon 3 and exon 4 regions. The platypus (monotreme mammal) exhibited conserved GPIHBP1 gene sequences within the 5′-flanking and exon 3 and 4 regions but showed no conservation of other sections of this gene, and lacked exon 1 and 2 conserved sequences. In addition, the chicken (bird) genomic sequence showed no significant conservation of any region of the GPIHBP1 gene, which is consistent with BLAT analyses undertaken using mammalian GPIHBP1 protein sequences which failed to identify a GPIHBP1 gene in this bird genome. It would appear that GPIHBP1 has only recently evolved during mammalian evolution and that the functional gene is present only in eutherian mammalian genomes.
Phylogeny and divergence of mammalian GPIHBP1 and LY6-like sequences
A phylogenetic tree (Fig. 6) was calculated by the progressive alignment of 11 mammalian GPIHBP1 amino acid sequences with human, mouse, cow and opossum LY6-like sequences which was ‘rooted’ with the zebrafish (Danio rerio) LYPD6 sequence (Tables 1, 2). The phylogram showed clustering of the sequences into groups which were consistent with their evolutionary relatedness as well as distinct groups for mammalian GPIHBP1 and LY6-like sequences, which were distinct from the zebrafish LYPD6 sequence. In addition, the mammalian LY6-like sequences were further subdivided into groups, including PSCA, LYNX1, LY6D, LY6H, SLURP1, LYPD2, LY6E, LY6K, GML and a group of mouse Ly6-like sequences (designated as Ly6a, Ly6c1, Ly6c2, Ly6f and Ly6i). These groups were significantly different from each other (with bootstrap values >90) and have apparently evolved as distinct genes and proteins during mammalian evolution. Moreover, it is apparent that GPIHBP1 is a distinct but related LY6-like gene which has appeared early in eutherian mammalian evolution.
Hypothesis: proposed mechanism for the evolutionary appearance of GPIHBP1 in eutherian mammals
A search was undertaken for a potential gene ‘donor’ for the exon encoding the acidic amino acid motif contained within the mammalian GPIHBP1 gene using BLAT to interrogate the human genome with the known nucleotide sequence for exon 2 of the human GPIHBP1 gene (Kent et al. 2003). A region of the human BCL11A gene (encoding acidic residues 484–504 of human B-cell CLL/lymphoma 11A) was identified which encoded an extended sequence of acidic amino acids comparable to amino acid residues 25–50 (corresponding to residues encoded by exon 2 of human GPIHBP1) in the human GPIHBP1 sequence. Supplementary Fig. 1 shows an alignment of this region for representative vertebrate BCL11A acidic amino acid sequences with several mammalian GPIHBP1 exon 2 sequences. Similarities in acidic amino acid sequences are apparent although each protein exhibited a distinctive conservation pattern. It may be noted that the BCL11A gene and protein can be traced back to reptiles and fish in vertebrates (Table 3) whereas GPIHBP1 has been only reported in eutherian mammals (Table 1). Previous studies have shown that the mouse Bcl11a gene encodes a C2H2-type zinc-finger protein which is a common site of retroviral integration in myeloid leukemia and functions as a myeloid and B-cell proto-oncogene (Nakamura et al. 2000) and may serve as a candidate gene for the transfer and integration of the acidic amino acid encoding ‘motif’ into the mammalian GPIHBP1 gene. A hypothesis concerning the evolutionary appearance of the ‘ancestral’ eutherian mammalian GPIHBP1 gene is presented in Fig. 7.
-
Step 1
An LY6-like gene within a common ancestor to eutherian mammals underwent a tandem duplication event generating two closely related LY6-like genes. It may be noted that the opossum genome contains similar LY6H genes (designated as LY6H1 and LY6H2) which are closely localized on opossum chromosome 3 (Fig. 3) and form a distinct opossum LY6-like group following CLUSTAL analysis (Fig. 6); and
-
Step 2
Retroviral integration of the acidic amino acid encoding ‘motif’ of the ancestral BCL11A gene may have occurred in one of the duplicated LY6-like genes (potentially a LY6H-like gene or another LY6-like gene) resulting in the addition of an exon (exon 2) which during the subsequent evolution generates an ancestral eutherian mammalian GPIHBP1-like gene and protein which is retained throughout subsequent eutherian mammalian evolution.
Conclusions
The results of the present study indicate that the mammalian GPIHBP1 gene and encoded protein recently reported represents a distinct family of lymphocyte antigen-6 (LY6)-related gene and protein which shares key conserved sequences and functions with other LY6-like genes and proteins previously studied (Brakenoff et al. 1995; Capone et al. 1996; Clark et al. 2003; Horie et al. 1998; Ishikawa et al. 2007). GPIHBP1 is encoded by a single gene among the mammalian genomes studied which is localized within a LY6-like gene cluster (~500 kbs) on human chromosome 8 and usually contained 4 coding exons. Predicted secondary structures for mammalian GPIHBP1 proteins showed a strong similarity with other LY6-like proteins in a number of domains, including the N-terminal signal peptide region, the UPAR-LY6 domain and in having a highly hydrophobic C-terminal helical sequence, which is removed in the endoplasmic reticulum during the formation of the glycosylphosphatidylinositol anchor. In contrast, however, all mammalian GPIHBP1 proteins contained two high acidic amino acid regions, which have been proposed to play a role in binding LPL (Beigneux et al. 2007; Gin et al. 2007, 2011). Predicted secondary and tertiary structures of the UPAR-LY6 mammalian GPIHBP1 domain showed a strong resemblance to the corresponding region for the human CD59 antigen structure (Leath et al. 2007) with five anti-parallel β-sheets. Comparative studies of 12 mammalian GPIHBP1 genomic sequences indicated that this gene has appeared during eutherian mammalian evolution with conserved genomic sequences observed for all eutherian mammalian genomes examined. In contrast, GPIHBP1 gene sequences were absent from the chicken genome or were seen only in part for the monotreme and marsupial genomes examined. It is proposed that the GPIHBP1 gene has appeared early in mammalian evolution following a tandem gene duplication event of one of the LY6 genes and the subsequent retroviral integration of exon 2 encoding the acidic amino acid ‘motif’.
References
Altschul F, Vyas V, Cornfield A, Goodin S, Ravikumar TS, Rubin EH, Gupta E (1997) Basic local alignment search tool. J Mol Biol 215:403–410
Beigneux AP, Davies BSJ, Gin P, Weinstein MM, Farber E, Qiao X, Peale F, Bunting S, Walzem RL, Wong JS, Blaner WS, Ding Z-M, Melford K, Wongsiriroj N, Shu X, de Sauvage F, Ryan RO, Fong LG, Bensadoun A, Young SG (2007) Glycosylphosphatidylinositol-binding protein 1 plays a critical role in the lipolytic processing of chylomicrons. Cell Metab 5:279–291
Beigneux AP, Gin P, Davies BSJ, Weinstein MM, Ryan OO, Forg LG, Young SG (2008) Glycosylation of Asn-76 in mouse GPIHBP1 is critical for its appearance on the cell surface and the binding of chylomicrons and lipoprotein lipase. J Lipid Res 49:1312–1321
Beigneux AP, Franssen R, Bensadoun A, Gin P, Melford K, Walzem RL, Weinstein MM, Kuienhoven JA, Kastelain JJ, Fong LG, Dallinga-Thie GM (2009a) Cylomicronemia with a mutant GPIHBP1 (Q115P) that cannot bind lipoprotein lipase. Arterioscler Thromb Vasc Biol 29:956–962
Beigneux AP, Gin P, Davies BSJ, Weinstein MM, Bensadoun A, Fong LG, Young SG (2009b) Highly conserved cysteines within the Ly6 domain of GPIHBP1 are crucial for the binding of lipoprotein lipase. J Biol Chem 283:16928–16939
Bovine Genome Project (2008) http://hgsc.bcm.tmc.edu/projects/bovine
Brakenoff RH, Gerretsen M, Knippels EMC, van Dijk M, van Essen H, Weghuis DO, Sinke RJ, Snow GB, van Dongen GAMS (1995) The human E48 antigen, highly homologous to the murine Ly-6 antigen ThB, is a GPI-anchored molecule apparently involved in keratinocyte cell–cell adhesion. J Cell Biol 129:1677–1689
Capone MC, Gorman DM, Ching EP, Ziotnik A (1996) Identification through bioinformatics of cDNAs encoding thymic shared Ag-1/stem cell Ag-2: a new member of the human Ly6 family. J Immunol 157:969–973
Chimpanzee Sequencing and Analysis Consortium (2005) Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437:69–87
Clark HF, Gurney AL, Abaya E, Baker K, Baldwin D, Brush J, Chen J, Chow B, Chui C, Crowley C, Currell B, Deuel B, Dowd P, Eaton D, Foster J, Grimaldi C, Gu Q, Hass PE, Heldens S, Huang A, Kim HS, Klimowski L, Jin Y, Johnson S, Lee J, Lewis L, Liao D, Mark M, Robbie E, Sanchez C, Schoenfeld J, Seshagiri S, Simmons L, Singh J, Smith V, Stinson J, Vagts A, Vandlen R, Watanabe C, Wieand D, Woods K, Xie MH, Yansura D, Yi S, Yu G, Yuan J, Zhang M, Zhang Z, Goddard A, Wood WI, Godowski P, Gray A (2003) The secreted protein discovery initiative (SPDI), a large-scale effort to identify novel human secreted and transmembrane proteins: a bioinformatics assessment. Genome Res 13:226–2270
Davies BSJ, Beigneux AP, Barnes RH, Yiping T, Gin P, Weinstein MM, Nobumori C, Nyren R, Goldberg I, Olivecrona G, Bensadoun A, Young SG, Fong LG (2010) GPIHBP1 is responsible for the entry of lipoprotein lipase into capillaries. Cell Metab 12:42–52
Eisenhaber B, Bork P, Eisenhaber F (1998) Sequence properties of GPI-anchored proteins near the omega-site: constraints for the polypeptide binding site of the putative transamidase. Protein Eng 11:1155–1161
Emmanuelsson O, Brunak S, von Heijne G, Nielson H (2007) Locating proteins in the cell using TargetP, SignalP and related tools. Nat Protoc 2:953–971
Felsenstein J (1985) Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39:783–791
Fisher EA (2010) GPIHBP1: lipoprotein lipases’s ticket to ride. Cell Metab 12:1–2
Franssen R, Young SG, Peelman F, Hertecant J, Sierts JA, Schimmel AW, Bensadoun A, Kastelein JJ, Fong LG, Dallinga-Thie GM, Beigneux AP (2010) Chylomicronemia with low postheparin lipoprotein lipase in the setting of GPIHBP1 defects. Circ Cardiovasc Genet 3:169–178
Fry BG, Wüster W, Kini RM, Brusic V, Khan A, Venkataraman D, Rooney AP (2003) Molecular evolution and phylogeny of elapid snake venom three-finger toxins. J Mol Evol 57:110–129
Gin P, Beigneux AP, Davies B, Young MF, Ryan RO, Bensadoun A, Fong LG, Young SG (2007) Normal binding of lipoprotein lipase, chylomicrons and apo-AV to GPIHBP1 containing a G56R amino acid substitution. Biochim Biophys Acta 1771:1464–1468
Gin P, Beigneux AP, Voss C, Davies SJ, Beckstead JA, Ryan RO, Bensadoun A, Fong LG, Young SG (2011) Binding preferences for GPIHBP1, a glycosylphosphatidylinositol-anchored protein of capillary endothelial cells. Arterio Thromb Vasc Biol 31:176–182
Hall TA (1999) BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp Ser 41:95–98
Havel RJ, Kane JP (2001) Introduction: structure and metabolism of plasma lipoproteins. In: Scriver CR, Beaudet AL, Sly WS, Valle D, Childs B, Kinzler KW, Vogelstein B (eds) The metabolic and molecular bases of inherited disease. McGraw-Hill, New York, pp 2705–2716
Horie M, Okutomi K, Ohbuchi Y, Suzuki M, Takahashi E (1998) Isolation and characterization of a new member of the Ly6 gene family (LY6H). Genomics 53:365–368
Horse Genome Project (2008) http://www.uky.edu/Ag/Horsemap/
International Human Genome Sequencing Consortium (2001) Initial sequencing and analysis of the human genome. Nature 409:860–921
Ioka RX, Kang M-J, Kamiyama S, Kim D-H, Magoori K, Kamataki A, Ito Y, Takei YA, Sasaki M, Suzuki T, Sasano H, Takahashi S, Sakai J, Fujino T, Yamamoto TT (2003) Expression cloning, characterization of a novel glycosylphosphatidylinositol-anchored high density lipoprotein-binding protein, GPI-HBP1. J Biol Chem 278:7344–7349
Ishikawa N, Takano A, Yasui W, Inai K, Nishimura H, Ito H, Miyagi Y, Nakayama H, Fujita M, Hosokawa M, Tsuchiya E, Kohno N, Nakamura Y, Daigo Y (2007) Cancer-testis antigen lymphocyte 6 complex locus K is a serological biomarker and a therapeutic target for lung and esophageal carcinomas. Cancer Res 67:11601–11611
Kent WJ, Sugnet CW, Furey TS (2003) The human genome browser at UCSC. Genome Res 12:994–1006
Kimura M (1983) The neutral theory of molecular evolution. Cambridge University Press, Cambridge
Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R (2007) Clustal W and Clustal X version 2.0. Bioinformatics 23:2947–2948
Leath KJ, Johnson S, Roversi P, Highes TR, Smith RAG, Mackenzie L, Morgan BP, Lea SM (2007) High-resolution structures of bacterially expressed soluble human CD59. Acta Cryst F63:648–652
Mammalian Genome Project Team (2004) The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC). Genome Res 14:2121–2127
McGuffin LJ, Bryson K, Jones DT (2000) The PSIPRED protein structure prediction server. Bioinformatics 16:404–405
Mikkelsen TS, Wakefield MJ, Aken B, Amemiya CT, Chang JL, Duke S, Garber M, Gentles AJ, Goodstadt L, Heger A, Jurka J, Kamal M, Mauceli E, Searle SMJ, Sharpe T, Baker ML, Batzer MA, Benos PV, Belov K, Clamp M, Cook A, Cuff J, Das R, Davidow L, Deakin JE, Fazzari MJ, Glass JL, Grabherr M, Greally JM, Gu W, Hore TA, Huttley GA, Kleber M, Jirtle RL, Koina E, Lee JT, Mahony S, Marra MA, Miller RD, Nicholls RD, Oda M, Papenfuss AT, Parra ZE, Pollock DD, Ray DA, Schein JE, Speed TP, Thompson K, VandeBerg JL, Wade CM, Walker JA, Waters PD, Webber C, Weidman JR, Xie X, Zody MC, Broad Institute Genome Sequencing Platform, Broad Institute Whole Genome Assembly Team, Marshall Graves JA, Ponting CP, Breen M, Samollow PB, Lander ES, Lindblad-Toh K (2007) Genome of the marsupial Monodelphis domestica reveals innovation in noncoding sequences. Nature 447:167–175
Mouse Genome Sequencing Consortium (2002) Initial sequencing and comparative analysis of the mouse genome. Nature 420:520–562
Nakamura T, Yamazaki Y, Saiki Y, Moriyaki M, Largaespada DA, Jenkins NA, Copeland NG (2000) Evi9 encodes a novel zinc finger protein that physically interacts with BCL6, a known human B-cell proto-oncogene product. Mol Cell Biol 20:3178–3186
Nosjean O, Briolay A, Roux B (1997) Mammalian GPI proteins: sorting, membrane residence and functions. Biochim Biophys Acta 1331:153–186
Olivecrona G, Ehrenborg E, Semb H, Makoveichuk E, Lindberg A, Hayden MR, Gin P, Davies BS, Weinstein MM, Fong LG, Beigneux AP, Young SG, Harnell O (2010) Mutation of conserved cysteines in the Ly6 domain of GPIHBP1 in familial chylomicronemia. J Lipid Res 51:1535–1545
Ory DS (2007) Chylomicrons and lipoprotein lipase at the endothelial surface: bound and GAG-ged? Cell Metab 5:229–231
Rat Genome Sequencing Project Consortium (2004) Genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature 428:493–521
Rhesus Macaque Genome Sequencing and Analysis Consortium (2007) Evolutionary and biomedical insights from the Rhesus Macaque genome. Science 316:222–234
Saitou N, Nei M (1987) The neighbour-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4:406–425
Sendak RA, Bensadoun A (1998) Identification of a heparin-binding domain in the distal carboxyl-terminal region of lipoprotein lipase by site-directed mutagenesis. J Lipid Res 39:1310–1315
Thierry-Mieg D, Thierry-Mieg J (2006) AceView: A comprehensive cDNA-supported gene and transcripts annotation. Genome Biology 7:S12 http://www.ncbi.nlm.nih.gov/IEB/Research/Acembly/index.html?human
Van De Peer Y, de Wachter R (1994) TREECON for Windows: a software package for the construction, drawing of evolutionary trees for the Microsoft Windows environment. Comput Appl Sci 10:569–570
Wang J, Hegele RA (2007) Homozygous missense mutation (G56R) in glycosylphosphatidylinositol-anchored high-density lipoprotein-binding protein 1 (GPI-HBP1) in two siblings with fasting chylomicronemia. Lipids Health Dis 6:23
Warren WC, Hillier LW, Marshall Graves JA, Birney E, Ponting CP, Grützner F, Belov K (2008) Genome analysis of the platypus reveals unique signatures of evolution. Nature 453:175–183
Wion KL, Kirchgessner TG, Lusis AJ, Schotz MC, Lawn RM (1987) Human lipoprotein lipase complementary DNA sequence. Science 235:1638–1641
Young SG, Davies BSJ, Fong LG, Gin P, Weinstein MM, Bensadoun A, Beigneux AP (2007) GPIHBP1: an endothelial cell molecule important for the lipolytic processing of chylomicrons. Curr Opin Lipidol 18:389–396
Acknowledgments
This project was supported by NIH Grants P01 HL028972 and P51 RR013986. In addition, this investigation was conducted in facilities constructed with support from Research Facilities Improvement Program Grant Numbers 1 C06 RR13556, 1 C06 RR15456, 1 C06 RR017515.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Supplementary Fig. 1: Alignments for Acidic Amino Acid Sequence Regions for Vertebrate BCL11A and Mammalian GPIHBP1 Sequences
See Tables 1 and 3 for sources of glycosylphosphatidylinositol-anchored high-density lipoprotein-binding protein 1 (GPIHBP1) and vertebrate BCL11A gene (encoding B-cell CLL/lymphoma 11A) sequences; * shows identical residues for proteins; : similar alternate residues;. dissimilar alternate residues; acidic amino acids are in blue; basic amino acid residues in pink; hydrophobic amino acids in red; and hydrophilic amino acids in green
Rights and permissions
Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
About this article
Cite this article
Holmes, R.S., Cox, L.A. Comparative studies of glycosylphosphatidylinositol-anchored high-density lipoprotein-binding protein 1: evidence for a eutherian mammalian origin for the GPIHBP1 gene from an LY6-like gene. 3 Biotech 2, 37–52 (2012). https://doi.org/10.1007/s13205-011-0026-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13205-011-0026-4