Abstract
While there is considerable appeal to the idea of selecting a few SNPs to represent all, or much, of the DNA sequence variability in a local chromosomal region, it is also important to quantify what detail is lost in adopting such an approach. To address this issue, we compared high- and low-resolution depictions of sequence diversity for the same genomic region, the APOA1/C3/A4/A5 gene cluster on chromosome 11. First, extensive re-sequencing identified all nucleotide and sequence haplotype variation of the linked apolipoprotein genes in 72 individuals from three populations: African-Americans from Jackson, Miss., Europeans from North Karelia, Finland, and European-Americans from Rochester, Minn.. We identified 124 SNPs in 17.7 kb and significant differences in variation among genes. APOC3 gene diversity was particularly distinctive at high resolution, showing large allele frequency differences (F ST values >0.250) between Jackson and the other two samples, and divergent population-specific haplotype lineages. Next, we selected haplotype-tagging SNPs (htSNPs) for each gene, at a density of approximately one SNP per kb, using an algorithm suggested by Stram et al. (2003). The 17 htSNPs identified were then used to reconstruct low-resolution haplotypes, from which inferences about the structure of variation were also drawn. This comparison showed that while the htSNPs successfully tagged common haplotype variation, they also left much underlying sequence diversity undetected and failed, in some cases, to co-classify groups of closely related haplotypes. The implications of these findings for other haplotype-based descriptions of human variation are discussed.
Similar content being viewed by others
References
Akey JM, Zhang G, Zhang K, Jin L, Shriver MD (2002) Interrogating a high-density SNP map for signatures of natural selection. Genome Res 12:1805–1814
Antonarakis SE, Oettgen P, Chakravarti A, Halloran SL, Hudson RR, Feisee L, Karathanasis SK (1988) DNA polymorphism haplotypes of the human apolipoprotein APOA1-APOC3-APOA4 gene cluster. Hum Genet 80:265–273
Bafna V, Halldórsson BV, Schwartz R, Clark AG, Istrail S (2003) Haplotypes and informative SNP selection algorithms: don’t block out information. RECOMB’03, Berlin, Germany
Bandelt HJ, Forster P, Sykes BC, Richards MB (1995) Mitochondrial portraits of human populations using median networks. Genetics 141:743–753
Barbujani G, Magagni A, Minch E, Cavalli-Sforza LL (1997) An apportionment of human DNA diversity. Proc Natl Acad Sci USA 94:4516–4519
Benlian P, Boileau C, Loux N, Pastier D, Masliah J, Coulon M, Nigou M, Ragab A, Guimard J, Ruidavets JB, et al (1991) Extended haplotypes and linkage disequilibrium between 11 markers at the APOA1-C3-A4 gene cluster on chromosome200311. Am J Hum Genet 48:903–910
Bisgaier CL, Sachdev OP, Megna L, Glickman RM (1985) Distribution of apolipoprotein A-IV in human plasma. J Lipid Res 26:11–25
Cardon LR, Abecasis GR (2003) Using haplotype blocks to map human complex trait loci. Trends Genet 19:135–140
Carlson CS, Eberle MA, Rieder MJ, Smith JD, Kruglyak L, Nickerson DA (2003) Additional SNPs and linkage-disequilibrium analyses are necessary for whole-genome association studies in humans. Nat Genet 33:518–521
Carlson CS, Eberle MA, Rieder MJ, Yi Q, Kruglyak L, Nickerson DA (2004) Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am J Hum Genet 74:106–120
Clark AG, Weiss KM, Nickerson DA, Taylor SL, Buchanan A, Stengard J, Salomaa V, Vartiainen E, Perola M, Boerwinkle E, Sing CF (1998) Haplotype structure and population genetic inferences from nucleotide-sequence variation in human lipoprotein lipase. Am J Hum Genet 63:595–612
Clark AG, Nielsen R, Signorovitch J, Matise TC, Glanowski S, Heil J, Winn-Deen ES, Holden AL, Lai E (2003) linkage disequilibrium and inference of ancestral recombination in 538 single-nucleotide polymorphism clusters across the human genome. Am J Hum Genet 73:285–300
Collins FS, Green ED, Guttmacher AE, Guyer MS (2003) A vision for the future of genomics research. Nature 422:835–47
Dallinga-Thie GM, van Linde-Sibenius Trip M, Rotter JI, Cantor RM, Bu X, Lusis AJ, de Bruin TW (1997) Complex genetic contribution of the Apo AI-CIII-AIV gene cluster to familial combined hyperlipidemia. Identification of different susceptibility haplotypes. J Clin Invest 99:953–961
Dallinga-Thie GM, Groenendijk M, Blom RN, De Bruin TW, De Kant E (2001) Genetic heterogeneity in the apolipoprotein C-III promoter and effects of insulin. J Lipid Res 42:1450–1456
Dallongeville J, Meirhaeghe A, Cottel D, Fruchart JC, Amouyel P, Helbecque N (2001) Polymorphisms in the insulin response element of APOC-III gene promoter influence the correlation between insulin and triglycerides or triglyceride-rich lipoproteins in humans. Int J Obes Relat Metab Disord 25:1012–1017
Daly MJ, Rioux JD, Schaffner SF, Hudson TJ, Lander ES (2001) High-resolution haplotype structure in the human genome. Nat Genet 29:229–232
Dammerman M, Sandkuijl LA, Halaas JL, Chung W, Breslow JL (1993) An apolipoprotein CIII haplotype protective against hypertriglyceridemia is specified by promoter and 3′-untranslated region polymorphisms. Proc Natl Acad Sci USA 90:4562–4566
Devlin B, Risch N (1995) A comparison of linkage disequilibrium measures for fine-scale mapping. Genomics 29:311–322
Ewens WJ (1972) The sampling theory of selectively neutral alleles. Theor Popul Biol 3:87–112
Fay JC, Wu CI (2000) Hitchhiking under positive Darwinian selection. Genetics 155:1405–1413
Ferns GA, Galton DJ (1986) Haplotypes of the human apoprotein AI-CIII-AIV gene cluster in coronary atherosclerosis. Hum Genet 73:245–249
Fu YX, Li WH (1993) Statistical tests of neutrality of mutations. Genetics 133:693–709
Fullerton SM, Clark AG, Weiss KM, Nickerson DA, Taylor SL, Stengard JH, Salomaa V, Vartiainen E, Perola M, Boerwinkle E, Sing CF (2000) Apolipoprotein E variation at the sequence haplotype level: implications for the origin and maintenance of a major human polymorphism. Am J Hum Genet 67:881–900
Fullerton SM, Bartoszewicz A, Ybazeta G, Horikawa Y, Bell GI, Kidd KK, Cox NJ, Hudson RR, Di Rienzo A (2002a) Geographic and haplotype structure of candidate type 2 diabetes susceptibility variants at the calpain-10 locus. Am J Hum Genet 70:1096–1106
Fullerton SM, Clark AG, Weiss KM, Taylor SL, Stengard JH, Salomaa V, Boerwinkle E, Nickerson DA (2002b) Sequence polymorphism at the human apolipoprotein AII gene (APOA2): unexpected deficit of variation in an African-American sample. Hum Genet 111:75–87
Gabriel SB, Schaffner SF, Nguyen H, Moore JM, Roy J, Blumenstiel B, Higgins J, DeFelice M, Lochner A, Faggart M, Liu-Cordero SN, Rotimi C, Adeyemo A, Cooper R, Ward R, Lander ES, Daly MJ, Altshuler D (2002) The structure of haplotype blocks in the human genome. Science 296:2225–2229
Gotto AM Jr, Pownall HJ, Havel RJ (1986) Introduction to the plasma lipoproteins. Methods Enzymol 128:3–41
Groenendijk M, Cantor RM, de Bruin TW, Dallinga-Thie GM (2001) The apoAI-CIII-AIV gene cluster. Atherosclerosis 157:1–11
Haviland MB, Kessling AM, Davignon J, Sing CF (1995) Cladistic analysis of the apolipoprotein AI-CIII-AIV gene cluster using a healthy French Canadian sample. I. Haploid analysis. Ann Hum Genet 59:211–231
Hegele RA, Connelly PW, Hanley AJ, Sun F, Harris SB, Zinman B (1997) Common genomic variation in the APOC3 promoter associated with variation in plasma lipoproteins. Arterioscler Thromb Vasc Biol 17:2753–2758
Hill WG (1974) Estimation of linkage disequilibrium in randomly mating populations. Heredity 33:229–239
Hong SH, Park WH, Lee CC, Song JH, Kim JQ (1997) Association between genetic variations of apo AI-CIII-AIV cluster gene and hypertriglyceridemic subjects. Clin Chem 43:13–17
Johnson GC, Esposito L, Barratt BJ, Smith AN, Heward J, Di Genova G, Ueda H, Cordell HJ, Eaves IA, Dudbridge F, Twells RC, Payne F, Hughes W, Nutland S, Stevens H, Carr P, Tuomilehto-Wolf E, Tuomilehto J, Gough SC, Clayton DG, Todd JA (2001) Haplotype tagging for the identification of common disease genes. Nat Genet 29:233–237
Kamboh MI, Bunker CH, Aston CE, Nestlerode CS, McAllister AE, Ukoli FA (1999) Genetic association of five apolipoprotein polymorphisms with serum lipoprotein-lipid levels in African blacks. Genet Epidemiol 16:205–222
Karathanasis SK (1985) Apolipoprotein multigene family: tandem organization of human apolipoprotein AI, CIII, and AIV genes. Proc Natl Acad Sci USA 82:6374–6378
Ke X, Cardon LR (2003) Efficient selective screening of haplotype tag SNPs. Bioinformatics 19:287–288
Kumar S, Tamura K, Jakobsen IB, Nei M (2001) MEGA2: molecular evolutionary genetics analysis software. Bioinformatics 17:1244–1245
Li WW, Dammerman MM, Smith JD, Metzger S, Breslow JL, Leff T (1995) Common genetic variation in the promoter of the human apo CIII gene abolishes regulation by insulin and may contribute to hypertriglyceridemia. J Clin Invest 96:2601–2605
McConathy WJ, Gesquiere JC, Bass H, Tartar A, Fruchart JC, Wang CS (1992) Inhibition of lipoprotein lipase activity by synthetic peptides of apolipoprotein C-III. J Lipid Res 33:995–1003
Meng Z, Zaykin DV, Xu CF, Wagner M, Ehm MG (2003) Selection of genetic markers for association analyses, using linkage disequilibrium and haplotypes. Am J Hum Genet 73:115–130
Nei M, Kumar S (2000) Molecular evolution and phylogenetics. Oxford University Press, Oxford
Nickerson DA, Tobe VO, Taylor SL (1997) PolyPhred: automating the detection and genotyping of single nucleotide substitutions using fluorescence-based resequencing. Nucleic Acids Res 25:2745–2751
Nickerson DA, Taylor SL, Weiss KM, Clark AG, Hutchinson RG, Stengard J, Salomaa V, Vartiainen E, Boerwinkle E, Sing CF (1998) DNA sequence diversity in a 9.7-kb region of the human lipoprotein lipase gene. Nat Genet 19:233–240
Nickerson DA, Taylor SL, Fullerton SM, Weiss KM, Clark AG, Stengård JH, Salomaa V, Boerwinkle E, Sing CF (2000) Sequence diversity and large-scale typing of SNPs in the human apolipoprotein E gene. Genome Res 10:1532–1545
O’Brien RM, Granner DK (1996) Regulation of gene expression by insulin. Physiol Rev 76:1109–1161
Olivier M, Wang X, Cole R, Gau B, Kim J, Rubin EM, Pennacchio LA (2004) Haplotype analysis of the apolipoprotein gene cluster on human chromosome 11. Genomics (in press)
Ordovas JM, Civeira F, Genest J Jr, Craig S, Robbins AH, Meade T, Pocovi M, Frossard PM, Masharani U, Wilson PW, et al (1991) Restriction fragment length polymorphisms of the apolipoprotein AI, C-III, A-IV gene locus. Relationships with lipids, apolipoproteins, and premature coronary artery disease. Atherosclerosis 87:75–86
Orzack SH, Gusfield D, Olson J, Nesbitt S, Subrahmanyan L, Stanton VP Jr (2003) Analysis and exploration of the use of rule-based algorithms and consensus methods for the inferral of haplotypes. Genetics 165:915–928
Patil N, Berno AJ, Hinds DA, Barrett WA, Doshi JM, Hacker CR, Kautzer CR, Lee DH, Marjoribanks C, McDonough DP, Nguyen BT, Norris MC, Sheehan JB, Shen N, Stern D, Stokowski RP, Thomas DJ, Trulson MO, Vyas KR, Frazer KA, Fodor SP, Cox DR (2001) Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21. Science 294:1719–1723
Paul H, Galton D, Stocks J (1987) DNA polymorphic patterns and haplotype arrangements of the apo A1, apo C-III, apo A-IV gene cluster in different ethnic groups. Hum Genet 75:264–268
Peacock RE, Hamsten A, Johansson J, Nilsson-Ehle P, Humphries SE (1994) Associations of genotypes at the apolipoprotein AI-CIII-AIV, apolipoprotein B and lipoprotein lipase gene loci with coronary atherosclerosis and high density lipoprotein subclasses. Clin Genet 46:273–282
Pennacchio LA, Olivier M, Hubacek JA, Cohen JC, Cox DR, Fruchart JC, Krauss RM, Rubin EM (2001) An apolipoprotein influencing triglycerides in humans and mice revealed by comparative sequencing. Science 294:169–173
Pennacchio LA, Olivier M, Hubacek JA, Krauss RM, Rubin EM, Cohen JC (2002) Two independent apolipoprotein A5 haplotypes influence human plasma triglyceride levels. Hum Mol Genet 11:3031–3038
Romualdi C, Balding D, Nasidze IS, Risch G, Robichaux M, Sherry ST, Stoneking M, Batzer MA, Barbujani G (2002) Patterns of human diversity, within and among continents, inferred from biallelic DNA polymorphisms. Genome Res 12:602–612
Salomaa V, Rasi V, Pekkanen J, Vahtera E, Jauhiainen M, Vartiainen E, Myllyla G, Ehnholm C (1994) Haemostatic factors and prevalent coronary heart disease; the FINRISK Haemostasis Study. Eur Heart J 15:1293–1299
Sebastiani P, Lazarus R, Weiss ST, Kunkel LM, Kohane IS, Ramoni MF (2003) Minimal haplotype tagging. Proc Natl Acad Sci USA 100:9900–9905
Song J, Park JW, Park H, Kim JQ (1998) Linkage disequilibrium of the Apo AI-CIII-AIV gene cluster and their relationship to plasma triglyceride, apolipoprotein AI and CIII levels in Koreans. Mol Cells 8:12–18
Stead JD, Hurles ME, Jeffreys AJ (2003) Global haplotype diversity in the human insulin gene region. Genome Res 13:2101–2111
Stead JD, Jeffreys AJ (2002) Structural analysis of insulin minisatellite alleles reveals unusually large differences in diversity between Africans and non-Africans. Am J Hum Genet 71:1273–1284
Stephens JC, Schneider JA, Tanguay DA, Choi J, Acharya T, Stanley SE, Jiang R, Messer CJ, Chew A, Han JH, Duan J, Carr JL, Lee MS, Koshy B, Kumar AM, Zhang G, Newell WR, Windemuth A, Xu C, Kalbfleisch TS, Shaner SL, Arnold K, Schulz V, Drysdale CM, Nandabalan K, Judson RS, Ruano G, Vovis GF (2001a) Haplotype variation and linkage disequilibrium in 313 human genes. Science 293:489–493
Stephens M, Smith NJ, Donnelly P (2001b) A new statistical method for haplotype reconstruction from population data. Am J Hum Genet 68:978–989
Stephens M, Smith NJ, Donnelly P (2001c) Reply to Zhang et al. Am J Hum Genet 69:912–914
Stram DO, Haiman CA, Hirschhorn JN, Altshuler D, Kolonel LN, Henderson BE, Pike MC (2003) Choosing Haplotype-Tagging SNPS Based on Unphased Genotype Data Using a Preliminary Sample of Unrelated Subjects with an Example from the Multiethnic Cohort Study. Hum Hered 55:27–36
Surguchov AP, Page GP, Smith L, Patsch W, Boerwinkle E (1996) Polymorphic markers in apolipoprotein C-III gene flanking regions and hypertriglyceridemia. Arterioscler Thromb Vasc Biol 16:941–947
Tahvanainen E, Pajukanta P, Porkka K, Nieminen S, Ikavalko L, Nuotio I, Taskinen MR, Peltonen L, Ehnholm C (1998) Haplotypes of the ApoA-I/C-III/A-IV gene cluster and familial combined hyperlipidemia. Arterioscler Thromb Vasc Biol 18:1810–1817
Tajima F (1983) Evolutionary relationship of DNA sequences in finite populations. Genetics 105:437–460
Tajima F (1989) Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585–595
Talmud PJ, Hawe E, Martin S, Olivier M, Miller GJ, Rubin EM, Pennacchio LA, Humphries SE (2002) Relative contribution of variation within the APOC3/A4/A5 gene cluster in determining plasma triglycerides. Hum Mol Genet 11:3039–3046
Templeton AR, Boerwinkle E, Sing CF (1987) A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping. I. Basic theory and an analysis of alcohol dehydrogenase activity in Drosophila. Genetics 117:343–351
Turner ST, Weidman WH, Michels VV, Reed TJ, Ormson CL, Fuller T, Sing CF (1989) Distribution of sodium-lithium countertransport and blood pressure in Caucasians five to eighty-nine years of age. Hypertension 13:378–391
van der Vliet HN, Sammels MG, Leegwater AC, Levels JH, Reitsma PH, Boers W, Chamuleau RA (2001) Apolipoprotein A-V: a novel apolipoprotein associated with an early phase of liver regeneration. J Biol Chem 276:44512–44520
Vartiainen E, Puska P, Pekkanen J, Tuomilehto J, Jousilahti P (1994) Changes in risk factors explain changes in mortality from ischaemic heart disease in Finland. BMJ 309:23–27
Wang WY, Todd JA (2003) The usefulness of different density SNP maps for disease association studies of common variants. Hum Mol Genet 12:3145–3149
Waterworth DM, Talmud PJ, Humphries SE, Wicks PD, Sagnella GA, Strazzullo P, Alberti KG, Cook DG, Cappuccio FP (2001) Variable effects of the APOC3–482C>T variant on insulin, glucose and triglyceride concentrations in different ethnic groups. Diabetologia 44:245–248
Watterson GA (1975) On the number of segregating sites in genetical models without recombination. Theor Popul Biol 7:256–276
Wojciechowski AP, Farrall M, Cullen P, Wilson TM, Bayliss JD, Farren B, Griffin BA, Caslake MJ, Packard CJ, Shepherd J, et al (1991) Familial combined hyperlipidaemia linked to the apolipoprotein AI-CII-AIV gene cluster on chromosome 11q23-q24. Nature 349:161–164
Wright S (1931) Evolution in Mendelian populations. Genetics 6:111–178
Xu CF, Talmud P, Schuster H, Houlston R, Miller G, Humphries S (1994) Association between genetic variation at the APO AI-CIII-AIV gene cluster and familial combined hyperlipidaemia. Clin Genet 46:385–397
Zhang K, Deng M, Chen T, Waterman MS, Sun F (2002) A dynamic programming algorithm for haplotype block partitioning. Proc Natl Acad Sci USA 99:7335–7339
Acknowledgements
We thank D. J. Matthews and M. D. Shriver for comments on an earlier version of the manuscript. This research was supported by grants from the National Heart, Lung, and Blood Institute: HL39107, HL58238, HL58239, HL58240, and HL66682.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Fullerton, S.M., Buchanan, A.V., Sonpar, V.A. et al. The effects of scale: variation in the APOA1/C3/A4/A5 gene cluster. Hum Genet 115, 36–56 (2004). https://doi.org/10.1007/s00439-004-1106-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00439-004-1106-x