Human Genetics

, Volume 125, Issue 1, pp 81–93 | Cite as

Identification of common genetic variants that account for transcript isoform variation between human populations

  • Wei Zhang
  • Shiwei Duan
  • Wasim K. Bleibel
  • Steven A. Wisel
  • R. Stephanie Huang
  • Xiaolin Wu
  • Lijun He
  • Tyson A. Clark
  • Tina X. Chen
  • Anthony C. Schweitzer
  • John E. Blume
  • M. Eileen Dolan
  • Nancy J. CoxEmail author
Original Investigation


In addition to the differences between populations in transcriptional and translational regulation of genes, alternative pre-mRNA splicing (AS) is also likely to play an important role in regulating gene expression and generating variation in mRNA and protein isoforms. Recently, the genetic contribution to transcript isoform variation has been reported in individuals of recent European descent. We report here results of an investigation of the differences in AS patterns between human populations. AS patterns in 176 HapMap lymphoblastoid cell lines derived from individuals of European and African ancestry were evaluated using the Affymetrix GeneChip® Human Exon 1.0 ST Array. A variety of biological processes such as response to stimulus and transcription were found to be enriched among the differentially spliced genes. The differentially spliced genes also include some involved in human diseases that have different prevalence or susceptibility between populations. The genetic contribution to the population differences in transcript isoform variation was then evaluated by a genome-wide association using the HapMap genotypic data on single nucleotide polymorphisms (SNPs). The results suggest that local and distant genetic variants account for a substantial fraction of the observed transcript isoform variation between human populations. Our findings provide new insights into the complexity of the human genome as well as the health disparities between the two populations.


Common Genetic Variant Exon Array Transcript Cluster International HapMap Consortium Splice Index 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This Pharmacogenetics of Anticancer Agents Research (PAAR) Group ( study was supported by NIH/NIGMS grants U01 GM61393 and U01 GM61374. We are grateful to Dr. Jeong-Ah Kang for maintaining cell lines, Cheryl A. Roe for reviewing the manuscript and Drs. James Fackenthal and Emily Kistner for helpful discussion. T.A.C., T.X.C., A.C.S., and J.E.B. are employees of Affymetrix, Inc.

Supplementary material

439_2008_601_MOESM1_ESM.xls (746 kb)
Supplemental Table S1. Differentially spliced probesets between the CEU and YRI samples, Supplemental Table S2. Associated SNP’s with the differentially spliced probesets, Supplemental Table S3. Primers used in the validation of differentially spliced probesets (XLS 746 kb)
439_2008_601_MOESM2_ESM.tif (110 kb)
Supplemental Fig. 1 The workflow chart. Exon-level expression of the CEU and YRI trios was measured using theAffymetrix Human Exon 1.0ST array. Splicing index values were compared between the twopopulations to identify differential probesets. Genetic variants associated with the differentialprobesets and enriched pathways and/or Gene Ontology terms were then identified.254x190mm (72 x 72 DPI) (TIFF 110 kb)
439_2008_601_MOESM3_ESM.tif (117 kb)
Supplemental Fig. 2 PS3527423 (PARP2) is differentially spliced among 54 unrelated CEU samples. Blue indicates thepresence of both spliced and unspliced isoforms (300bp and 400bp); Red indicates unsplicedisoforms (400bp); Black indicates spliced isoforms (300bp).254x190mm (72 x 72 DPI) (TIFF 116 kb)
439_2008_601_MOESM4_ESM.tif (65 kb)
Supplemental Fig. 3 Validation of transcript isoform variation between populations using quantitative Real-Time PCR. Therelative expression between the exon of interest (EOI) and the neighboring, housekeeping exon(HKE) was shown. (A) MPRL43 (probeset: 3303658) has a lower ratio (one-tail t test p = 0.02) inCEU, consistent with the trend of SI (CEU SI mean = -0.02; YRI SI mean = 0.06). (B) MTMR4(probeset: 3764493) has a lower ratio (one-tail t test p = 0.05) in CEU consistent with the trend ofSI (CEU SI mean = 0.004; YRI SI mean = 0.024).254x190mm (72 x 72 DPI) (TIFF 64 kb)


  1. Abecasis GR, Cardon LR, Cookson WO (2000a) A general test of association for quantitative traits in nuclear families. Am J Hum Genet 66:279–292PubMedCrossRefGoogle Scholar
  2. Abecasis GR, Cookson WO, Cardon LR (2000b) Pedigree tests of transmission disequilibrium. Eur J Hum Genet 8:545–551PubMedCrossRefGoogle Scholar
  3. Affymetrix Inc. (2006) Identifying and validating alternative splicing events. Affymetrix Technical NoteGoogle Scholar
  4. Affymetrix Inc. (2007) Human Gene 1.0 ST Array Performance. Affymetrix GeneChip Gene and Exon Array Whitepaper CollectionGoogle Scholar
  5. Alberts R, Terpstra P, Li Y, Breitling R, Nap JP, Jansen RC (2007) Sequence polymorphisms cause many false cis eQTLs. PLoS ONE 2:e622PubMedCrossRefGoogle Scholar
  6. Applied Biosystems (2004) Guide to performing relative qualification of gene expression using Real-Time quantitative PCR. Technical NoteGoogle Scholar
  7. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25:25–29PubMedCrossRefGoogle Scholar
  8. Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B (57):289–300Google Scholar
  9. Brinkman BM (2004) Splice variants as cancer biomarkers. Clin Biochem 37:584–594PubMedCrossRefGoogle Scholar
  10. Cheung VG, Conlin LK, Weber TM, Arcaro M, Jen KY, Morley M, Spielman RS (2003) Natural variation in human gene expression assessed in lymphoblastoid cells. Nat Genet 33:422–425PubMedCrossRefGoogle Scholar
  11. Dennis G Jr, Sherman BT, Hosack DA, Yang J, Gao W, Lane HC, Lempicki RA (2003) DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome Biol 4:P3PubMedCrossRefGoogle Scholar
  12. Diabetes Epidemiology Research International Study Group (1988) Geographic patterns of childhood insulin-dependent diabetes mellitus. Diabetes Epidemiology Research International Group. Diabetes 37:1113–1119CrossRefGoogle Scholar
  13. Duan S, Huang RS, Zhang W, Bleibel WK, Roe CA, Clark TA, Chen TX, Schweitzer AC, Blume JE, Cox NJ, Dolan ME (2008a) Genetic architecture of transcript-level variation in humans. Am J Hum Genet 82: 1101–13 Google Scholar
  14. Duan S, Zhang W, Bleibel WK, Cox NJ, Dolan ME (2008b) SNPinProbe_1.0: a database for filtering out probes in the Affymetrix GeneChip® Human Exon 1.0 ST array potentially affected by SNPs. Bioinformation 2:469–470PubMedGoogle Scholar
  15. Eisen MB, Spellman PT, Brown PO, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA 95:14863–14868PubMedCrossRefGoogle Scholar
  16. Faustino NA, Cooper TA (2003) Pre-mRNA splicing and human disease. Genes Dev 17:419–437PubMedCrossRefGoogle Scholar
  17. Frazer KA, Ballinger DG, Cox DR, Hinds DA, Stuve LL, Gibbs RA, Belmont JW, Boudreau A, Hardenbol P, Leal SM, Pasternak S, Wheeler DA, Willis TD, Yu F, Yang H, Zeng C, Gao Y, Hu H, Hu W, Li C, Lin W, Liu S, Pan H, Tang X, Wang J, Wang W, Yu J, Zhang B, Zhang Q, Zhao H, Zhao H, Zhou J, Gabriel SB, Barry R, Blumenstiel B, Camargo A, Defelice M, Faggart M, Goyette M, Gupta S, Moore J, Nguyen H, Onofrio RC, Parkin M, Roy J, Stahl E, Winchester E, Ziaugra L, Altshuler D, Shen Y, Yao Z, Huang W, Chu X, He Y, Jin L, Liu Y, Shen Y, Sun W, Wang H, Wang Y, Wang Y, Xiong X, Xu L, Waye MM, Tsui SK, Xue H, Wong JT, Galver LM, Fan JB, Gunderson K, Murray SS, Oliphant AR, Chee MS, Montpetit A, Chagnon F, Ferretti V, Leboeuf M, Olivier JF, Phillips MS, Roumy S, Sallee C, Verner A, Hudson TJ, Kwok PY, Cai D, Koboldt DC, Miller RD, Pawlikowska L, Taillon-Miller P, Xiao M, Tsui LC, Mak W, Song YQ, Tam PK, Nakamura Y, Kawaguchi T, Kitamoto T, Morizono T, Nagashima A, Ohnishi Y et al (2007) A second generation human haplotype map of over 3.1 million SNPs. Nature 449:851–861PubMedCrossRefGoogle Scholar
  18. Gardina PJ, Clark TA, Shimada B, Staples MK, Yang Q, Veitch J, Schweitzer A, Awad T, Sugnet C, Dee S, Davies C, Williams A, Turpaz Y (2006) Alternative splicing and differential gene expression in colon cancer detected by a whole genome exon array. BMC Genomics 7:325PubMedCrossRefGoogle Scholar
  19. Gilad Y, Rifkin SA, Bertone P, Gerstein M, White KP (2005) Multi-species microarrays reveal the effect of sequence divergence on gene expression profiles. Genome Res 15:674–680PubMedCrossRefGoogle Scholar
  20. Huang da W, Sherman BT, Tan Q, Collins JR, Alvord WG, Roayaei J, Stephens R, Baseler MW, Lane HC, Lempicki RA (2007) The DAVID gene functional classification tool: a novel biological module-centric algorithm to functionally analyze large gene lists. Genome Biol 8:R183PubMedCrossRefGoogle Scholar
  21. Huang RS, Kistner EO, Bleibel WK, Shukla SJ, Dolan ME (2007) Effect of population and gender on chemotherapeutic agent-induced cytotoxicity. Mol Cancer Ther 6:31–36PubMedCrossRefGoogle Scholar
  22. Hull J, Campino S, Rowlands K, Chan MS, Copley RR, Taylor MS, Rockett K, Elvidge G, Keating B, Knight J, Kwiatkowski D (2007) Identification of common genetic variation that modulates alternative splicing. PLoS Genet 3:e99PubMedCrossRefGoogle Scholar
  23. International HapMap Consortium (2003) The International HapMap Project. Nature 426:789–796CrossRefGoogle Scholar
  24. International HapMap Consortium (2005) A haplotype map of the human genome. Nature 437:1299–1320CrossRefGoogle Scholar
  25. Ioannidis JP, Ntzani EE, Trikalinos TA (2004) ‘Racial’ differences in genetic effects for complex diseases. Nat Genet 36:1312–1318PubMedCrossRefGoogle Scholar
  26. Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, Speed TP (2003) Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 4:249–264PubMedCrossRefGoogle Scholar
  27. Jordan WJ, Eskdale J, Lennon GP, Pestoff R, Wu L, Fine DH, Gallagher G (2005) A non-conservative, coding single-nucleotide polymorphism in the N-terminal region of lactoferrin is associated with aggressive periodontitis in an African-American, but not a Caucasian population. Genes Immun 6:632–635PubMedCrossRefGoogle Scholar
  28. Kanehisa M, Goto S, Kawashima S, Okuno Y, Hattori M (2004) The KEGG resource for deciphering the genome. Nucleic Acids Res 32:D277–D280PubMedCrossRefGoogle Scholar
  29. Kurian AK, Cardarelli KM (2007) Racial and ethnic differences in cardiovascular disease risk factors: a systematic review. Ethn Dis 17:143–152PubMedGoogle Scholar
  30. Kwan T, Benovoy D, Dias C, Gurd S, Serre D, Zuzan H, Clark TA, Schweitzer A, Staples MK, Wang H, Blume JE, Hudson TJ, Sladek R, Majewski J (2007) Heritability of alternative splicing in the human genome. Genome Res 17:1210–1218PubMedCrossRefGoogle Scholar
  31. Kwan T, Benovoy D, Dias C, Gurd S, Provencher C, Beaulieu P, Hudson TJ, Sladek R, Majewski J (2008) Genome-wide analysis of transcript isoform variation in humans. Nat Genet 40:225–231PubMedCrossRefGoogle Scholar
  32. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, Funke R, Gage D, Harris K, Heaford A, Howland J, Kann L, Lehoczky J, LeVine R, McEwan P, McKernan K, Meldrim J, Mesirov JP, Miranda C, Morris W, Naylor J, Raymond C, Rosetti M, Santos R, Sheridan A, Sougnez C, Stange-Thomann N, Stojanovic N, Subramanian A, Wyman D, Rogers J, Sulston J, Ainscough R, Beck S, Bentley D, Burton J, Clee C, Carter N, Coulson A, Deadman R, Deloukas P, Dunham A, Dunham I, Durbin R, French L, Grafham D, Gregory S, Hubbard T, Humphray S, Hunt A, Jones M, Lloyd C, McMurray A, Matthews L, Mercer S, Milne S, Mullikin JC, Mungall A, Plumb R, Ross M, Shownkeen R, Sims S, Waterston RH, Wilson RK, Hillier LW, McPherson JD, Marra MA, Mardis ER, Fulton LA, Chinwalla AT, Pepin KH, Gish WR, Chissoe SL, Wendl MC, Delehaunty KD, Miner TL, Delehaunty A, Kramer JB, Cook LL, Fulton RS, Johnson DL, Minx PJ, Clifton SW, Hawkins T, Branscomb E, Predki P, Richardson P, Wenning S, Slezak T, Doggett N, Cheng JF, Olsen A, Lucas S, Elkin C, Uberbacher E, Frazier M et al (2001) Initial sequencing and analysis of the human genome. Nature 409:860–921PubMedCrossRefGoogle Scholar
  33. McGarvey PB, Huang H, Barker WC, Orcutt BC, Garavelli JS, Srinivasarao GY, Yeh LS, Xiao C, Wu CH (2000) PIR: a new resource for bioinformatics. Bioinformatics 16:290–291PubMedCrossRefGoogle Scholar
  34. McKusick VA (1998) Mendelian inheritance in man. A catalog of human genes and genetic disorders, 12th edn. Johns Hopkins University Press, BaltimoreGoogle Scholar
  35. Morley M, Molony CM, Weber TM, Devlin JL, Ewens KG, Spielman RS, Cheung VG (2004) Genetic analysis of genome-wide variation in human gene expression. Nature 430:743–747PubMedCrossRefGoogle Scholar
  36. Noble RC, Miller BR (1980) Auxotypes and antimicrobial susceptibilities of Neisseria gonorrhoeae in black and white patients. Br J Vener Dis 56:26–30PubMedGoogle Scholar
  37. Novoyatleva T, Tang Y, Rafalska I, Stamm S (2006) Pre-mRNA missplicing as a cause of human disease. Prog Mol Subcell Biol 44:27–46PubMedCrossRefGoogle Scholar
  38. Pruitt KD, Tatusova T, Maglott DR (2007) NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res 35:D61–D65PubMedCrossRefGoogle Scholar
  39. R Development Core Team (2005) R: a language and environment for statistical computing. R Foundation for Statistical Computing, ViennaGoogle Scholar
  40. Rozen S, Skaletsky H (2000) Primer3 on the WWW for general users and for biologist programmers. Methods Mol Biol 132:365–386PubMedGoogle Scholar
  41. Saeed AI, Sharov V, White J, Li J, Liang W, Bhagabati N, Braisted J, Klapa M, Currier T, Thiagarajan M, Sturn A, Snuffin M, Rezantsev A, Popov D, Ryltsov A, Kostukovich E, Borisovsky I, Liu Z, Vinsavich A, Trush V, Quackenbush J (2003) TM4: a free, open-source system for microarray data management and analysis. Biotechniques 34:374–378PubMedGoogle Scholar
  42. Sorek R, Shamir R, Ast G (2004) How prevalent is functional alternative splicing in the human genome? Trends Genet 20:68–71PubMedCrossRefGoogle Scholar
  43. Sorof JM, Hawkins EP, Brewer ED, Boydstun II, Kale AS, Powell DR (1998) Age and ethnicity affect the risk and outcome of focal segmental glomerulosclerosis. Pediatr Nephrol 12:764–768PubMedCrossRefGoogle Scholar
  44. Spielman RS, Bastone LA, Burdick JT, Morley M, Ewens WJ, Cheung VG (2007) Common genetic variants account for differences in gene expression among ethnic groups. Nat Genet 39:226–231PubMedCrossRefGoogle Scholar
  45. Storey JD, Madeoy J, Strout JL, Wurfel M, Ronald J, Akey JM (2007) Gene-expression variation within and among human populations. Am J Hum Genet 80:502–509PubMedCrossRefGoogle Scholar
  46. Stranger BE, Forrest MS, Clark AG, Minichiello MJ, Deutsch S, Lyle R, Hunt S, Kahl B, Antonarakis SE, Tavare S, Deloukas P, Dermitzakis ET (2005) Genome-wide associations of gene expression variation in humans. PLoS Genet 1:e78PubMedCrossRefGoogle Scholar
  47. Stranger BE, Nica AC, Forrest MS, Dimas A, Bird CP, Beazley C, Ingle CE, Dunning M, Flicek P, Koller D, Montgomery S, Tavare S, Deloukas P, Dermitzakis ET (2007) Population genomics of human gene expression. Nat Genet 39:1217–1224PubMedCrossRefGoogle Scholar
  48. Tessier MC, Qu HQ, Frechette R, Bacot F, Grabs R, Taback SP, Lawson ML, Kirsch SE, Hudson TJ, Polychronakos C (2006) Type 1 diabetes and the OAS gene cluster: association with splicing polymorphism or haplotype? J Med Genet 43:129–132PubMedCrossRefGoogle Scholar
  49. Thomas PD, Campbell MJ, Kejariwal A, Mi H, Karlak B, Daverman R, Diemer K, Muruganujan A, Narechania A (2003) PANTHER: a library of protein families and subfamilies indexed by function. Genome Res 13:2129–2141PubMedCrossRefGoogle Scholar
  50. Thorisson GA, Smith AV, Krishnan L, Stein LD (2005) The International HapMap Project Web site. Genome Res 15:1592–1593PubMedCrossRefGoogle Scholar
  51. Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, Holt RA, Gocayne JD, Amanatides P, Ballew RM, Huson DH, Wortman JR, Zhang Q, Kodira CD, Zheng XH, Chen L, Skupski M, Subramanian G, Thomas PD, Zhang J, Gabor Miklos GL, Nelson C, Broder S, Clark AG, Nadeau J, McKusick VA, Zinder N, Levine AJ, Roberts RJ, Simon M, Slayman C, Hunkapiller M, Bolanos R, Delcher A, Dew I, Fasulo D, Flanigan M, Florea L, Halpern A, Hannenhalli S, Kravitz S, Levy S, Mobarry C, Reinert K, Remington K, Abu-Threideh J, Beasley E, Biddick K, Bonazzi V, Brandon R, Cargill M, Chandramouliswaran I, Charlab R, Chaturvedi K, Deng Z, Di Francesco V, Dunn P, Eilbeck K, Evangelista C, Gabrielian AE, Gan W, Ge W, Gong F, Gu Z, Guan P, Heiman TJ, Higgins ME, Ji RR, Ke Z, Ketchum KA, Lai Z, Lei Y, Li Z, Li J, Liang Y, Lin X, Lu F, Merkulov GV, Milshina N, Moore HM, Naik AK, Narayan VA, Neelam B, Nusskern D, Rusch DB, Salzberg S, Shao W, Shue B, Sun J, Wang Z, Wang A, Wang X, Wang J, Wei M, Wides R, Xiao C, Yan C et al (2001) The sequence of the human genome. Science 291:1304–1351PubMedCrossRefGoogle Scholar
  52. Westfall PH, Young SS (1993) Resampling-based multiple testing: examples and methods for p-value adjustment. Wiley Publishers, New YorkGoogle Scholar
  53. Wright S (1950) Genetical structure of populations. Nature 166:247–249PubMedCrossRefGoogle Scholar
  54. Zhang W, Dolan ME (2008a) Ancestry-related differences in gene expression: findings may enhance understanding of health disparities between populations. Pharmacogenomics 9:489–492PubMedCrossRefGoogle Scholar
  55. Zhang W, Dolan ME (2008b) Beyond the HapMap genotypic data: prospects of deep resequencing projects. Curr Bioinform 3Google Scholar
  56. Zhang W, Dolan ME (2008c) On the challenges of the HapMap resource. Bioinformation 2:238–239PubMedGoogle Scholar
  57. Zhang W, Bleibel WK, Roe CA, Cox NJ, Eileen Dolan M (2007) Gender-specific differences in expression in human lymphoblastoid cell lines. Pharmacogenet Genomics 17:447–450PubMedCrossRefGoogle Scholar
  58. Zhang W, Duan S, Kistner EO, Bleibel WK, Huang RS, Clark TA, Chen TX, Schweitzer AC, Blume JE, Cox NJ, Dolan ME (2008a) Evaluation of genetic variation contributing to differences in gene expression between populations. Am J Hum Genet 82:631–640PubMedCrossRefGoogle Scholar
  59. Zhang W, Ratain MJ, Dolan ME (2008b) The HapMap resource is providing new insights into ourselves and its application to pharmacogenomics. Bioinform Biol Insights 2:15–23PubMedGoogle Scholar

Copyright information

© Springer-Verlag 2008

Authors and Affiliations

  • Wei Zhang
    • 1
  • Shiwei Duan
    • 1
  • Wasim K. Bleibel
    • 1
  • Steven A. Wisel
    • 1
  • R. Stephanie Huang
    • 1
  • Xiaolin Wu
    • 1
  • Lijun He
    • 1
  • Tyson A. Clark
    • 2
  • Tina X. Chen
    • 2
  • Anthony C. Schweitzer
    • 2
  • John E. Blume
    • 2
  • M. Eileen Dolan
    • 1
  • Nancy J. Cox
    • 1
    • 3
    Email author
  1. 1.Section of Hematology/Oncology, Department of MedicineThe University of ChicagoChicagoUSA
  2. 2.Expression Research LaboratoryAffymetrix Inc.Santa ClaraUSA
  3. 3.Department of Human GeneticsThe University of ChicagoChicagoUSA

Personalised recommendations