Human Genetics

, Volume 135, Issue 2, pp 223–232 | Cite as

Linking short tandem repeat polymorphisms with cytosine modifications in human lymphoblastoid cell lines

  • Zhou Zhang
  • Yinan Zheng
  • Xu Zhang
  • Cong Liu
  • Brian Thomas Joyce
  • Warren A. Kibbe
  • Lifang Hou
  • Wei ZhangEmail author
Original Investigation


Inter-individual variation in cytosine modifications has been linked to complex traits in humans. Cytosine modification variation is partially controlled by single nucleotide polymorphisms (SNPs), known as modified cytosine quantitative trait loci (mQTL). However, little is known about the role of short tandem repeat polymorphisms (STRPs), a class of structural genetic variants, in regulating cytosine modifications. Utilizing the published data on the International HapMap Project lymphoblastoid cell lines (LCLs), we assessed the relationships between 721 STRPs and the modification levels of 283,540 autosomal CpG sites. Our findings suggest that, in contrast to the predominant cis-acting mode for SNP-based mQTL, STRPs are associated with cytosine modification levels in both cis-acting (local) and trans-acting (distant) modes. In local scans within the ±1 Mb windows of target CpGs, 21, 9, and 21 cis-acting STRP-based mQTL were detected in CEU (Caucasian residents from Utah, USA), YRI (Yoruba people from Ibadan, Nigeria), and the combined samples, respectively. In contrast, 139,420, 76,817, and 121,866 trans-acting STRP-based mQTL were identified in CEU, YRI, and the combined samples, respectively. A substantial proportion of CpG sites detected with local STRP-based mQTL were not associated with SNP-based mQTL, suggesting that STRPs represent an independent class of mQTL. Functionally, genetic variants neighboring CpG-associated STRPs are enriched with genome-wide association study (GWAS) loci for a variety of complex traits and diseases, including cancers, based on the National Human Genome Research Institute (NHGRI) GWAS Catalog. Therefore, elucidating these STRP-based mQTL in addition to SNP-based mQTL can provide novel insights into the genetic architectures of complex traits.


Complex Trait Combine Sample International HapMap Project National Human Genome Research Institute International HapMap Consortium 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work was partially supported by grants from the National Institutes of Health: R21HG006367 (to WZ), R21CA187869 (to WZ and LH), and The Robert H. Lurie Comprehensive Cancer Center-Developmental Funds P30CA060553 (to WZ).

Supplementary material

439_2015_1628_MOESM1_ESM.png (269 kb)
Supplementary material 1 Fig. 1 Pearson’s correlation coefficients (ρ) of STRPs and cytosine modifications in local scans between CEU and YRI samples. Scatter plot for Pearson’s correlations (ρ) of STRP length and M-values of local CpGs within ± 1 Mb windows of STRPs for CEU and YRI. (PNG 269 kb)
439_2015_1628_MOESM2_ESM.png (204 kb)
Supplementary material 2 Fig. 2 QQ-plots of the observed p -values for trans -acting STRP-based mQTL. P-values are binned and displayed as hexagons. Different grey scales of each hexagon represent different counts of p-values. A total of > 200 million observed p-values from the whole-genome scan are shown. (a) CEU; (b) YRI. (PNG 204 kb)
439_2015_1628_MOESM3_ESM.png (98 kb)
Supplementary material 3 Fig. 3 Enrichment of GWAS loci among cis -acting STRP-based mQTL. The null distributions of the numbers of SNPs overlapped with GWAS loci are displayed as histograms. The asterisk marks the true number of SNPs overlapped with GWAS loci within different windows: (a) ± 100 Kb; (b) ± 500 Kb; and (c) ± 1 Mb of cis-acting STRP-based mQTL (p-value < 10−3). (PNG 97 kb)
439_2015_1628_MOESM4_ESM.xlsx (74 kb)
Supplementary material 4 (XLSX 73 kb)
439_2015_1628_MOESM5_ESM.xlsx (18.7 mb)
Supplementary material 5 (XLSX 19,168 kb)
439_2015_1628_MOESM6_ESM.xlsx (10 kb)
Supplementary material 6 (XLSX 10 kb)
439_2015_1628_MOESM7_ESM.xlsx (14 kb)
Supplementary material 7 (XLSX 14 kb)
439_2015_1628_MOESM8_ESM.xlsx (13 kb)
Supplementary material 8 (XLSX 13 kb)


  1. Bell JT, Pai AA, Pickrell JK, Gaffney DJ, Pique-Regi R, Degner JF et al (2011) DNA methylation patterns associate with genetic and gene expression variation in HapMap cell lines. Genome Biol 12:R10. doi: 10.1186/gb-2011-12-1-r10 PubMedPubMedCentralCrossRefGoogle Scholar
  2. Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple Testing. J Roy Stat Soc B Met 57:289–300. doi: 10.2307/2346101 Google Scholar
  3. Berto G, Camera P, Fusco C, Imarisio S, Ambrogio C, Chiarle R et al (2007) The Down syndrome critical region protein TTC3 inhibits neuronal differentiation via RhoA and Citron kinase. J Cell Sci 120:1859–1867. doi: 10.1242/jcs.000703 PubMedCrossRefGoogle Scholar
  4. Bibikova M, Barnes B, Tsan C, Ho V, Klotzle B, Le JM et al (2011) High density DNA methylation array with single CpG site resolution. Genomics 98:288–295. doi: 10.1016/j.ygeno.2011.07.007 PubMedCrossRefGoogle Scholar
  5. Bolstad BM, Irizarry RA, Astrand M, Speed TP (2003) A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19:185–193PubMedCrossRefGoogle Scholar
  6. Bolton KA, Ross JP, Grice DM, Bowden NA, Holliday EG, Avery-Kiejda KA et al (2013) STaRRRT: a table of short tandem repeats in regulatory regions of the human genome. BMC Genom 14:795. doi: 10.1186/1471-2164-14-795 CrossRefGoogle Scholar
  7. Borevitz JO, Liang D, Plouffe D, Chang HS, Zhu T, Weigel D et al (2003) Large-scale identification of single-feature polymorphisms in complex genomes. Genome Res 13:513–523. doi: 10.1101/gr.541303 PubMedPubMedCentralCrossRefGoogle Scholar
  8. Brahmachary M, Guilmatre A, Quilez J, Hasson D, Borel C, Warburton P et al (2014) Digital genotyping of macrosatellites and multicopy genes reveals novel biological functions associated with copy number variation of large tandem repeats. PLoS Genet 10:e1004418. doi: 10.1371/journal.pgen.1004418 PubMedPubMedCentralCrossRefGoogle Scholar
  9. Conrad DF, Pinto D, Redon R, Feuk L, Gokcumen O, Zhang Y et al (2010) Origins and functional impact of copy number variation in the human genome. Nature 464:704–712. doi: 10.1038/nature08516 PubMedPubMedCentralCrossRefGoogle Scholar
  10. Cooper GM, Coe BP, Girirajan S, Rosenfeld JA, Vu TH, Baker C et al (2011) A copy number variation morbidity map of developmental delay. Nat Genet 43:838–846. doi: 10.1038/ng.909 PubMedPubMedCentralCrossRefGoogle Scholar
  11. Du P, Zhang X, Huang CC, Jafari N, Kibbe WA, Hou L et al (2010) Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis. BMC Bioinform 11:587. doi: 10.1186/1471-2105-11-587 CrossRefGoogle Scholar
  12. Duan S, Huang RS, Zhang W, Bleibel WK, Roe CA, Clark TA et al (2008) Genetic architecture of transcript-level variation in humans. Am J Hum Genet 82:1101–1113. doi: 10.1016/j.ajhg.2008.03.006 PubMedPubMedCentralCrossRefGoogle Scholar
  13. Ellegren H (2000) Heterogeneous mutation processes in human microsatellite DNA sequences. Nat Genet 24:400–402. doi: 10.1038/74249 PubMedCrossRefGoogle Scholar
  14. Ellegren H (2004) Microsatellites: simple sequences with complex evolution. Nat Rev Genet 5:435–445. doi: 10.1038/nrg1348 PubMedCrossRefGoogle Scholar
  15. Encode Project Consortium (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489:57–74. doi: 10.1038/nature11247 CrossRefGoogle Scholar
  16. Ernst J, Kheradpour P, Mikkelsen TS, Shoresh N, Ward LD, Epstein CB et al (2011) Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473:43–49. doi: 10.1038/nature09906 PubMedPubMedCentralCrossRefGoogle Scholar
  17. Fraser HB, Lam LL, Neumann SM, Kobor MS (2012) Population-specificity of human DNA methylation. Genome Biol 13:R8. doi: 10.1186/gb-2012-13-2-r8 PubMedPubMedCentralCrossRefGoogle Scholar
  18. Gamazon ER, Badner JA, Cheng L, Zhang C, Zhang D, Cox NJ et al (2013) Enrichment of cis-regulatory gene expression SNPs and methylation quantitative trait loci among bipolar disorder susceptibility variants. Mol Psychiatry 18:340–346. doi: 10.1038/mp.2011.174 PubMedPubMedCentralCrossRefGoogle Scholar
  19. Hattori E, Ebihara M, Yamada K, Ohba H, Shibuya H, Yoshikawa T (2001) Identification of a compound short tandem repeat stretch in the 5′-upstream region of the cholecystokinin gene, and its association with panic disorder but not with schizophrenia. Mol Psychiatry 6:465–470. doi: 10.1038/ PubMedCrossRefGoogle Scholar
  20. Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS et al (2009) Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci USA 106:9362–9367. doi: 10.1073/pnas.0903103106 PubMedPubMedCentralCrossRefGoogle Scholar
  21. Johnson WE, Li C, Rabinovic A (2007) Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8:118–127. doi: 10.1093/biostatistics/kxj037 PubMedCrossRefGoogle Scholar
  22. Kanehisa M, Goto S (2000) KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28:27–30PubMedPubMedCentralCrossRefGoogle Scholar
  23. Kuroda S, Schweighofer N, Kawato M (2001) Exploration of signal transduction pathways in cerebellar long-term depression by kinetic simulation. J Neurosci 21:5693–5702PubMedGoogle Scholar
  24. Li R, Hsieh CL, Young A, Zhang Z, Ren X, Zhao Z (2015) Illumina synthetic long read sequencing allows recovery of missing sequences even in the “finished” C. elegans genome. Sci Rep. 5:10814. doi: 10.1038/srep10814
  25. McCarroll SA, Kuruvilla FG, Korn JM, Cawley S, Nemesh J, Wysoker A et al (2008) Integrated detection and population-genetic analysis of SNPs and copy number variation. Nat Genet 40:1166–1174. doi: 10.1038/ng.238 PubMedCrossRefGoogle Scholar
  26. Mills RE, Luttig CT, Larkins CE, Beauchamp A, Tsui C, Pittard WS et al (2006) An initial map of insertion and deletion (INDEL) variation in the human genome. Genome Res 16:1182–1190. doi: 10.1101/gr.4565806 PubMedPubMedCentralCrossRefGoogle Scholar
  27. Moen EL, Zhang X, Mu W, Delaney SM, Wing C, McQuade J et al (2013) Genome-wide variation of cytosine modifications between European and African populations and the implications for complex traits. Genetics 194:987–996. doi: 10.1534/genetics.113.151381 PubMedPubMedCentralCrossRefGoogle Scholar
  28. Monkley SJ, Pritchard CA, Critchley DR (2001) Analysis of the mammalian talin2 gene TLN2. Biochem Biophys Res Commun 286:880–885. doi: 10.1006/bbrc.2001.5497 PubMedCrossRefGoogle Scholar
  29. Morley M, Molony CM, Weber TM, Devlin JL, Ewens KG, Spielman RS et al (2004) Genetic analysis of genome-wide variation in human gene expression. Nature 430:743–747. doi: 10.1038/nature02797 PubMedPubMedCentralCrossRefGoogle Scholar
  30. Murrell A, Heeson S, Cooper WN, Douglas E, Apostolidou S, Moore GE et al (2004) An association between variants in the IGF2 gene and Beckwith–Wiedemann syndrome: interaction between genotype and epigenotype. Hum Mol Genet 13:247–255. doi: 10.1093/hmg/ddh013 PubMedCrossRefGoogle Scholar
  31. Pai AA, Bell JT, Marioni JC, Pritchard JK, Gilad Y (2011) A genome-wide study of DNA methylation patterns and gene expression levels in multiple human and chimpanzee tissues. PLoS Genet 7:e1001316. doi: 10.1371/journal.pgen.1001316 PubMedPubMedCentralCrossRefGoogle Scholar
  32. Payseur BA, Jing P (2009) A genomewide comparison of population structure at STRPs and nearby SNPs in humans. Mol Biol Evol 26:1369–1377. doi: 10.1093/molbev/msp052 PubMedPubMedCentralCrossRefGoogle Scholar
  33. Payseur BA, Place M, Weber JL (2008) Linkage disequilibrium between STRPs and SNPs across the human genome. Am J Hum Genet 82:1039–1050. doi: 10.1016/j.ajhg.2008.02.018 PubMedPubMedCentralCrossRefGoogle Scholar
  34. Payseur BA, Jing P, Haasl RJ (2011) A genomic portrait of human microsatellite variation. Mol Biol Evol 28:303–312. doi: 10.1093/molbev/msq198 PubMedPubMedCentralCrossRefGoogle Scholar
  35. Perry GH (2008) The evolutionary significance of copy number variation in the human genome. Cytogenet Genome Res 123:283–287. doi: 10.1159/000184719 PubMedCrossRefGoogle Scholar
  36. Pumpernik D, Oblak B, Borstnik B (2008) Replication slippage versus point mutation rates in short tandem repeats of the human genome. Mol Genet Genomics 279:53–61. doi: 10.1007/s00438-007-0294-1 PubMedCrossRefGoogle Scholar
  37. Ram D, Leshkowitz D, Gonzalez D, Forer R, Levy I, Chowers M et al (2015) Evaluation of GS Junior and MiSeq next-generation sequencing technologies as an alternative to Trugene population sequencing in the clinical HIV laboratory. J Virol Methods 212:12–16. doi: 10.1016/j.jviromet.2014.11.003 PubMedCrossRefGoogle Scholar
  38. Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM et al (2001) dbSNP: the NCBI database of genetic variation. Nucleic Acids Res 29:308–311PubMedPubMedCentralCrossRefGoogle Scholar
  39. Siegfried Z, Eden S, Mendelsohn M, Feng X, Tsuberi BZ, Cedar H (1999) DNA methylation represses transcription in vivo. Nat Genet 22:203–206. doi: 10.1038/9727 PubMedCrossRefGoogle Scholar
  40. Spielman RS, Bastone LA, Burdick JT, Morley M, Ewens WJ, Cheung VG (2007) Common genetic variants account for differences in gene expression among ethnic groups. Nat Genet 39:226–231. doi: 10.1038/ng1955 PubMedPubMedCentralCrossRefGoogle Scholar
  41. St George-Hyslop P, Haines J, Rogaev E, Mortilla M, Vaula G, Pericak-Vance M et al (1992) Genetic evidence for a novel familial Alzheimer’s disease locus on chromosome 14. Nat Genet 2:330–334. doi: 10.1038/ng1292-330 PubMedCrossRefGoogle Scholar
  42. Stadler MB, Murr R, Burger L, Ivanek R, Lienert F, Scholer A et al (2011) DNA-binding factors shape the mouse methylome at distal regulatory regions. Nature 480:490–495. doi: 10.1038/nature10716 PubMedGoogle Scholar
  43. Stark AL, Hause RJ Jr, Gorsic LK, Antao NN, Wong SS, Chung SH et al (2014) Protein quantitative trait loci identify novel candidates modulating cellular response to chemotherapy. PLoS Genet 10:e1004192. doi: 10.1371/journal.pgen.1004192 PubMedPubMedCentralCrossRefGoogle Scholar
  44. Stein JL, Hua X, Morra JH, Lee S, Hibar DP, Ho AJ et al (2010) Genome-wide analysis reveals novel genes influencing temporal lobe structure with relevance to neurodegeneration in Alzheimer’s disease. Neuroimage 51:542–554. doi: 10.1016/j.neuroimage.2010.02.068 PubMedPubMedCentralCrossRefGoogle Scholar
  45. Stranger BE, Forrest MS, Dunning M, Ingle CE, Beazley C, Thorne N et al (2007) Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science 315:848–853. doi: 10.1126/science.1136678 PubMedPubMedCentralCrossRefGoogle Scholar
  46. The International HapMap Consortium (2003) The International HapMap project. Nature 426:789–796. doi: 10.1038/nature02168 CrossRefGoogle Scholar
  47. The International HapMap Consortium (2005) A haplotype map of the human genome. Nature 437:1299–1320. doi: 10.1038/nature04226 PubMedCentralCrossRefGoogle Scholar
  48. The International HapMap Consortium (2007) A second generation human haplotype map of over 3.1 million SNPs. Nature 449:851–861. doi: 10.1038/nature06258 PubMedCentralCrossRefGoogle Scholar
  49. The International Human Genome Sequencing Consortium (2001) Initial sequencing and analysis of the human genome. Nature 409:860–921. doi: 10.1038/35057062 CrossRefGoogle Scholar
  50. Weber JL, Wong C (1993) Mutation of human short tandem repeats. Hum Mol Genet 2:1123–1128PubMedCrossRefGoogle Scholar
  51. Westfall P, Young S (1993) Resampling-based multiple testing: examples and methods for p-value adjustment. Wiley, New YorkGoogle Scholar
  52. Wooster R, Cleton-Jansen AM, Collins N, Mangion J, Cornelis RS, Cooper CS et al (1994) Instability of short tandem repeats (microsatellites) in human cancers. Nat Genet 6:152–156. doi: 10.1038/ng0294-152 PubMedCrossRefGoogle Scholar
  53. Zhang W, Duan S, Kistner EO, Bleibel WK, Huang RS, Clark TA et al (2008) Evaluation of genetic variation contributing to differences in gene expression between populations. Am J Hum Genet 82:631–640. doi: 10.1016/j.ajhg.2007.12.015 PubMedPubMedCentralCrossRefGoogle Scholar
  54. Zhang W, Duan S, Bleibel WK, Wisel SA, Huang RS, Wu X, He L, Clark TA, Chen TX, Schweitzer AC, Blume JE, Dolan ME, Cox NJ (2009) Identification of common genetic variants that account for transcript isoform variation between human populations. Hum Genet 125(1):81–93PubMedPubMedCentralCrossRefGoogle Scholar
  55. Zhang DD, Cheng LJ, Badner JA, Chen C, Chen Q, Luo W et al (2010) Genetic control of individual differences in gene-specific methylation in human brain. Am J Hum Genet 86:411–419. doi: 10.1016/j.ajhg.2010.02.005 PubMedPubMedCentralCrossRefGoogle Scholar
  56. Zhang X, Cal AJ, Borevitz JO (2011) Genetic architecture of regulatory variation in Arabidopsis thaliana. Genome Res 21:725–733. doi: 10.1101/gr.115337.110 PubMedPubMedCentralCrossRefGoogle Scholar
  57. Zhang X, Mu W, Zhang W (2012) On the analysis of the Illumina 450 k array data: probes ambiguously mapped to the human genome. Front Genet 3:73. doi: 10.3389/fgene.2012.00073 PubMedPubMedCentralGoogle Scholar
  58. Zhang X, Moen EL, Liu C, Mu W, Gamazon ER, Delaney SM et al (2014) Linking the genetic architecture of cytosine modifications with human complex traits. Hum Mol Genet 23:5893–5905. doi: 10.1093/hmg/ddu313 PubMedPubMedCentralCrossRefGoogle Scholar
  59. Zhang W, Gamazon ER, Zhang X, Konkashbaev A, Liu C, Szilagyi KL et al (2015) SCAN database: facilitating integrative analyses of cytosine modification and expression QTL. Database (Oxford). doi: 10.1093/database/bav025 Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  • Zhou Zhang
    • 1
    • 2
  • Yinan Zheng
    • 2
    • 3
  • Xu Zhang
    • 4
  • Cong Liu
    • 5
  • Brian Thomas Joyce
    • 2
    • 6
  • Warren A. Kibbe
    • 7
  • Lifang Hou
    • 2
    • 8
  • Wei Zhang
    • 2
    • 8
    • 9
    Email author
  1. 1.Driskill Graduate Program in Life SciencesNorthwestern University Feinberg School of MedicineChicagoUSA
  2. 2.Department of Preventive MedicineNorthwestern University Feinberg School of MedicineChicagoUSA
  3. 3.Institute for Public Health and Medicine, Feinberg School of MedicineNorthwestern UniversityChicagoUSA
  4. 4.Section of Hematology/Oncology, Department of MedicineUniversity of Illinois at ChicagoChicagoUSA
  5. 5.Department of BioengineeringUniversity of Illinois at ChicagoChicagoUSA
  6. 6.Division of Epidemiology and Biostatistics, School of Public HealthUniversity of Illinois at ChicagoChicagoUSA
  7. 7.Center for Biomedical Informatics and Information TechnologyNational Cancer InstituteRockvilleUSA
  8. 8.The Robert H. Lurie Comprehensive Cancer CenterNorthwestern University Feinberg School of MedicineChicagoUSA
  9. 9.Center for Genetic MedicineNorthwestern University Feinberg School of MedicineChicagoUSA

Personalised recommendations