Journal of Biosciences

, Volume 20, Issue 5, pp 613–627 | Cite as

Analysis of CAG/CTG triplet repeats in the human genome: Implication in transcription factor gene regulation

  • Rashna Bhandari
  • Samir K. BrahmachariEmail author


Instability and polymorphism at several CAG/CTG trinucleotide repeat loci have been associated with human genetic disorders. In an attempt to identify novel sites that may be possible loci for expansion of CAG/CTG repeats, we searched all human sequences in the EMBL nucleotide sequence database for (CAG)5 and (CTG)5 repeats. We have identified 121 human DNA sequences of known and unknown functions that contain stretches of five or more CAG or CTG repeats. Many repeat stretches were interrupted by variant triplets, a significant number of which differ from the repeat triplet only by a single base, suggesting that these evolved from the parent triplet by point mutations. A large number of human transcription factor genes were found to contain CAG repeats within their coding sequences. Analysis of the EMBL transcription factors database showed that many transcription factor genes of other eukaryotes, including genes involved inDrosophila embryo development, possess these repeats. Interestingly, CAG repeats are absent from prokaryotic transcription factors. Different sequence entries for the human TATA box binding protein showed a polymorphism in the length of the CAG repeat in this gene, suggesting that loci other than those already known to be associated with genetic diseases may be possible sites for repeat instability related disorders. On the basis of our findings in this database analysis, we propose a role for CAG repeats as cisacting regulatory elements involved in fine-tuning gene expression.


Triplet repeats sequence analysis transcription factor genes TBP 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Armour J A L, Neumann R, Gobert S and Jeffreys A J 1994 Isolation of human simple repeat loci by hybridisation selection;Hum. Mol. Genet. 3 599–605PubMedCrossRefGoogle Scholar
  2. Brahmachari S K, Meera G, Sarkar P S. Balagurumoorthy P, Tripathi J, Raghavan S, Shaligram U S and Pataskar S S 1995 Functional significance of simple repetitive sequences in the genome;Electrophoresis 16 1705–1714PubMedCrossRefGoogle Scholar
  3. Burke J R. Kingfield M S, Lewis K E. Roses A D. Lee J E. Hulette C, Pericak-Vance M A and Vance J M 1994 The Haw River Syndrome: Dentatorubropallidoluysian atrophy (DRPLA) in an African-American family;Nature Genet. 7 521–524PubMedCrossRefGoogle Scholar
  4. Chen L and Frankel A D 1994 An RNA-binding peptide from bovine immunodeficiency virus Tat protein recognises an unusual RNA structure;Biochemistry 33 2708–2715PubMedCrossRefGoogle Scholar
  5. Chung M Y, Ranum L P, Duvick L A, Servadio A, Zoghbi H Y and Orr H T 1993 Evidence for a mechanism predisposing to intergenerational CAG repeat instability in spinocerebellar ataxia type I;Nature Genet. 5 254–258PubMedCrossRefGoogle Scholar
  6. Devereux J, Haeberli P and Smithies O 1984 A comprehensive set of sequence analysis algorithms for the VAX;Nuclei Acids Res. 12 387–395CrossRefGoogle Scholar
  7. Fu Y H, Friedman D L, Richards S, Pearlman J A, Gibbs R A, Pizzuti A, Ashizawa T, Perryman M B, Scarlato G, Fenwick R G and Caskey C T 1993 Decreased expression of myotonin-protein kinsase messenger RNA and protein in adult form of myotonic dystrophy;Science 260 235–238PubMedCrossRefGoogle Scholar
  8. Fu Y H, Pizzuti A, Fenwick R G Jr, King J, Rajnarayan S, Dunne P W, Dubell J, Nasser G A, Ashizawa T, Jong P D, Wieringa B, Korneluk R, Perryman M B, Epstein H F and Caskey C T 1992 An unstable triplet repeat in a gene related to myotonic muscular dystrophy;Science 255 1256–1258PubMedCrossRefGoogle Scholar
  9. Gerber H P, Seipel K, Georgiev O, Hofferer M, Hug M, Rusconi S and Schaffner W 1994 Transcriptional activation modulated by homopolymeric glutamine and proline stretches;Science 263 808–811PubMedCrossRefGoogle Scholar
  10. Han J, Hsu C, Zhu Z, Longshore J W and Finley W H 1994 Over-representation of the disease associated (CAG) and (CGG) repeats in the human genome;Nucleic Acids Res. 22 1735–1740PubMedCrossRefGoogle Scholar
  11. Hoffmann A, Sinn E. Yamamoto T, Wang J, Roy A, Horikoshi M and Roeder R G 1990 Highly conserved core domain and unique N terminus with presumptive regulatory motifs in human TATA factor (TF1ID);Nature (London) 346 387–390CrossRefGoogle Scholar
  12. Huntington’s Disease Collaborative Research Group 1993 A novel gene containing a trinucleotide repeat that is expanded and unstable on Huntington’s disease chromosomes;Cell 72 971–983CrossRefGoogle Scholar
  13. Kang S, Jaworski A, Ohshima K and Wells R D 1995 Expansion and deletion of CTG repeats from human disease genes are determined by the direction of replication inE. coli;Nature Genet. 10 213–218PubMedCrossRefGoogle Scholar
  14. Kawaguchi Y, Okamoto T, Taniwaki M. Aizawa M, Inoue M, Katayama S, Kawakami H, Nakamura S, Nishimura M, Akiguchi I, Kimura J, Narumiya S and Kakizuka A 1994 CAG expansions in a novel gene for Machado-Joseph disease at chromosome 14q32.1;Nature Genet. 8 221–228PubMedCrossRefGoogle Scholar
  15. Kimura M 1983The neutral theory of molecular evolution (Cambridge: Cambridge Univ. Press)Google Scholar
  16. Koide R, Ikeuchi T, Onodera O, Tanaka A H, Igarashi S, Endo K, Takahashi H, Kondo R, Ashikawa A, Hayashi T, Saito M, Tomoda A, Miike T, Naito H, Ikuta F and Tsuji S 1994 Unstable expansion of CAG repeat in hereditary dentatorubral-pallidoluysian atrophy (DRPLA);Nature Genet. 6 9–12PubMedCrossRefGoogle Scholar
  17. Kremer E J, Pritchard M, Lynch M, Yu S, Holman K, Baker E. Warren S T, Schlessinger D, Sutherland G R and Richards R I 1991 Mapping of DNA instability at the fragile X to a trinucleotide repeat sequence p(CCG)n;Science 252 1711–1714PubMedCrossRefGoogle Scholar
  18. Kunst C B and Warren S T 1994 Cryptic and polar variation of the fragile X repeat could result in predisposing normal alleles;Cell 77 853–861PubMedCrossRefGoogle Scholar
  19. La Spada A R, Wilson E M, Lubahn D B, Harding A E and Fishbeck K H 1991 Androgen receptor gene mutations in X-linked spinal and bulbar muscular atrophy;Nature (London) 352 77–79CrossRefGoogle Scholar
  20. Li S H, McInnes M G, Margolis R L, Antonarakis S E and Ross C A 1993 Novel triplet containing genes in human brain: Cloning, expression, and length polymorphism;Genomics 16 572–579PubMedCrossRefGoogle Scholar
  21. Mahadevan M, Tsilfidis C, Sabourin L, Shutler G, Amemeiya C, Jansen G, Neville C, Narang M. Barcelo J, O’Hoy K, Leblund S, Earle-Macdonald J, deJong P J, Wieringa B and Komeluk R G 1992 Myotonic dystrophy mutation: an unstable CTG repeat in the 3′ untranslated region of the gene;Science 255 1253–1255PubMedCrossRefGoogle Scholar
  22. Mitas M, Yu A, Dill J, Kamp T J, Chambers E J and Haworth I S 1995 Hairpin properties of single-stranded DNA containing a GC-rich triplet repeat: (CTG)15;Nucleic Acids Res. 23 1050–1059PubMedCrossRefGoogle Scholar
  23. Orr H T, Chung M, Banfi S, Kwiatkowski T J, Servadio A, Beaudet A L, McCall A E, Duvick L A, Ranum L and Zoghbi H Y 1993 Expansion of an unstable trinucleotide CAG repeat in spinocerebellar ataxia type 1;Nature Genet. 4 221–226PubMedCrossRefGoogle Scholar
  24. Otten A D and Tapscott S J 1995 Triplet repeat expansion in myotonic dystrophy alters the adjacent chromatin structure;Proc. Natl. Acad. Sci. USA 92 5465–5469PubMedCrossRefGoogle Scholar
  25. Phillips K L, Gartrell D M, Roses A D and Lee J E 1993 A triplet repeat polymorphism in a gene expressed in human hypothalamus;Hum. Mol. Genet. 2 1332PubMedCrossRefGoogle Scholar
  26. Polymeropoulos M H, Rath D S, Xiao H and Merril CR 1991 Trinucleotide repeat polymorphism at the human transcription factor IID gene;Nucleic Acids Res. 19 4307PubMedCrossRefGoogle Scholar
  27. Richards R I, Holman K, Yu S and Sutherland G R 1993 Fragile X syndrome unstable element. p(CCG)n, and other simple tandem repeat sequences are binding sites for specific nuclear proteins;Hum. Mol. Genet. 2 1429–1435PubMedCrossRefGoogle Scholar
  28. Richards R I and Sutherland G R 1992 Dynamic mutations: A new class of mutations causing human disease;Cell 70 709–712PubMedCrossRefGoogle Scholar
  29. Riggins G J, Lokey L K, Chastain J L. Leiner H A, Sherman S L. Wilkinson K D and Warren S T 1992 Human genes containing polymorphic trinucleotide repeats;Nature Genet. 2 186–191PubMedCrossRefGoogle Scholar
  30. Stallings R L 1994 Distribution of trinucleotide microsatellites in different categories of mammalian genomic sequence: Implications for human genetic diseases;Genomics 21 116–121PubMedCrossRefGoogle Scholar
  31. Streisinger G. Okada Y. Emnch J, Newton J, Tsugita A, Terzaghi F and Inouye M 1966 Frameshift mutations in the genetic code;Cold Spring Harb. Symp. Quant. Biol. 31 77–84PubMedGoogle Scholar
  32. Strand M, Prolla T A, Liskay R M and Petes T D 1993 Destabilisation of tracts of simple repetitive DNA in yeast by mutations affecting DNA mismatch repair;Nature (London) 365 274–276CrossRefGoogle Scholar
  33. Stott K, Blackburn J M, Butler P J G and Perutz M 1995 Incorporation of glutamine repeats makes protein oligomerize: Implications for neurodegenerative diseases;Proc. Natl. Acad. Sci. USA 92 6509–6513PubMedCrossRefGoogle Scholar
  34. Sutherland G R and Richards R I 1995 Simple tandem DNA repeats and human genetic disease;Proc. Natl. Acad. Sci. USA 92 3636–3641PubMedCrossRefGoogle Scholar
  35. Trifonov E N 1989 The multiple codes of nucleotide sequences;Bull. Math. Biol. 51 417–432PubMedGoogle Scholar
  36. Tripathi J and Brahmachari S K 1991 Distribution of simple repetitive (TG/CA)n and (CT/AG)n sequences in human and rodent genomes;J. Biomol. Struc. Dyn. 9 387–397Google Scholar
  37. Wang Y H, Amirhaeri S, Kang S, Wells R D and Griffith J D 1994 Preferential nucleosome assembly at DNA triplet repeats from the myotonic dystrophy gene;Science 265 669–671PubMedCrossRefGoogle Scholar

Copyright information

© Indian Academy of Sciences 1995

Authors and Affiliations

  1. 1.Molecular Biophysics UnitIndian Institute of ScienceBangaloreIndia

Personalised recommendations