Evolutionary Constraint on DNA Shape in the Human Genome

  • Thomas D. Tullius
  • Stephen C. J. Parker
  • Elliott H. Margulies


In the age of genomics, DNA is depicted as a string of letters. While this is a useful device for representing the information in a genome, the molecular nature of DNA is obscured. Proteins cannot actually “read” DNA letters – they discriminate between DNA binding sites via molecular recognition, which is sensitive to DNA structure. Since shape is essential to DNA’s biological function, we hypothesized that natural selection can act to preserve DNA shape without maintaining the exact sequence of nucleotides. To test this hypothesis, we developed a DNA structure database, ORChID, and used it to map structural variation throughout the human genome. We then devised a computational algorithm, Chai, to detect evolutionary constraint on DNA shape. We found that Chai regions correlate better with experimental functional elements than do genomic regions that are sequence-constrained. Our results support the hypothesis that DNA shape can be a substrate for natural selection.


Functional Element Minor Groove Evolutionary Constraint Cleavage Pattern Evolutionary Selection 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work was funded by a grant to T.D.T. from the National Human Genome Research Institute (NHGRI) of the NIH (R01 HG003541). E.H.M. was supported by the Intramural Research Program of the NHGRI, NIH. S.C.J.P. was the recipient of a National Academies Ford Foundation Dissertation Fellowship.


  1. Balasubramanian B, Pogozelski W, Tullius T (1998) DNA strand breaking by the hydroxyl radical is governed by the accessible surface areas of the hydrogen atoms of the DNA backbone. Proc Natl Acad Sci USA 95:9738–9743PubMedCrossRefGoogle Scholar
  2. Celniker SE, Dillon LAL, Gerstein MB, Gunsalus KC, Henikoff S, Karpen GH, Kellis M, Lai EC, Lieb JD, Macalpine DM, Micklem G, Piano F, Snyder M, Stein L, White KP, Waterston RH, modENCODE Consortium (2009) Unlocking the secrets of the genome. Nature 459:927–930PubMedCrossRefGoogle Scholar
  3. ENCODE Consortium (2004) The ENCODE (ENCyclopedia of DNA Elements) project. Science 306:636–640CrossRefGoogle Scholar
  4. ENCODE Consortium (2007) Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447:799–816CrossRefGoogle Scholar
  5. Gerstein MB, Lu ZJ, Van Nostrand EL, Cheng C, Arshinoff BI, Liu T, Yip KY, Robilotto R, Rechtsteiner A, Ikegami K, Alves P, Chateigner A, Perry M, Morris M, Auerbach RK, Feng X, Leng J, Vielle A, Niu W, Rhrissorrakrai K, Agarwal A, Alexander RP, Barber G, Brdlik CM, Brennan J, Brouillet JJ, Carr A, Cheung M-S, Clawson H, Contrino S, Dannenberg LO, Dernburg AF, Desai A, Dick L, Dose AC, Du J, Egelhofer T, Ercan S, Euskirchen G, Ewing B, Feingold EA, Gassmann R, Good PJ, Green P, Gullier F, Gutwein M, Guyer MS, Habegger L, Han T, Henikoff JG, Henz SR, Hinrichs A, Holster H, Hyman T, Iniguez AL, Janette J, Jensen M, Kato M, Kent WJ, Kephart E, Khivansara V, Khurana E, Kim JK, Kolasinska-Zwierz P, Lai EC, Latorre I, Leahey A, Lewis S, Lloyd P, Lochovsky L, Lowdon RF, Lubling Y, Lyne R, Maccoss M, Mackowiak SD, Mangone M, Mckay S, Mecenas D, Merrihew G, Miller DM, Muroyama A, Murray JI, Ooi S-L, Pham H, Phippen T, Preston EA, Rajewsky N, Ratsch G, Rosenbaum H, Rozowsky J, Rutherford K, Ruzanov P, Sarov M, Sasidharan R, Sboner A, Scheid P, Segal E, Shin H, Shou C, Slack FJ, Slightam C, Smith R, Spencer WC, Stinson EO, Taing S, Takasaki T, Vafeados D, Voronina K, Wang G, Washington NL, Whittle CM, Wu B, Yan K-K, Zeller G, Zha Z, Zhong M, Zhou X, Ahringer J, Strome S, Gunsalus KC, Micklem G, Liu XS, Reinke V, Kim SK, Hillier LW, Henikoff S, Piano F, Snyder M, Stein L, Lieb JD, Waterston RH (2010) Integrative analysis of the Caenorhabditis elegans genome by the modENCODE Project. Science 330:1775–1787PubMedCrossRefGoogle Scholar
  6. Greenbaum JA, Pang B, Tullius T (2007) Construction of a genome-scale structural map at single-nucleotide resolution. Genome Res 17:947–953PubMedCrossRefGoogle Scholar
  7. Joshi R, Passner JM, Rohs R, Jain R, Sosinsky A, Crickmore MA, Jacob V, Aggarwal AK, Honig B, Mann RS (2007) Functional specificity of a Hox protein mediated by the recognition of minor groove structure. Cell 131:530–543PubMedCrossRefGoogle Scholar
  8. Margulies E, Blanchette M, Haussler D, Green ED (2003) Identification and characterization of multi-species conserved sequences. Genome Res 13:2507–2518PubMedCrossRefGoogle Scholar
  9. Margulies EH, Cooper G, Asimenos G, Thomas DJ, Dewey CN, Siepel A, Birney E, Keefe D, Schwartz AS, Hou M, Taylor J, Nikolaev S, Montoya-Burgos JI, Löytynoja A, Whelan S, Pardi F, Massingham T, Brown JB, Bickel P, Holmes I, Mullikin JC, Ureta-Vidal A, Paten B, Stone EA, Rosenbloom KR, Kent WJ, Bouffard GG, Guan X, Hansen NF, Idol JR, Maduro VVB, Maskeri B, McDowell JC, Park M, Thomas PJ, Young AC, Blakesley RW, Muzny DM, Sodergren E, Wheeler DA, Worley KC, Jiang H, Weinstock GM, Gibbs RA, Graves T, Fulton R, Mardis ER, Wilson RK, Clamp M, Cuff J, Gnerre S, Jaffe DB, Chang J, Lindblad-Toh K, Lander ES, Hinrichs A, Trumbower H, Clawson H, Zweig A, Kuhn RM, Barber GP, Harte R, Karolchik D, Field MA, Moore RA, Matthewson CA, Schein JE, Marra MA, Antonarakis SE, Batzoglou S, Goldman N, Hardison R, Haussler D, Miller W, Pachter L, Green ED, Sidow A (2007) Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome. Genome Res 17:760–774PubMedCrossRefGoogle Scholar
  10. modENCODE Consortium (2010) Identification of functional elements and regulatory circuits by Drosophila modENCODE. Science 330:1787–1797CrossRefGoogle Scholar
  11. Parker SCJ, Hansen L, Abaan HO, Tullius TD, Margulies EH (2009) Local DNA topography correlates with functional noncoding regions of the human genome. Science 324:389–392PubMedCrossRefGoogle Scholar
  12. Pogozelski W, Tullius T (1998) Oxidative strand scission of nucleic acids: routes initiated by hydrogen abstraction from the sugar moiety. Chem Rev 98:1089–1108PubMedCrossRefGoogle Scholar
  13. Price M, Tullius T (1992) Using hydroxyl radical to probe DNA structure. Methods Enzymol 212:194–219PubMedCrossRefGoogle Scholar
  14. Rohs R, West SM, Sosinsky A, Liu P, Mann RS, Honig B (2009) The role of DNA shape in protein-DNA recognition. Nature 461:1248–1253PubMedCrossRefGoogle Scholar
  15. Rohs R, Jin X, West SM, Joshi R, Honig B, Mann RS (2010) Origins of specificity in protein-DNA recognition. Annu Rev Biochem 79:233–269PubMedCrossRefGoogle Scholar
  16. Stella S, Cascio D, Johnson RC (2010) The shape of the DNA minor groove directs binding by the DNA-bending protein Fis. Genes Dev 24:814–826PubMedCrossRefGoogle Scholar
  17. Tullius T (1987) Chemical “snapshots” of DNA: using the hydroxyl radical to study the structure of DNA and DNA-protein complexes. Trends Biochem Sci 12:297–300CrossRefGoogle Scholar
  18. Tullius T (2009) DNA binding shapes up. Nature 461:1225–1226PubMedCrossRefGoogle Scholar
  19. Tullius T, Dombroski B (1985) Iron(II) EDTA used to measure the helical twist along any DNA molecule. Science 230:679–681PubMedCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Thomas D. Tullius
    • 1
    • 2
  • Stephen C. J. Parker
    • 2
    • 3
  • Elliott H. Margulies
    • 3
  1. 1.Department of ChemistryBoston UniversityBostonUSA
  2. 2.Program in BioinformaticsBoston UniversityBostonUSA
  3. 3.Genome Informatics Section, Genome Technology Branch, National Human Genome Research InstituteNational Institutes of HealthBethesdaUSA

Personalised recommendations