Junk or functional DNA? ENCODE and the function controversy

Abstract

In its last round of publications in September 2012, the Encyclopedia Of DNA Elements (ENCODE) assigned a biochemical function to most of the human genome, which was taken up by the media as meaning the end of ‘Junk DNA’. This provoked a heated reaction from evolutionary biologists, who among other things claimed that ENCODE adopted a wrong and much too inclusive notion of function, making its dismissal of junk DNA merely rhetorical. We argue that this criticism rests on misunderstandings concerning the nature of the ENCODE project, the relevant notion of function and the claim that most of our genome is junk. We argue that evolutionary accounts of function presuppose functions as ‘causal roles’, and that selection is but a useful proxy for relevant functions, which might well be unsuitable to biomedical research. Taking a closer look at the discovery process in which ENCODE participates, we argue that ENCODE’s strategy of biochemical signatures successfully identified activities of DNA elements with an eye towards causal roles of interest to biomedical research. We argue that ENCODE’s controversial claim of functionality should be interpreted as saying that 80 % of the genome is engaging in relevant biochemical activities and is very likely to have a causal role in phenomena deemed relevant to biomedical research. Finally, we discuss ambiguities in the meaning of junk DNA and in one of the main arguments raised for its prevalence, and we evaluate the impact of ENCODE’s results on the claim that most of our genome is junk.

This is a preview of subscription content, log in to check access.

Fig. 1

Notes

  1. 1.

    Because natural selection tends to remove deleterious mutations from the pool, we can expect to observe less mutations in DNA sequences important for the survival and reproduction of the organism. As there are a number of technical hurdles in the detection of such selection, estimates vary (some going up to 15 %). For a discussion, see Ponting and Hardison (2011).

  2. 2.

    See for example http://genomeinformatician.blogspot.ch/2012/09/encode-my-own-thoughts.html and http://www.homolog.us/blogs/blog/2013/04/09/homolog-us-blog-calls-for-sean-eddy-be-fired-for-the-sake-of-good-science/.

  3. 3.

    In fact, it must be noted that most criticisms are not directly aimed at what is written in ENCODE's scientific publications, which are careful in their formulations, but instead at its interpretation. This is not limited to the mass media, but also to the coverage the results were given in prestigious scientific journals such as Science (Pennisi 2012).

  4. 4.

    On the day the embargo was lifted on the last round of ENCODE's publications (and therefore long before the publication of ENCODE's criticisms), Ewan Birney, ENCODE's lead analysis coordinator, published a post on his personal blog providing his personal perspective of the project (Birney 2012a, b). In this post, Birney acknowledges that the proportion of the genome that is “functional” depends on how stringent one is, and preempts some of the most important technical criticisms addressed at the project.

  5. 5.

    Doolittle for instance writes: “Those of us who speak of excess DNA as informationally junk mean that its presence is not to be explained by past and/or current selection at the level of organisms—that it has no informational function construable historically as an SE [selected effect]. Those who say that almost the whole of the human genome is functional informationally do so on the basis of an operational diagnosis embracing a non-historical CR [causal role] definition of function.” (Doolittle 2013, p. 5299).

  6. 6.

    In a similar way, Weber (2005) has argued that “elucidating the evolutionary history of some system or subsystem is supplementary to analyzing its function; it is not part of it.” (Weber 2005, p. 40) This is easily shown by the fact that expatiations (features which start being used in a way for which they have not been selected) are generally considered functional.

  7. 7.

    Furthermore, the reader should be aware that different lines of research suggest that the modern synthesis is insufficient to understand evolution. See for instance the work edited by Pigliucci and Müller (2010) on the need of extending the standard evolutionary paradigm. Other directions/suggestions have been explored by Shapiro (2011), Gissis and Jablonka (2011 edited by), and Kauffman (1993, 1996).

  8. 8.

    Ernst Mayr (1961) proposed a distinction between two research projects within biology, which he labeled functional biology and evolutionary biology. According to Mayr, while functional biology seeks proximate causes and therefore investigates how certain phenomena occur, evolutionary biology is devoted to understand why or the evolutionary reason for the presence of the very same phenomena. Our point is that given traits can be involved in the explanations of functional biology independently of whether they have been selected for—as Cummins puts it: “Flight is a capacity that cries out for explanation in terms of anatomical functions regardless of its contribution to the capacity to maintain the species.” (Cummins 1975, p. 756). We obviously do not claim that Mayr's two domains of biology are insulated from each other, but rather that they pursue two legitimate aims which share many, though not all of their means (Laland et al. 2011). Each domain is important in the investigation of the other. However, a reduction of all the functional relevance of the genome to its evolutionary dimension (i.e. function as selected effect or biological advantage) fails to give enough attention to the different research projects of the life sciences.

  9. 9.

    Note that the two-step strategy we propose is not in conflict with Bechtel and Richardson's (2010) strategy of decomposition and localization. According to the latter, scientists decompose a complex phenomenon into less complex subsystems or contributions, and attempt to localize these functions to physical components of the system (e.g. organelles). Our point is that in the step of localization, the physical components are not entirely uncharacterized, and the earlier characterization of their activities provides important hints as to which function localizes where. Our two-step strategy is however to be distinguished from another very common strategy in biology. Mutants identified through reverse genetics, for instance, can establish the relevance of a part in a given phenomenon before identifying any of its activities. Obviously, the strategy we describe is but one of the many general strategies available to biologists.

  10. 10.

    This has to be understood as making a relevant difference, for any change to DNA makes a phenotypic difference at least insofar as the genome is also part of the structure of the organism. Even a transcription factor binding site in the middle of nowhere, not leading to any transcription, is having an effect on relevant gene functions, at least insofar as it sequesters the transcription factor and hence reduces the amount of the protein available for important binding sites. In the same way, non-coding RNA can have an influence on the expression of coding genes because they are bound by miRNAs which would normally regulate the coding genes (Salmena et al. 2011). However, such impacts may be so small as to be imperceptible.

  11. 11.

    This is well illustrated by Brenner's (1998) distinction between “junk” and “garbage”: “Some years ago I noticed that there are two kinds of rubbish in the world and that most languages have different words to distinguish them. There is the rubbish we keep, which is junk, and the rubbish we throw away, which is garbage. The excess DNA in our genomes is junk, and it is there because it is harmless, as well as being useless, and because the molecular processes generating extra DNA outpace those getting rid of it. Were the extra DNA to become disadvantageous, it would become subject to selection, just as junk that takes up too much space, or is beginning to smell, is instantly converted to garbage…” (Brenner S, Refuge of spandrels. Curr. Biol. 8: R669, 1998, quoted in Graur et al. 2013, p. 586).

  12. 12.

    E.g. “Lean Gene Machine”, Scientific American, accessed at http://www.scientificamerican.com/article.cfm?id=lean-gene-machine.

  13. 13.

    If evidence is required to support this claim, the reader may consider as an example the effect of budget and workforce cuts on the Greek National Health System (Kentikelenis et al 2011).

  14. 14.

    This problematic move is also present in another critique of ENCODE's claims, which contrasts the question of “How much DNA does it take to design a human?” with that of “How much DNA does it take to evolve a human?” (Eddy 2013, p. R260), relating the former to function and the latter to junk (see also the interview with Eddy in Diep 2013). Function, however, does not mean ideal design.

  15. 15.

    Chris Ponting (personal communication) for instance made this claim, but also emphasized the immense difficulty of identifying the remaining 10 % scattered across the genome.

  16. 16.

    Perhaps the most interesting study regarding this question is that of Nobrega et al. (2004), who deleted two megabase-long non-coding regions of the mouse genome and failed to detect any relevant phenotypic difference.

  17. 17.

    According to Eddy (2013), “[t]here are three categories of big science: the big experiment, the map, and the leading wedge. A big experiment is driven by a single question or hypothesis test, but requires a large scale community investment. […] A map is a data resource—comprehensive, complete, closed ended—to be used by multiple groups, over a long time, for multiple purposes. […] A leading wedge is a massed technology development effort, in an area where we need radically better methods.” (Eddy 2013, p. R261) While the success of “big experiments” is generally easy to appraise, Eddy deplores that “[w]e have been too shy to defend maps and leading wedges in biology” (Eddy 2013, p. R261).

References

  1. Agency for Healthcare Research and Quality (2001) Reducing and preventing adverse drug events to decrease hospital costs: research in action, issue 1. Retrieved from http://www.ahrq.gov/research/findings/factsheets/errors-safety/aderia/index.html

  2. Bechtel W, Richardson RC (2010) Discovering complexity—decomposition and localization as strategies in scientific research. The MIT Press, Cambridge

    Google Scholar 

  3. Bigelow J, Pargetter R (1987) Functions. J Philos 84(4):181–196

    Article  Google Scholar 

  4. Birney E (2012a) Lesson for big-data projects. Nature 489:49–51

    Article  Google Scholar 

  5. Birney E (2012b) ENCODE: my own thoughts. Ewan's Blog: Bioinformatician at large. Retrieved September 5, 2012, from http://genomeinformatician.blogspot.ch/2012/09/encode-my-own-thoughts.html

  6. Brenner S (1998) Refuge of spandrels. Curr Biol 8:R669

    Article  Google Scholar 

  7. Brown D, Boytchev H (2012) “Junk DNA” concept debunked by new analysis of human genome. The Washington Post. Retrieved September 5, 2012, from http://www.washingtonpost.com/national/health-science/junk-dna-concept-debunked-by-new-analysis-of-human-genome/2012/09/05/cf296720-f772-11e1-8398-0327ab83ab91_story.html

  8. Bunzl M (1980) Comment on “health as a theoretical concept”. Philos Sci 47:116–118

    Article  Google Scholar 

  9. Chanock SJ (2012) Toward mapping the biology of the genome. Genome Res 22(9):1612–1615. doi:10.1101/gr.144980.112

    Article  Google Scholar 

  10. Comings DE (1972) The structure and function of chromatin. Adv Human Genetics 3:237–431

    Google Scholar 

  11. Connor S (2003) Glaxo chief: our drugs do not work on most patients. The Independent. Retrieved December 8, 2003, from http://www.independent.co.uk/news/science/glaxo-chief-our-drugs-do-not-work-on-most-patients-575942.html

  12. Craver C (2007) Explaining the brain: mechanisms and the mosaic unity of neuroscience. Oxford University Press, New York

    Google Scholar 

  13. Cummins R (1975) Functional analysis. J Philos 72(20):741–765

    Article  Google Scholar 

  14. Darden L (2006) Reasoning in biological discoveries. Cambridge University Press, Cambridge

    Google Scholar 

  15. Diep F (2013) Friction over function: scientists clash on the meaning of ENCODE’s genetic data. Scientific American. Retrieved April 12, 2013, from http://www.scientificamerican.com/article/friction-over-function-encode/

  16. Doolittle WF (2013) Is junk DNA bunk? A critique of ENCODE. Proc Natl Acad Sci USA 110(14):5294–5300. doi:10.1073/pnas.1221376110

    Article  Google Scholar 

  17. Eddy SR (2012) The C-value paradox, junk DNA and ENCODE. Curr Biol 22:R898–R899. doi:10.1016/j.cub.2012.10.002

    Article  Google Scholar 

  18. Eddy SR (2013) The ENCODE project: missteps overshadowing a success. Curr Biol 23:R259–R261. doi:10.1016/j.cub.2013.03.023

    Article  Google Scholar 

  19. Gaudillière JP, Rheinberger H-J (2004) From molecular genetics to genomics, the mapping cultures of twentieth-century genetics. Routledge, London

    Google Scholar 

  20. Gerstein MB, Kundaje A, Hariharan M, Landt SG, Koon-Kiu Y, Chao C et al (2012) Architecture of the human regulatory network derived from ENCODE data. Nature 489:91–100. doi:10.1038/nature11245

    Article  Google Scholar 

  21. Gissis SB, Jablonka E (eds) (2011) Transformations of lamarckism. From subtle fluids to molecular biology. MIT Press, Cambridge

    Google Scholar 

  22. Graur D (2013) The Origin of Junk DNA: A Historical Whodunnit. Judge Starling. Retrieved October 19, 2013, from http://judgestarling.tumblr.com/post/64504735261/the-origin-of-junk-dna-a-historical-whodunnit

  23. Graur D, Zheng Y, Price N, Azevedo RBR, Zufall RA, Elhaik E (2013) On the immortality of television sets: “function” in the human genome according to the evolution-free gospel of ENCODE. Genome Biol Evol 5:578–590. doi:10.1093/gbe/evt028

    Google Scholar 

  24. Gregory TR (2007) The onion test. Genomicron, April 27th 2007, retrieved from http://www.genomicron.evolverzone.com/2007/04/onion-test/

  25. Griffiths PE (1993) Functional analysis and proper functions. Br J Philos Sci 44(3):409–422. doi:10.1093/bjps/44.3.409

    Article  Google Scholar 

  26. Griffiths PE (2001) Genetic information: a metaphor in search of a theory. Philos Sci 68(3):394–412

    Article  Google Scholar 

  27. Griffiths PE (2009) In what sense does “nothing make sense except in the light of evolution”? Acta Biotheor 57:11–32. doi:10.1007/s10441-008-9054-9

    Article  Google Scholar 

  28. Ibarra-Laclette E, Lyons E, Hernández-Guzmán G, Pérez-Torres CA, Carretero-Paulet L, Chang T-H, Herrera-Estrella L (2013) Architecture and evolution of a minute plant genome. Nature 498(7452):94–98. doi:10.1038/nature1213

    Article  Google Scholar 

  29. Kauffman S (1993) The origins of order: self-organization and selection in evolution. Oxford University Press, Oxford

    Google Scholar 

  30. Kauffman S (1996) At home in the universe: the search for the laws of self-organization and complexity. Oxford University Press, Oxford

    Google Scholar 

  31. Kentikelenis A, Karanikolos M, Papanicolas I, Basu S, McKee M, Stuckler D (2011) Health effects of financial crisis: omens of a Greek tragedy. Lancet 378:1457–1458

    Article  Google Scholar 

  32. Kolata G (2012) Bits of mystery DNA, far from “Junk,” play crucial role. The New York Times. p. 5–7. Retrieved September 6, 2012, from http://www.nytimes.com/2012/09/06/science/far-from-junk-dna-dark-matter-proves-crucial-to-health.html

  33. Laland KN, Sterelny K, Odling-Smee J, Hoppitt W, Uller T (2011) Cause and effect in biology revisited: is Mayr’s proximate-ultimate dichotomy still useful? Science 334:1512–1516. doi:10.1126/science.1210879

    Article  Google Scholar 

  34. Lynch VJ, Leclerc RD, May G, Wagner GP (2011) Transposon-mediated rewiring of gene regulatory networks contributed to the evolution of pregnancy in mammals. Nat Genetics 43(11):1154–1159. doi:10.1038/ng.917

    Article  Google Scholar 

  35. Maher B (2012) The human encyclopaedia. Nature 486:46–48

    Article  Google Scholar 

  36. Makalowski W (2003) Not junk after all. Science 300(5623):1246–1247. doi:10.1126/science.1085690

    Article  Google Scholar 

  37. Mayr E (1961) Cause and effect in biology. Science 134(3489):1501–1506. doi:10.1126/science.134.3489.1501

    Article  Google Scholar 

  38. Millikan RG (1989) In defense of proper functions. Philos Sci 56:288–302

    Article  Google Scholar 

  39. Neander K (1991) Functions as selected effects. Philos Sci 58:168–184

    Article  Google Scholar 

  40. NHGRI (2002) National Human Genome Research Institute (2002) Workshop summary: the comprehensive extraction of biological information from genomic sequence, retrieved from http://www.genome.gov/10005568

  41. Niu D-K, Jiang L (2013) Can ENCODE tell us how much junk DNA we carry in our genome? Biochem Biophys Res Commun 430:1340–1343. doi:10.1016/j.bbrc.2012.12.074

    Article  Google Scholar 

  42. Nobrega MA, Zhu Y, Plajzer-Frick I, Afzal V, Rubin EM (2004) Megabase deletions of gene deserts result in viable mice. Nature 431:988–993. doi:10.1038/nature02923.1

    Article  Google Scholar 

  43. Ohno S (1970) Evolution by gene duplication. Springer, New York

  44. Ohno S (1972) So much “junk” DNA in our genome. Brookhaven Symp Biol 23:366–370

    Google Scholar 

  45. Ohno S (1973) Evolutionary reason for having so much junk DNA. In: Pfeiffer RA (ed) Modern aspects of cytogenetics: constitutive heterochromatin in man. F.K. Schattauer Verlag, Stuttgart

  46. Pennisi E (2012) ENCODE project writes eulogy for junk DNA. Science 337:1159–1161

    Article  Google Scholar 

  47. Pigliucci M, Müller GB (eds) (2010) Evolution—the extended synthesis. The MIT Press, Cambridge

    Google Scholar 

  48. Ponting CP, Hardison RC (2011) What fraction of the human genome is functional? Genome Res 21:769–1776. doi:10.1101/gr.116814.110

    Article  Google Scholar 

  49. Pritchard JK, Gilard Y (2012) Evolution and the code. Nat (News & Views) 489:55

  50. Salmena L, Poliseno L, Tay Y, Kats L, Pandolfi PP (2011) A ceRNA Hypothesis: The Rosetta Stone of a Hidden RNA Language?. Cell 146(3):353–358. doi:10.1016/j.cell.2011.07.014

    Google Scholar 

  51. Schaub MA, Boyle AP, Kundaje A, Batzoglou S, Snyder M (2012) Linking disease associations with regulatory information in the human genome. Genome Res 22:1748–1759. doi:10.1101/gr.136127.111

    Article  Google Scholar 

  52. Shapiro JA (2011) Evolution: a view from the 21st century. FT Press, New Jersey

    Google Scholar 

  53. Stamatoyannopoulos J (2012) What does our genome encode? Genome Res 22:1602–1611. doi:10.1101/gr.146506.112

    Article  Google Scholar 

  54. Strasser BJ (2008) GenBank—natural history in the 21st century. Science 322(5901):537–538. doi:10.1126/science.1163399

    Article  Google Scholar 

  55. Strasser BJ (2012) Data-driven sciences: from wonder cabinets to electronic databases. Stud Hist Philos Biol Biomed Sci 43:85–87. doi:10.1016/j.shpsc.2011.10.009

    Article  Google Scholar 

  56. The ENCODE Project Consortium (2004) The ENCODE (ENCyclopedia Of DNA Elements) project. Science 306:636–640. doi:10.1126/science.1105136

    Article  Google Scholar 

  57. The ENCODE Project Consortium (2011) A user’s guide to the encyclopedia of DNA elements (ENCODE). PLoS Biol 9(4):e1001046. doi:10.1371/journal.pbio.1001046

    Article  Google Scholar 

  58. The ENCODE Project Consortium (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489:57–74. doi:10.1038/nature11247

    Article  Google Scholar 

  59. Tinbergen N (1963) On aims and methods in ethology. Zeitschrift für Tierpsychologie 20(4):410–433

    Article  Google Scholar 

  60. Wang L, Lawrence MS, Wan Y, Stojanov P, Sougnez C, Stevenson K et al (2011) SF3B1 and other novel cancer genes in chronic lymphocytic leukemia. N Engl J Med 365:2497–2506. doi:10.1056/NEJMoa1109016

    Article  Google Scholar 

  61. Weber M (2005) Philosophy of experimental biology. Cambridge University Press, Cambridge

    Google Scholar 

  62. Wouters AG (2003) Four notions of biological function. Stud Hist Philos Sci Part C Stud Hist Philos Biol Biomed Sci 34:633–668. doi:10.1016/j.shpsc.2003.09.006

    Article  Google Scholar 

  63. Wright L (1973) Functions. Philos Rev 82(2):139–168

    Article  Google Scholar 

Download references

Acknowledgments

We wish to acknowledge Fridolin Groß, who was part of the many discussions at the origin of this paper and carefully commented several versions of the paper. In addition, we wish to thank all those who have read drafts of this paper: Michel Morange, Michael Weisberg, Iros Barozzi, Lorenzo Del Savio, Marcel Weber and the lgBIG group in Geneva (in which the paper was discussed), Alkistis Elliot-Graves and Vera Pendino. We are also thankful to our colleagues of the FOLSATEC programme. Finally, we wish to acknowledge the two anonymous reviewers for their help in improving the text.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Pierre-Luc Germain.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Germain, PL., Ratti, E. & Boem, F. Junk or functional DNA? ENCODE and the function controversy. Biol Philos 29, 807–831 (2014). https://doi.org/10.1007/s10539-014-9441-3

Download citation

Keywords

  • Biological function
  • Causal role
  • Selected effect
  • ENCODE
  • Junk DNA