Abstract
In its last round of publications in September 2012, the Encyclopedia Of DNA Elements (ENCODE) assigned a biochemical function to most of the human genome, which was taken up by the media as meaning the end of ‘Junk DNA’. This provoked a heated reaction from evolutionary biologists, who among other things claimed that ENCODE adopted a wrong and much too inclusive notion of function, making its dismissal of junk DNA merely rhetorical. We argue that this criticism rests on misunderstandings concerning the nature of the ENCODE project, the relevant notion of function and the claim that most of our genome is junk. We argue that evolutionary accounts of function presuppose functions as ‘causal roles’, and that selection is but a useful proxy for relevant functions, which might well be unsuitable to biomedical research. Taking a closer look at the discovery process in which ENCODE participates, we argue that ENCODE’s strategy of biochemical signatures successfully identified activities of DNA elements with an eye towards causal roles of interest to biomedical research. We argue that ENCODE’s controversial claim of functionality should be interpreted as saying that 80 % of the genome is engaging in relevant biochemical activities and is very likely to have a causal role in phenomena deemed relevant to biomedical research. Finally, we discuss ambiguities in the meaning of junk DNA and in one of the main arguments raised for its prevalence, and we evaluate the impact of ENCODE’s results on the claim that most of our genome is junk.
Similar content being viewed by others
Notes
Because natural selection tends to remove deleterious mutations from the pool, we can expect to observe less mutations in DNA sequences important for the survival and reproduction of the organism. As there are a number of technical hurdles in the detection of such selection, estimates vary (some going up to 15 %). For a discussion, see Ponting and Hardison (2011).
In fact, it must be noted that most criticisms are not directly aimed at what is written in ENCODE's scientific publications, which are careful in their formulations, but instead at its interpretation. This is not limited to the mass media, but also to the coverage the results were given in prestigious scientific journals such as Science (Pennisi 2012).
On the day the embargo was lifted on the last round of ENCODE's publications (and therefore long before the publication of ENCODE's criticisms), Ewan Birney, ENCODE's lead analysis coordinator, published a post on his personal blog providing his personal perspective of the project (Birney 2012a, b). In this post, Birney acknowledges that the proportion of the genome that is “functional” depends on how stringent one is, and preempts some of the most important technical criticisms addressed at the project.
Doolittle for instance writes: “Those of us who speak of excess DNA as informationally junk mean that its presence is not to be explained by past and/or current selection at the level of organisms—that it has no informational function construable historically as an SE [selected effect]. Those who say that almost the whole of the human genome is functional informationally do so on the basis of an operational diagnosis embracing a non-historical CR [causal role] definition of function.” (Doolittle 2013, p. 5299).
In a similar way, Weber (2005) has argued that “elucidating the evolutionary history of some system or subsystem is supplementary to analyzing its function; it is not part of it.” (Weber 2005, p. 40) This is easily shown by the fact that expatiations (features which start being used in a way for which they have not been selected) are generally considered functional.
Furthermore, the reader should be aware that different lines of research suggest that the modern synthesis is insufficient to understand evolution. See for instance the work edited by Pigliucci and Müller (2010) on the need of extending the standard evolutionary paradigm. Other directions/suggestions have been explored by Shapiro (2011), Gissis and Jablonka (2011 edited by), and Kauffman (1993, 1996).
Ernst Mayr (1961) proposed a distinction between two research projects within biology, which he labeled functional biology and evolutionary biology. According to Mayr, while functional biology seeks proximate causes and therefore investigates how certain phenomena occur, evolutionary biology is devoted to understand why or the evolutionary reason for the presence of the very same phenomena. Our point is that given traits can be involved in the explanations of functional biology independently of whether they have been selected for—as Cummins puts it: “Flight is a capacity that cries out for explanation in terms of anatomical functions regardless of its contribution to the capacity to maintain the species.” (Cummins 1975, p. 756). We obviously do not claim that Mayr's two domains of biology are insulated from each other, but rather that they pursue two legitimate aims which share many, though not all of their means (Laland et al. 2011). Each domain is important in the investigation of the other. However, a reduction of all the functional relevance of the genome to its evolutionary dimension (i.e. function as selected effect or biological advantage) fails to give enough attention to the different research projects of the life sciences.
Note that the two-step strategy we propose is not in conflict with Bechtel and Richardson's (2010) strategy of decomposition and localization. According to the latter, scientists decompose a complex phenomenon into less complex subsystems or contributions, and attempt to localize these functions to physical components of the system (e.g. organelles). Our point is that in the step of localization, the physical components are not entirely uncharacterized, and the earlier characterization of their activities provides important hints as to which function localizes where. Our two-step strategy is however to be distinguished from another very common strategy in biology. Mutants identified through reverse genetics, for instance, can establish the relevance of a part in a given phenomenon before identifying any of its activities. Obviously, the strategy we describe is but one of the many general strategies available to biologists.
This has to be understood as making a relevant difference, for any change to DNA makes a phenotypic difference at least insofar as the genome is also part of the structure of the organism. Even a transcription factor binding site in the middle of nowhere, not leading to any transcription, is having an effect on relevant gene functions, at least insofar as it sequesters the transcription factor and hence reduces the amount of the protein available for important binding sites. In the same way, non-coding RNA can have an influence on the expression of coding genes because they are bound by miRNAs which would normally regulate the coding genes (Salmena et al. 2011). However, such impacts may be so small as to be imperceptible.
This is well illustrated by Brenner's (1998) distinction between “junk” and “garbage”: “Some years ago I noticed that there are two kinds of rubbish in the world and that most languages have different words to distinguish them. There is the rubbish we keep, which is junk, and the rubbish we throw away, which is garbage. The excess DNA in our genomes is junk, and it is there because it is harmless, as well as being useless, and because the molecular processes generating extra DNA outpace those getting rid of it. Were the extra DNA to become disadvantageous, it would become subject to selection, just as junk that takes up too much space, or is beginning to smell, is instantly converted to garbage…” (Brenner S, Refuge of spandrels. Curr. Biol. 8: R669, 1998, quoted in Graur et al. 2013, p. 586).
E.g. “Lean Gene Machine”, Scientific American, accessed at http://www.scientificamerican.com/article.cfm?id=lean-gene-machine.
If evidence is required to support this claim, the reader may consider as an example the effect of budget and workforce cuts on the Greek National Health System (Kentikelenis et al 2011).
This problematic move is also present in another critique of ENCODE's claims, which contrasts the question of “How much DNA does it take to design a human?” with that of “How much DNA does it take to evolve a human?” (Eddy 2013, p. R260), relating the former to function and the latter to junk (see also the interview with Eddy in Diep 2013). Function, however, does not mean ideal design.
Chris Ponting (personal communication) for instance made this claim, but also emphasized the immense difficulty of identifying the remaining 10 % scattered across the genome.
Perhaps the most interesting study regarding this question is that of Nobrega et al. (2004), who deleted two megabase-long non-coding regions of the mouse genome and failed to detect any relevant phenotypic difference.
According to Eddy (2013), “[t]here are three categories of big science: the big experiment, the map, and the leading wedge. A big experiment is driven by a single question or hypothesis test, but requires a large scale community investment. […] A map is a data resource—comprehensive, complete, closed ended—to be used by multiple groups, over a long time, for multiple purposes. […] A leading wedge is a massed technology development effort, in an area where we need radically better methods.” (Eddy 2013, p. R261) While the success of “big experiments” is generally easy to appraise, Eddy deplores that “[w]e have been too shy to defend maps and leading wedges in biology” (Eddy 2013, p. R261).
References
Agency for Healthcare Research and Quality (2001) Reducing and preventing adverse drug events to decrease hospital costs: research in action, issue 1. Retrieved from http://www.ahrq.gov/research/findings/factsheets/errors-safety/aderia/index.html
Bechtel W, Richardson RC (2010) Discovering complexity—decomposition and localization as strategies in scientific research. The MIT Press, Cambridge
Bigelow J, Pargetter R (1987) Functions. J Philos 84(4):181–196
Birney E (2012a) Lesson for big-data projects. Nature 489:49–51
Birney E (2012b) ENCODE: my own thoughts. Ewan's Blog: Bioinformatician at large. Retrieved September 5, 2012, from http://genomeinformatician.blogspot.ch/2012/09/encode-my-own-thoughts.html
Brenner S (1998) Refuge of spandrels. Curr Biol 8:R669
Brown D, Boytchev H (2012) “Junk DNA” concept debunked by new analysis of human genome. The Washington Post. Retrieved September 5, 2012, from http://www.washingtonpost.com/national/health-science/junk-dna-concept-debunked-by-new-analysis-of-human-genome/2012/09/05/cf296720-f772-11e1-8398-0327ab83ab91_story.html
Bunzl M (1980) Comment on “health as a theoretical concept”. Philos Sci 47:116–118
Chanock SJ (2012) Toward mapping the biology of the genome. Genome Res 22(9):1612–1615. doi:10.1101/gr.144980.112
Comings DE (1972) The structure and function of chromatin. Adv Human Genetics 3:237–431
Connor S (2003) Glaxo chief: our drugs do not work on most patients. The Independent. Retrieved December 8, 2003, from http://www.independent.co.uk/news/science/glaxo-chief-our-drugs-do-not-work-on-most-patients-575942.html
Craver C (2007) Explaining the brain: mechanisms and the mosaic unity of neuroscience. Oxford University Press, New York
Cummins R (1975) Functional analysis. J Philos 72(20):741–765
Darden L (2006) Reasoning in biological discoveries. Cambridge University Press, Cambridge
Diep F (2013) Friction over function: scientists clash on the meaning of ENCODE’s genetic data. Scientific American. Retrieved April 12, 2013, from http://www.scientificamerican.com/article/friction-over-function-encode/
Doolittle WF (2013) Is junk DNA bunk? A critique of ENCODE. Proc Natl Acad Sci USA 110(14):5294–5300. doi:10.1073/pnas.1221376110
Eddy SR (2012) The C-value paradox, junk DNA and ENCODE. Curr Biol 22:R898–R899. doi:10.1016/j.cub.2012.10.002
Eddy SR (2013) The ENCODE project: missteps overshadowing a success. Curr Biol 23:R259–R261. doi:10.1016/j.cub.2013.03.023
Gaudillière JP, Rheinberger H-J (2004) From molecular genetics to genomics, the mapping cultures of twentieth-century genetics. Routledge, London
Gerstein MB, Kundaje A, Hariharan M, Landt SG, Koon-Kiu Y, Chao C et al (2012) Architecture of the human regulatory network derived from ENCODE data. Nature 489:91–100. doi:10.1038/nature11245
Gissis SB, Jablonka E (eds) (2011) Transformations of lamarckism. From subtle fluids to molecular biology. MIT Press, Cambridge
Graur D (2013) The Origin of Junk DNA: A Historical Whodunnit. Judge Starling. Retrieved October 19, 2013, from http://judgestarling.tumblr.com/post/64504735261/the-origin-of-junk-dna-a-historical-whodunnit
Graur D, Zheng Y, Price N, Azevedo RBR, Zufall RA, Elhaik E (2013) On the immortality of television sets: “function” in the human genome according to the evolution-free gospel of ENCODE. Genome Biol Evol 5:578–590. doi:10.1093/gbe/evt028
Gregory TR (2007) The onion test. Genomicron, April 27th 2007, retrieved from http://www.genomicron.evolverzone.com/2007/04/onion-test/
Griffiths PE (1993) Functional analysis and proper functions. Br J Philos Sci 44(3):409–422. doi:10.1093/bjps/44.3.409
Griffiths PE (2001) Genetic information: a metaphor in search of a theory. Philos Sci 68(3):394–412
Griffiths PE (2009) In what sense does “nothing make sense except in the light of evolution”? Acta Biotheor 57:11–32. doi:10.1007/s10441-008-9054-9
Ibarra-Laclette E, Lyons E, Hernández-Guzmán G, Pérez-Torres CA, Carretero-Paulet L, Chang T-H, Herrera-Estrella L (2013) Architecture and evolution of a minute plant genome. Nature 498(7452):94–98. doi:10.1038/nature1213
Kauffman S (1993) The origins of order: self-organization and selection in evolution. Oxford University Press, Oxford
Kauffman S (1996) At home in the universe: the search for the laws of self-organization and complexity. Oxford University Press, Oxford
Kentikelenis A, Karanikolos M, Papanicolas I, Basu S, McKee M, Stuckler D (2011) Health effects of financial crisis: omens of a Greek tragedy. Lancet 378:1457–1458
Kolata G (2012) Bits of mystery DNA, far from “Junk,” play crucial role. The New York Times. p. 5–7. Retrieved September 6, 2012, from http://www.nytimes.com/2012/09/06/science/far-from-junk-dna-dark-matter-proves-crucial-to-health.html
Laland KN, Sterelny K, Odling-Smee J, Hoppitt W, Uller T (2011) Cause and effect in biology revisited: is Mayr’s proximate-ultimate dichotomy still useful? Science 334:1512–1516. doi:10.1126/science.1210879
Lynch VJ, Leclerc RD, May G, Wagner GP (2011) Transposon-mediated rewiring of gene regulatory networks contributed to the evolution of pregnancy in mammals. Nat Genetics 43(11):1154–1159. doi:10.1038/ng.917
Maher B (2012) The human encyclopaedia. Nature 486:46–48
Makalowski W (2003) Not junk after all. Science 300(5623):1246–1247. doi:10.1126/science.1085690
Mayr E (1961) Cause and effect in biology. Science 134(3489):1501–1506. doi:10.1126/science.134.3489.1501
Millikan RG (1989) In defense of proper functions. Philos Sci 56:288–302
Neander K (1991) Functions as selected effects. Philos Sci 58:168–184
NHGRI (2002) National Human Genome Research Institute (2002) Workshop summary: the comprehensive extraction of biological information from genomic sequence, retrieved from http://www.genome.gov/10005568
Niu D-K, Jiang L (2013) Can ENCODE tell us how much junk DNA we carry in our genome? Biochem Biophys Res Commun 430:1340–1343. doi:10.1016/j.bbrc.2012.12.074
Nobrega MA, Zhu Y, Plajzer-Frick I, Afzal V, Rubin EM (2004) Megabase deletions of gene deserts result in viable mice. Nature 431:988–993. doi:10.1038/nature02923.1
Ohno S (1970) Evolution by gene duplication. Springer, New York
Ohno S (1972) So much “junk” DNA in our genome. Brookhaven Symp Biol 23:366–370
Ohno S (1973) Evolutionary reason for having so much junk DNA. In: Pfeiffer RA (ed) Modern aspects of cytogenetics: constitutive heterochromatin in man. F.K. Schattauer Verlag, Stuttgart
Pennisi E (2012) ENCODE project writes eulogy for junk DNA. Science 337:1159–1161
Pigliucci M, Müller GB (eds) (2010) Evolution—the extended synthesis. The MIT Press, Cambridge
Ponting CP, Hardison RC (2011) What fraction of the human genome is functional? Genome Res 21:769–1776. doi:10.1101/gr.116814.110
Pritchard JK, Gilard Y (2012) Evolution and the code. Nat (News & Views) 489:55
Salmena L, Poliseno L, Tay Y, Kats L, Pandolfi PP (2011) A ceRNA Hypothesis: The Rosetta Stone of a Hidden RNA Language?. Cell 146(3):353–358. doi:10.1016/j.cell.2011.07.014
Schaub MA, Boyle AP, Kundaje A, Batzoglou S, Snyder M (2012) Linking disease associations with regulatory information in the human genome. Genome Res 22:1748–1759. doi:10.1101/gr.136127.111
Shapiro JA (2011) Evolution: a view from the 21st century. FT Press, New Jersey
Stamatoyannopoulos J (2012) What does our genome encode? Genome Res 22:1602–1611. doi:10.1101/gr.146506.112
Strasser BJ (2008) GenBank—natural history in the 21st century. Science 322(5901):537–538. doi:10.1126/science.1163399
Strasser BJ (2012) Data-driven sciences: from wonder cabinets to electronic databases. Stud Hist Philos Biol Biomed Sci 43:85–87. doi:10.1016/j.shpsc.2011.10.009
The ENCODE Project Consortium (2004) The ENCODE (ENCyclopedia Of DNA Elements) project. Science 306:636–640. doi:10.1126/science.1105136
The ENCODE Project Consortium (2011) A user’s guide to the encyclopedia of DNA elements (ENCODE). PLoS Biol 9(4):e1001046. doi:10.1371/journal.pbio.1001046
The ENCODE Project Consortium (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489:57–74. doi:10.1038/nature11247
Tinbergen N (1963) On aims and methods in ethology. Zeitschrift für Tierpsychologie 20(4):410–433
Wang L, Lawrence MS, Wan Y, Stojanov P, Sougnez C, Stevenson K et al (2011) SF3B1 and other novel cancer genes in chronic lymphocytic leukemia. N Engl J Med 365:2497–2506. doi:10.1056/NEJMoa1109016
Weber M (2005) Philosophy of experimental biology. Cambridge University Press, Cambridge
Wouters AG (2003) Four notions of biological function. Stud Hist Philos Sci Part C Stud Hist Philos Biol Biomed Sci 34:633–668. doi:10.1016/j.shpsc.2003.09.006
Wright L (1973) Functions. Philos Rev 82(2):139–168
Acknowledgments
We wish to acknowledge Fridolin Groß, who was part of the many discussions at the origin of this paper and carefully commented several versions of the paper. In addition, we wish to thank all those who have read drafts of this paper: Michel Morange, Michael Weisberg, Iros Barozzi, Lorenzo Del Savio, Marcel Weber and the lgBIG group in Geneva (in which the paper was discussed), Alkistis Elliot-Graves and Vera Pendino. We are also thankful to our colleagues of the FOLSATEC programme. Finally, we wish to acknowledge the two anonymous reviewers for their help in improving the text.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Germain, PL., Ratti, E. & Boem, F. Junk or functional DNA? ENCODE and the function controversy. Biol Philos 29, 807–831 (2014). https://doi.org/10.1007/s10539-014-9441-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10539-014-9441-3