Evidence amalgamation, plausibility, and cancer research


Cancer research is experiencing ‘paradigm instability’, since there are two rival theories of carcinogenesis which confront themselves, namely the somatic mutation theory and the tissue organization field theory. Despite this theoretical uncertainty, a huge quantity of data is available thanks to the improvement of genome sequencing techniques. Some authors think that the development of new statistical tools will be able to overcome the lack of a shared theoretical perspective on cancer by amalgamating as many data as possible. We think instead that a deeper understanding of cancer can be achieved by means of more theoretical work, rather than by merely accumulating more data. To support our thesis, we introduce the analytic view of theory development, which rests on the concept of plausibility, and make clear in what sense plausibility and probability are distinct concepts. Then, the concept of plausibility is used to point out the ineliminable role played by the epistemic subject in the development of statistical tools and in the process of theory assessment. We then move to address a central issue in cancer research, namely the relevance of computational tools developed by bioinformaticists to detect driver mutations in the debate between the two main rival theories of carcinogenesis. Finally, we briefly extend our considerations on the role that plausibility plays in evidence amalgamation from cancer research to the more general issue of the divergences between frequentists and Bayesians in the philosophy of medicine and statistics. We argue that taking into account plausibility-based considerations can lead to clarify some epistemological shortcomings that afflict both these perspectives.

This is a preview of subscription content, log in to check access.


  1. 1.

    For a survey of the challenges that the big data approach has to face, see Fan et al. (2014).

  2. 2.

    On what spurious correlations are, cf., e.g., Dellsén (2016, p. 78): “Suppose we have two variables V1 and V2 that are known on independent grounds to be unrelated, causally and nomologically. Let us further suppose that we learn, i.e. come to know, that there is some specific statistical correlation between V1 and V2—e.g. such that a greater value for V1 is correlated with a greater value for V2.” The latter correlation represents an instance of spurious correlation, i.e. a correlation between two variables which is not due to any real relation between them. Such a correlation does not convey any information on the correlated variables, nor on some other relevant aspect of the world, so it is useless, irrelevant, or worse, it may be lead us astray, if we do not correctly identify it as spurious.

  3. 3.

    In this case (i.e. \(c = 2\)), if we have \(k = 3\), then \({\gamma } = 8\). To see this, consider the following sequence of binary digits of length 8: 01100110. This string contains no arithmetic progression of length 3, because the positions 1, 4, 5, 8 (which are all ‘0’) and 2, 3, 6, 7 (which are all ‘1’) do not contain an arithmetic progression of length 3. However, if we add just one bit more to that string (i.e. if we add either ‘1’ or ‘0’), we obtain the following two strings: 011001100 and 011001101. Both these strings contain a monochromatic arithmetic progression of length 3. Consider 011001100: positions 1, 5, 9 are all ‘0’. Consider 011001101: positions 3, 6, 9 are all ‘1’. More generally, it can be proved that if a string contains more than 8 digits, it will contain a monochromatic arithmetic progression of length 3.

  4. 4.

    It is important to stress that the nature of the correlation function is irrelevant: it can be completely arbitrary, i.e. in no way related to the nature of the data stored in the database.

  5. 5.

    Cf. Calude and Longo (2016b, p. 6): “it is exactly the size of the data that allows our result: the more data, the more arbitrary, meaningless and useless [...] correlations will be found in them.” It may be interesting to note that, in order to derive their result, Calude and Longo define ‘spurious’ in a more restrictive way than usually is done. According to them, “a correlation is spurious if it appears in a ‘randomly’ generated database” (p. 13). Details can be found in Calude and Longo (2016b). In any case, this does not impinge on the considerations that follow.

  6. 6.

    Cf. Cellucci (2017a, p. 142): “Methods can be divided into algorithmic and heuristic. An algorithmic method is a method that guarantees to always produce a correct solution to a problem. Conversely, a heuristic method is a method that does not guarantee to always produce a correct solution to a problem.”

  7. 7.

    For the differences that exist among the analytic method, the analytic-synthetic method, and the axiomatic method, see Cellucci (2013).

  8. 8.

    The origin of the analytic method may be traced back to the works of the mathematician Hippocrates of Chios and the physician Hippocrates of Cos, and was firstly explicitly formulated by Plato in Meno, Phaedo and the Republic. As an example of the analytic method, consider the solution to the problem of the quadrature of certain lunules provided by Hippocrates of Chios:

    Show that, if PQR is a right isosceles triangle and PRQ, PTR are semicircles on PQ, PR, respectively, then the lunule PTRU is equal to the right isosceles triangle PRS.                      


    To solve this problem, Hippocrates of Chios states the following hypothesis:

    (B) Circles are as the squares on their diameters.

    Hypothesis (B) is a sufficient condition for solving the problem. For, by the Pythagorean theorem, the square on PQ is twice the square on PR. Then, by (B), the semicircle on PQ, that is, PRQ, is twice the semicircle on PR, that is, PTR, and hence the quarter of circle PRS is equal to the semicircle PTR. Subtracting the same circular segment, PUR, from both the quarter of circle PRS and the semicircle PTR, we obtain the lunule PTRU and the triangle PRS, respectively. Therefore, “the lunule” PTRU “is equal to the triangle.” [Simplicius, In Aristotelis Physicorum libros Commentaria, A 2, 61]. This solves the problem. But hypothesis (B) is in its turn a problem that must be solved (Cellucci 2013, p. 61).

  9. 9.

    For a more detailed confrontation of the concept of plausibility with some related (but distinct) concepts, such as truth, probability, and warranted assertibility, see Cellucci (2017a, Chapter 9).

  10. 10.

    On the possible interpretations of probabilities, see Gillies (2000). Basically, probabilities may be regarded as ‘objective’ or ‘subjective’. Cf. e.g. Djulbegovic et al. (2011, p. 309): “‘objective probability’ is believed to reflect the characteristics of the real world, i.e., the probability somehow relates to the physical property of the world or a mechanism generating sequences of events [...]. On the other hand, ‘subjective probability’ is believed to represent a state of mind and not a state of objects [...].”

  11. 11.

    This is the problem of the unconceived alternatives, see below Sect. 2.6.

  12. 12.

    See Cellucci (2013, Chapter 20). An example of plausible hypotheses that have zero probability are all the plausible hypotheses derived by an Induction from a Single Case (ISC). On the classical concept of probability as ratio between favorable and possible cases, a conclusion obtained by (ISC) has zero probability when the number of possible cases is infinite. An example of implausible hypotheses that have non-zero probability are implausible hypotheses that have been obtained by Induction from Multiple Cases (IMC). Consider the hypothesis that all swans are white. Until the end of the seventeenth century, “all swans observed were white. From this, by (IMC), it was inferred that all swans are white. But in 1697 black swans were discovered in Western Australia.” Since then, the hypothesis that all swans are white is highly implausible. But, this contrasts with the fact that, “on the classical concept of probability, a conclusion obtained by (IMC) has non-zero probability when the number of possible cases is not infinite”, and such “is the case of the hypothesis that all swans are white” (Cellucci 2013, p. 335).

  13. 13.

    For an opposite view, see Musgrave (2011). For a criticism of Musgrave’s view, see Cellucci (2017a).

  14. 14.

    Many replies have been elaborated in the last decade to Stanford (2006) (see, e.g., Magnus 2006; Saatsi et al. 2009; Ruhmkorff 2011); see Saatsi et al. (2009) for Stanford’s rejoinder to some criticisms; see Rowbottom (2016) for an interesting extension of Stanford’s line of reasoning. Here we will focus on Mizrahi’s (2016) attack to Stanford’s view, because, as a reviewer suggested, it puts into question the very coherence of Stanford’s position, so it risks questioning the validity of all those positions which similarly rely on the problem of the unconceived alternatives, as the one we advocate for in this paper. Mizrahi (2016) develops an argument against Stanford’s view according to which, if (1) one accepts Stanford’s argument against scientific realism, and (2) it is possible to adopt Stanford’s own line of reasoning in the field of philosophy, then Stanford’s position is self-debunking. Indeed, according to Mizrahi (2014), it is possible to construct an argument, which is analogous to Stanford’s argument against scientific realism, and so it is not easily refutable by those who accept Stanford’s argument, according to which we should not believe our current philosophical theories, because history of philosophy shows that philosophers routinely failed to conceive of serious objections to their theories. Call Mizrahi’s Stanford-like argument for philosophy MA. Now, according to Mizrahi’s argument against Stanford, if Stanford’s position is a philosophical position, then we should not trust it, precisely because it is a philosophical position, given that according to MA we should not trust philosophical theories. Many criticisms can be raised to Mizrahi’s approach, but we cannot analyze them all here for reason of space. What can be briefly pointed out is that Mizrahi’s argument against Stanford crucially relies on MA. Now, it is MA which is a self-defeating argument. Indeed, if one maintains MA, one is clearly advocating for a philosophical position, i.e. one is committing oneself to a given philosophical theory. But, according to MA itself, we should not trust philosophical theories. So, MA is self-defeating. If we consider now Mizrahi’s argument against Stanford’s position, it is easy to see that also this argument falls victim of MA’s self-defeatingness. Indeed, Mizrahi’s argument against Stanford conveys in its turn a philosophical position, which implies a commitment to a given philosophical theory. But, according to MA, we should not trust philosophical theories. Thus, since Mizrahi’s argument against Stanford crucially rests on MA, if (1) one takes MA to be a cogent argument, then one should not trust Mizrahi’s argument against Stanford, because Mizrahi’s argument against Stanford rests on a philosophical theory, and according to MA we should not trust any philosophical theory; if (2) one takes MA to be a self-defeating argument, i.e. a non-cogent argument, then Mizrahi’s argument against Stanford cannot even take off the ground, since it rests on a self-defeating argument.

  15. 15.

    Cf. Schupbach (2011, p. 119, fn. 2): “Such scenarios correspond to van Franssen’s best of a bad lot objection as well as what Stanford (2006) calls ‘the problem of unconceived alternatives’.”

  16. 16.

    On this issue see Pollock (1983). Although his approach is distant from the view advocated here, there is nevertheless some similarity between the two. For instance, Pollock points out the impossibility of equating what he calls ‘epistemic probability’ and ‘statistical probability’. His conception of ‘epistemic probability’ is more akin to what we call ‘plausibility’ than to what is usually meant with ‘probability’. According to him, ‘statistical probability’ is that kind of probability “about which we can learn by discovering relative frequencies, counting cases” (Pollock 1983, p. 236). On the contrary, the “epistemic probability of a proposition is the degree to which it is warranted” (Ibidem). In this view, a proposition is deemed warranted by a careful examination of the reasons for and against it: “a person is justified in believing P just in case he has adequate reason to believe P [...], and he does not have any defeaters for it at his immediate disposal” (p. 233). Moreover, Pollock clearly denies that arguments evaluation can be represented in probabilistic terms. Indeed, Pollock also explicitly denies the possibility of equating ‘epistemic probability’ and ‘subjective probability’. In his view, even if we follow the Bayesians and consider probability as expressing a person’s ‘degree of belief’ on the truth of a given proposition, we cannot equate probability assignment and arguments evaluation, mainly because we cannot reasonably impose on arguments evaluation the rules of probability calculus.

  17. 17.

    Cf. Calude and Longo (2016a, p. 273): “Randomness plays an essential role in probability theory, the mathematical calculus of random events. Kolmogorov axiomatic probability theory assigns probabilities to sets of outcomes and shows how to calculate with such probabilities.”

  18. 18.

    It is important to note that we are not denying that external factors may relevantly affect theoretical choices, we are just stressing that our perspective is able to account for relevant non-arbitrary epistemic factors that may lead to theoretical divergence or theory change in a rational way.

  19. 19.

    For a demonstration that no infinite sequence passes all tests of randomness, so that ‘true randomness’ does not exist, see Calude (2002).

  20. 20.

    It may be objected that the existence of hereditary cancer is a strong evidence for SMT. Supporters of TOFT usually specify that “a distinction should be made about the types of cancers that appear in the clinic; there are ‘sporadic’ cancers and hereditary ones. ‘Sporadic’ cancers represent over 95% of the cancers in humans. On the other hand, inherited cancers (less than 5% of total cancers) are a discrete subclass, mediated by germline mutations that have a distinct natural history, mostly appearing in early childhood and/or young adults [...]. While the DNA mutations in this latter type of cancers are present in all cells of the organism, tumors mostly appear in one or a few organs” (Soto and Sonnenschein 2011, p. 333). Since, according to TOFT, cancer is ‘development gone awry’, in this view hereditary cancers are regarded as ‘inherited inborn errors of development’ (Soto and Sonnenschein 2011).

  21. 21.

    It may be objected that SMT and TOFT, despite their diversity, are not genuine rival theories, because they are not really incompatible. This issue is in fact strongly debated (see Bedessem and Ruphy 2017, 2015; Bizzarri and Cucina 2016; Bertolaso 2016; Rosenfeld 2013). However, this issue does not impinge on our argumentation. We observed the disputes that are going on in the field and tried to represent the situation. And, in fact, many of the participants in the SMT vs. TOFT debate do see those theories as really incompatible. For example, Bizzarri and Cucina (2016) state that SMT and TOFT are irreconcilable theories. According to them, “irreconcilability depends on radical divergence existing among basic premises [...]. Copernican theory was irreconcilably different from the Ptolemaic one, given that the central place in the solar system was occupied by the Earth in the latter and the Sun in the former. It is obviously impossible to support at the same time these two opposing hypotheses by constraining them into a ‘unified’ cosmological model. By analogy, SMT and TOFT cannot be merged because the premises on which those frameworks rely are incompatible: the default state of the cell can be considered either quiescence (SMT), or proliferation (according to TOFT). The two default states cannot be operational at the same time” (Bizzarri and Cucina 2016, p. 232). Now, whether SMT and TOFT are really incompatible is not relevant here. Indeed, we do not aim at solving the dispute on which is the best theory between SMT and TOFT, nor we aim at solving the dispute on whether SMT and TOFT are really incompatible. We just try to highlight how this confrontations rest on theoretical commitments, whose adoption is better accounted for in terms of plausibility rather than probability. We thank an anonymous reviewer for having raised this issue.

  22. 22.

    Cf. Ow and Kuznetsov (2016, p. 1): “Big data analytics is the process of examining large data sets containing heterogeneous patient sub-populations and a wide variety of data types [...]. Big data analytics aims to uncover hidden patterns, unknown correlations, complex trends, [...], as well as other useful features.”

  23. 23.

    Cf. Stevens (2013, pp. 65–66): “Bioinformatics can be understood [...] as a kind of neo-Baconian science in which hypothesis-driven research is giving way to hypothesis-free experiments and data collection.”

  24. 24.

    Cf. Stevens (2013, p. 69): “the computer becomes the crucial tool: efficiency is a product of bioinformatic statistical and data management techniques. It is the computer that must reduce instrument output to comprehensible and meaningful forms. The epistemological shift associated with data-driven biology is linked to a technological shift associated with the widespread use of computers.”

  25. 25.

    On the idea of ‘hypothesis-free’ or ‘data-driven’ science, see Stevens (2013, Chapter 2); see also Chen and Snyder (2013). On explanatory data analysis, cf. Bassett et al. (1999, p. 54): “Knowledge discovery by exploratory data analysis is a ‘bottom up’ approach in which the data are allowed to ‘speak for themselves’ after a statistical [...] procedure is performed.” Cf. also Brown and Botstein (1999, p. 33): “this process is not driven by hypothesis and should be as model-independent as possible.”

  26. 26.

    This hypothesis has been empirically confirmed, see Soto and Sonnenschein (2011), Baker (2015a, b) and Bizzarri and Cucina (2016).

  27. 27.

    Cf. Raphael (2014, p. 7): “Using the background mutation rate (BMR) and the number n of sequenced nucleotides within a gene (g), the probability (Pg) that a passenger mutation is observed in g is given by \(\textit{Pg} = 1 - (1 - \hbox {BMR})\). Since somatic mutations arise independently in each sample, the occurrences of passenger mutations in g are modeled by flipping a biased coin with probability pg of heads (mutation). Thus, if somatic mutations have been measured in m samples, the number of patients in which gene g is mutated is described by a binomial random variable B(mPg) with parameters m and Pg. From B(mPg), it is possible to compute the probability that the observed number or more samples contain passenger mutations; this is the P value of the statistical test”.

  28. 28.

    The fact that some data can be regarded as relevant to the estimation of the frequency distribution of passenger and driver mutations only if one assumes SMT can be clearly seen by considering that if one assumes TOFT, then the very same data (i.e. the detected mutations in a given cancer genome) cannot be regarded as instances of ‘driver’ mutations, simply because according to TOFT somatic mutations are not the cause of cancer. If one does not assume SMT, by searching for mutations one can at most display correlations of recurrent mutations in cancer cells, but one cannot prove that those mutations are the cause of carcinogenesis, and so one cannot disconfirm TOFT. Indeed, TOFT does not deny the existence of mutations in the genome of cancer cells. It denies that these mutations are the cause of cancer insurgence.

  29. 29.

    It may be objected that the so called de novo approaches, which aim at the identification of driver mutations by statistically analyzing combinations of mutations in networks and pathways (this method belongs to the third kind of approaches to individuate driver mutations listed above) are less prone to this criticism, because they do not incorporate previous knowledge about genes associated with well-studied cancer pathways. But this objection is inadequate. Indeed, even de novo approaches are not independent from crucial assumptions that are not neutral with respect to what hypothesis on carcinogenesis is adopted. In order to identify novel combinations of mutations or mutated genes, “it would be ideal to test all possible combinations for recurrent mutations across a cohort of cancer patients, but such a de novo approach is impractical. For example, there are more than 10\(^{29}\) possible sets of eight genes in the human genome, which is both too many to evaluate computationally and too many hypotheses to test while retaining statistical power” (Raphael et al. 2014, p. 12). In order to overcome this difficulty, de novo approaches try to identify driver mutations by searching for genetic aberrations which are both (1) highly recurrent, and (2) mutually exclusive, i.e. they do not compare in different pathways. So, only if one accepts the “hypotheses that each tumor has relatively few driver mutations [...] and these driver mutations perturb multiple cellular functions in different pathways [...], one can conclude that a tumor rarely possesses more than one driver mutation per pathway”, and so that “when examining data across cancer samples, driver pathways [...] correspond to mutually exclusive sets of genes” (Ibidem). But these assumptions are not independent from prior knowledge. Indeed, they relies on SMT, since they presuppose the existence of driver mutations, and even a precise hypothesis about their frequency, presuppositions which cannot be based on nothing but some kind of prior knowledge. Thus, de novo approaches do not really differ from other kinds of approaches developed to identify driver mutations, and cannot be said to be independent from previous knowledge. So, this objection is inadequate.

  30. 30.

    For a detailed illustration of the main views of statistics, see Romeijn (2017).

  31. 31.

    For a survey on EBM, see Bluhm and Borgerson (2011).

  32. 32.

    This is the so-called problem of the external validity, see Worrall (2010).

  33. 33.

    Cf. Howson and Urbach (2006, p. 183): “Clinical trials typically involve two groups of subjects, all of whom are currently suffering from a particular medical condition; one of the groups, the test group, is administered the experimental therapy, while the other, the control group, is not [...].”

  34. 34.

    Cf. e.g. Worrall (2007b, p. 472): “there can be no estimate of how closely balanced a particular real trial is with respect to any unknown factor—this is so by definition, since the unknown factor is unknown!”

  35. 35.

    Cf. Worrall (2007b, p. 472): “there is also an epistemological issue about whether any repeated random trial would be comparable to the initial one. If a particular patient in the study receives, say, the ‘active drug’ on the first round, then, since this is expected to have some effect on his or her condition, the second randomization would not be rigorously a true repetition of the first. The second trial population, though consisting of the same individuals, would, in a possibly epistemically significant sense, not be the same population as took part in the initial trial.”

  36. 36.

    Cf. Worrall (2007a, pp. 1000–1001): “randomisation does not free us from having to think about alternative explanations for particular trial outcomes and from assessing the plausibility of these in the light of ‘background knowledge’.”

  37. 37.

    On the different Bayesian perspectives on prior probabilities, cf. Williamson (2010, p. 2): “All Bayesian epistemologists hold that rational degrees of belief are probabilities, consistent with total available evidence, and updated in the light of new evidence by Bayesian conditionalization. Strict subjectivists (e.g. Bruno de Finetti) hold that initial or prior degrees of belief are largely a question of personal choice. Empirically based subjectivists (e.g. Howson and Urbach [...]) hold that prior degrees of belief should not only be consistent with total evidence, but should also be calibrated with physical probabilities to the extent that they are known. Objectivists (e.g. Edwin Jaynes) hold that prior degrees of belief are fully determined by the evidence.”


  1. Allen, J. F. (2001). In silico veritas. Data-mining and automated discovery: The truth is in there. EMBO Reports, 2(7), 542–544.

    Google Scholar 

  2. Anderson, C. (2008). The end of theory: The data deluge makes the scientific method obsolete. Wired Magazine, 16(7), 16.

    Google Scholar 

  3. Baker, S. G. (2014). Recognizing paradigm instability in theories of carcinogenesis. British Journal of Medicine and Medical Research, 4(5), 1149–1163.

    Google Scholar 

  4. Baker, S. G. (2015a). A cancer theory kerfuffle can lead to new lines of research. Journal of the National Cancer Institute, 107(2), 405. doi:10.1093/jnci/dju405.

    Article  Google Scholar 

  5. Baker, S. G. (2015b). Response [to: Kaye, F.J. (2015)]. Journal of the National Cancer Institute, 107(5), djv061. doi:10.1093/jnci/djv061.

  6. Baker, S. G. (2017). The questionable premises underlying the search for cancer driver mutations and cancer susceptibility genes. Organisms, 1(1), 3–4.

    Google Scholar 

  7. Bandyopadhyay, P. S., & Forster, M. R. (2011). Philosophy of statistics: An introduction. In P. S. Bandyopadhyay & M. R. Forster (Eds.), Handbook of the philosophy of science. Volume 7. Philosophy of statistics (pp. 1–50). Amsterdam: Elsevier.

    Google Scholar 

  8. Bassett, D. E., Jr., Eisen, M. B., & Boguski, M. S. (1999). Gene expression informatics—It’s all in your mine. Nature Genetics, 21(1 Suppl.), 51–55.

  9. Bedessem, B., & Ruphy, S. (2015). SMT or TOFT? How the two main theories of carcinogenesis are made (artificially) incompatible. Acta Biotheoretica, 63(3), 257–267.

    Google Scholar 

  10. Bedessem, B., & Ruphy, S. (2017). SMT and TOFT integrable after all: A reply to Bizzarri and Cucina. Acta Biotheoretica, 65(1), 81–85.

    Google Scholar 

  11. Bertolaso, M. (2016). Philosophy of cancer. A dynamic and relational view. Dordrecht: Springer.

    Google Scholar 

  12. Bird, A. (2017). Systematicity, knowledge, and bias. How systematicity made clinical medicine a science. Synthese. doi:10.1007/s11229-017-1342-y.

  13. Bizzarri, M., & Cucina, A. (2016). SMT and TOFT: Why and how they are opposite and incompatible paradigms. Acta Biotheoretica, 64(3), 221–239.

    Google Scholar 

  14. Bluhm, R., & Borgerson, K. (2011). Evidence-based medicine. In F. Gifford (Ed.), Handbook of the philosophy of science. Volume 16. Philosophy of medicine (pp. 204–238). Amsterdam: Elsevier.

    Google Scholar 

  15. Brown, P. O., & Botstein, D. (1999). Exploring the new world of the genome with DNA microarrays. Nature Genetics, 21(1 Suppl.), 33–37.

    Google Scholar 

  16. Calude, C. (2002). Information and randomness. An algorithmic perspective. Berlin: Springer.

    Google Scholar 

  17. Calude, C., & Longo, G. (2016a). Classical, quantum and biological randomness as relative unpredictability. Natural Computing, 15(2), 263–278.

    Google Scholar 

  18. Calude, C. S., & Longo, G. (2016b). The Deluge of spurious correlations in big data. Foundations of Science. doi:10.1007/s10699-016-9489-4.

  19. Cellucci, C. (2013). Rethinking logic. Logic in relation to mathematics, evolution, and method. Dordrecht: Springer.

    Google Scholar 

  20. Cellucci, C. (2016). Models of science and models in science. In E. Ippoliti, F. Sterpetti, & T. Nickles (Eds.), Models and inferences in science (pp. 95–122). Cham: Springer.

    Google Scholar 

  21. Cellucci, C. (2017a). Rethinking knowledge. The heuristic view. Dordrecht: Springer.

  22. Cellucci, C. (in press). Theory building as problem solving. In E. Ippoliti & D. Danks (Eds.), Building theories. Cham: Springer.

  23. Chen, R., & Snyder, M. (2013). Promise of personalized omics to precision medicine. WIREs Systems Biology and Medicine, 5(1), 73–82.

    Google Scholar 

  24. Cirkel, G. A., Gadellaa-van Hooijdonk, C. G., Koudijs, M. J., Willems, S. M., & Voest, E. E. (2014). Tumor heterogeneity and personalized cancer medicine: Are we being outnumbered? Future Oncology, 10(3), 417–428.

    Google Scholar 

  25. Coveney, P. V., Dougherty, E. R., & Highfield, R. R. (2016). Big data need big theory too. Philosophical Transactions of the Royal Society A, 374(2080), 1–11.

    Google Scholar 

  26. Dellsén, F. (2016). Scientific progress: Knowledge versus understanding. Studies in History and Philosophy of Science, 56, 72–83.

    Google Scholar 

  27. Dimitrakopoulos, C. M., & Beerenwinkel, N. (2017). Computational approaches for the identification of cancer genes and pathways. Wiley Interdisciplinary Reviews. System Biology and Medicine. doi:10.1002/wsbm.1364.

  28. Djulbegovic, B., Hozo, I., & Greenland, S. (2011). Uncertainty in clinical medicine. In F. Gifford (Ed.), Handbook of the philosophy of science. Volume 16. Philosophy of medicine (pp. 298–356). Amsterdam: Elsevier.

    Google Scholar 

  29. Fan, J., Han, F., & Liu, H. (2014). Challenges of big data analysis. National Science Review, 1(2), 293–314.

    Google Scholar 

  30. Gagneur, J., Friedel, C., Heun, V., Zimmer, R., & Rost, B. (2017). Bioinformatics advances biology and medicine by turning big data troves into knowledge. Informatik Spektrum, 40(2), 153–160.

    Google Scholar 

  31. Gelman, A., Hennig, C. (2017). Beyond subjective and objective in statistics. Journal of the Royal Statistical Society: Series A, 180(4), 1–31.

  32. Gillies, D. (2000). Philosophical theories of probability. London: Routledge.

    Google Scholar 

  33. Goodman, S. N. (1999). Toward evidence-based medical statistics. 1: The P value fallacy. Annals of Internal Medicine, 130(12), 995–1004.

    Google Scholar 

  34. Goodman, S. N. (2001). Of P-values and Bayes: A modest proposal. Epidemiology, 12(3), 295–297.

    Google Scholar 

  35. Howson, C., & Urbach, P. (2006). Scientific reasoning. The Bayesian approach (3rd ed.). Chicago, La Salle: Open Court.

    Google Scholar 

  36. Katsnelson, A. (2013). Momentum grows to make ‘personalized’ medicine more ‘precise’. Nature Medicine, 19(3), 249.

    Google Scholar 

  37. Kant, I. (1992). Lectures on logic. Cambridge: Cambridge University Press.

    Google Scholar 

  38. Kaye, F. J. (2015). RE: A cancer theory kerfuffle can lead to new lines of research. Journal of the National Cancer Institute, 107(5), djv060. doi:10.1093/jnci/djv060.

    Article  Google Scholar 

  39. Laplace, P. S. (1951). A philosophical essay on probabilities. New York: Dover Publications [1st French ed.: 1814].

  40. Lawrence, M. S., Stojanov, P., Polak, P., Kryukov, G. V., Cibulskis, K., Sivachenko, A., et al. (2013). Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature, 499(7457), 214–218.

    Google Scholar 

  41. Lindley, D. V. (2000). The philosophy of statistics. Journal of the Royal Statistical Society: Series D, 49(3), 293–319.

    Google Scholar 

  42. Longo, G. (2017). The biological consequences of the computational world: Mathematical reflections on cancer biology. arXiv:1701.08085v2.

  43. Longo, G., Montévil, M., Sonnenschein, C., & Soto, A. M. (2015). In search of principles for a theory of organisms. Journal of Biosciences, 40(5), 955–968.

    Google Scholar 

  44. Magnus, P. D. (2006). What’s new about the new induction? Synthese, 148(2), 295–301.

    Google Scholar 

  45. Mazzocchi, F. (2015). Could big data be the end of theory in science? A few remarks on the epistemology of data-driven science. EMBO Reports, 16(10), 1250–1255.

    Google Scholar 

  46. Merid, S. M., Goranskaya, D., & Alexeyenko, A. (2014). Distinguishing between driver and passenger mutations in individual cancer genomes by network enrichment analysis. BMC Bioinformatics, 15, 308. doi:10.1186/1471-2105-15-308.

    Article  Google Scholar 

  47. Mizrahi, M. (2014). The problem of unconceived objections. Argumentation, 28(4), 425–436.

    Google Scholar 

  48. Mizrahi, M. (2016). Historical inductions, unconceived alternatives, and unconceived objections. Journal for General Philosophy of Science, 47(1), 59–68.

    Google Scholar 

  49. Musgrave, A. (2011). Popper and hypothetico-deductivism. In D. Gabbay, H. Stephan, & J. Woods (Eds.), Handbook of the history of logic. Volume 10. Inductive logic (pp. 205–234). Amsterdam: North-Holland.

    Google Scholar 

  50. Ow, G. S., & Kuznetsov, V. A. (2016). Big genomics and clinical data analytics strategies for precision cancer prognosis. Scientific Reports, 6(36493), 1–13.

    Google Scholar 

  51. Papineau, D. (1994). The virtues of randomization. The British Journal for the Philosophy of Science, 45(2), 437–450.

    Google Scholar 

  52. Pollock, J. L. (1983). Epistemology and probability. Synthese, 55(2), 231–252.

    Google Scholar 

  53. Pólya, G. (1941). Heuristic reasoning and the theory of probability. The American Mathematical Monthly, 48(7), 450–465.

    Google Scholar 

  54. Popper, K. R. (2005). The logic of scientific discovery. London: Routledge.

    Google Scholar 

  55. Putnam, H. (1975). Mathematics, matter and method. Philosophical papers (Vol. 1). Cambridge: Cambridge University Press.

    Google Scholar 

  56. Raphael, B. J., Dobson, J. R., Oesper, L., & Vandin, F. (2014). Identifying driver mutations in sequenced cancer genomes: Computational approaches to enable precision medicine. Genome Medicine, 6(5), 1–17.

    Google Scholar 

  57. Rigo-Lemini, M., & Martínez-Navarro, B. (2017). Epistemic states of convincement. A conceptualization from the practice of mathematicians and neurobiology. In U. E. Xolocotzin (Ed.), Understanding emotions in mathematical thinking and learning (pp. 97–131). London: Academic Press.

    Google Scholar 

  58. Romeijn, J.-W. (2017). Philosophy of statistics. In E. N. Zalta (Ed.), The Stanford encyclopedia of philosophy. https://plato.stanford.edu/archives/spr2017/entries/statistics/.

  59. Rosenfeld, S. (2013). Are the somatic mutation and tissue organization field theories of carcinogenesis incompatible? Cancer Informatics, 12, 221–229.

    Google Scholar 

  60. Rowbottom, D. P. (2016). Extending the argument from unconceived alternatives: Observations, models, predictions, explanations, methods, instruments, experiments, and values. Synthese. doi:10.1007/s11229-016-1132-y.

  61. Ruhmkorff, S. (2011). Some difficulties for the problem of unconceived alternatives. Philosophy of Science, 78(5), 875–886.

    Google Scholar 

  62. Saatsi, J., Psillos, S., Winther, R. G., & Stanford, K. (2009). Grasping at realist straws. Metascience, 18(3), 370–379.

    Google Scholar 

  63. Salmon, W. C. (1990). The appraisal of theories: Kuhn meets Bayes. Proceedings of the Biennial Meeting of the Philosophy of Science Association, 2, 325–332.

    Google Scholar 

  64. Schickore, J. (2014). Scientific discovery. In E. N. Zalta (Ed.), The Stanford encyclopedia of philosophy. http://plato.stanford.edu/archives/spr2014/entries/scientific-discovery/.

  65. Schupbach, J. N. (2011). Studies in the logic of explanatory power. Ph.D. thesis, University of Pittsburgh, School of Arts and Sciences, Department of History and Philosophy of Science, Pittsburgh.

  66. Sklar, L. (1981). Do unborn hypotheses have rights? Pacific Philosophical Quarterly, 62(1), 17–29.

    Google Scholar 

  67. Sonnenschein, C., & Soto, A. M. (2011). Response to in defense of the somatic mutation theory of cancer. BioEssays, 33(9), 657–659.

    Google Scholar 

  68. Sonnenschein, C., & Soto, A. M. (2016). Carcinogenesis explained within the context of a theory of organisms. Progress in Biophysics and Molecular Biology, 122(1), 70–76.

    Google Scholar 

  69. Soto, A. M., & Sonnenschein, C. (2011). The tissue organization field theory of cancer: A testable replacement for the somatic mutation theory. BioEssays, 33(5), 332–340.

    Google Scholar 

  70. Stanford, P. K. (2006). Exceeding our grasp: Science, history, and the problem of unconceived alternatives. New York: Oxford University Press.

    Google Scholar 

  71. Stevens, H. (2013). Life out of sequence. A data-driven history of bioinformatics. Chicago: The University of Chicago Press.

    Google Scholar 

  72. Stratton, M. R., Campbell, P. J., & Futreal, P. A. (2009). The cancer genome. Nature, 458(7239), 719–724.

    Google Scholar 

  73. Talukder, A. K. (2015). Genomics 3.0: Big-data in precision medicine. In N. Kumar & V. Bhatnagar (Eds.), Big data analytics (pp. 201–215). Cham: Springer.

    Google Scholar 

  74. Tannock, I. F., & Hickman, J. A. (2016). Limits to personalized cancer medicine. The New England Journal of Medicine, 375(13), 1289–1294.

    Google Scholar 

  75. Teira, D. (2011). Frequentist vs. Bayesian clinical trials. In F. Gifford (Ed.), Handbook of the philosophy of science. Volume 16. Philosophy of medicine (pp. 255–297). Amsterdam: Elsevier.

    Google Scholar 

  76. Tokheim, C. J., Papadopoulosc, N., Kinzlerc, K. W., Vogelsteinc, B., & Karchina, R. (2016). Evaluating the evaluation of cancer driver genes. Proceedings of the National Academy of Sciences, 113(50), 14330–14335.

    Google Scholar 

  77. van Fraassen, B. C. (1989). Laws and symmetry. Oxford: Oxford University Press.

    Google Scholar 

  78. Vaux, D. L. (2011a). In defense of the somatic mutation theory of cancer. BioEssays, 33(5), 341–343.

    Google Scholar 

  79. Vaux, D. L. (2011b). Response to the tissue organization field theory of cancer: A testable replacement for the somatic mutation theory. BioEssays, 33(9), 660–661.

    Google Scholar 

  80. Weinberg, R. (2014). Coming full circle—from endless complexity to simplicity and back again. Cell, 157(1), 267–271.

    Google Scholar 

  81. Weyl, H. (1949). Philosophy of mathematics and natural science. Princeton: Princeton University Press.

  82. Williamson, J. (2010). In defence of objective Bayesianism. Oxford: Oxford University Press.

    Google Scholar 

  83. Worrall, J. (2007a). Evidence in medicine and evidence-based medicine. Philosophy Compass, 2(6), 981–1022.

    Google Scholar 

  84. Worrall, J. (2007b). Why there’s no cause to randomize. The British Journal for the Philosophy of Science, 58(3), 451–488.

    Google Scholar 

  85. Worrall, J. (2010). Evidence: Philosophy of science meets medicine. Journal of Evaluation in Clinical Practice, 16(2), 356–362.

  86. Zbilut, J. P., & Giuliani, A. (2008). Biological uncertainty. Theory in Biosciences, 127(3), 223–227.

    Google Scholar 

  87. Zhang, J., Liu, J., Sun, J., Chen, C., Foltz, G., & Lin, B. (2014). Identifying driver mutations from sequencing data of heterogeneous tumors in the era of personalized genome sequencing. Briefing in Bioinformatics, 15(2), 244–255.

    Google Scholar 

Download references

Author information



Corresponding author

Correspondence to Fabio Sterpetti.

Additional information

The article was jointly developed by both authors and the thesis commonly shared. Marta Bertolaso was mainly responsible for the writing of Sects. 34, and 5. Fabio Sterpetti was mainly responsible for the writing of Sects. 1 and 2.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bertolaso, M., Sterpetti, F. Evidence amalgamation, plausibility, and cancer research. Synthese 196, 3279–3317 (2019). https://doi.org/10.1007/s11229-017-1591-9

Download citation


  • Cancer research
  • Evidence amalgamation
  • Plausibility
  • Probability
  • Somatic mutation theory
  • Tissue organization field theory