Abstract
Consensus conferences are social techniques which involve bringing together a group of scientific experts, and sometimes also non-experts, in order to increase the public role in science and related policy, to amalgamate diverse and often contradictory evidence for a hypothesis of interest, and to achieve scientific consensus or at least the appearance of consensus among scientists. For consensus conferences that set out to amalgamate evidence, I propose three desiderata: Inclusivity (the consideration of all available evidence), Constraint (the achievement of some agreement of intersubjective assessments of the hypothesis of interest), and Evidential Complexity (the evaluation of available evidence based on a plurality of relevant evidential criteria). Two examples suggest that consensus conferences can readily satisfy Inclusivity and Evidential Complexity, but consensus conferences do not as easily satisfy Constraint. I end by discussing the relation between social inclusivity and the three desiderata.
Similar content being viewed by others
Notes
Robustness-style arguments have been frequently appealed to as grounds for objectivity; concordant multimodal evidence has been seen as a way to avoid worries about the fallibility of single modes of evidence and as a way to resist skeptical arguments. See discussions of robustness (or synonyms) in Wimsatt (1981), Cartwright (1983), Salmon (1984), Culp (1994), Chang (2004), Weber (2005), Kosso (2006), Stegenga (2009), Kuorikoski et al. (2010), and Stegenga (2011a).
The process of bringing together experts in an attempt to resolve disagreement and settle on a fact of the matter is probably as old as organized humanity. One of the more infamous examples of a consensus conference is the 1616 meeting of the commission of theologians, or Qualifiers, who came to a formal consensus that the hypothesis of a moving earth is “foolish and absurd in philosophy” (see Westman 2011).
However, some advocates of deliberative approaches to amalgamating evidence have been critical of formal methods of evidence amalgamation. A long-time critic of formal amalgamation methods such as meta-analysis has suggested that personal judgment is necessary to properly amalgamate evidence:
A good review is based on intimate personal knowledge of the field, the participants, the problems that arise, the reputation of different laboratories, the likely trustworthiness of individual scientists, and other partly subjective but extremely relevant considerations. Meta-analysis rules out any such subjective factors. (Eysenck 1994)
For a critical account of meta-analysis, see Stegenga (2011b).
As of 2007 the NIH had produced 118 consensus statements (Solomon 2007).
Not much, I think, should be placed on this distinction. Often the policy implications of an epistemic conclusion are clear to the participants of a consensus conference, it is usually policy makers who organize consensus conferences, and policies themselves involve predictions on some epistemic basis or other. Thus, like the Danish model, the U.S. model is often employed for guidance with policy formulation, albeit perhaps less directly.
This problem does not arise in contexts in which there is an independent indicator of the truth. An anonymous reviewer suggests that in situations in which evidence is amalgamated in order to make predictions, we have such an independent indicator of the truth, since the AM can be tested against the frequency with which its predictions are borne out. However, often in the contexts in which AMs are used, the track record of an AM can only be evaluated by appeal to further evidence relevant to the hypothesis in question, and when such new evidence is itself inconclusive (which is ubuitous in such contexts) then the above circular argument applies. For instance, suppose our hypothesis (H) is “drug \(x\) alleviates symptoms \(y\)”, and we use an AM to amalgamate the available evidence regarding H, and then come to affirm H as probable. Further suppose that H warrants a prediction that if \(x\) were to be used in clinical practice it would alleviate \(y\). We then use \(x \)in clinical practice with the hope that it alleviates \(y\). But the evidence regarding H that becomes available from the use of \(x\) in clinical practice is only one kind of evidence relevant to H, and indeed such evidence is, in some widely recognized respects, inferior to the initial evidence that was amalgamated by the AM in the first place (because, for example, evidence from clinical practice is not controlled, and is liable to confounding by expectation bias and confirmation bias). So even after the prediction is made based on H and evidence is gathered about the prediction, our epistemic state regarding H is not different in kind than it was prior to the prediction, and the inferior evidence gathered after the prediction cannot be an arbiter of the veracity of the AM.
Even this, though, is overly optimistic: elsewhere I argue that even when assessing a single mode of evidence, constraint is not necessarily achieved, because there are numerous features of evidence that must be assessed, which can be variably (but rationally) prioritized.
A caveat: much hinges on what evidence is deemed ‘relevant’, and this is often a matter of dispute.
If a method provides information that is no more reliable than a randomizer, then such information should not be considered ‘evidence’. If two methods are both somewhat reliable but their degrees of reliabilities differ, then evidence from such methods should be weighted accordingly by an AM. Elsewhere I investigate such weighting methodologies for evidence in clinical research.
For examples of the plurality of features of evidence that scientists consider, see, for example, Franklin (2002).
(E) is a kind of epistemic inclusiveness at the level of the plurality of features of evidence, rather than at the level of the plurality of kinds of evidence available (which is accounted for by (I)).
This desideratum, a kind of social inclusiveness, is distinct from my (I) above, meant to be inclusiveness of an epistemic kind only. Nevertheless, as Longino (1990) and others have argued, one way to help achieve the epistemic virtues that I am concerned with might be to guarantee social inclusiveness in the process of consensus formation. I return to the relation between social inclusivity and my three desiderata for AMs in §4.
Some argue that knowledge is what an ideal epistemic community would, in the long run, eventually agree on (for instance, this is one interpretation of Peirce’s notion of convergence to the truth). Others argue that knowledge is just what an actual epistemic community settles on (see, for example, Kusch 2002), and so if intersubjective assessment of hypotheses were tightly constrained, then knowledge would be achieved. Though I will not argue the point here, since many others have done so, the conflation between consensus and knowledge should be rejected. See also Miller (2013).
There is a growing body of literature concerned with the epistemic value of consensus, of which Miller (2013) is a recent valuable addition. Since the primary focus of the present paper is on consensus conferences rather than on consensus per se, I avoid an exposition of this literature, but for a sampling, see also Gilbert (1987), Tuomela (1992), Wray (2001), and Tucker (2003).
However, for a critique of the assumed epistemic value of consilience, see Stegenga (2009). Consilience is often called ‘robustness’ (see also footnote 1).
The consensus achieved by the Intergovernmental Panel on Climate Change could be described as an example of achieved constraint despite discordant evidence.
Though Solomon (2007) notes that consensus conferences have been assessed based on their freedom from bias, by the Rand Corporation in 1983, a group at University of Michigan in 1987, and the NIH in 1999.
However, there were some experiments performed on prisoner ‘volunteers’ in the 1960s, with mixed results.
This view has been heavily criticized. See footnote 9 for references.
Dissenting outsiders are often non-scientists, and so are not ‘insiders’ to any scientific community. But sometimes such outsiders can be respected scientists in one community and be vocal dissenters to a consensus established by another community. The HIV-AIDS deniers (those who deny that HIV causes AIDS) are a salient example. Some are fully outsiders (Thabo Mbeki, the former president of South Africa is one of the most prominent examples). But others include Peter Duesberg, a molecular biologist at University of California, Berkeley, and Kary Mullis, winner of a Nobel Prize in chemistry. This raises the question: among professional scientists, what constitutes membership in this or that community? A cynical Kuhn-inspired answer might be: assent to the hypothesis about which consensus is in question. A less cynical answer might be: performing active research on that hypothesis. Duesberg and Mullis were outsiders to AIDS research on either answer. See van Rijn (2006).
Solomon (2007) gives several examples of such consensus conferences, including a 1994 conference titled “Helicobacter Pylori in Peptic Ulcer Disease” and a 2002 conference titled “Management of Hepatitis C”. Both of these conferences appeared to achieve (C), but in fact the conferences took place some time after the relevant scientific communities had already achieved consensus. For a criticism of Solomon’s argument, see Kosolosky (2012).
I do not mean to suggest that it is simple to avoid the subtle biases that arise in group deliberative processes. Janis (1982) argued that groups are liable to come to incorrect conclusions in certain circumstances; peer pressure and authoritative pressure stifles dissent and quiets the discussion of discordant evidence. In contrast, Tollefsen (2006) argues that scientists can engage in collaborative deliberation without engaging in groupthink or stifling dissent.
References
Beatty, J. (2006). Masking disagreement among scientific experts. Episteme, 3, 52–67.
Cartwright, N. (2006). Well-ordered science: Evidence for use. Philosophy of Science, 73, 981–990.
Cartwright, N. (2007). Are RCTs the gold standard? Biosocieties, 2, 11–20.
Cartwright, N. (1983). How the laws of physics lie. Oxford: Clarendon Press.
Chang, H. (2004). Inventing temperature. Oxford: Oxford University Press.
Collins, H. M., & Evans, Robert. (2002). The third wave of science studies: Studies of expertise and experience. Social Studies of Science, 32(2), 235–296.
Culp, Sylvia. (1994). Defending robustness: The bacterial mesosome as a test case. PSA, 1, 46–57.
Douglas, H. (2005). Inserting the public into science. In S. Maasen, & P. Weingart (Eds.), Democratization of expertise? Exploring novel forms of scientific advice in political decision-making. Netherlands: Springer.
Eysenck, H. (1994). Systematic reviews: Meta-analysis and its problems. British Medical Journal, 309, 789–792.
Franklin, A. (2002). Selectivity and discord: Two problems of experiment. Pittsburgh: Pittsburgh University Press.
Gilbert, M. (1987). Modeling collective belief. Synthese, 73(1), 185–204.
Hong, L., & Page, S. (2004). Groups of diverse problem solvers can outperform groups of high-ability problem solvers. Proceedings of the National Academy of Sciences, 101(46), 16385–16389.
Janis, I. (1982). Groupthink: Psychological studies of policy decisions and fiascoes. Houghton Mifflin.
Joss, S., & Durant, J. (Eds.). (1995). Public participation in science: The role of consensus conferences in Europe. UK: Science Museum.
Klein R., & Williams, A. (2000). Setting priorities: what is holding us back—inadequate information or inadequate institutions? In A. Coulter, & C. Ham (Eds.), The global challenge of health care rationing. Buckingham: Open University Press.
Kosolosky, L. (2012). The Intended window of epistemic opportunity: A comment on Miriam Solomon. In B. Van Kerkhove, T. Libert, G. Vanpaemel, & P. Marage, (Eds.), Logic, philosophy and history of science in Belgium II. Koninklijke Vlaamse Academie van België.
Kosso, P. (2006). Detecting extrasolar planets. Studies in History and Philosophy of Science, 37, 224–236.
Kramer, P. (2011). In defense of antidepressants. New York Times July 9.
Kuorikoski, J., Lehtinen, A., & Marchionni, C. (2010). Economic modeling as robustness analysis. The British Journal for the Philosophy of Science, 61, 541–567.
Kusch, M. (2002). Knowledge by agreement: the programme of communitarian epistemology. Clarendon.
Lomas, J., Fulop, N., Gagnon, D., & Allen, P. (2003). On being a good listener: Setting priorities for applied health services research. Milbank Quarterly, 81(3), 363–388.
Marmot, M. (2004). The status syndrome. New York: Times Books.
Mill, J. S. (1859). On liberty.
Miller, B. (2013) When is consensus knowledge-based? Distinguishing shared knowledge from mere agreement . Synthese, 190, 1293–1316.
Salmon, W. (1984). Scientific explanation and the causal structure of the world. Princeton: Princeton University Press.
Solomon, M. (2007). The social epistemology of NIH consensus conferences. In H. Kincaid, & J. McKitrick (Eds.), Establishing medical reality: Essays in the metaphysics and epistemology of biomedical science. Springer.
Solomon, M. (2006). Groupthink versus the Wisdom of Crowds: The social epistemology of deliberation and dissent. The Southern Journal of Philosophy, 44, 28–42.
Stegenga, J. (2009). Robustness discordance, and relevance. Philosophy of Science, 76, 650–661.
Stegenga, J. (2011a). An impossibility theorem for amalgamating evidence. Synthese, 190, 2391–2411.
Stegenga, J. (2011b). Is meta-analysis the platinum standard? Studies in History and Philosophy of Biological and Biomedical Sciences, 42(4), 497–507.
Tollefsen, D. P. (2006). Group deliberation, social cohesion, and scientific teamwork: Is there room for dissent? Episteme, 3, 37–51.
Tucker, A. (2003). The epistemic significance of consensus. Inquiry, 46(4), 501–521.
Tuomela, R. (1992). Group beliefs. Synthese, 91(3), 285–318.
van Rijn, K. (2006). The politics of uncertainty: The AIDS debate, Thabo Mbeki and the South African government response. Social History of Medicine, 19(3), 521–538.
Weber, M. (2005). Philosophy of experimental biology. Cambridge: Cambridge University Press.
Westman, R. (2011). The copernican question: Prognostication, skepticism, and celestial order. Berkeley: University of California Press.
Wimsatt, W. (1981). Robustness, reliability, and overdetermination. In M. B. Brewer, & B. E. Collins (Eds.), Scientific inquiry and the social sciences. Jossey-Bass.
Worrall, J. (2007). Why there’s no cause to randomize. British Journal for the Philosophy of Science, 58, 451–588.
Worrall, J. (2002). What evidence in evidence-based medicine? Philosophy of Science, 69, S316–S330.
Wray, K. B. (2001). Collective belief and acceptance. Synthese, 129(3), 319–333.
Acknowledgments
I am grateful to Nancy Cartwright, Boaz Miller, Alex Broadbent, Laszlo Kosolosky, Miriam Solomon, Anton Froeyman, Jeroen Van Bouwel, Heather Douglas, and two anonymous reviewers for detailed feedback on versions of this paper. Financial support was provided by the Banting Postdoctoral Fellowships Program administered by the Social Sciences and Humanities Research Council of Canada.
Author information
Authors and Affiliations
Corresponding author
Additional information
Forthcoming in Foundations of Science.
Rights and permissions
About this article
Cite this article
Stegenga, J. Three Criteria for Consensus Conferences. Found Sci 21, 35–49 (2016). https://doi.org/10.1007/s10699-014-9374-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10699-014-9374-y