1 Introduction

During the twentieth century, most philosophers of science subscribed to the notion that science should be free from the influence of non-epistemic, external values, such as moral, social, economic, or political values (Douglas 2009). Proponents of this view recognized that social and economic conditions, such as the availability of research funding and ethical constraints on the conduct of research, can impact scientific practice, but they maintained that the values that create these conditions should not be allowed to influence judgments and decisions which are internal to the epistemology and methodology of science. Science itself should remain as free as is humanly possible from external values (Douglas 2009).Footnote 1

Although some philosophers (e.g., Haack 2004; Betz 2013; Hudson 2016) still accept some version of the value-free ideal for science, others reject the value-free ideal and argue that scientists should incorporate external values into judgments and decisions at various stages of inquiry (Douglas 2009; Steel 2010; Kourany 2010; Elliott 2017; Brown 2020). This shift in philosophical thinking about the relationship between science and external values is due largely to penetrating critiques of the value-free ideal developed by philosophers, historians, and sociologists of science who argued that external values appropriately affect judgments and decisions related to such tasks as hypothesis and concept formation, theory construction, and hypothesis or theory acceptance.Footnote 2

Rejecting the value-free ideal creates a new problem, however, namely, how can one distinguish between legitimate and illegitimate value influences (Holman and Wilholt 2022; Resnik and Elliott 2019)? Resolution of this issue has important implications for science and society, since illegitimate value influences can undermine the integrity, reliability, and trustworthiness of science (Resnik 2007; 2009; Goldenberg 2021). Well-known examples of the corrupting influence of values on science include fraudulent, biased, misleading, or irreproducible studies conducted by pharmaceutical, tobacco, food, and energy companies to promote their economic interests (Resnik 2007; 2009; Michaels 2008; McGarity and Wagner 2008; Oreskes and Conway 2010; Holman and Elliott 2018; Richie 2020). Value corruption is not limited to private industry, however, since academic researchers sometimes fabricate, falsify, or deceptively manipulate data to promote their careers or financial interests, and researchers funded by environmental, political, or consumer interest groups sometimes manipulate or distort science for the sake of “worthy” causes, such as protecting public health or the environment (Wagner and Steinzor 2006; McGarity and Wagner 2008; Resnik 2015; Ritchie 2020; Saphier 2021; Shamoo and Resnik 2022).

Holman and Wilholt (2022) have called the issue of how to distinguish between legitimate and illegitimate external value influences on science the “new” demarcation problem. While this is indeed a new problem for the philosophy of science, it is not entirely clear how it relates to the original demarcation problem as initially formulated by Karl Popper. Something both problems have in common is a concern with the practical applications of theories and hypotheses that are regarded as scientific. One reason why Popper thought it was important to distinguish between science and pseudoscience was to prevent theories he regarded as pseudoscientific from having detrimental impacts on society (Popper 1959; 1963; Magee 1985). Philosophers who are working on the new demarcation problem also want to prevent research from having adverse impacts on society, but they are more concerned with the impacts of biased, misleading, or fraudulent science on society than with pseudoscience’s impacts (Douglas 2009; Resnik and Elliott 2019; Koskinen and Rolin 2022).

The new demarcation problem is much more complex and multifaceted than the original problem because distinguishing between legitimate and illegitimate value influences requires one to provide an account of the difference between good and bad science, which is no simple task, since there are many ways that science can go wrong (Ritchie 2020; Boudry 2021). A study might be recognized as scientific even though it has been corrupted by external values in various ways. For example, a study sponsored by a pharmaceutical company which claims that its new hypertension drug is superior to competing medications might fail to produce good evidence for this conclusion due to insufficient sample size; lack of control groups, blinding, or randomization; poor recordkeeping; or biased data analysis or interpretation (Gallin et al. 2018). While the key question for the original demarcation problem was “Is this theory or hypothesis scientific?”, the key question for the new problem is “Has this experiment, clinical trial, survey, longitudinal study, meta-analysis, systematic review, or other research product been inappropriately influenced by external values?”.

In this paper, we will defend an approach to the new demarcation problem that addresses complexities and nuances involved in distinguishing between good and bad science. Our approach describes epistemic and ethical norms that can help distinguish between science and non-science and between good and bad science, but it goes beyond this important first step and includes rules, policies, and procedures that serve to specify and implement those norms. We also provide a set of questions that can help provide practical guidance for deciding, for example, whether a research study should be admitted as evidence in a court of law or used in making decisions concerning the regulation of drugs.

Our argument will proceed as follows. In Sect. 2 of this paper, we review the original demarcation problem and explain how efforts to develop necessary and sufficient conditions for distinguishing between science and non-science ran into significant difficulties because science is a highly diverse and complex activity and because these definitions were not well-suited for distinguishing among gradations of science. In Sect. 3, we review some of the key arguments for and against the value-free ideal and show how rejection of this ideal leads to the problem of how to distinguish between legitimate and illegitimate value influences. In Sect. 4 we review and critique some attempts to solve the new demarcation problem. In Sect. 5 we develop our approach to the new demarcation problem, which draws on lessons learned from failed attempts to solve the original problem. In Sect. 6, we use a case study involving regulatory decisions concerning drugs and chemicals to illustrate our approach.

2 The Original Demarcation Problem

The question of how to distinguish between science and non-science was a key epistemological issue in the philosophy of science in the twentieth century (Mahner 2007; Fasce 2017). The problem has its roots in logical positivism’s core tenet that knowledge must be verifiable.Footnote 3 A belief (or statement) is verifiable if one can determine its truth or falsify by means of observations, tests, or experiments, or by logical or mathematical proofs (Carnap 1928; Reichenbach 1938; Ayer 1946). For example, the statement “Ethanol is flammable” can be verified by performing an experiment to determine whether ethanol ignites in the presence of oxygen and a flame. The statement “x + y = x − (− y)” can be proven mathematically by drawing inferences from the definitions of plus sign, the minus sign, and the equals sign. A statement like “The soul is immortal,” however, does not count as knowledge because it cannot be verified by observations, tests, experiments, or logical or mathematical or proofs.

By the late 1950s, verificationist approaches to knowledge and language were falling out of favor among philosophers as result of by criticisms by Willard Quine (1951; 1953; 1955; 1960), Wittgenstein (1973 [1953]), Wilfrid Sellars (1956), Lewis Feuer (1951), and others.Footnote 4 Even so, Popper proposed a highly influential solution to the demarcation problem that grew out of his critiques of verificationist ideas. Popper proposed a simple test for determining whether a theory or hypothesis is scientific: a theory is scientific if it is falsifiable; otherwise, it is not. A theory is falsifiable if an observation, test, or experiment could disprove it (Popper 1963). Pseudoscientific theories, such as theories from astrology, are not falsifiable, but scientific theories, such as theories from astronomy, are falsifiable. Popper also held that one can prove theories or hypotheses to be false, but one can never prove them to be true. Hence, the scientific method consists of proposing hypotheses and attempting to prove them false, or what Popper called conjectures and refutations (Popper 1959; 1963; Magee 1985).

Popper’s motivation for solving the demarcation problem was not just intellectual; it was also practical. Popper realized that solving the problem would have important implications for practical disciplines, such as psychology and politics. Popper (1963) applied his solution to the demarcation problem to Freudian psychoanalysis and Marxism and argued that these were unscientific theories because they are not falsifiable. Since these theories are unscientific, we should be wary of using them in applied contexts, such as psychotherapy or politics, where human wellbeing is at stake (Magee 1985). Since Popper’s time, other writers have stressed the importance of being able to distinguish between science and non-science in numerous applied contexts, such as law, medicine, engineering, education, and public policy (Kitcher 1983; Resnik 2000; Haack 2014). The practical import of the demarcation problem is one reason why it remains an important issue in the philosophy of science (Pigliucci and Boudry 2013; Hansson 2017a).

Popper’s formulation of the demarcation problem and his proposed solution was fairly simple. The problem is how to distinguish between science and pseudoscience and the solution is to develop a test that establishes necessary and sufficient conditions for making this determination. Subsequent philosophical inquiry has shown, however, that the demarcation problem is much more complex and subtle than it was originally conceived to be and that there is no simple test for distinguishing between science and pseudoscience (Hansson 1996; 2009; 2017b; Mahner 2007; Pigliucci and Boudry 2013; Fasce 2017; Boudry 2021).

A key weakness of Popper’s solution is that it does not provide necessary conditions for regarding a theory or hypothesis as scientific because science includes foundational theories, hypotheses, concepts, and principles which are not falsifiable in any straightforward way (Resnik 2000). Important foundational principles in physics, such as the conservation of mass-energy, the law of entropy, and the uniformity of nature, cannot be falsified by individual experiments. If an experiment appeared to show that a physical process or system violates the conservation of mass-energy, for example, physicists would not reject this principle, but would question the soundness of the experiment or find a way of explaining why the principle is not being violated. Scientists accept or reject many foundational scientific ideas based on the explanatory role they play in a system of beliefs, not on the basis of particular tests or experiments (Duhem 1914; Quine 1955; Thagard 1988; Kitcher 1993).

Popper’s solution also does not provide sufficient conditions for regarding a theory of hypothesis as scientific because some pseudoscientific theories and hypotheses are falsifiable. For example, we can test the astrological claim that “the planet Mars causes violent behavior” by observing rates of violence when Mars is in different positions relative to the Earth. The problem with theories from pseudoscience is not that they are unfalsifiable or that they are false, since history shows that many scientific theories and hypotheses (such as phlogiston and ether theory) turned out to be false. The problem is that proponents of these theories continue to accept them despite ample evidence that they are false (Thagard 1978).

Since Popper’s time, philosophers, historians, and sociologists of science have proposed alternative solutions to the demarcation problem. Most of these solutions focus on units of analysis larger than hypotheses or theories, such as research communities, programs, disciplines, or paradigms, and some address historical, psychological, or sociological features of science (see e.g., Lakatos 1970; Merton 1973; Thagard 1978; Bunge 1982; Ruse 1996; Hansson 1996; 2009; 2017b; Mahner 2007; Pigliucci and Boudry 2013; Boudry 2021).

Paul Thagard (1978), for example, asked not whether astrological theories are pseudoscientific but whether astrology, as a research discipline, is pseudoscientific. Thagard argued that astrology is a pseudoscience because astrologers cling to discredited theories and ignore disconfirming evidence (or anomalies). However, as Thomas Kuhn (1962) documented in his groundbreaking book The Structure of Scientific Revolutions, highly respected scientists, such as physicists in the early twentieth century who refused to accept quantum mechanics, also sometimes hold onto discredited theories and ignore anomalies. Imre Lakatos (1970) also focused on units of analysis larger than hypotheses or theories. Lakatos argued that we can use the notion of progressiveness to distinguish between scientific and non-scientific research programs: scientific research programs make progress over time, while unscientific programs stagnate. One problem with this view is that it is difficult to define or measure scientific progress (Laudan 1977).

We will not catalog or critically examine the various approaches to the demarcation problem that have been proposed since Popper’s time.Footnote 5 Instead, we will argue that attempts to solve the demarcation problem by articulating necessary and sufficient conditions for distinguishing between science and pseudoscience encounter two major and perhaps insurmountable difficulties: (1) adequately accounting for the diversity and complexity of scientific practice, and (2) distinguishing among gradations of science, from good to bad (Dupré 1993; Resnik 2000; Boudry 2021).

Science is a highly diverse and complex activity with a wide variety of methodologies, procedures, datasets, inference patterns, hypotheses, models, theories, instruments, and traditions (Kuhn 1962; 1977; Hull 1990; Kitcher 1993; Ziman 2000; Boudry 2021). Science includes disciplines that are highly theoretical and mathematical, such as particle physics, cosmology, computer science, and bioinformatics, as well as disciplines that are applied and experimental, such as medicine, agronomy, pharmacology, and biochemistry. Some scientists use complex instruments, such as particle accelerators, gas chromatographs, or radio telescopes to gather data, while others collect data by interviewing people or observing human or animal behavior. Some scientists formulate specific hypotheses and test them prior to gathering data, while others use statistical algorithms to analyze large sets of preexisting data. Scientists work in every country in the world and in many different settings including academia, private industry, government, health care, law enforcement, and the military (Ziman 2000). Adequately accounting for all these different aspects of science is a monumental task for any definition of science to undertake, and this, we believe, is the main reason why definitions that use necessary and sufficient conditions to characterize science are likely to fail (Bunge 1982; Dupré 1993).

Definitions that use necessary and sufficient conditions to define science also have difficulties with distinguishing among gradations of science. Although Popper originally conceived of the demarcation problem as the task of formulating a definition that would yield a “yes” or a “no” answer to the question, “is this hypothesis or theory scientific?”, philosophers now recognize that it is also essential to distinguish among gradations of science, because in practical contexts the important question is often “is this experimental finding, analysis, model, or study good enough science?” In a court of law, for example (discussed below in more detail), a judge must decide whether a research study performed by a professional scientist is good enough (e.g., unbiased, supported by data, reproducible) to admit into evidence. In drug regulation (also discussed below), a government advisory committee must decide whether a study published in a scientific journal is good enough to use in deciding whether to approve a drug for marketing. Gradation poses problems for using necessary and sufficient conditions to define science, because questions about gradation call for nuanced answers based on degrees of conformity to certain standards. For example, sanitation ratings are a type of gradation based on conformity to sanitation standards. A rating of “A” represents compliance with all the standards, “B” fewer of them, and so on.

As a result of these and other problems with the original approach to solving the demarcation problem, many philosophers who are working on this topic no longer attempt to define science in terms of necessary and sufficient conditions but propose definitions that characterize science according to a list of epistemic criteria (Bunge 1982; Kitcher 1983; Dupré 1993; Ruse 1996; Resnik 2000; Hansson 2017a). While it is conceivable that definitions of science that use necessary and sufficient conditions could account for the complexity and diversity of scientific practice and its gradations, they are not as well-suited to this task as are definitions that define science by means of a list of criteria. Those who characterize science based on a list of criteria view the term ‘science’ as a Wittgensteinian family-resemblance word like art, politics, sport, and other terms that refer to complex human activities (Wittgenstein 1973; Bunge 1982; Dupré 1993).Footnote 6 Using this approach, one cannot immediately dismiss a hypothesis (theory, or field of inquiry) as unscientific because it fails to conform to a particular norm; one must engage in a broader, more holistic assessment of the hypothesis. Also, under this approach, a hypothesis may be more or less scientific, depending on how many of the criteria it satisfies.Footnote 7

3 The New Demarcation Problem

In the previous section we argued that the original demarcation problem was framed as the question of how to develop a set of necessary and sufficient conditions for distinguishing between science and pseudoscience. However, what was thought to be a fairly simple and straightforward question—“is this theory or hypothesis scientific?”—turned out to be much more complex and nuanced than philosophers had realized. In Sect. 5 of our paper, we will apply insights from efforts to solve the original demarcation problem to the new one. But first, we will briefly describe the new problem and how some philosophers have attempted to solve it.Footnote 8

The new demarcation problem, as mentioned earlier, emerges from rejecting the value-free ideal for science, which was part of logical positivism’s agenda for setting human knowledge on a solid foundation. Philosophers working within the positivist tradition, such as Carnap (1928), Reichenbach (1938), Nagel (1961), and Hempel (1965), developed theories of confirmation, explanation, and inductive reasoning that aimed to show how scientific knowledge is based on inferences from observations, tests, or experiments. According to this view, the decision of whether to accept or reject a hypothesis should be based on empirical evidence, not on external values. As new evidence emerges, hypotheses that were previously accepted may be rejected or revised. Science can make progress toward a more accurate and factual description of reality because its hypotheses and theories will reflect the empirical evidence that has been obtained and not personal, social, or other biases (Popper 1959). The positivists acknowledged that actual science often deviates from this ideal picture, due to human failings, such as biases and errors. Nevertheless, the positivists argued that scientists should aspire to overcome these problems so that science can be objective and truthful. For this reason, Douglas (2009) refers to this view as the value-free ideal.

Philosophers, historians, and sociologists of science have developed several compelling arguments for rejecting the value-free ideal (for an overview, see Elliott 2022).Footnote 9

Richard Rudner presented one of the most influential philosophical arguments against the value-free ideal in 1953, long before it became fashionable to claim that external values should influence scientific judgment and decision-making. Rudner argued that external values are essentially involved in the practice of accepting hypotheses because scientists must consider the consequences of mistakenly accepting or rejecting a hypothesis.Footnote 10 For example, one should require more evidence to accept a hypothesis concerning the safety of a new drug than a hypothesis concerning the composition of Jupiter’s atmosphere because important values (e.g., human health and life) are at stake in the drug safety hypothesis, while less important values are at stake in the Jupiter hypothesis. Though Rudner’s arguments focused on the standards of evidence scientists use to accept hypotheses, Douglas (2000; 2009) extended his arguments by arguing that they apply to other contexts, such as data analysis and interpretation, and she also gave Rudner’s arguments a stronger ethical foundation (see also Elliott and Richards 2017).

A second influential philosophical argument against the value-free ideal is based on the underdetermination of theory by evidence. This argument has its roots in the writings of Quine (1953; 1955; 1960) and Kuhn (1962; 1977) and has been developed at length by Helen Longino (1990; see also Anderson 2004; Biddle 2013; Brown 2020). According to this argument, empirical evidence often does not uniquely determine which theory (or model or hypothesis) one should accept, because inferences from evidence always depend on background assumptions.Footnote 11 Thus, it is often the case in science that there are multiple theories that are empirically equivalent, and the decision concerning which one to accept cannot be settled by purely evidentiary considerations (Laudan and Leplin 1991). Many philosophers, including those who defend the value-free ideal, respond to the underdetermination problem by claiming that scientists can use epistemic values, such as simplicity, explanatory power, and the like, to choose between competing hypotheses that fit the evidence equally well (Quine and Ullian 1974; Laudan 1984; Thagard 1988). Critics of the value-free ideal take the appeal to values one step further and argue that scientists are sometimes justified in using external values to choose between empirically equivalent hypotheses because epistemic values may fail to settle which hypothesis is better and there are often moral or social reasons not to remain neutral in such situations (Biddle 2013; Elliott 2011; Frisch 2020).

For example, the Intergovernmental Panel on Climate Change (IPCC) (2014) has been evaluating six models of global climate dynamics for the last decade or so. Critics of the value-free ideal could argue that if all of these models fit the data equally well, the IPCC could recommend that policymakers use the model that best promotes public and environmental health, provided that the IPCC clarifies the factors that shape their reasoning and the limitations of their conclusions and findings (Elliott and Resnik 2014; Intemann 2015; Frisch 2020). The IPCC could recommend, for example, that the models that provide the lowest estimates of global warming by 2100 should not be used in policymaking because these models will not convince the public that the problem of climate change is serious, but also that the models that give the highest estimates should not be used because these might be perceived as alarmist. One could argue, based on these value considerations, that the best models for communicating the risks of climate change to the public would fall somewhere between these two extremes.Footnote 12

A third philosophical argument is based on the observation that many scientific terms, especially those used in the social and biomedical sciences, are not purely descriptive and have moral, social, or cultural connotations and implications (Alexandrova 2018; Dupré 2007). Some examples of value-laden scientific terms include marriage, rape, adultery, aggression, crime, alienation, race, ethnicity, gender, health, disease, disability, intelligence, depression, anxiety, sexual orientation, environment, ecosystem, invasive species, and pest (Dupré 2007; Ereshefsky 2009; Larson 2011; Rosenberg 2015). Although some philosophers (e.g., Boorse 1977; Schaffner 1993), have tried to reinterpret scientific words with normative connotations in descriptive terms, Kevin Elliott (2017) challenges this idea. Elliott argues that the terms scientists use, as well as the ways they frame information and the manner in which they categorize phenomena, all have the potential to privilege some values over others (see also Larson 2011). According to Elliott, since there are often no value-neutral ways of handling these value-laden aspects of scientific language it is better for scientists to recognize these normative aspects of scientific language and to incorporate external values in their choices about language rather than to ignore this value-laden aspect of their work.

A fourth philosophical argument against the value-free ideal is based on insights from studies of how private companies and government funding agencies use money and power to shape the research agenda and, ultimately, the content of science (Holman and Bruner 2017; Resnik 2007; 2009; Michaels 2008). Private companies can shape the research agenda by funding and publishing studies that promote their economic interests and not publishing studies that undermine their interests. While the company’s interests may not affect the outcome of any particular study, they can still have a cumulative effect on the entire research field and therefore the evidence that is available for other researchers to use (Elliott and McKaughan 2009; Holman and Bruner 2017). For example, if a drug company funds ten clinical trials comparing its medication to competing products and publishes only the five studies that show that its product is superior, the research record will reflect this bias (Krimsky 2003; Resnik 2007; Michaels 2008). Although government science agencies usually require that funded investigators publish their research, they can still use their money and power to significantly influence the content of entire scientific fields (Resnik 2009). For example, research on environmental health disparities has been supported, in large part, by government funding (Schlosberg 2009). Since modern science is difficult to do without significant financial support, external values that impact funding decisions inevitably affect the content of science. As in cases where values influence scientific language, it is better to consider these influences in an intentional fashion rather than letting them play out without attending to them.

We would be remiss if we did not mention that philosophers have responded to these critiques with defenses and reinterpretations of the value-free ideal (see Jeffrey 1956; Betz 2013; John 2015; Hudson 2016; Lacey 2017), and the controversy about the relationship between science and values is not entirely settled (see Douglas 2016; Elliott 2022). It is not our aim in this paper to adjudicate the dispute between proponents and critics of the value-free ideal. Rather, our purpose is to reflect on a key problem that arises from rejecting the value-free ideal and consider how to best to address it.

The key problem for those who reject the value-free ideal is how to consistently and coherently distinguish between legitimate and illegitimate value influences (Douglas 2009; Hicks 2014; Resnik and Elliott 2019). This is an important issue not only for those who reject the value-free ideal but also for society, because, as we have seen, external values can have a powerful and corrupting influence that threatens the integrity, reliability, and trustworthiness of science.

Science derives its trustworthiness from being regarded by the public as a reliable and impartial provider of knowledge and expertise in a pluralistic society in which people disagree about fundamental values (Jasanoff 1998; Ziman 2000; Pielke 2007; Resnik 2009; Bright 2018). When the public views science as driven by the economic, social, or political values or interests of particular organizations or groups, it may lose trust in science, which can have devastating impacts on science and society. When the public loses trust in science, policy discussions can degrade into political battles, as has occurred in the debate about climate change, and laypeople may ignore evidence-based health advice from physicians and government health agencies, as has occurred during the COVID-19 pandemic (Collins and Evans 2017; Mann 2021; Saphier 2021). Also, erosion of public trust can make people less willing to fund science, participate in clinical trials, or otherwise support the research enterprise, all of which can have negative impacts on science and society (Resnik 2011).

Given the serious epistemic, social, and political risks associated with allowing external values to influence science, it is incumbent on critics of the value-free ideal to show how one can prevent illegitimate value influences while permitting legitimate ones (Resnik and Elliott 2019; Koskinen and Rolin 2022). This is not an easy problem to solve because it seems hypocritical to say that some value influences are acceptable while others are not. How can one consistently argue that it is acceptable for an academic scientist to allow a concern for public health to affect their interpretation of toxicology data while at the same time maintaining that it is not acceptable for an industry scientist to design a study in a way that it promotes the company’s profits? To deal with such questions, one must develop an approach that clearly and consistently shows why some value influences are problematic while others are not.

The new demarcation problem is therefore similar to the original one in the sense that it is concerned with the practical applications of science, i.e., how science is used in law, medicine, education, public policy, and so on. If we use theories (or hypotheses, principles, models, or concepts) to make decisions with important implications for human health, the environment or other things we value, then we expect those theories to be reliable, impartial, and trustworthy (Resnik 2000; 2009). This practical concern remains one of the key motivations for drawing some boundaries between science and pseudoscience, as well as between good science and bad (i.e., biased, fraudulent, erroneous) science. However, the new problem is much more complex and multifaceted than the original one because it is explicitly concerned with answering questions about gradations of science, from good to bad, and because there are so many ways in which values can potentially influence science. Before turning to our own preferred approach to this problem, the next section considers some of the other available strategies for addressing it.

4 Protecting the Integrity of Science

Philosophers who reject the value-free ideal have explicitly or implicitly proposed various ways of distinguishing between legitimate and illegitimate values influences on science. Unfortunately, all of these efforts, as we shall argue below, have weaknesses and shortcomings. This is understandable, because the existing scholarship developed primarily in an effort to challenge the value-free ideal and has only secondarily turned to the issue of distinguishing between appropriate and inappropriate value influences (Holman and Wilholt 2022).

We will not review every approach to distinguishing between legitimate and illegitimate value influences on science in this paper but will focus on several influential, representative accounts. Current approaches to the new demarcation problem can be categorized as axiological, functionalist, consequentialist, coordinative, or systemic (Holman and Wilholt 2022; for a related but slightly different categorization, see Elliott 2022). Axiological approaches claim that external value influences are appropriate when they promote the “right” values; functionalist approaches claim that external value influences are legitimate only when they play the appropriate role in scientific inquiry; consequentialist approaches claim that external value influences are appropriate when they achieve particular effects or accomplish particular aims; coordinative approaches claim that external values influences are legitimate when they involve appropriate coordination between researchers, the scientific community, and the larger public that facilitates assessment, discussion, and management of values; and systemic approaches claim that external value influences are legitimate when the community of inquirers is structured in such a way as to provide adequate critical scrutiny of those influences.

Following Holman and Wilholt (2022), we begin with axiological approaches first, since they demonstrate, in stark terms, the opportunities and dangers associated with incursion of external values into science. Axiological approaches hold that moral, social, or political values should play a decisive role in guiding, structuring, and governing scientific inquiry. Several philosophers (e.g., Kitcher 2001; 2011; Kourany 2010; Brown 2020) have defended this type of view. Philip Kitcher (2001; 2011) argues that science should be guided by the values that well-informed deliberators who are considering rules for the structure of society would adopt. Kourany (2010) contends that science should reflect values arrived at through ethical reasoning. For example, she claims that scientists should not pursue avenues of inquiry that could have racist implications, such as the relationship between race, genetics, and intelligence (Kourany 2010). Alex John London (2022) develops a similar approach in the realm of clinical research. London argues that clinical research is a cooperative social activity carried out to promote and protect human rights and egalitarian ideals of social justice.

Axiological approaches tend to appeal to the Baconian idea that science should improve the human condition (Bacon 2000; Kitcher 2001). For some proponents of the axiological approach, scientific research represents an opportunity to advance moral or political goals, such as democracy, social justice, or protection of human rights or the environment (Schroeder and Andrew 2017; Bright 2018). However, because axiological approaches do not place systematic constraints on the influence of external values on science, they threaten the trustworthiness of science by appealing to values that are likely to be controversial. We live in a highly polarized society in which people fundamentally disagree about moral and social issues, such as abortion, capital punishment, gun control, genetically modified crops, climate policy, immigration, and affirmative action, and the fundamental values that give rises to these disagreements (Gutmann and Thompson 1998; Resnik 2009). Given that science is typically regarded as a relatively neutral source of information, it seems problematic at best, and potentially disastrous at worst, to incorporate potentially controversial moral, social, and political assumptions into scientific reasoning. Some scholars embrace egalitarian conceptions of justice, for example, but many people accept different accounts of justice, such as libertarianism and utilitarianism. Some people might think that science should promote economic development, national security, or even religious doctrines. By wedding scientific inquiry to the advancement of particular moral, social, or political values, axiological approaches risk politicizing and polarizing science.

Some axiological approaches could also threaten the integrity of science if they do not require commitment to widely accepted epistemic and ethical norms that govern scientific inquiry, such as honesty, rigor, reproducibility, openness, transparency, and freedom of inquiry.Footnote 13 If one holds that science should serve the “right” social or political values, then one could potentially justify epistemically problematic actions to promote those values, such as fabrication or falsification of data or suppression of ideas, hypotheses, and theories. History provides a stark reminder of what can happen when science is subservient to politics or religion. Galileo Galilei, Giordano Bruno and Mendelian geneticists in the former Soviet Union faced significant repercussions, including imprisonment or death, for defending scientific theories that contravened political or religious values (Resnik 2009). In more recent times, politically motivated repression of scientific research and debate continues to be a major concern, as illustrated by efforts by the George W. Bush and Donald J. Trump administrations to censor climate change research conducted by US federal government scientists (Mann 2021).

Functionalist approaches address these concerns by limiting the roles that external values play in science. The most influential of these approaches has been defended by Heather Douglas (2009). In her book, Science, Policy, and the Value-Free Ideal, Douglas (2009) argues that values (epistemic and external) should play only an indirect role in science.Footnote 14 According to Douglas, values function in a direct role when they operate the way evidence does, by providing reasons in support of scientific statements or beliefs. Values function in an indirect role when they do not function as evidence but rather guide decisions about how much evidence is sufficient to accept a theory or hypothesis. Douglas argues that the key to maintaining appropriate roles for values in science is to ensure that they do not play a direct role when scientists assess hypotheses or theories. The fact that a theory or hypothesis promotes a particular external value, such as public health, should never be a reason for accepting that theory or hypothesis, though it might be used as a reason for setting a standard of evidence used to evaluate this hypothesis.

While Douglas’ distinction between direct and indirect roles for values provides some insights into how values should function in scientific research, it does not adequately protect the integrity and trustworthiness of science because it does not clearly distinguish between illegitimate and legitimate value influences in an indirect role.Footnote 15 There is ample evidence that values can significantly bias research even when they only operate in an indirect role (Michaels 2008). For example, private companies and the researchers who work for them have manipulated experimental designs and statistical models to favor their financial interests and have suppressed data and results that could undermine those interests (Resnik 2007; McGarity and Wagner 2008). Most people would regard this type of value influence as illegitimate, even though values do not appear to be functioning as evidence, since values are influencing the type of evidence that investigators obtain and how they share it with the scientific community (Steel and Whyte 2012).

Another problem with Douglas’ approach is that one might argue that there are some situations in which external values legitimately play a direct role in scientific judgment and decision-making (Biddle 2013; Elliott and McKaughan 2014). In the climate change case discussed above, for example, one could argue that climatologists may use external values, such as promoting public health or protecting the environment, when choosing among competing models of global climate that fit the data equally well (Intemann 2015; Frisch 2020).

Steel (2015) also develops a functionalist approach. Steel argues that epistemic and external values can both play an important role in scientific inquiry, but that external values should never override epistemic ones in the design or interpretation of research that is feasible and ethical (Steel and Whyte 2012). External values can play a role in these sorts of judgments or decisions only when competing choices are equally compatible with empirical evidence and epistemic values. For example, suppose that the data from a toxicology experiment provide equal support for two different interpretations of the safety of a chemical. A scientist could decide in this situation to opt for the interpretation that promotes external values, such as public health or corporate profits. However, if the data clearly supported a different interpretation, the scientists should choose that one. If two different research designs equally satisfy epistemic criteria, such as rigor, testability, and consistency, a scientist could choose the design with an eye toward promoting external values. However, a scientist should not allow these values to affect research design decisions when epistemic values favor one type of design over the other.

Although Steel’s approach also offers some useful insights into how values function in scientific research, it also does not provide an entirely satisfactory solution to the new demarcation problem. First, it is not clear that Steel’s view is comprehensive enough, given that it focuses primarily on the role of external values in research design and data interpretation. However, as we have seen, external values could corrupt science in other stages of research, such as problem selection, data analysis, peer review, and publication (Resnik and Elliott 2019). Steel does not indicate whether external value influences should be limited in these other contexts, but one might argue that they should be to protect the integrity of scientific research.

Second, external values can arguably have problematic effects on research even when they do not clearly override epistemic ones. In chemical toxicology studies, for example, scientists must make choices pertaining to many different aspects of experimental design, including the variables to be measured, dosing levels and schedules, the animals to be used, control groups, the length of time of the study, and so on. External values often guide these choices long before a potential conflict with epistemic values arises. For example, an academic scientist may decide to measure the impact of a chemical on the endocrine system because of possible implications for human health. Conversely, a scientist working for a company that manufactures this chemical might decide not to measure its impact on the endocrine system to avoid collecting data that could show that the chemical poses a risk to human health. In some cases, these influences of external values could be problematic even if they did not clearly conflict with epistemic values.

Consequentialist approaches to the new demarcation problem focus instead on whether value influences enable science to achieve the right effects or aims. For example, Intemann (2015) argues that scientists should incorporate values into their work in a manner that enables them to achieve aims that are democratically endorsed. One could also use ethical analysis to identify particular effects or aims that science should achieve. Although Holman and Wilholt (2022) distinguish consequentialist approaches from axiological approaches, they are very similar and have the potential to run into the same sorts of problems. For example, Steel (2017) worries that scientists could violate important epistemic constraints in the course of trying to achieve particular aims. Thus, consequentialist approaches, like axiological and functionalist strategies, run the risk of failing to protect the integrity, reliability, and trustworthiness of science against potentially problematic value influences throughout the entire process of scientific inquiry, from problem selection and research design to data analysis and hypothesis acceptance.

Coordinative approaches to the new demarcation problem focus on aligning the practices of science with the expectations of the audiences who receive scientific information. This coordination could take a variety of forms. One approach is to adopt conventional standards that establish particular ways of handling value judgments so that everyone knows what to expect (Wilholt 2009; John 2015). Another approach is to promote transparency about value influences so that those receiving scientific information can determine whether they agree with those underlying value influences (Elliott and Resnik 2014; Elliott and McKaughan 2014). Yet another approach is to foster engagement between scientists and those interested in or affected by the research so that it can be performed in a way that meets their expectations (Douglas 2005; Intemann 2015; Parker and Lusk 2019).

Unfortunately, even though efforts to align the practices of science with its users make a great deal of sense, coordinative approaches also have weaknesses that prevent them from serving as comprehensive strategies for addressing the new demarcation problem. In general, these approaches tend to face a dilemma. On one hand, if they adopt fixed standards that are applicable under all circumstances (e.g., John 2015), then they are fairly unlikely to serve the interests of all audiences. On the other hand, if they allow scientific practices to vary based on the specific needs or concerns of particular users (e.g., Elliott and McKaughan 2014; Parker and Lusk 2019), then they create the potential for confusion because it is difficult to be entirely transparent about all the ways in which values influence scientific work (Elliott 2021). Moreover, coordinative approaches tend to share the epistemic weaknesses of the axiological, functionalist, and consequentialist approaches, namely, that epistemic standards could be violated in the name of achieving the goals of particular audiences (Steel 2017).

Finally, systemic approaches shift the focus from the individual level to the social level and argue that value influences in science are appropriate as long as the structure of the scientific community has the appropriate characteristics to maintain the integrity of science. For example, Longino (1990) argues that the key to achieving scientific objectivity is not to banish external values from science but to structure scientific inquiry in such a way that judgments and decisions shaped by external values receive critical scrutiny in the marketplace of ideas. She defends four criteria that she claims are necessary for generating critical scrutiny: including publicly recognized venues for criticism, achieving uptake of criticism, having shared standards, and ensuring tempered equality of intellectual authority. Objectivity, according to Longino, emerges from a process of “checks and balances” on values. For example, if a chemical company publishes a study showing that its product is safe, a government-funded academic researcher may publish a study showing that it is not, and objective knowledge can emerge as the byproduct of this interplay of competing values, i.e., profit vs. public health.

Despite the benefits of drawing attention to the role of the scientific community in addressing the influences of external values, systemic approaches still have weaknesses. For example, Longino’s approach raises concerns because there is no guarantee that objectivity will emerge from Longino’s procedures, especially when there are substantial differences in resources and power between opposing sides of a scientific dispute. For example, if a chemical company has enough money to sponsor ten studies showing that its product is safe, but the government only has enough money to sponsor one study showing that it is not, the chemical company may emerge victorious, because it has more money than the government. Likewise, a political group that attempts to impose a racist, sexist, or homophobic agenda on a scientific discipline will emerge victorious if it has more power than opposing groups. Moreover, it is not clear that highly problematic values should be allowed to influence science, even if steps are taken at the level of the scientific community to counteract them (Intemann 2017). Thus, even though one can attempt to mitigate these sorts of problems by adding additional restrictions on the structure of the scientific community, it seems questionable to try to manage the influences of external values on science solely by influencing the structure of the scientific community.

5 Toward a Solution to the New Demarcation Problem

Our overview of attempted solutions to the old demarcation problem in Sect. 2 and the new demarcation problem in Sect. 4 converge on the same conclusion. In both cases, efforts to develop necessary and sufficient conditions for distinguishing science from non-science or good science from bad science end up running into difficulties (see also Koskinen and Rolin 2022). Although the philosophers we discussed in Sect. 4 did not conceive of themselves, as far as we know, as providing necessary and sufficient conditions for distinguishing between legitimate and illegitimate value influences on science, it is clear that their approaches to the new problem tended to offer fairly simple criteria for determining whether external value influences are appropriate and do not adequately account for the diversity and complexity of the relationship between scientific practice and external values. Our critiques of these approaches to the new problem emphasized the variety of ways that values can impact science. Values can impact science at many different stages of inquiry, and it is doubtful that a single criterion for distinguishing between appropriate and inappropriate influences will adequately account for the diversity and complexity of all these influences.

Given the complexity and diversity of value influences on science, we believe that efforts to distinguish between legitimate and illegitimate influences should focus on whether researchers are complying with epistemic and ethical norms that are constitutive of good science, rather than on some particular criteria of legitimacy. A set of norms is listed in Table 1. These norms can be used to classify science from “good” to “bad,” depending on how well it complies with the norms. On our view, a study could qualify as scientific but not be good enough to use in an applied context (such as making a legal or regulatory decision) because external values have inappropriately influenced it in some way. We do not claim originality for these norms; the list in Table 1 has much in common with norms developed by Merton (1973), Kuhn (1977), Thagard (1988), Kitcher (1993), Resnik (1998), Resnik and Elliott (2019), Elliott (2022), Koskinen and Rolin (2022), Shamoo and Resnik (2022), and others. The norms apply to all stages of the research process, from problem selection to data collection to publication and data sharing, and to a wide range of disciplines.

Table 1 Scientific norms (based on Resnik and Elliott 2019)

The norms can be understood within the framework of social epistemology (Longino 1990; Resnik 1996). Social epistemology characterizes knowledge production as a social activity governed by goals and norms. Knowledge production requires cooperation, collaboration, and trust among knowledge producers (i.e., members of scientific laboratories, research groups, or communities), as well as cooperation, collaboration, and trust between knowledge producers and the larger society (i.e., the public; Rolin 2015). Thus, science has norms because science itself is a society, and science is a socially sanctioned activity that exists within a larger society (Resnik 1996). Science is a socially sanctioned activity (as opposed to an illicit activity) because it produces something that the public regards as valuable: reliable, trustworthy knowledge that serves the common good.

The norms of science, therefore, are based on three foundations: (1) science’s general epistemic aims (i.e., production of reliable, impartial knowledge); (2) characteristics of the research environment necessary to achieve those epistemic aims; and (3) public accountability (i.e., science’s practical aim of providing trustworthy knowledge in a manner that serves the common good).Footnote 16 For example, some of science’s norms, such as honesty, rigor, objectivity, reproducibility, and carefulness, directly promote science’s epistemic aims; while others, such as openness, fair sharing of credit, respect, and safety, help to foster a research environment in which people can work together collaboratively and cooperatively to achieve common goals; and still others, such as protection of human and animal research subjects, social responsibility, and engagement, help to ensure that scientists are accountable to the public (Resnik 1996; 1998; Elliott 2011; 2017). See Fig. 1.

Fig. 1
figure 1

Relationship between Scientific Aims, Norms, Rules, Traditions, Policies, Procedure, and Practice (based on Resnik and Elliott 2019)

Since the norms of science promote general epistemological and practical aims, our approach could be construed as axiological. However, we do not believe that our approach is susceptible to the main objection to other axiological approaches, i.e., that they run the risk of politicizing and polarizing science, because the aims we endorse, and the norms they justify, are highly generic and are therefore compatible with many different social and political values. Although the influences of specific moral, social or political values, such as protection of the environment or promotion of social justice, might ultimately be allowed to influence scientific reasoning via norms like social responsibility or engagement, the norms in Table 1 operate at a more general and less controversial level. The general aims of science, and the norms that flow from them, provide a basic framework within which deliberation about more specific research aims and values can take place (see e.g., Elliott and Resnik 2014; Hicks 2014; Intemann 2015).

One of the main reasons for adopting our normative approach is that it accords well with the insight that the characteristics of good scientific research can vary depending on the context. Because these norms rest on the aims of science, the relative weight given to different norms can vary somewhat depending on the aims associated with specific research contexts (Lusk and Elliott 2022). When researchers are doing applied work for regulatory purposes, for example, norms like social responsibility and engagement take on special significance. These norms would not be absent in other contexts, but their implementation would likely take different forms.

Building on this point that the norms can be implemented in different ways, it is crucial to recognize that the norms are mere platitudes with little practical value unless they are enforced, implemented, and supported by rules, conventions, policies, and procedures (Mantzavinos 2020). With this in mind, another strength of our norm-based approach to addressing the new demarcation problem is that we can draw on the array of rules, conventions, policies, and procedures that have already been developed and implemented by research institutions, funding agencies, private sponsors, professional societies, and scientific journals (Table 2). It is important to note that these rules, conventions, policies, and procedures are not always spelled out explicitly in professional codes; instead, professional communities frequently develop conventional standards or expectations regarding study design, data analysis, and interpretation. Wilholt (2009) has previously argued that these conventional standards help guide scientists in their responses to value judgments.

Table 2 Rules, conventions, policies, and procedures for the conduct of science (based on Resnik and Elliott 2019)

Thus, according to our view, distinguishing between legitimate and illegitimate value influences on scientific research is a complex, context-dependent, and holistic determination. One must examine the extent to which a variety of norms have been met and how they have been met. To do so, one must typically determine whether scientists are following the rules, conventions, policies, and procedures created by scientific societies, research institutions, funding agencies, regulatory bodies, and other stakeholders. In addition, there may be conflicts between different norms as well as questions about how best to implement them (Elliott 2022). Addressing these conflicts and questions typically requires complex, holistic, context-dependent judgments. Also, there may be gradations in the extent to which the norms are met, ranging from exemplary science, to good science, to bad science, to not science at all (e.g., pseudoscience). Thus, in some cases there may be significant philosophical work involved in assessing the extent to which values have influenced a research project, and whether this influence has been appropriate or inappropriate. To begin this analysis, one can ask some standard questions that we have listed in Table 3. Although this might seem like a somewhat long and overwhelming list, we provide a case study in Sect. 6 that illustrates how specific questions become particularly salient given the issues at stake in specific cases.

Table 3 Questions a scientist or non-scientist could ask when determining whether a study complies with scientific norms

A final strength of our proposed approach is that it accords well with steps that are already being taken to address the value-ladenness of science in applied decision-making contexts. To take one example, judges often face difficult choices about whether to admit expert testimony into the courtroom, since attorneys may call witnesses who offer testimony based on novel or controversial theories, hypotheses, concepts, or methods. For example, in the 1980s, DNA fingerprinting evidence was controversial, because it was new and had not been well-validated or tested, but today it is routinely admitted into the courtroom (Roewer 2013). In a landmark case, Daubert v. Merrell Dow Pharmaceuticals, Inc. (1993), the U.S. Supreme Court ruled that judges can admit expert scientific testimony based on whether the expert’s knowledge is reliable or based on the scientific method.Footnote 17 The Court listed several factors that judges can consider when deciding whether to admit testimony based on theories, hypotheses, or concepts into the courtroom, including falsifiability, peer review and publication, error rate, and acceptance by the scientific community (Daubert v. Merrell Dow Pharmaceuticals, Inc. 1993). However, the Court noted that there could be other factors it did not describe and that none of the factors constitute necessary conditions for admitting scientific testimony into the courtroom. The Court offered the factors as guidelines rather than exhaustive and dispositive rules (Rothstein et al. 2011). Thus, judges are already using an approach that accords with our preferred solution to the new demarcation problem. To further implement our approach in the courtroom, we would recommend that judges use a list of questions like those described in Table 3 to determine whether an expert’s testimony complies with scientific norms. If the expert’s testimony significantly deviates from scientific norms, then the judge could decide not to admit their testimony.

6 A Case Study: The Food and Drug Administration

To provide further illustration of the way our approach would work in practice, we offer a brief case study involving the U.S. Food and Drug Administration (FDA)’s decision-making. The FDA regulates foods, drugs, biologics, medical devices, cosmetics, and veterinary and tobacco products marketed in the U.S. It has the authority to approve new products or new uses of approved products; to oversee marketing, labelling, and manufacturing of products; to issue warnings on products; or to require that products be removed from the market. The mission of the FDA is to promote public health and safety (Food and Drug Administration 2018a). To obtain approval for a new drug, a manufacturer must submit an application to the FDA and conduct animal experiments and human clinical trials to gather evidence concerning the drug’s safety and efficacy (Food and Drug Administration 2018b). The FDA reviews evidence submitted by the manufacturer to determine whether its drug is safe and effective enough to be approvable for marketing. The FDA uses panels of experts from the relevant scientific and medical disciplines to make recommendations concerning drug approvals. The expert panels review evidence pertaining to safety and efficacy submitted by manufacturers and considers evidence from other sources, such as independent studies published in peer-reviewed journals (Food and Drug Administration 2018b). A key question these panels face is whether a study should be included in the review of evidence. If a study does not meet appropriate scientific and ethical standards, the panel may decide not include it. While the FDA has issued some guidance to manufacturers on how to design and conduct clinical trials (Food and Drug Administration 1998), this guidance often is subject to interpretation, and panel members must often decide how to evaluate the studies when they review the evidence.

Deciding which studies to include in the review of evidence can be a difficult task, because research has shown that clinical trials funded by drug companies often reflect the company’s bias (for a review of the evidence, see Krimsky 2003; Sismondo 2008; McGarity and Wagner 2008). As discussed above, companies can achieve outcomes favorable to their products by various means, such as by manipulating study designs or data analyses, or by selective publication of data and results (Holman and Elliott 2018; Resnik 2007; Lexchin 2012). To apply our approach to the new demarcation problem to the question of which studies should be included in the review of evidence for drug approval, we would recommend that FDA expert panels use a list of questions like those described in Table 3 to determine whether the studies comply with scientific norms. If a study significantly deviates from scientific norms, then the panel members could decide not to include it in their review of evidence. Medical researchers, epidemiologists, and biostatisticians (see e.g., Furberg and Furberg 2008; Straus et al. 2019) have been moving in this direction since the 1990s by developing guidelines for assessing clinical evidence. These guidelines address issues related to research design, data analysis, data reporting, and other scientific issues like those discussed in this paper and described in Tables 1, 2, and 3. The norms, rules, and questions described in those tables could strengthen and supplement the guidelines that have already been developed.

Admittedly, in actual cases these deliberations about which studies to accept can become very complex. Thus, even though our approach provides a framework for distinguishing appropriate influences of values from those that are problematic, it does not provide a simple recipe for arriving at conclusions that everyone will agree on. Consider, for example, a prominent debate that has engulfed the FDA’s decision-making around the chemical bisphenol A (BPA), which is used in many products, including the lining of food cans (Resnik and Elliott 2015). Many scientists worry that BPA has endocrine-disrupting effects that can contribute to developmental disorders, obesity, diabetes, and numerous cancers (vom Saal and Frederick 2019). In 2008, the FDA concluded that BPA was safe at current levels of human exposure (Food and Drug Administration 2018c). They based this conclusion largely on two studies that were performed according to standardized protocols that are typically used for informing regulatory decision-making (Tyl 2009). This approach was very controversial, however, because it placed relatively little weight on a number of academic studies that provided evidence of potential harm from BPA at human exposure levels (Myers et al. 2009). This case illustrates the challenges of navigating the role of values in science because investigators on opposing sides of this bitter conflict worry that the opposing side is inappropriately influenced by values; for example, those who defend BPA could be influenced by financial values and the entrenched paradigms of the field of toxicology, while those who challenge its safety could be influenced by public-health-oriented and environmental values (see e.g., Elliott 2016; Vandenberg and Prins 2016).

In order to approach this case using our norm-based approach, one would want to examine the quality of the studies based on the norms, rules, and questions described in the tables above. For example, to what extent were the underlying data made available from the studies under consideration? How well were the methods described, and how relevant were those methods for addressing the regulatory question at issue? Were the results reported in a manner that gives regulators the information they need to assess the studies’ relevance and reliability (Moermond et al. 2016)? Did the investigators report important conflicts of interest? Were the studies peer reviewed? How do the study results compare to the findings from other studies? Moreover, to the extent that the investigators had to make important value judgments when interpreting the studies, were they transparent about the nature of those judgments and the reasons for making them in particular ways (Elliott and Resnik 2014)? Were the judgments made in ways that meet the conventional standards of the discipline and justifiable ethical and political principles (Wilholt 2009)?

Importantly, working through these questions can incorporate a great deal of scientific and philosophical analysis. The crucial sticking point in the BPA case is that there is a clash between opposing considerations. On one hand, the studies prioritized by the FDA were performed according to Good Laboratory Practice (GLP) guidelines—which establish strict requirements for record-keeping—as well as standardized methodologies established by the Organization for Economic Cooperation and Development (OECD) for the purposes of regulatory decision-making (see Elliott 2016). On the other hand, the academic studies emphasized by the FDA’s critics were published in peer-reviewed journals and incorporated cutting-edge methodologies that many regarded as better designed for uncovering the endocrine-disrupting effects of BPA (Myers et al. 2009). Critics of the FDA’s approach also worried that the investigators who performed the FDA’s preferred studies had conflicts of interest because of their connections with the chemical industry, while the FDA’s defenders claimed that academic studies often suffered from poor quality control and unvalidated methodologies. Moreover, in the background of all these debates were value judgments about how to interpret studies that appeared to show harmful effects from BPA at low doses of exposure but not at higher doses (Vandenberg et al. 2019; vom Saal and Frederick 2019); many endocrinologists regarded these studies as providing convincing evidence of harm, while many toxicologists remained unconvinced (Gore 2013). Although this might appear to be merely a methodological dispute, it is more complicated than that. Researchers and policymakers working on this case need to evaluate both the quality of the evidence and the most responsible conclusions to draw on the basis of ambiguous evidence.

Ultimately, the FDA partnered with the U.S. National Toxicology Program (NTP), the US. National Institute of Environmental Health Sciences (NIEHS), and a number of academic investigators in a project called CLARITY-BPA (Consortium Linking Academic and Regulatory Insights on BPA Toxicity) that was designed to help settle some of these difficult judgments (Schug et al. 2013). They performed a large, collaborative study that followed the regulatory guidelines preferred by the FDA but that also incorporated the new methodologies preferred by a number of academic researchers. Unfortunately, even this massive effort failed to produce results that conclusively settled the issue (Vandenberg et al. 2019; vom Saal and Frederick 2019). Thus, this ongoing conflict illustrates how difficult it is in some cases to distinguish between science that is done well from science that is not. While this might seem to suggest that our norm-based approach is unhelpful, we would argue that it supports our approach. The difficulties involved in distinguishing good science from bad science in a case like this one fit well with our contention that multiple norms need to be considered and that judgment is needed when deciding how to interpret and implement those norms. This case also illustrates how the norms can be specified, debated, and gradually improved. For example, scholars have been working to develop better rules and procedures for evaluating studies in the regulatory context so that the conflicts encountered in the BPA case can be alleviated in the future and regulatory agencies can more easily distinguish studies they can properly rely on from those they cannot (Moermond et al. 2016). Finally, cases like this one illustrate the importance of ongoing philosophical reflection to evaluate the most appropriate aims for science in specific contexts and to determine how best to prioritize norms in ways that achieve those aims (Elliott 2022).

7 Conclusion

We have argued that philosophers who are working on what has been called the new demarcation problem, i.e., how to distinguish between legitimate and illegitimate value influences on science, can gain useful insights from the attempts to solve the original demarcation problem, i.e., how to distinguish between science and pseudoscience. An important lesson that philosophers have learned from attempts to solve the original demarcation problem is that it is very difficult to develop necessary and sufficient conditions for drawing a line between science and pseudoscience because science is such a complex and multi-faceted endeavor. Rather, it is more fruitful to characterize science as a family of activities that tend to be guided by a set of shared norms. Given the wide range of contexts in which science is used and the challenges that philosophers have been facing in their efforts to arrive at necessary and sufficient conditions for describing the proper role of external values in science, it seems advisable to adopt an approach to the new demarcation problem that is similar to the response that many philosophers have adopted to the old one.

In order to develop an approach of this sort, we have described a set of scientific norms, based on the work of numerous philosophers, historians, sociologists, and scientists, and supplemented them with rules, conventions, policies, and procedures that implement the norms in scientific practice. Admittedly, it will still be challenging in many cases to adjudicate conflicts between the norms, but this is an accurate representation of scientific practice; it is indeed difficult in many cases to decide which value influences are legitimate and which are not. To assist in applying the norms to practical contexts, we have provided a list of questions, and we have shown how our approach can be used for addressing controversial issues in regulatory science. Exploring how to interpret and implement the norms in specific contexts and how to handle conflicts between them is an important topic for future work in the philosophy of science.