The success of a philosophical argument depends on the strength of the link between the premises and the conclusion. As we stated above, in a model-based argument the premises will cite at least one model. We do not specify how the model is used to support an argument, or how it is used to formulate the premises, because this depends on how the philosopher chooses to use it. In the case of EL models, the model used is a computational one: a computer program that implements a conceptual model, which, in turn, is used to support an argument. To understand how EL models function as argumentative devices it is useful to clearly disentangle between these three different layers of the process. As we will show, both modelling layers introduce potential “soft spots” in the argumentation that might be difficult to identify. We will also show where such soft spots reside in EL models, and how they should be taken into account when the strength of the overall argument is assessed.
In this section we first introduce the different layers of argumentation, then we show how they can be used in appraisals of the argument, and in conclusion we discuss an important function of models in philosophical argumentation: establishing difference makers.
The structure of Weisberg and Muldoon’s argument
Philosophers employ several kinds of devices in support of their arguments, such as formal derivations, thought experiments, case studies and even common-sense reflection. Models could be considered one of the elements in their toolbox. Both mathematical models (e.g., Kitcher 1990, 1993; Strevens 2003) and computer simulations (e.g., Weisberg and Muldoon 2009; Pöyhönen 2017; Zollman 2007) have been used as such argumentative devices.
As we noted above, in the case of computer simulations, two distinct layers of modelling should be distinguished: the conceptual model and the computational model. A conceptual model is typically presented in a research paper, although some of its details might remain implicit. Weisberg and Muldoon’s EL model suggests how the central notions in a general argument, such as cognitive diversity and the division of labour, can be made precise. The constructs of the research approach and the different learning strategies of agents, as well as the measures of collective epistemic performance (e.g., epistemic progress) are examples of conceptual-model operationalizations of the general notions of interest.
In addition, the conceptual model serves as a blueprint for the computational model, meaning the implementation of the model in a computer program. In principle, the computational model could be considered a deductive device that takes the modelling assumptions as input and produces the modelling results as output (Beisbart 2012). It should be borne in mind, however, that even if the computational model is conceived of or reconstructed as a deductively valid argument, this does not imply that the philosophical argument in question is also deductively valid. The model typically supports only part of the reasoning involved in the philosophical argument.
Now let us consider Weisberg and Muldoon’s philosophical argument for the following tentative conclusion: the optimal division of labour could be achieved with a polymorphic population of research strategies. How do they support this conclusion? They do so by means of results obtained from the computational implementation of a conceptual model, and two informal assumptions (i.e., assumptions that were not explicitly modelled). Let us now trace how Weisberg and Muldoon arrive at this conclusion. In order to simplify the picture, we omit many details in our skeletal representation of their argument.
In setting up their conceptual model (arrow A in Fig. 1), Weisberg and Muldoon appear to make the following assumptions, among others.Footnote 5 For the sake of brevity, we first present the assumption of the conceptual model and then the computational-model analogue of it in parenthesis.
-
P1 A scientific research topic can be represented as a set of research approaches (the set of patches comprising the landscape)
-
P2 The epistemic significance of research depends on the research approach adopted by the scientist (elevation of the patch)
-
P3 Similar research approaches have comparable epistemic significance (smoothness of the landscape)
-
P4 Scientists care about epistemic significance, i.e., about the “significance of the truth that is uncovered by employing a given approach” (p. 229) (Agents try to find high-elevation patches on the landscape)
-
P5 Scientists can only gradually change their research approach (Agents cannot jump over the landscape: they move one patch at a time)
-
P6 Different strategies for changing one’s research approach constitute a relevant form of cognitive diversity in a community of scientists (Three types of agents—controls, followers and mavericks)
-
…
There is obviously continuity between the conceptual assumptions and the assumptions implemented in the computational model. Nevertheless, the differences are significant. For example, the assumption that scientists care about epistemic significance (conceptual model) differs significantly from the assumption that agents try to find high-elevation patches on a landscape (computational model; see P4, above). Thus, whether the result of the computational model supports the conclusions at the level of the conceptual model turns out to be an important question for the appraisal of the philosophical argument.
Although the assumptions of the conceptual model lay down the basic framework of the analysis, it is still too vague to derive precise results. The implementation of the conceptual model in a computer program (arrow B in Fig. 1) makes the derivation possible, but with a slightly different set of assumptions that are taken to be an appropriate implementation of those of the conceptual model. Moreover, the conceptual model is incomplete in several ways, and it must be accompanied by various implementation assumptions (IA) to make the derivation of quantitative modelling results possible. The original EL model, includes such assumptions concerning:
-
IA1 The size of the landscape (101 × 101 grid)
-
IA2 Scheduling (do agents move simultaneously or sequentially)
-
IA3 Time scale, or how long a simulation is allowed to run (max 50,000 cycles)
-
…
The addition of these implementation assumptions allows the research approaches, the significance of the research and the behaviour of scientists to be represented in a particular way in the computational model in comparison to the conceptual model presented by Weisberg and Muldoon. It is on these premises and assumptions as well as the corresponding analysis of the computational model that Weisberg and Muldoon (among others) base the following interim conclusions (IC, arrow C in Fig. 1):
-
IC1 Maverick agents stimulate the problem-solving ability of followers (mavericks help followers to hill-climb)
-
IC2 Adding maverick agents to a population of controls and followers increases the population’s capacity to make epistemic progress (epistemic progress = df proportion of non-zero patches discovered by the population at a particular time)
-
IC3 The increased epistemic efficiency of the mixed population of agents (IC2) relies on mavericks stimulating the followers to make considerable epistemic and total progress (IC1)
Note that these interim conclusions still do not imply the tentative conclusion in Weisberg and Muldoon’s argument (i.e., that an optimal division of labour could be achieved with a polymorphic population of research strategies). In an attempt to reach this more general conclusion they make an additional informal assumption (IN1), which was neither implemented in the computational model nor presented with the conceptual model.
On the basis of their interim conclusions and this informal assumption, they conclude (arrow (d)):
Furthermore, although not stated in these precise terms, the interim conclusions are meant to support the more general philosophical conclusion:
As this schematic representation of the argumentative structure indicates, the derivation of philosophical conclusions relies on several elements including the specification of models (arrows A–B in Fig. 1) and interpretations of results (arrows C–D in Fig. 1). Each step typically introduces new assumptions into the process. Hence, the strength of the philosophical argument, as well as the truth and relevance of its conclusions, could be contested by challenging the assumptions and the argumentative links, as depicted in our schematic presentation.
Reappraisal of the argument
To see if the string of argumentation works, one could start by asking whether the models are valid in the sense that their results follow from their assumptions. A minimal validity condition for a computational model is that it must be free from programming and implementation errors. The process of ascertaining this is referred to as verification in the literature on simulation (see Gräbner 2018 and the citations therein).
It is also worth pointing out that in the case of EL models, what are typically called modelling results are not the same as output data: they are rather summaries of some distribution in the output dataset produced by numerous runs of the simulation model. Each single run of a computational model is an algorithmic process, the outputs of which could be considered deductive consequences of its inputs. However, because the modelling results tend to be based on a statistical analysis of numerous runs, they are not simply logical consequences of the modelling assumptions alone. In other words, the appropriate analysis of the output data, the statistical assumptions and the model used become an important part of the argument. Needless to say, if the statistical modelling of the output dataset is done incorrectly, what are reported as modelling results might not follow from the model’s assumptions. As an example, let us consider Weisberg and Muldoon’s claim that the epistemic efficiency of the mixed population of agents in their model is due to mavericks stimulating the followers to make considerable epistemic and total progress (IC3, above). A careful analysis of the model (see Alexander et al. 2015, Figure 8) reveals that no such stimulation occurs. Instead, further progress on the population level is generated solely by mavericks. The problem is that Weisberg and Muldoon jump to the wrong conclusion due to an insufficiently detailed analysis of the output data.
Another question concerning the strength of the philosophical argument is whether the computational model aligns well with the conceptual model (arrows B and C in Fig. 1). Does the implementation of the simulation line up with the conceptual model? Are the model results robust to changes in the implementation assumptions, parameters, software engineering and decisions, for example? In the Weisberg–Muldoon case, the computational model plays a key role in establishing the mapping from the proportion of mavericks to the dynamics of epistemic progress. Computational implementations of the components of the conceptual model (the different kinds of agents, landscapes, epistemically significant outcomes and so on) are generated in the computer program. Next, the computational model is run with a model structure and parameter values corresponding to different scenarios formulated in terms of the conceptual model. The values of the outcome variables (e.g., epistemic progress) are observed across these runs. Conclusions from the modelling stated on the level of the conceptual model rely on the input–output mapping of the computational model: it is with the help of a computer program that the modeler establishes which conceptual-model conclusions follow from which assumptions. Hence, claims concerning the implications of the conceptual model depend on the computer program and its ability to serve the inferential role attributed to it. In the case of the Weisberg–Muldoon model there is, in fact, a discrepancy between the description of the follower rule in the conceptual model and its computational implementation, which compromises conclusions about the relative epistemic performance of different kinds of agents (Alexander et al. 2015; Pöyhönen 2017). The argument from the premises of the conceptual model to its conclusions is not valid because a sequence of inferential steps implemented in the computational model fails.Footnote 6
Let us now look at the next argumentative step (arrow D in Fig. 1), moving from the conceptual model to philosophical conclusions of interest. To establish that the modelling results alone are not sufficient for making the inferential transition, let us consider, for example, the different conceptual-model constructs used by the authors to capture the idea of epistemic efficiency: they use three distinct outcome measures to track the epistemic efficiency of a population of scientists. First, they keep track of how long it takes the population to find the two peaks of the landscape. Second, they define a measure they call epistemic progress as the proportion of non-zero patches visited at a particular time. Finally, they also keep track of the proportion of all patches visited by some member population, which they label total progress. By providing counterexamples, Pöyhönen (2017) argues that none of these measures can function as an appropriate measure of epistemic efficiency, which is the more general philosophical idea implemented by the three variables. On the one hand, peaks-reached can be maximised by placing all agents in the peak patches, which corresponds to extremely poor epistemic coordination. Epistemic progress, on the other hand, is not sensitive to the elevation of a patch, and hence, especially on rugged landscapes, may correlate poorly with whether the two hills of epistemic significance have been discovered by the population. Finally, what Weisberg and Muldoon call “total progress” appears largely irrelevant to epistemic efficiency because it fails to distinguish the discovery of significant patches from the examination of zero-value approaches.
Note how this problem differs from the ones discussed above. It is an issue that concerns the relationship between the conceptual model and the philosophical argument (arrows A and D in Fig. 1). Even if the computational implementation of the constructs (e.g., epistemic progress) is adequate, the constructs themselves cannot serve the argumentative role assigned to them in the argumentation (i.e., to support conclusions C1 and C2).
Another factor that could weaken the philosophical argument is the lack of robustness of modelling results. If the simulation is highly sensitive to changes in a parameter value or to seemingly insignificant choices in the computational implementation, this challenges the reliability of the general conclusions that could be drawn from it. In this sense concerns about the robustness of philosophical models (see Sect. 2) are justified. However, it should also be recognised that various aspects of a model’s robustness emerge over time as other philosopher modellers take on the task of analysing and using the original model. Moreover, the discovery of non-robust results might indicate the conditions under which a philosophical argument is tenable and help in discovering critical assumptions as well as difference makers. Thus, non-robust results could be informative in showing the range of conclusions that could be supported by the computational and conceptual model. As we argue later on (Sects. 3.2 and 5), making sense of the positive and negative implications of non-robustness requires taking the broader argumentative context into account.
Finally, the applicability of the epistemic-landscape conceptual model could also be challenged more generally. The extent to which the philosophical argument is convincing depends on the credibility of the models involved. Appraisal of the credibility of a model requires more than a comparison of the model with its intended target. For example, Sugden (2000) suggests that credibility depends on the logical coherence (i.e., validity) of the model as well on how well the model fits into our general understanding of the causal structure of the world. Credible models, according to Sugden, describe “how the world could be” rather than being accurate descriptions of particular or generalized real-world target systems.Footnote 7 Concerning credibility, one could ask the following questions. Are the assumptions of the model plausible given the modelling goals and the argument of which it is a part? Are the construct (model world) and its dynamics (what happens in the model world), as well as the model results (findings), broadly consistent with our general knowledge of the world? Obviously, these questions are similar to those concerning representational adequacy discussed above. We are not suggesting that representational adequacy has no role to play in assessments concerning the use of models in philosophy. As should be obvious by now, we are arguing that representational adequacy must be evaluated in light of the argumentative goals and the broader argumentative context. Moreover, one can establish or question the credibility of a model without engaging in pairwise model-target comparisons, by way of making piecemeal assessments of the plausibility of individual assumptions relative to the argumentative goals, and considering how the model and the suggested conclusions fit what we already know about the world. As we will show, in the case of EL models, attempts to improve plausibility and credibility turn out to be another way of exploring what could be called the argumentative landscape relevant to the philosophical topic in question, i.e. the constellation of possible premises, conclusions and paths of reasoning connecting them. Such exploration results in a better understanding of the dependency between proposed variables, and of the conclusions that a given set of premises can support. However, to show this we should first discuss an important function of modelling: establishing difference makers.
Difference making and robustness
We suggested above that understanding how models function as argumentative devices requires making explicit the assumptions that contribute to model-supported arguments and keeping score of how strongly they support the conclusion of interest. Grasping the different steps of the argumentation and the credibility of the models is only part of the story, however. Bringing the value-added of simulation models into focus requires closer attention to the kind of conclusion being argued for. Simulation-based philosophical arguments typically use models to support claims about difference-making. For example, Weisberg and Muldoon aimed to show in their modelling efforts that changing the proportion of mavericks in the population of scientists makes a difference to its epistemic performance. Later models in the EL tradition qualified this difference-making result in various ways. Thoma (2015), for example, shows that such dependency holds only when scientists are not too inflexible in their choice of a new research topic and not too ignorant of others’ work. Pöyhönen (2017) suggests that the original difference-making relation between cognitive diversity and epistemic performance does not hold on the kind of smooth landscapes studied by Weisberg and Muldoon; it is only on rugged landscapes that a heterogenous population of agents outperforms a homogenous population. Although pointing in somewhat different directions, all such results concern a difference-making relationship between (changes in) a modelling assumption, and (changes in) an outcome of interest.
Establishing difference making goes beyond the task of establishing validity. Showing that a modelling assumption makes a difference to an outcome relies on the various ways of introducing variation into the assumptions. Analogously with Mill’s method of difference (Mill 1974, Book III, §2), to show that something makes a difference to something, one must find a pair of scenarios that are similar in all aspects except one, such that the difference in the input variable leads to a change in the output. EL modelling includes exactly this kind of reasoning. A model is run with different values of the independent variables of interest, and the effect of this variation on the dependent variable of interest is observed. A minimal condition for establishing such a difference-making claim is that holding other modelling assumptions constant, changes in the value of the independent variable should lead to changes in the dependent variable.Footnote 8 A simple example of this is mentioned above: holding other things constant, Weisberg and Muldoon ran their model with different proportions of mavericks, and observed the resulting changes in epistemic progress (Weisberg and Muldoon 2009).
Nevertheless, such minimal difference-making tends to be of little argumentative value because it can only support the claim of dependence under a very specific set of modelling assumptions. For most purposes, the relationship between two variables should have some generality across the variation in the variables, and across different values of other parameters of the model. As critics have pointed out, it was difficult to determine the relevance of the results achieved by Weisberg and Muldoon because the analysis of the original EL model did not include any sensitivity or robustness analysis. In other words, even if the proportion of mavericks makes a difference to epistemic progress under the set of parameter values employed by Weisberg and Muldoon, it may be that such an effect only occurs in this particular part of the parameter space of the model, and hence the difference-making relation might be highly local, even pointlike. Such locality (non-generality) makes it questionable whether a model can be used to support more general conclusions.Footnote 9 Establishing the generality and derivational robustness of a difference-making result requires showing that some parts of the model are not difference makers as far as the result is concerned. For example, and intuitively, results from EL models should not be sensitive to the size of the landscape, the exact position, shape and number of the hills of epistemic significance, or the order in which agents move during each time step.
Establishing the irrelevance of such factors works differently in analytical and simulation models. In the case of analytical modelling, similar results are often derived under less stringent assumptions during the evolution of a modelling framework, leading to more general results. In agent-based modelling, however, it is typically not possible to eliminate modelling assumptions so as to make the results more general. All the details of the model must be implemented in some form in order to make the computational model run, thus the generality of results must be established by some other means. In the following section we suggest that sensitivity analysis, robustness analysis and the construction of model families could be considered argumentative modelling moves to that end. More generally, understanding the argumentative role of agent-based models in philosophy requires not only an understanding of model structure but also practices of model construction and use. The epistemic benefit to be derived from modelling is commonly realised as a consequence of several modelling attempts—rather than from the development of a single model (more on this in Sect. 5).