None the less, he said, he meant to peg away until every peasant on the estate should, as he walked behind the plough, indulge in a regular course of reading Franklin’s Notes on Electricity, Virgil’s Georgics, or some work on the chemical properties of soil. (Nikolai Gogol: Dead Souls, Part II, Chapter III)

Introduction

This chapter features in a book that discusses the issues at stake and the matters of concern with precision oncology. Our contribution discusses how precision oncology, or rather a particular interpretation of it, is imagined to become an exact science. We furthermore address a matter of concern in that regard, namely the implications such a realization would have for the further development of life science and ultimately, medical practice. In a somewhat dramatic fashion, we liken that matter of concern to the crossing of Styx, which was one of the rivers in Greek mythology that separated our world from Hades. Over the following pages we shall develop our argument for believing so. Essentially, exact science demands tractable scientific problems. In order for the problems to be tractable, they have to concern explicitly and rigorously defined systems. Currently, medical attempts at representing, understanding and intervening direct themselves at wholes and parts of organisms that do not satisfy the demands from exact science. So, in order for medicine to become exact science its subject matter has to change.

We emphasize that we address only one particular interpretation of “precision”. Since the term “precision medicine” was launched in 2011 as a policy initiative (NRC 2011), it has been used ambiguously to refer to, on the one hand, how individual genetic and molecular information is already included in research and clinical practice, and on the other, a more or less Utopian sociotechnical imaginary of perfect and precise personalisation of medicine (Blasimme 2017, see also Engen, same volume Chap. ‘’Introduction to the Imaginary of Precision Oncology’’). In a similar vein, the Springer/Nature journal npj Precision Oncology defines:

… precision oncology as cancer diagnosis, prognosis, prevention and/or treatment tailored specifically to the individual patient based on the genetic and/or molecular profile of the patient. High-impact articles that entail relevant studies using panomics, molecular, cellular and/or targeted approaches in the cancer research field are considered for publication.Footnote 1

It could be argued that cancers have been diagnosed and treated for decades in this way, taking into account the molecular characteristics of the patients, or using “cellular approaches”. The “precision” in this particular definition is in other words implied, either by assuming that the use of certain contemporary and timely laboratory methods by itself qualifies as precise, or by alluding to an ideal of precision that is assumed or realised by the type of science to which one aspires, namely exact science. In this chapter, we shall explore the latter alternative: That what above is being called “genetic and/or molecular”, is conceived as part of a broader development within life science towards what sometimes is called systems biology, characterised by bioinformatic methods, large quantitative data sets, numerical precision and at least the ambition of mathematical rigour. The vision may also include the use of machine learning and artificial intelligence; however, that will not be our main focus.

Mathematical rigour will have to involve models and formal reasoning. Precise and accurate measurement by itself does not make for exact science; Lord Rutherford would have to admit that it would remain a particular form of stamp collecting. The component that is missing in the npj Precision Oncology definition above, is that of computational modelling. The systems biologyimaginary centres around the potential of computational modelling to change fundamentally biology into an exact science, on a par with physics and chemistry. In this way, the argument goes, biology may finally also provide quantitative knowledge that will enable it to predict, control and engineer life.

There is of course disagreement about the plausibility of the vision of an exact biology, both on principled and practical terms. Philosophically, anti-reductionist arguments against the plausibility have had the upper hand, while reductionist imaginaries have prevailed in research policy (for a discussion of that apparent paradox, see Strand 2022). In this chapter, we shall pursue a different question: If computational models indeed come to prevail and become the norm for good life science research, what would be the consequences? This question appears to have received little attention so far: Anti-reductionist critics have found it irrelevant since they do not believe in the vision anyway, while proponents appear to be convinced that the consequences will be uniformly beneficial.

The authors of this chapter have followed the development of computational methods into life science for more than two decades. In our notes, we wrote 10 years ago:

There is a sense now that mathematical rigour will eventually be an important part of reasoning in the biosciences. At the moment this is not yet the case. Most articles published in biology journals rely on experimental results and verbal reasoning based on these results. Mechanisms are described using diagrams and descriptions in plain English.

At the time of writing, 2021, this seems still to largely hold true, even for journals such as the mentioned npj Precision Oncology. In this respect the biosciences are different from sciences such as physics and chemistry where valid reasoning includes formal reasoning methods, such as mathematical proofs, computational simulations or calculations. In physics, knowledge is typically encoded in systems of equations (differential equations) or other types of formal models.

In what follows we explore two questions. Firstly, we want to analyse the nature of such a transition from verbal to formal models. In a possible future where every biological discovery has to be supported by a corresponding formal model, is it conceivable that formal models would replace more traditional types of biological knowledge, consisting of verbal models supported by experiments? Or would formal and verbal models co-exist? Based on a typology of currently used formal models in the biosciences, what can be said about the scope and purpose of such models? The second question we wish to investigate relates to the implications of the use of new methods. Assuming the use of computational methods in biology, life science and medicine continues, how might that transform the questions that are asked within the science, and equally important, how might it change the perception and use of the resultant knowledge and its accompanying power?

Types of Models

There is a variety of formal models in biology. For what follows it is useful to categorise these. The categories are reasonable in the view of the authors and broadly reflect the types of models that can be found in the wider biological literature. However, the authors admit that they use a broad brush, glossing over many details and that different categorisations could be equally reasonable. That said, our categorisation will not be essential for the main point of this contribution, but merely an orientation to help the reader think about modelling on biology.

The first and oldest class of models are small scale mathematical models of biological systems. These are typically sets of differential equations or other explicit mathematical relations. Mathematical modelling in this sense is not a new phenomenon at all in biology. Theoretical biology is an established field, but mostly with limited impact on mainstream biology with few exceptions. Within the theory of evolution mathematical modelling has gained some traction with real biologists. Apart from that, mathematical modelling in biology often clarifies mechanisms in biology. Examples include modelling separate pathways or parts of pathways, uptake dynamics in bacteria, small-scale reaction diffusion systems and the like. The models can be very simple and may relate well known physical effects to biological systems. One of the simplest examples is the Cherry-Adler model (Cherry and Adler 2000) which shows under which conditions two genes that repress one another display oscillatory behaviours. More complex models may investigate the details of regulatory pathways, establish the robustness of signalling pathways, or establish fundamental limits on biological sensing. (McGratch et al. 2017; Govern and Rein ten Wolde 2014; Halasz et al. 2016; Eduati et al. 2020; Adlung et al. 2017)

While small-scale models are typically formulated with mathematical rigour and afford analytic solutions they nearly always depend on radical idealisation and skim over much biological detail. This is perhaps also the reason why they tend to have limited impact within mainstream bioscience. Computer simulations are an alternative method of biological modelling. Computational simulations may be based on systems of equations, but could also implement biochemical reaction systems or even entire ecological systems. The key difference between computational and pure mathematical solutions is that the former are numerical in nature, rather than producing analytical solutions. This allows much more complicated models and by the same token, reduces the need for idealisation. As a result, these computational models are much closer to the experiment and can even make numerical predictions of specific systems, that is, provided the data used to build and parametrize the model are of sufficient quality. Examples are simulations of the entire translation system in yeast (Chu et al. 2012, 2014), or simulations of entire brains (Markram et al. 2015).

With modern computers and simulations technologies fairly large systems can be simulated within reasonable time. When it comes to systems consisting of a large number of particles computational cost can still limit a modelling project. It is conceivable that one may overcome many of those problems in the future when hardware becomes even cheaper and modelling technologies advance beyond their current state. There is still another problem of large scale computational models: Parameter values. Once one knows (or believes to know) the structure of a biological system, i.e., which protein interacts with which protein, which genes control one another, what pathways look like and the like, it is quite straightforward to encode it. Yet, determining the quantitative details of interactions is chronically difficult. The relevant empirical data is often hard and expensive to get by. Even if it is available, it is inherently uncertain and often species- and context-specific. Missing parameters are a major challenge for everyone who tries to model biological systems. Again, it is conceivable that in the future it becomes easy to measure these parameters and this problem goes away. Yet, at the moment we are not at this stage and the lack of quantitative information is a major impediment for the development of system wide models.

While such computational models may be quite large-scale in the sense that they represent a large number of interactions, they are focussed on a particular research question. Typically, they will be published together with a particular set of experiments. As such, these computational models are not unlike the mathematical models in that their development requires the ingenuity of the modeller to select relevant features of the system to be modelled and exclude irrelevant ones. This process of selection should not be misunderstood as a necessary evil of modelling. Instead, understanding what is and what is not relevant for a particular purpose is an essential part of specifying the model and the system, and the one that connects scientific values and practice to social values and practice. We shall return to this important point below.

The final type of model we wish to describe are what we call system wide models (SWM). “System wide” here usually means that everything known about a particular type of interaction is included in the model. This could mean that the model includes all metabolic reactions, or all genes and how they regulate one another. In this sense, system wide models are repositories of all available knowledge of a particular domain (e.g., the proteome) for a particular organism. They are very different from the computational models that clearly focus on a particular research question in that making these models does not require the ingenuity of the modeller, but could be and regularly is automated. Human input is usually still required for sanity checks and overall quality control, yet the main work of the model construction is performed by computer programs that query databases or perform automated literature searches and text mining to establish models.

There are many different approaches to SWMs. One important methodology is Flux-Balance Analyses. These are models that concentrate on an assumed steady-state state of a system. Their all-encompassing scope requires substantial idealisations, including assuming a steady-state and mass action kinetics only. Yet, with those assumptions in place all available information about the metabolism of a cell can be encoded in a (large) model and examined using appropriate software tools.

There are a number of other tools and models within the wider field of computational biology that can reasonably be considered as computational models. Databases, network representations (“omics”) and the like are also SWMs, in the sense that they are representations of organisms. Yet, they are not runnable.

A final type of model used in biology are informal or verbal models. Unlike any of the above types, verbal models are to some extent personal to the individual scientists and consist of the particular understanding that researchers have of a process. So, for example, a biologist may understand how translation works. She will then also be able, to some extent, to communicate this understanding to her students using everyday language or diagrams and cartoon models.

In some sense, this informal understanding is the real understanding of a system. Experts have it. On the other hand though, verbal reasoning and understanding is inherently imprecise and susceptible to logical, quantitative and other errors. For example, it is very difficult, even for an expert, to understand the behaviour of even moderately large gene networks by informal reasoning alone.

Kung et al. (2012), call their Figure 4a “cartoon model”. It is reproduced below (Fig. 1).

Fig. 1
The figure represents the cartoon model of chemical reactions. The elements are resting C Fe S P, Co protected, folate-free complex, C H subscript 3 H subscript 4 folate_bound complex, folate-on conformation, H subscript 4 folate-bound complex, and product bound.

An example of a cartoon model (Kung et al. 2012). Reprinted by permission from Springer Nature: Nature, © 2012

This drawing displays several key features: There are schematic, drawn elements that intend to correspond to a material constituent (typically a biomolecule or complex). Next, there are arrows that can be thought of as chemical reaction arrows and that enable representation of change in the dynamical system. In the case of verbal models, drawings are replaced by names and descriptions, and arrows are replaced by sentences that describe possible events. We will therefore treat verbal models as equivalent with cartoon models.

An informal model such as the one above has indeed a type of entailment structure (denoted by arrows by drawings of different system states in this particular example). However, the informal model cannot be “run” in the sense of being calculated or implemented in a computer. If one looks at Kung’s model above, one can deduce by one’s own inferential powers that there are some constraints in state space and the passages that can take place within it. In this sense it provides predictions, albeit qualitative and often quite imprecise and probabilistic ones. Such predictions are applied e.g., in medicine, when a candidate drug is chosen for a particular therapy because it is known that the patient could benefit from its effects if they occur. Predictions are also useful in the research process itself because they enter the cycle of speculating and hypothesizing about unknown biological mechanisms.

Where cartoon models meet their limits, mathematical or computer models can be used as a virtual laboratory to explore and understand the mechanisms. Not only will the computation be able to track a much larger number of interactions than what a human can do by informal deductive reasoning alone. More importantly, in order to formalise the model into a form that can be runnable by an algorithm, the modeller will request a lot of information that is not included in the informal model, about the kinetics of interactions, allowed concentration ranges for chemical components, etc. The choice to construct a formal model causes a need for more precise information which, together with computational precision, paves the way for improved prediction.

Understanding the Purpose of Formal Models

Unification

Traditionally, models in science are thought of as catering for two main needs: Explanation and prediction. Precisely what explanation means and under which condition one can be satisfied that a model is predictive are difficult topics in their own right. A thorough treatment of those issues goes well beyond the scope of the present contribution. Hence, instead of giving detailed explanations we are satisfied here with highlighting essential aspects connected to models.

In processes in which a scientific field develops into increased quantification, physics is often referred to as an exemplary science. Notably, in physics, explanation and prediction nearly always go hand in hand. A model in this case may be a relation that has been derived from some known physical equations. When adjusted to specific circumstances, it can then predict the result of an experiment. Whether or not this prediction is correct can and usually is tested by designing an experiment that implements the basic assumptions of the model. This is a well-known modus operandi of physics.

Models in physics can be explanatory in that they relate phenomena that are apparently different to one another. Typically, this happens by reducing them to a common underlying theory. So, for example, the theory of electricity and magnetism can be understood by reducing them to the underlying theory of electromagnetism, and in essence to Maxwell’s equations. Unification is a powerful and intellectually very satisfying mode of explanation.

Can models in biology fulfil a similar role? As far as prediction is concerned, some computational biologists will certainly recognise the cycle of prediction and experimental corroboration. There is nothing more satisfying for a modeller than to see an experiment reproducing what she predicted will be the case.

The similarity with physics, while apparent at first, is superficial though. The nature of the prediction is of a very different kind in biology than in physics. Experimental physicists tend to design their experiments in relation to an existing theoretical prediction. So, for example, if somebody predicted (credibly) that Higgs bosons exist, then experimentalists will try to find ways to confirm/reject this theory in practice. Note that the prediction of the existence of Higgs bosons is itself an application of physical insights tightly coupled with mathematical reason. In essence, the prediction of Higgs boson has been stipulated as a consequence of existing physical theories. In this sense, the experiment is subservient to the theory.

No such subservience exists in biology. Computational biologists do not normally derive their models from existing biological law-like theories. Instead, they encode existing biological understanding of particular structures and processes into a computer program which they then run. In this sense, computational models are more akin to special purpose reasoning tools, rather than case specific consequences of a general theory.

For example, if a modeller wishes to predict the speed of ribosomes motion over the mRNA, then she needs to first read through the experimental literature to find out how ribosomes move over the transcripts. This will involve an understanding of the degeneracy of the genetic code, of tRNAs and how they bind to the ribosome. Nearly everything that goes into the model will have ultimately been discovered by somebody through a large number of pain-staking experiments. In this sense, in computational biology the theory is subservient to the experiment.

Consequently, the role of models and their relation to experiments are different in biology than in physics. The experiment confirms the usefulness of a model in a specific context rather than corroborating a law-like theory about the world in general. Models can have an explanatory function in biology, but explanations do not derive from unification. They are of a different sort. Accordingly, unification is not a likely scope and purpose of the future biosciences even if formal models come to prevail.

Prediction

Rather than unification, the more frequent vision of a future quantitative biology, in particular in research policy, is that formal models will provide precise predictions of the behaviour of biological systems, including patients who then could benefit from truly precise medicine.

It seems indeed reasonable to expect improved predictive abilities in the biosciences. We shall explain why by highlighting important features of verbal and cartoon models. However, we shall also argue why the ambition of an all-encompassing exact biology might be unfeasible, in short, because there is an inevitable trade-off between precision and the complexity of the biological phenomena to be modelled. A more likely knowledge base for high-precision biological engineering is the type of synthetic biology that is based in relatively simple systems composed by artificial molecular species (e.g., biobricks) designed to have few interactions with native biological compounds. However, biological mechanisms are rarely simple in the sense that there are only a few relevant components. Instead, biological processes are regulated by a large number of elements. Consider the following example as found in Strath et al. (2009) (Fig. 2).

Fig. 2
The diagram illustrates the complicated cartoon model. It includes nutrition, growth factors, translation, transaction, D N A replication, and many more.

An example of a complicated cartoon model (Strath et al. 2009). Reprinted under Creative Commons license CC-BY 2.0

Here, the model elements include biomolecules, biological processes and cellular compartments, while arrows denote chemical reactions, physical movements and regulatory signals. We are not presented with an array of instances from state space as in Fig. 1. Instead, the figure invites questions such as “What would happen if the extracellular level of TGFβ increases?” and one could in principle try to “run” the model by human thinking, following the arrows in one’s mind. In practice, however, this procedure would not yield robust and precise results. The number and nonlinearity of interactions and the lack of quantitative information about their magnitudes and kinetics, render the answer indeterminate. This model provides very little precise information about the dynamics of the system that it intends to model; rather, it is an inventory of material and functional constituents and their interactions.

Where cartoon models meet their limits, mathematical or computer models can be used as a virtual laboratory to explore and understand the mechanisms. Not only will the computation be able to track of a much larger number of interactions than what a human can do by informal deductive reasoning alone. More importantly, in order to formalise the model into a form that can be runnable by an algorithm, the modeller will request a lot of information that is not included in the informal model, about the kinetics of interactions, allowed concentration ranges for chemical components, etc. The choice to construct a formal model causes a need for more precise information which, together with computational precision, paves the way for improved prediction.

At the same time, it is the consideration of precision of information and computation that also enables us to point out the predictive limitations of formal models of biological phenomena. Let us consider the vision of systems biology as an exact science that allows precise prediction of dynamic biological systems (Kitano 2002). First, “dynamic” implies that the model should be able to represent the development of the real system in time, that is, allow for prediction. Predictions should in principle be exact, at least within a time frame and within a tolerance level that is acceptable for scientific and engineering purposes, depending upon the practical possibilities of controlling the interaction between system and environment.

Next, exactitude should not only be provided but also ensured. For instance, artificial neural networks may be trained to deliver highly precise predictions of almost any dynamical system; however, unless the system itself is thought to have a structure reminiscent of a neural network, this type of precise modelling is not what is normally meant by exact science. While one would trust predictions that remain within well-tested parts of the state space, there is no reason a priori to trust predictions outside the empirically tested parts of parameter space. What is lacking, is a justified claim that model and system are structurally similar, and that model is “realistic” in some important sense. In the logical empiricist philosophy of science of the 1960s this requirement was stated in a very strict way by demanding “bridge principles” that provide one-to-one correspondence between observable elements of the real system and their counterparts in the model. For instance, a controversial issue in the interpretation of quantum mechanics was how to understand the wave function exactly because it remained unclear if the wave function corresponded to a property of objects in the real physical universe.

Small mathematical models and computational models in systems biology do indeed consist of elements that are supposed to correspond to molecules, molecular complexes or other small material components. The exception is the relatively marginal research tradition that the theoretical biologist Robert Rosen called “relational biology”, which tries to construct formal models in which the components can be purely functional rather than structural or material. Also, the inferential entailment structure, in particular in the small models, typically consists of deterministic differential equations intended to correspond to physical and chemical processes involving and governing the state functions of the material constituents. Equations may represent chemical reactions, transport, diffusion, enzymatic catalytic activity, etc. Such models may in principle be exact and provide precise predictions.

In practice, however, the precision is challenged on three fronts. First, the model will typically include equations with parameters, the values of which must be estimated either by experiment or, if this is not possible, also in part by reverse engineering approaches. Secondly, it is impossible to model a biological phenomenon without simplifying it. Already a single cell includes too many chemical and physical interactions and too much spatial detail for a model, and only some of these can be included. A higher organism includes a high number of tissues and a very high number of cells, none exactly identical; it interacts with a changing environment; and through reproduction it takes part in the evolutionary process. Very little of this, if anything, can be included in a computational model that aims to faithfully represent chemical and physical interactions, and even less in an analytical model. In practice, there is a trade-off between biological relevance and what can be achieved by a reasonable modelling effort. Thirdly, while it is certainly likely that computer power will continue to increase and that future computational models may be larger, this does not by itself necessarily increase precision. Large models introduce computational complexity. A model may be expanded so that it demands twice the computational expense to run it; however, a proper sensitivity analysis or at least an exploration of the model’s behaviour across its parameter space may demand much more than a doubling of computational expense. We believe that many modelling practitioners may verify the practical difficulties of tuning models into a biologically relevant behaviour: even with the “correct equations”, the every-day experience of the modeller is that most of the time the model runs produce noise and useless results. In practice, a larger model does not necessarily mean less noise; the opposite may be the case.

Robust predictions are therefore only to be expected in cases where there is good strategy for how to idealize and simplify the system into a model, either by drawing clear boundaries around the physical system, or by including only specific phenomena inside it, or, as in the case of flux balance analysis, state clear assumptions about the processes to be studied (in that case, model metabolic processes in terms of steady-state kinetics of perfectly available chemicals in solution). Predictive success will consequently depend on how reasonable these assumptions are in the specific case, that is, on the knowledge that the system already is approximately simple in a way that corresponds to the simplified model. Another way of saying this is that the model may yield robust predictions if we have a robust understanding of the biological system and know what questions we reasonably can ask about it. This is not in any way unique to biology. For instance, in the aftermath of the financial crisis, economists have had to explain that their models could not predict the system breakdowns that actually occurred because the possibility was not included in the design.

System Wide Models: Descriptive Models and Repositories of Information

It may be objected that already contemporary systems biology indeed counts with a number of what we above call SWMs, that is, large, system-wide models (SWMs), and that unlike the models discussed in the previous section, they do not need the ingenuity of the modeller to the same extent as simulation models do. While SWMs normally are computerized, they often are not executable. Instead, these models are mostly collections of data that is somehow collected, either by systematic search of the literature, a large number of individual submissions by experimentalists, or by a specific large scale experiments. Such SWMs are often collections of components, functional annotations of components (e.g., “this gene is involved in metabolism”), interaction tables of molecules, expression data, quantitative data (e.g., reaction rates) and the like. There are now numerous databases, most of which are accessible to the wider community, which store SWMs; one well known example is the KEGG database of pathways. (KEGG: Kyoto Encyclopedia of Genes and Genomes).

SWMs are frequently automatically created, although sometimes curated as well. Compared to nearly any other type of model they are less coupled to a specific context or purpose. Still, in order to be of use in practical science, the information contained within SWMs must somehow be related to a question. There are many conceivable ways in which this could happen. One way is that the data are processed, for example by statistical analysis. A next step to this is to use the raw information in SWMs to generate new knowledge using the help of statistical or machine learning/AI methods. For example, raw sequencing data could be used to generate to find promotor sites or introns. The result of those exercises is most often a derivative SWM.

SWMs lack many attributes that are usually associated with models. They are not built with respect to a theoretical framework, they are not unifying, and they are not predictive. As such, they are more akin a stamp collection than a theory. Nevertheless, SWMs allow scientists to reason about systems, for example by systematically comparing the differences between diseased tissue and healthy tissue or by extracting communalities between taxonomies of species. SWMs are accordingly models proper in the sense of being tools to help reasoning.

The existence of such models has led to a new type of “hypothesis-free” biological research which is mostly concerned with finding new ways to administer the new wealth of data, to find more efficient ways to cross-link available information in useful ways (Kuperstein et al. 2015).

A type of SWMs that has gained particular importance in systems biology are Flux-Balance- Analyses (FBA) models. FBA models, which may be an intermediate case between computational models and SWMs, can be “run,” in the sense that one can set up optimisation algorithms to find particular solutions that are consistent with given constraints. Yet, the model itself is not dynamic, but consists of a description of the topology of a metabolic network plus constraints. Together with an assumption of optimality FBA models can then be used to compute fluxes through the network. This can then be used in order to infer missing components or to analyse differences in the biochemistry of various cell types.

FBA models are SWMs in that they can be constructed semi-automatically from information stored on databases and can map entire metabolic networks of organisms. All those usages of SWMs do not generate direct understanding of particular systems, but rather more information, sometimes only aggregate views of detailed data that was already present. The important conclusion to draw from this, is that SWMs may play an important role in future biosciences, but that they are not end products in themselves. They have to be used by other modelling practices to produce prediction and understanding, be they verbal or formal mathematical or computational models.

Explanation

Arguably, as of now the most important function of biological models is that of explanation. Explanation in biology frequently means uncovering “mechanisms.” So, for example, transcription factors and their binding dynamics to the operator site together with an understanding of the action of RNA polymerase can explain how the activity of one gene can be regulated by that of another genes.

It is acceptable within biological science to describe such mechanisms qualitatively. That is, a number of experimental results together with a coherent verbal story is sufficient to satisfy editors in prominent journals that the relevant mechanism is interesting and can explain some biological phenomenon. Hence, verbal models are explanatory as long as it is supplemented by experimental evidence. So, for example, one could show that Gene A ceases to be regulated when Gene B is deleted. Furthermore, one could provide specific assays that demonstrate that within a certain area upstream of Gene A mutations can cancel the regulatory action of Gene B. Finally, one could directly demonstrate that the product of Gene B binds with high affinity to the sequence motif found upstream of Gene A.

This sort of evidence is mostly qualitative, elucidating the basic structure of the biological system. It provides little information about how strong the regulation is beyond very general qualifiers (i.e. the regulation is “very strong” or is “weak” etc...). For small systems such qualitative and semi-quantitative descriptions can be sufficient to create a good understanding of the system. However, they could be insufficient if the system is of moderate complexity. Non-linearities, for example, make it chronically difficult to understand how a system behaves, even if it consists of a few components only. Therefore, as long as they are used by themselves, verbal models are limited in their explanatory powers in biology.

Combining verbal models with computational reasoning can enhance explanations, in that it can add quantitative detail to mechanisms. Rather than saying that Gene A is regulated by Gene B, a quantitative model could add some understanding of how fast the regulation is and how the overall function of the regulation is achieved by this regulation. An example of this is the case of methylation in the case of the regulation of the fim switch in E.coli as described in (Chu et al. 2008). In this article the authors describe how the metabolism of sialic acid is turned on upon take-up of this nutrient. However, the known mechanistic model of the pathway activation conflicted with separate known information about the toxicity of sialic acid. Upon closer inspection of available experimental data, there appeared to be a contradiction. However, using a dynamical computational model, the authors could show that the apparent contradiction could be resolved if the different timescales of activation of the pathways are taken into account. This reasoning depends crucially on the separation of timescales in the regulatory dynamics. This sort of effect is impossible to describe by verbal reasoning. However, it should be noted that once the case is made formally by simulation, the formal argument can then be reincorporated into a verbal model and described using plain language. It is not necessary to re-run the formal model each time one talks about the system.

The quantitative understanding of the system can often be transformed into a suitable verbal model plus a reference to some formal model that demonstrates the claimed effect. In fact, if the formal model is to be explanatory at all, then it has to be translated into a verbal model in order to be useful. Computations by themselves only yields numbers. These numbers can be used for the purpose of prediction but do not convey understanding. Only when meaning is attached to them by interpretation, and they are related to a network of knowledge can they lead to understanding.

A Possible Tension Between the Requirements of Formal Models and the Need for Conceptual Flexibility and Ambiguity in Discovery

In the logical empiricist philosophy of science of the mid-twentieth century, the distinction between the context of discovery and the context of justification was devised in order to clarify the epistemic status of philosophy of science itself. Processes of discovery were considered to be informal, creative and a proper object of psychological research. The task of justification of scientific knowledge, on the other hand, was seen to be one of rational reconstruction, that is, the application of logic to demonstrate the valid relationship between scientific knowledge and its objects. Later developments in the philosophy of science have shown that the distinction between the context of discovery and the context of justification in itself is a simplification. If we accept the distinction as a first approximation, however, we can note how modelling in science serves two functions that at first sight appear quite different.

Often, the so-called modelling relation (Rosen 1985) is taken to describe the relationship between model and system (Fig. 3).

Fig. 3
The figure represents Rosen's relation. The relationship between model and system. Causal, Inferential, Real system, and Formal system.

Rosen’s modelling relation

This can be understood as a schema for rational reconstruction, that is for pursuing the question of the validity of the scientific model. For instance, one could argue that the model is valid if F and N are isomorphic and the encoding and decoding mappings are isomorphisms. This would indeed amount to the strictest possible logical formulation of the vision of exact science. On the other hand, in science as practice – in the context of discovery, as it were – there is usually no external position from which one can directly inspect the properties of F and N and compare their structure. Rather, the objective or research is to describe and understand a partially unknown natural phenomenon and the model F is a tool in the pursuit of that objective. With the philosopher Immanuel Kant we can say that there is no way of directly knowing the thing in itself. Our cognition takes place with the help of and through our cognitive apparatus, in which concepts and models play a main role. What we know about the real natural phenomenon is what is hypothesised by our best (formal or informal) model F. At an early stage in discovery in the biosciences as currently practiced, the best model will almost invariably be an informal, verbal one.

The argument has been made that in the biosciences, it can be helpful for discovery if one’s best model – that is, one’s preconceived ideas of the system – is indeed tentative, flexible and ambiguous. This is due to the nature of living systems. While many scientific experiments have the simple objective of measuring properties of known entities, molecular biologists, biochemists and microbiologists routinely do experiments with more unknowns. For instance, one may suspect the existence of a certain biological activity, signal or pathway, and tries out a number of experimental systems in order to see if the suspected phenomenon can be observed in a stable and reproducible manner. Furthermore, one would typically like to ascribe the function to a specific biomolecule or complex, but the phenomenon might not exist in a purified solution and might only be observed in an intact or close-to-intact biological structure as a living cell (Strand et al. 1996).

Rheinberger (1997) described the process of discovery in the life sciences as a dynamic interplay between modifying the experimental system and modifying one’s description of it. In the critical phases of discovery, the interpretations of the experiments might change on a daily basis. Concepts and even research questions may be changed, refined and rejected on the path towards a stable and interesting signal from the experimental system. If the process is successful, gradually an “epistemic thing” emerges, which is neither a consolidated phenomenon nor a clear concept yet. As the experiments become more reproducible and stable, the phenomenon is at some point said to exist, and its identity is given by its epistemic counterpart, that is, the model as of the time of consolidation. During such processes, conceptual flexibility and ambiguity seem to be an advantage; one could speculate if it is a prerequisite. If so, that would make a case for the usefulness of informal models as tools for discovery.

What we do know from the history of science, is that verbal models and cartoon models have played an important role in biological discovery. This does not preclude the possibility that computational models could serve as tools for discovery. We would like to speculate, however, that they would be different tools. For instance, the use of computational models in discovery would easily encourage questions about specific interactions, the value of kinetic parameters, etc. In this way, they might influence the life sciences to adopt a style of experimentation that is more similar to physics, chemistry and macroscopic biology. Whether this is a good approach depends on the body of available knowledge of a given biological system. If one is convinced that the inventory of constituents largely is known, it may be a sound approach to focus on quantification of their properties. If not, the general inclusion of computational modelling may actually slow down discovery, in particular if there is a division of labour where the experimentalists struggle to understand the content of the model and the modellers have little practical knowledge of biology. Cartoon models/verbal models facilitate thinking of the type “What is the function of this biomolecule/signal/pathway?” Sometimes this approach is too simplistic; however, it remains unclear what kind of biology one would get if such questions get dismissed altogether as too fuzzy and informal.

The Modelling Process: The Challenges of Radical Openness and Contextuality

Above we described the challenges to predictive precision that are encountered in modelling practice. We shall now turn to the possible implications of the responses that these challenges foster in terms of how the biosciences may develop.

Let us return to Fig. 3. The choice of a model (or a model design) implies a positive statement about the natural phenomenon under inquiry, a statement that ideally can be verified, corrected or refuted by experiment and observation. However, insofar as the natural phenomenon has not been completely identified (and accordingly being under inquiry), the choice of model (or model design) also acts to frame the phenomenon, that is, to delimit the research object. Certain elements and aspects of the real world will fit into the frame provided by the model, while others remain invisible, not measured or otherwise outside the scope of the model. While this fact is generally appreciated with respect to the particular choice of elements, we wish to draw attention to the most general implication of modelling, which is that the natural phenomenon will be framed in terms of a natural system (Chu et al. 2003). Chu (2011) has described how this constraint translates into two practical challenges that may occur in the modelling process: radical openness and contextuality.

Radical openness is a feature of the phenomenon under inquiry that is observed in the inability to successfully delimit the model/system. In order to improve the predictive power of the model, the model may be expanded by including elements from the environment that strongly interact with system; but this new definition lead to the identification of other strong interactions with the environment, and so on.

Contextuality is similar to radical openness, only that the problem of delimiting the model/system resides inside it, in the indefinite richness in the number and nature of properties of model elements and interactions between them. The model may seem predictive and exact but suddenly fails because it did not take into account properties and interactions that had not been considered or measured before.

Both challenges result from a richness in properties and interactions in the natural world. Exact science has three strategies to meet these challenges:

  1. 1.

    Exact sciencetries to avoid or minimise radical openness by searching for parts of the natural world that either appear relatively isolated from the environment (“looking for Nature’s seams”) or that can be isolated in the laboratory (or produced by technology).

  2. 2.

    Physics and chemistry try to avoid contextuality by constructing complete physicalist models of all properties of the elementary material constituents and developing a unified theory of all forces that act upon them. This amounts to a reductionist programme and it explains the importance of the assumption of the perfect identity of elementary particles in the same quantum state, and the unacceptability of “hidden variables”.

  3. 3.

    Finally, economics as a non-physicalist exact science tries to identify independent layers of law-like behaviour (of rational agents and market transactions) that are robust against contextuality.

Biology in its full scope studies a lot of different phenomena: structures such as biomolecules, cells, organisms, species and ecosystems and processes such as metabolism, reproduction, animal behaviour, and morphogenetic and phylogenetic development, to mention but a few. If emphasis is put on prediction and precision, the biosciences will have to employ all three of the strategies mentioned above. Already contemporary life science on the molecular and cellular/sub-cellular level is oriented towards (informal) models, and it focuses on “systems-like” biological phenomena: structures such as organisms, their spatial compartments and material constituents, and processes that involve material constituents. Framing phenomena of life as systems produces a bias towards constancy rather than change, and similarity rather than variation. We expect this bias to become stronger if computational models become the norm also because prediction will become a more central value and hence radical openness can be less tolerated. We would expect computational systems biology to consolidate the emphasis on material structure and single organisms in contemporary life science. Paradoxically, the strengthening of computational models and systems biology, which often is presented as a non-reductionist programme, may indeed in this sense lead to less appreciation of biological complexity.

A Drive Towards Computational Models

The emergence of journals such as PLoS Computational Biology and research initiatives such as the Centre for Digital Life Norway reflects a shift in the biological paradigm. We suspect that an important trigger for this shift is the development of high-throughput experimental methods. Storing the results of modern experimental techniques (i.e., microarray, ChiP-seq, etc..) requires sophisticated computational solutions. Even more so, intricate algorithms are required to understand the meaning of those. These new experimental techniques have made the use of computational tools indispensable and in the slipstream of this a new interest in modelling has evolved.

We believe that computational modelling is not merely riding a bandwagon of computational methods establishing themselves in unrelated areas. Instead, computational modelling becomes necessary also in traditional biosciences in order to manipulate, interpret and understand data and results. The results of this research usually lead to descriptive models but not by themselves to detailed understanding of mechanisms. For example, a sequenced genome only describes the DNA, but does not allow the user to understand the functional significance of the sequence. One use further computational processing to predict, for example, transcription factor binding sites and promotor sites. Yet, even with the best algorithms this only leads to a list of candidate locations with a given functional relevance. This is interesting knowledge that is routinely produced and deposited in databases. Yet, by itself it does not produce actual biological knowledge. Apart from the uncertainty that is attached to this data, a list of binding sites says very little about what is actually going on in the organism. The entries in the databases only become actually useful when they feed into the work of the biological investigator who confirms predicted knowledge experimentally and weaves the facts into a coherent mechanistic understanding of a concrete system.

High throughput biology is a just one symptom of a rapid method development that has led to an astonishing ability to manipulate and measure biosystems. These methods by themselves force the biomedical researcher into more complex verbal models that combine quantitative and qualitative information, often including non-linearities. Very quickly, verbal reasoning is not powerful enough anymore to integrate this information. This is only exacerbated by the large amount of information available from SWMs resulting in a deluge of information that needs to be integrated. In order to have an understanding of biological systems at the level of detail that is implied by the available data it is necessary to use computational reasoning as an enhancement to verbal argument.

We suspect that this trend will intensify, and with it, the need for computational processing. At the same time, it must be remembered that computational models do not per se provide understanding, but are reasoning tools to aid the intuitive understanding of biological systems. By themselves these models cannot provide any understanding, but they need to be related to verbal models and transformed into verbal models, at least by our present concept of what it means to understand something.

In summary: As long as biomedical research concentrated on a few genes at a time and their local effects, there was no need to outsource reasoning to a machine. Progress in biotechnology led to a refinement of measurement techniques. This in turn allowed high throughput technologies which necessitates the use of computers to administer and analyse the data deluge.

Crossing the Styx

Science, technology and society are co-produced in entangled processes that include sociotechnical imagination. In our case, the development of high-throughput laboratory methods, the increased focus on computational modelling and the emergence of the sociotechnical imaginary of precision medicine are three processes that are causally entangled and of course also form part a larger causal complex that includes the political economy of medical practice and research.

We have now reviewed certain features of types of models and modelling practices. If precision medicine were to become an exact science, it would imply that computational modelling would take a prominent if not wholly dominant place. It remains to discuss what the implications of such a development might have on medical technology and practice.

We find it useful to reflect on how technology normally works. The usual engineering solution to the problem of radical openness and contextuality referred to above is to construct simple and easy to predict systems rather than applying models to highly complicated systems. While many natural systems display nonlinearities and in general behaviour that is difficult to model, mechanical systems may have linear behaviour that can be modelled with precision exactly because they are designed to. For instance, railways are designed to have small and predictable friction between rails and train wheels. Well-designed mechanical systems can be predicted and controlled with extreme precision not because the universe is governed by simple and linear laws, but because nonlinear behaviour is deliberately excluded and prevented by skilful design of the system. This is how it is possible to send successfully spacecrafts to other planets or to develop and distribute vaccines (Latour).

Medical science and technology have ample examples of highly simplifying strategies, ranging from the “cut, burn and poison” of cancer medicine to lobotomy, electroshock therapy and various psychopharmaceuticals that aim to reduce suffering by reducing brain complexity. However, because the underlying body of knowledge is not exact, these technologies are more likely to fail, especially if one’s purpose is to restore and protect biological complexity. If one is satisfied with killing the patient, these technologies can all be applied without failure.

The imaginary of personalized and precision medicine contains the purpose to maintain or restore the subtle and delicate homeostasis of human health in the presence of multicausal networks that drive the individual towards illness and possibly death. It wants to achieve this by tailoring the treatment, that is, finding the right drug and dose to the right patient at the right time. This ideal is not new; it is the heritage from patient-centred clinical practice which by means of consultation and communication with the patient sought to tailor the doctor’s intervention. However, patient-centred practice, in all its imprecision, builds on hermeneutic knowledge of the single individual; what used to be called idiographic rather than nomothetic knowledge in philosophy of science, or simply experiential as opposed to evidence-based knowledge. The design problem in precision medicine is that it wants to achieve tailoring (which is sometimes possible but fallibly so by means of inexact science) by relying on exact science, which does not translate into technologies that tailor solutions around natural complexity. Exact science translates into technology that changes the system so that it keeps within the boundaries and parameter space of the model. The ambiguities in the use of the concept of precision medicine, and notably precision oncology, bear witness of this design problem. Certain diagnostic and therapeutic practices are already called precision oncology; it is just that the therapy outcomes are not precise.

Science is one of the most powerful institutions in modern society. It offers not only the knowledge base for the development of technology but also a large part of the knowledge with which modern human beings understand the world and themselves. It provides not only facts and explanations, but also indirectly guides us in the choice of questions and perspectives.

A transition from cartoon- and verbal model-based life science to a precision medicine based on systems biology and mathematical computational models will undoubtedly lead to new and improved knowledge of innumerous biological phenomena. At the same time, we have argued that it may direct the research focus even more towards controllable and predictable phenomena in relatively closed systems with regular behaviour. This is because such systems provide tractable problems for computational models. The resulting body of biological knowledge may reinforce modern human beings’ understanding of life as essentially predictable, understandable and controllable, and which therefore is provides a suitable substrate for industrial and economic exploitation. In this sense, precision medicine and systems biology present themselves as a business case for the bioeconomy. Still, as of today, precision medicine is inexact and largely retains a concept of and an interest in biological understanding close to its inexact past. A next logical step, however, could be the gradual dominance of computational modelling, which would imply an even stronger shift towards instrumentality, reductionism and the view that life is predictable and controllable.

We have called this paper “Crossing the Styx”. The shift to a biology dominated by computational models might be likened to crossing a river from which there is little possibility of return. In Greek mythology, Styx was one of the rivers that separated the World of the Living from Hades, the World of Death. Still, curiously, the souls continued a kind of life in Hades; but it was a different life. Often it was assumed to be an inferior life – quite the opposite of the optimistic visions and not the least all the sales talk that surrounds precision medicine, for which also practising scientists are responsible. We leave the evaluative aspect with the reader, well aware that the metaphor in itself may seem provocative. Still, we see two senses in which one could follow a quite different and broader debate than the one pursued in this paper.

The first is to what extent precision medicine could become an exact science by redefining its purpose and thereby solving the problems of radical openness and contextuality. We already noted how this problem was solved by military research that focuses on destroying life rather maintaining health. Killing people was successfully translated into true engineering problems. A less radical alternative is to make the criteria of medical success as simple as possible, for instance in terms of standardized clinical outcomes such as survival or progression-free survival, perhaps under hospitalized and strongly medicated conditions. For instance, a precision oncology that merely focuses on short-term delay of death due to cancer, has a better chance of successful translation into precise clinical practice than if quality of life is considered as part of the problem and accordingly part of the system. A step in the direction towards this rather extreme scenario would be the scientific dominance of computational modelling, SWMs and the disappearance of classic informal and verbal models and hence human understanding as a scientific product. It would all be data, models and clinical outcomes.

Some would argue from a cultural, perhaps humanistic point of view, that such a development of medical science and practice is undesirable and should be avoided; hence implying that it can be avoided. Still, it would have to be admitted that it could be seen as just a next step of what the philosopher Jürgen Habermas called the colonization of the lifeworld by technology, and other scholars have called medicalization processes throughout the twentieth century and into the twenty-first. If one takes the metaphor of Styx, one might be tempted to ask if human civilisation in this respect is itself becoming senescent and replacing its human faculties by formal reasoning and machines. True to the theoretical concepts of co-production and sociotechnical imagination, however, one could argue that there is no necessity in this development. Future science can become different. For instance, cancer medicine can become more tailored by resisting the Utopian ideal of exact science and rather combining high-throughput methods and other biomedical developments with the patient-centred focus of the art of medicine, staying with living, as it were.