At the Biological Modeling and Simulation Frontier
We provide a rationale for and describe examples of synthetic modeling and simulation (M&S) of biological systems. We explain how synthetic methods are distinct from familiar inductive methods. Synthetic M&S is a means to better understand the mechanisms that generate normal and disease-related phenomena observed in research, and how compounds of interest interact with them to alter phenomena. An objective is to build better, working hypotheses of plausible mechanisms. A synthetic model is an extant hypothesis: execution produces an observable mechanism and phenomena. Mobile objects representing compounds carry information enabling components to distinguish between them and react accordingly when different compounds are studied simultaneously. We argue that the familiar inductive approaches contribute to the general inefficiencies being experienced by pharmaceutical R&D, and that use of synthetic approaches accelerates and improves R&D decision-making and thus the drug development process. A reason is that synthetic models encourage and facilitate abductive scientific reasoning, a primary means of knowledge creation and creative cognition. When synthetic models are executed, we observe different aspects of knowledge in action from different perspectives. These models can be tuned to reflect differences in experimental conditions and individuals, making translational research more concrete while moving us closer to personalized medicine.
KEY WORDSagent-based mechanistic modeling predict simulation
absorption, distribution, metabolism, elimination, and toxicity
- AT II
alveolar type II
cellular Potts models
human adipose-derived stromal cell
In Silico Liver
Madin-Darby canine kidney
modeling and simulation
multiple organ failure
portal vein tracts
quantitative structure-activity relationship
research and development
systemic inflammatory response syndrome
tumor necrosis factor
The declining pace of new drug approvals, general inefficiencies, and significant rates of failures late in development are among the factors contributing to calls for reexamination of the methodological robustness of pharmaceutical and biotechnology research and development (R&D) (1, 2, 3, 4). Among the inefficiencies is the fact that scientists and decision-makers translate results and conclusions from wet-lab experiments to patients using conceptual mappings where some assumptions about the wet-lab-to-patient mappings are intuitive and unknown, and thus unchallengeable. Those conceptual models exist in and rely on the minds of a changing cadre of domain experts. It has been argued that mathematical, systems biology models will help to address this issue, but we argue that primary reliance on the familiar inductive and deductive computational modeling methods will be inadequate: success requires that they be augmented with new classes of models and methods. Within the past decade, an expanding number of groups, working within different domains, have contributed to the development of a class of computational models that we argue can dramatically increase the efficiency, efficacy, reliability, and variety of plausible translational mappings, and thus facilitate and accelerate better decision-making.
To distinguish this class of models and its methods from the more familiar inductive mathematical modeling methods, we identify it as the synthetic (combining elements to form a whole) method of modeling and simulation (M&S). Our objective is to draw on recent examples to help make a case for the use of synthetic M&S within the critical decision-making stages of drug discovery and development. We explain 1) how synthetic M&S is made scientific 2) how synthetic M&S can augment mental models and system thinking with concrete virtual tissues, organs, and ultimately virtual patients; and 3) that synthetic M&S enables explorable, in silico wet-lab-to-referent mappings that are accessible to all members of an R&D organization as operating models of current knowledge and beliefs. More important is that synthetic M&S encourages and facilitates abductive reasoning (see Appendix for explanation): the primary means of knowledge creation and the primary source of creative leaps. The more mature descendants of these models may even begin capturing the gestalt of successful pharmaceutical R&D.
While it is true that no computational model can fully represent the complexity of biological systems, new model types are essential to achieving deeper insight into the causal, mechanistic networks responsible for disease and desired pharmacological phenotypes. We show how synthetic M&S can be used to discover, clarify, and challenge plausible linkages between biological mechanisms and phenotypes. Skeptics may also declare that models cannot mimic the complexity inherent within biology and thus cannot be correct. However, the advance of science depends on discovering and using better models. It will become clear how validation-supported synthetic models (defined in the Glossary) can expedite and improve R&D decision-making.
The text that follows is divided into five sections. In “Rationale for New Model Classes,” we provide our rationale: we make the case for needing new classes of models and discuss how they are created. We follow that in “Analogues: From In Vitro Tissues to Interacting Organs” with descriptions of four examples of system-oriented, synthetic, biomimetic models that have provided new mechanistic insight into phenomena observed in vitro and in vivo. Those motivating examples provide context for “Reasoning Types and Their Different Roles in M&S,” which discusses how the three types of reasoning—induction, deduction, and abduction (in Glossary)—are used in science and M&S. We describe how the two classes of models, inductive and synthetic, draw differently on inductive, deductive, and abductive reasoning to achieve their different objectives. Coupling the capabilities of well-established mathematical modeling methods with those of synthetic M&S will, for the first time, make the full power of the scientific method available to the M&S component of R&D. That discussion leads directly to “M&S and the Scientific Method,” in which we develop the idea of scientific M&S (in Glossary). In the penultimate section, “Impact of M&S on Scientific Theory,” we explain what it can accomplish. We argue that in order to achieve the above vision, we must expand computational M&S into scientific M&S. We provide a list of eight capabilities that synthetic models will need in order to achieve our vision. We then summarize in the Conclusions. A glossary of less familiar terms is included in the Appendix along with essential supporting information, including brief descriptions of inductive, deductive, and abductive reasoning. For convenience, selected key points made in “Rationale for New Model Classes,” “Reasoning Types and Their Different Roles in M&S,” “M&S and the Scientific Method,” and “Impact of M&S on Scientific Theory” are provided as bulleted statements at the start of each subsection. A relatively comprehensive bibliography of primarily discrete event (in Glossary) biomedical models that combine synthetic and inductive methods is provided as Supplemental Material. See (5, 6, 7, 8, 9, 10, 11) for reviews of advances in and relevant biomedical applications of inductive mathematical M&S.
RATIONALE FOR NEW MODEL CLASSES
Envisioned New Model Classes
Building an experimental apparatus is fundamentally different from “modeling the data.”
An objective is to build better working hypotheses about mechanisms.
What spatiotemporal mechanisms play roles in the emergence (in Glossary) of a pharmacological response? During drug discovery and development, current knowledge is often inadequate to answer that question. A research objective is to develop better working hypotheses about those mechanisms. Synthetic models can expedite that process. A dictum of the physicist Richard Feynman was “what I cannot create, I do not understand.” It follows that to understand biological responses and their plausible generative mechanisms when uncertainty is large and data are chronically limited, we need to build extant (actually existing, observable), working mechanisms that exhibit some of those same phenomena. Building extant, plausible, analogue mechanisms is fundamentally different from the traditional approach of “modeling the data.” In the latter case, the mechanisms are all conceptual. We cannot yet build hierarchical, modular, extant mechanisms out of biochemicals. However, as described below, we can build extant biomimetic mechanisms using object-oriented software tools.
Consider the following: A software engineer, given complete freedom, creates code that, when executed, produces mechanisms which give rise to multi-attribute phenomena that are strikingly similar to specified pharmacological phenomena. When the software engineer has limited biological knowledge, there may be no logical mapping from event execution in the simulation (in Glossary) to the biology during observation. However, biologically inspired requirements can be imposed to shrink and constrain the space of software mechanism and implementation options that successfully exhibit those same phenomena. A continuation of that process can lead to extant software mechanisms (and phenomena) that are increasingly analogous to their biological counterparts. In so doing, we are not building a model based exclusively on known biological facts and assumptions, because the facts are often insufficient to do so. Furthermore, keeping track of all the assumptions and assessing their compatibility can become an unwieldy, time-intensive task. Rather, we are exploring the space of reasonably realistic, biomimetic mechanisms that can cause the emergence of prespecified pharmacological phenomena. The focus is on inventing, building, exploring, challenging, and revising plausible biomimetic mechanisms. To distinguish the two modeling methods and help ensure a disciplined focus on methodology, we refer to models arrived at through the latter process as (biomimetic) analogues (in Glossary). To emphasize aspects of construction and method, specifically combining often varied and diverse elements, so as to form a coherent whole, we say synthetic analogues.
Bridging the Gap Between Wet-Lab and Traditional, Computational Models
Gap-bridging computational models will be objects of experimentation, similar to wet-lab models.
In order to demonstrate that we understand how molecular level details interface with and exert influence at higher levels and emerge as features of a favorably altered patient phenotype, we need models and methods that can bridge the gap. We need models that are increasingly more like their referents—models that have extant mechanisms that generate emergent properties analogous to how phenomena emerge during wet-lab experiments. Those models will be synthetic, as are wet-lab models. An important use for such models will be testing hypotheses about mechanisms (rather than about patterns in data).
However, it is important to note that network, PK, PD, and other mathematical modeling methods do not need replacing—they do what they are intended to do very well, even though they will benefit from improvements. Nevertheless, to span the gap, we will need new model classes having new uses and capabilities (see Text Box below).
Instantiating a Mechanistic Hypothesis and Achieving Measurable Phenotype Overlap
A synthetic analogue is an extant hypothesis: execution produces an observable mechanism.
Analogues that are scientifically useful will have few 1:1 model-to-referent mappings.
Although there are many similarities in measurable phenomena between systems, there are few precise, one-to-one behavioral mappings between structures in MDCK cultures and epithelial cell structures within mammalian tissues. True, there is a 1:1 correspondence between cells. However, because cell environments and genetics are different (cannot be precisely duplicated), mappings between in vitro and in vivo, which can be aspect- and perspective- dependent, may be nonlinear and in some cases complex (in Glossary). That observation is instructive. It suggests that an in silico analogue can become a scientifically useful representation of MDCK cultures (and eventually epithelial cells in tissues) without enforcing 1:1 mappings between its attributes and mechanisms and measures of cultured MDCK.
We can characterize wet-lab models that have extended lifetimes and are used in different experimental contexts with a variety of designs as being robust to context, even though its referent is specific. We need synthetic analogues that can be characterized similarly. The mappings from wet-lab model to referent are often somewhat different for each context because the aspect of interest will have changed. We can surmise that an analogue built initially to have many 1:1 model-to-referent mappings may be solidly anchored to one referent aspect and attribute, and thus may have limited additional uses (without undergoing considerable reengineering).
Absent precise 1:1 mappings, scaling methods will be needed; their development can be separated from that of the analogue. The mechanisms responsible for generation of an MDCK culture phenomenon (e.g., stable cyst formation) are not grounded to any external measurement methods. Nor are they grounded directly to a tissue referent. The units, dimensions, and/or objects to which a variable or model constituent refers establish groundings (in Glossary and discussed further in “How a Model is Grounded Impacts How It Can and Should Be Used”). The components of parameterized PK models are typically grounded to metric space. The components of MDCK mechanisms are grounded to each other. The grounding of cells to each other and their environment is independent of any measures. From that fact, we can infer that analogues that bridge the gap will exhibit similar grounding. We measure wet-lab phenomena using metric devices. We cannot use those same devices to measure events during simulations.
Analogues That Bridge the Gap Will Be Executable Knowledge Embodiments Suitable for Experimentation
Synthetic models present different aspects of knowledge in action, and do so from different perspectives.
Separate, tuned copies of successful analogues can reflect differences in individual-specific attributes.
In synthetic analogues such as the In Silico Livers (ISLs) discussed below, components and their interactions represent micro-mechanistic features, including anatomical, physiological, and molecular details at different levels during execution. Because of such multi-level similarities, following several rounds of improvement, testing, and validation, descendant analogues of this class have the potential to evolve into executable representations of what we know (or think we know) about biological systems: executable biological knowledge embodiments. We expand on that idea below; see (18) for further discussion. Such embodiments are needed but are beyond the scope of current PK, PD, and related modeling methods. Knowledge embodiment is made feasible because synthetic analogues provide concrete instances of that knowledge rather than computational descriptions of conceptual representations. When an analogue is executed, it demonstrates when, how, and where our knowledge matches or fails to match details of the referent system. For that reason, Fisher and Henzinger (19) suggested referring to such simulation models as executable biology.
The envisioned synthetic analogues can facilitate the merger of knowledge and expertise contributed across organizational domains into executable and, therefore, observable and falsifiable systems of plausible mechanisms and hypotheses (20). Together, they will represent the current best theory for aspects of system function. It will be possible to observe different aspects of knowledge in action and do so from different perspectives, as we do with wet-lab systems. Adjusting (tuning) an ISL to represent (for example) a normal rat liver in one in silico experiment, a diseased rat liver in another (as in (21); see “In Silico Livers”), and a mouse or human liver in another will be relatively straightforward because uncertainty can be preserved and cross-validation of component functions can specify which features to tune and by how much. It will be feasible to take copies of the same analogue and tune each separately to reflect differences in measured, patient-specific attributes. The collective knowledge coupled with collective uncertainty can be made specific for groups of patients and even for individual patients.
Achieving the Vision Motivating Physiologically Based Pharmacokinetic (PBPK) Modeling and Simulation
To discover and test plausible mechanistic details, we must experiment on (different) synthetic analogues.
A vision motivating research on synthetic analogues is identical to one that has motivated development of traditional PBPK models: by “accounting for the causal basis of the observed data, ... the possibility exists for efficient use of limited drug-specific data in order to make reasonably accurate predictions as to the pharmacokinetics of specific compounds, both within and between species, as well as under a variety of conditions” (22). However, PBPK model parameters necessarily conflate features and properties of the biology (aspects of histology, etc.) with drug physicochemical properties (PCPs) (23). In doing so, “the causal basis” becomes obscured due to the conflated biological features that were especially influential in causing some pattern in the data.
A purpose of conducting PK and other experiments that provide time course data is often to shed light on prevailing mechanistic hypotheses about drug dynamics, specifically to gain new knowledge regarding mechanistic details of disposition and metabolism. Most often, hypotheses about those details are induced from the data. Fitting inductive mathematical models to data is often used as evidence in support of particular hypotheses. To date, designing and conducting new wet-lab experiments has been the only practicable means to experimentally falsify those hypothesized, conceptual mechanisms. Experimenting on synthetic analogues provides a powerful new means of discovering and testing the plausibility of mechanistic details. A traditional, inductive, PK model hypothesizes an explanation of patterns in PK data (24). The mathematics of PBPK models describe data features predicted to arise from conceptualized mechanisms, which are typically described in sketches and prose. As illustrated on the left side of Fig. 2, there are unverifiable, conceptual mappings between equations and envisioned mechanisms. The methods used by synthetic analogues, as exemplified by the four cases described in “Analogues: From In Vitro Tissues to Interacting Organs,” are different. They provide an independent, scientific means to challenge, explore, better understand, and improve any inductive mechanism and, importantly, the assumptions on which it rests.
Creating Synthetic Analogues and Defining Their Use
Models that begin spanning the gap in Fig. 1 are generalized object- and agent-oriented constructions.
Biomimetic agent-based analogues facilitate discovery and understanding of phenomena produced by systems of interacting components.
Important use: better understand disease mechanisms and their interactions with interventions.
By nesting agents and objects hierarchically, one can discover plausible upward and downward mechanistic linkages.
The biological mechanisms that generate system level phenomena are consequences of components at multiple levels interacting in parallel, primarily discretely, with other components in their local environment. Simulation of such behavior can and has been achieved by adopting discrete event M&S methods. Any interaction can be stochastic. So doing simulates uncertainties and is a means to preserve ignorance. See (25,26) for a generalized discussion of the advantages of using discrete event methods to model and simulate complex adaptive systems. Fisher and Henzinger (19) discuss how several formal, discrete event methods (Boolean networks, Petri nets (27), pi-calculi, interacting state machines, etc.) have been leveraged to gain mechanistic insight into biological phenomena. Advances in simulating complex biological phenomena have been accomplished using formal cellular automata (28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42) and cellular Potts models (13,43, 44, 45, 46, 47, 48, 49, 50, 51, 52).
Most biological components are spatially organized, semi-modular, and quasi-autonomous: they include organs, tissue functional units, cells, subcellular systems, and macromolecular complexes. Synthetic analogues must be capable of exhibiting those same attributes. Greater component autonomy coupled with realistic yet abstract, spatially organized, biomimetic mechanisms have been achieved using agent-based and agent-oriented methods (53, 54, 55). Because all of the preceding methods are based ultimately on object-oriented programming (in Glossary) methods, we suggest that analogues that begin spanning the gap in Fig. 1 will be considered generalized constructions in the object-oriented domain that uses agents (in Glossary). Consequently, we focus the following discussion on those methods. The four examples described in the sections that follow all use agents. In agent-based (in Glossary) modeling, quasi-autonomous, decision-making entities called agents are key components; see Appendix for available multi-agent M&S platforms. Other components, such as those representing specific compounds (biochemicals or xenobiotics), can be simple, reactive objects or properties of spaces. Reactive objects and agents follow sets of rules that govern their actions and interactions with other system components. In this context, an agent is a biomimetic object that can be quasi-autonomous; it has its own agenda and can schedule its own actions, much like we envision a cell or mitochondrion doing. When needed, an agent can change its operating logic. Agent-based modeling facilitates the production of systemic behaviors and attributes that arise from the purposeful interactions of changeable components. The resulting biomimetic analogues have advantages when attempting to understand and simulate phenomena produced by systems of interacting components, and that makes them prime candidates for bridging the gap in Fig. 1.
If we had analogues of the type just described, how would we use them in the context of drug discovery and development research? An important use would be to understand the mechanisms that generate disease-related phenomena and how compounds or formulations that interact with the mechanisms can alter those phenomena. Improved mechanistic knowledge will enable improved predictions, while helping to reduce requirements for new wet-lab experiments.
A feature of object-oriented analogues is that objects and agents can be either atomic or composite. Atomic components define the system’s level of resolution—its granularity (in Glossary). Granularity is the extent to which a system is subdivided, with the smallest components being atomic. An atomic object has no internal structure and so cannot be subdivided—it simply uses its assigned logic. Granularity is also the level of specificity or detail with which system content is described: the more fine-grained, the more specific. Objects, both atomic and complex, are pluggable and can be replaced (as distinct from being subdivided) with more fine-grained, composite components that exhibit the same behaviors within the analogue under the same conditions. That replacement can even take place during a simulation. These components can exhibit hierarchical nesting, which makes it feasible to use analogues of this class to begin discovering plausible upward and downward linkages that are needed to enable instantiating (in Glossary) details of genotype-phenotype linkage. When the nested components are relationally grounded, one can avoid many of the multi-scale problems that plague metrically grounded, equation-based, inductive models. The phenomena emerging from mechanisms at one level can be used as input at another level. Greater nesting means more components, and that means more interactions and more simulation time to process, document, and record those interactions. In order to maintain parsimony, analogues should be designed with components that are just fine-grained enough to produce targeted phenomena and achieve the analogue’s specified uses.
Representing Chemical Entity Attributes and Dynamics Within Biomimetic Analogues
Components base their actions on information presented by the mobile objects (compounds) they encounter.
Each type of component–compound interaction is a simple micro-mechanism.
Components use simple logic to tailor their micro-mechanism to a subset of a compound’s properties.
Recent reviews of in silico prediction of absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties and of quantitative structure-activity relationship (QSAR) methods chronicle advances in technical sophistication, plus increased use of consensus modeling. They also point out pitfalls of complicated correlation methods. Satisfaction with the quality of PCP predictions is counterbalanced by disappointment in predictions of organism-based properties for new compounds (56, 57, 58, 59, 60, 61, 62). Johnson reminds us (61) that many, nearly equivalent, correlational models are possible, making it easy to choose a wrong model for the next new compound. Furthermore, many of the models have no grounding in biological reality. A take-home message is that useful, reliable ADMET predictions of organism-based attributes of future compounds will need to be based on mechanistic insight. The representations of compounds within synthetic analogues, along with their interactions with system components, provide a new, potentially powerful approach to thinking about organism-based, structure-activity relationships.
Precise knowledge of the stoichiometry of biological component–compound interactions is rarely if ever available. Uncertainties at all levels are common. An advantage of discrete event methods is that both knowledge and ignorance (uncertainties) can be represented concurrently. Precise yet poorly informed assumptions can be avoided. The effective stoichiometry of influential, low level (fine-grained) interactions involving compounds can be represented at almost any convenient granularity level below that of the targeted phenomena, but the mappings from objects representing compounds to their referent molecules are not 1:1. Early in the analogue development process a scientifically justified level (moving down from targeted phenotypic attributes) at which to initially represent compounds is the first at which biological functional units are encountered, and that is typically a coarse-grained representation.
The presence of a compound can be represented as a property of a space or as mobile objects. We focus on the latter. Mobile objects representing chemical entities can map to an arbitrary number of molecules (see Appendix, Representing Compounds). During simulations such objects are typically passive. The agency (interaction-specific programming logic) to determine the outcome of a component–compound interaction typically resides with the biomimetic component or process; in some cases more than one component or process can be involved. An important feature of the synthetic approach from a biopharmaceutical sciences perspective is that each mobile object carries with it all the information that the empowered (active) component, a membrane transporter object for example, needs in order to adjust and reparameterize its interaction logic. That information can include (or not) selected PCPs that domain experts understand intuitively (molecular weight, logP, measures of ionization state, etc.) along with bioactivity attributes (the chemical entity is a CYP 2C9 substrate, etc.). In that way, during a simulation an analogue system can accommodate objects representing any number of compounds, which of course is ideal for studying and exploring drug–drug interactions (63, 64, 65). A component empowered to interact can use compound identification information to adjust its engagement logic.
Early in an analogue’s development, micro-mechanistic knowledge is insufficient to parameterize component–compound interactions a priori using compound-specific information. Micro-mechanism logic must be tuned individually for each of the first few compounds for which referent data are available. As the set of compounds enlarges, inductive modeling methods within the analogue’s larger framework can be used, as in (66,67), to establish quantitative mappings from patterns in chemical entity information to patterns in parameter values of tuned component–compound interaction. Such a mapping will be the analogue’s counterpart to a structure-activity relationship. In subsequent rounds of analogue use and refinement, the new knowledge contained in that relationship can be used, in some cases automatically, to provide an initial analogue parameterization for the next chemical entity to be studied. Simulations using those parameterizations will stand as crude predictions of the new compound’s targeted attributes. The limited (artificial) intelligence available to analogue components at that stage can be improved systematically as the analogue is iteratively validated against wet-lab data for additional compounds. That primitive intelligence can be shared between different analogues and models within an organization’s larger M&S framework.
Each type of component–compound interaction is a micro-mechanism. The micro-phenomenon that results is typically simple: transport occurred (or not), metabolism occurred (or not), spatial relocation occurred (or not), etc. When such phenomena are studied in vitro in simple systems, one often observes that just a few molecular descriptors account for the majority of the data variance for the compounds studied. Even when the more complex ADMET properties are analyzed collectively, simple, interpretable rules of thumb emerge (68). Given the simple and stochastic nature of most synthetic micro-mechanisms, a small change in the PCP space can correspond to a negligible or modest change in the micro-phenomenon, as well as the parameterization of its logic. Computationally simple methods are expected to suffice in predicting acceptable, micro-mechanism parameter values. The precision of estimated micro-mechanism parameterizations can be expected to vary randomly across the analogue. Nevertheless, the predicted, systemic target phenomena can still be accurate enough for R&D decision-making; see (67) for an example.
ANALOGUES: FROM IN VITRO TISSUES TO INTERACTING ORGANS
The evolution of object- and agent-oriented biological system models over the past decade is interesting but outside the scope of this article. Currently, there are no analogues that bridge the gap in Fig. 1. Analogues capable of beginning to bridge the gap began appearing only recently. Because model use has been different in each case, a straightforward comparison of what those different models do (or do not do) and how they do it would be misleading. Rather, the objective here is to provide examples of analogues that can evolve to become gap-spanning, scientifically useful in silico systems. We sought examples that would enable readers to envision analogues that could be useful within their own R&D domains. With that in mind, we limited examples to those that included drugs (one case) or to which objects representing drugs could be added (three cases) without requiring system reengineering. Because this field is relatively new, all four examples are early stage and still somewhat abstract. However, it is easy to imagine variants of each, in parallel, becoming incrementally more sophisticated and realistic.
Filling the Need for an Epithelial Cell Culture Analogue That Has Its Own Phenotype and Plausible Operating Principles
Even though their phenotypes are complex, in vitro cell cultures are among the simplest biological systems. Early examples of in silico explorations into mechanisms in vitro using CA and CPMs include (14,69,70). Thereafter, experimentation with agent-oriented methods increased. Within the past three years, considerable progress has been made improving in vitro mechanistic insight using primarily more sophisticated, cell-centered (71), and agent-oriented methods (53,64,72, 73, 74, 75, 76, 77), including exploration of events in a crowded virtual cytoplasm (78). See Supplemental Material for a thorough listing of research progress that used variations of the synthetic method during the intervening years.
There are several agent-based, cell-centered, synthetic analogues to which drug objects could be added. Walker et al. (79) provide an early example representing cell–cell mechanisms of interaction using a synthetic, agent-based approach. Bindschadler and McGrath (72) achieved new insight into mechanisms of wound healing using an agent-based approach in which components were grounded purposefully to metric spaces, enabling direct comparison of simulated and wet-lab measurements. Zhang et al. (80) developed a sophisticated, 3D, multi-scale, agent-oriented analogue of solid tumor growth in which agent logic was controlled in part by conceptualizations of molecular details and gene-protein interaction profiles. Key features were grounded to metric spaces. The example that follows, even though drawn from our own work, was selected because the system uses relational grounding (discussed in “How a Model is Grounded Impacts How It Can and Should be Used”).
The analogue is a simple synthetic analogue of human alveolar type II (AT II) epithelial cell cultures (81) similar to those used in drug discovery and development research (82, 83, 84), where selected system attributes are measured in the presence and absence of compounds of interest. Below, the discussion focuses on attributes in the absence of compounds. The targeted attributes are aspects of AT II cystogenesis in 3D matrix, which recapitulates several basic features of mammalian epithelial morphogenesis (85). To gain insight into the process, Kim et al. (81) created a concrete, standalone, in silico “cultured cell” system that had its own unique phenotype. The system was then refined iteratively so that its phenotype had the relationship to in vitro cell phenotypes that is illustrated in Fig. 3. A case was made that when mappings from in silico components and interactions to their biological counterparts are intuitive and clear, even if abstract, then one can hypothesize that the causal events in silico have in vitro counterparts. Accepting the reasonableness of these mappings enabled making an important, additional claim: a mapping will also exist between in silico operating principles and cellular operating principles.
Representing an Epithelial Cell Culture as a Dynamic System of Coarse-Grained, Interacting Components
Detailed descriptions and methods are available in (81,86). Building blocks and their functions, along with assembly methods were proposed so that components and the assembled analogue mapped logically to wet-lab counterparts. Data accumulated during executions were compared against referent wet-lab data. When the analogue failed validation, it was revised and tested iteratively until pre-specified behaviors were achieved. The assembled components and their operating methods stood as a hypothesis: these mechanisms will produce targeted characteristics. Execution and analysis of results tested that hypothesis. Hereafter, to clearly distinguish in silico components and processes from corresponding wet-lab structures and processes, we use small caps when referring to the in silico counterparts.
To produce the epithelial cell analogue, a cell culture was conceptually abstracted into four components. Cells, matrix (media containing matrix), free space (matrix-free media), and a space to contain them had in silico counterparts: cell, matrix, free space, and culture. Matrix and free space were passive objects. Matrix mapped to a cell-sized volume of extracellular matrix. A free space mapped to a similarly sized volume that was essentially free of cells and matrix elements. Cells were quasi-autonomous agents, which mimicked specified behaviors of AT II cells in cultures. Each used a set of rules or decision logic to interact with their local environment. When two or more cells attached, they acted quasi-autonomously, independent of individual cell activities. The culture used a standard 2D hexagonal grid to provide the space in which its objects resided and moved about. The simulation was executed in discrete time steps, during which each cell, in pseudo-random order, took actions based on its internal state and external environment. Having objects update pseudo-randomly simulated the parallel operation of cells in culture and the nondeterminism fundamental to living systems, while building in a controllable degree of uncertainty.
Hypothesizing and Testing Plausible Mechanisms of AT II Cystogenesis
In vitro, the average ALC size increased monotonically with initial cell density. Similar patterns were observed during simulations: mean values and their standard deviations are graphed in Fig. 5A. In sparse cultures with < 1,000 cells, which mapped to ∼1 × 104 cells/cm2in vitro, cells formed small ALCs with diameters that were essentially the same as the referent mean diameter. In denser cultures, larger ALC diameters were observed. Changes in the number of clusters, as a function of initial cell density, were the same for both in vitro and in simulations (Fig. 5B). AT II cell migration speed was an important determinant of aggregation and ALC formation in 3D cultures. Intuitively, one would expect to achieve the production of larger ALCs by elevating cell speed. Doing so would increase the cell collision rate and thus accelerate aggregation. Conducting such experiments in vitro is infeasible because a minimum level of extracellular matrix is required to sustain normal AT II cell behaviors. Testing that hypothesis for AT II analogues was straightforward and achieved by parametrically changing cell migration speed. Slowing migration speed was expected to correlate to increased extracellular matrix densities. As shown in Fig. 5E, the results predicted a dramatic reduction in ALC formation when the extracellular matrix is stiffened. The prediction was tested and confirmed in vitro at a high initial cell density (85).
Simulating Tissue Responses In Vivo
Achieving Micro-mechanistic Insight to Ischemic Microvascular Injury
The work by Bailey et al. (87) demonstrates the potential of using synthetic models for hypothesis generation and knowledge discovery, and is also an excellent example of multiple iterative cycles of in silico experimentation coupled with wet-lab experimentation. They constructed a multi-cell, tissue-level, agent-oriented model of human adipose-derived stromal cell (hASC) trafficking through the microvasculature of skeletal muscle tissue after acute ischemia.
After ischemic injury to microvasculature, blood is re-routed to adjacent microvascular networks causing swelling and increases in wall shear stress and hydrostatic pressures at the adjacent site. The changes in wall shear stress and circumferential stress activate endothelial cells and perivascular cells, initiating the recruitment of circulating cells into the site of injury. Activated endothelial cells increase their surface expression of important cellular adhesion molecules that enable the circulating cells to home to the site of ischemic injury, adhere to the endothelium, extravasate, and incorporate into the injured tissue. In addition, activated endothelial cells and perivascular cells secrete a number of inflammatory chemokines, cytokines, and growth factors. These further activate endothelial and perivascular cells, as well as activating circulating cells in order to promote their recruitment into the site of ischemic injury. Intravenous delivery of hASC has been shown to help repair and regenerate injured tissue from ischemia. An analogue was used to gain mechanistic insight into that process.
Bailey et al. constructed an agent-based model to identify potential bottlenecks that may limit the efficiency of these therapeutic cells being recruited into the site of ischemic injury after intravenous injection. It was their hope that better clinical outcomes could be achieved through increasing the number of incorporated hASCs.
Individual in silico cell behaviors were determined from a set of over 150 rules that were derived from independent literature. Each endothelial cell, monocyte, and hasc could, in a binary manner, either be positive or negative in their expression of each cellular adhesion molecule. In a similar binary manner, each endothelial cell, monocyte, and tissue resident macrophage could either be in a positive or negative state of secretion for each of the chemokines and cytokines. Thus, each cell could be unique. A cell’s state of cellular adhesion molecule expression and state of chemokine and cytokine secretion were also dynamic, and depended upon the chemokine and cytokine secretion states of neighboring cells (Fig. 6D).
Whether a circulating monocyte or hasc rolled or adhered was dependent on a combination of factors. They had to experience the correct combination of cellular adhesion molecule expression states and chemokine secretion states from a nearby endothelial cell, and have also experienced a wall shear stress below a certain threshold level. If the cell adhered for more than a specified number of time steps, it could then transmigrate into the tissue space.
Bailey et al. simulated a microvascular network under normal conditions and after ischemic injury. The microvascular network was representative of a microvascular network adjacent to the site of ischemic injury, where blood flow is increased due to its redistribution. When simulating ischemic injury, the pressure at the feeding arteriole was increased by 25% and the resultant hemodynamic properties were re-calculated.
Simulation Experiments Implicated an Additional Cellular Adhesion Molecule
The analogue was verified without hasc by performing a series of in silico knockout experiments, comparing simulated and wet-lab phenomena where data were available, and by showing that the analogue could mimic three aspects of ischemic injury: (1) increase in wall shear stress and network rates; (2) up-regulation of specific cellular adhesion molecules expression by the endothelium; and (3) increased secretion levels of chemokines and cytokines. Additionally, two key monocyte properties had in silico counterparts. Following verification, they simulated hASC trafficking after intravenous injection before and after ischemic injury. Lower than expected levels of hasc extravasation were observed, and that led to a re-evaluation of their rule-set. hASCs do not express PSGL-1, however they hypothesized that there may exist an additional cellular adhesion molecule used for rolling, similar to PSGL-1, for which there was no counterpart in their analogue.
To explore that hypothesis, they included an additional cellular adhesion molecule, termed SBM-X, with properties similar to PSGL-1 and determined whether so doing allowed the analogue to more closely mimic in vivo experimental results. Simulations showed that the inclusion of the new molecule SBM-X was necessary to achieve targeted levels of hasc extravasation. They subsequently tested their hypothesis in vitro and showed that small fractions of hASCs are able to roll on P-selectin even though they do not express PSGL-1. They proposed that the cellular adhesion molecule CD24 is a likely SBM-X candidate.
In Silico Livers
An In Silico Liver is Constructed by Assembling Simple Components into a Larger, Multi-Level, Biomimetic Structure
In Silico Livers (ISLs) are advanced examples of biomimetic analogues designed specifically to help bridge the gap in Fig. 1. An ISL is not intended to be a model having a temporally stable structure. It is designed to be altered easily in order to explore many equally plausible mechanistic explanations for disposition-related observations. The ISL is an assembly of componentized mechanisms: purposefully separated and abstracted aspects of hepatic form, space, and organization interacting with compounds. Each component mechanism has been unraveled from the complex whole of the hepatic-drug phenotype; it has its own unique phenotype, but that phenotype is much simpler than that of the entire lobule mechanism.
ISL-targeted attributes were divided into three classes (88): 1) compound-specific time course data, 2) microanatomical details (heterogeneous sinusoid structures, perivenous cross-connections between sinusoids, etc.), and 3) experiment details (perfusion is single-pass or not, administered compounds pass though catheters and large vessels before entering lobular portal vein tracts, etc.). A primary use has been to use and reuse similar ISL structures to provide plausible micro-mechanistic explanations of hepatic disposition data for many drugs.
The SS structure shown in Fig. 7C maps to a unit of sinusoid function that includes spatial features. An SS is a discretized, tube-like structure comprised of a blood “core” surrounded by three identically sized 2D grids, which together simulate a 3D structure. It can be replaced by more realistic 3D grid when it is required to do so to achieve some targeted attribute. Two SS classes, S1 and S2, are specified to provide sufficient variety of compound travel paths. Compared to an S2, an S1 on average has a shorter internal path length and a smaller surface-to-volume ratio.
ISL parameters are grouped into three categories: 1) those that control lobule graph and 2) SS structures, and 3) those that control lobular component interactions with compounds. Compounds are represented using objects that move through the lobule and interact with encountered SS features. A typical compound maps to many drug molecules (see Representing Compounds in the Appendix). A compound’s behavior is determined by the PCPs of its referent compound, along with the lobule and SS features encountered during its unique trek from PV to CV. During a simulation cycle, an encountered component “reads” the information carried by a compound and then uses it to customize its response, in compliance with its parameter values, following some pre-specified logic. That feature enables multiple, different compounds to be percolating through SS features during the same experiment.
Objects called cells (Fig. 7D) map to an unspecified number of cells. They function as containers for other objects. A grid location and its container are the current limit of spatial resolution. Cells contain a stochastic parameter-controlled number of binders in a well-stirred space. Binders map to transporters, enzymes, lysosomes, and other cellular material that binds or sequesters drug molecules. In the cited work, a binder within an endothelial cell only bound and later released a compound. A binder within a hepatocyte is called an enzyme because it can bind a substrate compound and either release or metabolize it. Additional objects can be added as needed, as in (65), to represent uptake and efflux transporters, specialized enzymes, and pharmacological targets, without compromising function of objects already present. Because of the stochastic nature of ISL simulations, each in silico experiment generates a slightly different outflow profile.
Experimenting on ISLs to Better Understand Micro-mechanisms and Predict Hepatic Drug Disposition
In (90), a single parameterized ISL structure was arrived at iteratively, and held constant for antipyrine, atenolol, labetalol, diltiazem, and sucrose. Parameters sensitive to compound-specific PCPs were tuned so that ISL outflow profiles were validated separately and together against rat perfused liver outflow profiles. Each ISL component interacted uniquely with each of the five compounds. The consequences of ISL parameter changes on outflow profiles were explored. Selected changes altered outflow profiles in ways consistent with knowledge of hepatic anatomy and physiology and compound PCPs. That level of validation enabled the authors to posit that static and dynamic ISL micro-mechanistic details, although abstract, mapped realistically to hepatic mechanistic details. Because ISL mechanisms are built from finer-grained components, there is precise control over conflation. The causal basis is present in the component–compound interaction logic (axioms, rules). An expectation has been that at some level of granularity, the complexity will be sufficiently unraveled so that the logic for a given micro-mechanism (its “causal basis”) will rely heavily on only a few easily specified compound and biological attributes.
Subsequently, in (91), quantitative mappings were established between drug PCPs and ISL parameter values for the above four sets of drug PCPs and the corresponding sets of PCP-sensitive, ISL parameter values. Those relationships were then used to predict PCP-sensitive, ISL parameter values for prazosin and propranolol given only their PCPs. Relationships were established using three different methods: 1) a simple linear correlation method, 2) the Fuzzy c-Means algorithm, and 3) a simple artificial neural network. Each relationship was used separately to predict ISL parameter values for prazosin and propranolol given their PCPs. Those values were then used to predict disposition details for the two drugs. All predicted disposition profiles were judged reasonable (well within a factor of two of referent profile data). The parameter values predicted using the artificial neural network gave the most precise results. More noteworthy, however, was that the simple linear correlation method did surprisingly well. That is because the ISL is an assembly of micro-mechanisms where each is influenced most by a small subset of PCPs. The results suggest that when using the synthetic method of assembling separated micro-mechanism, a parameter estimation method, which reasonably quantifies the relative differences between compound-specific behaviors at the level of detail represented by those micro-mechanisms, will provide useful, ballpark estimates of hepatic disposition. That bodes well for using synthetic analogues for predicting PK properties, given only molecular structure information.
Because the causal, mechanistic differences occur at the micro-mechanism level, it is easy to morph—transform—a normal ISL into a diseased ISL (Fig. 8B). The morphing stands as a hypothesis for how and where disease may have altered hepatic micro-architectural features and processes. The transformation methods are generalizable. For example, a validated analogue of one in vitro cell culture system can be morphed into a different analogue representing a second in vitro cell culture system (and vice versa). The process will present a dynamic hypothesis of where and how compound interaction properties differ between analogues.
It is impractical to obtain liver perfusion data for large numbers of compounds. However, in vitro disposition properties can be measured using cultured hepatocytes and other cell types. An advantage of the componentized analogue approach is that an ISL validated for several compounds can be re-used to obtain ballpark estimates of hepatic disposition properties of other compounds by using in vitro data and taking advantage of ISL component replacement capabilities. Sheikh-Bahaei et al. (66) validated hepatocyte monolayers for four different compounds. The referent system was hepatocytes in a sandwich-culture system that enabled estimating biliary excretion. Their hepatocytes were based on the same container object concept used to create ISL hepatocytes. That similarity opens the door to unplugging the hepatocytes from an ISL that has been validated for several compounds and replacing them with hepatocytes that have been tuned and validated in vitro for one or more other compounds for which no liver perfusion data is (or will be) available. Following adjustments based on cross-model validation studies, the outflow profile from such an ISL, given the new compound’s PCPs, will stand as a ballpark prediction of that drug’s hepatic disposition.
An Analogue of Interacting Organs
Unraveling the Mechanisms of Systemic Inflammatory Response Syndrome and Multiple Organ Failure
Gary An (18) engineered a multi-level, two-organ analogue (gut and lung) to explore plausible causal mechanisms responsible for the clinical manifestation of multi-scale disordered acute inflammation, termed systemic inflammatory response syndrome (SIRS) and multiple organ failure (MOF), and how they may respond to therapeutic interventions (92). Following iterative refinement and parameter tuning, An discovered course-grained, multi-level analogue mechanisms that achieved several targeted attributes related to SIRS and MOF. Simulations used abstract and discrete analogues of gut and lung, each comprising fixed cell-mimetic agents forming endothelial and epithelial tissues, mobile cell-mimetic agents corresponding to inflammatory cells, and mobile objects that mapped to pro-inflammatory mediators. Although exploration of the consequences of drug interventions was not part of the latest study, system design enables adding drug-mimetic objects when the need arises.
Component interactions have biological counterparts that extend from intracellular mechanisms to clinically observed phenomena in the intensive care setting. Different types of cell agents encapsulate specific mechanistic knowledge extracted from in vitro experiments. The model was used to explore the likelihood of the two prevailing hypotheses about the nature of disordered systemic inflammation: that it is a disease of the endothelium or that it is a disease of epithelial barrier function. The former paradigm points to the endothelial surface as the primary communication and interaction surface between the body’s tissues and the blood, which carries inflammatory cells and mediators. However, there is also compelling evidence that organ dysfunction related to inflammation is primarily manifest in a failure of epithelial barrier function. An’s multi-level analogue of interacting gut and lung enabled exploration of plausible mechanisms that unify those two hypotheses. Simulations produced qualitative phenomena that mimicked attributes of multi-organ failure: severe inflammatory insult to one organ led to both organs failing together.
Abstract, Coarse-Grained Representation of Epithelial and Endothelial Tissues Responding to inflammation
In deference to the parsimony guideline, each organ was composed of an epithelial surface, which determined organ integrity, and an endothelial/blood interface, which provided for initiation and propagation of inflammation. The epithelial cell layer was validated separately against data from in vitro cell monolayer models used to study epithelial barrier permeability. The epithelial cell layer was concatenated with the endothelial/inflammatory cell layers to produce an abstract, coarse-grained gut analogue. It was separately validated against observations made on in vivo wet-lab models of the inflammatory response of the gut to ischemia. Finally, the gut organ and a similarly constructed pulmonary organ were combined to create a gut-pulmonary axis analogue, the behavior of which was expected to map to in vivo and clinical observations on the crosstalk between these two organ systems.
The interaction of a layer of endothelial cells with a population of different inflammatory cells was described in (92,93). The latter included neutrophils, monocytes, T-cells, etc. All cells were agents. The analogue used objects that mapped to specific mediators, including endotoxin, tumor necrosis factor (TNF), and IL-1. Other objects mapped to receptors, including L-selectin, ICAM, TNF receptors, and IL-1 receptors. The interaction of those objects mapped to signals being transferred between inflammatory cells and endothelial cells. The rules and operating principles used by cells were abstract yet strove to reflect current knowledge: positive and negative feedback relationships were implemented using simple arithmetic relationships. Receptor status was expressed as either on or off. The system enabled simulating the dynamics of the innate immune response and exploring plausible mechanisms of systemic inflammatory response syndrome due to a disease of endothelial cells.
Epithelial cell agents are the smallest functional unit of the gut–lung analogue. Each agent maps to a single epithelial cell in the context of its response to inflammatory mediators, including nitric oxide, and pro-inflammatory cytokines, including TNF and IL-1. Any epithelial cell can form a tight junction with any of its eight epithelial cell neighbors. The process required functional and localized tight junction proteins and could be impaired by inflammation. That analogue mechanism mapped to the production and localization of tight junction proteins being impaired in a pro-inflammatory cytokine milieu. The epithelial component was first validated against in vitro data. That validated component was used directly as its in vivo counterpart. Variables within each cell controlled levels of tight junction components along with intracellular, pro-inflammatory signals. The phenomena of pro-inflammatory signals impairing tight junction function, and tight junction dysfunction leading to epithelial barrier failure were targeted attributes. Simple rules specified factor creation and how those factors interacted. Component levels and rules were tuned to achieve a satisfactory degree of similarity between analogue phenomena during simulated treatments and corresponding reported observations. When tight junction formation was impaired (or its components inhibited), a factor permeated the epithelial cell layer in a process that mapped to epithelial barrier failure. The same analogue mechanism was used with lung epithelial cells in a process that mapped to pulmonary edema, impaired oxygenation, and further injury.
Linking gut and lung Analogues into a Higher Level, Interacting System of Organs
An inherent property of synthetic system models is composability (the linkage or establishment of component, inter-relationships) downward, by nesting components within components, and laterally, by linking components at a similar level (linkage of the SS in the ISL is an example). By linking gut and lung, An demonstrates the important point that upward composability can yield significant benefits—in the form of improved insight—and was necessary for achieving the research objective.
For the gut organ, two coupled phenomena were targeted: 1) the gut can fail in the presence of severe ischemia, but 2) it can recover from less severe ischemia. In the gut, ischemia interfered with formation of tight junction components and thus epithelial barrier function. Each gut endothelial cell had an ischemic injury parameter that enabled the control of the proportion of endothelial cells having ischemic injury. The latter led to production of a pro-inflammatory signal (called cell-damage-byproduct). The endothelial component for both gut and lung also needed to recover from perturbations simulating both infectious and non-infectious insult, where the infectious insult replicated and actively damaged the system. It needed to do so while mimicking recognized component mechanisms. Gut epithelial cells responded to that byproduct. The process simulated epithelial barrier dysfunction. Simulation results are consistent with the hypothesis that the cell-damage-byproduct was responsible for activation of circulating inflammatory cells and that, in turn, led to organ injury. If we accept the analogue-to-referent mappings as being reasonable, then there may be an in vivo counterpart to the cell-damage-byproduct. The lung employed essentially the same mechanisms. For the lung organ, two coupled phenomena were targeted: epithelial barrier dysfunction results in lung edema and impaired systemic oxygenation, and supplemental oxygen can increase the sub-lethal threshold of hypoxia. Pulmonary epithelial barrier dysfunction led to impaired oxygenation, which in turn affected systemic endothelial cell oxygenation status.
For the two\ coupled organs, two attributes were that a severe inflammatory pulmonary insult (such as pneumonia) can lead to gut failure (Fig. 10B), and gut ischemia can lead to pulmonary failure (Fig. 10C). Given those, three modes of organ crosstalk were specified: 1) inflammatory cells moved between Z = 0 and Z = 3 carrying inflammatory signals; 2) the cell-damage-product produced by ischemic endothelial cells in gut moved to lung where it could activate lung inflammatory and endothelial cells. That process had a negative impact on endothelial function and epithelial barrier function, which, in turn, impacted systemic oxygenation; and 3) all endothelial cells were dependent on a baseline oxygenation level. It decreased as lung dysfunction increased. Simulations with the two organs coupled together showed that a severe insult to one organ (which, for example, may map to pneumonia) led to MOF: lung inflammation led to impaired systemic oxygenation and gut ischemia, which, in turn, fed back to the lung, potentiating pulmonary dysfunction and lowering the analogue’s sublethal ischemic threshold.
REASONING TYPES AND THEIR DIFFERENT ROLES IN M&S
Deduction, induction, and abduction play different, essential roles in M&S.
Methodical use and documentation of all three reasoning types is required.
A model is a physical, mathematical, or logical representation of a referent system. The word model shares its etymological ancestors with the word measure. A model is, fundamentally, a measurement device or method for some referent. We use analogue to refer to models that are physical, such as an in vitro tissue model, and to distinguish products of the synthetic method from inductive models. An analogue’s existence, operation, mechanism, etc. are entirely independent of the modeler and the referent. In those cases where the analogue consists of a computer running a program, we call it a simulation. Note that a simulation must be executing—prior to and after the computer executes the program it is merely a set of instructions (program) and an instruction interpreter/executor (computer). Computation is a form of inference or reasoning: it is deduction. It is noteworthy that biological processes are also largely deductive, but the challenge is that biology is a young science; we do not know much about the language of biological processes or axioms.
In order to understand how computational models are and should be used for mechanism and knowledge discovery, one needs to understand how induction, deduction, and abduction relate to computational M&S. Those relationships are at the core of both synthetic and scientific M&S.
Synthetic Analogues Encourage Abductive, Scientific M&S
Abductive inference dominates upstream discovery and development.
Experimenting on synthetic analogues encourages abductive inference in exactly the same way as wet-lab experimentation.
Abduction, induction, and deduction are necessary for discovery and development decision-making.
Scientific M&S requires designing and conducting experiments on analogues designed to qualify as objects of experimentation.
Multiple competing hypothetical mechanisms (models) are required.
Models are used throughout the drug development pipeline from discovery to post-marketing surveillance and from laboratory production to manufacturing. However, model purpose and usage vary within that pipeline. Downstream models focus on achieving some specific objective, like documenting that disease progression can be halted or improving a drug’s supply chain. Upstream, scientific models focus on adding new domain knowledge and reducing uncertainties. We argue that abductive inference is most important for upstream M&S, and we will show that experimenting on synthetic analogues encourages abductive inference in exactly the same way as wet-lab experimentation.
None of the three methods of reasoning adds new knowledge on its own. In particular, deduction, being purely syntactic, is incapable of adding new knowledge. Because most current computational analogues are, independent of some larger, descriptive context, deductive devices, it is justifiable to doubt the extent to which such an analogue can be scientific. Such an analogue is a statement about what is currently known or believed. The means for making computational analogues scientific lies in model usage and how that usage fits into the larger research enterprise. The same is true of a wet-lab model.
New knowledge comes about by seeking and confronting contrast, anomaly, and surprising or unexpected observations. Our models evolve fastest when they fail to capture the world around us. When that occurs, we respond by constructing explanatory hypotheses—often relying on abduction—which are usually manifold and typically wrong at first. The collection of initial hypotheses is refined iteratively through rational analysis, including experimentation and deduction in both the minds of the researchers as well as in computer simulations. Those that survive are further refined in the face of these newly induced models. At the end of this iterative process, the most robust explanatory and predictive hypotheses can be integrated into larger bodies of theory.
Granted, the above is a caricature of the actual process, which is extremely complex and social. But from the perspective presented, it is clear that abduction is very important to the scientific method. Scientific models—in silico or in the mind of a domain expert—are primarily abductive. This should be common sense, since science is about capturing, refining, and ultimately reducing our ignorance of a given system. As scientists, we deal more with what we do not know or do not understand than with what we do know or understand.
Computational analogues differ from other model types (e.g. in vitro, in situ, in vivo) because they rely entirely on machinery that has been explicitly and aggressively designed so that variation, anomaly, and surprise are minimized to the point where they are vanishingly small. For the most part, computational analogues are deterministic, well-controlled, and predictable devices. Because of this special status, the overwhelmingly popular uses of computational analogues do not involve experimentation. By contrast, consider an in vitro model. It is also explicitly and aggressively designed to minimize variation. However, there are always system component aspects about which the experimenter is fundamentally ignorant. Obviously, that is the component of interest (cells, tissue explant, etc.). We still treat artificial machines as experimental systems, even though well-understood theories are known to govern their behavior. In contrast, we often believe we fully understand and can validate computational programs. Most researchers do not treat a computational analogue as an experimental apparatus.
Instead, we inscribe into them what we expect to conclude from them. As stated above, that is deduction. The conclusions are the same as the premises, just transformed by a formal system grammar. Hence, if we maintain that computational analogues are completely verified (we know precisely what they do) and they are purely deductive (truth- preserving), then we cannot rely on them as rhetorical devices in and of themselves without committing the fallacy of petitio principii—assuming the conclusion. This situation makes it clear that in order to avoid fallacy, any scientific rhetoric of which a computational model is a part must include the other two types of inference (abduction and induction), and to do that must draw on additional models, especially those in the mind of the researcher. That realization implies that scientific research involving computational analogues—scientific M&S—is characterized by testing multiple similarly plausible models, just as abduction requires testing multiple hypotheses and induction requires multiple observations. Note that abduction and induction occur at a level above the computational analogues.
The described framework for the scientific use of computational models requires designing and conducting experiments on the analogue and constructing analogues that merit being objects of experimentation. Each of the four biomimetic systems in “Analogues: From In Vitro Tissues to Interacting Organs” had multiple predecessors (which, at the time were considered plausible models about some aspect of phenotype) that were challenged experimentally and found in some way wanting.
M&S AND THE SCIENTIFIC METHOD
Two Major Categories of Robustness: You Cannot Have Both
Inductive models tend to be robust to changes in referent; synthetic models tend to be robust to changes in context.
Synthetic models are more useful early on, while inductive models are more useful later.
As they mature, synthetic models will also be useful later.
For the computational models in Fig. 1, including those that will bridge the gap, there are two main categories (types I and II) of robustness, each with a subcategory. Short- and long-term uses determine which category should be selected to achieve a given M&S objective, and that in turn impacts which model type, inductive or synthetic, may be best. A model cannot be robust in all ways. Analogues in the first category (type I) are robust to context changes, yet fragile to changes in referent. Some of these analogues can still be abstract enough to work for families of referents. Analogues in the second category (type II) are robust to referent changes, yet fragile to changes in context. Some analogues within this category can still be abstract enough to work for families of context. Wet-lab models can be similarly classified. For example, MDCK cells can be cultured under many different conditions with a variety of additives; the cells are robust to context changes. However, as epithelial cells, they cannot mimic cardiac myocytes and in that way they are fragile to changes in referent. Embryonic stem cells can be prodded to transform into representations of many different cell types, but to be maintained as stem cells their environment must be tightly controlled; they are robust to referent changes, yet fragile to changes in context. Synthetic and inductive models often identify more strongly with one of these categories, as illustrated in Fig. 11B. A fully synthetic and concrete analogue, such as the ISL, is robust to changes in context, yet fragile to changes in referent. It can be used to represent a liver in almost any context, but it cannot be used to represent a lung (however, some of its parts could be reused in a lung analogue). The generic two-layer subsystems used by An in the fourth example are robust to changes in context, and they have contextual patterns suitable for multiple referents. The organ subsystems are synthetic, yet abstract enough to represent other organs or tissues. A fully inductive, detailed, PBPK model is expected to be robust to changes in referent, yet it is fragile to changes in context. The same model can be used to represent any number of individuals and even different mammals, but only under similar conditions. Traditional PK and PD models are sufficiently abstract and general to be robust to changes in referent within referent patterns across contexts. Consequently, such models can be used to characterize data from many different referents. The patterns in the data can arise during different experimental contexts.
Models grounded to a metric space or hyperspace (discussed in “How a Model is Grounded Impacts How It Can and Should be Used”) are robust to changes in referent while being fragile to changes in usage or experimental protocol. Inductive models tend to be grounded absolutely because they model relations between variables or quantities, not qualities or mechanisms. However, a generalized inductive model (e.g., exponential decay, saturation, or a sigmoid) can easily show qualitative relationships between quantities. Such models can be robust to variations in the ratios between the quantities even though they depend fundamentally on the quantities they relate.
Relational models are robust to changes in use or experimental protocol, yet they are fragile to changes in referent. It is natural for synthetic models to be internally grounded because they model relations between identified and hypothesized components. A more generalized and abstract synthetic model, such as cellular automata can show organizational patterns between constituents. Consequently, it can be robust to changes in referent, when the various referents have similar organization.
In general, because inductive models tend to be robust to changes in referent, and synthetic models tend to be robust to changes in context, we recommend synthetic analogues for explanation and hypothesis generation earlier in R&D and inductive models for prediction and late-stage hypothesis falsification. When synthetic analogues are relational, they are best for knowledge embodiment.
The vision presented in the Introduction requires analogues with long lifecycles. To have long lifecycles, analogues must be capable of adjusting easily to incorporate new knowledge. This process requires using all three forms of inference (94). It is noteworthy that many inductive models have short lifecycles; some are never used again following their initial application. Synthetic and inductive models have different adjustment capabilities depending on the type and source of the data. When new knowledge comes from changes to context, then a synthetic model will be most appropriate. An example of change in context would be many different types of wet-lab experiments using the same cell line. The latter is often the case for wet-lab models used early in support of R&D. When the new knowledge comes from well-studied changes to the referent, then an inductive model is most appropriate. An example of the latter would be an expanded, phase four clinical trial.
How a Model is Grounded Impacts How it Can and Should be Used
Knowledge embodiment requires synthetic analogues that are relational.
Inductive models are typically grounded to metric spaces.
Metric grounding complicates combining models to form larger ones.
Relational grounding enables flexible, adaptable analogues, but requires a separate analogue-to-referent mapping model.
Biomimetic analogues designed to support drug discovery and development must have long lifecycles.
As stated earlier, the units, dimensions, and/or objects to which a variable or model constituent refers establish groundings. Inductive models are typically grounded to metric spaces. So doing provides simple, interpretive mappings between output and parameter values and referent data. Because phenomena and generators (in Glossary) are tightly coupled in such models, the distinction between phenomenon and generator is often small. Metric grounding creates issues that must be addressed each time one needs to expand the model to include additional phenomena and when combining models to form a larger system. Adding a term to an equation, for example, requires defining its variables and premises to be quantitatively commensurate with everything else in the model. Such expansions can be challenging and even infeasible when knowledge is limited and uncertainty is high, as on the left side of Fig. 11A. A model synthesized from components all grounded to the same metric spaces—a PBPK model for example—is itself grounded to the Cartesian composite of all those metric spaces. The reusability of such a model is limited under different experimental conditions or when an assumption made is brought into question.
Grounding to hyperspaces increases flexibility. A hyperspace is a composite of multiple metric spaces (and possibly non-spatial sets). Grounding to a hyperspace provides an intuitive and somewhat simple interpretive map (see (95)). Phenomena and generators are more distinct, because derived measures will often have hyperspace domains and co-domains, making them more complex as interpretive functions. Hyperspaces are often intuitively discrete, so they do not require discretization. They thus handle heterogeneity better than does a model grounded to a metric space. The High Level Architecture (IEEE standard 1516-2000) and federated systems for distributed computer simulation systems are examples of hyperspace grounding. Their focus is to define interfaces (boundary conditions) explicitly so that components adhere to a standard for such interfaces.
Dimensionless, relational grounding is another option. In equation-based models, dimensionless grounding is achieved by replacing a dimensioned variable with itself multiplied by a constant having the reciprocal of that dimension. That transformation creates a new variable that is purely relational. It relies on the constant part of a particular context. The components and processes in synthetic models need not have assigned units; see (65,96,97) for examples. The first, third, and fourth of the above examples use relational grounding: each constituent is grounded to a proper subset of other constituents. Relational grounding enables synthesizing flexible, easily adaptable analogues. However, a separate mapping model is needed to relate analogue to referent phenotypic attributes.
Hybrids of the above grounding methods are also possible. Some models can be synthesized by plugging together components that are simpler models. For example, in (80) output of metrically grounded, equation-based models of subcellular molecular and cell cycle details contribute to rules used by cell level agents. Such coupling makes them somewhat relational because not every component must be connected to every other component (or adhere to a standard adhered to by all other components, as with the High Level Architecture). However, their synthesis will depend in a fundamental way on their grounding, sometimes to a metric space, as in (98,99). The High Level Architecture (and similar) standards can be considered as hybrids, because they provide openness and extensibility that allow some sub-systems to integrate based on one standard and others to integrate based on another standard.
Biomimetic analogues designed to support drug discovery and development research are expected to evolve and become more realistic and useful. Consequently, those that do will have long lifecycles. Some will mature to become virtual tissues and organs: components in virtual patients. For these analogues, we suggest they begin as relational analogues and remain so to the degree feasible, and that separate mapping models be developed in parallel.
Synthetic analogue development to date by different groups has managed the grounding issue differently. Twelve examples are discussed briefly in the Appendix.
Experimenting on Synthetic and Inductive Analogues
Synthetic models are relatively specific, particular, and concrete; inductive models are relatively general, representative, and more abstract.
Experiments on inductive models will discover characteristics of the data from which the model was induced; experiments on synthetic models will discover characteristics of the composition, the mechanisms, and the model’s systemic phenotype.
When trying to explain a system about which we are ignorant or uncertain, use abduction: it is ignorance-preserving.
Synthesis facilitates knowledge discovery by helping to specifically falsify hypotheses.
Inductive models preserve the truth about patterns in data. Synthetic models exercise abduction while representing knowledge, uncertainty, and ignorance.
We argue for conducting experiments on computational analogues as if they were naturally occurring organisms or tissues. That is precisely what is done with very complex software and hardware systems developed purely for engineering purposes (e.g. flight code for an automatic pilot). However, in these engineering contexts, the purpose behind such testing is to clamp down on the exhibited variation and ensure that it stays within specified, controlled tolerances. When designing and planning wet-lab experiments, we include engineering tasks such as clamping down on the variation of those parts that are not objects of the experiment, such as temperature, pH, pO2, etc. The difference is that those wet-lab experiments have a different objective: to explore the living component of the system. The tightly controlled, well-understood, predictable parts of the supporting laboratory equipment are ancillary to the primary purpose, which is to refine and increase our understanding of the biological material being studied.
If we replace the biological material with a computational analogue in a supporting framework like the one described in Fig. 4, does it still make sense to use the whole apparatus to refine and increase our understanding of a smaller component of the system, such as the simulated cells in Fig. 4? The answer depends on the nature of that model and the model’s current location in model space. If it is a straightforward implementation of, for example, a simple mathematical equation that is well understood, then the answer is “no.” Of the models in the references cited in Supplemental Material, the vast majority developed within the past 15 years required considerable experimentation. That is because the phenotype (as in Fig. 3) resulting from the initially conceived mechanism was too far removed from targeted phenotypic attributes. Experimentation (several cycles of a protocol such as the one in the next section) was needed to locate a region of model space (mechanism space) for which the phenotype was more biomimetic.
Both inductive and synthetic methods depend differently on the means and measures used to gather the data and related information that becomes the focus of the model engineering effort. Inductive models contain an inherent commensurability amongst the measures, because induction finds and reproduces connected patterns in whole data sets. Synthesis, however, combines heterogeneous data with information from disparate sources and discovers ways to compose them; some information sources are ad hoc, whereas others are highly methodical. Even though the methods are fundamentally different, both inductive and synthetic models of biopharmaceutical interest will often be appropriate for experimentation.
Experiments on any model—mental, wet-lab, inductive, or synthetic—can help the scientist think about and discover plausible characteristics of a referent system’s mechanisms. Experiments on inductive models do so by exploring characteristics of the data from which the model was induced. That is because the mappings are among data, conceptualized mechanisms and referent, as illustrated on the left side of Fig. 2. Experiments on synthetic models do this by exploring model organization (how components co-operate/interact), which is hypothesized to map to referent organization, as illustrated by mapping C on the right side of Fig. 2.
Both modeling types have their strengths and weaknesses. Inductive models, because they rely directly on the measures used to take the data, are susceptible to the fallacy of inscription error (the logical fallacy of assuming the conclusion and programming in aspects of the result you expect to see). This weakness is a natural result of the combination of the extrapolative properties of induction and the truth-preserving properties of deduction. By contrast, as discussed in (100), synthetic models, like the examples described in “Analogues: From In Vitro Tissues to Interacting Organs,” can contain abiotic and arbitrary artifacts, assumptions, and simplifications made for the convenience of the builders. Note the partial overlap of phenotypes in Fig. 3. These properties provide direction for when one style should be preferred over the other. When trying to clearly specify the parts of a referent about which we are ignorant or uncertain, an ignorance-preserving technique like abduction combined with synthetic M&S should be the center of attention. When trying to specify the parts of a referent about which we have deep knowledge, a truth-preserving technique like deduction should be the focus. Where we possess enough reliable knowledge to warrant extrapolation and precise prediction, induction should be the focus. Hence, synthesis can be most useful as an upstream modeling method focused on discovering and falsifying hypotheses during knowledge discovery and synthesis, and while honing down and selecting hypotheses that are most believable. When we become confident of the generative mechanisms, inductive counterparts of synthetic analogues can kick in, allowing us to approach engineering or clinical degrees of understanding, intervention, and prediction. Inductive models are best for preserving the truth about patterns in data. Abductive synthetic models are best for exercising abduction and representing current knowledge and beliefs clearly, as well as areas of ignorance and uncertainty.
When R&D goals require the capabilities of both modeling methods, both model types can be implemented for parallel simulations within a common framework (101). In the sections that follow, we identify several important issues and discuss them in context of both synthetic and inductive M&S. Background information on computational models in support of scientific discovery is provided in the Appendix.
The Scientific Method in Iterative Analogue Refinement
Conceptual mechanisms can be flawed in ways that only become obvious after they are implemented synthetically.
Following a rigorous protocol facilitates generating multiple mechanistic hypotheses and then eliminating the least plausible through experimentation.
An iterative model refinement protocol is the heart of abductive, mechanism-focused, exploratory modeling.
The scientific method provides a procedure for investigation, the objective of which is knowledge discovery (or questioning and integrating prior knowledge). The method begins with phenomena in need of explanation or investigation. We pose hypotheses and then strive to falsify their predictions through experimentation. The traditional inductive modeling approach illustrated on the left side of Fig. 2 is often part of a larger scientific method that includes wet-lab experiments. On its own, however, inductive modeling is not scientific because new knowledge (about the referent) is not generated.
The stages in scientific M&S are illustrated on the right side of Fig. 2. The assembly of micro-mechanisms in each of the four examples in “Analogues: From In Vitro Tissues to Interacting Organs” was a hypothesis. Each execution was an in silico experiment. Measures of phenomena during execution provided data. When that data failed to achieve a pre-specified measure of similarity with referent wet-lab data, the mechanism was rejected as a plausible representation of its wet-lab counterpart (for a detailed example, see (65)). In all four examples discussed in “Analogues: From In Vitro Tissues to Interacting Organs,” many mechanisms were tested and rejected en route to the mechanisms discussed in the cited papers. Multiple rounds of iterative refinement followed by mechanistic failure illustrate the fact that complex conceptual mechanisms can be flawed in ways that are not readily apparent to the researcher. The flaws only become obvious after we actually invest in the effort to implement and test the mechanism synthetically.
The iterative model refinement protocol is the heart of abductive, mechanism-focused, exploratory modeling. When faced with the task of building a scientifically relevant, multi-attribute analogue in the face of significant gaps in the body of knowledge used to guide the process, parameterizations and model components must strike a flexible balance between too many and too few. Doing so can be complicated by the fact that a validated, parsimonious, multi-attribute analogue will be over-mechanized (“over-parameterized”) for any one attribute. Too many components and parameters can imply redundancy or a lack of generality; too few can make the model useless for researching multi-attribute phenomena.
Scientific progress can be measured in scientific M&S protocol cycles completed. It follows that we want to make the process as easy as possible. Decisions made at the beginning and during the first protocol cycle can dramatically impact the level of effort required to complete subsequent cycles. Strategies that can work well for inductive M&S may not be appropriate for synthetic M&S.
Making Predictions Using Synthetic Analogues
Synthetic models make predictions about component relations; inductive models make predictions about variable relations and patterns in data.
Synthetic analogues are ideal for discovering plausible mechanisms, relations between components, and mechanism-phenotype relationships. They are best for exercising abduction and representing current knowledge. They are good at explanation. Because of the uncertainties reflected in stochastic parameters and mappings to the referent, they are not as good as inductive, equation-based models at precise, quantitative prediction; their predictions will be “soft.” However, they can make effective relational predictions. Synthetic models make predictions about component relations (quantitative or qualitative), and inductive models make predictions about variable relations (quantitative or qualitative) and patterns in data. An ISL, for example, can be used to make predictions about where and how within the system two drugs administered together may effectively interact. However, when such predictions are absolutely grounded (via the grounding map), then an important distinction between synthetic and inductive model predictions lies in the error estimate for the prediction.
IMPACT OF M&S ON SCIENTIFIC THEORY
Instantiating and Exploring Theories of Translation
Theories of translation will arise from contrasting analogues.
A precondition for understanding if and when observations on wet-lab research models can translate to patients (and vice versa) is to have a method to anticipate how each system will respond to the same or similar new intervention at the mechanism level. The ISLs (21) and the other analogues described above enable developing that method. Building an analogue of each system within a common framework allows exploration of how one analogue might undergo (automated) metamorphosis to become the other. When successful, a concrete mapping is achieved. Such a mapping is a hypothesis and an analogue of a corresponding mapping between the two referent systems, as in Fig. 8B. The analogue mapping can help establish how targeted aspects of the two referent systems are similar and different both at the mechanistic level and, importantly, at the systemic, emergent property level. The vision is that the analogues along with the metamorphosis method can be improved iteratively as part of a rational approach to translational research.
Abductive Reasoning and Synthetic M&S Can Help Manage the Information and Data Glut
Synthetic analogues help alleviate the information and data glut.
Combining synthetic with inductive models better preserves and progressively enhances knowledge.
Exclusive reliance on inductive and deductive methods starves R&D of abductive opportunities.
We concur with An’s observations regarding dynamic knowledge representation and ontology instantiation (18,102), and argue that the data and information glut impacting pharmaceutical R&D is caused by our knowledge schemes and knowledge bases (into which that data and information should fit) being incomplete or unnecessarily abstract. The available schemes are largely represented in the formalized, prosaic Methods sections of published scientific papers, and to some extent in Discussion sections; they are also represented within similar documents within organizations. The majority of that information consists of relationships between quantities, yet the schemata available for cataloging relationships between quantities are ambiguous. Examples of elements of schemata are logP, clearance, level of gene expression, media composition, response, etc. The schemata are designed purposefully to lose, forget, abstract away, and/or ignore some concrete details of experiments and cases. On one hand, such abstraction is good because it facilitates the extraction of fundamentals and major trends: the take-home messages from specific experiments. On the other hand, there are many concrete details which, were they captured by the schemata, would permit a more complete cataloging of experiments and observations. In some cases, such improvements could enable semi-automatic extraction, hypothesis generation, evaluation, and hypothesis selection. We posit that complementing the current schemata with synthetic analogues, advanced progeny of the four examples in “Analogues: From In Vitro Tissues to Interacting Organs,” would be a significant step toward more satisfactory schemata. The process would begin alleviating the information and data glut, and allow semi-automated hypothesis generation/testing and theory development.
A synthetic analogue is a schema for biomimetic constituents. An inductive model is a schema for quantities. Either, alone, is inadequate. Together, knowledge generated can be preserved and its value progressively enhanced as the process advances.
A natural consequence of the rapid advances in –omic technologies, which are quantitative, coincident with the rise of molecular biology, coupled with advances in computational methods, has been a heavy focus on forward mappings and inductive models (discussed further under Generator–phenomenon relationships in the Appendix). Methods to support abduction and synthesis—scientific M&S—have not advanced as quickly. A consequence is the current information and data glut. Methods (knowledge schema) for rapid hypothesis generation and refinement are sorely needed (102).
We suggest that a contributor to the inefficiencies responsible for significant rates of failures late in development is the fact that scientists and decision-makers currently translate results and conclusions from wet-lab experiments to patients using conceptual mechanisms and mappings where some assumptions are intuitive and unknown, and thus unchallengeable. It has been argued that mathematical, systems biology models will help address this issue, but we believe the situation is actually made worse by over-reliance on inductive and deductive computational modeling methods that reduce opportunities for and ignore the importance of abductive reasoning. Scientific M&S provides the means to concretely challenge those concepts and preserve those that survive.
Discovery and Development Research Needs Explanatory Models to Complement Inductive Models
Synthetic analogues are best at explanation; inductive models are best at precise prediction.
Strong theory and good science depend on having both heuristic value and predictive value.
When in pursuit of new mechanistic insight, the emphasis should be on generation and exploration of multiple hypotheses.
We risk being stuck on the current scientific plateau until we implement complementary methods to generate and select from competing, explanatory hypotheses.
The explanatory power and heuristic value of a hypothesis come from its ability to make specific statements about the network of micro-mechanisms (lower level generator–phenomenon relationships; see Appendix for further discussion) that produce a rich phenotype. Synthetic analogues are best at explanation. On the other hand, the predictive power of a hypothesis comes from its ability to make specific statements about the end conditions (the context, state, situation, etc. it will obtain) given some initial conditions. Inductive models are best at prediction. Strong theory and good science depend on having both heuristic value and predictive value. Recall Richard Feynman’s dictum: “what I cannot create, I do not understand.” Correlations are devoid of heuristic value, yet they can provide predictive value.
An attractive property of inductive models is that, because they relate quantities, they are relatively easy to validate. However, because they do not relate specific components, the mechanisms and the generator–phenomenon map they are intended to represent remain conceptual. All the hypothesized generators for the phenomena modeled are embedded in the prosaic and pictorial descriptions of the models, not in the mathematics. The descriptions are not actually part of the induced model. There is only the conceptual linkage illustrated on the left side of Fig. 2. That situation makes generator falsification very difficult because many generator configurations and different sets of generators can produce the same phenomena, which, when measured, contains patterns specified by the validated mathematical, inductive model. Within the biomedical, pharmaceutical, and biotechnology domains, that difficulty has resulted in too little focus on falsification and an overzealous focus on data validation (of patterns), as distinct from the more heuristic forms of validation (103). When in pursuit of new mechanistic insight, the emphasis should be on generation and exploration of multiple hypotheses. The validation process should involve repeated attempts to falsify a population of hypotheses and select the survivors. As demonstrated herein, the technology and methods are available to complement any mechanistically focused inductive model with synthetic explanatory modeling methods that offer concrete analogues of the conceptualized mechanisms. Without complementary methods to generate and select from competing, explanatory hypotheses, we risk remaining stuck on the current scientific plateau. Information and data will continue to grow, further overwhelming the individual scientist’s ability to reason scientifically and to decide which therapeutic candidates to select and which to eliminate.
Modeling a wet-lab biological system is multifaceted, requiring all three reasoning methods along with multiple models and model types.
The phases of multi-modeling are construction, evaluation and selection, and refinement.
Scientific modeling, like wet-lab science, requires use of all three modes of inference. It also requires development of inductive and synthetic models to provide heuristic as well as predictive value. Use of the three inference modes requires developing and using multiple models. Multiple models are necessary because discovering plausible forward maps requires targeting multiple phenomena (and multiple measures). Discovering plausible inverse maps requires exploring multiple generators (and multiple measures).
Any satisfactory synthetic analogue can be falsified by placing additional demands on its phenotype, by selectively expanding the set of targeted attributes to which it is expected to be similar (Step 8a in Fig. 12). A beauty of synthetic analogues of the types described in “Analogues: From In Vitro Tissues to Interacting Organs” is that we can observe the networked micro-mechanisms and “see” what mechanistic features were most likely responsible for the analogue’s inability to survive falsification. The data in Fig. 9 are an example. This knowledge improves insight into the referent mechanism and is an example of the analogue’s heuristic value. If all existing analogues have been falsified, then one must step back and invent new analogues containing new micro-mechanistic features that may survive falsification. Following the preferred (but more resource-intensive) approach, two or more somewhat different, yet equally plausible analogues are created, and one or more survives falsification. The process of analogue falsification and survival provides valuable new knowledge about the analogue and about the referent micro-mechanisms.
Falsification of a synthetic analogue requires a precise criterion, one that requires the use of inductive models. The quantitative comparison of comparable analogue and wet-lab phenomena typically focuses on data features, a task for which inductive models are ideally suited. Statistical models are also useful. Because multiple attributes are always being targeted, the falsification decision will be based on multi-attribute comparisons. Following the repeated cycles of refinement in Fig. 12, analogues become more resistant to incremental falsification, and begin earning trust. Trustable synthetic analogues must be robust to both context variance and constituent variance. To build trust, the analogue must be groundable using absolute units (clock time, ml, moles, etc.), which requires concurrent development of mapping models. It becomes clear that modeling a wet-lab biological system to expedite research progress is a multifaceted undertaking requiring exercising all three reasoning methods and the development and use of multiple models and multiple model types. These tasks will be facilitated and made easier by insisting on analogues that strive to exhibit the capabilities in the Text Box. Further, multi-modeling has three different phases: construction, followed by evaluation and selection (for or against), and refinement.
Complement, Not Replace; Evolution, Not Revolution
Synthetic methods complement inductive methods.
Synthetic methods can help ferret out bad information.
Inclusion of synthetic M&S early in R&D will accelerate the process.
One might surmise that because synthetic M&S methods draw extensively on recent computer science advances and thus are relatively new, that they are intended to replace the “old” inductive methods. That is not the case: M&S efforts complement wet-lab efforts. As illustrated in Fig. 11, synthetic methods complement the well-established inductive methods. Synthetic M&S simply provides a dramatic expansion in options for using M&S to advance science and facilitate discovery, especially where mechanistic insight is needed, toward the left side of Fig. 11A. Synthetic M&S is a product of the evolutionary process that is driving computational M&S methods to become more biomimetic, to bridge the gap in Fig. 1.
Integration of various published results into a synthetic analogue accomplishes two things: 1) it helps build schemata for knowledge (and ignorance) representation, and 2) it also provides a mechanism for the curation and maintenance of the embedded knowledge. Both are possible because components and mechanisms in synthetic models are concrete, whereas those in inductive models are conceptual. Because synthetic models are heuristic, falsification during the evaluation of the Fig. 12 protocol will spawn abductive explanations for the hypothesis’s failure. The weakest part of the failed analogue will most often be some new feature of the hypothesized mechanisms. However, in some cases, the weakest part may be some previously added component or logic drawn from badly designed experiments or invalid or false conclusions. The heuristic nature of synthetic models can highlight and facilitate the correction or revisitation of those previously accepted components; see (65) for an example. Synthetic M&S in conjunction with wet-lab experimentation and inductive M&S methods curates and maintains the integrity of current knowledge but also acknowledges our current state of ignorance.
One can argue that progress in a pharmaceutical R&D effort is tied to opportunities for mission-centered abductive reasoning. Circumstances requiring abductive reasoning associated with results of wet-lab experiments are common during discovery and early development. Opportunities to exercise abductive reasoning have traditionally been tied to the number of experiments and their duration. A complementary synthetic M&S effort can be added incrementally. With that effort comes a dramatic increase in opportunities for goal-directed abductive reasoning. If progress and better decision-making are positively correlated with knowledge and insight gained through goal-directed abductive reasoning, then we can anticipate that the early inclusion of scientific synthetic M&S will accelerate the process.
Levent Yilmaz provided useful constructive criticism and fresh ideas. Tim Otter, Paul Davis, and Marty Katz provided insightful commentary. Support was provided in part from the CDH Research Foundation and the International Foundation for Ethical Research.
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
- 11.Demin I, Goryanin. Kinetic modelling in systems biology. BocaRaton: Chapman & Hall; 2008.Google Scholar
- 15.Cardelli L. Brane calculi. Lecture notes in computer science (Computational Methods in Systems Biology) v3082. Springer; 2005. p. 257–78.Google Scholar
- 16.Priami C, Quaglia P. Beta binders for biological interactions. Lecture notes in computer science (Computational Methods in Systems Biology) v3082. 2005; p. 20–33.Google Scholar
- 17.Phillips A, Cardelli L, Castagna G. A graphical representation for biological processes in the stochastic pi-calculus. Transactions On Computational Systems Biology VII, Lc Notes Compu Sc. 2006;4230:123–52.Google Scholar
- 23.Rescigno A. Foundations of pharmacokinetics. New York: Kluwer; 2003. p. 17–21.Google Scholar
- 25.Zeigler BP. Multifacetted modelling and discrete event simulation. San Diego: Academic; 1984.Google Scholar
- 26.Zeigler BP, Praehofer H, Kim TG. Theory of modeling and simulation: integrating discrete event and continuous complex dynamic systems. San Diego: Academic; 2000.Google Scholar
- 28.Von Neumann J, Burks AW. Theory of self-reproducing automata. Urbana: University of Illinois Press; 1966.Google Scholar
- 39.Dormann S, Deutsch A. Modelling of self-organized avascular tumour growth with a hybrid cellular automaton. In Silico Biol. 2002;2:35.Google Scholar
- 40.Alber MS, Kiskowski MA, Glazier JA, Jiang Y. On cellular automaton approaches to modeling biological cells, IMA Vol Math Appl. 2003;134:1–39.Google Scholar
- 41.Deutsch A, Dormann S. Cellular automaton modeling of biological pattern formation: characterization, applications, and analysis. Birkhäuser: Boston; 2005. p. 334.Google Scholar
- 50.Anderson ARA, Chaplain MAJ, Rejniak KA, editors. Single-cell-based models in biology and medicine. Basel: Birkhäuser; 2007. p. 349.Google Scholar
- 66.Sheikh-Bahaei S, Hunt CA. Prediction of in vitro hepatic biliary excretion using stochastic agent-based modeling and fuzzy clustering. In: Perrone LF et al, editors. Proceedings of the 37th Conference on Winter Simulation, Monterey, CA, Dec 03–6. 2006. p. 1617–24.Google Scholar
- 70.Stevens A. Simulations of the aggregation and gliding behavior of myxobacteria. In Biological motion, lecture notes in biomathematics, vol 89. New York: Springer. 1990. p. 548–55.Google Scholar
- 89.Hung DY, Chang P, Weiss M, Roberts MS. Structure-hepatic disposition relationships for cationic drugs in isolated physiological models. JPET. 2001;297:780–9.Google Scholar
- 91.Yan L, Park S, Sheikh-Bahaei S, Ropella GEP, Hunt CA. Predicting hepatic disposition properties of cationic drugs using a physiologically based, agent-oriented in silico liver. In: Rajaei H, Wainer GA, Chinni MJ, editors. Proceedings of the 2008 Spring Simulation Multiconference, SpringSim 2008, Ottawa, Canada, April 14–17, 2008. SCS/ACM 2008. 2008a; p. 162–6.Google Scholar
- 95.Fages F. From syntax to semantics in systems biology towards automated reasoning tools. In: Priami C et al., editors. Trans Comput Syst Biol IV, LNBI. 2006;3939:68–70.Google Scholar
- 98.Gennari JH, Neal ML, Carlson BE, Cook DL. Integration of multi-scale biosimulation models via lightweight semantics. Pac Sym Biocomput. 2008;2008:414–25.Google Scholar
- 99.Sun Z, Finkelstein A, Ashmore J. Using ontology with semantic web services to support modeling in systems biology. In: Weske M, Hacid M-S, Godart C, editors. WISE 2007 Workshops, LNCS. 2007; 4832:41–51.Google Scholar
- 101.FURM: A Functional Unit Represenmtation Method, http://furm.org/ (accessed 6/6/09).
- 102.An G. Dynamic knowledge representation using agent-based modeling: ontology instantiation and verification of conceptual models Ch. 15. In: Maly IV, editor. Methods in molecular biology: systems biology 500 Humana Press (Springer Science); 2009. doi:10.1007/978-1-59745-525-1_15.
- 103.Xiang X, Kennedy R, Madey G, Cabaniss S. Verification and validation of agent-based scientific simulation models. In: Yilmaz L., editor. Proceedings of the 2005 Agent-Directed Simulation Symposium, April 2005. The Society for Modeling and Simulation International 2005;37:47–55.Google Scholar
- 105.Peirce CS. How to make our ideas clear. Pop Sci Monthly. 1878;12:286–302.Google Scholar
- 106.Peirce CS. Deduction, induction, and hypothesis. Pop Sci Monthly. 1878;13:470–82.Google Scholar
- 107.Yu CH. Abduction? Deduction? Induction? Is there a logic of exploratory data analysis?. 1994. http://www.creative-wisdom.com/pub/Peirce/Logic_of_EDA.html (accessed 5/24/09).
- 108.Magnani L. Abduction, reason and science—processes of discovery and explanation. New York: Kluwer; 2000.Google Scholar
- 109.Gabbay DM, Woods J. A practical logic of cognitive systems, volume 2: the reach of abduction: insight and trial. Elsevier. 2005.Google Scholar
- 110.Minar N, Burkhart R, Langton C, Askenazi M. The Swarm simulation system: a toolkit for building multi-agent simulations. Working Paper 96-06-042, Santa Fe Institute, Santa Fe, NM. 1996.Google Scholar
- 115.Gilbert N, Troitzsch K. Simulation for the social scientist. Maidenhead: Open University Press; 2005.Google Scholar