Prospects for Declarative Mathematical Modeling of Complex Biological Systems

Declarative modeling uses symbolic expressions to represent models. With such expressions, one can formalize high-level mathematical computations on models that would be difficult or impossible to perform directly on a lower-level simulation program, in a general-purpose programming language. Examples of such computations on models include model analysis, relatively general-purpose model reduction maps, and the initial phases of model implementation, all of which should preserve or approximate the mathematical semantics of a complex biological model. The potential advantages are particularly relevant in the case of developmental modeling, wherein complex spatial structures exhibit dynamics at molecular, cellular, and organogenic levels to relate genotype to multicellular phenotype. Multiscale modeling can benefit from both the expressive power of declarative modeling languages and the application of model reduction methods to link models across scale. Based on previous work, here we define declarative modeling of complex biological systems by defining the operator algebra semantics of an increasingly powerful series of declarative modeling languages including reaction-like dynamics of parameterized and extended objects; we define semantics-preserving implementation and semantics-approximating model reduction transformations; and we outline a “meta-hierarchy” for organizing declarative models and the mathematical methods that can fruitfully manipulate them. Electronic supplementary material The online version of this article (10.1007/s11538-019-00628-7) contains supplementary material, which is available to authorized users.


Introduction
We propose and discuss an informal definition of declarative modeling in general, and provide as examples a collection of specific mathematical constructions of processes and extended objects for use in declarative models of complex biological systems and their processing by computer.Such processing can be symbolic and/or numerical, including for example model reduction by coupled symbolic and machine learning methods.The resulting apparatus is intended for semi-automatic synthesis and analysis of biological models, a computational domain which must typically deal with substantial complexity in the subject modeled.
We will define declarative models in Section 2 (informally in general but formally in particular cases); generalize the approach to extended objects in Section 3 by way of discrete graphs; and discuss progress in model reduction (usually across scales) by machine learning in Section 4. In Section 5 we will propose the elements of a larger-scale mathematically based "meta-hierarchy" for organizing many biological models and modeling methods, enabled by the declarative approach to modeling.
Much of this paper reviews previous work, extending it somewhat and setting it in a broader context.The aim of this paper is not to provide a balanced summary of work in the field; instead it is aimed mainly at outlining mathematically the possibilities of particular directions for future development.

Notation
We will use square brackets to build sets (tuples or lists) ordered by indices, so in a suitable context f prxsq " f prx j |j P t1, . . .nusq " f px 1 , . . .x n q.Multisets are denoted tpmultiplicityqpelementq, . ..u ˚.
Indices may have primes or subsubscripts and are usually deployed as follows: r indexes rewrite rules, i indexes individual domain objects, α, β index domain object types, p, q index arguments in one or more argument lists in a rule, c indexes variables in a rule, A indexes a list of defined measure spaces, and j, k index internal components of a vector parameter.Object typing will be denoted by ":", so for example n : N means that n P N and the usual set of integer arithmetic operations pertain to n. Subtyping will be denoted by "::".

Declarative modeling
A useful distinction in programming languages is that between declarative and imperative languages.Roughly, declarative languages aim to make it easy to specify the goals or criteria of a successful computation, leaving many of the procedural decisions about how exactly to pursue those goals up to other software.Imperative languages on the other hand support the process of telling the computer exactly what to do -which is eventually necessary anyway.For example the Fibonacci series is a sequence of integers defined by a recursion equation (declarative), but the recursion equation can be implemented in software expensively using simple recursion.Alternatively it can be implemented more efficiently using either conventional looping or tail recursion (Abelson et al. 1996).These are two imperative options, the latter closer to declarative style.A mixed declarative/imperative route would be to automatically recognize and solve the recurrence relation by computer algebra, and then generate imperative code from the analytic solution.In mathematical modeling languages as distinct from programming languages, the opportunities for the declarative style are much greater owing to the practical power of symbolic reasoning, such as exact or approximate algebraic calculations and proofs, either before resorting to numerical simulation or mixed together with it.
An example of a formalizable language for modeling biology is a collection of partial differential equations in which the variables represent local concentrations of molecular species and the spatial differential operators are all diffusion operators ∇ x ¨pD α pxq∇ x ).Such a deterministic reaction/diffusion model can be represented in a computer by one or more abstract syntax trees (ASTs) including nodes for variable names, arithmetic operations, spatial differential operators such as the Laplacian, the first-order time derivative operator, equality constraints including boundary conditions, and possibly function definitions.Such an AST can be used to denote a reactiondiffusion model as a data object that can be manipulated by computer algebra.It can also under some conditions be transformed symbolically, for example by separation of time scales replacing a subset of ordinary differential equations (without spatial derivatives) by function definitions of algebraic rate laws to be invoked in the differential equations for the remaining variables (e.g.Ermentrout (2004)).We generalize from this example.A formal language should have a "semantics" map Ψ, giving a mathematical meaning to some defined set of valid expressions in the language.For a modeling language, each valid model declaration M should correspond to an instance ΨpMq in some space S of "dynamical systems" interpreted broadly, so that such systems may be stochastic and/or infinite-dimensional.If some of the semantically meaningful model expressions are composed of meaningful sub-expressions, and their semantics can be combined in a corresponding way, then the semantics is "compositional"; composition commutes with the semantic map.Likewise for valid transformations of model declarations, one would like the semantics before and after transformation to yield either the same mathematical meaning (equivalent dynamics), or two meanings that are related somehow eg. by approximate equality under some conditions on parameters that may be partially known by proof and/or numerical experiment.
So in the context of modeling languages in general and biological modeling languages in particular, we think the key advantages of the declarative language style are captured by the following informal (but perhaps formalizable) definition: A declarative modeling language L is a formal language together with (a) compositional semantics Ψ : L Ñ S that maps all syntactically valid models M in L into some space S of dynamical systems, and (b) conditionally valid or conditionally approximately valid families of transformations on model-denoting AST expressions in the modeling language L. These AST transformations can be expressed formally in some computable meta-language, though the meta-language need not itself be a declarative modeling language.
Under this informal definition the utility of a declarative modeling language would depend on its expressive power, addressed in Sections 2.1 and 3 below, and on the range, value, and reliability of the model-to-model transformations that can be constructed for it, to be discussed in Sections 3 (implementation), 4 (model reduction), and 5 (wider prospects).Multiscale modeling benefits from both the expressive power (e.g.representing cellular and molecular processes in the same model) and model reduction (finding key coarse-scale variables and dynamics to approximate fine-scale ones) elements of this agenda.
Although it is plausible that "one can't proceed from the informal to the formal by formal means" (Perlis 1982), so that the task of formalizing a complex biological system to create models must begin informally, we will nevertheless try to be systematic about this task by building the semantic map Ψ up out of "processes" and "objects" of increasing generality and structure.These processes and objects are represented by "expressions" in a language, and mapped to mathematical objects that together define a model.So the map Ψ will be concerned with expressions, processes, and objects.

Semantic Maps
We show how to define a series of simple quantitative modeling languages of increasing expressive power, accompanied by semantic maps ΨpMq to operator algebras.To do this we need some idea concerning how to represent elementary biological processes with syntactic expressions that can include numerical quantities.One such idea begins with chemical reaction notation.

Pure Reaction Rules
In biology, generalized versions of rewrite rules naturally specify biochemical processes in a declarative modeling language, as well as model transformations in a metalanguage.The most straightforward example is chemical reaction notation.As in (Mjolsness 2010) we could use chemical addition notation: where m prq α and n prq β are nonnegative integer-valued stoichiometries for molecular species A α indexed by α in reaction r with nonnegative reaction rate k prq , and we may omit summands with m prq α " 0 or n prq β " 0. But really the left and right hand sides of the reaction arrow are not sums but multisets with nonnegative integer multiplicities of all possible reactants, defaulting to zero if a reactant is not mentioned: (2) Either syntax expresses the transformation of one multiset of symbols into another, with a numerical or symbolic quantitative reaction rate k prq .The syntax is easily encoded in an abstract syntax tree (AST).An example AST transformation might be a meta-rule that reverses an arrow (Yosiphon 2009) and changes the name of the reaction rate to (for example) k pr 1 q , for a new reaction number r 1 , allowing for the possibility of detailed balance to be satisfied in a collection of reactions.Many other reaction arrow types (e.g.substrate-enzyme-product) can then be defined by computable transformation to combinations of these elementary mass action reactions (e.g.(Shapiro et al. 2003;Yang et al. 2005;Shapiro et al. 2015) for examples in a declarative modeling context), using either commercial (Wolfram Research 2017) or open-source (Joyner et al. 2012) computer algebra system software.How can we define a compositional semantics for this reaction notation?Fortunately the operator algebra formalism of quantum field theory can be adapted to model the case of ordinary (non-quantum) probabilities governed by the law of mass action in a Master Equation (Doi 1976a,b;Peliti 1985;Mattis and Glasser 1998;Mjolsness 2010) (and Morrison and Kinney (2016) for the equilibrium case).The corresponding semantics ΨpMq is the creation/annihilation operator monomial which acts on probability distributions over states given by vectors of nonnegative integers n α (distinguished from n prq β by its lack of a superscript), one for each molecular species.The mass action law is encoded in the annihilation operators a α .Annihilation operator subscript α indexes the species present in the multiset on the left hand side (LHS) of Equation ( 2); the corresponding operator is raised to the power of its multiplicity or ingoing stoichiometry m prq α in reaction r, destroying that many particles of species α if they exist (and contributing zero probability if they don't).Likewise creation operator âβ has subscript β indexing the species present in the multiset on the right hand side (RHS) of Equation ( 2), and the operator is raised to the power of its outgoing stoichiometry n prq β which indicates how many particles of species β are to be created.If the number of particles of a given species is any nonnegative integer, as we assume for molecular species in solution, then annihilation and creation operators have infinite dimensional representations and satisfy the commutator of the Heisenberg algebra ra α , âβ s " a α âβ ´â β a α " δ αβ I α , where I is the identity operator and δ is the Kronecker delta.This commutator can be used to enforce a "normal form" on polynomials in which annihilation operators precede creation operators in each monomial, as in Equation (3).We can define the diagonal number operator N α " âα a α .
Another case is that in which the number of "particles" of any given species must be in t0, 1u, as for example if the species are individual binding sites occupied by a particular molecular species; then the operators are 2 ˆ2 matrices a " `0 1 0 0 ˘and â " `0 0 1 0 ˘obeying ra α , âβ s " a α âβ ´â β a α " δ αβ pI α ´2 âα a α q.In this case we can define the diagonal number operator N α " âα a α " `0 0 0 1 ˘, the zero-checking operator Z α " I α ´Nα " `1 0 0 0 ˘, and the "erasure" projection operator E α " pZ α `aα q " `1 1 0 0 ˘which takes either state to the zero-particle state.In this case also a 2 " â2 " 0. The semantics given by Equation ( 3) is compositional or structure-respecting because: (a) The operators for multiple rules indexed by r in a ruleset map to operator sums: , where W r " Ŵr ´diagp1 ¨Ŵ r q and 9 p " W ¨p.
In Equation ( 4) the first statement is ruleset compositionality, the second ensures conservation of probability, and the third is the resulting Chemical Master Equation (CME) stochastic dynamical system for the evolving state probability ppnq.Under Equation (3) one may calculate that diagp1 ¨Ŵ r q equals the diagonal operator ś iPlhsprq pN α q m prq α .One may regard this linearity as a linear mapping of vector spaces (hence as a morphism or category arrow): The source vector space is spanned by basis vectors that correspond to ordered pairs of multisets of species symbols, weighted by scalar reaction rates k r , and the target vector space is a vastly larger space of possible probability-conserving operators.Also, (b) The multisets on the left hand side and right hand side of a rule each map to an operator product in normal form (incuding powers for repeated multiset elements) in Equation (3).Each product consists of commuting operators so their order is not important.
There are various possible formalizations of the "structure respecting" nature of mapping Ψ from rulesets to operator algebra, in addition to the vector space mapping just mentioned.One could consider just a vector space orthant with nonnegative reaction rate coefficients.One could take a page from Homotopy Type Theory (discussed in Section 3) and upgrade Ψ to a functor between categories, by making each vector space a category with morphisms such as piecewise linear paths from one vector to another, among other options.We do not yet have a uniformly preferred candidate for formalizing the idea of "respecting structure" in mathematical modeling languages.
The integer-valued index r we used to name the reactions is part of a meta-language for the present theorizing, and not part of the modeling language.One slightly confusing point is that these unordered collections of chemical reactions are expressions in the language, but they also have a form reminiscent of a grammar for another language -albeit a language of multisets representing the system state, rather than of strings or trees, and a language that may be entirely devoid of terminal symbols (e.g.inert products such as waste).The CME as semantics was suggested by (Mjolsness and Yosiphon 2006) and by (Cardelli 2008), though it can be regarded as implicit in the Doi-Peliti formalism.There is also a projection from continuous-time (CME) semantics to discrete-time probabilistic semantics.
From this operator algebra semantics and the time ordered product expansion one can derive valid exact stochastic simulation algorithms including the Gillespie algorithm and various generalizations, as detailed in (Mjolsness 2013).Such algorithms can also be accelerated exactly (Mjolsness et al. 2009), and accelerated further by working hierarchically at multiple scales and/or using parallel computing (Orendorff and Mjolsness 2012).The same theory can be used to derive algorithms for the inference of reaction rate parameters from sufficient data (Wang et al. 2010) (cf. Golightly andWilkinson (2011)), although sufficient data may be hard to obtain.
Thus, unordered collections of pure chemical reactions provide a simple example of a modeling language with a compositional semantics.But for most biological modeling we need much more expressive power than this.

Parameterized Reaction Rules
The first modeling language escalation beyond pure chemical reaction notation is to particle-like objects or "agents" that bear numerical and/or discrete parameters which affect their reaction reaction rates.For example, the size of a cell may affect its chances of undergoing cell division.This kind of multiset rewrite rule can be generalized to (Mjolsness 2010) (5) The parameters x p , y q (indexed by positions p, q in their respective argument lists, and which may themselves be vectors x p ) introduce a new aspect of the language, analogous to the difference between predicate calculus and first order logic: Each parameter can appear as a constant or as a variable, and the same variable X c can be repeated in several components of several parameter lists in a single rule.Thus it is impossible in general to say whether two parameters in a rule are equal or not, and thus whether two terms τ αppq rx p s in a rule are the same or not, just from looking at the rule -that fact may depend on the values of the variables, known only at simulation time.
The p, q subindex notation is as in (Mjolsness 2013).Here we have also switched from moleculelike term names A α to more generic logic-like term names τ αppq .The reaction rate ρ r p " x p ‰ , " y q ‰ q now depends on parameters on one or both sides of the rewrite rule which can be factored (automatically in a declarative environment, as in (Yosiphon 2009)) into a rate depending on the LHS parameters only and a conditional distribution of RHS parameters given LHS parameters.
For example a stem cell of volume V might divide asymmetrically yielding a stem cell and a transit-amplifying cell: stemcellrx, V, . ..sÝÑ TAcellrx `∆x, V{2, . ..s, stemcellrx ´∆x, V{2, . ..s with ρpVqN p∆x; V 1{d q, where ρpVq is a probability per unit time or "propensity" for cell division depending on cell volume, and N p∆x; cV 1{d q is a Gaussian or normal probability density function with diagonal covariance proportional to a lengthscale set by cell volume.This kind of example (Yosiphon 2009, Mironova et al. 2012, Mjolsness 2013) can be implemented in a computer algebra system (Yosiphon 2009, Shapiro et al. 2013), and shows that with parameters, reaction-like rewrite rules can represent (for example) both cellular and molecular processes in the same model, and of course their interactions, which is a key expressiveness capability for multiscale modeling.
Even for multilevel modeling of a single-scale reaction network, dynamic parameters allow for the possibility of aggregate objects that keep track of many individual particles including their number.Such aggregate objects would have their own rules as discussed in (Yosiphon 2009 Section 6.4), possibly obtained from fine-scale reaction rules by meta-rule transformation.
To specify the compositional semantics for parameterized rewrite rules it is necessary to specify the probability space in which distributions are defined.In (Mjolsness 2010) a fairly general space S of dynamical systems available as targets for the semantics map ΨpMq P S is specified in terms of master equations ( 9p " W ¨p) governing the evolution of (classical, non-quantum) probability distributions p in Fock spaces constructed out of elementary measure spaces for the parameters.
In order to define the semantics of variables we will need to integrate over their possible values.Each variable X c will have a type required by its position(s) in the argument list rx p s of one or more terms τ αppq , and a corresponding measure space D Apcq and measure µ Apcq .Here A indexes some list of available measure spaces including discrete measure on N and Lebesgue measure on R d:N .Let V α " Â p D Apα,pq be the resulting measure space for parameter lists of terms of type α.To summarize briefly the "symmetric Fock space" construction outlined in (Mjolsness and Yosiphon 2006;Mjolsness 2010), for each nonnegative integer n a we define a measure space of states that have a total of n α "copies" of parameterized term τ α px α q: where Spnq is the permutation group on n items -in this case the elements of the n α measure space factors.Next, any number n α : N of terms is accommodated in a disjoint union of measure spaces f α pn α q, resulting in a measure space for all terms of type α, and a cross product is taken over all species α: This is the measurable system state space.In it, objects of the same type α are indistinguishable except by their parameters x p : V α .The operator semantics now becomes an integral over all the variables X c : Ŵr " ż . . .
This expression is again in normal form, with annihilation operators preceding (to the right of) creation operators.When other operator expressions need to be converted to normal form one uses the same kinds of commutation relations as before except that the δ αβ Kronecker delta functions are now augmented by Dirac and product delta functions as needed to cancel out the corresponding measure space integrals: The integrals over measure spaces act a bit like quantifiers in first order logic, binding their respective variables.We speculate that the product of two such operator expressions could be computed in part by using the logical unification algorithms of computational symbolic logic, since the problem of finding the most general unifier arises naturally when integrating over several sets of variables during restoration of canonical form (cf. the proof of Proposition 1 in Section 3.3 below) using the commutation relations of Equation ( 9) and their delta functions.Equations ( 5) and ( 8) comprise the syntax and semantics of the basic ruleset portion of Stochastic Parameterized Grammars (SPG) language of (Mjolsness and Yosiphon 2006;Yosiphon 2009).A simulation algorithm is derived in (Mjolsness 2013).Other features such as submodels, object type polymorphism and graph grammars are also included.
This notation is fundamentally more powerful than pure chemical reactions, because now reaction/process rates ρ r p " x p ‰ , " y q ‰ q can be functions of all the parameters involved in a rule, and a rule firing event can change those parameters.It becomes possible to express sorceror's apprentice models which purport to accomplish an infinite amount of computing in a finite simulated time, though this situation can also be avoided with extra constraints on the rate functions in the language.
A related fully declarative modeling language family rides under the banner of "L-systems", named after Lindenmeyer and championed by Prusinkiewicz (e.g.(Prusinkiewicz and Lindenmeyer 1990)).L-systems and their generalizations have been effective in the modeling of a great variety of developmental phenomena, particularly in plant development, because of their declarative expressive power.But the usual semantics of L-systems falls outside the family of languages considered in this section and in Section 3.3.2below, for a theoretically interesting reason.The applicable rules of L-systems are usually defined to fire in parallel, and synchronously in discretzed time so that one tick of a global clock may see many discrete state updates performed.This semantics seems to be incompatible with the addition of local, continuous-time operators defined in Equation (4), because "atomic" uninterruptible combinations of events are represented in operator algebra by multiplication rather than by addition of operators (and even then they are serialized), and on the other hand continuous-time parallel processes are represented by addition rather than multiplication of operators, in the master equation that describes the operation of processes in continuous time.Despite this difference, one could seek structure-respecting mappings between operator algebra and L-system semantics at least at the level of individual rule-firings.
Another relevant point of comparison for the operator algebra semantics of parameterized reaction network languages is the BioNetGen modeling language (Blinov et al. 2004).This language has been applied to many problems in signal transduction pathway modeling with discretely parameterized terms representing multistate molecular complexes.It also represents labelled graph structures that arise in such molecular complexes, placing it also in the class of graph rewrite rule dynamics languages discussed in detail in Section 3.3.
These (parameterized rewrite rule) models would seem to be non-spatial, but particle movement through space can already be encoded using discrete or continuous parameters that denote spatial location.Such motion would however have to occur in discrete steps.The solution to that limitation is another language escalation.

Differential Equation Rules
Another form for parameterized rewrite rules licenses locally attached ordinary differential equations (ODEs) for continuous parameters, as in (Prusinkiewicz et al. 1993;Mjolsness 2013;Mjolsness et al. 1991), as part of the language L: where the second form uses component rather than vector notation for all the ODEs.The first form is more readily generalizable to stochastic differential equations (SDEs) tdx p " v p prx k sqdt ὴp ?dt, η p " π η prx k squ where π η is a probability density function conditional on rx k |ks.The ODE semantics is given by the corresponding differential operators: as shown for example in (Mjolsness 2013).The SDE case is discussed in (Mjolsness and Yosiphon 2006), section 5.3.Equations ( 5), ( 10) and ( 8), ( 11) comprise the syntax and semantics respectively of the basic ruleset portion of Dynamical Grammars (DG) language of (Mjolsness and Yosiphon 2006;Yosiphon 2009), by addition of differential equations to SPGs.A simulation algorithm is derived in (Mjolsness 2013).The generalization to operator algebra semantics for Partial Differential Equations (PDEs) and Stochastic Partial Differential Equations (SPDEs), as a limit of spatially discretized ODE and SDE systems, is outlined (although not rigorously) in (Mjolsness 2010).
These differential equation bearing rules can be used to describe processes of growth and movement of individual particle-like objects, as in "agent-based" modeling.For example, a cell may grow according to a differential equation and divide with a probability rate that depends on its size, as in the plant root growth model of (Mironova et al. 2012).In general rewrite rules can be used to describe individual processes within a model, if they are augmented (as above) with a symbolically expressed quantitative component such as a probability distribution or a differential equation.Each such process has a semantic map Ψ to a ring of operators, and processes operating in parallel on a common pool of objects compose by operator addition (Equation ( 4)).Declarative computer languages based on this and other chemical reaction arrows, transformed to ordinary differential equation deterministic concentration models, include (Shapiro et al. 2015(Shapiro et al. , 2003 ; Mjolsness 2013; Yosiphon 2009) among others.Given a collection of rules with differential equations attached, is there a way to get rid of the rules and just keep the differential equations tdx j {dt " v j prx k squ?That would only be the case if the left hand side and right hand side of the rules all take the same form as in Equation ( 10), so that only the parameters of modeled objects are dynamic, not the number or structural relationships of such objects.There is for example the deterministic approximation of stochastic chemical kinetics in which that assumption is valid: chemical reactions are assumed to be well-mixed in a fixed compartment such as the cytoplasm of a cell, and the real-valued concentration of each molecular species becomes a parameter of the enclosing compartment.The reaction-like rules of the chemical reaction network are all dropped in favor of mass action differential equations attached to the compartment, yielding a multicompartmental modeling language that translates to fixed differential equations (similar to (Shapiro et al. 2003) among many others).But since cells actually divide and change their interconnections, it is still valuable to add in cell division and reconnection rules and translate instead to dynamically changing sets of differential equations whose structure gets discretely altered every so often (Shapiro et al. 2013), as permitted by Equations ( 5) and ( 8).
Complex biological objects often have substructure whose dynamics is not easily captured by a fixed list of parameters and a rate function or differential equation that depends on those parameters.Extended objects such as molecular complexes, cytoskeletal networks, membranes, and tissues comprising many cells linked by extra-cellular matrix are all cases in point.We now wish to extend the syntax and semantics of the foregoing class of declarative languages to handle extended objects systematically, by creating a compositional language and semantics for biological objects as well as processes, and then extending the semantics for processes accordingly.In the case of discrete substructure this can be done with labelled graph structures (discussed in Section 3.2 below).In some cases the sub-objects (such as lipid molecules in a membrane, or long polymers in cytoskeleton) are so numerous that an approximate spatial continuum model is justified and simpler than a large spatially discrete model.Spatial continuum models with geometric objects can be built out of manifolds and their embeddings, along with biophysical fields represented as functions defined on these geometries, in various ways we now discuss.

Extended objects
Can we declaratively model non-pointlike, extended biological objects such as polymer networks, membranes, or entire tissues in biological development?To achieve constructive generality in treating such extended objects we introduce, in Section 3.2, ideas based on discrete graphs and their possible continuum limits.These ideas include graded graphs, abstract cell complexes, stratified graphs, and combinations of these ideas.Dynamics by rewrite rules, beginning with graph rewrite rules, are developed in Section 3.3.But first we discuss the more general nonconstructive types of extended objects that we may wish to approximate constructively.

Nonconstructive extended objects
Classical mathematics is not constrained by the requirement of being computationally constructive, though it can be.This lack of constraint makes easier to establish useful mappings and equivalences (compared for example to Intuitionist mathematics), so it is easier to reuse what is already known in proving new theorems.We would like to retain these advantages in a theoretical phase of work before mapping to constructive computer simulations.
Classical mathematical categories such as topological spaces, measure spaces, manifolds, CW cell complexes, and stratified spaces (the latter two composed of manifolds of heterogeneous dimension) provide models of extended continuum objects such as biological cell membranes, cytoskeleton, and tissues made out of many adjacent cells.Objects in such categories may (depending on the category) have rich and useful collections of k-ary operators (unary, binary, countable associative, etc. object-valued operators, not to be confused with the probability-shifting dynamical creation/annihilation operators of Section 2.1) such as category sums and products, and function arrows for Cartesian Closed Categories such as compactly generated topological spaces (Steenrod 1967;Booth and Tillotson 1980).Such ' C , b C , Ñ C operators for category C can be targeted by compositional semantics from context-free grammar rules that generate expressions including these operators in an AST in a modeling language; these are essentially "type constructor" type inference rules in standard programming language semantics (Pierce 2002).
Further object-generating operators may require mathematical objects in heterogeneous but related categories, such as (a) defining a quotient topological space from an equivalence relation (whose morphisms are coarsenings) on points of the original space, or (b) defining new submanifolds by level sets of continuous functions using the regular value theorem (related to the implicit function theorem), or as the image of a continuously differentiable embedding ([Hirsch 1976] Chapter 1, Theorems 3.1 and 3.2.).Such level set functions could be biophysical fields such as concentration of morphogen for tissue domain boundary, or cortical microtubules in the preprophase band for plant cell division, or they could be purely mathematical phase fields that rapidly interpolate between discrete values for different compartments.Also objects in algebraic categories such as categories of groups, rings, fields, and vector spaces may bring their "internal" algebraic operations into the purview of a semantic map that is further constrained to represent these operations.Thus Ψ should be a highly constrained structure-respecting function between the object collections of the corresponding categories of model object expressions and mathematical objects.
With extended objects we encounter the possibility that the type of one object is itself another typed object.For example a point x : S may be constrained to lie on the surface of a sphere S : Manifoldpd " 2q :: Manifold which is itself of type 2-dimensional manifold or more generally a Manifold.Indeed this possibility may be taken to define an "extended" object like S: It is, or it at least informs, the type of its sub-objects.But for consistency one might like to map domain objects to "mathematical objects" that find their formal packaging in category theory by the same definition, whether they are constituent objects like point x or extended objects like 2D manifold S.There are at least two natural ways to do this: (1) Define a mathematical object x as a point in a "space" X which is a category-object in some (preferably topological) category, and also define a category of categories containing Top, such as the small monoidal categories (though this choice might be too inclusive), to legitimize space X as also being mathematical object by this definition; or (2) Define a mathematical object x directly as an object in a category, and define a category whose category-objects are the points of a topological space such as S (itself a category-object in Top or Manifold).The latter option is taken in Homotopy Type Theory (HTT 2013), with continuous paths in a space as morphisms between its points.An enriched category results.In our case this would lead to homotopy on manifolds, CW complexes, and stratified spaces as well as graph path homotopy considerations.
By these means we could nonconstructively define the mathematical semantics map Ψ of an extended object model in terms of functors (or at least structure-representing functions) to classical mathematical categories.However the mathematical objects in these spaces are not generally defined in a computable way, so we can't get all the way to computable implementations by this route.
Our strategy to achieve computability will be to define another fundamental kind of map in addition to semantics maps Ψ: namely, Implementation maps I from restricted versions of potentially nonconstructive but standard mathematical categories, to labelled graphs.The restricted version could for example be something like piecewise linear stratified spaces (Weinberger 1994), or low-order polynomial splines to better preserve the differential topology of manifolds and stratified spaces ((Hirsch 1976) Chapter 3; (Weinberger 1994) Part II).By abuse of terminology such implementation maps may alternatively run from language (with classical mathematics semantics) to language (with graph semantics), provided the corresponding diagram commutes.In other words, implementation should commute with the semantic map, both respecting composition.This constraint applies to I both as it acts on extended objects, and on the process models (e.g.differential equations or Markov chains) under which they evolve.Diagram 5 in Section 3.3.3will provide examples.
Finite extended objects can be modeled by labelled graphs defined in the next section.These are eminently computable.Infinite graphs comprise a borderline case.One can use sequences of finite, computable labelled graphs to approximate continua.In this way we can have access to continuum extended objects as declarative modeling "semantics" for very fine-grained biological objects, while "implementing" these infinite mathematical objects in terms of explicitly finite and computable ones.Of course one's computational resources may or may not be adequate to making the approximation needed.
A major goal for declarative modeling of continuum objects is declarative modeling of partial differential equations (PDEs), before they have been spatially discretized into e.g.ordinary differential equation systems.Software that already supports some PDE models declaratively, though not in conjunction with all the other process types of Section 2 or 3.3, includes the weak form specification language of Fenics (Logg et al. 2012) for finite element method (FEM) numerical solution, and at least one commercial computer algebra system (Wolfram Research 2017).Of course the choice of numerical solution methods for PDEs is a vast area of research; here we just observe that methods for which the mapping from continuum to discrete descriptions (e.g.meshing) is dynamic and opportunistic, such as Discontinuous Galerkin methods generalizing FEMs, or Lagrangian methods including particle methods for fluid flow, would place extra demands on the flexibility of the graph formalism for extended objects introduced in the next section.

Constructive extended objects via graphs
Discrete graphs, especially when augmented with labels, are mathematical objects that can represent computable objects and expressions at a high level of abstraction.

Standard definitions
An (undirected/directed) graph is a collection V of nodes and a collection E of edges each of which is an (unordered/ordered) pair of vertices.A graph homomorphism is a map from vertices (also called nodes) of source graph to vertices of target graph that takes edges (or links) to edges but not to non-edges.Graphs (undirected or directed) and their homomorphisms form a category of graphs and graph homomorphisms.The category of undirected graphs can be modeled within the category of directed graphs by a functor representing each undirected edge by a pair of oppositely directed edges between the same two vertices.
Either category of graphs and graph homomorphisms contains resources with which to formalize labelled graphs (specifically vertex-labelled graphs), bipartite graphs, and graph colorings all as graph homomorphisms (to a fully connected and self-connected graph KΛ of L " |Λ| distinct labels Λ " tλ iPt1...Lu u; a 2-node graph without self-connections; and more generally a clique of L distinct nodes fully interconnected but without self-connections, respectively).Edge-labelled graphs (as opposed to the default vertex-labelled graphs) can be had by way of a functor from graphs to bipartite graphs which maps vertexes to red vertices and edges to blue vertices; the resulting bipartite graph can be further labelled as needed.
As usual, a tree is an undirected graph without cycles and a Directed Acyclic Graph (DAG) is a directed graph without directed cycles.A directed rooted tree is a DAG whose undirected counterpart is a tree, and whose (directed) edges are consistently directed away from (or consistently towards) one root node.Then Abstract Syntax Trees (previously discussed as the substrate for declarative model transformations) can be represented as node-and edge-labelled directed rooted trees with symbolic labels for which the finite number of children of any node are totally ordered (e.g. by extra numerical edge labels), so there is a unique natural "depth-first" mapping of ASTs to parenthesized strings in a language.Alternatively it is also possible to totally order the nodes in a tree or DAG in "breadth-first" ways that respect the directed-edge partial order, as in task scheduling.
The automorphism group of a labelled graph is in general reduced in size from that of the corresponding unlabelled graph, making it easier to detect particular subgraphs.Some binary operators of type pgraph, graph) Ñ graph, namely categorical disjoint sum "' Graph " and the graph cross product "ˆ" = categorical product "b Graph ", can be defined as universal diagrams in the Graph category.The graph box product "l", essential for meshing, can also be formulated as a tensor product universal diagram but not strictly within the category ((Knauer 2011) Theorem 4.3.5).Graph functions by binary operator "Ñ Graph " that produces a graph are also possible but require an altered definition of the category (Brown et al. 2008).

Non-standard definitions
A special case of a labelled graph is a numbered graph with integer labels λ 1 i " i, |Λ 1 | ě |V|, and the labelled graph homomorphism to KΛ 1 is one-to-one (but not necessarily onto), so each vertex receives a unique number in Λ 1 " t1 . . .k ě |V|u.Then one way to express a labelled graph G Ñ Graph KΛ is as the composition G Ñ Graph KΛ 1 Ñ Graph KΛ of a graph numbering G Ñ Graph KΛ 1 followed by a relabelling determined by a mapping of sets Λ 1 Ñ Set Λ Y t∅u, with ∅ the value taken on ununused number labels i.The possibility of unused numbers (|Λ 1 | ě |V|) will be needed when consistently labelling two different graphs.
We consider graph homomorphisms to five particular integer-labelled graphs: N "pN, Successorq " nonnegative integers t0, 1, . ..u as vertices, with (possibly directed) edges from each integer i to its immediate successor i `1 and to itself; (In this notation the hats Ĝ refer to the addition of self-edges.)Graph homomorphisms to these graphs will provide the core of the next few definitions.Following ([Ehrig et al. 2006] Chapter 2), such homomorphisms themselves form useful graph-related categories.These definitions can all be used in either the undirected or directed graph contexts.The two contexts are related by the standard functor in which an undirected edge corresponds to two oppositely directed edges, defining a mapping from undirected graphs to directed graphs.
A graded graph is a homomorphism from a graph G to N. Additional assumptions may allow an undirected graded graph to "approach" a continuum topology such as a manifold or other metric space, in the limit of large level number.The metric can be derived via a limit of a diffusion process on graphs.For example, D-dimensional rectangular meshes approach R D this way (e.g.Mjolsness and Cunha (2012)).A conflicting definition for the term "graded graph", disallowing connections within a level, was given in (Fomin 1994) but we need both ∆l " 0 and ∆l " `1 edges.In the category of undirected graphs, ∆l " ˘1 are indistinguishable except by consulting the vertex labels l.Of course, p N, id Nq is itself a graded graph.
A stratified graph of dimension D is a homomorphism from a graph G to ĴD .In the undirected graph case, such a homomorphism is equivalent to a graph whose vertices are labelled by dimension number, since it imposes no constraints on edges.The reason for reversing the edges in the directed graph case, compared to the graded graph definition, is that the level-changing edges then correspond to standard boundary relationships from higher to lower dimensional strata.
Abstract cell complex: A special case of a stratified graph of dimension D is a graph homomorphism from G to Nop D , the directed graph of integers t0, . . .Du with self-connections and immediate predecessor connections i Ñ i ´1 within the set.Since Nop D maps to ĴD by graph homomorphism, this is (equivalent to) a special case of stratified graphs.
A graded stratified graph of dimension D is a graph homomorphism from G to C D " Nl ĴD .Two nodes are connected iff either their level numbers are equal (and, in the directed case, the source node dimension is ě the target node dimension), or if their level numbers differ by one and their dimensions are equal.Because G projects to both N and ĴD , a graded stratified graph maps straightforwardly (by functors) to both a graded graph and to a stratified graph.However, it enforces an additional consistency constraint on level number and dimension: they cannot both change along one edge (and, in the directed case, the source node dimension must be ě the target node dimension).
A graded abstract cell complex of dimension D is likewise a graph homomorphism from G to CD " Nl Nop D .Our aim with these definitions is to discretely and computably model continuum stratified spaces (including cell complexes) in the limit of sufficiently large level numbers.Stratified spaces are topological spaces decomposable into manifolds in more general ways than CW cell complexes are.In addition, we would like each manifold to be a differentiable manifold with a metric related to a Laplacian, since that will be the case in modeling spatially extended biological or physical objects.
Thus we arrive at the constructive slice categories SpHq : where H " N, ĴD , Nop D , C D , or CD .Here it is essential that each possible target graph H includes all self-connections.In particular a homomorphism of graded graphs is a graph homomorphism with H " N, so graded graphs together with level-preserving graph homomorphisms form a slice category.
For a discrete approximation to a stratified space by a stratified graph, we can identify the "graph strata", and their interconnections by boundary relationships, as connected components of constant dimension.So, by eliminating links between nodes of different dimension, and then finding the connected components that remain, we identify the strata in a stratified graph (or in a stratified labelled graph).In this commutative diagram: qpdq must be a fully disconnected graph, with each vertex then corresponding (by h ´1q to a d-dimensional stratum in G.The directed graph associated to G S (directed by dimension number in the case of undirected graphs) is a DAG, due to disconnection within each dimension.The graph homomorphism h becomes a homomorphism of stratified graphs.
Observation 1 The resulting G S in Diagram 2 is the graph of strata, a minimal structure for modeling complex geometries.It is therefore a natural graph on which to specify rewriting rules for major structure-changing processes such as biological cell division, mitochondrial fusing, neurite or cytoskeletal branching and other essential processes of biological development.
The geometry of cytoskeleton, in particular, is better captured by stratified graphs and stratified spaces than by cell complexes, because 1D and 0D cytoskeleton is often embedded directly into 3D cytosol rather than into 2D membrane structures.
Observation 2 In a graded stratified graph a useful special case occurs if G S restricted by level number stabilizes beyond a constant number of levels, so the description in terms of strata has a continuum limit G S. In that case, the limiting G S is also the natural locus for verification of compatible boundary conditions between PDEs of different dimensionality governing the evolution of biophysical fields defined on continuum-limit strata.
The edges remaining in G S , connecting strata of unequal dimensionality, model in-contact relations such as "boundary" and/or "inside".Further constraints are needed to disentangle these alternatives.Similar ideas to G S may be involved in the "persistent homology" approach to unsupervised learning of data structure (Bendich et al. 2007).
As observed above, the inverse image of a node in G S identifies the corresponding stratum in G.It is natural to use the graph Laplacian on each such stratum to define: (a) geometric distances, using the Green's function of the Laplacian operator; (b) regularizers for regression of functions from sparsely provided data (cf.Poggio and Girosi (1990)); (c) a Sobolev space H 2 of functions in the infinite-graph limit, for biophysical fields; (d) the definition of functional integrals in the infinite-graph limit, using a kernel k m,λ px, yq " pm ´λ∇ 2 qδ Dirac px ´yq to form a statistical mechanics partition function such as the Gaussian functional integral (e.g.Mandl and Shaw (2010)): and/or (e) Graph Convolutional Networks, a generalization of deep convolutional neural networks for machine learning from rectangular grids to general graphs (Hammond et al. 2011, Kipf andWelling 2016).But in order to recover the expected continuum properties from the graded graph of meshes associated with a stratum, it may be necessary to reweight the edges of the graph (and hence its Laplacian) according to the local scale of its embedding into for example the three dimensions of flat biological space.Laplacians and their heat kernels on Riemannian manifolds are sufficient to recover the local geometric structure by way of triangulated coordinate systems (Jones et al. 2008).Graph heat kernels converge to the manifold ones in cases where a sequence of graphs "approximate" the manifold (Coifman and Lafon 2006).
Another useful special case of a stratified graph occurs if G S obeys the constraint that only adjacent dimensions connect.In that case one has the discrete graph structure which we have defined as an abstract cell complex (ACC), although there are several related claimants for that phrase and we make no claim for the superiority of this one.This condition results from using the Hasse diagram for the dimension-labelled strata, with other possible boundary relationships among cells being obtained by multiple steps along ∆d " ´1 ACC edges.In most geometric applications one may in addition observe that (b) two strata of dimensionality labels d ´1 and d `1 are always mutually adjacent to either zero or two strata of dimension d, though in common with the "abstract complexes" of MGS (Giavitto and Michel 2001) we do not make this part of the definition.ACCs have been used for declarative developmental modeling at least of plant development (Lane 2015) and neural tube development (Spicher 2007), the latter using MGS.
PDEs: If some of the strata in G S are host to partial differential equations (respecting possibly dynamical boundary conditions at adjacent lower and higher dimensional strata) then G S will be too minimal for solution algorithms like finite elements or finite volumes, and those strata may have to be meshed.In that case the strata of G S may need to be subdivided sufficiently finely into patches that can each host a local coordinate system, compatible with its neighboring patches of different dimension and (under one possible strategy) separated by extra artificial boundary strata patches from its neighbors of the same dimension.Compatible finer meshes (computed eg. by programs such as Tetgen (Si and Gartner 2005; Mjolsness and Cunha 2012) could be aligned with the local coordinate systems.As a simpler example, a cell complex graph whose cells embed into 3D as polyhedra can be subdivided into cuboids whose main diagonals each stretch from a vertex of the starting polyhedron to its centroid; but this decomposition may or may not have good numerical properties such as condition number of a meshed mechanical stiffness matrix.A problem for developmental biology, as for fluid simulation, is to maintain mesh quality while the system simulated undergoes large deformations.Meshing for PDEs is a vast field of applied mathematical research to which we cannot do justice; the point here is that stratified graded graphs (and perhaps other graph slice categories) provide a way to formalize many of the problems and capabilities that have to be represented explicitly for declarative modeling to be applied.
Abstract cell complex graphs are also the key structure for Finite Element Methods (Hughes 2000) and for Discrete Exterior Calculus (DEC) discretizations of PDEs (Desbrun et al. 2005, Arnold et al. 2010).DEC allows for the separate discrete representation of k-forms and the full exploitation of the generalized Stokes' theorem, including symplectic PDE integrators and the Helmholtz/Hodge decomposition of function spaces for possible PDE solutions.DEC can be combined with subdivision (de Goes et al. 2016) in the manner of a graded stratified graph.On the other hand, analysis on more general stratified spaces is also possible including differential operators for corners with singularities allowed on the approach to lower dimensional strata (Schulze and Tarkhanov 2003).
By these various means one would like to generalize from ODE-bearing rules to PDE-bearing rules which would be of two basic types: (1) PDEs for the evolution of biophysical fields such as diffusion within a manifold (possibly including a hyperbolic term for finite speed information propagation, as in the Telegrapher's Equation derivable for stochastic diffusion (Kac 1974) on the way to the Heat Equation), and (2) PDEs for the evolution of the embeddings of strata into higher strata, such as cytoskeleton mechanics (1D into 2D or 3D) or the biomechanics of 2D membranes embedded into ordinary three dimensional Euclidean space.This could be accomplished by way of level sets for example.

Dynamics on Graphs
There are at least two ways to mathematically define the semantics of graph rewrite rules for use in biological modeling of extended objects.The operator algebra approach, extending the form of rewrite rule semantics given in Section 2, has been used as the theory behind some molecular complex modeling and tissue-level developmental modeling methods.It takes advantage of the ring of operators to blend with continuous-time process models by scalar multiplication of rates and operator addition of parallel processes, thereby also gaining compatibility with quantitative methods of statistical mechanics and field theory in physics, and with machine learning by con-tinuous optimization.A second approach, the graph homomorphism pushout diagram approach championed in (Ehrig et al. 2006), has been used in molecular complex modeling (Danos et al. 2012).It takes advantage of the category theory formulation of graphs, discussed in Section 3. We will first briefly discuss pushout semantics, since it leverages graph categories but we have little to add to this approach, and then we will discuss operator algebra semantics for graph grammars in continuous time.

Pushout semantics
A pure (unlabelled) graph grammar rule G Ñ K G 1 can replace graph G with graph G 1 , holding common subgraph K constant, anywhere that G (and its subgraph K) occurs as a subgraph inside of some "host" graph C within the current pool of one or more graphs (indeed C can be taken to be the entire pool of graphs in the current state since that's just a big disconnected graph).The result will be an altered version C 1 of the host or pool graph.Using graph homomorphism arrows, the double pushout diagram for the firing of a graph grammar rule relates all these graphs as follows: The diagram is to be universal at C and C 1 (so any rival occupant for either position is essentially just a homomorphic image of the universal occupant), hence is double pushout.If such a C 1 exists, then it is a candidate outcome for the effect of a rule firing of the given rule on the given pool.
What is nice about this diagram is that, as in Section 3.2 above, its objects and arrows can be reinterpreted in other graph-related categories including slice categories such as labelled graphs and other typed attributed graphs.We also imagine that it should be possible to implement a slice graph category rewrite rule under the double pushout semantics in terms of ordinary (double pushout) labelled graph rewrite rule firings, by use of suitable labels.A possible sticking point is verifying implementation of the universalities of the pushout construction in the slice category.In this paper we will only study the analogous question for operator algebra semantics (Section 3.

below).
There is also a closely related single pushout diagram version of the semantics, and a collection of "independence" conditions for two successive rule firings to have an order-independent result (Ehrig et al. 2006).

Graph rewrite rule operators
By generating unique integer "Object ID" parameters for each new object created in a parameterized grammar rule, it is possible to implement graph grammar rules by parameterized grammar rules alone, just using repeated OID values to represent graph links.This route was detailed in (Mjolsness and Yosiphon 2006) and implemented declaratively in (Yosiphon 2009).But it is also possible to define graph grammar rewrite rule semantics directly as a continuous-time dynamical system, using operator algebra as in the previous semantics defintions of Section 2.

Suppose we have two labelled graphs G
Ñ Graph KΛ 2 .We decompose them each into a numbered graph G num i : G pure i Ñ Graph Kt1,...k i ě|V i |u and a relabelling Kt1,...k i ě|V i |u Ñ Graph KΛ i determined by a mapping of sets t1, ...k i u Ñ Set Λ i , determined in turn by an ordered listing of labels λ i , possibly augmented by the nullset symbol ∅.The whole decomposition can be denoted G num i xxλ i yy.We are interested in graph rewrite rules G num xxλyy Ñ G 1 num xxλ 1 yy that respect a single consistent numbering of vertices of the two numbered graphs before their relabellings.In that case vertices in G 1 and G 2 that share a vertex number are regarded as the same vertex v, before and after rewriting (similar to the shared graph K in double pushout), so that any graph edges contacting v and not mentioned in the rewrite rule are preserved.
Here (Equation (14 below) is an example pertaining to refinement of triangular meshes in 2D.This is one of four rewrite rules that suffice to implement a standard mesh refinement scheme.Three of those rules including this one deal with partially refined triangle edges, an intermediate state produced by the previous refinement of nearby triangles.It can also be interpreted as an (unlabelled) graded graph rewrite rule since it preserves the graded graph constraints on level numbers l, if they are satisfied initially.(The other rewrite rules are similar but deal with the cases of zero, two, or three partially refined triangle sides.)The labelled graph rewrite rule is: Note that in this example there is a shared numbering of nodes of the two graphs, and node numbers 1-4 occur in both graphs.This is equivalent to identifying the shared subgraph K in the double pushout semantics of Diagram 3. Any extra edges that contact nodes 1-4 in a subgraph of the pool graph, e.g.parts of other nearby triangles, will remain after this rule fires.Of the other three grammar rules, two are analogs of ( 14) for two or three hanging side nodes like node 4 above; only one rule increases the maximal level number: In the resulting four-rule grammar, the order in which triangles are refined does not interfere with the potential refinement of other triangles.If for example a global maximum-level constraint l ď L were imposed via the firing rate of each rule, then the refinement process would converge by many alternative paths to the same fully refined terminal triangular mesh.
A more complicated rewrite rule can maintain not only l " level number but also d " dimension in a cell complex representation, during this kind of triangle refinement.
To implement a 2D version of the the polyhedron Ñ cuboid mesh refinement scheme mentioned in Section 3.2.2,one could alternatively start with four triangle Ñ quadrilateral refinement rules; for example instead of rule ( 14) one would have: However, one is then committed to designing a larger grammar that can refine both triangles and quadrilaterals with varying numbers of hanging side nodes like node number 4 above, possibly requiring six more quadrilateral refinement rules since there are two ways to place two such side nodes.Abstraction via sub-grammars may then become important.Further notational innovations could also reduce the number of rules needed.This completes a discussion of various particular examples of labelled graph rewrite rules.
In general suppose G and G 1 are numbered graphs sharing a common numbering of their vertices, with index-ordered adjacency matrices rg pq |p, qs and rg 1 p 1 q 1 |p 1 , q 1 s whose elements take values in t0, 1u.Then an unambiguous graph rewrite rule can be expressed as: where the double angle brackets denote label substitutions: . .where σ : N Ñ N and σ 1 : N Ñ N are strictly monotonically increasing mappings of initial segments of integers into the shared index space.
We give graph grammar rule semantics for the case of directed graphs in terms of binary state vectors for node and edge existence; creation and annihilation operators all have dimension 2 ˆ2.Assuming there is some label in the label space, and provided the global state is initialized to have zero probability of active edges for unused nodes, then the off-diagonal portion of the operator algebra semantics is: For an unlabelled graph, add a single formal label, so Λ " t∅u.The leading (rightmost) operators Z i p λ p in Equation ( 18) find unused nodes to build new graph out of, if needed.Their indices i p should be each averaged rather than summed over, since it doesn't matter which unused node is brought into service; hence the proportionality rather than equality in ( 18).If only a finite range of index values i p P t0, . . .I overflow ´1u are possible and they are all used up, then only rewrite rules that don't create new nodes will be able to fire; one is naturally interested in the infinite storage space limit I overflow Ñ `8 for which this problem doesn't occur.The trailing (leftmost) "garbage collection" erasure operators ś i E i p i E i i p erase any edges pi p , iq or pi, i p q dangling from nodes i p that have just been deleted, making those nodes suitable for reuse in future rule firings.These operators maintain the invariance of the statement that inactive nodes have all their edges inactive as well, which we assumed is true of the initial condition.
As is the case for Equation (3), Equation ( 18) can be taken as a normal form for rewrite rule dynamics, but now applied to graphs.We will show next that (somewhat akin to the "Concurrency theorems" of the double pushout approach, but different in detail) the product of two such forms can be rewritten as a (possibly large) integer-weighted sum of expressions having the same form, or a form of equivalent meaning with extra factors of Z and E that don't affect the active node set; however, some of the weights may be negative.
As a hasty preview of the proof below, this simplification of the product of two graph rewrite rule forms happens because the Z and E factors can be grouped into members of the compound rule's rhs z lhs and lhs z rhs, which commute past everything else, and the remaining factors which are expressed in terms of a and â (including negative coefficients from Z) and join in the interior graph normal form factors using commutators.The extra factors are indexed by a particular set of indices computed below.In more detail, we have: Proposition 1 The product of two operators taking the form of Equation ( 18) can be rewritten as an integer-weighted sum of expressions taking the same form, except possibly with extra factors of Z and E that don't affect the active node set.
Proof : The product of two expressions of the form of Equation ( 18) initially takes the general form Ŵr 2 Ŵr 1 9 `ρr This expression should equal a sum of expressions that take the same form as Equation ( 18), although possibly with negative signs and irrelevant extra factors of E and Z as stated, for various choices of graph rule matrices g 1;2v and g 1 1;2v (v indexing the summands that result).The leading factors of ρ multiply properly to give a new leading ρ for each summand, possibly to be multiplied by integers arising from commutation relations.The ) will become a new ) after some nonnegative number of index collisions involving commutators proportional to δ i p 1 i p 2 and δ j q 1 j q 2 for various p and q subindices reduce the number of indices summed over by demanding index equality.
The proof work occurs in two steps: (1) commuting factors of E in line 5 to join those in line 2 and factors of Z in line 5 to join those in line 7, and (2) commuting factors of â from line 6 past (to the left of) factors of a in line 4 to restore normal form, more specifically for (2a) the edges and (2b) the node labels.We will discuss these two steps in reverse order, since Step 2 is simpler.
Recall that all a, â commutators are either zero, when operator types or indices don't match, or they are diagonal and a linear combination of the identity and a normal form N " âa matrix, multiplied by a delta function that eliminates one or more indices from the sum.
Step 2. Since a 2 U " â2 U " 0 for all indices U, and since E and Z are multilinear in the basis tI, a U , âU |all indices Uu, as is the interior of Equation ( 18) in between its E and Z factors, the interior of the resulting fully commuted forms will also all be multilinear in the basis tI, a U , âU |all indices Uu.From that fact one can pick out appropriate matrices g 1;2 and g 1 1;2 on the edge labels.Their entries will all be either 0 or 1 since higher powers a lě2 U " 0 " âlě2 U contribute nothing to the full operator.Likewise with labels.
Step 1.We now translate the calculation of Step (1) to set notation as follows.For each summand in ř tiu ř tju . .., the variables ri p |ps and rj q |qs take values in some sufficiently large range of integers t0, ...I overflow ´1u.On the other hand each subindex p, p 1 , . . .(there are several different dummy bound indices with each of these names in Equation ( 19)) takes values in t1, . . .k 1 ď k 1L `k1R u, and likewise for q, q 1 , . . . in t1, . . .k 2 ď k 2L `k2R u.Given fixed mappings ri p |ps and rj q |qs, these indices now effectively run over the following sets of pool nodes: Fortunately it is only the singly-pp, qq-indexed Z i p ˚and E ˚jp ˚operator factors that need to be commuted past other factors to get to their target locations to the rightmost (respectively leftmost) factor collections in the compound rule analog of Equation ( 18).It should be clear that some of the E factors commute leftwards unobstructed, due to the delta functions in the commutation relations that give zero commutator whenever subscripts don't match.This happens for E terms indexed by members of the set L 1 zR 1 (arising from line 5 in Equation ( 19)), provided they are not in any of the index sets for graph edges that occur on lines 3 and 4: namely, L 2 or R 2 .This set then is pL 1 zR 1 qzL 2 zR 2 .These unobstructed E factors will always join those already on line 2 of Equation ( 19)) indexed by pL 2 zR 2 q, to form the union set E 1;2 defined in Equation ( 21) below.
The remaining obstructed terms can be fully absorbed into the normal form terms that obstructed them, feeding into Step 2.
In the same manner we calculate the set Z 1;2 of factors that always end up in the rightmost collection of Z factors in line 7 of Equation ( 19)) after commutation; Z 1;2 is given in Equation ( 21) below.It turns out to be related to E 1;2 by the time-reversal duality L 1 Ø R 2 , L 2 Ø R 1 , which also pertains to the rest of this calculation.
On the other hand, by Equation ( 18) what we need in these positions in the compound rule is L 1;2 zR 1;2 for the E factors and R 1;2 zL 1;2 for the Z factors, where L 1;2 corresponds to the LHS of the compound rule r effective " "1;2" (our made-up index to denote the new compound rule).So what are L 1;2 and R 1;2 ?The left hand (input) side set of the compound rule is the left hand side L 1 of the first-acting rule, namely rule 1, together with any inputs to rule 2 that are not supplied by outputs of rule 1 (L 2 zR 1 ) but instead must be input to the compound rule.Thus L 1;2 " L 1 Y pL 2 zR 1 q.By similar argument, or by time-reversal duality, R 1;2 " R 2 Y pR 1 zL 2 q.
So we have these compound rule index set definitions: Together they imply by Boolean logic the following (dual) disjoint set unions: where 9 Y denotes disjoint set union.The Boolean logic proving Equation ( 22) is carried out in full, in the 16-row truth table below (Table 1).
Thus, the E 1;2 and Z 1;2 sets have all the entries they need to assume the form of Equation ( 18) for the compound rule -plus a few more, namely those indexed by ∆.What is the meaning of ∆? From (Table 1), it is easy to verify that another expression for ∆ is ∆ " L 2 X R 1 X L 1;2 X R 1;2 .These are the indices of vertices that are neither ingoing to nor outgoing from the composite rule, yet are both outgoing from rule 1 and ingoing to rule 2. They correspond exactly to the "churning" (allocation and immediate deallocation) of extra unused node and edge memory by rule 1 followed by rule 2, that is not needed in the compound rule.(These operators also maintain the invariance of the statement that inactive nodes have all their edges inactive as well.)So the extra factors of E and Z just correspond to an equivalence class of operators that differ only by "churning" extra unused memory.
Thus, after full use of commutation relations, Equation ( 19) becomes a (weighted, signed) sum of operators each equivalent to a compound rule's operator in the form of Equation ( 18).Furthermore the equivalence relation is given by extra factors of E and Z arising from set ∆, as stated in the proposition.QED.
Corollary 1 The commutator r Ŵr 1 , Ŵr 2 s of two operators taking the form of Equation ( 18) can also be rewritten as an integer-weighted sum of expressions taking the same form, except possibly with extra factors of Z and E that don't affect the active node set.Compared to the product Ŵr 1 Ŵr 2 , however, many summands may cancel.
Observation 3 In this sense (Proposition 1 and Corollary 1), there is a higher-level algebra of graph rewrite rule operators that can be "implemented" by (mapped compositionally by ring homomorphism to) a sufficiently large indexed collection of binary state variables with their own lower-level state-changing operator algebra.
Observation 4 If instead of the creation and annihilation operators for Boolean state variables we were to use creation and annihilation operators for N-valued numbers of identical particles in the definition (18) of Ŵr , then in Proposition 1 all the signs from commuting the inner parts of the rule operators would come out ě 0, which is simpler.But (1) the handling of memory allocation and deallocation might have to be revised, and (2) graph grammar rules could have unintended semantics in terms of multigraphs: graphs with nonnegative integer edge weights.
Conjecture 1 Since the operator algebra and pushout diagram semantics are alternative ways of defining the "same" operation, they must be related.So we conjecture there is a map Ψ 2 making Diagram 4 commute: The conjectured mapping Ψ 2 can be defined as tranferring probability to the subspace of states of the Fock space corresponding to a C 1 post-firing pool graph, from the subspace of states of the Fock space corresponding to a C pre-firing pool graph.
Semantics alternatives.The semantics of Equation ( 18) permit several rule nodes (indexed by p) to map to one pool node (indexed by i p ) if that node is self-connected, as does the pushout rule firing semantics described in Section 3.3.1 (since graph homomorphisms can be many-to-one).22).False entries are shown as blank for ease of visualization, and every entry is either true or false.Note that the last (16th) line is entirely blank i.e. false.The entries augmenting E 1;2 and Z 1;2 by ∆, when L 1;2 zR 1;2 and R 1;2 zL 1;2 are both false, are boxed.To prohibit many-to-one node matching in general one could constrain the summed pool indices to all be unequal.To prohibit many-to-one node matching in particular cases one can (a) add differentiating node labels and use the semantics of (18), or (b) alter and generalize the semantics to include factors of ś LHS pZ i p i q q ḡpq where ḡ is a 0/1-valued matrix representing a second type of graph edge that identifies prohibited connections on the LHS, and likewise ś RHS pZ i 1 p i 1 q q ḡ1 p 1 q 1 for the RHS.The normal form could put these new Z product factors just to the right of (acting just before) the corresponding factors for g and g 1 in Equation ( 18).If corresponding entries of g and g 1 both take the value 1, that inconsistency has no effect since their product has a factor aZ " apI ´âaq " a ´pa âqa " a ´pI ´âaqa " a ´a `âa 2 " 0.

Slice Rewrite Rule Operators
The slice categories of Diagram 1, with H " N, ĴD , Nop D , C D , or CD for graded graph, stratified graph, abstract cell complex, graded stratified graph, and graded abstract cell complex respectively, are all variants of the category of labelled graphs ϕ : G Ñ H whose labels κ are nodes of H, with extra constraints added on the integer-valued labels.We can encode these constraints in each case with a predicate P H pϕq, and enforce them with a real-valued "gating" indicator function ΘpP H pϕqq which takes the value 1 if the predicate is satisfied and zero otherwise.If an ordering on the nodes of G is established, as we have assumed, then these objects become P H pκq and an indicator function ΘpP H pκqq.In the foregoing graph rewrite rule semantics, such an ordering is established by the arbitrary indexing scheme of p and q.So we may generalize the graph rewrite rule to cover these slice categories as well by mapping ρ slice r pκ, κ 1 q to a corresponding ρ graph r ppκ, λq, pκ 1 , λ 1 qq: ρ graph r ppκ, λq, pκ 1 , λ 1 qq " ΘpP H pκqq ˆΘpP H pκ 1 qq ˆρslice H, r pλ, λ 1 q. (23) The first indicator function in Equation ( 23) could be omitted if the initial condition gives nonzero probability only to valid H-slice graphs and all rules in the grammar are valid H-rewrite rules, maintaining the validity conditions P H using the second indicator function in in Equation ( 23).
The remaining operator products in Equation ( 18) can remain the same, yielding the operator algebra semantics of these (and potentially other) slice category rewrite rules, complete with provision for extra domain-model specific labels λ.
In this way we could implement special graph grammar syntax for slice graph rewrite rules.For example in the case of undirected graded graphs, we could let directed edges represent ∆l " `1 edges and undirected edges represent ∆l " 0. The triangle refinement example would then become: Directed graded graphs could be represented too, with just one bit of edge label information to record whether there is a change of level number along a directed edge or not.In both cases the integer-valued level number edge labels are removed, to be restored automatically by an implementation map I from slice graph grammar rule syntax to ordinary graph grammar rule syntax.
(This implementation map could even be expressed in the form of a meta-grammar rule, mapping rules like (24) to a slight variant of rules like ( 14) with an AST for the labels.)Similarly, for stratified graphs one could label edges with ∆d and allow the interpretation process to restore d.For abstract cell complexes one instead needs only to record one extra bit of edge information regarding d: ∆d P t0, ´1u.As in the case of rule ( 14), with these extra edge labels rule (24) could be made more elaborate by retaining all relevant strata and their connections at smaller level numbers rather than just the graded graph "frontier" comprising the deepest substrata in each stratum.
Observation 5 In all these slice graph category cases the implementation mapping on rewrite rules should match the implementation mapping on their semantics as proposed above, so that implementation commutes with semantics: Slice Rules Ψ ✲ Gated Op.Alg.

Efficient implementations
We just saw that slice graph grammar rules can be implemented (efficiently) in terms of ordinary labelled graph grammar rules.
The efficient implementation of graph grammars rules themselves can also be considered.We have mentioned that they can be implemented in terms of parameterized grammars with parameters devoted to recording integer-valued ObjectIDs.That implies that worst-case performance for parameterized grammars can be as bad as finding small unlabelled subgraphs in small unlabelled graphs, though finding subgraphs in practice is a lot easier than finding them in worst case, and labels help substantially.So one option is just to deploy algorithms that match small symbolic expressions, or use computer algebra systems that have done the same.But another option is available specifically for declarative modeling: to find the rules with the most commonly occurring rule firings in a model, and to use meta-grammars or a meta-language (discussed in Section 2) to transform those rules into submodels comprising rules taking only special forms that can be compiled into special-case efficient simulation code.Examples of special-form rules amenable to special-case simulation code include parameterless rules, terms with parameter sets that take only a few values, rules that consist only of differential equations, and many other possibilities.Then, use strategies like operator splitting to simulate quickly most of the time, slowing down only for occasional higher-cost operations like cell division in a tissue model.

Meta-Hierarchy via Graphs
If we seek models in discrete mathematics for the idea of a "hierarchy" such as a hierarchy of biological systems and subsystems, or a hierarchy of modeling methods, the simplest possibility is a tree: a directed graph whose undirected counterpart has no cycles.This graph could be labelled with the names of the subsystems, methods, or other concepts in the hierarchy.Such a restrictive definition could be appropriate for a compositional hierarchy, or for a strict classification aimed at reconstructing clades, but not in general for a hierarchy of specializations or subsets in which a node may have several parent nodes.For that case a less restrictive possibility is to model a hierarchy as a labelled DAG (directed acyclic graph), which has no cycles as a directed graph.Thus a labelled DAG is a natural model for the idea of a hierarchy that is more general than a tree.
However, as the foregoing examples show, a hierarchy may be composed of items related in several different ways (composition, specialization, mutually exclusive specialization, and so on.)This fact suggests a further generalization.If the edge labels in a labelled DAG take values in a further DAG of possible relationships, themselves related by specialization, the resulting compound structure can be called a meta-hierarchy since it encodes a hierarchy of interrelated hierarchies.This kind of structure has precedent in eg. the more general "typed attributed graphs" of (Ehrig et al. 2006).
In Section 5 we will consider a meta-hierarchy whose vertices index into (are labelled by) formal languages for modeling aspects of biology.

Model Reduction
Model reduction can be a reasonable strategy to deal with biological complexity.Instead of picking out the most important variables and processes to include in a model a priori, one can include some reasonable representation of many variables and processes (the model still won't be complete) with reasonable parameter values (easier said than done) in a fine-scale model, and then computationally seek a smaller, coarser scale model that behaves in approximately the same way, on some some set of "observables" or "quantities of interest", in some relevant region of parameter space.
In addition to eliminating conditionally unnecessary state variables for simplicity's sake, model reduction has the potential to: (a) enable scaling up to very large models through increased computational efficiency in simulation; (b) mathematically connect predictive models across scales of description for causal authenticity and greater accuracy at each scale; (c) enable the study of the great diversity of possible emergent phenomena, as a function of the parameters, structures, and initial conditions of fine-scale models.Repeated model reduction can result in a hierarchical stack of inter-related models, with which to systematically maximize these advantages.
How can we use machine learning to perform the computational search for reduced models?Given enough data from a pure (parameterless) stochastic chemical reaction network, and the correct structure of the network, it is possible to learn the reaction rates (Wang et al. 2010;Golightly and Wilkinson 2011).In Section 4.1 below we summarize how to learn not just reaction rates but an effective reduced-state space model in the form of a time-varying Boltzmann distribution in at least some examples by following very general principles, for the parameterless case (Johnson et al. 2015) and for the case in which parameters include spatial positions (Ernst et al. 2018).
The general theme of using machine learning to create computationally efficient reduced models is rapidly advancing.In computational chemistry for example (Smith et al. 2017) develop a neural network for learning from, interpolating, and much more efficiently applying the energy and therefore force information computed in Density Functional Theory fine-scale calculations.Likewise, other work (e.g.Burkardt et al. (2006)) addresses difficult problems in fluid flow.
As for the general semantics (Ψ) and implementation (I) families of structure-respecting mappings, we will denote model reduction mappings by a family symbol R.

Learning Boltzmann Distribution Dynamics
In the parameterless rewrite rule (e.g. a chemical reaction as in Equation ( 2)) case we learn a coarse-scale approximation p of p as a time-varying version of a Boltzmann distribution or Markov Random Field (MRF): Here each potential function V α is a function of a subset or clique C α of the components of s, creating a bipartite "factor graph" of variable-nodes indexed by i and factor-nodes indexed by α (Lauritzen 1995;Frey 2003).If the factor graph is not connected then its connected components all factorize into independent probability distributions whose product is p.
To get the dynamics of µ one can minimize the KL divergence between p and p, where p evolves under a differential equation whose right hand sides rF α |αs can be taken to be a learned combination of a large number of hand-designed basis functions (Johnson et al. 2015;Johnson 2012), optimizing a KL divergence between distributions p and p.This is the "Graph-Constrained Correlation Dynamics" (GCCD) model reduction method.It was used to achieve a substantial reduction in number of variables for modeling a molecular complex in synapses.
The goal of training for model reduction may be summarized as minimal degradation over time of the approximation on a set of observables: Here R is a restriction operator mapping fine-scale to coarse-scale states, and ∆t indicates the passage of time under model dynamics.This diagram can be stacked horizontally, for teacherforcing model training, or vertically, for application across more than two scales.
This model reduction method can be extended to the case of continuous spatial parameters (Equations ( 5) and ( 10)) as follows: Instead of a discrete state vector s for all the numbers of all the possible (usually molecular) species, we have a representation comprising the total number n : N of objects (e.g.molecules) present, indexed by i P t1, . . .nu, together with an n-dimensional vector α of discrete species types α i and an n-dimensional vector x of continuous spatial parameter vectors x i such as d " 3 dimensions of space (though orientation could contribute `d 2 ˘" 3 more components).Then a pure state vector (probability 1 concentrated on one state) can be denoted by the "ket" basis vector |n, α, x, ty, analogous to a single vector n in GCCD above, and the full state of the system is a probability mixture of the basis states.Similar to GCCD, construct a coarse-scale mixed state based on an energy function and a Boltzmann distribution that sums over 1-particle contributions ν 1 pα i 1 , x i 1 , tq to the energy, summed over one particle index i 1 , plus 2-particle contributions ν 2 ppα i 1 , α i 2 q, px i 1 , x i 2 q, tq that obey permutation invariance, summed over two indices i 1 ă i 2 , and so on up to k-particle contributions of order k " K: |rν k |k P t1, . . .Kus, ty " where xiy n k " ti 1 ă i 2 ă ¨¨¨ă i k : i P r1, nsu denotes ordered subsets of k indexes each in t1, . . ., nu.Here the partition functional Z rrν k |kss has two nested square brackets, the outer brackets indicating that Zr. ..s depends on its arguments as a functional depends on functions rather than as a function depends on numbers, and the inner brackets indicating that the ν k functions indexed by k should all be included in the argument list.
The goal as in Diagram 6 is to find p that approximates the solution to the master equation 9 ppn, α, x, tq " W ¨ppn, α, x, tq, where W sums over all processes that affect the state such as chemical reactions and diffusion (studied in Ernst et al. ( 2018)) but eventually also active transport, crowding effects and so on.
To define the time evolution of the reduced model pptq, introduce a set of functionals rF k |k P t1 . . .Kus (ODEs, PDEs, or even other forms for F could be tried) to create a governing dynamical system for the interaction functions rν k |ks: The criterion for choosing the functions F is to minimize an action that integrates a dissimilarity measure (KL-divergence) between p and p: Proposition 2 Given a reaction network and a fixed collection of K interaction functions tν k u K k"1 , the linearity of the CME in reaction operators 9 p " ř r W prq p extends to the functionals F k " ř r F prq k .In this very limited technical sense, there is a vector space homomorphism (which is a particular kind of structure-respecting map R) from the spatial operator algebra semantics to the Dynamic Boltzmann reduced model semantics.

Speculation: Expression Dynamics for Model Search
One more significant connection between modeling languages and graph grammars is the possible use of graph grammars for structural model learning.Reaction and rewrite rule model classes share with artificial neural networks, Markov Random Fields, and many other machine-learnable model classes the property that model architecture is determined by a weighted graph (e.g. the bipartite graph of reactions and reactants) whose structure can be sculpted by setting nonzero weights (e.g.reaction rates) to zero or vice versa, with some expectation of continuity in the neighborhood of zero weight in an optimization formulation of training.This operation would correspond to the deletion or insertion of weak rules.However, other "structural" moves in graph space are bolder and may have higher potential optimization gain.For example if a rule is duplicated while maintaining the sum of the rates, then the model behavior and objective function are unchanged but at least one of the daughter rules may be freed up to drift and adopt new functions.Analogous "duplicate and drift" mechanisms are believed to enable neofunctionalization resulting in selective advantage in genetic evolution (Moore and Purugganan 2003;Thompson et al. 2016).Such an operation is easy to encode with a graph grammar meta-rule.
Duplication with arrow-reversal of one daughter rule, on the other hand, would require for continuity that the unreversed rule retain nearly its full strength and the reversed one enter the ruleset at very small strength.But as discussed in Section 2.1 such a reversed rule could subsequantly evolve in strength to establish detailed balance.Other meta-rules could implement the mutation, crossover and three-parent rule-generation operations of genetic algorithms and differential evolution.Semantic word embeddings as used in current machine learning methods for natural language processing may provide a way to learn the inter-substitutability of individual symbols in rewrite rules based on similarity of vectors evolved under previous successful substitutions.This kind of mechanism also promotes the evolution of evolvability.So in addition to the generic metarules we have listed, modeling-language specific or domain-specific rules could be optimized.All of these mechanisms for evolutionary structural optimization of models are local in the model AST and so can be encoded and studied using graph grammars.

A Meta-Hierarchy for Declarative Modeling
We have developed the ideas of declarative modeling including various formal modeling languages, together with structure-respecting maps (some of them category morphisms or functors) between formal languages and related mathematical objects for semantics, implementation, and model reduction.We have also defined a "meta-hierarchy" as a DAG whose edges are labelled by a DAG of types of relationships, the relationship types forming a specialization hierarchy.
We propose here a particular extensible meta-hierarchy (as defined in Section 3.4) of formal languages aimed at declarative modeling (as defined in Section 2) of complex scientific domains such as developmental biology.We specify the top levels of the meta-hierarachy but leave lower levels free to evolve with usage.Every node in the meta-hierarchy is either a symbolic placeholder (typical for top level nodes) or represents a formal language or sub-language to be used in a way that satisfies the definition of declarative modeling.The top level nodes in the meta-hierarchy are symbolic placeholders for a classification of deeper-level nodes along two independent labellings: (a) "ontolexical", in which the labels are "process", "object", and "expression" as used in Section 2, and (b) knowledge domain, in which the labels are "science", "mathematics", and "computing".In addition there is a "declarative dynamical model informatics" or simply "models" knowledge domain node, aimed at mediating between the other three.Of course many deep sub-classifications are possible especially for the knowledge domains, beginning with specializa-tion links to physics, chemistry, and biology, together with reduction links among their further specializations.The resulting meta-hierarchy is named "Tchicoma" after a volcanic formation in the southern Rocky Mountains.
As discussed in Section 3.4, the edges of the meta-hierarchy are labelled by relationship types (e.g.composition vs. specialization, the latter specialized e.g.into mutually exclusive and/or exhaustive vs. overlapping specializations; also proven vs. machine-verifiably proven vs. unproven relationships; and so on) that themselves stand in a specialization hierarchy.Such relationship links could be used in the construction of maps Ψ, I, and R (for semantics, implementation and model reduction respectively) in declarative modeling, e.g. to retrieve similar known maps from previous work.Specialization links can be used to insert conditions that enable theorems and alto work, and to evolve those conditions as knowledge accumulates.Automatic curation of these link types would also provide an opportunity to keep usage-based statistics on the edges of each type at each node in the meta-hierarchy from prior successful model-building activities, for human visualization, for automatic heuristic search, and for targeting the invention of new nodes in the meta-hierarchy to regions with high previous application.New nodes could be specializations or generalizations of single nodes or of several nodes jointly, resulting in a cumulative resculpting of the metahierarchy and its relationship type DAG under pressure of maximal utility.
The languages at the nodes of this meta-hierarchy can be specified by formal grammars, or they can be generated by unary and binary operators defined for all objects in some mathematical category C, including but not limited to category-level binary operators such as universal sum, product, or function arrow that can be defined purely in category-theoretic terms.Such a generated language has the advantage that there is a built-in mathematical semantics taking values in the category C objects denoted by the operator expression trees.In full generality the detection of semantic equivalence between expressions in such languages is intractable, though it can often be specialized to a solvable problem.
The near-top nodes of the Tchicoma meta-hierarchy comprise a Cartesian product of three ontolexical nodes and four knowledge domain nodes.With the resulting twelve-element cross product classification one can identify the following potential kinds of structure-respecting interlanguage mappings, differentiated by top-level source and target positions in the Tchicoma metahierarchy: 1. Mappings discussed in this paper: (a) Ψ: Semantics: Mapping from model object and/or process expressions in a modeling language L, to mathematical objects.This mapping is an essential part of declarative modeling as defined in Section 2, and was detailed in Sections 2 and 3.
(b) I: Implementation: Mapping from mathematical objects that are the semantics of model objects and/or model processes, to computational objects and processes respectively; or mapping from model expressions to computational expressions that denote these objects and processes; as introduced in Section (d) X : Transformation: From expressions to expressions, preserving or approximating semantics (or projections thereof); an essential part of declarative modeling as defined in Section 2. For example such transformations could include model reduction R, computational implementation I, and/or model translation T (below).
2. Mappings not detailed in this paper: (a) C: Model creation: From domain (object, process) expressions in a domain language to (object, process) expressions respectively in a modeling language.Domain languages may incorporate ontology formalizations such as the Web Ontology Language ("OWL") or more general Description Logics; examples include the BioPAX biological pathway description language and ontoloty (Demir et al. 2010), and the Gene Ontology (GO) three-part hierarchy of biological objects and processes that has an object hierarchy and two process hierarchies at two scales, with "Is-A" specialization links within each hierarchy.
(b) T : Translation: From expressions to expressions, preserving or approximating semantics (or projections thereof to observables).E.g. translation could formalize the process of translating between the inputs to different modeling software systems.By defining such translations first at the level of classical (and/or intuitionistic) mathematics rather than software, it is much easier to establish equivalences where they exist.Then one can explore whether and how deeply implementing or updating such translations is worth the effort.
(c) A: Model analysis (phase diagrams, bifurcation diagrams, and the like): From models to mathematical analysis products such as reduced-parameter spaces, discontinuity or phase-change strata therein, and long-time asymptotics of observables.Model analysis enables further reduction of submodels, for example by algebraic solution of fast subnetworks.
For most of these kinds of maps other than Ψ, several maps of the same kind could have the same source and target nodes; in that case subscripting the mapping symbols by mapping subkinds could become necessary -although one would prefer to elaborate the meta-hierarchy nodes instead, if possible.
The curation of this meta-hierarchy may provide a fruitful application area for automatic theorem verification software, since (a) many of these map types require the assertion of mathematical equivalences and approximations that could be proven, possibly with computer help, in an automatically verifiable form; (b) the applicability of a particular map to a particular modelling problem could be subject to logical inference on applicability conditions; and (c) the retrieval or synthesis of valid map compositions that achieve some formalizable goal could be achieved by forward-and/or reverse-chaining style theorem-proving algorithm.Some of these applications of computer logic could be systematized using type theory and the Curry-Howard-Lambek correspondence which in some circumstances at least maps logical inference to type theory and to category theory [Lambek and Scott 1986].For "computing" nodes in the meta-hierarchy, and implementation maps that target them, predictive declarative models of computational resource use as a function of problem statement could also be collected and trained on past data.

Conclusion
We aim to formalize aspects of mathematical biological modeling so that they become amenable both to computer assistance and to cooperative human development of complex biological models.A conceptual framework, based on declarative modeling, in support of that goal is presented.The elements of the conceptual framework include an informal definition of declarative biological modeling with formalized examples; physics-derived semantics for reaction/rewrite rule process expressions including parameters, variables, graph structure, and differential equations in terms of operator algebras built from creation and annihilation operators; a constructive labelledgraph approach to the semantics of extended objects including graphs labelled by approximation level number and/or stratum dimension and identity, which can approximate and computably implement classical nonconstructive geometries up to stratified spaces; labelled, graded, and stratified graph rewrite rule operator semantics for these constructive extended objects; a potentially generic model reduction framework based on dynamically evolving Boltzmann distributions; and a proposed overarching meta-hierarchy of modeling sub-languages within which the structure-respecting maps required for declarative biological modeling could be defined, curated, and evolved through experience for maximal utility.
This framework could support improved multiscale modeling through more expressive modeling languages and widely applicable model reduction.The framework may provide opportunities for mathematical biologists to contribute to systematic mappings for complex biological model creation, definition, reduction, implementation and analysis in ways that could be greatly amplified by automation and artificially intelligent computational improvement.

(
α, x, tq |n, α, x, ty , with mixture probabilities ppn, α, x, tq : p " xn, α, x, t|rν k |ks, ty " 1 Z rrν k |kss exp pp|| pq, where D KL pp|| pq " α, x, tq ln ppn, α, x, tq ppn, α, x, tq .(29) Assuming this objective has been minimized, (Ernst et al. 2018) use the chain rule of calculus to show: 3 and illustrated in Diagram 5. Computer Science seeks common target objects for many efficient implementation maps; current examples include Trilinos and PetSc for large-scale scientific computing.(c) R: Model reduction from model expressions to model expressions with approximately the same semantics under projection to a set of observables; as exemplified in Section 4 and Diagram 6.