1 Introduction

Since its inception about sixty years ago (Eilenberg and Mac Lane 1945), category theory has been increasingly recognized as having a foundational role in mathematics. Category theory provides the conceptual lens to pick out, focus on, and characterize the mathematical structures that have importance and universality (Awodey 1996). The principal lens is provided by universal mapping properties (UMPs) and particularly the arrangement of two UMPs given by a pair of adjoint functors, an adjunction.

The notion of adjoint functor applies everything that we’ve learned up to now to unify and subsume all the different universal mapping properties that we have encountered, from free groups to limits to exponentials. But more importantly, it also captures an important mathematical phenomenon that is invisible without the lens of category theory. Indeed, I will make the admittedly provocative claim that adjointness is a concept of fundamental logical and mathematical importance that is not captured elsewhere in mathematics. (Awodey 2006)

The isolation and explication of the notion of adjointness is perhaps the most profound contribution that category theory has made to the history of general mathematical ideas. (Goldblatt 2006, p. 438)

Nowadays, every user of category theory agrees that [adjunction] is the concept which justifies the fundamental position of the subject in mathematics. (Taylor 1999 p. 367)

Thus there is a segment of opinion that adjoint functors are of fundamental importance in mathematics, so one might well expect to find significant applications in the empirical sciences.

This paper has two purposes. The first purpose is to give an informal presentation of a new “heteromorphic” theory about adjoint functors (Ellerman 2006) by focusing on a pair of simple examples. This new treatment of adjoints uncovers a set of ideas and analogies, not evident in the conventional treatment of adjoints, that may lead to important applications to some very old problems. Given a domain of phenomena obeying certain laws, how can some qualitatively new and relatively autonomous behavior emerge? The claim is that adjoint functors provide a conceptual model, albeit in an abstract timeless setting, of how a relatively autonomous activity can emerge out of a lower lawful domain. The examples will be drawn primarily from the biological and human sciences.

2 Determination through universals

2.1 A mathematical example: the Cartesian product

The most basic conceptual structure of category theory is the “morphism” that abstracts from the idea of a function. A function describes “how one thing determines another.” The “things” are the objects in categories and the determination is abstractly represented by the mappings or “morphisms” between objects. If the objects were of the same type such as groups or rings, then the morphisms would be group or ring homomorphisms. Footnote 1 The objects of the same type and the appropriate homomorphisms between them constitute a category.

The simplest case is the category of sets where the objects are unstructured sets and the morphisms are just ordinary set functions or maps from one set to another. Given a set W and a set X, a mapping fWX assigns to each element wW, an element f(w)∈X. Then W is called the domain of the map and X is the codomain. In terms of one intuitive picture of determination, the elements of W are the determiners or “causes” and the elements of X are the potential determinees or “effects.” In a specific determination fWX, a determiner or cause wW “determines” the determinee or effect f(w)∈X.

Our topic is a special type of determination which is through a universal. Perhaps the simplest non-trivial example of determination through a universal is the Cartesian product. The setting is two codomain sets X and Y and a pair of maps from any common domain set W to X and Y, e.g., fWX and gWY. For each element wW, the pair of maps would just pick out an element f(w) = x ∈ X and an element g(w) = y ∈ Y, which are the “determinees” or effects of the determination. The pair of maps (f, g) is often called a cone of maps (Fig. 1). Footnote 2

Fig. 1
figure 1

Pair of maps (solid arrows) with same domain is a cone (dashed arrow)

That is a direct determination from an object in one category, a set, to a pair of sets, a set-pair, which is an object in another category. What is a “determination through a universal” There are two sides to a determination, the sending side and the receiving side. Let’s first illustrate sending through a universal. For an intuition, think of the sender as an organism behaving to affect the environment (the receiving side). The cone \({\left(f,g\right) :W\rightarrow\left(X,Y\right) }\) is a certain “behavior” by which the “organism” affects the “environment.” To change this into a determination through a universal, the “organism” needs to internally construct a representation of the possible behaviors or external determinations. As noted above, a specific determiner or cause wW would determine as its effect a pair of elements xX and yY which we can represent as the ordered pair (x, y). In this case, all the possible determinations such as cones to (X,Y) could be internally represented within the category of sets simply by the set of all possible effects (x, y), namely the Cartesian product X × Y = {(x,y)| xX, yY}. The canonical cone between the representation X × Y of all the possible effects and the individual determinees or effects in X and in Y is given by the two projection maps p X X × YX and p Y X × YY from that common domain X × Y to X and Y. The projection map to \({\mathit{X}}\) is defined by p X ((x,y)) = x and the projection map to \({\mathit{Y}}\) is defined similarly, p Y ((x,y)) = y. Thus (p X p Y ) is a cone of maps from X × Y to (X,Y). For example, if X and Y are each the real numbers \({ \mathbb{R}}\) , then the Cartesian product \({ \mathbb{R} \times \mathbb{R} }\) is the set of Cartesian coordinates for the plane, and the two projections give the x coordinate and the y coordinate of a point (x,y).

Any determination given by a cone \({\left(f,g\right) :W\rightarrow \left(X,Y\right) }\) is between two different types of things. In the more technical terms of category theory, it is a heteromorphism between objects in different categories in contrast to the homomorphisms that are internal to a category and are between the objects of the same category (in the diagrams, the solid arrows are homomorphisms internal to a category and the dashed arrows are the external heteromorphisms between objects in different categories). In this example, W is an object in the category of sets while (X,Y) is an object in a category of pairs of sets (a type of functor category) whose homomorphisms are pairs of set maps.

Some picturesque intuition was suggested where W was part of an organism behaving so as to affect something different, namely the relevant features of the organism’s environment represented by the pair of sets (X,Y). Then we suggested that there is a quite different way that an organism might act on or affect its environment. The first step is to build an internal representation of all the possible effects that a behavior might have. This internal universal model is successful if each external behavior \({\left(f,g\right) :W\rightarrow\left(X,Y\right) }\) can be represented by an internal map \({\left\langle f,g\right\rangle :W\rightarrow X\times Y}\) between two like things (i.e., between two sets W and X × Y), such that if the internal action 〈f, g〉:W → X × Y is followed by the canonical projection maps, then the result has the same effects as the original behavior \({\left(f,g\right) :W\rightarrow\left(X,Y\right) }\).

In mathematical terms, this is the universal mapping property of the Cartesian product and its projections. Given any determination (cone) \({\left(f,g\right) :W\rightarrow\left( X,Y\right) }\) from a common domain set W to set-pair (X,Y), there is a unique factor map 〈f,g〉:WX × Y defined by 〈f, g〉 (w) = (f(w), g(w)) such that the composite map \({W\overset{\langle f,g\rangle }{\longrightarrow}X\times Y\overset{p_{X}}{\longrightarrow}X=W\overset {f}{\rightarrow}X}\) of the factor map followed by the projection to X is the map f, and similarly for g, i.e., \({W\overset{\langle f,g\rangle }{\longrightarrow}X\times Y\overset{p_{Y}}{\longrightarrow}Y=W\overset {g}{\rightarrow}Y}\) . The set W and the cone \({\left(f,g\right) :W\rightarrow\left( X,Y\right) }\) can change, but the target set-pair (X,Y), the Cartesian product X × Y, and its projections (p X , p Y ) are fixed for this example. The product X × Y is the sending universal object and the projections are the sending universal morphism; they should be thought of together as the “sending universal.” This is illustrated in the upper triangle of the following diagram (the lower triangle is considered later) (Fig. 2).

Fig. 2
figure 2

Illustrative picture of behavior (f, g) internalized as action 〈f, g〉 factored through sending universal

Thus the projections are a universal for the property of being a cone from any common domain set to X and Y.Footnote 3 This is our simplest example of determination through a universal. The given determination (f,g) is external in the sense of being between different sorts of entities (objects in different categories). But this specific external determination can be factored through the universal by the specific internal determination 〈f, g〉:W → X × Y (which is between entities of the same sort) followed by the canonical external connection (the sending universal cone). Footnote 4

Mathematically, the external determination and the internal factorization through the universal give the same results. But in an empirical application, the whole question might be whether some determinative process is of the external type or of the internal through-a-universal type.

External determination corresponds to the ordinary intuitive notion of a determinative process. Internal determination through a universal is a fundamentally different type of process and that difference is our focus in the applications below. Determination through universals suggests an approach to a basic conundrum in philosophy—how can a qualitatively new type of more autonomous activity emerge? The suggestion is that the question can be approached by considering the shift from a direct external determination to an internal determination through a universal. As will be seen, the universality plus the internality add up to a type of autonomy. But first we need to extend the mathematical examples.

2.2 Extending the example: The receiving side

A morphism has both a source or sending end and a target or receiving end. The Cartesian product and its projections were a sending universal. The receiving set-pair (X,Y) were fixed and any determination (“cone”) to (X,Y) could be factored through that sending universal. There is also a symmetrical “dual” concept of the receiving universal, and the pair of sending and receiving universals is what is given by a pair of adjoint functors. Often in an adjunction, one of the universals is the one of interest and the other seems to be more a rather trivial bit of conceptual bookkeeping so that the two will make an adjunction. That is the case with the product adjunction. Hence we will first just mathematically fill out the Cartesian product example without focusing on any interpretation. Then we will give another adjunction dual to the Cartesian product which will be our non-trivial example of a receiving universal.

For the dual concept of the receiving universal, we reverse what is fixed and what is variable. Now we take the domain set W as fixed and we want to consider cones of maps from W to any set-pair which we might as well represent as (X,Y). We might take a single element wW as the determiner and then the two functions would give us two determinees f(w) and g(w) in the two codomain sets. How could this sort of determination from W be represented in a universal manner? We build a model on the receiving side (i.e., among the receiving objects which are set-pairs) of all the possible determiners or causes wW. The set-pair that would model all the elements of W is just a pair of copies of W denoted ΔW = (W, W). The universal cone relating each wW to the two copies of itself as “effects” consists of the cone (1 W ,1 W ) of two identity maps \({W\overset{1_{W}}{\rightarrow}W}\) . Is the cone universal? Given any other cone of maps (f, g) from W to any set-pair (X,Y), does there exists a unique pair of factor maps that will factor the cone (f, g) through the universal? Yes, the pair (f, g): (W, W)→(X,Y) will trivially do the job. Footnote 5

2.3 Combining the examples: an adjunction

We are trying to give an elementary introduction to the heteromorphic theory of adjoint functors. Homomorphisms are morphisms between objects within the same category, while heteromorphisms are morphisms between objects of different categories. So far the only heteromorphisms we have used are the cones which are morphisms from a set to a pair of sets (or, in a more advanced treatment, from a set to a functor). The following diagram is the central diagram of the heteromorphic theory; it combines the two factorizations of a cone (f, g) through the receiving and sending universals into one commutative square.

The sending and receiving universals and the factor maps can arranged in an adjunctive square diagram where the cone \({\left(f,g\right) :W\rightarrow\left(X,Y\right) }\) could be taken as the main diagonal. The objects and solid arrow on top are in the sending category (the category of sets in this case) while the objects and solid arrow on the bottom are in the receiving category (the category of set-pairs in this case). The dashed arrows are heteromorphisms from objects in one category to objects in the other category (Fig. 3).

Fig. 3
figure 3

Adjunctive square diagram

The adjunctive square commutes in the sense that if we compose the maps going clockwise from \({W}\) to (X,Y) and then going counter-clockwise from W to (X,Y), we will get in both cases the same two maps fWX and g: WY which thus can be considered as the main (NW to SE) diagonal of the square. This product adjunction is essentially this situation where given a cone of maps \({\left(f,g\right) :W\rightarrow\left(X,Y\right) }\), there is a unique factor map \({\langle f, g\rangle}\) factoring the given pair through the sending universal of the projection (p X , p Y ), and there is a unique map-pair, also denoted (f, g), that trivially factors the given cone through the receiving universal (1 W , 1 W ).

Thus an adjunction arises from a situation: (1) where every heteromorphism [e.g., \({\left(f,g\right) :W\rightarrow\left( X,Y\right) }\)] to a given object in the receiving category [e.g., (X,Y)] can be universally represented [e.g., by X × Y] within the sending category, and 2) where every heteromorphism from a given object in the sending category [e.g., W] can be universally represented [e.g., by ΔW] within the receiving category. An adjunction is given by a pair of adjoint functors which take the given object, e.g., (X,Y) or W, to its corresponding universal object in the other category, e.g., the product functor ×  which takes (X,Y) to X × Y and the diagonal functor Δ which takes W to ΔW = (W,W). The product functor that takes (X,Y) to X × Y, the two objects on the right-hand side of the adjunctive square, is called the right adjoint, and the diagonal functor that takes W to (W, W), the two objects on the left-hand side of the adjunctive square, is called the left adjoint. In the adjunctive square diagram, there is a one-one correspondence or isomorphism between the maps of the form ΔW→(X,Y) in the category of set-pairs (bottom horizontal morphism) and the maps of the form X × Y in the category of sets (top horizontal morphism). This yields the usual definition of an adjunction (e.g., Mac Lane 1971, p. 78) as a natural isomorphism between the two sets of homomorphisms or “hom-sets”:

$$ Hom(\Delta W,(X,Y))\cong Hom(W,X\times Y). $$

Note that the left adjoint diagonal functor Δ occurs on the left in the hom-set, and similarly the right adjoint product functor ×  is on the right in the hom-set. This standard definition of an adjunction makes no mention of the heteromorphisms. The maps in each of the hom-sets are between objects in the same category, e.g., between pairs of sets or between single sets, as indicated in the name “homomorphism.” Since the objects and maps at the top of an adjunctive square diagram are in one category while the objects and maps on the bottom are in another category, all the morphisms from top to bottom are between the objects of different categories. They are the heteromorphisms—which could also be called chimera morphisms (since their tail is in one category and their head in another). The originally given cone of maps from a single set W to the set-pair (X,Y) is an example of a chimera morphism or heteromorphism. The conventional natural isomorphism of the adjunction between the two types of homomorphisms can be extended since each type of homomorphism is uniquely paired with the heteromorphism that is the main diagonal in the adjunctive square diagram. Taking Het(W, (X,Y)) as the set of heteromorphisms \({W\rightarrow(X,Y)}\) , the above natural isomorphism can be extended to the form specific to the heteromorphic treatment: Footnote 6

$$ Hom(\Delta W,(X,Y))\cong Het(W,(X,Y))\cong Hom(W,X\times Y) \hbox{Adjunctive Isomorphisms}$$

2.4 The dual example of the coproduct

In the product example, the universal of interest was the sending one while the receiving one was rather trivial. In the dual construction of the “coproduct” (disjoint union of sets), those roles are reversed. For this example, the category of set-pairs plays the role of the sending category (on top in the adjunctive square diagram), while the category of sets is the receiving category (on the bottom in the diagram). For the more picturesque intuition, we could take the sender as the environment and the receiver as the organism so the example could be interpreted as perception or recognition (Edelman 2004) though a universal. A pair of sets (X,Y) is the fixed sender and we might consider a determination \({\left(f,g\right) :\left(X,Y\right) \rightarrow W}\) to any set W, i.e., a pair of maps fXW and g:YW which is called a cocone. In the intuitive picture, that would represent a signal from the environment to the organism.

It is a different matter if the organism can construct a universal internal model of all the relevant “signals” or “messages” from that environment so that the “signal” could be factored through the universal via an internal “perception” (or “recognition”). All the possible “messages,” “causes,” “stimuli,” or determiners in X and Y are just all their elements so the set containing all those elements is the disjoint union or coproduct X + Y.

Even if there were elements common to the two sets (i.e., X and Y had a non-empty intersection), we would still need to consider the disjoint union since maps f and g might take the same common element to different elements of W so there would need to be two distinct copies of that element in the coproduct X + Y for all messages to factor through that universal. The two injection maps i X : XX + Y and i Y : YX + Y would map each determiner to its internal representative in the disjoint union X + Y. Then given any cocone \({\left(f,g\right) :\left(X,Y\right) \rightarrow W}\), there is a unique factor map (“internal perception”) {f,g}: X + YW such that \({\left(X,Y\right) \overset{(i_{X},i_{Y})}{\rightarrow} X+Y\overset{\left\{ f,g\right\} }{\longrightarrow}W=\left( X,Y\right) \overset{(f,g)}{\rightarrow}W}\) (i.e., so that the internal perception of the message through the receiving universal is the same as the original external signal). In more philosophical terms, the internalized determination through the receiving universal (with the only external connection being the canonical receiving universal connection) gives the receiving “organism” a certain measure of independence or autonomy from the direct stimulus control represented by the specific external determinations. By having the internalized perception based on the fixed universal connection to the environment, the receiving organism has, in a sense, built itself a separate internal “world” that gives it a measure of separateness or autonomy from its environment.

The following adjunctive square has the original signal, the cocone \({\left(f,g\right) :\left(X,Y\right) \rightarrow W}\), as its main diagonal with the receiving universal of the cocone of injection maps on the left and the internalized determination on the bottom (Fig. 4).

Fig. 4
figure 4

Signal (f, g) internalized as perception or recognition {f, g} factored through receiving universal

The main focus of this example is the lower triangle which shows the factorization through the receiving universal. But, as before, there is also the rather trivial other half of the adjunctive square diagram (where the cocone (1 W , 1 W ) has the role of a sending universal morphism). In this case, the adjunction isomorphisms are:

$$ Hom\left(X+Y,W\right) \cong Het(\left(X,Y\right),W)\cong Hom\left((X,Y\right),\Delta W) $$

where the coproduct functor assigning X + Y to (X,Y) gives the left adjoint and the same diagonal functor assigning ΔW to W gives the right adjoint.

2.5 The main features of determination through universals

What are the main features to abstract from the examples of determination through a universal? Given a pair of adjoint functors, i.e., an adjunction, there is always a sending universal and a receiving universal although one of them may be rather trivial. Both universals are present in the adjunctive square diagram (the NE and SW corners and the vertical arrows on the left and right) but they result from different assumptions about what is fixed and what is variable. In the general conceptual scheme, the source or senders on the NW corner could be thought of as “determiners” or “causes.” The target or receivers on the SE corner of the diagram are the “determinees” or “effects” (Fig. 5).

Fig. 5
figure 5

Adjunctive square as general scheme for determination through universals

If the determiners are taken as the fixed or given part, then the corresponding universal object in the SW corner of the diagram would be a universal model for determinations from those possible determiners or causes. Any specific determination (diagonal map in the diagram) could be uniquely factored through that universal object on the receiving side (bottom of diagram) via the receiving universal map and the internalized version of the specific determination.

If the determinees were taken as the fixed or given part, then the corresponding universal in the NE corner of the diagram would be a universal model for determinations to those possible determinees or effects. Any specific determination (diagonal map) could be uniquely factored through that universal object on the sending side (top of diagram) via the internalized version of the specific determination and the sending universal map.

Sticking to the disjoint union as the illustrative example, a particular determination from the given set-pair (X,Y) to any other set such as W would be given by a cocone or pair of maps \({\left(f,g\right) :\left(X,Y\right) \rightarrow W}\). The universal construction of the coproduct X + Y constructs the set of all determiners (or causes) so that the given instance of an external determination factors through the universal by the internal map {f, g} : X + YW. That internal map “recognizes” the causes and sends the same message to W as the original transmission from (X, Y) to W (Fig. 6).

Fig. 6
figure 6

Signal (f,g) internalized through receiving universal as perception { f,g}

The main features might be singled out for this receiving case (the dual sending case is considered next).

Universality

While an external determination involves a given set of possible determiners, the determination through a universal constructs a universal object internal to the receiving side together with a universal receiving map so that all possible determinations from those determiners can be factored through that receiving universal.

Internalization

The factorization through the universal internalizes the particular determination (e.g., (f, g) is replaced by { f, g}) so that the only external–internal connection is the indirect fixed canonical one connecting the external determiners to their internal representations (e.g., the canonical injections (i X , i Y ) as the receiving universal map).

Autonomy = Universality + Internalization

The net effect is that the receiver (“organism”) is “disconnected” from direct external stimulus control by the sender (the perception takes place, as it were, in the internalized “environment” or “world”) and becomes in that sense autonomous.

The product example of a non-trivial sending universal took the given pair (X, Y) as the effects or determinees. A specific determination would be given by a pair of maps (cone) \({\left(f,g\right) :W\rightarrow \left(X,Y\right) }\) from any single set W to X and Y. The universal construction of the product X × Y constructs the set of all determinees (or effects) so that the given instance of an external determination factors through the universal by the internal map 〈f,g〉: W → X × Y. That internal map “chooses” the effects and transmits the same results to (X,Y) as the original transmission from W to (X, Y) (Fig. 7).

Fig. 7
figure 7

Behavior (f,g) internalized as action 〈f,g〉 through the sending universal

The main features might be singled out for this sending case.

Universality

While an external determination involves a given set of possible determinees or effects, the determination through a universal constructs a universal object internal to the sending side together with a universal sending map so that all possible determinations to those determinees can be factored through that sending universal.

Internalization

The factorization through the universal internalizes the particular determination (e.g., (f, g) is replaced by 〈f, g〉) so that the only external–internal connection is the indirect fixed canonical one connecting the internal representations to the external effects (e.g., the canonical projections (p X , p Y ) as the sending universal map).

Autonomy = Universality + Internalization

The net effect is that the sender (“organism”) is “disconnected” from direct “causal” interaction with the effects (the action takes place, as it were, in the internalized “world”) and becomes in that sense autonomous.

This completes the first part of the paper, illustrating a theory of adjoint functors using the simple examples of the product and coproduct adjunctions for sets. These examples, like any pair of adjoint functors, have the conceptual structure of determination through universals. We have tried to describe this conceptual structure using the concepts of universality, internalization, and autonomy each of which has a precise meaning in the mathematical context.

In the second part of the paper, the purpose is to point out a set of analogies or, more ambitiously, applications with similar structures in the empirical sciences. A few caveats are required for the transition from an abstract mathematical structure to an empirical application.

In an adjunction, there “is” both the external determination and the factorizations through a universal. In an empirical context, the question would be whether a determinative process is of the direct external type or is an example of determination through a universal. In an empirical example of determination through a universal, there might be no external counterpart except as a conceptual possibility.

Also an adjunction is an atemporal mathematical model whereas an empirical example would be a temporal process. The point is that we are using the atemporal mathematical model to illustrate the abstract structure of a temporal model, not to describe a time-path of some system.

3 Applications in the life and human sciences

3.1 Selectionist versus instructionist evolution

The contrast between Darwinian selectionist evolutionary theory and Lamarckian instructionist evolutionary theory is our first major example. The environment is the given set of determiners and the question is how does it act on organisms so that they become more adapted. The Lamarckian instructive process would be mathematically modeled as a direct external determination. The environment (somehow) directly instructs the organism about what features have adaptive value and then that adaptation is transmitted to the offspring. In the selectionist account, the species population through mutation and sexual reproduction generates a wide variety of possibilities in a manner autonomous of direct environmental influence. Then from among these generated possibilities, the environment selects which ones to “implement” in the sense of differentially amplifying or reproducing those organisms.

The main conceptual features of determination through universals are present in the selectionist account of biological evolution (where the Lamarckian account is only pictured as a conceptual possibility of how the environment might somehow directly induce adaptations in organisms) (Fig. 8).

Fig. 8
figure 8

Selection as determination through universals

Universality

The selectionist theory is an example of population thinking because it is the population, not the individual organism, that explores the universe of possibilities by variation through mutation and sexual reproduction.

Internalization

The environment acts on the generated variety by selection and then, internal to the species, the fittest differentially reproduce so the net effect is “as if” the environment had directly instructed organisms with the fittest adaptations.

Autonomy

In Darwinian theory, this is the basic non-Larmarckian point that there is no direct information flow from the environment to the organisms to somehow adapt certain characteristics. The actual process is the indirect one of generating a “universal” variety, and the environment selecting the fitter ones which then differentially reproduce.

3.2 The DNA mechanism as a universal constructor

Although most of our examples focus on determination through a receiving universal, it might be useful to briefly consider the dual case of determination through a sending universal. One instance would arise in the contrast between a special-purpose machine or computer program that directly produces certain results, and a universal constructor or computer language that can be programmed to produce any result (of course, within some universe of options). In computer science, there is the contrast between a special-purpose Turing machine (a simple type of theoretical computer) that performs only a specific calculation and the “factorization” of the inputs + instructions through a universal Turing machine that will produce the same end results (Fig. 9).

Fig. 9
figure 9

Special-purpose calculator factored through universal computer

For a biological example, one could replace the “inputs” by some specification of the “blueprint” (required amino acids, proteins, etc.) and one might also imagine a special-purpose mechanism that would produce those outputs from those inputs. But the actual mechanism used in Nature uses a universal constructor present in all the various types of life. The instructions are encoded into genes using the genetic code for that universal DNA mechanism which then implements the instructions to produce or develop the specific molecules (Fig. 10).Footnote 7

Fig. 10
figure 10

DNA mechanism as universal constructor

Universality

As a sending universal, the DNA mechanism is structured to recognize and implement instructions for a given “universe” of relevant possible outcomes (amino acids, proteins, etc.).

Internalization

The genes plus the DNA mechanism combine to internalize one overall mechanism for the construction of the molecules.

Autonomy

The net result of having the blueprint, specific construction instructions, and universal construction mechanism all internalized in a living organism gives a type of autonomy characteristic of living things.

3.3 Selectionist versus instructionist theories of the immune system

There are a number of examples in the life sciences of determinative processes that were originally assumed to be instructionist but were later found to operate by a selectionist mechanism. One of the most telling cases was the immune system. Originally it was assumed that the antigen would somehow instruct the immune system as to how an anti-body could be constructed to neutralize the antigen. During the 1950s, a number of difficulties in the instructionist account fostered the development of a selectionist approach. While many researchers contributed to this approach, one of the earliest was Niels Jerne (Jerne 1955) who has also been most attentive to analogies with other fields.

In the selectionist theory, the immune system takes on the active role of generating a huge well-nigh “universal” variety of antibodies but in low concentrations. This initial generation of candidate antibodies is not being directed or instructed by the past disease history of the organism. An externally introduced antigen has the indirect role of simply selecting which antibody fits it like a key in a lock. Every antibody has the possibility of self-reproducing or cloning itself but it is the ones whose key has fit into a lock that have this potentiality triggered. Then that antibody is differentially amplified in the sense of being cloned into many copies to lock up the other instances of the antigen. Thus the selectionist account of the immune system has the main features of determination through universals (Fig. 11).

Fig. 11
figure 11

Selectionist account of immune system as determination through a universal

A similar example was the originally instructionist account of bacteria “learning” to tolerate antibiotics or to consume a new substance but now these processes are recognized as being selectionist. A wide variety of bacterial mutations are constantly being generated and those that can tolerate antibiotics or digest a new substrate will differentially thrive in such an environment.

3.4 Edelman’s selectionist theory of the brain

After Gerald Edelman received the Nobel prize for his work on the selectionist approach to the immune system, he switched to neurophysiology and developed the theory of neuronal group selection or neural Darwinism.

[T]he theoretical principle I shall elaborate here is that the origin of categories in higher brain function is somatic selection among huge numbers of variants of neural circuits contained in networks created epigenetically in each individual during its development; this selection results in differential amplification of populations of synapses in the selected variants. In other words, I shall take the view that the brain is a selective system more akin in its workings to evolution than to computation or information processing. (Edelman 1987, p. 25)

There are several different phases in this selectionist theory. In the developmental phase of the brain, a huge variety of loose connections are made. Those that find some resonance with the individual’s experience are strengthen while those that are unused will atrophy. The slogan is that “the neurons that fire together, wire together.” Later there is an experiential selection the strengthens some connections and weakens others. Finally, “reentrant” signals within the brain deepen the process of self-organization through strengthening some connections and weakening others.

One of the tell-tale signs of a process of determination through universals is the indirectness of the factorization through a universal. Here again, an instructionist account might be first given for a process that is later recognized as being selectionist. The interplay between these two accounts dates back at least to the Platonic–Socratic account of learning not as the result of external instruction but as a process of catalyzing internal recollection. One of the striking epigrams of neo-Platonism is the thesis that “no man ever does or can teach another anything” (Burnyeat 1987, p. 1). In the early fifth century, Augustine in De Magistro(The Teacher) made the point contrasting “outward” instruction with learning “within.”

But men are mistaken, so that they call those teachers who are not, merely because for the most part there is no delay between the time of speaking and the time of cognition. And since after the speaker has reminded them, the pupils quickly learn within, they think that they have been taught outwardly by him who prompts them. (Chapter XIV)

In the nineteenth century, Wilheim von Humboldt made the same point even recognizing the symmetry between speaker and listener.

Nothing can be present in the mind (Seele) that has not originated from one’s own activity. Moreover understanding and speaking are but different effects of the selfsame power of speech. Speaking is never comparable to the transmission of mere matter (Stoff). In the person comprehending as well as in the speaker, the subject matter must be developed by the individual’s own innate power. What the listener receives is merely the harmonious vocal stimulus. (Humboldt 1997, p. 102)

A similar theme has been a mainstay in active learning theories of education. As John Dewey put it:

It is that no thought, no idea, can possibly be conveyed as an idea from one person to another. When it is told, it is, to the one to whom it is told, another given fact, not an idea. The communication may stimulate the other person to realize the question for himself and to think out a like idea, or it may smother his intellectual interest and suppress his dawning effort at thought. (Dewey 1916, p. 159)

Remarkably, the immunologist Niels Jerne tied these themes together.

Several philosophers, of course, have already addressed themselves to this point. John Locke held that the brain was to be likened to white paper, void of all characters, on which experience paints with almost endless variety. This represents an instructive theory of learning, equivalent to considering the cells of the immune system void of all characters, upon which antigens paint with almost endless variety.

Contrary to this, the Greek Sophists, including Socrates, held a selective theory of learning. Learning, they said, is clearly impossible. For either a certain idea is already present in the brain, and then we have no need of learning it, or the idea is not already present in the brain, and then we cannot learn it either, for even if it should happen to enter from outside, we could not recognize it. This argument is clearly analogous to the argument for a selective mechanism for antibody formation, in that the immune system could not recognize the antigen if the antibody were not already present. Socrates concluded that all learning consists of being reminded of what is pre-existing in the brain. (Jerne 1967, pp. 204–205)

This theme distinguishing direct determination from the composite effect of the indirect influence differentially triggering internal processes comes out in Edelman’s theory of the brain.

According to this analysis, extrinsic signals convey information not so much in themselves, but by virtue of how they modulate the intrinsic signals exchanged within a previously experienced neural system. In other words, a stimulus acts not so much by adding large amounts of extrinsic information that need to be processed as it does by amplifying the intrinsic information resulting from neural interactions selected and stabilized by memory through previous encounters with the environment. (Edelman and Tononi 2000, p. 137)

Thus, for example, the old neo-Platonic theme of learning through recollection emerges in Edelman’s account of perception as the “remembered present” (e.g., chapter nine in Edelman and Tononi 2000) (Fig. 12).

Fig. 12
figure 12

Selectionist account of perception or recognition as determination through a receiving universal

In broad-brush terms, one might intuitively think of the universal model as a large set of brain circuits representing a wide (“universal”) range of sensory images and vibrating at a low level of amplitude beneath the level of consciousness (analogous to the “universal” repertoire of antibodies present in the immune system in low concentrations). When a specific signal is received from the environment, then it might resonate with a particular circuit-image which would greatly increase the amplitude of those vibrations and would thus constitute the perception. This sort of model has a built-in type of intentionality (i.e., seeing is always seeing-as) since the perception would always be “perception-as” depending on which image was resonated.

In view of his earlier work on the immune system, Edelman is well-placed to try to draw out the underlying principles of the selectionist account of recognition (i.e., determination through a receiving universal)—which agree with the main features described above.

The long trail from antibodies to conscious brain events has reinforced my conviction that evolution, immunology, embryology, and neurobiology are all sciences of recognition whose mechanics follow selectional principles. ... All selectional systems follow three principles. There must be a generator of diversity, a polling process across the diverse repertoires that ensue, and a means of differential amplification of the selected variants. (Edelman 2004, p. 7367); (also Edelman 2004, pp. 41–42)

These three principles are functionally represented by the three components in a determination through a receiving universal pictured above. The “generator of diversity” is the receiving universal object, the “polling process across the diverse repertoires” is represented by the receiving universal morphism (labelled “selection” in the above diagram) that is the canonical external–internal interface between external environment and the receiving universal object, and finally the “differential amplification” is represented by the factor morphism (with that label in the above diagram).

3.5 Pseudo-selectionist theories

Considerable scientific prestige is now attached to the “Darwin’s dangerous idea” (Dennett 1995), the selectionist account of biological evolution. When the selectionist ideas turned out to be successful in other areas (e.g., the immune system or bacterial “learning”), it became something of a scientific fad to cast all sorts of theories into a seemingly selectionist mold. The advent of “universal selection theory” was the “second Darwinian revolution” (e.g., Cziko 1995; Hull 2001; Heyes and Hull 2001). Since Lamarckian or instructionist theories of learning and adaptation were alternative to Darwin’s selectionist account, they were treated as being almost pseudo-scientific in the same league as creationism. Even some of the greatest of modern philosophers such as Karl Popper were drawn into the fad.

The theory of knowledge which I wish to propose is a largely Darwinian theory of the growth of knowledge. From the amoeba to Einstein, the growth of knowledge is always the same: we try to solve our problems, and to obtain, by a process of elimination, something approaching adequacy in our tentative solutions. (Popper 1979, p. 261)

Yet many of the so-called “selectionist” theories were so general, that most any type of learning or adaptation—from the operant conditioning of rats running mazes and pigeons pecking levers to the growth of scientific knowledge—could be verbally described in such a way as to appear “selectionist.”

Instead of recapitulating that debate, a different approach is taken here. We start with a notion of determination through universals that can be stated precisely—albeit at a high level of abstraction—and that is already known to be of fundamental importance in mathematics itself. When the main features of this type of determination are described, then the Darwinian selectionist account of biological evolution, the selectionist theory of the immune system, and a selectionist approach to the brain seem to have a good fit (modulo the differences between an atemporal conceptual model and a temporal process). However, other “selectionist” theories such as operant conditioning or the growth of knowledge “from the amoeba to Einstein” seem to be rather contrived and selective accounts.

The first major message of the mathematical model is that there is nothing pseudo-scientific in the notion of direct or instructive determination. If anything, that is the standard type of determination. An adjunctive situation is very special. In the context of an adjunction, there is always an equality in the overall end results of a direct determination and the indirect factorization through the universal. In an empirical context, we might find a process corresponding to one or the other type of determination or perhaps both at the same time.

Secondly, the aspects of universality, internalization, and autonomy impose important restrictions on what might be interpreted as a determination through universals. An animal searching for food by taking in a host of instructive clues from the environment hardly satisfies these restrictions. If a rat running a maze is perchance “clueless” at a junction and has to make a “blind variation” to avoid the fate of Buridan’s ass, then that waste-case of resolving a tie in an overall instructive process does not transform the process into one of representing a universe of possibilities combined with indirect selection of particular possibilities.

In Popper’s account of the growth of scientific knowledge, the hypothetico-deductive method is the selectionist theory while Baconian induction plays the role of the instructionist theory (see chapter five in Popper 1985). While Popper’s account of the power of the hypothetico-deductive method is impressive, there is simply no reason to think that anything as complex as the growth of scientific knowledge should be purely or even primarily one way or the other. The attempt to fit an animal’s behavioral learning in an environment or the development of scientific knowledge to this Procrustean bed seems rather overdrawn. Rats are surely as omnivorous in their consumption of instructive clues as they are of food they seek. And even in mathematics, not to mention the empirical sciences, induction or generalization from examples is a well-known process for developing ideas and hypotheses. As Paul (“anything goes”) Feyerabend would have emphasized, scientists are also omnivorous in their consumption of clues to generate new ideas.

3.6 Chomsky’s theory of generative grammar

Language learning by a child is another example of a process that was originally thought to be instructive. But Noam Chomsky’s theory of generative grammar postulated an innate language faculty or universal grammar that would unfold according to the linguistic experience of the child. The child did not “learn” the rules of grammar; the linguistic experience of the child would select how the universal mechanism would develop or unfold to differentially implement one rule rather than another. Again Niels Jerne saw the connection; his Nobel Lecture was entitled The Generative Grammar of the Immune System (Fig. 13).

Fig. 13
figure 13

Generative grammar account of language learning as determination through a universal

An everyday example of indirect determination is a person’s understanding of spoken language. The naive viewpoint is that somehow the meaning of the spoken sentences is transmitted from the speaker to the listener. But, in fact, it is only the physical sounds that are transmitted. The syntactic analysis and the semantic component have to be generated internally by the listener so the heard sounds only have the role of selecting which generative processes will be triggered. This neo-Platonic point was already emphasized in the last section.

Chomsky has emphasized the universality of the internal mechanism to both generate and understand a potential infinity of sentences which have never been spoken or heard before. Footnote 8 Descartes emphasized this universality of language and reason: “reason is a universal instrument which can serve for all contingencies” (Descartes 1975, p. 116) so Chomsky has referred to the generative grammar approach as “Cartesian linguistics.”

In summary, one fundamental contribution of what we have been calling "Cartesian linguistics" is the observation that human language, in its normal use, is free from the control of independently identifiable external stimuli or internal states and is not restricted to any practical communicative function, in contrast, for example, to the pseudo language of animals. It is thus free to serve as an instrument of free thought and self-expression. The limitless possibilities of thought and imagination are reflected in the creative aspect of language use. The language provides finite means but infinite possibilities of expression constrained only by rules of concept formation and sentence formation, these being in part particular and idiosyncratic but in part universal, a common human endowment. (Chomsky 1966, p. 29)

The general features of universality, internalization, and autonomy (independence from external stimulus control) are clear. Footnote 9

4 Conclusion

In recent decades, the notion of an adjunction has emerged as a principal lens to pick out and characterize what is important in mathematics. If adjoint functors characterize much of what is important in mathematics itself, then it is reasonable to expect that the conceptual structure might also have applications in the empirical sciences that are of special importance.

The applications given here were not clear in the “classical” treatment of adjoint functors since that treatment did not involve heteromorphisms at all. It is only with the heteromorphic theory that adjoints are seen as arising from internal universal representations of the external determinations (heteromorphisms) between two different domains (the sending and receiving categories). This is the mathematical content of the adjunction natural isomorphisms (using the coproduct example):

$$ Hom\left(X+Y,W\right) \cong Het(\left(X,Y\right),W)\cong Hom\left( (X,Y\right),\Delta W) $$

which show that the heteromorphisms in the center are internally represented by the right adjoint in the sending category on the right and also by the left adjoint in the receiving category on the left. The classical treatment of adjoints “left out” the heteromorphisms in the middle and focused on the natural isomorphism between the internal factor morphisms on each side. Thus it missed the whole interplay between the external determinations (heteromorphisms) being internally represented on the sending and receiving sides (by homomorphisms) which was key to the applications.

This restructuring of an external determination as a determination through an internal universal structure provides a model of how a new type of internal organization might display qualitatively different types of behavior and recognition. The internalization through the universal structure builds a “separate” internalized “space” or “world” and thus supports the emergence of a qualitatively new level of relatively autonomous activity that would not otherwise be present if there was only the direct determinative connections.

In the life sciences, selectionist theories of biological evolution, the immune system, and the brain seem to fit well into this model as well as the DNA mechanism. Noam Chomsky’s theory of generative grammar seems to be a good fit within the human sciences. All these structures are certainly of special importance, and their common features are described by the conceptual structure of determination through a universal.

Overall, these examples suggest that determination through universals—mathematically expressed in a pair of adjoint functors—offers a set of ideas to approach the old conundrum of how levels of organization exhibiting some measure of autonomy could exist in a world otherwise characterized by direct external determination.