Adjunctions and monads are objects that can be derived from one another. Every adjunction gives raise to a monad but only some specific type of adjunctions come out of monads. In this section, we show how to turn adjunctions into monads and how to handle Kleisli adjunctions from monads. We then give a proof of Mac Lane’s comparison theorem.
Definition 2.1
Let \(\mathscr {C}\) and \(\mathscr {D}\) be two categories. The functors \(F:\mathscr {C}\rightarrow \mathscr {D}\) and \(G:\mathscr {D}\rightarrow \mathscr {C}\) form an adjunction \(F\dashv G:\mathscr {D}\rightarrow \mathscr {C}\) iff there exist natural transformations \(\eta :Id_{\mathscr {C}} \Rightarrow GF\) and \(\varepsilon :FG\Rightarrow Id_{\mathscr {D}}\) with the following coherence conditions satisfied:
$$\begin{aligned} \varepsilon _{FX} \circ F\eta _X&= id_{FX} \text { for each } X \text { in } \mathscr {C}\end{aligned}$$
(2.1)
$$\begin{aligned} G\varepsilon _A \circ \eta _{GA}&= id_{GA} \text { for each } A \text { in } \mathscr {D}\end{aligned}$$
(2.2)
Informally speaking, an adjunction is a possible similarity measure between functors. It is a concept such as equality, equivalence and isomorphism, considered as the weakest notion among them. It can be understood as a further weakening of an isomorphism. If we look for isomorphisms in between two functors, say F and G, we first require them to be defined from the same source category to the same target category. Only then, we search for two natural transformations between F and G defined in opposite directions such that their composition gives the identity natural transformation over the functor F or G depending on the order of composition.
An adjunction is weaker. It can relate two functors defined between the same categories but in the opposite directions, i.e., \(F:\mathscr {C}\rightarrow \mathscr {D}\) and \(G:\mathscr {D}\rightarrow \mathscr {C}\) . Obviously, we cannot look for an equality, equivalence nor isomorphism between F and G since they do not live in the same type, from a type-theoretic point of view; meaning they cannot be directly compared. Indeed, this is the point where the notion of adjointness comes into the play by providing a possible way to compare them via their compositions. Therefore, for the functors \(F:\mathscr {C}\rightarrow \mathscr {D}\) and \(G:\mathscr {D}\rightarrow \mathscr {C}\) to qualify as adjoints, there need to be two natural transformations \(\eta : Id _\mathscr {C}\Rightarrow GF\) and \(\varepsilon :FG\Rightarrow Id _\mathscr {D}\) satisfying the coherence conditions stated in Eqs. (2.1) and (2.2). The former can be depicted and explained as follows:
If F and G are adjoint functors, then this condition intuitively tells us that the following case holds for each map \(f:X\rightarrow Y\) in the category \(\mathscr {C}\): we have maps \(Idf:X\rightarrow Y\) and \(GFf:GFX\rightarrow GFY\) also in \(\mathscr {C}\). Using the natural transformation \(\eta \), we can deform the map Idf into GFf. We then transport this deformation through the functor F from the category \(\mathscr {C}\) into the category \(\mathscr {D}\). There, using the natural transformation \(\varepsilon \), we can do an inverse deformation (or better a formation) and form the map f back in the form of Ff since we are in \(\mathscr {D}\). Note that we independently have the other very similar condition at Eq. (2.2) satisfied.
Example 1
In the Calculus of Inductive Constructions (CIC), the conjunction and implication over propositions are adjoint operations. To show this, we first take the Prop universe as the category of propositional formulas and entailments, and name it CatP. It is then possible to form two endo-functors \(F, G:\forall p \in obj(CatP), CatP\rightarrow CatP\) as
$$\begin{aligned} F(p)&= \lambda q. p \wedge q \\ F(p)f&= \lambda (H: p \wedge a). \text {match } H \text { with conj } x\,y \Rightarrow \text {conj } x\,(f\,y), \forall f:a\rightarrow b\\ G(p)&= \lambda q. p \implies q \\ G(p)f&= \lambda (H: p\implies a) (x: p). f\,(H\,x),\, \forall f:a\rightarrow b \end{aligned}$$
We now define two natural transformations
\(\forall \eta :\forall p \in obj(CatP), Id_{CatP}\Rightarrow G(p)\circ F(p)\) and \(\forall \varepsilon :\forall p \in obj(CatP), F(p)\circ G(p) \Rightarrow Id_{CatP}\) as
$$\begin{aligned} \eta (p)&= \lambda (y: id\,q) (x: p). \text {conj } x\,\,y,\, \forall q \in obj(catP)\\ \varepsilon (p)&= \lambda (H: p \wedge (p \implies q)).\text {match } H \text { with conj } x\,y \Rightarrow y\,x, \forall q \in obj(catP) \end{aligned}$$
where \(\mathtt {conj}\,x\,y\) is a proof of \(x \wedge y\).
It is straightforward to check that the functors F and G form an adjunction through \(\eta \) and \(\varepsilon \) by simply showing that Eqs. (2.1) and (2.2) are satisfied. It is interesting to notice that \(\varepsilon \) is modus-ponens.
There are other equivalent ways to formulate the notion of adjunction. For instance, the following Proposition 2.2 makes use of an isomorphism of hom-functors and may better highlight the connection between an isomorphism and an adjunction. Also, an adjunction can be seen as a generalization of the “equivalence” between categories.
Proposition 2.2
An adjunction \(F\dashv G:\mathscr {D}\rightarrow \mathscr {C}\) determines a bijection of natural transformations defined between hom-functors
$$\begin{aligned} \varphi _{X,\,A}:Hom_{\mathscr {D}}(FX,\,A) \xrightarrow {\cong } Hom_{\mathscr {C}}(X,\,GA) \end{aligned}$$
(2.3)
for each \(X \in \mathscr {C}\) and \(A\in \mathscr {D}\) as follows:
$$\begin{aligned} \varphi _{X,\,A}f = Gf\circ \eta _X:X\rightarrow GA \text { for each } f:FX\rightarrow A \end{aligned}$$
(2.4)
$$\begin{aligned} \varphi ^{-1}_{X,\,A}g = \varepsilon _A \circ Fg:FX\rightarrow A \text { for each } g:X\rightarrow GA. \end{aligned}$$
(2.5)
Definition 2.3
A monad \(T = (T, \eta , \mu )\) in a category \(\mathscr {C}\) consists of an endo-functor \(T:\mathscr {C}\rightarrow \mathscr {C}\) equipped with two natural transformations
$$\begin{aligned} \eta :Id_{\mathscr {C}}\Rightarrow T \qquad \mu :T^2\Rightarrow T \end{aligned}$$
(2.6)
such that the following diagrams commute:
$$\begin{aligned} \mu \circ T\mu = \mu \circ \mu T \quad \quad \quad \quad \quad \quad \quad \quad \quad&\mu \circ T\eta = \mu \circ \eta T \end{aligned}$$
(2.7)
$$\begin{aligned}&\mu \circ \eta T = id_T \end{aligned}$$
(2.8)
$$\begin{aligned}&\mu \circ T\eta = id_T \end{aligned}$$
(2.9)
The natural transformations \(\mu \) and \(\eta \) can be respectively seen as the binary multiplication and the identity operations of the monad. Then, the coherence condition given by the above diagram on the left (aka the associativity square) ensures that the multiplication is an associative operation. While the one on the right (aka the unit triangles) assures the neutrality of the identity with respect to the multiplication.
Example 2
A monad is in fact a monoid in the category of endo-functors with its identity being unit \(\eta :Id_\mathscr {C}\rightarrow T\) of the monad and its binary operation being the multiplication \(\mu :T^2\rightarrow T\). The properties of the monoidal identity meet the coherence conditions at unit triangles. The associativity square can be formed by the associativity of the monoid’s binary operation.
Proposition 2.4
An adjunction \(F \dashv G:\mathscr {D}\rightarrow \mathscr {C}\) determines a monad on \(\mathscr {C}\) and a comonad on \(\mathscr {D}\) as follows:
-
The monad \((T, \eta , \mu )\) on \(\mathscr {C}\) has endo-functor \(T = GF:\mathscr {C}\rightarrow \mathscr {C}\), unit \(\eta :Id_{\mathscr {C}}\Rightarrow T\) where \(\eta _X = \varphi _{X,\,FX}(id_{FX})\) and multiplication \(\mu :T^2\Rightarrow T\) such that \(\mu _X = G(\varepsilon _{FX})\).
-
The comonad \((D, \varepsilon , \delta )\) on \(\mathscr {D}\) has endo-functor \(D = FG:\mathscr {D}\rightarrow \mathscr {D}\), counit \(\varepsilon :D \Rightarrow Id_{\mathscr {D}}\) where \(\varepsilon _A = \varphi _{GA,\,A}^{-1}(id_{GA})\) and co-multiplication \(\delta :D\Rightarrow D^2\) such that \(\delta _A = F(\eta _{GA})\).
Proposition 2.5
Each monad \((T, \eta , \mu )\) on a category \(\mathscr {C}\) determines a Kleisli category \(\mathscr {C}_T\) and an associated adjunction \(F_T\dashv G_T:\mathscr {C}_T\rightarrow \mathscr {C}\) as follows:
-
The categories \(\mathscr {C}\) and \(\mathscr {C}_T\) have the same objects and there is a morphism \(f^\flat :X\rightarrow Y\) in \(\mathscr {C}_T\) for each morphism \(f:X\rightarrow TY\) in \(\mathscr {C}\).
-
For each object X in \(\mathscr {C}_T\), the identity arrow is \(id_X = h^\flat :X\rightarrow X\) in \(\mathscr {C}_T\) where \(h = \eta _X:X\rightarrow TX \text { in } \mathscr {C}. \)
-
The composition of a pair of morphisms \(f^\flat :X\rightarrow Y\) and \(g^\flat :Y\rightarrow Z\) in \(\mathscr {C}_T\) is given by the Kleisli composition: \(g^\flat \circ f^\flat = h^\flat :X\rightarrow Z \text { where } h = \mu _Z\circ Tg\circ f:X\rightarrow TZ \text { in } \mathscr {C}.\)
-
The functor \(F_T:\mathscr {C}\rightarrow \mathscr {C}_T\) is the identity on objects. On morphisms,
$$\begin{aligned} F_Tf = (\eta _Y\circ f)^\flat , \text { for each } f:X\rightarrow Y \text { in } \mathscr {C}. \end{aligned}$$
(2.10)
-
The functor \(G_T:\mathscr {C}_T\rightarrow \mathscr {C}\) maps each object X in \(\mathscr {C}_T\) to TX in \(\mathscr {C}\). On morphisms,
$$\begin{aligned} G_T(g^\flat ) = \mu _Y \circ Tg, \text { for each } g^\flat :X\rightarrow Y \text { in } \mathscr {C}_T. \end{aligned}$$
(2.11)
Below lemma is used in Theorem 2.7 to prove the uniqueness of the comparison functor.
Lemma 2.6
Let \(F \dashv G:\mathscr {D}\rightarrow \mathscr {C}\) be an adjunction. For each \(f:X\rightarrow GY\) in \(\mathscr {C}\), and \(g,\,\,h:FX \rightarrow Y\) in \(\mathscr {D}\), if \(f = Gg \circ \eta _X\) and \(f = Gh \circ \eta _X\) then \(g = h\).
Proof
By assumption, we have \(Gg \circ \eta _X = Gh \circ \eta _X\) thus \(\varepsilon _Y \circ F(Gg \circ \eta _X) = \varepsilon _Y \circ F(Gh \circ \eta _X)\) which is \(\varepsilon _Y \circ FGg \circ F\eta _X = \varepsilon _Y \circ FGh \circ F\eta _X\). The naturality of \(\varepsilon \) gives \(g \circ \varepsilon _{FX} \circ F\eta _X = h \circ \varepsilon _{FX} \circ F\eta _X\). Finally, since \(\varepsilon _{FX} \circ F\eta _X = id_{FX}\), we conclude that \(g = h\). \(\square \)
Theorem 2.7
(The comparison theorem for the Kleisli construction [14, Ch. VI, §5, Theorem 2]) Let \(F \dashv G:\mathscr {D}\rightarrow \mathscr {C}\) be a adjunction and let \((T, \eta , \mu )\) be the associated monad on \(\mathscr {C}\). Then, there is a unique comparison functor \(L:\mathscr {C}_T\rightarrow \mathscr {D}\) such that \(GL = G_T\) and \(LF_T = F\), where \(\mathscr {C}_T\) is the Kleisli category of \((T, \eta , \mu )\), with the associated adjunction \(F_T\dashv G_T:\mathscr {C}_T\rightarrow \mathscr {C}\).
Intuitively, one can start with an arbitrary adjunction \(F\vdash G:\mathscr {D}\rightarrow \mathscr {C}\). This determines a monad T on the base category \(\mathscr {C}\) and dually a comonad on the category \(\mathscr {D}\) as stated in Proposition 2.4. Later, the monad T creates a Kleisli category \(\mathscr {C}_T\) together with a Kleisli adjunction \(F_T\dashv G_T:\mathscr {C}_T\rightarrow \mathscr {C}\) as in Proposition 2.5. The theorem states that one can compare these arbitrary and structured adjunctions using a functor \(L:\mathscr {C}_T\rightarrow \mathscr {D}\) (aka comparison functor) which is unique and making the above diagram commutative in both directions.
Proof
Let us first assume that \(L:\mathscr {C}_T\rightarrow \mathscr {D}\) is a functor satisfying \(GL = G_T\) and \(LF_T = F\). So that the below given diagram commutes.
Let \(\theta _{X,\,Y}:Hom_{\mathscr {C}_T}(F_TX, Y) \xrightarrow {\cong } Hom_{\mathscr {C}}(X, G_TY)\) be a bijection associated to the adjunction \(F_T \dashv G_T\) provided by Proposition 2.2. Similarly, let \(\psi _{X,\,Y}:Hom_{\mathscr {D}}(FX, Y) \xrightarrow {\cong } Hom_{\mathscr {C}}(X, GY)\) be a bijection associated to the adjunction \(F \dashv G\). Since units of adjunctions \(F_T \dashv G_T\) and \(F \dashv G\) are the unit \(\eta \) of the monad \((T,\eta ,\mu )\) by [14, Ch. IV, §7, Proposition 1], we obtain the commutative diagram below:
Therefore, \(L_{F_TX, Y} =\psi _{X, LY}^{-1} \circ \theta _{X, Y}\). Using the Eq. (2.4) in Proposition 2.2, we have: \( \theta _{X, Y}f^\flat = G_Tf^\flat \circ \eta _X:X\rightarrow G_TY, \text {for each } f^\flat :F_TX = X \rightarrow Y \text { in } \mathscr {C}_T. \) Since \(G_Tf^\flat = \mu _Y\circ Tf\) in \(\mathscr {C}\), for each \(f^\flat :X\rightarrow Y\) in \(\mathscr {C}_T\), by Eq. (2.11), we have \(\theta _{X, Y}f^\flat = \mu _Y\circ Tf\circ \eta _X:X\rightarrow G_TF_TY = G_TY.\) Thanks to the naturality of \(\eta \), we get \(\theta _{X, Y}f^\flat = \mu _Y\circ \eta _{TY} \circ f\). The monadic axiom \(\mu _Y\circ \eta _{TY} = id_{TY}\) yields \(\theta _{X, Y}f^\flat = f:X\rightarrow G_TY\). Since \(G_T = GL\) and \(F_T\) is the identity on objects, we have \(\theta _{X, Y}f^\flat = f:X\rightarrow GLY \text { and } LF_TY = LY = FY.\) Now, by Eq. (2.5) in Proposition 2.2, we obtain \( \psi _{X, LY}^{-1}f = \varepsilon _{LY} \circ Ff = \varepsilon _{FY} \circ Ff = \psi _{X, FY}^{-1}f \text { for each } f:X\rightarrow GFY \text { in } \mathscr {C}. \) Hence \( \psi _{X, LY}^{-1}(\theta _{X, Y}f^\flat ) = \psi _{X, FY}^{-1}f = \varepsilon _{FY} \circ Ff. \) In other words, given a functor L satisfying \(GL = G_T\) and \(LF_T = F\), then it must be such that \(LX = FX\) for each object X in \(\mathscr {C}_T\) and \(Lf^\flat = \varepsilon _{FY} \circ Ff \text { in } \mathscr {D}\text { for each } f^\flat :X\rightarrow Y \text { in } \mathscr {C}_T.\)
We first prove that some map \(L:\mathscr {C}_T\rightarrow \mathscr {D}\), characterized by \(LX = X\) and \(Lf^\flat = \varepsilon _Y \circ Ff\), is actually a functor satisfying \(GL = G_T\) and \(LF_T = F\):
-
1.
For each X in \(\mathscr {C}_T\), due to the fact that \(id_X = (\eta _X)^\flat \) in \(\mathscr {C}_T\), we have \(L(id_X) = L((\eta _X)^\flat ) = \varepsilon _{FX} \circ F\eta _X\). By [14, Ch. IV, §1, Theorem 1], we get \( \varepsilon _{FX} \circ F\eta _X = id_{FX} = id_{LX}. \) For each pair of morphisms \(f^\flat :X\rightarrow Y\) and \(g^\flat :Y\rightarrow Z\) in \(\mathscr {C}_T\), by Kleisli composition, we obtain \(L(g^\flat \circ f^\flat ) = \varepsilon _{FZ} \circ FG\varepsilon _{FZ} \circ FGFg \circ Ff.\) Since \(\varepsilon \) is natural, we have \(\varepsilon _{FZ} \circ Fg \circ \varepsilon _{FY} \circ Ff\) which is \(L(g^\flat ) \circ L(f^\flat )\) in \(\mathscr {D}\). Hence \(L:\mathscr {C}_T\rightarrow \mathscr {D}\) is a functor.
-
2.
For each object X in \(\mathscr {C}_T\), \(LX = FX\) in \(\mathscr {D}\) and \(GLX = GFX = TX = G_TX\) in \(\mathscr {C}\). For each morphism \(f^\flat :X\rightarrow Y\) in \(\mathscr {C}_T\), \(Lf^\flat = \varepsilon _{FY} \circ Ff\) in D by definition. Hence, \(GLf^\flat = G\varepsilon _{FY} \circ GFf.\) Similarly, Eq. (2.11) gives \(G_Tf^\flat = G\varepsilon _{FY} \circ GFf.\) We get \(GLf^\flat = G_Tf^\flat \) for each mapping \(f^\flat \). Thus \(GL = G_T.\)
-
3.
\(F_T\) is the identity on objects, thus \(LF_TX = LX = FX\). For each morphism \(f:X\rightarrow Y\) in \(\mathscr {C}\), we have \(F_Tf = (\eta _Y \circ f)^\flat \) in \(\mathscr {C}_T\), by definition. So that \(LF_Tf = L(\eta _Y \circ f)^\flat = \varepsilon _{FY} \circ F\eta _Y \circ Ff.\) Due to \(\varepsilon \) and \(\eta \) being natural, we have \(\varepsilon _{FY} \circ F\eta _Y = id_{FY}\) yielding \(LF_Tf = Ff\) for each mapping f. Therefore \(LF_T= F\).
We additionally need to show that the functor \(L:\mathscr {C}_T\rightarrow \mathscr {D}\), as characterized before, satisfying the equations \(GL = G_T\) and \(LF_T = F\) is unique. Otherwise put, for each functor \(R:\mathscr {C}_T\rightarrow \mathscr {D}\) satisfying \(GR = G_T\) and \(RF_T = F\), we need to obtain \(R = L\). Let us show in the following items that the functors L and R map objects and morphisms in the same way:
-
For each object X in \(\mathscr {C}_T\), we have \(LX = FX = RF_TX\) by definition of L and the assumption \(RF_T = F\). Since \(F_T\) is the identity on objects (see the fourth item in Proposition 2.5), we get \(LX = RX\).
-
For each morphism \(f^\flat :X\rightarrow Y\) in \(\mathscr {C}_T\), (correspondingly \(f:X\rightarrow TY\) in \(\mathscr {C}\)), we would end up with \(Lf^\flat = Rf^\flat \) if we can demonstrate that \(f = G(Lf^\flat ) \circ \eta _X = G(Rf^\flat ) \circ \eta _X\) holds in \(\mathscr {C}\), thanks to Lemma 2.6. We first trivially get \(f = G_Tf^\flat \circ \eta _X = G_Tf^\flat \circ \eta _X\) using the assumed equations \(GR = G_T\) and \(GL = G_T\). Then, we have \(G_Tf^\flat \circ \eta _X = \mu _Y \circ GFf \circ \eta _X\) by definition of \(G_T\). That amounts to \(G_Tf^\flat \circ \eta _X = \mu _Y \circ \eta _{GFY} \circ f\) due to the naturality of \(\eta \). The monadic axiom \(\mu _Y\circ \eta _{GFY} = id_{GFY}\) yields \(G_Tf^\flat \circ \eta _X = f\). Therefore \(Lf^\flat = Rf^\flat \).
We have \(\forall R:\mathscr {C}_T\rightarrow \mathscr {D}, GR = G_T \wedge RF_T = F \implies R = L\) thus the functor L is unique. \(\square \)
Example 3
To demonstrate a use case of the comparison theorem, we start with a comonad D on an arbitrary category \(\mathscr {C}\). Thanks to Proposition 2.5, we get the coKleisli category \(\mathscr {C}_D\) with the coKleisli adjunction \(F_D\dashv G_D:\mathscr {C}\rightarrow \mathscr {C}_D\) in association. We further know from the dual of Proposition 2.4 that the coKleisli adjunction gives us a comonad (which is indeed the D itself) on the base category \(\mathscr {C}\) and a monad T on the codomain category \(\mathscr {C}_D\). By Proposition 2.4 itself, we can obtain the Kleisli category \(\mathscr {C}_{D,\,T}\) of the monad T with the Kleisli adjunction \(F_{D,\,T}\dashv G_{D,\,T}:\mathscr {C}_{D,\,T}\rightarrow \mathscr {C}_D\). It is obvious that the category \(\mathscr {C}_{D,\,T}\) is the full-image category of the endo-functor of the comonad D which we started with: it is made of the objects of \(\mathscr {C}\), and for each arrow \(f^{\flat \sharp }:X\rightarrow Y\) in \(\mathscr {C}_{D,\,T}\), there is an arrow \(f:DX \rightarrow DY\) in \(\mathscr {C}\). Now, Theorem 2.7 provides the unique comparison functor \(L:\mathscr {C}_{D,\,T}\rightarrow \mathscr {C}\) with the equations \(L\circ F_{D,\,T}= F_D\) and \(G_D\circ L = G_{D,\,T}\) satisfied. Furthermore, it is possible to prove the fact that the functors L and \(G_{T,\,D}\circ F_T\) form the full-image decomposition of the endo-functor of the comonad D. Otherwise put, this endo-functor is indeed \(L \circ (F_{D,\,T}\circ G_D):\mathscr {C}\rightarrow \mathscr {C}\).
Notice that it is possible to keep building the Kleisli categories over coKleisli categories by subsequent applications of Propositions 2.5 and 2.4. Such constructions obviously follow a pattern.
For instance the Kleisli category \(\mathscr {C}_{D,T,D,T}\) built over the coKleisli, Kleisli and coKleisli categories provided by the comonad D on \(\mathscr {C}\) is the full-image category of the composition of the endo-functor of D with itself: it is made of the objects of \(\mathscr {C}\) and for each arrow \(f^{\flat \sharp \flat \sharp }:X\rightarrow Y\) in \(\mathscr {C}_{D,T,D,T}\), there is a corresponding arrow \(f:D^2X \rightarrow D^2Y\) in \(\mathscr {C}\). And, the comparison theorem gives us a unique functor \(K:\mathscr {C}_{D,T,D,T}\rightarrow \mathscr {C}\) which decomposes the endo-functor of D composed with itself into \(K\circ (F_{D,T,D,T}\circ G_{D,T,D} \circ F_{D,\,T}\circ G_D):\mathscr {C}\rightarrow \mathscr {C}\). This means that \(K\circ (F_{D,T,D,T}\circ G_{D,T,D} \circ F_{D,\,T}\circ G_D)f = D^2f\) for each f in \(\mathscr {C}\).
In general, when there are subsequent dual adjunctions, a coKleisli over a Kleisli or vice versa, out of the same monad or comonad, the comparison functor provided by Theorem 2.7 can be used to annihilate these adjunctions in such a way that one basically returns to the initial point up to the number of annihilations that the endo-functor of the initial monad or comonad composed to itself.
We have formalized all of the categorical content mentioned so far in Coq, and briefly explain the formalization in the following Sect. 3.