Keywords

1 Introduction

In this paper we deal with algebraic categories and deterministic weighted automata functors on them. Such categories are the target of generalized determinization [10, 22, 23] and enable coalgebraic modelling beyond sets. For example, non-deterministic automata, weighted, or probabilistic ones are coalgebraically modelled over the categories of join-semilattices, semimodules for a semiring, and convex sets, respectively. Moreover, expressions for axiomatizing behavior semantics often live in algebraic categories.

In order to prove completeness of such axiomatizations, the common approach [4, 21, 23] is to prove finality of a certain object in a category of coalgebras over an algebraic category. Proofs are significantly simplified if it suffices to verify finality only w.r.t. coalgebras carried by free finitely generated algebras, as those are the coalgebras that result from generalized determinization.

In recent work, Milius [16] proposed the notion of a proper functor on an algebraic category that provides a sufficient condition for this purpose. This notion is an extension of the notion of a proper semiring introduced by Esik and Maletti [8]: A semiring is proper if and only if its “cubic” functor is proper. A cubic functor is a functor \(\mathbb S \times (-)^A\) where A is a finite alphabet and \(\mathbb S\) is a free algebra with a single generator in the algebraic category. Cubic functors model deterministic weighted automata which are models of determinizations of non-deterministic and probabilistic transition systems.

Properness is the property that for any two states that are behaviourally equivalent in coalgebras with free finitely generated carriers, there is a zig-zag of homomorphisms (called a chain of simulations in the original works on weighted automata and proper semirings) that identifies the two states and whose nodes are all carried by free finitely generated algebras.

Even though the notion of properness is relatively new for a semiring and very new for a functor, results on properness of semirings can be found in more distant literature as well. Here is a brief history, to the best of our knowledge:

  • The Boolean semiring was proven to be proper in [3].

  • Finite commutative ordered semirings were proven to be proper in  [7, Theorem 5.1]. Interestingly, the proof provides a zig-zag with at most seven intermediate nodes.

  • Any euclidean domain and any skew field were proven proper in [1, Theorem 3]. In each case the zig-zag has two intermediate nodes.

  • The semiring of natural numbers \(\mathbb N\), the Boolean semiring \(\mathbb B\), the ring of integers \(\mathbb Z\) and any skew field were proven proper in [2, Theorem 1]. All zig-zags were spans, i.e., had a single intermediate node with outgoing arrows.

  • Noetherian semirings were proven proper in [8, Theorem 4.2], commutative rings also in [8, Corollary 4.4], and finite semirings as well in [8, Corollary 4.5], all with a zig-zag being a span. Moreover, the tropical semiring is not proper, as proven in [8, Theorem 5.4].

Having properness of a semiring, together with the property of the semiring being finitely and effectively presentable, yields decidability of the equivalence problem (decidability of trace equivalence) for weighted automata.

In this paper, motivated by the wish to prove properness of a certain functor \(\widehat{F}\) on convex algebras used for axiomatizing trace semantics of probabilistic systems in [23], as well as by the open questions stated in [16, Example 3.19], we provide a framework for proving properness. We instantiate this framework on known cases like Noetherian semirings and \(\mathbb N\) (with a zig-zag that is a span), and further prove new results of properness:

  • The semirings \(\mathbb Q_+\) and \(\mathbb R_+\) of non-negative rationals and reals, respectively, are proper. The shape of the zig-zag is a span as well.

  • The functor \([0,1] \times (-)^A\) on PCA is proper, again the zig-zag being a span.

  • The functor \(\widehat{F}\) on \({\texttt {PCA}}\) is proper. This proof is the most involved, and interestingly, provides the only case where the zig-zag is not a span: it contains three intermediate nodes of which the middle one forms a span.

Our framework requires a proof of so-called extension and reduction lemmas in each case. While the extension lemma is a generic result that covers all cubic functors of interest, the reduction lemma is in all cases a nontrivial property intrinsic to the algebras under consideration. For the semiring of natural numbers it is a consequence of a result that we trace back to Hilbert; for the case of convex algebra [0, 1] the result is due to Minkowski. In the case of \(\widehat{F}\), we use Kakutani’s set-valued fixpoint theorem.

It is an interesting question for future work whether these new properness results may lead to new complete axiomatizations of expressions for certain weighted automata.

The organization of the rest of the paper is as follows. In Sect. 2 we give some basic definitions and introduce the semirings, the categories, and the functors of interest. Section 3 provides the general framework as well as proofs of properness of the cubic functors. Sections 4, 5 and 6 lead us to properness of \(\widehat{F}\) on PCA. For space reasons, we present the ideas of proofs and constructions in the main paper and defer all detailed proofs to the arXiv-version [24].

2 Proper Functors

We start with a brief introduction of the basic notions from algebra and coalgebra needed in the rest of the paper, as well as the important definition of proper functors [16]. We refer the interested reader to [9, 11, 20] for more details. We assume basic knowledge of category theory, see e.g. [14] or [24, Appendix A].

Let \({\texttt {C}}\) be a category and F a \({\texttt {C}}\)-endofunctor. The category \(\mathsf{Coalg}({F})\) of F-coalgebras is the category having as objects pairs (Xc) where X is an object of \({\texttt {C}}\) and c is a \({\texttt {C}}\)-morphism from X to FX, and as morphisms \(f:(X,c)\rightarrow (Y,d)\) those \({\texttt {C}}\)-morphisms from X to Y that make the diagram on the right commute.

figure a

All base categories \({\texttt {C}}\) in this paper will be algebraic categories, i.e., categories \({\texttt {Set}}^{T}\) of Eilenberg-Moore algebras of a finitary monad Footnote 1 in \({\texttt {Set}}\). Hence, all base categories are concrete with forgetful functor that is identity on morphisms.

In such categories behavioural equivalence [13, 25, 26] can be defined as follows. Let (Xc) and (Yd) be F-coalgebras and let \(x\in X\) and \(y\in Y\). Then x and y are behaviourally equivalent, and we write \(x\sim y\), if there exists an F-coalgebra (Ze) and \(\mathsf{Coalg}({F})\)-morphisms \(f:(X,c)\rightarrow (Z,e)\), \(g:(Y,d)\rightarrow (Z,e)\), with \(f(x)=g(y)\).

If there exists a final coalgebra in \(\mathsf{Coalg}({F})\), and all functors considered in this paper will have this property, then two elements are behaviourally equivalent if and only if they have the same image in the final coalgebra. If we have a zig-zag diagram in \(\mathsf{Coalg}({F})\)

(1)

which relates x with y in the sense that there exist elements \(z_{2k}\in Z_{2k}\), \(k=1,\ldots ,n-1\), with (setting \(z_0=x\) and \(z_{2n}=y\))

$$\begin{aligned} f_{2k}(z_{2k})=f_{2k-1}(z_{2k-2}),\quad k=1,\ldots , n, \end{aligned}$$

then \(x\sim y\).

We now recall the notion of a proper functor, introduced by Milius [16] which is central to this paper. It is very helpful for establishing completeness of regular expressions calculi, cf. [16, Corollary 3.17].

Definition 2.1

Let \(T:{\texttt {Set}}\rightarrow {\texttt {Set}}\) be a finitary monad with unit \(\eta \) and multiplication \(\mu \). A \({\texttt {Set}}^{T}\)-endofunctor F is proper, if the following statement holds.

For each pair \((TB_1,c_1)\) and \((TB_2,c_2)\) of F-coalgebras with \(B_1\) and \(B_2\) finite sets, and each two elements \(b_1\in B_1\) and \(b_2\in B_2\) with \(\eta _{B_1}(b_1)\sim \eta _{B_2}(b_2)\), there exists a zig-zag (1) in \(\mathsf{Coalg}({F})\) which relates \(\eta _{B_1}(b_1)\) with \(\eta _{B_2}(b_2)\), and whose nodes \((Z_j,e_j)\) all have free and finitely generated carrier.

This notion generalizes the notion of a proper semiring introduced by Esik and Maletti in [8, Definition 3.2], cf. [16, Remark 3.10].

Remark 2.2

In the definition of properness the condition that intermediate nodes have free and finitely generated carrier is necessary for nodes with incoming arrows (the nodes \(Z_{2k-1}\) in (1)). For the intermediate nodes with outgoing arrows (\(Z_{2k}\) in (1)), it is enough to require that their carrier is finitely generated. This follows since every F-coalgebra with finitely generated carrier is the image under an F-coalgebra morphism of an F-coalgebra with free and finitely generated carrier.

Moreover, note that zig-zags which start (or end) with incoming arrows instead of outgoing ones, can also be allowed since a zig-zag of this form can be turned into one of the form (1) by appending identity maps.

Some Concrete Monads and Functors

We deal with the following base categories.

  • The category \({\mathbb S\text {-}{\texttt {SMOD}}}\) of semimodules over a semiring \(\mathbb S\) induced by the monad \(T_{\mathbb S}\) of finitely supported maps into \(\mathbb S\), see, e.g., [15, Example 4.2.5].

  • The category PCA of positively convex algebras induced by the monad of finitely supported subprobability distributions, see, e.g., [5, 6] and [17].

For \(n\in \mathbb N\), the free algebra with n generators in \({\mathbb S\text {-}{\texttt {SMOD}}}\) is the direct product \(\mathbb S^n\), and in PCA it is the n-simplex \(\varDelta ^n=\{(\xi _1,\ldots ,\xi _n)\mid \xi _j\ge 0,\sum _{j=1}^n\xi _j\le 1\}\).

Concerning semimodule-categories, we mainly deal with the semirings \(\mathbb N\), \(\mathbb Q_+\), and \(\mathbb R_+\), and their ring completions \(\mathbb Z\), \(\mathbb Q\), and \(\mathbb R\). For these semirings the categories of \(\mathbb S\)-semimodules are

  • \({\texttt {CMON}}\) of commutative monoids for \(\mathbb N\),

  • \({\texttt {AB}}\) of abelian groups for \(\mathbb Z\),

  • \({\texttt {CONE}}\) of convex cones for \(\mathbb R_+\),

  • \({\mathbb Q\text {-}{\texttt {VEC}}}\) and \({\mathbb R\text {-}{\texttt {VEC}}}\) of vector spaces over the field of rational and real numbers, respectively, for \(\mathbb Q\) and \(\mathbb R\).

We consider the following functors, where A is a fixed finite alphabet. Recall that we use the term cubic functor for the functor \(T1\times (-)^A\) where T is a monad on \({\texttt {Set}}\). We chose the name since \(T1\times (-)^A\) assigns to objects X a full direct product, i.e., a full cube.

  • The cubic functor \(F_{\,{\mathbb S}}\) on \({\mathbb S\text {-}{\texttt {SMOD}}}\), i.e., the functor acting as

    $$\begin{aligned}&F_{\,{\mathbb S}}X=\mathbb S\times X^A\text { for }X\text { object of }{\mathbb S\text {-}{\texttt {SMOD}}},\\&F_{\,{\mathbb S}}f={{\mathrm{id}}}_{\mathbb S}\times (f\circ -) \text { for }f:X\rightarrow Y\text { morphism of }{\mathbb S\text {-}{\texttt {SMOD}}}. \end{aligned}$$

    The underlying \({\texttt {Set}}\) functors of cubic functors are also sometimes called deterministic-automata functors, see e.g. [10], as their coalgebras are deterministic weighted automata with output in the semiring.

  • The cubic functor \(F_{\,{[0,1]}}\) on \({\texttt {PCA}}\), i.e., the functor \(F_{\,{[0,1]}}X=[0,1]\times X^A\) and \(F_{\,{[0,1]}}f={{\mathrm{id}}}_{[0,1]}\times (f\circ -)\).

  • A subcubic convex functor \(\widehat{F}\) on PCA whose action will be introduced in Definition 4.1.Footnote 2 The name originates from the fact that \(\widehat{F}X\) is a certain convex subset of \(F_{\,{[0,1]}}X\) and that \(\widehat{F}f=(F_{\,{[0,1]}}f)|_{\widehat{F}X}\) for \(f:X\rightarrow Y\).

Cubic functors are liftings of \({\texttt {Set}}\)-endofunctors, in particular, they preserve surjective algebra homomorphisms. It is easy to see that also the functor \(\widehat{F}\) preserves surjectivity, cf. [24, Lemma D.1]. This property is needed to apply the work of Milius, cf. [16, Assumptions 3.1].

Remark 2.3

We can now formulate precisely the connection between proper semirings and proper functors mentioned after Definition 2.1. A semiring \(\mathbb S\) is proper in the sense of [8], if and only if for every finite input alphabet A the cubic functor \(F_{\,{\mathbb S}}\) on \({\mathbb S\text {-}{\texttt {SMOD}}}\) is proper.

We shall interchangeably think of direct products as sets of functions or as sets of tuples. Taking the viewpoint of tuples, the definition of \(F_{\,{\mathbb S}}f\) reads as

$$\begin{aligned} (F_{\,{\mathbb S}}f)\big ((o,(x_a)_{a\in A})\big )=\big (o,(f(x_a))_{a\in A}\big ),\quad o\in \mathbb S,\ x_a\in X\text { for }a\in A . \end{aligned}$$

A coalgebra structure \(c:X\rightarrow F_{\,{\mathbb S}}X\) writes as

$$\begin{aligned} c(x)=\big ({c}_{\text {o}}(x),({c}_{a}(x))_{a\in A}\big ),\quad x\in X , \end{aligned}$$

and we use \({c}_{\text {o}}:X\rightarrow \mathbb S\) and \({c}_{a}:X\rightarrow X\) as generic notation for the components of the map c. More generally, we define \(c_w:X\rightarrow X\) for any word \(w\in A^*\) inductively as \({c}_{\varepsilon }={{\mathrm{id}}}_X\) and \({c}_{wa}={c}_{a}\circ {c}_{w},\ w\in A^*,a\in A\).

The map from a coalgebra (Xc) into the final \(F_{\,{\mathbb S}}\)-coalgebra, the trace map, is then given as \({{\mathrm{tr}}}_c(x)=\big (({c}_{\text {o}}\circ {c}_{w})(x)\big )_{w\in A^*}\) for \( x\in X\). Behavioural equivalence for cubic functors is the kernel of the trace map.

3 Properness of Cubic Functors

Our proofs of properness in this section and in Sect. 6 below start from the following idea. Let \(\mathbb S\) be a semiring, and assume we are given two \(F_{\,{\mathbb S}}\)-coalgebras which have free finitely generated carrier, say \((\mathbb S^{n_1},c_1)\) and \((\mathbb S^{n_2},c_2)\). Moreover, assume \(x_1\in \mathbb S^{n_1}\) and \(x_2\in \mathbb S^{n_2}\) are two elements having the same trace. For \(j=1,2\), let \(d_j:\mathbb S^{n_1}\times \mathbb S^{n_2} \rightarrow F_{\,{\mathbb S}}(\mathbb S^{n_1}\times \mathbb S^{n_2})\) be given by

$$\begin{aligned} d_j(y_1, y_2) = \Big ({c_j}_{\text {o}}(y_j),(({c_1}_{a}(y_1),{c_2}_{a}(y_2)))_{a\in A}\Big ). \end{aligned}$$

Denoting by \(\pi _j:\mathbb S^{n_1}\times \mathbb S^{n_2}\rightarrow \mathbb S^{n_j}\) the canonical projections, both sides of the following diagram separately commute.

However, in general the maps \(d_1\) and \(d_2\) do not coincide.

The next lemma contains a simple observation: there exists a subsemimodule Z of \(\mathbb S^{n_1}\times \mathbb S^{n_2}\), such that the restrictions of \(d_1\) and \(d_2\) to Z coincide and turn Z into an \(F_{\,{\mathbb S}}\)-coalgebra.

Lemma 3.1

Let Z be the subsemimodule of \(\mathbb S^{n_1}\times \mathbb S^{n_2}\) generated by the pairs \(({c_1}_{w}(x_1),{c_2}_{w}(x_2))\) for \(w \in A^*\). Then \(d_1|_Z=d_2|_Z\) and \(d_j(Z)\subseteq F_{\,{\mathbb S}}(Z)\).

The significance of Lemma 3.1 in the present context is that it leads to the diagram (we denote \(d=d_j|_Z\))

In other words, it leads to the zig-zag in Coalg(\(F_{\,{\mathbb S}}\))

(2)

This zig-zag relates \(x_1\) with \(x_2\) since \((x_1,x_2)\in Z\). If it can be shown that Z is always finitely generated, it will follow that \(F_{\,{\mathbb S}}\) is proper.

Let \(\mathbb S\) be a Noetherian semiring, i.e., such that every \(\mathbb S\)-subsemimodule of some finitely generated \(\mathbb S\)-semimodule is itself finitely generated. Then Z is, as an \(\mathbb S\)-subsemimodule of \(\mathbb S^{n_1}\times \mathbb S^{n_2}\), finitely generated. We reobtain [8, Theorem 4.2].

Corollary 3.2

(Esik–Maletti 2010). Every Noetherian semiring is proper.

Our first main result is Theorem 3.3 below, where we show properness of the cubic functors \(F_{\,{\mathbb S}}\) on \(\mathbb S\)-SMOD, for \(\mathbb S\) being one of the semirings \(\mathbb N\), \(\mathbb Q_+\), \(\mathbb R_+\), and of the cubic functor \(F_{\,{[0,1]}}\) on PCA. The case of \(F_{\,{\mathbb N}}\) is known from [2, Theorem 4]Footnote 3, the case of \(F_{\,{[0,1]}}\) is stated as an open problem in [16, Example 3.19].

Theorem 3.3

The cubic functors \(F_{\,{\mathbb N}}\), \(F_{\,{\mathbb Q_+}}\), \(F_{\,{\mathbb R_+}}\), and \(F_{\,{[0,1]}}\) are proper.

In fact, for any two coalgebras with free finitely generated carrier and any two elements having the same trace, a zig-zag with free and finitely generated nodes relating those elements can be found, which is a span (has a single intermediate node with outgoing arrows).

The proof proceeds via relating to the Noetherian case. It always follows the same scheme, which we now outline. Observe that the ring completion of each of \(\mathbb N\), \(\mathbb Q_+\), \(\mathbb R_+\), is Noetherian (for the last two it actually is a field), and that [0, 1] is the positive part of the unit ball in \(\mathbb R\).

Step 1. The extension lemma: We use an extension of scalars process to pass from the given category C to an associated category \(\mathbb E\)-MOD with a Noetherian ring \(\mathbb E\). This is a general categorical argument.

To unify notation, we agree that \(\mathbb S\) may also take the value [0, 1], and that \(T_{[0,1]}\) is the monad of finitely supported subprobability distributions giving rise to the category PCA.

\(\mathbb S\)

\(\mathbb N\)

\(\mathbb Q_+\)

\(\mathbb R_+\)

[0, 1]

C

\(\mathbb N\)-SMOD (CMON)

\(\mathbb Q_+\)-SMOD

\(\mathbb R_+\)-SMOD (CONE)

PCA

\(\mathbb E\)-MOD

\(\mathbb Z\)-MOD (AB)

\(\mathbb Q\)-MOD (\(\mathbb Q\)-VEC)

\(\mathbb R\)-MOD (\(\mathbb R\)-VEC)

\(\mathbb R\)-MOD (\(\mathbb R\)-VEC)

For the formulation of the extension lemma, recall that the starting category C is the Eilenberg-Moore category of the monad \(T_{\mathbb {S}}\) and the target category \(\mathbb E\)-MOD is the Eilenberg-Moore category of \(T_{\mathbb {E}}\). We write \(\eta _{\mathbb S}\) and \(\mu _{\mathbb S}\) for the unit and multiplication of \(T_{\mathbb {S}}\) and analogously for \(T_{\mathbb {E}}\). We have \(T_{\mathbb {S}}\le T_{\mathbb {E}}\), via the inclusion monad morphism \(\iota :T_{\mathbb {S}}\Rightarrow T_{\mathbb {E}}\) given by \(\iota _X(u) = u\), as \(\eta _{\mathbb E} = \iota \mathrel {\circ }\eta _{\mathbb S}\) and \(\mu _{\mathbb E} \mathrel {\circ }\iota \iota =\iota \mathrel {\circ }\mu _{\mathbb S}\) where \(\iota \iota {\mathop {=}\limits ^{\text {def}}}T_{\mathbb {E}}\iota \mathrel {\circ }\iota {\mathop {=}\limits ^{\text {nat.}}} \iota \mathrel {\circ }T_{\mathbb {S}}\iota \). Recall that a monad morphism \(\iota :T_\mathbb S \rightarrow T_\mathbb E\) defines a functor \(M_\iota :{\texttt {Set}}^{T_\mathbb E} \rightarrow {\texttt {Set}}^{T_\mathbb S}\) which maps a \(T_\mathbb E\)-algebra \((X, \alpha _X)\) to \((X, \iota _X\mathrel {\circ }\alpha _X)\) and is identity on morphisms. Obviously, \(M_\iota \) commutes with the forgetful functors \(U_\mathbb S: {\texttt {Set}}^{T_\mathbb S} \rightarrow {\texttt {Set}}\) and \(U_\mathbb E: {\texttt {Set}}^{T_\mathbb E} \rightarrow {\texttt {Set}}\), i.e., \(U_\mathbb S \mathrel {\circ }M_\iota = U_\mathbb E\).

Definition 3.4

Let \((X, \alpha _X) \in {\texttt {Set}}^{T_{\mathbb {S}}}\) and \((Y, \alpha _Y) \in {\texttt {Set}}^{T_{\mathbb {E}}}\) where \(T_{\mathbb {S}}\) and \(T_{\mathbb {E}}\) are monads with \(T_{\mathbb {S}}\le T_{\mathbb {E}}\) via \(\iota :T_{\mathbb {S}}\Rightarrow T_{\mathbb {E}}\). A \({\texttt {Set}}\)-arrow \(h:X \rightarrow Y\) is a \(T_{\mathbb {S}}\le T_{\mathbb {E}}\)-homomorphism from \((X,\alpha _X)\) to \((Y,\alpha _Y)\) if and only if the following diagram commutes (in \({\texttt {Set}}\))

where \(\iota h\) denotes the map \(\iota h {\mathop {=}\limits ^{\text {def}}} T_{\mathbb {E}}h\mathrel {\circ }\iota _X {\mathop {=}\limits ^{\text {nat.}}} \iota _Y \mathrel {\circ }T_{\mathbb {S}}h\). In other words, a \(T_{\mathbb {S}}\le T_{\mathbb {E}}\)-homomorphism from \((X,\alpha _X)\) to \((Y,\alpha _Y)\) is a morphism in \({\texttt {Set}}^{T_\mathbb S}\) from \((X,\alpha _X)\) to \(M(Y,\alpha _Y)\).

Now we can formulate the extension lemma.

Proposition 3.5

(Extension Lemma). For every \(F_{\,{\mathbb {S}}}\)-coalgebra \(T_{\mathbb {S}}B {\mathop {\rightarrow }\limits ^{c}} F_{\,{\mathbb {S}}}(T_{\mathbb {S}}B)\) with free finitely generated carrier \(T_{\mathbb {S}}B\) for a finite set B, there exists an \(F_{\,{\mathbb {E}}}\)-coalgebra \(T_{\mathbb {E}}B {\mathop {\rightarrow }\limits ^{\tilde{c}}} F_{\,{\mathbb {E}}}(T_{\mathbb {E}}B)\) with free finitely generated carrier \(T_{\mathbb {E}}B\) such that

where the horizontal arrows (\(\iota _B\) and \(\iota _1 \times \iota _B^A\)) are \(T_{\mathbb {S}}\le T_{\mathbb {E}}\)-homomorphisms, and moreover they both amount to inclusion.

Step 2. The basic diagram: Let \(n_1,n_2\in \mathbb N\), let \(B_j\) be the \(n_j\)-element set consisting of the canonical basis vectors of \(\mathbb E^{n_j}\), and set \(X_j=T_{\mathbb S}B_j\). Assume we are given \(F_{\,{\mathbb S}}\)-coalgebras \((X_1,c_1)\) and \((X_2,c_2)\), and elements \(x_j\in X_j\) with \({{\mathrm{tr}}}_{c_1}x_1={{\mathrm{tr}}}_{c_2}x_2\).

The extension lemma provides \(F_{\,{\mathbb E}}\)-coalgebras \((\mathbb E^{n_j},\tilde{c}_j)\) with \(\tilde{c}_j|_{X_j}=c_j\). Clearly, \({{\mathrm{tr}}}_{\tilde{c}_1}x_1={{\mathrm{tr}}}_{\tilde{c}_2}x_2\). Using the zig-zag diagram (2) in Coalg(\(F_{\mathbb E}\)) and appending inclusion maps, we obtain what we call the basic diagram. In this diagram all solid arrows are arrows in \(\mathbb E\)-MOD, and all dotted arrows are arrows in C. The horizontal dotted arrows denote the inclusion maps, and \(\pi _j\) are the restrictions to Z of the canonical projections.

Commutativity of this diagram yields \(d\big (\pi _j^{-1}(X_j)\big )\subseteq (F_{\,{\mathbb E}}\pi _j)^{-1}\big (F_{\,{\mathbb S}}X_j)\) for \(j=1,2\). Now we observe the following properties of cubic functors.

Lemma 3.6

We have \(F_{\,{\mathbb E}}X\cap F_{\,{\mathbb S}}Y=F_{\,{\mathbb S}}(X\cap Y)\). Moreover, if \(Y_j\subseteq X_j\), then \((F_{\,{\mathbb E}}\pi _1)^{-1}(F_{\,{\mathbb S}}Y_1)\cap (F_{\,{\mathbb E}}\pi _2)^{-1}(F_{\,{\mathbb S}}Y_2) =F_{\,{\mathbb S}}(Y_1\times Y_2)\).

Using this, yields

$$\begin{aligned} d\big (Z\cap (X_1\times X_2)\big )\subseteq&\, F_{\,{\mathbb E}}Z\cap (F_{\,{\mathbb E}}\pi _1)^{-1}\big (F_{\,{\mathbb S}}X_1) \cap (F_{\,{\mathbb E}}\pi _2)^{-1}\big (F_{\,{\mathbb S}}X_2)\\ =&\, F_{\,{\mathbb E}}Z\cap F_{\,{\mathbb S}}(X_1\times X_2)=F_{\,{\mathbb S}}\big (Z\cap (X_1\times X_2)\big ). \end{aligned}$$

This shows that \(Z\cap (X_1\times X_2)\) becomes an \(F_{\,{\mathbb S}}\)-coalgebra with the restriction \(d|_{Z\cap (X_1\times X_2)}\). Again referring to the basic diagram, we have the following zig-zag in Coalg(\(F_{\mathbb S}\)) (to shorten notation, denote the restrictions of \(d,\pi _1,\pi _2\) to \(Z\cap (X_1\times X_2)\) again as \(d,\pi _1,\pi _2\)):

(3)

This zig-zag relates \(x_1\) with \(x_2\) since \((x_1,x_2)\in Z\cap (X_1\times X_2)\).

Step 3. The reduction lemma: In view of the zig-zag (3), the proof of Theorem 3.3 can be completed by showing that \(Z\cap (X_1\times X_2)\) is finitely generated as an algebra in C. Since Z is a submodule of the finitely generated module \(\mathbb E^{n_1}\times \mathbb E^{n_2}\) over the Noetherian ring \(\mathbb E\), it is finitely generated as an \(\mathbb E\)-module. The task thus is to show that being finitely generated is preserved when reducing scalars.

This is done by what we call the reduction lemma. Contrasting the extension lemma, the reduction lemma is not a general categorical fact, and requires specific proof in each situation.

Proposition 3.7

(Reduction Lemma). Let \(n_1,n_2\in \mathbb N\), let \(B_j\) be the set consisting of the \(n_j\) canonical basis vectors of \(\mathbb E^{n_j}\), and set \(X_j=T_{\mathbb S}B_j\). Moreover, let Z be an \(\mathbb E\)-submodule of \(\mathbb E^{n_1}\times \mathbb E^{n_2}\). Then \(Z\cap (X_1\times X_2)\) is finitely generated as an algebra in \(\mathtt {C}\).

4 A Subcubic Convex Functor

Recall the following definition from [23, p. 309].

Definition 4.1

We introduce a functor \(\widehat{F}:{\texttt {PCA}}\rightarrow {\texttt {PCA}}\).

  1. 1.

    Let X be a PCA. Then

    $$\begin{aligned} \widehat{F}X=\Big \{ (o,\phi )\in [0,1]&\times X^A\mid \\&\exists \,n_a\in \mathbb N{.}\exists \,p_{a,j}\in [0,1],x_{a,j}\in X\text { for }j=1,\ldots ,n_a,a\in A{.}\\&o+\sum _{a\in A}\sum _{j=1}^{n_a}p_{a,j}\le 1,\ \phi (a)=\sum _{j=1}^{n_a}p_{a,j}x_{a,j} \Big \}. \end{aligned}$$
  2. 2.

    Let XY be PCAs, and \(f:X\rightarrow Y\) a convex map. Then \(\widehat{F}f:\widehat{F}X\rightarrow \widehat{F}Y\) is the map \(\widehat{F}f={{\mathrm{id}}}_{[0,1]}\times (f\circ -)\).

For every X we have \(\widehat{F}X\subseteq F_{\,{[0,1]}}X\), and for every \(f:X\rightarrow Y\) we have \(\widehat{F}f=(F_{\,{[0,1]}}f)|_{\widehat{F}X}\). For this reason, we think of \(\widehat{F}\) as a subcubic functor.

The definition of \(\widehat{F}\) can be simplified.

Lemma 4.2

Let X be a \(\mathtt {PCA}\), then

$$\begin{aligned} \widehat{F}X=\Big \{ (o,f)\in [0,1]\times X^A\mid&\exists \,p_a\in [0,1],x_a\in X\text { for }a\in A{.}\\ \quad&o+\sum _{a\in A}p_a\le 1,\ f(a)=p_ax_a \Big \}. \end{aligned}$$

From this representation it is obvious that \(\widehat{F}\) is monotone in the sense that

  • If \(X_1\subseteq X_2\), then \(\widehat{F}X_1\subseteq \widehat{F}X_2\).

  • If \(f_1:X_1\rightarrow Y_1,f_2:X_2\rightarrow Y_2\) with \(X_1\subseteq X_2,Y_1\subseteq Y_2\) and \(f_2|_{X_1}=f_1\), then \(\widehat{F}f_2|_{\widehat{F}X_1}=\widehat{F}f_1\).

Note that \(\widehat{F}\) does not preserve direct products.

For a PCA X whose carrier is a compact subset of a euclidean space, \(\widehat{F}X\) can be described with help of a geometric notion, namely using the Minkowksi functional of X. Before we can state this fact, we have to make a brief digression to explain this notion and its properties.

Definition 4.3

Let \(X\subseteq \mathbb R^n\) be a PCA. The Minkowski functional of X is the map \(\mu _{X}:\mathbb R^n\rightarrow [0,\infty ]\) defined as \(\mu _{X}(x)=\inf \{t>0\mid x\in tX\}\), where the infimum of the empty set is understood as \(\infty \).

Minkowski functionals, sometimes also called gauge, are a central and exhaustively studied notion in convex geometry, see, e.g., [19, p. 34] or [18, p. 28].

We list some basic properties whose proof can be found in the mentioned textbooks.

  1. 1.

    \(\mu _{X}(px)=p\mu _{X}(x)\) for \(x\in \mathbb R^n,p\ge 0\),

  2. 2.

    \(\mu _{X}(x+y)\le \mu _{X}(x)+\mu _{X}(y)\) for \(x,y\in \mathbb R^n\),

  3. 3.

    \(\mu _{X\cap Y}(x)=\max \{\mu _{X}(x),\mu _{Y}(x)\}\) for \(x\in \mathbb R^n\).

  4. 4.

    If X is bounded, then \(\mu _{X}(x)=0\) if and only if \(x=0\).

The set X can almost be recovered from \(\mu _{X}\).

  1. 5.

    \({\displaystyle \{x\in \mathbb R^n\mid \mu _{X}(x)<1\}\subseteq X\subseteq \{x\in \mathbb R^n\mid \mu _{X}(x)\le 1\}}\).

  2. 6.

    If X is closed, equality holds in the second inclusion of 5.

  3. 7.

    Let XY be closed. Then \(X\subseteq Y\) if and only if \(\mu _{X}\ge \mu _{Y}\).

Example 4.4

As two simple examples, consider the n-simplex \(\varDelta ^n\subseteq \mathbb R^n\) and a convex cone \(C\subseteq \mathbb R^n\). Then (here \(\ge \) denotes the product order on \(\mathbb R^n\))

$$\begin{aligned} \mu _{\varDelta ^n}(x)= {\left\{ \begin{array}{ll} \sum _{j=1}^n\xi _j &{},\quad x=(\xi _1,\ldots ,\xi _n)\ge 0, \\ \infty &{},\quad \text {otherwise}. \end{array}\right. } \qquad \mu _{C}(x)= {\left\{ \begin{array}{ll} 0 &{},\quad x\in C, \\ \infty &{},\quad \text {otherwise}. \end{array}\right. } \end{aligned}$$

Observe that \(\varDelta ^n=\{x\in \mathbb R^n\mid \mu _{\varDelta ^n}(x)\le 1\}\).

Another illustrative example is given by general pyramids in a euclidean space. This example will play an important role later on.

Example 4.5

For \(u\in \mathbb R^n\) consider the set

$$\begin{aligned} X=\big \{x\in \mathbb R^n\mid x\ge 0\text { and }(x,u)\le 1\big \} , \end{aligned}$$

where \((\cdot ,\cdot )\) denotes the euclidean scalar product on \(\mathbb R^n\). The set X is intersection of the cone \(\mathbb R_+^n\) with the half-space given by the inequality \((x,u)\le 1\), hence it is convex and contains 0. Thus X is a PCA.

Let us first assume that u is strictly positive, i.e., \(u\ge 0\) and no component of u equals zero. Then X is a pyramid (in 2-dimensional space, a triangle).

figure b

The n-simplex \(\varDelta ^n\) is the pyramid obtained using \(u=(1,\ldots ,1)\).

The Minkowski functional of the pyramid X associated with u is

$$\begin{aligned} \mu _{X}(x)= (x,u) \text { if } x\ge 0,\quad \mu _{X}(x)=\infty \text { otherwise} . \end{aligned}$$

Write \(u=\sum _{j=1}^n\alpha _je_j\), where \(e_j\) is the j-th canonical basis vector, and set \(y_j=\frac{1}{\alpha _j}e_j\). Clearly, \(\{y_1,\ldots ,y_n\}\) is linearly independent. Each vector \(x=\sum _{j=1}^n\xi _je_j\) can be written as \(x=\sum _{j=1}^n(\xi _j\alpha _j)y_j\), and this is a subconvex combination if and only if \(\xi _j\ge 0\) and \(\sum _{j=1}^n\xi _j\alpha _j\le 1\), i.e., if and only if \(x\in X\). Thus X is generated by \(\{y_1,\ldots ,y_n\}\) as a PCA.

The linear map given by the diagonal matrix made up of the \(\alpha _j\)’s induces a bijection of X onto \(\varDelta ^n\), and maps the \(y_j\)’s to the corner points of \(\varDelta ^n\). Hence, X is free with basis \(\{y_1,\ldots ,y_n\}\).

If u is not strictly positive, the situation changes drastically. Then X is not finitely generated as a PCA, because it is unbounded whereas the subconvex hull of a finite set is certainly bounded.

figure c

Now we return to the functor \(\widehat{F}\).

Lemma 4.6

Let \(X\subseteq \mathbb R^n\) be a \(\mathtt {PCA}\), and assume that X is compact. Then

$$\begin{aligned} \widehat{F}X=\Big \{(o,\phi )\in \mathbb R\times (\mathbb R^n)^A\mid \ o\ge 0,\ o+\sum _{a\in A}\mu _{X}(\phi (a))\le 1 \Big \} . \end{aligned}$$

In the following we use the elementary fact that every convex map has a linear extension.

Lemma 4.7

Let \(V_1,V_2\) be vector spaces, let \(X\subseteq V_1\) be a \(\mathtt {PCA}\), and let \(c:X\rightarrow V_2\) be a convex map. Then c has a linear extension \(\tilde{c}:V_1\rightarrow V_2\). If \({{\mathrm{span}}}\, X=V_1\), this extension is unique.

Rescaling in this representation of \(\widehat{F}X\) leads to a characterisation of \(\widehat{F}\)-coalgebra maps. We give a slightly more general statement.

Corollary 4.8

Let \(X,Y\subseteq \mathbb R^n\) be \(\mathtt {PCA}\) s, and assume that X and Y are compact. Further, let \(c:X\rightarrow \mathbb R_+\times (\mathbb R^n)^A\) be a convex map, and let \(\tilde{c}:\mathbb R^n\rightarrow \mathbb R\times (\mathbb R^n)^A\) be a linear extension of c. Then \(c(X)\subseteq \widehat{F}Y\), if and only if

$$\begin{aligned} {\tilde{c}}_{\text {o}}(x)+\sum _{a\in A}\mu _{Y}({\tilde{c}}_{a}(x))\le \mu _{X}(x),\quad x\in \mathbb R^n . \end{aligned}$$
(4)

5 An Extension Theorem for \(\widehat{F}\)-coalgebras

In this section we establish an extension theorem for \(\widehat{F}\)-coalgebras. It states that an \(\widehat{F}\)-coalgebra, whose carrier has a particular geometric form, can, under a mild additional condition, be embedded into an \(\widehat{F}\)-coalgebra whose carrier is free and finitely generated.

Theorem 5.1

Let (Xc) be an \(\widehat{F}\)-coalgebra whose carrier X is a compact subset of a euclidean space \(\mathbb R^n\) with \(\varDelta ^n\subseteq X\subseteq \mathbb R_+^n\). Assume that the output map \({c}_{\text {o}}\) does not vanish on invariant coordinate hyperplanes in the sense that (\(e_j\) denotes again the j-th canonical basis vector in \(\mathbb R^n\))

$$\begin{aligned} \begin{aligned}&\not \exists \,I\subseteq \{1,\ldots ,n\}{.}\\&I\ne \emptyset ,\quad {c}_{\text {o}}(e_j)=0,j\in I, \quad {c}_{a}(e_j)\subseteq {{\mathrm{span}}}\{e_i\mid i\in I\},a\in A,j\in I. \end{aligned} \end{aligned}$$
(5)

Then there exists an \(\widehat{F}\)-coalgebra (Yd), such that \(X\subseteq Y\subseteq \mathbb R_+^n\), the inclusion map \(\iota :X\rightarrow Y\) is a \(\mathsf{Coalg}({\widehat{F}})\)-morphism, and Y is the subconvex hull of n linearly independent vectors (in particular, Y is free with n generators).

The idea of the proof can be explained by geometric intuition. Say, we have an \(\widehat{F}\)-coalgebra (Xc) of the stated form, and let \(\tilde{c}:\mathbb R^n\rightarrow \mathbb R\times (\mathbb R^n)^A\) be the linear extension of c to all of \(\mathbb R^n\), cf. Lemma 4.7.

figure d

Remembering that pyramids are free and finitely generated, we will be done if we find a pyramid \(Y\supseteq X\) which is mapped into \(\widehat{F}Y\) by \(\tilde{c}\):

figure e

This task can be reformulated as follows: For each pyramid \(Y_1\) containing X let \(P(Y_1)\) be the set of all pyramids \(Y_2\) containing X, such that \(\tilde{c}(Y_2)\subseteq \widehat{F}Y_1\). If we find Y with \(Y\in P(Y)\), we are done.

Existence of Y can be established by applying a fixed point principle for set-valued maps. The result sufficient for our present level of generality is Kakutani’s generalisation [12, Corollary] of Brouwers fixed point theorem.

6 Properness of \(\widehat{F}\)

In this section we give the second main result of the paper.

Theorem 6.1

The functor \(\widehat{F}\) is proper.

In fact, for each two given coalgebras with free finitely generated carrier and each two elements having the same trace, a zig-zag with free and finitely generated nodes relating those elements can be found, which has three intermediate nodes with the middle one forming a span.

We try to follow the proof scheme familiar from the cubic case. Assume we are given two \(\widehat{F}\)-coalgebras with free finitely generated carrier, say \((\varDelta ^{n_1},c_1)\) and \((\varDelta ^{n_2},c_2)\), and elements \(x_1\in \varDelta ^{n_1}\) and \(x_2\in \varDelta ^{n_2}\) having the same trace. Since \(\widehat{F}\varDelta ^{n_j}\subseteq \mathbb R\times (\mathbb R^{n_j})^A\) we can apply Lemma 4.7 and obtain \(F_{\,{\mathbb R}}\)-coalgebras \((\mathbb R^{n_j},\tilde{c}_j)\) with \(\tilde{c}_j|_{\varDelta ^{n_j}}=c_j\). This leads to the basic diagram:

At this point the line of argument known from the cubic case breaks: it is not granted that \(Z\cap (\varDelta ^{n_1}\times \varDelta ^{n_2})\) becomes an \(\widehat{F}\)-coalgebra with the restriction of d.

The substitute for \(Z\cap (\varDelta ^{n_1}\times \varDelta ^{n_2})\) suitable for proceeding one step further is given by the following lemma, where we tacitly identify \(\mathbb R^{n_1}\times \mathbb R^{n_2}\) with \(\mathbb R^{n_1+n_2}\).

Lemma 6.2

We have \(d(Z\cap 2\varDelta ^{n_1+n_2})\subseteq \widehat{F}(Z\cap 2\varDelta ^{n_1+n_2})\).

This shows that \(Z\cap 2\varDelta ^{n_1+n_2}\) becomes an \(\widehat{F}\)-coalgebra with the restriction of d. Still, we cannot return to the usual line of argument: it is not granted that \(\pi _j(Z\cap 2\varDelta ^{n_1+n_2})\subseteq \varDelta ^{n_j}\). This forces us to introduce additional nodes to produce a zig-zag in \(\mathsf{Coalg}({\widehat{F}})\). These additional nodes are given by the following lemma. There \({{\mathrm{co}}}( -)\) denotes the convex hull.

Lemma 6.3

Set \(Y_j={{\mathrm{co}}}(\varDelta ^{n_j}\cup \pi _j(Z\cap 2\varDelta ^{n_1+n_2}))\). Then \(\tilde{c}_j(Y_j)\subseteq \widehat{F}Y_j\).

This shows that \(Y_j\) becomes an \(\widehat{F}\)-coalgebra with the restriction of \(\tilde{c}_j\). We are led to a zig-zag in \(\mathsf{Coalg}({\widehat{F}})\):

This zig-zag relates \(x_1\) and \(x_2\) since \((x_1,x_2)\in Z\cap 2\varDelta ^{n_1+n_2}\).

Using Minkowski’s Theorem and the argument from [24, Lemma B.8] shows that the middle node has finitely generated carrier. The two nodes with incoming arrows are, as convex hulls of two finitely generated \(\mathtt {PCA}\)s, of course also finitely generated. But in general they will not be free (and this is essential, remember Remark 2.2). Now Theorem 5.1 comes into play.

Lemma 6.4

Assume that each of \((\varDelta ^{n_1},c_1)\) and \((\varDelta ^{n_2},c_2)\) satisfies the following condition:

$$\begin{aligned} \begin{aligned}&\not \exists \,I\subseteq \{1,\ldots ,n\}{.}\\&I\ne \emptyset ,\ {c_j}_{\text {o}}(e_k)=0,k\in I, \ {c_j}_{a}(e_k)\subseteq {{\mathrm{co}}}(\{e_i\mid i\in I\}\cup \{0\}),a\in A,k\in I. \end{aligned} \end{aligned}$$
(6)

Then there exist free finitely generated \(\mathtt {PCA}\)s \(U_j\) with \(Y_j\subseteq U_j\subseteq \mathbb R_+^{n_j}\) which satisfy \(\tilde{c}_j(U_j)\subseteq \widehat{F}U_j\).

This shows that \(U_j\), under the additional assumption (6) on \((\varDelta ^{n_j},c_j)\), becomes an \(\widehat{F}\)-coalgebra with the restriction of \(\tilde{c}_j\). Thus we have a zig-zag in \(\mathsf{Coalg}({\widehat{F}})\) relating \(x_1\) and \(x_2\) whose nodes with incoming arrows are free and finitely generated, and whose node with outgoing arrows is finitely generated:

Removing the additional assumption on \((\varDelta ^{n_j},c_j)\) is an easy exercise.

Lemma 6.5

Let \((\varDelta ^n,c)\) be an \(\widehat{F}\)-coalgebra. Assume that I is a nonempty subset of \(\{1,\ldots ,n\}\) with

$$\begin{aligned} {c}_{\text {o}}(e_k)=0,\ k\in I\quad \text {and}\quad {c}_{a}(e_k)\in {{\mathrm{co}}}\big (\{e_i\mid i\in I\}\cup \{0\}\big ),\ a\in A,k\in I. \end{aligned}$$
(7)

Let X be the free \(\mathtt {PCA}\) with basis \(\{e_k\mid k\in \{1,\ldots ,n\}\setminus I\}\), and let \(f:\varDelta ^n\rightarrow X\) be the \(\mathtt {PCA}\)-morphism with \(f(e_k)=0\) if \(k\in I\) and \(f(e_k)=e_k\) if \(k\not \in I\). Further, let \(g:X\rightarrow [0,1]\times X^A\) be the \(\mathtt {PCA}\)-morphism with

$$\begin{aligned} g(e_k)=\Big ({c}_{\text {o}}(e_k),\big (f({c}_{a}(e_k))\big )_{a\in A}\Big ),\quad k\in \{1,\ldots ,n\}\setminus I . \end{aligned}$$

Then (Xg) is an \(\widehat{F}\)-coalgebra, and f is an \(\widehat{F}\)-coalgebra morphism of \((\varDelta ^n,c)\) onto (Xg).

Corollary 6.6

Let \((\varDelta ^n,c)\) be an \(\widehat{F}\)-coalgebra. Then there exists \(k\le n\), an \(\widehat{F}\)-coalgebra \((\varDelta ^k,g)\), such that \((\varDelta ^k,g)\) satisfies the assumption in Lemma 6.4 and such that there exists an \(\widehat{F}\)-coalgebra map f of \((\varDelta ^n,c)\) onto \((\varDelta ^k,g)\).

The proof of Theorem 6.1 is now finished by putting together what we showed so far. Starting with \(\widehat{F}\)-coalgebras \((\varDelta ^{n_j},c_j)\) without any additional assumptions, and elements \(x_j\in \varDelta ^{n_j}\) having the same trace, we first reduce by means of Corollary 6.6 and then apply Lemma 6.4. This gives a zig-zag as required:

and completes the proof of properness of \(\widehat{F}\).