1 Introduction

In 2013 Curto et al. [6] initiated the study of convex neural codes, which are the combinatorial codes \(\mathcal {C}\subseteq 2^{[n]}\) that record the intersection and covering relations among n convex open sets in Euclidean space (see Definition 1.2 below). Their motivation arose from neuroscience, namely the study of place cells, which are hippocampal neurons that fire when an animal is in a particular region of its environment. Place cells can be thought of as encoding a cognitive map of the animal’s environment (see [19]), and the study of convex neural codes seeks to understand how well this cognitive map can capture the geometry and topology of the environment.

Mathematical research on convex codes has blossomed since 2013. An efficient characterization of convex codes is unfortunately out of the question—recent work in [17] shows that recognizing convex codes is \(\exists \mathbb {R}\)-hard. Nevertheless, researchers have used techniques from algebra [5, 8, 10], discrete geometry [1, 3, 11, 16, 18], and topology [2, 4] to analyze many interesting families of codes and develop frameworks in which to test whether or not a code is convex. Some works also study codes that arise from “good covers” [2, 4], collections of closed convex sets [3, 9], and “non-degenerate” collections of convex sets [1, 3].

Let us begin by recalling some fundamental definitions. Our combinatorial objects of study are codes, which are just subsets of the power set \(2^{[n]}\), where \([n]:=\{1,2,\ldots ,n\}\).

Definition 1.1

A collection \(\mathcal {C}\subseteq 2^{[n]}\) is called a code. The elements of \(\mathcal {C}\) are called codewords. If \(\mathcal {C}\subseteq 2^\sigma \) for some \(\sigma \subseteq [n]\), we say that \(\sigma \) is a base set for \(\mathcal {C}\).

We will adopt the convention that \(\emptyset \) is a codeword in every code, which is typical in the study of convex neural codes. Codes can be used to record information about how sets in a collection intersect and cover one another, as follows.

Definition 1.2

Let \(\,\mathcal {U}=\{U_1,U_2,\ldots ,U_n\}\) be a collection of sets in \(\mathbb {R}^d\). The code of \(\,\mathcal {U}\) is

$$\begin{aligned} , \end{aligned}$$

where the empty-indexed intersection is equal to \(\mathbb {R}^d\) by convention. We say that \(\,\mathcal {U}\) is a realization of \({{\,\textrm{code}\,}}({\mathcal {U}})\).

In words, we obtain \({{\,\textrm{code}\,}}({\mathcal {U}})\) by labeling every point \(p\in \mathbb {R}^d\) by the indices \(i\in [n]\) for which \(p\in U_i\), and then collecting all the labels obtained in this way. If \(\,\mathcal {U}\) consists of convex open sets we say that \(\,\mathcal {U}\) is a convex open realization of \({{\,\textrm{code}\,}}({\mathcal {U}})\). We may similarly define convex closed realizations, and we typically use the notation \(\mathcal {X}=\{X_1,\ldots ,X_n\}\) for collections of closed convex sets. Every realization in this paper will consist of convex sets, so we will usually drop the adjective “convex.” Note that our convention that \(\emptyset \) is always a codeword amounts to the requirement that a realization \(\,\mathcal {U}\) does not cover \(\mathbb {R}^d\)—in particular, by intersecting with a sufficiently large closed or open ball we may assume all of our realizations are bounded.

Example 1.3

Consider the code

$$\begin{aligned} \mathcal {C}=\{123,12,13,24,34,1,2,3,4,\emptyset \}. \end{aligned}$$

Figure 1 shows an open realization \(\,\mathcal {U}=\{U_1,U_2,U_3,U_4\}\) of \(\mathcal {C}\) in \(\mathbb {R}^2\). One could also regard this figure as an illustration of a closed realization: replacing each set by its closure does not change the realized code.

Fig. 1
figure 1

An (open or closed) realization of \(\mathcal {C}\) in \(\mathbb {R}^2\), with an arrow pointing to the region where the codeword 12 arises

Remark 1.4

We will always illustrate open convex sets with a solid border. To avoid confusing these illustrations with closed convex sets, our captions will always specify whether we are regarding the illustrated sets as closed or open.

In Example 1.3, it was convenient that we could regard our realization as either closed or open without changing the realized code. This is not always the case—for example, in an open realization we may have disjoint sets which share boundary points, so that replacing them by their closures changes the code that they realize. In fact, this may be the case in every open realization of a code: [3] gives an example of a code in which every open realization is forced to include disjoint sets that share boundary points. Motivated by this difficulty, [3] introduced a notion of non-degeneracy for realizations, which places technical geometric and topological criteria on a realization in such a way that replacing sets by their interiors or closures preserves the realized code. Recently, [1] proved that non-degenerate realizations are exactly those for which replacing sets by interiors or closures does not affect the realized code—we will take this as the definition of non-degeneracy.

Definition 1.5

A collection \(\,\mathcal {U}=\{U_1,\ldots ,U_n\}\) of convex open sets is called non-degenerate if the collection \(\mathcal {X}=\{X_1,\ldots ,X_n\}\) with \(X_i:={{\,\textrm{cl}\,}}(U_i)\) has the property that \({{\,\textrm{code}\,}}(\mathcal {X})={{\,\textrm{code}\,}}({\mathcal {U}})\). Symmetrically, a collection of closed convex sets \(\mathcal {X}=\{X_1,\ldots ,X_n\}\) is called non-degenerate if the collection \(\,\mathcal {U}=\{U_1,\ldots ,U_n\}\) with \(U_i:={{\,\textrm{int}\,}}(X_i)\) has the property that \({{\,\textrm{code}\,}}({\mathcal {U}})={{\,\textrm{code}\,}}(\mathcal {X})\).

It is of particular interest to determine the smallest dimension in which a code has an open, closed, or non-degenerate realization. These minimum dimensions are referred to as embedding dimensions of a code.

Definition 1.6

Let \(\mathcal {C}\) be a code. The open, closed, and non-degenerate embedding dimensions of \(\mathcal {C}\) are the following quantities, respectively:

$$\begin{aligned} {{\,\textrm{odim}\,}}(\mathcal {C})&:=\min {\{d\,|\,\mathcal {C}\,\text { has an open convex realization in}\, \mathbb {R}^d\}},\\ {{\,\textrm{cdim}\,}}(\mathcal {C})&:=\min {\{d\,|\,\mathcal {C}\,\text { has a closed convex realization in}\, \mathbb {R}^d\}},\quad \text {and}\\ {{\,\textrm{nddim}\,}}(\mathcal {C})&:=\min {\{d\,|\,\mathcal {C}\,\text { has a non-degenerate (open or closed) convex realization in}~\mathbb {R}^d\}}. \end{aligned}$$

Above, the minimum over the empty set is equal to \(\infty \) by convention. The embedding dimension vector of \(\mathcal {C}\) is the 3-tuple

$$\begin{aligned} ({{\,\textrm{odim}\,}}(\mathcal {C}),{{\,\textrm{cdim}\,}}(\mathcal {C}),{{\,\textrm{nddim}\,}}(\mathcal {C})).\end{aligned}$$

For a fixed code \(\mathcal {C}\subseteq 2^{[n]}\), what can we say about the embedding dimensions of \(\mathcal {C}\) and their relationships to one another? Determining these dimensions exactly is often an infeasible task, but it is possible in some specific cases, and sometimes one may obtain general bounds that are interesting even if they are not exact. For example, when \(\mathcal {C}\) is intersection complete (i.e., the intersection of any two codewords is again a codeword), [3] showed that \({{\,\textrm{nddim}\,}}(\mathcal {C})\le \max {\{2,m\}}\) where \(m+1\) is the number of inclusion-maximal codewords in \(\mathcal {C}\), and [14] showed that \({{\,\textrm{cdim}\,}}(\mathcal {C})\le 2d+1\) if every codeword has size \(d+1\) or less.

As a very basic start, every non-degenerate realization can be regarded as a closed or open realization, so we have the following:

Proposition 1.7

If \(\mathcal {C}\) has embedding dimension vector (abc), then \(\max {\{a,b\}}\le c\).

It is natural to ask whether we can guarantee stricter relationships between the various embedding dimensions of a code. As we will see in Theorem 1.9, the answer in general is “no.” However, in some special cases, the answer is yes. For example, if \(\mathcal {C}\) is a simplicial complex then \({{\,\textrm{cdim}\,}}(\mathcal {C})={{\,\textrm{odim}\,}}(\mathcal {C})={{\,\textrm{nddim}\,}}(\mathcal {C})\) (see [14, Thm. 1.4]). If \(\mathcal {C}\) is intersection complete, then \({{\,\textrm{cdim}\,}}(\mathcal {C})\le {{\,\textrm{odim}\,}}(\mathcal {C})={{\,\textrm{nddim}\,}}(\mathcal {C})\) (see [13, Lem. 2.2.4 and Theorem 2.2.7]).

One final special case is when any of the embedding dimensions is equal to 1. In this case, all embedding dimensions must be equal to 1. This fact was first posed as a conjecture in [1, Conj. 3.4], and we prove it below using ideas based on discussions with the authors.

Theorem 1.8

Let \(\mathcal {C}\) be a code. Then the following are equivalent:

  • \({{\,\textrm{odim}\,}}(\mathcal {C})=1\),

  • \({{\,\textrm{cdim}\,}}(\mathcal {C})=1\), and

  • \({{\,\textrm{nddim}\,}}(\mathcal {C})=1\).

Proof

It will suffice to show that any open or closed realization of \(\mathcal {C}\) by intervals in \(\mathbb {R}^1\) can be made non-degenerate. In any realization by (open or closed) intervals, we may assume without loss of generality that every point in \(\mathbb {R}^1\) is either a left endpoint of some intervals in our realization, or a right endpoint of some intervals in our realization, but not both simultaneously. To guarantee this, simply insert a closed unit interval [ab] at any point \(p\in \mathbb {R}^1\) that is simultaneously a left and right endpoint. If our realization is open, we modify it so that a is a right endpoint of all intervals that p was a right endpoint of, and b is a left endpoint of all intervals that p was a left endpoint of. If our realization is closed, we do the opposite: intervals whose left endpoints were equal to p now have left endpoint a, while those with right endpoint p now have right endpoint b.

We claim that such a realization is necessarily non-degenerate. Observe that if some codeword c arises at a point p, then the same codeword arises at every point in a small closed interval with one of its endpoints equal to p. Thus replacing our intervals by their interiors or closures does not change the realized code, and the realization is non-degenerate. \(\square \)

Beyond the 1-dimensional case, the only relationship that we can guarantee between embedding dimensions is that the open and closed embedding dimensions are no larger than the non-degenerate embedding dimension. The following theorem captures this fact formally.

Theorem 1.9

Let \(2\le a,b,c\le \infty \) and suppose that \(\max {\{a,b\}}\le c\). Then there exists a code \(\mathcal {C}_{(a,b,c)}\) with embedding dimension vector (abc).

Rather than construct all \(\mathcal {C}_{(a,b,c)}\) directly, we will reduce to three cases, from which one can build any \(\mathcal {C}_{(a,b,c)}\). For this reduction, we observe that any two codes \(\mathcal {C}\) and \(\mathcal {D}\) may be relabeled so that they have disjoint base sets, and then combined to yield a code whose embedding dimension vector is the component-wise maximum of the original embedding dimension vectors.

Proposition 1.10

Let \(\mathcal {C}\) and \(\mathcal {D}\) be codes on disjoint base sets, with respective embedding dimension vectors \((a_1, b_1, c_1)\) and \((a_2, b_2, c_2)\). Then the embedding dimension vector of \(\mathcal {C}\cup \mathcal {D}\) is

$$\begin{aligned} (\max {\{a_1,a_2\}},\max {\{b_1,b_2\}},\max {\{c_1,c_2\}}).\end{aligned}$$

Proof

Any realization of \(\mathcal {C}\cup \mathcal {D}\) yields a realization of \(\mathcal {C}\) by deleting the sets indexed by the base set of \(\mathcal {D}\), and vice versa. Deleting sets preserves openness, closedness, or non-degeneracy of a realization, so the various embedding dimensions of \(\mathcal {C}\cup \mathcal {D}\) provide an upper bound on the corresponding embedding dimensions of \(\mathcal {C}\) and \(\mathcal {D}\). Conversely, any pair of (open, closed, or non-degenerate) realizations of \(\mathcal {C}\) and \(\mathcal {D}\) in the same dimension yields a corresponding realization of \(\mathcal {C}\cup \mathcal {D}\) by placing the two realizations sufficiently far apart. This proves the result. \(\square \)

By Proposition 1.10, to prove Theorem 1.9 it will suffice to exhibit codes with embedding dimension vectors (d, 2, d), (2, dd), and (2, 2, d) for all choices of \(2\le d\le \infty \). We treat the respective cases for finite d in Sects. 23, and 4. The cases with \(d=\infty \) are all treated in Sect. 5.

Our most technical result is the construction and analysis of the code \(\mathcal {C}_{(2,d,d)}\)—in particular, proving that \(\mathcal {C}_{(2,d,d)}\) has a non-degenerate realization in \(\mathbb {R}^d\) requires several pages of careful work (see Proposition 3.4).

Our constructions primarily make use of two existing tools. First, in [11, 14] we studied “sunflowers” of convex open sets, obtaining examples of codes with large open embedding dimension and small closed embedding dimension. Second, Chan et al. introduced “rigid structures” in [1]. Their results guarantee that sets in a closed realization must have a union which is convex under certain conditions—and importantly, their results do not hold for open realizations.

Informally, sunflowers guarantee structure in open (but not closed) realizations, while rigid structures provide the opposite. By combining these tools in various ways we are able to obtain all the desired codes \(\mathcal {C}_{(a,b,c)}\). In the interest of concision, we do not explain sunflowers or rigid structures in full generality. Instead, we state versions of these results that suffice in our context, and provide citations for a more general presentation.

2 Constructing the Codes \(\mathcal {C}_{(d,2,d)}\) with \(d<\infty \)

We begin by constructing the codes \(\mathcal {C}_{(d,2,d)}\) for all finite d. In fact, there is an existing family in the convex codes literature that suffices: sunflower codes. We first introduced and studied these codes in [14, Defn. 5.3], where we were primarily concerned with their open embedding dimensions. Below we review the definition of these codes, illustrate a few small examples, and provide citations for the results implying that they have the appropriate embedding dimension vector.

Definition 2.1

(see also [13, 14])  For \(2\le d<\infty \), define \({\mathcal {S}}_d\subseteq 2^{[d+1]}\) to be the code consisting of the following codewords: [d], all singleton sets, all pairs \(\{i,d+1\}\) for \(i\in [d]\), and the empty set.

The code \({\mathcal {S}}_d\) has two salient geometric features. First, in any realization the first d sets must form a “sunflower” in the sense that their various pairwise intersections must be the same, and must be nonempty. Second, the \((d+1)\)-st set intersects all the other sets, but not their common intersection. It turns out that such an arrangement is only possible to achieve with convex open sets in dimension at least d. For a full discussion of this fact, see [14, Sect. 5]. Below we illustrate a few of these codes to provide intuition.

Example 2.2

The sunflower codes \({\mathcal {S}}_d\) for \(d=2,3,4\) are listed below:

$$\begin{aligned} {\mathcal {S}}_2= & {} \{12,13,23,1,2,3,\emptyset \},\qquad {\mathcal {S}}_3=\{123,14,24,34,1,2,3,4,\emptyset \}, \qquad \text {and}\\ {\mathcal {S}}_4= & {} \{1234,15,25,35,45,1,2,3,4,5,\emptyset \}. \end{aligned}$$

Figure  2 shows open realizations for \({\mathcal {S}}_2\) and \({\mathcal {S}}_3\) in \(\mathbb {R}^2\) and \(\mathbb {R}^3\), respectively. The code \({\mathcal {S}}_4\) does not have an open realization in \(\mathbb {R}^3\), so we illustrate a closed realization in \(\mathbb {R}^2\).

Fig. 2
figure 2

Open realizations of \({\mathcal {S}}_2\) and \({\mathcal {S}}_3\), and a closed realization of \({\mathcal {S}}_4\)

We conclude this section by formally observing that \({\mathcal {S}}_d\) has the desired embedding dimension vector (d, 2, d). We provide several citations rather than a detailed proof since the results characterizing the embedding dimensions of \({\mathcal {S}}_d\) are already established.

Proposition 2.3

Let \(2\le d<\infty \). Then the code \({\mathcal {S}}_d\subseteq 2^{[d+1]}\) has embedding di-mension vector equal to (d, 2, d).

Proof

The results [13, Thm. 5.2.2 and Proposition 5.2.3] tell us that \({{\,\textrm{odim}\,}}({\mathcal {S}}_d)=d\) and \({{\,\textrm{cdim}\,}}({\mathcal {S}}_d)=2\). The code \({\mathcal {S}}_d\) is intersection complete, and [13, Lem. 2.2.4] guarantees that open and non-degenerate embedding dimension are equal for intersection complete codes. Thus \({{\,\textrm{nddim}\,}}({\mathcal {S}}_d)={{\,\textrm{odim}\,}}({\mathcal {S}}_d)=d\), proving the result. \(\square \)

3 Constructing the Codes \(\mathcal {C}_{(2,d,d)}\) with \(d< \infty \)

The code \(\mathcal {C}_{(2,d,d)}\) will have a base set of size 4d. Rather than simply use the integers \(\{1,2,\ldots ,4d\}\), we use four types of labeled symbol, with d many of each. Below, we let

$$\begin{aligned} \alpha _{[d]}:=\{\alpha _1,\alpha _2,\ldots ,\alpha _d\}, \end{aligned}$$

where each \(\alpha _i\) is a formal symbol in our base set. For any \(\sigma \subseteq [d]\), we let \(\alpha _\sigma =\{\alpha _i\,|\,i\in \sigma \}\), and in specific examples we will sometimes omit braces on \(\sigma \)—so for example, \(\alpha _{23}=\{\alpha _2,\alpha _3\}\). The sets \(\beta _{[d]}\), \(\gamma _{[d]}\), and \(\delta _{[d]}\) are defined analogously.

This notation has two distinct advantages. First, it streamlines the indexing in the results below, so that we need not deal with cumbersome offset factors in our base set indices. Second, it highlights that each type of base set element plays a different role in the code. Before commenting on these various roles, we provide a formal definition.

Definition 3.1

Let \(2\le d<\infty \), and define \(\mathcal {C}_{(2,d,d)}\) be the code on the base set \(\alpha _{[d]}\cup \beta _{[d]}\cup \gamma _{[d]}\cup \delta _{[d]}\) which has the following nonempty codewords:

  1. (i)

    \(\{\beta _i,\gamma _i\}\) for \(i\in [d]\),

  2. (ii)

    \(\{\beta _i,\beta _{i+1},\gamma _i\}\) for \(i\in [d-1]\),

  3. (iii)

    \(\{\beta _i,\beta _{i+1}\}\) for \(i\in [d-1]\),

  4. (iv)

    \(\{\beta _i,\beta _{i+1},\gamma _{i+1}\}\) for \(i\in [d-1]\),

  5. (v)

    \(\{\alpha _i,\beta _i,\gamma _i,\delta _i\}\) for \(i\in [d]\),

  6. (vi)

    \(\{\alpha _i,\delta _i\}\) for \(i\in [d]\),

  7. (vii)

    \(\alpha _{[d]}\cup \delta _{[d]}\),

  8. (viii)

    \(\delta _{[d]}\).

Informally, the base set elements and codewords above each play the following roles. The various \(\alpha _i\) are defined so that the various \(U_{\alpha _i}\) in any realization of \(\mathcal {C}_{(2,d,d)}\) form a sunflower—recall the commentary following Definition 2.1—thanks to the codewords of type (vi) and (vii). The codewords of types (i)–(iv) guarantee that the various \(\beta _i\) and \(\delta _i\) form a “rigid structure” as defined in [1]—this means that the union of all \(X_{\beta _i}\) and \(X_{\gamma _i}\) in a closed realization must be convex (see Lemma 3.6 below). Moreover, the codewords of type (v) force this rigid structure to intersect the various sunflower petals \(U_{\alpha _i}\). Finally, the various \(\delta _i\) have essentially the same behavior as the \(\alpha _i\), with the exception of the codeword (viii), which ties the structure of \(\mathcal {C}_{(2,d,d)}\) to the structure of the code \(\mathcal {A}_d\) from [7, Thm. 3.7], and is key to forcing the closed embedding dimension of \(\mathcal {C}_{(2,d,d)}\) to be large. Let us start our analysis of \(\mathcal {C}_{(2,d,d)}\) concretely, by forming an open realization in \(\mathbb {R}^2\).

Proposition 3.2

The code \(\mathcal {C}_{(2,d,d)}\) has an open realization in \(\mathbb {R}^2\).

Proof

We begin by describing the sets \(U_{\alpha _i}\) and \(U_{\delta _i}\) for \(i\in [d]\). Let P be a regular \((d+1)\)-gon in \(\mathbb {R}^2\) with center at the origin, inscribed in a second regular \((d+1)\)-gon \(P'\), which is rotated by an angle of \(\pi /(d+1)\) so that the vertices of P meet the midpoints of the edges of \(P'\). Observe that \({{\,\textrm{int}\,}}(P'\setminus P)\) consists of \(d+1\) disjoint open “flaps” arranged sequentially along the edges of P. Label these flaps as \(F_1, F_2, \ldots , F_{d+1}\). For each \(i\in [d]\) define

$$\begin{aligned} U_{\alpha _i}={{\,\textrm{int}\,}}(P\cup F_i)\quad \ \text {and}\quad \ U_{\delta _i}={{\,\textrm{int}\,}}{(P\cup F_i \cup F_{d+1})}.\end{aligned}$$

In words, \(U_{\alpha _i}\) is P plus the i-th flap, while \(U_{\delta _i}\) is P plus the i-th flap and the \((d+1)\)-st flap. Observe that the nonempty codewords arising in this arrangement are exactly those of types (vi)–(viii) in Definition 3.1. Indeed, the codeword \(\{\alpha _i,\delta _i\}\) arises in \(F_i\), the codeword \(\alpha _{[d]}\cup \delta _{[d]}\) arises in the interior of P, and the codeword \(\delta _{[d]}\) arises inside \(F_{d+1}\).

We can now define the sets \(U_{\beta _i}\) and \(U_{\gamma _i}\) in our realization. For each \(i\in [d+1]\), let \(L_i\) be a line that is parallel to the i-th edge of P, moved a small distance away from P but still intersecting the flap \(F_i\). Let \(L_i'\) be a second copy of \(L_i\) moved twice as far from P as \(L_i\), and note that \(L_i'\) does not necessarily pass through \(F_i\). For \(2\le i\le d\), label points in the intersections of the various \(L_i\) and \(L_i'\) as follows:

$$\begin{aligned} p_i&=L_{i-1}\cap L_i,\!\qquad q_i=L_{i-1}'\cap L_i',\\ r_i&=L_{i-1}\cap L_i',\qquad s_i=L_{i-1}'\cap L_i,\\ t_i&=\frac{q_i+r_i}{2},\qquad \quad \, u_i=\frac{q_i+s_i}{2}. \end{aligned}$$

These points are shown in Fig. 3. Moreover, we define the following edge cases:

$$\begin{aligned} p_1&=s_1=L_1\cap L_{d+1}',\qquad q_1=t_1=L_1'\cap L_{d+1}',\\ r_{d+1}&=p_{d+1}=L_d\cap L_{d+1}',\qquad q_{d+1}=u_{d+1}=L_d'\cap L_{d+1}'. \end{aligned}$$
Fig. 3
figure 3

Some of the points used to construct \(U_{\beta _i}\) and \(U_{\gamma _i}\). The polygons P and \(P'\) are not pictured

Now, with these points labeled, for \(i\in [d]\) we define

$$\begin{aligned} U_{\beta _i}={{\,\textrm{int}\,}}{({{\,\textrm{conv}\,}}\{s_i,r_{i+1},q_{i+1},q_i\})},\quad \text {and}\quad U_{\gamma _i}={{\,\textrm{int}\,}}{({{\,\textrm{conv}\,}}\{p_i,p_{i+1},u_{i+1},t_i\})}. \end{aligned}$$

We claim that this completes our open realization of \(\mathcal {C}_{(2,d,d)}\). The \(U_{\beta _i}\) and \(U_{\gamma _i}\) do not fully cover any of the regions giving rise to the codewords of types (vi)–(viii) that we described previously, so it suffices to show that the codewords arising inside the various \(U_{\beta _i}\) and \(U_{\gamma _i}\) are exactly those of types (i)–(v) in Definition 3.1.

Note that the union of the various \(U_{\beta _i}\) and \(U_{\gamma _i}\) is a bent, thickened line segment which wraps around the first d edges of P. For \(i\in [d-1]\), the codewords arising near the i-th joint in this bent region are exactly \(\{\beta _i,\gamma _i\}\), \(\{\beta _i,\beta _{i+1},\gamma _i\}\), \(\{\beta _i,\beta _{i+1}\}\), \(\{\beta _i, \beta _{i+1}, \gamma _{i+1}\}\), \(\{\beta _{i+1}, \gamma _{i+1}\}\) in sequence, as illustrated in Fig. 4.

Fig. 4
figure 4

The codewords arising at the i-th “joint” where \(L_i\) and \(L_{i+1}\) meet. Note that this open realization is degenerate: the disjoint sets \(U_{\gamma _i}\) and \(U_{\gamma _{i+1}}\) share the boundary point \(p_{i+1}=L_i\cap L_{i+1}\)

These are all codewords of type (i)–(iv), and we see that all such codewords arise for various choices of i. Away from the joints, the only codewords that arise are \(\{\beta _i,\gamma _i\}\) and \(\{\alpha _i,\beta _i,\gamma _i,\delta _i\}\), the latter arising near the midpoint of \(L_i\). This accounts for the codewords of type (v), and so we have indeed realized the code \(\mathcal {C}_{(2,d,d)}\) as desired. \(\square \)

Example 3.3

In Fig. 5 we illustrate the open realization of \(\mathcal {C}_{(2,d,d)}\) constructed in Proposition 3.2 in the case \(d=4\). Explicitly, the code we realize is

$$\begin{aligned} \mathcal {C}_{(2,4,4)}&=\{\beta _1\gamma _1,\alpha _1\beta _1\gamma _1\delta _1,\beta _{12}\gamma _1,\beta _{12},\beta _{12}\gamma _2,\\&\qquad \beta _2\gamma _2,\alpha _2\beta _2\gamma _2\delta _2,\beta _{23}\gamma _2,\beta _{23},\beta _{23}\gamma _3,\\&\qquad \beta _3\gamma _3,\alpha _3\beta _3\gamma _3\delta _3,\beta _{34}\gamma _3,\beta _{34},\beta _{34}\gamma _4,\\&\qquad \beta _4\gamma _4,\alpha _4\beta _4\gamma _4\delta _4,\alpha _1\delta _1,\alpha _2\delta _2,\alpha _3\delta _3,\alpha _4\delta _4,\delta _{1234},\alpha _{1234}\delta _{1234},\emptyset \}. \end{aligned}$$

In the first four lines above we have written the codewords that appear in the bent region around the outside of the central pentagon in the order that they appear. In the final line we have written the codewords that appear in the five “flaps” around the pentagon and the codeword \(\alpha _{1234}\delta _{1234}\) that appears in the pentagon itself.

Fig. 5
figure 5

An open realization of \(\mathcal {C}_{(2,4,4)}\) in \(\mathbb {R}^2\) as constructed in the proof of Proposition 3.2

With this construction achieved, we can proceed to construct a non-degenerate open (and hence also closed) realization of \(\mathcal {C}_{(2,d,d)}\) in \(\mathbb {R}^d\). This construction is the most technical result in the paper, and requires us to carefully manipulate a variety of inequalities that define the sets in our realization. However, the broad intuition for this construction is not too complex: we thicken the coordinate axes into open cubical prisms to form the various \(U_{\alpha _i}\) and \(U_{\delta _i}\), and we form the various \(U_{\beta _i}\) and \(U_{\gamma _i}\) by sequentially slicing through a thickened simplex in the positive orthant which lies far form the origin. This construction is illustrated for the case \(d=3\) in Example 3.5 below.

Proposition 3.4

The code \(\mathcal {C}_{(2,d,d)}\) has a non-degenerate realization in \(\mathbb {R}^d\).

Proof

For \(i\in [d]\), we start by defining

$$\begin{aligned} U_{\alpha _i}&=\left\{ {\textbf{x}}\in \mathbb {R}^d\;\Big |\;0<x_j<1\ \text {for}\, j\ne i\,\text { and}\ \sum _{j\in [d]}x_j>1\right\} \quad \text {and}\\ U_{\delta _i}&=\{{\textbf{x}}\in \mathbb {R}^d\mid 0<x_j<1\ \text {for}\, j\ne i\,\text { and}\ x_i>0\}. \end{aligned}$$

In words, \(U_{\delta _i}\) is obtained from an open unit hypercube in the positive orthant by extending it infinitely in the i-th coordinate direction. The set \(U_{\alpha _i}\) is the subset of \(U_{\delta _i}\) in which the sum of all coordinates is larger than one—that is, \(U_{\alpha _i}\) is obtained from \(U_{\delta _i}\) by slicing off the simplex in which the sum of coordinates is one or less, which lies in the corner of the positive orthant.

We claim that the nonempty codewords that arise among these sets are exactly those of types (vi)–(viii) in Definition 3.1. Note that outside of the open unit hypercube in which each coordinate is between zero and one, the only nonempty codewords appearing are \(\{\alpha _i,\delta _i\}\) for \(i\in [d]\). This follows from the fact that \(U_{\alpha _i}\) and \(U_{\gamma _i}\) are the same outside the hypercube, and \(U_{\alpha _i}\) does not meet \(U_{\alpha _j}\) outside the hypercube when \(j\ne i\). Inside the hypercube there are two regions. Where the sum of coordinates is larger than one, all \(U_{\alpha _i}\) and \(U_{\delta _i}\) appear, giving rise to the codeword \(\alpha _{[d]}\cup \delta _{[d]}\). Where the sum of coordinates is one or less, only the various \(U_{\delta _i}\) appear, giving rise to the codeword \(\delta _{[d]}\). Thus all codewords of types (vi)–(viii) appear, and no others.

Now let us define the \(U_{\beta _i}\) and \(U_{\gamma _i}\). Let \(\ell \) denote the linear functional given by \(\ell ({\textbf{x}})=\sum _{i\in [d]}ix_i\). Let C be the open convex region in the positive orthant consisting of the points \({\textbf{x}}\) so that the sum of the coordinates of \({\textbf{x}}\) is between \(5d^2\) and \(5d^2+1\). Now, for \(i\in [d]\) define

$$\begin{aligned} U_{\beta _i}&=\{{\textbf{x}}\in C\mid 5id^2-4d^2<\ell ({\textbf{x}})<5id^2+4d^2\}\quad \text {and}\\ U_{\gamma _i}&=\{{\textbf{x}}\in C\mid 5id^2-2d^2<\ell ({\textbf{x}})<5id^2+2d^2\}. \end{aligned}$$

We aim to show the addition of these sets to our realization gives rise to exactly the codewords of type (i)–(v) from Definition 3.1. First, let us determine the codewords that arise from these sets independent of the \(U_{\alpha _i}\) and \(U_{\delta _i}\). If \({\textbf{x}}\in C\), then observe that the value of \(\ell ({\textbf{x}})\) completely determines which codeword arises at \({\textbf{x}}\):

  • \(\{\beta _i,\gamma _i\}\) for \(i\in [d]\) arises when \(5id^2-d^2\le \ell ({\textbf{x}})\le 5id^2+d^2\),

  • \(\{\beta _i,\beta _{i+1},\gamma _i\}\) for \(i\in [d-1]\) arises when \(5id^2+d^2<\ell ({\textbf{x}})<5id^2+2d^2\),

  • \(\{\beta _i,\beta _{i+1}\}\) for \(i\in [d-1]\) arises when \(5id^2+2d^2\le \ell ({\textbf{x}})\le 5id^2+3d^2\), and

  • \(\{\beta _i,\beta _{i+1},\gamma _{i+1}\}\) for \(i\in [d-1]\) arises when \(5id^2+3d^2<\ell ({\textbf{x}})<5id^2+4d^2\).

By construction of C we have \(4d^2<\ell ({\textbf{x}})<5d^3+d^2\) for all \({\textbf{x}}\in C\), and so these cases cover all points in C. To show that each case actually occurs, we will construct a line segment L along which \(\ell \) takes values covering all cases above.

Let p be the point whose first coordinate is \(5d^2-d +1/2\), and whose remaining coordinates are all equal to \(1+1/(d-1)\). Observe that the sum of the coordinates of p are exactly \(5d^2+1/2\). Moreover, we have

$$\begin{aligned} \ell (p)&=5d^2-d+\frac{1}{2}+\biggl (1+\frac{1}{d-1}\biggr )\sum _{i=2}^di\\&=5d^2-d+\frac{1}{2}+\biggl (1+\frac{1}{d-1}\biggr )\left( d-1+\sum _{i=1}^{d-1}i\right) \\&\le 5d^2-d+\frac{1}{2}+\biggl (1+\frac{1}{d-1}\biggr )(d-1+(d-1)^2)\\&=5d^2-d+\frac{1}{2}+d^2<5d^2+d^2. \end{aligned}$$

Symmetrically, let q be a point whose last coordinate is \(5d^2-d+1/2\), and whose remaining coordinates are equal to \(1+1/(d-1)\). As with p, we see that the sum of coordinates of q is exactly \(5d^2+1/2\). Furthermore, we may compute

$$\begin{aligned} \ell (q)&=d\biggl (5d^2-d+\frac{1}{2}\biggr )+\sum _{i=1}^{d-1}i\biggl (1+\frac{1}{d-1}\biggr )\\&=5d^3-d^2+\frac{d}{2}+\biggl (1+\frac{1}{d-1}\biggr )\sum _{i=1}^{d-1}i\ge 5d^3-d^2+\frac{d}{2}+d>5d^3-d^2. \end{aligned}$$

Since \(\ell \) is linear, we conclude that \(\ell \) takes all real values between \(\ell (p)<5d^2+d^2\) and \(\ell (q)>5d^3-d^2\) on the line segment L. Moreover, every point on L has sum of coordinates equal to \(5d^2+1/2\), so L is contained in C. In particular, there are points in \(L\cap C\) for which \(\ell \) attains a value covering each case in the bulleted list above. Thus all codewords of types (i)–(iv) arise along L, and no others arise from the various \(U_{\beta _i}\) and \(U_{\gamma _i}\).

We have determined that the various \(U_{\alpha _i}\) and \(U_{\delta _i}\) give rise to exactly the codewords of types (vi)–(vii) in isolation, while the various \(U_{\beta _i}\) and \(U_{\gamma _i}\) give rise to exactly those of types (i)–(iv) in isolation. We must now argue that when considered together, all these codewords remain, and the only new codewords that appear are exactly those of type (v) in Definition 3.1.

Considering the sets together, we do not lose any codewords. Those of types (vi)–(vii) arise outside of C, which contains the various \(U_{\beta _i}\) and \(U_{\gamma _i}\). Those of types (i)–(iv) arise along L, and every point in L has all coordinates larger than one, so L does not meet any \(U_{\alpha _i}\) or \(U_{\delta _i}\). To see that only codewords of type (v) arise when considering the sets together, it will suffice to show that \(U_{\alpha _i}\cap C\) is nonempty, and is contained in the region where the codeword \(\{\beta _i,\gamma _i\}\) arises (recall that \(U_{\alpha _i}\) and \(U_{\delta _i}\) are identical outside of the unit hypercube, which does not meet C). That is, it will suffice to show that \(5id^2-d^2\le \ell ({\textbf{x}})\le 5id^2+d^2\) for all \({\textbf{x}} \in U_{\alpha _i}\cap C\), and to find an example of one such \({\textbf{x}}\).

The points in \({\textbf{x}}\) in \(U_{\alpha _i}\cap C\) are exactly those which satisfy the following \(d+1\) conditions:

$$\begin{aligned} 5d^2<\sum _{i\in [d]}x_i<5d^2+1,\qquad 0<x_j< 1\ \text {for}\, j\ne i\quad \text {and}\quad x_i>0.\end{aligned}$$

Note that \(\ell ({\textbf{x}})\) is smallest when the early coordinates of \({\textbf{x}}\) are larger and the overall sum of the coordinates is smallest. Thus for \({\textbf{x}}\in U_{\alpha _i}\cap C\), the value of \(\ell ({\textbf{x}})\) is bounded below by

$$\begin{aligned} \sum _{j=1}^ij+i(5d^2-i)=5id^2-i^2+\sum _{j=1}^ij\ge 5id^2-d^2.\end{aligned}$$

On the other hand, \(\ell ({\textbf{x}})\) is largest when the later coordinates of \({\textbf{x}}\) are larger, and the overall sum of the coordinates is largest. Thus the value of \(\ell ({\textbf{x}})\) on the region \(U_{\alpha _i}\cap C\) is bounded above by

$$\begin{aligned} i(5d^2+1-i)+\sum _{j=i}^dj=5id^2+i-i^2+\sum _{j=i}^dj<5id^2+d^2.\end{aligned}$$

This shows that \(U_{\alpha _i}\cap C\) is contained in the region where the codeword \(\{\beta _i,\gamma _i\}\) arises, so the only codeword that could arise in our overall realization involving both \(\alpha _i\) and some \(\beta _j\) is exactly \(\{\alpha _i,\beta _i,\gamma _i,\delta _i\}\). To see that this codeword actually does arise, consider the point whose i-th coordinate is \(5id^2+1/2-\varepsilon \), and all of whose other coordinates are \(\varepsilon /(d-1)\). For a sufficiently small \(\varepsilon >0\), this point will lie in \(C\cap U_{\alpha _i}\), and thus give rise to the codeword \(\{\alpha _i,\beta _i,\gamma _i,\delta _i\}\).

So far we have shown that our collection is an open convex realization of \(\mathcal {C}_{(2,d,d)}\). Let us finally argue that our realization is non-degenerate. It suffices to observe that replacing the sets in our realization with their closures does not change the realized code. The arguments above can be applied verbatim, provided that we swap any strict inequalities for non-strict inequalities, and vice versa. \(\square \)

Example 3.5

Let us consider the code \(\mathcal {C}_{(2,d,d)}\) in the case \(d=3\). We have

$$\begin{aligned} \mathcal {C}_{(2,3,3)}&=\{\beta _1\gamma _1,\alpha _1\beta _1\gamma _1\delta _1,\beta _{12}\gamma _1,\beta _{12},\beta _{12}\gamma _2,\\&\qquad \beta _2\gamma _2,\alpha _2\beta _2\gamma _2\delta _2,\beta _{23}\gamma _2,\beta _{23},\beta _{23}\gamma _3,\\&\qquad \beta _3\gamma _3,\alpha _3\beta _3\gamma _3\delta _3,\\&\qquad \alpha _1\delta _1,\alpha _2\delta _2,\alpha _3\delta _3,\delta _{123},\alpha _{123}\delta _{123},\emptyset \}. \end{aligned}$$

Figure 6 illustrates the construction used in Proposition 3.4 to obtain a non-degenerate realization of \(\mathcal {C}_{(2,3,3)}\) in \(\mathbb {R}^3\). Note that Fig. 6 is only a sketch of our construction—we do not precisely illustrate the inequalities that define the set C, and the various \(U_{\alpha _i}\) and \(U_{\delta _i}\) would be thinner relative to C in an exact illustration.

Fig. 6
figure 6

A non-degenerate open realization of \(\mathcal {C}_{(2,3,3)}\) in \(\mathbb {R}^3\), with the regions that give rise to each codeword labeled

So far, we have established appropriate upper bounds on the embedding dimensions of \(\mathcal {C}_{(2,d,d)}\). We now move on to establish matching lower bounds. It will suffice to show that \({{\,\textrm{cdim}\,}}(\mathcal {C}_{(2,d,d)})=d\), which we do in Proposition 3.7. Our proof requires two existing tools. The first tool is a notion of “rigid structures” defined by [1], which guarantees that a union of certain sets in a closed realization is convex—we do not state the definition of a rigid structure in full generality, but instead give a sufficient version of this result as a lemma below. The second tool we require is a code \(\mathcal {A}_d\) from [7]. The relevant feature of this code is that if we add a certain codeword to it, the resulting code has closed embedding dimension equal to d—the codeword of type (viii) from Definition 3.1 will be exactly the codeword that we need, up to a relabeling of the base set.

Lemma 3.6

(version of Lemma 4.21 from [1])   Let \(\mathcal {X}=\{X_1,X_2,\ldots ,X_n\}\) be a closed convex realization of a code \(\mathcal {C}\), and suppose that the nonempty codewords in \(\mathcal {C}\) can be labeled \(c_1,c_2,\ldots ,c_k\) so that (i) \(c_1\subset c_2\supset c_3\subset c_4\supset \ldots \subset c_{k-1}\supset c_k\), (ii) no other containments occur between nonempty codewords, and (iii) \(c_i\cap c_{i+1}\cap c_{i+2}\) is nonempty for all \(i\in [k-2]\). Then the union \(\bigcup _{i\in [n]}X_i\) is convex.

Proposition 3.7

The code \(\mathcal {C}_{(2,d,d)}\) has closed embedding dimension equal to d.

Proof

In Proposition 3.4 we showed that \(\mathcal {C}_{(2,d,d)}\) has a non-degenerate realization in \(\mathbb {R}^d\). Thus it will suffice to show that \(\mathcal {C}_{(2,d,d)}\) does not have a closed realization in any dimension \(d'<d\). Suppose for contradiction that we have a closed convex realization

$$\begin{aligned} \mathcal {X}=\{X_{\alpha _1},\ldots ,X_{\alpha _d},X_{\beta _1},\ldots ,X_{\beta _d},X_{\gamma _1},\ldots ,X_{\gamma _d},X_{\delta _1},\ldots ,X_{\delta _d}\}\end{aligned}$$

of \(\mathcal {C}_{(2,d,d)}\) in \(\mathbb {R}^{d'}\) where \(d'< d\).

Consider the code that arises only from the various \(X_{\beta _i}\) and \(X_{\gamma _i}\). The nonempty codewords in this code will be exactly the codewords of types (i)–(iv) in Definition 3.1. Observe that we may order these codewords sequentially so that we have the containments

$$\begin{aligned}{} & {} \beta _1\gamma _1\subset \beta _{12}\gamma _1\supset \beta _{12}\subset \beta _{12}\gamma _2\supset \beta _2\gamma _2\subset \beta _{23}\gamma _2\supset \ldots \\{} & {} \quad \ldots \subset \beta _{d-1}\beta _d\gamma _{d-1}\supset \beta _{d-1}\beta _d\subset \beta _{d-1}\beta _d\gamma _d\supset \beta _d\gamma _d. \end{aligned}$$

Moreover, no other containment relations exist between these codewords, and the intersection of any three consecutive codewords is nonempty (in particular, the intersection will contain some \(\beta _i\)). Thus by Lemma 3.6, the union of all \(X_{\beta _i}\) and \(X_{\gamma _i}\) is a closed convex set. Let us call this union \(X_{d+1}\).

Now for \(i\in [d]\), define \(X_i=X_{\alpha _i}\) and \(X_{{\overline{i}}}=X_{\delta _i}\). The code realized by the collection \(\mathcal {X}'=\{X_1,X_2,\ldots ,X_{d+1},X_{\overline{1}},X_{\overline{2}},\ldots ,X_{{\overline{d}}}\}\) will be exactly \(\mathcal {A}_d\cup \{\{\overline{1},\overline{2},\ldots ,{\overline{d}}\}\}\), where \(\mathcal {A}_d\) is the code of [7, Defn. 3.6]. However, [7, Thm. 3.7] states that the closed embedding dimension of \(\mathcal {A}_d\cup \{\{\overline{1},\overline{2},\ldots ,{\overline{d}}\}\}\) is exactly d. Since the realization \(\mathcal {X}'\) lies in \(\mathbb {R}^{d'}\) where \(d'<d\), we have reached a contradiction. Thus \(\mathcal {C}_{(2,d,d)}\) has closed embedding dimension equal to d. \(\square \)

We have established all the necessary constructions and results to exactly characterize the embedding dimensions of \(\mathcal {C}_{(2,d,d)}\). We compile and summarize these results in the theorem below.

Theorem 3.8

The code \(\mathcal {C}_{(2,d,d)}\) of Definition 3.1 has embedding dimension vector equal to (2, dd).

Proof

In Proposition 3.2 we established that \({{\,\textrm{odim}\,}}(\mathcal {C}_{(2,d,d)})\le 2\), and in Proposition 3.4 we showed that \({{\,\textrm{nddim}\,}}(\mathcal {C}_{(2,d,d)})\le d\). Proposition 3.7 showed that \({{\,\textrm{cdim}\,}}(\mathcal {C}_{(2,d,d)})=d\), which implies that the non-degenerate embedding dimension is also equal to d. We cannot have \({{\,\textrm{odim}\,}}(\mathcal {C}_{(2,d,d)})<2\) since Theorem 1.8 would imply that the closed embedding dimension is less than d. Thus the open embedding dimension is exactly 2, and the result follows. \(\square \)

4 Constructing the Codes \(\mathcal {C}_{(2,2,d)}\) with \(d<\infty \)

The code \(\mathcal {C}_{(2,2,d)}\) is closely related to the code \(\mathcal {C}_{(2,d,d)}\) which we defined and analyzed in the previous section. In fact, \(\mathcal {C}_{(2,2,d)}\) is simply the result of deleting the base set elements \(\delta _{[d]}\) from \(\mathcal {C}_{(2,d,d)}\). It turns out this is enough to lower the closed embedding dimension from d to 2, without changing the other embedding dimensions.

Definition 4.1

Let \(2\le d<\infty \), and define \(\mathcal {C}_{(2,2,d)}\) to be the code on the base set \(\alpha _{[d]}\cup \beta _{[d]}\cup \gamma _{[d]}\) which has following nonempty codewords:

  1. (i)

    \(\{\beta _i,\gamma _i\}\) for \(i\in [d]\),

  2. (ii)

    \(\{\beta _i,\beta _{i+1},\gamma _i\}\) for \(i\in [d-1]\),

  3. (iii)

    \(\{\beta _i,\beta _{i+1}\}\) for \(i\in [d-1]\),

  4. (iv)

    \(\{\beta _i,\beta _{i+1},\gamma _{i+1}\}\) for \(i\in [d-1]\),

  5. (v)

    \(\{\alpha _i,\beta _i,\gamma _i\}\) for \(i\in [d]\),

  6. (vi)

    \(\{\alpha _i\}\) for \(i\in [d]\),

  7. (vii)

    \(\alpha _{[d]}\).

The close relationship between \(\mathcal {C}_{(2,2,d)}\) and \(\mathcal {C}_{(2,d,d)}\) greatly simplifies our analysis of \(\mathcal {C}_{(2,2,d)}\). As a start, we have the following.

Proposition 4.2

The code \(\mathcal {C}_{(2,2,d)}\) has an open realization in \(\mathbb {R}^2\) and a non-degenerate realization in \(\mathbb {R}^d\).

Proof

Since \(\mathcal {C}_{(2,2,d)}\) is the result of deleting the base set elements \(\{\delta _1,\ldots ,\delta _d\}\) from \(\mathcal {C}_{(2,d,d)}\), any realization of \(\mathcal {C}_{(2,d,d)}\) yields a realization of \(\mathcal {C}_{(2,2,d)}\) by deleting the various \(U_{\delta _i}\). In Proposition 3.2 we constructed an open realization of \(\mathcal {C}_{(2,d,d)}\) in \(\mathbb {R}^2\), and in Proposition 3.4 we constructed a non-degenerate realization of \(\mathcal {C}_{(2,d,d)}\) in \(\mathbb {R}^d\). Since deleting sets in a realization preserves openness and non-degeneracy of the realization, these constructions give us an open realization of \(\mathcal {C}_{(2,2,d)}\) in \(\mathbb {R}^2\) and a non-degenerate realization of \(\mathcal {C}_{(2,2,d)}\) in \(\mathbb {R}^d\) as desired. \(\square \)

It remains to argue that \(\mathcal {C}_{(2,2,d)}\) has a closed realization in \(\mathbb {R}^2\), but no non-degenerate realization in a dimension less than d. We start by constructing a closed realization. Example 4.4 illustrates this construction in the case \(d=4\).

Proposition 4.3

The code \(\mathcal {C}_{(2,2,d)}\) has a closed realization in \(\mathbb {R}^2\).

Proof

Informally, we may form a closed realization by arranging the various \(X_{\beta _i}\) and \(X_{\gamma _i}\) sequentially along the x-axis, and then letting the various \(X_{\alpha _i}\) be triangles which meet at a common point above the x-axis and intersect the x-axis sequentially. Formally, let C be the strip \(\{(x,y)\in \mathbb {R}^2\mid -1\le y\le 0\text { and }0\le x\le 4d-3\}\). Then for \(i\in [d]\) we define

$$\begin{aligned} X_{\beta _i}= & {} \{(x,y)\in C\mid 4i-7\le x\le 4i\}\quad \text {and}\\ X_{\gamma _i}= & {} \{(x,y)\in C\mid 4i-5\le x\le 4i-2\}.\end{aligned}$$

Let \(p=(0,d)\), and for \(i\in [d]\) let \(q_i=(4i-3.75,0)\) and \(r_i=(4i-3.25,0)\). Then define

$$\begin{aligned} X_{\alpha _i}={{\,\textrm{conv}\,}}\{p,q_i,r_i\}. \end{aligned}$$

We claim that this yields a closed realization of \(\mathcal {C}_{(2,2,d)}\). Observe that the \(X_{\alpha _i}\) are triangles which only meet at p, so the codewords they give rise to in isolation are simply \(\{\alpha _i\}\) for \(i\in [d]\) and \(\alpha _{[d]}\), the latter arising only at p. These are exactly the codewords of types (vi) and (vii) in Definition 4.1.

The codewords that arise from the various \(U_{\beta _i}\) and \(U_{\gamma _i}\) in isolation are completely determined by the x coordinates of points in C. Indeed, if (xy) is a point in C then the codeword arising at this point is

  • \(\{\beta _i,\gamma _i\}\) if and only if \(4i-4<x<4i-3\),

  • \(\{\beta _i, \beta _{i+1},\gamma _i\}\) if and only if \(4i-3\le x\le 4i-2\),

  • \(\{\beta _i, \beta _{i+1}\}\) if and only if \(4i-2<x<4i-1\), and

  • \(\{\beta _i,\beta _{i+1},\gamma _{i+1}\}\) if and only if \(4i-1\le x\le 4i\).

These cases partition all points in C, and all such cases occur by construction of C. Moreover, these are exactly the codewords of types (i)–(iv) in Definition 4.1. Finally, note that by choice of the points \(q_i\) and \(r_i\), the triangle \(U_{\alpha _i}\) only meets C where the codeword \(\{\beta _i,\gamma _i\}\) arises. This yields exactly the codewords of type (v), and so we have indeed constructed a closed realization of \(\mathcal {C}_{(2,2,d)}\) as desired. \(\square \)

Example 4.4

Figure  7  shows the construction used in Proposition 4.3 to obtain a closed realization of the code \(\mathcal {C}_{(2,2,d)}\) in \(\mathbb {R}^2\) in the case \(d=4\). In this case, we have

$$\begin{aligned} \mathcal {C}_{(2,2,4)}&=\{\beta _1\gamma _1,\alpha _1\beta _1\gamma _1,\beta _{12}\gamma _1,\beta _{12},\beta _{12}\gamma _2,\\&\qquad \beta _2\gamma _2,\alpha _2\beta _2\gamma _2,\beta _{23}\gamma _2,\beta _{23},\beta _{23}\gamma _3,\\&\qquad \beta _3\gamma _3,\alpha _3\beta _3\gamma _3,\beta _{34}\gamma _3,\beta _{34},\beta _{34}\gamma _4,\\&\qquad \beta _4\gamma _4,\alpha _4\beta _4\gamma _4,\\&\qquad \alpha _1,\alpha _2,\alpha _3,\alpha _4,\alpha _{1234},\emptyset \}. \end{aligned}$$
Fig. 7
figure 7

A closed realization of \(\mathcal {C}_{(2,2,4)}\) in \(\mathbb {R}^2\)

We are now ready to prove that \({{\,\textrm{nddim}\,}}(\mathcal {C}_{(2,2,d)})=d\). Our proof proceeds similarly to the proof of Proposition 3.7—namely, it relies on the rigid structure result in Lemma 3.6, and on the characterization of the embedding dimensions of an existing family of codes.

Proposition 4.5

The code \(\mathcal {C}_{(2,2,d)}\) has non-degenerate embedding dimension equal to d.

Proof

In Proposition 4.2 we argued that \({{\,\textrm{nddim}\,}}(\mathcal {C}_{(2,2,d)})\le d\), so it will suffice to argue that there is no non-degenerate (open or closed) realization of \(\mathcal {C}_{(2,2,d)}\) in \(\mathbb {R}^{d'}\) with \(d'<d\). Suppose for contradiction that we have a closed non-degenerate realization

$$\begin{aligned} \mathcal {X}=\{X_{\alpha _1},\ldots ,X_{\alpha _d},X_{\beta _1},\ldots ,X_{\beta _d},X_{\gamma _1},\ldots ,X_{\gamma _d}\} \end{aligned}$$

of \(\mathcal {C}_{(2,2,d)}\) in a dimension \(d'<d\). As in the proof of Proposition 3.7, the codewords of types (i)–(iv) satisfy the conditions of Lemma 3.6, and so the union of all \(X_{\beta _i}\) and \(X_{\gamma _i}\) is a closed convex set. Let us call this set \(X_{d+1}\), and for \(i\in [d]\) define \(X_i=X_{\alpha _i}\).

Now, we claim that non-degeneracy of the realization \(\mathcal {X}\) guarantees non-degeneracy of the realization \(\mathcal {X}'=\{X_1,X_2,\ldots ,X_{d+1}\}\). First observe that the sets \(\{X_1,\ldots ,X_d\}\) are non-degenerate in isolation, and realize the code \(\{[d],\{1\},\{2\},\ldots ,\{d\},\emptyset \}\). The set \(X_{d+1}\) is full-dimensional, and since the codewords of type (v) in Definition 4.1 are the only ones that simultaneously contain some \(\alpha _i\) and \(\beta _j\) or \(\gamma _j\), we conclude that \(X_{d+1}\) intersects each other \(X_i\) at a point common to both their interiors, while avoiding the common intersection of all \(X_i\) for \(i\in [d]\).

However, the analysis above tells us that \({{\,\textrm{code}\,}}(\mathcal {X}')={\mathcal {S}}_d\) (recall Definition 2.1). The code \({\mathcal {S}}_d\) has non-degenerate embedding dimension exactly d, and since \(\mathcal {X}'\) is a realization in \(\mathbb {R}^{d'}\) with \(d'<d\), we have reached a contradiction. This proves the result. \(\square \)

We have now characterized the embedding dimension vector of \(\mathcal {C}_{(2,2,d)}\), as summarized in the theorem below.

Theorem 4.6

The code \(\mathcal {C}=\mathcal {C}_{(2,2,d)}\) of Definition 4.1 has embedding dimension vector equal to (2, 2, d).

Proof

In Proposition 4.2 we showed that the open and non-degenerate embedding dimensions of \(\mathcal {C}_{(2,2,d)}\) were no larger than two and d, respectively. Proposition 4.3 showed that the closed embedding dimension was no more than two. In Proposition 4.5 we argued that the non-degenerate embedding dimension was exactly d, which then implies that the closed and open embedding dimensions are both exactly two—if they were smaller, so would be the non-degenerate embedding dimension. \(\square \)

5 Constructing the Codes \(\mathcal {C}_{(\infty ,2,\infty )}\), \(\mathcal {C}_{(2,\infty ,\infty )}\), and \(\mathcal {C}_{(2,2,\infty )}\)

We now treat the three remaining cases, in which some embedding dimensions may be infinite. As we did in Sect. 2, we draw on some existing examples in the literature which suffice—in fact, the codes \(\mathcal {C}_{(\infty ,2,\infty )}\) and \(\mathcal {C}_{(2,\infty ,\infty )}\) have already been defined an analyzed in [18] and [3], respectively. Our main contribution is the construction of the code \(\mathcal {C}_{(2,2,\infty )}\) (see Theorem 5.3), which adds a rigid structure to the minimally non-convex code from [12, Thm. 5.10].

Fig. 8
figure 8

Left: a closed realization of \(\mathcal {C}_{(\infty ,2,\infty )}\). Right: an open realization of \(\mathcal {C}_{(2,\infty ,\infty )}\). In the open realization we have labeled the regions that give rise to each codeword—the various \(U_i\) are open halves of the hexagon, rotated consecutively by 60 degrees

Proposition 5.1

The code

$$\begin{aligned} \mathcal {C}_{(\infty ,2,\infty )}:=\{2345,123,134,145,13,14,23,34,45,3,4,\emptyset \}, \end{aligned}$$

which appears in [18, Thm. 3.1], has embedding dimension vector \((\infty ,2,\infty )\).

Proof

[18, Thm. 3.1] states that this code does not have an open convex realization in any dimension, and so its open and non-degenerate embedding dimensions are both \(\infty \). On the other hand, [3, Fig. 2.2(a)] provides a closed realization of this code in \(\mathbb {R}^2\). \(\square \)

Proposition 5.2

The code

$$\begin{aligned} \mathcal {C}_{(2,\infty ,\infty )}:=\{123,126,156,234,345,456,12,16,23,34,45,56,\emptyset \} \end{aligned}$$

which appears in [3, Sect. 2.3] has embedding dimension vector \((2,\infty ,\infty )\).

Proof

[3, Lem. 2.9] states that this code does not have a closed convex realization in any dimension, and so its closed and non-degenerate embedding dimensions are both \(\infty \). However, [3, Fig. 2.1(a)] provides an open realization of this code in \(\mathbb {R}^2\). \(\square \)

Figure 8 duplicates [13, Fig. 1.7], illustrating a closed realization of \(\mathcal {C}_{(\infty ,2,\infty )}\) and an open realization of \(\mathcal {C}_{(2,\infty ,\infty )}\) in \(\mathbb {R}^2\). Note that both realizations are degenerate. For example, on the left \(X_2\) and \(X_3\) only intersect in a 1-dimensional segment, and on the right \(U_1\) and \(U_4\) are disjoint but share boundary points. In Theorem 5.3, we conclude by constructing and analyzing the code \(\mathcal {C}_{(2,2,\infty )}\).

Theorem 5.3

The code

$$\begin{aligned} \mathcal {C}_{(2,2,\infty )}:=\{123,145,2456,2467,389,678,689,246,45,67,68,89,1,2,3,\emptyset \} \end{aligned}$$

has embedding dimension vector \((2,2,\infty )\).

Fig. 9
figure 9

A closed realization of \(\mathcal {C}_{(2,2,\infty )}\) in \(\mathbb {R}^2\)

Fig. 10
figure 10

An open realization of \(\mathcal {C}_{(2,2,\infty )}\) in \(\mathbb {R}^2\)

Proof

Figures 9 and 10 show closed and open realizations of \(\mathcal {C}_{(2,2,\infty )}\) in \(\mathbb {R}^2\). It remains to show that no non-degenerate realization of this code exists. Suppose for contradiction that we have a closed non-degenerate realization \(\mathcal {X}=\{X_1,X_2,\ldots ,X_9\}\) in \(\mathbb {R}^d\). We may assume without loss of generality that the various \(X_i\) are compact.

Let us first examine the sets \(\{X_4,X_5,\ldots ,X_9\}\) in isolation. The codewords that arise from this collection will be \(\{45,456,46,467,67,678,68,689,89,\emptyset \}\). These codewords satisfy the conditions of Lemma 3.6—namely, we have the containments

$$\begin{aligned} 45\subset 456\supset 46\subset 467\supset 67\subset 678\supset 68\subset 689\supset 89, \end{aligned}$$

no other containments occur, and the intersection of any three consecutive codewords is nonempty. Thus the union \(X_4\cup X_5\cup \ldots \cup X_9\) is convex. Since our realization is non-degenerate, the interior of this union is a nonempty convex open set—let us call this interior U.

Now, let p be a point in the interior of \(X_{123}\), let q be a point in the interior of \(X_{145}\), and let r be a point in the interior of \(X_{389}\). Observe that q and r both lie in U, so the line segment \(\overline{qr}\) is contained in U. Thus the consecutive codewords that appear along \(\overline{qr}\) must all contain some index between 4 and 9. In fact, consecutive codewords that appear along this line segment must contain one another (see [15, Lem. 2.1]). The only possible sequence of codewords that can arise along L is therefore

$$\begin{aligned} 145\quad 45\quad 2456\quad 246\quad 2467\quad 67\quad 678\quad 68\quad 689\quad 89\quad 389. \end{aligned}$$

In particular, L passes through \(X_2\) in addition to the interiors of \(X_1\) and \(X_3\). By possibly perturbing q and r by a small distance, we may assume that \(\overline{qr}\) passes through the interior of \(X_2\).

Let A be the affine span of p, q, and r. We may assume that these points are in general position so that A has dimension exactly two. Define \(U_i={{\,\textrm{int}\,}}(X_i)\cap A\) for \(i\in [3]\), and \(U_4=U\cap A\). The sets \(U_1\), \(U_2\), \(U_3\), and \(U_4\) may then be regarded as convex open sets in \(A\cong \mathbb {R}^2\).

We claim that the code realized by \(\,\mathcal {U}=\{U_1,U_2,U_3,U_4\}\) is exactly \({\mathcal {S}}_3\) (recall Definition 2.1). Our choice of p guarantees that \(U_1\cap U_2\cap U_3\) is nonempty, while the line segment \(\overline{qr}\subseteq U_4\) guarantees that \(U_4\) intersects each of \(U_1\), \(U_2\), and \(U_3\). The set \(U_4\) does not meet \(U_{123}\) since \(U_4\) is contained in U which does not meet \(X_{123}\), which in turn contains \(U_{123}\). Finally, codewords containing any of 12, 13, or 23 do not appear in this realization—if they did, then non-degeneracy of \(\mathcal {X}\) would imply that there was some codeword appearing in the original realization which contained one of these. No such codeword exists in the original realization, so \(\,\mathcal {U}\) is an open convex realization of \({\mathcal {S}}_3\) in \(\mathbb {R}^2\). This contradicts the fact that \({{\,\textrm{odim}\,}}({\mathcal {S}}_3)=3\) (recall Proposition 2.3). Thus \(\mathcal {C}_{(2,2,\infty )}\) does not have a non-degenerate realization in any dimension. \(\square \)

6 Conclusion

We have constructed several families of codes and characterized their embedding dimension vectors. In combination, these families guarantee that every vector (abc) with \(2\le a,b,c\le \infty \) and \(\max {\{a,b\}}\le c\) arises as the embedding dimension vector of some code (Theorem 1.9). Moreover, such vectors are exactly those that can arise as embedding dimension vectors, with the exception of the vectors (0, 0, 0) and (1, 1, 1). Although our results required careful and sometimes lengthy proofs, our arguments were primarily based on existing tools in the convex neural code literature: sunflowers of convex open sets (recall Sect. 2, which restates results of [14]), and rigid structures of closed convex sets (recall Lemma 3.6, which restates results of [1]). Our contribution was to find combinations of these tools which yielded the correct embedding dimension vectors, and carry out the necessary analysis to characterize these vectors.

Although we have settled the question of which vectors can arise as embedding dimension vectors, it is still very difficult to bound the embedding dimensions of an arbitrary code \(\mathcal {C}\subseteq 2^{[n]}\). In fact, it is even an open question whether or not there exists an algorithm which can decide the open, closed, or non-degenerate embedding dimension of a code.

A further area of study which we did not explore in this work was the relationship between the size of the base set of a code and its embedding dimensions. Such a line of investigation would help characterize how “efficiently” codes can capture the dimension of a space in which they are realized. We thus ask the following:

Question 6.1

Among all codes \(\mathcal {C}\) with base set [n], what is the maximum finite open (respectively, closed or non-degenerate) embedding dimension that arises? Which codes achieve this maximum?

As a start, we conjecture that each additional base set element yields a strict increase in the maximum embedding dimension.

Conjecture 6.2

The maximum described in Question 6.1 is a strictly increasing function of n. That is, if \(\mathcal {C}\subseteq 2^{[n]}\) has maximum open embedding dimension among all codes on [n], while \(\mathcal {D}\subseteq 2^{[n+1]}\) has maximum open embedding dimension among all codes on \([n+1]\), then \({{\,\textrm{odim}\,}}(\mathcal {D})>{{\,\textrm{odim}\,}}(\mathcal {C})\). Moreover, the analogous result should hold for closed and non-degenerate embedding dimensions.

Rather than stratifying codes by the size of their base sets, and then asking for the maximum embedding dimension in each strata, one could take the reverse perspective: stratify by embedding dimension, and then ask for the smallest base set size. We formalize this point of view below. Note that this is not simply a reformulation of Question 6.1, though these two questions do provide bounds for one another.

Question 6.3

Among all codes \(\mathcal {C}\) with open (respectively, closed or non-degenerate) embedding dimension equal to d, what is the minimum base set size that arises? Which codes achieve this minimum?

Results of [14] imply that the largest finite open embedding dimension among codes \({\mathcal {C}\subseteq 2^{[n]}}\) can be as large as

$$\left( {\begin{array}{c}n-1\\ \lfloor (n-1)/2\rfloor \end{array}}\right) ;$$

in particular, it can be larger than n. This implies that the minimum in Question 6.3 is not a strictly increasing function of d, so we cannot make an analogous conjecture to Conjecture 6.2 in the case of open embedding dimension. However, there are not yet known examples where \({{\,\textrm{cdim}\,}}(\mathcal {C})>n\) and \(\mathcal {C}\subseteq 2^{[n]}\). Nevertheless, we conjecture that such codes exist, and finding such examples would be a good starting point for work on Question 6.3.

Conjecture 6.4

There exists a code \(\mathcal {C}\subseteq 2^{[n]}\) such that \(n<{{\,\textrm{cdim}\,}}(\mathcal {C})<\infty \).