1 Introduction

When decision-making criteria are ordered lexicographically, the first criterion that orders a pair of alternatives determines a choice or preference for that pair. This paper uses minimal sets of criteria to measure how concisely preferences can be represented and how efficiently an agent can make decisions. I will argue that criteria provide a better measure of concision than the economics test of checking whether a preference has a utility representation. The paper will also reconcile a disagreement between psychology and utility theory regarding the permissibility of indifference in preference analysis, generalize Debreu’s characterization of utility-representable preferences, and offer simple proofs of extension theorems for transitive orders. Criteria can in principle be arbitrary complete and transitive binary relations on the domain of alternatives, but it turns out that the simplest variety, the binary criteria that partition alternatives into two equivalence classes, is best suited to this paper’s agenda.

Any preference \(\succsim \) can be represented as the lexicographic ordering of some well-ordered set of binary criteria, and there is consequently a minimal set that represents \(\succsim \). The ordinal number of criteria in this minimal set, which I will call the intrinsic length of \( \succsim \), gauges how difficult \(\succsim \) is to represent and the decision-making burden of using binary criteria to make \(\succsim \)-optimal choices. When the intrinsic length of \(\succsim \) is less than the number of indifference classes in \(\succsim \), a length-cardinality gap is present and \(\succsim \) can then be represented concisely. Some preferences—for example, preferences with a finite number of indifference classes and utility-representable preferences with a continuum of indifference classes—are declared concise by this test, while others are hard to represent. Length-cardinality gaps offer an alternative to the long tradition in economics of regarding the existence of a utility representation as the mark of concision. The two approaches sometimes generate the same conclusions, but where they differ the length-cardinality gap discriminates more precisely. For example, an existence-of-a-utility test has to judge all preferences with countably many indifference classes as equally concise, while a length-cardinality test does not.

When an agent uses a sequence of binary criteria to make decisions, a length-cardinality gap means that the agent can make decisions rapidly and that the number of criterion orderings the agent has to make will be small relative to the number of preference orderings that the sequence generates. For example, to generate a finite number of indifference classes n and thus \(\left( {\begin{array}{c}n\\ 2\end{array}}\right) =\frac{n(n-1)}{2}\) orderings of indifference classes, only \(\left\lceil \log _{2}n\right\rceil \) criteria and criterion judgments are needed.Footnote 1

Intrinsic length sheds light on Chipman’s (1960) lexicographic utility theory, where a preference is represented by an ordered set of utility functions and the first utility to discriminate between two alternatives determines which is preferred.Footnote 2 Chipman was prompted by Debreu’s demonstration in 1954 that so-called lexicographic preferences—where an agent chooses between bundles x and y according to the first coordinate where x and y differ—do not have utility representations. Chipman showed that these preferences and indeed all preferences have a lexicographic utility representation. In Chipman’s theory, Debreuvian lexicographic preferences turn out to have a concise representation since they require only finitely many utility functions. But this conclusion is threatened by Chipman’s use of utility functions as a measuring stick: is concision possible only because utility functions can represent uncountably many indifference classes? Well-ordered sets of binary criteria eliminate this difficulty. They also vindicate Chipman’s position: Debreuvian lexicographic preferences show a sizable length-cardinality gap and qualify as concise.

An agent may seek criteria that pick out a single choice from any finite set of options rather than a set of acceptable alternatives. In psychological models of sequences of binary criteria, this absence of indifference is upheld as an advantage over classical utility maximization (Tversky 1972).Footnote 3 To formalize the decisiveness of binary criteria, I show that they can lead to preferences that strictly order every pair of bundles in \( {\mathbb {R}} ^{n}\): the indifferences criticized by behavioral psychology are absent. An agent moreover needs to proceed through only finitely many criteria to order a pair of bundles. Despite the absence of indifference, the preferences to which these sets of binary criteria lead have utility representations. Thus there is no inherent conflict between the psychological and utility points of view: decisiveness is consistent with utility maximization and economic rationality. Since the utilities that arise discriminate between every pair of bundles, they define one-to-one mappings (injections) of \( {\mathbb {R}} ^{n}\) into \( {\mathbb {R}} \) and in fact generalize the injections defined by Cantor, the discoverer of the first such mappings. This class of utility functions can represent ‘fractal’ preferences where the pattern of indifference classes when they are grouped together coarsely matches their pattern when grouped together finely (Mandler 2020c).

Well-ordered criteria also advance the mathematics of decision theory. By viewing each binary criterion as a binary digit and allowing the number of digits to be more than countable, we can generalize the classical representation result, due to Birkhoff (1948) and Debreu (1954), that a utility representation exists for a preference \(\succsim \) if and only if \(\succsim \) has a countable order-dense subset. Finally criteria can provide very simple proofs that transitive orders can be extended to linear orders.

Evren and Ok (2011) and Kochov (2007), and earlier Ok (2002) and Mandler (2006), provide a non-lexicographic theory of representation in which unordered families of utility functions can represent incomplete preferences. The present theory lacks that advantage. Instead the goal will be to find minimal well-ordered families of binary relations or criteria that represent preferences when the criteria are ordered lexicographically. This approach delivers a gain in concision since lexicographic representations can use fewer criteria (of a given discriminatory capacity) than unordered representations require. The well-orderings play two roles: they ensure that when some criterion in a set discriminates between a pair of alternatives there is a first criterion that does so, and they make comparisons of concision possible by determining which of two sets of criteria is shorter. Ordinal numbers—the canonical well-ordered sets—are therefore essential. To maintain accessibility, I segregate the abstract applications of ordinals to the last two sections of the paper.

2 The lexicographic method

The notation, most of which is routine, is gathered into “Appendix 3” along with some mathematical background. Two conventions need advance warning. First, among the labels we use for binary relations are \( R,R_{k},R^{k},\succsim ,\ge _{L},\) and \(\le \), with \(I,I_{k},I^{k},\sim \), \(=_{L},\) and \(=\) denoting their symmetric parts and \(P,P_{k},P^{k},\succ ,>_{L},\) and < their asymmetric parts. Second, a binary relation is rational if it is complete and transitive and it is then also called a preference.

Let X be a domain of alternatives.

Definition 1

A S-sequence of criteria \(\left\langle R_{i}\right\rangle _{i\in S}\) is a set of binary relations on X with a well-ordered set of indices S such that, for each \(i\in S\), \(R_{i}\) is rational. Each \(R_{i}\) is a criterion.

The well-ordering assumption on S means that any nonempty set of indices contains a minimal index. The assumption is indispensable for a lexicographic ordering: it ensures that the first criterion to strictly order a pair of alternatives is well defined. Without loss of generality, we will take S to be an ordinal number, part of the definition of which is an ordering \(\le \) on S. The well-ordering in Definition 1 does not invoke the well-ordering theorem whose proof is nonconstructive and which we use sparingly.Footnote 4

Until Sect. 5, there is little loss in generality in taking S to be a set of consecutive integers, beginning with 1 rather than 0 to suit our applications. The ordering that accompanies S is then the standard ordering of integers. We indicate the entire set of positive integers (with its standard ordering) by the ordinal notation \(\omega \). A \(\omega \)-sequence is therefore a traditional sequence.

The lexicographic ordering of \(\left\langle R_{i}\right\rangle _{i\in S}\) is the complete binary relation \(\succsim \) on X such that \( x\succ y\) if and only if the first criterion \(R_{k}\) to strictly order x and y has \(xP_{k}y\). Formally, \(\succsim \) is defined by

$$\begin{aligned} x\succsim y\Leftrightarrow \left[ \left( xR_{i}y{\text { for all }}i\in S\right) {\text { or }}\left( \exists k\in S{\text { such that }}xP_{k}y{\text { and }}xI_{i}y{\text { for all }}i<k\right) \right] \end{aligned}$$

for all \(x,y\in X\) and we then say that \(\left\langle R_{i}\right\rangle _{i\in S}\) represents \(\succsim \). Rearranging a sequence of criteria sometimes will and sometimes will not affect the preference represented. Generally speaking, the proofs of more abstract results, for instance, Examples 2 and 3 and Theorems 67, and 8, will not depend on how criteria are arranged, while the concise S-sequences of criteria for particular preferences will, for instance, in Examples 1 and 4.Footnote 5

An equivalence class of a criterion \(R_{i}\) is a maximal subset of the domain X that contains only elements that are not strictly ordered by \( R_{i}\) (see “Appendix 3”). A binary criterion \(R_{i}\) partitions X into no more than two equivalence classes, labeled \(E_{1}^{i}\) and \(E_{2}^{i}\) when there are exactly two, where \(xP_{i}y\) if and only if \( x\in E_{1}^{i}\) and \(y\in E_{2}^{i}\). The name ‘criterion’ stems from the binary case. Given a S-sequence \(\left\langle R_{i}\right\rangle _{i\in S}\), the more preferred of two alternatives can be found by proceeding through the criteria until coming to the first \(R_{i}\) such that \(E_{1}^{i}\) contains just one of the alternatives.

One of the purposes of lexicographic orderings of criteria is to represent preferences concisely. As we will see, a small number of criteria can represent a large number of preference distinctions even when criteria are binary (Sects. 34). A S-sequence of criteria can also be viewed as a decision procedure in which an agent proceeds through criteria sequentially and the first criterion that orders a pair of alternatives determines the agent’s choice. A concise representation then indicates decision-making efficiency: choices can be made quickly and relatively few criterion orderings are needed. For this interpretation, the number of criteria an agent needs to examine to discriminate between two alternatives should be finite and therefore S should be no greater than \(\omega \). When S equals \(\omega \), the number of criteria an agent needs to examine remains finite, but generally there will not be a finite bound that will serve for all pairs. If \(S>\omega \), on the other hand, the agent would face the infeasible task of having to examine infinitely many criteria.

In a procedural interpretation, the S-sequence forms the primitive description of an agent. Agents do not begin with preferences and then derive criteria; they are endowed with criteria in the same way agents are endowed with preferences in consumer theory. A \(\omega \)-sequence that represents a classical consumer preference \(\succsim \) in fact presupposes less of an agent than does \(\succsim \) itself. As we will see in Example 2, a \(\omega \)-sequence of binary criteria, which posits only countably many criterion orderings, can lead to the textbook cases of consumer preferences that make uncountably many choice distinctions. Since orderings are presumably costly, it is less demanding to assume that agents are endowed with such \(\omega \)-sequences than with the preference relations the criteria lead to.

One representation fact is obvious: any preference \(\succsim \) is the lexicographic ordering of some set of criteria, namely the set that consists of just \(\succsim \). The purpose of S-sequences with \(S>1\) is to lessen the number of criterion orderings presupposed in a representation and to define a sharper measure of concision. A binary criterion requires only one criterion ordering, a ranking of the criterion’s two equivalence classes, and we will see that binary criteria both reduce the number of criterion orderings an agent needs to make and provide a uniform yardstick for concision comparisons.

The fundamental rationality property of S-sequences of criteria is that they lead to rational binary relations, i.e., to preferences. A similar conclusion for lexicographic products of linear criteria is a well-known fact in set theory (see Ciesielski 1997). Theorem 1 applies regardless of which ordinal forms the set of indices S.

Theorem 1

The lexicographic ordering of any S-sequence of criteria is rational.

Proof

Let \(\succsim \) be the lexicographic ordering of \(\left\langle R_{i}\right\rangle _{i\in S}\). For any pair \(x,y\in X\), either \(xR_{i}y\) for all \(i\in S\) or, since S is well-ordered, there is a first \(k\in S\) such that \(yP_{k}x\). In the first case \(x\succsim y\) and in the second case either \(x\succ y\) (when \(\exists i<k\) with \(xP_{i}y\) and hence there is a first i with \(xP_{i}y\)) or \(y\succ x\) (when \(\not \exists i<k\) with \( xP_{i}y \)). So \(\succsim \) is complete. For transitivity, suppose \( x\succsim y\succsim z\). If \(xI_{i}y\) and \(yI_{i}z\) for all \(i\in S\), then by transitivity \(xR_{i}z\) for all \(i\in S\) and hence \(x\succsim z\). Alternatively there is a first \(i\in S\) such that at least one of the following hold: \(xP_{i}y\), \(yP_{i}x\), \(yP_{i}z\), \(zP_{i}y\). Since \( x\succsim y\succsim z\) and \(xI_{j}yI_{j}z\) for \(j<i\), we have \(xR_{i}yR_{i}z\). So either \(xP_{i}y\) or \(yP_{i}z\) or both and hence \(xP_{i}z\). Since \( xI_{j}yI_{j}z\) and hence \(xI_{j}z\) for \(j<i\), we have \(x\succsim z\). \(\square \)

Three examples will illustrate how to use criteria to represent preferences. A utility representation of a binary relation \(\succsim \) on X is a function \(u:X\rightarrow {\mathbb {R}} \) such that \(\left. x\succsim y\right. \Leftrightarrow u(x)\ge u(y)\). A S-sequence of criteria \(\left\langle R_{i}\right\rangle _{i\in S}\) is a Chipman representation of its lexicographic ordering if each \( R_{i} \) has a utility representation.

Example 1 shows that even though a preference relation may lack a utility representation, it can have a concise Chipman representation. The two following examples use binary criteria represent a preference \( \succsim \): each criterion partitions X into two \(\succsim \)-ordered equivalence classes and the ‘dividing points’ of the criteria form a set that is dense in \(\succsim \). Example 2 shows that any utility-representable preference can be represented by a \(\omega \)-sequence of binary criteria. In this canonical case of concise representation, uncountably many indifference classes are represented by countably many binary criteria. Example 3 shows for any preference \( \succsim \) how to build a binary S-sequence that represents \(\succsim \).

Example 1

(Debreuvian lexicographic preferences) Define \(\succsim \) on the domain \(X= {\mathbb {R}} _{+}^{n}\) by

$$\begin{aligned} x\succsim y\Leftrightarrow (x=y{\text { or }}x_{i}>y_{i}{\text { for the least integer }}i{\text { such that }}x_{i}\ne y_{i}). \end{aligned}$$

To distinguish \(\succsim \) from the other cases of lexicography in this paper, I have added the modifier ‘Debreuvian’ to what are commonly known as ‘lexicographic preferences’. Debreu (1954) famously showed that \( \succsim \) has no utility representation. But there is a set of just n criteria that represents \(\succsim \) where each criterion does have a utility representation. Let \(R_{i}\) be the binary relation on X that orders n-vectors by their ith coordinate, \(xR_{i}y\Leftrightarrow x_{i}\ge y_{i}\), which has the utility representation \(u_{i}(x)=x_{i}\). So \(\left\langle R_{i}\right\rangle _{i\in \{1,\ldots ,n\}}\) is a Chipman representation of \(\succsim \). \(\square \)

Example 2

(Preferences with utilities) Suppose that \(\succsim \) on an arbitrary domain X has a utility representation u, thus allowing \(\succsim \) to have an uncountable number of indifference classes. Then \(\succsim \) has a Chipman representation consisting of the single criterion \(\succsim \). For an alternative that uses binary criteria, define for each rational number r, the criterion equivalence classes \(E_{1}^{r}=\{z\in X:u(z)\ge r\}\) and \( E_{2}^{r}=\{z\in X:u(z)<r\}\) with criterion \(R^{r}\) defined by \(xR^{r}y\) if and only if \((x\in E_{k}^{r}\), \(y\in E_{l}^{r}\), and \(k\le l)\). Assign the indices in \(\omega \) to the rationals via a bijection \(f:\omega \rightarrow \mathbb {Q} \) and let \(\ge _{L}\) be the lexicographic ordering of \(\left\langle R^{f(i)}\right\rangle _{i\in \omega }\). To see that \(\ge _{L}\,=\,\succsim \), suppose that \(x\succsim y\). The fact that u represents \(\succsim \) then implies that, for any \(r\in \mathbb {Q} \), \((y\in E_{1}^{r})\Rightarrow (x\in E_{1}^{r})\). It follows that \( xR^{r}y \) for all \(r\in \mathbb {Q} \) and therefore \(x\ge _{L}y\). Conversely, suppose \(x\ge _{L}y\). If \( y\succ x\) then \(yR^{r}x\) for all \(r\in \mathbb {Q} \) and, letting \(r^{\prime }\) be a rational in (u(x), u(y)], we have \(y\in E_{1}^{r^{\prime }}\) and \(x\in E_{2}^{r^{\prime }}\) and therefore \( yP^{r^{\prime }}x\). Thus \(y>_{L}x\), a contradiction. See Mandler et al. (2012).

The arrangement of the \(R^{r}\), as determined by f, does not affect the conclusion that \(\ge _{L}\,=\,\succsim \). Since the \(E_{1}^{r}\) are upper contours of \(\succsim \), if x and y are strictly ordered by \(\succ \) then each \(R^{r}\) that strictly orders x and y imposes the same strict ordering. \(\square \)

Example 3

(Arbitrary preferences) Let \(\succsim \) be a preference on an arbitrary X. For each \(z\in X\), define the equivalence classes \(E_{1}^{z}=\{w\in X:w\succsim z\}\) and \(E_{2}^{z}=\{w\in X:z\succ w\}\), and set \(xR^{z}y\) if and only if \( (x\in E_{k}^{z}\), \(y\in E_{l}^{z}\), and \(k\le l)\). By the well-ordering theorem there is an ordinal S that maps bijectively to X, let us say by \( f:S\rightarrow X\). So \(\left\langle R^{f(i)}\right\rangle _{i\in S}\) is a S-sequence of binary criteria. The minimum of the ordinals S that map bijectively to X is the cardinal \(\left| X\right| \) and \( \left\langle R^{f(i)}\right\rangle _{i\in S}\) is then a \(\left| X\right| \)-sequence.Footnote 6 The proof that \( \ge _{L}\), the lexicographic ordering of \(\left\langle R^{f(i)}\right\rangle _{i\in S}\), equals \(\succsim \) is similar to the argument in Example 2. If \(x\succsim y\) then the transitivity of \(\succsim \) implies that, for any \(z\in X\), \((y\in E_{1}^{z})\Rightarrow (x\in E_{1}^{z})\). Hence \(xR^{z}y\) for all \(z\in X\) and \(x\ge _{L}y\). If \(x\ge _{L}y\) and \(y\succ x\) then \(yR^{z}x\) for all z and \(yP^{y}x\). Since S is well-ordered, there must be a first index \( i\in S\) such that \(yP^{f(i)}x\) and therefore \(y>_{L}x\), a contradiction. Hence \(\left\langle R^{f(i)}\right\rangle _{i\in S}\) represents \(\succsim \). See Chipman (1960) and Martínez Legaz (1998). As in Example 2, the \(E_{1}^{z}\) are upper contours of \( \succsim \) and therefore the arrangement of the criteria by f is irrelevant for the equality \(\ge _{L}\,=\,\succsim \). \(\square \)

Example 3 provides an alternative S-sequence of criteria to represent the Debreuvian lexicographic preference \(\succsim \) of Example 1. But the S-sequences in Example 3 come at a cost: they use as many criteria as there are elements in X and can therefore lead to lengthy representations. From the procedural point of view as well, these sequences can be unattractive. A sequence of criteria should issue decisions quickly and minimize the number of criterion orderings that an agent must form.

One improvement on the criteria in Example 3 is readily available: to generate a preference \(\succsim \), an agent can use just one criterion for each indifference class of \(\succsim \). For each indifference class E, pick some \(z\in E\) and define the criterion \(R^{z}\) as in Example 3 where \(xP^{z}y\) holds iff \(x\succsim z\succ y\) . If n is the cardinality of the indifference classes of \(\succsim \), the n-sequence defined by these criteria, arranged in any order, will represent \(\succsim \).

When measured by the number of orderings required, the representation of a preference \(\succsim \) with n indifference classes by a set of n binary criteria achieves considerable progress over the preference itself. The number of criterion orderings will equal the number of binary criteria, and if this number n is finite and greater than 3 then n will be less than the number of orderings of the indifference classes of \(\succsim \) which equals \(\left( {\begin{array}{c}n\\ 2\end{array}}\right) =\frac{n(n-1)}{2}\). Whether n is finite or not, the benefits of binary criteria will often be greater still: we will see that the number of binary criteria needed to represent \(\succsim \) can be strictly less than n.

Examples 2 and 3 show that the discreteness of binary criteria need not lead to discontinuities, in the sense of Debreu (1954), in the preferences that result. No restrictions are placed on the utilities in Example 2 or on the preferences in Example 3: the utilities and preferences can be either continuous or discontinuous.

It is illuminating to compare S-sequences of criteria with the families of utility functions that serve as representations in Evren and Ok (2011) and Kochov (2007) and earlier Ok (2002) and Mandler (2006). In these theories, \(x\succsim y\) is inferred if and only if \(u(x)\ge u(y)\) for every u in the family, with the benefit that any reflexive and transitive \(\succsim \) can be represented by some family. A lexicographically ordered set of criteria in contrast cannot represent an incomplete \(\succsim \). The advantages of lexicography are twofold. First, since criteria do not have to agree on the ordering between two alternatives for a preference to be represented, a lexicographic ordering can make do with fewer criteria (of a given discriminatory capacity). Second, due to the well-ordering the number of criteria in different representations can be compared without ambiguity. The concision of representations can then also be compared.

3 The intrinsic length of a preference

When a S-sequence of criteria \(\left\langle R_{i}\right\rangle _{i\in S}\) represents a preference, the ordinal number S measures how brief or concise the representation is. But for S to be a reasonable index of brevity, we should compare like with like: if \(\succsim \) is represented by a lengthy S-sequence of binary criteria while \(\succsim ^{\prime }\) is represented by a single criterion (\(\succsim ^{\prime }\) itself) it would hardly be reasonable to conclude that \(\succsim \) is more difficult to represent. Since any preference \(\succsim \) can be represented by some S -sequence of binary criteria (Example 3), we can define an intrinsic measure of concision by requiring criteria to be binary, thus keeping the playing field level. Since binary criteria have the minimum nontrivial number of equivalence classes, this measure of length will discriminate as finely as possible. If instead each criterion were allowed to have \(e>2\) equivalence classes, then all preferences with e or fewer indifference classes would be judged equally concise.

Given a preference \(\succsim \) on an arbitrary domain X, Example 3 shows that there is a \(\left| X\right| \)-sequence of binary criteria that represents \(\succsim \). Since the set of ordinal numbers

$$\begin{aligned} \{S:\exists {\text { a binary }}S{\text {-sequence }}\left\langle R_{i}\right\rangle _{i\in S}{\text { that represents }}\succsim {\text { such that }}S\le \left| X\right| \} \end{aligned}$$

is therefore nonempty, it has a minimal element: there is a quickest binary S-sequence that represents \(\succsim \).

Definition 2

The intrinsic length of a preference \(\succsim \) is the ordinal number S such that (1) there exists a S-sequence of binary criteria that represents \(\succsim \) and (2) if some \(S^{\prime }\)-sequence of binary criteria represents \(\succsim\) then \(S\le S^{\prime }\).

While determining the intrinsic length of a preference can be a nontrivial task, we saw in Sect. 2 that a preference with a set of indifference classes of cardinality n can be represented by a n-sequence of criteria. We will see in a moment that n provides a tight bound for intrinsic length.

From the procedural perspective, the number of criteria S in a sequence of binary criteria measures decision-making speed and the number of criterion orderings that an agent using the S-sequence has to make. The intrinsic length of a preference \(\succsim \) therefore indicates the efficiency that is achievable when an agent uses binary criteria to make \(\succsim \)-optimal decisions.

Both the decision-making efficiency and ease of representation of a preference \(\succsim \) should be judged relative to the cardinality n of its set of indifference classes. Concision and efficiency will appear as a ‘significant gap’ between the intrinsic length of \(\succsim \) and the bound n on intrinsic length. While there is no natural threshold that makes a gap significant, the presence of some nonzero gap can serve as a dividing line.

Definition 3

A preference \(\succsim \) displays a length-cardinality gap if the intrinsic length of \(\succsim \) is strictly less than the cardinality of the set of indifference classes of \(\succsim \).Footnote 7

We have already seen a length-cardinality gap—a significant gap—in Example 2: any preference with a utility representation has an intrinsic length no greater than \(\omega \) even if it has a continuum of indifference classes. Given that a utility-representable \(\succsim \) with infinitely many indifference classes cannot have a finite intrinsic length (see Example 5), the intrinsic length of such a \( \succsim \) must equal \(\omega \) exactly. Since virtually any preference used in consumer theory has both a utility representation and a continuum of indifference classes, preferences of this type offer a rich supply of length-cardinality gaps. That traditional consumer preferences turn out to be concise in our framework is reassuring: when the standard view that a utility function provides a brief summary of a preference makes sense, we come to the same conclusion.

Finite preferences offer an even simpler example of a length-cardinality gap.

Example 4

(Finitely many indifference classes) Suppose \(\succsim \) has a finite number of indifference classes n and let \(u:X\rightarrow \{0,\ldots ,n-1\}\) be a utility function that represents \(\succsim \). Identify each u(x) with its binary representation—a sequence of \(\left\lceil \log _{2}n\right\rceil \) 0’s and 1’s—and define the binary criterion \(R_{i}\) by setting \(xP_{i}y\) if and only if the ith digit (from the left) of u(x) is 1 and the ith digit of u(y) is 0 and setting \(xI_{i}y\) otherwise. The sequence \( \left\langle R_{i}\right\rangle _{i\in \left\{ 1,\ldots ,\left\lceil \log _{2}n\right\rceil \right\} }\) represents \(\succsim \) since \(x\succ y\) if and only if there is a first digit where the binary representations of u(x) and u(y) differ and x has a 1 in this digit while y has a 0.Footnote 8 Given that \(\left\lceil \log _{2}n\right\rceil <n\) for every positive integer n, finite preferences always display a length-cardinality gap, indeed a gap that increases rapidly in n. Since the \(R_{i}\) are not defined by upper contours, as in Examples 2 and 3, their sequencing does affect the preference that is represented. \(\square \)

If \(\left\langle R_{i}\right\rangle _{i\in S}\) represents \(\succsim \) then \( x,y\in X\) are in the same \(\succsim \)-indifference class if and only if x and y are in the same \(R_{i}\)-equivalence class for all \(i\in S\). Consequently if each \( R_{i}\) is binary, the cardinality n of the \(\succsim \)-indifference classes must satisfy \(n\le \left| 2^{S}\right| \). So, when n and S are both finite, \(\log _{2}n\le S\) and, since S is an integer, \( \left\lceil \log _{2}n\right\rceil \le S\). Example 4 thus pins down the intrinsic length of a \(\succsim \) with a finite number of indifference classes n: it is exactly \(\left\lceil \log _{2}n\right\rceil \) .

This conclusion for the finite case illustrates the concision and decision-making efficiency of criteria. Though binary criteria are the crudest available, the number of binary criteria (and hence the number of criterion orderings) needed to generate a preference with n indifference classes increases on the order of \(\log n\), a slow function of n. The n vs. \(\left\lceil \log _{2}n\right\rceil \) length-cardinality gap therefore widens as n increases. The same conclusions hold when an agent’s ‘true’ preference has infinitely many indifference classes but only finitely many criteria can be used to summarize the preference: n disjoint blocks of indifference classes can be distinguished by \(\left\lceil \log _{2}n\right\rceil \) criteria.

Not every preference displays a length-cardinality gap.

Example 5

(Countably many indifference classes) Let \(\succsim \) have a countable infinity of indifference classes. Since the intrinsic length of \(\succsim \) is bounded above by the cardinality of its indifference classes \(\omega \) and since the lexicographic ordering of a finite number k of binary criteria can have at most \(2^{k}\) indifference classes, the intrinsic length of \(\succsim \) must be \(\omega \). If the binary S-sequence \(\left\langle R_{i}\right\rangle _{i\in S}\) represents \(\succsim\) then at least one indifference class of the preference represented by any finite initial segment of \(\left\langle R_{i}\right\rangle _{i\in S}\) must contain a countable infinity of \(\succsim \)-indifference classes. In this sense, a binary S-sequence that represents \(\succsim \) makes no real progress following finitely many of its criteria. \(\square \)

If it seems counterintuitive that there are preferences with utility representations, such as \(\succsim \) in Example 5, that we classify as hard to represent, keep in mind that length-cardinality gaps are defined relative to the number of indifference classes. What is notable about the \(\succsim \)’s in Example 5 is that their intrinsic length \(\omega \) is no smaller than the intrinsic length of classical utility-representable consumer preferences with a continuum of indifference classes (Example 2). One might have hoped for a more concise summary. An agent with one of the \(\succsim \)’s in Example 5 therefore will not enjoy any ‘savings’ on criterion orderings: to get a countable infinity of preference judgments, a countable infinity of criterion orderings is required.

I provide further cases where preferences show no length-cardinality gap in “Appendix 1”.

Examples 4 and 5 clarify what the length-cardinality gap accomplishes compared to the economics tradition of taking utility to be the natural representation tool. An existence-of-a-utility test of concision uses the same measuring rod for all preferences, and preferences with countably many indifference classes readily pass the test. The length-cardinality gap adapts the required intrinsic length to the number of indifference classes and therefore classifies preferences more precisely.

As we have seen, a preference with a utility representation can also be represented by a \(\omega \)-sequence of binary criteria. The argument used in Example 2 for this conclusion leads to a broader principle that shows when arbitrary criteria that represent a preference can be transformed into a relatively small set of binary criteria that represents the same preference. For a preference \(\succsim \) with a utility representation, it is the presence of a countable order-dense subset that ensures that a \(\omega \)-sequence of binary criteria can also represent \( \succsim \). The general result, Theorem 2, shows that if a criterion \(R_{i}\) has an order-dense subset D, then we can replace \( R_{i}\) with binary criteria, two for each element of D.Footnote 9 This substitution can pin down the intrinsic length of a variety of preferences and will lead to some lessons regarding Chipman’s utility theory.

If R is binary relation on X, \(D\subset X\) is R-order-dense if and only if for all \(x,y\in X\) with xPy there exists \(d\in D\) such that xRdRy. Debreu (1954) credits this convenient adjustment of prior definitions of order density to Savage.

Theorem 2

If the S-sequence \(\left\langle R_{i}\right\rangle _{i\in S}\) represents \(\succsim \) and there is a cardinality C such that each \( R_{i}\) has a \(R_{i}\)-order-dense subset of cardinality no greater than C, then the intrinsic length of \(\succsim \) is no greater than 2CS.Footnote 10

When C and S are finite, the product 2CS coincides with the standard definition of a product. More generally, 2CS is a multiplication of ordinals where the cardinal C is identified with the least ordinal that is equinumerous with C. If A and B are two ordinals endowed with \(\le _{A}\) and \(\le _{B},\) respectively, then the ordinal AB can be represented by \(B\times A\) ordered lexicographically, that is, with the ordering between (ba) and \((b^{\prime },a^{\prime })\) in \(B\times A\) determined by \(\le _{B}\) when \(b\ne b^{\prime }\) and determined by \(\le _{A}\) otherwise (see “Appendix 3”).

Theorem 2’s conversion of a S-sequence that represents some \( \succsim \) into a sequence of binary criteria representing the same \( \succsim \) will often make it easy to determine the intrinsic length of \( \succsim \). In the cases of interest, including the Debreuvian lexicographic preferences discussed in Example 1 and below, the sequence of criteria identified by Theorem 2 uses the minimum possible number of binary criteria.

Theorem 2 clarifies Chipman’s utility theory Chipman (1960) which uses well-ordered families of utility functions to represent preferences. Chipman’s goal was to provide a representation tool for preferences, such as Debreuvian lexicographic preferences, that have no utility representation. One conclusion of Chipman’s work is that the Debreuvian lexicographic preference \(\succsim \) on \( {\mathbb {R}} _{+}^{n}\) is not difficult to represent. As pointed out in Example 1, this preference can be represented by n utility functions.

To pin down the intrinsic length of the Debreuvian lexicographic preference \( \succsim \), apply Theorem 2. Following Example 1, if we let \(R_{i}\) order points in \( {\mathbb {R}} _{+}^{n}\) by their ith coordinate, \(xR_{i}y\Leftrightarrow x_{i}\ge y_{i}\) , then \(\left\langle R_{i}\right\rangle _{i\in \{1,\ldots ,n\}}\) is a ‘Chipman representation’ of \(\succsim \) (that is, each \(R_{i}\) has a utility representation). For each \(R_{i}\), there is a \(R_{i}\)-order-dense subset of \( {\mathbb {R}} _{+}^{n}\) of cardinality \(\omega \), for example the points z such that \( z_{i}\) is a positive rational and \(z_{k}=0\) for \(k\ne i\). Theorem 2 therefore implies that the intrinsic length of \(\succsim \) is no greater than \(2\omega n\). We can simplify this bound further since \( 2\omega =\omega \): the ordering \(\preceq \) of \(\omega \) consecutive copies of the pair \(\{1,2\}\), \(1^{1}\preceq 2^{1}\preceq 1^{2}\preceq 2^{2}\preceq 1^{3}\preceq 2^{3}\preceq \ldots \), coincides up to a labeling with \(\omega \), the standard ordering \(\le \) of the natural numbers \(1\le 2\le 3\le \ldots \) . The intrinsic length of \(\succsim \) is therefore bounded by \(\omega n\).Footnote 11

Since for the Debreuvian lexicographic preference \(\succsim \) on \( {\mathbb {R}} _{+}^{n}\), the set of indifference classes is \( {\mathbb {R}} _{+}^{n}\) itself, which has the cardinality of the continuum, \(\succsim \) displays a significant length-cardinality gap. So, despite its reputation, \(\succsim \) has a reasonably concise representation. Our theoretical language thus sharpens Chipman’s message: if we had followed Chipman’s lead and represented \(\succsim \) with utility-representable criteria, then the possibility would remain open that the concision conclusion was an artifact of letting criteria contain so many (uncountably many) equivalence classes. Indeed whenever criteria are more than binary, they risk masking the complexity or simplicity of the preferences they represent.

The above calculation of intrinsic length applies to any Chipman representation \(\left\langle R_{i}\right\rangle _{i\in S}\), not just to Debreuvian lexicography: S can be an arbitrary ordinal, and the 2 in the Theorem 2 bound on intrinsic length will again fall away.

Proposition 1

If \(\left\langle R_{i}\right\rangle _{i\in S}\) is a Chipman representation of \(\succsim, \) then the intrinsic length of \(\succsim \) is no greater than \(\omega S\).

Proof

Since each \(R_{i}\) has a utility representation, the domain X has a countable \(R_{i}\)-order-dense subset (Debreu 1954) and so by Theorem 2 the intrinsic length of \(\succsim \) is no greater than \(2\omega S\). As before, \(2\omega =\omega \). \(\square \)

While the representation of a preference can require countably many binary criteria when a Chipman representation needs to employ only finitely many utility-representable criteria, replacing a utility-representable criterion with a \(\omega \)-sequence of binary criteria amounts to a change of notation (see Sect. 5). Moreover, replacing utilities with binary criteria also will not by itself break the countability barrier: if \( \left\langle R_{i}\right\rangle _{i\in S}\) represents \(\succsim \), S is countable, and each \(R_{i}\) has a utility representation, then Proposition 1 implies \(\succsim \) can be represented by a countable sequence of binary criteria (since if S is countable then so is \(\omega S\)).

4 Decisive utility functions

Psychological theories of choice rest on decision rules and procedures rather than axioms of rationality. One object of suspicion among psychologists has been the concept of indifference, an import from utility theory that psychologists have doubted can be important for choice decisions or even observable. Tversky (1972) objects to utility theory on these grounds and offers an alternative model where agents proceed lexicographically through binary criteria. This section examines Tversky’s conclusions by considering, on the domain of bundles of goods, \(\omega \) -sequences of binary criteria that represent preferences that never deem distinct alternatives to be indifferent. Despite the absence of indifference, these preferences have utility representations. So although indifference has divided utility and behavioral theory, there is no need to choose sides: decisiveness is compatible with utility maximization.

Viewed procedurally, an agent who uses a \(\omega \)-sequence of binary criteria to choose from a finite set of possibilities will proceed through the criteria, letting each criterion \(R_{i}\) eliminate any bundle x that has survived so far if \(R_{i}\) specifies that some other survivor is superior to x, similarly to Tversky (1972) or Gigerenzer and Todd (1999). As long as each pair of possibilities is strictly ordered by some \(R_{i}\), eventually only one bundle will remain. The restriction to \(\omega \)-sequences ensures that the underlying choice protocol is finite: one bundle is selected after only finitely many criteria are examined. With S-sequences such that \(S>\omega \), any procedural interpretation would be suspect.

A utility function u on X is decisive if \(u(x)\ne u(y)\) for all \(x,y\in X\) with \(x\ne y\). On a domain of bundles of \(n\ge 2\) goods—any nontrivial rectangle in \( {\mathbb {R}} ^{n}\)—decisive utilities will be one-to-one mappings (injections) into \( {\mathbb {R}} \) and must therefore be discontinuous. Since all of the utility representations of the preference that underlies a decisive utility will be discontinuous, the preference itself will be discontinuous in the sense of Debreu (1954). These conclusions go hand-in-hand with the Tversky agenda: decisiveness entails discontinuity.

Injections from \( {\mathbb {R}} ^{n}\) into \( {\mathbb {R}} \) sparked controversy when they were discovered by Cantor in the 19th century: they imply that the cardinality of \( {\mathbb {R}} ^{n}\) is no greater than that of \( {\mathbb {R}} \) and their existence was a surprise even to Cantor. Yet agents that adopt simple decision rules can end up maximizing utilities that qualify as one of these elaborate constructions. The decisive utility functions in this section and Cantor’s injections are in fact closely related (Mandler 2020c).

Apart from Theorem 3, the domain in this section will be a rectangle in \( {\mathbb {R}} ^{n}\): we set \(X=\prod \nolimits _{i=1}^{n}X_{i}\), where each \(X_{i}\) is an interval in \( {\mathbb {R}} \). Since the \(X_{i}\) can be unbounded, \(X= {\mathbb {R}} ^{n}\) and \(X= {\mathbb {R}} _{+}^{n}\) are permitted. We start with a characteristic case of the decision criteria modeled by psychologists and that lead in the end to decisive utilities.

Example 6

(Threshold criteria) Each criterion will specify a threshold amount of one of the goods: if the threshold is quantity \(r\in X_{j}\) of good j, then the criterion classifies x as strictly superior to y if and only if \(x_{j}\ge r>y_{j}\).Footnote 12 Each threshold criterion is therefore binary and requires an agent only to choose a good j and a consumption level r that will serve as a suitable cutoff. Formally, \(R^{(j,r)}\) is defined by \(xR^{(j,r)}y\) if and only if any of the following possibilities obtains: (i) \(x_{j}\ge r>y_{j}\), (ii) \((x_{j}\ge r\) and \(y_{j}\ge r)\) or (iii) \((r>x_{j}\) and \(r>y_{j})\). Although threshold criteria are motivated as procedures, the lexicographic ordering of any S-sequence of threshold criteria must be rational: since each criterion is rational, Theorem 1 applies. \(\square \)

Definition 4

A S-sequence of criteria \(\left\langle R_{i}\right\rangle _{i\in S}\) is binary-dense if: (1) the number of criteria S is \(\omega \), (2) each \(R_{i}\) is binary, and (3) for any \( x,y\in X\) with \(x\ne y\) there is a \(i\in S\) such that \(xP_{i}y\) or \(yP_{i}x\).

To see that sequences of threshold criteria can be binary-dense, let \( Q_{i}\subset X_{i}\) be countable and dense in \(X_{i}\), e.g., the rational numbers in \(X_{i}\), and define \(\mathbf {Q}=\{(j,r)\in \{1,\ldots ,n\}\times {\mathbb {R}} :r\in Q_{j}\}\). Since \(\mathbf {Q}\) is countable, we can enumerate \(\mathbf {Q }\) via a bijection \(f:\omega \rightarrow \mathbf {Q}\), which defines \( \left\langle R^{f(i)}\right\rangle _{i\in \omega }\). Conditions (1) and (2) of Definition 4 are plainly satisfied. As for (3), for \(x\ne y\) there will be a j with \(x_{j}\ne y_{j}\) and hence a \( (j,r)\in \mathbf {Q}\) such that r lies in the interval \((\min [x_{j},y_{j}],\max [x_{j},y_{j}])\). So \(R^{(j,r)}\) will strictly order \(\ x\) and y.

The lexicographic ordering of a binary-dense sequence of criteria \( \left\langle R_{i}\right\rangle _{i\in \omega }\) will discriminate between every distinct pair of bundles in \( {\mathbb {R}} ^{n}\). An agent using such a \(\left\langle R_{i}\right\rangle _{i\in \omega }\) to choose from a finite set will therefore eventually eliminate all but one of the available alternatives.Footnote 13 Binary-dense sequences thus offer another illustration of the capacity of binary criteria to make preference distinctions efficiently: only \(\omega \) binary criteria are needed to distinguish among all of the bundles in \( {\mathbb {R}}^{n}\), akin to the observation in Sect. 3 that only \( \left\lceil \log _{2}n\right\rceil \) binary criteria are needed to distinguish among all of the options in a finite set of n alternatives. Notwithstanding this discriminatory power, an agent that uses a binary-dense sequence of criteria needs to proceed through only finitely many criteria to choose between two bundles.

To arrive at a decisive utility function on \(X= {\mathbb {R}} ^{n}\), we need to show that lexicographic orderings of binary-dense sequences of criteria have utility representations. This fact follows from a broader principle: if each criterion in a \(\omega \)-sequence has finitely many equivalence classes, then the lexicographic ordering that results will have a utility representation. No upper bound on the number of criterion equivalence classes is needed.

Theorem 3

Let X be an arbitrary domain and S an ordinal no greater than \(\omega \). If each \(R_{i}\), \(i\in S\), has a finite number of equivalence classes, then the lexicographic ordering of \(\left\langle R_{i}\right\rangle _{i\in S}\) has a utility representation.

The criteria in a binary-dense sequence can be monotone, that is, they can classify bundles with more of each good to be weakly superior to bundles with less of each good. A binary relation R on X is monotone if \(x\ge y\) implies xRy and strongly monotone if \(x\ge y\) and \( x\ne y\) imply xPy. Threshold criteria for example are monotone. A utility representation u of a strongly monotone preference satisfies the property that \(x\ge y\) and \(x\ne y\) imply \(u(x)>u(y)\), which we also call strong monotonicity.

Suppose \(\left\langle R_{i}\right\rangle _{i\in \omega }\) is a binary-dense sequence of monotone criteria. Since threshold criteria are monotone, such sequences exist. The lexicographic ordering \(\succsim \) of \(\left\langle R_{i}\right\rangle _{i\in \omega }\) is then strongly monotone. For if \( x\ge y\) and \(x\ne y, \) then \(xR_{i}y\) for every index i and by condition 3 of Definition 4 there is a first index k such that either \(xP_{k}y\) or \(yP_{k}x\). Since \(xR_{k}y, \) it must be that \(xP_{k}y\) and therefore \(x\succ y\). Theorem 3 thus leads to the following result.

Theorem 4

There are strongly monotone preferences that can be represented by binary-dense \(\omega \)-sequences of criteria. Any such preference has a decisive and strongly monotone utility representation which is therefore a one-to-one mapping of \( {\mathbb {R}} ^{n}\) (or one of its rectangular subsets) into \( {\mathbb {R}} \).

It’s worth reiterating that since the utilities in Theorem 4 are decisive, they cannot be continuous; indeed, they fail to be continuous on any nonempty open subset of X. The utilities are, however, continuous almost everywhere. In fact since even weakly monotone functions on \( {\mathbb {R}} ^{n}\) are almost-everywhere (Fréchet) differentiable (Chabrillac and Crouzeix 1987), the utilities in Theorem 4 are almost-everywhere differentiable as well.

The utilities in Theorem 4 illustrate that on the domain \( {\mathbb {R}} ^{n}\) the existence of preferences that do not have utility representations cannot be explained by there being ‘too many’ preference distinctions: any two distinct points in \( {\mathbb {R}} ^{n}\) are strictly ordered by the preferences in Theorem 4, and yet each point can be assigned a distinct utility number. Although this fact is in principle well known, textbook treatments of the failure of Debreuvian lexicographic preferences on \( {\mathbb {R}} _{+}^{n}\) (Example 1) to have a utility representation often attribute that failure to the fact that a multidimensional set of bundles would have to be mapped one-to-one into the real line (see, e.g., Mas-Colell et al. (1995, p. 46)). Theorem 4 shows that this is not the source of trouble.

Cantor provided the first examples of injections from \( {\mathbb {R}} ^{n}\) into \( {\mathbb {R}} \) and at first glance his ingenious constructions would seem to be very different from the injections considered above. I show in Mandler (2020c) that the preferences that underlie his injections arise from a binary-dense sequence of threshold criteria and are fractal (‘self-similar’) as well.

5 A generalization of the Birkhoff–Debreu theorem

The most important result in utility theory, with origins that reach to Cantor (1895) and laid out explicitly in Birkhoff (1948) and Debreu (1954), states that a preference has a numerical representation—a utility representation in the language of economics—if and only if its domain has a countable order-dense subset. The usefulness of this theorem rests on a basic feature of real numbers: although a utility-representable preference can have a continuum of indifference classes, one for each real number, we can encode these preferences using a vastly smaller number—a countable number—of digits. The theorem is a classical result in lattice theory, but it stands as a specific, isolated fact: in a subject that normally deals with order-theoretic assumptions and conclusions, it establishes the existence of a function that takes real numbers as its range. This section provides an order-theoretic generalization of the Birkhoff–Debreu theorem by letting the order-dense subsets have cardinalities that need not be countable. To do so, we will first rephrase the definitions of a numerical representation and a \(\omega \)-sequence of binary criteria: they can both be seen as \(\omega \) -sequences of binary digits.

Let \(\succsim \) be a preference on X and \(u:X\rightarrow [0,1]\) a utility representation of \(\succsim \). Instead of viewing u as a map into \( {\mathbb {R}} \), we can view u as a map \(V:X\rightarrow \{0,1\}^{\omega }\) that assigns to each \(x\in X\) a \(\omega \)-sequence of 0’s and 1’s that, when read as a sequence of fractional digits, equals u(x).Footnote 14 With this interpretation of a utility function in mind, we can use the first coordinate where V(x) and V(y) differ to order X. For \(x,y\in X\), define \(V(x)\ge V(y)\) by

$$\begin{aligned}&\left( V(x)(i)\ge V(y)(i)\,\forall i\in \omega \right) {\text { or }}\nonumber \\&\quad \left( \exists k\in \omega {\text { s.t. }}V(x)(k)>V(y)(k){\text { and }} V(x)(i)=V(y)(i)\,\,\forall i<k\right) , \end{aligned}$$
(N)

where V(x)(i) indicates the ith coordinate of the \(\omega \)-sequence V(x). Evidently V orders the elements of X as u or \(\succsim \) does. We say that \(\succsim \) has a binary-number representation if there exists a \( V:X\rightarrow \{0,1\}^{\omega }\) such that, for all \(x,y\in X\), \(x\succsim y\Leftrightarrow V(x)\ge V(y)\). For the goal of representing a preference, the difference between assigning to each \(x\in X\) a utility number in \( {\mathbb {R}} \) rather than a \(\omega \)-sequence of binary digits is essentially notational. Each is interchangeable with a \(\omega \)-sequence of binary criteria.

The following Debreu version of the Birkhoff result is best suited to our setting.

Theorem 5

(Birkhoff–Debreu) Let \(\succsim \) be a preference on X. Then there exists a utility representation or equivalently a binary-number representation of \(\succsim \) if and only if X contains a \(\succsim \) -order-dense subset that has cardinality no greater than \(\omega \).

To extend this result, we expand the definition of representation to allow the number of coordinates to reach beyond \(\omega \). Let S be an ordinal number. Call a function \(V:X\rightarrow \{0,1\}^{S}\) a S-numbering of X: for each \(x\in X\), V assigns a 0 or 1 to each of the ‘coordinates’ in S. A preference \(\succsim \) has a binary S-representation if and only if there is a S-numbering of X such that \(x\succsim y\Leftrightarrow V(x)\ge V(y)\) for all \(x,y\in X\), where \(V(x)\ge V(y)\) retains the definition given by (N) but with S replacing \(\omega \).

A binary S-representation generalizes the advantages of a utility representation. Instead of the \(\omega \) digits of a utility representation that can encode \(\left| 2^{\omega }\right| \) (a continuum) of indifference classes, the S digits of a S-representation can encode \(\left| 2^{S}\right| \) indifference classes.

If in the definitions above we replace \(\{0,1\}^{S}\) as the range of V with \(\{0,1,2\}^{S}, \) then we call the representation a ternary S-representation. We could equivalently define \(\succsim \) to have a binary (resp. ternary) S-representation if there is a S-sequence of criteria that represents \( \succsim \) such that each criterion has no more than two (resp. three) equivalence classes: we have switched to numbers to underscore the link to utility functions.

A cardinal number n is a strong limit if \(2^{k}<n\) for any cardinal \(k<n\). The simplest example is given by \(\omega \).

Theorem 6

Let \(\succsim \) be a preference relation on X and let S be a strong limit cardinal. Then there exists a binary S-representation of \( \succsim \) if and only if X contains a \(\succsim \)-order-dense subset that has cardinality no greater than S.

Since \(\omega \) is a strong limit cardinal and the existence of a binary \( \omega \)-representation is equivalent to the existence of a utility representation, Theorem 6 generalizes Theorem 5.

The ‘if’ half of Theorem 5 has historically been more important in economics. For this direction, we can go further.

Theorem 7

Let \(\succsim \) be a preference relation on X. If X contains a \(\succsim \)-order-dense subset that has cardinality no greater than the ordinal number S, then \(\succsim \) has a ternary S -representation.

The proof parallels Examples 2 and 3 but replaces the criteria that have ‘dividing’ alternatives that are dense in \(\succsim \) with 3-valued functions. The order-density assumption ensures that any two points that are \(\succ \)-ordered are assigned a different number by one of these functions.

Proof

Let D be the \(\succsim \)-order-dense subset and define, for each \(d\in D\) and \(x\in X\),

$$\begin{aligned} V^{d}(x)=\left\{ \begin{array}{ll} 2&{}\quad {\text {if }}x\succ d \\ 1&{}\quad {\text {if }}x\sim d \\ 0&{}\quad {\text {if }}d\succ x \end{array}.\right. \end{aligned}$$

Since the cardinality of D is no greater than S, there is an onto function \(f:S\rightarrow D\). Define a ternary S-numbering by \( V(x)(i)=V^{f(i)}(x) \) for all \(i\in S\) and \(x\in X\). To show that \( V(x)\ge V(y)\Leftrightarrow x\succsim y\), suppose first that \(V(x)\ge V(y)\). If \(V(x)(i)\ne V(y)(i)\) for some \(i\in S, \) then, since S is well-ordered, there is a first such i and for this i we must have \( V(x)(i)>V(y)(i)\). Hence \(x\succsim f(i)\succsim y\) with at least one strict preference, and therefore \(x\succ y\). To conclude that if \( V(x)(i)=V(y)(i)\) for all i then \(x\succsim y\), suppose to the contrary that \(y\succ x\). Then there is a \(d\in D\) such that \(y\succsim d\succsim x\) and since \(y\succ x\) one preference must be strict. Hence \( V^{d}(y)>V^{d}(x)\), a contradiction. Conversely, suppose \(x\succsim y\). Then for any \(d\in D\) we cannot have \(V^{d}(y)>V^{d}(x)\) since that would entail \(y\succ x\). Hence \(V^{d}(x)\ge V^{d}(y)\) for all \(d\in D\) and so \( V(x)\ge V(y)\). \(\square \)

Theorem 7 can be recast to fit the language of Theorem 5: under the assumptions of Theorem 7, there will be a binary 2S-representation of \(\succsim \), where 2S is a multiplication of ordinals. The proof is simply to use two 2-valued functions for each \(V^{d}\) employed above, one that assigns a higher number only to \(x\succ d\) and one that assigns a higher number only to \(x\succsim d \). Since \(2\omega =\omega \), this formulation strictly generalizes the ‘if’ half of Theorem 5.

6 Lexicographic proofs of extension theorems

Lexicographic orderings of criteria offer simple ways to extend transitive but incomplete binary relations into rational binary relations. The first fruit of these methods is an easy proof of the classical result that transitive binary relations can be extended to rational antisymmetric orders (a binary relation R is antisymmetric if xRy and yRx imply \(x=y\)).

A binary relation \(\succsim _{e}\) on X is an extension of a binary relation \(\succsim \) on X if, for all \(x,y\in X\), \(x\succ y\Rightarrow x\succ _{e}y\), where \(\succ \) and \(\succ _{e}\) are the asymmetric parts of \(\succsim \) and \(\succsim _{e},\) respectively.

The following result generalizes Szpilrajn’s extension theorem (Szpilrajn 1930); it drops the assumptions that the assumed binary relation \( \succsim \) is reflexive or antisymmetric. But if \(\succsim \) happens to be antisymmetric, then a rational antisymmetric extension \(\succsim _{e}\) satisfies the property, \(x\succsim y\Rightarrow x\succsim _{e}y\), which is the more traditional definition of an extension.

Theorem 8

Any transitive binary relation has a rational antisymmetric extension.

Proof

Given the transitive binary relation \(\succsim \) on X, define for each \( x\in X\) the equivalence classes \(E_{1}^{x}=\{y\in X:y\succ x\}\), \( E_{2}^{x}=\{x\}\), and \(E_{3}^{x}=X\backslash (E_{1}^{x}\cup E_{2}^{x})\) and the criterion \(R^{x}\) by \(aR^{x}b\) if and only if (\(a\in E_{k}^{x}\), \(b\in E_{l}^{x}\), and \(k\le l\)). Let \(P^{x}\) denote the asymmetric part of \( R^{x}\). By the well-ordering theorem, there is a well-ordering \(\le \) of X. Let S be the ordinal number to which \((X,\le )\) is order isomorphic under some bijection \(f:X\rightarrow S\). We can then index the criteria by setting \(R_{f(x)}=R^{x}\) for \(x\in X\). Let \(\succsim _{e}\) be the lexicographic ordering of \(\left\langle R_{i}\right\rangle _{i\in S}\). By Theorem 1, \(\succsim _{e}\) is rational. For antisymmetry, notice that for any \(x,y\in X\) with \(x\ne y\) we have either \(xP^{x}y\) or \( yP^{x}x\). Hence there must be a first i such that \(xP_{i}y\) or \(yP_{i}x\) . Hence \(x\ne y\) implies either \(x\succ _{e}y\) or \(y\succ _{e}x\).

For the extension, suppose \(a,b\in X\) satisfy \(a\succ b\) and hence \(a\ne b\) . Fix some \(c\in X\). If \(b\in E_{2}^{c}\) (i.e., \(c=b\)), then \(a\in E_{1}^{c}\) and hence \(aP^{c}b\). If \(b\in E_{3}^{c},\) then not \(bP^{c}a\). If \(b\in E_{1}^{c},\) then \(b\succ c\) and, by the transitivity of \(\succ \), \( a\succ c\). So \(a\in E_{1}^{c}\) and again \(bP^{c}a\) does not obtain. Since \(aP^{b}b\) and not \(bP^{c}a\) for all \(c\in X\), \(a\succ _{e}b\). \(\square \)

If following Szpilrajn we assume a \(\succsim \) that is reflexive and antisymmetric as well as transitive, the proof would be shorter still. The \(R^{x}\) could then be binary criteria with the equivalence classes \( E_{1}^{x}=\{y\in X:y\succsim x\}\) and \(E_{2}^{x}=X\backslash E_{1}^{x}\). The proof above also relies on a somewhat idiosyncratic lexicographic ordering (Theorem 1). See Mandler (2020a) for a proof of Szpilrajn’s original theorem that uses the standard lexicographic product.

Outside of Example 3, the proof above is the only place in the paper that uses the well-ordering theorem, and its presence explains why this proof is so short relative to standard approaches. Petri (2018) has adapted the well-ordering shortcut of this paper to provide a short proof of Hansson’s theorem.

The advantages of a lexicographic approach to extensions go beyond Szpilrajn’s theorem. In the proof of Theorem 8, we picked the most convenient \(R_{i}\)-indifference classes that would lead to a \( \succsim _{e}\) that strictly orders every distinct pair in X. But suppose we do not seek an extension that is antisymmetric, say because we seek indifference classes that can be given a traditional economic interpretation. A glance at the above proof shows that all that is needed for the lexicographic ordering of some \(\left\langle R_{i}\right\rangle _{i\in S}\) to extend \(\succsim \) is that \(\left\langle R_{i}\right\rangle _{i\in S}\) satisfies two conditions whenever \(a\succ b\): (i) there is a \( k\in S\) with \(aP_{k}b\) and (ii) not \(bP_{i}a\) for all \(i\in S\). To use just these two properties to build more economically natural extensions, suppose we wish to label alternatives x and y as indifferent if they have the same better-than and worse-than sets, since then they are behaviorally indistinguishable. So write \(x\approx y\) if

$$\begin{aligned} \{z\in X:z\succ x\}= & {} \{z\in X:z\succ y\}\\ \{z\in X:x\succ z\}= & {} \{z\in X:y\succ z\}. \end{aligned}$$

See Fishburn (1970) and Mandler (2009). To build an extension of \(\succsim \) that preserves this definition of indifference for each \(x\in X\), set \(E_{1}^{x}=\{z\in X:z\succ x\}\), \( E_{2}^{x}=\{z\in X:z\approx x\}\), and \(E_{3}^{x}=X\backslash (E_{1}^{x}\cup E_{2}^{x})\). The previous proof that the \(\succsim _{e}\) that results extends \(\succsim \) remains virtually unchanged, and \(a\approx b\) implies that for each x both a and b are always elements of the same \( E_{i}^{x} \) and hence \(a\sim _{e}b\).

Proposition 2

Any transitive binary relation \(\succsim \) has a rational extension \( \succsim _{e}\) such that, for all \(a,b\in X\), \(a\sim _{e}b\) if and only if \( a\approx b\).