Introduction: Roberts’ Claim

Consider a society with a non-empty finite set N of individuals i.Footnote 1 Let X be a domain of at least three social states, and \(\mathscr {R}(X)\) the set of all logically possible (complete and transitive) social weak preference orderings R on X. For each \(R \in \mathscr {R}(X)\), let P and I denote the corresponding strict preference and indifference relations.

Let \(\mathbb {R}^N\) denote the Euclidean space that consists of the Cartesian product of \(\# N\) copies of the real line \(\mathbb {R}\). A utility function profile \(\mathbf {u}^N\) is a mapping \(X \ni x \mapsto \mathbf {u}^N (x) \in \mathbb {R}^N\). Let \(\mathscr {U}^N\) denote the set of all utility function profiles on X. Following Sen (1970; 1977), a social welfare functional (or SWFL) f on a domain \(\mathscr {D} \subseteq \mathscr {U}^N\) is a mapping \(\mathscr {D} \ni \mathbf {u}^N \mapsto f( \mathbf {u}^N ) \in \mathscr {R}(X)\) that determines a (complete and transitive) social preference ordering \(R = f( \mathbf {u}^N )\) for each utility function profile in \(\mathscr {D}\). With some slight abuse of notation, we let \(R( \mathbf {u}^N )\), \(P( \mathbf {u}^N )\) and \(I( \mathbf {u}^N )\) denote respectively the weak preference, strict preference, and indifference relations associated with \(f( \mathbf {u}^N )\).

Given any non-empty subset \(A \subset X\), say that:

  1. 1.

    two utility function profiles \(\mathbf {u}^N, { \tilde{\mathbf{u}}}^N \in \mathscr {U}^N\) are equal on A just in case one has \(\mathbf {u}^N (x) = \tilde{\mathbf {u}}^N (x)\) for all \(x \in A\);

  2. 2.

    two social preference orderings R and \(\tilde{R}\) on X are equal on A just in case, for all \(y, z \in A\), one has \(y \ R \ z \Longleftrightarrow y \ \tilde{R} \ z\).

The main part of this paper considers social welfare functionals f which satisfy at least the first two of the following three axioms:

Unrestricted domain (U):

: The domain \(\mathscr {D}\) of f is the whole of \(\mathscr {U}^N\).

Independence (I):

: Given any non-empty subset \(A \subset X\), if the two utility function profiles \(\mathbf {u}^N, { \tilde{\mathbf{u}}}^N \in \mathscr {D}\) are equal on A, then the two associated social orderings \(f( \mathbf {u}^N ), f( { \tilde{\mathbf{u}}}^N ) \in \mathscr {R}(X)\) are also equal on A.

Pareto indifference (P\(^0\)):

: In case \(y, z \in X\) and \(\mathbf {u}^N \in \mathscr {D}\) satisfy \(\mathbf {u}^N (y) = \mathbf {u}^N (z)\), the associated social indifference relation satisfies \(y \ I( \mathbf {u}^N ) \ z\).

An important result in social choice theory with interpersonal comparisons is the “strong neutrality” or “welfarism” result due to D’Aspremont and Gevers (1977) and Sen (1977, p. 1553). This states that, when f satisfies all three conditions (U), (I), and (P\(^0\)), then there exists a (complete and transitive) social welfare ordering \(R^*\) on \(\mathbb {R}^N\) with the property that \(y \ R \ z \iff \mathbf {u}^N (y) \ R^* \ \mathbf {u}^N (z)\). This plays a prominent role among the results appearing in the surveys by Sen (1984), Blackorby et al. (1984), D’Aspremont (1985), Mongin and d’Aspremont (1998), and Bossert and Weymark (2004). Both Sen (1977) and d’Aspremont (D’Aspremont (1985), p. 34) provide complete proofs.Footnote 2

While the Pareto indifference axiom (P\(^0\)) is appealing, the impossibility theorem in Arrow (1963) is the most prominent of many results that replace it with the following alternative:

Pareto (P):

: In case the two social states \(y, z \in X\) and the utility function profile \(\mathbf {u}^N \in \mathscr {D}\) satisfy \(\mathbf {u}^N (y) \gg \mathbf {u}^N (z)\), the associated strict preference relation \(P( \mathbf {u}^N )\) on X satisfies \(y \ P( \mathbf {u}^N ) \ z\).Footnote 3

Specifically, under the assumption that individuals’ utility functions are ordinally non-comparable, Arrow’s impossibility theorem states that (U), (I) and (P) together imply a dictatorship. To develop a theory general enough to cover this important case, Roberts (1980, p. 427) specifies an additional condition that can be restated as follows:Footnote 4

Weak continuity (WC):

: For all utility function profiles \(\mathbf {u}^N \in \mathscr {D}\) and all vectors \(\varvec{\epsilon }\in \mathbb {R}^N\) with \(\varvec{\epsilon }\gg \mathbf {0}\), there exists a utility function profile \(\mathbf {u}^N _{ \varvec{\epsilon }} \in \mathscr {D}\) satisfying \(\varvec{\epsilon }\gg \mathbf {u}^N (x) - \mathbf {u}^N _{ \varvec{\epsilon }} (x) \gg \mathbf {0}\) for all social states \(x \in X\) with the property that \(f( \mathbf {u}^N ) = f( \mathbf {u}^N _{ \varvec{\epsilon }} )\).

Then Roberts (1980, p. 428) claims the following:

Claim Suppose that the SWFL \(\mathscr {D} \ni \mathbf {u}^N \mapsto f( \mathbf {u}^N ) \in \mathscr {R}(X)\) satisfies (U), (I), (P), and (WC). Then there exists a continuous function \(\mathbb {R}^N \ni \mathbf{w} \mapsto W( \mathbf{w}) \in \mathbb {R}\), strictly increasing with an increase in all its arguments, with the property that for all utility function profiles \(\mathbf {u}^N \in \mathscr {D}\) and all social states \(y, z \in X\) one has

$$\begin{aligned} W( \mathbf {u}^N (y) ) > W( \mathbf {u}^N (z) ) \Longrightarrow y \ P( \mathbf {u}^N ) \ z \end{aligned}$$

This claim has come to be known as Roberts’ “weak neutrality” or “weak welfarism” theorem.Footnote 5 In many of the surveys mentioned above, it was cited as an alternative to the strong neutrality result of D’Aspremont and Gevers (1977) and Sen (1977, p. 1553). The unpublished results by Le Breton (1987) and by Bordes and Le Breton (1987) investigating Roberts’ theorem for restricted economic domains have since been amalgamated with related results that appear in Bordes et al. (2005).

Condition (WC), however, is too weak for the claim to hold. To show this, Section 2 provides a counter example which even satisfies the following familiar condition:

Strict Pareto (P\(^*\)):

: In case the two social states \(y, z \in X\) and the utility function profile \(\mathbf {u}^N \in \mathscr {D}\) satisfy \(\mathbf {u}^N (y) \geqq \mathbf {u}^N (z)\), the associated social preferences satisfy \(y \ R( \mathbf {u}^N ) \ z\), with \(y \ P( \mathbf {u}^N ) \ z\) unless \(\mathbf {u}^N (y) = \mathbf {u}^N (z)\).

The same example shows the error in Roberts’ attempt to prove his intermediate Lemma 6. Then Section 3 uses a modified form of the alternative “shift invariance” condition due to Roberts (1983, p. 74) himself in order to prove the crucial Lemma 6 in Roberts (1980). This establishes that a slight alteration to Claim 1 makes it valid.

Weak Continuity: A Counter Example

Definition of a Discontinuous Social Welfare Ordering

The following is an example of a society with two individuals and a strictly increasing and symmetric utilitarian welfare function

$$\begin{aligned} \mathbb {R}^2 \ni ( u_1, u_2 ) = \mathbf{u} \mapsto W( u_1, u_2 ) = W( \mathbf{u}) \rightarrow \mathbb {R}\end{aligned}$$
(1)

such that the induced SWFL defined on X by

$$\begin{aligned} a \ R \ b \Longleftrightarrow W( u_1 (a), u_2 (a) ) \ge W( u_1 (b), u_2 (b) ) \end{aligned}$$
(2)

satisfies conditions (U), (I), (P\(^*\)) and (WC). Yet the function W that we will define has a discontinuity at the origin \(\mathbf {0} = (0, 0)\) which implies a discontinuity in the preference ordering R on \(\mathbb {R}^2\) defined by (2). This implies that no continuous function W can satisfy Claim 1 in this example.

Indeed, first define the function \(\mathbb {R}^2 \ni ( v_1, v_2 ) = \mathbf {v} \mapsto w( \mathbf {v}) \in \mathbb {R}\) by

$$\begin{aligned} w( \mathbf {v}) := \min \{ v_1 + 2 v_2, 2 v_1 + v_2 \} \end{aligned}$$
(3)

Next, partition \(\mathbb {R}^2\) into the three subdomains \(S_1, S_2, S_3\) and then define the symmetric function \(S_i \ni \mathbf {v} \mapsto W( \mathbf {v}) \in \mathbb {R}\) for \(i = 1, 2, 3\) so that

$$\begin{aligned} W( \mathbf {v})&:= v_1 + v_2 \quad \text {on} \quad S_1 := \{ \mathbf {v} \in \mathbb {R}^2 \mid v_1 + v_2 \le 0 \} \end{aligned}$$
(4)
$$\begin{aligned} W( \mathbf {v})&:= 1 + w( \mathbf {v}) \quad \text {on} \quad S_2 := \{ \mathbf {v} \in \mathbb {R}^2 \mid w( \mathbf {v}) > 0 \} \end{aligned}$$
(5)
$$\begin{aligned} W( \mathbf {v})&:= \exp \dfrac{ ( v_1 + 2v_2 )( 2v_1 + v_2 ) }{ 3( v_1 + v_2 )} \quad \text {on} \quad S_3 := \mathbb {R}^2 {\setminus } ( S_1 \cup S_2 ) \end{aligned}$$
(6)

The indifference map corresponding to this function is illustrated in Fig. 1, which shows how:

  1. (i)

    the closed set \(S_1\) is separated from \(S_3\) by the common boundary line \(\{ \mathbf {v} \in \mathbb {R}^2 \mid v_1 + v_2 = 0 \}\), which constitutes the closed indifference curve \(W( \mathbf {v}) = 0\);

  2. (ii)

    the open set \(S_2\) is separated from \(S_3\) by the common boundary set \(\{ \mathbf {v} \in \mathbb {R}^2 \mid w( \mathbf {v}) = 0 \}\);

  3. (iii)

    (0, 0) belongs to \(S_1\), but is a common boundary point of all three subdomains.

Fig. 1
figure 1

Eight level curves of the function \(( v_1, v_2 ) \mapsto W( v_1, v_2 )\)

Consider first the intermediate subdomain \(S_3\), which consists of two wedges where \(v_1 + v_2 > 0\) and \(w( \mathbf {v}) \le 0\). On \(S_3\), it follows from (6) that

$$\begin{aligned} \ln W( \mathbf {v}) = \frac{ 2 ( v_1 )^2 + 5v_1v_2 + 2( v_2 )^2 }{ 3( v_1 + v_2 )} = \frac{ 2( v_1 + v_2 ) }{3} + \frac{ v_1 v_2 }{ 3( v_1 + v_2 )} \end{aligned}$$
(7)

In \(S_3\), because \(v_1 + v_2 > 0\), this is a differentiable function of \(\mathbf {v}\). Its first partial derivative is

$$\begin{aligned} \frac{ \partial }{ \partial v_1 } \ln W = \frac{2}{3} + \frac{ v_2 ( v_1 + v_2 ) - v_1 v_2 }{3( v_1 + v_2 )^2 } = \frac{2}{3} + \frac{ ( v_2 )^2 }{3( v_1 + v_2 )^2 } > 0 \end{aligned}$$

By symmetry, its second partial derivative is also positive. This shows that \(S_3 \ni \mathbf {v} \mapsto W( \mathbf {v}) \in \mathbb {R}\) is strictly increasing.

Furthermore, consider any sequence \(\mathbb {N}\ni n \mapsto \mathbf {v}^n = ( v^n_1, v^n_2 ) \in S_3\) which converges to a point \(\bar{ \mathbf {v}} = ( \bar{v}_1, \bar{v}_2 ) \ne (0, 0)\) that lies on the lower boundary of \(S_3\) because \(\bar{v}_1 + \bar{v}_2 = 0\). Evidently it follows from (7) that, as \(n \rightarrow \infty\), so \(\ln W( v^n_1, v^n_2 ) \rightarrow - \infty\), implying that \(W( v^n_1, v^n_2 ) \rightarrow 0\). Also, it follows from (3) and (6) that \(W( \mathbf {v}) = 1\) when \(\mathbf {v}\) lies on the upper boundary of \(S_3\), where \(w( \mathbf {v}) = 0\). So the range set \(W( S_3 )\) must be the whole interval (0, 1] .

Putting these results for the function \(\mathbb {R}^2 \ni \mathbf {v} \mapsto W( \mathbf {v})\) on the subdomain \(S_3\), together with obvious properties on the subdomains \(S_1\) and \(S_2\) where W is defined by (4) and (5), we see that the respective range sets are the three pairwise disjoint line intervals

$$\begin{aligned} W( S_1 ) = ( -\infty , 0], \quad W( S_2 ) = (1, \infty ), \quad W( S_3 ) = (0, 1] \end{aligned}$$

From this it it is evident that the function \(\mathbf {v} \mapsto W( \mathbf {v})\) is strictly increasing throughout \(\mathbb {R}^2\), and continuous except at (0, 0) , where \(W(0, 0) = 0\).

The three-dimensional graph of \(W( v_1, v_2 )\) has a boundary that includes the vertical “cliff” \(\{ (0, 0) \} \times [0, 1]\) in \(\mathbb {R}^3\) of height 1. The base of this cliff is at the origin (0, 0, 0), which is on the graph of W because \(W(0, 0) = 0\). Thus, the mapping \(\mathbb {R}^2 \ni \mathbf {v} \mapsto W( \mathbf {v})\) is discontinuous at \(\mathbf {v} = (0, 0)\). Not only is the function W discontinuous; so is the preference relation it induces. Indeed, for each fixed \(\mathbf {\bar{v}} \in S_3\) where \(W( \mathbf {\bar{v}}) \in (0, 1]\), the upper contour set \(\{ ( v_1, v_2 ) \in \mathbb {R}^2 \mid W( v_1, v_2 ) \ge W( \mathbf {\bar{v}}) \}\) is not closed because it excludes the point (0, 0) which is in its closure.

Verifying Weak Continuity

Consider now the SWFL \(\mathscr {U}^2 \ni ( u_1, u_2 ) \mapsto f( u_1, u_2 )\) defined as in Section 2.1. Obviously, this induced SWFL satisfies conditions (U), (I) and (P\(^*\)). To verify condition (WC) it is enough to construct, for each fixed \(\varvec{\epsilon }= ( \epsilon _1, \epsilon _2 ) \in \mathbb {R}^2_{++}\), a transformation

$$\begin{aligned} \mathbb {R}^2 \ni \mathbf {v} \mapsto \varvec{\phi } ^{ \varvec{\epsilon }} ( \mathbf {v}) = ( \phi ^{ \varvec{\epsilon }}_1 ( \mathbf {v}), \phi ^{ \varvec{\epsilon }}_2 ( \mathbf {v}) ) \in \mathbb {R}^2 \end{aligned}$$

satisfying

$$\begin{aligned} (0, 0) \ll ( v_1, v_2 ) - \varvec{\phi } ^{ \varvec{\epsilon }} (\mathbf {v}) \ll \varvec{\epsilon }\end{aligned}$$

together with the requirement that \(\mathbf {v} \mapsto W( \varvec{\phi } ^{ \varvec{\epsilon }} ( \mathbf {v}))\) and \(\mathbf {v} \mapsto W( \mathbf {v})\) are ordinally equivalent welfare functions in the sense that there exists a strictly increasing transformation \(\mathbb {R}\ni \mathbf{w} \mapsto \psi ^{ \varvec{\epsilon }} (\mathbf{w}) \in \mathbb {R}\) for which \(W( \varvec{\phi } ^{ \varvec{\epsilon }} ( \mathbf {v})) \equiv \psi ^{ \varvec{\epsilon }} ( W( \mathbf {v}) )\).

In the following constructions, given any fixed \(\varvec{\epsilon }= ( \epsilon _1, \epsilon _2 ) \in \mathbb {R}^2_{++}\), let

$$\begin{aligned} \epsilon _* := \min \{ \epsilon _1, \epsilon _2 \} \in \mathbb {R}\ \text{ and } \ \mathbf{e} := (1, 1) \in \mathbb {R}^2 \end{aligned}$$
(8)

Then \(\epsilon _* > 0\), of course. The transformation will take the form

$$\begin{aligned} \mathbb {R}^2 \ni \mathbf {v} \mapsto \varvec{\phi } ^{ \varvec{\epsilon }}( \mathbf {v}) := \mathbf {v} - \lambda ^{ \varvec{\epsilon }} ( \mathbf {v}) \, \mathbf{e} \in \mathbb {R}^2 \end{aligned}$$
(9)

for a suitably constructed scalar function \(\mathbb {R}^2 \ni \mathbf {v} \mapsto \lambda ^{ \varvec{\epsilon }} ( \mathbf {v}) \in \mathbb {R}\) taking values in the open interval \((0, \epsilon _* )\).

Case 1: The simplest case is when

$$\begin{aligned} v_1 + v_2 \le 0 \ \text{ and } \text{ so } \ W( \mathbf {v}) = v_1 + v_2 \le 0 \end{aligned}$$
(10)

In this case, define \(\lambda ^{ \varvec{\epsilon }} ( \mathbf {v}) := \frac{1}{2} \, \epsilon _*\) for \(\epsilon _*\) given by (8). Then (9) implies that

$$\begin{aligned} \varvec{\phi } ^{ \varvec{\epsilon }}( \mathbf {v}) = ( v_1 - \tfrac{1}{2} \epsilon _*, v_2 - \tfrac{1}{2} \epsilon _* ) \end{aligned}$$

So (9) and (10) imply that \(\phi ^\epsilon _1 ( \mathbf {v}) + \phi ^\epsilon _2 ( \mathbf {v}) = v_1 + v_2 - \epsilon _* \le - \epsilon _* < 0\). Now, whenever \(v_1 + v_2 \le 0\), it follows that

$$\begin{aligned} W( \varvec{\phi } ^{ \varvec{\epsilon }} ( \mathbf {v})) = \phi ^{ \varvec{\epsilon }}_1 ( \mathbf {v}) + \phi ^{ \varvec{\epsilon }}_2 ( \mathbf {v}) = \psi ^{ \varvec{\epsilon }} ( W( \mathbf {v}) ) \end{aligned}$$

provided we define

$$\begin{aligned} \psi ^{ \varvec{\epsilon }} (W) := W - \epsilon _* \ \text{ for } \text{ all } \ W \le 0 \end{aligned}$$
(11)

Case 2: This case occurs when

$$\begin{aligned} w( \mathbf {v})> 0 \ \text{ and } \text{ so } \ W( \mathbf {v}) = 1 + w( \mathbf {v}) > 1 \end{aligned}$$
(12)

In this case, define

$$\begin{aligned} \lambda ^{ \varvec{\epsilon }} ( \mathbf {v}) := \tfrac{1}{6} \min \{ \epsilon _*, w( \mathbf {v}) \} \end{aligned}$$
(13)

Clearly, this definition implies that \(\lambda ^{ \varvec{\epsilon }} ( \mathbf {v}) \in (0, \epsilon _* )\). Also

$$\begin{aligned} \phi ^{ \varvec{\epsilon }}_1 ( \mathbf {v}) + 2 \phi ^{ \varvec{\epsilon }}_2 ( \mathbf {v})&= v_1 + 2 v_2 - 3 \lambda ^{ \varvec{\epsilon }} ( \mathbf {v}) \end{aligned}$$
(14)
$$\begin{aligned} \text{ and } \quad 2 \phi ^{ \varvec{\epsilon }}_1 ( \mathbf {v}) + \phi ^{ \varvec{\epsilon }}_2 ( \mathbf {v})&= 2 v_1 + v_2 - 3 \lambda ^{ \varvec{\epsilon }} ( \mathbf {v}) \end{aligned}$$
(15)

But \(w( \mathbf {v}) := \min \{ v_1 + 2 v_2, 2 v_1 + v_2 \}\) and (13) implies that \(\lambda ^{ \varvec{\epsilon }} ( \mathbf {v}) \le \frac{1}{6} w( \mathbf {v})\). So it follows from (14) and (15) that

$$\begin{aligned} \min \{ \phi ^{ \varvec{\epsilon }}_1 ( \mathbf {v}) + 2 \phi ^{ \varvec{\epsilon }}_2 ( \mathbf {v}), \, 2 \phi ^{ \varvec{\epsilon }}_1 ( \mathbf {v}) + \phi ^{ \varvec{\epsilon }}_2 ( \mathbf {v}) \} = w( \mathbf {v}) - 3 \lambda ^{ \varvec{\epsilon }} ( \mathbf {v}) \ge \tfrac{1}{2} \, w( \mathbf {v}) > 0 \end{aligned}$$
(16)

So the definition of \(\mathbb {R}^2 \ni \mathbf {v} \mapsto W( \mathbf {v}) \in \mathbb {R}\) in (5) and of \(\mathbb {R}^2 \ni \mathbf {v} \mapsto \lambda ^{ \varvec{\epsilon }} ( \mathbf {v}) \in \mathbb {R}\) in (13) imply, together with (16), that

$$\begin{aligned} 1 < W( \phi ^{ \varvec{\epsilon }} ( \mathbf {v}))&= 1 + \min \{ \phi ^{ \varvec{\epsilon }}_1 ( \mathbf {v}) + 2 \phi ^{ \varvec{\epsilon }}_2 ( \mathbf {v}), \, 2 \phi ^{ \varvec{\epsilon }}_1 ( \mathbf {v}) + \phi ^{ \varvec{\epsilon }}_2 ( \mathbf {v}) \} \\&= 1 + w( \mathbf {v}) - 3 \lambda ^{ \varvec{\epsilon }} ( \mathbf {v}) \\&= 1 + w( \mathbf {v}) - \tfrac{1}{2} \min \{ \epsilon _*, w( \mathbf {v}) \} \\&= \max \{ 1 + w( \mathbf {v}) - \tfrac{1}{2} \epsilon _*, 1 + \tfrac{1}{2} w( \mathbf {v}) \} \\&= \max \{ W( \mathbf {v}) - \tfrac{1}{2} \, \epsilon _*, \tfrac{1}{2} \, [ W( \mathbf {v}) + 1] \} \end{aligned}$$

It follows that \(W( \phi ^{ \varvec{\epsilon }} ( \mathbf {v})) = \psi ^{ \varvec{\epsilon }} ( W( \mathbf {v}) )\) provided that we define

$$\begin{aligned} \psi ^{ \varvec{\epsilon }} (W) := \max \{ W - \tfrac{1}{2} \, \epsilon _*, \tfrac{1}{2} \, (W + 1) \} \ \text{ for } \text{ all } \ W > 1 \end{aligned}$$
(17)

Case 3: This leaves the hardest third case, when

$$\begin{aligned} w( \mathbf {v}) \le 0 \ \text{ and } \text{ also } \ v_1 + v_2 > 0 \end{aligned}$$
(18)

In this case, the definition in (6) implies that \(0 < W( \mathbf {v}) \le 1\).

Fix any \(\mathbf {v} = ( v_1, v_2 ) \in \mathbb {R}^2\) satisfying (18). Then, given any \(\varvec{\epsilon }\in \mathbb {R}^2\) satisfying \(\varvec{\epsilon }\gg 0\), consider the non-empty open interval of \(\mathbb {R}\) defined by

$$\begin{aligned} I^{ \varvec{\epsilon }} ( \mathbf {v}) := (0, \min \{ \epsilon _*, \tfrac{1}{2} ( v_1 + v_2 ) \} ) = (0, \epsilon _*) \cap (0, \tfrac{1}{2} ( v_1 + v_2 )) \end{aligned}$$
(19)

Now consider the function g defined on the open interval in (19) by

$$\begin{aligned} I^{ \varvec{\epsilon }} ( \mathbf {v}) \ni \lambda \mapsto g( \lambda ) := W( \mathbf {v} - \lambda \, \mathbf{e}) \in \mathbb {R}\end{aligned}$$
(20)

In Sect. 2.1 we saw that W is strictly increasing as a function of two variables, so the function g is strictly decreasing. Also, when \(\lambda > 0\), it is evident that

$$\begin{aligned} w( \mathbf {v} - \lambda \, \mathbf{e}) < w( \mathbf {v}) \le 0 \end{aligned}$$
(21)

On the other hand, when \(\lambda < \tfrac{1}{2} ( v_1 + v_2 )\), because \(e_1 = e_2 = 1\), one has

$$\begin{aligned} ( v_1 - \lambda e_1 ) + ( v_2 - \lambda e_2 ) = v_1 + v_2 - 2 \lambda > 0 \end{aligned}$$
(22)

So for all \(\lambda \in I^{ \varvec{\epsilon }} ( \mathbf {v})\) the inequalities (21) and (22) imply that the 2-vector \(\mathbf {v} - \lambda \, \mathbf{e}\) satisfies (18). It follows that \(W( \mathbf {v} - \lambda \, \mathbf{e})\) is also defined by (6), implying that

$$\begin{aligned} g( \lambda ) = W( \mathbf {v} - \lambda \, \mathbf{e}) = \exp \dfrac{( v_1 + 2 v_2 - 3 \lambda ) \, (2 v_1 + v_2 - 3 \lambda )}{ 3( v_1 + v_2 - 2 \lambda )} \end{aligned}$$
(23)

Now let \(\mu\) denote a suitably chosen positive scalar constant which is independent of both \(\mathbf {v}\) and \(\varvec{\epsilon }\), and whose possible range will be specified later. For each \(\mathbf {v} \in \mathbb {R}^2\) that satisfies the inequalities (18), because g is strictly decreasing and positive, we can define \(\lambda ^{ \varvec{\epsilon }} ( \mathbf {v})\) implicitly as the unique value of \(\lambda\) that solves the equation

$$\begin{aligned} g( \lambda ) = W( \mathbf {v} - \lambda \, \mathbf{e}) = W( \mathbf {v}) \exp (- \mu \, \epsilon _* ) \end{aligned}$$
(24)

Then \(\lambda ^{ \varvec{\epsilon }} ( \mathbf {v})\) will be well defined and positive, with

$$\begin{aligned} W( \varvec{\phi } ^{ \varvec{\epsilon }} ( \mathbf {v}) ) = W( \mathbf {v}) \exp (- \mu \, \epsilon _* ) = \psi ^{ \varvec{\epsilon }} ( W( \mathbf {v}) ) < 1 \end{aligned}$$

where \(\psi ^{ \varvec{\epsilon }} (W) := W \exp (- \mu \, \epsilon _* ) \in (0, 1)\) whenever \(0 < W \le 1\).

It remains only to choose \(\mu > 0\) so that the corresponding solution to equation (24) exists in the open interval \(I^{ \varvec{\epsilon }} ( \mathbf {v})\) defined by (19) and so satisfies \(\lambda ^{ \varvec{\epsilon }} ( \mathbf {v}) < \epsilon _*\). Because we are assuming that the inequalities (18) hold, definition (6) implies that any \(\lambda ^{ \varvec{\epsilon }} ( \mathbf {v})\) satisfying (24) and (19) must be a value of \(\lambda\) which solves the equation

$$\begin{aligned} \frac{( v_1 + 2 v_2 - 3 \lambda ) \, (2 v_1 + v_2 - 3 \lambda )}{3 ( v_1 + v_2 - 2 \lambda )} = \frac{( v_1 + 2 v_2 ) \, (2 v_1 + v_2 )}{3 ( v_1 + v_2 )} - \mu \, \epsilon _* \end{aligned}$$

But \(v_1 + v_2> 2 \lambda > 0\) in the relevant interval of values of \(\lambda\), so we can clear fractions to obtain the quadratic equation \(q( \lambda ) = 0\), where

$$\begin{aligned} q( \lambda ):= & {} ( v_1 + v_2 ) \, ( v_1 + 2 v_2 - 3 \lambda ) \, (2 v_1 + v_2 - 3 \lambda ) \nonumber \\&- ( v_1 + v_2 - 2 \lambda ) \, ( v_1 + 2 v_2 ) \, (2 v_1 + v_2 ) \nonumber \\&+ 3 \mu \, \epsilon _* \, ( v_1 + v_2 ) \, ( v_1 + v_2 - 2 \lambda ) \end{aligned}$$
(25)

Now, note that when \(\lambda = 0\) the first two terms on the right-hand side of (25) cancel. Because \(v_1 + v_2 > 0\), it follows that

$$\begin{aligned} q(0) = 3 \mu \, \epsilon _* \, ( v_1 + v_2 ) ^2 > 0 \end{aligned}$$
(26)

In addition, simple calculation shows that

$$\begin{aligned} q \left( \tfrac{1}{2} ( v_1 + v_2 ) \right) = - \tfrac{1}{4} ( v_1 + v_2 ) \, ( v_1 - v_2 ) ^2 \end{aligned}$$
(27)

Finally, some much more tedious but still routine algebraic manipulation shows that

$$\begin{aligned} q( \epsilon _* ) = ( v_1 + v_2 ) \, \epsilon _* \, [(9 - 6 \mu ) \, \epsilon _* + (3 \mu - 5) \, ( v_1 + v_2 )] + 2 v_1 v_2 \, \epsilon _* \end{aligned}$$
(28)

Because \(v_1 + v_2 > 0\) but \(w( \mathbf {v}) \le 0\), it follows from (3) that \(v_1\) and \(v_2\) have opposite signs. In particular \(v_1 \ne v_2\) and \(v_1 v_2 < 0\). Then (27) implies that \(q( \frac{1}{2} ( v_1 + v_2 )) < 0\) whereas (28) implies that for any \(\mu\) satisfying \(9< 6 \mu < 10\) one has \(q( \epsilon _* ) < 0\). So choosing any fixed \(\mu \in \left( \frac{3}{2}, \frac{5}{3} \right)\) guarantees that, by the intermediate value theorem, the quadratic equation \(q( \lambda ) = 0\) has a unique root \(\lambda ^{ \varvec{\epsilon }} ( \mathbf {v})\) in the open interval \(I^{ \varvec{\epsilon }} ( \mathbf {v}) = (0, \min \{\, \epsilon _*, \frac{1}{2} ( v_1 + v_2 ) \,\}\) defined by (19).Footnote 6 In particular, for each \(\mathbf {v}\) satisfying (18), the root \(\lambda ^{ \varvec{\epsilon }} ( \mathbf {v})\) of \(q( \lambda ) = 0\) that we have found lies in \((0, \epsilon _* )\), as required.

Finally, putting all the three different cases together gives \(W( \varvec{\phi }^{ \varvec{\epsilon }}( \mathbf {v})) \equiv \psi ^{ \varvec{\epsilon }} ( W( \mathbf {v}))\), where \(\varvec{\phi }^{ \varvec{\epsilon }}( \mathbf {v}) = \mathbf {v} - \lambda ^{ \varvec{\epsilon }} ( \mathbf {v}) \, \mathbf{e}\), and then

$$\begin{aligned} \psi ^{ \varvec{\epsilon }} (W) := {\left\{ \begin{array}{ll} W - \frac{1}{2} \, ( \epsilon _1 + \epsilon _2 )< 0 &{} \text {if} \ W \le 0; \\ W \, \exp (- \mu \, \epsilon _* ) \in (0, 1] &{} \text {if} \ 0 < W \le 1; \\ \max \{\, W - \frac{1}{2} \, \epsilon _*, \frac{1}{2} \, (W + 1) \,\}> 1 &{} \text {if} \ W > 1. \end{array}\right. } \end{aligned}$$

In particular, \(\psi ^{ \varvec{\epsilon }}\) is strictly increasing in W for each fixed \(\varvec{\epsilon }\gg 0\). \(\square\)

Diagnosis

Next, to see where his proof erred, we introduce a definition that incorporates some more notation from Roberts (1980, pp. 425–426).

Definition 1

Given the SWFL f, the strict preference relation \(\succ\) on utility vectors in \(\mathbb {R}^N\) is defined so that \(\mathbf{a} \succ \mathbf{b}\) just in case there exist a utility function profile \(\mathbf {u}^N \in \mathscr {D}\) and two social states \(x, y \in X\) with \(x \ f( \mathbf {u}^N ) \ y\) such that \(\mathbf{a} \gg \mathbf{u}(x)\) and \(\mathbf{u}(y) \gg \mathbf{b}\). Then, for each \(\mathbf {v}^* \in \mathbb {R}^N\), the three sets \(L( \mathbf {v}^* )\), \(M( \mathbf {v}^* )\) and \(N( \mathbf {v}^* )\) are defined respectively by

$$\begin{aligned} L( \mathbf {v}^* )&:= \{ \mathbf {v} \in \mathbb {R}^N \mid \mathbf {v}^* \succ \mathbf {v} \} \end{aligned}$$
(29)
$$\begin{aligned} M( \mathbf {v}^* )&:= \{ \mathbf {v} \in \mathbb {R}^N \mid \mathbf {v} \succ \mathbf {v}^* \} \end{aligned}$$
(30)
$$\begin{aligned} N( \mathbf {v}^* )&:= \mathbb {R}^N {\setminus } [ L( \mathbf {v}^* ) \cup M( \mathbf {v}^* ) ] \end{aligned}$$
(31)

Now, in the above example the set \(N( \mathbf {0})\) is equal to the middle region where \(W( \mathbf {v}) \in (0, 1]\). Note too that, although \(\mathbf {v} \in N( \mathbf {0})\) whenever \(W( \mathbf {v}) \in (0, 1]\), one will have \(\mathbf {v} - \varvec{\eta } \succ - \varvec{\eta }'\) whenever \(\varvec{\eta }, \varvec{\eta }' \gg \mathbf {0}\) with \(\varvec{\eta }\) small enough so that \(W( \mathbf {v}) \ge 0\) because \(\eta _1 + \eta _2 \le v_1 + v_2\). This contradicts Roberts’ claim, in the course of trying to prove Lemma 6, that: “...as \(\mathbf {v} + \varvec{\gamma } \in N( \mathbf {v}^* )\), [condition] (WC) ensures that \(\mathbf {v} + \varvec{\gamma } - \varvec{\eta } _3 \in N( \mathbf {v}^* - \varvec{\eta } _4 )\) for some \(\varvec{\epsilon }\gg \varvec{\eta } _3, \varvec{\eta } _4 \gg \mathbf {0}\) where \(\varvec{\epsilon }\) is subject to choice.”

A new sufficient condition

Statement of pairwise continuity

Roberts (1983, p. 74) later introduced a shift invariance condition which can be slightly restated as follows:

Condition (SI):

: Given any \(\varvec{\epsilon }\in \mathbb {R}^N_{++}\), for all profiles \(\mathbf {u}^N \in \mathscr {D}\), there exists an \(\varvec{\epsilon }' \in \mathbb {R}^N_{++}\) and a profile \({ \tilde{\mathbf{u}}}^N \in \mathscr {D}\) such that \(f( \mathbf {u}^N ) = f({ \tilde{\mathbf{u}}}^N )\) and, for all \(x \in X\), one has \(\varvec{\epsilon }\gg \mathbf {u}^N (x) - { \tilde{\mathbf{u}}}^N (x) \gg \varvec{\epsilon }'\).

As he states in a footnote: “Shift invariance is slightly stronger than ...(WC). ...The strengthening allows one to deal with problems that are akin to the existence of poles in a consumer’s indifference map ....”Footnote 7 However, when proving his Lemma A.5, it seems that Roberts (1983, p. 90) in the end reverses the order of the quantified statements “for all profiles \(\mathbf {u}^N \in \mathscr {U}^N\)” and “there exists an \(\varvec{\epsilon }' \in \mathbb {R}^N_{++}\)” and actually uses the following uniform shift invariance assumption:

Condition (USI):

: Given any \(\varvec{\epsilon }\in \mathbb {R}^N_{++}\), there exists an \(\varvec{\epsilon }' \in \mathbb {R}^N_{++}\) for which, for all profiles \(\mathbf {u}^N \in \mathscr {D}\), there exists a profile \({ \tilde{\mathbf{u}}}^N \in \mathscr {D}\) such that \(f( \mathbf {u}^N ) = f({ \tilde{\mathbf{u}}}^N )\) and, for all \(x \in X\), one has \(\varvec{\epsilon }\gg \mathbf {u}^N (x) - { \tilde{\mathbf{u}}}^N (x) \gg \varvec{\epsilon }'\).

Instead of (WC) or (SI), I shall use the following pairwise continuity assumption which weakens (USI):

Condition (PC):

: Given any \(\varvec{\epsilon }\in \mathbb {R}^N_{++}\), there exists an \(\varvec{\epsilon }' \in \mathbb {R}^N_{++}\) for which, for all profiles \(\mathbf {u}^N \in \mathscr {D}\), given any pair \(x, y \in X\) satisfying \(x \ P( \mathbf {u}^N ) \ y\), there exists a profile \({ \tilde{\mathbf{u}}}^N \in \mathscr {D}\) with \({ \tilde{\mathbf{u}}}^N (x) \ll \mathbf {u}^N (x) - \varvec{\epsilon }'\) and \({ \tilde{\mathbf{u}}}^N (y) \gg \mathbf {u}^N (y) - \varvec{\epsilon }\) such that \(x \ P( { \tilde{\mathbf{u}}}^N ) \ y\).

Like shift invariance, condition (PC) strengthens weak continuity because the same strictly positive vector \(\varvec{\epsilon }'\) must work simultaneously for all \(x, y \in X\). Like uniform shift invariance, it also strengthens shift invariance because the same strictly positive vector \(\varvec{\epsilon }'\) must also work for all profiles \(\mathbf {u}^N \in \mathscr {D}\). On the other hand, condition (PC) weakens even condition (WC), as well as condition (USI), to the extent that the profile \({ \tilde{\mathbf{u}}}^N\) can depend on the pair \(x, y \in X\), and also because it requires only one-way strict inequalities which, moreover, are different for x and y.

Sufficiency of pairwise continuity

With condition (PC) replacing (WC), Lemma 6 of Roberts (1980) will be proved via the following two separate lemmas that involve definitions (30) and (31):

Lemma 1

If f satisfies (U), (I) and (P), then for all \(\mathbf {v}, \mathbf {v}', \varvec{\eta }, \varvec{\eta } ' \in \mathbb {R}^N\) with \(\varvec{\eta }, \varvec{\eta } ' \gg \mathbf {0}\), one has \(\mathbf {v} \in N( \mathbf {v}') \Longrightarrow \mathbf {v} + \varvec{\eta } \in M( \mathbf {v}' - \varvec{\eta } ')\).Footnote 8

Proof

Suppose that xyz are three distinct elements of X. By condition (U), there exists a profile \(\mathbf {u}^N \in \mathscr {D}\) such that

$$\begin{aligned} \mathbf {v} + \varvec{\eta } \gg \mathbf {u}^N (x) \gg \mathbf {u}^N (y) \gg \mathbf {v} \ \text {and} \ \mathbf {v}' \gg \mathbf {u}^N (z) \gg \mathbf {v}' - \varvec{\eta } ' \end{aligned}$$
(32)

By Definition 1, if \(z \ P( \mathbf {u}^N ) \ y\) were true, it would imply that \(\mathbf {v}' \succ \mathbf {v}\). So \(\mathbf {v} \in N(\mathbf {v}') \Longrightarrow y \ R( \mathbf {u}^N ) \ z\). Then the Pareto condition (P) implies that \(x \ P( \mathbf {u}^N ) \ y\), and so \(\mathbf {v} \in N( \mathbf {v}') \Longrightarrow x \ P( \mathbf {u}^N ) \ z\) because \(R( \mathbf {u}^N )\) is transitive. From (32) it follows that \(\mathbf {v} \in N( \mathbf {v}') \Longrightarrow \mathbf {v} + \varvec{\eta } \succ \mathbf {v}' - \varvec{\eta } '\).

Part (b) of the following Lemma is a minor restatement of the conclusion of Lemma 6 in Roberts (1980):

Lemma 2

If f satisfies (U), (I), (P) and (PC), then:

  1. (a)

    if \(\varvec{\epsilon }, \mathbf {v}, \mathbf {v}'\) in \(\mathbb {R}^N\) satisfy \(\varvec{\epsilon }\gg \mathbf {0}\) as well as \(\mathbf {v} + \varvec{\eta } \in M( \mathbf {v}' + \varvec{\epsilon })\) for all \(\varvec{\eta } \gg \mathbf {0}\), then \(\mathbf {v} \in M( \mathbf {v}')\);

  2. (b)

    for all \(\mathbf {v}, \mathbf {v}^*\) in \(\mathbb {R}^N\) that satisfy \(\mathbf {v} \in N( \mathbf {v}^* )\), there is no \(\varvec{\gamma }\in \mathbb {R}^N_{++}\) such that \(\mathbf {v} + \varvec{\gamma }\in N( \mathbf {v}^* )\).

Proof

(a) Given \(\varvec{\epsilon }\gg \mathbf {0}\), let \(\varvec{\epsilon }' \gg \mathbf {0}\) be specified as in the statement of condition (PC). Choose \(\varvec{\eta } \gg \mathbf {0}\) so that \(\varvec{\eta } \ll \varvec{\epsilon }'\). Because \(\mathbf {v} + \varvec{\eta } \in M( \mathbf {v}' + \varvec{\epsilon })\), condition (U) and Definition 1 imply that there exist \(\mathbf {u}^N \in \mathscr {D}\) and \(x, y \in X\) such that \(x \ P( \mathbf {u}^N ) \ y\) while

$$\begin{aligned} \mathbf {v} + \varvec{\eta } \gg \mathbf {u}^N (x) \ \text {and} \ \mathbf {u}^N (y) \gg \mathbf {v}' + \varvec{\epsilon }\end{aligned}$$
(33)

By condition (PC), there exists \({ \tilde{\mathbf{u}}}^N \in \mathscr {D}\) such that \(x \ P( { \tilde{\mathbf{u}}}^N ) \ y\) while

$$\begin{aligned} {\tilde{\mathbf{u}}}^N (x) \ll \mathbf {u}^N (x) - \varvec{\epsilon }' \ \text {and} \ { \tilde{\mathbf{u}}}^N (y) \gg \mathbf {u}^N (y) - \varvec{\epsilon }\end{aligned}$$
(34)

But then (33) and (34) together imply that \({ \tilde{\mathbf{u}}}^N (x) \ll \mathbf {v} + \varvec{\eta } - \varvec{\epsilon }' \ll \mathbf {v}\), because of the choice of \(\varvec{\eta }\); they also imply that \({ \tilde{\mathbf{u}}}^N (y) \gg \mathbf {v}'\). Because \(x \ P( { \tilde{\mathbf{u}}}^N ) \ y\), it follows from Definition 1 that \(\mathbf {v} \succ \mathbf {v}'\).

(b) Suppose that \(\mathbf {v} + \varvec{\gamma }\in N( \mathbf {v}^* )\). Definition (31) implies that \(\mathbf {v}^* \in N( \mathbf {v} + \varvec{\gamma })\). Choose any \(\varvec{\gamma }' \gg \mathbf {0}\) satisfying \(\varvec{\gamma }' \ll \varvec{\gamma }\). Now Lemma 1 implies that \(\mathbf {v}^* + \varvec{\eta } \in M( \mathbf {v} + \varvec{\gamma }')\) for all \(\varvec{\eta } \gg \mathbf {0}\). So part (a) implies that \(\mathbf {v}^* \in M( \mathbf {v})\). In particular, \(\mathbf {v} \not \in N( \mathbf {v}^* )\).

Invariance conditions and pairwise continuity

Just as with Roberts’ (WC) and (SI) conditions, condition (USI) and so (PC) is certainly satisfied if f is invariant under a set of transformations of individual utility functions large enough to include all shifts of the form \(\tilde{u}_i (x) \equiv \alpha + u_i (x)\) (for all \(i \in N\) and \(x \in X\)) with \(\alpha \in \mathbb {R}\) independent of i. Of the six invariance classes \(\Phi\) presented on p. 423 of Roberts (1980), this is true of the first five, namely (ONC), (CNC), (OLC), (CUC), and (CFC).

It remains to discuss the sixth cardinal ratio-scale or (CRS) class, as well as the broader non-comparable ratio-scale or (NRS) invariance class that includes all transformations of the form \(\tilde{u}_i (x) \equiv \beta _i \, u_i (x)\) (for all \(i \in N\) and \(x \in X\)) with \(\beta _i > 0\) for each individual \(i \in N\). Indeed, since the unrestricted domain condition (U) allows each utility function to have both positive and negative values, there are difficulties with the key condition (PC) in this case.Footnote 9

When Roberts (1980, p. 423) discusses the (CRS) class, he states that “For simplicity, it will be assumed that welfares are always strictly positive.” This, of course, violates condition (U), but almost all the relevant results are easily restated for any restricted domain \(\mathscr {D} \subset \mathscr {U}^N\) large enough to include all profiles of utility functions whose values are strictly positive. Nevertheless, the key pairwise continuity condition (PC) is violated because, no matter how small \(\varvec{\epsilon }' \in \mathbb {R}^N_{++}\) may be, for any \(x \in X\) there will always be a strictly positive-valued profile \(\mathbf {u}^N \in \mathscr {D}\) such that \(\mathbf {u}^N (x) \ll \varvec{\epsilon }'\); this makes it impossible to find any profile \({ \tilde{\mathbf{u}}}^N\) of strictly positive-valued utility functions that satisfies \({ \tilde{\mathbf{u}}}^N (x) \ll \mathbf {u}^N (x) - \varvec{\epsilon }'\).

When utilities are restricted to have strictly positive values throughout X, however, one can work instead with \(\ln u_i (x)\) as a transformed utility function. Then ratio-scale invariance for utilities is equivalent to invariance under additive shifts of their logarithms. Moreover, condition (PC) can be restated for these transformed utility functions. A similar trick works when utility functions are restricted to have strictly negative values; one works instead with \(- \ln [- u_i (x) ]\) as a transformed utility function. The (CRS) and (NRS) invariance classes present problems only when utilities are allowed to change signs, or to become zero.

Conclusion

The weak neutrality or welfarism theorem due to Roberts (1980) is indeed “both important and useful” (p. 428). The minor errors in its statement and in the proof of the key Lemma 6 are not very difficult to correct by replacing the weak continuity condition (WC) with the new pairwise continuity condition (PC) stated here in Section 3.1.

An open question is whether the closely related Theorem 1 of Roberts (1983) holds under shift invariance (SI) instead of uniform shift invariance (USI), which is stronger than (PC). However, even (USI) is weak enough that having to impose it instead of (WC) or (SI) would do little to detract from the significance or wide applicability of Roberts’ theorem.

Only in the case of ratio-scale measurability of utilities that can change sign does Roberts’ theorem seem inapplicable. With zero as the utility of the infinite set of potential people who, in each possible social state, never come into existence, this happens to be exactly the setting that we consider in Chichilnisky et al. (2020). But in that paper we extend the kind of original position due to Vickrey (1945) and Harsanyi (1953; 1955). Then we use an “impartial benefactor” argument to derive a utilitarian SWFL more directly.