Economists generally speak in terms of preference or utility. [In modern economics, utility is most commonly used as a representation of preference, in the following sense: \( {\text{U}}^{\text{i}} \left( {\text{x}} \right) > {\text{U}}^{\text{i}} \left( {\text{y}} \right) \) means ‘Individual i prefers (alternative/situation/bundle of goods) x to y.] Only since the recent three decades or so that more economists have explicitly studied happiness. However, even if we confine to the older days, utility and happiness did not differ by much. In our definition, they differ only by imperfect information, possible concern for the welfare of others, and irrationality. For most issues concerning individual preference/happiness, especially on traditional economic issues of consumer choice, we typically abstract from imperfect information, irrationality, and pure or non-affective altruism (ruling out a true concern for the welfare of others over and above effects like warm glow). Then, the issues of measurability and interpersonal comparability of either happiness or utility are then essentially similar. In this chapter, we treat them similarly. Section 6.1 argues that happiness (and utility that represents preference) is cardinally measurable and interpersonally comparable, at least in principle. Section 6.2 discusses some ways to improve the accuracy and comparability of happiness in practice. Appendix D also addresses a practical problem in happiness measurement.

6.1 Happiness is Cardinally Measurable and Interpersonally Comparable

I believe that many students of economics, like myself, have at some stage been baffled by the controversies regarding whether happiness or utility is measurable or not measurable, cardinally measurable or just ordinally measurable. Ordinal measurability involves ability to rank. With just ordinal measurability, one can say that utility at x is higher than that at y, but cannot say how many times higher, nor compare differences in utility.

The confusion with respect to utility measurability is partly due to the use of the same term `utility’ both as a measure of subjective satisfaction and as an indicator of objective choice or preference. Another source of confusion is the insufficient distinction between measurability in principle and measurability in practice. For utility as a measure of the subjective satisfaction or happiness of an individual, it seems clear that it is cardinally measurable in principle, though the practical difficulties of such measurements may be very real. These difficulties include inaccuracies and possible insincerity in preference revelation. Moreover, even the individual himself may have difficulties in giving a precise measure. For example, I prefer an apple to an orange and prefer an orange to a pear. If you ask me, `Do you prefer an apple to an orange more strongly than an orange to a pear? (Question A), then I will say, ‘It depends on what kind of fruits I had in the immediate past, what sort of meal I am having’. If all these are known, then I will be able to give a definite answer. Thus, subject to practical difficulties my subjective utility is cardinally measurable. If it was just ordinally measurable, I would not just have some difficulties in answering Question A, I would dismiss it as meaningless. It seems clear that any individual will be able to compare the difference in subjective utility between having an apple and an orange and that between an orange and a house, and able to compare the difference in subjective disutility between a bite of an ant and a sting of a bee and that between a sting of a bee and having his right arm cut off.

It also seems meaningful to say that I was at least twice as happy in 2020 as in 1970. If I have a perfect memory, I may even be able to pin down the ratio of happiness, at say, around 2.8. It also seems sensible for someone to say, ‘Had I known the sufferings I had to undergo, I would have committed suicide long ago’, or `If I had to lead such a miserable life, I would wish not to have been born at all!’ Hence, it makes sense to speak of negative or positive happiness/utility. Thus, somewhere in the middle, there is something corresponding to zero utility. `There can be little doubt that an individual, apart from his attitude of preference or indifference to a pair of alternatives, may also desire an alternative not in the sense of preferring it to some other alternative, or may have an aversion towards it not in the sense of contra–preferring it to some other alternative. There seem to be pleasant situations that are intrinsically desirable and painful situations that are intrinsically repugnant. It does not seem unreasonable to postulate that welfare is +ve in the former case and –ve in the latter’ (Armstrong 1951, p. 269). This is also effectively the conclusion of Kahneman et al. 1997. Hence it seems clear that utility or welfare as a subjective feeling is in principle measurable in a full cardinal sense.

Economists’ Bias Against Utility/Happiness Cardinal Measurability and Interpersonal Comparability

Instead of the subjective sense discussed above, we may use ‘utility’ purely as an objective indicator of an individual preference ordering and we may not be interested in anything in addition to this ordinal aspect of ‘utility’. For example, for certain economic problems like the derivation of demand curves/functions, we only have to assume that a consumer/individual can compare the desirability of different bundles of goods ordinally, i.e. the ability to rank different bundles is sufficient. The same demand function can be derived from the same set of indifference curves with different sets of cardinal utility numbers, provided that the same ranking is preserved. Thus, in this sense, cardinal utility can be assumed away on the ground of Occam’s razor for such problems. At least partly due to this, many economists are hostile against cardinal measurability. However, to insist on ordinal utility only (denying the use or cardinal utility) even for other problems, such as happiness studies, social choiceFootnote 1, optimal population, choices affecting the probabilities of survival (see e.g. Ng 2011, 2016 on the latter issue) where cardinal utilities are needed, is to commit the fallacy of misplaced abstraction. This mistake is similar to insisting that a person must shave off his mustache since that is unnecessary for eating; not allowing for the possibility that he may want to keep his mustache to increase his sex appeal!

The following is representative of the modern textbook hostility against the cardinal measurability and interpersonal comparability of utility. ‘There is no way that you or I can measure the amount of utility that a consumer might be able to obtain from a particular good… there can be no accurate scientific assessment of the utility that someone might receive by consuming a frozen dinner or a movie relative to the utility that another person might receive from that same good … Today no one really believes that we can actually measure utils’ (Miller 2011, pp. 436–7). There is at least one counter-example to this confident assertion—the present writer.

A probably most widely used textbook in basic economics (‘sold millions of copies in more than 40 languages’ by 1997) that has lived through 19 editions over 1948-2010 and written by a Nobel laureate puts it bluntly: ‘Economists today generally reject the notion of a cardinal, measurable utility’ (Samuelson and Nordhaus 2010, p. 89). Note that a cardinal, measurable utility is not just abstracted away as unnecessary, but rejected outright.

Another widely used intermediate microeconomic textbook example on the hostility against cardinal utility: ‘But how do we tell if a person likes one bundle twice as much as another? How could you even tell if you like one bundle twice as much as another? One could propose various definitions for this kind of assignment: I like one bundle twice as much as another if I am willing to run twice as far to get it, or to wait twice as long, or to gamble for it at twice the odds… Although each of them is a possible interpretation of what it means to want one thing twice as much as another, none of them appears to be an especially compelling interpretation’ (Varian 2010, pp. 57–8).

Indeed, there is an especially compelling interpretation. Since our ultimate objective is happiness (on which see Chap. 5 above), using the amount of happiness of the individual involved provides a perfect answer to Varian’s question, if we ignore the effects on others, which is another issue (slightly touched on above). In addition, the actual amount of happiness enjoyed, but not the amount of utility as representing preference orderings only, could be used to determine well-being even in the presence of preference changes. Thus, using happiness/welfare instead of preference/utility, we may analyze the normative aspects of preference changes.

It is true that the strong and explicit beliefs in the non-cardinal measurability and non-interpersonal comparability of happiness and/or preference (see Chap. 2 on the differences between the two concepts) are held mainly by economists (due to the non-necessity for demand analysis as discussed above). However, even among sociologists and psychologists who study happiness, such beliefs are also very common. For example, sociologist and veteran happiness researcher Veenhoven ‘argued that happiness is measured at the ordinal level’ (Kalmijn and Veenhoven 2005; Veenhoven 2010, p. 612n). The common belief in the non-cardinality of happiness/utility spans the whole multi-disciplinary happiness studies. Thus, after a cross-disciplinary survey of the issue of cardinality, Kristoffersen (2017, p. 612n) concluded that ‘Many scholars (economists and others) are of the opinion that wellbeing data are strictly ordinal in nature, and tend to criticise the common tendency to treat them as cardinal measures’. (See also Kristoffersen 2017.) Levinson (2012, p.873) also mentions that ‘economists normally assume utility is ordinal rather than cardinal, and that interpersonal comparisons based on stated happiness are impossible’.

The Compellingness of Cardinal Measurability and Interpersonal Comparability

In fact, the compellingness of the cardinal measurability and interpersonal comparability of happiness/utility is obvious. Consider the following three simple alternatives faced by a person:

A::

Her current situation.

B::

Her current situation plus being bitten by an ant (non-poisonous one) once.

C::

Her current situation plus being thrown bodily into a pool of boiling water

Obviously, she prefers A to B and B to C. If preference/utility is purely ordinal, this is all she can say. However, even you, not being her, know that the intensity of her preference of B over C is at least many thousand times larger than that of A over B. Moreover, you may also be confident that the intensity of her preference of B over C is at least many thousand times larger than that of your preference of A over B (interpreting A and B as applied to you).

True, this is interpersonal comparison of utility regarded by Robbins (1932, 1938) as unscientific. In fact, this comparison is solidly based on evolutionary biology, as touched on earlier and also discussed in Appendix C. An ant bite reduces her (and most individuals’) fitness by a very small amount and hence induces only a small amount of pain. Being thrown bodily into boiling water threatens ones’ survival and must cause great pain and intense attempt to avoid it. Though there may be some degree of interpersonal differences, these are almost certainly less significant than the huge survival difference between an ant bite and being thrown into boiling water. Thus, our degree of confidence in the truth of the comparison above is no less than 99.99%, a degree of certainty envied by all empirical scientists, economists included.

Most people now know that our brain consists of two hemispheres, with the left brain controlling the right side of the body and vice versa. We do not feel this duality as our two brain hemispheres are connected by corpus callosum, making our subjective consciousness unified. However, some patients with serious epilepsy have their two brain hemispheres separated by cutting the connection (to reduce brain interaction). They then behave as if having two centres of consciousness or mind, with their left brain (normally controlling speech) not knowing what their right brain has seen with the left eye, if a blinder is also placed between their two eyes (Gazzaniga, 1970).Footnote 2 Thus, two separate brain hemispheres each with independent consciousness may be unified with connection through the corpus callosum. Similarly, if our technology is advanced enough to imitate the connection through the corpus callosum, we could so connect her brain with yours. Then, she could feel your taste of ice cream and you could feel her taste of blueberries. Interpersonal comparison would become almost perfect!

While happiness is cardinally measurable and interpersonally comparable in principle, it is true that the commonly used methods of happiness measurement are not very cardinal and interpersonally very difficult to compare, as mentioned above. The lack in comparability in existing happiness measures makes happiness studies vulnerable to the criticism of doubters of happiness results such as Johns et al. (2007) and Ott (2010). If happiness measures could be based on more comparable methods of measurement, as discussed below, the critics may have less gun powder to use.

6.2 How Could the Measurement of Happiness be Improved?

One reason most economists are skeptical of happiness measurement is that professionally they trust what people do rather than what their say (‘cheap talk’). If an individual is willing to pay from her own pocket to actually buy a certain item, economists are willing to accept that she values that item at least at the price paid. If she just says that she values certain thing at a certain value, economists are generally skeptical. Since most if not all existing measurements of happiness are based on how happy people say they are in questionnaire surveys, economists are thus skeptical of their reliability. This skepticism has some validity.

However, there are persuasive arguments that existing measures, though imperfect, are rather reliable. For example, different measures of happiness correlate well with one another (Fordyce 1988), with recalls of positive versus negative life events (Seidlitz et al. 1997), with reports of friends and family members (Costa and McCrae 1988; Diener 1984; Sandvik et al. 1993, 2009), with physical measures like heart rate and blood pressure measures (Shedler et al. 1993), with EEG measures of prefrontal brain activity (Sutton and Davidson 1997), and with more objective measures of well-being like incidence of depression, poor appetite and sleep (Luttmer 2005). Pavot et al. (1991) finds that respondents reporting that they are very happy tend to smile more. MacCulloch and Di Tella (2000) note that psychologists who study and give advice on happiness for a living use happiness data. ‘Presumably, if markets work and there was a better way to study well-being, people who insist on using bad data would be driven out of the market’ (pp. 7–8). Moreover, correlations of happiness show remarkably consistency across countries, including developing and transitional (Graham and Pettinato 2001, 2002; Namazie and Sanfey 2001). Dominitz and Manski (1999) examine the scientific basis underlying economists’ hostility against subjective data and found it to be ‘meager’ and ‘unfounded’. Rather, ‘survey respondents do provide coherent, useful information when queried systematically’; see Manski 2000, p. 132.) Despite remaining problems of happiness measurement (see, e.g. Schwarz and Stracek 1999; Bertrand and Mullainathan 2001), reported happiness indices may be used as good approximations (Frey and Stutzer 2002b; Oishi 2019) and ‘happiness surveys are capturing something meaningful about true utility’ (Di Tella and MacCulloch 2006, p. 28).

For those economists who are still skeptical or even look down upon and deride at the happiness measures (which actually happened), I call upon them to look at their own backyard. Consider the most important economic variable GDP or GNP. Its measurement is subject to all sorts of inaccuracies, as is well-known to all economists. We used the imperfect measure for many decades. Then came the PPP (purchasing power parity) adjustment which overnight increased the Chinese GDP by 4 times and the Indian GDP by 6 times from this single adjustment alone! Most happiness measures may not be very accurate but I doubt that a 4-times adjustment will ever be necessary for the average figure of any nation.

There are a number of methods to improve the measurement of happiness to increase its accuracy and comparability, including interpersonal and intertemporal comparability. Some of these methods are easier to implement than others. Let us start from the easier ones first.

First, as discussed in the previous chapters, as a rule, using the concept of happiness instead of other concepts like life satisfaction is likely to yield a better result. This is easily implemented.

Secondly, asking subjects to tick from: very happy, pretty happy, not too happy, and unhappy gives very vague results. This is so because phrase like ‘not too happy’ is vague as to the amount of happiness it represents. It may either represent a positive amount of (net) happiness or a negative amount. Before we use some interpersonal comparable units of happiness measurement (which is more complex, as discussed below), it is difficult to get happiness results that are valid across persons. Different persons may use the same phrase such as ‘very happy’ to describe different amounts of happiness, and use different phrases to describe the same amount of happiness, making interpersonal comparison difficult. However, there is a well-defined level of happiness that has interpersonal significance. This is the level of zero (net) happiness, or where the amount of positive happiness is just offset by the negative amount of happiness or the amount of unhappiness (pain and sufferings). In terms of Figure 1 above (Chap. 1), it is the case where the area above the line of neutrality equals that below this line. If the net amount of happiness is zero, the value of life to that person herself (i.e. ignoring any effects on others) is neutral. This has an interpersonal significance. A person may have a large amount of positive happiness and also a large amount of unhappiness. Another person may have a small amount of positive happiness and also a small amount of unhappiness. It may be difficult to compare the amount of happiness (or unhappiness) of the first person with that of the second. However, if the amount of positive happiness of each of these two persons just offsets the amount of unhappiness, the net amounts of happiness of both persons are the same, being both equal to zero. Thus, happiness studies should aim to discover, among others, information regarding the proportions of people with happiness levels above, at and below this level of neutrality. This is an interpersonally, intertemporally, internationally, and interculturally comparable and useful piece of information.

When a subject is asked to rate her own happiness within the scale of say 0-10, it is true that most people may use the mid point of 5 to stand for the point of neutrality. However, this is by no means certain or universal. This is particularly so as many Western countries use 50 (out of 100) as the passing mark in exam grading, while the corresponding passing mark is 60 in China. Thus, a brief instruction asking the subject to use 5 to stand for neutrality will increase the informational content of the survey results especially with respect to the comparability of the proportion of people above the neutrality point.

The above improvement can be easily implemented. However, while achieving a significant improvement easily, it does not solve most of the problems of comparability. Society A may have 90% of people above the line of neutrality while society B only has 85%. However, society B may still be a happier one if many of those above neutrality have much higher happiness than society A and most of those below neutrality in society B are only marginally below while most of those below neutrality in society A are significantly below.

To overcome such difficulty of incomparability, I develop (Ng 1996) a method that yields happiness measures that are comparable interpersonally, inter-temporally, and interculturally. It is based on Edgeworth’s concept of a just perceptible increment of happiness, but developed to be operational and actually used to conduct an actual survey/measurement.Footnote 3 For example, if you prefer two spoons of sugar in a given cup of coffee to 1.5 spoons, you may not know the difference between 2 and 1.99 spoons. There exists a difference that makes one just perceivably taste better than the other.Footnote 4 Edgeworth took it as axiomatic, or, in his words ‘a first principle incapable of proof’, that the ‘minimum sensible or the just perceivable increments of pleasures for all persons, are equatable (Edgeworth 1881, pp. 7ff., pp. 60 ff.). I (Ng 1975) derived this result as well as the utilitarian social welfare function (SWF), that social welfare is the unweighted sum of individual utilities/welfares, from more basic axioms.

The main axiom is the Weak Majority Preference Criterion (WMP): For any two alternatives x and y, if no individual prefers y to x, and (1) if I, the number of individuals, is even, at least I/2 individuals prefer x to y; (2) if I is odd, at least (I–1)/2 individuals prefer x to y and at least another individual’s utility level is not lower in x than in y, then social welfare is higher in x than in y.

The reason why WMP leads us to the utilitarian SWF is not difficult to see. The criterion WMP requires that individual utility/welfare differences sufficient to give rise to preferences of half of the population must be regarded as socially more significant than utility differences not sufficient to give rise to preferences (or dis-preferences) of another half. Since any group of individuals comprising 50 per cent of the population is an acceptable half, this effectively makes a just–perceivable increment of utility/welfare of any individual an interpersonally comparable unit.Footnote 5 (Ignoring the difference between individual preferences and welfare, utility and welfare may be used interchangeably. Where they differ, welfare or happiness should be used, as argued in above; WMP should then be revised to refer to happiness.) The compellingness of this argument is further expounded in Ng & Singer (1981).

Thus, measures of happiness based on the concept just perceivable increment of happiness is not only cardinal but also interpersonally comparable. If we use the same number say one to measure the happiness difference of a just perceivable incrementFootnote 6 for all individuals, the happiness indices so constructed are interpersonally comparable since each just perceivable increment of happiness is equitable across individuals. Though such measures are more difficult to obtainFootnote 7, some such measures may be obtained for some small but representative samples and the results compared with the existing measures taken on larger samples. If some reliable correspondences between the two sets of measures could be established, we may not have to use the more complicated method for the majority of subjects surveyed. The combined use of these two methods may be a good way to tackle the problems of reliability and comparability.

The study of happiness is still a very new science. Thus, it has much scope to be improved to increase the accuracy and comparability of happiness measures not only by taking account of the above but also many other issues. Given time and more studies, significant improvements may be expected.

Concluding Summary

Simple ways to improve the accuracy and interpersonal and intertemporal comparability of happiness measurement include using happiness instead of life satisfaction (or other concepts), pinning down the dividing line of the zero amount of net happiness, using an interpersonally valid unit based on the just perceivable increment of happiness, and the complementary use of this method for small samples and the traditional methods for large samples.