Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

The Metrics of Subjective Wellbeing Data: An Empirical Evaluation of the Ordinal and Cardinal Comparability of Life Satisfaction Scores

  • 628 Accesses

  • 14 Citations


This paper is motivated by the lack of consensus on the metrics of subjective wellbeing measurement scales. Subjective wellbeing data are frequently treated as though they are cardinally comparable both across and within individuals, though very little evidence exists to support these assumptions. Because wellbeing cannot be observed directly, cardinality must remain an assumption, which is usually imposed based on statistical convenience rather than on reason. The premise of this paper is that it is both possible and useful to make this assumption more informed. The analysis applies the principle of simultaneous conjoint measurement to improve our understanding of what information is contained within subjective wellbeing scores. Specifically, the metrics of the eleven-point numeric life satisfaction scale is evaluated using the MH5 mental health survey instrument. Under the assumption that the response function for MH5 is identifiable by the Rasch model, the shape of the response function for life satisfaction is potentially observable indirectly via the association between life satisfaction and MH5. The results presented here suggest life satisfaction scores are ordinally distinct, in terms of these mental health data, which supports the assumption of ordinal comparability. Under the aforementioned assumption, these scores are also approximately equidistant, which supports cardinal comparability. This pattern is found both across individuals and within individuals across time.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3


  1. 1.

    Accounting for individual fixed effects is important to the extent that we cannot yet capture the necessary missing information by other means, and cardinal comparability is important to the extent that this is—currently—a requirement for fixed-effects panel model estimation. Individual fixed effects have been shown to be accounted for, to a large extent, by personality traits, which are now captured in some survey data (Boyce 2010). Thus, our improved ability to capture this information may therefore potentially render fixed-effects panel model estimation less important.

  2. 2.

    See for example Blanchflower and Oswald (2004, 2005), Gardner and Oswald (2001), Headey and Wooden (2004), Van Praag and Ferrier-i-Carbonell (2004), and Ferrer-i-Carbonell and Frijters (2004).

  3. 3.

    Subjective wellbeing is defined here as self-reported responses to survey questions probing any aspect of wellbeing, though most commonly happiness and satisfaction. The term wellbeing is not necessarily synonymous with utility, which might be considered more specific and context-dependent in some applications. As discussed further on, available evidence suggests subjective wellbeing is likely to be a good proxy for experience utility, though not for other types of utility like decision utility. Consequently, in the discussions provided here utility refers specifically to experience utility. Likewise, happiness and satisfaction, which are two common types of subjective wellbeing measures, are not necessarily synonymous. Crooker and Near (1998) provides a discussion on the respective meanings of happiness and satisfaction, but this is not considered in further detail here.

  4. 4.

    In other words, such measurement scales exhibit interval-level quality. A possible further assumption implies ratio-level quality, which requires a unique and non-arbitrary zero-point (or point of neutrality). This level of quality is not discussed directly here, as it is considered quite contentious and to have limited consequences for common uses of subjective wellbeing data. The case for ratio quality is discussed in some detail in Kristoffersen (2010).

  5. 5.

    Thus, the condition of uniqueness would be unnecessary in fixed- and random-effects panel models, because individual fixed effects are accounted for. Hirschauer et al. (2014) elaborate more specifically on the conditions for comparison across individuals, within individuals over time, and within individual across categories.

  6. 6.

    See for example Larsen and Fredrickson (1999), Van Praag (1991), Sandvik et al. (1993), and Diener and Lucas (1999). Brief surveys of the collective evidence are provided by Clark et al. (2008) and by Layard et al. (2008).

  7. 7.

    First, implicit trade-offs, as measured in empirical models of subjective wellbeing, generally correspond well with what we know about choice behaviour: for example, the observed positive effects of marriage and employment on subjective wellbeing correspond well with the amount of effort people tend to put into obtaining these outcomes. Second, observed behaviour is consistent with what we expect from wellbeing-maximising individuals: for example, low satisfaction scores in the spheres of work and marriage tend to be good predictors of job change and divorce. Finally, the evidence which emerges from the analysis of survey data on subjective wellbeing corresponds well with that which emerges from experimental economics, particularly with respect to positional concerns (Clark et al. 2008).

  8. 8.

    Ng (1996) uses a small but complex survey, of 41 Chinese graduate students, that individuals are capable of making judgements on how much time they spend in unhappy and happy states, and agree with the notion of a scale of net happiness with a unique zero-point. While this evidence is of value, Ng himself admits the sample is clearly small and selective, and that the questions are slightly leading in nature.

  9. 9.

    Similarly, utility is assumed to be bounded in von Neuman–Morgenstern utility models (Savage 1954).

  10. 10.

    Some evidence of such effect in subjective wellbeing data are provided by Lau (2007).

  11. 11.

    The specific idea of a logarithmic utility function appears first to have been proposed by Bernoulli (1738 [1954]). Edgeworth (1881 [1961]), who proposed the idea of a ‘hedonimeter’ which would produce an objective and fully cardinal measure of utility, was similarly interested in the idea of just perceivable increments.

  12. 12.

    Ng (1996, 2008) suggests increasing increments of true happiness (which is considered a stimulus in that context) are required to produce perceivable increments in measured happiness (the response). Hence, Ng’s exposition of a logistic response function incorporates elements of the stimulus–response relationship as well the possibility of bias implied by the boundedness of measurement scales.

  13. 13.

    In this figure, a zero-point or point of neutrality of utility or wellbeing might reasonably be inferred by the inflection points of panels (a) and (c), which may or may not correspond with the mid-point of the measurement scale. This would incorporate the idea of negative utility or wellbeing, which is reasonable where utility or wellbeing is perceived as the net of positive and negative feelings or experiences, as suggested by Ng (1996).

  14. 14.

    Specifically, Rasch models apply additive conjoint measurement (Luce and Tukey 1964) to produce a measure where conjoint transitivity implies that items and persons are measured on an interval scale with a common unit (Brogden 1977; Wright 1997). Andrich (1978) later developed the Polytomous Rasch model for multiple ordered (rather than dichotomous) responses. See Wright’s (1997) for a brief description of the history and development of measurement in social sciences.

  15. 15.

    Thus, the solution proposed by Blandon and Jaccard (2006) applies the same principles of simultaneous conjoint measurement upon which Rasch models are founded. As explained by Luce and Tukey (1964): “the essential character of simultaneous conjoint measurement is described by an axiomatization of the comparison of effects of (or responses to) pairs formed from two specific kinds of ‘quantities’.” They explain that these can potentially produce a cardinal (interval-quality) measure: “The axioms apply when, for example, the effect of a pair consisting of one mass and one difference in gravitational potential on a device that responds to momentum is compared with the effect of another such pair. Measurement on interval scales which have a common unit follows from these axioms”.

  16. 16.

    If function k is found to be irregular we could either conclude that no recognisable pattern exists, and that linearity is not a reasonable assumption, or the range of possible functional forms could potentially be expanded in search of a recognisable pattern and a functional form that enables transformation of subjective wellbeing data onto a linear scale.

  17. 17.

    Specifically, the raw MH5 index scores intervals 0–10, 10–20, etc, up to 90–100 have logit intervals of 2.23, 1.22, 1.00, 0.90, 0.85, 0.83, 0.90, 1.09, 1.54, and 3.21 (Perneger and Bovier 2001). Accordingly, the following transformation function will linearise these intervals: \(MH5^{T} = \ln \left( {\frac{0.00932MH5 + 0.034}{1 - (0.00932MH5 + 0.034)}} \right)\). This produces a scale with lower and upper bounds of −3.35 and +3.35, with a mid-point of zero. For convenience, this is scaled to produce a 0–100 index in the analysis to follow.

  18. 18.

    Measurement theory posits that raw such scores are not to be used, as they are not true measures (that is, they are not linear) (Wright 1997). However, the results presented here are generated using both raw and adjusted scores in the interest of completeness, as this provides better information about the implications of the adjustment.

  19. 19.

    Calculating mean mental health scores for each life satisfaction group will produce the same information, but by estimating differences in means in this way we are also able to perform the necessary hypothesis tests of model parameters.

  20. 20.

    When these life satisfaction data are regressed on gender, age (squared and cubed terms included), marital status, the presence of children, labour force participation, physical health, education, income (log of equivalised household income) and personal characteristics, these variables are found to explain just over 18 % of the variation in life satisfaction observed across individuals. This is comparable to other similar models presented in the literature, such as Boyce and Wood (2011), who estimate a similar model with an adjusted R-squared of 0.20. When mental health information is added to such a model the explanatory power increases from 18 to over 27 %. Consequently, mental health information contains a lot of information about life satisfaction, much of which is unique and cannot be attributed to other variables. Because this type of life satisfaction model is not of key interest here, these estimates are not presented in the paper.

  21. 21.

    Gardner and Oswald (2006) present evidence on marital satisfaction and subsequent likelihood of divorce. Clark et al. (2008) provide a list of studies comparing job satisfaction and job quits. Clark (2003) finds that those which are more negatively affected by becoming unemployed are also faster at finding new work.


  1. Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43, 561–573.

  2. Bernoulli, D. (1738 [1954]). Exposition of a new theory on the measurement of risk. Econometrica, 22, 23–36.

  3. Blanchflower, D. G., & Oswald, A. J. (2004). Well-being over time in Britain and the USA. Journal of Public Economics, 88, 1359–1386.

  4. Blanchflower, D. G., & Oswald, A. J. (2005). Happiness and the human development index: The paradox of Australia. NBER Working Paper Series (No. 11416).

  5. Blanton, H., & Jaccard, J. (2006). Arbitrary metrics in psychology. American Psychologist, 61(1), 27–41.

  6. Boyce, C. J. (2010). Understanding fixed effects in human well-being. Journal of Economic Psychology, 31, 1–16.

  7. Boyce, C. J., & Wood, A. J. (2011). Personality and the marginal utility of income: Personality interacts with increases in household income to determine life satisfaction. Journal of Economic Behavior & Organization, 78, 183–191.

  8. Brogden, H. E. (1977). The Rasch model, the law of comparative judgement and additive conjoint measurement. Psychometrika, 42, 631–634.

  9. Clark, A. E. (2003). Unemployment as a social norm: Psychological evidence from panel data. Journal of Labour Economics, 21(2), 323–351.

  10. Clark, A. E., Frijters, P., & Shields, M. A. (2008). Relative income, happiness, and utility: An explanation for the Easterlin Paradox and other puzzles. Journal of Economic Literature, 46(1), 95–144.

  11. Crooker, K. J., & Near, J. P. (1998). Happiness and satisfaction: Measures of affect and cognition? Social Indicators Research, 44, 195–224.

  12. Diener, E., & Lucas, R. E. (1999). Personality and subjective well-being. In D. Kahneman, E. Diener, & N. Schwartz (Eds.), Well-being: The foundations of hedonic psychology (pp. 213–229). New York: Sage.

  13. Edgeworth, Y. F. (1881 [1961]). Mathematical psychics: An essay on the application of mathematics to the moral sciences. New York: Augustus M. Kelly.

  14. Ferrer-i-Carbonell, A., & Frijters, P. (2004). How important is methodology for the estimates of the determinants of happiness? The Economic Journal, 114(July), 641–659.

  15. Gardner, J., & Oswald, A. J. (2001). Does money buy happiness? A longitudinal study using data on windfalls. Warwick: Warwick University.

  16. Gardner, J., & Oswald, A. (2006). Do divorcing couples become happier by braking up? Journal of the Royal Statistical Society: Series A (Statistics in Society), 169(2), 319–336.

  17. Guttman, L. (1977). What is not what in statistics. The Statistician, 26, 81–107.

  18. Headey, B., & Wooden, M. (2004). The effects of wealth and income on subjective well-being and ill-being. Economic Record, 80(Special Issue), S24–S33.

  19. Hirschauer, N., Lehberger, M., & Musshoff, O. (2014). Happiness and utility in economic thought—or: What can we learn from happiness research for public policy analysis and public policy making? Social Indicators Research, 121, 647–674.

  20. Katzner, D. W. (1998). The misuse of measurement in economics. Metroeconomica, 49(1), 1–22.

  21. Kristoffersen, I. (2010). The metrics of subjective wellbeing: Cardinality, neutrality and additivity. The Economic Record, 86(272), 98–123.

  22. Larsen, R. J., & Fredrickson, B. L. (1999). Measurement issues in emotional research. In D. Kahneman, E. Diener, & N. Schwarz (Eds.), Well-being: The foundations of hedonic psychology. New York: Sage.

  23. Lau, A. L. D. (2007). Measurement of subjective wellbeing: Cultural issues. In 9th Quality of Life Conference. Deakin University, Melbourne.

  24. Layard, R., Mayraz, G., & Nickell, S. (2008). The marginal utility of income. Journal of Public Economics, 92, 1846–1857.

  25. Luce, R. D., & Tukey, J. W. (1964). Simultaneous conjoint measurement: A new type of fundamental measurement. Journal of Mathematical Psychology, 1, 1–27.

  26. Masin, S. C., Zudini, V., & Antonelli, M. (2009). Early alternative derivations of Fechner’s law. Journal of the History of the Behavioral Sciences, 45(1), 56–65.

  27. Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. The Psychological Review, 63(2), 81–97.

  28. Ng, Y.-K. (1996). Happiness surveys: Some comparability issues and an exploratory survey based on just perceivable increments. Social Indicators Research, 38, 1–27.

  29. Ng, Y.-K. (2008). Happiness studies: Ways to improve comparability and some public policy implications. The Economic Record, 84(265), 253–266.

  30. Oswald, A. (2008). On the curvature of the reporting function from objective reality to subjective feelings. Economics Letters, 100(3), 369–372.

  31. Parducci, A. (1995). Happiness, pleasure, and judgment: The contextual theopry and its applications. Hillsdale, NJ: Erlbaum.

  32. Perneger, T. V., & Bovier, P. A. (2001). Application of the Rasch model to the SF36 mental health 5 item scale (MH5). ISPOR Sixth Annual International Meeting, Value In Health.

  33. Raczek, A. E., Ware, J. E., Bjorner, J. B., Gandek, B., Haley, S. M., Aaronson, N. K., et al. (1998). Comparisons of Rasch and summated rating scales constructed from SF-36 physical functioning items in seven countries: Results from the IQOLA Project. International Quality of Life Assessment. Journal of Clinical Epidemiology, 51(11), 1203–1214.

  34. Rasch, G. (1961). On general laws and the meaning of measurement in psychology. In Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability Berkeley, California, University of California Press.

  35. Sandvik, E., Diener, E., & Seidlitz, L. (1993). Subjective well-being: The convergence and stability of self-report and non-self-report measures. Journal of Personality, 61, 317–342.

  36. Savage, L. J. (1954). The foundations of statistics. New York: Wiley.

  37. Schwartz, N. (1995). What respondents learn from questionnaires: The survey interview and the logic of conversation. International Statistical Review, 63, 153–177.

  38. Van Praag, B. M. S. (1991). Ordinal and cardinal utility: An integration of the two dimensions of the welfare concept. Journal of Econometrics, 50, 69–89.

  39. Van Praag, B. M. S., & Ferrier-i-Carbonell, A. (2004). Happiness quantified. New York: Oxford University Press.

  40. Ware, J. E., Snow, K. K., Kosinski, M., & Gandek, B. (2000). SF-36 health survey: Manual and interpretation guide. Lincoln, RI: QualityMetric Inc.

  41. Wright, B. D. (1997). Measurement for social science and education: History of social science measurement. http://www.rasch.org/memo62.htm.

Download references


This paper is dedicated to the memory of Paul W. Miller, who encouraged me to pursue this work on the metrics of subjective wellbeing. Several other scholars have provided valuable comments and guidance in the various developmental stages of this paper, including Robert Cummings, Stephen Pudney, Juerg Weber, Peter Robertson, David Butler, Paul Gerrans and David Andrich. I also wish to thank three anonymous referees for considered and valuable comments and suggestions. The study uses unit record data from the Household, Income and Labour Dynamics in Australia (HILDA) survey. The HILDA project was initiated and funded by the Australian Government Department of Families, Housing, Community Services and Indigenous Affairs (FaHCSIA) and is managed by the Melbourne Institute of Applied Economic and Social Research (MIAESR). The findings and views reported in this paper, as well as any mistakes or errors, are those of the author, and should not be attributed to any of the scholars listed in these acknowledgements or to FaHCSIA or MIAESR.

Author information

Correspondence to Ingebjørg Kristoffersen.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Kristoffersen, I. The Metrics of Subjective Wellbeing Data: An Empirical Evaluation of the Ordinal and Cardinal Comparability of Life Satisfaction Scores. Soc Indic Res 130, 845–865 (2017). https://doi.org/10.1007/s11205-015-1200-6

Download citation


  • Life satisfaction
  • Subjective wellbeing
  • Mental health
  • Cardinality
  • Ordinality
  • Response functions
  • Methodology