1 Introduction

Within this paper, happiness is defined as the subjective appreciation of one’s life-as-a-whole (Veenhoven 1984). Psychologists investigate this subject typically at the level of individuals, trying to explain why one person enjoys life more than someone else (e.g., Diener et al. 1999). Sociologists focus rather on happiness in collectivities, such as nations (e.g., Veenhoven 1999) and so this paper does. The most common question in studies on happiness in nations is how happy citizens typically are and why people enjoy life more in one nation than in another. A less common question is to what extent happiness differs between citizens in a country and how inequalities in happiness can be reduced. This calls for empirical research on inequality of happiness in nations and for that purpose we need measures of inequality. Many measures of inequality are mentioned in the literature on statistics but it is not immediately clear which of these are most appropriate for the issue at stake here.

1.1 Aim

Selection of an appropriate measure of inequality requires a clear conception of inequality in the first place and also a method for quantification of that notion. This is what this paper is about. It proposes a notion of inequality with a clear minimum and maximum value and develops methods for quantification of that kind of inequality. These methods can be used for gauging the descriptive statistics that are available in standard statistical programs or can be used as alternatives, although they are not recommended for daily application in happiness measurement.

1.2 Approach

Since one of the aims is to select appropriate measures of inequality of happiness in nations, we should acknowledge how happiness in nations is measured. This is typically done in ‘survey studies’ and relevant aspects of this technique are that happiness is assessed using self-reports and in samples of the adult population. This method is considered in more detail in Sect. 2.

This practice has consequences for the level of measurement. Happiness is measured as a discrete variable, which is self-reported as one of a small number of response categories. Categories of discrete variables are either unordered or ordered. If they are not ordered, the level of measurement is also referred to as “nominal”; in Sect. 3, we will consider this level of measurement.

At the ordinal level of measurement, the categories are ordered by definition. In some cases the additional assumption may be justified that the categories have either equal or unequal but known mutual distances on some underlying metric scale; such cases are sometimes referred to as ‘pseudo-metric’. The terms “nominal” and “ordinal level of measurement” and the underlying principles stem from Stevens (1946) and are fundamental in our considerations.

Schematically:

figure a

Happiness is measured essentially at the ordinal level of measurement. In some cases, the distances between the categories have been measured externally, i.e., in a separate study. Other studies estimate these distances ‘internally’, i.e., within the context of the correlational studies on the basis of e.g., ordered probits or related statistical techniques. The most conventional method, however, is still to postulate equidistance, which seems to be acceptable for scales with seven or more categories, but is also applied to 3- and 4-point scales without any methodological scruples, as is done in e.g., the US General Social Survey (GSS). In Sect. 4, we will consider inequality in situations where happiness is measured at this ordinal level of measurement and focus especially on the pseudo-metric variant. Two applications will be given in the end of this section.

Happiness is measured always as a discrete variable, but in the conversion of the sample findings to happiness information about the population represented by that sample, sometimes a latent variable is postulated, which is mapped onto the discrete scale of measurement. If this latent variable is continuous, the corresponding level of measurement is necessarily metric. The continuous nature of such happiness variables requires a special method for quantification of its inequality, which is developed in Sect. 5. The general result is applied to two specific models of continuous distributions.

The conclusions are collected in Sect. 6.

Although the focus of this paper is on inequality of happiness in nations, most conclusions may apply mutatis mutandis to other phenomena that are measured in a similar way, such as work satisfaction or disparity in self-esteem among pupils in schools.

2 The Measurement of Happiness

In cross-national studies, happiness is usually measured by self-report to single questions. A typical and frequently used example of such questions is: “Taking all things together, how would you say things are these days - would you say you are…?” The respondent is requested to make a choice out of e.g., four possible ratings:

  • ‘unhappy’ (R1)

  • ‘not too happy’ (R2)

  • ‘pretty happy’ (R3) and

  • ‘very happy’ (R4)

We shall use the symbol R j to denote the jth response, being a member of a set of k possible alternatives, written as {R j |j = 1(1) k}. In the above example with k = 4, happiness is reported by the respondent on a 4-step rating scale. In this context, the possible ratings are referred to as ‘categories’. This term arises from the name “the method of successive categories”, as is in use for the above method of measurement among psychometricians, see e.g., Guildford (1954, Ch. 10). In the world database of happiness (further abbreviated WDH), a set of one question and all possible responses to that question is referred to as an “item”.

The basic results in this type of investigations are the counted absolute frequencies {n j } at which members of that sample with size N select one out of the k alternatives {R j |j = 1(1)k}. Respondents who report “Don’t know” or who do not make any choice at all are ignored in this context.

Questions of the above type are presented to members of a sample from a population, e.g., some nation to obtain information about the happiness situation in that population. The happiness distribution of such a community is defined as the probability distribution of the individual happiness values in that population. The parameters of this distribution are unknown, but have to be estimated from the frequency distribution of the individual happiness values in the sample that represents that population. The average value and the standard deviation can be estimated from the corresponding frequency distribution parameters of the k responses {R j } in the sample that represents the society of the study.

3 Inequality at the Nominal Level of Measurement

Although happiness is seldom measured at the nominal level, we will start in this section with some views on inequality at that level of measurement. This is done mainly to introduce some concepts considered fundamental in our approach. We will describe happiness ratings in a sample in terms of sets and inequality in terms of relations between the elements of such a set, and we will introduce this approach without the complication of ordering. In addition to this didactical reason, this approach is adopted to demonstrate that in this respect there is an essential difference between nominal and ordinal situations.

At the nominal level of measurement of some specified variable, e.g., happiness, two respondents are either equal or unequal with respect to that variable as they select the same or different ratings if the same [1, k] happiness scale presented to them. An obvious measure for the inequality can be obtained by describing the happiness ratings of all N sample members in mathematical terms as elements of a set and by considering inequality as a binary relation on that set, i.e., between any pair of these elements. This relation “is unequal to” is symmetric, but neither reflexive nor transitive. In this approach and in the case of a nominal level of measurement, for each of these N 2 pairs, the inequality relation is either true or false, depending on whether the two selected happiness ratings are different or identical. This inequality relation of any pair can be represented as an indicator variable 0/1, where “FALSE” = 0 and “TRUE” = 1; the outcome is referred to as the inequality value of that pair.

Objections may be raised against the way pairs have been counted resulting in N 2 pairs rather than in the expected ½N(N − 1) actually different pairs. Therefore, we have to define explicitly what is defined a pair in this context. In our set-theoretical approach, we consider as an example the set {A, B} with N = 2 elements only; now N 2 = 4 binary relations can be identified, not only A − B, but also B − A and even A − A and B  B. The third and the fourth pair may be labeled “improper pairs”, since the inequality relation is antireflexive and could be ignored. The second relation (B − A) can also be ignored, albeit for a different reason. Since the relation is symmetric, the relation (B − A) gives the same contribution to the total inequality as A − B does already. Nevertheless, we prefer to count all four pairs as pairs, since (a) this makes the mathematics more convenient for larger values of k, (b) the improper pairs will never be counted as unequal ones and therefore they will not contribute to inequality and (c) our choice doubles he value of the total inequality. However, as we shall compare the total inequality to its maximum value, the choice will not affect their ratio, which will be used as an inequality indicator. Hence, in this section and in the next one, the total number of pairs is adopted to be N 2 and not the binomial coefficient ½N(N − 1).

We will illustrate our approach by an example with N = 8 and k = 4:

figure b

All shaded cells on the main diagonal correspond to the N improper pairs. Each cell above this diagonal corresponds to one and only one cell below that diagonal with the same content, so all cells together above the main diagonal contribute to the total inequality with the same amount as all cells together below this diagonal. Each of all k blocks, containing n 2 j cells and including the n j shaded cells on the main diagonal, has a zero contribution to the total inequality, irrespective of the way the pairs are counted.

As a measure of the total ‘amount of inequality’ in the set, we count the number of ‘unequal pairs’, i.e., pairs with inequality value = 1. This statistic will be denoted S and equals the sum of the inequality values of all N 2 pairs. In our above example S = 46.

For each individual member in the jth category, i.e., the group with size n j , consisting of all subjects that respond in favor of the same response R j , its contribution equals \( \sum\nolimits_{i = j} {n_{i} = N - n_{j} ,} \) where

$$ N: = \sum\limits_{j = 1}^{k} {n_{j} } $$
(1)

The value of S is obtained as the difference between the total number of pairs and the total number of ‘equal pairs’, resulting in

$$ S: = N^{2} - \sum\limits_{j = 1}^{k} {n_{j}^{2} } $$
(2)

This result is also made clear by considering the above scheme. In our example S = 82 − (22 + 12 + 32 + 22) = 64 − 18 = 46, which value has been found already.

In view of the constraint (Eq. 1), the maximum value of S is to be found by application of Lagrange’s method of undetermined multipliers, i.e., by putting the partial derivatives of

$$ F: = S - 2\lambda \left[ {N - \sum\limits_{j = 1}^{k} {n_{j} } } \right] $$
(3)

with respect to all {n j |j = 1(1)k} equal to zero. For convenience reasons, we write the multiplier this time as (−2λ). The result is

$$ {\frac{\partial F}{{\partial n_{j} }}} = - 2n_{j} + 2\lambda = 0 \Rightarrow n_{j} = \lambda \Rightarrow n_{j} = N/k \quad \forall j = 1(1)k $$
(4)

The extreme value of F, and therefore also that of S, is a maximum since

$$ {\frac{{\partial^{2} F}}{{\partial n_{j}^{2} }}} = - 2 < 0 \Rightarrow S_{\max } = N^{2} - {\frac{{N^{2} }}{k}} = {\frac{k - 1}{k}}N^{2} $$
(5)

This result enables to define an index number. We will call it the “Nominal Inequality Index”, and denote it as NII, defining it as a number rounded to integer values

$$ \text{NII}: = {\frac{S}{{S_{\max } }}} \times 100 $$
(6)

so 0 ≤ NII ≤ 100. Combination of Eqs. 2, 5 and 6 results in

$$ \text{NII} = {\frac{{N^{2} - \sum\nolimits_{1}^{k} {n_{j}^{2} } }}{{N^{2} }}} \times {\frac{k}{k - 1}} \times 100 $$
(7)

To people who do not consider equality as a zero-inequality, but as a complementary concept to inequality, the value of 100 − NII might be an option to serve as an indicator for the ‘degree of equality’, but in our view this is not a recommended practice.

4 Inequality at the Ordinal and at the Discrete Metric Level of Measurement

Contrary to measurements at the nominal level, inequality relations in the ordinal case can be distinguished as either “<”or “>”. This is the situation as it occurs in Sect. 2 with four ordered categories. The order in such situations is always assumed to be unambiguous.

4.1 Assumed Equidistance

First we consider the case in which the various ratings are assumed to be equidistant. This means that, e.g., the difference between “very happy” and “not too happy” is equal to that between “pretty happy” and “unhappy”, whereas both these differences are twice that between “pretty happy” and “not too happy”. Under these assumptions, the ordinal numbers of the ratings {1, 2, 3, 4} can be treated as if they were cardinal. This approach will be referred to as the “pseudo-metric” one. In this case, the mathematical operations which are required for the calculation of average values, standard deviations and that of various other statistics, are admissible.

An obvious way to quantify the total inequality is to apply the procedure that was adopted in Sect. 3, but to give the inequality value of each pair a weight proportional to the absolute value of the distance of the ratings of both members on the happiness scale. A suitable value for this distance is the absolute value of the difference of the ratings. In the above example, a pair consisting of the ratings of an unhappy and a pretty happy person contributes to the total amount of inequality with a weight |1–3| = 2. Along this line, the joint contribution of all individuals with the same rating j to the total amount S of inequality can be written as

$$ S(j) = \sum\limits_{i = 1}^{k} {\left| {j - i} \right|n_{i} } $$
(8)

and the total amount of inequality is

$$ S: = \sum\limits_{j = 1}^{k} {\sum\limits_{i = 1}^{k} {\left| {j - i} \right|n_{j} n_{i} } } $$
(9)

The maximum value of S can be found by putting the partial derivatives of

$$ F: = S + 2\lambda \left[ {N - \sum\limits_{j = 1}^{k} {n_{j} } } \right] $$
(10)

with respect to each n j separately equal to zero, adopting 2λ for the multiplier this time:

$$ {\frac{\partial F}{{\partial n_{j} }}} = \sum\limits_{j = 1}^{k} {\left| {j - i} \right|n_{j} - \lambda = 0 \quad \forall j = 1(1)k} $$
(11)

which can also be written as

$$ \sum\limits_{i = 0}^{k - j} {\left| {k - i} \right|} n_{j - i} - \sum\limits_{i = 0}^{j - 1} {n_{j - i} } = \lambda \quad \forall j = 1(1)k $$
(12)

In Table 1, the jth row corresponds to the respondents of category j. The sum of all cells in that row is the contribution of a single individual in that category. Multiplication by the frequency, denoted before the left-hand column results in the total contribution S(j) (Eq. 8) of the jth category. After that multiplication, the total amount of inequality is obtained as the sum S of all k × k cells within the rectangle.

Table 1 Calculation of the total inequality: Respondents can score their happiness from 1 to k, the number of people with score j is n(j) and the individual inequality between k and j is |kj|

For the differentiation of F with respect to n j , one has to be aware of the fact that, after the multiplication with the n j , (a) terms with n j occur in the shaded jth row and the jth column only, (b) the sums of the cells in that column and that row are equal, so their joint contribution to S can be replaced with twice that of the jth row, (c) the result of the partial differentiation can be found in the shaded jth row within the rectangle but for the value of λ, so that the row sum of each row within the rectangle equals λ, and (d) after multiplication of the shaded row sum by n j , the sum of all row sums up to the total amount of inequality \( \lambda \sum {n_{j} } = \lambda N = S_{\max } \Rightarrow \lambda = S_{\max } /N \)

As the reader can verify by substitution, the solution of the k equations

$$ \sum\limits_{j = 1}^{k} {\left| {j - i} \right|n_{j} = \lambda \quad \forall j = 1(1)k} $$
(13)

is simply n 1 = n k  = ½N and n j  = 0 for j = 2(1) k  1. Apparently, the inequality is maximal if the sample members are distributed equally over both terminal categories, leaving empty the other k  2 categories. From Eq. 9, the corresponding maximum value of S is found to be

$$ S_{\max } = {\frac{1}{2}}(k - 1)N. $$
(14)

For this situation a discrete inequality index (DII) can be defined in a way analogous to the NII in Eq. 6 for the nominal case, by substitution of the results of Eqs. 9 and 14:

$$ {\text{DII}}: = {\frac{S}{{S_{\max } }}} \times 100 = {\frac{{2\sum\nolimits_{j = 1}^{k} {\sum\nolimits_{i = 1}^{k} {\left| {j - i} \right|n_{i} n_{j} } } }}{{(k - 1)N^{2} }}} \times 100 $$
(15)

In Eq. 9, one may consider to raise the difference |j − i| to some power >1 if more weight is assigned to the distance, or <1 in case of less weight. As long as there is no evidence for such a choice, we maintain the unity exponent value. There is, however, a quite different reason to consider an exponent = 2, since |j − i| 2 = (j − i)2 and in this way one gets rid of the absolute values. In that case Eq. 9 is to be replaced with a similar statistic, denoted S (2)

$$ S^{(2)} : = \sum\limits_{j = 1}^{k} {\sum\limits_{i = 1}^{k} {(j - i)^{2} n_{i} n_{j} } } $$
(16)
$$ \begin{aligned} S^{(2)} & = \sum\limits_{i} {\sum\limits_{j} {j^{2} n_{i} n_{j} } } + \sum\limits_{i} {\sum\limits_{j} {i^{2} n_{i} n_{j} } } - 2\sum\limits_{i} {\sum\limits_{j} {ijn_{i} n_{j} } } \\ & = \left[ {\sum\limits_{j} {n_{j} j^{2} } } \right]\left[ {\sum\limits_{j} {n_{j} } } \right] + \left[ {\sum\limits_{i} {m_{i} i^{2} } } \right]\left[ {\sum\limits_{i} {n_{i} } } \right] - 2\left[ {\sum\limits_{j} {n_{j} j} } \right]\left[ {\sum\limits_{i} {n_{i} i} } \right] \\ & = 2N\sum\limits_{j} {n_{j} j^{2} } - 2\left[ {\sum\limits_{j} {n_{j} j} } \right]^{2} = 2N(N - 1)S^{2} \\ \end{aligned} $$
(17)

where S 2 is the sample variance. The maximum value of the sample standard deviation S = ½(k − 1) as is demonstrated by Kalmijn and Veenhoven (2005, p. 392), so

$$ S_{\max }^{(2)} = {\frac{1}{2}}(k - 1)^{2} N(N - 1) $$
(18)

and this value is also realized when ½N respondents select the response “1” and all other ½N select the response “k”.

This result is not surprising, but there may be good reasons to prove conjectures like this, even if they are very plausible. Such a reason could be the different findings in the nominal and the ordinal case.

4.2 Estimated Distance Between Response Options

Recently, Veenhoven (2009b) has introduced a method in which the boundaries between the categories are determined empirically, dropping the equidistance assumption in this way. From these boundary values, the mid-interval values (MIV) {m j |j = 1(1)k} of the k categories are obtained. Since the positions of k − 2 intermediate intervals do not occur in the formula for S max, the obvious conclusion is that in case of the MIV-approach (a) the maximum inequality is obtained again by the equipartition of the sample members over both terminal categories and (b) the formulae Eqs. 9, 14 and 15 are still applicable by simply replacing the ordinal numbers of the categories with the corresponding MIV.

4.3 Ordinal Level of Measurement

The former of these two conclusions even applies to the truly ordinal situation, but the second will not, since nothing is known about the magnitude of the distances between the positions of the categories; all we know is their algebraic signs. No suitable statistic ordinal inequality index (OII) has been proposed yet as an indicator for the amount of inequality at this level of measurement. To some readers, it might be alluring to solve this problem by taking into account the number of intermediate categories for each pair and to augment this number by unity; however, the results of this approach will turn out to be identical to those in the pseudo-metric case.

4.4 Relationship with Mean Pair Distance

From Eq. 9, it will be clear that in the metric discrete case, the total amount of inequality S is proportional to the mean pair distance, also known as the mean absolute distance. This statistic is obtained by dividing S by N(N − 1), i.e., ignoring the improper pairs.

4.5 First Application: Performance Judgment of Current Descriptive Statistics

In a previous study (Kalmijn and Veenhoven 2005), a number of descriptive statistics were judged for their aptness as a dispersion measure of happiness frequency distributions. For a set of distributions, the ranking according to increasing values of a dispersion statistic should be in a good agreement with that according to their inequality. The conclusions of that study were enhanced if the hypothetical distributions are ranked according to their DII-values, since two out of four statistics that were considered acceptable on the basis of the 2005 study, are found now to have a perfect rank correlation with the DII-values; these statistics are the standard deviation and the mean pair distance. For the latter, this result is not surprising, since this statistic has been shown to be just defined as S/(N(N − 1)), where S is defined according to Eq. 9 and N is the sample size.

As long as the only reason to square the absolute distance between the ratings is a mathematical one, we do not propose to make in choice in favor of this option. Note, however, that a choice of S (2), which is proportional to the sample variance, would have resulted in a perfect rank correlation between the standard deviation and the sample inequality. We consider this finding as an additional support for the recommendation to use the standard deviation as the most appropriate statistic to quantify the sample inequality.

4.6 Second Application: Inaptness of the Gini Coefficient for Happiness Inequality

The above mentioned study (Kalmijn and Veenhoven 2005) demonstrated the poor performance of the Gini coefficient for this application. The availability of the DII statistic provides additional evidence. Suppose we have a sample with size N = 100 and a [1, 4] scale of measurement. The frequency distribution can be written as {n 1, n 2, n 3, n 4}. The Happiness Gini index is described as being maximal if the happiness of one person is maximal (Eq. 4) and that of the other 99 is minimal (Eq. 1). For this {99, 0, 0, 1} distribution, one can compute that DII = 4 against DII = 100 in the case of {50, 0, 0, 50}. Although the Gini index claims to cover the range (0, 100), in this situation its maximum attainable value amounts to 33 and is reached for {50, 0, 0, 50}, whereas for {99, 0, 0, 1} the index = 3 only. The DII is a good statistic to demonstrate that for this application, the Gini index is not.

5 Inequality in Case of Continuous Distributions

Although happiness is always measured as a discrete observable variable, metric or not, there is a good reason to pay attention to the continuous case, more specifically to the beta distribution. This distribution is proposed (Kalmijn and Arends 2010) as the model for the probability distribution of a latent continuous happiness variable, which is mapped onto a discrete ordinal scale of happiness measurement.

The above mentioned happiness-related variables are latent and unobservable. They are assumed to be random variables with a continuous probability density function (p.d.f.). For the random variable x, the p.d.f. will be denoted g(x). The domain of g(x) may be either finite or infinite, but in view of the fact that in case of happiness-related variables the domain is always finite, only that class of distributions of distributions will be dealt with. Without loss of generality, we confine ourselves more specifically to distributions on the domain [0, 1], since by linear transformation of the variable on any other finite domain distribution on the [0, 1] domain can a always be obtained.

The p.d.f. g(x|θ) has p parameters (p ≥ 0). If p > 1, θ is a parameter vector with dimension p. The value θ is estimated on the basis of the observations, given the structure of g(x|θ), which has been chosen by the research worker.

By definition in our situation

$$ \int\limits_{0}^{1} {g\left( {x\left| \theta \right.} \right){\text{d}}x} = 1 $$
(19)

which is in a way the continuous equivalent of Eq. 1 in the discrete case.

The amount of inequality S in the continuous situation is defined in a way that is very similar to the approach in the discrete case. The total distribution is partitioned in differentials, each of which acts as the equivalent of an individual rating in the discrete situation (Fig. 1).

Fig. 1
figure 1

Inequality contribution of the pair g(x)dx and g(x + y)dy. In case of a continuous distribution with density g(x)

Consider the part of the distribution between the values x and x + dx with area g(x)dx and a second part of the distribution, at a distance y from x, so between x + y and x + y + dy with area g(x + y)dy. For a given value of x (0 ≤ x ≤ 1), −x ≤ y ≤ 1 − x. It should be noted that g(x + y) is the value of the p.d.f. g(Þ) of the random variable X at X = x + y.

The contribution of this ‘pair’ to the total amount of inequality can be defined as the product of the two shaded area’s and their absolute distance; i.e.,

$$ {\text{d}}S = [g(x){\text{d}}x]\times[g(x + y){\text{d}}y]\times\left| y \right|. $$
(20)

Generally speaking, this total amount of inequality can be written as the double integral

$$ S = \int\limits_{x = 0}^{1} {\int\limits_{y = - x}^{1 - x} {g(x)g(x + y)\left| y \right|{\text{d}}y{\text{d}}x} } = \int\limits_{0}^{1} {\left[ {\int\limits_{0}^{1 - x} {g(x + y)y{\text{d}}y} - \int\limits_{ - x}^{0} {g(x + y)y{\text{d}}y} } \right]g(x){\text{d}}x} $$
(21)

This total amount of inequality S will be referred to as the continuous inequality value (CIV). Contrary to the DII in Sect. 3, this CIV is not an index number. If, however, this CIV is divided by its maximum attainable value and is multiplied by 100, a statistic is obtained, which is referred to as the continuous inequality index (CII), an index in complete analogy to DII for the discrete distributions. Just as DII, both CIV and CII have been developed for this study only and not to extend the standard list of current dispersion descriptive statistics.

For a specified type of the p.d.f. the value of S depends on the value of the parameter θ.

Contrary to the discrete cases in the previous section, the maximum inequality is obtained for the value of the vector θ that maximizes F(q) = S(q), but now without a term \( \lambda \left[ {1 - \int_{0}^{1} {g(x\left| \theta \right.){\text{d}}x} } \right], \) since the value of the integral in this term is independent of θ according to Eq. 19.

This value of θ is obtained by putting the derivative of S with respect to θ equal to zero and solving that equation for θ. If p ≥ 2, θ is a vector and S is to be differentiated partially with respect to each element of θ separately; θ is to be solved then from these p equations.

As has been pointed out in the introduction to this section, the above result will be applied to the standard beta distribution specifically. Unfortunately, the standard beta distribution looks too complicated for full analytical elaboration. Numerical integration, however, gives an informative picture on the application to the inequality of a random variable with a beta distribution as a function of its shape parameters. Nevertheless, in order to make clear how the approach works, we needed additionally a quite simple model as an alternative for this. Our choice was in favor of a model with a linear probability density function and, more specifically one that was conjectured to have same limit distribution as the beta distribution in case all parameters approach zero values. So we shall demonstrate this approach in practice for two cases, both on the interval [0, 1]: the symmetric split triangular distribution with p = 1 (CASE I) and the standard beta distribution with p = 2(CASE II).

Generally speaking, the method can also be applied to distributions of other continuous variables, but these are beyond the scope of this paper as long as they are not relevant to measuring happiness.

CASE I

The symmetric split triangular distribution on [0, 1] (Fig. 2):

$$ g(x): = \left\{ {\begin{array}{*{20}c} {\theta^{ - 2} (\theta - x)} \hfill & {x \in [0,\theta ] \subseteq \left[ {0,{\frac{1}{2}}} \right] \subseteq \mathbb{R}} \hfill \\ {\theta^{ - 2} (x - 1 + \theta )} \hfill & {x \in [1 - \theta ,1] \subseteq \left[ {{\frac{1}{2}},1} \right] \subseteq \mathbb{R}} \hfill \\ 0 \hfill & {\text{otherwise}} \hfill \\ \end{array} } \right. $$
(22)
Fig. 2
figure 2

Split triangular distribution

Check for Eq. 16:

$$ {\text{Since}}\int\limits_{0}^{1} {g(x\left| \theta \right.){\text{d}}x} = \int\limits_{0}^{\theta } {\theta^{ - 2} (\theta - x){\text{d}}x} + \int\limits_{1 - \theta }^{1} {\theta^{ - 2} (x - 1 + \theta ){\text{d}}x} = {\frac{1}{2}} + {\frac{1}{2}} = 1, $$

g(x) is a p.d.f. Because in this case g(x) is not defined in a unique way over the complete interval (0, 1), we have to calculate S as the sum of two nontrivial components, one for each of two subdomains separately. These two components are:

$$ \begin{aligned} S_{1} = & \theta^{ - 2} \int\limits_{0}^{\theta } {\left[ {\int\limits_{0}^{\theta - x} {(\theta - x - y)y{\text{d}}y} + \int\limits_{1 - \theta - x}^{\theta - x} {(x + y - 1 + \theta )y{\text{d}}y} - \int\limits_{ - x}^{0} {(\theta - x - y)y{\text{d}}y} } \right]} (\theta - x){\text{d}}x \\ S_{2} = & \theta^{ - 2} \int\limits_{1 - \theta }^{1} {\left[ {\int\limits_{0}^{1 - x} {(x + y - 1 + \theta )y{\text{d}}y} - \int\limits_{ - x}^{\theta - x} {(\theta - x - y)y{\text{d}}y} - \int\limits_{1 - \theta - x}^{0} {(x + y - 1 + \theta )y{\text{d}}y} } \right]} (x - 1 + \theta ){\text{d}}x \\ \end{aligned} $$
(23)

Since p = 1, the maximum inequality is obtained by finding the maximum value of

$$ S(\theta ): = S_{1} + S_{2} = \left( {{\frac{1}{4}} - {\frac{1}{10}}\,\theta } \right) + \left( {{\frac{1}{4}} - {\frac{1}{10}}\,\theta } \right) = {\frac{1}{2}} - {\frac{1}{S}}\,\theta $$
(24)

This maximum value equals ½ and is obtained at q S0. A larger value of q reduces the inequality of the distribution, with 2/5 as its minimum value at q = ½.

CASE II

The standard beta distribution

In this case, the random variable x has a p.d.f. with two shape parameters α and b

$$ g(x): = \left\{ {\begin{array}{*{20}c} {[B(\alpha ,\beta )]^{ - 1} x^{\alpha - 1} (1 - x)^{\beta - 1} } \hfill & {x \in [0,1] \subseteq \mathbb{R}\quad \alpha, \beta \in \mathbb{R}^{ + } } \hfill \\ 0 \hfill & {\text{otherwise}} \hfill \\ \end{array} } \right. $$
(25)

where

$$ B(\alpha ,\beta ): = \int\limits_{0}^{1} {t^{\alpha - 1} (1 - t)^{\beta - 1} {\text{d}}t} \quad t \in [0,1] \subseteq \mathbb{R}\quad \alpha, \beta \in \mathbb{R}^{ + } $$
(26)

is the complete beta function with parameters α and β.

The total amount of inequality in this case can be written as

$$ S(\alpha ,\beta ): = [B(\alpha ,\beta ]^{ - 1} (I_{1} - I_{2} ), $$
(27)

where

$$ I_{1} : = \int\limits_{x = 0}^{1} {x^{\alpha - 1} (1 - x)^{\beta - 1} \left[ {\int\limits_{y = 0}^{1 - x} {y(x + y)^{\alpha - 1} (1 - x - y)^{\beta - 1} {\text{d}}y} } \right]{\text{d}}x} $$
(28)

and

$$ I_{2} : = \int\limits_{x = 0}^{1} {x^{\alpha - 1} (1 - x)^{\beta - 1} \left[ {\int\limits_{y = - x}^{0} {y(x + y)^{\alpha - 1} (1 - x - y)^{\beta - 1} {\text{d}}y} } \right]{\text{d}}x} $$
(29)

As long as no simple analytical expression for S(α, β) is available, its value has to be obtained by numerical integration. The result is given in Fig. 3 as a contour plot.

Fig. 3
figure 3

Contour plot of the continuous inequality value S(α, β) = 0.10(0.05)0.45 for a standard beta distribution with parameters 0 ≤ α, β ≤ 10. (Prepared by Dr. R. J. Stroeker)

In a way, continuous distributions can be considered as limit cases of discrete ones for both kQ1 and NQ1. The findings and considerations together of CASE I gave rise to the expectation that, the inequality of a beta distribution (a) CIV < ½ and (b) increases as the value of the sum of a and b decreases. The above contour plot confirms these expectations reasonably well and visualizes the extent to which the inequality depends on the values of the distribution shape parameters. It appears, however, that for very small values of one of the parameters, the inequality shows an unexpected very steep descent. As long as no values of the parameters are expected in these regions, there seems no reason to worry about that.

5.1 Application to the Measurement of Happiness

In this section, we assumed the domain of the probability distribution to be [0, 1] ⊆ ℝ.

For the distribution of happiness in, e.g., nations, it is usual to adopt [0, 10] ⊆ ℝ as the domain. In that case, formula Eq. 20 is to be replaced with

$$ S(\theta ) = \int\limits_{x = 0}^{10} {\left[ {\int\limits_{0}^{10 - x} {g^{*} (x + y)y{\text{d}}y} - \int\limits_{ - x}^{0} {g^{*} (x + y)y{\text{d}}y} } \right]g^{*} (x){\text{d}}x,} $$
(30)

where g(x) has been replaced with g*(x) : = (1/10) × g(x), the Jacobian determinant (1/10) being required in order to replace Eq. 15 with

$$ \int\limits_{0}^{10} {g^{*} (x\left| \theta \right.){\text{d}}x = 1} $$
(31)

and x and y have been adjusted accordingly.

As a consequence, this linear transformation of the random variable will not affect the numerical value of the inequality measure S(θ) of its probability distribution, in other words: S(θ) is invariant under linear transformation of the random variable.

6 Conclusions

A set-theoretic approach to inequality as a relation on the set of the responses of all members of a sample from a population produces a number of additional inequality statistics. These statistics can be used for computing the maximum possible degrees of inequality and for ranking different happiness distributions according to increasing inequality. This applies to both discrete and continuous happiness variables separately.

In the discrete situation, happiness is measured by using a measurement scale on the basis of k ordered categories. In this situation, the inequality of the distribution can adopt a minimum (zero) value, but also a maximum. The latter situation occurs if all ½N sample members select the lowest possible rating and the other ½N the highest possible one. This finding even applies to the truly ordinal case, i.e., if the distances between the ratings are unknown.

For the nominal distribution, we defined a measure for its happiness inequality, which is referred to as the NII and for the discrete metric case the DII of that distribution. The latter statistic is equal to the mean pair distance, but for a factor which contains the sample size only. Our intention is not at all to add this indices to the list of current dispersion measures for daily use, but just to use this measure for selecting the most appropriate measures from that list. In this context, two applications can be mentioned. One is in the judgment of various descriptive statistics for the quantification of happiness inequality within nations. This study enhances the recommendation to proceed with the standard deviation as a suitable indicator for this situation. The other application is that this study delivers additional evidence against using the aptness of the Gini coefficient for the same purpose.

In case of a nominal scale, no terminal categories can be identified. Each of the k categories has ‘equal rights’ to be considered as such and it has been proven that the total inequality is maximal if the frequencies in all categories are equal or almost equal, which is in agreement with the proof given in Sect. 3. This example demonstrates that problems with variables at the ordinal level of measurement cannot always solved by treating the variable as nominal.

In case of a continuous distribution, it is possible to define a statistic, called CIV, given the p.d.f. of the probability distribution. For the standard beta distribution with parameters a and b, we found that CIV < 0.5 and that CIV decreases as a and/or b increase. In a contour plot, a more quantitative picture is given for CIV (a, b).