Multidimensional polarization for ordinal data

The dominant approach to evaluating distributional features of ordinal variables (e.g. self-reported health status) has been the Allison-Foster bipolarization ordering (henceforth AF). It has not yet been extended to a multidimensional setting. Here we fill this gap. A multidimensional extension of the AF relation is characterized by a sequence of median-preserving spreads on each dimension and association-changing switches. This extension does not pay attention to the dimensions’ association. We then offer one that does and characterize it in terms of classes of polarization measures and welfare functions. Based on these two orderings we construct polarization indices and develop statistical inference for them. We measure bidimensional polarization in educational attainment and life satisfaction across OECD members. Dependence does not affect whether or not countries dominate each other bidimensionally.


Introduction
Ordinal variables (i.e. for which there is only information on the ordering of the categories) abound in the social sciences e.g. self-reported health status, educational attainment. In recent years, they have been utilized widely to evaluate prosperity, e.g. the 2011 resolution of the United Nations General Assembly (No. 65/309), OECD Better Life Index. The problem with such data is that they do not have a natural scale. Standard summary statistics such as the mean, the variance and inequality measures point to different conclusions depending on the scale used, which is arbitrary. To account for this, an inequality measurement theory for ordinal data has been developed (Allison and Foster 2004;Kobus and Miłoś 2012;Apouey and Silber 2013;Abul Naga and Stapenhurst 2015;Lv et al. 2015;Cowell and Flachaire 2017). So far a notable approach is the Allison and Foster (2004) ordering for which the most unequal distribution is the most bipolarized one. We use the word polarization here too. Although this approach has been further studied (Apouey 2007;Abul Naga and Yalcin 2008;Lazar and Silber 2013;Kobus 2015) and is increasingly used in empirical research (Madden 2010;Jones et al. 2011;Dutta and Foster 2013;Arrighi et al. 2015), it has yet to be developed in some natural directions such as multidimensionality. The very few contributions in this direction (Makdisi and Yazbeck 2014;Sonne Schmidt et al. 2016) have limited applicability because they only deal with binary indicators. 1 We construct two multidimensional polarization orderings that are closely related to univariate AF. We characterize them in terms of elementary transformations of probability mass, a class of welfare functions and a class of measures consistent with these orderings. In a companion paper (Kobus and Kurek 2018) we develop statistical inference theory for these new measures. We use these results here to rank OECD countries in terms of educational and life satisfaction polarization, which complements the study by Balestra and Ruiz (2015).
The first ordering we propose -called mAF1 -requires that AF holds on each dimension. Such straightforward generalization does not take into account the association between dimensions, therefore it approximates and is applicable to cases where the interdependencies between variables are not very strong. As pointed out by Fattore and Maggino (2014) in their review of different approaches to multidimensionality in social sciences: "As a matter of fact, evaluation dimensions are often weakly interdependent (...). It is intrinsic to the true multidimensionality of the concepts related to quality-of-life" (p. 201). 2 We provide a type of Hardly-Littlewood-Pólya (Hardy et al. 1934) (henceforth HLP) result (Theorem 3) in which we obtain the equivalence between three notions: (i) the unanimity of the class of welfare functions that are increasing below the median, decreasing above the median and have all cross-differences equal to zero, (ii) an implementable criterion that compares distributions consistently with (i), namely, the mAF1 relation, (iii) elementary transformations that reflect the notion of polarization increase, which are median-preserving spreads on each dimension and association-changing, both increasing and decreasing, switches.
The second ordering we propose -called mAF2 -increases according to first-order stochastic dominance for points below the multidimensional median (i.e. a vector of medians) and increases according to survival dominance for points above the median. We show that this relation is equivalent to a class of welfare functions that for two dimensions are decreasing below the median and increasing above the median and supermodular. In the general case, relevant differences are positive or negative. mAF2 is also equivalent to a class of polarization measures which is continuous, decomposable by population subgroups and increases according to mAF2.
Based on the two polarization orderings, we construct polarization indices and we show their properties (Lemma 1-3). These are multidimensional analogues of the α, β family of indices by Abul Naga and Yalcin (2008) (denoted P α,β,γ ) and a, b family of indices by Kobus and Miłoś (2012) (denoted P a,b,c ). We also study a multidimensional index consistent with the mAF2 (denoted P mAF2 ). In a companion working paper (Kobus and Kurek 2018), we obtain explicit standard errors formulae for P α,β,γ , P a,b,c , and P mAF2 . Finally, we apply these three measures to study educational and life satisfaction polarization in OECD countries. In empirical applications we use a generalization of the AF relation from Mendelson (1987), largely overlooked by the literature. Here the mass is concentrated around any quantile, not necessarily the median. For the most equal distribution, for which all mass is in one category, there is obviously no differentiation between various quantiles. For the most unequal distribution, one still gets two distinctive groups, but they are uneven, e.g. for the first quartile groups' sizes are 25% vs. 75%. This generalization largely extends the applicability of the polarization ordering.
As mentioned, empirical application complements the study by Balestra and Ruiz (2015) (henceforth BR), who use the univariate AF relation. They acknowledge the need for the multidimensional extension of AF but their study lacks statistical analysis. We provide both extensions. We find that while univariate dominances are fairly common, bidimensional dominance is rare. There are only 20 cases of bidimensional polarization dominance and 14 cases of bidimensional welfare dominance -i.e. when welfare is understood as two univariate first-order stochastic dominances occurring together. As a region Northern Europe performs best in terms of low bidimensional polarization. The results are highly significant. Interestingly, the results are the same no matter which polarization measure we use. Since some measures neglect dependence between dimensions and others do not, this suggests that dependence does not play an important role when comparing joint distributions of education and happiness in OECD countries.
Although the AF bipolarization ordering is most often used to measure inequality in ordinal variables, it is important to note that in a cardinal setting, polarization and inequality are different concepts (Esteban and Ray 1994). In particular, polarization may increase after an inequality-decreasing transfer that improves group homogeneity. In an ordinal setting, it is difficult to conceptualize inequality as a deviation from a perfectly equal distribution, because there are as many of them as there are categories. With recent contributions, however, it seems that the differentiation between inequality and polarization for ordinal data is possible. In particular, Gravel et al. (2015) propose the Hammond transfer (Hammond 1976) as a defining concept for inequality for ordinal variables. A Hammond transfer moves two individuals closer in the distribution of an ordinal indicator irrespective of whether the gain equals the loss.  use similar transfers. Cowell and Flachaire (2017) consider various reference points, not only the median (e.g. the highest category). This paper is organized as follows. In Section 2, we cover the basic definitions and notation. In Section 3, we introduce the mAF1 relation and in Section 4 the mAF2 relation. We define axioms and formulate main results concerning these relations (Theorems 1-6). In Section 5 we study measures based on two polarization orderings and their properties (Lemma 1-3). In Section 6 we measure bidimensional polarization in educational attainment and life satisfaction in OECD countries. Finally, in the concluding remarks we give examples of other extensions of AF. This will be the subject of further research. Empirical results are collected in Appendix A and proofs in Appendix B.

Basic definitions and notation
A relation on a set of probability distributions is a partial ordering if and only if it satisfies (REFLEX) p p (ANTISYM) p 1 p 2 and p 2 p 1 implies p 1 = p 2 (TRANSI) p 1 p 2 and p 2 p 3 implies p 1 p 3 .
In what follows we will also use relations that fulfill only (REFLEX) and (TRANSI), which are quasi-orderings. An ordering is a partial ordering in which all elements are comparable (a chain). Each such ordering has an associated indifference relation (being the equivalence relation) defined as p 1 ∼ p 2 if and only if p 1 p 2 and p 2 p 1 . We call an element p maximal (resp. minimal) in if there exists no elementp such that p p (resp.p p) and p p.
Let us take third dimension and the following probability distribution on it p 3 (1) = 0.15, p 3 (2) = 0.55, p 3 (3) = 0.30. 4 We notice that p j is a unidimensional distribution for which we define the cumulative distribution function For example, we get the value P 3 (2) = 0.15 + 0.55 = 0.70. Let denote the survival function for dimension j -th.
In a similar manner we define a multidimensional cumulative distribution function by and a multidimensional survival function bȳ Continuing our example we have, for example, P(2, 4, 1) = 0.05. Let λ, denote, respectively, the set of all probability distributions and cumulative distribution functions.
For each dimension j we define a median m j which is the number of the category for which P j (m j − 1) < 1/2 and P j (m j ) ≥ 1/2. Let m = (m 1 , . . . , m k ) denote the vector of unidimensional medians. We often call such a defined multidimensional median simply the median. It is unique. This assumption can be relaxed but it is mostly technical. 5 Further, for each dimension j , τ j is the number of the category for which P j (τ j − 1) < π and P j (τ j ) ≥ π, where π ∈ [0, 1]. So τ j is a quantile of distribution P j and τ j = m j for π = 0.5. Let τ = (τ 1 , . . . , τ k ) denote a multidimensional quantile. When there is no confusion that we are considering unidimensional objects, we omit superscript j .
Finally, let the polarization index be denoted by P : → R. Its properties are defined in Section 5. We also consider a class of social welfare functions that are additively separable and symmetric with respect to individuals as it is typically assumed in multidimensional dominance literature (Atkinson and Bourguignon 1982, p. 190). Let U denote the set of utility functions u : I → R which can be both interpreted as a general evaluation function (which includes a utility function too) and also as a cardinal scale itself (see Yalonetzky (2013) or (Cowell and Flachaire 2017) for the same interpretation). It measures the contribution of an individual (or other unit) to total welfare and is evaluated at ordinal categories. Formally, let W : λ × U → R be a social welfare function such that W (p, u) = i∈I u(i)p(i). Allison and Foster (2004) postulate that inequality in ordinal data increases when probability mass is moved away from the median, that is, when the so-called medianpreserving spread occurs. They introduce a particular relation on the space of distributions (Definition 1) that embodies the notion of one distribution being more equal than the other.

Definition 1 Unidimensional AF (AF)
Let p 1 , p 2 be two distributions and let m denote the median. We write p 1 AF p 2 if and only if the following conditions hold: (AF1) p 1 , p 2 have a unique and common median m, The interpretation of the AF ordering is intuitive. In particular, we have that p 1 AF p 2 when p 1 is more concentrated (i.e. when there is more probability mass) around the median than p 2 . The most bipolarized distribution, that is, the one that has half of the mass in the lowest category and half of the mass in the highest category, is the most unequal according to this relation.

Dependence neutral multidimensional polarization ordering: a characterization theorem
In a cardinal setting, Atkinson's theorem (Atkinson 1970) states that Lorenz ordering is the largest ordering compatible with each symmetric index fulfilling the Pigou-Dalton transfer axiom. We ask essentially the same question in the multidimensional ordinal framework: for which transfers is the multidimensional AF the largest equivalent ordering? And also, what is the class of indices in which all indices are consistent with the multidimensional AF relation? Finally, what are the comparisons of the distributions in terms of polarization over which all social welfare functions in a given class agree? By answering these questions we get a type of Hardly-Littlewood-Pólya result that combines transfers with a dominance ordering and a class of measures and welfare functions. Such results are typically difficult to obtain in a multidimensional framework (Gravel et al. 2015).
We take the natural extension of the unidimensional AF relation, which we call mAF1. Then we define two types of transfers: multidimensional spread and association-changing switch. They are related to mAF1 (Theorem 1). Then we define a class of polarization indices and a class of social welfare functions and show that they too are related to mAF1 (Theorem 2). Together these results imply our main result (Theorem 3). We keep the notation of Allison and Foster (2004), so the dominating distribution is worse in the sense of the polarization relation.

Definition 2 Multidimensional AF (mAF1)
Let p 1 , p 2 be two probability distributions with a unique and common median m. We say that p 1 mAF1 p 2 if and only if p j 1 AF p j 2 for all j ∈ {1, 2, . . . , k}.
In the definition above, p j 1 , p j 2 are the marginals of p 1 , p 2 given by Eq. 1. Note that m denotes the vector of medians (m 1 , m 2 ). If the probability mass on each marginal is concentrated in one category, then joint distribution is concentrated in one category too. The opposite is true as well, that is, if the joint probability mass is concentrated in one (multidimensional) category (which is then also the median m), then so is the probability mass on each marginal. This is the least polarized distribution according to mAF1 .
Definition of the multidimensional spread is the following: ceteris paribus probability mass is moved away from the median on one dimension.

Definition 3 Multidimensional Spread
Let p 1 , p 2 be probability distributions with a unique and common median m. We say that p 2 was obtained from p 1 via a multidimensional spread if and only if for some j ≤ k, > 0 one of the following two conditions holds: whereĩ is i except for j -th coordinate where we put i 1 and analogously forî where we put i 2 . p 1 , p 2 coincide on other coordinates. 2. There exist i 1 > i 2 ≥ m j and i ∈ I such that p 2 (ĩ) = p 1 (ĩ) + , p 2 (î) = p 1 (î) − , whereĩ,î are the same as in condition 1 and p 1 , p 2 coincide on other coordinates. Figure 1 gives examples of the transfer described in Definition 3. The first picture refers to point 1 in the definition and the second picture refers to point 2. The solid lines show medians on both dimensions. Another type of transfer that is a necessary constraint on distributions consistent with mAF1 is the following.

Definition 4 Association-changing Switch
Let p 1 , p 2 be probability distributions on I. We say that p 2 was obtained from p 1 via an association-changing switch if and only if for some i ∈ I and j 1 , j 2 ≤ k there exist i 1 , i 2 on dimension j 1 and h 1 , h 2 on dimension j 2 and ∈ R such that where i 1 is i except that on j 1 −th coordinate we put i 1 and on j 2 −th coordinate we put h 1 . Analogously in i 2 we put i 2 , h 1 , in i 3 we put i 1 , h 2 and in i 4 we put i 2 , h 2 .
The type of transfers described in Definition 4 either increase or decrease association by putting more mass on the diagonal or counter-diagonal, respectively. Please note that there is no relation between i 1 , i 2 and between h 1 , h 2 . Association-increasing switches are described in Tchen (1980), Epstein and Tanny (1980) and Tsui (1999). Figure 1 (the third picture) illustrates this concept. Here, there are only two dimensions so j 1 = 1 and j 2 = 2. We choose i 1 = 2, i 2 = 4, h 1 = 2, h 2 = 5 and thus the shift takes place between points p(2, 2), p(2, 5), p(4, 2), p(4, 5). We transfer the probability mass from p(2, 5) to p(2, 2), which is then compensated by a transfer from p(4, 2) to p(4, 5). Alternatively, one could take < 0 in Definition 4 in which case a transfer would be from p(4, 2) to p(2, 2) and from p(2, 5) to p(4, 5). In our example both transfers cross the median, but in general this does not have to be the case. In a three (and more) dimensional space (Fig. 2), an association-changing switch takes place between two dimensions with the third dimension (and more dimensions) fixed.
This shows that mAF1 relation and Definitions 3 and 4 are related.
Theorem 1 Let p 1 , p n be probability distributions with a unique and common median m.
The following statements are equivalent: (i) p 1 mAF1 p n (ii) There exists a sequence p i , i = 1, . . . , n such that p i differs from p i+1 by either a multidimensional spread or an association-changing switch.
A class of indices consistent with Definition 3 increases following a multidimensional spread. Fig. 2 The association-changing switch in three dimensions Definition 5 An index P is increasing with respect to a multidimensional spread if where p 1 was obtained from p 2 via a multidimensional spread.
A class of indices consistent with Definition 4 does not change following an associationdecreasing or increasing switch.
Definition 6 An index P is neutral with respect to an association-changing switch if where p 1 was obtained from p 2 via an association-changing switch.
Similarly to polarization indices, a class of welfare functions that is consistent with Definitions 3 and 4 increases with respect to the unidimensional AF relation on each marginal and treats dependence structure as irrelevant. Let us define the discrete analogue of a differential operator D l u(i 1 , i 2 , . . . , i k ) = u(i 1 , i 2 , . . . , i l + 1, . . . , i k ) − u(i 1 , i 2 , . . . , i l , . . . , i k ) We consider the following class of utility functions Utility functions that belong to U 1 are increasing for points below the median, decreasing for points above the median and have all cross-differences equal to zero. This class of welfare functions reminds one of ALEP neutrality (Kannai 1980). These are functions that increase with the ordering on each marginal distribution and do not pay attention to the dependence.
Theorem 2 Let p 1 , p n be probability distributions with a unique and common median m. The following statements are equivalent: (i) p 1 mAF1 p n (ii) P(p 1 ) ≤ P(p n ) for all P satisfying Definitions 5 and 6.
Combining Theorems 1 and 2 gives us the main result that characterizes the multidimensional polarization relation in the HLP spirit.
Theorem 3 Let p 1 , p n be probability distributions with a unique and common median m. The following statements are equivalent: i) p 1 mAF1 p n (ii) There exists a sequence p i such that P(p i ) ≤ P(p i+1 ) and p i differs from p i+1 by a multidimensional spread or association-changing switch. (iii) P(p 1 ) ≤ P(p n ) for all P satisfying Definitions 5 and 6. (iv) W (p 1 ) ≥ W (p n ) for all u ∈ U 1 .

Dependence increasing multidimensional polarization ordering: a characterization theorem
Ignoring dependence is a significant limitation of mAF1, although as mentioned in the Introduction, ordinal indicators used in social sciences are often weakly interdependent. Here we propose a relation which is a different multidimensional generalization of AF and increases with dependence. The more interdependent attributes are, the more polarized the joint distribution is.

Definition 7 Multidimensional AF (mAF2)
Let p 1 , p 2 be two probability distributions with a unique and common median m. We say that p mAF2 q if and only if the following two conditions hold (1) P(i) ≤ Q(i) for i ≺ m (2)P(i) ≤Q(i) for i m whereP,Q denote survival functions of p and q, respectively.
In other words, for i ≺ m relation mAF2 increases according to first-order stochastic dominance, and for i m relation mAF2 increases according to survival dominance. Figure 3 helps to explain the intuition behind mAF2. Consider the types of bidimensional transfers of probability mass that are consistent with mAF1. These transfers increase bipolarization on each dimension. That is, for i ≺ m probability mass is moved towards lower categories and for i m probability mass is moved towards higher categories. However, when i 1 ≤ m 1 and i 2 ≥ m 2 or when i 1 ≥ m 1 and i 2 ≤ m 2 , then mass is moved in reversed direction on both dimensions. Transfers that move mass in the same direction are consistent with either first-order stochastic dominance or with survival dominance, which are both well-known partial orderings on distributions. However, transfers that move mass in opposite directions are not characterized by any known relation on distributions. Therefore, we do not impose anything on how mAF2 behaves in such cases.
Let us consider the following class of utility functions.
Utility functions that belong to U 2 are the following. When j is an odd number, differences of the j -th order are lower than zero for categories below the median and greater than zero for categories above the median. When j -th is an even number they are all greater than zero. For two dimensions, such functions are decreasing (increasing) below (above) the median and supermodular, i.e. u(i 2 , j 2 )−u(i 1 , j 2 )−u(i 2 , j 1 )+u(i 1 , j 1 ) ≥ 0 for i 2 > i 1 and j 2 > i 1 . The mAF2 relation is the largest (in the sense of inclusion) relation on distributions that is equivalent to the class of welfare functions for which utility functions belong to U 2 .
Theorem 4 Let p 1 , p 2 be probability distributions with a unique and common median m. The following statements are equivalent: The mAF2 relation is also consistent with a class of polarization measures. We state this result here, because it gives another characterization of the mAF2 relation, however, the axioms used in Theorem 5 will be defined in the next section. This result will also be useful in studying properties of polarization measures related to the mAF2 relation.
Theorem 5 Let p 1 , p 2 be two probability distributions with a unique and common median m. The following statements are equivalent: (i) p 1 mAF2 p 2 (ii) P(p 1 ) ≤ P(p 2 ) for all P satisfying CON, NORM, DECOMP and EQUAL2.
Combining Theorems 4 and 5 provides a characterization result for the mAF2 relation.
Theorem 6 Let p 1 , p 2 be two probability distributions. The following statements are equivalent: (iii) P(p 1 ) ≤ P(p 2 ) for all P satisfying CON, NORM, DECOMP and EQUAL2.

Multidimensional polarization measures
In Sections 3 and 4 we characterized polarization relations mAF1 and mAF2. These are partial orderings, so we will now consider polarization measures related to them. Measures ensure linear ordering on the set of distributions, hence they are always conclusive. The following axioms will be imposed on polarization indices.
CON is a natural technical assumption. The index fulfills NORM when it attaches value 0 to the least polarized distribution (i.e. all mass is in one multidimensional category) and when it attains value 1 for the most polarized distribution (i.e. the probability mass is equally divided by points (1, . . . , 1) and (n 1 , . . . , n k )). DECOMP means that a given measure is decomposable by population subgroups. The original notion comes from Shorrocks (1984) and in the ordinal setting it was first defined by Kobus and Miłoś (2012). It works in the following way. If p 1 , p 2 is the distribution of, respectively, men and women, and α is the population size of the men's group, then αp 1 + (1 − α)p 2 is the whole distribution, and the polarization measure which is decomposable represents the polarization in the whole distribution as some function of the polarization values in men's and women's distribution. ATTRDECOMP requires that a multidimensional polarization index can be represented as a function of unidimensional polarization indices. It is a desired property of inequality, polarization or poverty measures as it allows the evaluation of the contribution of each dimension to overall inequality/polarization. This definition was introduced by Abul Naga and Yalcin (2008). This can also be treated as a form of obtaining a multidimensional measure which is alternative to, for example, first computing individual welfare levels and then computing the overall index. Zhong (2009) applies the Abul Naga-Geoffard decomposition in a health-income context and Crocci Angelini and Michelangeli (2012) to study the evolution of well-being inequality in some EU countries. Kobus and Miłoś (2012) characterizes attribute decomposability in the strongest form, namely when association is not taken into account, and provides a simple proof to check the decomposability of specific indices. A strong form of ATTRDE-COMP is the one used here. ADDSEP makes the index additive. EQUAL, EQUAL1 and EQUAL2 ensure consistency with, respectively, the AF, mAF1 and mAF2 relation. Before we study measures related to our polarization orderings, we note the following relationship between axioms.
Remark 2 If P fulfills ATTRDECOMP and EQUAL, then P fulfills EQUAL1.
We will now offer three families of multidimensional polarization measures. In Lemmas 1-3 we show what properties they satisfy. The following index is consistent with the mAF2 relation, i.e. it fulfills EQUAL2. This is a simple index that adds probability mass for points below the multidimensional median and subtracts probability mass for points above the multidimensional median. The denominator is for normalization.
The following index is consistent with the mAF1 relation.
Here α, β are vectors, therefore we put them in bold. When α j → 1, then the index becomes more sensitive to inequality below the median and it abstracts from it when α j → ∞. Similarly for β. P increases in γ . When γ → −∞ the index places more weight on the dimension with the smallest polarization; on the other hand, when γ → ∞ the index takes into account the dimensions with the highest polarization.
Here is another index consistent with mAF1.

Lemma 3
where P(p) is the Kobus and Miłoś (2012)

fulfills CON, NORM, DECOMP, ATTRDECOMP, EQUAL1 and ADDSEP.
Here a, b, c are vectors. If a j = 1 and b j = 1 for all j , then we get P 1,1 which is the multidimensional version of the absolute value index introduced by Abul Naga and Yalcin (2008). When a j > b j the index is more sensitive to polarization below the median on the j -th dimension, whereas the opposite is true if a j < b j and more weight is attached to polarization above the median.

Educational and life satisfaction polarization among OECD countries
In the empirical analysis below we use the Mendelson (1987) definition which allows for crossing points τ other than the median m. In our dataset we mostly find dominances for which τ = m. We recall the original definition of Mendelson.
Distributions can cross at several quantiles at the same time, but then the values of cdfs are the same for all these quantiles (Lemma 4). Therefore we take AF τ with the lowest α.
Proof From p AF q (τ 1 ) we get that P (i) ≥ Q(i) for i ≥ τ 1 and from p AF q (τ 2 ) we get that P (i) ≤ Q(i) for i < τ 2 , so by combining both we obtain P (i) = Q(i) for τ 1 ≤ i < τ 2 .
We generalize AF τ in the same was as we constructed mAF1.
Definition 9 Multidimensional AF τ (mAF τ ) Let p 1 , p 2 to be two probability distributions with a unique and common multidimensional quantile τ = (τ j ) k j =1 . We say that p 1 mAF τ p 2 if and only if p j 1 AF τ p j 2 for all j ∈ {1, . . . , k}.
In words, p 1 mAF τ p 2 when p 2 is less concentrated around the τ -quantile than p 1 and is therefore considered more polarized. It should be noted that τ may include different quantiles for marginal distributions, that is, it might be that τ j = τ l for j = l. With this definition, after replacing the median by τ , Theorem 3 is still valid. Coming back to measures, when AF τ replaces AF, then the Abul Naga and Yalcin index becomes where now m denotes the τ -th quantile. That is, although Lemma 4 holds, the values of the index may change when the parameter changes from the median to any quantile. The Kobus and Miłoś index changes to where again m denotes the τ -th quantile.
We are now ready to state the results of our empirical analysis. It complements the study made recently by Balestra and Ruiz (2015) who use the AF approach to compare OECD countries in terms of polarization in education and life satisfaction. However, they themselves mention that the shortcoming of their study is that they treat the two dimensions separately. We fill this gap. We use the World Values Survey Wave 5 (collected between years 2005 and 2009) and Wave 6 (2010-2014) which covers 37 OECD countries. The sample sizes range from 784 (New Zealand) to 2809 (South Africa) respondents. Educational attainment is an ordinal indicator with 9 categories which are answers to the question "What is the highest education level that you have attained?" -which we encoded as a 4category variable. Life satisfaction is a 10-category variable answering the question "All things considered, how satisfied are you with your life as a whole these days?". We encoded two adjacent categories as one and got a 5-category indicator. Table 1 provides the detailed description. The number of possible pair-wise dominance relationships we consider is 666, that is, we consider all situations where there is a dominance of p 1 over p 2 but there is no dominance of p 2 over p 1 .
As for unidimensional comparisons, the extension of AF dominance to AF τ dominance reduces the incompleteness of inequality rankings sixfold for each dimension (Table 2). Concerning education, Austria, New Zealand, Norway, Slovakia and USA are dominated by most countries, that is, they are the least educationally polarized. The most polarized countries are Slovenia, Turkey, Mexico and Luxembourg. For life satisfaction the best performing countries are generally Northern European countries and the worst are Eastern European countries and France ("the French unhappiness puzzle" -Senik (2014)). Except for the UK, as a region Northern Europe performs well in both dimensions.
Bidimensional polarization is rare. We find only 20 cases of mAF τ dominance for τ = 1 4 , 1 3 , 1 2 , 2 3 , 3 4 . We also find 14 cases when mAF τ dominance practically comes down to two unidimensional F SD dominances, because the crossing point is the last category. These 14 cases thus represent examples of welfare dominance, when welfare is understood as the joint occurrence of two univariate F SD dominances. In those cases, the value of P a,b,c = i<m P (i) (m−1)τ does not change with parameters because they cancel out, whereas the value of P α,β,γ = i<m P (i) α +1 (m−1)τ α does with α due to non-linearity. Furthermore, many mAF τ dominances repeat, namely, we find dominance for both τ = 1 4 and for τ = 1 2 . In those cases the same category is both the first quartile and the median, but neither of them is the first nor the last category. Index P mAF2 remains unchanged. Tables 3, 4 and 5 report the values of indices together with standard errors. Polarization scores differ significantly between countries -e.g. between 0.28 and 0.47 for the P 1,1,2 index (Table 3 second column). Table 6 contains the values of a test statistic for differences in polarization measures. Generally Northern Europe performs best as a region. This is true for welfare dominance as well, so this regions enjoys both relatively high aggregate levels of achievement in both dimensions and a low level of polarization whatever scale considered. The results often resolve ambiguities in the BR study. For example, New Zealand is less bidimensionally polarized than Australia, but the comparison of dimensions separately gives ambiguous conclusions.
The results are typically highly significant. In the case of P α,β,γ what makes the results less significant is putting less emphasis on polarization in the lower end of the distribution (fourth vs. second column of Table 6). For P a,b,c this happens too but to a lesser extent and when more emphasis is put on polarization at the higher end of the distribution. The results obtained due to P mAF2 are practically the same as the results from two mAF1 indices. As already discussed, comparing to P α,β,γ and P a,b,c , P mAF2 takes into account the association between dimensions of well-being. Therefore this indicates that in the evaluation of OECD countries in terms of bidimensional polarization, there is no impact of association on whether a given country dominates the other given that it already dominates it in each dimension.

Concluding remarks
The Allison-Foster polarization ordering is the main approach in measuring inequality for ordinal data. In this paper, we studied its multidimensional counterparts and proved a set of standard results regarding the dominance relations and measures based on them. While Kobus (2015) relaxes the assumption of a unique median and even more conclusiveness can be reached with Mendelson's generalization, still the reliance on a common crossing is a limiting feature of the AF approach, namely, distributions that do not cross cannot be compared. Abul Naga and Yalcin (2010) and Sarkar and Santra (2016) shed some light on median-independent orderings. This is definitely an important research direction for multidimensional polarization for ordinal data.
Still, other generalizations of the AF approach are worth considering. For example, one could take different definitions of the multidimensional median -e.g. in the case of the geometric median one considers transformations of the sample points that minimize the sum of Euclidean distances without changing the argument minimum. Another extension is to consider the convex set spanned by the points (n 1 , . . . , m j , . . . , n k ), j = 1, . . . , k and (m 1 , . . . , m k ). Then, multidimensional AF can be defined as first-order dominance below this set and either survival or reverse first-order dominance above this set.
Furthermore, Theorem 1 shows explicitly the types of transfers that mAF1 is sensitive to and is salient about. Given this theorem it seems natural to combine this ordering with the relation that treats association separately. Such separation is consistent with the view of a multidimensional distribution as a collection of the marginals and dependence structure (e.g. Nelsen (1999)). This is an interesting direction because it almost automatically gives an attribute decomposability of polarization measures into univariate polarizations and a measure of dependence. One should bear in mind, however, technical problems related to dependence for discrete distributions (Carley 2002). This is because the concept of dependence is not clear for ordinal variables, and more general for discrete variables (Genest and Neslehova 2007). Correlation is certainly not a good measure as it is not scale-free and has many other drawbacks (e.g. it only picks up linear dependence). It is therefore through such comparisons as in Table 6, among others, i.e. the comparisons of scale-free measures that treat association differently, that the impact of association can be detected for ordinal indicators.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.