1 Introduction

The happiness literature suggests that human beings set their preferences over a wide range of goods, social and moral values and institutions. In this context, worldwide happiness surveys are widely used both in academic research and in the construction of worldwide happiness indexes that are commonly employed in cross-country comparisons of happiness levels (Kahneman and Krueger 2006; Easterlin 2001; Mentzakis and Moro 2009; Pedersen and Schmidt 2011; MacKerron 2012).

However, the fact that most of these indicators are based on answers to questionnaires subjects their results to at least two main concerns. First, they can be affected by a number of potential errors that stem from language ambiguities, scale comparability and ambiguities related to the time period on which respondents based their answers (Bertrand and Mullainathan 2001). Similarly, Kristensen and Johansson (2008) present a cross-country comparison on job satisfaction for a number of EU countries and highlight that individuals belonging to different cultures also perceive questionnaires differently, which could make any comparison misleading. A second concern is that since country-level happiness indicators can be seen as the outcome results of economic and social policies and institutions, it is plausible to think that they are potentially subject to manipulation; see for example Frey (2011).

In this paper we propose an alternative methodology based on the preferences on different happiness determinants that many millions of people reveal with their decision to migrate to some countries compared to other potential alternative destinations, measured over a number of years. In a different context, Tiebout (1956) had already suggested that people “vote with their feet” to find the community that provides their optimal bundle of taxes and public goods, and the issue has been an object of analysis in, for example, Banzhaf and Walsh (2008) and Cameron and McConnaha (2006). Here we explore how the size and direction of migration flows are affected by a number of happiness indicators. As it will be discussed in next section and reported in the “Appendix”, these indicators are publicly observed.

Consistently with this insight, our estimation of a gravity model for net migration flows using data from the OECD migration database during the period 1995–2011 reveals that migration flows respond to typical bilateral gravity variables such as income, language, common borders and migration policies, as well as variables that the happiness literature has proposed as both economic and non-economic determinants of happiness. Dolan et al. (2008) classify those factors into: absolute income, relative income, demographic and social characteristics, social development, time use, relationship with others and characteristics of the place where we live. We control for these variables and incorporate fixed effects to control for non-observable components that are not related to wellbeing such as the different size of migration across different pairs of countries and other potential idiosyncratic components, as well as time effects to allow for comparison across different years. Once all these factors are taken into account, a desirability index for cross country comparison is proposed using the estimated coefficients.Footnote 1 We interpreted it as a happiness index given that it is based on revealed preferences about happiness indicators. However, regardless of its name, the importance of this index is that it could be deemed as a relevant instrument to be used by policy makers in order to weight, according to revealed preferences, the relative importance of a set of economic and social variables and institutions.Footnote 2

The remainder of the paper is organized as follows: Sect. 2 describes the study design, starting with the determinants of happiness usually proposed by the happiness literature and then presenting the details on the empirical strategy, which consists of estimating a gravity model of migration to reveal preferences, using the FEVD panel estimation methodology; Sect. 3 presents and discusses the panel estimation results; Sect. 4 proposes a happiness index based on preferences revealed through migration. Finally, Sect. 5 concludes.

2 Study design

2.1 Dependent variable

Our dependent variable is the net migration flows from all over the world (see “Appendix A”) into OECD countries (plus Russia) using data from the OECD migration database during the period 1995–2011.

Due to the problem of missing values for the dependent variable we extracted two different samples: (i) Sample 1 includes countries with the least number of missing values; (ii) Sample 2 includes the larger countries as measured by GDP. Apart from missing values, there are also cases with zero migration flows. While country pairs with missing values for the dependent variable are automatically excluded from the regressions, those that have a zero value are not. We cannot be sure whether a zero value is a true zero flow or a missing value that was recorded as a zero, however for our purposes the relevant issue is whether their existence is non-random. We tested whether both missing and zero values in the dependent variable could result from a self-selection bias in each of the two samples and adjusted the estimation accordingly.

2.2 Explanatory variables

We introduce explanatory variables proposed by the happiness literature as well as control variables based on the gravity model literature. Our list of explanatory variables and the data sources are reported in “Appendix B”.

2.2.1 Happiness variables

Dolan et al. (2008) provide a very complete review of the economic literature on happiness, proposing a classification into six broad groups: (1) absolute and relative income; (2) personal characteristics such as age, gender, ethnicity, household size, number of children, education and marital status; (3) social development characteristics such as education, health (or life expectancy), sector of work (agriculture, manufacturing, services), and unemployment; (4) how we spend our time described by variables such as hours worked, commuting, care for others, community involvement and volunteering, and religion activities; (5) attitudes and beliefs toward self/others life describes the characteristics of relationships with others with respect to marriage and intimate relationships, family and friends; (6) the wider economic, social and political environment, a country’s institutions, and is represented by a variety of country characteristics such as inflation, welfare system and public insurance, economic freedom, climate, natural environment, safety, political freedom and nature of policies. These variables have been used in various studies of happiness, such as Easterlin (1995, 2001), Ferrer-i-Carbonell (2005), Mentzakis and Moro (2009), Blanchflower and Oswald (2004, 2008), Pedersen and Schmidt (2011), Peiró (2006), Roysamb et al. (2002), Realo and Dobewall (2011), Abadie (2006), Abadie and Gardeazabal (2008).

We also include as explanatory variable a traditional happiness indicator taken from survey data, in this case from the World Values Survey (http://www.worldvaluessurvey.org/). This allows us to identify the relationship between the traditional survey variables and our revealed preference measure (migration) and show the impact of the additional explanatory variables. The significance of this impact demonstrates that migration decisions may be correlated with a variety of happiness determinants that are not captured by the existing survey-based happiness indicators.

2.2.2 Gravity variables

The migration literature has traditionally used gravity models to account for the determinants of migration flows (see, for example, the recent work by Felbermayr and Toubal 2012; or Hanson and McIntosh 2012). Gravity models relate bilateral flows of trade, investment, or in our case, migration, to the size of the partner countries and the inverse of the distance between them. More generally, the gravity literature includes a number of variables capturing factors that facilitate or hinder migration. In particular, we include pairwise variables such as the distance between each pair of countries, and two dummy variables that take value 1 when the pair of countries shares a common language and a common border respectively and zero otherwise. We include origin-specific and destination-specific variables such as country GDP plus migration policies.Footnote 3

2.3 Empirical model

Beine et al. (2011) provide a theoretical justification for deriving a gravity-type equation from the maximization of the utilities obtained by a representative agent for remaining in the country of origin or migrating to a number of alternative destinations.Footnote 4 These utilities are linear functions of attributes that are specific to either origin or destination, or defined bilaterally for each origin–destination pair. Here, we consider that the determinants of happiness are part of these attributes and, in line with the gravity model literature, we estimate the following specification:

$$\begin{aligned} F_{ijt} =\alpha _0 + \sum \limits _{i=1}^{p_1 } \beta _i {}^{\prime }s_{it} +\sum \limits _{j=1}^{p_2 } \gamma _j {}^{\prime }d_{jt} +\sum \limits _{r=1}^{p_3 } \delta _r {}^{\prime }x_{rt} +\eta _t +u_{ijt} \end{aligned}$$
(1)

where \(F_{ijt}\) is the net flow of people moving from country i to j at time t (migration); \(s_{it}\) is a vector of country-specific variables for the country of origin, \(d_{jt}\) is a vector of country-specific variables for the country of destination; \(x_{rt}\) is a vector of pairwise variables between the origin and destination country; \(\eta _t\) is a year fixed effect; \(\alpha _0\), \(\beta _i\), \(\gamma _j\) and \(\delta _r\) are parameters of the model; and \(u_{ijt}\) is an iid error with zero mean and \(\sigma ^{2}\) variance for countries i and j at time t.

Note that model (1) includes, among other variables, happiness characteristics of the different countries and the associated parameters can be interpreted as individual preferences for these characteristics.

2.4 Estimation strategy

The estimation results are obtained using Fixed Effects Vector Decomposition (FEVD) with a first stage Heckman correction. The use of fixed effects is justified by the standard Hausman (1978) test. The use of the FEVD method (Plümper and Troeger 2007) circumvents the elimination of time-invariant varables that occurs in the traditional fixed effect model, whereas the two-stage Heckman estimation addresses the potential presence of self-selection bias (probability of having observable net flows strictly different from zero). A similar approach has been used, for example, by Helpman et al. (2008) in trade or by Beine et al. (2011) to model migration.Footnote 5

Identification of the model is achieved by including in the first-stage Probit specification several variables that should have an impact on the fixed costs of migration, such as: (i) for the origin country, a dummy for being an oil producer, a dummy for authoritarian country, the country’s average fertility rate, lagged emigration policies, and an island indicator; (ii) for the destination country, the lagged introduction of restrictive migration policies, conservative policies, and liberal migration policies; (iii) finally, the existence of a common currency, common religion and free trade area.Footnote 6

Our estimation results suggest the presence of selection bias indicated by a significant inverse Mills ratio. To control for the potential correlation of the error term in the primary and the selection equation we also considered the Mundlak–Chamberlain approach, with no qualitative change in the estimated results or in the subsequent ranking of countries presented in the following section. For the sake of brevity we show in this paper our baseline specification that is based on a unique estimated inverse Mills ratio for the whole sample period.

3 Results and discussion

The benchmark estimation results are presented in Table 1. The table specifies clearly which variables have been used only in the first stage (selection variables) and which have been used in both stages (variables of interest). Within the group of variables of interest, it also distinguishes the bilateral variables (most of them gravity controls), the country-level characteristics considered at origin and at destination, and the individual-level characteristics of the migrants measured at their origin country. These characteristics cover physical (age, gender, life expectancy), social (marital status, number of children) and psychological (importance given to family, friends, work, nationality and politics) dimensions of the individual that may influence the decision to migrate. For completeness we insert into the empirical specification the same happiness determinants for both the origin and the destination countries, except for a few variables that did not present enough variance at the destination (OECD countries) and would become collinear with the constant term.Footnote 7 In those cases, those variables are included only for origin countries (worldwide sample).

Table 1 Regression results

The signs of the coefficients are robust across the two samples for the majority of variables. The significance of the lagged dependent variable reveals the persistence of the geography of migration flows over time, which is a common result in the migration literature. The long-run results do not differ qualitatively from those of the short-run, although the long-run impact amplifies that of the short-run due to the positive sign of the lagged dependent variable coefficient. The cumulative nature of this result confirms the high persistence and increasing impact of migration determinants over time.

Note that the inclusion of the World Values Survey happiness index measured as the difference between the values taken at the origin and at the destination countries does not affect the estimation. This index is negatively correlated to net migration flows. Furthermore, the correlation of migration flows with lagged and leading values of the survey-based happiness indexes is negligible. These values do not change much after accounting for all the other factors that impact on migration in Table 1 regressions. This result reveals that information based on standard indexes are a weak representation of observed actions in terms of country preferences revealed through migration. Besides migration is explained by factors that are not captured by the happiness index: traditional gravity variables, migration policy variables, and various other variables that influence happiness grouped described in Sect. 2.

In particular, all the traditional gravity model variables are significant at 1% and have the expected signs: migration depends negatively on distance but positively on common border and language. Moreover, being a landlocked country decreases migration at origin and at destination. These are country-level factors that are not considered in the two survey-based happiness indexes.

Also significant is a large number of country characteristics which are not taken into account either by the survey-based happiness indexes or by the traditional gravity variables. The happiness literature has highlighted the importance of absolute and relative income and so has the migration literature. Indeed we find that migrants flow out of poorer countries and from more unequal to less unequal countries. Presumably, this is because both absolute and relative income influence preferences as has been reported by the happiness literature.

We also control for a number of personal characteristics which are aggregated at the country level either by taking means or by calculating the percentage of population that bears such characteristic in the country. The results show that there is more emigration from origin countries with higher standard deviation of age, higher percentage of married and of single people, and higher percentage of men in the population. The contribution of education to migration is positive, both at origin and at destination. Generally, countries with higher educational levels may offer broader employment opportunities and educated people are more sought after in the labour market. This result underscores the importance of years of education in the domestic and foreign labour markets.

Next we take into account social development characteristics such as unemployment and life expectancy. It would be expected that migration would increase (decrease) with unemployment at the origin (destination). In general, these expectations are confirmed by the results. Life expectancy is a more complex variable because countries where people live longer supply more migrants over time but on the other hand provide less labour market vacancies. To account for non-linearity, the square of this variable was included as an additional explanatory variable. After carrying out these modifications, life expectancy is found to decrease migration at the origin. These results are consistent with the hypothesis that life expectancy proxies for general well-being in a country rather than representing labour market considerations.

Another group of factors influencing country preferences would be the migrant’s attitudes and beliefs. For example, there is less emigration out of countries where more people attribute more importance to work and politics. Perhaps this result is due to migration being less likely the more the migrants are involved in work and political networks in their country. On the contrary, there is more emigration out of countries where higher average importance is given to nationality. The result that migration increases (diminishes) with the level of priority given to men in the origin (destination) country seems to point towards the existence of discrimination motivations to migrate.

The next group of variables concerns several general country characteristics that make them more or less attractive. The results indicate that there is more emigration out of countries with higher population density, more pollution, and higher altitude. These are undesirable characteristics for most people. On the contrary, emigration is lower out of more peaceful countries but also out of more corrupt ones as there may be more vested interests in staying within informal networks. On the other hand, immigration is higher into countries with higher population density, higher pollution, lower rainfall, lower altitude, more civil liberties, more peaceful, and with a freer economy. Higher population density and higher pollution can be seen as proxies for a high level of economic activity and social interaction, therefore better employment opportunities. For these reason they may proxy for a location’s attractiveness, even though they may also proxy for congestion diseconomies beyond certain levels. However, it is also relevant to note that due to its small magnitude the estimated coefficient associated to pollution only has a marginal influence in the index.

A relevant issue to notice regarding the estimated model is that country size is already taken into account by including pairwise country fixed effects. Other approaches such as the one proposed by Beine et al. (2011) for trade have the advantage that they are sensitive to fluctuations of country size along time. However, the empirical implementation of this framework to our particular context clashes with the important empirical problem that total potential migrant population is not directly observed for any country as it is not only dependent on age but also on a myriad of personal, social and economic reasons. Therefore changes in population in a given country could not correspond in many cases to changes in the potential migration population size. This measurement error would be especially important for developing countries with big changes in population.

In a robustness exercise, the model was re-estimated by following a similar approach to Beine et al. (2011) using total population as a measure of population size. Estimation results are not reported for the sake of brevity but they are qualitatively similar in most cases to those in Table 1. However, there are few but very relevant differences in the proposed happiness ranking as some countries with big fluctuations in their population such as Bangladesh, India, Nigeria or Tanzania are among the happiest countries in the ranking.

4 A proposal for a happiness index based on revealed preferences

The previous results have shown that there are many variables that establish a relationship between happiness and migration flows. We take their values in the last available year of the sample, 2011, to construct a happiness index where the estimated long-term coefficients are used as the respective weights of the happiness determinants discussed in the previous section. Although the approach in this paper is empirical the estimated parameters in the model could be interpreted as the value that indiviuals give to different happiness indicators in their utility function based on their decision to migrate to one country or another.Those coefficients are averaged in two circumstances: (i) when a specific determinant is estimated both at the origin and at the destination;Footnote 8 (ii) when one country is included in both of the samples used. Furthermore, in order to deal with missing values in some variables, we use the deviation from the mean among all countries, which allows us to assume that the missing values are in the sample mean, i.e. non informative, and then minimize the noise caused by these cases.

To explain the construction of the index, we start by defining the contribution to the total index of a happiness variable \(y_{it}\) that is defined both for the country of origin and destination. Assume also that \(\hat{\phi }^{o}\) and \(\hat{\phi }^{d}\) are the estimated long-run coefficients associated to origin and destination for that variable, respectively. The contribution of the variable to the happiness index for country i is obtained as

$$\begin{aligned} CV_i =\left( {\frac{-\hat{\phi }^{o}+\hat{\phi }^{d}}{2}} \right) \left( {y_{i,2011} -\bar{y}_{2011}} \right) \end{aligned}$$
(2)

where \(y_{i2011}\) is the value of the determinant for country i in 2011 and \(\bar{{y}}_{2011}\) is the average of the determinant among all countries in the sample for data in 2011. If the variable is not bilateral but only defined for the origin country then only \(\hat{\phi }^{o}\) is considered in expression (2). In this computation, coefficients and values taken by explanatory variables have being averaged across the two samples and across origin and destination countries. Note that variables are measured in devitations with respect to the mean as the relevant information for the ranking is how a country perform in each specific indicator compared to the average.

The happiness index is then constructed by adding up the contributions of all the variables belonging to the five groups of happiness determinants: absolute and relative income, personal characteristics, attitudes and beliefs toward self/others life and economic, social and political environment.Footnote 9 The happiness index constructed in this way in presented in Table 2. The final column of Table 2 provides the WVS survey-based happiness indexes for comparison.Footnote 10 The correlation between the happiness proposed in the paper and the Human Develop Index and GDP (purchasing power parity) are 0.76 and 0.78.

Table 2 Happiness ranking

For most countries, a positive value of the survey-based index is matched by positive net migration flows. However, for a few cases, average self-assessed happiness and average observed net desirability are clearly at odds due to the influence of factors that are not captured by existing happiness indexes. Here we distinguish two main types of countries: those self-proclaimed happy but regarded as undesirable (14 mostly middle-income and emerging economies), and those self-proclaimed unhappy but regarded as desirable (14 mostly high-income countries, many of them transition economies). Close inspection of the five groups of determinants of happiness reveals that, in both cases, the explanation to this mismatch seems to reside in the personal characteristics of those countries’ nationals, followed by the country’s social development characteristics and also to some extent the nationals’ attitudes and beliefs.

5 Conclusions

In this paper we propose a happiness index based on migration flows, where migration is taken as a mechanism for revealing preferences. We estimate the impact of a large and diverse number of variables on migration flows, in addition to a survey-based index widely used to rank country happiness. Using these estimated coefficients as weights, we build an alternative ranking based on revealed preferences.

The estimation results reveal that the survey-based index is weakly correlated to migration flows. In fact 14 middle-income and emerging countries are net migration senders even though they are self-proclaimed happy in surveys, whereas another 14 high-income countries, among them several transition economies, are net migration recipients, even though in surveys they are self-proclaimed unhappy. Inspection of the role played by the five groups of determinants of happiness included in the regressions reveals that the explanation seems to reside in the personal characteristics of those countries’ nationals, followed by the country’s social development characteristics and also to some extent the nationals’ attitudes and beliefs.

Our index is based on the assumption that, on average, individuals have access to information about potential destination countries and make rational decisions based on this information. Although this is a plausible assumption, our analysis could be extended by increasing the data time period and by studying the different motivations to migrate in different individuals’ clusters. Moreover, the proposed index could also be improved by increasing the quantity and quality of the variables in the econometric specification. However, in spite of this, we think that any ranking of this type should be based as much as possible on revealed preferences instead of the researchers’ ad hoc postulates. Along these lines, the ranking we propose is not affected by the types of ambiguities in the existing survey based indexes that potentially make results in the different countries not comparable and is thus, we believe, a useful alternative measure to be considered for international comparisons.