Our main purpose is to investigate the effect of employment category on subjective well-being. We run regression models of the following type:
$$H_{i} = S_{i}^{\prime } \beta + X_{i}^{\prime } \gamma + p_{i}^{\prime } \lambda + \varepsilon_{i}$$
(1)
where H
i
is the answer of respondent i to the subjective well-being question described above; S
i
is a vector of indicators for employment category; X
i
is a vector of other variables that may affect subjective well-being; p
i
is a dummy for province of residence; and ε
i
is the error term. Errors are allowed to be correlated within communes (the primary sampling unit), but not across. β, γ, and λ are vectors of coefficients to be estimated.
In terms of employment category, we distinguish between self-employment, wage work, and no employment. The vast majority in the last category are either too old or too sick to work. Only very few are ‘unemployed’ in the sense of desiring a job without being able to find one. We distinguish between self-employment in farming, in non-farm enterprises, and in collection of common property resources (CPRs).Footnote 8 Classification is based on respondents’ ‘main’ source of employment, defined as the occupation they spend most time in. Because the main aim of the analysis is to identify the difference in happiness between self-employed and wage workers, respondents with no employment are excluded from the estimation sample.
Figure 1 presents the analytical model used to select the elements of X
i
, the control variables in the regressions. We distinguish between two types of variables. ‘Background variables’ are exogenous to employment. These variables may affect employment and could also directly drive happiness. ‘Intermediate variables’, on the other hand, are potentially affected by employment and therefore may mediate a causal effect from employment to happiness. Controlling for background variables allows us to identify the total, causal effect of employment category on happiness, including indirect effects that operate through income, risk exposure, social networks, and so on. However, the hypotheses derived in Sect. 2 concern the direct effect of employment category on autonomy, competence and relatedness and therefore on subjective well-being. Testing these hypotheses requires that intermediate variables such as income are controlled for.
Our main purpose is therefore to estimate models where both background- and intermediate variables are controlled, but estimates of the total (direct plus indirect) effect are also of interest.
Background variables include age, gender, ethnicity, place of birth (commune of current residence or elsewhere), and years schooling. All other control variables are viewed as ‘intermediate’. Among these variables, we first include a measure of income, measured at the household level. While a positive correlation between income and happiness is a standard finding in individual level analyses of happiness, it has been hotly debated whether this correlation is driven by absolute or relative income (e.g. Easterlin 1974; Cummins 2000; Berry 2009).
We also control for landlessness. In agriculture-based societies, such as rural Vietnam, land is a key source of income, risk coping, prestige, and identity. Two different measures of health, namely (a) the number of days in the last year the respondent was unable to perform normal activities due to illness, and (b) an indicator for the respondent’s household being hit by any health shocks that led to income losses in the past 2 years are also included. Controls for the number of children in the household below 15 years of age and indicators for marital status are also used.
We include two measures of migration, in addition to the place of birth-indicator mentioned above, namely (a) an indicator for a member of the household having migrated temporarily (and currently being away), and (b) an indicator for former household members having permanently migrated to another commune, district or province (see De Jong et al. 2002; Knight and Gunatilaka 2010b).
Measures of social networks are also used (see Powdthavee 2008; Berry and Hansen, 1996). We distinguish between membership of the Communist Party, ‘mass organizations’, and other formal groups. Mass organizations are the most important type of formal group in Vietnam and include the Women’s, Farmers’, Youth, and Veterans’ Unions. To proxy the strength of respondents’ informal social networks, we use a measure of the number of weddings the respondent’s household has attended in the past year.
We include dummies indicating whether the respondent’s household experienced any of five different types of shocks in the past 2 years. The five types of shocks are (a) health shocks (already discussed above), (b) natural disasters, (c) pest infections and crop disease, (d) ‘economic shocks’ (price changes, unemployment, failure of an investment, and land loss), and (e) a residual category. Finally we include an indicator for being the household head due to the composition of the sample where household heads are over-represented.
Endogeneity
The purpose of these analyses is to investigate the effects of employment category on subjective well-being. In some cases it is relevant to speculate that causality may also run in the other direction (the grey arrow from happiness to employment in Fig. 1). For example, people with a positive outlook may be more likely than others to start their own business. Blanchflower and Oswald (1998) investigate the influence of a range of exogenous, psychological characteristics (measured in childhood) on the probability of becoming self-employed and find only weak effects. This suggests that the effect of happiness on self-employment may also be weak. Nevertheless, to take account of the possibility of a reverse link from happiness to self-employment, we implement an instrumental variables analysis, where self-employment is instrumented by commune level characteristics, which are exogenous to the psychological characteristics of respondents.
The set of instruments, measured in a questionnaire administered to commune officials, includes (a) the commune level share of self-employment in total employment, (b) wage rates for two common wage jobs (harvesting and construction), separately for males and females, and (c) a dummy for the presence of ‘craft villages’ (villages with a tradition for a particular craft, e.g. basket weaving or pottery). The ideas behind this strategy are that (1) the probability of being self-employed depends on the overall prevalence of self-employment in one’s area of residence, (2) higher wages provide an incentive to take up wage work, and (3) craft villages provide additional opportunities for self-employment.
In the instrumental variables (two-stage least squares) analysis the first-stage regression is a linear probability model for self-employment:
$$s_{i} = Z_{c}^{\prime } \delta + X_{i}^{\prime } \theta + p^{\prime}_{i} \eta + \mu_{i}$$
(2)
where s
i
is a dummy for being self-employed, Z
i
is the vector of commune-level instruments described above, and X
i
and p
i
are defined as in Eq. (1). δ, θ and η are parameters to be estimated. The second-stage regression is:
$$H_{i} = \beta_{IV} \hat{s}_{i} + X_{i}^{\prime } \gamma_{IV} + p^{\prime}_{i} \lambda_{IV} + \varepsilon_{IVi}$$
(3)
where \(\hat{s}_{i}\) is the predicted value of s
i
from the first stage regression and other variables are defined as in Eq. (1). The subscript “IV” indicates that estimates are from the IV analysis. Table 2 presents the means and medians of the instrumental variables.Footnote 9