1. Introduction

The deepening economic crisis in many western countries has resulted in a general trend of increasingly restrictive policies toward immigration (OECD [2010]). As governments around the world are struggling with rising unemployment rates, there is growing political pressure to increase restrictions on international migration. This political pressure is often based on the popular perception that the presence of migrants reduces employment opportunities for native workers. Increasingly restrictive immigrant policies can, however, be misguided as they ignore the potential positive effects that migrants can have on host economies.

Self-employment and entrepreneurship are generally acknowledged as being crucial for economic growth. Small enterprises play a crucial role in both developed and developing countries and are often credited with providing specialist goods and services, intensifying competition and increasing economic efficiency (Parker [2004]). High rates of self-employment and entrepreneurship among migrants can have many positive effects such as bringing new skills to the labour market (Hunt [2011]; Ottaviano and Peri [2012]), increasing domestic demand and creating jobs with positive consequences on both employment rates and social security systems (Lacomba and Lagos [2010]). This set of benefits is likely to be particularly relevant in times of recession when standard employment opportunities fall and unemployment rates increase – although this fact is somehow qualified by the Constant and Zimmermann ([2014])'s finding that even though self-employment is an important way to get out of unemployment in times of recession in Germany between 1983 and 2003, migrants are actually less likely than German natives to engage in self-employment as a mechanism to avoid unemployment in economic downturns. But more generally, the importance of self-employment and entrepreneurship for an economy and its growth is such that several countries (such as Germany, Portugal or the USA) provide visa benefits to arriving immigrants who pledge to invest substantial amounts and/or create new jobs in the host country.

While the majority of studies looking at the link between risk aversion and entrepreneurship (for non-migrant populations) find a significant negative relationship (Stewart and Roth [2001]), this finding is not unanimous and variation exists in the significance and strength of the effects found. Indeed, while Van Praag and Cramer ([2001]), Cramer et al. ([2002]), and Ekelund et al. ([2005]) find a statistically significant negative relationship between risk preferences and the probability of being self-employed, Blanchflower and Oswald ([1998]) find risk preferences not to be linked to the probability of being self-employed. In addition, Caliendo et al. ([2009]) used the German Socio-Economic Panel (SOEP) data to find that individuals with lower risk aversion are more likely to become self-employed, but that this effect is only significant for individuals in transition from regular employment, not for those coming out of unemployment or inactivity. Finally, one more study highlighting the subtleties in the relation between risk aversion and entrepreneurship is that of Dohmen et al. ([2011]), who find a statistically significant negative link between risk aversion and self-employment for domain specific self-evaluation measures, but not when using a hypothetical lottery question. Taking a different perspective, Hormiga and Bolívar-Cruz ([2014]) look more specifically at the relationship between risk perceptions and entrepreneurship amongst migrants to find that being an immigrant in Spain seems to be associated with lower perceived business risks, which correlate positively with higher entrepreneurship rates – a finding consistent with the negative correlation between risk aversion and entrepreneurship found in other studies. A limitation of this study is that the indicator used to capture risk aversion is a question regarding `fear of starting a new business'. While fear of starting a business and risk aversion might be related, fear is not a direct measure of risk aversion.

In this setting, the study of risk preferences and migration seems of special interest. Jaeger et al. ([2010]) is the only work directly examining the relationship between risk preferences and migration. It finds that, for the case of internal migration of Germans in Germany, individuals who are more willing to take risks are also more willing to migrate between regions within the country. Bonin et al. ([2009]), however, find that first generation immigrants have lower risk preferences than natives, which only equalize in the second generation. Related recent research (such as Umblijs [2012]) has shown that new immigrants without significant networks (be it family, friends or fellow countrymen) at the destination country tend to be more risk loving than those new immigrants who have these networks available at the time of arrival1.

This paper investigates the motives behind migrant entrepreneurship, focusing specifically on the role that risk preferences play in the probability to become self-employed2. We look at the difference in risk attitudes within migrant communities, and propose a novel methodology to improve comparability of risk preferences between individuals from different cultural backgrounds. Our risk variable is based on a self-evaluation measure of willingness to take risks in the domain of employment that combines several self-evaluation risk questions with anchoring vignettes. The vignettes allow us to measure risk preferences in a more accurate way, by reducing the bias caused by Differential Item Functioning (DIF), in which individuals interpret the response scale in a non-uniform way. This bias is especially pronounced when the characteristic being measured is subjective and related to earlier experiences of the individual, as is likely to be the case for risk preferences. This bias is further compounded when the population being studied is culturally heterogeneous, since the use of scales has been shown to vary between individuals from different origin countries3. This context suggests that our vignette-adjusted measure should be especially important in the measurement of risk preferences in immigrant populations.

Our vignette-adjusted measure of risk aversion is tested using a unique tailor-made representative survey of the migrant population in Greater Dublin, Ireland, conducted by the authors. Respondents were asked to rate three hypothetical individuals on their willingness to take risks in their work life, and were then asked to rate their own willingness to take risks on the same scale. The information from the hypothetical vignettes is used to perform an econometric adjustment of the self-evaluation responses, eliminating the bias caused by DIF.

The results confirm the existence of a negative relationship between risk aversion and entrepreneurship when using the DIF-adjusted measures, while the correlation of the unadjusted measure with entrepreneurship was not statistically significant. Given the importance of vignette-adjustment to our results, we use a Compound Hierarchical Probit (CHOPIT) specification to look at the heterogeneous effects of individual vignette choice on the self-evaluation risk measure. We find that entrepreneurs inflate the most risk-averse values and undervalue the most risk loving value of the self-evaluation scale, relative to non-entrepreneurs. The results also suggest the existence of a routine bias in the use of scales between individuals from different countries of birth, as well as male and female respondents.

Our paper is unique in that it uses a new tailor-made survey instrument that combines self-evaluation risk questions with anchoring vignettes that correct for measurement error caused by DIF. In this way, it provides an improved measure to test the relationship between risk preferences and entrepreneurship in heterogeneous populations, such as the sample of immigrants used in this study – this is also an original contribution to the existing literature on risk aversion and entrepreneurship. Our results suggest that the use of uncorrected DIF measures could be a possible explanation for the variability in the results on the correlation between risk preferences and entrepreneurship reported in previous studies. These are relevant results in light of the economic importance of entrepreneurship and self-employment in particular.

The rest of the article is organized in the following way: Section 2 outlines the methodology used; Section 3 provides the econometric framework; Section 4 introduces the data; Section 5 presents the results; and Section 6 finally concludes.

2. Methodology for measuring risk preferences

We use a vignette approach to counter scale bias in our risk measures in the domain of work. Individuals are asked to rank three hypothetical individuals (vignettes) in terms of their risk preferences before ranking themselves on the same 7-point Likert scale. The comparison between the ranking of hypothetical individuals and the respondent's self-evaluation is used to counter scale interpretation bias. We use non-parametric and semi-parametric scale readjustment methods as well as a more sophisticated Compound Hierarchical Ordered Probit (CHOPIT) model in order to compare these results against the ones obtained using the non-adjusted measure. Comparing these results will show the effect that controlling for Differential Item Functioning (DIF) can have on the general conclusion regarding the link between risk aversion and entrepreneurship in our migrant sample.

1. Rescaling responses using vignettes: non-parametric approach

The simplest way to use vignettes is to rescale individual self-evaluation responses mechanically. This rescaling involves moving from the actual scale presented in the survey to a relative scale, where the adjusted value is the position of the self-evaluation response, relative to the value given for the vignettes. In our survey each individual was asked to score three hypothetical individuals, therefore the responses can be recoded on a 7-point scale. If y i is the categorical self-assessment for individual i, and z ij is the categorical survey response for respondent i on vignette j (j = 1, 2, 3), the self-evaluation response can be rescaled relative to the vignette in the following way:

C i = 1 if y i < z i 1 2 if y i = z i 1 3 if z i 1 < y i < z i 2 4 if y i = z i 2 5 if z i 2 < y i < z i 3 6 if y i = z i 3 7 if y i > z i 3
(1)

where C i represents the recoded value based on vignette responses. Equation (1) shows how a survey question accompanied by three vignettes results in an adjusted 7-point scale. The non- parametric approach provides a straightforward way to adjust responses for DIF without using statistical modelling techniques. However, the main limitation of this approach is that recoding is only possible when vignettes are not tied and are consistently ranked. For example, if a respondent gives all three vignettes the same rank, the adjusted response C i , will not take a single value, but will take the vector {2, 4, 6}. The non-parametric solution to the problem is to delete the responses that contain a vector value of C i . This approach is not the most efficient as other information could be used to predict actual unobserved values in the case of tied or miss-ordered vignette responses.

2. Rescaling responses using vignettes: a semi-parametric approach

An improvement over the non-parametric approach of deleting vector values of C i is to assign the value from the vector that has the highest conditional probability of being true based on other available data. As above, we assume that C i can be either a scalar or a vector. We assume that there is a single unobserved continuous true value that represents the risk preference of all individuals, denoted by C i *. We also assume that in cases in which C i is a vector, we can estimate which value has the highest probability of being C i * conditional on explanatory variables x i . We call the upper and lower bounds of the vignette responses thresholds and denote them as τ c . Therefore, Equation (1) for C i can be rewritten in the general form:

C i =cif τ c - 1 C i * < τ c
(2)

Incorporating the possibility that C i is a vector variable yields the following equation:

C i = m , , n if τ m - 1 C i * < τ n
(3)

In order to estimate the underlying value for C i *, we use a modified version of the ordered probit model in order to break ties when C i is a vector value. We call this the semi-parametric approach. This can be done by using explanatory variables x i to find the value in the vector that is most likely to be the true value of C i , given the information available in x i :

Pr( C i m , , n | x i = τ m - 1 τ n N C * | x i β dy
(4)

In the case of scalar values, C i is selected in the same way as in the non-parametric approach. In the case of a vector value, expression (4) provides a probability density for each of the values in the vector, which together sum to one. The vector value with the highest probability, conditional on characteristics x i , is selected as the adjusted risk measure for that individual.

The semi-parametric approach described above involves splitting ties between non-integer values of the readjusted scale, which are caused by miss-ordered vignettes. While the general approach to deal with missing values is multiple imputation, this option is problematic in our case given that each individual has different bounds for the imputed value. As the distribution of the random draws that are required for the multiple imputation approach would be unique to each respondent, the multiple imputation approach is not applicable in our case. However, it is important to note that for the vast majority of cases (85%) the vector of possible responses contains only 4 values out of a scale of 7, and 27% of these cases have vectors that include only two possible values. This means that the error in the standard errors resulting from our estimation strategy is likely to be much lower relative to a situation where the values were missing and bounded only by the maximum and minimum of the scale, i.e. 1-7.

1. Selecting predictor variables

In order to break ties in vector responses, the predictor variables x i should be correlated with the way that respondents use self-evaluation scales but not with their actual risk preferences. For our predictor variables we include: gender of interviewer; nationality of interviewer; time and number of attempts taken to complete interview; and the range of responses for other vignette questions. The gender and nationality of interviewers has been shown to influence the way respondents answer survey questions4. The time taken to answer the survey and amount of attempts used to complete the survey is likely to reflect how carefully each questions was considered, and the influence that previous sections of the survey had on the vignette questions, which where towards the end of the questionnaire. In addition to the vignette questions for the risk measure in the domain of work, the survey included six other vignettes related to two other self-evaluation questions. The range of responses, between the lowest and highest response for the two questions, gives an indication of the extent to which the respondent uses the extremes of the scale5.

We selected predictive variables xi that are related to the response `style' of individuals and heterogeneity in the characteristics of the interviewer, which could influence how questions are answered but are not related to the risk preferences of the individual. This additional information is used to break ties in cases where vignettes are tied or inconsistently ranked.

3. Econometric framework

We use two econometric specifications. The first specification has entrepreneurship and the second has risk aversion as the dependent variable. The first specification is more closely related to the existing literature on entrepreneurship and risk preferences while the second allows us to investigate the heterogeneous effects of vignettes on different groups of migrants.

1. Estimating the relationship between risk aversion and entrepreneurship using the adjusted measure

In order to investigate the link between risk aversion and entrepreneurship we propose the following econometric specification:

y i * = β 1 ris k i + β 2 X i + ε i
(5)

where the dependent variable y i * denotes whether an individual is self-employed or not at the time of the survey; risk i represents the risk aversion measure (adjusted or unadjusted in different specifications); and Xi is a vector including demographic characteristics (age, education, and marital status); controls related to migration (years living in Ireland, size of the population of individuals from one's country of origin living in Dublin), previous entrepreneurial experience before migration, industry of employment and region of birth controls.

In order to capture non-linearities in the link between risk aversion and entrepreneurship, riski is divided into three categories: lowrisk relates to individuals having a value of 1 or 2 on the scale, mediumrisk relating to individuals with values 3, 4, or 5, and highrisk relating to individuals with values 6 or 7. We include mediumrisk and highrisk as dummy control variables, using lowrisk as the reference point.

2. Estimating heterogeneity in the effect of vignettes on the risk measure: CHOPIT model

In addition to using the adjusted measure of risk preferences as an independent variable, as shown in the econometric specification above, we are also interested in the heterogeneous effects of vignettes on the risk measure itself. In this case, the risk measure is the dependent variable and individual vignette responses enter the right hand side of the equation along with other control variables. While the semi-parametric approach outlined above is comparable with the results reported in the literature, the specification outlined below can provide additional insights into how various groups of migrants interpret the self-evaluation scale differently.

For the parametric specification of the vignette adjustment procedure we use the Compound Hierarchical Ordered Probit (CHOPIT) model which was first applied to vignettes by King et al. ([2004]), and is an extension of the ordered probit model that corrects for DIF. The model explains the self-assessment values using an ordered response equation with thresholds that depend on individual characteristics.

We denote the self-assessment response of individual i by CS i , which is a value on the initial 7-point scale that individuals ranked themselves on. In addition, we assume that the self-assessment value is driven by an underlying, unobservable actual level of risk aversion CS i * given by:

C S i * = X i β+ ξ i
(6)

where X i is a set of individual characteristics including age, gender and a dummy variable for being an entrepreneur; ξ i is the residual term and is comprised of unobserved heterogeneity in risk preferences and an idiosyncratic noise term affecting subjective self-reporting. We assume that ξ i is normally distributed and is independent of X i , with mean 0. We observe values that correspond to thresholds between vignettes along the latent index:

C S i =jifτ s i j - 1 <C S i * τ s i j ,j=1,....,7.
(7)

where the thresholds τ s i j - 1 are given by:

τ s i 0 =-,τ s i 7 =,τ s i 1 = X i γ s 1 + υ i ,τ s i j =τ s i j - 1 +exp X i γ s j ,j=2,3,4,5,6.
(8)

In the above equation υi follows an N (0, σ2) and is distributed independently of X i . For the non-adjusted self-evaluation risk questions, β and γsj are not separately identified. In other words, Equation (5) cannot be identified if the use of the scale differs between different groups. However, if an equation specifying vignette selection were defined, the scale could be adjusted to account for the difference in scale interpretation. This is exactly what is done next. Indeed, the vignettes use the same scale as the self-evaluation questions and can be modelled in a similar way to the response equations:

C L i * = Z i π+ ε i ,
(9)
C L i =jifτ l i j - 1 <C L i * τ l i j ,j=1,....,7.
(10)

where C L i * represents the true unobserved value of vignette L (L = 1, 2, 3) and Zi represents variables that influence the interpretation of a given vignette. Thresholds in Equation (10) are also modelled in a similar way to the self-response equation with thresholds τlijmodeled in a similar way to the self-response equation with τ l i j instead of τ s i j . The error term ε i in Equation (9) is normally distributed and independent of Zi.

The thresholds are also modelled in a similar way to the response equation, but again using different parameters as shown below.

τ l i 0 =-,τ l i 4 =,τ l i 1 = X i γ l 1 + υ i ,τ l i j =τ l i j - 1 +exp x i γ l j ,j=2,3,4,5,6.
(11)

The key assumption of the CHOPIT model is that there is response consistency between the ranking of vignettes and the ranking of the self-evaluation questions. This assumption means that individuals use the scale in the same way for the vignettes and the self-response questions and that the threshold parameters in Equations (8) and (11) are equivalent:

γ s j =γ l j ,j=1,..,5.
(12)

As γlj can be identified separately from the vignette equation and can be matched to γsj based on the assumption of response consistency, β in Equation (6) can be identified. Given the way that the thresholds vary amongst respondents is controlled for by γs, the results of β in Equation (6) control for differential item functioning. As mentioned above, while this approach does not result in an adjusted risk measure that can be used as an independent variable, it does provide more detailed insights into the characteristics that affect the use of the response scale beyond what is possible using non-parametric and semi-parametric approaches.

4. Data description

1. Survey background

The empirical analysis in this paper uses a representative dataset of immigrants in the Greater Dublin Area, Ireland. The immigrant survey data were collected as part of an EU NORFACE project, and are a representative sample of the immigrant population residing in the Greater Dublin Area. In addition to detailed information on the migrants, the survey also includes tailor-made questions designed to capture individual risk preferences.

The household survey was conducted amongst 1500 immigrants aged 18 years or older, residing in the Greater Dublin Area, who arrived in Ireland between 2000 and six months prior to the interview date and who were not Irish or British citizens6. The survey was conducted between January 2010 and October 2011 by Amarach Research, a reputable survey company with prior experience conducting research surveys in Ireland, under the close supervision of our research team. This time period coincides with the beginning of the recovery from the financial crisis in Ireland. While the crisis is a relevant context in which to study entrepreneurship, the nature of our data does not allow us to compare the risk preferences of migrants who left Ireland before 2010 and the individuals in our dataset. More generally the cross-sectional nature of our data means that we do not know about selection in return migration: in other words, we do not know how migrants that have left Ireland before we conducted our survey compare to the individuals in our dataset.

The sampling framework for the survey was the 2006 Census of Ireland, and the Enumeration Areas (EA) were randomly selected according to probability proportional to size sampling, where size is defined as the total number of non-Irish and non-British individuals.

Fifteen households were selected within each EA using a random route approach with clearly stated rules for selecting households to be interviewed. Within an EA, interviewers visited every fifth house, turning right after each attempt. Instructions on which house to select in specific situations, such as in tower blocks and cul-de-sacs, were given to interviewers. All addresses visited, even when not resulting in an interview, were recorded to ensure that the survey rules were followed correctly. Non-responses, due to no one being at home at the time of the visit, were minimized by interviewers going back to an address up to five times on different days and at different times. While this five time `call back' rule was time consuming, it minimized non-response and ensured that a representative sample of migrants was selected, including single dwelling households, which would otherwise be under represented. When respondents declined to be interviewed, their characteristics (namely gender, approximate age, nationality, type of dwelling) were recorded to allow for the adjustment of sampling weights.

In the presence of more than one migrant living in a household, the survey respondent was selected using a randomization rule. If the randomly selected respondent within the household was not present, an interview with that individual was arranged at a time convenient for the respondent.

The design of the survey questions and data collection strategy were carefully developed in order to ensure that our sample is representative of all migrants, including illegal and non-registered migrants. The randomized procedure for selecting addresses within an EA was useful in capturing a representative selection of migrants, including those that were not registered in official data. The legal status of respondents was not asked for and this was made clear to the respondents before the survey was administered. In addition, it was made clear to respondents that the data would be rendered anonymous and not used for any purpose other than academic research. In order to maximize trust, interviewers were chosen from a broad range of backgrounds and received detailed classroom and in-the-field training, followed up by randomized quality checks.

While the differences in risk attitudes between natives and migrants would make an interesting comparison, surveying the native Irish population was outside the remit of the project. We also had no feasible way to compare the risk preferences of migrants with those of non-migrants in the migrant origin countries. For this reason, we cannot compare or be sure of how migrants' risk preferences captured in our data differ, and in which ways, from non-migrants' risk preferences. We therefore look specifically at the difference in risk attitudes within migrant communities.

The self-evaluation risk measure was administered in order to ensure consistency in the ordering of the vignettes and in the way that questions were asked. The questions were piloted at an early phase of development of the survey to ensure that the vignettes were understood in the same way by all individuals. In addition to asking the questions orally, respondents were given cards with the hypothetical scenario for the questions they were answering so that they could better follow and process all of the information. Great care was taken to ensure that all interviewers asked the questions in a uniform way and were not allowed to influence respondent's answers. The objective was to minimize the ways that the survey questions could be interpreted, while allowing respondents to express their true answers.

The order of the vignette questions was randomized. These questions were immediately followed by the self-evaluation question so that the same scale and context would be transferred from the hypothetical vignettes to the self-evaluation question. The vignette questions on risk perceptions along the work dimension are presented in Additional file 1: Figure S1.

Our survey includes a number of variables that are relevant to include in our empirical analysis. The variable `Years of Schooling' corresponds to the number of years of schooling completed by the respondent both in the home and receiving countries. `Married' is a binary variable taking the value of 1 if the respondents replied that their current marital status was "married or couple living together". The omitted category includes single respondents not living in partnership, separated, divorced or widowed individuals. `Years in Ireland' is the number of years the respondent has lived in Ireland at the time of the survey - this ranges from 1 to 10 years given our eligibility restriction for migrants to be included in our sample. We also include the square of this term to reflect the possible non-linear relationship between time spent in a country and the probability of being self-employed. `Proficient English' is a binary variable taking the value of 1 if respondents rated themselves as having fluent English. This may be importantly correlated with the respondents' level of integration in the Irish economy and potentially condition their capacity to become self-employed independently of their level of risk aversion. `Pre-Migration Entrepreneurial Experience' is a binary variable taking the value of 1 if the respondent was ever self-employed before they arrived in Ireland, which may affect the probability of being an entrepreneur in Ireland. `Migrant Enclave' is the decimal of the people in our survey that where born in the same country as the respondent. This is intended as a proxy for the size of the expatriate community from the respondents' origin country in Ireland. Industry dummies include transportation, construction, IT, finance, commerce, education, student and health. The omitted category is `other services'. Region dummies include Africa, Asia, Europe New Member States, Rest of EU and South America. The omitted category is North America.

2. Stability of risk preferences over time

An important issue in the measurement of risk preferences concerns the stability of risk preferences over time. There has been some debate in the economics and psychology literature regarding the stability of personality traits. While Harrison et al. ([2007]) find that, in a representative sample of the Danish population, individuals on average become less risk averse after the age of 40; Barsky et al. ([1997]) and McCrae ([1993]) find that risk preferences are a stable character trait in adults. McCrae ([1993]) suggests that changes in individual risk measures for individuals over time found in other studies are due to measurement error.

Andersen et al. ([2008]) used field experiments to examine the temporal stability of risk preferences over a 17-month period among the Danish population and find that, while there is some variation in risk attitudes over time, there is no general tendency for risk attitudes to increase or decrease over a 17-month period. The results of this work highlight a general tendency for temporal stability of risk preferences in a longitudinal sense. In terms of migrants, Bonin et al. ([2012]) find, however, that adaptation to the host country (Germany in their case) closes the gap in risk proclivity by reducing immigrants' risk aversion. This finding suggests that risk characteristics of migrants are not necessarily stable over time and can be affected by their level of integration in the host country.

Given the cross sectional nature of our dataset we cannot disentangle differences in risk preferences due to age and cohort effects. However, given that in this article we address the issue of measurement error in capturing risk preferences, we can look more closely at the relationship between age and risk preferences across individuals by comparing our unadjusted with our adjusted risk measures. The left-hand diagram in Figure 1 shows the relationship between age and willingness to take risks for our unadjusted measure. The polynomial smoothed plot shows that risk preferences remain relatively stable until the age of around 65 where the average willingness to take risk decreases substantially. The right-hand diagram in Figure 1 shows the relationship between age and willingness to take risks using the vignette-adjusted measure. In contrast to the unadjusted measure, the relationship between age and willingness to take risks shows a general increase in the willingness to take risks from around age 30 and shows far less volatility after age 60, relative to the unadjusted measure. The relatively more stable relationship between age and risk preferences for the vignette-adjusted measure supports the suggestion that changes in risk preferences over time may be partly due to measurement error as suggested by McCrae ([1993]). More specifically, the graphs in Figure 1 shows that in terms of self-evaluation questions, scale perception is sensitive to age and that older individuals are not substantially more risk averse in terms of employment than younger individuals, within our sample of migrants.

Figure 1
figure 1

Age and willingness to take risks in the domain of work: non-adjusted vs. adjusted comparison. Note: The Figure shows the relationship between the self-evaluation measure of willingness to take risks in the domain of work, using the unadjusted measure (left-hand side) and the vignette adjusted measure (right-hand side). A Least Squares Polynomial Smoothing filter was applied, and a 95% confidence interval is shown by the grey shaded area.

3. Descriptive statistics

Tables 1 and 2 provide summary statistics regarding entrepreneurs in our sample. Our sample contains 111 (8% of the total sample) entrepreneurs. Table 1 describes the sectors of employment for self-employed individuals in our sample, showing that the highest proportions of entrepreneurs are in the transportation, construction, and IT sectors.

Table 1 Distribution of migrant entrepreneurs by industry
Table 2 Summary statistics of key variables, by employment type

Table 2 shows the differences in means between entrepreneurs and non-entrepreneurs regarding the most common explanatory variables for entrepreneurial activity found in the literature, namely income, age, years of schooling, and gender. The table shows that while the non-adjusted self-evaluation risk measure suggests no statistically significant differences between entrepreneurs and the rest of the population, the adjusted measure reveals that entrepreneurs are more risk loving at a 6% statistical significance level.

The summary statistics also show that there are statistically significant differences between entrepreneurs and non-entrepreneurs in terms of income, age, and gender variables. Table 2 shows that the average entrepreneur has a higher monthly income (by EUR 335), is three years older, has a similar amount of education, and is more likely to be male than the average non-entrepreneur.

Figures 2 and 3 show the distribution of responses of entrepreneurs and non-entrepreneurs for the non-adjusted and adjusted risk measures. The difference between entrepreneurs and non-entrepreneurs is less pronounced in the unadjusted (Figure 2) than the adjusted (Figure 3) case, suggesting that entrepreneurs routinely rate the hypothetical vignettes in a way different from the rest of the population. The adjusted measure in Figure 3 suggests that entrepreneurs are more likely to be medium-to-high risk loving (4-6 on the scale) and less likely to be risk averse (values 1-3) or extremely risk loving (7 on the scale), relative to the rest of the population.

Figure 2
figure 2

Non-adjusted risk measure, entrepreneurs and non-entrepreneurs.

Figure 3
figure 3

Vignette adjusted, entrepreneur and non-entrepreneur comparison.

The summary statistics show that vignette adjustment has a significant effect on the distribution of responses and that (in our sample) more risk loving individuals are more likely to be self-employed when the adjusted measure is used. The next section looks more closely at how the self-evaluation responses were adjusted using the anchoring vignettes.

Vignette responses and relative rank analysis

Table 3 provides a breakdown of the adjusted values or vectors after the self-evaluation measure is rescaled using the vignette responses. The first Column in Table 3 corresponds to Ci as described in Section 2, the value is the non-parametrically adjusted (or rescaled) self-evaluation measure in the domain of work. In our scale higher values correspond to a greater willingness to take risks with the adjusted measure having a minimum value of 1 and a maximum value of 7. When individuals ranked the vignettes consistently 7 and without ties, Ci takes a single value. If respondents ranked vignettes inconsistently or ranked at least two vignettes in the same way, a single recoded value cannot be obtained, and Ci is a vector. Therefore, even in the presence of inconsistent ranking we can give a range within which the true value lies8.

Table 3 Summary of relative rank analysis

The rank analysis in Table 3 suggests that after adjusting the self-evaluation risk measures using the vignettes, 63% of the responses were scalar. This corresponds to a reasonable proportion of correctly ordered vignette responses compared to reports in the literature9. In addition, while there were some inconsistencies or ties in 37% of the cases, in the majority of these situations, only two vignettes were ties or miss-ordered. Tied vignettes could reflect the situation where the hypothetical scenarios are so far from the respondent's own preferences that distinguish between the vignettes become difficult and not necessarily a result of misconception. In total only 9 individuals (0.6%) in the sample miss-ordered all three of the vignettes, as shown by the {1 to 7} category in Table 3. The high proportion of consistently and nearly consistently ranked vignettes is reassuring as it suggests that the vignettes were correctly understood by the majority of respondents.

5. Empirical results

This section presents the results of the empirical analysis using the non-adjusted, semi- parametric, and parametric models. We also discuss the results of the parametric CHOPIT model, which allows us to see how various groups within our sample interpreted the self- evaluation scale.

1. Probit estimation results: unadjusted risk measure

As described above, the vignette-adjusted variable can be created using either non-parametric or semi-parametric approaches. As a benchmark, we start by analyzing the non-adjusted self-evaluation measure of risk aversion, as shown in Table 4. This measure corresponds to the value that the respondents reported on their self-evaluation without vignette adjustment. Table 4 presents marginal effects of the probit specification, and shows that there is no significant relationship between the unadjusted risk aversion measure and the probability of being an entrepreneur. The simple probit regression (Column 1) shows that the relationship between entrepreneurship and the unadjusted categories medium risk-loving and high risk-loving (with low risk-loving as the omitted category) is not statistically significant. The risk measure variable remains statistically insignificant even after individual characteristics (Column 2) and other potential explanatory factors (Column 3) are accounted for.

Table 4 Probit results: unadjusted risk measure

Column (3) in Table 4 also shows that being female, having pre-migration experience as an entrepreneur, and `migrant enclave' are statistically significant explanatory variables of the probability to become an entrepreneur. The significant positive effect for having pre-migration entrepreneurial experience is as expected since individuals who have been self-employed previously have probably more relevant skills for self-employment than those with no previous experience. The variable could also reflect individuals who moved to Ireland with the express intent of starting a business. The `migrant enclave' variable is also positive and statistically significant indicating that having a larger population of individuals from the same country living in Dublin increases the probability of a migrant becoming an entrepreneur. This could either be due to better networking probabilities within Dublin, or due to a larger market for culturally specific goods and services, an example being shops selling specialty goods from the home region. Perhaps unexpectedly, the date of arrival in Ireland does not significantly affect the probability of a migrant becoming self-employed – which can perhaps be taken as evidence that self-employment is not strongly used as a way to escape unemployment at arrival in the hope to obtain formal paid employment afterwards.

2. Probit estimation results: non-parametric adjusted measure

Table 5 shows the marginal effects of the non-parametrically adjusted measure of risk aversion. The table presents marginal effects of the probit specification. This non-parametric approach excludes all inconsistently ranked vignettes as can be seen by the lower number of observations in Table 5. Column (1) in Table 5 shows that using this vignette adjustment specification, both the mediumrisk and highrisk variables (relative to the low risk loving omitted category) have a positive significant (at the 1% level) effect on the probability of being self-employed. Having a medium level of willingness to take risks increases the probability of an individual being an entrepreneur by 9 percentage points, relative to the omitted category, and having a high level of willingness to take risks increases the probability of being an entrepreneur by 10 percentage points. The magnitude of the coefficients drops slightly to a positive effect of 8 percentage points, after controls are added, for both medium and high risk, and remains statistically significant in all of the specifications. It is also interesting to note that women are less likely to be entrepreneurs by 6 percentage points. Having a previous entrepreneurial experience in the country of origin is correlated with an increase in the probability of being self-employed in Ireland by around 10 percentage points. These results are all statistically significant at the 1% level.

Table 5 Probit results: non-parametrically adjusted risk measure

Comparing these results in Table 5 to those displayed in Table 4 shows that the risk aversion measures became significant following the non-parametric adjustment of the self-evaluation risk measure. This finding suggests a significant positive relationship between being either medium or high risk loving (relative to the low risk loving category) and being self-employed – i.e. a positive relationship between risk loving (as opposed to risk averse) and the probability of being an entrepreneur. It also points to the vignette adjusted measure reflecting more closely the actual risk preferences of the respondents by counteracting scale perception bias.

3. Estimation results: semi-parametrically adjusted measure

Table 6 shows probit results for the probability of being self-employed using the semi-parametrically adjusted risk measure in the domain of work. For this measure inconsistently ordered vignettes are allocated to the value with the highest probability of being true (amongst the vector values) based on the choices made by other individuals with similar characteristics, as described in Section 3. Column (1) of Table 6 shows that the marginal effects of the risk measure on the probability of being self-employed are statistically significant for both the mediumrisk and highrisk variables (with low risk loving as the omitted category). The coefficient suggests that having a medium level of willingness to take risks increases the probability of being self- employed by 8 percentage points, and having a high willingness to take risks increases the probability of being self-employed by 10 percentage points relative to the omitted category. Column (2) in Table 6 includes controls for basic characteristics used in the literature and the migration-specific variables. The results suggest that there is a significant relationship between risk preferences and entrepreneurship even after controlling for all of the variables included in our specification.

Table 6 Probit results: semi-parametrically adjusted risk measure

With the inclusion of all controls, Column (3) in Table 6 suggests that having a medium level of risk increases the probability of being self-employed by 6 percentage points, and having a high level of risk increases the probability of being self-employed by 7 percentage points. In this specification, the `migrant enclave' variable becomes statistically significant. The enclave variable is a measure of the concentration of individuals with the same nationality living in Dublin and therefore a positive coefficient could reflect a higher market for culturally specific goods and services or the benefits of increased networks in setting up a business as an entrepreneur.

4. How vignettes affect the risk measure across different variables: results of the CHOPIT model

Table 7 shows the results of the CHOPIT model in which the risk measure is the dependent variable for comparison. Column (1) of Table 7 also presents the results of the estimation using the ordered probit model.

Table 7 Ordered probit and compound hierarchical probit (CHOPIT) model

Table 7 shows that while the non-adjusted ordered probit model in Column (1) suggests no significant relationship between risk aversion and entrepreneurship, the vignette-adjusted regression in Column (2) shows a positive and statistically significant relationship. The Table suggests a positive relationship between willingness to take risks and entrepreneurship in our sample of migrants. In other words, while the self-reported level of risk of entrepreneurs is not statistically different from the rest of the population, their actual level of risk aversion is significantly lower because they interpret the scale in a different way.

The difference in statistical significance for the entrepreneur variable between Columns (1) and (2) in Table 7 is due to variation in scale interpretation. The vignette threshold values τ provide more information regarding how entrepreneurs perceive the self-evaluation scale. The results in Column (2) of Table 7 show that entrepreneurs regard the most risk averse values of the scale as being more risk loving than do non-entrepreneurs (positive sign on τ 1 ), while considering the more risk loving values as not being as risk loving as the rest of the population (negative sign on τ 2 , τ 3 , τ 5 and τ 6 ). The inflation of low values on the scale and undervaluing of higher values by entrepreneurs, has essentially compressed the actual unobserved scale for this subgroup. In other words, the entrepreneurs' valuation of the vignettes results in a narrower range of vignette-adjusted values than the non-adjusted self-evaluation measure would suggest. An explanation for this scale compression could be that self-employed individuals undervalue risky employment decisions due to their own willingness to take such risks, while at the same time recognizing that the risk element in seemingly risk-free employment decisions has to be considered, a point that could be missed by non-entrepreneurs.

Another noteworthy result of the CHOPIT model in Table 7 is related to the four variables that are statistically significant for the Ordered Probit (Column 1) but not for the vignette-adjusted CHOPIT model (Column 2). The dummy variables for born in Africa, born in Australia, and gender are all statistically significant when the unadjusted measure is used, but lose their statistical significance after vignette adjustment. This result suggests that while the scale perception of these groups is statistically different from the rest of the population, their actual risk preferences in terms of employment are not. While the unadjusted measure suggests that being female is associated with being more risk averse - Table 7, Column (1) - the `actual' vignette-adjusted measure -Table 7, Column (2) – points to there being no statistically significant relationship between being female and actual risk preferences with respect to employment. This result suggests that while women are more likely to rate themselves as being more risk averse in terms of employment, they are also more likely than men to consider the hypothetical risk loving individuals to be more risk averse. Therefore, while a difference in perception of risk exists between the genders, actual risk preferences in terms of employment do not appear significantly different between men and women. Furthermore, while the unadjusted measure suggests that individuals born in Africa and Australia are more risk loving, the adjusted results suggest that there is no statistically significant difference in the risk preferences of individuals from these countries.

More detailed information on the cut-off values is provided in Table 8. The table shows the first, third and fifth cut-off and gives an indication of how the scale is interpreted by individuals from different regions of birth and along different variables. Looking at cut-off values τ 3 and τ 5 in Columns (2) and (3) in Table 8, one can see that the values are positive for Africa and Australia and negative for South America. This suggests that migrants from Africa and Australia think of these values as being more risk loving than the rest of the population, while individuals from South America see the higher values as being less risk loving than the rest of the population. The female variable in Columns (2) and (3) in Table 8 is also negative suggesting that female respondents undervalue the more risk loving vignettes. This undervaluing of the more risk loving individuals suggests that while female respondents tend to rate themselves lower on the self-evaluation scale, they rate the most risk-loving vignettes as less risk-loving than male respondents.

Table 8 CHOPIT Model: cut off values

The results of the CHOPIT model suggest that for certain groups the perceived difference in risk preferences is actually due to differences in scale interpretation rather than to actual differences in risk preferences. Conversely, while the unadjusted measure suggested that entrepreneurs do not differ in their risk preferences from the rest of the population, the `actual' vignette adjusted level suggests that entrepreneurs are, in fact, more risk loving than the rest of the population.

6. Summary and conclusion

This paper investigates the relationship between risk aversion and entrepreneurship, looking specifically at a migrant population. The main challenge in investigating the relationship between risk aversion and entrepreneurship amongst migrants is to ensure that measures of risk preferences are comparable across individuals. This paper develops a novel vignette-adjusted self-evaluation risk measure in order to counter the problem of the different interpretation of scales amongst individuals in our sample, and tests its validity using a tailor-made survey of immigrants in the Greater Dublin Area, Ireland.

The relationship between risk aversion and entrepreneurship is tested and the results suggest a positive relationship between the willingness to take risks and being an entrepreneur, but only after the risk aversion measure is adjusted for Differential Item Functioning using a series of vignettes. Using the unadjusted measure of risk aversion there is no statistically significant relationship between risk aversion and entrepreneurship. Using adjusted measures, our results suggest that having a medium preference for risk relative to a low preference for risk increases the probability of a migrant becoming an entrepreneur by between 5.6 and 7.9 percentage points, while being a high risk individual (relative to a low risk) increases the probability of becoming an entrepreneur by between 7.0 and 7.3 percentage points (both results being statistically significant). These results confirm our prediction that in heterogeneous populations self-evaluation measures can suffer from differential item functioning and that a vignette adjusted measure can counter bias caused by heterogeneous interpretation of the self-evaluation scale.

The difference in results between the vignette-adjusted and non-adjusted measures suggests that while entrepreneurs' stated willingness to take risks was similar to the rest of the population, their actual level of risk aversion was lower. In addition to different scale interpretation for entrepreneurs, we also find statistically significant differences between individuals from different regions of the world, and between different genders.

In this case the vignettes were crucial in obtaining a measure that reflects actual preferences more closely. Given the difference between the adjusted and unadjusted results, it is possible that some of the variation of results reported in the broader empirical literature, looking at risk and entrepreneurship, could potentially be related to measurement error. This is likely to be the case when the population under examination is highly heterogeneous.

The novel addition of vignette-adjustment to the self-evaluation measure improves the accuracy and reliability of results considerably, with a relatively small additional cost to the survey designer. The addition of vignettes is especially valuable when the sample is made up of individuals from a variety of cultures, as uses of the self-evaluation scale are likely to differ substantially, and biases arising from differential item functioning will be magnified.

In summary, this paper suggests that a preference for risk is significantly positively correlated with entrepreneurship amongst migrants, and that there is heterogeneity in migrant groups regarding unobservable characteristics. Predicting which migrants are likely to start a new business in the host economy therefore requires one to consider unobservable characteristics, in addition to observable variables. While unobservable characteristics are by definition difficult to quantify, our research provides an improved methodology for measuring domain specific individual risk preferences in heterogeneous populations.

Endnotes

1In related research, Batista and Umblijs ([2014]) show that more risk averse immigrants tend to send more remittances abroad, and Batista and Narciso ([2013]) conduct a randomized field experiment to find that remittances also increase with communication flows between migrants and their network abroad.

2We define entrepreneurs as individuals who have been self-employed at any time during their residence in Ireland. We define entrepreneurs as individuals who have been self-employed at any time during their residence in Ireland.

3A number of articles have highlighted how differences in the interpretation of scales across countries can introduce bias in international studies. See for example Le ([2009]); Choi et al. ([2009]) and Culpepper and Zimmerman ([2006]).

4For research on gender impacts, see Catania et al. ([1996]) and for nationality effects see Webster ([1996]).

5There is evidence of heterogeneity in the range used in scales that is independent of the question being asked. For example, see Le ([2009]); Culpepper and Zimmerman ([2006]).

6Eligibility requirements were set to maximize the probability that migrants still retained contacts outside of Ireland (hence the 2000 arrival threshold) but were already minimally established in Ireland (for six months at least) so that contacts with their networks abroad could provide useful information. British citizens were excluded, given the close historical ties between Ireland and the UK.

7By consistent we mean that individuals ordered the vignettes as they were designed with the most risk averse hypothetical individual being given the lowest score, etc. The most common ranking was 1, 2, 3, which reflects the order that was intended.

8For example, if an individual ties vignettes 1 and 2, and considers himself less risk loving than vignette 3 but more risk loving than the tied vignettes 1 and 2, the adjusted value will lie between the values of 2 and 6. This is because we know that the value cannot be 1, as he has ranked himself above vignettes 1 and 2; at the same time he cannot be more risk loving than 6 because he is more risk averse than vignette 3. Therefore in this example the individual will have vector {2, 3, 4, 5, 6} for C i .

9The percentage of correctly ranked vignettes varies between studies. For example Hopkins and King ([2010]) rank 74% of vignettes correctly when looking at self-reported vignette adjusted differences in political efficacy between China and Mexico, whereas Bratton ([2010]) has only 37% of consistent and non-tied responses when investigating perceptions of democracy in Africa.

Additional file