Abstract
Understanding the worldwide drivers of qualified entrepreneurship is a key issue in economic policy design. To help policy decisions exert their intended impact, we aim to cluster a wide range of countries on the basis of their levels and trends in selfemployment productivity using a finite mixture model applied to a new large dataset of 121 countries covering the period of 1991â€“2019. Our results point to three groups of high, medium, and lowproductive means and tendencies, the geographical distribution of which suggests that they can be reinterpreted using the three stages of economic development, namely, innovation, efficiency, and factordriven economies. Notably, we find that widespread digitalization and low unemployment enhance the probability of transitioning into a highly productive cluster. However, we failed to find that industry weight or employment protection legislation strictness serve as determinants in the transition between groups. Suggestive rationales for these results and implications for the entrepreneurship policy agenda are also provided.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Policymakers tend to encourage entrepreneurial activity because it is viewed as a key driver of economic growth, job creation, and innovation. Consequently, they implement portfolios of policies to promote entrepreneurship/selfemployment and to support small and medium firms as a solution to weak economic performance and deficient job creation. However, as the seminal work of Blanchflower (2004) pointed out, the level of selfemployment itself does not guarantee economic growth. In fact, as Poschke (2013) noted, both developed and developing countries sometimes show the same selfemployment rates despite having different growth patterns.
Among others, Shane (2009) and Congregado et al. (2010) warn that encouraging more people to become entrepreneurs does not necessarily lead to economic development. The strong negative crosscountry association between selfemployment and the level of income per capita in both lessdeveloped and developing countries and the mixed evidence regarding the impact of entrepreneurship on growth at the macro level constitute indications of something being wrong in the usual linkage between the size of the aggregate selfemployment sector and economic growth, as the works of Pietrobelli et al. (2004), Wennekers et al. (2010), Arin et al. (2015), and RodriguezSantiago (2022) have found.
Not trying to be exhaustive, Maloney (2004), Acs (2006), Poschke (2018, 2019), and Allub and Erosa (2019) pointed out that selfemployment exhibits substantial heterogeneity and that crossnational differences could be behind this apparent puzzle and suggested examining the relationships between qualified selfemployment (rather that the aggregate rate) and economic development. In this context, Stam (2015) and Stam and Van de Ven (2021) investigated the determinants of optimally productive entrepreneurship and the pillars of the entrepreneurial ecosystem, which are particularly important for devising an effective national competitiveness strategy.
This body of literature encompasses three main categories of research focusing on the determinants of selfemployment rates at the macroeconomic level. Firstly, studies such as those by Acs et al. (1994) delve into the influence of macroeconomic factors like capital per worker and industry composition. Secondly, research by Pietrobelli et al. (2004), Arin et al. (2015), and RodrÃguezSantiago (2022) examines the impact of income per worker on selfemployment rates, highlighting the adverse effects of macroeconomic instability on entrepreneurial activities. Thirdly, investigations by Blanchflower (2000), Centeno (2000), Robson (2003), and Torrini (2005) scrutinize labor market dynamics and regulations, including employment protection legislation, while others such as FÃ¶lster (2002), Anokhin and Schulze (2009), Djankov et al. (2010), Estrin et al. (2012), Belitski et al. (2016), and Dutta and Sobel (2016) focus on the role of corruption and taxation. Lastly, various articles, including those by Sobel (2008), Acs et al. (2008), Estrin et al. (2012), BjÃ¸rnskov and Foss (2016), and Urbano et al. (2020), explore the influence of institutions and institutional quality on entrepreneurship.
For these reasons, our paper follows this literature by focusing on the analysis of selfemployment productivity (output per selfemployed worker) with a twofold purpose. Our first purpose is to cluster countries worldwide to identify groups with some degree of similarity regarding their level and trend in selfemployment productivity. With this classification, policymakers could examine cluster membership to determine whether their countries have performed on par with other countries in similar economic circumstances and to provide warning of unfulfilled expectations.
Our second purpose is to determine what drives country memberships and, as a result, what characteristics are shared by countries that makes them similar to their cluster and different from other clusters in terms of selfemployment productivity. This analysis may provide implications for country policymakers regarding which policy variables have the greatest degree of influence on countrylevel selfemployment performance and to consider which strategies would be appropriate to promote movements towards more productive clusters.
In our study, we use finite mixture models to analyze the varied landscape of selfemployment productivity across different countries. These models offer numerous advantages over alternative approaches, particularly in their capacity to provide inference on individual classifications and overall clustering. The following paragraphs lay out a robust foundation for comprehending the chosen methodology.
To capture the worldwide heterogeneity in selfemployment productivity, we rely on finite mixture models. These modelbased classification methods exhibit several advantages over their alternatives, which classify according to similarities. First, the clustering in finite mixture models works on a statistical basis, which facilitates the conduction of inference on the estimates for individual classifications and for the clustering as a whole. Second, in the context of finite mixture models, a number of statistical criteria have been developed to objectively assess the optimal number of clusters. Third, exogenous explanatory variables explaining cluster formation can be easily and explicitly incorporated in the clustering procedure. Thus, modelbased classification facilitates designing guidelines for policy recommendations based on an analysis of outcomes.
In finite mixture models, each cluster is assumed to have its own density. In our approach, we assume that this density is determined by both the groupdependent level of selfemployment productivity and its groupdependent trend, or longterm direction, over the sample period. Thus, the model characterizes the homogeneity within each cluster not only by level of selfemployment, which could be viewed as either high, moderate, or low, but also by the intensity in the overall evolution of the data path.
In addition, the model defines the probability that a country belongs to a given group. In this case, we consider a logittype structure that depends on the influence of unitspecific exogenous variables on cluster membership. According to the literature, we postulate that these exogenous variables characterizing the exante likelihood of membership in a given cluster are of four types. First, following Acs (2006), who consider that improvements in information technologies such as telecommunications may increase the returns to entrepreneurship, we use the World Bank's Digital Adoption Index (DAI).
Second, following Fairlie and Fossen (2020) and Cowling and Wooden (2021), we also consider the countryspecific labor market situation as measured by the unemployment rate to be an influencing factor on whether a country is affiliated with a certain cluster of selfemployment productivity. Third, in line with Blanchflower and Shadforth (2007), who examine whether selfemployment was stimulated in the United Kingdom through changes in the industrial sector, we include the relative weight of the industrial sector. Finally, we follow Centeno (2000) and Robson (2003), who examined the interaction of selfemployment and labor market rigidity, to propose the Labor Market Rigidity Index (LAMRIG) as an additional exogenous determinant of cluster membership.
In the empirical analysis, we compile a new large internationally comparable database of 121 countries covering the period from 1991 to 2019 and use the finite mixture model to obtain the following results. First, our datadriven approach points to three distinct groups of countries. The first group characterizes the countries with the highest productivity level and the steepest productivity trend. The second group comprises countries with a medium level of productivity and a flatter trend than that in the first group. The third group characterizes the countries with the lowest productivity level and the least pronounced trend in selfemployment productivity.
The main results and contributions of the paper can be summarized in the following statements: firstly, referring to the geographical distribution of groups; secondly, regarding the disparities in selfemployment productivity across these groups; and thirdly, concerning the pivotal elements influencing transitions between groups.
In accordance with the resulting geographical distribution of groups, we followed the categories of countries suggested by Porter et al. (2002) to categorize the groups according to their levels of competitiveness across the stages of economic development. The first cluster tends to include most innovationdriven economies with higher wages, level of innovation and associated standards of living. The second cluster includes countries in the efficiencydriven stage, who require the development of more efficient production processes and the ability to harness the benefits of existing technologies. Countries in the factordriven stage predominate in the third cluster, which is composed of the least developed countries, where subsistence agriculture, extraction businesses, and unskilled labor are prevalent.
Second, despite the significant differences in the levels of selfemployment productivity across these three groups, our results do not suggest that they will eventually converge. The reason is that the productivity of countries in the lower productive groups tends to grow over the sample period at a slower rate than that in the higher productive groups. Therefore, the trajectory of the least productive countries will not tend to catch up unless policymakers take measures to close the productivity gap. Thus, doing nothing is the best guarantee of failure in promoting the convergence process.
Third, our research identifies two key elements in the national entrepreneurial ecosystem that can enable less productive countries to reverse this tendency by determining the key factors that influence group membership. In line with the intersections of digital technologies and entrepreneurship that have been documented by JafariSadeghi et al. (2021), our results show that designing a nuanced digital strategy with policies tailored to promote adoption and diffusion of digital technologies is especially important in facilitating the transition to the innovationdriven group.
In addition, we find that unemployment is a barrier to moving countries from the efficiencydriven to the innovationdriven group. In line with Thurik et al. (2008), who detected a dynamic interrelationship between selfemployment and unemployment rates, we find that the labor market dynamic is also related to selfemployment productivity. We postulate that the structural unemployment rate tends to favor the entry of marginal entrepreneurs who erode average productivity into selfemployment. Thus, active labor market policies that are oriented to stimulate the search for salaried work offers hinder the promotion of selfemployment among the less productive unemployed and thus appear to be advisable as a strategy for catchingup.
Interestingly, we failed to find that industrialization intensity or the rigidity of employment protection legislation were key elements in transitioning to more productive groups. The former is in line with the findings of Acs and NaudÃ© (2013), who do not see industrial policies as merely functional policies without consideration of firm or entrepreneurial specifics. The latter agrees with Robson (2003), Torrini (2005), and Kanniainen and Vesala (2005), who found that employment protection legislation restrictiveness had little impact on aggregate selfemployment.
The outline of the rest of the paper is as follows. Section 2 presents the model and provides an overview of the key statistical elements used to understand the empirical results. Section 3 describes the database and presents and discusses the results. Section 4 concludes and provides policy implications and further avenues of future research.
2 Data and methodology
2.1 Data description
In this paper, we focus on the productivity of selfemployment as a proxy for the quality of entrepreneurship. In particular, we consider GDP per selfemployed person, i.e., the output per selfemployed worker, as our measure of productivity. To compare productivity levels across countries, GDP is converted to international dollars using purchasing power parity rates, which account for the differences in relative prices among countries.^{Footnote 1}
Selfemployed workers are those workers who, working on their own account, with one or a few partners or in a cooperative, hold the type of jobs defined as selfemployment jobs. These data are taken from the International Labor Office (ILOSTAT) database. Selfemployed workers include the following four subcategories: employers, ownaccount workers, members of producers' cooperatives, and contributing family workers.
The dataset of covariates that are used in the logistic prior to classifying each country in a specific group is created with four variables meant to capture groupspecific differences. The first structural variable captures the labor market situation by using the average unemployment rate provided by the ILOSTAT database. This is measured as the percentage of the total labor force that is without work but have been seeking work in a recent past period and is currently available to work.
The second structural variable, which reflects the level of industrialization, is the average of industry added value as a percentage of GDP, including the ISIC divisions 0543. These data are taken from World Bank national accounts and OECD National Accounts Statistics. The third structural variable is meant to capture the rigidity of employment protection legislation. For this purpose, we use the average of the market legislation rigidity index, as detailed by Campos and Nugent (2018).
Finally, we use a fourth structural variable to measure the level of digitalization. In particular, we use the Digital Adoption Index provided by the World Bank, which is a composite index measuring the spread of digital technologies in a country across three dimensions of the economy, namely, those of people, government, and business. To facilitate interpretation, the data have been normalized so that countries with values over 0 will be above the sample average, and vice versa.
Estimation in finite mixture models requires handling balanced panels. Therefore, our effective dataset is composed of annual selfemployment and the four countrylevel covariates for a large set of 121 countries, spanning from 1991 to 2019. The list of countries, their code, average GDP by selfemployed for the period, and the values of covariates can be found in the Appendix Table 4.
2.2 Model specification
In this paper, we investigate pooling within a time series panel using a finite mixture of an unspecified number of separate distributions.^{Footnote 2} For this purpose, let \(y=\{{y}_{it}\}\), \(i=1,...,N; t=1,...,T\) be a panel, where \(i\) and \(t\) refer to countryspecific selfemployment productivity and year, respectively. In addition, we assume that the time series arise from \(K\) hidden groups in such a way that all the time series within a certain group are characterized by the same econometric model and depend on the same set of parameters, which are heterogeneous across groups.
The approach used is based on formulating a time series model for each univariate time series \({y}_{i} =\left\{{y}_{i1},\dots ,{y}_{iT}\right\}\) in terms of the groupspecific sampling density. For group \(k\), the density is \(p\left({y}_{i}{\vartheta }_{k}\right)\), and the unknown groupspecific parameters \(\vartheta =\left\{{\vartheta }_{1},\dots ,{\vartheta }_{K}\right\}\) take values in a parameter space \(\theta\). In this case, the same model is valid for all the time series within a given group, although with different parameters across groups. Furthermore, we also assume that the time series are independent within each cluster.
In this context, it is convenient to introduce a latent group indicator \({S}_{i}\), which takes a value out of the discrete set \(\{1,...,K\}\), indicating to which group the time series belongs; that is, \({S}_{i}=k\) indicates that \({y}_{i}\) belongs to group \(k\). We assume that \({S=(S}_{1},\dots ,{S}_{N})\) are a priori independent. Thus, knowing \({S}_{i}\) is equivalent to knowing the groupspecific parameters and the density \(p\left({y}_{i}{\vartheta }_{{S}_{i}}\right)\).
The joint sampling distribution reads as
However, an important issue in this specification is that neither the number of clusters nor the group membership are known a priori. In contrast, we use modelbased clustering techniques based on Bayesian classification rules to determine \(K\) and to estimate the group indicator \({S}_{i}\) along with the groupspecific parameters \({\vartheta }_{1},...,{\vartheta }_{K}\) from the data.
To overcome the issue that group membership is unknown in practice, we assume that each time series of selfemployment productivity is taken to be a realization of the mixture probability density function of \(K\) separate distributions
where the mixing proportion \(Pr\left({S}_{i}=k \right{Z}_{i},\gamma )\) is the probability that \({y}_{i}\) belongs to group \(k\). Thus, the probabilities of group membership are posited to rely on vectors of countryspecific variables, \({Z}_{i}\), each comprising g variables, and on the parameter set \(\gamma =({\gamma }_{1},\dots ,{\gamma }_{K})\), where each \({\gamma }_{k}\) represents a vector of \(g\) elements, with k=1,â€¦,K. For clustering purposes, each component in mixture Model (2) corresponds to a cluster.
To complete the model specification, given the time series of selfemployment productivity of country \(i\), \({y}_{i}\), that belongs to a certain group \(k\), \({S}_{i}=k\), we consider that the expected value of each time series of this group is fully characterized by a groupdependent mean and a groupdependent trend. Thus, we model the timeseries dynamics of selfemployment productivity as
where the error term is conditionally heteroscedastic, \({\varepsilon }_{it} \sim N\left(0,\frac{{\sigma }^{2}}{{\lambda }_{i}}\right),\) and \({S}_{i}=1,...,K\).^{Footnote 3} For each group, we consider \({\vartheta }_{k}={(\mu }^{k}, {\alpha }^{k},{\sigma }^{2})\), and for each country, we consider that the seriesspecific variance weights, \({\lambda }_{i}\), where \({\lambda =(\lambda }_{1},\dots ,{\lambda }_{N})\) collects all of the weights. For a given cluster \(i\) belonging to cluster \({S}_{i}=k\), parameter \({\mu }^{{S}_{i}}\) represents the base values of selfemployment productivity that characterizes the cluster, while parameter \({\alpha }^{{S}_{i}}\) provides the intensity of the increasing or decreasing behavior in the series belonging to the cluster over time.
In addition, we follow FrÃ¼hwirthSchnatter and Kaufmann (2008), Kaufmann (2010), and Hamilton and Owyang (2012), and consider a multinomial logit model to include prior information on a particular series in the estimation of the group probability:
where the first group is the baseline group, and we set \({\gamma }_{1}=0\). We assume that \({\gamma =(0,\gamma }_{2},\dots ,{\gamma }_{K})\) are independent of the other parameters of the model.
The vector \({Z}_{i}\), for \(i=1,\dots ,N\), includes the \(g\) countryspecific average features of the labor market structure that determine the classification of the selfemployment productivity of country \(i\) into a specific group, with \({Z{'}=(Z{'}}_{1},\dots ,{Z{'}}_{N})\). The parameters \({(\gamma }_{2},\dots ,{\gamma }_{K})\) are unknown but groupspecific values, and they allow us to estimate the prior classification probabilities of country \(i\) belonging to a group depending on the structural variables \({Z}_{i}\).
These parameters have a nice interpretation because they determine the intensity of each structural variable for classifying a country into a certain group. If the jth component of \({\gamma }_{k}\) for country \(i\) is positive, then there is an important role for the jth structural variable of country \(i\) in making this country more likely to belong to group \(k\) rather than to part of the baseline group. In contrast, if the component is negative, increasing the jth structural variable of country \(i\) increases the probability of this country being reclassified toward the baseline group.
2.3 Model estimation
The model estimation is carried out within a Bayesian framework with the aid of Markov Chain Monte Carlo (MCMC) simulation and data augmentation methods for finite mixture models. Thus, using the information given in the data, the key issue is obtaining a posterior inference on the group indicator, \(S\), the model parameters, \(\vartheta\), the seriesspecific variance weights, \(\lambda\), and the intensity of the structural variables, \(\gamma\).
Let us start by assuming that the number of clusters \(K\) is known, although we will set a procedure for determining the number of clusters below.
Priors
The parameter vector is further broken down into parameter blocks, for all of which we assume standard prior distributions as follows: The prior distribution of the groupspecific parameters \(\left({\mu }^{k},{\alpha }^{k}\right)\sim N({m}_{0},{M}_{0})\), for \(k=1,\dots ,K\); the variance of the error terms and the seriesspecific variance weights follow inverse gamma and gamma distributions, respectively: \({\sigma }^{2}\sim IG\left({g}_{0},{G}_{0}\right)\) and \({\lambda }_{i}\sim G\left(\frac{v}{2},\frac{v}{2}\right)\) for \(i=1,\dots ,N\); and the parameters governing the prior group probabilities under the logit structure follow a normal distribution, \({\gamma }_{k}\sim N(0,\tau {I}_{g})\), for \(k=1,\dots ,K\), where \(g\) is the dimension of vectors \({Z}_{i}\).
Estimation
The sampling scheme to draw from the posterior follows FrÃ¼hwirthSchnatter and Kaufmann (2008) and involves the iteration between the following three steps:

(i)
Classification for fixed parameters. Each time series \({y}_{i}\), with \(i =1,\dots , N\), is classified into one of the \(K\) groups by sampling the group indicator \({S}_{i}\) from the posterior distribution \(Pr\left({S}_{i} = k{y}_{i},{Z}_{i},\vartheta ,\lambda ,\gamma \right),\) using the sampling density as well as the prior classification probabilities,
for \(k=1,...,K\).

(ii)
Estimation for a fixed classification and \(\lambda\). Conditional on knowing the values of \(S\) and \(\lambda\), sampling \({\vartheta }_{1},...,{\vartheta }_{K}\) is carried out by sampling the groupspecific parameters from the posterior \(p\left({\vartheta }_{1},...,{\vartheta }_{K}\rightS,y,\lambda )\), where each group parameter \({\vartheta }_{k}\) is estimated by pooling each time series that currently belongs to group \(k\). To sample \(\gamma\), we follow Scott (2011) and use a MetropolisHasting algorithm.

(iii)
Estimation of \(\lambda\) for a fixed \(S\), \(\vartheta\) and \(\gamma\). For each \(i =1,..., N\), the scale factors \({\lambda =(\lambda }_{1},\dots ,{\lambda }_{N})\) are sampled independently from the gamma distributions.
The MCMC estimation procedure described above is repeated \(M\) times, and the stacked values of the outcomes of each iteration draw can be used to perform an inference. However, the sampler could present label switching problems, and the finite mixture model must be identified through some inequality constraint on the groupspecific parameters. To handle label switching in mixture models, we use the identifiability constraint \({\mu }_{1}>{\mu }_{2}>\dots >{\mu }_{K}\) for all \(k=1,\dots ,K\). In other words, the constraint implies that the groups are identified by their level of selfemployment productivity.
Once the model has been identified, we can perform inference regarding which time series belong to which group by using the posterior classification probability. In particular, we can estimate the posterior probability that a time series \({y}_{i}\) belongs to group \(k\) from the MCMC draws by averaging over the \(M\) iterations,
2.4 The number of clusters
For exposition purposes, the number of components of the mixture, \(K\), was known. In practice, however, the number of groups will be unknown. To choose the number of groups in a straightforward form, one could select the number of components that maximizes the marginal likelihood from the set \(\{1,\dots ,{K}^{*}\}\), where \({K}^{*}\) is an upper bound. However, this method will result in a model with an arbitrarily large number of groups.
For this reason, we consider selecting the model with the number of groups necessary to maximize the quality of the classification by introducing the entropy of the model. If we call \({EN}_{j}\) the entropy of a model with a fixed number of \(j\) groups, the method would entail selecting \(K\) as the model that minimizes
for \(j=1,\dots ,{K}^{*}\). In this expression, larger entropy values indicate worse clustering solutions in terms of a quality classification, where the value would be 0 for perfect classification.
3 Empirical results
3.1 Model estimation
Estimation is based on the following priors. For groupspecific parameters, we use \(\left({\mu }_{k}, {\alpha }_{k}\right)\sim {\text{N}}\left(0,1000\right)\). The priors of the variances are \({\sigma }^{2}\sim IG(\mathrm{1,1})\), and they are \({\lambda }_{i}\sim G(\mathrm{4,4})\) for the scale parameters. For the parameters of the logistic model, we use the prior \(\gamma \sim N(0,\tau {I}_{g})\), with \(\tau =20\) and \(g=4\). For each run of the MCMC sampler, after conducting a burnin phase of 2000 iterations to remove dependence on the starting condition, 8000 draws are kept to evaluate the estimation.
To select the number of groups, we set \({K}^{*}=6\). TableÂ 1 presents the results of the marginal likelihood and entropy for models with up to six groups. As expected, the likelihood increases with the number of groups. However, the model specification that divides the data into three separate groups is preferred because this model reaches the lowest entropy value (0.38) among model specifications.
TableÂ 2 gives the posterior means of the mean and slope coefficients associated with each of the three idiosyncratic groups, displaying their standard deviations in parentheses indicating that they are all significant at the 5% level. The table shows a division of countries into three distinct groups according to the average and trend of their respective selfemployment productivity. The groups are ordered in decreasing order of entrepreneurship productivity levels and trends, with Group 1 being associated with countries with the highest levels of productivity and the steepest tendencies. In Group 2, we find countries with medium levels of productivity and midlevel tendencies. Finally, Group 3 contains the countries with the lowest productivity levels and the flattest tendencies.
To complete the description of the groups, Table 3 illustrates the individual characteristics that drive group formation. For this purpose, the table shows the posterior means of the estimated logistic coefficients that influence the group probabilities and their standard deviations (in parenthesis), with bold indicating that the coefficients are significant at 95% confidence. The structural variables, which appear in columns, represent the averages of unemployment rate, the value added by industry, the Labor Market Rigidity Index, and the Digital Adoption Index.
In accordance with the parameter estimates, we find an important role of the level of ability of individuals in a country to access and use new information and communication technologies (ICTs) to make the country less likely to be part of Groups 2 and 3. Thus, we identify Group 1 as the group of countries with the highest levels of adoption of digital technologies, which lead to high levels and an upward tendency of productive selfemployment. To examine this finding more deeply, Fig.Â 1a shows the prior probability that country i belongs to Group 1, which is conditional on the structural variables, \(Pr({S}_{i}=1{Z}_{i},\gamma )\), as a function of the Digital Adoption Index. The figure reveals the positive relationship between the adoption of digital technologies and the probability of being classified in the group with the highest level and steepest tendency of selfemployment productivity.
In addition, TableÂ 3 shows that countries with high unemployment rates have lower odds of being classified in the first group than of belonging to the second group. To illustrate this finding, Figure 1 (b) shows that the prior probability that country i belongs to Group 2, which is conditional on the structural variables, \(Pr({S}_{i}=2{Z}_{i},\gamma )\), is positively correlated with unemployment.
The figures in TableÂ 3 also point to a very interesting finding. The parameters governing the membership probabilities that relate to industry size and the rigidity of employment protection legislation are not statistically significant. Thus, neither the value added by industry nor that added by labor market rigidity seem to play a statistically significant role in group formation.
To interpret the dynamics of group membership, Fig.Â 2 sketches the geographical distribution of the three groups based on the countryâ€™s highest posterior probabilities. In particular, country i is classified in Group k if \(Pr\left({S}_{i} =k{y}_{i},{Z}_{i},\vartheta ,\lambda ,\gamma \right)>Pr\left({S}_{i} =j{y}_{i},{Z}_{i},\vartheta ,\lambda ,\gamma \right)\), with \(j\ne k\). A visual examination of the map allows us to identify the highly productive countries of Group 1 as most of the European countries, United States, Canada, Saudi Arabia, Oman, Japan and Australia.^{Footnote 4} These countries show the highest wages and associated standard of living, which can be sustained through businesses that compete with new and unique products and companies that compete through innovation and the production of new and different goods using the most sophisticated production processes.
The interaction between the medium levels and trends in selfemployment productivity and high rates of unemployment plays a significant role in forming Group 2. In this group, we find the remaining European countries and some developing and emerging economies, such as southern Africa, Mexico, Brazil, Argentina, Chile, Uruguay, Algeria, Egypt, Turkey, Iran, Kazakhstan, South Korea and Malaysia. These countries produce standard products and services and are susceptible to external, sectorspecific demand shocks.
Finally, the countries with the lowest entrepreneurship productivity and the flattest productivity trend appear in Group 3. In this group, we find Central and Middle African countries, such as Angola, Cameroon, the Central African Republic, Chad, and Nigeria; some Asian countries, such as India, Pakistan, Indonesia and Mongolia; and some South American countries, such as Paraguay, Bolivia, Peru, Ecuador, and Colombia. These are less developed economies, showing limitations in the accessibility of digital technologies, in the level of wages and in competitive advantages accompanied by a heavy reliance on unskilled labor and natural resources.^{Footnote 5}
3.2 Connection with the literature
Our empirical findings call for the leveraging of the most closely related scholarly knowledge on international selfemployment development. First, our modelbased clustering procedure split the dataset of countries into three distinct groups according to their levels and trends in selfemployment productivity. According to the classification of competitiveness across stages of economic development as advocated by Porter et al. (2002), the highproductivity group aligns with economies in the innovationdriven stage, the mediumproductivity group relates to efficiencydriven economies, and the lowproductivity group contains factordriven economies.
Second, our results point to the fact that digitalization seems to be a key factor favoring the transition from factor and efficiencydriven economies to innovationdriven economies. This is likely because digitalization is a key competitive factor, both for a managed economy in which competitiveness is based on efficiency and for capturing the best profit opportunities that favor technological and economic leadership. This agrees with recent findings examining the intersection of digital technologies and entrepreneurship and its impact on the pursuit of sustainable development. In this context, Nambisan (2017) and JafariSadeghi et al. (2021) are two significant examples.
Third, our results suggest that there is a relationship between labor market dynamics (the reduction of unemployment) and the likelihood of a country moving from a medium to a highproductivity selfemployment group. At this point, we could argue that a wellfunctioning labor market generates sufficient wage employment opportunities to substantially reduce the relative weight of "necessity entrepreneurs," usually marginally attached to selfemployment, with respect to "opportunity entrepreneurs." This results in an increase in the quality of entrepreneurship, becomes a key element in capturing more and better profit opportunities and transforms an economy into an entrepreneurialdriven economy.
Thus, it follows that labor market reforms that promote employability and the provisioning of employment opportunities in the context of a labor market with adequate dynamism are elements that favor the transition to the innovationdriven group. These arguments are in line with the contributions of Acs (2006), Baptista and Thurik (2007), Baumol and Strom (2007), Acs et al. (2008) and Van der Zwan et al. (2016).
Fourth, our statistical evidence does not seem to support the idea that the industrial sector plays a significant role in the probability of a country belonging to the highproductivity selfemployment group. In contrast to this result and in line with Lucas (1988), industrialization processes should be accompanied by an increase in selfemployment productivity as lowskilled selfemployment moves to paid employment as attracted by larger wages in routine industrial job opportunities.
However, our result agrees with the literature on employersize wage differentials. Davis and Haltiwanger (1996) pointed out that larger employers do not necessarily pay substantially higher wages because the dispersion of wages exhibits a pronounced relationship to employer size. In this context, Poschke (2018) found that it is not only the average size of firms but also their dispersion that is significantly higher in developed countries, and Shi et al. (2020) recently suggested the wageboosting effect of innovation in shaping firm wages, which does not necessarily depend on firm size. In addition, Acs and NaudÃ© (2013) recognized the complexity of the role of entrepreneurs in industrialization, as this role can be inhibited by, for example, market failures. For this reason, these authors do not see industrial policies as merely functional policies without consideration for firm or entrepreneurial specifics.
Finally, our results do not support that strict employment protection legislation promotes selfemployment productivity because it does not influence a transition toward the group of countries with the highest selfemployment productivity. This is closely related to the work of Robson (2003), who found very limited evidence for a positive relationship between selfemployment and the strictness of employment protection legislation, as this largely depends on the introduction of suitable control variables. Torrini (2005) also failed to find any robust relationship between the selfemployment rate and employment protection legislation in a multivariate context.
3.3 Policy recommendations
This study provides important guidelines that policymakers are invited to use when drawing up effective national strategies and policy aspects for selfemployment productivity and combatting traditional stereotypes that appear to be less effective for transitioning into the group of highly productive countries.
Our analysis identifies three clusters of countries with some degree of similarity regarding their level and trend in selfemployment productivity. This classification can help national policymakers to verify which group their own country belongs to and determine whether their country has performed on par with other countries in similar economic circumstances.
Unfortunately, our results provide evidence that the catchup effect, which predicts that all economies will eventually converge in terms of selfemployment productivity, does not apply. In this context, we consider the inactivity of policy makers to not be justified and recognize that there is pressure on governments to provide resources to assist in promoting changes toward more productive clusters.
According to our results, the first challenge of policy interventions implies improving incentive structures for entrepreneurs associated with digitalization and promoting the introduction of a new culture of digital entrepreneurship. This implies supporting the development of digital and entrepreneurship skills by addressing some key barriers with a range of policy actions. Examples include embedding digital entrepreneurship modules in entrepreneurship education, offering tailored digital entrepreneurship training programs and improving access to finance for digital entrepreneurship for underrepresented and disadvantaged groups.
The second challenge of policy interventions that are aimed at encouraging selfemployment productivity to facilitate transitioning into the innovationdriven group requires decisive measures aimed at reducing national unemployment. To name a few, we suggest increasing the attractiveness to private capital, removing the obstacles to labor mobility, improving the quality of formal job allocation mechanisms and reducing rigidities in the housing market. The effectiveness of national policies must also be enhanced by ensuring that funds devoted to reducing unemployment are well managed and that the monitoring and evaluation procedures are improved to guarantee a consistent longterm strategy for human capital.
Likewise, our results point to avoiding policy measures that are likely to be inefficient in promoting selfemployment productivity. Our results do not support implementing industrial policies as merely functional policies without consideration of firm or entrepreneurial specifics. In fact, the patterns we find suggest that more numerous industry does not necessarily appear to be better for selfemployment productivity.
In addition, our results suggest that changes in the rigidity of labor legislation per se do not serve to stimulate selfemployment productivity. Despite the explicit set of rules that govern national employment protection legislation, different degrees of regulatory compliance and the possibility of evasion opportunities could explain this finding. In any case, we consider that the design of incentive schemes should not result in distortions to the allocation of talent between salaried employment and selfemployment.
4 Conclusions
This paper reexamines the diversity in the level and dynamics of entrepreneurship across countries in terms of selfemployment productivity. To this end, we applied a Bayesian finite mixture model for clustering time series to a large dataset consisting of internationally comparable indicators covering a large set of 121 countries over the last three decades.
Our empirical findings point to the existence of three homogeneous groups stratified by the following levels of entrepreneurship productivity: high, medium, and lowproductivity countries. These clusters are roughly aligned with the three major groups of countries usually considered in the entrepreneurship literature, namely, factor, efficiency, and innovationknowledge driven countries, and with the literature on managed vs. entrepreneurial societies. In addition, these clusters parallel the three different stages of economic development, namely, developing, transitioning and developed countries.
In contrast to simpler clustering methods, our clustering approach allows us to examine the key structural variables that allocate each country to a particular cluster and regulate the transition to higher productive groups. In other words, our results not only provide homogeneous country groups regarding the quality of entrepreneurship but also point to the institutions or elements in the national entrepreneurial ecosystem that enable their high selfemployment productivity and determine their transition to higher productivity groups. The identification of these factors might be particularly useful for policymakers interested in promoting selfemployment productivity and for academics who strive to test previous theories and hypotheses.
Among the factors guiding the transition between groups, we consider some structural variables that strengthen or weaken the probability of a nation becoming an entrepreneurial economy. These variables are the unemployment rate, which refers to the labor market situation; the average of industry added value as a percentage of GDP, which measures the level of industrialization; the Digital Adoption Index, which measures the diffusion and adoption of digital technologies; and the Labor Market Rigidity Index, which is a measure of labor market rigidities. These variables are usually viewed as common drivers in entrepreneurship (e.g., Audretsch and Thurik 2004) and are included in the literature on regional innovation systems and entrepreneurial ecosystems (see Cao and Shi 2021 and Qian and Acs 2023 for recent surveys).
Our results suggest that policy measures oriented toward (i) creating and enabling environments that foster the accessibility of digital technologies and (ii) promoting initiatives for reducing unemployment are key elements for those countries that are generally moving toward becoming highly productive economies. However, the results fail to find the share of industrial added value as a determinant of such transitions. In addition, we find that deregulation policies meant to reduce rigidities in the labor market are also not important keys for transitioning between clusters.
According to our results, the proposed framework is a very promising tool for analyzing the determinants of selfemployment productivity. In fact, we look forward to future work addressing the following issues. First, although we focused on aggregate selfemployment productivity, we see a natural extension to be the exploration of disaggregated measures, mixed incomes, and nonagricultural selfemployment. Second, we could extend the number of additional factors driving the transition between groups. Third, the method is suitable for exploring the determinants of selfemployment productivity at the regional level. These extensions were not pursued in this paper due to the cost of reducing the number of observational units. For this reason, these extensions have been explicitly left for further research.
Data availability
The data and MATLAB code that support the findings of this study are available from the corresponding author upon request.
Notes
In particular, we converted GDP to international dollars using purchasing power parity rates, in constant 2017 international dollars, from the World Development Indicators database, World Bank and EurostatOECD PPP Programme. For Canada, we use real GDP at constant national prices in millions 2017 US dollars, which is taken from Penn World Table 10.0.
For further details on finite mixture models, see FrÃ¼hwirthSchnatter (2006).
The advantages of Gaussian mixture modeling are that the estimation methods are well established, and the component distributions are thoroughly understood and thus interpretation of the results is facilitated.
Noticeable exceptions are Portugal, Poland, Ukraine, and Greece.
One significant exception is China.
References
Acs ZJ (2006) How is entrepreneurship good for economic growth? Innovations: Techno Govern Glob 1(1):97â€“107
Acs ZJ, NaudÃ© W (2013) Entrepreneurship, stages of development, and industrialization. In: Szirmai A, NaudÃ© W, Alcorta L (eds) Pathways to industrialization in the twentyfirst century: New challenges and emerging paradigms. Oxford Academic, Oxford
Acs ZJ, Desai S, Hessels J (2008) Entrepreneurship, economic development and institutions. Small Bus Econ 31(3):219â€“234
Acs ZJ, Audretsch DB, Evans DS (1994) Why does the selfemployment rate vary across countries and over time?Â Discussion Paper No. 871,Â Center for Economic Policy Research, January 1994. https://econpapers.repec.org/paper/cprceprdp/871.htm
Allub L, Erosa A (2019) Financial frictions, occupational choice and economic inequality. J Monet Econ 107:63â€“76
Anokhin S, Schulze WS (2009) Entrepreneurship, innovation, and corruption. J Bus Ventur 24(5):465â€“476
Arin KP, Huang VZ, Minniti M, Nandialath AM, Reich OFM (2015) Revisiting the determinants of entrepreneurship: a Bayesian approach. J Manag 41(2):607â€“631
Audretsch DB, Thurik AR (2004) A model of the entrepreneurial economy. Int J Entrepre Educ 2(2):143â€“166
Baptista R, Thurik AR (2007) The relationship between entrepreneurship and unemployment: is Portugal an outlier? Technol Forecast Soc Change 74(1):75â€“89
Baumol WJ, Strom RJ (2007) Entrepreneurship and economic growth. Strateg Entrepre J 1(3â€“4):233â€“237
Belitski M, Chowdhury F, Desai S (2016) Taxes, corruption, and entry. Small Bus Econ 47(1):201â€“216
BjÃ¸rnskov C, Foss NJ (2016) Institutions, entrepreneurship, and economic growth: what do we know and what do we still need to know? Acad Manag Perspect 30(3):292â€“315
Blanchflower DG (2000) Selfemployment in OECD countries. Labour Econ 7(5):471â€“505
Blanchflower DG (2004) Selfemployment: More may not be better. Swed Econ Policy Rev 11(2):15â€“73
Blanchflower DG, Shadforth C (2007) Entrepreneurship in the UK. Found Trends Entrepre 3(4):257â€“364
Campos NF, Nugent JB (2018) The dynamics of the regulation of labour in developing and developed countries since 1960. In: Campos NF, Grauwe P, Ji Y (eds) The political economy of structural reforms in Europe.Â Oxford University Press, pp 75â€“88
Cao Z, Shi X (2021) A systematic literature review of entrepreneurial ecosystems in advanced and emerging economies. Small Bus Econ 57(1):75â€“110
Centeno M (2000) Is selfemployment a response to labour market rigidity? Econ Bull 37â€“44
Congregado E, Golpe AA, Carmona M (2010) Is it a good policy to promote selfemployment for job creation? Evidence from Spain. J Policy Model 32(6):828â€“842
Cowling ML, Wooden M (2021) Does solo selfemployment serve as a â€˜stepping stoneâ€™ to employership? Labour Econ 68:101942
Davis SJ, Haltiwanger J (1996) Employer size and the wage structure in US manufacturing. Ann Econ Stat 41â€“42:323â€“367
Djankov S, Ganser T, McLiesh C, Ramalho R, Shleifer A (2010) The effect of corporate taxes on investment and entrepreneurship. Am Econ J: Macroeconomics 2(3):31â€“64
Dutta N, Sobel R (2016) Does corruption ever help entrepreneurship? Small Bus Econ 47:179â€“199
Estrin S, Korosteleva J, Mickiewicz T (2012) Which institutions encourage entrepreneurial growth aspirations? J Bus Ventur 28(4):564â€“580
Fairlie RW, Fossen FM (2020) Defining opportunity versus necessity entrepreneurship: two components of business creation. In: Polachek SW, Tatsiramos K (eds) Change at home, in the labor market, and on the job, vol 48. Emerald Publishing Limited,Â pp 253â€“289. https://doi.org/10.1108/S0147912120200000048008
FÃ¶lster S (2002) Do lower taxes stimulate selfemployment? Small Bus Econ 19:135â€“145
FrÃ¼hwirthSchnatter S (2006) Finite mixture and Markov switching models. Springer, New York
FrÃ¼hwirthSchnatter S, Kaufmann S (2008) Modelbased clustering of multiple time series. J Bus Econ Stat 26(1):78â€“89
Hamilton JD, Owyang MT (2012) The propagation of regional recessions. Rev Econ Stat 94(4):935â€“947
JafariSadeghi V, GarciaPerez A, Candelo E, Couturier J (2021) Exploring the impact of digital transformation on technology entrepreneurship and technological market expansion: the role of technology readiness, exploration and exploitation. J Bus Res 124:100â€“111
Kanniainen V, Vesala T (2005) Entrepreneurship and labor market institutions. Econ Modell 22(5):828â€“847
Kaufmann S (2010) Dating and forecasting turning points by Bayesian clustering with dynamic structure: a suggestion with an application to Austrian data. J Appl Economet 25(2):309â€“344
Lucas RE Jr (1988) On the mechanics of economic development. J Monet Econ 22(1):3â€“42
Maloney WF (2004) Informality revisited. World Dev 32(7):1159â€“1178
Nambisan S (2017) Digital entrepreneurship: Toward a digital technology perspective of entrepreneurship. Entrepre Theory Pract 41(6):1029â€“1055
Pietrobelli C, Rabellotti R, Aquilina M (2004) An empirical study of the determinants of selfemployment in developing countries. J Int Dev 16(6):803â€“820
Porter M, Sachs J, McArthur J (2002) Executive summary: Competitiveness and stages of economic development. In: Porter M, Sachs J, Cornelius PK, McArthur J, Schwab K (eds) The global competitiveness report. Oxford University Press, New York, pp 2001â€“2002
Poschke M (2013) Who becomes an entrepreneur? Labor market prospects and occupational choice. J Econ Dyn Control 37(3):693â€“710
Poschke M (2018) The firm size distribution across countries and skillbiased change in entrepreneurial technology. Am Econ J: Macroeconomics 10(3):1â€“41
Poschke M (2019) Wage employment, unemployment and selfemployment across countries. IZA Discussion Paper No. 16271. https://doi.org/10.2139/ssrn.3401135
Qian H, Acs ZJ (2023) Entrepreneurial ecosystems and economic development policy. Econ Dev Q 37(1):96â€“102
Robson MT (2003) Does stricter employment protection legislation promote selfemployment? Small Bus Econ 21(3):309â€“319
RodriguezSantiago A (2022) Reevaluating the relationship between economic development and selfemployment, at the macrolevel: a Bayesian model averaging approach. Int J Interact Multimedia Artif Intell 7(3):20â€“25
Scott SL (2011) Data augmentation, frequentist estimation, and the Bayesian analysis of multinomial logit models. Stat Pap 52(1):87â€“109
Shane S (2009) Why encouraging more people to become entrepreneurs is bad public policy. Small Bus Econ 33:141â€“149
Shi L, Li S, Fu X (2020) The fourth industrial revolution, technological innovation and firm wages: Firmlevel evidence from OECD economies. Revue dâ€™Ã‰conomie Industrielle 1(169):89â€“125
Sobel RS (2008) Testing Baumol: Institutional quality and the productivity of entrepreneurship. J Bus Ventur 23(6):641â€“655
Stam E (2015) Entrepreneurial ecosystems and regional policy: A sympathetic critique. Eur Plan Stud 23(9):1759â€“1769
Stam E, Van de Ven A (2021) Entrepreneurial ecosystem elements. Small Bus Econ 56(2):809â€“832
Thurik AR, Carree MA, Van Stel A, Audretsch DB (2008) Does selfemployment reduce unemployment? J Bus Ventur 23(6):673â€“686
Torrini R (2005) Crosscountry differences in selfâ€“employment rates: The role of institutions. Labour Econ 12(5):661â€“683
Urbano D, Audretsch D, Aparicio S, Noguera M (2020) Does entrepreneurial activity matter for economic growth in developing countries? The role of the institutional environment. Int Entrepre Manag J 16(3):1065â€“1099
Van der Zwan P, Thurik AR, Verheul I, Hessels J (2016) Factors influencing the entrepreneurial engagement of opportunity and necessity entrepreneurs. Eurasian Bus Rev 6(3):273â€“295
Wennekers S, Van Stel A, Carree M, Thurik AR (2010) The relationship between entrepreneurship and economic development: Is it Ushaped? Found Trends Entrepre 6(3):167â€“237
Acknowledgements
The authors would like to thank JesÃºs CrespoCuaresma, BegoÃ±a Cueto, Frank Fossen, Javier J. PÃ©rez, ConcepciÃ³n RomÃ¡n, and Andre Van Stel for their insightful comments that contributed substantially to the development of this paper. All remaining errors are our responsibility.
Funding
Funding for open access publishing: Universidad de Huelva/CBUA The authors acknowledge funding from the Spanish Ministry of Science, Innovation and Universities through project PID2020115183RBC22; from Junta de AndalucÃa through grant P2000733 and Research Group SEJ487 (Spanish Entrepreneurship Research Group â€“ SERG); and from Research and Transfer Policy Strategy of the University of Huelva 2021. M. Camacho is grateful for the support of grant PID2022136547NBI00 funded by MICIU/AEI/https://doi.org/10.13039/501100011033 and by FEDER, UE.
Author information
Authors and Affiliations
Contributions
All authors contributed in the same way to the writing and preparation of this research article.
Corresponding author
Ethics declarations
Competing interests
The authors have no competing interests to declare.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Camacho, M., Congregado, E. & RodriguezSantiago, A. An inquiry into the drivers of an entrepreneurial economy: A Bayesian clustering approach. J Evol Econ (2024). https://doi.org/10.1007/s00191024008639
Accepted:
Published:
DOI: https://doi.org/10.1007/s00191024008639
Keywords
 Entrepreneurship
 ProductiveÂ selfemployment
 Modelbased clustering
 Finite mixture models
 Crosscountry analysis
 Transition probabilities