1 Introduction

Before the sudden halt brought about by the outbreak of the COVID-19 pandemic in the early 2020, migration flows towards OECD countries and across the world witnessed a persistent upsurge (OECD 2020b, https://www.migrationdataportal.org). The economic effects of migration have gained centrality in the academic debate, with major contributions addressing effects on employment, productivity, trade, and innovation.Footnote 1 In particular, since the seminal works by Gould (1994) and Head and Ries (1998), a solid strand of literature has developed and has shown that migration contributes to trade in a significant way. Besides demanding home-country products, migrants attenuate the information and enforcement costs that, even in the ICT era, affect international trade (Rauch 2001; Anderson and van Wincoop 2004).Footnote 2 Despite the rich empirical support for these effects, recent studies have started to raise concerns of a possibly overstated nexus (Parsons 2012; Burchardi et al. 2018). Moreover, results appear sensitive to the adopted empirical approaches—e.g., countries vs. subnational units of analysis, immigrants vs. emigrants, imports vs. exports, standard vs. differentiated commodities, similar vs. dissimilar countries)—uncovering the nuances of a still-open research issue. This is particularly so when we address the persistence of migration effects on trade during downturns in the business cycle, which has received little attention so far. Yet, addressing this issue appears very timely in light of the recent economic crisis caused by COVID-19 (Frankel and Romer 1999; Alcalá and Ciccone 2004; OECD 2020a; UNCTAD 2020).

Most studies analyzing the link between migration and trade (e.g., Gould 1994; Dunlevy and Hutchinson 1999; Rauch and Trinidade 2002) employ gravity models (Anderson and van Wincoop 2003; Chaney 2008; Head and Mayer 2014). Yet, the application of the gravity model to the analysis of the migration–trade link still reveals important gaps from both a theoretical and econometric perspective. We seek to fill these gaps. Using an integrated approach that draws on theoretical and methodological contributions from the frontier of the debate (Head and Mayer 2014; Correia et al. 2019, 2020; Weidner and Zylkin 2020), we investigate the case for an actual pro-export effect of migrants in a country characterized by large diasporas, like Spain, over a decade that has witnessed the interlinking of two economic crises (2007–2016). In doing so, we jointly consider three issues.

First, unlike the majority of previous studies, we simultaneously address the pro-export effects of both emigrants and immigrants. Indeed, when studying only one direction of migration the effects may be confounded and ascribed to the wrong underlying mechanism, especially in countries with large diasporas, like the one on which we focus.

Second, we refer to very disaggregated geographical units of analysis, i.e., Spanish NUTS3 regions (“provincias”). Due to the localized nature of knowledge spillovers (Audretsch and Feldman 1996; Rauch 2001) and the spatial heterogeneity in the distributions of both migration and trade (e.g., Peri and Requena-Silvente 2010; Bratti et al. 2014), the migration–trade link is arguably a localized phenomenon. Accordingly, the extant literature at the subnational level is quite developed.Footnote 3 However, to the best of our knowledge, this is the first study to investigate the export effects of emigration (along with immigration) at the subnational level by allowing for subnationally heterogeneous export capacities. More precisely, and in line with recent contributions (Briant et al. 2014; Burchardi et al. 2018; Bratti et al. 2020), we allow for subnational heterogeneity in the “multilateral resistance” factors that inhibit the trade of provinces to any partner (i.e., in the so-called “multilateral resistance term” (MRT); Anderson and van Wincoop 2003).

Third, we contribute to the discussion on the delicate methodological choice of how to estimate migration-augmented gravity models of trade with panel data: an issue at the frontier of econometric “best practices” to consistently identify the determinants of international trade (Larch et al. 2019, p. 487). Indeed, the literature has long indicated an ideal candidate estimator for gravity models. This appears to be the Poisson Pseudo Maximum Likelihood (PPML) estimator with time-varying exporter and importer fixed effects and with time-invariant region-country fixed effects. Such an approach would address the inconsistency of heteroskedastic log-linear models estimated by OLS (Santos-Silva and Tenreyro 2006) and allow for heterogeneous MRTs while controlling for bilateral heterogeneity in unobservable trade barriers (Feenstra 2004; Baldwin and Taglioni 2007; Baier and Bergstrand 2007). Until recently, however, the literature had not provided a solution to the issue of separation in PPML models, which is particularly severe in the context of high-dimensional fixed effects, and for which maximum likelihood estimates may not exist or may be incorrect (Santos-Silva and Tenreyro 2010, 2011; Larch et al. 2019; Correia et al. 2019). Moreover, the asymptotic properties of PPML estimates with more than two-way fixed effects were still unclear (Weidner and Zylkin 2020). In this paper, we take advantage of recent econometric and computational advances (Correia et al. 2019, 2020; Weidner and Zylkin 2020) to improve the implementation of this approach. Moreover, taking stock of the simulation results by Head and Mayer (2014), who cast doubts on the use of PPML as a “workhorse” estimator for gravity models, we select the most suitable estimator through diagnostic tests that address the underlying distribution of the errors and the sources of potential misspecification (Head and Mayer 2014; Manning and Mullahy 2001; Santos-Silva and Tenreyro 2006).

We carry out our integrated analysis by focusing on exports. We study 5450 trading pairs, constituted by 50 Spanish provincesFootnote 4 and 109 countries over the period of 2007–2016. We first work out the elasticity of exports to both immigration and emigration. Then, we revisit some stylized facts of the migration–trade link, like the role of institutional and language similarity between trade partners and the distinction between local and non-local effects of migration. Finally, we provide a first exploration of the extent to which the effects of migrants on exports has changed over time. Indeed, focusing on the 2007–2016 timeline, we provide a first endurance test of the trade effects of migration during the subprime mortgage crisis, the unfolding of the sovereign debt default crisis, and the subsequent recession.

Our results only partially confirm the available knowledge on the issue and offer some novel insights. When both directions of migration are retained and their endogeneity is addressed through an instrumental variable (IV) strategy, we do not find robust evidence of an immigration effect on exports. Exports from Spanish provinces appear mainly driven by emigrants, and the ability of immigrants to promote trade appears to deteriorate with the global financial downturn.

The remainder of this paper is organized as follows. In Sect. 2, we position our research in the extant literature. In Sect. 3, we present the data used for our empirical analysis, and in Sect. 3.1 we illustrate its methodological novelties. Section 4 presents our results, and Sect. 5 offers some concluding remarks. Appendix A details our data sources and variables, and Appendices B and C contain a set of robustness checks and heterogeneity analyses.

2 Migrants and exports at the subnational level: the case for heterogeneous export capacities

The analysis of the trade impacts of migration has so far mainly concentrated on the effects of immigrants on trade in their recipient countries. Furthermore, when implemented at the subnational level, analyses have generally assumed that the trade capacities of the recipient units of investigation (i.e., regions) are homogeneous. Both choices entail problematic implications that we address by integrating the role of emigrants along with that of immigrants and by allowing for export capacities to be heterogeneous across subnational units of analysis.

2.1 The pro-export effects of both emigrants and immigrants

The mechanisms through which migration can affect trade have been extensively studied over the last decades.Footnote 5 Migrants typically move to a new location and preserve a relationship with their origin countries by creating and maintaining social and business networks that span across countries (Rauch 2001; Rauch and Casella 2001). Their embeddedness in these transnational networks is at the core of migrants’ capacities to reduce the so-called informal barriers to trade, that is, of the “network effects” they exert on trade. First, given their knowledge of customs, laws, markets, language, and business practices on the two sides of their migration route, migrants can help fill the information gaps between sellers and buyers, and in doing so they facilitate the realization of new business opportunities (information effect). Second, within their transnational networks, migrants can put in place implicit enforcing mechanisms (e.g., punishment, sanctions, and exclusions) for international contractual relationships and compensate for the weakness of institutional protection mechanisms (enforcement effect). A different kind of effect (preference effect) refers to migrants’ preferences for products from their homeland, which increases trade unidirectionally, with emigrants adding to the foreign demand for exports and immigrants increasing the domestic demand for imports (Hatzigeorgiou 2010; Parsons 2012).

While the literature has mainly focused on immigrants, the above mechanisms can be argued to apply to emigrants as well (see, e.g., Murat and Pistoresi 2009; Parsons 2012). In countries with large diasporas, there is no ex-ante reason to expect that information and enforcement effects are only due to inward rather than outward migration; and emigrant preference effects may be substantial drivers of exports that may confound the results if neglected. Hence, omitting the emigration side from the analysis may not only overstate the immigrant effects, but more importantly, it could also lead to wrongly attributing to information and enforcement what is in fact a preference effect.

In spite of these problematic implications, the emigration side of the migration–trade nexus has generally been neglected so far,Footnote 6 mainly due to the lack of data, and especially in studies with a subnational focus. This is doubly unfortunate. From a subnational perspective, the effects of emigration could, in fact, operate differently from those of immigration. For example, Spanish emigrants may access knowledge exchanged within networks of co-nationals from provinces other than their own, with whom they share a language and social capital. Accordingly, they would promote the realization of trade opportunities not only with their province of origin but with Spain as a whole.

Another reason for their joint analysis is that emigrants and immigrants are likely complementary and may perform their bridging role in different contexts. Emigrant destination countries may substantially differ from immigrant countries of origin, for example, in terms of resource endowments, cultural habits, and institutional setups (Girma and Yu 2002; Dunlevy 2006), and they could follow distinct historical routes (Gould 1994; Rauch 2001). Accordingly, the trade contribution of emigrants could be higher or lower than that of immigrants, depending on the intensity of the existing barriers to trade (Rauch 2001). Furthermore, differences in tastes and human capital could translate into different effects on the trade of specific commodities and services (Rauch and Trinidade 2002; Peri and Requena-Silvente 2010; Briant et al. 2014).

In conclusion, in addressing the effect migrants can have on trade and, like in our empirical application, on exports, the joint analysis of both emigrants and immigrants is crucial to obtaining accurate results and drawing valid conclusions.

2.2 The gravity model with subnationally heterogeneous export capacities

The information flows that account for a large portion of migrants’ pro-trade effects strongly rely on the business and social networks that migrants create. These are networks that operate mainly through direct interpersonal contacts and proximity (Rauch 2001). Given the tendency of new incoming immigrants to settle close to places where other immigrants have already settled (Altonji and Card 1991; Card 2001) and given the subnational heterogeneity in the economic structure of countries (Bratti et al. 2014), network effects can be expected to be heterogeneous across subnational units of analysis. In Spain, for example, at the beginning of our period of analysis (2007) seven provinces (Madrid, Barcelona, Alicante, Valencia, Malaga, Murcia, and the Balearic Islands) contained about 62% of the country-level immigrants, and eight provinces (Barcelona, Madrid, Valencia, Pontevedra, Zaragoza, Bizkaia, Gipuzkoa, A Coruña) accounted for 59% of exports. Emigrants were only slightly less concentrated, with about 60% originating from 9 provinces (Madrid, A Coruã, Pontevedra, Barcelona, Ourense, Asturias, Santa Cruz de Tenerife, Lugo, Valencia). These facts clearly indicate the polarizing role of the provinces of Madrid, Barcelona, and Valencia, but also the subnational heterogeneity in the distribution of immigrants, emigrants, and exports. On the basis of this evidence, subnational heterogeneity in the pro-trade effects of migrants is to be expected.

A subnational analysis of the pro-trade effects of migrants is indeed highly desirable. Investigating such a localized phenomenon as migration at the country level could, in fact, suffer from the Modifiable Areal Unit Problem (MAUP, Openshaw 1983), and the choice of the subnational level of analysis appears preferable. Furthermore, the reference to subnational observations increases data variability and mitigates concerns of spurious correlations affecting the relationship between trade and migration (Wagner et al. 2002; Bratti et al. 2014). For these reasons, the literature has progressively moved towards a finer geographic disaggregation in terms of units of analysis (for a review of studies at the national vs. subnational level, see Peri and Requena-Silvente 2010 and Felbermayr et al. 2015). Despite this wealth of studies, however, the trade implications of migrants, from a subnational perspective, have not yet been fully exploited.

As in the case of national units of analysis, analysis at the subnational level has developed through the advances in the gravity model of international trade by Anderson and van Wincoop (2003). Their “Multinational Resistance Term Revolution” (Head and Mayer 2014) led to an important extension of its standard “naive” formulation, mainly drawn on the analogy with Newtonian law in physics (Tinbergen 1962; Bergstrand 1985). Since then, country i’s exports to country j, \(X_{ij}\), are not only assumed to be a positive function of their economic masses \(Y_{i}\) and \(Y_{j}\) and a negative function of their distance and of the relative transaction costs, \(\phi _{ij}\);Footnote 7 in addition, the “monadic” terms are adjusted by the average openness to trade of each trading partner, briefly, by their “Multilateral Resistance Terms” (MRTs). Denoting with \(\Omega _i\) the average market size accessible to the exporting country and with \(\Phi _j\) the average degree of competition of the importing one,Footnote 8 the “structural” form of the gravity equation (Head and Mayer 2014) in a cross-sectional context is the following:Footnote 9

$$\begin{aligned} X_{ij}=\frac{Y_i}{\Omega _i}\frac{Y_j}{\Phi _j}\phi _{ij} \end{aligned}$$

Following the previous equation, any change in bilateral trade barriers, encapsulated in the “dyadic” term \(\phi _{ij}\), like their reduction entailed by migration, should be evaluated relative to the MRT, rather than in absolute terms (Anderson and van Wincoop 2003). Following Baier and Bergstrand (2007) and Baldwin and Taglioni (2007), the application of Eq. 1 to a panel context requires recognizing that most variables of interest, including the MRT, are time-varying.

With subnational units, and in the absence of subnationally disaggregated data on the destination of exports, as in our case, the gravity model becomes asymmetrical. In our case, the exporters are the NUTS3 Spanish provinces, while the importers are the destination countries. However, the interpretation of the terms in Eq. 1 remains remarkably similar. Indeed, as recently formalized by Bratti et al. (2020), the heterogeneous productivity of firms in different regions implies subnationally heterogeneous exporting capacities.Footnote 10 In turn, subnationally heterogeneous productivity suits well the case of countries marked by a geographically fragmented production structure, such as Spain, and bears implications for the study of the migration–trade link. The average productivity of firms located in a given province is in fact not unrelated to bilateral migration stocks. Provinces with more productive firms may have a more dynamic structure of opportunities, attract more migrants from any origin country, and have lower emigration rates. The overall supply of immigrant labor, in turn, may affect productivity, wages, and the offshoring decisions of firms (e.g. Ottaviano and Peri 2006; Ottaviano et al. 2013) and can ultimately affect the accessibility-weighted exporting capacities of the exporter.

In light of the previous arguments, and in order not to commit the “gold medal mistake” of the gravity literature (Baldwin and Taglioni 2007), subnational and country-level studies alike need to account for the heterogeneity of the exporter-side MRT (Baier and Bergstrand 2007). In a subnational context, the MRT represents the province’s (weighted) capacity of export to any country in the world.Footnote 11

In spite of this rich theoretical background, the estimate of subnational gravity models with heterogeneous exporter MRTs was only recently incorporated into panel data analyses (see Bratti et al. 2020). Briant et al. (2014) and Burchardi et al. (2018) include exporter effects, but in a cross-sectional framework. Bandyopadhyay et al. (2008) and Peri and Requena-Silvente (2010) use panel data but assume the term to be constant across regions in the same country, while Bratti et al. (2014) assume it to be invariant across the provinces (NUTS3) of the same more aggregated regional (NUTS2) level of analysis.

In an attempt to fill this gap, in the empirical application that we propose with panel data, we allow for subnationally heterogeneous export capacities at the NUTS3 level (provincias, referred to as “provinces”) rather than the NUTS2 level (Comunidades Autonomas, referred to as “regions”).

3 Empirical application

We investigate the role of migration in driving the export performance of 50 Spanish provinces (NUTS3) towards 109 destination countries over the 10 years of 2007–2016. Compared to the previous study of the migration–trade link in Spanish provinces by Peri and Requena-Silvente (2010), we include a wider set of countries. We do this by drawing on the publicly available province-level dataset supplied by the Ministry of Economics and Competitiveness and by avoiding the elimination of dyads for which there are zero trade flows. Unlike Peri and Requena-Silvente (2010), who focused on the pre-crisis period (1998–2007), when immigration was booming, our analysis concentrates on a mainly negative phase of the business cycle, marked by the burst of the subprime mortgage crisis, the unfolding of the sovereign debt default crisis, and the subsequent recession, which heavily impacted the Spanish economy (e.g., Bentolila et al. 2012). Over this period, Spanish exports grew at an average rate of 4.1%, while emigration and immigration increased at an average rate of 4.7% and of 0.7%, respectively. The underlying patterns have, however, been very different, as illustrated in Fig. 1 for the countries in our sample. Export growth rates (not shown) faced a single substantial drop in 2009 and rapidly recovered. The growth of immigrant stocks slowed down over the entire period, taking negative values from 2011 onwards. Emigrant stocks increased at a stable pace over the considered period, but more strongly during the crisis years.

Fig. 1
figure 1

Source Own elaborations of Spanish National Statistical Institute (Instituto Nacional de Estadística, INE) data

Growth rates of immigrant and emigrant stocks in Spain, 2007–2017. The total immigrant and emigrant stocks on which the growth rates are computed are obtained by aggregating our bilateral immigration and emigration stocks data by year.

Given the particular trends that migration and exports revealed over this crisis period, studying their relationship represents an interesting exercise to evaluate the endurance of migration effects along the business cycle. To the best of our knowledge, this is the first study to perform such an analysis.

The dataset used for the empirical analysis is a balanced panel. Export data are retrieved from the official statistics of the Ministerio de Economia, Industria y Competitividad (MEIC) in Spain. For the sake of illustration, Fig. 2 represents the relationship between exports of the province of Madrid and the distance-weighted GDP of EU partner countries in 2008. The resulting picture is reassuring regarding the choice of the gravity model as an interpretative framework for the exports of Spanish provinces. The slope of the fitted line is 0.94, very close to one, in line with the stylized facts highlighted in the gravity literature and with the remarkable “law-like” behavior of international trade (Head and Mayer 2014). More details about our data sources, variables of interest, and summary statistics for these are provided in Appendix A.

Fig. 2
figure 2

Source Own elaborations of MEIC and INE data

Gravity model and the exports of Madrid to EU countries, 2008. The figure plots the log exports of the province of Madrid to EU countries vs. the log of the ratio between each country’s GDP and its distance to Madrid.

3.1 Econometric strategy

The specification that we employ to estimate the gravity model in Eq. 1 follows Baier and Bergstrand (2007) and Baldwin and Taglioni (2007) and includes a vector of country-year effects, \(\theta _{\mathbf {jt}}\), and a vector of province-year effects, \(\omega _{\mathbf {it}}\). Even by including these fixed effects, preferential ties linking specific dyads could still confound the estimation of the migrant effects. Historical reasons, including colonial history, past migration, and geography and transport infrastructure, may be responsible for tighter trade relationships between specific pairs, but also for larger bilateral migration stocks. In this case, the estimated effects of migration would also capture the role of history and geography (Briant et al. 2014; Burchardi et al. 2018). This limitation affects, for instance, the recent specification by Bratti et al. (2020). In order to address this issue, we thus include a further set of region-country fixed effects, \(\eta _{\mathbf {rj}}\), which capture most of the time-invariant heterogeneity across dyads (Baier and Bergstrand 2007; Baldwin and Taglioni 2007). We cannot include dyadic (i.e., province-country) fixed effects, as the arguments by Baier and Bergstrand (2007) imply, due to the low residual variation in the data. Indeed, province-year, country-year, and province-country fixed effects explain between 90% and 98% of the variation in our dependent variable, depending on the estimator. In order to capture part of the residual pair-level heterogeneity, we further include bilateral distance, \(Dist_{ij}\), between province–country dyads.Footnote 12

On the basis of the previous choices, our identification strategy ultimately draws on two main sources of variation: the cross-sectional variation between provinces in the same region–country pair and the time variation within the region–country pair, which is not explained by country- and province-specific shocks. We are confident that this approach allows us to control for most confounding factors that would pose a threat to the internal validity: most importantly, that the more economically dynamic provinces within a given region are simultaneously the strongest exporters and the strongest attractors of migrants. At the same time, our identification strategy allows us to exploit the time and cross-sectional variation in the data, which is relevant for understanding the phenomenon.

We augment the resulting three-way fixed-effects gravity model by adding our variables of interest: the stock of immigrants from country j living in province i at time t \(({\text {Immi}}_{ijt}\)) and the stock of emigrants from province i living in country j at time t (\({\text {Emi}}_{ijt}\)). We log-transform both variables and add one unit to each of them to address the indeterminacy of the log of zeros. Furthermore, in order to control for possible non-linearities associated with this transformation, we add two dummy variables: “No Immigrant” (\(NI_{ijt}\)) and “No Emigrant” (\(NE_{ijt}\)). Each of the two dummies is equal to one if the immigrant (emigrant) stock from (to) country j to (from) province i in year t is equal to zero, and zero otherwise. Thus, our benchmark econometric model is the following:

$$\begin{aligned} X_{ijt} & = ({{\text {Y}}_{it-1}\times {\text {Y}}_{jt-1}})^{b_1}{\text {Dist}}^{b_2}_{ij}({\text {Immi}}_{ijt-1}+1)^{b_3}({\text {Emi}}_{ijt-1}+1)^{b_4}\nonumber \\&\times {\text {NI}}_{ijt-1}^{b_5}{\text {NE}}_{ijt-1}^{b_6}e^{(\omega _{\mathbf {it}}+\theta _{\mathbf {jt}}+\eta _{\mathbf {rj}}+\varepsilon _{ijt})}, \end{aligned}$$

where, besides the variables that we have already defined, \(\varepsilon _{ijt}\) is a random error term with standard properties.

As the extant literature has highlighted, the choice of the most suitable estimator for Eq. 2 is not trivial. An intertwining set of econometric issues arise, which we address in the following subsections.

3.1.1 Zero trade flows and heteroskedasticity

A non-negligible share of the bilateral export values in our sample—about 7.5%—are zeros. The estimates of a standard OLS log-linear specification would only be based on positive trade values, posing a problem of selection bias. Previous studies have addressed the issue by opting for a Tobit model, with an arbitrary zero or an estimated threshold (Wagner et al. 2002; Herander and Saavedra 2005).

More recently, the Poisson Pseudo-Maximum Likelihood (PPML) estimator, which naturally accommodates zero trade flows, was recommended by Santos-Silva and Tenreyro (2006) as a “workhorse” for gravity models. The PPML estimator is consistent even with over- or under- dispersion (Wooldridge 2002) and when the share of zeros is substantial Santos-Silva and Tenreyro (2011).

PPML, and other estimators with the dependent variable in levels, is also recommended when the error term is heteroskedastic (Santos-Silva and Tenreyro 2006). In this case, log-linearizing the gravity equation to estimate it by OLS introduces a bias (see also Manning and Mullahy 2001; Blackburn 2007). A violation of the homoskedasticity assumption will, in general, lead to the expected value of the log-linearized error term in the log-linear transformation of the gravity model being dependent on the covariates. In other words, the conditional mean of the log of the errors will depend on both their mean and on the higher-order moments of their distribution. With heteroskedasticity, this will be correlated with the covariates, leading to inconsistent OLS estimates.

In partial contrast with these arguments, Head and Mayer (2014) show that relatively common misspecifications of the conditional mean—that is, taking as linear an effect that is actually non-linear—can lead to a severe bias in the PPML estimates due to the higher weight that this estimator places on larger observations. In this case, the more flexible distributional assumptions of the Gamma PML (GPML) are more suitable.

According to Head and Mayer (2014), the choice between a Poisson and a Gamma PML estimator should draw on an analysis of the underlying distribution of the errors. Following this claim, in our application we select a suitable estimator by applying the procedure that Head and Mayer suggest. To the best of our knowledge, this is the first application of this test to the analysis of the migration–trade link. As a first step, we estimate the gravity model by PPML, OLS (applied to the log-linear model), and GPML. In particular, for the OLS estimates, the dependent variable is the natural log of the strictly positive values of exports, \({\text {In}}(X_{ijt})\); for those by PPML, we employ the non-negative export values in levels \(X_{ijt}\); and for those by GPML, we use their strictly positive values \(e^{({\text {In}}(X_{ijt}))}\).

As a second step, we then perform a Park test for heteroskedasticity on the OLS estimates. With heteroskedasticity, we select the estimator via the “MaMu test”, discussed by Head and Mayer (2014), drawing on Santos-Silva and Tenreyro (2006) and Manning and Mullahy (2001). The test focuses on the relationship between the variance and the conditional mean of the residuals obtained by each estimator: \({\text {var}}[X_{ij}|\mathbf {z}_{ij}]= h{\text {E}}[X_{ij}|\mathbf {z}_{ij}]^\lambda\), where \(\mathbf {z}_{ij}\) is the vector of covariates. The test estimates the empirical value of \(\lambda\) by regressing the squared residuals on the fitted values of each model. The distribution of the errors is estimated by OLS when applied to OLS residuals, by PPML when applied to the PPML residuals, and by GPML when applied to the GPML residuals (Manning and Mullahy 2001; Santos-Silva and Tenreyro 2006).Footnote 13

The outcome of the test gets read as in the following. Values of \(\lambda\) close to 2 reflect a constant coefficient of variation, which is compatible with the Gamma distributional assumptions and with a log-normal distribution. The most efficient estimators, in this case, are the homoskedastic OLS on logs—which is the MLE if the homoskedasticity assumption is reasonable—and the Gamma PML.Footnote 14 In this case, according to Weidner and Zylkin (2020), the Gamma PML with three-way fixed effects will also be consistent. If \(\lambda\) is instead closer to 1, generalizing the Poisson distributional assumptions (Manning and Mullahy 2001), the Poisson PML is to be preferred as OLS will be inconsistent due to heteroskedasticity and Gamma PML will suffer from an incidental parameters problem.

As a third and final step, in order to corroborate the choice of the estimator based on the previous test, we run the Ramsey (1969) RESET tests on each estimation method, aiming to detect possible misspecifications in the conditional means. This could, for instance, arise from non-constancy in the covariates (Head and Mayer 2014).

3.1.2 Separation and incidental parameter problems

In our setting, the choice of the estimator that we discussed in the previous subsection is further complicated by the inclusion of three-way fixed effects. Maximum likelihood estimates in count data models—as well as more generally in non-linear models—may actually not exist if there is a problem of “separation” (Santos-Silva and Tenreyro 2010; Correia et al. 2019). In this case, the log likelihood increases monotonically as one or more coefficients tend to infinity. As a result of this, the log-likelihood cannot be maximized for any finite coefficient estimate and estimation algorithms fail to converge or yield incorrect estimates. This happens, for instance, when two regressors are collinear for the subsample of positive values of the dependent variable or, more generally, when the conditional mean is specified in such a way that its image does not include all the points in the support of the dependent variable. The problem is exacerbated by the inclusion of high-dimensional fixed effects. Indeed, the same problem has effectively hindered, until recently, the estimation of three-way gravity models by PPML (Larch et al. 2019). Studying the conditions governing the existence of a variety of generalized linear models, Correia et al. (2019) have recently shown how in the case of Poisson regression and even with high-dimensional fixed effects, the parameters of interest can usually be consistently estimated and problematic observations can be identified and dropped from the sample without affecting the validity of the estimates. Their method can be implemented in Stata via the ppmlhdfe routine (Correia et al. 2020).

A further estimation issue relates to the potential incidental parameters problem affecting three-way PPML and GPML estimates. This issue was recently addressed by (Weidner and Zylkin 2020). As for PPML, the absence of an incidental parameters problem in Poisson regressions with one-way fixed effects is a well-known result (Cameron and Trivedi 2015) that was shown to carry over to two-way models (Fernández-Val and Weidner 2016). However, while the three-way PPML estimator is also generally consistent, it is not unbiased as it suffers from an asymptotic incidental parameter bias of order 1/N. This bias affects the asymptotic confidence intervals in fixed-T panels, causing them to not be correctly centered at the true point estimates; the cluster-robust variance estimates are also downward biased. Weidner and Zylkin (2020) propose an algorithm to correct for this bias, ppml_fe_bias, which we implement here.

Weidner and Zylkin (2020) also study bias and consistency in Gamma PML models with high-dimensional fixed effects. The GPML with one-way fixed effects is free from the incidental parameters problem (Greene 2004), but this result carries over to the three-way GPML only if the conditional variance is correctly specified, that is, if it is proportional to the square of the conditional mean. Otherwise, the Gamma PML suffers from an incidental parameters problem, unlike the Poisson PML, that remains consistent under more general conditions. Weidner and Zylkin (2020) also provide a modified version of the algorithm in Correia et al. (2019) to estimate GPML models with high-dimensional fixed effects. This is contained in the Stata routine gpmlhdfe.

In what follows, we draw on these contributions to implement the selection of the estimator recommended by Head and Mayer (2014).

3.1.3 Endogeneity

Another issue that may affect our estimates is, of course, endogeneity. Even if we include large sets of fixed effects, our estimates may still suffer from the omission of bilateral time-varying variables or reverse causality. It has been argued that the direction of causality runs from migration to trade, as migration is generally driven by factors like family reunifications, wage differentials, and pre-existing co-ethnic communities (Gould 1994; Munshi 2003; Mayer 2004; Jayet and Ukrayinchuk 2007); yet, we cannot rule out the problem a priori.

We address this issue with an instrumental variables approach. To instrument immigration stocks, we resort to a slight modification of the standard shift-share instrument drawn from the labor economics and economic geography literature (see, e.g., Altonji and Card 1991; Card 2001; Ottaviano and Peri 2006). We move from the fact that new immigrants tend to co-locate where their co-nationals have previously settled, as the presence of co-ethnic networks decreases settlement costs and facilitates access to jobs and services. The remarkable path dependency in immigrant settlement ensures that immigrant stocks can typically be very accurately predicted by this instrument. On the other hand, pre-determined shares are arguably unrelated to current unobserved shocks affecting the outcomes of interest. This motivates the popularity of the “immigrant enclave” or “past settlement” instrument in a variety of settings (Jaeger et al. 2018), including in the migration–trade link literature. In this framework, exogenous shocks to the supply of immigrants from country j in year t (the “push” factor of immigration) affect the trade of provinces differently depending on the initial share of immigrants from country j.

We first build up weights for each country–province pair ij by using the share of immigrants from country j residing in province i over the total immigration from country j in a base year. These weights are then multiplied by the overall immigration stocks from country j to Spain in year t to obtain the imputed stocks. The main difference between our instrument and the standard shift-share approach is that by imputing bilateral stocks, we do not aggregate the imputed stocks by province. This is the procedure followed by other studies employing gravity models, such as Peri and Requena-Silvente (2010) and Bratti et al. (2014).

To minimize the risk that the initial shares are correlated to current unobservable shocks affecting trade flows, we construct the shares using the most remote year for which immigrant stocks data are available, i.e., 1991. As commonly argued in the literature (e.g., Hunt and Gauthier-Loiselle 2010; Autor et al. 2013), if these shocks are serially correlated, the exclusion restriction is more likely to hold with more remote shares. Moreover, the 1991 data are census-based and provide better coverage of immigration stocks than more recent intercensal estimates. The downside of using long-lagged shares is that a relatively large number of pairs turn out to have zero weight—whenever no immigrants from country j settled in province i in 1991. This is indeed the case for about 15% of the pairs in our sample.

The availability of data on residential variation (Estadística de Variaciones Residenciales) from 1988Footnote 15 allows us to construct an additional, flow-based, instrument for immigration. In this case, we compute the shares based on the average inflows of foreign-born and foreign nationals by province and country over the 1988–1998 period, i.e., before the immigration boom of the early 2000s took place. The longer time coverage of this additional instrument should better address the issue of zero shares. With this additional instrument, we can flank our baseline just-identified 2SLS with an overidentified model and test for overidentification restrictions in a similar spirit to Briant et al. (2014).

Turning to emigration, we cannot construct a similar instrument due to the lack of historical data on Spanish emigration. Moreover, until 2006, the level of detail in residential variation data is also relatively poor for what concerns transfers to foreign countries. Hence, we propose an original procedure that reverses the logic of the recent works by Basile et al. (2018) and Beine and Coulombe (2018),Footnote 16 and impute bilateral emigration based on aggregate residential inscription cancellations. In particular, we approximate the “push” factors driving emigrants to leave their province of origin with the overall cancellations in province i that involve a move to any foreign country. We proxy the “pull” factors leading emigrants to target a specific destination country with the overall cancellations that involve a move from any Spanish province to country j. In both cases, we assume that bilateral flows have negligible weight over aggregate outflows.

We denote with \(w_{it}\) the share of residential cancellations from province i over the total cancellations at time t and with \(w_{jt}\) the share of residential cancellations directed to country j over the total cancellations at time t. We use the product of \(w_{it}\) and \(w_{jt}\) to reweigh the total stocks of emigrants at time t, \(Emi_t\). From this, we subtract bilateral stocks of emigrants \(Emi_{ijt}\) to avoid perfect multicollinearity with the time-varying fixed effects and to mitigate concerns that the computation of the overall stocks was based on possibly endogenous bilateral stocks. The resulting instrument is \({\text {Emi}}^{\mathrm{imputed}}_{ijt}=w_{it}w_{jt}({\text {Emi}}_{t} - {\text {Emi}}_{ijt})\).

This way of accounting for “push” and “pull” factors is similar to Burchardi et al. (2018). However, data constraints prevent us from integrating any of the “recursive” factors that they use to account for persistent emigrant settlements from a specific province to a specific country over a long period of time. Nonetheless, drawing on the panel structure of our data, our 2SLS estimates include the three sets of province-year, country-year, and region-country effects (along with distance), as in all of our main specifications. Hence, our instrument will account for time-invariant ties between specified region–country dyads.

The popularity of the shift-share instruments has triggered many recent contributions to highlight their shortcomings.Footnote 17 Most relevantly for our application, Jaeger et al. (2018) have noted that when the country-of-origin mix of immigrants is stable over time, the shifters are serially correlated. Hence, the instrument will correlate with its lags and the resulting estimates will conflate the short- and long-run effects of migration. Similar issues affect the other instrument that has been employed in the migration–trade link literature, i.e., the one based on the gravity model of migration employed by Bratti et al. (2020). These limitations also affect our proposed instrument for emigration. To disentangle the short-run from the longer-run adjustment to immigration, Bratti et al. (2020) propose instrumenting current and past migration jointly with the shift-share instrument and its lag. To yield valid estimates the two instruments must differ, hence there must be some innovation over time in the countries of origin.

This is indeed a limitation of our application. In our sample, the composition of immigrants does not provide sufficient variation to allow us to distinguish the consequences of current and past immigration, similarly to what the authors find for recent decades in the US. Despite the drop in immigration rates associated with the crisis, they remain highly serially correlated (correlation coefficients for all lag lengths \(>0.88\)). Indeed, the drop in the overall immigration stocks is mainly due to a decrease in immigrants from Ecuador, Colombia, and Argentina, who nonetheless remain—by far—the most represented among immigrants over the entire period. This implies that our estimates will conflate the short- and long-run adjustment of trade to immigration: that is, immigrant information and enforcement effects, as well as the trade effects of the longer-run adjustment of the local system to migration. As suggested by Bratti et al. (2020), migration may indeed affect export competitiveness via wage and productivity effects.

4 Results

4.1 The pro-export effect of migration

Based on our econometric strategy, Table 1 reports the results of the three estimators of the gravity model, including both immigrants and emigrants and allowing for heterogeneous MRTs at the province level.Footnote 18 Starting with the building blocks of the gravity model, the product of per-capita GDP is collinear with the province-year and country-year fixed effects and is thus omitted.Footnote 19 As for the distance variable, it has the expected negative effects in the OLS and GPML estimates, while its effect is insignificant, conditional on the bilateral region–country effects, according to the Poisson estimates. As previously mentioned, because the region-country effects absorb most of the pair-level variation, including distance or excluding it leaves the results virtually unaffected (see Appendix Table 9).

Coming to our focal migration variables, their point estimates are similar across the different estimators and specifications.

When included separately, both immigrants and emigrants positively and significantly affect provincial exports. On the other hand, the results change when they are included jointly, as expected. With the sole exception of the OLS estimates, a significant and positive effect on provincial Spanish exports is revealed only by emigrants, while the effect of immigrants turns out to be non-significant. Importantly, the positive yet comparatively small correlation between immigration and emigration stocks implies that the coefficient of each of these variables is somewhat overestimated when they are included separately. The non-significant or only mildly significant effects of \({\text {NI}}_{ijt-1}\) and \({\text {NE}}_{ijt-1}\) also reassure us that having added one unit to the migration variables does not alter the results. Across the considered estimators, a 10% increase in the emigration stock towards a certain country is found to increase exports by about 0.9% on average.

This is a first interesting result of our application. The result is original with respect to previous studies at the country level, which have found that the “impact of emigrant networks on exports coexists with a positive and significant impact of immigration on exports” (Hiller 2014, : 698).

As we stressed in Sect. 2.2, allowing for heterogeneous export capacity at the province level is a crucial choice to obtain accurate estimates in the gravity model at stake. In support of this argument, in Appendix B.2 we study whether the assumption of heterogeneous export capacity at the province level yields statistically different implications from less-demanding approaches. Results clearly indicate that ignoring this heterogeneity dramatically overstates both the immigrant and emigrant effects, supporting our theory-based approach.

Table 1 Pro-export effect of immigrants and emigrants: subregionally heterogenous MRTs

4.2 Picking the right estimator

According to Head and Mayer (2014), a scenario like the one presented in Table 1, where the three estimators yield largely similar results, is reassuring in terms of there being no signals of major misspecification. In particular, including or excluding zero trade flows leaves our results virtually unaffected.

Still, as we have mentioned above, the log-linear OLS estimates are inconsistent when there is heteroskedasticity. A standard Park regression of the squared OLS residuals on the covariates (Table 15) confirms this suspicion: the variance of the residuals actually increases with the increase in both the immigrant and emigrant stocks and it decreases with the distance variable; furthermore, the variance is also, on average, smaller for provinces with no immigrants. Hence, with our data, Poisson and Gamma PML estimators should be preferred over OLS. This is confirmed by the “MaMu test” reported in Table 2.

The estimated value of \(\lambda\) in Equation 13 is about 1.7 for OLS, about 1.3 for PPML, and 1.9 for GPML. In the latter case, the confidence intervals for the \(\hat{\lambda }\) include 2. This implies that, provided that the conditional mean is well specified, the high-dimensional fixed-effects Gamma model that we implemented does not suffer from an incidental parameters problem, similarly to the fixed-effects Poisson model Weidner and Zylkin (2020). Moreover, while the \(\hat{\lambda }\) for OLS and PPML are neither precisely 2 nor precisely 1, they seem to satisfy the distributional assumptions of their underlying estimators reasonably well. According to these estimates, the GPML would seem to be the most efficient estimator for our data. However, because the \(\hat{\lambda }\) estimated for the OLS and PML residuals are both significantly below 2, the PPML estimator should be preferred over the OLS (see Head and Mayer 2014). In brief, the results of the MaMu test support the implementation of either the PPML or the GPML.

Table 2 MaMu and RESET tests

In the bottom panel of Table 2, we report the coefficients and p-values associated with a set of Ramsey (1969) RESET tests on each estimation method, again following Santos-Silva and Tenreyro (2006). The null hypothesis of the correct specification of the conditional mean cannot be rejected in the case of the Poisson PML, while it is rejected in the case of the OLS estimates and the Gamma PML.Footnote 20

In conclusion, on the basis of the previous diagnostic test, the Poisson PML estimator emerges as the most suitable one for addressing our focal issue. On this basis, we infer that conditional on emigrant stocks, immigrants into Spanish provinces do not significantly increase the exports of these provinces towards the immigrants’ home countries. On the other hand, we detect a positive and significant effect of emigrants on Spanish provinces’ exports. Conditional on immigration stocks, a 10% increase in the emigration stocks of Spanish provinces increases exports to their countries of expatriation by almost 1% (Table 1). The emigrant effects on exports incorporate both a network effect (information and enforcement) and a preference effect, and their magnitude is comparable to previous estimates of the immigrant effect on imports. These previous estimates are generally larger than those for exports and lead to a corresponding increase of about 1.5% (see the meta-analysis by Genc et al. 2012). With respect to previous studies, our estimates are thus comparable but relatively smaller. Let us remember that this relatively conservative result is found by allowing for differential exporting capacities of provinces. Erroneously ruling out this heterogeneity would lead to much larger estimates of both the immigrant and emigrant effects (Table 10).

The result of an exclusive pro-export effect of emigrants is original and amenable to different interpretations. First of all, along a negative phase of the business cycle like the one we are considering, the opposing dynamics of immigration and emigration could have affected the composition of the respective stocks as well as the relative importance of the network and preference effects. While more refined data would be needed to ascertain the validity of this hypothesis with more accuracy, we argue that in the same period, the demand for home-country products (preference effect) expressed by emigrants could reasonably be the greatest, if not the only, relevant channel through which migrants can affect trade. Moreover, due to the wider set of mechanisms underlying their pro-export impact, the effect of emigrants may be easier to detect statistically. On more structural grounds, the same results can also be related to the level of productivity of the firms based in Spanish provinces. Previous evidence reveals that emigrants are more effective in promoting the trade of low-productivity firms, which have a lower capacity to enter into foreign markets (following Melitz’s selection argument) but a greater chance of overcoming trade barriers, with the help of emigrants, once they have entered (Hiller 2014).

Before taking these results as conclusive, three further steps are required to ensure that our estimates are reliable: i) ensuring that the estimates are robust to the bias-corrected method proposed by Weidner and Zylkin (2020); ii) addressing possible remaining sources of endogeneity; iii) and studying whether there is significant heterogeneity in the migrant effects, which could challenge the underlying assumption of constant elasticity in the Poisson PML model (Head and Mayer 2014). We will address these issues in turn in each of the following subsections.

4.3 Bias-corrected PPML estimates

Weidner and Zylkin (2020) showed that the PPML with three-way fixed effects suffers from a peculiar type of incidental parameters problem: an asymptotic bias affecting confidence intervals and cluster-robust variance estimates. Based on their discussion, the size of the bias may be expected to be relatively small. Still, it is important to empirically appreciate whether it would lead to differences in our results. For this reason, we report the results of the bias-corrected three-way fixed effects PPML estimates in Table 3. The bias-corrected estimates are fully in line with the main results. This is so even with a bias of about 0.006 affecting the estimates of both our variables of interest \({\text {In}}({\text {Immi}}_{ijt-1}+1)\) and \({\text {In}}({\text {Emi}}_{ijt-1}+1)\), inflating the estimated coefficients by 10% and 6% and deflating the estimated standard errors by 8% and 9%, respectively. The insights derived from this approach still indicate a positive and significant effect of emigrants but not of immigrants on exports.Footnote 21 Given the robustness of our results to the bias correction by Weidner and Zylkin (2020), we proceed with the standard uncorrected estimates in what follows.

Table 3 Bias-corrected PPML estimates

4.4 Instrumenting migration stocks

Table 4 reports our 2SLS estimates drawing on the instrumental variables discussed in Sect. 3.1.3.Footnote 22

Columns (1) and (2) report the first-stage regressions. The F-statistics of both first-stage regressions, comfortably above the conventional value of 5, allow us to dismiss concerns about the potential weakness of our baseline instruments. As previously mentioned, this is a common result for instrumental variables such as our Altonij–Card-like instrument for immigration (column 1). Importantly, the F-statistics reported in column 2 are also reassuring in terms of the strength of our less-standard instrument for emigration.

Column (3) reports the second-stage estimates of the just-identified model, where the orthogonality of the instruments must be assumed. Results confirm the results of the Poisson PML and highlight an even larger role for emigrants than the one we identified in the baseline estimates, similarly to Bratti et al. (2014, 2020).Footnote 23 The increase in the coefficient of emigration could be attributed to a measurement error: as discussed in Appendix A, \({\text {Emi}}_{ijt}\) likely underestimates the actual stocks as not all Spanish expats appear in the electoral registries of their host countries. By contrast, the data on residential cancellations may more rapidly capture the movements of Spanish nationals abroad.

In column (4), we report the results of the overidentified model that includes the additional instrument based on the 1988–1998 flows. Again, the F-tests on the first stage strongly reject the null hypotheses of weak instruments. The estimated 2SLS coefficients of immigration and emigration are remarkably similar to the ones in column (4). As for the overidentification test, the Hansen J statistic \(\chi ^2(1)\) is 0.069 (p-value = 0.7927) and prevents us from rejecting the null hypothesis of instrument orthogonality.Footnote 24 These results support the validity of our instruments and suggest that the positive effect of emigrants on trade detected in the previous section is robust to endogeneity concerns.

Table 4 2SLS estimates

4.5 Sources of heterogeneity in the pro-export effect of migration

As the extant literature has largely shown (e.g., Girma and Yu 2002; Wagner et al. 2002), the trade effect of migrants usually interacts with standard trade determinants: first and above all, institutional and language commonality. Because PPML estimates place more weight on larger trade flows, it is important to study how sensitive our results are to heterogeneity along different dimensions. In order to check this, we carry out an additional set of analyses and distinguish trade partners by institutional and language commonality with Spain.Footnote 25

The relative results, reported in Appendix C because of scope constraints, show that consistent with previous literature, the pro-export effect of emigrants is unambiguously driven by emigrants towards extra-EU countries. This actually confirms previous findings supporting the argument that differences in institutional settings increase transaction costs and make the role of migrants more salient in reducing them. On the other hand, we find stronger pro-trade effects of emigrants targeting Spanish- or English-speaking countries. This suggests that in the case of Spain, language commonality rather than similarity promotes trade. Quite interestingly, in our application institutions and language do not represent two sides of the same coin in transnational networking: unlike common institutions, language commonality adds to an emigrant’s ability to promote export. Drawing on the random encounter model proposed by Wagner et al. (2002),Footnote 26 we could think that while trade barriers could be per se lower between countries speaking the same language, language commonality further eases migrants’ access to information about home- or host-country opportunities. In short, language commonality increases the probability that an emigrant has the capacity to facilitate a transaction.

We also study the role of geography and subnational concentration on migrants’ pro-trade effects (Herander and Saavedra 2005; Peri and Requena-Silvente 2010). In order to study the effects of geographic proximity, similarly to Herander and Saavedra (2005) we contrast the effects of immigrant and emigrant networks pertaining to a specific province with those of country-wide networks. Results in Appendix C show that a weakly significant pro-export effect of immigrants emerges for province-level networks only and that emigrant effects appear instead to be driven by both a localized and a country-wide component. This suggests that the exports of a given province i to a country j rely not only on the pro-trade effects of emigrants from those provinces but also on emigrants from provinces other than i who live in j. The networks through which immigrants and emigrants exert their pro-export effect in the case of Spanish provinces appears to be on a different scale: local and non-local, respectively. This represents, to the best of our knowledge, an additional novel result of this study.

As an additional implication of recognizing the role of geography in affecting migrants’ pro-export effects, we allow for these effects to be heterogeneous across provinces with different levels of migration concentration, as suggested by Herander and Saavedra (2005) and Peri and Requena-Silvente (2010).Footnote 27 Consistent with Herander and Saavedra (2005), our results in Appendix C.3 indicate the existence of positive co-ethnic and inter-ethnic spillover effects from immigrants residing in high-concentration provinces, while those in low-concentration provinces do not reveal spillovers of this kind. On the other hand, no significant spillover effects emerge for emigrants, irrespective of the level of concentration. The results obtained by disaggregating provinces with high, medium, and low migration rates (Peri and Requena-Silvente 2010) are compatible with the arguments about the need for a minimum scale of immigration and emigration stocks for pro-trade effects to emerge. The positive effects of concentration are also confirmed when provinces are classified according to the values of a standard location quotient for immigration and emigration.

Finally, in the same Appendix where we develop the previous analyses of this subsection, we propose a methodology that, using an approximation with the data at our disposal, suggests that immigrant and emigrant effects on exports are stronger for provinces that are less specialized in the production of homogeneous intermediate goods. This is consistent with the original arguments by Rauch and Trinidade (2002). Migrant information effects would be more salient in the intermediation of trade transactions that concern differentiated goods and less salient for homogeneous goods, for which prices convey most of the relevant information.

All in all, while some heterogeneity in the effects emerges, the results of our estimators are similar to each other and do not raise substantial concerns that the greater weight given to larger observations by the PPML is biasing the results.

4.6 Migrant effects over time

The dataset we use for our empirical analysis starts close to the eruption of the global financial downturn, usually identified as occurring in late 2008, and extends over the following recession period. The crisis heavily affected Spanish exports, which faced a substantial drop in 2009, and was accompanied by an increase in unemployment rates throughout Spain. It is reasonable to expect that these peculiar dynamics have affected migrant composition, decreasing their incentives to stay in Spain and increasing those to expatriate, leading to a negative selection of the “stayers” and deteriorating their ability to facilitate trade. In Fig. 3, we provide an original way to address this issue. Specifically, we enrich the three-way fixed effects PPML regression with a full set of interaction effects between each of our migration variables and each year in our sample. Appendix Table 16 reports the estimated coefficients.

Fig. 3
figure 3

Time patterns of the migration effects

The coefficients display a clear pattern. In the case of immigrants, the pro-export effect is found to decrease and become insignificantly different from zero from 2010 onward. The pro-trade effect of emigrants is instead increasingly positive over time, and more markedly so since 2012. These results are important in at least two respects. On the one hand, they help reconcile our findings with those of previous studies and with the results by Peri and Requena-Silvente (2010) in particular. On the other hand, and relatedly, they suggest that a change in the composition of Spanish migrants may have occurred over time. This could have followed the stronger reaction to the crisis of more-qualified immigrants and emigrants, who are presumably better able to facilitate the creation of a trade tie.

Clearly, the fact that our panel starts in 2006 limits our ability to analyze the role of the Global Financial Crisis in the migrant pro-export effect. The main hindrance to this end is the lack of emigration data from before 2006. In Appendix B.3, we impute emigration data based on data on residential cancellations and study how migrant effects change across the pre- and post-crisis period. The results confirm that the global financial crisis reduced the immigrant effects and increased the emigrant effects.

In the same Appendix, we also study whether these dynamics could be attributed to asymmetric effects of immigration or return migration. We do not identify positive effects of return migration, but including this variable presumably reduces the noise in the estimates and makes the positive immigration effects gain significance at the 10% level.

5 Conclusions

Although it represents a research issue with a long-standing tradition, the analysis of the pro-trade effects of migrants at the subnational level is still open to amelioration. In particular, recent advances in the gravity model literature offer a battery of new analytical tools with which previous knowledge about the quantity and quality of these effects can be proven and possibly enriched. This is primarily the case for the heterogeneity of regional multilateral resistance terms, in addition to region–country dyadic time-invariant fixed effects and time-varying country-level effects. While the inclusion of time-varying effects for the traders is an obvious implication of the gravity model, it has often been neglected in empirical studies on the migration–trade link that adopt subnational units. As we have argued, multiple motivations urge for the inclusion of these controls. As we have shown, their neglect entails serious distortions in the results. As we have also shown, additional important refinements can be obtained by taking stock of computational advances at the frontier of gravity model estimation via Pseudo Maximum Likelihood (PML) estimators with multi-way fixed effects. Through our diagnostic tests, implemented for the first time to the analysis of the migration–trade link and with panel data, we compared OLS, GPML, and PPML estimators. The Poisson PML (PPML) estimator emerged as the most suitable in the analysis of the pro-export effect of migrants for Spanish provinces. The magnitudes of the PPML, the Gamma PML (GPML), and the OLS estimates are comparable to each other, suggesting that the model is well specified and not substantially affected by the zeroes in the dependent variable. On the other hand, the OLS estimator was discarded on the grounds of heteroskedasticity, with important implications for the obtained results.

By exclusively relying on the OLS estimates, we would have concluded that immigrants exert a significant pro-trade effect along with emigrants, consistent with previous studies. Instead, neither of our PML models supports such an inference in the considered timeframe. Rather, our estimates indicate and robustly confirm a positive effect of emigrants only on the exports of Spanish provinces. This result is robust to an instrumental variables approach and to the bias correction proposed by Weidner and Zylkin (2020).

The magnitude of the effect identified implies that a 10% increase in the emigrant population from a given province to a certain country would increase its exports towards it by almost 1%. Disregarding the heterogeneous exporting capacity of provinces would have led to an undue overestimation of both immigrant and emigrant effects. When compared to the few previous studies that include both immigrants and emigrants in export effects, the stronger and more robust effect of emigrants represents an interesting result. These results could be interpreted in light of the peculiar phase of the business cycle and of the impact of the global financial downturn. These have apparently eroded immigrants’ abilities to promote trade. Omitting the outward side of migration would have made this distinction impossible and may have wrongly attributed to immigrants a role that is actually played by emigrants.

Importantly, the insignificant immigrant effects bear implications about the mechanism underlying the pro-trade effects. Indeed, the emigrant effect could be either an information, an enforcement, or a preference effect. Considering that no robust evidence could be found in support of immigrant effects, which are exclusively driven by the first two mechanisms, one may argue that the effects that we identify are due to a preference effect only. While this may be the primary mechanism driving emigrant effects, the results of our heterogeneity analysis still support immigrant effects compatible with the information effect: that is, immigration effects were found in high-concentration provinces and in provinces less specialized in intermediate goods. Accordingly, we are inclined to attribute the declining role of immigration to a deterioration in the trade opportunities that are amenable to the mediation of immigrants, as well as to a change in immigrant composition. Further research may seek to confirm this interpretation during more positive phases of the cycle and, more generally, may further investigate the relationship between the business cycle and the migration–trade link.

Original results also emerge when the new methodological setting that we have proposed is applied to investigate the nuances of the trade–migration link. Consistent with previous literature, the effect of emigration is stronger in the trade linkages that provinces establish with more institutionally distant countries, i.e., with non-EU countries in our setting. Institutional distance actually represents a transaction cost, which the business networks created by emigrants can contribute to alleviating. An opposite result emerges with respect to language commonality, as Spanish emigrants to Spanish- and English-speaking countries have a magnifying, rather than reduced, pro-export effect. This result suggests that language skills also directly affect the individual ability of migrants to promote exports. Hence, language commonality could be a leverage for better detecting and exploiting trade opportunities. In other words, among the potential trade opportunities that a migrant can facilitate, some could be lost due to language differences. This would have the effect of reducing the emigrants’ capacity to promote trade (cfr. the random encounter model in Wagner et al. 2002).

Additional insights from our application concern the novel contribution regarding the effect of networks of expatriates on exports via local and nation-wide networks. Results show that both matter. However, the existence of a network of Spanish expatriates in the same country is more important than their provinces of origin, with a very large elasticity. This likely indicates that an increase in emigrant stocks triggers a demand effect that not only promotes trade from the province of origin but also from Spain as a whole. The intertwining of local and non-local effects of emigration on trade also represents a newly emergent piece of evidence on which future research should concentrate.