(Optimal) spatial aggregation in the determinants of industrial location


Empirical studies on the determinants of industrial location typically use variables measured at the available administrative level (municipalities, counties etc.). However, this amounts to assuming that the effects that these determinants may have on the location process do not extend beyond the geographical limits of the selected site. We address the validity of this assumption by comparing results from standard count data models with those obtained by calculating the geographical scope of the spatially varying explanatory variables using a wide range of distances and alternative spatial autocorrelation measures. Our results reject the usual practice of using administrative records as covariates without making some kind of spatial correction.

Fig. 1
Fig. 2
Fig. 3


  1. 1.

    This can be seen as a particular case of the so-called modifiable area unit problem (MAUP) originally described by Openshaw and Taylor (1979).

  2. 2.

    Catalonia is an autonomous region of Spain that has about 7 million inhabitants (15% of the Spanish population), covers an area of 31,895 km2, and contributes 19% of Spanish GDP. The capital of Catalonia is the city of Barcelona. Counties in Catalonia are known as comarques.

  3. 3.

    In addition to the different strands of empirical industrial location literature, some related studies have investigated the MAUP (Openshaw and Taylor 1979). However, these studies were generally not concerned with the determinants of industrial location (a recent exception is Pablo-Martí and Muñoz-Yebra 2009) but with issues such as the spatial distribution of new concerns (Duranton and Overman 2005, 2008) and the estimation of wage and gravity equations (Briant et al. 2010).

  4. 4.

    See also Jofre-Monseny (2009) for a recent application to the same Spanish region that is investigated here.

  5. 5.

    As is common in the industrial location literature, our empirical strategy implicitly assumes that the administrative unit to which variables refer is indeed the spatial unit that agents effectively use when taking location decisions. Since we are using municipality data, we believe that this is a plausible assumption. One may still argue that this assumption may not hold for large municipalities and metropolitan areas, so we performed some robustness tests that essentially meant dropping from our data set municipalities with more than 250,000 people (in our case, the city of Barcelona) and those that are part of a metropolitan area (around the cities of Barcelona, Girona, Lleida, Manresa and Tarragona). Though results barely changed in the first case, we found that dropping the metropolitan areas from our sample provided different results from those reported below in terms of preferred specification and neighbourhood criterion (though not much in terms of value and significance of the marginal effects). This may be interpreted as evidence that the location processes in metropolitan and non-metropolitan areas are different. However, for the sake of simplicity we do not explore this possibility here but leave it for future research.

  6. 6.

    We did not consider specification 3.B, i.e. one in which we would add (rather than replace the original variables by) the spatially lagged variables calculated as in specification 3.A, because the high correlation between the original variables and these spatially lagged variables (around 0.95 for 6 of the 18 variables) resulted in severe multicollinearity.

  7. 7.

    We use residential population as the only explanatory variable in the inflated part of the ZIPM and ZINBM. The coefficient associated with this variable was negative and statistically significant in all our specifications.

  8. 8.

    Note that, although we have experimented with alternative sets of explanatory variables (e.g. we have dropped some of the variables related to the agglomeration economies, knowledge and commuting) and computed the GoF tests using different numbers of cells (see Manjón-Antolín 2009 for details on the computation of this test), these general trends remain largely unaffected.

  9. 9.

    Although some variables were not statistically significant individually, the Wald test for their joint significance was generally well above standard critical values (results available on request). See Table 3 for an illustrative example of this general trend.


Additional information

This research was partially funded by SEJ2007-64605/ECON, SEJ2007-65086/ECON, the “Xarxa de Referència d’R + D + I en Economia i Polítiques Públiques” of the Catalan Government and the PGIR program N-2008PGIR/05 of the Rovira i Virgili University (funded by both the Catalan and Spanish Governments). This paper has benefited from discussions with Á. Alañón, D. Liviano, F. Pablo and E. Viladecans. We would also like to acknowledge the helpful and supportive comments from seminar participants at the EEFS 2009 Conference (University of Warsaw), the Workshop on “Entrepreneurial Activity and Regional Competitiveness” (Max Planck Institute of Economics & ORKESTRA-Basque Institute of Competitiveness), the 3rd Central European Conference in Regional Science (Technical University of Košice) and the RSAI British & Irish Section 2009 Annual Conference (Limerick). Any errors are, of course, our own.

  • Industrial location
  • Count data models
  • Spatial statistics

JEL Classifications

  • C25
  • C52
  • R11
  • R30