“Explaining the growth and change of regions and cities is one of the great challenges for social science. Cities or regions, like any other geographical scale of the economic system, have complex economic development processes that are shaped by an almost infinite range of forces. There is a thorny question as to what social science should aim to do in the face of such complexity”. (Storper 2011, p. 333)

1 Introduction

In this paper, we discuss a fundamental challenge of one particular form of explaining growth and change of regions and cities: the explanation of regional growth by structural factors in regional growth models. Regional growth models test if and to what extent selected variables predict regional growth on average. This paper shifts the analysis from smoothening regional growth around means towards what is usually treated as “noise” and “random disturbance”, i.e. the residuals that remain unexplained in regional growth regressions. This appears important as these residuals are “[s]tubbornly high—and often growing” (Rodríguez-Pose 2013, p. 1036), meaning the predictive power of regional growth models is decreasing.

We address a fundamental issue of regional growth that surfaces in the introductory quote from Storper (2011). Regions may develop systematic deviations from average growth as a result of the interplay between “an almost infinite range of forces”. Knowledge bases, networks, institutions, industries, and infrastructure co-evolve in regions in a path-dependent manner. The interplay between these many factors leads to emerging qualities where the outcomes cannot be predicted, but are still persistent over time. Hence, these region-specific growth deviations are to be theoretically expected. This brings about an important task of distinguishing between the principal regularities in urban and regional growth and the events and processes that are not temporally or geographically regular but that affect pathways of development in durable ways (Storper 2011).

This resonates with evolutionary and institutional economic geography, where path-dependent processes may lead to a wide variety of regional trajectories (Boschma 2004). As such, it deviates from the economic growth literature’s typical theoretical starting point of general equilibrium. For a review of the latter literature, see Breinlich et al. (2014a, b), who also provide an excellent account of methods to improve the causal interpretation of specific structural factors on growth.

In contrast to much of the economic growth literature, this paper’s main concern is not the causal interpretation of specific factors, but the overall growth patterns of regions. General structural factors, such as industry mix, human capital, or population size, partly explain regional growth (Sect. 2.1). Yet, region-specific growth is important besides and beyond such general structural factors. We elaborate why regional and extra-regional conditions may explain region-specific growth (Sect. 2.2). The main purpose is to develop a methodology for detecting systematic regional growth deviations after considering structural factors and regional preconditions (Sect. 3). We do this by closely investigating the patterns of residuals in regional growth regressions. If region-specific growth indeed plays an important role, residuals should not be randomly distributed but show systematic deviations. Finally, we provide an empirical illustration (Sect. 4). We assess unexplained regional growth deviations using data on employment growth across Swedish local labour markets in 2000–2016. We find that the residuals are large and often systematic, which challenges current thinking as it calls for (i) including additional important variables, such as institutions (e.g. Rodríguez-Pose 2020), (ii) improving econometric models, or (iii) acknowledging the possibility of region-specific growth paths caused by the interplay of multiple regional and extra-regional factors.

2 Regional growth models: what they explain and what remains unexplained

This section embarks from a short review of regional growth models. We discuss what such models explain and what remains unexplained. We identify factors, which are often not included in regional growth models, and discuss the complex interplay of many regional and extra-regional forces that may cause region-specific growth trajectories.

2.1 A short review of regional growth models

Traditional models of regional growth departed from a Solow–Swann framework (Solow 1956), seeing growth mainly as a function of the accumulation of capital and the increasing productivity of labour. These models were later extended with a broader conception of capital to include human, social, and other types of capital. The introduction of endogenous growth models (Romer 1986) in the 1980s represented an important breakthrough, explicitly incorporating the role of R&D and innovation as key drivers of growth. However, growth models have continued to leave a lot unexplained, and contemporary literature aims to identify more intangible social factors that can fill this gap. The most popular sets of explanations in the regional literature revolve around the role of institutions and social structures, and factors such as the evolution of regional industry structures and the opportunities and barriers it creates for knowledge spillovers and new industry creation.

Accordingly, recent literature on regional growth models exhibits a high degree of variation in the dependent and independent variables used, as well as in its modelling approaches (Table 4 in Appendix provides various examples from recent papers in this field). This is a large area of research, and hence, this short review is by no means exhaustive. Rather, it provides an overview against which we develop the remainder of the paper.

To start with, the concept of growth in regional growth models typically refers to economic growth, measured predominantly by the gross regional product, employment, or productivity. These measures emphasise different aspects of regional growth, where the gross regional product results from changes in employment and productivity. A stable, or even increasing, gross regional product may result from a decline in employment coupled with an increase in productivity (the so-called jobless growth). Such interplay between employment and productivity occurs, for instance, when manufacturing firms automate the production process, which tends to increase employment initially but leads to a reduction in employment at a later stage (Bessen 2020).

Secondly, models differ considerably in the explanatory and control variables included. Obviously, the choice of explanatory and control variables relates to the empirical context as well as the specific research question. The traditional factors in growth models are capital and physical infrastructure (e.g. road infrastructure or broadband access), labour endowments, and R&D, reflecting the classical Solow–Swann and endogenous growth models. Recent developments have added institutional factors, as well as evolutionary processes.

Consequently, contemporary research foregrounds two sets of structural preconditions, which shape knowledge spillovers and the evolution of industries, and hence drive regional growth in developed countries: firstly, the clustering of economic activities and the underlying regional industrial mixes and agglomeration effects, which are important both as measures of capital in the classical models and as indicators of the potential for knowledge spillovers and industry branching in the evolutionary tradition. Secondly, regional competitiveness factors reflecting the spatial patterns of innovation activities (Cheshire and Malecki 2004; Crescenzi et al. 2016; Giannakis and Bruggeman 2017; Harris 2011; Iammarino et al. 2018; Storper 2011), following from the endogenous growth model tradition’s emphasis on R&D and human capital.

As regards the first set of factors, the literature highlights the role of knowledge spillovers between firms and industries as a key mechanism through which the clustering of economic activities affects growth. There is a consensus that the dynamism of large cities makes them motors of economic growth (Duranton and Puga 2001; Fujita et al. 1999). Urban agglomeration is also considered to lead to greater innovation (Iammarino 2005) and to lower barriers and costs of knowledge sharing and transmission across individual and firm networks (Storper and Venables 2004). With respect to regional industrial mixes, Glaeser et al. (1992) gave rise to a lively debate—referred to as ‘MAR vs. Jacobs’—on the impact of specialisation and diversification on economic growth. MAR refers to theories of Marshall, Arrow, and Romer, who suggested that knowledge spillovers take place predominantly between similar economic activities, giving rise to localisation economies. In contrast, Jacobs (1969) claimed that industrial diversity enhances the cross-fertilisation of ideas from different sectors. A more recent position is that diversity in cognitively similar industries (related variety) is the strongest stimulant of regional growth as such diversity provides the most fertile soil for inter-industry knowledge spillovers (Frenken et al. 2007). Contemporary models of regional growth thus often include three variables that account for the regional industry mix: specialisation, diversification, and related variety. In addition, population size and density are often included to account for general agglomeration and urbanisation effects.

The second subset of structural factors relates to the determinants of regional competitiveness, primarily human capital, research and development (R&D), and innovation efforts. The accumulation of human capital and the allocation of resources to R&D are long-term structural characteristics of the regional economy, which adjust slowly over time and shape local growth trajectories (Blažek and Kadlec 2019). Both factors shape the capability of the local economy to generate new knowledge and to receive and exploit knowledge from the outside world (Crescenzi and Rodríguez-Pose 2011; Faggian and McCann 2009; Gennaioli et al. 2013). The absorption and generation of new knowledge and its translation into new products and processes are key drivers of regional economic performance. The innovativeness and human capital intensity of the regional economy also facilitate regional connectivity with the national and global economy. Regions investing more in innovation and human capital attract the most sophisticated functions of multinational firms, enabling the regional economy to enter the most advanced stages of global value chains (Crescenzi et al. 2014). R&D intensity and the share of the population with higher education degrees are the most commonly used indicators to account for this.

Thirdly, it is important to clarify the concept of a region. The empirical implementation of regional growth models is in most cases limited to administrative borders due to data availability. In the European context, the statistical NUTS units are often used. These relate to administrative borders but often combine several municipalities and sometimes counties to create units with similar population size. Studies including regions from numerous countries often use more aggregate (e.g. NUTS1 or NUTS2) territories due to data availability and comparability (see Table 4). One problem is that administrative (or NUTS) regions often do not correspond to functional regions, which is the level at which the mechanisms that drive regional growth play out. Few countries provide the required data for functional regions such as labour markets, but if available, it is the preferred delineation of empirical regional growth models (Boschma 2004).

Finally, to conclude this short review, Table 4 also illustrates the variety of modelling approaches used in regional growth models. Breinlich et al. (2014a, b) provide an excellent review of different modelling approaches, shortcomings, and methods for improving causal interpretation.

For the purposes of this paper, it is important to note that regional growth models typically estimate growth trajectories for an average region, thereby identifying how certain factors are associated with regional growth on average. Deviations from such average trajectories are considered as noise or random shocks (particularly in the spatial economics tradition), and therefore uninteresting for explaining regional growth. From our perspective, these deviations are of interest and point to unexplained mechanisms, the black box of regional growth. Some regions do grow above or below average in certain time periods not because of chance but because of their unique combinations of conditions and relations both at the regional and extra-regional scale (compare Capello 2009; Capello and Nijkamp 2011; Storper 2011). Such unique combinations might be sources of region-specific growth trajectories that the standard growth modelling approaches are unable to capture.

2.2 Region-specific growth

The idea of region-specific growth suggests that there may be systematic deviations from the estimations of regional growth models. These deviations may partly be explained by omitted variables, which are often difficult to measure (cf Rodríguez-Pose 2013). This is, for instance, the case for institutions and social capital, which can only be observed through imprecise proxies. Undoubtedly, such factors account for part of the unexplained deviations. Yet, it may also be the case that the unique combination of conditions in specific regions at specific times enables growth trajectories that systematically deviate from the average (Sayer 2000; Storper 2011). In addition, recent contributions attribute to human agency the potential to create new development paths, which deviate from the expected prolongation of the past (Garud et al. 2010; Grillitsch and Sotarauta 2019; Isaksen et al. 2019).

In this section, we review conditions, which are often inadequately captured in regional growth models, and discuss how the combination and activation of these conditions may lead to region-specific growth paths. Boschma (2004, p. 1008) argues that “a region moves along a specific development trajectory that affects (as an incentive and selection structure) the kind of competences that are most developed and reproduced, and how the institutional set-up co-evolves, and influences the way production, learning and innovation take place. Consequently, there exists a wide diversity of regional trajectories”. Resonating with this statement, this review discusses regional knowledge bases, institutional architectures, extra-regional relations, and human agency as potential causes of region-specific growth paths.

First, knowledge bases vary significantly between places due to industrial, educational, and research specialisations, which combine sticky local knowledge with global knowledge in a unique manner (Asheim and Isaksen 2002). While it is obvious that skills and competences are developed in and drawn to regions in response to existing specialisations, it took Polanyi (1958) to clearly express why this knowledge remains sticky. Important parts of knowledge are embodied, impossible to codify, and therefore hard to transfer over distance. This type of knowledge is acquired through interaction and practice, leading to the localised learning thesis (Maskell and Malmberg 1999), according to which interactive learning is powered through social networks at the local scale (Breschi and Lissoni 2009; Kemeny et al. 2016) as well as shared institutions (Gertler 1995). This has the potential to create unique regional knowledge bases.

Second, institutions, which shape interactions between individuals and organisations within and across regions, are difficult to operationalize and measure (Rodríguez-Pose 2013, 2020). Institutions are relevant for national competitiveness and innovativeness (Hall and Soskice 2001; Nelson 1993; Vitols 2001) and frame the emergence of regional innovation systems (Asheim and Coenen 2005; Asheim and Gertler 2005). Regional policy mixes and rationales, regional investments in systems of vocational training, R&D, and innovation and technology transfer can explain the competitiveness and innovativeness of regions (Blažek and Kadlec 2019; Cooke and Morgan 1994; Morgan 2016). Furthermore, regional interactions and social networks facilitate the emergence of informal institutions and conventions (Malecki 2011; Saxenian 1994; Storper 1995) that may underpin region-specific innovative milieus (Camagni 1991; Crevoisier 2004; Maillat 1998).

Moreover, the multi-scalar architecture of institutions implies that a large number of institutions intersect in specific territories (Gertler 2010; Grillitsch 2015; Hassink 2010). In combination, local cultures, national laws, international regulations, conventions of specific industries and professions, among others, shape the institutional architecture of regions. Considering further the practically countless combinations of these institutions, and that their effect rests on complementarities and contradictions between institutions (Amable et al. 2005; Höpner 2005), it becomes clear why institutional architectures can cause region-specific growth trajectories.

Third, regions are fundamentally open systems subject to inflows and outflows of people and firms. They rely on connections to other regions to bring new knowledge into the system (Bathelt et al. 2004; Fitjar and Rodríguez-Pose 2011; Trippl et al. 2018). Migrants bring human capital, as well as different perspectives and international personal and professional networks, which allow regions to access diverse knowledge (Faggian and McCann 2009; Kemeny 2017; Meili and Shearmur 2019; Saxenian 2007; Solheim and Fitjar 2018; Williams et al. 2004). Multinational enterprises (MNEs) bring investments and competence, and their location decisions can have fundamental implications for regional development (Cantwell and Iammarino 2003; Dunning 1998; Phelps and Fuller 2000). Yet, there have been warnings that reliance on MNEs may turn regions into branch plant economies that struggle to create sustainable competitive advantage (Cumbers 2000).

Perspectives on global cities and the world city network (Beaverstock et al. 2000; Taylor 2001) note that a region’s position within global networks is a key determinant of regional growth. Hence, the accessibility to and/or the number of external connections of the region may not fully account for its potential to access knowledge from outside. It also matters which other regions it can connect to, and how they in turn are connected to other regions, as well as which position it has in the urban hierarchy (Shearmur and Doloreux 2015). As each region has a unique position in this network, it is in effect an idiosyncratic factor.

Furthermore, production is increasingly characterised by an international division of labour. Companies in different regions and countries perform separate functions, creating global value chains. These are governed in various ways, with implications for co-ordination across companies and the distribution of power (Gereffi et al. 2005; Humphrey and Schmitz 2002). Within these value chains, multinational enterprises have established global production networks, with subsidiaries and independent local suppliers performing different functions in the production process. These hierarchical networks distribute knowledge and power between headquarters and local suppliers and are to a varying extent territorially embedded (Ernst and Kim 2002; Henderson et al. 2002).

Regional growth often involves regional industries upgrading their positions within these global value chains, i.e. moving from lower to higher-value activities within the value chains (Gereffi 2014; Giuliani et al. 2005). The opportunities for upgrading are shaped by regions’ current positions in the value chains (MacKinnon 2012; Pietrobelli and Rabellotti 2011), the variegated modes of governance in global production networks (Blažek 2015), and on the extent to which local firms can benefit from knowledge spillovers (Crescenzi et al. 2015). The way regions connect to other regions in specific periods, and which conditions these connections provide for regional development, is thus rooted in industrial histories of specific territories, thereby being a potential source for region-specific growth paths.

Yet, such regional trajectories also depend on life cycles of industries (Audretsch and Feldman 1996; Klepper 1997). When a new industry emerges, the windows of locational opportunity are relatively open, as new institutional structures are needed. Regions that succeed in attracting these industries can shift their positions radically (Boschma 1997; Storper and Walker 1989). Over time, the industry consolidates, making it much more difficult for new regions to develop competitive advantage. In the more mature phase, the potential for innovation declines, competition becomes more cost-based and production becomes more dispersed (Audretsch and Feldman 1996).

Finally, there has been a growing literature on human agency in regional development. This literature brings forward a complementary argument: regions may not only develop differently because of unique combinations of various conditions and relations at different scales (as we argued above), but also because of the emerging character of regional development shaped by the intended and unintended consequences of decisions, strategies, and actions of various actors and actor groups (Dawley 2014; Garud and Karnøe 2003; Simmie 2012). More specifically, this literature foregrounds human agency as a fundamental mechanism for change. Human change agency can take different forms, involving innovative entrepreneurship, institutional entrepreneurship, and place-based leadership (Grillitsch and Sotarauta 2019; MacKinnon et al. 2019) and be performed by firms as well as actors from the support system for innovation and entrepreneurship (Isaksen et al. 2019; Isaksen et al. 2018).

In sum, region-specific growth can arise due to the particular combination of regional and extra-regional conditions and relations in concrete territories as well as due to the emerging nature of regional development in which human agency plays a role.

3 Methodology: how to detect systematic regional growth deviations

We propose a systematic framework for identifying relevant cases for in-depth investigation of region-specific growth mechanisms. We do so by rejecting the notion of spatial economics about the ‘noisy’ nature of regional deviations from average growth paths and embracing an economic geography perspective that these deviations indicate temporally and geographically bounded processes that set regions on idiosyncratic paths of development. Large deviations are interesting as extreme cases for qualitative research (Eisenhardt and Graebner 2007) or for identifying potential improvements in the empirical model (Lieberman 2005). That is, we propose that there is a ‘signal’ in the ‘noise’.

In practice, this implies that more attention should be paid to deviations from the mean—that is, the residuals in growth regressions. In this section, we outline (in general terms) a methodology that employs such residuals to detect regions that over certain periods deviate systematically from the trajectories predicted by growth regressions.

3.1 From structural preconditions to growth regression

In Sect. 2.1, we identified two groups of structural factors that explain the expansion and development of regional economies: (a) the clustering of economic activities and the underlying regional industrial mixes and agglomeration effects and (b) regional competitiveness factors underpinned by spatial patterns of innovation activities. The next step is to develop an empirical model that estimates regional growth with these structural factors as predictors. The approach described below is not specific to this set of independent variables but can be adjusted to any model that captures the effects of observable structural factors on regional growth. We do not include the factors discussed in Sect. 2.2 in this model, as we see these as part of the unique conditions that shape region-specific growth, whose effects will differ in each individual case.

Let us observe a set of n regions \({\text{REG}}^{n} = \left\{ {r_{1} ,r_{2} , \ldots ,r_{n} } \right\}\) over a period of m years \({\text{T}}^{m} = \left\{ {t_{1} ,t_{2} , \ldots ,t_{m} } \right\}\). Since our primary goal is not the causal analysis of structural factors, but rather arriving at the best possible prediction of regional growth, we can specify the following fixed effects panel growth model:

$$Y_{rt}^{t + k} = \beta_{0} + {\text{AGGL}}_{rt} \beta_{1} + {\text{COMPET}}_{rt} \beta_{2} + \theta_{t} + \delta_{r} + \varepsilon_{rt}$$
(1)

where \(Y_{rt}^{t + k}\) represents growth in region \(\left( {r \in {\text{REG}}^{n} } \right)\) over k years between t and \(t + k\left( {t \in {\text{T}}^{{m} - {k}} } \right)\).Footnote 1\({\text{AGGL}}_{rt}\) and \({\text{COMPET}}_{rt}\) are matrices containing variables describing agglomeration and competitiveness factors, respectively.Footnote 2θt represents unobserved time-specific shocks that are uniform across all regions, such as national or global shocks.

The part of regional growth that cannot be explained by structural variables or time effects is represented by (a) regional fixed effects \(\left( {\delta_{r} } \right)\) capturing time-invariant unobservable regional characteristics that remain constant over the period \({\text{T}}^{{m} - {k}}\) and (b) the standard error term \(\left( {\varepsilon_{rt} } \right)\). The standard error term represents a time-specific unexplained growth component, our “black-box”, which captures the variance in regional growth that cannot be predicted with the included (and accessible) variables.

A k-year period panel model is preferred over a model capturing year-to-year variation in the data for two reasons: first, regional structural preconditions change rather slowly, implying a relatively low year-by-year variation within regions (Firgo and Mayerhofer 2017). Second, year-to-year models only identify short-run associations between structural factors and regional growth, leaving out long-run effects. Yet, as changes in structural conditions often take time to translate into growth, it makes more sense to employ an interval model rather than a year-to-year model.

3.2 From growth regression to systematic growth deviations

One way to identify systematic growth deviations is to look at the fixed effects \(\widehat{{\delta_{r} }}\) for each \(r \in {\text{REG}}^{n}\) estimated with the model specified in Eq. (1). However, there are two issues with such an identification strategy. The first one is purely statistical: most often researchers operate with short panels, where the number of cross-sectional units (regions) is larger than the number of time periods (years). In such empirical situations, estimates of fixed effects are inconsistent and highly sensitive to the inclusion of time-varying explanatory variables (Wooldridge 2002). The second issue stems from the fact that regional fixed effects are, by definition, time-invariant. Thus, even when estimated consistently, they do not allow to identify temporally bounded deviations of regions from average growth paths. As we are directly interested in the latter, estimated fixed effects are not the best tool for identifying outlier regions.

Instead, we identify systematic regional growth deviations using the standard error term. Estimating the model specified in Eq. (1), we obtain parameter values \(\hat{\beta }_{i} ,\;\hat{\delta }_{r} ,\;\) and \(\hat{\theta }_{t}\). We use these to derive point estimates for regional growth \(\hat{Y}_{rt}^{t + k}\) in each region and year and, subsequently, error terms \(\hat{\varepsilon }_{rt} = Y_{rt}^{t + k} - \hat{Y}_{rt}^{t + k}\). As a result, we obtain t − k matrix of error terms:

$$E_{n}^{m - k} = \left( {\begin{array}{*{20}l} {\hat{\varepsilon }_{11} } \hfill & {\hat{\varepsilon }_{12} } \hfill & \cdots \hfill & {\hat{\varepsilon }_{1m - k} } \hfill \\ {\hat{\varepsilon }_{21} } \hfill & {\hat{\varepsilon }_{22} } \hfill & \cdots \hfill & {\hat{\varepsilon }_{2m - k} } \hfill \\ {} \hfill & {} \hfill & \ddots \hfill & {} \hfill \\ {\hat{\varepsilon }_{n1} } \hfill & {\hat{\varepsilon }_{n2} } \hfill & \cdots \hfill & {\hat{\varepsilon }_{nm - k} } \hfill \\ \end{array} } \right)$$
(2)

For each of n regions and m − k periods, the elements of E mkn represent the unexplained growth component after accounting for structural preconditions, as well as region and time fixed effects. Values of \(\hat{\varepsilon }_{rt}\) above zero indicate that the model underestimates regional growth. In other words, a region performs better than its structural preconditions would suggest. Conversely, values of \(\hat{\varepsilon }_{rt}\) below zero indicate that a region performs worse than predicted by its structural preconditions, i.e. the model overestimates regional growth.

By comparing the unexplained growth component across regions, \(\hat{\varepsilon }\), it is possible to identify outliers, i.e. regions that in certain periods perform better or worse than their structural preconditions are able to account for. We standardise elements of the matrix E mkn to make them comparable.Footnote 3 The idea with standardisation is that over the entire period of observation, the average value of ert in each region r is by definition zero. However, the question of interest is whether the residuals deviate from this mean at random—as regional growth models habitually assume—or whether they show systematic patterns of substantial deviations.

We calculate the standard deviation of error terms in each period t (σt) and divide the error terms by their standard deviation in the respective period:

$$z_{rt} = \frac{{e_{rt} - \bar{e}_{t} }}{{\sigma_{t} }} = \frac{{e_{rt} }}{{\sigma_{t} }}$$
(3)

and we, thus, obtain a matrix of standardised error terms:

$$Z_{n}^{m - k} = \left( {\begin{array}{*{20}l} {z_{11} } \hfill & {z_{12} } \hfill & \cdots \hfill & {z_{1m - k} } \hfill \\ {z_{21} } \hfill & {z_{22} } \hfill & \cdots \hfill & {z_{2m - k} } \hfill \\ {} \hfill & {} \hfill & \ddots \hfill & {} \hfill \\ {z_{n1} } \hfill & {z_{n2} } \hfill & \cdots \hfill & {z_{nm - k} } \hfill \\ \end{array} } \right)$$
(4)

Values zrt measure the distances of each error term from zero expressed in standard deviations for each observation period.Footnote 4 Standardisation does not change the sign of errors, so that values above zero indicate that regional growth deviates positively from its predicted performance and vice versa.

Examining the matrix Z mkn row-by-row—i.e. looking at individual regions—we define persistent regional deviations from average growth paths as at least k + 1 consecutive periodsFootnote 5 when standardised error terms are above (below) one for a certain region. This allows us to identify where and when regional structural preconditions are ill-equipped to predict regional growth.

One may claim that such persistent deviations in error terms signify the presence of serial autocorrelation and, thus, a misspecification of the model (e.g. by omitting variables). This perspective stems from conventional modelling approaches where any deviation of regional growth from average trajectories are there because of random shocks or noise and, for that reason, are uninteresting as regional growth trajectories are ultimately expected to revert to the mean. In that respect, one should search for additional variables to be included in the model to deal with that issue. In our perspective, however, it is exactly these persistent deviations in error terms that we should use to identify regions in which local temporally bounded factors potentially create a region-specific growth path that is not explained by structural factors.

4 Empirical illustration: do systematic regional growth deviations exist in Sweden?

To demonstrate the methodology empirically, we apply the procedure outlined in Sect. 3 to data on regional employment growth in Sweden in the period between 1990 and 2016.Footnote 6

4.1 Data and variables: non-technical summary

The data employed in the analysis are from the Longitudinal Integration Database for Health Insurance and Labour Market Studies (LISA), which is a total-count population data set. LISA integrates annual data from several registers, including education, income, employment, health insurance, and population registers. The data set contains detailed information on individuals across various variables, such as age, education, annual earnings, municipality of residence and employment, and industry of employment.

We operationalise regional performance as employment growth over four-year periods (1990–1993; 1991–1994; etc.):

$$\Delta {\text{emp}}_{rt}^{t + 3} = \frac{{\ln \left( {{\text{emp}}_{rt + 3} } \right) - \ln \left( {{\text{emp}}_{rt} } \right)}}{3},$$

where emprt is the employment in region r in year t. While we are focusing on employment growth in this illustration exercise, the methodology outlined in Sect. 3 may be applied to regional GDP growth, regional productivity changes, or any other regional performance variable depending on the aims of the analysis.

The selection of the dependent variable has implications for the definition of spatial units for subsequent analysis, because they should ideally capture functional units (see Sect. 2.1). The data in LISA are originally provided at a municipality level. Yet, as labour market processes often transcend municipal borders, we merge the 290 Swedish municipalities into 90 local labour markets (LLMs). The latter constitute integrated geographical entities within which most interactions between workers and employers occur. In that respect, LLMs are appropriate functional units for linking the supply and demand sides of regional labour markets and explaining their performance. In practice, the boundaries of LLMs are defined by commuting patterns between municipalities through maximising the self-containment of commuting flows (SCB 2010).

The independent variables included in this illustration correspond to the two groups of structural factors identified in theoretical section of the paper (Sect. 2.1). The first group of variables concerns regional industry mix and agglomeration effects. To position a region on a spectrum of specialisation versus diversification in regional employment mix,Footnote 7 we use three variables often employed in the literature: related variety (as diversity in cognitively similar industries), absolute diversity in the regional employment mix (measured by inverted Hirschman-Herfindahl index), and relative regional specialisation (derived from location quotients of industries weighted by their employment shares within a region). These three variables provide an operationalisation of MAR (specialisation) and Jacobs (diversity) agglomeration externalities. We also include population density as well as the size of the labour market as variables capturing the third type of agglomeration economies—urbanisation externalities.

The second group of variables focuses on regional competitiveness stemming from local innovation activity. To capture R&D intensity of regional economies, we measure shares of regional employment in high-tech manufacturing and knowledge-intensive services (Eriksson and Hansen 2013). As a measure of regional R&D potential, we also include human capital (measured as the share of regional population with higher education among individuals aged 25 +) (Ó hUallacháin 2007).

As control variables, we include some general structural characteristics of local labour markets: we account for the share of employment in manufacturing and the share of public employment to control for the sensitivity of regional labour markets to macroeconomic conditions (Martin 2012). To account for convergence effects, we include the median regional wage. Finally, we account for regional competition for workers by including the number of establishments per worker.

All independent variables (except for those which are shares) are log-transformed. To mitigate endogeneity concerns, all variables are measured at the beginning of each sub-period. Table 1 provides descriptive statistics and correlations for the variables.

Table 1 Descriptive statistics and correlations

4.2 Regression results

The intention of this paper is not to evaluate the impact of structural characteristics on regional employment growth per se, but rather to quantify the remaining unexplained variance after accounting for the structural factors. Nonetheless, we provide a brief reflection on the relationship between structural characteristics and employment growth, as a background to the discussion of outliers. Table 2 presents the results of estimating the regression specified in Eq. (1) using the variables summarised in Sect. 4.1.

Table 2 Employment growth and structural preconditions at the regional level in Sweden, 1990–2016

When it comes to the degree of specialisation versus diversification in the regional employment mix, we observe a positive significant effect of related variety, a negative significant effect of specialisation (measured by the Theil index), and an insignificant effect of absolute diversity. This implies that over the observed time period, it was the regions with sufficiently (but not excessively) diversified employment mixes that were most able to generate employment in Sweden. Employment growth is also positively affected by the degree of urbanisation (measured as population density).

With respect to regional innovativeness and competitiveness, there is (somewhat surprisingly) no significant relationship between employment growth and the share of knowledge-intensive activities (both manufacturing and services) in the region. Nor does the human capital variable tend to exhibit any significant impact.

Finally, with respect to the group of ‘structural controls’, we observe a significant convergence effect (negative sign for the regional employment variable), a positive (but weakly significant) effect from the share of manufacturing, and a negative effect of the public employment share.

4.3 Systematic regional growth deviations

This section presents the analysis of systematic deviations in regional growth. First, we obtain the matrix of prediction errors E 2490 as in Eq. (2) and transform it into the matrix of standardised prediction errors Z 2490 , according to Eq. (3). As outlined above, we identify region-specific growth if the standardised prediction error is above (below) one for at least four consecutive years. Table 3 presents all regions that exhibit such systematic deviations from the average growth prediction according to this definition, while Table 6 in Appendix presents information about all 90 regions in Sweden.

Table 3 Regions exhibiting deviating growth periods, 1990–2016

Following this approach, we identify 21 regions that at some point between 1990 and 2016 exhibited a systematic deviation for at least four years in a row. Of these:

  • Seven regions (Arvidsjaur, Gällivare, Kiruna, Laxå, Pajala, Säffle, and Vansbro) had periods during which they grew both above and below what would be predicted by their structural preconditions;

  • Six regions (Bengtsfors, Emmaboda, Gislaved, Hofors, Sorsele, and Stockholm) exhibited only the positive outlier features; and

  • Eight regions (Eskilstuna, Haparanda, Hultsfred, Jokkmokk, Olofström, Strömstad, Söderhamn, and Ånge) had periods of growth below the prediction by the structural factors.

Looking at temporal and regional patterns, we also derive a series of further stylised facts:

  1. 1.

    Regions with systemic deviations represent a broad range of size groups—from the metropolitan local labour market of Stockholm on the right side of the distribution (with a population of 2.8 million inhabitants in 2018) to the local labour market of Sorsele on the left side (with a population of 2522 inhabitants in 2018). In that respect, the methodology is not biased towards any particular group of regions with respect to their size;

  2. 2.

    There is no clear temporal correlation in the outlier patterns. That is, we observe both negative and positive outlier tendencies throughout the whole observation period. This implies that the proposed methodology tends to do a good job in distinguishing the region-specific growth from the national growth pattern.

Another way to look at the residuals is to compare them with the observed employment growth (see Fig. 1). One would expect positive outliers to be regions with exceptionally fast growth, while negative outliers would be regions with exceptionally slow growth in the respective period. The latter is, in general, true: there appears to be a strong correlation between the value of the standardised residual and the actual growth (lower left quadrant in Fig. 1).

Fig. 1
figure 1

Standardised residuals versus employment growth for outlier regions

However, when it comes to positive outliers, the situation is more nuanced. There are certain regions which had positive employment growth, while being positive outliers (upper right quadrant in Fig. 1). However, the growth tempo is not correlated with the size of the standardised residual. In addition, there are also several regions which are lucky losers (lower right quadrant in Fig. 1). These demonstrate low growth performance, and yet they are still positive outliers, implying that they shrank more slowly than their structural preconditions would suggest.

The above illustrates clear patterns where some regions over a period of at least 4 years consistently perform better or worse than could be expected considering their structural preconditions. The residuals are also remarkably robust to which explanatory factors are included in the model. This can be investigated by comparing the residuals of the fully specified model with the residuals of a model that only includes year and regional fixed effects. Figure 2 illustrates the residuals for Stockholm, Gällivare, and Strömstad, which are the regions with the lowest, median, and highest average absolute differences between residuals of the fully specified and the fixed effects only model.

Fig. 2
figure 2

Illustrations of region-specific growth paths

For Stockholm, as well as for Gällivare, the two lines representing the residuals in each year of observation for the two models are very close to each other. For Strömstad, the figure unveils a larger gap between the residuals of the two models even though their trend is similar. Out of the 21 regions with systematic regional growth deviations, 19 regions show small gaps between the residuals of the two models, i.e. resembling the figures for Stockholm and Gällivare. In most cases, therefore, the inclusion or exclusion of structural variables does not alter the residuals in a substantial manner.

A final remark relates to the importance of systemic regional growth deviations, which is an unexpected but still important finding of the empirical illustration. Overall, the potential of the structural variables to explain regional growth variation is low and largely concealed in relatively high \(R^{2}\) values supposed to measure the fit of the model. This has already become apparent by the fact that in most cases the residuals using a full model do not deviate substantially from the residuals of a model that only includes the time and regional fixed effects. Another way of illustrating this is by comparing the explained variance from different models.

We use as a starting point a model that only includes year fixed effects and calculate the sum of the squared residuals. Then, we include a regional fixed effect (i.e. a dummy variable for each region) and sum again the squared residuals. This reduces the sum of squared residuals—or in other words regional variation—by 32%. Then, we add all other explanatory factors as shown in Table 2 and calculate the sum of the squared residuals. All structural variables and the regional fixed effects account for 42% of regional variation as compared to the model with only year fixed effects. This means that in total 58% of regional variation remains unexplained. While this includes stochastic disturbances as well as more systematic deviations in regional growth, it suggests that we should seriously consider why regions show idiosyncratic growth paths that cannot be explained by general structural factors.

5 Conclusions

While regional development research has traditionally mainly been preoccupied with identifying regularities explaining growth across regions, this paper turns the attention to the outliers in regional growth regressions. From a theoretical perspective, there are many reasons to expect regions to exhibit idiosyncratic growth patterns. Regional development is a function of a complex web of intra- and extra-regional endowments of knowledge, resources and networks, characterised by mutual dependencies and interactions across many factors. Hence, some regions can be expected to outperform their peers over shorter or longer periods, while others lag behind.

Regional growth patterns therefore need to be investigated more closely. We propose an approach to detect systematic regional growth deviations. Residuals can be used to identify short-term and medium-term trends that are not explained by the included structural variables. When the residuals are sufficiently large and maintain the same direction over a period of several years, they are an indication of a region-specific growth component that deviates systematically from the expected average growth performance.

Swedish register data from 1990 to 2016 are used to illustrate the approach and detect regions that exhibit growth deviations in the short- and medium-term. These come in all shapes and sizes, from the capital to tiny peripheral regions. They encompass positive outliers, negative outliers, or regions that are both during the period of observation. Furthermore, outliers are not limited to a certain phase of economic transition, but appear throughout the whole study period. Furthermore, the analysis shows that the residuals are not heavily affected by the inclusion or exclusion of other variables in the model. This suggests that residuals hold indeed valuable information to detect outlier regions.

Moreover, it is noteworthy how large the share of regional growth variation is that remains unexplained in standard growth regressions. Considering the total regional variation, 58% remains unexplained after considering all structural factors that have received primary attention in the recent literature. Of the 42% of regional variation explained by structural variables, the largest part can be attributed to regional fixed effects. Yet, it needs to be kept in mind that the empirical illustration is limited to a rather small and highly developed country. On the one hand, good data availability allowed including a comprehensive set of up-to-date indicators. On the other hand, before conducting similar studies in different contexts, it cannot be guaranteed that the findings hold in the same way. However, the proposed methodology can in any case be useful to investigate whether, when, and where systematic regional growth deviations exist.

This has implications for quantitative and qualitative research in economic geography. First, the method serves as a tool to identify when and where general structural factors are ill-equipped to explain regional growth. It unveils extreme cases of unexpected growth (or decline), from which substantial new knowledge can be gained through in-depth case studies (Eisenhardt and Graebner 2007). Further research about these extreme cases may lead to the discovery of so far unobserved structural factors, or a further validation of the importance of region-specific growth trajectories due to human agency and the unique combination of conditions in a region at a given time.

Second, the findings pose a challenge for quantitative studies besides what has been identified in the economics literature (Breinlich et al. 2014a, b). If the combination of intra- and extra-regional factors constitute opportunities and constraints for growth that are region- and time-specific, and if actors perceive and act upon those in a variegated manner (Grillitsch and Sotarauta 2019), regional pathways are expected to emerge that have little to do with modelled averages in regional growth regressions. Such estimations do not fit well with the “wide diversity of regional trajectories” argued for in evolutionary economic geography (Boschma 2004, p. 1008). New methods are needed that are both closer to this theoretical understanding and better equipped for reducing the so far unexplained regional growth variations.