The black box of regional growth

Regional growth models leave a large share of variation unexplained. While we should continuously aim to improve these models, the unique combination of conditions and human agency in each region will also invariably lead to region-specific growth trajectories. Theoretically, we should thus expect systematic deviations from growth predictions. We propose an approach to explore these unexplained deviations and to detect regions that perform unexpectedly well or badly in certain periods. We illustrate the approach using data for Sweden from 1990 to 2016. We find systematic patterns of unexplained periodic regional growth deviations outweighing the effect of generic structural factors.


Introduction
In this paper, we discuss a fundamental challenge of one particular form of explaining growth and change of regions and cities: the explanation of regional growth by structural factors in regional growth models. Regional growth models test if and to what extent selected variables predict regional growth on average. This paper shifts the analysis from smoothening regional growth around means towards what is usually treated as "noise" and "random disturbance", i.e. the residuals that remain unexplained in regional growth regressions. This appears important as these residuals are "[s]tubbornly high-and often growing" (Rodríguez-Pose 2013, p. 1036), meaning the predictive power of regional growth models is decreasing.
We address a fundamental issue of regional growth that surfaces in the introductory quote from Storper (2011). Regions may develop systematic deviations from average growth as a result of the interplay between "an almost infinite range of forces". Knowledge bases, networks, institutions, industries, and infrastructure coevolve in regions in a path-dependent manner. The interplay between these many factors leads to emerging qualities where the outcomes cannot be predicted, but are still persistent over time. Hence, these region-specific growth deviations are to be theoretically expected. This brings about an important task of distinguishing between the principal regularities in urban and regional growth and the events and processes that are not temporally or geographically regular but that affect pathways of development in durable ways (Storper 2011).
This resonates with evolutionary and institutional economic geography, where path-dependent processes may lead to a wide variety of regional trajectories (Boschma 2004). As such, it deviates from the economic growth literature's typical theoretical starting point of general equilibrium. For a review of the latter literature, see Breinlich et al. (2014a, b), who also provide an excellent account of methods to improve the causal interpretation of specific structural factors on growth.
In contrast to much of the economic growth literature, this paper's main concern is not the causal interpretation of specific factors, but the overall growth patterns of regions. General structural factors, such as industry mix, human capital, or population size, partly explain regional growth (Sect. 2.1). Yet, region-specific growth is important besides and beyond such general structural factors. We elaborate why regional and extra-regional conditions may explain region-specific growth (Sect. 2.2). The main purpose is to develop a methodology for detecting systematic regional growth deviations after considering structural factors and regional preconditions (Sect. 3). We do this by closely investigating the patterns of residuals in regional growth regressions. If region-specific growth indeed plays an important role, residuals should not be randomly distributed but show systematic deviations. Finally, we provide an empirical illustration (Sect. 4). We assess unexplained regional growth deviations using data on employment growth across Swedish local 1 3 The black box of regional growth labour markets in 2000-2016. We find that the residuals are large and often systematic, which challenges current thinking as it calls for (i) including additional important variables, such as institutions (e.g. Rodríguez-Pose 2020), (ii) improving econometric models, or (iii) acknowledging the possibility of region-specific growth paths caused by the interplay of multiple regional and extra-regional factors.

Regional growth models: what they explain and what remains unexplained
This section embarks from a short review of regional growth models. We discuss what such models explain and what remains unexplained. We identify factors, which are often not included in regional growth models, and discuss the complex interplay of many regional and extra-regional forces that may cause region-specific growth trajectories.

A short review of regional growth models
Traditional models of regional growth departed from a Solow-Swann framework (Solow 1956), seeing growth mainly as a function of the accumulation of capital and the increasing productivity of labour. These models were later extended with a broader conception of capital to include human, social, and other types of capital. The introduction of endogenous growth models (Romer 1986) in the 1980s represented an important breakthrough, explicitly incorporating the role of R&D and innovation as key drivers of growth. However, growth models have continued to leave a lot unexplained, and contemporary literature aims to identify more intangible social factors that can fill this gap. The most popular sets of explanations in the regional literature revolve around the role of institutions and social structures, and factors such as the evolution of regional industry structures and the opportunities and barriers it creates for knowledge spillovers and new industry creation. Accordingly, recent literature on regional growth models exhibits a high degree of variation in the dependent and independent variables used, as well as in its modelling approaches ( Table 4 in Appendix provides various examples from recent papers in this field). This is a large area of research, and hence, this short review is by no means exhaustive. Rather, it provides an overview against which we develop the remainder of the paper.
To start with, the concept of growth in regional growth models typically refers to economic growth, measured predominantly by the gross regional product, employment, or productivity. These measures emphasise different aspects of regional growth, where the gross regional product results from changes in employment and productivity. A stable, or even increasing, gross regional product may result from a decline in employment coupled with an increase in productivity (the so-called jobless growth). Such interplay between employment and productivity occurs, for instance, when manufacturing firms automate the production process, which tends to increase employment initially but leads to a reduction in employment at a later stage (Bessen 2020).
Secondly, models differ considerably in the explanatory and control variables included. Obviously, the choice of explanatory and control variables relates to the empirical context as well as the specific research question. The traditional factors in growth models are capital and physical infrastructure (e.g. road infrastructure or broadband access), labour endowments, and R&D, reflecting the classical Solow-Swann and endogenous growth models. Recent developments have added institutional factors, as well as evolutionary processes.
Consequently, contemporary research foregrounds two sets of structural preconditions, which shape knowledge spillovers and the evolution of industries, and hence drive regional growth in developed countries: firstly, the clustering of economic activities and the underlying regional industrial mixes and agglomeration effects, which are important both as measures of capital in the classical models and as indicators of the potential for knowledge spillovers and industry branching in the evolutionary tradition. Secondly, regional competitiveness factors reflecting the spatial patterns of innovation activities (Cheshire and Malecki 2004;Crescenzi et al. 2016;Giannakis and Bruggeman 2017;Harris 2011;Iammarino et al. 2018;Storper 2011), following from the endogenous growth model tradition's emphasis on R&D and human capital.
As regards the first set of factors, the literature highlights the role of knowledge spillovers between firms and industries as a key mechanism through which the clustering of economic activities affects growth. There is a consensus that the dynamism of large cities makes them motors of economic growth (Duranton and Puga 2001;Fujita et al. 1999). Urban agglomeration is also considered to lead to greater innovation (Iammarino 2005) and to lower barriers and costs of knowledge sharing and transmission across individual and firm networks (Storper and Venables 2004). With respect to regional industrial mixes, Glaeser et al. (1992) gave rise to a lively debate-referred to as 'MAR vs. Jacobs'-on the impact of specialisation and diversification on economic growth. MAR refers to theories of Marshall, Arrow, and Romer, who suggested that knowledge spillovers take place predominantly between similar economic activities, giving rise to localisation economies. In contrast, Jacobs (1969) claimed that industrial diversity enhances the cross-fertilisation of ideas from different sectors. A more recent position is that diversity in cognitively similar industries (related variety) is the strongest stimulant of regional growth as such diversity provides the most fertile soil for inter-industry knowledge spillovers (Frenken et al. 2007). Contemporary models of regional growth thus often include three variables that account for the regional industry mix: specialisation, diversification, and related variety. In addition, population size and density are often included to account for general agglomeration and urbanisation effects.
The second subset of structural factors relates to the determinants of regional competitiveness, primarily human capital, research and development (R&D), and innovation efforts. The accumulation of human capital and the allocation of resources to R&D are long-term structural characteristics of the regional economy, which adjust slowly over time and shape local growth trajectories (Blažek and Kadlec 2019). Both factors shape the capability of the local economy to generate 1 3 The black box of regional growth new knowledge and to receive and exploit knowledge from the outside world (Crescenzi and Rodríguez-Pose 2011;Faggian and McCann 2009;Gennaioli et al. 2013). The absorption and generation of new knowledge and its translation into new products and processes are key drivers of regional economic performance. The innovativeness and human capital intensity of the regional economy also facilitate regional connectivity with the national and global economy. Regions investing more in innovation and human capital attract the most sophisticated functions of multinational firms, enabling the regional economy to enter the most advanced stages of global value chains (Crescenzi et al. 2014). R&D intensity and the share of the population with higher education degrees are the most commonly used indicators to account for this.
Thirdly, it is important to clarify the concept of a region. The empirical implementation of regional growth models is in most cases limited to administrative borders due to data availability. In the European context, the statistical NUTS units are often used. These relate to administrative borders but often combine several municipalities and sometimes counties to create units with similar population size. Studies including regions from numerous countries often use more aggregate (e.g. NUTS1 or NUTS2) territories due to data availability and comparability (see Table 4). One problem is that administrative (or NUTS) regions often do not correspond to functional regions, which is the level at which the mechanisms that drive regional growth play out. Few countries provide the required data for functional regions such as labour markets, but if available, it is the preferred delineation of empirical regional growth models (Boschma 2004).
Finally, to conclude this short review, Table 4 also illustrates the variety of modelling approaches used in regional growth models. Breinlich et al. (2014a, b) provide an excellent review of different modelling approaches, shortcomings, and methods for improving causal interpretation.
For the purposes of this paper, it is important to note that regional growth models typically estimate growth trajectories for an average region, thereby identifying how certain factors are associated with regional growth on average. Deviations from such average trajectories are considered as noise or random shocks (particularly in the spatial economics tradition), and therefore uninteresting for explaining regional growth. From our perspective, these deviations are of interest and point to unexplained mechanisms, the black box of regional growth. Some regions do grow above or below average in certain time periods not because of chance but because of their unique combinations of conditions and relations both at the regional and extraregional scale (compare Capello 2009;Capello and Nijkamp 2011;Storper 2011). Such unique combinations might be sources of region-specific growth trajectories that the standard growth modelling approaches are unable to capture.

Region-specific growth
The idea of region-specific growth suggests that there may be systematic deviations from the estimations of regional growth models. These deviations may partly be explained by omitted variables, which are often difficult to measure (cf Rodríguez-Pose 2013). This is, for instance, the case for institutions and social capital, which can only be observed through imprecise proxies. Undoubtedly, such factors account for part of the unexplained deviations. Yet, it may also be the case that the unique combination of conditions in specific regions at specific times enables growth trajectories that systematically deviate from the average (Sayer 2000;Storper 2011). In addition, recent contributions attribute to human agency the potential to create new development paths, which deviate from the expected prolongation of the past (Garud et al. 2010;Grillitsch and Sotarauta 2019;Isaksen et al. 2019).
In this section, we review conditions, which are often inadequately captured in regional growth models, and discuss how the combination and activation of these conditions may lead to region-specific growth paths. Boschma (2004Boschma ( , p. 1008 argues that "a region moves along a specific development trajectory that affects (as an incentive and selection structure) the kind of competences that are most developed and reproduced, and how the institutional set-up co-evolves, and influences the way production, learning and innovation take place. Consequently, there exists a wide diversity of regional trajectories". Resonating with this statement, this review discusses regional knowledge bases, institutional architectures, extra-regional relations, and human agency as potential causes of region-specific growth paths.
First, knowledge bases vary significantly between places due to industrial, educational, and research specialisations, which combine sticky local knowledge with global knowledge in a unique manner (Asheim and Isaksen 2002). While it is obvious that skills and competences are developed in and drawn to regions in response to existing specialisations, it took Polanyi (1958) to clearly express why this knowledge remains sticky. Important parts of knowledge are embodied, impossible to codify, and therefore hard to transfer over distance. This type of knowledge is acquired through interaction and practice, leading to the localised learning thesis (Maskell and Malmberg 1999), according to which interactive learning is powered through social networks at the local scale (Breschi and Lissoni 2009;Kemeny et al. 2016) as well as shared institutions (Gertler 1995). This has the potential to create unique regional knowledge bases.
Second, institutions, which shape interactions between individuals and organisations within and across regions, are difficult to operationalize and measure (Rodríguez-Pose 2013, 2020. Institutions are relevant for national competitiveness and innovativeness (Hall and Soskice 2001;Nelson 1993;Vitols 2001) and frame the emergence of regional innovation systems (Asheim and Coenen 2005;Asheim and Gertler 2005). Regional policy mixes and rationales, regional investments in systems of vocational training, R&D, and innovation and technology transfer can explain the competitiveness and innovativeness of regions (Blažek and Kadlec 2019;Cooke and Morgan 1994;Morgan 2016). Furthermore, regional interactions and social networks facilitate the emergence of informal institutions and conventions (Malecki 2011;Saxenian 1994;Storper 1995) that may underpin region-specific innovative milieus (Camagni 1991;Crevoisier 2004;Maillat 1998).
Moreover, the multi-scalar architecture of institutions implies that a large number of institutions intersect in specific territories (Gertler 2010;Grillitsch 2015;Hassink 2010). In combination, local cultures, national laws, international regulations, conventions of specific industries and professions, among others, shape the institutional architecture of regions. Considering further the practically countless combinations The black box of regional growth of these institutions, and that their effect rests on complementarities and contradictions between institutions (Amable et al. 2005;Höpner 2005), it becomes clear why institutional architectures can cause region-specific growth trajectories.
Third, regions are fundamentally open systems subject to inflows and outflows of people and firms. They rely on connections to other regions to bring new knowledge into the system (Bathelt et al. 2004;Fitjar and Rodríguez-Pose 2011;Trippl et al. 2018). Migrants bring human capital, as well as different perspectives and international personal and professional networks, which allow regions to access diverse knowledge (Faggian and McCann 2009;Kemeny 2017;Meili and Shearmur 2019;Saxenian 2007;Solheim and Fitjar 2018;Williams et al. 2004). Multinational enterprises (MNEs) bring investments and competence, and their location decisions can have fundamental implications for regional development (Cantwell and Iammarino 2003;Dunning 1998;Phelps and Fuller 2000). Yet, there have been warnings that reliance on MNEs may turn regions into branch plant economies that struggle to create sustainable competitive advantage (Cumbers 2000).
Perspectives on global cities and the world city network (Beaverstock et al. 2000;Taylor 2001) note that a region's position within global networks is a key determinant of regional growth. Hence, the accessibility to and/or the number of external connections of the region may not fully account for its potential to access knowledge from outside. It also matters which other regions it can connect to, and how they in turn are connected to other regions, as well as which position it has in the urban hierarchy (Shearmur and Doloreux 2015). As each region has a unique position in this network, it is in effect an idiosyncratic factor. Furthermore, production is increasingly characterised by an international division of labour. Companies in different regions and countries perform separate functions, creating global value chains. These are governed in various ways, with implications for coordination across companies and the distribution of power (Gereffi et al. 2005;Humphrey and Schmitz 2002). Within these value chains, multinational enterprises have established global production networks, with subsidiaries and independent local suppliers performing different functions in the production process. These hierarchical networks distribute knowledge and power between headquarters and local suppliers and are to a varying extent territorially embedded (Ernst and Kim 2002;Henderson et al. 2002).
Regional growth often involves regional industries upgrading their positions within these global value chains, i.e. moving from lower to higher-value activities within the value chains (Gereffi 2014;Giuliani et al. 2005). The opportunities for upgrading are shaped by regions' current positions in the value chains (MacKinnon 2012; Pietrobelli and Rabellotti 2011), the variegated modes of governance in global production networks (Blažek 2015), and on the extent to which local firms can benefit from knowledge spillovers (Crescenzi et al. 2015). The way regions connect to other regions in specific periods, and which conditions these connections provide for regional development, is thus rooted in industrial histories of specific territories, thereby being a potential source for region-specific growth paths.
Yet, such regional trajectories also depend on life cycles of industries (Audretsch and Feldman 1996;Klepper 1997). When a new industry emerges, the windows of locational opportunity are relatively open, as new institutional structures are needed. Regions that succeed in attracting these industries can shift their positions radically (Boschma 1997;Storper and Walker 1989). Over time, the industry consolidates, making it much more difficult for new regions to develop competitive advantage. In the more mature phase, the potential for innovation declines, competition becomes more cost-based and production becomes more dispersed (Audretsch and Feldman 1996).
Finally, there has been a growing literature on human agency in regional development. This literature brings forward a complementary argument: regions may not only develop differently because of unique combinations of various conditions and relations at different scales (as we argued above), but also because of the emerging character of regional development shaped by the intended and unintended consequences of decisions, strategies, and actions of various actors and actor groups (Dawley 2014;Garud and Karnøe 2003;Simmie 2012). More specifically, this literature foregrounds human agency as a fundamental mechanism for change. Human change agency can take different forms, involving innovative entrepreneurship, institutional entrepreneurship, and place-based leadership (Grillitsch and Sotarauta 2019;MacKinnon et al. 2019) and be performed by firms as well as actors from the support system for innovation and entrepreneurship (Isaksen et al. 2019;Isaksen et al. 2018).
In sum, region-specific growth can arise due to the particular combination of regional and extra-regional conditions and relations in concrete territories as well as due to the emerging nature of regional development in which human agency plays a role.

Methodology: how to detect systematic regional growth deviations
We propose a systematic framework for identifying relevant cases for in-depth investigation of region-specific growth mechanisms. We do so by rejecting the notion of spatial economics about the 'noisy' nature of regional deviations from average growth paths and embracing an economic geography perspective that these deviations indicate temporally and geographically bounded processes that set regions on idiosyncratic paths of development. Large deviations are interesting as extreme cases for qualitative research (Eisenhardt and Graebner 2007) or for identifying potential improvements in the empirical model (Lieberman 2005). That is, we propose that there is a 'signal' in the 'noise'. In practice, this implies that more attention should be paid to deviations from the mean-that is, the residuals in growth regressions. In this section, we outline (in general terms) a methodology that employs such residuals to detect regions that over certain periods deviate systematically from the trajectories predicted by growth regressions.

From structural preconditions to growth regression
In Sect. 2.1, we identified two groups of structural factors that explain the expansion and development of regional economies: (a) the clustering of economic activities 1 3 The black box of regional growth and the underlying regional industrial mixes and agglomeration effects and (b) regional competitiveness factors underpinned by spatial patterns of innovation activities. The next step is to develop an empirical model that estimates regional growth with these structural factors as predictors. The approach described below is not specific to this set of independent variables but can be adjusted to any model that captures the effects of observable structural factors on regional growth. We do not include the factors discussed in Sect. 2.2 in this model, as we see these as part of the unique conditions that shape region-specific growth, whose effects will differ in each individual case.
Let us observe a set of n regions REG n = r 1 , r 2 , … , r n over a period of m years T m = t 1 , t 2 , … , t m . Since our primary goal is not the causal analysis of structural factors, but rather arriving at the best possible prediction of regional growth, we can specify the following fixed effects panel growth model: where Y t+k rt represents growth in region (r ∈ REG n ) over k years between t and t + k t ∈ T m−k . 1 AGGL rt and COMPET rt are matrices containing variables describing agglomeration and competitiveness factors, respectively. 2 θ t represents unobserved time-specific shocks that are uniform across all regions, such as national or global shocks.
The part of regional growth that cannot be explained by structural variables or time effects is represented by (a) regional fixed effects r capturing time-invariant unobservable regional characteristics that remain constant over the period T m−k and (b) the standard error term rt . The standard error term represents a timespecific unexplained growth component, our "black-box", which captures the variance in regional growth that cannot be predicted with the included (and accessible) variables.
A k-year period panel model is preferred over a model capturing year-to-year variation in the data for two reasons: first, regional structural preconditions change rather slowly, implying a relatively low year-by-year variation within regions (Firgo and Mayerhofer 2017). Second, year-to-year models only identify short-run associations between structural factors and regional growth, leaving out long-run effects. Yet, as changes in structural conditions often take time to translate into growth, it makes more sense to employ an interval model rather than a year-to-year model.

From growth regression to systematic growth deviations
One way to identify systematic growth deviations is to look at the fixed effects ̂ r for each r ∈ REG n estimated with the model specified in Eq. (1). However, there are two issues with such an identification strategy. The first one is purely statistical: most often researchers operate with short panels, where the number of cross-sectional units (regions) is larger than the number of time periods (years). In such empirical situations, estimates of fixed effects are inconsistent and highly sensitive to the inclusion of time-varying explanatory variables (Wooldridge 2002). The second issue stems from the fact that regional fixed effects are, by definition, time-invariant. Thus, even when estimated consistently, they do not allow to identify temporally bounded deviations of regions from average growth paths. As we are directly interested in the latter, estimated fixed effects are not the best tool for identifying outlier regions. Instead, we identify systematic regional growth deviations using the standard error term. Estimating the model specified in Eq. (1), we obtain parameter values ̂i ,̂r, and ̂t . We use these to derive point estimates for regional growth Ŷ t+k rt in each region and year and, subsequently, error terms ̂r t = Y t+k rt −Ŷ t+k rt . As a result, we obtain t − k matrix of error terms: For each of n regions and m − k periods, the elements of E n m−k represent the unexplained growth component after accounting for structural preconditions, as well as region and time fixed effects. Values of ̂r t above zero indicate that the model underestimates regional growth. In other words, a region performs better than its structural preconditions would suggest. Conversely, values of ̂r t below zero indicate that a region performs worse than predicted by its structural preconditions, i.e. the model overestimates regional growth.
By comparing the unexplained growth component across regions, ̂ , it is possible to identify outliers, i.e. regions that in certain periods perform better or worse than their structural preconditions are able to account for. We standardise elements of the matrix E n m−k to make them comparable. 3 The idea with standardisation is that over the entire period of observation, the average value of e rt in each region r is by definition zero. However, the question of interest is whether the residuals deviate from this mean at random-as regional growth models habitually assume-or whether they show systematic patterns of substantial deviations.
We calculate the standard deviation of error terms in each period t (σ t ) and divide the error terms by their standard deviation in the respective period: and we, thus, obtain a matrix of standardised error terms: Specifically, we column-standardise elements of the matrix E n m−k , by calculating standard deviation of error terms in each period t (σ t ) and divide error terms by their standard deviation in respective periods.

3
The black box of regional growth Values z rt measure the distances of each error term from zero expressed in standard deviations for each observation period. 4 Standardisation does not change the sign of errors, so that values above zero indicate that regional growth deviates positively from its predicted performance and vice versa.
Examining the matrix Z n m−k row-by-row-i.e. looking at individual regions-we define persistent regional deviations from average growth paths as at least k + 1 consecutive periods 5 when standardised error terms are above (below) one for a certain region. This allows us to identify where and when regional structural preconditions are ill-equipped to predict regional growth.
One may claim that such persistent deviations in error terms signify the presence of serial autocorrelation and, thus, a misspecification of the model (e.g. by omitting variables). This perspective stems from conventional modelling approaches where any deviation of regional growth from average trajectories are there because of random shocks or noise and, for that reason, are uninteresting as regional growth trajectories are ultimately expected to revert to the mean. In that respect, one should search for additional variables to be included in the model to deal with that issue. In our perspective, however, it is exactly these persistent deviations in error terms that we should use to identify regions in which local temporally bounded factors potentially create a region-specific growth path that is not explained by structural factors.

Empirical illustration: do systematic regional growth deviations exist in Sweden?
To demonstrate the methodology empirically, we apply the procedure outlined in Sect. 3 to data on regional employment growth in Sweden in the period between 1990 and 2016. 6

Data and variables: non-technical summary
The data employed in the analysis are from the Longitudinal Integration Database for Health Insurance and Labour Market Studies (LISA), which is a total-count population data set. LISA integrates annual data from several registers, including education, income, employment, health insurance, and population registers. The data set contains detailed information on individuals across various variables, such as age, g. a value of 1 indicates that the error term is exactly one standard deviation from zero. 5 where k is the length of a period in the model specified in Eq. (1), 6 For more technical details, consult the "Appendix". education, annual earnings, municipality of residence and employment, and industry of employment. We operationalise regional performance as employment growth over four-year periods (1990-1993; 1991-1994; etc.): where emp rt is the employment in region r in year t. While we are focusing on employment growth in this illustration exercise, the methodology outlined in Sect. 3 may be applied to regional GDP growth, regional productivity changes, or any other regional performance variable depending on the aims of the analysis.
The selection of the dependent variable has implications for the definition of spatial units for subsequent analysis, because they should ideally capture functional units (see Sect. 2.1). The data in LISA are originally provided at a municipality level. Yet, as labour market processes often transcend municipal borders, we merge the 290 Swedish municipalities into 90 local labour markets (LLMs). The latter constitute integrated geographical entities within which most interactions between workers and employers occur. In that respect, LLMs are appropriate functional units for linking the supply and demand sides of regional labour markets and explaining their performance. In practice, the boundaries of LLMs are defined by commuting patterns between municipalities through maximising the self-containment of commuting flows (SCB 2010).
The independent variables included in this illustration correspond to the two groups of structural factors identified in theoretical section of the paper (Sect. 2.1). The first group of variables concerns regional industry mix and agglomeration effects. To position a region on a spectrum of specialisation versus diversification in regional employment mix, 7 we use three variables often employed in the literature: related variety (as diversity in cognitively similar industries), absolute diversity in the regional employment mix (measured by inverted Hirschman-Herfindahl index), and relative regional specialisation (derived from location quotients of industries weighted by their employment shares within a region). These three variables provide an operationalisation of MAR (specialisation) and Jacobs (diversity) agglomeration externalities. We also include population density as well as the size of the labour market as variables capturing the third type of agglomeration economies-urbanisation externalities.
The second group of variables focuses on regional competitiveness stemming from local innovation activity. To capture R&D intensity of regional economies, we measure shares of regional employment in high-tech manufacturing and knowledgeintensive services (Eriksson and Hansen 2013). As a measure of regional R&D potential, we also include human capital (measured as the share of regional population with higher education among individuals aged 25 +) (Ó hUallacháin 2007).

3
The black box of regional growth As control variables, we include some general structural characteristics of local labour markets: we account for the share of employment in manufacturing and the share of public employment to control for the sensitivity of regional labour markets to macroeconomic conditions (Martin 2012). To account for convergence effects, we include the median regional wage. Finally, we account for regional competition for workers by including the number of establishments per worker.
All independent variables (except for those which are shares) are log-transformed. To mitigate endogeneity concerns, all variables are measured at the beginning of each sub-period. Table 1 provides descriptive statistics and correlations for the variables.

Regression results
The intention of this paper is not to evaluate the impact of structural characteristics on regional employment growth per se, but rather to quantify the remaining unexplained variance after accounting for the structural factors. Nonetheless, we provide a brief reflection on the relationship between structural characteristics and employment growth, as a background to the discussion of outliers. Table 2 presents the results of estimating the regression specified in Eq. (1) using the variables summarised in Sect. 4.1.
When it comes to the degree of specialisation versus diversification in the regional employment mix, we observe a positive significant effect of related variety, a negative significant effect of specialisation (measured by the Theil index), and an insignificant effect of absolute diversity. This implies that over the observed time period, it was the regions with sufficiently (but not excessively) diversified employment mixes that were most able to generate employment in Sweden. Employment growth is also positively affected by the degree of urbanisation (measured as population density).
With respect to regional innovativeness and competitiveness, there is (somewhat surprisingly) no significant relationship between employment growth and the share of knowledge-intensive activities (both manufacturing and services) in the region. Nor does the human capital variable tend to exhibit any significant impact.
Finally, with respect to the group of 'structural controls', we observe a significant convergence effect (negative sign for the regional employment variable), a positive (but weakly significant) effect from the share of manufacturing, and a negative effect of the public employment share.

Systematic regional growth deviations
This section presents the analysis of systematic deviations in regional growth. First, we obtain the matrix of prediction errors E 90 24 as in Eq.
(2) and transform it into the matrix of standardised prediction errors Z 90 24 , according to Eq. (3). As outlined above, we identify region-specific growth if the standardised prediction error is above (below) one for at least four consecutive years. Table 3 presents all regions that exhibit such systematic deviations from the average growth prediction according to

3
The black box of regional growth this definition, while Table 6 in Appendix presents information about all 90 regions in Sweden.
Following this approach, we identify 21 regions that at some point between 1990 and 2016 exhibited a systematic deviation for at least four years in a row. Of these: Table 2 Employment growth and structural preconditions at the regional level in Sweden, 1990Sweden, -2016 Robust standard errors clustered at the regional level are reported in brackets *** (**,*)indicate a significant difference from zero at the 1% (5%, 10%) level  1990-1993 1991-1994 1992-1995 1993-1996 1994-1997 1995-1998 1996-1999 1997-2000 1998-2001 1999-2002 2000-2003 2001-2004 2002-2005 2003-2006 2004 −Negative outlier implies that a region performed substantially worse in terms of employment growth than would be predicted by its structural preconditions +Positive outlier implies that a region performed substantially better in terms of employment growth than would be predicted by its structural preconditions 1 3 The black box of regional growth • Seven regions (Arvidsjaur, Gällivare, Kiruna, Laxå, Pajala, Säffle, and Vansbro) had periods during which they grew both above and below what would be predicted by their structural preconditions; • Six regions (Bengtsfors, Emmaboda, Gislaved, Hofors, Sorsele, and Stockholm) exhibited only the positive outlier features; and • Eight regions (Eskilstuna, Haparanda, Hultsfred, Jokkmokk, Olofström, Strömstad, Söderhamn, and Ånge) had periods of growth below the prediction by the structural factors.
Looking at temporal and regional patterns, we also derive a series of further stylised facts: 1. Regions with systemic deviations represent a broad range of size groups-from the metropolitan local labour market of Stockholm on the right side of the distribution (with a population of 2.8 million inhabitants in 2018) to the local labour market of Sorsele on the left side (with a population of 2522 inhabitants in 2018).
In that respect, the methodology is not biased towards any particular group of regions with respect to their size; 2. There is no clear temporal correlation in the outlier patterns. That is, we observe both negative and positive outlier tendencies throughout the whole observation period. This implies that the proposed methodology tends to do a good job in distinguishing the region-specific growth from the national growth pattern.
Another way to look at the residuals is to compare them with the observed employment growth (see Fig. 1). One would expect positive outliers to be regions with exceptionally fast growth, while negative outliers would be regions with exceptionally slow growth in the respective period. The latter is, in general, true: there appears to be a strong correlation between the value of the standardised residual and the actual growth (lower left quadrant in Fig. 1). However, when it comes to positive outliers, the situation is more nuanced. There are certain regions which had positive employment growth, while being positive outliers (upper right quadrant in Fig. 1). However, the growth tempo is not correlated with the size of the standardised residual. In addition, there are also several regions which are lucky losers (lower right quadrant in Fig. 1). These demonstrate low growth performance, and yet they are still positive outliers, implying that they shrank more slowly than their structural preconditions would suggest.
The above illustrates clear patterns where some regions over a period of at least 4 years consistently perform better or worse than could be expected considering their structural preconditions. The residuals are also remarkably robust to which explanatory factors are included in the model. This can be investigated by comparing the residuals of the fully specified model with the residuals of a model that only includes year and regional fixed effects. Figure 2 illustrates the residuals for Stockholm, Gällivare, and Strömstad, which are the regions with the lowest, median, and highest average absolute differences between residuals of the fully specified and the fixed effects only model.
For Stockholm, as well as for Gällivare, the two lines representing the residuals in each year of observation for the two models are very close to each other. For Strömstad, the figure unveils a larger gap between the residuals of the two models even though their trend is similar. Out of the 21 regions with systematic regional growth deviations, 19 regions show small gaps between the residuals of the two models, i.e. resembling the figures for Stockholm and Gällivare. In most cases, therefore, the inclusion or exclusion of structural variables does not alter the residuals in a substantial manner.
A final remark relates to the importance of systemic regional growth deviations, which is an unexpected but still important finding of the empirical illustration. Overall, the potential of the structural variables to explain regional growth variation is low and largely concealed in relatively high R 2 values supposed to measure the fit of the model. This has already become apparent by the fact that in most cases the residuals using a full model do not deviate substantially from the residuals of a model that only includes the time and regional fixed effects. Another way of illustrating this is by comparing the explained variance from different models.
We use as a starting point a model that only includes year fixed effects and calculate the sum of the squared residuals. Then, we include a regional fixed effect (i.e. a dummy variable for each region) and sum again the squared residuals. This reduces the sum of squared residuals-or in other words regional variation-by 32%. Then, we add all other explanatory factors as shown in Table 2 and calculate the sum of the squared residuals. All structural variables and the regional fixed effects account for 42% of regional variation as compared to the model with only year fixed effects. This means that in total 58% of regional variation remains unexplained. While this includes stochastic disturbances as well as more systematic deviations in regional growth, it suggests that we should seriously consider why regions show idiosyncratic growth paths that cannot be explained by general structural factors.

3
The black box of regional growth

Conclusions
While regional development research has traditionally mainly been preoccupied with identifying regularities explaining growth across regions, this paper turns the attention to the outliers in regional growth regressions. From a theoretical perspective, there are many reasons to expect regions to exhibit idiosyncratic growth patterns. Regional development is a function of a complex web of intra-and extraregional endowments of knowledge, resources and networks, characterised by mutual dependencies and interactions across many factors. Hence, some regions can be expected to outperform their peers over shorter or longer periods, while others lag behind.
Regional growth patterns therefore need to be investigated more closely. We propose an approach to detect systematic regional growth deviations. Residuals can be used to identify short-term and medium-term trends that are not explained by the included structural variables. When the residuals are sufficiently large and maintain the same direction over a period of several years, they are an indication of a regionspecific growth component that deviates systematically from the expected average growth performance.
Swedish register data from 1990 to 2016 are used to illustrate the approach and detect regions that exhibit growth deviations in the short-and medium-term. These come in all shapes and sizes, from the capital to tiny peripheral regions. They encompass positive outliers, negative outliers, or regions that are both during the period of observation. Furthermore, outliers are not limited to a certain phase of economic transition, but appear throughout the whole study period. Furthermore, the analysis shows that the residuals are not heavily affected by the inclusion or exclusion of other variables in the model. This suggests that residuals hold indeed valuable information to detect outlier regions.
Moreover, it is noteworthy how large the share of regional growth variation is that remains unexplained in standard growth regressions. Considering the total regional variation, 58% remains unexplained after considering all structural factors that have received primary attention in the recent literature. Of the 42% of regional variation explained by structural variables, the largest part can be attributed to regional fixed effects. Yet, it needs to be kept in mind that the empirical illustration is limited to a rather small and highly developed country. On the one hand, good data availability allowed including a comprehensive set of up-to-date indicators. On the other hand, before conducting similar studies in different contexts, it cannot be guaranteed that the findings hold in the same way. However, the proposed methodology can in any case be useful to investigate whether, when, and where systematic regional growth deviations exist.
This has implications for quantitative and qualitative research in economic geography. First, the method serves as a tool to identify when and where general structural factors are ill-equipped to explain regional growth. It unveils extreme cases of unexpected growth (or decline), from which substantial new knowledge can be gained through in-depth case studies (Eisenhardt and Graebner 2007). Further research about these extreme cases may lead to the discovery of so far unobserved structural factors, or a further validation of the importance of region-specific growth trajectories due to human agency and the unique combination of conditions in a region at a given time.
Second, the findings pose a challenge for quantitative studies besides what has been identified in the economics literature (Breinlich et al. 2014a, b). If the combination of intra-and extra-regional factors constitute opportunities and constraints for growth that are region-and time-specific, and if actors perceive and act upon those in a variegated manner (Grillitsch and Sotarauta 2019), regional pathways are expected to emerge that have little to do with modelled averages in regional growth regressions. Such estimations do not fit well with the "wide diversity of regional trajectories" argued for in evolutionary economic geography (Boschma 2004(Boschma , p. 1008. New methods are needed that are both closer to this theoretical understanding and better equipped for reducing the so far unexplained regional growth variations. Acknowledgements The methodological approach we describe in this article was developed, tested, and refined during the Regional Growth Against All Odds (ReGrow) project and we thank our collaborators in Norway, Finland, and Sweden for valuable comments: Jari Kolehmainen, Heli Kurikka, Karl-Johan Lundquist, Hjalti Nielsen, Magnus Nilsson, Josephine Rekers, Sami Sopanen, Markku Sotarauta, and Linda Stihl.
Funding Open Access funding provided by Lund University. This research was supported by a grant from Länsförsäkringar Alliance Research Foundation, Sweden (Grant Number: 2017/01/011).
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.

Literature review
See Table 4.

Data and definitions
The data employed in this paper come from the Longitudinal Integration Database for Health Insurance and Labour Market Studies (LISA) that is an anonymised linked employer-employee database that aims at complementing traditional labour market statistics and providing a better description of the labour market and people's relationship to the world of work (SCB 2016). It is a total-count individual register: all individuals registered in Sweden on December 31 each year are included in the population for the reference year. LISA is a longitudinal database, meaning that the data for the same person can be linked for all years in which she is included in the population.

3
The black box of regional growth

3
The black box of regional growth

3
The black box of regional growth LISA integrates the annual data from several registers, including education, income, employment, health insurance, and population registers. The connection of an employee to an employer is denoted by the identity number of the firm and the establishment where she has her main employment. The data also contain detailed information on various individual variables, such as age, education, annual earnings, municipality of residence and employment, and industry of employment. Annual data cover the period between 1990 and 2016.  Table 5 as follows.
Having three industry classifications required ensuring data consistency over time. As SNI2002 was a result of a minor revision of SNI1992, this was solved by manually merging the activity classifications at the five-digit level and further aggregating them into 505 four-digit industries. When it comes to ensuring the compatibility of NACE Rev. 1.1 and NACE Rev 2.0, direct conversion between the two classification systems does not work very well. As there are considerably more industries in NACE Rev. 2.0, switching between classification systems may lead to breaks in values of some independent variables, which may result in decreased prediction power of the empirical model. We discuss how we deal with this issue when presenting individual independent variables.

Dependent variable
We operationalise regional performance as employment growth over four-year periods (1990-1993; 1991-1994; etc.): where emp rt is the employment in region r in year t.

Related variety
The conventional approach derives measures of related variety from the hierarchical structure of official industry classifications. That is, industries are considered more related when they share more digits in the industry classification. Its fundamental weakness is that by assuming cognitive similarity to exist only between industries sharing some digits in industry classifications, it underestimates the span of knowledge spillover channels between industries (Firgo and Mayerhofer 2017). Furthermore, such measures of related variety disregard potential intertemporal dynamics in the relatedness linkages between various industries in the process of technological development (Martynovich 2016). Therefore, in long-term studies, it is more reasonable to employ measures of related variety based on revealed relatedness (Kuusk and Martynovich 2020). Following Neffke and Henning (2013), we claim that excessive exchange of labour between two industries signals overlapping skill requirements between them and indicates that these industries are related. Let F ijt be an observed (actual) flow of labour between industries i and j at time t and F ijt -an expected flow of labour between them derived from industry sizes, growth, and average wages. 8 Then, the values of ratio of observed to predicted flows that are significantly larger than 1 indicate that industries i and j are skill-related. 9 As we expect the network of related industries to evolve over time, we iterate this procedure for 24 periods: 1990-1993, 1991-1994, 1992-1995, …, 2013-2016. From constructed linkage metrics, we derive the regional measure of related variety, by weighting the metrics according to Fitjar and Timmermans (2017): where s ijrt -a measure of inter-industry relatedness between industry i and other industries j (i ≠ j) present in the region r in time period t; q irt -a share of industry i in the regional employment in time period t; N rt -a number of industries present in the region in time period t. Using the information on the presence of related ties between industries within the region, this indicator allows measuring the overall degree of related variety in the regional economy. In broad terms, it represents the (weighted) average number of related industries per each industry.
The black box of regional growth This indicator is calculated for four-digit industries. We can therefore expect that the switch between NACE Rev. 1.1 and NACE Rev. 2.0 will substantially affect the value of the variable. We therefore propose to correct the RV value according to the following procedure:

Other industry mix and agglomeration variables
To measure the absolute diversity in regional employment mix, we calculate the reverse Hirschman-Herfindahl index defined in the following way: where Q irt -employment share of a two-digit industry i in region r in time period t.
Following van Oort et al. (2015), we include the Theil index (sum of location quotients of the SNI two-digit industries weighted by their employment shares within a region) as a measure of relative regional specialisation. It is calculated as: where Q grt -employment share of a two-digit industry g in region r in sub-period t; Q gt -employment share of a two-digit industry i in national employment in time period t. While this index has the drawback of not accounting for the absolute size of particular sectors in the region, it has been proven to be a robust estimator of localisation economies.
The difference between the two latter measures is that while the absolute diversity measure reflects the concentration of employment within a region, the Theil index transforms the individual sectoral concentration measures into a generalised between-region specialisation measure. As both the reverse Hirschman-Herfindahl and Theil indices are calculated at the two-digit level, we do not expect much disruption in the values of the variables when the industry classification scheme is switched (as the number of industries at the two-digit level is comparable).
Urbanisation externalities are captured by population density of regions in each respective year. These data come from the official public database of Statistics Sweden.
Human capital effects on regional employment dynamics are captured by the share of regional population with higher education (within the group of workers aged 25 +):

Control variables
It has been claimed that 'manufacturing and construction industries have been viewed as being more cyclically sensitive than private service industries, and the latter more sensitive than public sector services' (Martin 2012). Moreover, public employment protection mechanisms may prevent a contraction in output from translating into a proportional decline in employment in the regions where a larger share of employment is concentrated in the public sector. More stringent employment Htechmanu_share rt = htechmanu_emp rt emp rt KIserv_share rt = KIserv_emp rt emp rt human_cap rt = HE_pop_25+ rt pop_25+ rt protection regulations and less flexible labour markets may shelter the regional economy from temporary shocks (Groot et al. 2011). We therefore account for the share of employment in manufacturing and the share of public employment to control for the sensitivity of regional labour markets to the macroeconomic conditions. These variables are defined in the following way: We also control for economic convergence by including measures of the median regional wage level and regional absolute employment. The expectation is that employment will, ceteris paribus, grow more rapidly (in per cent): • In regions with lower economic development (and, thus, lower median wage) levels; • In regions with lower absolute employment.
Finally, we account for the level of regional competition for workers, which is defined as the number of establishments per worker: where emp rt -employment in region r in sub-period t, N_est rt -number of establishments (plants) in region r in sub-period t (Table 6).

Outlier regions
See Table 6.
Manu_share rt = manu_emp rt emp rt Pub_share rt = public_emp rt emp rt Comp rt = N_est rt emp rt Table 6 Full list of outlier regions The black box of regional growth  The black box of regional growth  −Negative outlier: implies that a region performed substantially worse in terms of employment growth than would be predicted by its structural preconditions +Positive outlier: implies that a region performed substantially better in terms of employment growth than would be predicted by its structural preconditions