Skip to main content

Industrial Growth and Productivity Change in German Cities: A Multilevel Investigation

  • Chapter
  • First Online:
The Evolution of Economic and Innovation Systems

Part of the book series: Economic Complexity and Evolution ((ECAE))

  • 1270 Accesses

Abstract

The role of productivity change and city-specific characteristics on economic growth are analyzed for German cities. Productivity change is measured by the Malmquist index and its components, which are estimated by non-parametric data envelopment analysis. The nested structure as well as the interaction between industries within cities and over time is accounted for by estimating multilevel models. It is shown that there are differences for industrial growth for different cities and years. Therefore, the use of multilevel models is required. Schumpeter’s creative destruction is found to hold for efficiency change on industrial growth. Efficiency change measures the catching-up to the best practice production function, reducing both value added growth and employment growth. Technological progress shifts the best practice production function and leads only to a rise in value added growth and not in employment growth. The estimations indicate a converging growth of urban industrial value added while employment growth diverges.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    A list of the included cities is given in the Appendix.

  2. 2.

    The database is available https://www.regionalstatistik.de/genesis.

  3. 3.

    The database is available on CD-ROM upon request to the Federal Agency of Building and Urban Development at http://www.bbsr.bund.de.

  4. 4.

    The database is available on the Internet by http://stats.oecd.org.

References

  • Aghion P, Howitt PW (2009) The economics of growth. MIT, Cambridge

    Google Scholar 

  • Badunenko O (2010) Downsizing in the German chemical manufacturing industry during the 1990s. Why is small beautiful? Small Bus Econ 34:413–n431

    Google Scholar 

  • Batabyal AA, Nijkamp P (2012) Retraction of “a Schumpeterian model of entrepreneurship, innovation, and regional economic growth”. Int Reg Sci Rev 35:464–n486

    Article  Google Scholar 

  • Batabyal AA, Nijkamp P (2013) A multi-region model of economic growth with human capital and negative externalities in innovation. J Evol Econ 23:909-n-924

    Google Scholar 

  • Boschma RA, Lambooy JG (1999) Evolutionary economics and economic geography. J Evol Econ 9:411–429

    Article  Google Scholar 

  • Boschma RA, Frenken K (2011) The emerging empirics of evolutionary economic geography.J Econ Geogr 11:295–307

    Google Scholar 

  • Bryk AS, Raudenbush SW (1988) Toward a more appropriate conceptualization of research on school effects: a three-level hierarchical linear model. Am J Edu 97:65–108

    Article  Google Scholar 

  • Charnes A, Cooper WW, Rhodes E (1978) Measuring the efficiency of decision making units. Eur J Oper Res 2:429–444

    Article  Google Scholar 

  • Coelli TJ, Rao DP, O’Donnell CJ, Battese GE (2005) An introduction to efficiency and productivity analysis, 2nd edn. Springer, New York

    Google Scholar 

  • Cullmann A, von Hirschhausen C (2008) Efficiency analysis of east European electricity distribution in transition: legacy of the past? J Prod Anal 29:155–167

    Article  Google Scholar 

  • Dietrich A (2009) Does growth cause structural change, or is it the other way round? A dynamic panel data analyses for seven OECD countries. Jena Economic Research Papers 2009-034

    Google Scholar 

  • Dopfer K, Foster J, Potts J (2004) Micro-meso-macro. J Evol Econ 14:263–279

    Article  Google Scholar 

  • Farrell MJ (1957) The measurement of productive efficiency. J R Stat Soc Ser A 120:253–281

    Article  Google Scholar 

  • Fratesi U (2010) Regional innovation and competitiveness in a dynamic representation. J Evol Econ 20:515–552

    Article  Google Scholar 

  • Färe R, Grosskopf S, Lindgren B, Roos P (1992) Productivity changes in Swedish pharmacies 1980–1989: a non-parametric malmquist approach. J Prod Anal 3:85–101

    Article  Google Scholar 

  • Färe R, Grosskopf S, Norris M (1997) Productivity growth, technical progress, and efficiency change in industrialized countries: reply. Am Econ Rev 87:1040–1043

    Google Scholar 

  • Frenken K, Boschma RA (2007) A theoretical framework for evolutionary economic geography: industrial dynamics and Urban growth as a branching process. J Econ Geogr 7:635–649

    Article  Google Scholar 

  • Gaffard J-L (2008) Innovation, competition, and growth: schumpeterian ideas within a hicksian framework. J Evol Econ 18:295–311

    Article  Google Scholar 

  • Giovannetti G, Ricchiuti G, Velucchi M (2009) Location, internationalization and performance of firms in Italy: a multilevel approach. Universita’ degli Studi di Firenze, Dipartimento di Scienze Economiche, Working Papers Series N. 09/2009

    Google Scholar 

  • Glaeser EL, Kallal HD, Scheinkman JA, Shleifer A (1992) Growth in cities. J Polit Econ 100:1126–1152

    Article  Google Scholar 

  • Goedhuys M, Srholec M (2010) Understanding multilevel interactions in economic development. TIK Working Papers on Innovation Studies No. 20100208

    Google Scholar 

  • Harville DA (1977) Maximum likelihood approaches to variance component estimation and to related problems. J Am Stat Assoc 72:320–338

    Article  Google Scholar 

  • Henderson JV (1997) Externalities and industrial development. J Urban Econ 42:449–470

    Article  Google Scholar 

  • Henderson JV, Kuncoro A, Turner M (1995) Industrial development in cities. J Polit Econ 103:1067–1090

    Article  Google Scholar 

  • Hox JJ (1998) Multilevel modeling: when and why. In: Balderjahn I, Mathar R, Schader M (eds) Classification, data analysis, and data highways. Springer, New York, pp 147–154

    Chapter  Google Scholar 

  • Hox JJ (2002) Multilevel analysis: techniques and applications. Erlbaum, Mahwahn, NJ

    Google Scholar 

  • Ieno EN, Luque PL, Pierce GJ, Zuur AF, Santos MB, Walker NJ, Saveliev AA, Smith G (2009) Three-way nested data for age determination techniques applied to cetaceans. In: Zuur AF, Ieno EN, Walker NJ, Saveliev AA, Smith GM (eds) Mixed effects models and extensions in ecology with R. New York, Springer, Chapter 20, pp 459–492

    Chapter  Google Scholar 

  • Illy A, Schwartz M, Hornych C, Rosenfeld MTW (2011) Local economic structure and sectoral employment growth in German cities. Tijdschrift voor Economische en Sociale Geografie 102:582–593

    Article  Google Scholar 

  • Laird N, Lange N, Stram D (1987) Maximum likelihood computations with repeated measures: application of the EM algorithm. J Am Stat Assoc 82:97–105

    Article  Google Scholar 

  • Laird NM, Ware JH (1982) Random-effects models for longitudinal data. Biometrics 38:963–974

    Article  Google Scholar 

  • Lindstrom MJ, Bates DM (1988) Newton–Raphson and EM algorithms for linear mixed-effects models for repeated-measures data. J Am Stat Assoc 83:1014–1022

    Google Scholar 

  • Lindstrom MJ, Bates DM (1990) Nonlinear mixed effects models for repeated measures data. Biometrics 46:673–687

    Article  Google Scholar 

  • Maas CJM, Hox JJ (2005) Sufficient sample sizes for multilevel modeling. Methodology 1:86–92

    Google Scholar 

  • MacKinnon JG, White H (1985) Some heteroskedasticity-consistent covariance matrix estimators with improved finite sample properties. J Econ 29:305–325

    Article  Google Scholar 

  • Malmquist S (1953) Index numbers and indifference surfaces. Trabajos de Estadística y de Investigación Operativa 4:209–242

    Article  Google Scholar 

  • Martin R, Sunley P (2006) Path dependence and regional economic evolution. J Econ Geogr 6:395–437

    Article  Google Scholar 

  • McFadden D (1973) Conditional logit analysis of qualitative choice behavior. In: Zaremka P (ed) Frontiers in econometrics. Academic Press, New York, pp 105–142

    Google Scholar 

  • Moerbeek M, Breukelen GJP, Berger MPF (2000) Design issues for experiments in multilevel populations. J Educ Behav Stat 25:271–284

    Article  Google Scholar 

  • Moomaw RL (1981) Productivity and city size: a critique of evidence. Q J Econ 96:675–688

    Article  Google Scholar 

  • Noseleit F (2013) Entrepreneurship, structural change, and economic growth. J Evol Econ 23:735–766

    Article  Google Scholar 

  • Park WG (1995) International R&D spillovers and OECD economic growth. Econ Inq 33:571–591

    Article  Google Scholar 

  • Pinheiro JC, Bates D (2000) Mixed-effects models in S and S-plus. Springer, New York

    Book  Google Scholar 

  • Pinheiro JC, Bates D, DebRoy S, Sarkar D, R Development Core Team (2013) nlme: linear and nonlinear mixed effects models. R package version 3.1-111

    Google Scholar 

  • Raudenbush SW, Bryke AS (2002) Hierarchical linear models, application and data analysis methods, Advanced Quantitative Techniques in the Social Science Series, 2nd edn. Sage, Thousand Oaks

    Google Scholar 

  • Ray SC, Desli E (1997) Productivity growth, technical progress, and efficiency change in industrialized countries: comment. Am Econ Rev 87:1033–1039

    Google Scholar 

  • Roy A, Bhaumik DK, Aryal S, Gibbons RD (2007) Sample size determination for hierarchical longitudinal designs with different attrition rates. Biometrics 63:699–707

    Article  Google Scholar 

  • Rozenblat C (2012) Opening the black box of agglomeration economies for measuring cities’ competitiveness through international firm networks. Urban Stud 47:2841–2865

    Article  Google Scholar 

  • Saviotti PP, Pyka A (2004) Economic development by the creation of new sectors. J Evol Econ 14:1–35

    Article  Google Scholar 

  • Schumpeter JA (1934) The theory of economic development. Harvard University Press,Cambridge, MA

    Google Scholar 

  • Schumpeter JA (1939) Business cycles. McGraw Hill, New York

    Google Scholar 

  • Simar L, Wilson PW (1998) Sensitivity analysis of efficiency scores: how to bootstrap in nonparametric Frontier models. Manag Sci 44:49–61

    Article  Google Scholar 

  • Simar L, Wilson PW (1999) Estimating and bootstrapping malmquist indices. Eur J Oper Res 115:459–471

    Article  Google Scholar 

  • Simar L, Wilson PW (2002) Non-parametric test of returns to scale. Eur J Oper Res 139:115–132

    Article  Google Scholar 

  • Simar L, Wilson PW (2007) Estimations and inference in two-stage, semi-parametric models of production processes. J Econ 136:31–64

    Article  Google Scholar 

  • Srholec M (2010) A multilevel approach to geography of innovation. Reg Stud 44:1208–1220

    Article  Google Scholar 

  • Srholec M (2011) A multilevel analysis of innovation in developing countries. Ind Corp Chang 22:1539–1569

    Article  Google Scholar 

  • Stamer M (1999) Strukturwandel und wirtschaftliche Entwicklung in Deutschland, den USA und Japan. Shaker, Aachen

    Google Scholar 

  • Statistisches Bundesamt (2003) German classification of economic activities, Edition 2003 (WZ 2003). Statistisches Bundesamt, Wiesbaden

    Google Scholar 

  • Sveikauskas LA (1975) The productivity of cities. Q J Econ 89:392–413

    Article  Google Scholar 

  • Tabachnick BG, Fidell LS (2007) Using multivariate statistics, 5th edn. Pearson International Edition, Boston

    Google Scholar 

  • Thanassoulis E, Portela MCS, Despic O (2008) Data envelopment analysis: the mathematical programming approach to efficiency analysis. In: Fried HO, Lovell CAK, Schmidt SS (eds) The measurement of productivity efficiency and productivity growth. Oxford, New York, Chapter 3, pp 251–420

    Chapter  Google Scholar 

  • Ware JH (1985) Linear models for the analysis of longitudinal studies. Am Stat 39:95–101

    Google Scholar 

  • West BT, Welch KB, Galecki AT (2007) Linear mixed models: a practical guide using statistical software. Chapman & Hall/CRC Taylor & Francis Group, Boca Raton, FL

    Google Scholar 

  • Wheelock DC, Wilson PW (1999) Technical progress, inefficiency, and productivity change in U.S. banking, 1984–1993. J Money Credit Bank 31:212–234

    Article  Google Scholar 

  • Wilson PW (2008) FEAR 1.0: a software package for Frontier efficiency analysis with R. Socio Econ Plan Sci 42:247–254

    Article  Google Scholar 

  • Zuur AF, Gende LB, Ieno EN, Fernández NJ, Eguaras MJ, Fritz R, Walker NJ, Saveliev AA, Smith GM (2009) Mixed effects modelling applied on American foulbrood affecting honey bees larvae. In: Mixed effects models and extensions in ecology with R, Chapter 19, pp 447–458

    Google Scholar 

  • Zuur AF, Ieno EN, Walker NJ, Saveliev AA, Smith GM (2009) Mixed effects models and extensions in ecology with R. Springer, New York

    Book  Google Scholar 

Download references

Acknowledgements

I would like to thank the participants at the 9th ACDD in Strasbourg, France, the 14th International Joseph A. Schumpeter Society Conference in Brisbane, Australia, the 9th ISNE conference in Cork, Ireland, and the 2nd ifo workshop on regional economics in Dresden, Germany, for their fruitful discussion on earlier drafts of this paper. Additionally, I would like to thank an anonymous referee and Jens J. Krüger for reading a previous version and for their helpful comments. The usual disclaimer applies.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stephan Hitzschke .

Editor information

Editors and Affiliations

Appendices

Appendix 1: List of Cities Included (Table 12)

Table 12 Cities included with average population

Appendix 2: Multilevel Model Estimation

In general and in the formulation of Pinheiro and Bates (2000) a three level model with two levels of random effects is written as

$$\displaystyle{ y_{\mathit{ijt}} =\boldsymbol{ X}_{\mathit{ijt}}\boldsymbol{\beta }_{\mathit{ijt}} +\boldsymbol{ Z}_{\mathit{ij},t}\boldsymbol{b}_{\mathit{ij}} +\boldsymbol{ Z}_{\mathit{ijt}}\boldsymbol{b}_{\mathit{ijt}} + e_{\mathit{ijk}}, }$$
(37)

with \(i = 1,\ldots,N\), \(j = 1,\ldots,n\), and \(t = 2,\ldots,T\), and \(\boldsymbol{b}_{\mathit{ij}} \sim N\left (\mathbf{0},\boldsymbol{\Sigma }_{1}\right )\), \(\boldsymbol{b}_{\mathit{ijt}} \sim N\left (\mathbf{0},\boldsymbol{\Sigma }_{2}\right )\), \(\boldsymbol{e}_{\mathit{ijk}} \sim N\left (0,\sigma ^{2}\boldsymbol{I}\right )\). For simplification the number observations is the same for every level and group so that no observation is missing and it does not vary by lower level groups. In the mixed or random effects literature Eq. ( 37) is written in vector notation for all i as

$$\displaystyle{ \boldsymbol{y}_{\mathit{jt}} =\boldsymbol{ X}_{\mathit{jt}}\boldsymbol{\beta }_{\mathit{jt}} +\boldsymbol{ Z}_{j,t}\boldsymbol{b}_{j} +\boldsymbol{ Z}_{\mathit{jt}}\boldsymbol{b}_{\mathit{jt}} +\boldsymbol{ e}_{\mathit{jt}}. }$$
(38)

Equation ( 37) and accordingly Eq. ( 37) incorporate \(\boldsymbol{X}_{\mathit{jt}}\) the regressor matrix for the vector of the p fixed effects \(\boldsymbol{\beta }_{\mathit{jt}}\), \(\boldsymbol{Z}_{j,t}\) the regressor matrix for the random effects \(\boldsymbol{b}_{j}\) of the second level, and \(\boldsymbol{Z}_{\mathit{jt}}\) the regressor matrix for the random effect \(\boldsymbol{b}_{\mathit{jt}}\) of the third level. The variance-covariance matrices \(\boldsymbol{\Sigma }_{l}\) for l = 1, 2 and in each of the two levels of random effects have to be symmetric and positive definite and can be expressed as \(\sigma ^{2}\boldsymbol{D}_{l}\) with σ 2 the variance of the error term and \(\boldsymbol{D}_{l}\) a scaled variance-covariance matrix for the random effects of level l.

The estimation procedure is developed from the simple model with one level of random effects to two levels of random effects and can be extended by further levels of random effects.

For one level of random effects with l = 1 the calculation is performed as follows. The general model equation without the third level denoted with t or the second level of random effects is in vector notation

$$\displaystyle{ y_{\mathit{ij}} =\boldsymbol{ X}_{\mathit{ij}}\boldsymbol{\beta }_{\mathit{ij}} +\boldsymbol{ Z}_{\mathit{ij}}\boldsymbol{b}_{\mathit{ij}} + e_{\mathit{ij}}, }$$
(39)

for \(i = 1,\ldots,N\), \(j = 1,\ldots,n\), and \(\boldsymbol{X}_{\mathit{ij}}\) the \(\left (N \cdot n \times p\right )\) regressor matrix for the \(\left (p \times 1\right )\) vector of fixed effects \(\boldsymbol{\beta }_{\mathit{ij}}\), \(\boldsymbol{Z}_{\mathit{ij}}\) is the \(\left (N \cdot n \times q\right )\) regressor matrix for the q random effects \(\boldsymbol{b}_{\mathit{ij}}\). In notation for all i as vector it follows

$$\displaystyle{ \boldsymbol{y}_{j} =\boldsymbol{ X}_{j}\boldsymbol{\beta }_{j} +\boldsymbol{ Z}_{j}\boldsymbol{b}_{j} +\boldsymbol{ e}_{j}, }$$
(40)

for \(j = 1,\ldots,n\). As Lindstrom and Bates (1988) show in general without restriction on the error term structure \(\boldsymbol{e}_{j} \sim N\left (\mathbf{0},\sigma ^{2}\boldsymbol{\Lambda }\right )\) where \(\boldsymbol{\Lambda }\) is of size N × N and does not have to be the identity matrix \(\boldsymbol{I}\)

$$\displaystyle{y_{j}\vert \boldsymbol{b}_{j} \sim N\left (\boldsymbol{X}_{j}\boldsymbol{\beta }_{j} +\boldsymbol{ Z}_{j}\boldsymbol{b}_{j},\sigma ^{2}\boldsymbol{\Lambda }_{ j}\right ),\,j = 1,\,\ldots,\,n.}$$

For all j, it becomes in vector notation

$$\displaystyle{\boldsymbol{y}\vert \boldsymbol{b} \sim N\left (\boldsymbol{X}\boldsymbol{\beta } +\boldsymbol{ Z}\boldsymbol{b},\sigma ^{2}\boldsymbol{\Lambda }\right )}$$

with\(\boldsymbol{Z} =\mathrm{ diag}\left (\boldsymbol{Z}_{1},\boldsymbol{Z}_{2},\ldots,\boldsymbol{Z}_{n}\right )\), \(\boldsymbol{\Lambda } =\mathrm{ diag}\left (\boldsymbol{\Lambda }_{1},\boldsymbol{\Lambda }_{2},\ldots,\boldsymbol{\Lambda }_{n}\right )\) and \(\boldsymbol{b} \sim N\left (0,\sigma ^{2}\boldsymbol{\Sigma }\right )\)

$$\displaystyle{ \boldsymbol{y} \sim N\left (\boldsymbol{X}\boldsymbol{\beta },\boldsymbol{D}\right ),\boldsymbol{D} =\sigma ^{2}\left (\boldsymbol{Z}\boldsymbol{\Sigma }\boldsymbol{Z}^{{\prime}} +\boldsymbol{ \Lambda }\right ) }$$
(41)

The likelihood function is

$$\displaystyle{ L\left (\boldsymbol{\beta },\boldsymbol{\theta },\sigma ^{2}\vert \boldsymbol{y}\right ) =\prod _{ j=1}^{n}p\left (y_{ j}\vert \boldsymbol{\beta },\boldsymbol{\theta },\sigma ^{2}\right ). }$$
(42)

In Eq. ( 42) \(\boldsymbol{\theta }\) contains the unique elements of \(\boldsymbol{\Sigma }\) and the parameters in \(\boldsymbol{\Lambda }\) which are the variance components without exact specification (Harville 1977; Lindstrom and Bates 1990). Because \(\boldsymbol{b}_{j}\) and \(\boldsymbol{e}_{j}\) are independent, as Eq. ( 41) indicates, Eq. ( 42) results in

$$\displaystyle\begin{array}{rcl} L\left (\boldsymbol{\beta },\boldsymbol{\theta },\sigma ^{2}\vert \boldsymbol{y}\right )& =& \prod _{ j=1}^{n}\int p\left (y_{ j}\vert \boldsymbol{b}_{j},\boldsymbol{\beta },\sigma ^{2}\right )p\left (\boldsymbol{b}_{ j}\vert \boldsymbol{\theta },\sigma ^{2}\right )d\boldsymbol{b}_{ j} \\ & =& \prod _{j=1}^{n}\int \frac{\exp (-\left \Vert \boldsymbol{y}_{j} -\boldsymbol{ X}_{j}\boldsymbol{\beta }_{j} + \mathbf{Z}_{j}\boldsymbol{b}_{j}\right \Vert ^{2}/2\sigma ^{2})} {\left (2\pi \sigma ^{2}\right )^{N/2}} \\ & & \times \frac{\exp \left (-\boldsymbol{b}_{j}^{{\prime}}\boldsymbol{D}^{-1}\boldsymbol{b}_{j}/2\sigma ^{2}\right )} {\left (2\pi \sigma ^{2}\right )^{q/2}\sqrt{\left \vert \boldsymbol{D} \right \vert }} d\boldsymbol{b}_{j} \\ & =& \prod _{j=1}^{n} \frac{1} {\sqrt{\left (2\pi \sigma ^{2 } \right ) ^{N/2}}} \\ & & \times \int \frac{\exp \left [\frac{-1} {2\sigma ^{2}} \left (\left \Vert y_{j} -\boldsymbol{ X}_{j}\boldsymbol{\beta } -\boldsymbol{ Z}_{j}\boldsymbol{b}_{j}\right \Vert ^{2} +\boldsymbol{ b}_{j}^{{\prime}}\boldsymbol{D}^{-1}\boldsymbol{b}_{j}\right )\right ]} {\left (2\pi \sigma ^{2}\right )^{q/2}\sqrt{\left \vert \boldsymbol{D} \right \vert }} d\boldsymbol{b}_{j} \\ & =& \prod _{j=1}^{n} \frac{1} {\sqrt{\left (2\pi \sigma ^{2 } \right ) ^{N/2}}} \\ & & \times \int \frac{\exp \left [\frac{-1} {2\sigma ^{2}} \left (\left \Vert y_{j} -\boldsymbol{ X}_{j}\boldsymbol{\beta } -\boldsymbol{ Z}_{j}\boldsymbol{b}_{j}\right \Vert ^{2} -\left \Vert \Delta \boldsymbol{b}_{j}\right \Vert ^{2}\right )\right ]} {\left (2\pi \sigma ^{2}\right )^{q/2}\mathsf{abs}\left \vert \Delta \right \vert ^{-1}} d\boldsymbol{b}_{j} \\ & =& \prod _{j=1}^{n} \frac{\mathsf{abs}\left \vert \Delta \right \vert } {\sqrt{\left (2\pi \sigma ^{2 } \right ) ^{N/2}}} \\ & & \times \int \frac{\exp \left [\frac{-1} {2\sigma ^{2}} \left (\left \Vert \tilde{\boldsymbol{y}}_{j} -\boldsymbol{\tilde{ X}}_{j}\boldsymbol{\beta } -\boldsymbol{\tilde{ Z}}_{j}\boldsymbol{b}_{j}\right \Vert ^{2}\right )\right ]} {\left (2\pi \sigma ^{2}\right )^{q/2}} d\boldsymbol{b}_{j}, {}\end{array}$$
(43)

with \(\tilde{\boldsymbol{y}}_{j} = \left [\begin{array}{c} \boldsymbol{y}_{j} \\ \mathbf{0} \end{array} \right ],\tilde{\boldsymbol{X}}_{j} = \left [\begin{array}{c} \boldsymbol{X}_{j} \\ \mathbf{0}\end{array} \right ],\tilde{\boldsymbol{Z}}_{j} = \left [\begin{array}{c} \boldsymbol{Z}_{j} \\ \boldsymbol{\Delta }\end{array} \right ]\) as pseudo data, where \(\boldsymbol{\Delta }\) a relative precision factor as the Cholesky factor of \(\boldsymbol{D}^{-1}\), since \(\boldsymbol{b}_{j}^{{\prime}}\boldsymbol{D}^{-1}\boldsymbol{b}_{j} = \left \Vert \boldsymbol{\bigtriangleup }\boldsymbol{b}_{j}\right \Vert ^{2} = \left \Vert \mathbf{0} -\mathbf{0}\boldsymbol{\beta } -\boldsymbol{\bigtriangleup }\boldsymbol{b}_{j}\right \Vert ^{2}\) and therefore \(\boldsymbol{D}^{-1} = \Delta ^{{\prime}}\Delta \) (Lindstrom and Bates 1990).

So the exponent is the sum of squared residuals (\(\left \Vert a\right \Vert = \sqrt{a^{{\prime} } a}\) as the norm of a matrix). Equation ( 43) clearly points out that the maximization of the log-likelihood requires the minimization of the quadratic norm within the exponential function within the integral. This quadratic norm includes the quadratic error terms and is therefore similar to other least squares problems except that the mean of the random effects have to be zero. To solve that least squares problem numerically the orthogonal-triangular decomposition of rectangular matrices is preferred since it provides stable and efficient results by reducing the condition, i.e. the complexity of \(\boldsymbol{X}_{j}\) and \(\boldsymbol{Z}_{j}\). The orthogonal-triangular decomposition uses is the QR-decomposition, with \(\tilde{\boldsymbol{Z}}_{j} =\boldsymbol{ Q}_{(j)}\left [\begin{array}{c} \boldsymbol{R}_{11(j)} \\ \mathbf{0} \end{array} \right ]\), where \(\boldsymbol{Q}_{(j)}\) is a \(\left (N + q\right ) \times \left (N + q\right )\) orthogonal matrix \(\left (Q_{(j)}^{{\prime}} = Q_{(j)}^{-1}\right )\) and \(\boldsymbol{R}_{11(j)}\) is an upper-triangular \(\left (q \times q\right )\) matrix. This decomposition can be performed for every real matrix but in the case for positive elements in \(\boldsymbol{R}_{11(j)}\) have to be invertible, so \(\tilde{\boldsymbol{Z}}_{j}\) has to have full rank as for OLS regression there must not be any linear dependency structure within the random variables. Also \(\tilde{\boldsymbol{X}}_{j} =\boldsymbol{ Q}_{(j)}\left [\begin{array}{c} \boldsymbol{R}_{10(j)} \\ \boldsymbol{R}_{00(j)} \end{array} \right ]\) and \(\tilde{\boldsymbol{y}}_{j} =\boldsymbol{ Q}_{(j)}\left [\begin{array}{c} \boldsymbol{c}_{1(j)} \\ \boldsymbol{c}_{0(j)} \end{array} \right ]\). Therefore, it is also possible to orthogonal triangular decomposition (QR) of an augmented matrix

$$\displaystyle{\left [\begin{array}{ccc} \boldsymbol{Z}_{j}&\boldsymbol{X}_{j}&\boldsymbol{y}_{j} \\ \boldsymbol{\Delta } & \mathbf{0} & \mathbf{0} \end{array} \right ] = \left (\begin{array}{ccc} \tilde{\boldsymbol{Z}}_{j}&\tilde{\boldsymbol{X}}_{j}&\tilde{\boldsymbol{y}}_{j} \end{array} \right ) =\boldsymbol{ Q}_{(j)}\left [\begin{array}{ccc} \boldsymbol{R}_{11(j)} & \boldsymbol{R}_{10(j)} & \boldsymbol{c}_{1(j)} \\ \mathbf{0} &\boldsymbol{R}_{00(j)} & \boldsymbol{c}_{0(j)} \end{array} \right ]}$$

or

$$\displaystyle{\boldsymbol{Q}_{(j)}^{-1}\left (\begin{array}{ccc} \tilde{\boldsymbol{Z}}_{j}&\tilde{\boldsymbol{X}}_{j}&\tilde{\boldsymbol{y}}_{j} \end{array} \right ) = \left [\begin{array}{ccc} \boldsymbol{R}_{11(j)} & \boldsymbol{R}_{10(j)} & \boldsymbol{c}_{1(j)} \\ \mathbf{0} &\boldsymbol{R}_{00(j)} & \boldsymbol{c}_{0(j)} \end{array} \right ].}$$

The exponent in Eq. ( 43) becomes

$$\displaystyle\begin{array}{rcl} \left \Vert \tilde{\boldsymbol{y}}_{j} -\tilde{\boldsymbol{ X}}_{j}\boldsymbol{\beta } -\tilde{\boldsymbol{ Z}}_{j}\boldsymbol{b}_{j}\right \Vert ^{2}& =& \left \Vert \boldsymbol{Q}_{ (j)}^{{\prime}}\left (\tilde{\boldsymbol{y}}_{ j} -\tilde{\boldsymbol{ X}}_{j}\boldsymbol{\beta } -\tilde{\boldsymbol{ Z}}_{j}\boldsymbol{b}_{j}\right )\right \Vert ^{2} {}\\ & =& \left \Vert \boldsymbol{c}_{1(j)} -\boldsymbol{ R}_{10(j)}\boldsymbol{\beta } -\boldsymbol{ R}_{11(j)}\boldsymbol{b}_{j}\right \Vert ^{2} + \left \Vert \boldsymbol{c}_{ 0(j)} -\boldsymbol{ R}_{00(j)}\boldsymbol{\beta }\right \Vert. {}\\ \end{array}$$

Thus the integral in Eq. ( 43) can be expressed as

$$\displaystyle{ \exp \left [\frac{\left \Vert \boldsymbol{c}_{0(j)} -\boldsymbol{ R}_{00(j)}\boldsymbol{\beta }\right \Vert ^{2}} {-2\sigma ^{2}} \right ]\int \frac{\exp \left [\frac{-1} {2\pi \sigma ^{2}} \left (\left \Vert \boldsymbol{c}_{1(j)} -\boldsymbol{ R}_{10(j)}\boldsymbol{\beta } -\boldsymbol{ R}_{11(j)}\boldsymbol{b}_{j}\right \Vert ^{2}\right )\right ]} {\left (2\pi \sigma ^{2}\right )^{q/2}} d\boldsymbol{b}_{j}. }$$
(44)

Note because \(\boldsymbol{R}_{11(j)}\) is a non-singular, Bates and Pinheiro construct the following variable

\(\boldsymbol{\phi }_{j} = \left (\boldsymbol{c}_{1(j)} -\boldsymbol{ R}_{10(j)}\boldsymbol{\beta } - n\boldsymbol{R}_{11(j)}\boldsymbol{b}_{1}\right )/\sigma\) with \(d\boldsymbol{\phi }_{j} =\sigma ^{-q}\mathrm{abs}\vert \boldsymbol{R}_{11(j)}\vert d\boldsymbol{b}_{j}\) to easily eliminate the integral. The integral expressed in Eq. ( 44) is

$$\displaystyle\begin{array}{rcl} & & \int \frac{\exp \left [\frac{-1} {2\pi \sigma ^{2}} \left (\left \Vert \boldsymbol{c}_{1(j)} - n\boldsymbol{R}_{10(j)}\boldsymbol{\beta } -\boldsymbol{ R}_{11(j)}\boldsymbol{b}_{j}\right \Vert ^{2}\right )\right ]} {\left (2\pi \sigma ^{2}\right )^{q/2}} d\boldsymbol{b}_{j} {}\\ & & \quad = \frac{1} {\mathrm{abs}\vert \boldsymbol{R}_{11(j)}\vert }\int \frac{\exp \left (-\left \Vert \boldsymbol{\phi }_{j}\right \Vert ^{2}/2\right )} {\left (2\pi \right )^{q/2}} d\boldsymbol{\phi }_{j} {}\\ & & \quad =\mathrm{ abs}\vert \boldsymbol{R}_{11(j)}\vert ^{-1} {}\\ \end{array}$$

because the integral is over a standard normal distribution, which is unity over the whole range.

And because the determinant of \(\boldsymbol{R}_{11(j)}\) is the sum of its diagonal elements since it is an upper-triangular matrix by construction of QR decomposition. So altogether the likelihood function becomes

$$\displaystyle{L\left (\boldsymbol{\beta },\boldsymbol{\theta },\sigma ^{2}\vert \boldsymbol{y}\right ) =\prod _{ j=1}^{n}\frac{\exp \left [\frac{\left \Vert \boldsymbol{c}_{0j}-\boldsymbol{R}_{00(j)}\boldsymbol{\beta }\right \Vert ^{2}} {-2\sigma ^{2}} \right ]} {\sqrt{\left (2\pi \sigma ^{2 } \right ) ^{N } \vert \boldsymbol{D}\vert }}\mathrm{abs}\vert \boldsymbol{R}_{11(j)}\vert ^{-1}.}$$

A further QR decomposition can be performed by

$$\displaystyle{\left [\begin{array}{cc} \boldsymbol{R}_{00(1)} & \boldsymbol{c}_{0(1)}\\ \vdots & \vdots \\ \boldsymbol{R}_{00(M)} & \boldsymbol{c}_{0(M)} \end{array} \right ] =\boldsymbol{ Q}_{0}\left [\begin{array}{cc} \boldsymbol{R}_{00} & \boldsymbol{c}_{0} \\ \mathbf{0} &\boldsymbol{c}_{-1} \end{array} \right ]}$$

to

$$\displaystyle\begin{array}{rcl} L\left (\boldsymbol{\beta },\boldsymbol{\theta },\sigma ^{2}\vert \boldsymbol{y}\right )& =& \left (2\pi \sigma ^{2}\right )^{-N_{M}/2}\exp \left (\frac{\left \Vert \boldsymbol{c}_{-1}\right \Vert ^{2} + \left \Vert \boldsymbol{c}_{ 0} -\boldsymbol{ R}_{00}\boldsymbol{\beta }\right \Vert ^{2}} {-n2\sigma ^{2}} \right ) {}\\ & & \times \prod _{j=1}^{n}\mathrm{abs}\left ( \frac{\vert \boldsymbol{\Delta }\vert } {\vert \boldsymbol{R}_{11(j)}\vert }\right ) {}\\ \end{array}$$

with \(N_{n} =\sum _{ j=1}^{n}N = n \cdot N\) and \(1/\sqrt{\vert \boldsymbol{D}\vert } =\mathrm{ abs}\vert \boldsymbol{\Delta }\vert \). The estimate of fixed effects \(\boldsymbol{\beta }\) follows from \(\left \Vert \boldsymbol{c}_{0} -\boldsymbol{ R}_{00}\boldsymbol{\beta }\right \Vert ^{2}\) and is

$$\displaystyle{\hat{\boldsymbol{\beta }}=\boldsymbol{ R}_{00}^{-1}\boldsymbol{c}_{ 0}}$$

and

$$\displaystyle{\boldsymbol{\sigma }^{2} = \left \Vert \boldsymbol{c}_{ -1}\right \Vert ^{2}/N_{ n}.}$$

Maximum likelihood estimates are then performed by setting an estimate for \(\boldsymbol{\theta }\). The random effects are evaluated by

$$\displaystyle{\hat{\boldsymbol{b}}_{j}\left (\boldsymbol{\theta }\right ) =\boldsymbol{ R}_{11(j)}^{-1}\left (\boldsymbol{c}_{ 1j} -\boldsymbol{ R}_{10(j)}\hat{\boldsymbol{\beta }}\left (\boldsymbol{\theta }\right )\right ).}$$

This is the best linear unbiased predictor for the random effects, where \(\boldsymbol{\theta }=\hat{\boldsymbol{\theta }}\) as the maximum likelihood estimate.

Lindstrom and Bates (19881990) show the computation for full maximum likelihood and restricted maximum likelihood estimation. Since the maximum likelihood estimation does not account for the loss in degrees of freedom (N M p) the estimators are generally downward biased for example if the estimator for the variance component is \(\theta _{i}\left (N_{n} - p\right )/N\) its bias is θ i pN n (Harville 1977). The estimation is therefore performed with the restricted maximum likelihood estimation (REML) sometimes also called residual maximum likelihood which accounts for the degrees of freedom but results in incomparable results if the number of parameters differ. The restricted form as Laird and Ware (1982) and Ware (1985)

$$\displaystyle{ L_{R}\left (\boldsymbol{\theta },\sigma ^{2}\vert \boldsymbol{y}\right ) =\int L\left (\boldsymbol{\beta },\boldsymbol{\theta },\sigma ^{2}\vert \boldsymbol{y}\right )d\boldsymbol{\beta } }$$
(45)

logarithm

$$\displaystyle\begin{array}{rcl} l_{R}\left (\boldsymbol{\theta },\sigma ^{2}\vert \boldsymbol{y}\right )& =& \log L_{ R}\left (\boldsymbol{\theta },\sigma ^{2}\vert \boldsymbol{y}\right ) {}\\ & =& -\frac{N_{n} - p} {2} \log \left (2\pi \sigma ^{2}\right ) -\frac{\left \Vert \boldsymbol{c}_{-1}\right \Vert ^{2}} {2\sigma ^{2}} -\log \mathrm{ abs}\vert \boldsymbol{R}_{00}\vert {}\\ & & +\sum _{j=1}^{n}\log \mathrm{abs}\left ( \frac{\vert \boldsymbol{\Delta }\vert } {\vert \boldsymbol{R}_{11(j)}\vert }\right ). {}\\ \end{array}$$

As the result, the conditional estimate for \(\boldsymbol{\beta }\) is

$$\displaystyle{\hat{\boldsymbol{\beta }}=\boldsymbol{ R}_{00}^{-1}\boldsymbol{c}_{ 0}}$$

as the same as in the unconditional case but with \(\boldsymbol{R}_{00}^{-1}\) different due to different \(\boldsymbol{\Delta }\) and σ 2

$$\displaystyle{\hat{\sigma }_{R}^{2}\left (\boldsymbol{\theta }\right ) = \left \Vert c_{ -1}\right \Vert ^{2}/\left (N_{ t} - p\right ).}$$

So the restricted log-likelihood is

$$\displaystyle\begin{array}{rcl} l_{R}\left (\boldsymbol{\theta }\vert \boldsymbol{y}\right )& =& l_{R}\left (\boldsymbol{\theta },\hat{\sigma }_{RE}^{2}\left (\boldsymbol{\theta }\right )\vert \boldsymbol{y}\right ) {}\\ & =& \mathrm{const} -\left (N_{n} - p\right )\log \left \Vert \boldsymbol{c}_{-1}\right \Vert -\log \mathrm{ abs}\vert \boldsymbol{R}_{00}\vert +\sum _{ j=1}^{n}\log \mathrm{abs}\left ( \frac{\vert \boldsymbol{\Delta }\vert } {\vert \boldsymbol{R}_{11(j)}\vert }\right ). {}\\ \end{array}$$

In both cases the variance of the fixed effect coefficients is

$$\displaystyle{\mathrm{Var}\left (\hat{\boldsymbol{\beta }}\right ) =\hat{\sigma } ^{2}\boldsymbol{R}_{ 00}^{-1}\left (\boldsymbol{R}_{ 00}^{-1}\right )^{{\prime}}.}$$

The integral or respectively the sum becomes clear as soon as we rewrite the likelihood function for one level of random effects in Eq. ( 42) for two levels of random effects namely in my example the city level \(j = 1,\ldots,n\) which is nested within the time level \(t = 1,\ldots,T\), it becomes

$$\displaystyle\begin{array}{rcl} L\left (\boldsymbol{\beta },\boldsymbol{\theta }_{1},\boldsymbol{\theta }_{2},\sigma ^{2}\vert \boldsymbol{y}\right )& =& \prod _{ t=1}^{T}\int \prod _{ j=1}^{n}\left [\int p\left (\boldsymbol{y}_{\mathit{ jt}}\vert \boldsymbol{b}_{\mathit{jt}},\boldsymbol{b}_{it},\boldsymbol{\beta },\sigma ^{2}\right )p\left (\boldsymbol{b}_{\mathit{ jt}}\vert \boldsymbol{\theta }_{2},\sigma ^{2}\right )d\boldsymbol{b}_{\mathit{ jt}}\right ] \\ & & \times p\left (\boldsymbol{b}_{t}\vert \boldsymbol{\theta }_{1},\sigma ^{2}\right )d\boldsymbol{b}_{ t}. {}\end{array}$$
(46)

Decomposition is constructed similar to the case with one level of random effects

$$\displaystyle\begin{array}{rcl} \left [\begin{array}{cccc} \boldsymbol{Z}_{\mathit{jt}} & \boldsymbol{Z}_{j,t}&\boldsymbol{X}_{\mathit{jt}} & \boldsymbol{y}_{\mathit{jt}} \\ \boldsymbol{\Delta }_{2} & \mathbf{0} & \mathbf{0} & \mathbf{0} \end{array} \right ]& =& \boldsymbol{Q}_{\mathit{jt}}\left [\begin{array}{cccc} \boldsymbol{R}_{22(\mathit{jt})} & \boldsymbol{R}_{21(\mathit{jt})} & \boldsymbol{R}_{20(\mathit{jt})} & \boldsymbol{c}_{2(\mathit{jt})} \\ \mathbf{0} &\boldsymbol{R}_{11(\mathit{jt})} & \boldsymbol{R}_{10(\mathit{jt})} & \boldsymbol{c}_{1(\mathit{jt})} \end{array} \right ],\, {}\\ j& =& 1,\,\ldots,\,n,\,t = 1,\,\ldots,\,T {}\\ \end{array}$$

decomposition for that

$$\displaystyle{\left [\begin{array}{ccc} \boldsymbol{R}_{11(1t)} & \boldsymbol{R}_{10(1t)} & \boldsymbol{c}_{1(1t)}\\ \vdots & \vdots & \vdots \\ \boldsymbol{R}_{11(\mathit{Mt})} & \boldsymbol{R}_{1(\mathit{Mt})} & \boldsymbol{c}_{1\mathit{Mt})} \\ \boldsymbol{\Delta }_{1} & \mathbf{0} & \mathbf{0} \end{array} \right ] = Q_{(i)}\left [\begin{array}{ccc} \boldsymbol{R}_{11(t)} & \boldsymbol{R}_{10(t)} & \boldsymbol{c}_{1(t)} \\ 0 &\boldsymbol{R}_{00(t)} & \boldsymbol{c}_{0(t)} \end{array} \right ]}$$

the profiled log-likelihood becomes

$$\displaystyle\begin{array}{rcl} l_{R}\left (\boldsymbol{\theta }_{1},\boldsymbol{\theta }_{2}\vert \boldsymbol{y}\right )& =& \log L_{R}\left (\hat{\boldsymbol{\beta }}_{R}\left (\boldsymbol{\theta }_{1},\boldsymbol{\theta }_{2}\right ),\boldsymbol{\theta }_{1},\boldsymbol{\theta }_{2},\hat{\sigma }_{R}^{2}\left (\boldsymbol{\theta }_{ 1},\boldsymbol{\theta }_{2}\right )\vert \boldsymbol{y}\right ) {}\\ & =& \mathrm{const} -\left (N_{T} - np\right )\log \left \Vert \boldsymbol{c}_{-1}\right \Vert -\log \mathrm{ abs}\vert \boldsymbol{R}_{00}\vert {}\\ & & +\sum _{t=1}^{T}\log \mathrm{abs}\left ( \frac{\vert \boldsymbol{\Delta }_{1}\vert } {\vert \boldsymbol{R}_{11(t)}\vert }\right ) +\sum _{ t=1}^{T}\sum _{ j=1}^{n}\log \mathrm{abs}\left ( \frac{\vert \boldsymbol{\Delta }_{2}\vert } {\vert \boldsymbol{R}_{22(\mathit{jt})}\vert }\right ), {}\\ \end{array}$$

with \(N_{T} = N \cdot n \cdot T\) the total number of observations. Compared to the two level model, the three level model just adds the last addend for the nested higher level.

The solution is straight forward according to one level estimation.

Multilevel models are solved by EM algorithm, which is an iteration of two steps, namely the expectation and maximization (Laird et al. 1987). The data are fitted to the model within the expectation step by estimating the fixed effects, random effects, and the pseudo data (\(\tilde{\boldsymbol{y}}_{j}\), \(\tilde{\boldsymbol{X}}_{j}\), and \(\tilde{\boldsymbol{Z}}_{j}\)) to the current values of variance components \(\hat{\boldsymbol{\theta }}\). The maximization step fits the parameter \(\boldsymbol{\theta }\) of the model to the data by maximizing the likelihood to achieve new variance component parameters \(\hat{\boldsymbol{\theta }}\) for the expectation step (Laird and Ware 1982; Lindstrom and Bates 1988).

As described in Laird and Ware (1982) and Lindstrom and Bates (1990) it starts by setting an initial value for \(\boldsymbol{\theta }\) within the maximization-step. The error term depends on those variance components in \(\boldsymbol{\hat{\theta }}\) which is straightforward \(\boldsymbol{e}_{j} =\boldsymbol{ y}_{j} -\boldsymbol{ X}_{j}\boldsymbol{\beta }_{j}\left (\hat{\boldsymbol{\theta }}\right ) -\boldsymbol{ Z}_{j}\boldsymbol{b}_{j}\left (\hat{\boldsymbol{\theta }}\right )\). The expectation-step consists of estimation of the variance components namely for the error terms and the random effects, they basically are presented as in Laird and Ware (1982)

$$\displaystyle{ \mathsf{E}\left (\sum _{j=1}^{n}\boldsymbol{e}_{ j}^{T}\boldsymbol{e}_{ j}\mid \boldsymbol{y}_{j},\hat{\boldsymbol{\theta }}\right ) =\sum _{ j=1}^{n}\boldsymbol{e}_{ j}^{T}\left (\hat{\boldsymbol{\theta }}\right )\boldsymbol{e}_{ j}\left (\hat{\boldsymbol{\theta }}\right ) + \mathsf{tr}\mathsf{\,var}\left (\boldsymbol{e}_{j}\mid \boldsymbol{y}_{j},\hat{\boldsymbol{\theta }}\right ) }$$
(47)

and

$$\displaystyle{ \mathsf{E}\left (\sum _{j=1}^{n}\boldsymbol{b}_{ j}\boldsymbol{b}_{j}^{T}\mid \boldsymbol{y}_{ j},\hat{\boldsymbol{\theta }}\right ) =\sum _{ j=1}^{n}\boldsymbol{b}_{ j}\left (\hat{\boldsymbol{\theta }}\right )\boldsymbol{b}_{j}^{T}\left (\hat{\boldsymbol{\theta }}\right ) + \mathsf{var}\left (\boldsymbol{b}_{ j}\mid \boldsymbol{y}_{j},\hat{\boldsymbol{\theta }}\right ). }$$
(48)

The maximization steps then use the log-nlikelihood function depending on whether estimating by maximum likelihood or restricted maximum likelihood as presented above or in Lindstrom and Bates (1990) for both estimation in general and with computational improvements in Laird et al. (1987) as implemented in current software to achieve faster convergence.

Appendix 3: Residual Plots for Employment Growth at City and Time Level (Figs. 4 and 5)

Fig. 4
figure 4

Residual plot for employment growth at city level

Fig. 5
figure 5

Residual plot for employment growth at time level

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Hitzschke, S. (2015). Industrial Growth and Productivity Change in German Cities: A Multilevel Investigation. In: Pyka, A., Foster, J. (eds) The Evolution of Economic and Innovation Systems. Economic Complexity and Evolution. Springer, Cham. https://doi.org/10.1007/978-3-319-13299-0_20

Download citation

Publish with us

Policies and ethics