Skip to main content
Log in

Fiscal fragmentation and crime control: Is there an efficiency-equity tradeoff?

  • Published:
International Tax and Public Finance Aims and scope Submit manuscript

Abstract

This article investigates the effects of fiscal fragmentation on aggregate crime rates and the spatial disparities in crime rates among counties in a metropolitan area. We begin by developing a model of local government provision of public safety. The model predicts that fiscal fragmentation creates an efficiency-equity tradeoff. To investigate this tradeoff, we estimate a variety of empirical models using county-level panel data drawn from a sample of metropolitan areas in the USA for census years 1990, 2000, and 2010. Our findings suggest that fiscal fragmentation increases efficiency in the provision of public safety; that is, fiscal fragmentation has a negative effect on aggregate crime rates in metropolitan areas. We also find that fiscal fragmentation increases disparities in crime rates among counties in a metropolitan area. In other words, fiscal fragmentation has a negative effect on interpersonal equity in the provision of public safety. We further explore the underlying mechanisms of the efficiency-equity tradeoff in a Spatial-Autoregressive Durbin model with multiplicative spatial interaction terms. Since conventional estimation methods are not suitable for the task at hand, we derive an innovative Maximum Likelihood Estimator for our empirical model. As predicted, we find evidence of both interjurisdictional spillover effects and Tiebout-sorting effects due to fiscal fragmentation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. A pure public good is nonrival in consumption and nonexcludable. In contrast, the benefits of a local public good extend over a limited geographic area and/or are eventually subject to congestion costs as the size of the group sharing the good exceeds some threshold.

  2. Merit goods are those goods and services that the government feels that people will under-consume, and which ought to be subsidized or provided free at the point of use so that consumption does not depend primarily on the ability to pay for the good or service. Examples of merit goods include education, health care, public safety, and refuse collection.

  3. See, for example, Hoxby (2000), Urquiola (2005), Rothstein (2007), and Baum-Snow and Lutz (2011).

  4. According to the Tenth Amendment to the US Constitution, the states, subject to certain restrictions, have the right to exercise a general police power which is the capacity of the states to regulate behavior and enforce order within their territory for the betterment of the health, safety, morals, and general welfare of their inhabitants. The states often further delegate law enforcement to local governments.

  5. These are well-known mechanisms in the literature on local public finance. For example, Mehay (1977) and Hakim et al. (1979) examine the effect of interjurisdictional spillovers on the provision of public safety, and Banzhaf and Walsh (2012) find that environmental amenities and disamenities result in Tiebout-sorting.

  6. As a robustness test, as discussed in greater detail below, we also estimate these empirical models using panel data for counties in the lower 48 states rather than metropolitan areas. We obtain similar results using this sample, as well.

  7. For the reader’s convenience, the properties of the SAR-Durbin estimator are derived in Appendix 3 to this article.

  8. In this model, we attempt to identify the causal mechanisms of FF on the decentralized provision of public safety in a general setting. In Appendix 1 to this article, we describe the household’s maximization problem that results in (1).

  9. In actual practice in the United States, jurisdictions finance services, like public safety, using final retail sales taxes and property taxes. Since these taxes are increasing in income and wealth, they give rise to fiscal rent seeking in which low-income households attempt to move into jurisdictions with high-income households to enjoy the implicit subsidy in the provision of public safety. Therefore, high-income jurisdictions must use exclusionary zoning to prevent low-income households from moving into their jurisdiction. By assuming that public safety is financed with a head tax, we do not need to concern ourselves here with the choice of tax instrument, fiscal rent seeking, and the need for exclusionary zoning.

  10. Gyimah-Brempong (1987), Wolch et al (2004), and Southwick (2005) show that FF creates economies and diseconomies of population scale in the production of public safety. By introducing governmental competition, Scotchmer (2002) and Epple and Nechyba (2004) show that FF increases the efficiency of providing local public goods by forcing local governments to operate closer to their minimum cost frontier.

  11. This sorting process allows households to reveal their demand for public safety which, according to the Tiebout hypothesis, results in an efficient allocation of local public goods. Eberts and Gronberg (1981) report evidence that FF results in sorting of households among jurisdictions by income.

  12. In the USA, law enforcement is typically carried out by town, municipal, and county police. We recognize the important role that subcounty levels of governments play in providing policing services. In this article, however, we take county areas as a basic unit of analysis because, on the one hand, the size of municipal police ranges from more than 30000 law enforcement officers, for example, New York City, to less than five officers like Amherst, Virginia or Hot Springs, North Carolina, and there is considerable variation in function. County police, by contrast, is more comparable in size and function. On the other hand, county sheriffs usually take an active role in facilitating the coordination of city or town police departments within the county area. So, we take county area as a consolidated unit of law enforcement.

  13. Following the usual practice in the literature, we exclude 7 counties in the District of Columbia, Alaska, and Hawaii. We also exclude Connecticut and Rhode Island because these two states do not have county-level governments. Finally, we exclude a few counties in this sample due to missing data on crime rates and/or missing data on certain explanatory variables.

  14. According to Levitt (1998), UCR data are subject to several criticisms, including the misreporting of crimes by victims and different recording practices by police departments over time.

  15. We use the 2013 National Center for Health Statistics (NCHS) Urban–Rural Classification Scheme for counties to construct our sample. Metropolitan statistical areas (MSAs) are defined by resident-population size, which has been derived from the 2012 post-censal estimates of July 1, 2012. This scheme classifies the counties of all metropolitan and non-metropolitan areas into six categories: large central metropolitan areas (MSA populations of 1 million or more); large fringe metropolitan areas (populations of 1 million or more); medium metropolitan areas (MSA populations of 250000–999999); small metropolitan areas (MSA populations of 50000–250000); micropolitan areas; and noncore areas. The last two categories are defined as non-metropolitan areas (populations less than 50000).

  16. In the USA, habitual offender laws (commonly referred to as three-strikes legislation) were first implemented on March 7, 1994, and are part of the United States Justice Department's Anti-Violence Strategy. These laws require a person guilty of committing both a severe violent felony and two other previous convictions to serve a mandatory life sentence in prison. The purpose of the laws is to drastically increase the punishment of those convicted of more than two serious crimes.

  17. In some cases, a metropolitan area consists of the counties from more than one state, and we introduce the fixed effect of the state in which the most populous county of the metropolitan area is located.

  18. Due to the lack of a natural experiment for fiscal fragmentation, we adopt this 2SLS method as a second-best approach to identify the impact of FF on public safety.

  19. We adopt the definition of streams in Rothstein (2007) when estimating our 2SLS regressions. We also estimate the 2SLS regressions using the definition of streams in Hoxby (2000) and an alternative definition of streams in Rothstein (2007). The 2SLS results are robust to these alternative instruments. These results are provided in Appendix 2, Tables 12 and 13, to this article.

  20. The estimation results of robustness checks are available upon request.

  21. The estimation results of robustness checks are available upon request to the authors.

  22. Henceforth, we do not include a subscript for the type of crime, although we continue to estimate this model using the three major crime categories.

  23. Since FF is nearly invariant over the sample period of this study, we cannot estimate the effect of the efficiency channel (λ1) while also controlling for county fixed effects (βj). However, we have reported evidence regarding the efficiency channel of FF in (5); therefore, we focus on estimating the effect of the spillover channel and sorting channel of FF in this section.

  24. There is data attrition in Tables 5 and 6 for two reasons. First, we drop metropolitan areas that only contain one county. Second, the estimation of (8) requires that the data cover every county in a metropolitan area; otherwise, all the counties in the metropolitan area must be dropped.

  25. Note that the variable U is also included in X which means that the vector \(U_{jn} = {\text{diag}}\left( {u_{j,1} , \ldots ,u_{j,n} } \right)^{{\prime }}\) is included in \(X_{nt}\). For simplicity, we do not write it out explicitly.

  26. In dynamic panel model, the first difference and Helmert transformation have often been used to eliminate the fixed effects, and a special selection of \(F_{T,T - 1}\) gives rise to the Helmert transformation where \({\upepsilon }_{\mathrm{nt}}\) is transformed to \(\left( {\frac{T - t}{{T - t + 1}}} \right)^{1/2} \left[ {\varepsilon_{nt} - \frac{1}{T - t}\left( {\varepsilon_{n,t + 1} + \cdots + \varepsilon_{nT} } \right)} \right]\), which is of particular interest for dynamic panel models.

References

  • Banzhaf, H. S., & Walsh, R. P. (2012). Do people vote with their feet? An empirical test of Tiebout’s mechanism. The American Economic Review, 98(3), 843–863.

    Article  Google Scholar 

  • Baum-Snow, N., & Lutz, B. F. (2011). School desegregation, school choice, and changes in residential location patterns by race. The American Economic Review, 101(7), 3019–3046.

    Article  Google Scholar 

  • Becker, G. S. (1968). Crime and punishment: An economic approach. Journal of Political Economy, 76(2), 169–217.

    Article  Google Scholar 

  • Brennan, G., & Buchanan, J. M. (1980). The power to tax: Analytical foundations of a fiscal constitution. Cambridge University Press.

    Google Scholar 

  • Cowell, F. A., & Victoria-Feser, M. (1996). Robustness properties of inequality measures. Econometrica, 64(1), 77–101.

    Article  Google Scholar 

  • Cullen, J., & Levitt, S. D. (1999). Crime, urban flight, and the consequences for cities. The Review of Economics and Statistics, 81(2), 159–169.

    Article  Google Scholar 

  • Eberts, R. W., & Gronberg, T. J. (1981). Jurisdictional homogeneity and the Tiebout hypothesis. Journal of Urban Economics, 10(2), 227–239.

    Article  Google Scholar 

  • Ehrlich, I. (1996). Crime, punishment, and the market for offenses. The Journal of Economic Perspectives, 10(1), 43–67.

    Article  Google Scholar 

  • Elhorst, J. P. (2010). Dynamic panels with endogenous interaction effects when T is small. Regional Science and Urban Economics, 40(5), 272–282.

    Article  Google Scholar 

  • Epple, D., & Nechyba, T. (2004). Fiscal decentralization. In J. V. Henderson & J.-F. Thisse (Eds.), Handbook of regional and urban economics (Vol. 4, pp. 2423–2480). Elsevier.

    Google Scholar 

  • Forbes, K., & Zampelli, E. (1989). Is Leviathan a mythical beast? The American Economic Review, 79(3), 568–577.

    Google Scholar 

  • Found, A. (2012). Economies of scale in fire and police services in Ontario. IMFG Papers on Municipal Finance and Governance, 12, 1–29.

    Google Scholar 

  • Gyimah-Brempong, K. (1987). Economies of scale in municipal police departments: The case of Florida. The Review of Economics and Statistics, 69(2), 352–356.

    Article  Google Scholar 

  • Hakim, S., Ovadia, A., Sagi, E., & Weinblatt, J. (1979). Interjurisdictional spillover of crime and police expenditure. Land Economics, 55(2), 200–212.

    Article  Google Scholar 

  • Hoxby, C. M. (2000). Does competition among public schools benefit students and taxpayers? The American Economic Review, 90(5), 1209–1238.

    Article  Google Scholar 

  • Kelejian, H., & Prucha, I. (1999). A generalized moments estimator for the autoregressive parameter in a spatial model. International Economic Review, 40(2), 509–533.

    Article  Google Scholar 

  • Levitt, S. D. (1998). The relationship between crime reporting and police: Implications for the use of uniform crime reports. Journal of Quantitative Criminology, 14, 319–352.

    Article  Google Scholar 

  • Levitt, S. D. (2002). Using electoral cycles in police hiring to estimate the effects of police on crime: Reply. The American Economic Review, 92(4), 1244–1250.

    Article  Google Scholar 

  • Liu, Y. (2014). Does competition for capital discipline governments? The role of fiscal equalization. International Tax and Public Finance, 21(3), 345–374.

    Article  Google Scholar 

  • Liu, Y. (2016). Do government preferences matter for tax competition? International Tax and Public Finance, 23(2), 343–367.

    Article  Google Scholar 

  • Lee, L., & Liu, X. (2010). Efficient GMM estimation of high order spatial autoregressive models with autoregressive disturbances. Econometric Theory, 26(1), 187–230.

    Article  Google Scholar 

  • Lee, L.-F., & Yu, J. (2010). Estimation of spatial autoregressive panel data models with fixed effects. Journal of Econometrics, 154(2), 165–185.

    Article  Google Scholar 

  • Machin, S., & Marie, O. (2011). Crime and police resources: The street crime initiative. Journal of the European Economic Association, 9, 678–701.

    Article  Google Scholar 

  • Mehay, S. L. (1977). Interjurisdictional spillovers of urban police services. Southern Economic Journal, 43(3), 1352–1359.

    Article  Google Scholar 

  • Moody, C., & Marvell, T. (2010). On the choice of control variables in the crime equation. Oxford Bulletin of Economics and Statistics, 72(5), 696–715.

    Article  Google Scholar 

  • Neyman, J., & Scott, E. L. (1948). Consistent estimates based on partially consistent observations. Econometrica, 16(1), 1–32.

    Article  Google Scholar 

  • Oates, W. E. (1972). Fiscal federalism. Harcourt.

    Google Scholar 

  • Overton, M. (2017). Sorting through the determinants of local government competition. The American Review of Public Administration, 47(8), 914–928.

    Article  Google Scholar 

  • Rothstein, J. (2007). Does competition among public schools benefit students and taxpayers? comment. The American Economic Review, 97(5), 2026–2037.

    Article  Google Scholar 

  • Scotchmer, S. (2002). Local public goods and clubs. In A. J. Auerbach & M. Feldstein (Eds.), Handbook of public economics (Vol. 4, pp. 1997–2042). Elsevier.

    Google Scholar 

  • Smith, L. C. (2020). Rivers of power. Little, Brown, Spark.

    Google Scholar 

  • Southwick, L. (2005). Economies of scale and market power in policing. Managerial and Decision Economics, 26, 461–473.

    Article  Google Scholar 

  • Tiebout, C. M. (1956). A pure theory of local expenditures. Journal of Political Economy, 64, 416–424.

    Article  Google Scholar 

  • Urquiola, M. (2005). Does school choice lead to sorting? Evidence from Tiebout variation. The American Economic Review, 95(4), 1310–1326.

    Article  Google Scholar 

  • Wheaton, W. C. (2006). Metropolitan fragmentation, law enforcement effort and urban crime. Journal of Urban Economics, 60(1), 1–14.

    Article  Google Scholar 

  • Wolch, J. R., Pastor, M., & Dreier, P. (2004). Up against the sprawl: Public policy and the making of southern California. University of Minnesota Press.

    Google Scholar 

Download references

Acknowledgements

Prof. Jenny Ligthart, passed away on November 21, 2012. We remember her as an excellent researcher and a very good friend, Ruixin Wang and Jinghua Lei are deeply grateful for her supervision and help. We would like to appreciate the insightful comments from both the editor and two anonymous reviewers. We would also like to thank the helpful comments from Julie Cullen, Yaohui Dong, Bas van Groezen and Ben Vollaard. Jinghua Lei gratefully acknowledges the financial support of the National Natural Science Foundation of China (NSFC) (Grant No. 71803186). All errors are our own.

Funding

Funding was provided by national natural science foundation of china (No.71803186).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruixin Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Jenny Ligthart: Deceased.

Appendices

Appendix 1

In this section, we model the households’ choice of the level of public safety, using a median-voter framework. Assume households in jurisdiction j have identical Cobb–Douglas utility functions and differ in household income. Household i in jurisdiction j seeks to maximize utility by choosing the level of public safety, denoted by PSj, and the amount of a private good xi. We assume that public safety is a local public good. The household’s maximization problem is given by the following expressions:

$$\mathop {\max }\limits_{x,PS} u\left( {x_{i} ,PS_{j} } \right) = \alpha \log \left( {x_{i} } \right) + \beta \log \left( {PS_{j} } \right)$$
(8)
$${\rm s.t.}\quad P_{0} x_{i} + P_{j} PS_{j} = I_{i}$$
(9)

where Pj is the tax-price of public safety in jurisdiction j; P0 is the price of a homogeneous private good; and \({I}_{i}\) is the income of household i. The necessary first-order conditions for a local maximum are given as follows:

$$u_{{x_{i} }} = P_{0} \lambda_{i}$$
(10)
$$u_{{PS_{j} }} = P_{j} \lambda_{i}$$
(11)

where λi is the Lagrange multiplier associated with the budget constraint of household i in jurisdiction j. Solving for the demand for the private good and public safety, we obtain the following expressions:

$$x_{i}^{D} = \frac{{\alpha I_{i} }}{{P_{0} \left( {\alpha + \beta } \right)}}$$
(12)
$$PS_{ij}^{D} = \frac{{\beta I_{i} }}{{P_{j} \left( {\alpha + \beta } \right)}}$$
(13)

The preference for public safety is increasing with respect to Ii and is single-peaked, which assures us of an equilibrium. Following the median-voter theorem, the demand for public safety in jurisdiction j, \(PS_{j}^{D}\) is decided by the median-income household in jurisdiction j. Accordingly, the median-income household’s demand for public safety is given by the following expression:

$$PS_{j}^{D} = D\left( {P_{j} ,P_{0} ,IM_{j} } \right)$$
(14)

where again, Pj is the tax-price of public safety in jurisdiction j; P0 is the price of a homogeneous private good; and IMj is the income of the median-income household in jurisdiction j. Every household in jurisdiction j consumes the same amount of public safety. Moreover, public safety obeys the law of downward-sloping demand and is a normal good.

Appendix 2

See Tables 7, 8, 9, 10, 11, 12 and 13.

Table 7 First-stage estimates of the 2SLS regressions
Table 8 OLS estimates of the effect of fiscal fragmentation on crime rates by type of crime (metropolitan areas)
Table 9 2SLS estimates of the effect of fiscal fragmentation on crime rates by type of crime (metropolitan areas)
Table 10 OLS estimates of the effect of fiscal fragmentation on intraregional disparities in crime rates by type of crime (metropolitan areas)
Table 11 2SLS estimates of the effect of fiscal fragmentation on intraregional disparities in crime rates by type of crime (metropolitan areas)
Table 12 2SLS estimates of the effect of fiscal fragmentation on crime rates by type of crime (metropolitan areas, alternative IVs)
Table 13 2SLS estimates of the effect of fiscal fragmentation on intraregional disparities in crime rates by type of crime (metropolitan areas, alternative IVs)

Appendix 3

Without loss of generality, specification (7) can be written in the following matrix form:

$$\begin{aligned} & Y_{nt} = \sum\limits_{j = 1}^{p} {\lambda_{j,0} U_{jn} W_{n} Y_{nt} + X_{nt} \beta_{0} + c_{n} } + \alpha_{t} \iota_{n} + v_{nt} , \\ & v_{nt} = \rho_{0} M_{n} v_{nt} + \varepsilon_{nt} ,\quad t = 1, \ldots T, \\ \end{aligned}$$
(15)

where \(Y_{nt}\) is an \(n \times 1\) vector of dependent variable, \(U_{jn} = {\text{diag}}\left\{ {u_{j,1} , \ldots ,u_{j,n} } \right\}\) \(\left( {j = 1, \ldots p} \right)\) is a series of \(n \times n\) matrices of non-stochastic exogenous variables, \(U_{1n} = I_{n}\) is the identity matrix, \(X_{nt}\) is an \(n \times k\) matrix of non-stochastic exogenous variables,Footnote 25\(c_{n} = \left( {c_{1,n} , \ldots ,c_{n,n} } \right)^{\prime }\) is an \(n \times 1\) vector of fixed effects, \(\alpha_{t}\) is the fixed time effect, \(\iota_{n}\) is a \(n \times 1\) vector of ones, \(W_{n}\) and \(M_{n}\) are \(n \times n\) spatial weights matrices of known constants, and \(\varepsilon_{nt}\) is an \(n \times 1\) vector of the disturbances where \(\varepsilon_{it}\) is i.i.d. across \(i\) and \(t\) with zero mean and variance \(\sigma_{0}^{2}\). Thus, in model (15), we also allow spatial errors. For instance, in specification (7), \(Y_{nt} = \left( {CR_{11t} , \ldots ,CR_{{1J_{1} t}} , \ldots ,CR_{I1t} , \ldots ,CR_{{IJ_{I} t}} } \right)^{{\prime }}\), \(U_{jn} = {\text{diag}}\left\{ {FF_{1} , \ldots ,FF_{1} , \ldots ,FF_{I} , \ldots ,FF_{I} } \right\}\), \(X_{nt} = \left( {X_{11t} , \ldots ,X_{{1J_{1} t}} , \ldots ,X_{I1t} , \ldots ,X_{{IJ_{I} t}} } \right)^{{\prime }}\). As in our empirical study, we only have one multiplicative interactive term in each specification, so \(p = 2\). In general, model (15) allows \(p\) to be some finite number that satisfies the identification condition, which means model (15) may contain several interactive terms.

Model (15) is related in some sense to the spatial autoregressive model (SAR) model with high-order spatial lags which can characterize spatial interdependence based on different types of relations (e.g., geographic distance, social relation) among cross-sectional units. Such different types of spatial interdependence are captured by the different spatial weight matrices. If we denote \(W_{jn} = U_{jn} W_{n}\), then model (15) can be viewed as a \(p\)-order SAR model with \(1\)-order SAR disturbance (for short,\(SARAR\left( {p,1} \right)\), see Lee and Liu (2010)). However, in essence, model (15) is to capture the varying spatial effects depending on a (possibly exogenous variable \(U_{jn}\)), while the type of spatial interdependence is assumed to be fixed and exogenously given by the spatial weight matrix. Note that the varying spatial effects are modeled to be linear in \(U_{jn}\), one can extend model (15) to a functional coefficients spatial model, where the parameter \(\lambda\) that captures the spatial effects is some unspecified smooth function of \(U_{jn}\), to specify the possibly nonlinear varying spatial effects.

A central problem in estimating panel models with fixed effect is how to eliminate it, for both the fixed individual effects and fixed time effects. However, from a methodological point of view, the asymptotics are of interest only when both \(n\) and \(T\) tend to infinity. When \(T\) tends to infinity, the fixed-time effects may cause the incidental parameter problem (Neyman and Scott, 1948) in addition to the individual fixed effects if we apply the direct maximum likelihood approach to jointly estimate the common parameters and fixed effects. When \(T\) is finite, the fixed-time effects can be regarded as a finite number of additional parameters similar to the role of \(\beta\), so the fixed time effects are of no concern. As in our model, \(T\) is finite, so we can simply omit the fixed-time effects to avoid the complication of eliminating them. If one is interested in the case when \(T\) is large, the transformation approach in Lee and Yu (2010) can also be applied to eliminate the fixed time effects.

With a bit abuse of notation, we denote the time operator \(J_{T} = \left( {I_{T} - \frac{1}{T}\iota_{T} \iota_{T}^{{\prime }} } \right)\). To eliminate the fixed individual effects, the transformed model (15) consists of

$$\begin{aligned} & Y_{nt} = \sum\limits_{j = 1}^{p} {\lambda_{j,0} U_{jn} W_{n} Y_{nt} + X_{nt} \beta_{0} + } \tilde{v}_{nt} , \\ & \tilde{v}_{nt} = \rho_{0} M_{n} \tilde{v}_{nt} + \varepsilon_{nt} . \\ \end{aligned}$$
(16)

Note that, the resulting disturbances \(\varepsilon_{nt}\) would be linearly dependent over the time dimension. Therefore, without creating linear dependence in the resulting disturbances, a corresponding transformation can be based on the orthonormal eigenvector matrix of \(J_{T}\), as in Lee and Yu (2010). The orthogonal transformation in Lee and Yu (2010) includes the Helmert transformation as a special case to eliminate the fixed individual effects. Let \(\left[ {F_{T,T - 1} ,\frac{1}{\sqrt T }\iota_{T} } \right]\) be the orthonormal eigenvector matrix of \(J_{T}\), where \(F_{T,T - 1}\) is the \(T \times \left( {T - 1} \right)\) submatrixFootnote 26 corresponding to the eigenvalues of one. For any \(n \times T\) matrix \(\left[ {A_{n1} , \ldots A_{nT} } \right]\), define the transformed \(n \times \left( {T - 1} \right)\) matrix \(\left[ {A_{n1}^{ * } , \ldots A_{{n\left( {T - 1} \right)}}^{*} } \right] = \left[ {A_{n1} , \ldots A_{nT} } \right]F_{T,T - 1}\). Similarly, \(X_{nt}^{*} = \left[ {X_{nt,1}^{ * } , \ldots X_{nt,k}^{ * } } \right]\). Note that, \(U_{jn}\), which is included in \(X_{nt}\), has also been eliminated. Therefore, model (15) can be rewritten as

$$\begin{aligned} & Y_{nt}^{*} = \sum\limits_{j = 1}^{p} {\lambda_{j,0} U_{jn} W_{n} Y_{nt}^{*} + X_{nt}^{*} \beta_{0} } + v_{nt}^{*} , \\ & v_{nt}^{*} = \rho_{0} M_{n} v_{nt}^{*} + \varepsilon_{nt}^{ * } ,\quad t = 1, \ldots ,T - 1, \\ \end{aligned}$$
(17)

Because \(\left( {\varepsilon_{n1}^{*\prime} , \ldots ,\varepsilon_{{n\left( {T - 1} \right)}}^{*\prime} } \right)^{\prime } = \left( {F_{T,T - 1}^{{\prime }} \otimes I_{n} } \right)\left( {\varepsilon_{n1}^{{\prime }} , \ldots ,\varepsilon_{nT}^{{\prime }} } \right)^{{\prime }}\) and \(\varepsilon_{nt}\) are i.i.d.,

$$E\left( {\varepsilon_{n1}^{*\prime} , \ldots ,\varepsilon_{{n\left( {T - 1} \right)}}^{*\prime} } \right)^{{\prime }} \left( {\varepsilon_{n1}^{*\prime} , \ldots ,\varepsilon_{{n\left( {T - 1} \right)}}^{*\prime} } \right) = \sigma_{0}^{2} \left( {F_{T,T - 1}^{{\prime }} \otimes I_{n} } \right)\left( {F_{T,T - 1} \otimes I_{n} } \right) = \sigma_{0}^{2} I_{{n\left( {T - 1} \right)}} .$$

Hence, \(\varepsilon_{it}^{ * }\) is uncorrelated for all \(i\) and \(t\) (and independent under normality).

If the disturbances were normally distributed, the log-likelihood function of (17) is

$$\begin{aligned} \ln L_{n,T} \left( \theta \right) & = - \frac{{n\left( {T - 1} \right)}}{2}\ln \left( {2\pi \sigma^{2} } \right) + \left( {T - 1} \right)\left[ {\ln \left| {S_{n} \left( \lambda \right)} \right| + \ln \left| {R_{n} \left( \rho \right)} \right|} \right] \\ & \quad - \frac{1}{{2\sigma^{2} }}\sum\limits_{t = 1}^{T - 1} {\varepsilon_{nt}^{ * } \left( \varsigma \right)^{{\prime }} \varepsilon_{nt}^{ * } \left( \varsigma \right)} , \\ \end{aligned}$$
(18)

where \(\theta = \left( {\beta^{{\prime }} ,\lambda^{{\prime }} ,\rho ,\sigma^{2} } \right)^{{\prime }}\), \(\lambda = \left( {\lambda_{1} , \ldots ,\lambda_{p} } \right)^{\prime }\), \(\varepsilon_{nt}^{ * } \left( \varsigma \right) = R_{n} \left( \rho \right)\left[ {S_{n} \left( \lambda \right)Y_{nt}^{ * } - X_{nt}^{ * } \beta } \right]\) and \(S_{n} \left( \lambda \right) = I_{n} - \sum\limits_{j = 1}^{p} {\lambda_{j} U_{jn} W_{n} }\). To guarantee that the log-likelihood function is well defined, we only consider the parameter space of \(\lambda\) and \(\rho\) such that \(\left| {S_{n} \left( \lambda \right)} \right| > 0\) and \(\left| {R_{n} \left( \rho \right)} \right| > 0\). Let \(\left\| \cdot \right\|\) be any matrix norm, one has \(\left\| {\sum\limits_{j = 1}^{p} {\lambda_{j} U_{jn} W_{n} } } \right\| \le \left( {\sum\limits_{j = 1}^{p} {\left| {\lambda_{j} } \right|} } \right) \cdot \max_{j = 1, \ldots p} \left\| {U_{jn} W_{n} } \right\|\), and the parameter space may be taken to be \(\sum\limits_{j = 1}^{p} {\left| {\lambda_{j} } \right|} \le 1/\max_{j = 1, \ldots p} \left\| {U_{jn} W_{n} } \right\|\). This limits the spatial dependence among the units to a tractable degree (Kelejian and Prucha, 1999) and rules out the unit root case (in time series as a special case). It is also a sufficient condition for \(S_{n} \left( \lambda \right)^{ - 1}\) to be uniformly bounded in row and column sums in absolute value, because \(S_{n} \left( \lambda \right)^{ - 1} = I_{n} + \sum\limits_{j = 1}^{p} {\lambda_{j} U_{jn} W_{n} } + \left( {\sum\limits_{j = 1}^{p} {\lambda_{j} U_{jn} W_{n} } } \right)^{2} + \cdots\). When \(M_{n}\) is row-normalized, a possible parameter space for \(\rho\) can be one satisfying \(\left| \rho \right| < 1\).

For any \(n\)-dimensional column vectors \(p_{nt}\) and \(q_{nt}\), as \(\begin{aligned} \sum\limits_{t = 1}^{T - 1} {p_{nt}^{*\prime} q_{nt}^{*} } & = \left( {p_{n1}^{{\prime }} , \ldots ,p_{nT}^{{\prime }} } \right)\left( {F_{T,T - 1} \otimes I_{n} } \right)\left( {F^{\prime}_{T,T - 1} \otimes I_{n} } \right)\left( {q_{n1}^{{\prime }} , \ldots ,q_{nT}^{{\prime }} } \right)^{{\prime }} \\ & = \left( {p_{n1}^{{\prime }} , \ldots ,p_{nT}^{{\prime }} } \right)\left( {J_{T} \otimes I_{n} } \right)\left( {q_{n1}^{{\prime }} , \ldots ,q_{nT}^{{\prime }} } \right)^{{\prime }} = \sum\limits_{t = 1}^{T - 1} {p_{nt}^{{\prime }} q_{nt} } \\ \end{aligned}\).

by using \(\left( {p_{n1} , \ldots p_{nT} } \right) = \left( {p_{n1} , \ldots p_{nT} } \right)J_{T}\), the log-likelihood function can be rewritten as

$$\begin{aligned} \ln L_{n,T} \left( \theta \right) & = - \frac{{n\left( {T - 1} \right)}}{2}\ln \left( {2\pi \sigma^{2} } \right) + \left( {T - 1} \right)\left[ {\ln \left| {S_{n} \left( \lambda \right)} \right| + \ln \left| {R_{n} \left( \rho \right)} \right|} \right] \\ &\quad - \frac{1}{{2\sigma^{2} }}\sum\limits_{t = 1}^{T - 1} {\varepsilon_{nt} \left( \varsigma \right)^{{\prime }} \varepsilon_{nt} \left( \varsigma \right)} , \\ \end{aligned}$$
(19)

where \(\varepsilon_{nt} \left( \varsigma \right) = R_{n} \left( \rho \right)\left[ {S_{n} \left( \lambda \right)Y_{nt} - X_{nt} \beta } \right]\). For (15), the corresponding estimates are

$$\hat{\beta }_{nT} \left( {\lambda ,\rho } \right) = \left[ {\sum\limits_{t = 1}^{T} {X_{nt}^{{\prime }} R_{n} \left( \rho \right)^{{\prime }} X_{nt} } } \right]^{ - 1} \left[ {\sum\limits_{t = 1}^{T} {X_{nt}^{{\prime }} R_{n} \left( \rho \right)^{{\prime }} R_{n} \left( \rho \right)S_{n} \left( \lambda \right)Y_{nt} } } \right]$$
(20)
$$\begin{aligned} \hat{\sigma }_{nT}^{2} \left( {\lambda ,\rho } \right) & = \frac{1}{{n\left( {T - 1} \right)}}\sum\limits_{t = 1}^{T} {\left[ {S_{n} \left( \lambda \right)Y_{nt} - X_{nt} \hat{\beta }_{nT} \left( {\lambda ,\rho } \right)} \right]}^{{\prime }} R_{n} \left( \rho \right)^{{\prime }} R_{n} \left( \rho \right) \\ & \left[ {S_{n} \left( \lambda \right)Y_{nt} - X_{nt} \hat{\beta }_{nT} \left( {\lambda ,\rho } \right)} \right]. \\ \end{aligned}$$
(21)

and the concentrated log-likelihood function of \(\left( {\lambda ,\rho } \right)\) is

$$\begin{aligned} \ln L_{n,T} \left( {\lambda ,\rho } \right) & = - \frac{{n\left( {T - 1} \right)}}{2}\left( {\ln \left( {2\pi } \right) + 1} \right) - \frac{{n\left( {T - 1} \right)}}{2}\ln \hat{\sigma }_{nT}^{2} \left( {\lambda ,\rho } \right) \\ &\quad + \left( {T - 1} \right)\left[ {\ln \left| {S_{n} \left( \lambda \right)} \right| + \ln \left| {R_{n} \left( \rho \right)} \right|} \right], \\ \end{aligned}$$
(22)

Theorem 1

Under the assumptions presented below, \(\theta_{0}\) is identified, and the quasi-maximum likelihood estimator \(\hat{\theta }_{nT}\) based on (15) is consistent, \(\hat{\theta }_{nT} - \theta_{0} \mathop{\longrightarrow}\limits^{p}0\).

The set of assumptions are similar to Assumptions 1–8 in Lee and Yu (2010):

Assumption 1

\(W_{n}\) and \(M_{n}\) are non-stochastic spatial weights matrices with zero diagonals.

Assumption 2

The disturbances \(\left\{ {\varepsilon_{it} } \right\}\); \(i = 1, \ldots ,n\) and \(t = 1, \ldots ,T\), are i.i.d. across i and t with zero mean, variance \(\sigma_{0}^{2}\) and \(E|\varepsilon_{it} |^{4 + \eta } < \infty\) for some \(\eta > 0\).

Assumption 3

\(S_{n} \left( \lambda \right)\) and \(R_{n} \left( \rho \right)\) are invertible for all \(\lambda \in \Lambda\) and \(\rho \in {\rm P}\), where \(\Lambda\) and \({\rm P}\) are compact intervals. Furthermore, \(\lambda_{0}\) is in the interior of \(\Lambda\), and \(\rho_{0}\) is in the interior of \({\rm P}\).

Assumption 4

n is large, where T can be finite or large.

Assumption 5

The elements of \(X_{nt}\) and \(U_{jn}\) are non-stochastic and bounded, uniformly in j, n, and t. Also, under the asymptotic setting in Assumption 4, the limit of \(\frac{1}{nT}\sum\limits_{t = 1}^{T} {\tilde{X}_{nt}^{\rm T} R_{n}^{\rm T} R_{n} \tilde{X}_{nt} }\) exists and is nonsingular.

Assumption 6

\(W_{n}\) and \(M_{n}\) are uniformly bounded in both row and column sums in absolute value (UB). Also \(S_{n}^{ - 1} \left( \lambda \right)\) and \(R_{n}^{ - 1} \left( \rho \right)\) are UB, uniformly in \(\lambda \in \Lambda\) and \(\rho \in {\rm P}\).

Assumption 7

Either (a) the limit of \(\frac{1}{{n\left( {T - 1} \right)}}\sum\limits_{t = 1}^{T} {\left( {\tilde{X}_{nt} ,G_{n} \tilde{X}_{nt} \beta_{0} } \right)^{\rm T} R_{n}^{\rm T} \left( \rho \right)} R_{n} \left( \rho \right)\left( {\tilde{X}_{nt} ,G_{n} \tilde{X}_{nt} \beta_{0} } \right)\) is nonsingular for each possible \(\rho \in {\rm P}\), and the limit of \(\left( {\frac{1}{n}\ln |\sigma_{0}^{2} \left( {R_{n}^{ - 1} } \right)^{\rm T} \left( {R_{n}^{ - 1} } \right)| - \frac{1}{n}\ln |\sigma_{n}^{2} \left( \rho \right)\left( {R_{n}^{ - 1} \left( \rho \right)} \right)^{\rm T} \left( {R_{n}^{ - 1} \left( \rho \right)} \right)|} \right)\) is not zero for \(\rho \ne \rho_{0}\); or (b) the limit of \(\left( {\frac{1}{n}\ln |\sigma_{0}^{2} \left( {R_{n}^{ - 1} } \right)^{\rm T} \left( {S_{n}^{ - 1} } \right)^{\rm T} \left( {S_{n}^{ - 1} } \right)\left( {R_{n}^{ - 1} } \right)| - \frac{1}{n}\ln |\sigma_{n}^{2} \left( {\lambda ,\rho } \right)\left( {R_{n}^{ - 1} \left( \rho \right)} \right)^{\rm T} \left( {S_{n}^{ - 1} \left( \lambda \right)} \right)^{\rm T} \left( {S_{n}^{ - 1} \left( \lambda \right)} \right)\left( {R_{n}^{ - 1} \left( \rho \right)} \right)|} \right)\) is not zero for \(\left( {\lambda ,\rho } \right) \ne \left( {\lambda_{0} ,\rho_{0} } \right)\), as n tends to infinity.

Denote \(C_{jn} = \ddot{C}_{jn} - \frac{{tr\left( {\ddot{C}_{jn} } \right)}}{n}I_{n}\) and \(D_{n} = M_{n} R_{n}^{ - 1} - \frac{{tr\left( {M_{n} R_{n}^{ - 1} } \right)}}{n}I_{n}\).

Assumption 8

The limit of \(\frac{1}{{n^{2} }}\left[ {tr\left( {C_{jn}^{s} C_{jn}^{s} } \right)tr\left( {D_{n}^{s} D_{n}^{s} } \right) - tr^{2} \left( {C_{jn}^{s} D_{n}^{s} } \right)} \right]\) is strictly positive for any \(j = 1,\ldots ,p\), as n tends to infinity.

Theorem 2

Under the same set of assumptions (the definitions of \(\Sigma_{{\theta_{0} ,nT}}\) and \(\Omega_{{\theta_{0} ,nT}}\) as well),

$$\sqrt {n(T - 1)} \left( {\hat{\theta }_{nT} - \theta_{0} } \right)\mathop{\longrightarrow}\limits^{d}N\left( {0,\lim \Sigma_{{\theta_{0} ,nT}}^{ - 1} \left( {\Sigma_{{\theta_{0} ,nT}} + \Omega_{{\theta_{0} ,nT}} } \right)\Sigma_{{\theta_{0} ,nT}}^{ - 1} } \right).$$
$$\begin{aligned} \sum_{{\theta_{0} ,nT}} & = - E\left( {\frac{1}{{n\left( {T - 1} \right)}}\frac{{\partial^{2} \ln L_{nT} \left( {\theta_{0} } \right)}}{{\partial \theta \partial \theta^{\rm T} }}} \right) \\ & = \frac{1}{{\sigma_{0}^{2} }}\left( {\begin{array}{*{20}l} {\frac{1}{{n\left( {T - 1} \right)}}\sum\nolimits_{t = 1}^{T} {\left( {\ddot{X}_{nt} ,\ddot{G}_{1n} \ddot{X}_{nt} \beta_{0} , \ldots ,\ddot{G}_{pn} \ddot{X}_{nt} \beta_{0} } \right)^{\rm T} \left( {\ddot{X}_{nt} ,\ddot{G}_{1n} \ddot{X}_{nt} \beta_{0} , \ldots ,\ddot{G}_{pn} \ddot{X}_{nt} \beta_{0} } \right)} } \hfill & * \hfill & * \hfill \\ {0_{{1 \times \left( {k + p} \right)}} } \hfill & 0 \hfill & * \hfill \\ {0_{{1 \times \left( {k + p} \right)}} } \hfill & 0 \hfill & 0 \hfill \\ \end{array} } \right) \\ & \quad { + }\left( {\begin{array}{*{20}l} {0_{k \times k} } \hfill & * \hfill & \cdots \hfill & * \hfill & * \hfill & * \hfill \\ {0_{1 \times k} } \hfill & {\frac{1}{n}tr\left( {\ddot{G}_{1n}^{s} \ddot{G}_{1n} } \right)} \hfill & \cdots \hfill & {\frac{1}{n}tr\left( {\ddot{G}_{1n}^{s} \ddot{G}_{pn} } \right)} \hfill & * \hfill & * \hfill \\ \vdots \hfill & \vdots \hfill & \cdots \hfill & \vdots \hfill & * \hfill & * \hfill \\ {0_{1 \times k} } \hfill & {\frac{1}{n}tr\left( {\ddot{G}_{pn}^{s} \ddot{G}_{1n} } \right)} \hfill & \cdots \hfill & {\frac{1}{n}tr\left( {\ddot{G}_{pn}^{s} \ddot{G}_{pn} } \right)} \hfill & * \hfill & * \hfill \\ {0_{1 \times k} } \hfill & {\frac{1}{n}tr\left( {\left( {M_{n} R_{n}^{ - 1} } \right)^{s} \ddot{G}_{1n} } \right)} \hfill & \cdots \hfill & {\frac{1}{n}tr\left( {\left( {M_{n} R_{n}^{ - 1} } \right)^{s} \ddot{G}_{pn} } \right)} \hfill & {\frac{1}{n}tr\left( {\left( {M_{n} R_{n}^{ - 1} } \right)^{s} \left( {M_{n} R_{n}^{ - 1} } \right)} \right)} \hfill & * \hfill \\ {0_{1 \times k} } \hfill & {\frac{1}{{\sigma_{0}^{2} n}}tr\left( {\ddot{G}_{1n} } \right)} \hfill & \cdots \hfill & {\frac{1}{{\sigma_{0}^{2} n}}tr\left( {\ddot{G}_{pn} } \right)} \hfill & {\frac{1}{{\sigma_{0}^{2} n}}tr\left( {M_{n} R_{n}^{ - 1} } \right)} \hfill & {\frac{1}{{2\sigma_{0}^{4} }}} \hfill \\ \end{array} } \right) \\ \end{aligned}$$
$$\begin{aligned} \sum_{{\theta_{0} ,nT}} & = \frac{{\left( {T - 1} \right)}}{T}\frac{{\left( {\mu_{4} - 3\sigma_{0}^{4} } \right)}}{{\sigma_{0}^{4} }} \\ & \times \left( {\begin{array}{*{20}c} {0_{k \times k} } & * & \cdots & * & * & * \\ {0_{1 \times k} } & {\frac{1}{n}\sum\nolimits_{i = 1}^{n} {\left( {\ddot{G}_{1n}^{2} } \right)}_{ii} } & \cdots & {\frac{1}{n}\sum\nolimits_{i = 1}^{n} {\left( {\ddot{G}_{pn} \ddot{G}_{1n} } \right)}_{ii} } & * & * \\ \vdots & \vdots & \cdots & \vdots & * & * \\ {0_{1 \times k} } & {\frac{1}{n}\sum\nolimits_{i = 1}^{n} {\left( {\ddot{G}_{1n} \ddot{G}_{pn} } \right)}_{ii} } & \cdots & {\frac{1}{n}tr\left( {\ddot{G}_{pn}^{2} } \right)_{ii} } & * & * \\ {0_{1 \times k} } & {\frac{1}{n}\sum\nolimits_{i = 1}^{n} {\left( {\ddot{G}_{1n,ii} } \right)} \left( {M_{n} R_{n}^{ - 1} } \right)_{ii} } & \cdots & {\frac{1}{n}\sum\nolimits_{i = 1}^{n} {\left( {\ddot{G}_{pn,ii} } \right)} \left( {M_{n} R_{n}^{ - 1} } \right)_{ii} } & {\frac{1}{n}\sum\nolimits_{i = 1}^{n} {\left( {M_{n} R_{n}^{ - 1} } \right)}_{ii}^{2} } & * \\ {0_{1 \times k} } & {\frac{1}{{2\sigma_{0}^{2} n}}tr\left( {\ddot{G}_{1n} } \right)} & \cdots & {\frac{1}{{2\sigma_{0}^{2} n}}tr\left( {\ddot{G}_{pn} } \right)} & {\frac{1}{{2\sigma_{0}^{2} n}}tr\left( {M_{n} R_{n}^{ - 1} } \right)} & {\frac{1}{{4\sigma_{0}^{4} }}} \\ \end{array} } \right) \\ \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lei, J., Ligthart, J., Rider, M. et al. Fiscal fragmentation and crime control: Is there an efficiency-equity tradeoff?. Int Tax Public Finance 29, 751–787 (2022). https://doi.org/10.1007/s10797-021-09692-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10797-021-09692-z

Keywords

JEL Classification

Navigation