Abstract
While accurate data are critical in understanding crime and assessing criminal justice policy, data on crime and illicit activities are invariably measured with error. In this chapter, we illustrate and evaluate several examples of measurement error in criminal justice data. Errors are evidently pervasive, systematic, frequently related to behaviors and policies of interest, and unlikely to conform to convenient textbook assumptions. Using both convolution and mixing models of the measurement error generating process, we demonstrate the effects of data error on identification and statistical inference. Even small amounts of data error can have considerable consequences. Throughout this chapter, we emphasize the value of auxiliary data and reasonable assumptions in achieving informative inferences, but caution against reliance on strong and untenable assumptions about the error generating process.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
An extensive literature attempts to document the validity and reliability of criminal justice data (Lynch and Addington 2007; Mosher et al. 2002). In this chapter, we make no attempt to fully summarize this literature and offer no specific suggestions for modifying surveys to improve the quality of collected data.
- 2.
To accommodate proxy errors, this model has often been generalized by including a factor of proportionality linking the observed and true variables. For example, y = δy ∗ + ν, where δ is an unknown parameter. For brevity, we focus on the pure measurement model without an unknown factor of proportionality. Including a scaling factor of unknown magnitude or sign induces obvious complications beyond those discussed here. For additional details, see Wooldridge (2002, 63–67) and Bound et al. (2001, 3715–3716).
- 3.
- 4.
When the available measurement of y ∗ is a proxy variable of the form y = δy ∗ + ν, the probability limit of \(\hat{{\beta }}_{y,{x}^{{_\ast}}}\) is δβ. If δ is known in sign but not magnitude, then the sign, but not scale, of β is identified.
- 5.
Although a full treatment of the effects of measurement error in multivariate regression is beyond the scope of this chapter, several general results are worth mentioning. First, measurement error in any one regressor will usually affect the consistency of all other parameter estimators. Second, when only a single regressor is measured with classical error, the OLS estimator of the coefficient associated with the error-ridden variable suffers attenuation bias in the standard sense (see, for example, Wooldridge 2002, 75). In general, all other OLS parameters are also inconsistent, and the direction of inconsistency can be asymptotically signed by the available data. Finally, with measurement error in multiple regressors, classical assumptions imply that the probability limit of the OLS parameter vector is usually attenuated in an average sense, but there are important exceptions (Wansbeek and Meijer 2000, 17–20).
- 6.
- 7.
Of the required conditions for using a second measurement as an instrumental variable, the assumption that the two errors are uncorrelated, ση, μ = 0, may be the most difficult to satisfy in practice. Even if both errors are classical in other regards, errors in different measurements of the same variable may be expected to correlate so that ση, μ > 0. When covariance between μ and η is nonzero, the IV slope estimator is no longer a consistent point estimator of β, though it may still provide an informative bound on β under certain circumstances (see, for example, Bound et al. 2001, 3730; Black et al. 2000).
- 8.
- 9.
Similar logic suggests violation for any discrete variable, and any continuous but bounded variable.
- 10.
Failure of assumption A2 affects inference regrading α, but not β (see, for example, Illustration 1).
- 11.
- 12.
- 13.
Note that the variance of \(\Delta {x}_{i}^{{_\ast}}\) is smaller when \({x}_{i}^{{_\ast}}\) has positive autocorrelation: \({\sigma }_{\Delta {x}^{{_\ast}}}^{2} = 2{\sigma }_{{x}^{{_\ast}}}^{2}(1 - {\rho }_{{x}_{2}^{{_\ast}},{x}_{1}^{{_\ast}}}).\) To see why the relative strength of serial correlation is a concern, suppose that ρ x2 ∗ , x1 ∗ > 1 ∕ 2 while random measurement errors exhibit no autocorrelation. This implies \({\sigma }_{\Delta {x}^{{_\ast}}}^{2} < {\sigma }_{{x}^{{_\ast}}}^{2}\) while σΔμ 2 = 2σμ 2, so attenuation bias will be greater after first differencing the data.
- 14.
As discussed is the previous section, when a variable with bounded support is imperfectly classified, it is widely recognized that the classical errors-in-variables model assumption of independence between measurement error and true variable cannot hold. Molinari (2008) presents an alternative and useful conceptualization of the data error problem for discrete outcome variables. This “direct misclassification” approach allows one to focus on assumptions related to classification error rates instead of restrictions on the mixing process.
- 15.
- 16.
In this discussion, we are concerned with drawing inferences on the true rate of aggravated assault reported to the police. Inferences regarding the overall rate of aggravated assault – known and unknown to the police – are complicated by the proxy variables problem discussed previously.
References
Anglin MD, Caulkins JP, Hser Y (1993) Prevalence estimation: policy needs, current status, and future potential. J Drug Issues 23(2):345–360
Azrael D, Cook PJ, Miller M (2001) State and local prevalence of firearms ownership: measurement structure and trends, National Bureau of Economic Research: Working Paper 8570
Bennett T, Holloway K, Farrington D (2008) The statistical association between drug misuse and crime: a meta-analysis. Aggress Violent Behav 13:107–118
Black D (1970) Production of crime rates. Am Sociol Rev 35:733–48
Black DA, Berger MC, Scott FA (2000) Bounding parameter estimates with nonclassical measurement error. J Am Stat Assoc 95(451):739–748
Blumstein A, Rosenfeld R (2009) Factors contributing to U.S. crime trends, in understanding crime trends: workshop report Goldberger AS, Rosenfeld R (eds) Committee on Understanding Crime Trends, Committee on Law and Justice, Division of Behavioral and Social Sciences and Education. The National Academies Press, Washington, DC
Bollinger C (1996) Bounding mean regressions when a binary variable is mismeasured. J Econom 73(2):387–99
Bound J, Brown C, Mathiowetz N (2001) Measurement error in survey data. In: Heckman J, Leamer E (eds) Handbook of econometrics, 5, Ch. 59:3705–3843
Chaiken JM, Chaiken MR (1990) Drugs and predatory crime. Crime Justice: Drugs Crime, 13:203–239
Dominitz J, Sherman R (2004) Sharp bounds under contaminated or corrupted sampling with verification, with an application to environmental pollutant data. J Agric Biol Environ Stat 9(3):319–338
Frazis H, Loewenstein M (2003) Estimating linear regressions with mismeasured, possibly endogenous, binary explanatory variables. J Econom 117:151–178
Frisch R (1934) Statistical confluence analysis by means of complete regression systems. University Institute for Economics, Oslo
Griliches Z, Hausman JA (1985) Errors in variables in panel data: a note with an example. J Econom 31(1):93–118
Harrison LD (1995) The validity of self-reported data on drug use. J Drug Issues 25(1):91–111
Harrison L, Hughes A (1997) Introduction – the validity of self-reported drug use: improving the accuracy of survey estimates. In: Harrison L and Hughes A (eds) The validity of self-reported drug use: improving the accuracy of survey estimates. NIDA Research Monograph, vol 167. US Department of Health and Human Services, Rockville, MD, pp 1–16
Horowitz J, Manski C (1995) Identification and robustness with contaminated and corrupted data. Econometrica, 63(2):281–302
Hotz J, Mullins C, Sanders S (1997) Bounding causal effects using data from a contaminated natural experiment: analyzing the effects of teenage childbearing. Rev Econ Stud 64(4):575–603
Huber P (1981) Robust statistics. Wiley, New York
Johnston LD, O’Malley PM, Bachman JG (1998) National survey results on drug use from the monitoring the future study, 1975–1997, Volume I: Secondary school students. NIH Publication No. 98-4345. National Institute on Drug Abuse, Rockville, MD
Kleck G, Gertz M (1995) Armed resistance to crime: The prevalence and nature of self-defense with a gun. J Crim Law Criminol 86:150–187
Klepper S, Leamer EE (1984) Consistent sets of estimates for regressions with errors in all variables. Econometrica 52(1):163–183
Koss M (1993) Detecting the scope of rape: A review of prevalence research methods. J Interpers Violence 8:198–222
Koss M (1996) The measurement of rape victimization in crime surveys. Crim Justice Behav 23:55–69
Kreider B, Hill S (2009) Partially identifying treatment effects with an application to covering the uninsured. J Hum Resour 44(2):409–449
Kreider B, Pepper J (2007) Disability and employment: reevaluating the evidence in light of reporting errors. J Am Stat Assoc 102(478):432–441
Kreider B, Pepper J (2008) Inferring disability status from corrupt data. J Appl Econom 23(3):329–349
Kreider B, Pepper J. (forthcoming). Identification of expected outcomes in a data error mixing model with multiplicative mean independence. J Business Econ Stat
Kreider B, Pepper J, Gundersen C, Jolliffee D (2009) Identifying the effects of food stamps on children’s health outcomes when participation is endogenous and misreported. Working Paper
Lambert D, Tierney L (1997) Nonparametric maximum likelihood estimation from samples with irrelevant data and verification bias. J Am Stat Assoc 92:937–944
Lynch JP, Addington LA (eds) (2007) Understanding crime statistics: revisiting the divergence of the NCVS and UCR. Cambridge University Press, Cambridge
Lynch J, Jarvis J (2008) Missing data and imputation in the uniform crime reports and the effects on national estimates. J Contemp Crim Justice 24(1):69–85. doi:10.1177/1043986207313028
Maltz M (1999) Bridging gaps in police crime data: a discussion paper from the BJS Fellows Program Bureau of Justice Statistics, Government Printing Office, Washington, DC
Maltz MD, Targonski J (2002) A note on the use of county-level UCR data. J Quant Criminol. 18:297–318
Manski CF (2007) Identification for prediction and decisions. Harvard University Press, Cambridge, MA
Manski CF, Newman J, Pepper JV (2002) Using performance standards to evaluate social programs with incomplete outcome data: general issues and application to a higher education block grant program. 26(4), 355–381
McDowall D, Loftin C (2007) What is convergence, and what do we know about it? In Lynch J, Addington LA (eds) Understanding crime statistics: revisiting the divergence of the NCVS and UCR, Ch. 4: 93–124
McDowall D, Loftin C, Wierseman B (1998) Estimates of the frequency of firearm self-defense from the redesigned national crime victimization survey. Violence Research Group Discussion Paper 20.
Molinari F (2008) Partial identification of probability distributions with misclassified data. J Econom 144(1):81–117
Mosher CJ, Miethe TD, Phillips DM (2002) The mismeasure of crime. Sage Publications, Thousand Oaks, CA
Mullin CH (2005) Identification and estimation with contaminated data: When do covariate Data Sharpen Inference?” J Econom 130:253–272
National Research Council (2001) Informing America’s policy on illegal drugs: what we don’t know keeps hurting us. Committee on Data and Research for Policy on Illegal Drugs. In: Manski CF, Pepper JV, Petrie CV (eds) Committee on Law and Justice and Committee on National Statistics. Commission on Behavioral and Social Sciences and Education. National Academy Press, Washington, DC
National Research Council (2003) Measurement problems in criminal justice research: workshop summary. Pepper JV, Petrie CV. Committee on Law and Justice and Committee on National Statistics, Division of Behavioral and Social Sciences and Education. The National Academies Press, Washington, DC
National Research Council (2005) Firearms and violence: a critical review. Committee to improve research information and data on firearms. In: Wellford CF, Pepper JV, Petire CV (eds) Committee on Law and Justice, Division of Behavioral and Social Sciences and Education. The National Academies Press, Washington, DC
National Research Council (2008) Surveying victims: Options for conducting the national crime victimization survey. Panel to review the programs of the bureau of justice statistics. In: Groves RM, Cork DL (eds). Committee on National Statistics and Committee on Law and Justice, Division of Behavioral and Social Sciences and Education. The National Academies Press, Washington, DC
Office of Applied Studies. (2003). Results from the 2002 National Survey on Drug Use and Health: summary of national finding, (DHHS Publication No. SMA 03-3836, Series H-22). Substance Abuse and Mental Health Services Administration, Rockville, MD
Pepper JV (2001) How do response problems affect survey measurement of trends in drug use? In: Manski CF, Pepper JV, Petrie C (eds) Informing America’s policy on illegal drugs: What we don’t know keeps hurting us. National Academy Press, Washington, DC, 321–348
Rand MR, Rennison CM (2002) True crime stories? Accounting for differences in our national crime indicators. Chance 15(1):47–51
Tourangeau R, McNeeley ME (2003) Measuring crime and crime victimization: methodological issues. In: Pepper JV, Petrie CV (eds) Measurement Problems in Criminal Justice Research: Workshop Summary. Committee on Law and Justice and Committee on National Statistics, Division of Behavioral and Social Sciences and Education. The National Academies Press: Washington, DC 0
U.S. Department of Justice (2004) The Nation’s two crime measures, NCJ 122705, http://www.ojp.usdoj.gov/bjs/abstract/ntmc.htm.
U.S. Department of Justice (2006) National Crime Victimization Survey: Criminal Victimization, 2005. Bureau of Justice Statistics Bullentin. http://www.ojp.usdoj.gov/bjs/pub/pdf/cv05.pdf
U.S. Department of Justice (2008) Crime in the United States, 2007-+. Federal Bureau of Investigation, Washington, DC. (table 1). http://www.fbi.gov/ucr/cius2007/data/table_01.html
Wansbeek T, Meijer E (2000) Measurement error and latent variables in econometrics. Elsevier, Amsterdam
Wooldridge JM (2002) Econometric analysis of cross section and panel data. MIT Press, Cambridge, MA
Acknowledgments
We thank Stephen Bruestle, Alex Piquero, and David Weisburd for their helpful comments. Pepper’s research was supported, in part, by the Bankard Fund for Political Economy.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Pepper, J., Petrie, C., Sullivan, S. (2010). Measurement Error in Criminal Justice Data. In: Piquero, A., Weisburd, D. (eds) Handbook of Quantitative Criminology. Springer, New York, NY. https://doi.org/10.1007/978-0-387-77650-7_18
Download citation
DOI: https://doi.org/10.1007/978-0-387-77650-7_18
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-77649-1
Online ISBN: 978-0-387-77650-7
eBook Packages: Humanities, Social Sciences and LawSocial Sciences (R0)