Measurement Error in Criminal Justice Data

Pepper, John; Petrie, Carol; Sullivan, Sean

doi:10.1007/978-0-387-77650-7_18

Measurement Error in Criminal Justice Data

John Pepper³,
Carol Petrie⁴ &
Sean Sullivan³

Chapter
First Online: 01 January 2009

112k Accesses
18 Citations
3 Altmetric

Abstract

While accurate data are critical in understanding crime and assessing criminal justice policy, data on crime and illicit activities are invariably measured with error. In this chapter, we illustrate and evaluate several examples of measurement error in criminal justice data. Errors are evidently pervasive, systematic, frequently related to behaviors and policies of interest, and unlikely to conform to convenient textbook assumptions. Using both convolution and mixing models of the measurement error generating process, we demonstrate the effects of data error on identification and statistical inference. Even small amounts of data error can have considerable consequences. Throughout this chapter, we emphasize the value of auxiliary data and reasonable assumptions in achieving informative inferences, but caution against reliance on strong and untenable assumptions about the error generating process.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
An extensive literature attempts to document the validity and reliability of criminal justice data (Lynch and Addington 2007; Mosher et al. 2002). In this chapter, we make no attempt to fully summarize this literature and offer no specific suggestions for modifying surveys to improve the quality of collected data.
2.
To accommodate proxy errors, this model has often been generalized by including a factor of proportionality linking the observed and true variables. For example, y = δy ^∗ + ν, where δ is an unknown parameter. For brevity, we focus on the pure measurement model without an unknown factor of proportionality. Including a scaling factor of unknown magnitude or sign induces obvious complications beyond those discussed here. For additional details, see Wooldridge (2002, 63–67) and Bound et al. (2001, 3715–3716).
3.
The interested reader should consult more detailed presentations in Wooldridge (2002), Wansbeek and Meijer (2000) and Bound et al. (2001).
4.
When the available measurement of y ^∗ is a proxy variable of the form y = δy ^∗ + ν, the probability limit of \(\hat{{\beta }}_{y,{x}^{{_\ast}}}\) is δβ. If δ is known in sign but not magnitude, then the sign, but not scale, of β is identified.
5.
Although a full treatment of the effects of measurement error in multivariate regression is beyond the scope of this chapter, several general results are worth mentioning. First, measurement error in any one regressor will usually affect the consistency of all other parameter estimators. Second, when only a single regressor is measured with classical error, the OLS estimator of the coefficient associated with the error-ridden variable suffers attenuation bias in the standard sense (see, for example, Wooldridge 2002, 75). In general, all other OLS parameters are also inconsistent, and the direction of inconsistency can be asymptotically signed by the available data. Finally, with measurement error in multiple regressors, classical assumptions imply that the probability limit of the OLS parameter vector is usually attenuated in an average sense, but there are important exceptions (Wansbeek and Meijer 2000, 17–20).
6.
A number of possible strategies are available, and the interested reader should consult the discussions in Wansbeek and Meijer (2000) and Bound et al. (2001).
7.
Of the required conditions for using a second measurement as an instrumental variable, the assumption that the two errors are uncorrelated, σ_η, μ = 0, may be the most difficult to satisfy in practice. Even if both errors are classical in other regards, errors in different measurements of the same variable may be expected to correlate so that σ_η, μ > 0. When covariance between μ and η is nonzero, the IV slope estimator is no longer a consistent point estimator of β, though it may still provide an informative bound on β under certain circumstances (see, for example, Bound et al. 2001, 3730; Black et al. 2000).
8.
Klepper and Leamer (1984) suggest a similar strategy for the case where multiple regressors are measured with error. The general approach is described by Bound et al. (2001, 3722–3723).
9.
Similar logic suggests violation for any discrete variable, and any continuous but bounded variable.
10.
Failure of assumption A2 affects inference regrading α, but not β (see, for example, Illustration 1).
11.
Bollinger (1996) and Frazis and Loewenstein (2003) derive bounds when a binary regressor is measured with error.
12.
With panel data, assumptions A1–A5 must account for correlations in both the cross-section and the time series dimension. For detailed examples, see Wooldridge (2002) and Griliches and Hausman (1985).
13.
Note that the variance of \(\Delta {x}_{i}^{{_\ast}}\) is smaller when \({x}_{i}^{{_\ast}}\) has positive autocorrelation: \({\sigma }_{\Delta {x}^{{_\ast}}}^{2} = 2{\sigma }_{{x}^{{_\ast}}}^{2}(1 - {\rho }_{{x}_{2}^{{_\ast}},{x}_{1}^{{_\ast}}}).\) To see why the relative strength of serial correlation is a concern, suppose that ρ_{x2 ∗ , x1 ∗} > 1 ∕ 2 while random measurement errors exhibit no autocorrelation. This implies \({\sigma }_{\Delta {x}^{{_\ast}}}^{2} < {\sigma }_{{x}^{{_\ast}}}^{2}\) while σ_Δμ ² = 2σ_μ ², so attenuation bias will be greater after first differencing the data.
14.
As discussed is the previous section, when a variable with bounded support is imperfectly classified, it is widely recognized that the classical errors-in-variables model assumption of independence between measurement error and true variable cannot hold. Molinari (2008) presents an alternative and useful conceptualization of the data error problem for discrete outcome variables. This “direct misclassification” approach allows one to focus on assumptions related to classification error rates instead of restrictions on the mixing process.
15.
This type of restriction is used in the literatures on robust statistics (Huber 1981) and data errors with binary regressors (see, e.g., Bollinger 1996 and Frazis and Loewenstein 2003).
16.
In this discussion, we are concerned with drawing inferences on the true rate of aggravated assault reported to the police. Inferences regarding the overall rate of aggravated assault – known and unknown to the police – are complicated by the proxy variables problem discussed previously.

References

Anglin MD, Caulkins JP, Hser Y (1993) Prevalence estimation: policy needs, current status, and future potential. J Drug Issues 23(2):345–360
Google Scholar
Azrael D, Cook PJ, Miller M (2001) State and local prevalence of firearms ownership: measurement structure and trends, National Bureau of Economic Research: Working Paper 8570
Google Scholar
Bennett T, Holloway K, Farrington D (2008) The statistical association between drug misuse and crime: a meta-analysis. Aggress Violent Behav 13:107–118
Article Google Scholar
Black D (1970) Production of crime rates. Am Sociol Rev 35:733–48
Article Google Scholar
Black DA, Berger MC, Scott FA (2000) Bounding parameter estimates with nonclassical measurement error. J Am Stat Assoc 95(451):739–748
Article Google Scholar
Blumstein A, Rosenfeld R (2009) Factors contributing to U.S. crime trends, in understanding crime trends: workshop report Goldberger AS, Rosenfeld R (eds) Committee on Understanding Crime Trends, Committee on Law and Justice, Division of Behavioral and Social Sciences and Education. The National Academies Press, Washington, DC
Google Scholar
Bollinger C (1996) Bounding mean regressions when a binary variable is mismeasured. J Econom 73(2):387–99
Article Google Scholar
Bound J, Brown C, Mathiowetz N (2001) Measurement error in survey data. In: Heckman J, Leamer E (eds) Handbook of econometrics, 5, Ch. 59:3705–3843
Google Scholar
Chaiken JM, Chaiken MR (1990) Drugs and predatory crime. Crime Justice: Drugs Crime, 13:203–239
Article Google Scholar
Dominitz J, Sherman R (2004) Sharp bounds under contaminated or corrupted sampling with verification, with an application to environmental pollutant data. J Agric Biol Environ Stat 9(3):319–338
Article Google Scholar
Frazis H, Loewenstein M (2003) Estimating linear regressions with mismeasured, possibly endogenous, binary explanatory variables. J Econom 117:151–178
Article Google Scholar
Frisch R (1934) Statistical confluence analysis by means of complete regression systems. University Institute for Economics, Oslo
Google Scholar
Griliches Z, Hausman JA (1985) Errors in variables in panel data: a note with an example. J Econom 31(1):93–118
Article Google Scholar
Harrison LD (1995) The validity of self-reported data on drug use. J Drug Issues 25(1):91–111
Google Scholar
Harrison L, Hughes A (1997) Introduction – the validity of self-reported drug use: improving the accuracy of survey estimates. In: Harrison L and Hughes A (eds) The validity of self-reported drug use: improving the accuracy of survey estimates. NIDA Research Monograph, vol 167. US Department of Health and Human Services, Rockville, MD, pp 1–16
Google Scholar
Horowitz J, Manski C (1995) Identification and robustness with contaminated and corrupted data. Econometrica, 63(2):281–302
Article Google Scholar
Hotz J, Mullins C, Sanders S (1997) Bounding causal effects using data from a contaminated natural experiment: analyzing the effects of teenage childbearing. Rev Econ Stud 64(4):575–603
Article Google Scholar
Huber P (1981) Robust statistics. Wiley, New York
Book Google Scholar
Johnston LD, O’Malley PM, Bachman JG (1998) National survey results on drug use from the monitoring the future study, 1975–1997, Volume I: Secondary school students. NIH Publication No. 98-4345. National Institute on Drug Abuse, Rockville, MD
Google Scholar
Kleck G, Gertz M (1995) Armed resistance to crime: The prevalence and nature of self-defense with a gun. J Crim Law Criminol 86:150–187
Article Google Scholar
Klepper S, Leamer EE (1984) Consistent sets of estimates for regressions with errors in all variables. Econometrica 52(1):163–183
Article Google Scholar
Koss M (1993) Detecting the scope of rape: A review of prevalence research methods. J Interpers Violence 8:198–222
Article Google Scholar
Koss M (1996) The measurement of rape victimization in crime surveys. Crim Justice Behav 23:55–69
Article Google Scholar
Kreider B, Hill S (2009) Partially identifying treatment effects with an application to covering the uninsured. J Hum Resour 44(2):409–449
Google Scholar
Kreider B, Pepper J (2007) Disability and employment: reevaluating the evidence in light of reporting errors. J Am Stat Assoc 102(478):432–441
Article Google Scholar
Kreider B, Pepper J (2008) Inferring disability status from corrupt data. J Appl Econom 23(3):329–349
Article Google Scholar
Kreider B, Pepper J. (forthcoming). Identification of expected outcomes in a data error mixing model with multiplicative mean independence. J Business Econ Stat
Google Scholar
Kreider B, Pepper J, Gundersen C, Jolliffee D (2009) Identifying the effects of food stamps on children’s health outcomes when participation is endogenous and misreported. Working Paper
Google Scholar
Lambert D, Tierney L (1997) Nonparametric maximum likelihood estimation from samples with irrelevant data and verification bias. J Am Stat Assoc 92:937–944
Article Google Scholar
Lynch JP, Addington LA (eds) (2007) Understanding crime statistics: revisiting the divergence of the NCVS and UCR. Cambridge University Press, Cambridge
Google Scholar
Lynch J, Jarvis J (2008) Missing data and imputation in the uniform crime reports and the effects on national estimates. J Contemp Crim Justice 24(1):69–85. doi:10.1177/1043986207313028
Article Google Scholar
Maltz M (1999) Bridging gaps in police crime data: a discussion paper from the BJS Fellows Program Bureau of Justice Statistics, Government Printing Office, Washington, DC
Google Scholar
Maltz MD, Targonski J (2002) A note on the use of county-level UCR data. J Quant Criminol. 18:297–318
Article Google Scholar
Manski CF (2007) Identification for prediction and decisions. Harvard University Press, Cambridge, MA
Google Scholar
Manski CF, Newman J, Pepper JV (2002) Using performance standards to evaluate social programs with incomplete outcome data: general issues and application to a higher education block grant program. 26(4), 355–381
Google Scholar
McDowall D, Loftin C (2007) What is convergence, and what do we know about it? In Lynch J, Addington LA (eds) Understanding crime statistics: revisiting the divergence of the NCVS and UCR, Ch. 4: 93–124
Google Scholar
McDowall D, Loftin C, Wierseman B (1998) Estimates of the frequency of firearm self-defense from the redesigned national crime victimization survey. Violence Research Group Discussion Paper 20.
Google Scholar
Molinari F (2008) Partial identification of probability distributions with misclassified data. J Econom 144(1):81–117
Article Google Scholar
Mosher CJ, Miethe TD, Phillips DM (2002) The mismeasure of crime. Sage Publications, Thousand Oaks, CA
Google Scholar
Mullin CH (2005) Identification and estimation with contaminated data: When do covariate Data Sharpen Inference?” J Econom 130:253–272
Article Google Scholar
National Research Council (2001) Informing America’s policy on illegal drugs: what we don’t know keeps hurting us. Committee on Data and Research for Policy on Illegal Drugs. In: Manski CF, Pepper JV, Petrie CV (eds) Committee on Law and Justice and Committee on National Statistics. Commission on Behavioral and Social Sciences and Education. National Academy Press, Washington, DC
Google Scholar
National Research Council (2003) Measurement problems in criminal justice research: workshop summary. Pepper JV, Petrie CV. Committee on Law and Justice and Committee on National Statistics, Division of Behavioral and Social Sciences and Education. The National Academies Press, Washington, DC
Google Scholar
National Research Council (2005) Firearms and violence: a critical review. Committee to improve research information and data on firearms. In: Wellford CF, Pepper JV, Petire CV (eds) Committee on Law and Justice, Division of Behavioral and Social Sciences and Education. The National Academies Press, Washington, DC
Google Scholar
National Research Council (2008) Surveying victims: Options for conducting the national crime victimization survey. Panel to review the programs of the bureau of justice statistics. In: Groves RM, Cork DL (eds). Committee on National Statistics and Committee on Law and Justice, Division of Behavioral and Social Sciences and Education. The National Academies Press, Washington, DC
Google Scholar
Office of Applied Studies. (2003). Results from the 2002 National Survey on Drug Use and Health: summary of national finding, (DHHS Publication No. SMA 03-3836, Series H-22). Substance Abuse and Mental Health Services Administration, Rockville, MD
Google Scholar
Pepper JV (2001) How do response problems affect survey measurement of trends in drug use? In: Manski CF, Pepper JV, Petrie C (eds) Informing America’s policy on illegal drugs: What we don’t know keeps hurting us. National Academy Press, Washington, DC, 321–348
Google Scholar
Rand MR, Rennison CM (2002) True crime stories? Accounting for differences in our national crime indicators. Chance 15(1):47–51
Google Scholar
Tourangeau R, McNeeley ME (2003) Measuring crime and crime victimization: methodological issues. In: Pepper JV, Petrie CV (eds) Measurement Problems in Criminal Justice Research: Workshop Summary. Committee on Law and Justice and Committee on National Statistics, Division of Behavioral and Social Sciences and Education. The National Academies Press: Washington, DC 0
Google Scholar
U.S. Department of Justice (2004) The Nation’s two crime measures, NCJ 122705, http://www.ojp.usdoj.gov/bjs/abstract/ntmc.htm.
U.S. Department of Justice (2006) National Crime Victimization Survey: Criminal Victimization, 2005. Bureau of Justice Statistics Bullentin. http://www.ojp.usdoj.gov/bjs/pub/pdf/cv05.pdf
U.S. Department of Justice (2008) Crime in the United States, 2007-+. Federal Bureau of Investigation, Washington, DC. (table 1). http://www.fbi.gov/ucr/cius2007/data/table_01.html
Wansbeek T, Meijer E (2000) Measurement error and latent variables in econometrics. Elsevier, Amsterdam
Google Scholar
Wooldridge JM (2002) Econometric analysis of cross section and panel data. MIT Press, Cambridge, MA
Google Scholar

Download references

Acknowledgments

We thank Stephen Bruestle, Alex Piquero, and David Weisburd for their helpful comments. Pepper’s research was supported, in part, by the Bankard Fund for Political Economy.

Author information

Authors and Affiliations

Department of Economics, University of Virginia, Charlottesville, VA, USA
John Pepper & Sean Sullivan
Committee on Law and Justice, National Research Council, Washington, DC, USA
Carol Petrie

Authors

John Pepper
View author publications
You can also search for this author in PubMed Google Scholar
Carol Petrie
View author publications
You can also search for this author in PubMed Google Scholar
Sean Sullivan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

College of Criminology, Florida State University, West Call Street 643, Tallahassee, 32306, U.S.A.
Alex R. Piquero
Inst. Criminology, Hebrew University of Jerusalem, Jerusalem, 91905, Israel
David Weisburd

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Pepper, J., Petrie, C., Sullivan, S. (2010). Measurement Error in Criminal Justice Data. In: Piquero, A., Weisburd, D. (eds) Handbook of Quantitative Criminology. Springer, New York, NY. https://doi.org/10.1007/978-0-387-77650-7_18

Download citation

DOI: https://doi.org/10.1007/978-0-387-77650-7_18
Published: 03 December 2009
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-77649-1
Online ISBN: 978-0-387-77650-7
eBook Packages: Humanities, Social Sciences and LawSocial Sciences (R0)

Publish with us

Policies and ethics