Skip to main content
Log in

Using Booking Data to Model Drug User Arrest Rates: A Preliminary to Estimating the Prevalence of Chronic Drug Use

  • Original Paper
  • Published:
Journal of Quantitative Criminology Aims and scope Submit manuscript

Abstract

Public policy is often concerned with the size and characteristics of special populations that are difficult to reach in household surveys. Chronic drug users, who often live outside conventional households, provide the illustration motivating this paper. An alternative to household surveys is to question chronic drug users where they congregate—jails, treatment programs, and shelters, for example. Using such opportunistic data for prevalence estimation raises difficult problems for statistical inference: Study subjects who arrive at the collection points cannot be deemed a random sample of the general population. However, if we could estimate the rates at which chronic drug users arrive at the collection points, then we could use those estimates to weight the sample to represent the population. This paper presents a modified Poisson mixture model used to estimate the stochastic process that accounts for how chronic drug users get arrested. It uses that model to estimate arrest rates for 38 counties using up to sixteen quarters of data from the Arrestee Drug Abuse Monitoring survey.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. The term arrest appears interchangeably with the term booking in the text, but both terms mean that a person was detained in jail pending action by the police or courts. The reader should note that an arrest does not always result in an arrestee being booked into jail, because the arresting officer sometimes has authority to release a person after issuing a citation for that person’s appearance in court. Conversely, a booking does not always require an arrest, because probation and parole officers can bring probationers/parolees to jail in response to technical violations of the conditions of supervision. When the term is used in this paper, an arrest means being booked into a jail, meaning a person is fingerprinted and waits for a magistrate or other official to set bail. Typically, this process takes a few hours; sometimes it takes a few days. Because ADAM interviews occur after a person is booked, arrests that do not result in booking are invisible to the ADAM survey.

  2. Suppose a random variable ɛ were distributed as normal with mean 0 and standard deviation σ. Then exp(ɛ) would be distributed as lognormal with mean and variance described in the text.

  3. Simeone et al. (1997) conducted a survey of chronic drug users during 1995 at a random sample of Chicago booking facilities, treatment programs, and shelters. Their sample was drawn from entering cohorts. A roughly three-year average arrest rate was 0.054 arrests per month for the arrestee cohort, 0.016 arrests per month for the treatment cohort, and 0.026 arrests per month for the shelter cohort. The arrest rate for the arrest cohort is 3.4 times as large as that for the treatment cohort and 2.1 times as large as that for the shelter cohort. This is consistent with the discussion in the text that predicts that the arrest rates for chronic drug users who were intercepted at a booking facility would be larger than for chronic drug users in the general community, provided we treat chronic users arriving at treatment programs and shelters as being representative of chronic users in the community.

  4. Systematic underreporting of drug use would affect the estimated average arrest rates. For example, suppose that chronic users, who have the highest arrest rates, are more likely to report their drug use than are less frequent users. Then chronic users would be overrepresented in the data, and arrest rates would be biased upward. Based on analysis of urine test results, chronic users appear more willing to report their drug use, so this bias is a possibility, which we mitigate by restricting our estimates to chronic drug users alone. On the other hand, the β parameter estimates should be unbiased (actually, consistent) unless underreporting is correlated with γ.

  5. The ADAM instrument records monthly drug use levels with a scale: 0 is no drug use during the month; 1 is one day per week; 2 is two to three days per week; and 3 is four or more days per week. We averaged self-reports based on this scale over the window period.

  6. The ADAM developers performed a validation exercise by checking self-reports (from the calendar) against booking records for 75 interviewees in New Orleans. We have reanalyzed their data. Over the entire calendar, interviewees reported 76 arrests and the booking records recorded 74 arrests. Regressing the self-reported arrest rates on the official records for the twelve months produced a constant of near zero (0.01 with standard error of 0.39) and a slope near one (1.03 with standard error of 0.38). We take this as evidence that self-reports and booking records are consistent, although clearly self-reports and booking records frequently disagree about the specific month. Regressing the official arrest rates on calendar months, we find that the annualized booking rate increases by 0.05 arrests per month (standard error 0.028), suggesting that the Period Effect is more than an artifact of telescoping. However, when we perform the same regression using the self-reported arrests as the dependent variable, we find that the annualized booking rate increases by 0.13 arrests per month (standard error 0.031). A test of the difference in these two slopes is significant at p  <  0.05, so we have to be concerned that telescoping may account for some of the estimated acceleration in arrest rate. Regrettably, these samples are small, and they only pertain to New Orleans.

References

  • Ahn C, Blumstein, A, Schervish M (1990) Estimation of Arrest Careers using Hierarchical Stochastic Models. J Quantitative Criminol 6 (2)

  • Blumstein A, Cohen J, Roth J, Visher C (1986) Criminal careers and career criminals, vol 1. National Academy Press, Washington, DC

    Google Scholar 

  • Cameron A, Trivedi P (1998) Regression analysis of count data. Cambridge University Press, Cambridge United Kingdom

    Google Scholar 

  • Cohen J (1992) Incapacitation Effects of Incarcerating Drug Offenders, Final Report submitted to the National Institute of Justice

  • Cohen J, Nagin D, Wallstrom G, Wasserman L (1998) Hierarchical Bayesian Analysis of Arrest Rates. J Am Stat Assoc 93(444):1260–1270

    Article  Google Scholar 

  • Cosslett S (1993) Estimation from endogenously stratified samples. In: Maddala G, Rao C, Vinod H (eds) Handbook of statistics, vol 11. Elsevier Science Publishing

  • Englin J, Shonkwiler J (1995) Estimating social welfare using count data models: an application to long-run recreational demand under conditions of endogenous stratification and truncation. The Rev Econ Stat 104–112

  • Fendrich M, Johnson T, Sudman S, Wislar J, Spiehler V (1999) Validity of Drug Use Reporting in a High-Risk Community Sample: A Comparison of Cocaine and Heroin Survey Reports with Hair Tests. Am J Epidemiol 149(10):955–62

    Google Scholar 

  • Greene W. (2003) Econometric analysis, 5th edn. Prentice Hall, Upper Saddle River, NJ

    Google Scholar 

  • Hammett T, Harmon P, Rhodes W (2002) The Burden of Infectious Disease Among Inmates of and Releasees from US Correctional Facilities. Am J Public Health 92:1789–1794

    Article  Google Scholar 

  • Harrell K, Kapsak I, Caisson, Wirtz P (1986) The validity of self-reported drug use data: the accuracy of responses on confidential self-administered answer sheets. Paper prepared for the National Institute on Drug Abuse, Contract Number 271-85-8305

  • Hser Y (1993) Population Estimates of Illicit Drug Users in Los Angeles County. J Drug Issues 23(2):323–334

    Google Scholar 

  • Hser Y, Anglin D, Wickens T, Brecht M, Homer J (1992) Techniques for the estimation of illicit drug-use prevalence. National Institute of Justice Research Report, NCJ 133786

    Google Scholar 

  • Hunt D, Rhodes W (2001) Methodology Guide for ADAM. Report prepared by Abt Associates Inc. for the National Institute of Justice

  • Kalbfleisch J, Prentice R (1980) The statistical analysis of failure time data. Wiley Series in Probability and Mathematical Statistics, New York NY

    Google Scholar 

  • Kish, L. (1995) Survey sampling. Wiley, New York

    Google Scholar 

  • Lancaster T (1990) The econometric analysis of transition data. Cambridge University Press, Cambridge United Kingdom

    Google Scholar 

  • Land K, McCall P, Nagin D (1996) A comparison of poisson, negative binomial, and semiparametric mixed poisson regression models with empirical applications to criminal careers data. Social Res Methods 24:387–440

    Article  Google Scholar 

  • Land K, Nagin D (1996) Micro-models of criminal careers: a synthesis of the criminal career and life course approaches via semiparametric mixed poisson models with empirical applications. J Quantitative Criminol 12:163–191

    Article  Google Scholar 

  • Lohr S (1999) Sampling: design and analysis. Duxbury Press

  • Maltz M (1996) From poisson to the present: applying operations research to problems of crime and justice. J Quantitative Criminol 12(1):3–61

    Article  Google Scholar 

  • Manski C, Pepper J, Petrie C (2001) Informing America’s policy on illegal drugs: what we don’t know keeps hurting us. National Academy Press, Washington, DC

    Google Scholar 

  • Manski C (1995) Identification problems in the social sciences. Harvard University Press, Cambridge MA

    Google Scholar 

  • McCulloch C, Searle S (2001) Generalized, linear, and mixed models. Wiley Series in Probability and Statistics, New York NY

    Google Scholar 

  • Nagin D, Land K (1993) Age, criminal careers and population heterogeneity: specification and estimation of a nonparametric, mixed poisson model. Criminology 31:327–362

    Article  Google Scholar 

  • Rhodes et al (2005) What America’s user spend on illegal drugs: 1988–2003. Report submitted to the Office of National Drug Control Policy (2004). Publication by ONDCP in preparation

  • Rhodes W, Hyatt R, Scheiman P (1996) Predicting pretrial misconduct with drug tests of arrestees: evidence from eight settings. J Quantitative Criminol 12(3):315–348

    Article  Google Scholar 

  • Rhodes W (1993) Synthetic estimation applied to the prevalence of drug use. The J Drug Issues 23(2):297–321

    Google Scholar 

  • Schafer J (1997) Analysis of incomplete multivariate data. monographs on statistics and applied probability, 72. Chapman & Hall/CRC, Boca Raton, Florida

  • Simeone R, Rhodes W, Hunt D, Truitt L (1997) A plan for estimating the number of chronic drug users in the United States. Report submitted to the Office of National Drug Control Policy by Abt Associates Inc.

  • Spelman W (1994) Criminal incapacitation. Plenum Press, New York

    Google Scholar 

  • Wish E, Cuadrado M, Martorana J (1987) Drug abuse as a predictor of pretrial failure-to-appear in arrestees in Manhattan. Unpublished paper prepared under Grant 83-IJ-CX-K048 to Narcotic and Drug Research, Inc.

  • Wright D, Gfroerer J, Epstein J (1997) The use of external data sources and ratio estimation to improve estimates of hardcore drug use from the NHSDA. In: Harrison L, Hughes A (eds) The validity of self-reported drug use: improving the accuracy of survey estimates, NIDA Monograph 167, pp477–497

Download references

Acknowledgments

Research was done under contract to the National Institute of Justice and under a McGillis grant from Abt Associates, Inc. The authors thank anonymous reviewers as well as Stephen Kennedy and Dana Hunt, both of Abt Associates, for their helpful comments on earlier drafts.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to William Rhodes.

Appendix: The Distribution of γ

Appendix: The Distribution of γ

Equation (7) gives us the distribution of γ for “arrested” CDUs. The mean arrest rate for arrested CDUs will be larger than the mean arrest rate for CDUs in the county. To see this, follow (7A) through (7D) to compute the mean arrest rate for arrested CDUs as:

$$E\left(\gamma\vert {\rm arrest}\right)=\int\limits_{\gamma=0}^{\infty} \gamma g\left(\gamma\vert {\rm arrested}\right)d\gamma $$
(7A)

Substituting (7) from the main text into (7A) gives:

$$ E\left(\gamma\vert {\rm arrest}\right)=\frac{\int\limits_{\gamma=0}^{\infty} \gamma^{2}f\left(\gamma\right)d\gamma}{e^{\left(1/2\right)\sigma^{2}}} =\frac{\int\limits_{\gamma=0}^{\infty}{\gamma^{2}f \left(\gamma\right)d\gamma}}{E\left(\gamma\right)} $$
(7B)

Because γ is distributed as lognormal in the general population, its variance can be written:

$$ {\rm VAR}\left(\gamma\right)=e^{\sigma^{2}}\left(e^{\sigma^{2}}-1\right) =\int\limits_{\gamma=0}^{\infty}\gamma^{2}f\left(\gamma\right)d\gamma -E\left(\gamma\right)^{2} $$
(7C)

The first part of the equality is definitional: this is the variance for a lognormal random variable. The second part of the equality comes from writing the variance of a variable Y as

$$ E\left\lfloor\left(Y-\bar{Y}\right)^{2}\right\rfloor=E\left[Y^{2}\right] -2E\left[Y\right]\bar{Y}+\bar{Y}^{2}=E\left[Y^{2}\right]-\bar{Y}^{2} $$

.

Solving (7C) for \(\int\gamma^{2}f(\gamma)d\gamma\), and substituting the results into (7B), yields the first term on the right of the equality sign in (7D). Because γ is lognormal, we substitute \(E\left[\gamma\right]=e^{\left(1/2\right)\sigma^{2}}\) into the numerator and denominator of (7D) to yield the second term on the right of the equality. The final simplification follows from some algebraic manipulation, which recognizes that \(e^{x}e^{y}=e^{x+y}\).

$$ E\left(\gamma\vert {\rm arrest}\right)=\frac{e^{\sigma^{2}} \left(e^{\sigma^{2}}-1\right)+E\left(\gamma\right)^{2}} {E\left(\gamma\right)}=\frac{e^{\sigma^{2}}\left(e^{\sigma^{2}}-1\right) +e^{\sigma^{2}}}{e^{\left(1/2\right)\sigma^{2}}} =e^{\left(3/2\right)\sigma^{2}} $$
(7D)

Figure 3, which represents a stylized version of drug user arrest rates, helps cement ideas. The picture segment on the left shows, for small γ, f(γ) as the higher curve and g(γ| arrest) as the lower curve. Inspection of the curves shows that the expected value of γ is higher for a sample of arrested CDUs than it is for the population of CDUs in the county. The picture segment on the right shows, for small N, P(N| X) as the higher curve (where N = 0) and P(N| X;arrest) as the lower curve. As would be expected, the probability of zero arrests during the window period is lower for arrested CDUs than it is for the population of CDUs. For a random sample drawn from the community, the observable data would correspond to f(γ) and P(N| X), but for the sample of arrestees, the observable data conform to g(γ| arrest) and P(N| X;arrest). Given use of the arrestee sample to estimate β and σ, estimation requires likelihood based on these conditional distributions.

Fig. 3
figure 3

Distribution of γ and Arrest Rates (a) f(γ) and g(γ|arrest) (b) P(N|X) and P(N|X;arrest). Notes: In panel a, the higher curve (at low values of the arrest rateγ) represents the distribution of γ for CDUs in the county and the lower curve represents the distribution of γ for CDUs who are arrested. In panel b, the higher curve (at low values of the arrest rate γ) represents the distribution of arrests for CDUs in the county and the lower curve represents the distribution for CDUs who are arrested

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rhodes, W., Kling, R. & Johnston, P. Using Booking Data to Model Drug User Arrest Rates: A Preliminary to Estimating the Prevalence of Chronic Drug Use. J Quant Criminol 23, 1–22 (2007). https://doi.org/10.1007/s10940-006-9016-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10940-006-9016-9

Keywords

Navigation