Skip to main content

RNR Simulation Tool: A Synthetic Datasets and Its Uses for Policy Simulations

  • Chapter
  • First Online:
Simulation Strategies to Reduce Recidivism

Abstract

Evidence-based practice, as the name suggests, requires evidence to support policy and practice. Typically this evidence comes in the form of data—about choices, policy options, and outcomes. However, such data can be very hard to come by for most jurisdictions or agencies. How should they utilize the available evidence to best support sound decisions? This chapter describes the use of synthetic datasets for this purpose. Synthetic datasets have, at their core, theoretically possible attribute profiles. These profiles represent the support space of a population of interest. The profiles are weighted (or re-weighted) to reflect different aggregate properties. The properties may reflect such features as means, rates, variances, covariances, correlations, etc., of various attributes. In effect, once constructed, the synthetic dataset can be analyzed in much the same way as a real sample from the population of interest.

Two aspects of synthetic datasets make them particularly appealing for policy simulations. First, the weights can be constructed to conform to disparate pieces of evidence. Evidence available from different sources can be combined and used to populate the synthetic dataset. Second, the synthetic dataset can be re-weighted to reflect jurisdiction-specific or localized attribute features. In other words, the synthetic datasets can be customized to reflect the characteristics of a local jurisdiction, thereby making it more useful for localized policy simulations. This chapter describes the methodology used in constructing and re-weighting synthetic datasets and demonstrates the procedure with real data from several jurisdictions. Finally, the chapter will describe how this synthetic data is used in the full web-based simulation tool.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In actual application, the number of requirements needs to be much smaller because the higher the number of constraints we impose, the harder it is to find a set of weights that will satisfy all of them simultaneously.

  2. 2.

    Number in parenthesis is the value given to the category in the dataset.

References

  • Abowd, J. M., & Woodcock, S. (2001). Disclosure limitation in longitudinal linked data. In P. Doyle, J. Lane, J. Theeuwes, & L. Zayatz (Eds.), Confidentiality, disclosure and data access: Theory and practical applications for statistical agencies (pp. 215–277). Amsterdam: North Holland.

    Google Scholar 

  • Bhati, A., Roman, J., & Chalfin, A. (2008). To treat or not to treat: Evidence on the prospects of expanding treatment to drug-involved offenders. Washington, DC: The Urban Institute.

    Google Scholar 

  • Golan, A., Judge, G., & Miller, D. (1996). Maximum entropy econometrics: Robust estimation with limited data. Chichester: Wiley.

    Google Scholar 

  • Jaynes, E. T. (1957). Information theory and statistical mechanics. Physics Review, 106, 620–630.

    Article  Google Scholar 

  • Kullback, S. (1959). Information theory and statistics. New York, NY: Wiley.

    Google Scholar 

  • Raghunathan, T. E., Reiter, J. P., & Rubin, D. B. (2003). Multiple imputation for statistical disclosure limitation. Journal of Official Statistics., 19, 1–19.

    Google Scholar 

  • Reiter, J. (2002). Satisfying disclosure restrictions with synthetic data sets. Journal of Official Statistics, 18, 531–544.

    Google Scholar 

  • Reiter, J. (2003). Releasing multiply-imputed, synthetic public use microdata: An illustration and empirical study. Journal of the Royal Statistical Society, Series A, 168, 185–205.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Avinash Bhati Ph.D. .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this chapter

Cite this chapter

Bhati, A., Crites, E.L., Taxman, F.S. (2013). RNR Simulation Tool: A Synthetic Datasets and Its Uses for Policy Simulations. In: Taxman, F., Pattavina, A. (eds) Simulation Strategies to Reduce Recidivism. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-6188-3_8

Download citation

Publish with us

Policies and ethics