Skip to main content

Capture-Recapture for Casualty Estimation and Beyond: Recent Advances and Research Directions

  • Chapter
  • First Online:
Statistics in the Public Interest

Abstract

The most basic quantitative question about the consequences of armed conflicts is perhaps how many people were killed. During and after conflicts, it is common to attempt to create tallies of victims. However, destroyed infrastructure and institutions, danger to field workers, and a reasonable suspicion of data collection by victim communities limit the result of these efforts to incomplete and non-representative lists. Capture-Recapture (CR) estimation, also known as Multiple Systems Estimation (MSE) in the context of human populations, is a family of methods for estimating the size of closed populations based on matched incomplete samples. CR methods vary in details and complexity, but they all ultimately rely on analyzing the patterns of inclusion of individuals across samples to estimate the probability of not being observed and then the number of unobserved individuals. In this discussion, we describe the versions MSE with which analysts have estimated the total number of casualties in armed conflicts. We explore the advances of the last 15 years, and we describe outstanding statistical challenges.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Agresti, A. (1994), “Simple capture-recapture models permitting unequal catchability and variable sampling effort,” Biometrics, 50, 494–500.

    Article  Google Scholar 

  • Amorós, J. (2014), “Recapturing Laplace,” Significance, 11, 38–39.

    Article  Google Scholar 

  • Anderson, M. and Fienberg, S. E. (1999), “To sample or not to sample: The 2000 census controversy,” Journal of Interdisciplinary History, 30, 1–36.

    Article  Google Scholar 

  • Ball, P. (2000), “The Guatemalan Commission for Historical Clarification: Intersample Analysis,” in Making the Case: Investigating Large Scale Human Rights Violations using Information Systems and Data Analysis, eds. Ball, P., Spirer, H. F., and Spirer, L., American Association for the Advancement of Science, chap. 11.

    Google Scholar 

  • Ball, P., Asher, J., Sulmont, D., and Manrique, D. (2003), “How many Peruvians have died? An estimate of the total number of victims killed or disappeared in the armed internal conflict between 1980 and 2000,” AAAS. Report to the Peruvian Truth and Reconciliation Commission (CVR). Also published as Anexo 2 (Anexo Estadístico) of CVR Report.

    Google Scholar 

  • Ball, P., Betts, W., Scheuren, F., Dudukovic, J., and Asher, J. (2002), “Killings and Refugee Flow in Kosovo, March–June, 1999,” Report to ICTY.

    Google Scholar 

  • Ball, P., Coronel, S., Padilla, M., and Mora, D. (2019a), “Drug-Related Killings in the Philippines,” Tech. rep., Human Rights Data Analysis Group and the Stabile Center for Investigative Journalism.

    Google Scholar 

  • Ball, P. and Harrison, F. (2018), “How many people disappeared on 17–19 May 2009 in Sri Lanka?” Tech. rep., Human Rights Data Analysis Group and the International Truth and Justice Project.

    Google Scholar 

  • Ball, P. and Price, M. (2018), “The statistics of genocide,” CHANCE, 31, 38–45.

    Article  Google Scholar 

  • Ball, P. and Price, M. (2019), “Using Statistics to Assess Lethal Violence in Civil and Inter-State War,” Annual Review of Statistics and Its Application, 6:1, 63–84.

    Article  MathSciNet  Google Scholar 

  • Ball, P., Rodríguez, C., and Rozo, V. (2019b), “Asesinatos de líderes sociales en Colombia en 2016–2017: una estimación del universo,” Tech. rep., Human Rights Data Analysis Group and Dejusticia.

    Google Scholar 

  • Ball, P., Shin, E. H.-S., and Yang, H. (2018), “There may have been 14 undocumented Korean “comfort women” in Palembang, Indonesia,” Tech. rep., Human Rights Data Analysis Group and Transitional Justice Working Group.

    Google Scholar 

  • Bird, S. M. and King, R. (2017), “Multiple Systems Estimation (or Capture-Recapture Estimation) to Inform Public Policy,” Annual Review of Statistics and Its Application, 5.

    Google Scholar 

  • Bishop, Y., Fienberg, S., and Holland, P. (1975), Discrete Multivariate Analysis: Theory and Practice, Cambridge, MA: MIT Press, reprinted in 2007 by Springer-Verlag, New York.

    Google Scholar 

  • Darroch, J., Fienberg, S., Glonek, G., and Junker, B. (1993), “A three-sample multiple-recapture approach to census population estimation with heterogeneous catchability,” Journal of the American Statistical Association, 88, 1137–1148.

    Article  Google Scholar 

  • Fellegi, I. P. and Sunter, A. B. (1969), “A Theory for Record Linkage,” Journal of the American Statistical Association, 64, 1183–1210.

    Article  MATH  Google Scholar 

  • Fienberg, S. (1972), “The Multiple recapture census for closed populations and incomplete 2k contingency tables,” Biometrika, 59, 591–603.

    MathSciNet  MATH  Google Scholar 

  • Fienberg, S., Johnson, M., and Junker, B. (1999), “Classical multilevel and Bayesian approaches to population size estimation using multiple lists,” Journal of the Royal Statistical Society. Series A, 162, 383–406.

    Article  Google Scholar 

  • Fienberg, S. E. and Manrique-Vallier, D. (2009), “Integrated methodology for multiple systems estimation and record linkage using a missing data formulation,” AStA-Advances in Statistical Analysis, 93, 49–60.

    Article  MathSciNet  MATH  Google Scholar 

  • Gelman, A. and Loken, E. (2013), “The garden of forking paths: Why multiple comparisons can be a problem, even when there is no “fishing expedition” or “p-hacking” and the research hypothesis was posited ahead of time,” Unpublished paper.

    Google Scholar 

  • Hogan, J. W. and Daniels, M. J. (2008), Missing Data in Longitudinal Studies, Boca Raton: Chapman and Hall.

    MATH  Google Scholar 

  • International Working Group for Disease Monitoring and Forecasting (1995a), “Capture-recapture and multiple-record systems estimation I: History and theoretical development,” American Journal of Epidemiology, 142, 1047–1058.

    Article  Google Scholar 

  • —— (1995b), “Capture-recapture and multiple-record systems estimation II: Applications in human diseases,” American Journal of Epidemiology, 142, 1059–1068.

    Article  Google Scholar 

  • Jaro, M. A. (1989), “Advances in Record-Linkage Methodology as Applied to Matching the 1985 Census of Tampa, Florida,” Journal of the American Statistical Association, 84, 414–420.

    Article  Google Scholar 

  • Johndrow, J. E., Bhattacharya, A., and Dunson, D. B. (2017), “Tensor decompositions and sparse log-linear models,” Annals of statistics, 45, 1.

    Article  MathSciNet  MATH  Google Scholar 

  • Johndrow, J. E., Lum, K., and Manrique-Vallier, D. (2019), “Low-risk population size estimates in the presence of capture heterogeneity,” Biometrika, 106, 197–210.

    Article  MathSciNet  MATH  Google Scholar 

  • Laplace, P. S. (1786), “Sur les naissances, les mariages et les morts,” in Histoire de L’Académie Royale des Sciences.

    Google Scholar 

  • Larsen, M. D. and Rubin, D. B. (2001), “Iterative Automated Record Linkage Using Mixture Models,” Journal of the American Statistical Association, 96, 32–41.

    Article  MathSciNet  Google Scholar 

  • Lerdsuwansri, R. and Böhning, D. (2018), “Extending the Lincoln-Petersen Estimator when Both Sources are Counts,” in Capture-Recapture Methods for the Social and Medical Sciences, eds. Böhning, D., Van Der Heijden, P. G., and Bunge, J., Boca Raton, FL: Chapman & Hall/CRC, chap. 23, pp. 341–360.

    Google Scholar 

  • Liseo, B. and Tancredi, A. (2011), “Bayesian Estimation of Population Size via Linkage of Multivariate Normal Data Sets,” Journal of Official Statistics, 27, 491–505.

    Google Scholar 

  • Little, R. J. A. and Rubin, D. B. (2002), Statistical Analysis with Missing Data: Second Edition, New York: John Wiley & Sons.

    Book  MATH  Google Scholar 

  • Lum, K., Price, M., Guberek, T., and Ball, P. (2010), “Measuring Elusive Populations with Bayesian Model Averaging for Multiple Systems Estimation: A Case Study on Lethal Violations in Casanare, 1998–2007,” Statistics, Politics and Policy, 1.

    Google Scholar 

  • Madigan, D. and York, J. C. (1997), “Bayesian methods for estimation of the size of a closed population,” Biometrika, 84, 19–31.

    Article  MathSciNet  MATH  Google Scholar 

  • Manrique-Vallier, D. (2016), “Bayesian Population Size Estimation Using Dirichlet Process Mixtures,” Biometrics, 72, 1246–1254.

    Article  MathSciNet  MATH  Google Scholar 

  • Manrique-Vallier, D., Ball, P., and Sulmont, D. (2019), “Estimating the Number of Fatal Victims of the Peruvian Internal Armed Conflict, 1980-2000: an application of modern multi-list Capture-Recapture techniques,” arXiv preprint, arXiv:1906.04763v2 [stat.AP, stat.ME].

    Google Scholar 

  • Mulry, M. H. and Spencer, B. D. (1991), “Total Error in PES Estimates of Population,” Journal of the American Statistical Association, 86, 839–855.

    Article  Google Scholar 

  • Okiria, A. G., Bolo, A., Achut, V., Arkangelo, G. C., Michael, A. T. I., Katoro, J. S., Wesson, J., Gutreuter, S., Hundley, L., and Hakim, A. (2019), “Novel Approaches for Estimating Female Sex Worker Population Size in Conflict-Affected South Sudan,” JMIR Public Health Surveill, 5, e11576.

    Article  Google Scholar 

  • Otis, D. L., Burnham, K. P., White, G. C., and Anderson, D. R. (1978), “Statistical inference from capture data on closed animal populations,” Wildlife monographs, 3–135.

    Google Scholar 

  • Price, B. S., Geyer, C. J., and Rothman, A. J. (2019), “Automatic Response Category Combination in Multinomial Logistic Regression,” Journal of Computational and Graphical Statistics, 28, 758–766.

    Article  MathSciNet  MATH  Google Scholar 

  • Sadinle, M. (2014), “Detecting Duplicates in a Homicide Registry Using a Bayesian Partitioning Approach,” Annals of Applied Statistics, 8, 2404–2434.

    Article  MathSciNet  MATH  Google Scholar 

  • —— (2017), “Bayesian Estimation of Bipartite Matchings for Record Linkage,” Journal of the American Statistical Association, 112, 600–612.

    Article  MathSciNet  Google Scholar 

  • —— (2018), “Bayesian propagation of record linkage uncertainty into population size estimation of human rights violations,” Annals of Applied Statistics, 12, 1013–1038.

    Article  MathSciNet  MATH  Google Scholar 

  • Sadinle, M. and Fienberg, S. E. (2013), “A Generalized Fellegi-Sunter Framework for Multiple Record Linkage With Application to Homicide Record Systems,” Journal of the American Statistical Association, 108, 385–397.

    Article  MathSciNet  MATH  Google Scholar 

  • Sanathanan, L. (1973), “A comparison of some models in visual scanning experiments,” Technometrics, 15, 67–78.

    MATH  Google Scholar 

  • Sekar, C. C. and Deming, W. E. (1949), “On a Method of Estimating Birth and Death Rates and the Extent of Registration,” Journal of the American Statistical Association, 44, 101–115.

    Article  MATH  Google Scholar 

  • Steorts, R. C., Hall, R., and Fienberg, S. E. (2016), “A Bayesian Approach to Graphical Record Linkage and Deduplication,” Journal of the American Statistical Association, 111, 1660–1672.

    Article  MathSciNet  Google Scholar 

  • Tancredi, A. and Liseo, B. (2011), “A Hierarchical Bayesian Approach to Record Linkage and Size Population Problems,” Annals of Applied Statistics, 5, 1553–1585.

    Article  MathSciNet  MATH  Google Scholar 

  • Winkler, W. E. (1988), “Using the EM Algorithm for Weight Computation in the Fellegi-Sunter Model of Record Linkage,” in Proceedings of the Section on Survey Research Methods, American Statistical Association, pp. 667–671.

    Google Scholar 

  • Zaslavsky, A. and Wolfgang, G. (1993), “Triple-system modeling of census, post-enumeration survey, and administrative-list data,” Journal of Business & Economic Statistics, 11, 279–288.

    Google Scholar 

  • Zwane, E. and van der Heijden, P. (2007), “Analysing capture–recapture data when some variables of heterogeneous catchability are not collected or asked in all registrations,” Statistics in Medicine, 26, 1069–89.

    Article  MathSciNet  Google Scholar 

  • Zwierzchowski, J. and Tabeau, E. (2010), “The 1992–95 War in Bosnia and Herzegovina: Census-Based Multiple System Estimation of Casualties’ Undercount,” Berlin: Households in Conflict Network and Institute for Economic Research, 539.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel Manrique-Vallier .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Manrique-Vallier, D., Ball, P., Sadinle, M. (2022). Capture-Recapture for Casualty Estimation and Beyond: Recent Advances and Research Directions. In: Carriquiry, A.L., Tanur, J.M., Eddy, W.F. (eds) Statistics in the Public Interest. Springer Series in the Data Sciences. Springer, Cham. https://doi.org/10.1007/978-3-030-75460-0_2

Download citation

Publish with us

Policies and ethics