Skip to main content
Log in

Improving Wildlife Population Inference Using Aerial Imagery and Entity Resolution

  • Published:
Journal of Agricultural, Biological and Environmental Statistics Aims and scope Submit manuscript


Recent technological advancements have seen a rapid growth in the use of imagery data to estimate the abundance and spatial distribution of animal populations. However, the value of imagery data may not be fully exploited under traditional analytical frameworks. We developed a method that leverages aerial imagery data for population modeling through entity resolution, a technique that stochastically links the same individual across multiple images. Resolving duplicate individuals in overlapping images that are distorted requires realigning observed point patterns optimally; however, popular machine learning algorithms for image stitching do not often account for alignment uncertainty. Moreover, duplicated individuals can provide insight about detection probability when overlaps are viewed as replicate surveys. Our model resolves individual identities by linking observed locations to latent activity centers and estimates total population as informed by the linkage structure. We developed a hierarchical framework to achieve entity resolution and abundance estimation cohesively, thereby avoiding single-direction error propagation that is common in two-stage models. We illustrate our method through simulation and a case study using aerial images of sea otters in Glacier Bay, Alaska. Supplementary materials accompanying this paper appear on-line

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others


  • Anderson M, Fienberg SE (1999) Who counts? The politics of census-taking in contemporary America. Russell Sage Foundation

  • Ando H (1991) Dynamic reconstruction of 3d structure and 3d motion. In: Proceedings of the IEEE workshop on visual motion, pp 101–102

  • Barker RJ, Schofield MR, Link WA, Sauer JR (2018) On the reliability of N-mixture models for count data. Biometrics 74:369–377

    Article  MathSciNet  MATH  Google Scholar 

  • Betancourt B, Zanella G, Miller JW, Wallach H, Zaidi A, Steorts RC (2016) Flexible models for microclustering with application to entity resolution. In Advances in neural information processing systems, pp 1417–1425

  • Borchers DL, Nightingale P, Stevenson BC, Fewster RM (2020) A latent capture history model for digital aerial surveys. Biometrics

  • Brost BM, Hooten MB, Small RJ (2017) Leveraging constraints and biotelemetry data to pinpoint repetitively used spatial features. Ecology 98(1):12–20

    Article  Google Scholar 

  • Brost BM, Hooten MB, Small RJ (2020) Model-based clustering reveals patterns in central place use of a marine top predator. Ecosphere 11:e03123

    Article  Google Scholar 

  • Brown M, Lowe DG (2015) Automatic panoramic image stitching using invariant features. Int J Comput Vision 74:59–73

    Article  Google Scholar 

  • Buckland ST, Burt ML, Rexstad EA, Mellor M, Williams AE, Woodward R (2012) Aerial surveys of seabirds: the advent of digital methods. J Appl Ecol 49:960–967

    Article  Google Scholar 

  • Caughley G (1974) Bias in aerial survey. J Wildl Manag 38(4):921–933

    Article  Google Scholar 

  • Christen P (2011) A survey of indexing techniques for scalable record linkage and deduplication. IEEE Trans Knowl Data Eng 24:1537–1555

    Article  Google Scholar 

  • Copas J, Hilton F (1990) Record linkage: statistical models for matching computer records. J R Stat Soc A Stat Soc 153:287–312

    Article  Google Scholar 

  • Darroch JN (1958) The multiple capture census i. Estimation of a closed population. Biometrika 45:343–358

    MathSciNet  MATH  Google Scholar 

  • Dennis EB, Morgan BJT, Ridout MS (2015) Computational aspects of N-mixture models. Biometrics 71:237–246

    Article  MathSciNet  MATH  Google Scholar 

  • Dryden IL, Mardia KV (1998) Statistical analysis of shape. Wiley

  • Du Y, Wong Y, Liu Y, Han F, Gui Y, Wang Z, Kankanhalli M, Geng W (2016) Marker-less 3d human motion capture with monocular image sequence and height-maps. In: European conference on computer vision. Springer, pp 20–36

  • Efford M (2004) Density estimation in live-trapping studies. Oikos 106:598–610

    Article  Google Scholar 

  • Efford MG (2011) Estimation of population density by spatially explicit capture-recapture analysis of data from area searches. Ecology 92:2202–2207

    Article  Google Scholar 

  • Eisaguirre JM, Williams PJ, Lu X, Kissling ML, Beatty WS, Esslinger GG, Womble JN, Hooten MB (2021) Diffusion modeling reveals effects of multiple release sites and human activity on a recolonizing apex predator. Mov Ecol 9:34

    Article  Google Scholar 

  • Esslinger GG, Esler D, Howlin S, Starcevich L (2015) Monitoring population status of sea otters (Enhydra lutris) in Glacier Bay National Park and Preserve, Alaska: options and considerations. US Department of the Interior, US Geological Survey

  • Fellegi IP, Sunter AB (1969) A theory for record linkage. J Am Stat Assoc 64:1183–1210

    Article  MATH  Google Scholar 

  • Fortini M, Liseo B, Nuccitelli A, Scanu M (2001) On Bayesian record linkage. Res Official Stat 4:185–198

    Google Scholar 

  • Green PJ, Mardia K (2005) Bayesian alignment using hierarchical models, with application in protein bioinformatics. Biometrika 93(2):235–254

    Article  MathSciNet  MATH  Google Scholar 

  • Gross JW, Heumann BW (2016) A statistical examination of image stitching software packages for use with unmanned aerial systems. Photogr Eng Remote Sens 82(6):419–425

    Article  Google Scholar 

  • Hefley TJ, Broms KM, Brost BM, Buderman FE, Kay SL, Scharf HR, Tipton JR, Williams PJ, Hooten MB (2017) The basis function approach to modeling autocorrelation in ecological data. Ecology 98:632–646

    Article  Google Scholar 

  • Hefley TJ, Hooten MB (2016) Hierarchical species distribution models. Curr Lands Ecol Rep 1:87–97

    Article  Google Scholar 

  • Hooten MB, Johnson DS, Brost BM (2021) Making recursive Bayesian inference accessible. Am Stat 75:185–194

    Article  MathSciNet  Google Scholar 

  • Jain S, Neal RM (2004) A split-merge Markov chain Monte Carlo procedure for the Dirichlet process mixture model. J Comput Graph Stat 13:158–182

    Article  MathSciNet  Google Scholar 

  • Jameson RJ, Kenyon KW, Johnson AM, Wight HM (1982) History and status of translocated sea otter populations in North America. Wildl Soc Bull 10(2):100–107

    Google Scholar 

  • Jaro MA (1989) Advances in record-linkage methodology as applied to matching the 1985 census of Tampa, Florida. J Am Stat Assoc 84:414–420

    Article  Google Scholar 

  • Kendall WL, Nichols JD, Hines JE (1997) Estimating temporary emigration using capture-recapture data with Pollock’s robust design. Ecology 78:563–578

  • Ketz AC, Johnson TL, Hooten MB, Hobbs NT (2019) A hierarchical Bayesian approach for handling missing classification data. Ecol Evol 9(6):3130–3140

    Article  Google Scholar 

  • LaPorte RE, McCarty D, Bruno G, Tajima N, Baba S (1993) Counting diabetes in the next millennium: application of capture-recapture technology. Diabetes Care 16:528–534

    Article  Google Scholar 

  • Larsen MD (2004) Record linkage using finite mixture models. An essential journey with donald Rubin’s statistical family, applied Bayesian modeling and causal inference from incomplete-data perspectives, pp 309–318

  • Larsen MD, Rubin DB (2001) Iterative automated record linkage using mixture models. J Am Stat Assoc 96:32–41

    Article  MathSciNet  Google Scholar 

  • Larson SE, Bodkin JL, VanBlaricom GR (2014) Sea Otter conservation. Academic Press

  • Levin A, Zomet A, Peleg S, Weiss Y (2004) Seamless image stitching in the gradient domain. In: European conference on computer vision. Springer, pp 377–389

  • Link WA (2013) A cautionary note on the discrete uniform prior for the binomal N. Ecology 94(10):2173–2179

    Article  Google Scholar 

  • Link WA, Yoshizaki J, Bailey LL, Pollock KH (2009) Uncovering a latent multinomial: analysis of mark-recapture data with misidentification. Biometrics 66:178–185

    Article  MathSciNet  MATH  Google Scholar 

  • Liseo B, Tancredi A (2011) Bayesian estimation of population size via linkage of multivariate normal data sets. J Official Stat 27:491

    Google Scholar 

  • Lu X, Williams PJ, Hooten MB, Powell JA, Womble JN, Bower MR (2019) Nonlinear reaction-diffusion process models improve inference for population dynamics. Environmetrics 31(3):e2604

    Google Scholar 

  • Lum K, Price ME, Banks D (2013) Applications of multiple systems estimation in human rights research. Am Stat 67:191–200

    Article  MathSciNet  Google Scholar 

  • McGlincy MH (2004) A Bayesian record linkage methodology for multiple imputation of missing links. In: ASA proceedings of the joint statistical meetings. American Statistical Association, Alexandria, VA, pp 4001–4008

  • Ourselin S, Roche A, Subsol G, Pennec X, Ayache N (2001) Reconstructing a 3D structure from serial histological sections. Image Vis Comput 19:25–31

    Article  Google Scholar 

  • R Core Team (2019) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria

  • Rezende DJ, Eslami SA, Mohamed S, Battaglia P, Jaderberg M, Heess N (2016) Unsupervised learning of 3d structure from images. In: Advances in neural information processing systems, pp 4996–5004

  • Royle JA (2009) Analysis of capture-recapture models with individual covariates using data augmentation. Biometrics 65:267–274

    Article  MathSciNet  MATH  Google Scholar 

  • Royle JA, Dorazio R (2012) Parameter-expanded data augmentation for Bayesian analysis of capture-recapture models. J Ornithol 152:521–537

  • Royle JA, Dorazio RM, Link WA (2007) Analysis of multinomial models with unknown index using data augmentation. J Comput Graph Stat 16:67–85

    Article  MathSciNet  Google Scholar 

  • Royle JA, Young KV (2008) A hierarchical model for spatial capture-recapture data. Ecology 89(8):2281–2289

    Article  Google Scholar 

  • Sadinle M (2018) Bayesian propagation of record linkage uncertainty into population size estimation of human rights violations. Ann Appl Stat 12(2):1013–1038

    Article  MathSciNet  MATH  Google Scholar 

  • Scharf HR, Hooten MB, Fosdick BK, Johnson DS, London JM, Durban JW et al (2016) Dynamic social networks based on movement. Ann Appl Stat 10:2182–2202

    Article  MathSciNet  MATH  Google Scholar 

  • Steorts RC (2015) Entity resolution with empirically motivated priors. Bayesian Anal 10(4):849–875

    Article  MathSciNet  MATH  Google Scholar 

  • Steorts RC, Hall R, Fienberg SE (2015) A Bayesian approach to graphical record linkage and deduplication. J Am Stat Assoc 111:1660–1672

    Article  MathSciNet  Google Scholar 

  • Steorts RC, Ventura SL, Sadinle M, Fienberg SE (2014) A comparison of blocking methods for record linkage. International conference on privacy in statistical databases. Springer, Cham, pp 253–268

    Chapter  Google Scholar 

  • Szeliski R (2006) Image alignment and stitching: a tutorial. Found Trends® Comput Graph Vis 2:1–104

  • Tancredi A, Liseo B (2011) A hierarchical Bayesian approach to record linkage and population size problems. Ann Appl Stat 5(2B):1553–1585

    Article  MathSciNet  MATH  Google Scholar 

  • Tancredi A, Steorts R, Liseo B et al (2018) A unified framework for de-duplication and population size estimation. Bayesian Anal 15(2):633–682

    MathSciNet  MATH  Google Scholar 

  • Ver Hoef JM (2014) Aerial survey data. Statistics Reference Online, Wiley StatsRef

    Book  Google Scholar 

  • Wahba G (1978) Improper priors, spline smoothing and the problem of guarding against model errors in regression. J Roy Stat Soc Ser B 40(3):364–372

  • Wallach H, Jensen S, Dicker L, Heller K (2010) An alternative prior process for nonparametric Bayesian clustering. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp 892–899

  • Williams PJ, Hooten MB, Esslinger GG, Womble JN, Bodkin JL, Bower MR (2019) The rise of an apex predator following deglaciation. Divers Distrib 25:895–908

    Article  Google Scholar 

  • Williams PJ, Hooten MB, Womble JN, Bower MR (2017) Estimating occupancy and abundance using aerial images with imperfect detection. Methods Ecol Evol 8:1679–1689

    Article  Google Scholar 

  • Williams PJ, Schroeder C, Jackson P (2020) Estimating reproduction and survival of unmarked juveniles using aerial images and marked adults. J Agric Biol Environ Stat 25:133–147

    Article  MathSciNet  MATH  Google Scholar 

  • Williams TM (1989) Swimming by sea otters: adaptations for low energetic cost locomotion. J Comp Physiol A 164(6):815–824

    Article  Google Scholar 

  • Winkler WE (1995) Matching and record linkage. Bus Surv Methods 1:355–384

    Google Scholar 

  • Winkler WE (2006) Overview of record linkage and current research directions. In: Bureau of the Census

  • Womble J, Williams P, Johnson W, Taylor-Thomas L, Bower M (2018) Sea otter monitoring protocol for Glacier Bay National Park, Alaska: Version SO-2017.1. Natural Resource Report NPS/SEAN/NRR—2018/1762, National Park Service, Fort Collins, Colorado

  • Wood SN, Pya N, Safken B (2016) Smoothing parameter and model selection for general smooth models. J Am Stat Assoc 111(516):1548–1563

    Article  MathSciNet  Google Scholar 

  • Wright JA, Baker RJ, Schofield MR, Frantz AC, Byrom AE, Gleeson DM (2009) Incorporating genotype uncertainty into mark-recapture-type models for estimating abundance using DNA samples. Biometrics 65:833–840

    Article  MathSciNet  MATH  Google Scholar 

Download references


This research was funded by NSF DMS 1614392 and NPS P16AC01524 and P19AC00063. Research and monitoring were conducted under US Fish & Wildlife Service Scientific Research Permit #MA14762C-0 and NPS Scientific Research Permit GLBA-2016-SCI-022. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the US Government. We appreciate the assistance of Dennis Lozier and Louise Taylor with sea otter surveys and image processing.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Xinyi Lu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 772 KB)

Appendix A: Prior Distributions

Appendix A: Prior Distributions

$$\begin{aligned} p_0&\sim \text {Beta}\left( 3, 1\right) ,\\ \psi&\sim \text {Beta}\left( 0.001, 1\right) ,\\ \sigma ^2_u&\sim \text {IG}\left( 100, 25\right) ,\\ \varvec{s}_m&\sim \text {Unif}({\mathcal {D}}),\ m = 1, \dots , M,\\ \varvec{\alpha }&\sim \text {N}\left( \varvec{0}, 0.001\varvec{R} + 0.01\varvec{I}\right) ,\\ \varvec{\beta }&\sim \text {N}\left( \varvec{0}, 0.001\varvec{R} + 0.01\varvec{I}\right) , \end{aligned}$$

where \(\varvec{R} = \left( \varvec{D}_2^-\right) ^{'}\varvec{D}_2\) and \(\varvec{D}_2 = \begin{bmatrix} 1 &{} -2 &{} 1 &{} 0 &{} 0\\ 0 &{} 1 &{} -2 &{} 1 &{} 0\\ 0 &{} 0 &{} 1 &{} -2 &{} 1 \end{bmatrix}\).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lu, X., Hooten, M.B., Kaplan, A. et al. Improving Wildlife Population Inference Using Aerial Imagery and Entity Resolution. JABES 27, 364–381 (2022).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: