Abstract
Sample selectivity is a recurrent problem in public health programmes and poses serious challenges to their evaluation. Traditional approaches to handle sample selection tend to rely on restrictive assumptions. The aim of this paper is to illustrate a copula-based selection model to handle sample selection in the evaluation of public health programmes. Motivated by a public health programme to promote physical activity in Leeds (England), we describe the assumptions underlying the copula selection, and its relative advantages compared with commonly used approaches to handle sample selection, such as inverse probability weighting and Heckman’s selection model. We illustrate the methods in the Leeds Let’s Get Active programme and show the implications of method choice for estimating the effect on individual’s physical activity. The programme was associated with increased physical activity overall, but the magnitude of its effect differed according to adjustment method. The copula selection model led to a similar effect to the Heckman’s approach but with relatively narrower 95% confidence intervals. These results remained relatively similar when different model specifications and alternative distributional assumptions were considered. The copula selection model can address important limitations of traditional approaches to address sample selection, such as the Heckman model, and should be considered in the evaluation of public health programmes, where sample selection is likely to be present.
Similar content being viewed by others
References
World Health Organization. Health Promotion. 2020. https://www.who.int/health-topics/health-promotion#tab=tab_1. Accessed 18 July 2020
Centers for Disease Control and Prevention. Promoting healthy behaviors. 2020. https://www.cdc.gov/healthyschools/healthybehaviors.htm. Accessed 18 July 2020
House of Lords Science and Technology Select Committee. Behaviour change. 2nd Report of session 2010–12. The Stationery Office, Editor, London. https://publications.parliament.uk/pa/ld201012/ldselect/ldsctech/179/179.pdf.
Craig P, et al. Developing and evaluating complex interventions: the new Medical Research Council guidance. Int J Nurs Stud. 2013;50(5):587–92.
Fletcher A, et al. Realist complex intervention science: applying realist principles across all phases of the Medical Research Council framework for developing and evaluating complex interventions. Evaluation (Lond). 2016;22(3):286–303.
Skivington K, Matthews L, Craig P, Simpson S, Moore L. Developing and evaluating complex interventions: updating Medical Research Council guidance to take account of new methodological and theoretical approaches. Lancet. 2018. https://doi.org/10.1016/S0140-6736(18)32865-4.
Adda J, Cornaglia F. Taxes, cigarette consumption, and smoking intensity. Am Econ Rev. 2006;96(4):1013–28.
Raghunathan TE. What do we do with missing data? some options for analysis of incomplete data. Annu Rev Public Health. 2004;25:99–117.
Craig P, Cooper C, Gunnell D, et al. Using natural experiments to evaluate population health interventions: new Medical Research Council guidance. J Epidemiol Community Health. 2012;66:1182–6.
Frew EJ, et al. Cost-effectiveness of a community-based physical activity programme for adults (Be Active) in the UK: an economic analysis within a natural experiment. Br J Sports Med. 2014;48(3):207.
Molenberghs G, et al. Handbook of missing data methodology. New York: Chapman and Hall/CRC; 2014.
Heckman JJ. Sample selection bias as a specification error. Econometrica. 1979. https://doi.org/10.2307/1912352.
Bärnighausen T, et al. Correcting HIV prevalence estimates for survey nonparticipation using Heckman-type selection models. Epidemiology. 2011. https://doi.org/10.1097/EDE.0b013e3181ffa201.
Koné S, et al. Heckman-type selection models to obtain unbiased estimates with missing measures outcome: theoretical considerations and an application to missing birth weight data. BMC Med Res Methodol. 2019;19(1):231.
Puhani P. The Heckman correction for sample selection and its critique. J Econ Surv. 2000;14(1):53–68.
Gomes M, et al. Estimating treatment effects under untestable assumptions with nonignorable missing data. Stat Med. 2020;39(11):1658–74.
Gomes M, et al. Copula selection models for non-Gaussian outcomes that are missing not at random. Stat Med. 2019;38(3):480–96.
Marra G, Radice R. GJRM: generalised joint regression modelling. R package version 0.1–1. 2017. https://rdrr.io/cran/GJRM/man/GJRM-package.html.
Active Leeds. Leeds Let's Get Active. https://active.leeds.gov.uk/classesandactivities/leeds-lets-get-active. Accessed 15 Nov 2020.
Fairburn J, Maier W, Braubach M. Incorporating environmental justice into second generation indices of multiple deprivation: lessons from the UK and progress internationally. Int J Environ Res Public Health. 2016. https://doi.org/10.3390/ijerph13080750.
Craig CL, et al. International Physical Activity Questionnaire: 12-country reliability and validity. Med Sci Sports Exer. 2003. https://doi.org/10.1249/01.MSS.0000078924.61453.FB.
Candio P, et al. Cost-effectiveness of a proportionate universal offer of free exercise: Leeds let’s get active. J Public Health. 2020. https://doi.org/10.1093/pubmed/fdaa113.
Wooldridge JM. Inverse probability weighted M-estimators for sample selection, attrition, and stratification. Port Econ J. 2002;1(2):117–39.
Seaman SR, White IR. Review of inverse probability weighting for dealing with missing data. Stat Methods Med Res. 2013;22(3):278–95.
Sklar A. Fonctions de répartition à n dimensions et leurs marges. Publ Inst Stat Univ Paris. 1959;8:229–31.
Nelsen RB. Methods of constructing copulas. In: Rb N, editor. An introduction to copulas. New York: Springer; 2006.
Smith MD. Modelling sample selection using Archimedean copulas. Econom J. 2003;6(1):99–123.
StataCorp. Stata statistical software: release 15. College Station: StataCorp LLC; 2017.
Tamakloe R, Hong J, Park D. A copula-based approach for jointly modeling crash severity and number of vehicles involved in express bus crashes on expressways considering temporal stability of data. Accid Anal Prev. 2020;146:105736.
Briggs A, Claxton K, Sculpher M. Decision modelling for health economic evaluation. Oxford: Oxford University Press; 2006.
Incerti D, Thom H, Baio G, Jansen JP. You still using excel? The advantages of modern software tools for health technology assessment. Value in Health. 2019.https://doi.org/10.1016/j.jval.2019.01.003.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Funding
PC was supported through the White Rose PhD Studentship Network scheme as part of the National Institute for Health Research Collaboration for Leadership in Applied Health Research and Care Yorkshire and Humber.
Conflicts of interest
The authors declare no conflict of interest.
Ethical approval
Analysis of anonymised data did not require ethical approval.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Availability of data and material
No data are available. Programme data have been provided by the local City Council under a Data Processing Agreement.
Code availability
Software code for implementing the proposed copula framework using the R package GJRM is provided.
Authors’ contributions
PC and MG were responsible for designing the study and drafting the manuscript. AJH, SP, AP and CB revised the paper critically for intellectual content. All the authors read and approved the final version of the manuscript.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Candio, P., Hill, A.J., Poupakis, S. et al. Copula Models for Addressing Sample Selection in the Evaluation of Public Health Programmes: An Application to the Leeds Let’s Get Active Study. Appl Health Econ Health Policy 19, 305–312 (2021). https://doi.org/10.1007/s40258-020-00629-x
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40258-020-00629-x