Abstract
Comparative effectiveness studies can identify the causal effect of treatment if treatment is unconfounded with outcome conditional on a set of measured covariates. Matching aims to ensure that the covariate distributions are similar between treatment and control groups in the matched samples, and this should be done iteratively by checking and improving balance. However, an outstanding concern facing matching methods is how to prioritise competing improvements in balance across different covariates. We address this concern by developing a ‘loss function’ that an iterative matching method can minimise. Our ‘loss function’ is a transparent summary of covariate imbalance in a matched sample and follows general recommendations in prioritising balance amongst covariates. We illustrate this approach by extending Genetic Matching (GM), an automated approach to balance checking. We use the method to reanalyse a high profile comparative effectiveness study of right heart catheterisation. We find that our loss function improves covariate balance compared to a standard GM approach, and to matching on the published propensity score.
Similar content being viewed by others
Notes
For example, there may be circumstances where a summary prognostic measure is excluded from the matching because it is highly correlated with the underlying covariates, and the best overall balance may be achieved by matching on the component measures rather than the summary variable. Here it would be important to check balance for both types of variable.
The negative of the p value is used as the measure is required to report imbalance not balance.
The KS test statistic is the maximum discrepancy in the eQQ plot and is sensitive to imbalance across the empirical distribution.
Using p values may be preferable with the KS bootstrap test since the test statistic is not monotonically related to the p value when there are point masses in the empirical distribution (Abadie 2002).
More generally, the comparison sample can be any sample, not necessarily a PS matched sample.
References
Abadie, A.: Bootstrap tests for distributional treatment effects in instrumental variables models. J. Am. Stat. Assoc. 97, 284–292 (2002)
Abadie, A., Imbens, G.W.: Large sample properties of matching estimators for average treatment effects. Econometrica 74, 235–267 (2006)
Austin, P.C.: A critical appraisal of propensity-score matching in the medical literature between 1996 and 2003. Stat. Med. 27, 2037–2049 (2008)
Austin, P.C.: Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples. Stat. Med. 28, 3083–3107 (2009)
Austin, P.C., Grootendorst, P., Anderson, G.M.: A comparison of the ability of different propensity score models to balance measured variables between treated and untreated subjects: a Monte Carlo study. Stat. Med. 26, 734–753 (2007)
Brookhart, M.A., Schneeweiss, S., Rothman, K.J., Glynn, R.J., Avorn, J., Stürmer, T.: Variable selection for propensity score models. Am. J. Epidemiol. 163, 1149–1156 (2006)
Chittock, D., Dhingra, V., Ronco, J., Russell, J., Forrest, D., Tweeddale, M., Fenwick, J.: Severity of illness and risk of death associated with pulmonary artery catheter use. Crit. Care Med. 32, 911–915 (2004)
Cochran, W.G., Rubin, D.B.: Controlling bias in observational studies: a review. Sankhyā Indian J. Stat. A 35, 417–446 (1973)
Connors, A.F., Speroff, T., Dawson, N.V., Thomas, C., Harrell, F.E., Wagner, D., Desbiens, N., Goldman, L., Wu, A.W., Califf, R.M., Fulkerson, W.J., Vidaillet, H., Broste, S., Bellamy, P., Lynn, J., Knaus, W.A.: The effectiveness of right heart catheterization in the intial care of critically ill patients. J. Am. Med. Assoc. 276, 889–897 (1996)
Dawid, A.P.: Conditional independence in statistical theory. J. R. Stat. Soc B 41, 1–31 (1979)
Diamond, A., Sekhon, J.: Genetic matching for estimating causal effects: a general multivariate matching method for achieving balance in observational studies. In progress (2010); working paper available from http://sekhon.berkeley.edu/papers/GenMatch.pdf
Drake, C.: Effects of misspecification of the propensity score on estimators of treatment effect. Biometrics 49, 1231–1236 (1993)
Glance, L.G., Osler, T.M., Mukamel, D.B., Dick, A.W.: Use of a matching algorithm to evaluate hospital coronary artery bypass grafting performance as an alternative to conventional risk adjustment. Med. Care 45, 292–299 (2007)
Grootendorst, P.: A review of instrumental variables estimation of treatment effects in the applied health sciences. Health Serv. Outcome Res. Methodol. 7, 159–179 (2007)
Hansen, B.B.: The prognostic analogue of the propensity score. Biometrika 95, 481–488 (2008)
Hansen, B.B., Bowers, J.: Covariate balance in simple, stratified and clustered comparative studies. Stat. Sci. 23, 219–236 (2008)
Harvey, S., Harrison, D.A., Singer, M., Ashcroft, J., Jones, C.M., Elbourne, D., Brampton, W., Williams, D., Young, D., Rowan, K.: Assessment of the effectiveness of pulmonary artery catheters in management of patients in intensive care (PAC-Man): a randomised controlled trial. Lancet 366, 472–477 (2005)
Helfand, M.: Comparative effectiveness research. Med. Decis. Mak. 29, 641 (2009)
Hill, J., Reiter, J.P.: Interval estimation for treatment effects using propensity score matching. Stat. Med. 25, 2230–2256 (2006)
Hirano, K., Imbens, G.W.: Estimation of causal effects using propensity score weighting: an application to data on right heart catheterization. Health Serv. Outcome Res. Methodol. 2, 259–278 (2001)
Ho, D., Imai, K., King, G., Stuart, E.A.: Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political Anal. 15, 199–236 (2007)
Imai, K., King, G., Stuart, E.A.: Misunderstandings between experimentalists and observationalists about causal inference. J. R. Stat. Soc. A 171, 481–502 (2008)
Knaus, W., Lynn, J.: Study to understand prognoses and preferences for outcomes and risks of treatment (SUPPORT) 1989-1997 [Computer file]. ICPSR version. George Washington University [producer], Washington, DC, 2000. Inter-university Consortium for Political and Social Research [distributor], Ann Arbor, MI, 2001 (1997). doi:10.3886/ICPSR02957
Lalonde, R.: Evaluating the econometric evaluations of training programs with experimental data. Am. Econ. Rev. 76, 604–620 (1986)
Levy, A., Harrigan, B., Johnston, K., Briggs, A.: Comparative effectiveness research through the looking glass. Med. Decis. Mak. 29, N6–N8 (2009)
Pearl, J.: Causal diagrams for empirical research. Biometrika 82, 669–710 (1995)
Rosenbaum, P.R.: Observational Studies. Springer, New York (2002)
Rosenbaum, P., Rubin, D.: The central role of the propensity score in observational studies for causal effects. Biometrika 70, 410–455 (1983)
Rosenbaum, P., Rubin, D.B.: Reducing bias in observational studies using subclassification on the propensity score. J. Am. Stat. Assoc. 79, 516–524 (1984)
Rosenbaum, P.R., Rubin, D.B.: Constructing a control group using multivariate matched sampling methods that incorporate the propensity score. Am. Stat. 39, 33–38 (1985)
Rubin, D.B.: Multivariate matching methods that are equal percent bias reducing, I: some examples. Biometrics 32, 109–120 (1976)
Rubin, D.B.: The design versus the analysis of observational studies for causal effects: parallels with the design of randomized trials. Stat. Med. 26, 20–36 (2007)
Sekhon, J.S.: Multivariate and propensity score matching software with automated balance optimization: the matching package for R. J. Stat. Softw. 42, 1–52 (2011)
Author information
Authors and Affiliations
Corresponding author
Appendix: Example of a user defined loss function together with annotated R code
Appendix: Example of a user defined loss function together with annotated R code
The function takes as input the variables matches , BM and nboots . The variable matches is a matrix with three columns, the indices of the treated and control units for a current matched sample and weights corresponding to each matched pair. The variable BM is the balance matrix containing the columns of the data for the variables used to define the loss function. The variable nboots is the number of bootstrap samples to use when computing the KS-bootstrap test p value.
-
my.fitfunc <- function(matches, BM, nboots=0){
-
index.treated <- matches[,1]
-
index.control <- matches[,2]
-
weights <- matches[,3]
The variable nvars is the number of variables in the balance matrix. The variables bm1.nvars and bm2.nvars are the number of high and low priority variables respectively. The high priority variables are assumed to be at the beginning. The vector pvals is created as an empty vector to store the p values from the t-tests and KS-tests on the covariates.
-
nvars <- ncol(BM)
-
bm2.nvars <- nvars-bm1.nvars
-
pvals <- c(rep(NA,2 * bm1.nvars))
The t-test and KS-test p values are computed for all of the high priority variables and stored in the vector pvals . The ks.boot function performs the KS-boot test with the number of bootstraps set by the parameter nboots but if the parameter nboots is 0 then the KS-test is performed. Here if the variable nboots is less than 500 then the parameter nboots is set to 0 and the warnings about a low number of bootstraps are suppressed else the parameter is set to the value of the variable nboots . The KS-test p values are stored at the beginning of the vector pvals then the t-test p values.
-
for (i in 1:bm1.nvars){
-
if(nboots >= 500){
-
pvals[i] <- ks.boot(BM[index.treated,i], BM[index.control,i],
-
nboots=nboots)$ks.boot.pvalue
-
} else {
-
suppressWarnings(pvals[i] <- ks.boot(BM[index.treated,i],
-
BM[index.control,i],nboots=0)$ks$p.value)
-
}
-
pvals[i+bm1.nvars] <- balanceUV(BM[index.treated,i],
-
BM[index.control,i], paired=TRUE, weights=weights,
-
match=TRUE)$p.value
-
}
The p values are computed similarly for all of the low priority variables and stored in the vector pvals3 instead of pvals1 . To reference the appropriate point in BM , the loop is
-
for (i in (bm1.nvars+1):nvars){ }
The variable indx is a vector of TRUE or FALSE corresponding to whether each p value is smaller in the current matched sample than the values stored in conpvals . The offset is needed to take into account the machine precision. Since the GM algorithm aims to minimise the output of the function and it is necessary to maximise the balance test p values, pvals is multiplied by −1. The variable pvals is copied to a variable named pvals1 .
-
indx <- conpvals > pvals + sqrt(.Machine$double.eps)
-
pvals <- -1*pvals
-
pvals1<-pvals
one is added to the p values from the current matched sample which are lower than conpvals , i.e. indx=TRUE , making them positive. Since GM minimises the loss function, it moves towards matched samples which maximise the number of p values that are better than conpvals .
-
for(i in 1:length(pvals)){
-
if(indx[i]){
-
pvals1[i] <- 1 + pvals[i]}
-
}
This line is very important since it structures the output according to Fig. 4. The vector pvals1 contains the first two blocks and is sorted in decreasing order. Since the values in pvals1 corresponding to variables with worse balance than conpvals are positive then they will be in the first block and given highest priority. The vector pvals3 is given lowest priority and is sorted in decreasing order, as in Fig. 4.
-
loss <- c(sort(pvals1,decreasing=TRUE),sort(pvals3,decreasing=TRUE))
-
return(loss)
-
}
Rights and permissions
About this article
Cite this article
Ramsahai, R.R., Grieve, R. & Sekhon, J.S. Extending iterative matching methods: an approach to improving covariate balance that allows prioritisation. Health Serv Outcomes Res Method 11, 95–114 (2011). https://doi.org/10.1007/s10742-011-0075-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10742-011-0075-5