Instrumental variable specifications and assumptions for longitudinal analysis of mental health cost offsets
Authors
- First Online:
- Received:
- Revised:
- Accepted:
DOI: 10.1007/s10742-012-0097-7
- Cite this article as:
- O’Malley, A.J. Health Serv Outcomes Res Method (2012) 12: 254. doi:10.1007/s10742-012-0097-7
Abstract
Instrumental variables (IVs) enable causal estimates in observational studies to be obtained in the presence of unmeasured confounders. In practice, a diverse range of models and IV specifications can be brought to bear on a problem, particularly with longitudinal data where treatment effects can be estimated for various functions of current and past treatment. However, in practice the empirical consequences of different assumptions are seldom examined, despite the fact that IV analyses make strong assumptions that cannot be conclusively tested by the data. In this paper, we consider several longitudinal models and specifications of IVs. Methods are applied to data from a 7-year study of mental health costs of atypical and conventional antipsychotics whose purpose was to evaluate whether the newer and more expensive atypical antipsychotic medications lead to a reduction in overall mental health costs.
Keywords
Causal inference Exclusion restriction Fixed differences Instrumental variable Longitudinal Mental health costs1 Introduction
Estimation of causal effects in observational studies is an engrossing and controversial topic in statistics and the social sciences. Some investigators consider observational studies to lack internal validity as the absence of randomization exposes results to bias from unmeasured confounding variables. Yet observational studies are an important part of medical and health care research. They can be performed in situations where randomized trials are infeasible, they generate larger datasets, and they may involve more diverse study populations. Therefore, observational studies allow estimates of treatment effects for more nuanced subpopulations and are better equipped to account for treatment effect-heterogeneity than randomized trials.
Instrumental variables (IV) identify randomized experiments that are naturally occurring, enabling estimation of causal effects in observational studies. Loosely-speaking, an IV must predict treatment but not be directly related to the outcome or any unmeasured confounding variables (Imbens and Angrist 1994; Angrist et al. 1996). An IV extracts the variation in the supposed endogeneous predictor(s) that is orthogonal to any unmeasured confounding variables, yielding projected values from which the causal effect of the endogeneous predictor(s) on the outcome can be determined. Unlike regression and propensity score methods (Rosenbaum and Rubin 1983), IV methods accommodate unmeasured confounders. Although testing whether a variable predicts treatment is straight-forward, the requirement that the same variable does not directly affect the outcome (the exclusion restriction) is the bane of all IV analyses. First, even a small direct effect on the outcome violates the exclusion restriction. Second, it is not possible to test the exclusion restriction by simply including the IV as a predictor of the outcome as its own effect is then confounded with that of any unmeasured confounders (Morgan and Winship 2007, pp 196–197). Therefore, the choice of IVs must be undertaken with great care.
Longitudinal studies generalize cross-sectional designs by accommodating repeated observations over time on the same study unit (e.g., a patient). They allow dynamic treatment effects (e.g., the effect of a change in treatment) and modifying effects (e.g., the effect of current treatment changes with past treatment) to be estimated. In addition, individual dummy variables may be used to block the effects of time-invariant confounders. An important question is whether longitudinal data enhances examination of the IV assumptions.
In this paper we discuss the use of IVs in longitudinal analyses with particular focus on lagged predictors and outcomes. Treatment is represented using contemporaneous, lagged, and modifying variables. Because lagged treatment may be assumed to be endogeneous, exogeneous, an IV, or to have no effect whatsoever and lagged outcomes may be predictors, a multitude of longitudinal models are possible.
Various model specifications are compared by evaluating the effect of atypical versus conventional antipsychotic drugs on overall mental health costs defined as the cost of treatment and subsequent medical care in that year for medicaid recipients. The same data was analyzed previously by fitting a cross-sectional model using ordinary least squares (OLS) and various IV methods (O’Malley et al. 2011). However, in this cross-sectional setting the IV was borderline weak. Therefore, another key question is whether availability of longitudinal data allows the IV to be strengthened.
There are several important papers on IV methods for longitudinal (Hogan and Lancaster 2004; McClellan and Newhouse 1997) and other data types involving lagged variables, including spatially lagged data (Haining 1978; Kelejian and Robinson 1993). However, while several areas of statistical methodology consider the use of lagged variables as predictors (e.g., longitudinal analysis, time series analysis), their use as IVs has been studied less extensively. An exception is the work of several econometricians on methods for analyzing panel data (Arellano and Honore 2001, chapter 53; Hsiao 2003, chapter 4).
In Sect. 2 we review past work on mental health cost offsets and introduce the data and key variables motivating this work. The implication of differing assumptions about the causal relationships involving unmeasured confounding variables is illustrated using directed acyclic graphs (DAGs) in Sect. 3. In particular, we describe situations where lagged outcomes and treatments have different roles including when they should not be adjusted for, when they should be adjusted for, and when there is ambiguity. In Sect. 4 we introduce notation, models, estimands and IV assumptions for the mental health cost offsets analysis. Section 5 describes the IV requirements for each model and the method of estimation. In Sect. 6 we compare results across the models. The paper concludes with a discussion of the main findings in Sect. 7.
2 Background
2.1 Mental health cost offsets hypothesis
Atypical antipsychotics, including clozapine, olanzapine (zyprexa), quetiapine (sero-quel), and risperidone (risperidal), while considerably more expensive than the D2-antagonists, have been associated with a different (neurological versus physical) profile of side effects (O’Malley et al. 2011). It is thought that the greater tolerability of these new antipsychotics improves adherence to treatment regimens, thereby reducing relapses, resulting in declines in the use of hospital and emergency room services. This has led to the offset hypothesis that atypical antipsychotics, while more expensive ultimately pay for themselves through reductions in other types of health spending (Lichtenberg 2001). However, the hypothesis is disputed (Rosenheck et al. 2006) and testing it is complicated by the fact that patients who receive the newer atypical drugs likely differ from those getting the older drugs on a number of systematic factors, some unobserved.
2.2 Study population and variables
The data motivating this research is from Florida’s Medicaid population over the period July 1994–June 2001. Study years are from July 1 of 1 year to June 30 of the next year. The analysis sample was restricted to patients continuously-enrolled for 6-months or more of a given study year (26,759 individuals).
Log-annual mental health spending is the dependent variable and plurality drug type (defined as a binary variable indicating whether atypical or conventional antipsychotic drugs comprised the majority of an individual’s Medicaid claims for the year) is the key predictor or “treatment.” The assumed exogeneous predictors are male, white, black, history of substance abuse, recipient of supplemental security income (SSI), study year and area of residence. Because Miami–Key West is the most populous area, indicator variables for the ten other areas are included as predictors. Unmeasured confounders could include health status of the patient (other physical and mental health comorbidities, severity of illness), patient preferences over treatment, access to skilled physicians, and physician prescribing habits. Many of these are time-varying and therefore cannot be blocked by patient dummies.
The approval status of the atypicals introduced during our study period—zyprexa, seroquel, geodon—and their interactions with area of residence were previously used as IVs. ^{1} Clearly, whether a drug has been approved impacts the likelihood an individual receives an atypical at a given time. Because areas have different geographic, cultural, social and economic factors and physicians in them may have varying attitudes, the uptake of atypicals is likely to vary between areas. Thus, the likelihood a patient is prescribed an atypical is expected to depend on where they live (O’Malley et al. 2011). In this paper the consequence of supplementing these IVs with additional variables only available with longitudinal data will be investigated.
Basic identification improvements over cross-sectional analysis
Model |
Estimate |
t-stat |
P-value |
F_{StageI} |
---|---|---|---|---|
Ordinary least squares | ||||
Cross-sectional |
1.022 |
76.4 |
0.000 | |
Fixed differences |
0.613 |
44.1 |
0.000 | |
IV regression (two-stage least squares) | ||||
Cross-sectional |
−0.028 |
−0.17 |
0.866 |
9.69 |
Fixed differences |
−0.590 |
−3.46 |
0.001 |
7.31 |
Add a _{ i(t-2)} as IV |
0.133 |
1.28 |
0.199 |
15.5 |
3 Causal assumptions
Conditioning on different subsets of the history of the outcomes or the treatments has been shown to have dramatic effects on the resulting inference (Pepe and Anderson 1994; Vansteelandt 2007). Therefore, it is important to consider the implications of including or excluding each candidate predictor in the model. DAGs are useful for depicting the data generating mechanism and the causal assumptions made by various models. Let Y, A, X, U and Z be random variables denoting the outcome, treatment, exogeneous covariates, unmeasured covariates, and IVs for an individual. We use the subscript t for time and for illustration consider the case \(t \in \{0,1\}\).
3.1 Conditioning on lagged treatments and outcomes
In order to obtain a consistent estimate of the effect of A _{1} on Y _{1}, it is necessary to condition on A _{0} as it would otherwise be an unmeasured confounder. However, while conditioning on Y _{0} does not affect the identifiability of the effect of A _{1} on Y _{1}, it has implications for the effect of A _{0} on Y _{1}. If Y _{0} is not conditioned on then the direct effect of A _{0} on Y _{1} is confounded with the effect acting through Y _{0}. If Y _{0} is conditioned on then the unblocked path from A _{0} to Y _{1} through U that arises as Y _{0} is caused by both A _{0} and U leads to lack of identifiability (Sharkey and Elwert 2010).^{2} Specifically, one cannot distinguish the effect of A _{0} on Y _{1} from that induced through U. Therefore, whether or not Y _{0} is conditioned on, the direct effect of A _{0} on Y _{1} is not-identified.
3.2 Need for IVs
Under Figure 3 the IV analysis does not need to involve Y _{0}, A _{0} or X. However, if Z also caused A _{0} it would then be necessary to condition on X and either Y _{0} or A _{0}. Because conditioning on Y _{0} and X blocks all paths from A _{0} to Y _{1}, a test of the validity of the model assumptions is to include A _{0} in the model; a statistically significant coefficient of the effect of A _{0} on Y _{1} would raise concerns about the validity of the model.^{4}
4 Notation and models for offsets analysis
4.1 Longitudinal models
If a _{ it } is endogeneous then any variable that interacts with a _{ it } is also endogeneous. However, while a _{ it } a _{ i(t−1)} inherits endogeneity from a _{ it }, a _{ i(t−1)} need not be endogeneous. For both (2) and (3) we evaluate the consequence of a _{ i(t−1)} endogeneous (as in Fig. 4), exogeneous (as in Fig. 1), and usable as an IV (as in Fig. 2 or Fig. 3). Because adjusting for y _{ i(t−1)} can be problematic (Figs. 1, 4), the estimates obtained under this model are compared to those for models that exclude y _{ i(t−1)}.
Although random effect models are common in longitudinal analyses they are problematic when y _{ i(t−1)} (or other lagged outcome) is a predictor as the assumption that random β _{0i } is uncorrelated with the predictors is violated (Wooldridge 2002, p. 256). This is seen from the fact that β _{0i } affects the expected value of all observations on an individual, including y _{ i(t−1)}. Therefore, under a random effects specification, β _{0i } would be correlated with y _{ i(t−1)}, which is a predictor of y _{ it }. Thus, we avoid random effect specifications for β _{0i }. Because we don’t model the correlation structure we use robust standard errors to account for dependence within individuals (Huber 1967; White 1982).
5 IV requirements
The general requirements for z _{ it } to be an IV for the effect of a _{ it } on y _{ it } are: (1) it is associated with a _{ it } conditional on x _{ it }, u _{ it }; (2) it is not associated with u _{ it } conditional on x _{ it }; (3) it is not associated with y _{ it } conditional on a _{ it }, x _{ it }, u _{ it }. The more precisely z _{ it } predicts a _{ it } the greater the statistical power of the analysis; perfect predictions typically occur only in randomized studies with 100 % compliance with treatment assignment. Condition (2) guards against any backdoor pathways from z _{ it } through u _{ it } to y _{ it }—sometimes referred to as the “random” requirement. Condition (3) excludes z _{ it } from having a direct effect on y _{ it } other than through a _{ it }—the “exclusion restriction.”
A DAG-based test of z _{ it } as an IV in Fig. 4 is: after removing all arcs out of a _{ it } no path leads from z _{ it } to y _{ it } conditional on x _{ it } (Brito and Pearl 2002; Joffe et al. 2008). Any unmeasured area level variables are absorbed in u _{ it }. However, because such variables are time-invariant the inclusion of the area dummies in x _{ it } blocks their effects.
5.1 Using longitudinal data to enhance IVs
In the cross-sectional analysis of the offsets data, the IVs were contemporaneous indicators of the approval status of zyprexa, seroquel and geodon and their interactions with area of residence. However, the model for the outcome is suggestive of additional IVs; {a _{ i(t−k)}}_{ k>1} do not appear in either (2) or (3), which is consistent with them not having a direct effect on y _{ it }. Because a _{ i(t−2)} is evaluated at least a year earlier than y _{ it }, it is plausible that it is uncorrelated with y _{ it } conditional on (y _{ i(t−1)}, a _{ it }, a _{ i(t−1)}, x _{ it }). If a _{ i(t−2)} is correlated with a _{ it } conditional on (a _{ i(t−1)}, x _{ it }, u _{ it }) then a _{ i(t−2)} is a valid IV. In general, if treatment influences subsequent treatment for a longer period than it influences outcomes, then the lagged treatment variables from the differential period are candidate IVs.^{5}
When β_{3} = 0 in (2), a _{ i(t−1)} is a candidate IV for a _{ it }. However, if a _{ i(t−1)} is associated with an unmeasured confounder (e.g., as in Fig. 1 when Y _{0} is conditioned on), it violates the IV assumptions. If a _{ i(t−2)} or any other variable is known to be a valid IV, the Sargan over-identifying restrictions test (ORT) may be used to evaluate whether a _{ i(t−1)} is a valid IV (Sargan 1958).
5.2 Estimation: two-stage least squares (2SLS)
To avoid estimating the fixed effects {β _{0i }}_{1:N }, estimation of the longitudinal models is accomplished by regressing the individually-first differenced outcomes on the individually-first-differenced predictors (Wooldridge 2002, pp. 279–281). Because differencing accounts for all time-invariant variation, the strength of the IV is governed by the extent to which intra-individual variation in z _{ it } predicts intra-individual variation in a _{ it }. Conversely, the exclusion restriction is only violated by intra-individual variation directly related to y _{ it }.
A virtue of first differencing over mean-centering (subtraction of the individual sample mean \(\bar{v}_{i}\) from v _{ it }, \(t=1,\ldots,T\)) is that it makes a _{ i(t−2)} more defensible as an IV. This is seen from that fact that under (2) and (3) the first-differenced error, \(\epsilon_{it}-\epsilon_{i(t-1)}\), is independent of a _{ i(t−2)} − a _{ i(t−3)}. However, if a _{ it } depends on \(\epsilon_{it}\) for t = 1,…,T then \(a_{i(t-2)}-\bar{a}_{i}\) and the mean-centered error \(\epsilon_{i(t-2)}-\bar{\epsilon}_{i}\) appear likely to be correlated.
By using a _{ i(t-2)} as an IV and basing estimates on first-differences, only observations with non-missing (a _{ it }, a _{ i(t−1)}, a _{ i(t-2)}, a _{ i(t−3)}) are used in the analysis leading to a substantial loss of information. Rather than require that all IVs be available for all observations, we do not use a _{ i(t−2)} as an IV for observations in which it is missing [an approach proposed in Arellano and Bond (1991)]. Let r _{ it } = 1 if a _{ i(t−2)} is missing and r _{ it } = 0 otherwise. Then set the component of z _{ it } corresponding to a _{ i(t−2)} equal to 0 if r _{ it } = 0. Because r _{ it } is not expected to contain any information about y _{ it } we use it as an additional IV. If all of the IVs are valid then the treatment effect is not affected by the removal or addition of any particular IV from the analysis (Small 2007). Therefore, using r _{ it } as an additonal IV is only expected to affect the precision of the estimated treatment effects.
- 1.Use OLS to fit the “stage I” regression equationto obtain fitted values \(\hat{{a}}_{it}.\)$$ \tilde{a}_{it} = \theta_{1}\tilde{y}_{i(t-1)} + \theta_{2}\tilde{a}_{i(t-1)} + {\varvec{\theta}}_{3}^{T}{\tilde{\user2{x}}}_{it} + {\varvec{\theta}}_{4}^{T}{\tilde{\user2{z}}}_{it} + \tilde{\delta}_{it} $$
- 2.Use OLS to fit the outcome or “stage II” regression equationyielding estimates of β_{2} and the other model parameters.$$ \tilde{y}_{it} = \beta_{1} \tilde{y}_{i(t-1)} + \beta_{2}\hat{{a}}_{it} + \beta_{3}\tilde{a}_{i(t-1)} + {\varvec{\beta}}_{5}^{T}{\tilde{\user2{x}}}_{it} + \tilde{\epsilon}_{it}, $$
As depicted above, all exogeneous predictors in the outcome (stage II) equation are included in the stage I equation (Angrist and Pischke 2009, p. 189).
A curious feature of (4) is that \(\tilde{y}_{i(t-1)}\), \({\tilde{\user2{x}}}_{it}\), and \({\tilde{\user2{y}}}_{it}\) are predictors of \(\tilde{a}_{i(t-1)}\) (second equation). The anomaly that \(\tilde{y}_{i(t-1)}\) is a predictor of \(\tilde{a}_{i(t-1)}\) in (4) emphasizes that the stage I equations do not depict models that we believe in but are artifacts of the estimation procedure. The stage I equations are determined solely by the outcome equation and the designated instruments. In contrast, under a parametric structural equation model such as the “Heckit model” (Arendt and Holm 2008), a bivariate model is assumed in which the predictors in the treatment selection equations (for a _{ it }, a _{ i(t−1)}) need not include the same exogeneous predictors as the outcome equation for y _{ it }.
The Stata procedure xtivreg2 with estimation option “fd” (for first differences) may be used to fit the longitudinal models described above. Example code is provided in the Appendix.
6 Results
6.1 Strengthening IV in cross-sectional model
The potential for longitudinal data to enhance IV estimation is first demonstrated by fitting the cross-sectional model in (1), then first-differencing to account for time-invariant confounders, and finally augmenting the IVs with a _{ i(t−2)}. The substantial difference between the OLS and 2SLS estimates of β _{1} under (1) can be attributed to extensive unmeasured confounding (Table 1). Although the effect of a _{ i(t−2)} is reduced by first-differencing, the IV assumptions are more believable as time-invariant unmeasured variables are blocked. Despite only being identified off intra-individual variation, the doubling of the F_{StageI} statistic reveals that use of a _{ i(t−2)} as an IV substantially improves identification of the effect of a _{ it } on y _{ it }.
6.2 Dynamic model
We consider the four models given by y _{ i(t−1)} (included, excluded) and a _{ i(t−1)} (included, excluded). In 2SLS analyses, two scenarios are considered when a _{ i(t−1)} is included (endogeneous, exogeneous) and excluded (IV, not an IV) from the model. Throughout the longitudinal analyses a _{ i(t−2)} is embedded in z _{ it }. Unless otherwise stated, results pertain to the case when y _{ i(t−1)} is excluded from the analysis.
Longitudinal models with different roles of a _{ i(t−1)}: no treatment modification
Status |
Term |
y _{ i(t−1)} Excluded |
y _{ i(t−1)} Included | ||||||
---|---|---|---|---|---|---|---|---|---|
of a _{ i(t-1)} |
Estimate |
t-stat |
P-value |
F _{StageI} |
Estimate |
t-stat |
P-value |
F _{StageI} | |
Ordinary least squares | |||||||||
Exogeneous |
a _{ it } |
0.625 |
37.7 |
0.000 |
0.622 |
40.2 |
0.000 | ||
a _{ i(t-1)} |
0.107 |
7.57 |
0.000 |
0.288 |
20.7 |
0.000 | |||
Exclude |
a _{ it } |
0.613 |
44.1 |
0.000 |
0.603 |
44.0 |
0.000 | ||
IV regression (two-stage least squares) | |||||||||
Endogeneous |
a _{ it } |
−0.686 |
−3.42 |
0.001 |
6.04 |
−0.997 |
−4.49 |
0.000 |
3.91 |
a _{ i(t-1)} |
0.374 |
5.53 |
0.000 |
0.601 |
7.58 |
0.000 | |||
Exogeneous |
a _{ it } |
0.355 |
6.51 |
0.000 |
54.5 |
0.218 |
4.20 |
0.000 |
54.5 |
a _{ i(t-1)} |
0.027 |
1.34 |
0.179 |
0.169 |
8.67 |
0.000 | |||
Instrument |
a _{ it } |
0.297 |
8.91 |
0.000 |
142.7 |
−0.134 |
−3.98 |
0.000 |
136.6 |
Exclude |
a _{ it } |
0.133 |
1.28 |
0.199 |
15.5 |
0.403 |
4.21 |
0.000 |
14.0 |
Results under IV estimation are well identified when a _{ i(t−1)} is used in some form to predict a _{ it } in the stage I equation (F_{StageI} in excess of 50 as an exogeneous predictor and in excess of 100 as an IV), moderately well-identified if a _{ i(t−1)} is excluded altogether (F_{StageI} around 15), and poorly-identified if a _{ i(t−1)} is endogeneous. The level of identification is minimally affected by conditioning on y _{ i(t−1)}. The lack of identifiability in the endogeneous case is compounded by high colinearity between a _{ it } and a _{ i(t−1)}, which even in the absence of unmeasured confounders makes it difficult to extract the independent effect of each and often increases the magnitude and alternates the signs of the predictors (as for the offsets analysis).
Because the inclusion of y _{ i(t−1)} as a predictor impacts the results in different ways, the three “identified” cases are discussed each in turn. When a _{ i(t−1)} is an exogeneous covariate the coefficient of a _{ it } is significant and positive (estimate 0.0355, P < 0.001) while the coefficient of a _{ i(t−1)} is not significantly different from 0. The inclusion of y _{ i(t−1)} led to an increase in the effect of a _{ i(t−1)} at the expense of the effect of a _{ it }. Although the estimate of β _{2} (the effect of a _{ it }) is bigger than β _{3} (the effect of a _{ i(t−1)}), the latter has a higher t-statistic due to the fact that it is not instrumented.
When a _{ i(t−1)} is an IV there is only a minor change to the exogeneous case—a consequence of the estimated β_{3} being close to 0 when a _{ i(t−1)} is a predictor. However, when y _{ i(t−1)} is included, the estimate of β_{2} is negative and significant (estimate −0.134, P < 0.001). This is the only well-identified longitudinal specification under which atypicals appear to lower the cost of mental health care. However, one reason to doubt analyses with a _{ i(t−1)} as an IV is that \(\tilde{a}_{i(t-1)}=a_{i(t-1)}-a_{i(t-2)}\) and \(\tilde{\epsilon}_{it}=\epsilon_{it}-\epsilon_{i(t-1)}\) seem likely to be correlated as endogeneous treatment assignment implies a _{ i(t−1)} and \(\epsilon_{i(t-1)}\) are correlated.
If a _{ i(t−1)} is excluded altogether then β_{2} is estimated to be 0.133 (not significant) when y _{ i(t−1)} is excluded and 0.403 (P < 0.001) when y _{ i(t−1)} is included. Thus, the impact of y _{ i(t−1)} is opposite that when a _{ i(t−1)} is used as an IV. Unfortunately, it is not possible to test empirically whether conditioning on y _{ i(t−1)} is more problematic than not conditioning on y _{ i(t−1)}. However, conditioning generally introduces less bias than not conditioning (Greenland 2003), suggesting that the results under the exogeneous specification might be the more trustworthy. Because the estimates of both β_{2} and β_{3} are positive and significant under the exogeneous specification, the offsets hypothesis appears to not hold.
6.3 Modified-treatment model
Longitudinal models with different roles of a _{ i(t−1)}: treatment modification
Status |
Term |
y _{ i(t−1)} Excluded |
y _{ i(t−1)} Included | ||||||
---|---|---|---|---|---|---|---|---|---|
of a _{ i(t−1)} |
Estimate |
t-stat |
P-value |
F _{StageI} |
Estimate |
t-stat |
P-value |
F _{StageI} | |
Ordinary least squares | |||||||||
Exogeneous |
a _{ it } |
0.635 |
34.0 |
0.000 |
0.624 |
36.7 |
0.000 | ||
a _{ i(t−1)} |
−0.030 |
−1.04 |
0.299 |
−0.007 |
−0.25 |
0.800 | |||
a _{ it } a _{ i(t−1)} |
0.126 |
5.09 |
0.000 |
0.292 |
12.9 |
0.000 | |||
Exclude |
a _{ it } |
0.608 |
43.7 |
0.000 |
0.593 |
43.2 |
0.000 | ||
a _{ it } a _{ i(t−1)} |
0.100 |
8.75 |
0.000 |
0.181 |
15.0 |
0.000 | |||
Two-stage least squares | |||||||||
Endogeneous |
a _{ it } |
−0.472 |
−2.74 |
0.006 |
2.22 |
−0.675 |
−3.76 |
0.000 |
2.21 |
a _{ i(t−1)} |
−0.398 |
−1.82 |
0.069 |
−0.499 |
−2.29 |
0.022 | |||
a _{ it } a _{ i(t−1)} |
1.106 |
3.16 |
0.002 |
1.55 |
4.41 |
0.000 | |||
Exogeneous |
a _{ it } |
0.273 |
4.09 |
0.000 |
7.17 |
0.133 |
2.07 |
0.038 |
7.16 |
a _{ i(t−1)} |
0.863 |
3.64 |
0.000 |
0.930 |
4.10 |
0.000 | |||
a _{ it } a _{ i(t−1)} |
−0.476 |
−3.26 |
0.001 |
−0.370 |
−2.66 |
0.008 | |||
Instrument |
a _{ it } |
0.430 |
9.51 |
0.000 |
48.1 |
0.256 |
5.94 |
0.000 |
48.2 |
a _{ it } a _{ i(t−1)} |
0.095 |
3.06 |
0.002 |
0.331 |
11.0 |
0.000 | |||
Exclude |
a _{ it } |
0.431 |
9.53 |
0.000 |
49.4 |
0.259 |
6.01 |
0.000 |
49.4 |
a _{ it } a _{ i(t−1)} |
0.091 |
2.89 |
0.004 |
0.323 |
10.6 |
0.000 |
The results under OLS and 2SLS are largely invariant to y _{ i(t−1)}. One explanation that might also account for the sensitivity of the results under the dynamic model to the status of y _{ i(t−1)} is that y _{ i(t−1)} functions like a surrogate for a _{ it } a _{ i(t−1)}. Thus, if a _{ it } a _{ i(t−1)} is excluded from the model its effect in large part transmits through y _{ i(t−1)}. If a _{ it } a _{ i(t−1)} is included then the treatment effect heterogeneity is appropriately accounted for and y _{ i(t−1)} has less impact.
Because F _{StageI} ≤2.3 (7.2) when a _{ i(t−1)} is an endogeneous (exogeneous) predictor, implying weak identifiability, it is unwise to interpret the associated results. Attempts to strengthen identification by using a _{ i(t−2)} z _{ it } as an IV resulted in at most minor improvements (results not presented). Therefore, the key to identification of endogeneous (a _{ it }, a _{ it } a _{ i(t−1)}) is the exclusion of a _{ i(t−1)} from the outcome model. In other words, the required exclusion restriction is that there is no carryover effect of atypical use for individuals who switch to a conventional [β_{3} = 0 in (3)].
If a _{ i(t−1)} is excluded from the outcome equation it makes little empirical difference whether or not it is used as an IV. The two endogeneous effects are well identified (F_{StageI} nearly 50) and their estimated effects are similar. However, as for the dynamic model, inclusion of y _{ i(t−1)} led to the term involving a _{ i(t−1)} (in this case a _{ it } a _{ i(t−1)}) having a greater effect. With y _{ i(t−1)} in the model the effect of a _{ it } a _{ i(t−1)} is 50 % greater than that of a _{ it }; absent y _{ i(t−1)} the effect is one-quarter the size.
Because the estimated effects under 2SLS are significant and positive under the four well-identified scenarios, the evidence against the offsets hypothesis is again substantial. However, we cannot conclusively discern whether a _{ i(t−1)} operates as a lagged effect or exclusively as a modifying effect distinguishing new and continuing atypical users.
7 Discussion
In testing the offsets hypothesis we found that lagged treatment, a _{ i(t−1)}, has a profound impact on the results of the IVs analyses. Furthermore, the estimated coefficients were sensitive to the role of the lagged outcome, y _{ i(t−1)}.
In both the dynamic- and modified-treatment models, endogeneity of a _{ i(t−1)} proved fatal for identification. In the dynamic treatment model (no modification by lagged treatment), the key to identifiability was inclusion of a _{ i(t−1)} in the treatment selection equation for a _{ it }. In the modified-treatment model the key was exclusion of a _{ i(t−1)} from the outcome model. In both cases, a _{ i(t−1)} did not need to be used as an IV in order to obtain statistically significant results.
If y _{ i(t−1)} was excluded then the effect of a _{ it } tended to dominate that of any other treatment variable (a _{ i(t−1)} in the dynamic model and a _{ it } a _{ i(t−1)} in the modified-treatment model) whereas if y _{ i(t−1)} was included lagged treatment had substantially more influence. In all such models the estimated treatment effects were positive. The discrepancy of these results with the cross-sectional analysis may be due to the weakness of the IVs cross-sectionally, violations of the IV assumptions in the longitudinal models, model miss-specification, or combinations of these.
The only specification that supported the offsets hypothesis was the dynamic-treatment model when a _{ i(t−1)} was an IV and y _{ i(t−1)} a predictor. In this model, conditioning on y _{ i(t−1)} appears justified since if a _{ i(t−1)} has an effect on y _{ i(t−1)} which in turn has an effect on y _{ it }, conditioning on y _{ i(t−1)} is necessary for a _{ i(t−1)} to be a valid IV (Fig. 2). Furthermore, it is possible that the inclusion of any term involving a _{ i(t−1)} in the outcome equation leads to spurious effects. Therefore, it is plausible that the lone specification that obtained a negative estimate is the only valid specification! However, while use of a _{ i(t−1)} as an additional IV is enticing, its validity relies on an exclusion restriction that is difficult to satisfy, especially when first differencing is used for estimation. Therefore, the results in which a _{ i(t−1)} is not used as an IV appear more trustworthy.
An important new finding is that use of an atypical in the past year may have a carryover effect on mental health costs in the current year. Under the dynamic treatment model there was evidence that individuals who used an atypical in the prior year had greater mental health costs. The well-identified results for the modified-treatment model rely on the exclusion restriction that past treatment is irrelevant for individuals taking conventionals. Unfortunately the IVs are not powerful enough for all treatment variables to simultaneously be modeled as endogeneous. Therefore, it is not possible to make a reliable comparison between the dynamic- and modified-treatment models.
While longitudinal designs have clear advantages, the consequences of different assumptions must be carefully considered. Using DAGs to depict theoretical models may generate valuable insights into the variables thought to influence or confound the effects of interest, which in turn can lead to experimental designs and identification strategies that overcome concerns about unmeasured confounders. The sensitivity of the IV results for the offsets analysis to different assumptions about lagged treatment and lagged outcomes illustrates the importance of using external information to help specify the most appropriate model. In addition to using varied specifications to evaluate the sensitivity of results to different models and IV specifications, sensitivity analyses that evaluate the robustness to violations of the IV assumptions (Small 2007) may also be helpful.
Developed in the 1920’s (Wright 1928), IVs and their estimation methods are less well known among statisticians (Dowd 2011). However, the growing importance of and interest in health policy research and the need for IVs in this field is likely to foster increased methodological work and awareness of IVs in the future. In this paper the focus was longitudinal models, inspired in part by the fact that statistical methods developed for longitudinal data have widespread applicability [e.g., generalized estimation equations (Liang and Zeger 1986; Zeger and Liang 1986)]. IV methods for time-to-event and joint longitudinal-survival models are important areas for future research.
Because risperidal was introduced prior to 1994 its approval status is constant in the sample and so cannot be used as an IV.
Because it is caused by both A _{0} and U, Y _{0} is known as a collider. In general, conditioning on colliders is problematic (VanderWeele 2011).
In DAG terminology, U is a common cause of A _{0} and Y _{0} and therefore a confounder of the effect of A _{0} on Y _{0}.
Note that A _{0} is an IV conditional on Y _{0} and X. Therefore, the rationale for such a test is the same as that underlying the test of over-identifying restrictions (Small 2007). A significant finding would cast doubt on whether Z is a valid IV or suggest that some other assumption about the model is incorrect.
Acknowledgments
Research for the paper was supported by NIH Grant 1RC4MH092717-01. The dataset analyzed in this paper was developed in collaboration with Sharon-Lise T. Normand and Richard G. Frank on work supported by NIH Grants R01 MH061434 and R01 MH069721. The author also thanks Jaeun Choi for valuable suggestions made on an early draft of the manuscript and Felix Elwert for helpful discussions.
Open Access
This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.