Instrumental variable specifications and assumptions for longitudinal analysis of mental health cost offsets
- First Online:
- Received:
- Revised:
- Accepted:
DOI: 10.1007/s10742-012-0097-7
- Cite this article as:
- O’Malley, A.J. Health Serv Outcomes Res Method (2012) 12: 254. doi:10.1007/s10742-012-0097-7
- 682 Downloads
Abstract
Instrumental variables (IVs) enable causal estimates in observational studies to be obtained in the presence of unmeasured confounders. In practice, a diverse range of models and IV specifications can be brought to bear on a problem, particularly with longitudinal data where treatment effects can be estimated for various functions of current and past treatment. However, in practice the empirical consequences of different assumptions are seldom examined, despite the fact that IV analyses make strong assumptions that cannot be conclusively tested by the data. In this paper, we consider several longitudinal models and specifications of IVs. Methods are applied to data from a 7-year study of mental health costs of atypical and conventional antipsychotics whose purpose was to evaluate whether the newer and more expensive atypical antipsychotic medications lead to a reduction in overall mental health costs.
Keywords
Causal inference Exclusion restriction Fixed differences Instrumental variable Longitudinal Mental health costs1 Introduction
Estimation of causal effects in observational studies is an engrossing and controversial topic in statistics and the social sciences. Some investigators consider observational studies to lack internal validity as the absence of randomization exposes results to bias from unmeasured confounding variables. Yet observational studies are an important part of medical and health care research. They can be performed in situations where randomized trials are infeasible, they generate larger datasets, and they may involve more diverse study populations. Therefore, observational studies allow estimates of treatment effects for more nuanced subpopulations and are better equipped to account for treatment effect-heterogeneity than randomized trials.
Instrumental variables (IV) identify randomized experiments that are naturally occurring, enabling estimation of causal effects in observational studies. Loosely-speaking, an IV must predict treatment but not be directly related to the outcome or any unmeasured confounding variables (Imbens and Angrist 1994; Angrist et al. 1996). An IV extracts the variation in the supposed endogeneous predictor(s) that is orthogonal to any unmeasured confounding variables, yielding projected values from which the causal effect of the endogeneous predictor(s) on the outcome can be determined. Unlike regression and propensity score methods (Rosenbaum and Rubin 1983), IV methods accommodate unmeasured confounders. Although testing whether a variable predicts treatment is straight-forward, the requirement that the same variable does not directly affect the outcome (the exclusion restriction) is the bane of all IV analyses. First, even a small direct effect on the outcome violates the exclusion restriction. Second, it is not possible to test the exclusion restriction by simply including the IV as a predictor of the outcome as its own effect is then confounded with that of any unmeasured confounders (Morgan and Winship 2007, pp 196–197). Therefore, the choice of IVs must be undertaken with great care.
Longitudinal studies generalize cross-sectional designs by accommodating repeated observations over time on the same study unit (e.g., a patient). They allow dynamic treatment effects (e.g., the effect of a change in treatment) and modifying effects (e.g., the effect of current treatment changes with past treatment) to be estimated. In addition, individual dummy variables may be used to block the effects of time-invariant confounders. An important question is whether longitudinal data enhances examination of the IV assumptions.
In this paper we discuss the use of IVs in longitudinal analyses with particular focus on lagged predictors and outcomes. Treatment is represented using contemporaneous, lagged, and modifying variables. Because lagged treatment may be assumed to be endogeneous, exogeneous, an IV, or to have no effect whatsoever and lagged outcomes may be predictors, a multitude of longitudinal models are possible.
Various model specifications are compared by evaluating the effect of atypical versus conventional antipsychotic drugs on overall mental health costs defined as the cost of treatment and subsequent medical care in that year for medicaid recipients. The same data was analyzed previously by fitting a cross-sectional model using ordinary least squares (OLS) and various IV methods (O’Malley et al. 2011). However, in this cross-sectional setting the IV was borderline weak. Therefore, another key question is whether availability of longitudinal data allows the IV to be strengthened.
There are several important papers on IV methods for longitudinal (Hogan and Lancaster 2004; McClellan and Newhouse 1997) and other data types involving lagged variables, including spatially lagged data (Haining 1978; Kelejian and Robinson 1993). However, while several areas of statistical methodology consider the use of lagged variables as predictors (e.g., longitudinal analysis, time series analysis), their use as IVs has been studied less extensively. An exception is the work of several econometricians on methods for analyzing panel data (Arellano and Honore 2001, chapter 53; Hsiao 2003, chapter 4).
In Sect. 2 we review past work on mental health cost offsets and introduce the data and key variables motivating this work. The implication of differing assumptions about the causal relationships involving unmeasured confounding variables is illustrated using directed acyclic graphs (DAGs) in Sect. 3. In particular, we describe situations where lagged outcomes and treatments have different roles including when they should not be adjusted for, when they should be adjusted for, and when there is ambiguity. In Sect. 4 we introduce notation, models, estimands and IV assumptions for the mental health cost offsets analysis. Section 5 describes the IV requirements for each model and the method of estimation. In Sect. 6 we compare results across the models. The paper concludes with a discussion of the main findings in Sect. 7.
2 Background
2.1 Mental health cost offsets hypothesis
Atypical antipsychotics, including clozapine, olanzapine (zyprexa), quetiapine (sero-quel), and risperidone (risperidal), while considerably more expensive than the D2-antagonists, have been associated with a different (neurological versus physical) profile of side effects (O’Malley et al. 2011). It is thought that the greater tolerability of these new antipsychotics improves adherence to treatment regimens, thereby reducing relapses, resulting in declines in the use of hospital and emergency room services. This has led to the offset hypothesis that atypical antipsychotics, while more expensive ultimately pay for themselves through reductions in other types of health spending (Lichtenberg 2001). However, the hypothesis is disputed (Rosenheck et al. 2006) and testing it is complicated by the fact that patients who receive the newer atypical drugs likely differ from those getting the older drugs on a number of systematic factors, some unobserved.
2.2 Study population and variables
The data motivating this research is from Florida’s Medicaid population over the period July 1994–June 2001. Study years are from July 1 of 1 year to June 30 of the next year. The analysis sample was restricted to patients continuously-enrolled for 6-months or more of a given study year (26,759 individuals).
Log-annual mental health spending is the dependent variable and plurality drug type (defined as a binary variable indicating whether atypical or conventional antipsychotic drugs comprised the majority of an individual’s Medicaid claims for the year) is the key predictor or “treatment.” The assumed exogeneous predictors are male, white, black, history of substance abuse, recipient of supplemental security income (SSI), study year and area of residence. Because Miami–Key West is the most populous area, indicator variables for the ten other areas are included as predictors. Unmeasured confounders could include health status of the patient (other physical and mental health comorbidities, severity of illness), patient preferences over treatment, access to skilled physicians, and physician prescribing habits. Many of these are time-varying and therefore cannot be blocked by patient dummies.
The approval status of the atypicals introduced during our study period—zyprexa, seroquel, geodon—and their interactions with area of residence were previously used as IVs. ^{1} Clearly, whether a drug has been approved impacts the likelihood an individual receives an atypical at a given time. Because areas have different geographic, cultural, social and economic factors and physicians in them may have varying attitudes, the uptake of atypicals is likely to vary between areas. Thus, the likelihood a patient is prescribed an atypical is expected to depend on where they live (O’Malley et al. 2011). In this paper the consequence of supplementing these IVs with additional variables only available with longitudinal data will be investigated.
Basic identification improvements over cross-sectional analysis
Model | Estimate | t-stat | P-value | F_{StageI} |
---|---|---|---|---|
Ordinary least squares | ||||
Cross-sectional | 1.022 | 76.4 | 0.000 | |
Fixed differences | 0.613 | 44.1 | 0.000 | |
IV regression (two-stage least squares) | ||||
Cross-sectional | −0.028 | −0.17 | 0.866 | 9.69 |
Fixed differences | −0.590 | −3.46 | 0.001 | 7.31 |
Add a_{i(t-2)} as IV | 0.133 | 1.28 | 0.199 | 15.5 |
3 Causal assumptions
Conditioning on different subsets of the history of the outcomes or the treatments has been shown to have dramatic effects on the resulting inference (Pepe and Anderson 1994; Vansteelandt 2007). Therefore, it is important to consider the implications of including or excluding each candidate predictor in the model. DAGs are useful for depicting the data generating mechanism and the causal assumptions made by various models. Let Y, A, X, U and Z be random variables denoting the outcome, treatment, exogeneous covariates, unmeasured covariates, and IVs for an individual. We use the subscript t for time and for illustration consider the case \(t \in \{0,1\}\).
3.1 Conditioning on lagged treatments and outcomes
In order to obtain a consistent estimate of the effect of A_{1} on Y_{1}, it is necessary to condition on A_{0} as it would otherwise be an unmeasured confounder. However, while conditioning on Y_{0} does not affect the identifiability of the effect of A_{1} on Y_{1}, it has implications for the effect of A_{0} on Y_{1}. If Y_{0} is not conditioned on then the direct effect of A_{0} on Y_{1} is confounded with the effect acting through Y_{0}. If Y_{0} is conditioned on then the unblocked path from A_{0} to Y_{1} through U that arises as Y_{0} is caused by both A_{0} and U leads to lack of identifiability (Sharkey and Elwert 2010).^{2} Specifically, one cannot distinguish the effect of A_{0} on Y_{1} from that induced through U. Therefore, whether or not Y_{0} is conditioned on, the direct effect of A_{0} on Y_{1} is not-identified.
3.2 Need for IVs
Under Figure 3 the IV analysis does not need to involve Y_{0}, A_{0} or X. However, if Z also caused A_{0} it would then be necessary to condition on X and either Y_{0} or A_{0}. Because conditioning on Y_{0} and X blocks all paths from A_{0} to Y_{1}, a test of the validity of the model assumptions is to include A_{0} in the model; a statistically significant coefficient of the effect of A_{0} on Y_{1} would raise concerns about the validity of the model.^{4}
4 Notation and models for offsets analysis
4.1 Longitudinal models
If a_{it} is endogeneous then any variable that interacts with a_{it} is also endogeneous. However, while a_{it}a_{i(t−1)} inherits endogeneity from a_{it}, a_{i(t−1)} need not be endogeneous. For both (2) and (3) we evaluate the consequence of a_{i(t−1)} endogeneous (as in Fig. 4), exogeneous (as in Fig. 1), and usable as an IV (as in Fig. 2 or Fig. 3). Because adjusting for y_{i(t−1)} can be problematic (Figs. 1, 4), the estimates obtained under this model are compared to those for models that exclude y_{i(t−1)}.
Although random effect models are common in longitudinal analyses they are problematic when y_{i(t−1)} (or other lagged outcome) is a predictor as the assumption that random β_{0i} is uncorrelated with the predictors is violated (Wooldridge 2002, p. 256). This is seen from the fact that β_{0i} affects the expected value of all observations on an individual, including y_{i(t−1)}. Therefore, under a random effects specification, β_{0i} would be correlated with y_{i(t−1)}, which is a predictor of y_{it}. Thus, we avoid random effect specifications for β_{0i}. Because we don’t model the correlation structure we use robust standard errors to account for dependence within individuals (Huber 1967; White 1982).
5 IV requirements
The general requirements for z_{it} to be an IV for the effect of a_{it} on y_{it} are: (1) it is associated with a_{it} conditional on x_{it}, u_{it}; (2) it is not associated with u_{it} conditional on x_{it}; (3) it is not associated with y_{it} conditional on a_{it}, x_{it}, u_{it}. The more precisely z_{it} predicts a_{it} the greater the statistical power of the analysis; perfect predictions typically occur only in randomized studies with 100 % compliance with treatment assignment. Condition (2) guards against any backdoor pathways from z_{it} through u_{it} to y_{it}—sometimes referred to as the “random” requirement. Condition (3) excludes z_{it} from having a direct effect on y_{it} other than through a_{it}—the “exclusion restriction.”
A DAG-based test of z_{it} as an IV in Fig. 4 is: after removing all arcs out of a_{it} no path leads from z_{it} to y_{it} conditional on x_{it} (Brito and Pearl 2002; Joffe et al. 2008). Any unmeasured area level variables are absorbed in u_{it}. However, because such variables are time-invariant the inclusion of the area dummies in x_{it} blocks their effects.
5.1 Using longitudinal data to enhance IVs
In the cross-sectional analysis of the offsets data, the IVs were contemporaneous indicators of the approval status of zyprexa, seroquel and geodon and their interactions with area of residence. However, the model for the outcome is suggestive of additional IVs; {a_{i(t−k)}}_{k>1} do not appear in either (2) or (3), which is consistent with them not having a direct effect on y_{it}. Because a_{i(t−2)} is evaluated at least a year earlier than y_{it}, it is plausible that it is uncorrelated with y_{it} conditional on (y_{i(t−1)}, a_{it}, a_{i(t−1)}, x_{it}). If a_{i(t−2)} is correlated with a_{it} conditional on (a_{i(t−1)}, x_{it}, u_{it}) then a_{i(t−2)} is a valid IV. In general, if treatment influences subsequent treatment for a longer period than it influences outcomes, then the lagged treatment variables from the differential period are candidate IVs.^{5}
When β_{3} = 0 in (2), a_{i(t−1)} is a candidate IV for a_{it}. However, if a_{i(t−1)} is associated with an unmeasured confounder (e.g., as in Fig. 1 when Y_{0} is conditioned on), it violates the IV assumptions. If a_{i(t−2)} or any other variable is known to be a valid IV, the Sargan over-identifying restrictions test (ORT) may be used to evaluate whether a_{i(t−1)} is a valid IV (Sargan 1958).
5.2 Estimation: two-stage least squares (2SLS)
To avoid estimating the fixed effects {β_{0i}}_{1:N}, estimation of the longitudinal models is accomplished by regressing the individually-first differenced outcomes on the individually-first-differenced predictors (Wooldridge 2002, pp. 279–281). Because differencing accounts for all time-invariant variation, the strength of the IV is governed by the extent to which intra-individual variation in z_{it} predicts intra-individual variation in a_{it}. Conversely, the exclusion restriction is only violated by intra-individual variation directly related to y_{it}.
A virtue of first differencing over mean-centering (subtraction of the individual sample mean \(\bar{v}_{i}\) from v_{it}, \(t=1,\ldots,T\)) is that it makes a_{i(t−2)} more defensible as an IV. This is seen from that fact that under (2) and (3) the first-differenced error, \(\epsilon_{it}-\epsilon_{i(t-1)}\), is independent of a_{i(t−2)} − a_{i(t−3)}. However, if a_{it} depends on \(\epsilon_{it}\) for t = 1,…,T then \(a_{i(t-2)}-\bar{a}_{i}\) and the mean-centered error \(\epsilon_{i(t-2)}-\bar{\epsilon}_{i}\) appear likely to be correlated.
By using a_{i(t-2)} as an IV and basing estimates on first-differences, only observations with non-missing (a_{it}, a_{i(t−1)}, a_{i(t-2)}, a_{i(t−3)}) are used in the analysis leading to a substantial loss of information. Rather than require that all IVs be available for all observations, we do not use a_{i(t−2)} as an IV for observations in which it is missing [an approach proposed in Arellano and Bond (1991)]. Let r_{it} = 1 if a_{i(t−2)} is missing and r_{it} = 0 otherwise. Then set the component of z_{it} corresponding to a_{i(t−2)} equal to 0 if r_{it} = 0. Because r_{it} is not expected to contain any information about y_{it} we use it as an additional IV. If all of the IVs are valid then the treatment effect is not affected by the removal or addition of any particular IV from the analysis (Small 2007). Therefore, using r_{it} as an additonal IV is only expected to affect the precision of the estimated treatment effects.
- 1.Use OLS to fit the “stage I” regression equationto obtain fitted values \(\hat{{a}}_{it}.\)$$ \tilde{a}_{it} = \theta_{1}\tilde{y}_{i(t-1)} + \theta_{2}\tilde{a}_{i(t-1)} + {\varvec{\theta}}_{3}^{T}{\tilde{\user2{x}}}_{it} + {\varvec{\theta}}_{4}^{T}{\tilde{\user2{z}}}_{it} + \tilde{\delta}_{it} $$
- 2.Use OLS to fit the outcome or “stage II” regression equationyielding estimates of β_{2} and the other model parameters.$$ \tilde{y}_{it} = \beta_{1} \tilde{y}_{i(t-1)} + \beta_{2}\hat{{a}}_{it} + \beta_{3}\tilde{a}_{i(t-1)} + {\varvec{\beta}}_{5}^{T}{\tilde{\user2{x}}}_{it} + \tilde{\epsilon}_{it}, $$
As depicted above, all exogeneous predictors in the outcome (stage II) equation are included in the stage I equation (Angrist and Pischke 2009, p. 189).
A curious feature of (4) is that \(\tilde{y}_{i(t-1)}\), \({\tilde{\user2{x}}}_{it}\), and \({\tilde{\user2{y}}}_{it}\) are predictors of \(\tilde{a}_{i(t-1)}\) (second equation). The anomaly that \(\tilde{y}_{i(t-1)}\) is a predictor of \(\tilde{a}_{i(t-1)}\) in (4) emphasizes that the stage I equations do not depict models that we believe in but are artifacts of the estimation procedure. The stage I equations are determined solely by the outcome equation and the designated instruments. In contrast, under a parametric structural equation model such as the “Heckit model” (Arendt and Holm 2008), a bivariate model is assumed in which the predictors in the treatment selection equations (for a_{it}, a_{i(t−1)}) need not include the same exogeneous predictors as the outcome equation for y_{it}.
The Stata procedure xtivreg2 with estimation option “fd” (for first differences) may be used to fit the longitudinal models described above. Example code is provided in the Appendix.
6 Results
6.1 Strengthening IV in cross-sectional model
The potential for longitudinal data to enhance IV estimation is first demonstrated by fitting the cross-sectional model in (1), then first-differencing to account for time-invariant confounders, and finally augmenting the IVs with a_{i(t−2)}. The substantial difference between the OLS and 2SLS estimates of β_{1} under (1) can be attributed to extensive unmeasured confounding (Table 1). Although the effect of a_{i(t−2)} is reduced by first-differencing, the IV assumptions are more believable as time-invariant unmeasured variables are blocked. Despite only being identified off intra-individual variation, the doubling of the F_{StageI} statistic reveals that use of a_{i(t−2)} as an IV substantially improves identification of the effect of a_{it} on y_{it}.
6.2 Dynamic model
We consider the four models given by y_{i(t−1)} (included, excluded) and a_{i(t−1)} (included, excluded). In 2SLS analyses, two scenarios are considered when a_{i(t−1)} is included (endogeneous, exogeneous) and excluded (IV, not an IV) from the model. Throughout the longitudinal analyses a_{i(t−2)} is embedded in z_{it}. Unless otherwise stated, results pertain to the case when y_{i(t−1)} is excluded from the analysis.
Longitudinal models with different roles of a_{i(t−1)}: no treatment modification
Status |
Term | y_{i(t−1)} Excluded | y_{i(t−1)} Included | ||||||
---|---|---|---|---|---|---|---|---|---|
of a_{i(t-1)} | Estimate | t-stat | P-value | F_{StageI} | Estimate | t-stat | P-value | F_{StageI} | |
Ordinary least squares | |||||||||
Exogeneous | a_{it} | 0.625 | 37.7 | 0.000 | 0.622 | 40.2 | 0.000 | ||
a_{i(t-1)} | 0.107 | 7.57 | 0.000 | 0.288 | 20.7 | 0.000 | |||
Exclude | a_{it} | 0.613 | 44.1 | 0.000 | 0.603 | 44.0 | 0.000 | ||
IV regression (two-stage least squares) | |||||||||
Endogeneous | a_{it} | −0.686 | −3.42 | 0.001 | 6.04 | −0.997 | −4.49 | 0.000 | 3.91 |
a_{i(t-1)} | 0.374 | 5.53 | 0.000 | 0.601 | 7.58 | 0.000 | |||
Exogeneous | a_{it} | 0.355 | 6.51 | 0.000 | 54.5 | 0.218 | 4.20 | 0.000 | 54.5 |
a_{i(t-1)} | 0.027 | 1.34 | 0.179 | 0.169 | 8.67 | 0.000 | |||
Instrument | a_{it} | 0.297 | 8.91 | 0.000 | 142.7 | −0.134 | −3.98 | 0.000 | 136.6 |
Exclude | a_{it} | 0.133 | 1.28 | 0.199 | 15.5 | 0.403 | 4.21 | 0.000 | 14.0 |
Results under IV estimation are well identified when a_{i(t−1)} is used in some form to predict a_{it} in the stage I equation (F_{StageI} in excess of 50 as an exogeneous predictor and in excess of 100 as an IV), moderately well-identified if a_{i(t−1)} is excluded altogether (F_{StageI} around 15), and poorly-identified if a_{i(t−1)} is endogeneous. The level of identification is minimally affected by conditioning on y_{i(t−1)}. The lack of identifiability in the endogeneous case is compounded by high colinearity between a_{it} and a_{i(t−1)}, which even in the absence of unmeasured confounders makes it difficult to extract the independent effect of each and often increases the magnitude and alternates the signs of the predictors (as for the offsets analysis).
Because the inclusion of y_{i(t−1)} as a predictor impacts the results in different ways, the three “identified” cases are discussed each in turn. When a_{i(t−1)} is an exogeneous covariate the coefficient of a_{it} is significant and positive (estimate 0.0355, P < 0.001) while the coefficient of a_{i(t−1)} is not significantly different from 0. The inclusion of y_{i(t−1)} led to an increase in the effect of a_{i(t−1)} at the expense of the effect of a_{it}. Although the estimate of β_{2} (the effect of a_{it}) is bigger than β_{3} (the effect of a_{i(t−1)}), the latter has a higher t-statistic due to the fact that it is not instrumented.
When a_{i(t−1)} is an IV there is only a minor change to the exogeneous case—a consequence of the estimated β_{3} being close to 0 when a_{i(t−1)} is a predictor. However, when y_{i(t−1)} is included, the estimate of β_{2} is negative and significant (estimate −0.134, P < 0.001). This is the only well-identified longitudinal specification under which atypicals appear to lower the cost of mental health care. However, one reason to doubt analyses with a_{i(t−1)} as an IV is that \(\tilde{a}_{i(t-1)}=a_{i(t-1)}-a_{i(t-2)}\) and \(\tilde{\epsilon}_{it}=\epsilon_{it}-\epsilon_{i(t-1)}\) seem likely to be correlated as endogeneous treatment assignment implies a_{i(t−1)} and \(\epsilon_{i(t-1)}\) are correlated.
If a_{i(t−1)} is excluded altogether then β_{2} is estimated to be 0.133 (not significant) when y_{i(t−1)} is excluded and 0.403 (P < 0.001) when y_{i(t−1)} is included. Thus, the impact of y_{i(t−1)} is opposite that when a_{i(t−1)} is used as an IV. Unfortunately, it is not possible to test empirically whether conditioning on y_{i(t−1)} is more problematic than not conditioning on y_{i(t−1)}. However, conditioning generally introduces less bias than not conditioning (Greenland 2003), suggesting that the results under the exogeneous specification might be the more trustworthy. Because the estimates of both β_{2} and β_{3} are positive and significant under the exogeneous specification, the offsets hypothesis appears to not hold.
6.3 Modified-treatment model
Longitudinal models with different roles of a_{i(t−1)}: treatment modification
Status |
Term | y_{i(t−1)} Excluded | y_{i(t−1)} Included | ||||||
---|---|---|---|---|---|---|---|---|---|
of a_{i(t−1)} | Estimate | t-stat | P-value | F_{StageI} | Estimate | t-stat | P-value | F_{StageI} | |
Ordinary least squares | |||||||||
Exogeneous | a_{it} | 0.635 | 34.0 | 0.000 | 0.624 | 36.7 | 0.000 | ||
a_{i(t−1)} | −0.030 | −1.04 | 0.299 | −0.007 | −0.25 | 0.800 | |||
a_{it}a_{i(t−1)} | 0.126 | 5.09 | 0.000 | 0.292 | 12.9 | 0.000 | |||
Exclude | a_{it} | 0.608 | 43.7 | 0.000 | 0.593 | 43.2 | 0.000 | ||
a_{it}a_{i(t−1)} | 0.100 | 8.75 | 0.000 | 0.181 | 15.0 | 0.000 | |||
Two-stage least squares | |||||||||
Endogeneous | a_{it} | −0.472 | −2.74 | 0.006 | 2.22 | −0.675 | −3.76 | 0.000 | 2.21 |
a_{i(t−1)} | −0.398 | −1.82 | 0.069 | −0.499 | −2.29 | 0.022 | |||
a_{it}a_{i(t−1)} | 1.106 | 3.16 | 0.002 | 1.55 | 4.41 | 0.000 | |||
Exogeneous | a_{it} | 0.273 | 4.09 | 0.000 | 7.17 | 0.133 | 2.07 | 0.038 | 7.16 |
a_{i(t−1)} | 0.863 | 3.64 | 0.000 | 0.930 | 4.10 | 0.000 | |||
a_{it}a_{i(t−1)} | −0.476 | −3.26 | 0.001 | −0.370 | −2.66 | 0.008 | |||
Instrument | a_{it} | 0.430 | 9.51 | 0.000 | 48.1 | 0.256 | 5.94 | 0.000 | 48.2 |
a_{it}a_{i(t−1)} | 0.095 | 3.06 | 0.002 | 0.331 | 11.0 | 0.000 | |||
Exclude | a_{it} | 0.431 | 9.53 | 0.000 | 49.4 | 0.259 | 6.01 | 0.000 | 49.4 |
a_{it}a_{i(t−1)} | 0.091 | 2.89 | 0.004 | 0.323 | 10.6 | 0.000 |
The results under OLS and 2SLS are largely invariant to y_{i(t−1)}. One explanation that might also account for the sensitivity of the results under the dynamic model to the status of y_{i(t−1)} is that y_{i(t−1)} functions like a surrogate for a_{it}a_{i(t−1)}. Thus, if a_{it}a_{i(t−1)} is excluded from the model its effect in large part transmits through y_{i(t−1)}. If a_{it}a_{i(t−1)} is included then the treatment effect heterogeneity is appropriately accounted for and y_{i(t−1)} has less impact.
Because F_{StageI} ≤2.3 (7.2) when a_{i(t−1)} is an endogeneous (exogeneous) predictor, implying weak identifiability, it is unwise to interpret the associated results. Attempts to strengthen identification by using a_{i(t−2)}z_{it} as an IV resulted in at most minor improvements (results not presented). Therefore, the key to identification of endogeneous (a_{it}, a_{it}a_{i(t−1)}) is the exclusion of a_{i(t−1)} from the outcome model. In other words, the required exclusion restriction is that there is no carryover effect of atypical use for individuals who switch to a conventional [β_{3} = 0 in (3)].
If a_{i(t−1)} is excluded from the outcome equation it makes little empirical difference whether or not it is used as an IV. The two endogeneous effects are well identified (F_{StageI} nearly 50) and their estimated effects are similar. However, as for the dynamic model, inclusion of y_{i(t−1)} led to the term involving a_{i(t−1)} (in this case a_{it}a_{i(t−1)}) having a greater effect. With y_{i(t−1)} in the model the effect of a_{it}a_{i(t−1)} is 50 % greater than that of a_{it}; absent y_{i(t−1)} the effect is one-quarter the size.
Because the estimated effects under 2SLS are significant and positive under the four well-identified scenarios, the evidence against the offsets hypothesis is again substantial. However, we cannot conclusively discern whether a_{i(t−1)} operates as a lagged effect or exclusively as a modifying effect distinguishing new and continuing atypical users.
7 Discussion
In testing the offsets hypothesis we found that lagged treatment, a_{i(t−1)}, has a profound impact on the results of the IVs analyses. Furthermore, the estimated coefficients were sensitive to the role of the lagged outcome, y_{i(t−1)}.
In both the dynamic- and modified-treatment models, endogeneity of a_{i(t−1)} proved fatal for identification. In the dynamic treatment model (no modification by lagged treatment), the key to identifiability was inclusion of a_{i(t−1)} in the treatment selection equation for a_{it}. In the modified-treatment model the key was exclusion of a_{i(t−1)} from the outcome model. In both cases, a_{i(t−1)} did not need to be used as an IV in order to obtain statistically significant results.
If y_{i(t−1)} was excluded then the effect of a_{it} tended to dominate that of any other treatment variable (a_{i(t−1)} in the dynamic model and a_{it}a_{i(t−1)} in the modified-treatment model) whereas if y_{i(t−1)} was included lagged treatment had substantially more influence. In all such models the estimated treatment effects were positive. The discrepancy of these results with the cross-sectional analysis may be due to the weakness of the IVs cross-sectionally, violations of the IV assumptions in the longitudinal models, model miss-specification, or combinations of these.
The only specification that supported the offsets hypothesis was the dynamic-treatment model when a_{i(t−1)} was an IV and y_{i(t−1)} a predictor. In this model, conditioning on y_{i(t−1)} appears justified since if a_{i(t−1)} has an effect on y_{i(t−1)} which in turn has an effect on y_{it}, conditioning on y_{i(t−1)} is necessary for a_{i(t−1)} to be a valid IV (Fig. 2). Furthermore, it is possible that the inclusion of any term involving a_{i(t−1)} in the outcome equation leads to spurious effects. Therefore, it is plausible that the lone specification that obtained a negative estimate is the only valid specification! However, while use of a_{i(t−1)} as an additional IV is enticing, its validity relies on an exclusion restriction that is difficult to satisfy, especially when first differencing is used for estimation. Therefore, the results in which a_{i(t−1)} is not used as an IV appear more trustworthy.
An important new finding is that use of an atypical in the past year may have a carryover effect on mental health costs in the current year. Under the dynamic treatment model there was evidence that individuals who used an atypical in the prior year had greater mental health costs. The well-identified results for the modified-treatment model rely on the exclusion restriction that past treatment is irrelevant for individuals taking conventionals. Unfortunately the IVs are not powerful enough for all treatment variables to simultaneously be modeled as endogeneous. Therefore, it is not possible to make a reliable comparison between the dynamic- and modified-treatment models.
While longitudinal designs have clear advantages, the consequences of different assumptions must be carefully considered. Using DAGs to depict theoretical models may generate valuable insights into the variables thought to influence or confound the effects of interest, which in turn can lead to experimental designs and identification strategies that overcome concerns about unmeasured confounders. The sensitivity of the IV results for the offsets analysis to different assumptions about lagged treatment and lagged outcomes illustrates the importance of using external information to help specify the most appropriate model. In addition to using varied specifications to evaluate the sensitivity of results to different models and IV specifications, sensitivity analyses that evaluate the robustness to violations of the IV assumptions (Small 2007) may also be helpful.
Developed in the 1920’s (Wright 1928), IVs and their estimation methods are less well known among statisticians (Dowd 2011). However, the growing importance of and interest in health policy research and the need for IVs in this field is likely to foster increased methodological work and awareness of IVs in the future. In this paper the focus was longitudinal models, inspired in part by the fact that statistical methods developed for longitudinal data have widespread applicability [e.g., generalized estimation equations (Liang and Zeger 1986; Zeger and Liang 1986)]. IV methods for time-to-event and joint longitudinal-survival models are important areas for future research.
Because risperidal was introduced prior to 1994 its approval status is constant in the sample and so cannot be used as an IV.
Because it is caused by both A_{0} and U, Y_{0} is known as a collider. In general, conditioning on colliders is problematic (VanderWeele 2011).
In DAG terminology, U is a common cause of A_{0} and Y_{0} and therefore a confounder of the effect of A_{0} on Y_{0}.
Note that A_{0} is an IV conditional on Y_{0} and X. Therefore, the rationale for such a test is the same as that underlying the test of over-identifying restrictions (Small 2007). A significant finding would cast doubt on whether Z is a valid IV or suggest that some other assumption about the model is incorrect.
Acknowledgments
Research for the paper was supported by NIH Grant 1RC4MH092717-01. The dataset analyzed in this paper was developed in collaboration with Sharon-Lise T. Normand and Richard G. Frank on work supported by NIH Grants R01 MH061434 and R01 MH069721. The author also thanks Jaeun Choi for valuable suggestions made on an early draft of the manuscript and Felix Elwert for helpful discussions.
Open Access
This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.