1 Introduction

Opioid-induced deaths (hereafter, briefly opioid mortality)Footnote 1 increased sharply in the United States (US) during the past three decades, and opioid overdose mortality has become an important, yet inadequately understood public health problem (Kopel 2019). Although opioid mortality is a spatial characteristic of a population in a specific area (Cossman et al. 2007), only few studies have used spatial econometric methods to control for spatial dependenceFootnote 2 in regression models that aim to identify the impact of determinants on the outcome of the opioid mortality process (see Cushing et al. (2020)). Sparks and Sparks (2010) apply spatial autoregressive models to the study of US county mortality rates, and Yang et al. (2015) explore geographic variation in US mortality rates using a conventional spatial Durbin model. These studies, however, are cross-sectional in nature on the one side and do not contribute to the opioid mortality literature on the other.

To overcome this shortcoming in mortality research in general and opioid mortality research in particular, we use panel rather than cross-sectional data to study the relationship between opioid mortality rates and explanatory variables from a longitudinal perspective that accounts for temporal and spatial dependence simultaneously. Omitting the spatial dependence issue would increase the risk of obtaining biased estimation results, while the use of panel data results in a more accurate inference of model parameters (Baltagi et al. 2007). Panel data is generally more informative, contains more degrees of freedom and less multicollinearity between variables than cross-sectional data, and improves the efficiency of econometric estimates (see Elhorst (2021)). It is more informative since providing information on both the intertemporal dynamics and the individuality of the observational units (regions) and allowing one to control the effects of missing and unobserved variables (for more details, see Hsiao (2007)).

The primary objective of this study is to employ a spatial Durbin panel data model,Footnote 3 an extension of the cross-sectional model to a panel data framework, for estimating direct and indirect (cross-state spillover) effects of a set of factors that may influence opioid mortality rates across space and over time. In this model specification, the characteristics of a specific observation unit (in our case: state) and its neighbors are simultaneously considered. To the best of our knowledge, no opioid mortality research has considered spatial spillovers among observational units. Specifically, we consider five variables to capture the racial/ethnic structure of a state, five variables to cover the state’s age structure, two variables to represent the educational attainment level of its population, and three covariates that address economic distress. In addition to the sociodemographic and socioeconomic variables, we include the opioid prescription rate to account for one of the opioid supply factors, and health insurance coverage that may have the potential to increase the demand for drugs.

Our choice of using states rather than counties as observation units in this study was forced by data availability.Footnote 4 The Annual American Community Survey (ACS) census data is available starting 2014 for the demographic variables included in the study. Hence our study used the period 2014–2019. The model is applied to 49 states over six years, from 2014 to 2019. A k-nearest-neighbor matrix is chosen to represent the topological structure between the states.Footnote 5 The calculation of effects estimates is based on Bayesian estimation and inference that reflects a proper interpretation of the marginal effects for our nonlinear model that involves lags of the dependent variable vector. The study provides a rich picture of how sociodemographic and socioeconomic determinants affect opioid mortality. The findings highlight the critical role indirect (spillover) effects play. A neglection of the spillover effects would cause incorrect inferences.

The remainder of the paper is structured as follows. Section 2 presents the econometric framework, outlines the proposed model, discusses prior selection for Bayesian estimation of the model, and shows how to calculate direct and indirect effects estimates necessary for a proper interpretation of the model estimates.Footnote 6 Section 3 outlines model specification issues, provides a brief summary of the data set used, and presents then the direct and indirect effects estimates of the explanatory variables on opioid mortality rates. The last section summarizes and concludes the paper.

2 Econometric framework

2.1 The model

The econometric approach we employ in this study is a spatial econometric panel data model that is known as spatial Durbin panel data model in the spatial econometrics literature (see, e.g., LeSage (2014)) and can be written in matrix form as

$$\begin{aligned} y&= \rho W y + X \beta + W X \theta + \iota _T \otimes \mu + \nu \otimes \iota _N + \epsilon \end{aligned}$$
(1)

where y is the \(NT \times 1\) dependent variable vector of opioid mortality rates in state i (\(i=1,\ldots ,N\)) at time t (\(t=1,\ldots ,T\)), organized with t being the slow index for elements \(y_{it}\) in the vector y. X is an \(NT \times Q\) matrix of Q explanatory variables that is organized in the same way. \(\beta\) is the associated \(Q \times 1\) parameter vector.

W is an \(NT \times NT\) weight matrix representing some sort of connections among the states. The matrix has a block diagonal form: \(I_T \otimes w\), with \(\otimes\) being a Kronecker product. w is a \(N \times N\) spatial weight matrix reflecting spatial proximity of the N states that make up the panel of states over \(t=1,\ldots ,T\) time periods. The diagonal elements \(w_{ii}=0\) (\(i=1,\ldots ,N\)), and the matrix is row-normalized to have row-sums of unity. The \(NT \times 1\) matrix-vector product Wy is referred to as spatial lag and reflects a linear combination of neighboring state values for the dependent variable. The scalar parameter \(\rho\) denotes the spatial dependence parameter that indicates the strength of spatial dependence between the vector y and the vector Wy.Footnote 7 The \(NT\times Q\) matrix-matrix product WX is used to create spatial lags of the Q explanatory variables of the model, and represents a linear combination of characteristics from neighboring states, with the associated parameters \(\theta\).

\(\iota _T \otimes \mu\) represents an N-vector of state-specific fixed effects \(\mu\), with \(\otimes\) being a Kronecker product that repeats the vector \(\mu\) for each time period. \(\nu \otimes \iota _N\) reflects a Kronecker product of the T-vector of time-specific effects \(\nu\), one for each time period. The \(NT \times 1\) vector \(\epsilon\) is a stochastic disturbance term, assumed to be normally distributed with zero mean and a scalar variance, \(\sigma ^2\).

$$\begin{aligned} \epsilon&\sim \mathcal {N}(0_{NT},\sigma ^{2} I_{NT}) \end{aligned}$$
(2)

2.2 Bayesian estimation

Estimation of the spatial Durbin panel data model outlined in Eq. (1) is based on Markov Chain Monte Carlo (MCMC) estimation, with prior distributions assigned to the parameters \(\rho\), \(\delta = (\beta ,\theta )'\) and \(\sigma ^2\).Footnote 8 Parameter restrictions are imposed on the dependence parameter \(\rho\) during MCMC sampling, using methods described in LeSage (2020), and Fischer and LeSage (2020). MCMC estimation involves sequentially sampling each parameter from conditional distributions. One of the main advantages of MCMC sampling is that conditional distributions for each parameter, given values of all other model parameters, take a form that is computationally simple to sample from.

The prior setup we use is standard. We rely on a normal prior for the parameters \(\delta = (\beta ,\theta )'\), associated with the X- and WX-variables:

$$\begin{aligned} p(\delta )&\sim \mathcal {N}(\bar{\delta },\bar{\Sigma }_\delta ) \end{aligned}$$
(3)

where \(\bar{\delta }\) is a \(2Q\times 1\) vector of prior means, and \(\bar{\Sigma }_\delta\) a \((2Q)\times (2Q)\) prior variance-covariance matrix. We set the prior means to a value of 0.5 and the prior variance to 0.001.

For the dependence parameter \(\rho\) we employ a uniform prior

$$\begin{aligned} p(\rho )&\sim U(-1,1) \end{aligned}$$
(4)

because this scalar parameter is constrained to lie in the open interval \((-1,1)\). The constraint is imposed during MCMC estimation using a Metropolis-Hastings approach with rejection sampling, using methods described in Fischer and LeSage (2020), and LeSage (2020).

A non-informative prior is placed on the noise variance parameter \(\sigma ^2\), because we use an inverse gamma (\(\bar{a}, \bar{b}\)) distribution shown in Eq. (5).

$$\begin{aligned} p(\sigma ^2) = \frac{\bar{b}^{\bar{a}}}{gamma(\bar{a})}\ (\sigma ^2)^{-(\bar{a}+1)}\ exp(-\bar{b}/\sigma ^2) \end{aligned}$$
(5)

where \(\bar{a}\xrightarrow {}0\), \(\bar{b}\xrightarrow {}0\) reflect no prior information for this parameter.

As is traditional in the literature, we assume that priors for \(\rho ,\delta\) and \(\sigma ^2\) are independent. Given these priors we use the conditional distributions described in LeSage and Fischer (2020). A set of 4,000 draws were taken for determining posterior estimates of the parameters, with the first 500 omitted for burn-in of the sampler.

2.3 Direct and indirect impact estimates

LeSage and Pace (2009) point out that the estimates for the coefficients \(\beta\) and \(\theta\) are not meaningful, if one is interested in how changes in the explanatory variables impact the dependent variable outcome, which is the object of interest in this paper. In our model that includes a spatial lag of the dependent and independent variables, a change in a single explanatory variable in state i has a direct impact in i as well as an indirect (spillover) impact from other-states \(j \ne i\) to state i. We follow the suggestion of LeSage and Pace (2009) to calculate direct and indirect effects estimates that reflect a proper interpretation of the marginal effects for our nonlinear model which involves spatial lags of the dependent variable.Footnote 9

The partial derivatives used to calculate the direct and indirect effects estimates take the form of an \(NT\times NT\) matrix. Using matrix notation, the partial derivative for the qth explanatory variable \(X^q\) can be written as shown in Eq. (6).

$$\begin{aligned} \partial y/\partial X^q = (I_{NT}-\hat{\rho } W)^{-1} \ (\hat{\beta } + W \hat{\theta }) \end{aligned}$$
(6)

where \(\hat{\rho }, \hat{\beta }\) and \(\hat{\theta }\) denote parameter estimates. Diagonal elements of this matrix represent own-partial derivatives, while off-diagonal elements reflect cross-partial derivatives. Equation (6) illustrates that the partial derivatives of y with respect to the qth explanatory variable \(X^q\) has two specific properties (Elhorst 2010): First, if a particular explanatory variable \(X^q\) in a particular state i changes, not only the dependent variable in this state i will change, but also in other states \(j \ne i\). Second, direct and indirect effects are different for different states in the sample. Direct effects are different, since the own-partial derivatives of the \(NT\times NT\) matrix are different. And indirect effects are different, since the cross-partial derivatives of this matrix are different for different states.

If we have N states and Q different explanatory variables, then we get Q different \(NT\times NT\) matrices representing direct and indirect effects. To improve the surveyability of the \(Q\times NT\times NT\) effects estimates, LeSage and Pace (2009) propose using the mean of the main diagonal elements of the matrix of partial derivatives to produce a scalar summary of the direct effects. Using the mean of these NT different values generates a scalar summary that may be interpreted as representing how a change in the qth explanatory variable in the typical state impacts outcome y for this state (LeSage and Pace 2021).Footnote 10

Indirect effects represent the impact of the ith state outcomes \(y_i\) from a change in the qth explanatory variable from the jth state \(j \ne i\), and capture the off-diagonal elements of the \(NT \times NT\) matrix. Specifically, the elements in the ith row of the matrix show

$$\begin{aligned} \partial y_i/ \partial X^q_j \ \text {for} \ i\ne j; \ i=1,\ldots ,N \end{aligned}$$
(7)

reflecting how changes in each of the other states’ qth explanatory variable impact outcomes in the ith state.

Note that our model can have positive (or negative) direct effects associated with negative (or positive) indirect effects for the qth variable, so that spillover impacts might work in the opposite direction of direct impacts, arising from changes in each explanatory variable. Total effects represent the sum of direct and indirect effects. In addition to calculating scalar summary measures of the effects, there is a need to calculate an empirical distribution from which measures of dispersion for the effects can be constructed and used for inference regarding the statistical significance of the effects (LeSage and Pace 2021). This is done using a sequence of say 1,000 MCMC draws for the parameters \(\rho , \beta , \theta\) to calculate a sequence of 1,000 scalar summary measures of these effects that reflect the posterior distributions.Footnote 11

Given an estimate of the standard deviation for the scalar summary point estimates, we can test hypotheses regarding the significance of the various types of effects for each of the explanatory variables used in the model. t-statistics, t-probabilities, and lower 0.05 and upper 0.95 credible intervals can be computed using the retained MCMC draws. The mean of the draws for direct effects (for each of the Q explanatory variables), and the standard deviation of the draws is taken to construct the t-statistic, which is then used to find the associated t-probability. The lower 0.05 and upper 0.95 are based on the MCMC draws. Given a set of 1,000 MCMC draws, the lower 0.05 interval would be determined by the 50th value from the lowest set of sorted values, and the upper 0.95 by the 950th value from this sorted set (LeSage 2021).

3 Application of the model

The empirical model estimated uses a pool of the 49 contiguous US states (\(N=49\)) over six years from 2014 to 2019 (\(T=6\)). The dependent variable vector y represents the annual opioid mortality rates, and we consider a suit of Q=17 explanatory variables. The choice of the time was based on the availability of data.Footnote 12

3.1 Model specification issues

The definition of a spatial lag in spatial regression models in general, and our spatial Durbin panel data model in particular, depends on the choice of a spatial weight matrix that summarizes the topology of the data set. Clearly, a large number of weight matrices can be derived for the same spatial layout (see Fischer and Wang (2011) for a review). But k-nearest neighbor matrices that constrain the neighbor structure to the k nearest neighbors have received increasing popularity in spatial econometrics in recent years due to several reasons. First, there are new algorithms for calculating k nearest neighbors, and these are extremely fast for millions of observations (Matlab implemented these).Footnote 13Second, interpretation of spatial spillovers having a constant number of neighbors is more intuitive. The magnitude of spillovers to the k neighbors can easily be calculated and understood. This is not the case where an observation unit (state here) has two neighbors and another has four neighbors as in the case of, for example, a contiguity-based weight matrix.

The number of neighbors, k, is the parameter of this weighting scheme. Its choice is an empirical issue. Our solution to this problem relies on a Bayesian Comparison approach (see LeSage and Fischer (2008)).Footnote 14 This approach calculates the Bayesian posterior model probabilities of ten different k-nearest neighbor matrices with \(k=1,\ldots , 10\), given our spatial Durbin panel data model specification. These probabilities are based on the log-marginal likelihood of the model obtained by integrating out all parameters of the model over the entire parameter space on which they are defined. A log-marginal likelihood of −174.86 and an associated model probability close to one indicates that the model specification with the \(k=8\) nearest neighbor matrix is most consistent with the given panel data sample.

W is an \(294\times 294\) weight matrix representing connections between the states. W has a block diagonal \(I_T \otimes w\), with \(\otimes\) denoting a Kronecker product. w is a \(49 \times 49\) spatial weight matrix constructed based on eight nearest neighboring states and identical during each time period in our model formulation. The diagonal elements \(w_{ii}=0\) (\(i=1,\ldots ,49\)), and the matrix is row-normalized to have row-sums of unity. The \(294 \times 1\) matrix-vector product Wy is referred to as spatial lag and reflects a linear combination of neighboring states’ values for the dependent variable vector of opioid death rates. The \(294 \times Q\) matrix product WX is used to create spatial lags of the \(Q=17\) explanatory variables described in the following subsection.

3.2 Data and data sources

Data for the dependent variable vector y come from the US Centers for Disease Control and Prevention (CDC) Wonder Online Databases of Multiple Cause of Death, filtered for drug-related causes of deaths, estimated per 10,000 population for each state and year. Drug overdose deaths were identified using the International Classification of Diseases, Tenth Revision (ICD-10) underlying cause-of-death codes X40-X44 (unintentional), X60-X64 (suicide), X85 (homocide), or Y10-Y14 (undetermined intent).Footnote 15 Among deaths with drug overdose as the underlying cause, the opioid subcategory was determined by the following ICD-10 multiple cause-of-death codes: opium (T40.0); heroin (T40.1); natural and semisynthetic opioids (T40.2); methadone (T40.3); synthethic opioids other than methadone (T40.4); or other and unspecified narcotics (T40.6) as a contributing cause.Footnote 16

Fig. 1
figure 1

Opioid mortality rates in US, 2014 and 2019

Figure 1 shows a marked temporal growth of the opioid mortality rates over the time frame, and increasing geographic heterogeneity in the spatial distribution of the rates in 2019, with hot spots in the Rust Belt and the Midwest. While the opioid crisis initially hit the Midwest states and New Mexico hardest, Western states such as California, Nevada and Arizona have also experienced increases in opioid mortality rates. Based on prior opioid research we selected the following set of sociodemographic and socioeconomic X-variablesFootnote 17:

  1. (a)

    Five variables are considered to capture the racial/ethnic structure of a state: percentages of the population who identify as White, African American, Native American, Asian or Hispanic/Latino races. The data were derived from the US Census Bureau’s Population Estimates Program called American Community Survey (ACS).

  2. (b)

    To uncover the impact of age on opioid mortality rates, we include age in years based on standard definitions: percentages of people with 18–24 years, 25–34 years, 35–44 years, 45–64 years, and with 65 years and more (Data source: ACS).

  3. (c)

    The level of educational attainment of the population is widely viewed as a relevant factor affecting opioid mortalities (see, for example, Blake-Gonzalez et al. (2021)). Educational attainment is measured in terms of two variables, the percentage of people with 25 years or older who hold a high school’s degree and above and the percentage of people having a bachelor’s degree or higher (Data source: ACS).

  4. (d)

    Economic conditions of a state could account for up to one-tenth of the rise of opioid mortality rates (Blake-Gonzalez et al. 2021). Following (Nosrati et al. 2019) we consider three variables that address economic distress: the poverty rate, the variable of lower income and the unemployment rate. The data for these variables come from the American Community Survey. The poverty rate is calculated as the percentage of the population whose income falls below the state-specific level per year, while lower income is measured in terms of the number of people with income of less than $ 34,999 per 10,000 persons. The unemployment rate denotes the percentage of people aged 16 years and older who are unemployed in the resident labor force.Footnote 18

  5. (e)

    In addition to the above sociodemographic and socioeconomic factors, we include the opioid prescription rate to account for one of the opioid supply factors,Footnote 19 and health insurance coverage. The prescription rate is calculated by dividing the total number of opioid prescriptions dispensed annually by the annual resident population and multiplying the number by 100. The data come from the US Opioid Dispensing Rate Maps provided by IQVIA Xponent from the CDC in 2020. Health insurance coverage is measured in terms of the percentage of people with health insurance. To the extent that individuals are covered by health insurance, they may better afford to purchase prescriptions. Hence, health insurance may have the potential to increase the demand for drugs (Blake-Gonzalez et al. 2021).

3.3 Estimation results

Estimation is based on Markov Chain Monte Carlo estimation as described in Sect. 2.2. The resulting estimates are presented in Table 1. The table contains posterior means for the parameter estimates of the 17 explanatory variables along with the associated (asymptotic) t-statistics and z-probabilities. t-statistics represent the posterior mean divided by the posterior standard deviation both of which are calculated from the retained MCMC draws. The t-statistics are then used to determine the associated z-probabilities. R-square for the estimates is very high (0.94) and sigma-square low (0.0502). We see evidence of significant spatial effects working through the dependent variable, since \(\hat{\rho }=0.2939\) which is significant at the 99% level. The parameter estimates for the \(\beta\)-coefficients on the X-variables and the \(\theta\)-coefficients on the WX-variables are not meaningful if one is interested in how changes in explanatory variables impact the dependent variable outcomes which is the focus of interest in this study. As emphasized in Sect. 2.3, it is is necessary to properly calculate the direct, indirect and total effects associated with changes in the explanatory variables on opioid mortality rates, given our spatial panel data model.

Table 1 Posterior parameter estimates of the spatial Durbin panel data model

Table 2 reports the posterior mean effects estimates along with the 95 percent credible intervals and t-statistics that provide a valid inference with respect to the significance of the impact estimates. The scalar summary measures reflect: (i) a direct impact, (ii) an indirect (spatial spillover) impact and (iii) a total impact on opioid mortality rates for ceteribus paribus changes in the explanatory variable concerned. A positive (negative) mean with positive (negative) lower and upper credible intervals should be interpreted as a positive (negative) effect. Effects whose credible interval span zero are not significant. These statistics for the scalar summary effects estimates are calculated using the retained MCMC draws for the (direct, indirect and total) effects (for each explanatory variable). The standard deviation of the draws is used to construct t-statistics that are taken to determine the associated t-probabilities. The lower 0.05 and upper 0.95 intervals are also based on the MCMC draws.

Focusing first on the direct effects estimates shown in Table 2, we see five out of 17 explanatory variables that are significantly different from zero. All these impact estimates are negative. The direct impact of changes in the senior (65 years and older) population share on opioid mortality is negative with a magnitude of 0.2042. This result matches our intuition, as we should expect that a larger percentage of senior people in the typical state i would reduce the opioid mortality rate in this state. A bit larger in magnitude is the direct effect of the Native American population share (−0.2545) which suggests that a larger percentage of this population in state i would reduce opioid mortality in i, since members of this population tend to die of drug overdose less likely than those from the White, African American and Hispanic subpopulations (for differences in opioid mortality by race see Nechuta et al. (2018)).

Opioid PrescribingFootnote 20 representing the supply side has a negative direct effect of −0.0073, suggesting that this rate in state i would reduce the opioid mortality rate in this state i, in contrast to our expectation. This is likely to arise from the fact that our prescribing data includes only prescriptions for Medicare and Medicaid patients and, thus, underscores the use of prescription opioids, leading to biasing downward the effects estimates to a magnitude close to zero.

The direct impact of changes in the Poverty Rate is significant and negative on the opioid mortality rate in i, reducing this too, but with a much lower magnitude of 0.0327 as compared to the sociodemographic variables. The estimated direct effect of the Unemployment Rate is positive, but not significantly different from zero, suggesting that this rate has no impact on own-state opioid mortality. Its indirect effects estimate, however, indicates a significant and negative spillover impact, implying that increasing unemployment in neighboring state j decreases opioid mortality in state i.

Table 2 Direct, indirect and total impact estimates for changes in X- and W X-variables

The final direct effects estimate that is significantly different from zero, is the direct impact of changes in the percentage of people with 25 years or older who hold at least a high school’s degree. This negative direct effect has a magnitude of 0.0825, implying to reduce the opioid mortality in state i. The direct effects of all other explanatory variables are not significant, suggesting that they have no impact on own-state opioid mortality.

The direct impact estimates differ from the corresponding coefficient estimates outlined in Table 1. The difference is due to some feedback effect that comes into play in the direct effects estimates. The discrepancy is positive since the impact estimates exceed the coefficients, reflecting some positive feedback. Note that it would be a mistake to interpret the \(\beta\)-parameters as representing direct effects estimates. If we would incorrectly view, for example, the model coefficient \(\beta _3\) on the Percent Native American variable as representing the direct impact, this would lead to an effect not significantly different from zero. But the true direct impact estimate points to effects estimates that are negative and significant.

We now turn to considering the indirect (spatial spillover) effects estimates. These effects are measures for the magnitudes of spatial spillovers from the model. We emphasize that it would be a mistake to interpret the \(\theta\)-coefficients as representing spatial spillover magnitudes in our model. To see how incorrect this is, consider the difference between the \(\theta\)-coefficients and the true indirect effects correctly calculated from the cross-partial derivatives of the model, \(\partial y_i/ \partial X^q_j\), accumulated across all states \(j \ne i\) and then averaged. These estimates are referred to as a measure of cumulative spatial spillovers, representing the impact of changes in other-state explanatory variables on own-state outcomes.

Table 2 shows five variables with statistically significant negative indirect effects estimates. The indirect impact of changes in the percentage of the middle-aged (35–44 years) and senior (60 years and older) populations in neighboring states j on the opioid mortality rate in state i is negative with a magnitude of 1.06476 and 1.2530, respectively, implying to reduce opioid mortality in i. This could arise from either product market impacts of the populations staying in neighboring states or from neighboring states economies, or perhaps networks between suppliers in neighboring states and own states. Similar arguments apply for the poverty, the unemployment and opioid prescription rates, resulting, however, in much lower magnitudes (−0.1619, −0.6805 and −0.0336) in comparison with the age-specific indirect effects.

Moreover, we see four positive spillover effects that are significantly different from zero. The effects of White, Asian and Hispanic subpopulations in neighboring states j is positive (0.2851, 0.9952 and 0.8889), suggesting to increase the opioid mortality rate in state i. The same happens with the Lower Income variable that shows a spillover effect 0.2680. These results are not entirely expected, but the reader should note that the relationship being estimated is an unidentified reduced form relationship.

Indirect effects have larger magnitudes than the direct effects. But it must be noted that the indirect effects are cumulated spillovers where cumulation takes place over all neighboring states \(j \ne i\), neighbors to the neighboring states, and so on. While this may seem counter-intuitive, the indirect effect falling on any single state would be generally much smaller, consistent with spillovers being a ’second-order’ effect. Also the largest indirect effects would fall on nearby states. It is the cumulation of the spatial spillovers over all the states that leads to relatively larger indirect than direct effects.

Finally it should be noted that for matter of completeness Table 2 also lists the total effects that are defined as sum of the direct and indirect effects estimates. Inference statistics is based on the sum of two coefficients, \(\hat{\beta } + \hat{\theta }\), and a proper measure for dispersion of this sum of coefficients. This measure can be constructed from the MCMC draws. A final point worth mentioning is that a focus on direct rather than total impacts that include spatial spillover effects would lead to an underestimate of the impacts.

4 Closing remarks

This paper emphasizes the importance of employing a spatial Durbin panel data model for quantifying the impact of a set explanatory variables on opioid mortality rates in the US. The empirical model uses a pool of 49 US states over six years from 2014 to 2019, and a \(k=8\) nearest-neighbor matrix to represent the topological structure between the states. Calculation of the direct (own-state) and indirect (spatial spillover) effects of the explanatory variables is based on Bayesian estimation and inference. The effects estimates reflect a proper interpretation of the marginal effects of the nonlinear model that involves spatial lags of the dependent and independent variables. This study’s methodology and resulting findings provide a useful template for future empirical work using other geographic locations or shifting interest to other pandemics.

Our study has produced several interesting empirical results that carry some policy implications of interest for policy-makers and government agencies alike. First, evidence suggests that opioid mortality depends on characteristics of the state itself (direct impact), and on those of nearby states (indirect impact). Second, direct impacts are important, but externalities or (cross-state) spatial spillovers are more critical. In fact, there is empirical evidence that indirect effects of the poverty rate in neighboring states, for example, has an impact five times larger than the direct impact. When assessing the relative costs versus benefits of regional policies aiming to reduce opioid mortality, focus on direct (beneficial) rather than total (beneficial) effects, including spatial spillover benefits, would lead to underestimating the benefits to society or the returns to such policies. Third, a state’s sociodemographic structure (age and race) is essential. Economic distress of a state covered by the poverty rate, variable of lower income and unemployment rate are significant too, but comparatively less so as indicated by the total impact magnitudes.

A few limitations to our study should be noted. First, all the inferences are made conditional on the data used. The analysis is based on data reported on death certificates which we assume to be correct, but acknowledge they are imperfect. Death certificates may classify causes of death, and results may be biased by geographic heterogeneity in cause-of-death reporting if misclassification existed and/or changed over time and across states (see Alexander et al. 2018). Second, our prescribing opioid data covers only prescriptions for Medicare and Medicaid patients, and underscores the use of prescription opioids. This underscoring may be downward biasing the effects estimates of the prescription opioid supply variable. Finally, scalar summary effect measures provide information on the typical state rather than individual states. This may obscure variations in the direct, indirect, and total effects over space that would be of special interest to policy-makers and government agencies. Further explorations with observation-level direct, indirect, and total effects would produce additional insights to better understand regional differences in impact estimates that would be important for developing prevention and intervention strategies.