Introduction

Suicide and drug use disorder deaths are the leading causes of psychiatric morbidity [1]. These two partially overlapping causes of death are often challenging to distinguish as the determination of suicidal intent in drug use disorder deaths is heavily reliant on suicide notes, which are often not present [2]. Additionally, suicides are specifically prone to being wrongly classified as accidents [3] which results in missing funds for psychiatric care and consequently in absent treatments for both causes of death [1].

Suicide rates differ considerably between countries [4]. Some of this variation may be explained by country-specific differences in suicide risk factors, which may translate into real variations in the prevalence of suicides [5]. However, national suicide statistics rely on official post-mortem certification procedures, which vary across countries. More precisely, after a death occurs, a death certificate is issued indicating the cause of death according to the International Classification of Diseases (ICD). Naturally, countries differ in their national death certification procedures and have varying standards of training, likely affecting the quality and reliability of cause of death statistics [6, 7]. Therefore, heterogeneity in death misclassification can be expected between countries – i.e. some classify more deaths wrongly than others. Literature has suggested that deaths due to suicide are underreported in general and, when compared to other causes of deaths, may be especially subject to misclassification due to the stigma associated with it [8]. Thus, for all these reasons, many studies have questioned the validity of causes of death statistics [9], especially suicide statistics [9, 10], and consequently the validity of all studies relying on official mortality data.

The classic suicide misclassification hypothesis, which emerged in the 1970s, proposes that suicides are `hidden` in other cause-of-death categories [11,12,13]. Inferentially, the theory assumes that misclassification of suicides should be reflected in a negative association with nonspecific death rates. A previous study [14] has scrutinized this theory and found only weak to no associations between suicides, undetermined-, and ill-defined deaths in cross-sectional and longitudinal analyses. The study then proposed a new suicide misclassification hypothesis introducing autopsies as a covariate to investigate the diagnostic accuracy of death classifications. Doing this, the authors found that autopsy rates were strongly associated with, and predictive of, suicide deaths, drawing doubts on the reliability and validity of death classifications—and consequently of mortality statistics in the presence of low autopsy rates [14].

Based on evidence there are good reasons to assume, that [10]-however, past research failed to confirm the hypothesis that suicides are misclassified as neither deaths of undetermined intent nor as ill-defined deaths. Therefore, it is important to investigate which other death categories may account for these misclassifications. Previous research suggested that one of these categories may be unintentional injury deaths (i.e. accidents) [15]. Indeed, there are scenarios in which suicides and accidents could be confused with each other. For example, a fall from height or a death due to a drug use disorder, need further inquiry to be interpreted as either intentional (suicide) or unintentional (accident) death.

This study has two objectives. First, to examine to what degree suicides and accident deaths are misclassified. We hypothesize that most misclassifications of suicide and accidents occur between each other, given their external similarity but uncertainty of intent. Second, we investigate which other causes of deaths may account for the misclassification of accident and suicides deaths. These causes of death were chosen based on previous literature. More precisely, we investigate the categories undetermined, ill-defined, [12] and drug use disorder deaths [2]. Should these aforementioned causes of death account for the majority of misclassifications, the autopsy rates should no longer be associated with the cause of death in question.

Methods

This study examines three external causes of death that are differentiated by intention. These three categories are: Intentional injury deaths, unintentional injury deaths, and undetermined/ill-defined deaths. These followed the definition set out in the WHO Mortality Database, with intentional injury deaths referring to suicide deaths (ICD-10 codes: X60-X84, Y870) – called “suicides” hereafter. Unintentional injury deaths include all kind of accidents (e.g. car accidents) and accidental poisoning – either by medication or alcohol/illegal substances. Hereafter, the former are referred to as “accidents” (V01-X32, X40, X43, X46-X59, Y40-Y86, Y88, Y89, U12.9) and the latter – deaths due to illegal substances and alcohol – as “drug use disorder deaths” (F10-F16, F18-F19, X41-X42, X44, X45). Deaths of undetermined intent (Y10-Y34, Y872) are those, where it is not possible to know whether the death of a person was intentional or not. These include cases of death, in which it remains unclear whether an injury death, as described above, was a suicide or an accident. We also include ill-defined deaths (R00-R94, R96-R99) – these are deaths, where the death certification procedure did not find a specific cause of death (e.g. unattended deaths, death not otherwise specified). Note that the WHO classification of ICD-10 codes provides a set of mutually exclusive categories [16]. In that WHO classification system drug use disorder deaths should not be confused drug poisoning. These are categorized by the WHO into a number of categories, of which the following are relevant for this analysis: “Suicides” (i.e., ICD-10, X60–X69: intentional self-poisoning), “Accidents” (i.e., ICD-10, X40, X43, X46-X49: accidental poisoning), “Deaths of undetermined intent” (i.e., ICD-10, Y10–Y19: poisoning with undetermined intent) and drug use disorder deaths (i.e., ICD-10, X41–X42, X44–X45: accidental drug poisoning not elsewhere classified). Evidently, drug poisoning—as the mechanism of dying – underpins all categories. More precisely, a person can use drug poisoning as a means to commit suicide, poison themselves on accident, the poisoning may be of undetermined intent, or fall under a drug use disorder related death.

Consequently, as drug poisoning can be the mechanism of death in all the above-mentioned categories, this makes misclassification between these categories more likely. More specifically, these categories may be mutually exclusive in theory but are often hard to distinguish in practice, due to their shared mechanism.

Data analysis

Data was obtained for the last forty years, i.e. from 1982 until 2022 from the WHO mortality database. Autopsy rates were obtained from the European regional offices’ European Health for All Database (https://gateway.euro.who.int/en), the CDC (https://wonder.cdc.gov/) and Statistics Canada (https://www.statcan.gc.ca/en/start). For sensitivity analyses, additional indicators were retrieved from the OECD (https://data-explorer.oecd.org/). To improve robustness, individual years were averaged into periods: For the longitudinal analysis, eight five-year periods, starting with 1982–86, were calculated. For cross-sectional analysis, four ten-year periods, starting with 1982–91, were calculated.

For the cross-sectional analysis we conducted spearman correlations, bivariate regressions, and multiple regressions. The regressions were used to investigate the associations of each aforementioned variable with suicide and accident deaths.

Longitudinal data analysis was conducted using generalized estimating equations (GEEs). First, we implemented crude models and then two multiple GEEs were calculated. Similar to the regression models, the GEEs were calculated in order to investigate the associations of each aforementioned variable with both suicide and accidents.

The formula of both of the multiple regression models and of the multiple GEEs used identity links and a gamma distribution and can be described as following: For the model predicting suicides (per 100.000), the predictors were autopsies (per 100 deaths), drug use disorder deaths (per 100.000), undetermined/ill-defined deaths (per 100.000), and accidents (per 100.000). That is to say: Y(suicides)i = β0 + β1 accidents1i + β2 drug use disorders deaths + β3 undetermined/ill-defined deaths3i + β4 autopsy4i + εi). For the model predicting accidents, the predictors were autopsy rates, drug use disorder deaths, undetermined/ill-defined deaths, and suicides (i.e. Y(accidents)i = β0 + β1 suicides1i + β2 drug use disorder deaths2i + β3 undetermined/ill-defined deaths 3i + β4 autopsy4i + εi). Thereby, Y is defined as μ, the mean of the outcome Y, moreover μ is transformed using the identity link: 1/μ. The inclusion of time as an additional predictor to the above-mentioned variables was specific to the longitudinal model.

As accidents and suicides were non-negative and right skewed, gamma distributions were used for both the regression models and GEEs. The correlation structure of the longitudinal model was chosen based on quasi-likelihood under the independence model criterions (QICs) [17].

Sensitivity analysis

Sensitivity analyses were undertaken to investigate the robustness of results. We investigated (a) data quality, (b) indicator specification and (c) the impact of antidepressant use. Data quality was investigated in two ways: First by limiting countries to only EU member states, and second by only including countries with high data quality according to the WHO data quality classification (https://platform.who.int/mortality/about/data-quality). The impact of indicator specifications was investigated by dividing both undetermined/ill-defined deaths and drug use disorders deaths into their subcategories (ill-defined diseases and ill-defined injuries, and illicit drug use disorder deaths and alcohol use disorder deaths). The effect of antidepressants was investigated by including defined daily doses, reported on the national level and provided by the OECD, into the aforementioned models. All analysis was conducted using R version 4.3.2 [18]. This analysis was preregistered on OSF before data retrieval, under: https://osf.io/8pgst.

Results

Autopsy rates, accidents, and suicides, all declined over the decades. In contrast, undetermined/ill-defined deaths increased strongly between 1982 until 2001 but have since experienced a sharp decline (see Table 1). Correlations between causes of death and autopsy rates were often positive and significant, except for undetermined/ill-defined deaths, which mostly correlated negatively with autopsy rates (see Fig. 1).

Table 1 Sample characteristics
Fig. 1
figure 1

Bivariate associations of variables over all time as Spearman correlations, scatterplots, and histograms. Note: Lower triangle: Scatterplot, diagonal histograms; upper triangle spearman: correlations with confidence interval and significance test, ** significant at 0.01, *** significant at < 0.001

Regression models

In the bivariate models – averaging results from all four decades – with suicide as the outcome, the strongest association could be observed between suicides and drug use disorder deaths (average of B = 1.93 CI95% [0.96 – 3.08] p < 0.001; R2 = 0.45). This was followed by autopsy rates (average of B = 0.368 CI95% [0.20 – 0.56] p = 0.03; R2 = 0.33), dropping notably from a substantial R2 = 0.59 in the first decade to a weak R2 = 0.03 [19] in the current decade. Moreover, when using suicide as the outcome, there was a strong association with accidents, explaining an average of B = 0.255 CI95% [0.14 – 0.39] p < 0.001; R2 = 0.33, while undetermined/ill-defined deaths were, on average, only associated weakly (B = −0.01 CI95% [−0.05 – 0.05] p = 0.124; R2 = 0.06), (see Appendix for full results). Multiple regression models with suicide as the outcome explained on average, over all four decades, an R2 = 0.74 and showed that suicide was strongly associated with drug use disorder deaths and accidents. While the relevance of autopsy rates fluctuated throughout the decades, the impact from deaths of undetermined/ill-defined deaths was negligible over all four decades (see Table 2 for an overview of specific decades).

Table 2 Multiple generalised linear models with suicide death as outcome

In the bivariate models – averaging results from all four decades – with accidents as the outcome, the strongest average association with accidents were suicides (average of B = 1.27 CI95% [0.76 – 1.82] p < 0.001; R2 = 0.42), followed by autopsy rates (average of B = 0.85 CI95% [0.38 – 1.41] p = 0.002; R2 = 0.34) and then drug use disorder deaths (average of B = 1.46 CI95% [−0.048 – 3.27] p = 0.11; R2 = 0.21). As in the models with suicide as the outcome, undetermined/ill-defined deaths were only weakly associated with accidents (average of B = 0.11 CI95% [− 0.06–0.33] p = 0.27; R2 = 0.08), (see Appendix for full results). In multiple regression models with accidents as the outcome, suicides were most strongly associated with accidents – while all other associations were weak over all four decades, together explaining on average an R2 = 0.57 (see Table 3 for an overview of specific decades).

Table 3 Multiple generalised linear models with accident deaths as outcome

Generalized estimating equations

Predicting suicide, the best model fit was achieved with an independence correlation structure (QIC: 566). Compared to this, the compound symmetry (i.e., exchangeability) was QIC: 658.8, and the QIC for autoregressive structure was QIC: 764. Results of this GEE can be seen in Fig. 2. When comparing the results of the multiple GEE to the bivariate GEEs, the most striking difference was the change of autopsy rates, with a B = 0.39 (0.25 – 0.54), p < 0.001, and a moderate R2 = 0.16 in the bivariate model. This means that for every 1% increase in autopsy rates, suicide rates increase by 0.39 per 100,000. In contrast, in the multiple GEE, autopsy rates were significantly associated with a slight decrease of −0.03 per 100,000 suicides. After autopsy rates, the most impacted predictors were undetermined/ill-defined deaths which were not predictive of suicides in the crude GEE (B = -0.01CI95% [-0.05 – 0.03] p = 0.6; R2 = 0.002), but became significantly associated in the multiple GEE model, predicting that for every unit reduction in undetermined/ill-defined deaths, an increase of 0.05 per 100,000 suicide deaths could be observed. Compared to this, the other bivariate GEE only reported weak changes for drug use disorder deaths [B = 1.11 CI95% (0.73 – 1.49) p < 0.001 R2 = 0.12], leading to an increase of 1.11 per 100,000 suicides. Notably, accidents reported the same mean effect size in the bivariate GEE (B = 0.25 CI 95% (0.20 – 0.30), p < 0.001; R2 = 0.48) as in the multiple GEE model.

Fig. 2
figure 2

Multiple GEE models. Note: Autopsy rate per 100 deaths; causes of death per 100.000 of WHO standardised population

Predicting accident deaths, the multiple GEE achieved the best model fit with an autoregressive structure (QIC: 477.13). Compared to this, the compound symmetry (i.e., exchangeability) was QIC: 485.24, and the independent structure was QIC: 505.86. Results of this model can be seen in Fig. 2.

Comparing the results of the multiple GEE to the results of the bivariate models, the strongest change was again observed for autopsy rates. In the bivariate GEE, autopsy rates had a strong impact on accident deaths at B = 0.72 CI95% (0.42 – 1.03, p < 0.001; with a moderate R2 = 0.20) and a 1% change in autopsy rates was associated with an increase of 0.72 per 100,000 in accident deaths. The remaining predictors only changed insignificantly, as the bivariate models, drug use disorder deaths (B = 0.18 [0.02 – 0.33] p = 0.69; R2 = 0.02), undetermined/ill-defined deaths (B = 0.08 [0.07 – 0.09] p = 0.12; R2 = 0.03) and suicide deaths (B = 1.41 CI95% [0.82 – 2.00] p < 0.001; R2 = 0.43) reported comparable effects to those of the multiple GEE (see Fig. 2). Therefore, in the bivariate GEE predicting accidents, only suicides were able to explain a substantial degree of variance with an R2 = 0.43. We implemented a set of sensitivity analysis which affected the above presented results only marginally (see Appendix for full results).

Discussion

Confirming prior results [14, 20, 21] we fail to find evidence that suicides are mainly misclassified as either undetermined or ill-defined causes of death. Expanding previous investigations to include further causes of death, we find that suicides are misclassified as accident and drug use disorder deaths. Indeed, when accounting for accident deaths, drug use disorder deaths and undetermined/ill-defined deaths, the association between autopsy rates and suicide weakens considerably, implying that accidents and drug use disorder deaths are the two main causes of death which suicides are misclassified as. Furthermore, we find evidence that the direction of the misclassification of accidents and suicides goes both ways, suggesting that accidents are misclassified as suicides and vice versa. However, drug use disorder deaths were only associated with suicides but not with accidents, i.e. suicides are misclassified as drug use disorder deaths, while accident deaths are not misclassified as drug use disorder deaths. Indeed, autopsies were still associated with accidents deaths after accounting for all other causes of death, suggesting that—while we were able to identify the main causes of misclassification for suicides-, there remains unidentified important causes of misclassification for accidents. Altogether, our study supports the idea that suicides and drug use disorder deaths are uniquely prone to conflation with suicides in death statistics.

Causes for misclassification

Various reasons exist why these causes of death may be confused with each other. Both suicide and accident deaths are regarded as injury deaths by the WHO and the only difference between accidents and suicides is whether the deceased had the intent to die or not. In some cases, intent is apparent: For example, when a suicide note is present [2]. In these cases, death classification is more straightforward. However, in other cases, intent may not be as overt. For example, suicides may be misclassified as accidents when a person intentionally jumps from a roof or in front of a car – rather than falling accidentally [6]. 13/07/2024 14:43:00 PMIn such cases, intent can be hard to investigate, in particular when no suicide note has been both left and found [2]. Contrarily, accidents may be classified as suicides as well, for example a person suffering from depression may accidentally shoot themself while handling a firearm, which may be wrongly classified as a suicide due to their history of mental health problems [22].

It is unsurprising that more suicides are being misclassified as accidents than vice versa because the base assumption of an injury death is usually that an accident has occurred [3]. In case no contradicting evidence is present – e.g. in form of suicide notes/announcements [10]—suicides will thus routinely, and therefore at times wrongly, be classified as accidents. This may be aggravated by an overworked medico-legal system, [10] which counteracts deeper case investigations and forces technical and financial shortcuts due to generally limited resources: For one, such investigations are resource intense – e.g. the family of the deceased needs to be interviewed and the medical history has to be reconstructed in a psychological autopsy to evaluate whether it is sensible to assume the individual acted with suicidal intent. Furthermore, suicides are stigmatized, and officials may at times be biased to classify a death as an accident rather than a suicide, given that the former may be a preferable cause of death for the family and others [8]. In short, suicide as a cause of death has a higher burden of proof when compared to other causes of death [3, 10, 23], which, in an overworked system, may lead to an unconscious but preferred misclassification into other cause of death categories which put less strain on the examiners.

Implications

The accurate assessment of causes of death, or lack thereof, has significant implications. Cause of death statistics – which are based on death certificates—have an important role in the calculation of the burden of diseases and mortality (e.g. disability-adjusted life years [DALYs]), which informs the allocation of resources in the healthcare system—for example towards capacity building for prevention and treatment of mental health disorders and suicide. Thus, accurate death investigations have a decisive indirect effect on valid cause of death statistics and associated health care planning [24].

Autopsy rates have been decreasing in Europe and North America. Our study implies that both suicides and accidents may frequently be misclassified, which has wide ranging consequences. More precisely, when the correct classification of death is reliant on autopsy rates, the comparability of death statistics between countries is limited – i.e. countries with lower autopsy rates tend towards wrongly classifying more suicide and accident deaths than countries with high autopsy rates [25, 26]. Many studies utilize national death statistics and, if these are biased, research using measures derived from this data, such as DALYs, would have limited reliability and validity [9]. If this holds true, the current scientific literature may be less reliable than believed—as may be all public health interventions informed by this data.

Misclassifying causes of death threatens public health and safety and wastes limited public health resources by misallocating funds. Our study emphasises the need to improve cause of death classifications by improving the funding of the medico-legal system and adapting new standards and methods of classifying deaths [27], such as new approaches to categorisations [3], systems [28, 29], or simply better funding for more autopsies to ensure valid classifications of death. Currently, countries do not do this to a satisfactory extent as reflected in the poor quality of death statistics. Furthermore, countries may want to consider using autopsies in unambiguous death cases on a random test basis as a measure for quality control of death certification procedures. This could be used to monitor the accuracy of national death statistics and to inspect the ability of those certifying deaths, thus identifying professions, groups or individuals who require further training [25, 26]. Akin to the WHO data quality statistics [30], these investigations can be openly made available to researchers. More precisely, as long as cause of death misclassifications remain a challenge, we strongly urge that autopsy rates and other methods should be developed and established as a standard measure. Thus, like gender, age, or unemployment, autopsy rates—or other measures of cause of death misclassification—should be broadly included as confounders in the analyses of ecological data. There are limitations to this study. More precisely, as the analyses is based on ecological data, interferences should be made with caution. Moreover medico-legal systems vary between different countries, which makes drawing universal conclusions difficult. Therefore, research should be conducted in individual countries taking differences into account to provide information better tailored to the need of the respective countries and regions.

Conclusion

Our study suggests that suicides may be misclassified as accidents and drug use disorder deaths, and that accidents may be misclassified as suicides. Consequentially, our results imply that death certificates regularly include errors, thereby biasing national death statistics and thus all measures and studies based on these statistics. Therefore, autopsies should be conducted more frequently in equivocal cases—and on a test basis in unequivocal cases—to improve death certifications. Until these changes have been implemented, the present bias can be minimized by including autopsy rates as a confounder in studies based on death statistics.