Background

Verbal autopsy (VA) has become a widely established approach for characterizing cause of death patterns in settings where individual deaths are not routinely certified as to cause, with a variety of methods being used for both interview and interpretation phases [1]. Most often, VA has been applied for particular times, or over relatively short periods, to obtain point estimates of cause-specific mortality. However, as archives of VA data accumulate over time, possibilities of studying epidemic dynamics using VA approaches emerge. This is of interest in terms of measuring potential newly emerging causes of death [2], as well as for monitoring the dynamics of epidemiological transition [3]. But it also raises new methodological challenges, for example around consistent interpretation of VA into causes of death over long periods of time and consequently around practitioners' developing perceptions of new situations. More generally, it raises the question of how effectively VA methods are able to detect newly emerging causes of death.

Over the past two decades, southern Africa has experienced a massive and rapidly developing epidemic of HIV infection and associated mortality [46]. However, large-scale modeled estimates provide a rather imperfect picture of the epidemic, given that most deaths in southern Africa are neither certified nor medically investigated [7]. Localized populations with intensive surveillance, such as member centers of the INDEPTH Network [8], provide opportunities to look at specific examples in detail [911], even if this may generate a subsequent debate as to generalizability. A number of studies elsewhere have established the validity of VA methods for attributing deaths to HIV/AIDS, particularly among adults [1216]. Nevertheless, there remain some unresolved issues about how to best handle co-causes of mortality in cases of HIV-related death, and willingness to attribute deaths to HIV, whatever methods are used, may be influenced by nonmedical factors such as social stigmatization [17, 18].

HIV-related deaths are complex to count, since HIV-positive individuals are frequently affected by other diseases as a result of being immunologically compromised, and it can be difficult from VA data, in the absence of HIV serology, to determine the relative significance of AIDS versus other diseases in the processes leading to death. The 10th version of the International Classification of Diseases (ICD-10) uses codes B20 to B24 as underlying causes representing HIV/AIDS in combination with other disease categories (B20 infectious and parasitic diseases, B21 malignant neoplasms, B22 other diseases including wasting, B23 other conditions, and B24 nonspecific AIDS) [19]. However, differentiating probable HIV-related deaths detected by VA into these subcategories may not be easy to achieve, particularly where there is no explicit evidence of HIV positivity.

The ability to interpret any VA interview reliably depends on several factors, including the quality and detail of information on signs and symptoms provided by the informant. In settings where stigma is high around a particular cause of death - as is often the case for HIV - sensitive information may be withheld from the interviewer. Extent of nondisclosure is likely to vary as an epidemic develops, starting from minimal levels when key symptoms are not yet widely known by informants, and when physicians may also not yet be attuned to a particular diagnosis. As a significant epidemic such as HIV/AIDS develops, stigma is likely to rise, together with nondisclosure of relevant details. In a mature epidemic - particularly in the case of HIV as antiretroviral treatments are rolled out - nondisclosure may wane. These patterns may have significant effects on the outcomes of VA interpretation.

The Agincourt Health and Socio-Demographic Surveillance Site in the rural northeast of South Africa has been documenting a geographically-defined population (around 70,000 people in 2005) since 1992, including registering deaths and following those up with VA interviews [20]. The start of this surveillance in 1992 coincided with the early stages of the HIV epidemic (at least in terms of HIV-related mortality) in this area, and hence the accumulated VA data enable a methodological exploration as to how the epidemic evolved. Our primary aim is to characterize the epidemic of HIV-related mortality in this population, comparing both physician-interpreted causes of death and probabilistically modeled causes of death from the same VA interview material. As subsidiary aims, we investigate (1) approaches for handling common co-causes of HIV-related mortality, such as tuberculosis, malnutrition, and chronic gastroenteritis, and (2) variations between different coding physicians' responses to the emerging epidemic. Although this paper deals specifically with an epidemic of HIV-related mortality, findings are discussed in terms of using VA for monitoring long-term dynamics in mortality patterns.

Methods

The analyses in this paper are based on the entire series of 6,153 deaths (among all ages) in the Agincourt population from 1992 to 2005, as previously described in terms of primary-care planning [21] and in a comparison between physician and modeled VA interpretation [22]. VA interviews were successfully completed for 5,794 deaths (94.2%), using a questionnaire developed before international standards were agreed upon. These VA interviews were subsequently coded by two independent physicians who attempted to reach consensus where their diagnoses differed, with a third reviewing and intervening in case of disagreement. If no consensus could be reached, the cause of death was recorded as "undetermined." During the period from 1992 to 2005, 14 physician reviewers were involved in VA interpretation during various subperiods. In 373 (6.4%) of VA reviews, it was not possible to trace the identities of the coding physicians. The InterVA model (http://www.interva.net) was also applied to the VA interview material, as described previously [22]. This public-domain model relates input indicators (history, signs, symptoms from VA interview material) to likely cause(s) of death using Bayesian probabilities. A standard grid of conditional prior probabilities was defined by an expert panel of physicians [23]. The model has subsequently been evaluated in a number of settings [22, 24]. As a standard model designed for cause of death determination in low- and middle-income countries, it has the advantage of consistency over time and place [25].

A dataset was compiled (using Microsoft FoxPro) containing the two independent physician interpretations (main cause, possible immediate and contributing causes with ICD-10 codes), the physicians' consensus finding as to underlying cause (based primarily on the individual physicians' main cause findings), and the InterVA version 3.2 results (up to three likely causes per case, each associated with a quantified likelihood). The HIV level for the InterVA model was set to "high" and malaria set to "low," based on existing knowledge of causes of death in this population, as discussed previously [22]. The concept behind this setting in the InterVA model is analogous to a coding physician knowing that HIV or malaria represent more-common or less-common public health problems in a particular population, irrespective of the details around any individual death or detailed prior knowledge of cause-specific mortality. Age groups were defined as under 1 year, 1 to 4 years, 5 to 19 years, 20 to 49 years, 50 to 64 years, and 65 years and over. Analyses used Stata 10.

Surveillance-based studies in the Agincourt subdistrict were reviewed and approved by the Committee for Research on Human Subjects (Medical) of the University of the Witwatersrand, Johannesburg, South Africa (protocol M960720). Informed consent was obtained at the individual and household levels at every follow-up visit, whereas community consent from civic and traditional leadership was secured at the start of surveillance and reaffirmed from time to time. Feedback on cause of death patterns is presented to local communities and health service providers annually.

Results

The evolving epidemic of HIV-related mortality

Figure 1 shows the evolution of HIV-related mortality, both overall and by age group, in the Agincourt population, calculated as the rates (per 1,000 person-years) of physician consensus underlying cause being coded as ICD-10 B20-B24 (1,136 deaths, 18.4%), or the rates of most likely cause from InterVA being HIV/AIDS-related death (1,146 deaths, 18.6%). Both approaches showed very similar patterns over time and within age groups, with a huge increase from no HIV-related deaths in 1992 to 2.5 per 1,000 person-years in 2005 according to physician coding, and correspondingly from 0.2 to 2.6 per 1,000 person-years according to InterVA. Table 1 shows numbers of deaths according to physicians and InterVA, by age, sex, and period.

Figure 1
figure 1

HIV-specific mortality rates by age group by (a) physician consensus interpretation of VA data and (b) by InterVA interpretation (HIV-related death as most likely cause).

Table 1 Characteristics of HIV-related deaths in Agincourt, South Africa, using VA data interpreted according to physician consensus on underlying cause and InterVA most likely cause

Only 63/6,153 (1.0%) of the overall VA records explicitly mentioned HIV positivity in the interview material, so the overwhelming majority of conclusions on HIV-related deaths both by the physicians and the model reflected circumstantial findings. When data for the period from 1992 to 1994 were rerun with InterVA set to "low" HIV, the number of cases most likely due to HIV-related causes decreased from 57 to eight out of a total of 707 deaths (8.1% to 1.1%). Physician consensus findings for the same period recorded 13 cases (1.8%), although a total of 20 cases (2.8%) were HIV-related according to at least one physician. However, among the 51 cases rated as HIV-related by the model ("high" setting) but not by physician consensus for this period, the most common underlying cause attributed by physicians was malnutrition (nine cases, 17.6%). By contrast, overall physician consensus results for 1992-1994 recorded 4.2% for malnutrition, compared with 1.3% for 1995-2005.

Effects of different approaches for estimating HIV-related mortality

In addition to the physician consensus material on underlying causes of death that were identified as HIV-related, an additional 18 cases involved HIV as the physician consensus contributory cause. From this revised total of 1,154 HIV-related deaths, 693 (60.0%) were concluded in the physician consensus to have an infection (ICD B20), out of which 148 (12.7%) were specifically mentioned as tuberculosis. Ten cases (0.9%) had malignancies (B21), and 99 (8.6%) had chronic gastroenteritis or malnutrition (B22).

Using the alternative approach of the InterVA model, a total of 1,237 cases were rated as probably HIV-related, although in 91 of these HIV was not the most likely cause. Of the 1,237 cases, 156 (12.6%) were also identified as being associated with tuberculosis and 10 (0.8%) with other infections (B20), three (0.2%) with malignancies (B21), and 18 (1.5%) with chronic gastroenteritis or malnutrition (B22).

Interphysician variations in attributing HIV-related mortality

Of the 14 physicians coding this series of VAs, two completed very few (two and 16 cases respectively) and have been excluded from further consideration of interphysician variation. Of the 12 remaining, there were between two and five physicians coding VAs in any one year. No individual carried out work over the entire period. Figure 2 shows the overall proportions of physician consensus and InterVA HIV-related deaths by year, together with the proportions rated by the various physicians. In addition, the "low" HIV InterVA results for 1992-1994 are shown. Table 2 shows the proportions of HIV-related deaths as coded by first and second physician coders (irrespective of individual physician identity) compared with the revised physician consensus proportions, by year. The overall proportion of HIV-related mortality after achieving consensus was around 8% lower than single physician opinions (19.9% compared with 21.6%, ratio 0.92).

Figure 2
figure 2

Proportion of HIV-related deaths by year, according to physician consensus (heavy solid line) and opinions from 12 individual physicians who participated in coding during various periods (thin lines joining markers). The InterVA model results are represented by the heavy dashed line, with the alternative "low" HIV setting for 1992-1994 represented by the dotted line.

Table 2 HIV-related deaths (numbers and proportions) according to first and second physician coders and physician consensus, by year

Discussion

It is clear that the progression of the epidemic of HIV-related mortality in this rural South African community, with population-based rates increasing more than tenfold over a 14-year period, was successfully detected and tracked by means of VA, in the absence of any more rigorous routine procedures for following up deaths and their causes. Although one might not argue for VA as the epidemiological method of choice for this purpose, the reality across much of the world is that there is no realistic alternative for the time being [26]. Even where deaths are supposed to be certified, there can be considerable difficulties in accurately capturing and recording deaths related to HIV/AIDS [27]. How VA material can best be interpreted into cause of death findings including HIV-related mortality is thus a very important issue, which can then form the basis of understandings of population health, for example patterns of social disparities [28].

The validity, reliability, and consistency with which VA data can be interpreted, particularly in terms of HIV-related mortality, are important issues. Both the physician-based and modeled approaches presented here yielded very similar results in terms of characterizing the epidemic. Intuitively plausible trends, such as the increasing age of HIV-related deaths observed as the epidemic developed (according to both approaches), presumably following developments in care and treatment, are encouraging. The InterVA model was not specifically designed to deliver ICD-10 codes, and so the major comparison here was equivalence at the B2* level rather than at the third digit level. As is usually the case where VA is used, there is no gold standard against which to absolutely compare these findings. Even if we knew the HIV serostatus for every death, there would still be difficulties in determining which deaths were actually attributable to HIV. However, it is very unlikely that the closely similar epidemic patterns shown for the two methods in Figure 1 would be similar entirely by chance, and in that sense both lend credence to the other. But, as we have noted previously [22], the physician approach was very time-consuming and expensive compared with probabilistic modelling, and the delays and expense involved in the physician process may be hard to justify from these results.

Since the "two physicians plus arbitrator" model of physician interpretation seems to have become a de facto (but not necessarily "gold") standard in much VA work, it is perhaps surprising that there have been few detailed analyses of individual physicians' opinions compared with physician consensus findings in VA studies using this method, with some exceptions [29, 30]. It is also important in this context to remember that concurrent findings do not necessarily constitute "truth" [31]. In the particular setting of this epidemic, where the incidence of HIV-related deaths was changing at a rate that was not necessarily clear to physicians at the time, especially in the early stages of epidemic, it was particularly relevant to examine the ways in which individual physician interpreters responded to the changing situation, as well as the effect on consensus findings. It is also noteworthy that a relatively large number of individual physicians were involved in the process over the 14-year period; it would be surprising if this were not the case in most longer-term VA operations. It is worth noting that large studies using multiple physicians to interpret cause of death are difficult to interpret and understand if details about interobserver effects are not presented. It is also clear from the results in Figure 2 that, in general, consensus rates tended to be slightly lower than individual physician rates, particularly in the later years. This could have important implications in considering whether to use only a single coding physician per case, as has previously been suggested [32]. While there was generally good consistency between first and second physician findings (averaging over individual physicians) as shown in Table 2, the generally slightly lower rates of HIV-related mortality from the consensus process would probably result in slightly higher levels of "undetermined" cause of death in an all-cause analysis than might have resulted from using only a single physician coder.

Around the inception of this HIV-related mortality epidemic, the relationship between individual physicians, consensus results and the "low" and "high" HIV settings for the InterVA model are particularly interesting. The proportional differences in rates among the various approaches were greatest during the first three years, as is clear from Figure 2. Initial work on the InterVA model suggested that only causes likely to vary by an order of magnitude in terms of overall proportion needed to have an adjustment [23], with the crossover between "low" and "high" being at around 1% of total mortality. The "high" setting was therefore the appropriate one overall here. The analogous "setting" in physician coding is represented by a physician's awareness of how common HIV-related mortality is in a population, irrespective of the detailed circumstances of a particular case. Physician consensus rates gave the lowest measure of HIV in the early years, and it seems that in the uncertain early stages of the epidemic it was particularly difficult to achieve consensus, even though some deaths were considered as HIV-related by one physician. This supposition is indirectly supported by finding that the physicians' highest rates of malnutrition-related mortality were recorded during that period, probably representing a misclassification of deaths that were at least partly HIV-related. Thus the reality here is that the HIV-related mortality rates between 1992 and 1994 were probably somewhere in between the various estimates shown in Figure 2. Conversely, individual physicians recorded appreciably more HIV-related mortality in the later years, compared with both the consensus and modeled findings, possibly reflecting physicians' inflated views of HIV latterly. Additionally, nondisclosure of sensitive details in VA interviews at various stages of the epidemic may have compromised both the physicians' and model's findings. In the case of the model, it is important to note that the HIV rates over the period increased tenfold without any information being given to the model about a likely increase over time. This illustrates the relatively noncritical magnitudes of the cause-specific prior probabilities incorporated in the model, and supports the notion that a single model can be used for interpreting VA data over wide ranges of time and place, maximizing the benefits of consistency for comparative purposes over different settings.

Conclusions

VA was clearly able to identify the emergence and growth of a very significant epidemic of HIV-related mortality in this population, and using either physicians or probabilistic modeling to derive cause of death findings gave closely similar results. The evidence suggests that physicians were perhaps a little slow to recognize the early stages of the epidemic, while the model (at least when set to expect a "high" level of HIV mortality) may have slightly overestimated initially. However, the fact that a numerically constant model was able to characterize a greater-than-tenfold increase in HIV-related mortality over time is an important demonstration of the relative robustness of probabilistic modeling for VA interpretation. This suggests that there is no need for finely tuned "local" versions of models for VA interpretation, the proliferation of which would detract from the comparability of results over time and place.