Introduction

Prevalence of depression is similar in pregnant, postpartum, and nonpregnant women. However, the onset of new depression is higher during the perinatal period (Vesga-Lopez et al. 2008) and postpartum depression is often preceded by antenatal symptomology (Rahman and Creed 2007; Milgrom et al. 2008). Depression during pregnancy has been associated with poor uptake of antenatal care and adverse fetal and obstetric outcomes (Lancaster et al. 2010; Alder et al. 2007; Grote et al. 2010). Increasingly, anxiety during pregnancy has also been shown to be of concern (Alder et al. 2007; Austin 2004). While increased health care contact during pregnancy provides opportunities for screening, prevention, and treatment (Committee on Psychosocial Aspects of Child and Family Health: Task Force on Mental Health 2009), antenatal depression frequently remains undetected and untreated (Goodman and Tyer-Viola 2010).

A systematic review of perinatal mental disorders in low- and middle-income countries (LMIC) found a concerning burden, with weighted mean prevalence between 15 and 20 % (Fisher et al. 2012), with the review highlighting the dearth of research evidence in LMIC (Parsons et al. 2011) and in Africa (Sawyer et al. 2010), a continent also heavily affected by HIV epidemics. In the sub-Saharan African region alone, in 2010, approximately 1.36 million pregnant women were living with HIV (World Health Organization 2011); antenatal depression has been shown to be elevated in the context of HIV with rates above 40 % (Rubin et al. 2011; Rochat et al. 2011; Rochat et al. 2006). Improving early detection and intervention during routine pregnancy care may reduce risks for postnatal depression and may also improve HIV treatment and prevention outcomes (Psaros et al. 2009; Kapetanovic et al. 2009; Levine et al. 2008).

Despite significant and growing support for universal screening (Mitchell and Coyne 2007; Breedlove and Fryzelka 2011; Kim et al. 2008; Kuchn 2010), multiple challenges exist in ensuring that health care practitioners screen for depression in primary health care (PHC) and among high risk populations such as pregnant women (Gjerdingen and Yawn 2007; Kopelman et al. 2008; Rice et al. 2007). Resistance to screening is often high where primary care is particularly time pressured, creating a need for short, user-friendly, and sensitive tools. A recent pooled analysis (Mitchell and Coyne 2007) found that ultra short one-item screens have reasonable specificity, but low sensitivity, identifying only three in 10 depressed patients, while two- or three-item ultrashort measures have significantly improved sensitivity, identifying eight in 10 depressed cases. However, there are concerns that this higher sensitivity comes at the expense of high false-positive rates that prove too costly when resources are scarce. Evidence supporting the cost effectiveness of universal screening is mixed (Yonkers et al. 2009; Mitchell and Coyne 2009), with some studies showing that lack of access to treatment undermines the cost effectiveness of universal screening (Paulden et al. 2009), as does offering treatment to women who do not need it or to those who do not want it (Dowswell et al. 2010).

The issue of whether universal screening improves access to treatment, whether it is feasible, or benefits patients, is particularly complex in low-resource settings (Kagee et al. 2012; O'Hara et al. 2012). Barriers to routine screening for antenatal depression include lack of time, stigma, incomplete training, inattention by health professionals, and a lack of referral sources (Earls and The Committee on Psychosocial Aspects of Child Family Health 2010; Honikman et al. 2012). In LMIC settings, screening is often restricted by critical shortages in health care professionals at PHC level (Patel et al. 2009) and task shifting of primary care and prevention functions to community health care workers (CHW) is proposed as a means to improve maternal and child outcomes (Lewin et al. 2010). Approaches that incorporate CHWs in the detection and management of perinatal mental disorders have shown potential, with research demonstrating the capacity of CHWs to deliver treatment for both HIV (Selke et al. 2010) and maternal depression (Rahman 2005; Rahman et al. 2008). However, actualizing this potential at a larger scale requires short, effective screening measures to facilitate detection in two ways: firstly, by facilitating screening by CHWs who may identify risk at a household level and secondly, by facilitating screening by busy health practitioners within antenatal services. Shorter tools provide benefits from the health care perspective and have higher acceptability for women (Milgrom et al. 2011; Breedlove and Fryzelka 2011).

The Edinburgh Postnatal Depression Scale (EPDS) is the most widely used screening tool for the detection of perinatal depression (Lusskin et al. 2007; Breedlove and Fryzelka 2011). It is frequently used in LMIC settings (Parsons et al. 2011; Sawyer et al. 2010; Halbreich and Karkun 2006) and in HIV-endemic communities (Chibanda et al. 2010; Manikkam and Burns 2012), and is widely used in South Africa (Hartley et al. 2011; Rotheram-Borus et al. 2011; Rochat et al. 2006). Preliminary evidence from research in North America (Kabir et al. 2008) found that the three-item anxiety subscale of a EPDS was as effective as longer versions in identifying depression risk in the postnatal period. Research in Asia (Choi et al. 2012) found that two items of the EPDS predicted antenatal risk as well as the full 10-item version. However, interpretation of these findings is limited by the fact that neither of these studies included a diagnostic measure of depression. However, recent research examining a basket of commonly used items against clinical interview methods in the United States found that a combination of two to three items worked as well in identifying depression when compared to the EPDS10 and other commonly used screening tools (O'Hara et al. 2012).

In HIV-epidemic regions of Southern Africa, the burden of antenatal and postnatal depression has been shown to be as high as 30–50 % in multiple studies (Hartley et al. 2011; Manikkam and Burns 2012; Chibanda et al. 2010; Rochat et al. 2006; Rochat et al. 2011; Stewart et al. 2010). In these resource-scarce settings, women are known to be at high risk, but given low availability of resources, they are unlikely to be screened in primary care. Research in antenatal environments in South African illustrates that universal screening is feasible and acceptable but that shorter screening tools are needed to facilitate appropriate use of scarce resources in busy PHC settings (Honikman et al. 2012). Task shifting screening to CHWs who are able to screen at routine home visits requires simple, short screening tools with good sensitivity to ensure detection of women in need of referral, but also balanced with good specificity to ensure that referrals of false positives do not overburden already overburdened PHC resources (Kagee et al. 2012). Furthermore, at PHC level, short reliable screens that help nursing professionals determine which women should be referred to a medical officer would also help in making the most use of extremely scarce resources. Yet, no studies to date using clinical interview methods have examined the effectiveness of short and ultrashort versions of the EPDS for this purpose. The aim of this research is to test the hypothesis that shortened versions of the EPDS are as effective as longer versions in identifying antenatal depression as determined by a clinical interview diagnostic method.

Methods

Data were collected at a large centralized PHC facility located in an area with high HIV prevalence, in a predominantly rural part of South Africa (Tanser et al. 2008). The facility is staffed by 20–30 nurses offering a full range of PHC services to approximately 10,000 patients per month, including antenatal outpatient clinics, with an average of 160 first time antenatal attendees per month. This facility offers a 24-h service managing normal deliveries, with between 70 and 100 deliveries monthly (Houlihan et al. 2010). The subdistrict health services include a district hospital with 250 beds and 17 decentralized PHC clinics servicing a population of 228,000 over a geographical area of approximately 1,500 km2.

The research was based on an assessment at one time point, in the second-half of pregnancy, and the details are described elsewhere (Rochat et al. 2011). Eligible women were required to attend routine antenatal care, be at least 16 years of age, and a resident in the study area. As part of Prevention of Mother to Child Transmission Programs (PMTCT), all women were screened for HIV in routine antenatal care. Women learning their HIV status for the first time during this current pregnancy were included, regardless of whether they tested HIV positive or HIV negative. HIV testing took place 2–3 weeks prior to the depression assessment (Rochat et al. 2006). Women living with HIV (WLH) prior to this pregnancy were considered a separate group, with different risk profiles, and referred to nonroutine antenatal care, and thus excluded from the study. Women with chronic health problems such as diabetes and hypertension were also referred to specialist services and excluded. Written informed consent was obtained in writing and ethical approval obtained from the Biomedical Ethics Review Board of the University of KwaZulu-Natal (E193/3) and the Oxford Tropical Research Ethics Committee (OXTREC 014–04).

Women were interviewed in the local language (Zulu) using the major depression section of the Structured Clinical Interview for Depression (SCID) for Diagnostic and Statistical Manual of Mental Disorders (4th Edition) diagnoses and the EPDS administered in interview format. Anxiety was not assessed with the clinical interview method. Detailed information on cross-cultural validation and scoring methods for the presence/absence of a DSM-IV Major Depressive Episode (MDE) are available (Rochat et al. 2011). The highest recommended cutoff of ≥13 was used to define probable depression on the EPDS. HIV status was collected via self-report and verified against clinic records.

Data were double entered for accuracy into Stata 11. Data analysis examined the sensitivity and specificity and the positive predictive value (PPV) of the EPDS against the depression outcome, determined by the SCID. Multiple regression techniques examined EPDS items to identify those items significantly associated with clinical depression, significance was set at p < 0.001. Previously published scoring techniques for using shorter versions of the EPDS were applied (Mitchell and Coyne 2007; Kabir et al. 2008; Choi et al. 2012). The same recommended EPDS cutoff ≥13 was used for all versions of the scale. Receiver operating characteristic (ROC) analysis examined four versions of the EPDS including the full 10-item EPDS (EPDS10), the traditional depression subscale (EPDS7), and the new five-item (EPDS5R) and three-item versions (EPDS-3R) identified through item regression analysis. Given that anxiety was not measured using a gold standard measure, the three-item anxiety subscale was not examined in this analysis. Effectiveness was determined by the ability of versions or subscales to predict accurately the presence (sensitivity) or absence (specificity) of depression according to the “gold standard” clinical diagnostic measure. Various statistics were examined and included both positive and negative predictive values in order to take prevalence into account, and to fully explore the data, likelihood ratios were calculated conventionally and weighted by prevalence. Kappa statistic and Cronbach's alpha were calculated for each version.

Results

Figure 1 outlines the sample recruitment as published elsewhere (Rochat et al. 2011). A total of 112 women (72 %) completed the depression assessment and comprised the sample for this analysis. While more HIV-positive women were lost to follow up, baseline data showed no significant distinguishing characteristics between women lost to follow-up and those interviewed.

Fig. 1
figure 1

Sample recruitment (Rochat et al. 2011)

Sociodemographic characteristics (Table 1) are similar to those reported in the larger baseline cohort (Rochat et al. 2006). Most women were young, had completed some or all of their secondary education, were from low-income groups, were unmarried but in a stable relationship with the father of the child, and living with their families rather than cohabiting with partners. The majority (85.3 %) had unplanned pregnancies, and 49/109 (45 %) women were newly diagnosed with HIV during this pregnancy.

Table 1 Sample characteristics

The prevalence of depression was high, with similar rates found on the SCID (47 % CI 37.2–56.3) and the EPDS10 (44 % CI 34.57–53.50). Close to a third (14/51, 28 %) of depressed women reported episode duration of between 2 weeks and 2 months, while two thirds (24/51, 66 %) reported episode duration greater than 2 months, and only a few women reported duration as greater than 6 months. Less than a quarter (8/51 or 16 %) of depressed women reported a previous episode that had resolved prior to the pregnancy, and two out of eight reported that a previous episode had occurred in the postnatal period of a previous pregnancy. Slightly more HIV-positive women (n = 27) were depressed than HIV-negative women (n = 24). Depression and HIV status were not significantly associated [OR 1.84 (0.86–3.95), p = 0.117].

Regression analysis of items on the EPDS10 against the depression outcome (Table 2) showed five significant items in univariate analysis, and these included: items 2, 7, 8, 9, and 10. These items constituted the novel five-item version of the EPDS (EPDS5R) examined in ROC analysis. When these five items were entered into multivariable analysis, three items remained significant: items 2, 9, and 10. These three items constituted the novel three-item version (EPDS3R) examined in ROC analysis. We termed the five-item version, the short version and the three-item version, the ultrashort version, based on previous suggestions (Mitchell and Coyne 2007).

Table 2 EPDS items, standard and revised versions, and univariate and multivariate statistics by item

Table 3 details the performance of each of the four versions of the EPDS examined in data analysis, including PPVs, likelihood ratios, kappa, and Cronbach's alpha statistics. The EPDS10 showed sensitivity of 69 % and specificity of 78 %. An optimal cutoff ≥13 yielded the highest percentage of correctly classified cases (73.3 %). Cronbach's alpha for the EPDS10 was fair (ά = 0.6130) but fell short of general guidelines for a stand-alone screening measure (ά ≥ 0.70). The performance of the EPDS7 depression subscale was better than the full 10-item scale with high PPV (83.78), improved reliability (α 0.7033), and a moderate kappa statistic (k 0.5129).

Table 3 Diagnostic validity of standard, short, and ultrashort versions of the EPDS for the clinical interview depression outcome

The novel five-item version (EPDS5R) that included five items from the seven-item depression subscale showed the best ROC statistic (AUC 0.8440) and alpha statistic (ά = 0.7501) meeting the minimum guideline for a stand-alone tool. The ultrashort three-item version (EPDS3R), which included only three items of the seven-item subscale, had a high PPV (83.78) but was less reliable as a stand-alone tool (α = 0.6092); however, this may be an artifact of the small number of items included.

The ROC graph (Fig. 2) shows the performance of the full 10-item EPDS and the standard seven-item depression sub-scale. The seven-item depression subscale EPDS7 (AUC = 0.8323) performed slightly better than the full EPDS10 (AUC = 0.8169). In Fig. 3, the performance of the 10-item, seven-item, and new short (EPDS5R) and ultrashort (EPDS3R) versions are illustrated. The novel five-item (EPDS5R AUC = 0.844) and the three-item (EPDS-3R AUC = 0.8396) versions' performance improves slightly on the performance of the EPDS7 and substantially on the performance of the EPDS10, despite requiring less items.

Fig. 2
figure 2

ROC curve for EPDS10 (full item version) and EPDS7 (depression subscale)

Fig. 3
figure 3

ROC curve including EPDS10, EPDS7, and short EPDS5R and ultrashort EPDS3R

Discussion

The performance of the short EPDSR5 and ultrashort EPDSR3 version supports evidence that shorter versions of the EPDS may be effective in detecting antenatal depression. This research raises new considerations for screening research in LMIC heavily affected by HIV. These include considerations related to the performance and validation of screening methods against clinical interviews, the importance of screening for high risk women, the potential of suicide ideation items in screening for risk in the antenatal period, and the feasibility and acceptability of using short and ultrashort tools in PHC in LMIC.

Rates of depression on the screening measure versus the clinical diagnostic interview

Firstly, in this research, we found that the rate of depression was higher on the clinical interview method than on the EPDS screening measure, a finding that is becoming more common in LMIC settings (Parsons et al. 2011) and warrants attention in discussions about screening and treatment in these settings (O'Hara et al. 2012). There is an almost complete absence of studies that include a clinical diagnostic measure from LMICs. More research studies are needed that include both screening and diagnostic tools. A better understanding of the ability of screening measures to detect risk in these settings may prove a more fruitful research agenda in Southern Africa and other LMIC, compared to research using only screening measures that assume similar test performances as seen in international contexts.

Screening for antenatal depression among high risk women in LMIC

The sociodemographic profile of the study women illustrates that these are high risk women, the majority of whom are low income, have unplanned pregnancies, and a high proportion of whom have been diagnosed HIV positive during the current pregnancy. This is an important finding because while low rates of detection are frequently reported in all settings, evidence suggests that low-income women are particularly likely to go undetected and are less likely to access mental health care (Kopelman et al. 2008). However, it is also important to consider that these particularly elevated risk situations may have influenced reporting of symptoms and the rates and severity of depression. Research is needed to replicate these findings among women from lower risk groups, given that high levels of chronic environmental risk and a sense of hopelessness may influence not only the onset and duration of depression but also the symptoms reported. An important point of similarity between this and other research (O'Hara et al. 2012; Dickens et al. 2012; Milgrom et al. 2008; Zelkowitz et al. 2008; Matthey et al. 2012) is that we found a relationship between the role of the EPDS depressive items (items 2, 8, and 9) and a chronic presentation of depression. In this study, two thirds of the women diagnosed with depression had episode duration greater than 2 months, suggesting a predominantly chronic presentation that may have influenced which EPDS items improved detection. Further research is required to elucidate the differences in acute and chronic presentations of depression across pregnancy and in the postnatal period, and to determine the role of a previous history of depression prior to the pregnancy more clearly.

The role of suicide ideation in screening for antenatal depression

Lastly, it is noteworthy that two of the three items found to be highly effective in predicting depression in this research, item 2 (loss of interest) and item 9 (mood), are similar to the two items included in the Patient Health Questionnaire (PHQ) screen that is increasingly recommended and shown to be effective in primary care (Smith et al. 2010). The third item, suicidal ideation (item 10) is rarely used in settings where resources to respond are limited (Dickens et al. 2012). Nonetheless, a recent review (Lindahl et al. 2005) found that pregnancy did not offer a protective effect for suicide ideation, with between 5 and 14 % of women reporting suicide ideation during pregnancy or the postnatal period. There is growing evidence of suicide ideation risk during pregnancy in LMIC (Asad et al. 2010; Breedlove and Fryzelka 2011; Choi et al. 2012) and high numbers of women in this research were suicidal, as reported elsewhere (Rochat et al. 2013). While suicide ideation and attempts are found to be lower during pregnancy and postpartum than in the general population of women, when deaths do occur, suicides account for up to 20 % of postpartum deaths (Oates 2003). Thus, given that this item has high predictive value for depression, its inclusion as a core item in a screening instrument seems highly appropriate.

The application of short and ultrashort versions of the EPDS in antenatal screening in LMIC

The potential of short and ultrashort screening tools for LMIC settings is apparent in two important health care service contexts: at the community level for screening by home visitors and CHWs, and at the PHC level for screening by nurses and other health care professionals and within prevention of mother to child transmission programming in geographical areas heavily affected by HIV. The use of the three- and five-item versions of the EPDS may have particular useability in ensuring that all opportunities for regular and repeated screening are maximized in low-resourced settings.

Community-level screening with the ultrashort EPDS3R

There are increasing calls for community-level human resources to be engaged in delivering PHC in resource-limited settings, under the supervision of nurse practitioners and midwives (Patel and Kirkwood 2008). Given that shorter tools increase the feasibility and acceptable of screening for both lay professionals and women alike, an ultrashort three-item screen could prove highly suitable for routine screening for depression during home visits by CHWs. The brevity, sensitivity, and user friendliness of this version makes it suited to first-level screening, since it is unlikely to overburden CHWs or to miss women in need of further screening and referrals to higher level services. Routine use of the ultrashort three-item version by CHW trained to offer psychoeducation and support could ensure that women identified as being at risk for depression are referred to a professional nurse for further assessment through antenatal services. This may also improve opportunities for contact with health care among depressed women given the evidence that the presence of antenatal depression can impact on uptake of antenatal care and engagement with health services.

Primary health care-level screening with the short EPDS5R

A concern in all LMIC is that a significant portion of antenatal care is delivered by PHC nurses, themselves considered a scarce and overburdened resource. This raises a different set of challenges at the PHC clinic level, where a lack of nurse resources and time constraints may degrade opportunities for routine mental health screening or intervention, regardless of community-level referral systems, and where the cost of treatment of false positives may make treatment unsustainable. While shorter tools may be key to making screening more acceptable and feasible for CHWs, at the level of nursing care, screening tool specificity and reliability also play an important role in ensuring that scare resources are adequately expended (Kagee et al. 2012).

The slightly longer five-item version (EPDS5R) that has significantly improved reliability over the EPDS10 may have particular usability for nurse screening given that it is still short, performs well as a stand-alone assessment of depression, and has improved specificity over the three-item version. By using the EPDS5R as a tool to further screen women referred by CHWs, the health system would benefit from a filtering system that reduces the number of false positives ensuring that women who require higher level screening by medical doctors or psychologists are flagged and referred. This increased specificity occurs in a context where the nurse professional is also trained and able to make clinical judgments about the women's needs for care.

Barriers and challenges to universal screening at home and in clinics in LMIC

The usefulness of short and ultrashort screens aside, challenges still exist, for example, it is not clear whether increased detection and referral at a community level or the availability of shorter more reliable screening tools, would necessarily increase nurse responsiveness to community referrals or improve treatment access or cost effectiveness (Hewitt and Gilbody 2009). Several barriers, beyond detection, may play a role in lowered access and uptake of treatment by depressed women including: financial costs, lack of insurance, lack of transportation, long waiting periods for treatment, previous bad experiences with mental health services and concerns of stigma, reduced autonomy, and loss of maternal rights (Kopelman et al. 2008; Rochat et al. 2006; Paulden et al. 2009). These issues are important as, in some instances, universal screening has not been effective in improving outcomes (Miller et al. 2009) and where it has, the availability of treatment options alongside universal screening has been an important contributor to success. As such, the challenges to integration and delivery of mental health care in resource-poor settings are complex and not limited to the provision of short screening tools.

An equally important priority would be to establish and adapt appropriate community-level interventions (Patel et al. 2010) making use of community human resources to increase access to treatment and to evaluate user-friendly treatment and referral algorithms such as those recently published by the World Health Organization (World Health Organization 2010).

This study had a number of limitations including: the relatively small sample size, the possibility of selection bias, and the high prevalence of HIV in the sample that may have influenced the prevalence and severity of depression, although given the high rates of HIV among urban and rural women in many parts of sub-Saharan Africa (Karim et al. 2011), studies in such communities are important (Shao and Williamson 2012). These limitations raise questions about the generalizability of these findings especially in low-risk samples. Clearly, replication and further research is urgently required. A major strength of the study is the use of a gold standard clinical diagnostic measure of depression in an under-researched population, in a LMIC. To date, the vast majority of this kind of research has been undertaken in high-income settings. There is increasing evidence of the detrimental role anxiety may play in pregnancy and the importance of screening for anxiety (Matthey et al. 2012; Meades and Ayers 2011); however, this study did not include a clinical measure of anxiety and, hence, the performance of the three-item anxiety subscale of the EPDS to detect antenatal anxiety could not be examined, and future research should examine both antenatal depression and anxiety.

Conclusions

The failure to adequately detect antenatal depression has wide-reaching consequences. Most importantly, it results in the loss of opportunities to prevent postnatal depression that, in turn, is known to result in ongoing difficulties for the mother, negative effects on the children, lowered access to quality antenatal and postnatal care, and higher health care costs (Alder et al. 2007; Oates 2003), regardless of the setting. While preliminary, these results offer support for new, innovative, and feasible approaches to detection of depression in pregnancy that may be well suited to resource-poor settings. However, further research is urgently required to test short screening tools against clinical diagnostic instruments in larger samples.