By 2020, depression is projected to become the second most common cause of loss of disability-adjusted life years in the world [1]. The majority of cases are diagnosed and managed by general practitioners [2]. There is evidence for effectiveness of pharmacological and psychological interventions when tested in efficacy trials in well-controlled settings [35]. General practice has been criticized for inadequately recognizing and managing depression, and since the early 1990's there has been an increasing push for primary care to implement chronic illness management and collaborative care models to better manage depression [6]. A number of randomised trials testing these complex interventions for depression management have now been completed and published [610]. Policy-makers and clinicians are beginning to implement these models, yet it is not clear to what extent these interventions actually improve remission of depression; and if so, for how long. There have been calls for full remission and functional recovery as the most important goal of treatment [11].

We have identified five relevant reviews published in six papers since 2001 [610]. Von Korff's editorial reviewed a selection of depression RCTs and concluded that case management was a key ingredient to achieving a positive outcome, yet did not review quality of trials included. Gilbody et al focussed on identifying and describing the educational and organisational interventions for the management of depression in primary care, yet did not focus on recovery from depression as an outcome, nor on trial quality. Badamgarav and colleagues focussed on management programs for depression care, included non-randomised studies and was not specific to primary care. Bijl et al reviewed trials of disease management programs that included screening, they commented on the 'highly divergent' methodological quality of trials yet did not report a formal assessment of trial quality. Dawson et al undertook a meta-analysis of randomised trials recruiting subjects with major depressive disorder conducted in primary care using remission as a key outcome.

These recent systematic reviews have gathered together published articles of randomised trials aimed at improving the management of depression in primary care, yet they vary in their scope and inclusion criteria from this review. None include information about trial quality and only one presents any data on recovery [7].

We report a systematic review of the randomized trials testing chronic illness management approaches for depression in primary care. We refer to these trials as 'systems trials' throughout the paper. We examine the quality of reporting of the published randomized trials and discuss the relevance of their findings to primary care led health systems.


We developed inclusion criteria to identify all randomised controlled trials implementing interventions at the 'system' level, aimed at management of depression in adult primary care populations and comparing the new 'system' of care with the existing or 'usual' care. Trials were included only if they used a validated tool to assess participants as depressed at baseline and included a follow-up measure of recovery or remission from depression (or results from which recovery levels could be determined). Clustered and individually randomised trials were included.

Trials were classified as at the 'system level' if they tested interventions that included all of the following:

  1. 1.

    A multi-professional approach to patient care. This required that a general practitioner (GP) or family physician and at least one other health professional (e.g. nurse, psychologist, psychiatrist, pharmacist) were involved with patient care.

  2. 2.

    A structured management plan. In line with introducing an organised approach to patient care 'systems' trials were required to offer practitioners access to evidence based management information. This could be in the form of guidelines or protocols. Interventions could include both pharmacological (e.g. antidepressant medication) and non-pharmacological interventions (e.g. patient screening, patient and provider education, counselling, cognitive behaviour therapy).

  3. 3.

    Scheduled patient follow-ups. A 'systems' approach required interventions to have an organised approach to patient follow-up. We defined this as one or more scheduled telephone or in-person follow-up appointments to provide specific interventions, facilitate treatment adherence, or monitor symptoms or adverse effects.

  4. 4.

    Enhanced inter-professional communication. This required that the intervention introduced mechanisms to facilitate communication between professionals caring for the depressed person. This included team meetings, case-conferences, individual consultation/supervision, shared medical records, patient-specific written or verbal feedback between care-givers and was sometimes referred to as 'collaborative care' in the publications.

As this review focussed on interventions for the general adult primary care population, studies that selected for sub-groups of adult patients with depression (eg, patients with specific co-morbidities, patients from specific cultural backgrounds only, samples of all women/men, post-natal depression, or elderly-only samples) were excluded.

Literature search

A search of Medline (Ovid, see Table 1) and the Cochrane Central Register of Controlled Trials (CCRCT) was conducted in July 2004 for all relevant English-language publications. Search terms included depression, primary care, general practice/practitioners and family practice/practitioners/physicians. Searches were conducted using each word-stem (e.g. depress*) to ensure all variants of each word were captured in the search. No limit was placed on the year of publication. For the Medline search, the search terms were combined with Cumbers and Wentz's strategy which is specific for identifying randomised controlled trials [12]. The search was repeated using PubMed and no further studies were identified. Titles and abstracts were independently read and reviewed by JG or JD, and short-listed articles were discussed by both researchers to determine eligibility. In addition to this search strategy, hand-searches of reference lists in relevant papers were conducted.

Table 1 Medline search strategy

Data extraction

JD systematically extracted the following data from the papers: authors and year of publication, study setting and location, method of screening and inclusion/exclusion criteria, method and level of randomisation, components of interventions, sample size, attrition rates, follow-up times, recovery outcome measures and recovery results.

JD and JG independently examined each publication to assess the degree to which it was reported in accordance with CONSORT recommendations [1315] and entered this information into a template designed using CONSORT criteria. Where a trial was reported in multiple publications we examined each publication in detail. Resulting tables were independently cross-checked by KH and GB. Any discrepancies were discussed until consensus was reached. Limitations of each trial were discussed by all authors until consensus was reached.


We identified 928 articles on the CCRTR, and 669 articles on Medline (many trials being identified on both databases, see Figure 1). Eleven trials met all inclusion criteria [1626]. Trials that were described in multiple publications were considered as a single study and are named in this paper as the first published study.

Figure 1
figure 1

Flow diagram of search for relevant publications.

Table 2 summarises the study location, inclusion criteria, randomisation method and study size and see Additional file 1 which summarises the characteristics of the interventions.

Table 2 Study location, inclusion criteria, method of randomization and sample size.

Representativeness of sample and generalisability of results

Ten of the eleven trials were undertaken in the USA and one in the UK [17]. Three trials [19, 21, 22], used a practice-based screening approach to identify cases of probable depression whilst the remainder relied upon physician-made referral [16, 17, 20, 2325] or screening of patients receiving a new antidepressant prescription [18, 26]. Details about the number of eligible cases not recruited into studies were not well reported. Where they were reported, issues of generalisability of the trial findings to the population of depressed primary care patients are raised. For example, Rost reports that 16% of those approached refused screening and that 27% of those screened refused a baseline interview [27]. Five of the trials recruited only patients willing to take antidepressant medication [16, 18, 20, 24, 26]. The majority of interventions were focussed around improving compliance of patients with antidepressant medication and only two trials specifically included a manualised non-pharmacological intervention [16, 21]. All trials were pragmatic trials undertaken in a real world clinical setting.

Table 3. summarises the quality of reporting of trials in accordance with CONSORT criteria (as judged by the authors). No trial was judged as adequately addressing all of the CONSORT criteria. All trials gave good descriptions of the actual interventions delivered. In general the quality of trial reporting when assessed using CONSORT criteria was poor. Of the eleven identified trials five were randomised by cluster and six by individual. The method used to generate the random allocation sequence was reported for seven trials, yet none included a clear description of the method used to implement the random sequence (allocation concealment). Other common omissions were a lack of: clearly stated pre-specified objectives, documented primary and secondary outcomes and planned sub-group analyses, relevant sample size calculations, power to assess recovery and a clear diagram showing participant flow. Many studies inadequately reported attrition rates and even those that did failed to investigate how these rates could have influenced study findings. Only two trials reported any information about attempts to monitor adverse events. Blinding patients to allocation in a randomised trial of a mental health intervention is often impossible, yet few authors discuss the potential biases introduced by the lack of blinding. Allocation concealment and blinding status were poorly reported and no paper presented a discussion of the limitations of lack of blinding. Whilst statistical methods were generally well reported many studies appeared to ignore the problems of multiple testing [28].

Table 3 The quality of reporting of trialsi in accordance with CONSORT criteria [14,15] [13]

Table 4 summarises the follow-up times, attrition rates, measurement tools, blinding and recovery results. Recovery was defined as no longer satisfying criteria for probable depression using the scale included in the study. Some trials reported recovery results as proportions or odds ratios and it was impossible to accurately determine the actual numbers recovered or to independently calculate significance levels. Where actual numbers could be deduced we have included them in the table.

Table 4 Recovery results of system intervention trials.


Due to the mix of cluster and individually randomised trials, lack of actual numbers of participants who met recovery criteria being reported, incomplete descriptions of participant flow and variation in: follow-up times, instruments used to measure outcomes; eligibility criteria, severity of depression and co-morbidities, we were unable to confidently utilise quantitative data synthesis techniques.

Trials reported outcomes at varying time-points from three, four, six, 12, 24 to 57 months. It was not always clear why these time-points were chosen. Eight of the trials reviewed showed an increase in the proportion of those recovered in favour of the intervention group (range from 10% to 33%) at the varying follow-up times. Attrition rates ranging from 5% to 50% were reported (see Table 4), yet not taken into account in the reported recovery rates. No trial reported an intention to treat analysis. Four trials reported recovery outcomes at or beyond one year of follow-up [19, 21, 24, 29, 30], with three of these trials reporting findings in favour of the intervention [19, 21, 29, 30].


We identified eleven randomised trials testing a system level intervention in primary care and measuring recovery from depression as an outcome. We were able to use the CONSORT criteria and reach agreement about the quality of each trial reported. Overall the quality of reporting was poor. As expected, more recently published trials were more likely to report along CONSORT criteria, yet no trial fully addressed all criteria. Most of the published studies lacked power to measure the effect of the intervention on recovery. Few clearly stated pre-specified objectives and outcome measures. These limitations coupled with the lack of intention to treat analysis and the problematic practice of multiple testing and sub-group analyses makes the interpretation of results and use of meta-analysis techniques problematic.

The trials used a variety of tools to assess depression and recovery and there appeared to be no consensus as to what constitutes a clinically meaningful outcome measure for testing interventions to reduce depression in primary care, nor the best tools to measure it.

Clinical implications

All but one of the trials reviewed was undertaken in the USA. We know that the primary health care system in the USA is very different from Europe, Canada, Australia and New Zealand. Translating the findings of systems based intervention trials between countries raises interesting challenges for researchers and policy makers; particularly if we acknowledge the complexity in health care[31].

Most of the trials recruited only patients willing to take, or already prescribed, antidepressant medication and all but one used primarily pharmacologically based interventions. The findings from these trials may not be relevant to the broader primary care population who prefer psychological treatment [32]. This is further supported by the work of Bower and Gilbody who report that system level collaborative care interventions tend to be tested on patients with more severe disorders and focus on drug treatment and patients at risk of relapse and recurrence [33]. These findings suggest the need to reconsider the applicability of system level intervention models to those with milder forms of depression.

Is this review biased?

Our review is biased as we have only included published papers that report recovery data and have judged the trials according to what is recorded in the publication. It should be kept in mind that publication bias tends to favour trials with a positive outcome and it is likely that recovery data is more likely to be reported if it shows in favour of an intervention. We purposefully did not contact authors of the papers included in this review as we wished to assess the evidence as it stands in the public domain. Our review is also limited to English language papers and it is possible that negative trials reported in non-English journals have been excluded.


System level interventions implemented in the USA, with patients willing to take anti-depressant medication, lead to a modest increase in recovery from depression. Whether or not such systems of care are cost-effective in the long-term is unresolved. The relevance of these interventions to countries that have stronger primary care systems (e.g. UK, Netherlands, Canada, Australia, NZ) is not known. It is inappropriate to assume that these types of interventions can be 'transplanted' to a different health care setting with the same effect as observed in the USA. We require adequately powered randomised trials to test the effectiveness of these models of care in settings outside the USA before widespread implementation occurs.

Outcomes for people experiencing depression are suboptimal [34] and it is almost certain that researchers, policy-makers and clinicians will maintain an interest in re-defining the system of depression care in the community setting [6, 35]. It is important that we have high quality randomised trial data to support any major re-engineering of primary care and it appears from our review that the trials testing systems of care for depression managed in the community have suffered from many of the common pitfalls outlined by Chalmers [36].

As a community we need to agree upon the measures to be used when assessing effectiveness of interventions for depression. This is a complicated issue in itself, and Dowrick highlights the need for debate on how we view and measure depression [37]. If we agree that functional recovery and full remission is the goal of management [11] we need to agree upon a consistent way of measuring it.

We hope that this review will assist researchers developing trial protocols for interventions aimed at reducing depression, by encouraging them to think again about: defining the components of their system intervention, planning for a publication that addresses CONSORT reporting criteria, contributing their data to a quantitative meta-analysis and including a cost-effectiveness data analysis.