FormalPara Key Points for Decision Makers

Substantial methodological differences exist between model-based studies evaluating depression treatments.

Simpler models, such as decision trees and cohort-based Markov models, were more frequently used.

Microsimulation models, such as individual-based state-transition models and discrete-event simulation models, incorporate patient heterogeneity and history, which is important when modelling depression.

1 Introduction

Depression is associated with an enormous burden for patients, healthcare systems and society as a whole. The World Health Organization has estimated that depression will rank first among the most debilitating conditions in terms of disability-adjusted life-years (DALYs) by the year 2030 [1]. Individuals with depression experience impairments in many aspects of their life, resulting in significantly lower quality of life (QoL) than the general population [2,3,4]. Additionally, depression leads to an immense economic burden [5]. In the USA, the total costs of depression were estimated to be $US173 billion in 2005 and $US202 billion in 2010 [6], whereas in Europe the total expenses related to depression were estimated to be approximately €92 billion in 2010 [7, 8].

Depression is very common, with a lifetime prevalence between 10 and 15% [9]. Patients with depression constitute a rather heterogeneous group [10], which may also affect the effectiveness and cost effectiveness of treatments [11, 12]. Evidence exists that various evidence-based treatments are similarly effective against depression, but whether specific treatments can be more beneficial for patients with specific characteristics remains unclear [13]. More than 50% of the patients seeking treatment for depression experience recurrent depressive episodes, the mean duration of which is 16 weeks [14]. Previous episodes are significantly related to the probability of recurrence, with the risk for a new depressive episode increasing by 16% for each successive recurrence [15]. Furthermore, the probability of recurrence decreases as the period that the patient is in recovery increases. Depression often is comorbid with other mental disorders, such as anxiety or substance abuse, or physical disorders such as diabetes [14, 16]. Therefore, the personal history of each individual is important for the prognosis of depression.

Economic evaluations are frequently used to inform decisions about the allocation of scarce resources in healthcare [17, 18]. They can be conducted either alongside randomized controlled trials (RCTs) or using economic modelling techniques. The current paper focuses on modelling techniques. The three dominant modelling methods used in model-based studies for common mental health disorders are decision trees (DTs), state-transition Markov models (MMs), and discrete-event simulation (DES) models [19,20,21]. These modelling techniques have different strengths and limitations, and the choice of one over the other depends on several factors, such as the complexity of the decision problem, data accessibility and available expertise [22]. A recent literature review compared MMs and DES for the economic evaluation of healthcare technologies such as breast cancer treatments without focusing specifically on depression [20]. The authors concluded that DES showed a few profound advantages over MMs but that the final choice depends on the research question being addressed [20]. So far, the most appropriate modelling method to use when simulating the course of depression and the impact of treatment on this course remains unclear.

Economic modelling in the research field of depression has received increased attention in recent years. Afzali et al. [21] conducted a systematic literature search up to September 2010 reviewing model-based cost-effectiveness studies of depression treatments. The authors focused on the characteristics of the model-based studies and evaluated how these characteristics influenced the results. They identified wide variation in the methodological aspects of the included studies and concluded that these differences potentially influence the accuracy of the cost-effectiveness estimations [21]. Another systematic review examined model-based studies of pharmacotherapy for major depression, published up to May 2010, and reached similar conclusions [23].

Many additional modelling studies have been published since 2010, and we considered it opportune to update the review by Afzali et al. [21] because periodic updating of evidence is important to avoid stakeholders using outdated evidence to inform decisions or address new problems. Therefore, the first objective of the current study was to investigate the methodological characteristics of model-based studies examining the cost effectiveness of treatments for depression. Second, we aimed to examine which of the modelling methods was more appropriate for simulating the course of depression.

2 Methods

2.1 Literature Search

To answer the first research question, we updated the systematic review by Afzali et al. [21]. The first author (SK) searched the databases PubMed, EMBASE and PsycInfo for eligible studies published between 1 January 2002 and 1 October 2016. This timeframe overlaps with that examined by Afzali et al. [21] (2002–2010) because we developed our own search strategy and wanted to ensure the same search criteria were applied for all years considered. We used search terms referring to model-based economic evaluations and depression treatments (see Appendix 1 in the Electronic Supplementary Material [ESM] for the full search string). We searched for published studies written in English. No protocol was published for this review. We input all entries from the database searches into a reference manager and removed the duplicates. The titles and abstracts were screened for eligibility, and the full texts were then examined.

2.2 Inclusion Criteria

Studies were included when they fulfilled the following criteria: (1) used a health economic model, (2) examined the cost effectiveness of treatments for adults with depression and (3) estimated quality-adjusted life-years (QALYs) or DALYs. Health economic models were defined as a mathematical representation of reality that can be used to estimate the cost effectiveness of health technologies for depression [21]. Depression was defined as a diagnosis of major depressive disorder based on a structured clinical interview or a score on a standardized self-report measure of depressive symptom severity that indicates the presence of clinically relevant depression. We included studies using QALYs or DALYs, which is similar to the inclusion criteria used by Afzali et al. [21]. This approach was adopted to increase the homogeneity of the reviewed studies. At the same time, the amelioration of the QoL of patients with depression is one of the main goals of treatment for depression and thus a relevant outcome in the context of this study [24]. No restrictions were applied on treatments evaluated (i.e. pharmacotherapy, psychotherapy, combination treatments, other) or the control groups. Studies on the prevention of depression were excluded. Any ambiguity around whether a study should be included was discussed with another author (JEB) until consensus was reached.

2.3 Data Extraction and Analysis

Data were extracted on the modelling technique used (e.g. DTs, cohort-based state-transition MMs [CMMs]), the structure of the model (i.e. what health states/events were used to represent the natural course of depression), data sources (e.g. meta-analysis, RCTs), time horizon and cycle length, treatment comparisons (i.e. which intervention and control groups were compared) and sensitivity analyses and economic perspective (e.g. societal or healthcare system perspective). Data extraction was performed by the first author (SK) using a data form developed based on the review by Afzali et al. [21]. When data were missing or unclear, we emailed the first author of the paper to request additional information. The extracted data were presented using descriptive statistics such as means and percentages whenever possible.

2.4 Comparison of Modelling Techniques

We grouped the modelling methods into four classes: DTs, CMMs, individual-based state-transition models (ISMs), and DES models (see the Results section for a description of each modelling method). We used 11 predefined criteria to evaluate the appropriateness of each modelling technique for simulating the course of depression. These criteria were introduced by Brennan et al. [25] to provide guidance on key requirements of modelling methods and have been previously used in schizophrenia [26]. We aimed to extract information on the included studies for each criterion (e.g. how much building time the model required, were data available as required to populate the model, did the authors comment on the simulation time required, etc.). If information was not available in the included studies for a specific criterion, we searched the literature to identify data on the performance of each model on this criterion. The criteria were building time, data collection, experience, simulation time, clinical representation, patient heterogeneity, timing of events, memory, patient interaction, interaction due to covariates and variability (see Table 1 for a short definition of each criterion). We discussed the performance of each model in each criterion specifically for depression.

Table 1 Description of 11 criteria used to evaluate the appropriateness of each modelling technique to simulate the course of depression.

3 Results

3.1 Literature Search

Figure 1 describes the results of the literature search. The search yielded 7757 titles; 6570 references remained after duplicates were removed. After screening the titles and abstracts, we retrieved the full texts of 113 studies and excluded 72 (16 no depression, 35 no model-based studies, 18 no QALYs or DALYs, two prevention of depression, and one no adults). We included 41 studies compared with the 14 studies included by Afzali et al. [21] (Fig. 1).

Fig. 1
figure 1

Flowchart describing the search strategy. DALY disability-adjusted life-year, QALY quality-adjusted life-year

The included studies used different health economic modelling techniques (Table 2): 21 (51%) studies used DTs, 15 (37%) used CMMs, two (5%) used ISMs and three (7%) used DES modelling techniques.

Table 2 Summary of model-based studies of depression

3.2 Characteristics of Model-Based Studies for Depression

3.2.1 Time Horizon and Cycle Length

In the studies using DTs, the time horizon varied considerably between 6 weeks and 27 months. Likewise, in those using CMMs, the time horizon varied between 4 months and lifetime and the cycle length ranged from 1 week to 6 months (Table 2). In the two studies using ISMs, the time horizon was 3 and 5 years and the cycle length 2 months and 1 month, respectively. Finally, one of the three studies that used DES models had a time horizon of 5 years [27] and the other two used a lifetime time horizon (Table 2) [28, 29].

3.2.2 Perspective

In total, 12 (29%) studies adopted the societal perspective, which included all relevant direct and indirect healthcare costs and lost productivity costs, and 21 studies (51%) used the healthcare perspective, which included only healthcare costs. Six studies (16%) used the managed care organization perspective, which included the costs from the perspective of the organization providing the healthcare (Table 2), and, finally, one study (2%) used the employer’s perspective and one study (2%) used the payer’s perspective.

3.2.3 Treatment and Comparator

Just over half of the studies (22 [54%]) evaluated the cost effectiveness of pharmacological treatments for depression. Seven (17%) examined combined treatments, mostly a combination of antidepressants and psychological treatments. Four (10%) investigated psychological treatments, and two (5%) investigated antidepressant treatment in combination with genotyping. Two studies (5%) examined electroconvulsive therapy (ECT) and two (5%) studied repetitive transcranial magnetic stimulation (rTMS). Finally, two studies (5%) examined the cost effectiveness of positive mental training and St John’s Wort (Table 2).

Table 2 also presents the comparator in the included models: 23 (56%) studies used antidepressant medication as a comparator; seven (17%) used more than one comparator, including psychological treatments, no treatment, and combination treatments; and five (12%) compared the treatment under study with treatment as usual. Other comparators were computerized cognitive behavioural therapy, placebo pills, rTMS, and ECT (Table 2).

3.2.4 Structure of the Models

The structure of the models varied considerably between the studies. The most frequently used health states or events were response, remission, relapse and death (Table 2). Three studies (7%) used different states for different levels of depression severity (i.e. minimal, mild, moderate and severe depressive symptoms), and 13 studies (32%) included one or more of the health states discontinuation, second-line options and adverse events related to antidepressant medication.

3.2.5 Data Sources

The data sources that can be used as inputs for a health economic model have different strengths and limitations, but they must all be transparent and well documented [30, 31]. The included studies used various data sources to populate the model (Table 3). These sources included but were not limited to published RCTs, meta-analyses of RCTs, observational studies and expert opinions.

Table 3 Data sources and sensitivity analysis

A total of 27 studies (66%) used meta-analysis to derive clinical inputs. Since meta-analyses are based on a systematic literature search using predefined inclusion criteria, the possibility of selection bias is decreased. Expert opinion, which is considered an unreliable data source because of its subjective nature, was used in 15 studies (37%). Most of the studies used published data sources to estimate utility weights, such as RCTs or observational studies. Two studies (5%) used a depressive symptom severity measure to calculate utility weights, and two other studies (5%) converted depression-free days to utility weights (Table 3).

3.2.6 Sensitivity Analysis

All included studies carried out sensitivity analyses to examine the uncertainty around the study parameters (Table 3). The two basic purposes of sensitivity analysis are to investigate the robustness of a chosen course of action and to accumulate additional information surrounding a specific decision [32]. Of the included studies, 29 (70%) performed a univariate sensitivity analysis, six (15%) carried out a multivariate sensitivity analysis and six (15%) conducted both univariate and multivariate analyses (Table 3). The sensitivity analysis was probabilistic in 31 (77%) of the studies, deterministic in three (7%) and both probabilistic and deterministic in seven (17%).

3.3 Comparison of Modelling Techniques

We used 11 criteria to evaluate the appropriateness of the four modelling methods (DTs, CMMs, ISMs and DES) used in the included studies [25] and used these data to evaluate the models on the predefined criteria. An overview of this evaluation is presented in Table 4. A description of the criteria applied to each modelling method, with a specific focus on the importance of these criteria when modelling the cost effectiveness of depression treatments, follows.

Table 4 Strengths and weaknesses of four modelling techniques for simulating the course of depression

3.3.1 Decision Trees

DTs scored positively in four of the 11 criteria (Table 4). DTs are used to represent relatively simple decision problems and are mainly appropriate for modelling interventions over short time horizons [22]. This is reflected in our finding that the average time horizon in the studies using DTs was 10 months. They are generally simple to build and analyse. A DT typically includes a small number of health states and has low data requirements. This was also reflected in the studies included in this review, as the majority used between two and four health states (Table 2). This may pose a threat to the clinical representation of depression in these models, since depression typically has an intermittent course. DT is probably the most popular modelling technique in simulating depression (n = 21 [51%]) because of the low requirements for time to develop, data inputs and expertise. For example, most of the included studies used a dedicated software such as TreeAge Pro [33] or other statistical software such as Microsoft® Excel and WinBUGS [34] to develop the DTs.

A DT typically focuses on the average patient. For instance, the included studies using DTs did not differentiate between patients with comorbidity or different age groups while simulating the course of depression. Furthermore, DT does not track the individual history of a patient, and passing of time is not included in the model. This means that DTs do not include variability in patient characteristics over time (e.g. increase the probability of a future recurrence after each additional depressive episode). This is a disadvantage of DTs, because depression is a chronic disorder for many patients, and it is important to model long-term outcomes that are influenced by previous events.

The time horizon used in the included DT studies varied between 6 weeks and 27 months. This may be a short period for a recurrent disorder such as depression, for which the average episode is 16 weeks and more than half of patients experience more than one episode [14]. However, a longer time horizon would substantially increase the complexity of the model, making it difficult to handle. In addition, patient heterogeneity and interactions between variables are typically not included in DTs, but this can be implemented by running separate models for each subgroup or adding more health states to the model, respectively. However, this may also lead to an exponential increase in model size. These issues may decrease the face validity of DT models for depression. Only three of the reviewed studies that included DTs explicitly reported validation methods. Two used expert panels to validate the data sources and one used individual panellists to verify the model estimates [35,36,37].

3.3.2 Cohort-Based State-Transition Markov Model

The CMM scored positively in five of the 11 criteria (Table 4). The number of studies using CMMs (15 [37%]) indicated it was the second most popular technique to model depression. A CMM is most suitable for situations in which the decision problem can be represented in terms of health states and the population under study is a closed cohort [38]. Portions of the cohort simultaneously move from one health state to the other. The model is ‘memoryless’, a condition known as the ‘Markovian property’, meaning that future health states depend only on the current health state and not on the sequence of previous health states [38]. Therefore, the included studies using CMM did not account for patient history since the initiation of the simulation (e.g. time spent in previous health states). This is an essential drawback of using a CMM to simulate depression, because various patient characteristics (e.g. baseline depression severity, comorbidity, previous episodes, etc.) play an important role in prognosis and treatment decisions [39,40,41].

Nevertheless, additional health states or ‘tunnel’ states can be added to model patient history and heterogeneity or timing of events such as time spent in a depressive episode; these models are also called semi-Markov models. However, this may increase the complexity of the model considerably, resulting in a model that is rather difficult for analysts and researchers to manage and for decision makers and other stakeholders to interpret [38]. Only three of the 13 included studies used ‘tunnel states’. Thus, most studies using CMM did not include the timing of events and patient heterogeneity in the models.

CMM is relatively easy to build, and most of the included studies used Microsoft® Excel and Visual Basic applications or dedicated software such as TreeAge Pro [33] to develop the model. An advantage of CMM is that, similar to DTs, it has relatively low data requirements. However, and in comparison with DTs, it is easier for CMM to simulate different courses of disease and thus it has better clinical representation and face validity. Other validation methods were reported in four CMM studies, including validation of the model structure and approach by a panel of clinicians, verification of the model algorithm (internal validity), validation of the model’s assumptions by experts, and comparison of model predictions with estimates from published literature (external validity) [42,43,44,45].

3.3.3 Individual-Based State-Transition Model

The ISM scored positively in six of the 11 criteria (Table 4). Two studies employed an ISM that was a state-transition model such as CMM but was not limited by the Markovian property [38]. Instead of cohort simulation, ISM is a microsimulation model that simulates one patient at a time. The two included studies that used ISM tracked the individual history of the patients (in terms of increased risk for a future depressive episode after each new episode). They also included patient heterogeneity in the model by providing different transition probabilities based on patient characteristics, such as sex and age. In other words, ISM can include timing of events and memory in the model. However, this may increase the development and running time of the model considerably. The two studies used TreeAge Pro [33] to build the model.

The complexity of the model means data requirements are also higher than with DT and CMM (e.g. separate transition probabilities must be found for men and women or patients in different age groups), and this is a considerable issue. Although the volume of meta-analyses, RCTs and observational studies investigating depression in the literature is always increasing, there is no consensus on instruments to monitor the severity of depression and on the definitions of related health states such as remission and response. As a result, searching for data to populate the models can be inefficient and time consuming. Data availability was also an issue in the included studies; 37% relied at least partly on expert opinion as a data source for some clinical or cost parameters.

A further issue related to the use of ISM that also applies to CMM is the need to define a fixed cycle length [38, 46]. The choice of cycle length depends upon treatment length, data availability and frequency of clinical events, among others. Shorter cycle lengths give a better approximation of the events in real life but increase the computational burden in terms of data requirements and running time [38]. The cycle length should be short enough to incorporate the impact of important events such as treatment outcomes. Thus, considering the intermittent course of depression over time, cycle lengths longer than 3 months may not be appropriate. The two studies we identified as using ISMs had cycle lengths of 1 and 2 months.

The authors of these two studies did not provide an explicit explanation regarding why they chose a microsimulation approach instead of a CMM. The ability of studies using ISM to model patient heterogeneity and memory can potentially improve the face validity of the model. The authors did not provide further information on other formal methods they used to validate the models.

3.3.4 Discrete-Event Simulation

DES modelling scored positively in seven of the 11 criteria (Table 4). It has been used extensively for other healthcare applications, such as surgical disciplines, but we identified only three studies of depression (7%). Using DES results in a microsimulation model in which each patient is simulated separately, similar to using an ISM. DES models do not use fixed cycles, instead, a constantly running simulation clock tracks time [47]. Thus, the model advances from one event to another and these interim periods (e.g. remission from depression) can be captured by referring to the simulation clock [47]. The time at which an event occurs is based on a survival function that models the time to that specific event.

Because of this flexible time management, an event can occur at any time, which is particularly useful when simulating recurrent disorders such as depression, where most patients are expected to experience multiple episodes. This flexibility in time can be approximated in state-transition models such as CMM and ISM. For instance, one of the included studies that used CMM had a cycle length of 2 weeks. Nevertheless, using very short cycle lengths may increase the running time of the state-transition models considerably.

Although patient interactions such as waiting lists, which are common in specialized depression care, can also be implemented in the model, our literature search did not find any studies using this capability. The complexity of the model means the building and running time can be substantial. The expertise required to build and analyse DES models may be behind the limited use of this type of modelling in depression thus far. Nevertheless, dedicated software for DES models does exist, such as Simul8 [48], which one of the three studies used. The other two studies used Microsoft® Excel. In addition, data requirements are higher, posing a concern similar to that discussed for the ISM. However, DES is expected to represent the course of depression more naturally than DT or CMM because it is possible to model the effects of patient characteristics on the course of depression [49].

The authors of studies using DES models reported that they did so because of the complexity of the model they wanted to build and the ability to include patient heterogeneity and history [28, 29]. These advantages of DES modelling can enhance the face validity of the model. One [29] of the three studies described other methods of validation, including a workshop developed to validate the conceptual and mathematical models, testing of the model code and results by a second analyst, and error checking by the project team.

4 Discussion

This study aimed to review the methodological characteristics of model-based studies examining the cost effectiveness of treatments for depression and to evaluate which of the modelling methods was more appropriate for simulating the course of the disorder. The 41 studies had diverse methodological aspects and therefore it was difficult to compare them. Simpler models (i.e. DT and CMM) were used more frequently. However, microsimulation models (i.e. ISM and DES) appeared more appropriate when simulating the course of depression because they can track the personal history of each individual.

About half of the identified studies investigated antidepressants compared with a comparator treatment. However, practice guidelines recommend both psychological and pharmacological treatments for depressive disorders [50,51,52]. In addition, preliminary evidence from a meta-analysis showed that cognitive behavioural therapy and continuation of pharmacotherapy had similar long-term effects on depression [53]. Therefore, model-based economic evaluations of psychological treatments are necessary to examine the cost effectiveness of these interventions in the long term.

The included studies used various data sources to populate the models. Data from RCTs were commonly used to derive effect estimates. RCTs are methodologically sound but they sometimes have low external validity because of strict inclusion criteria [18, 54]. Meta-analysis of RCTs has increased precision and external validity than a single clinical trial, but the results may still be subject to publication bias [54, 55]. Observational studies typically have larger samples and higher external validity than RCTs [21]; however, the quality of the data is usually lower than in an RCT because it is difficult to control for confounding bias. Expert opinion is a controversial choice that may be used only when no other sources are available [18, 56]. We found 15 studies (37%) used expert opinion as the data source for clinical or cost inputs, which may be an indication of a lack of valid data for some parameters. The issue of the limited number of papers available in the literature presenting utility weights for different health states of depression has been addressed previously [57]. A recent systematic review and meta-analysis can provide a useful resource for decision analysts [58]. Overall, researchers should be cautious when using utility weights from the literature and ensure the weights they use are applicable to the specific population and health states they are evaluating.

The structure of the model must address the decision problem adequately [59]. The majority of the studies used relatively simple model structures with only a few health states, sometimes as few as two, with an average of four. Since depression is recurrent, with increased risk for suicide, it seems reasonable that the structure of the models needs to be relatively complex. Some key clinical events should be remission (ideally defined by a structured clinical interview) and a measure of symptom improvements (such as response or reliable change) and deterioration. Deterioration of depressive symptoms is often only partly included in the models as relapsing to depression after being in remission. However, it is possible that patients get worse while undergoing treatment for depression [60]. Moreover, depressive symptom severity is related to different utility weights, with more depressed patients having more deteriorated QoL [58]. Thus, accounting for severity of depression would improve the clinical representation of the models. Nevertheless, only three of the included studies used different states for different levels of depression severity. Overall, omitting important health states from the model structure may lead to overestimation or underestimation of the cost effectiveness of the treatment under study.

We searched the included papers for information on model validation. Model validation is crucial since it provides confidence to decision makers that the model accurately predicts the outcomes of interest [61]. We identified minimal information on validation methods. Although the body of literature on the importance of rigorous validation of statistical modelling is substantial [62], it appears not to be common practice in the health economic modelling of depression treatments. This is a shortcoming that may cause flawed decision making.

We identified a larger number of more complex models in studies published after 2010. Afzali et al. [21] found one study using ISM, which was conducted in 2006, and no DES models. After 2010, another study that used ISM and three studies that used DES models were identified. In addition, we found three studies that used ‘tunnel’ states to model time in the CMM (conducted in 2010, 2013 and 2015). This may be an indication of a trend towards more complex modelling methods to investigate the cost effectiveness of depression treatments. More complex modelling methods may improve the face validity of the models.

Since depression is a chronic and recurrent disorder, the history of each patient is crucial for prognosis (e.g. more previous depressive episodes, more severe depression and comorbidity are all related to worse prognosis) [40]. Studies that used DES and ISM included patient heterogeneity (e.g. age, sex) and tracking of individual history (e.g. additional depressive episodes during the simulation) in the model estimations. This may improve not only the face validity of the models but also the validity of model predictions.

Nevertheless, we could not judge the validity of model prediction because it was not reported in the papers. On the contrary, studies that used DTs and CMM were not designed to include patient heterogeneity and history. Therefore, we consider DES and ISM the most appropriate methods to simulate the course of depression.

In addition, DES may have some advantages over ISM, such as increased flexibility in time management (no cycles are used since time runs constantly) and the ability to model patient interactions (e.g. waiting lists for access to treatment). However, our search did not detect any study using the ability of a DES model to simulate patient interactions. Only empirical comparisons between the different models will allow us to draw firm conclusions on whether important aspects of depression, such as patient heterogeneity and time-varying events, can be validly incorporated in cohort-based models without resulting in an unwieldy number of health states; this may be easier to incorporate in microsimulation models.

To our knowledge, such comparisons have not yet been conducted for depression treatments. One study has compared DES and CMM for HIV treatments [63] and one has compared them for early breast cancer treatments [64]. Simpson et al. [63] indicated that DES showed better predictive validity over 5 years. Karnon [64] concluded that DES represented the data more flexibly than did CMM but that this advantage was outweighed by the increased efficiency of CMM (i.e. short computational times, easier to develop and analyse). Nevertheless, the extent to which these inferences can be generalized to the evaluation of depression treatments remains unknown.

In a recent review, Karnon and Afzali [65] presented an overview of the costs and benefits of using DES to model healthcare decision problems in general. The authors proposed four factors that decision analysts and researchers can use to evaluate whether using a DES model would be beneficial: baseline heterogeneity, disease progression as a continuous process, time-varying event rates, and prior events affect subsequent event rates [65]. When considering depression, it seems that all these factors apply. Initially, the depressed population is rather heterogeneous. Furthermore, depressive symptom severity is continuous, although sometimes it is conventionally modelled as discrete categories (i.e. mild, moderate, severe). In addition, time is crucial when modelling depression. The longer patients have depression the less likely they are to experience remission. Finally, the number of prior depressive episodes affects the probability of having new episodes. Thus, a DES model seems to be more suitable for modelling the course of depression than other more simple methods. However, the results of our review show that DES modelling is not yet commonly used for depression. This may limit the validity of the findings of existing studies.

More guidelines are available to help researchers and decision makers decide on the most suitable modelling method to address the problem of interest. The International Society for Pharmacoeconomics and Outcomes Research (ISPOR) Modeling Good Research Practices task force provided a thorough description of the advantages and disadvantages of the available modelling methods in a series of reports [22, 38, 47]. Finally, the ISPOR Dynamic Simulation Modeling Emerging Good Practices task force provided the SIMULATE checklist, which is a tool to assist modellers decide whether a dynamic modelling method, such as DES, is necessary or relatively simpler simulation methods, such as Markov models, are adequate to address the specific decision problem [66, 67]. In general, these guidelines are in accordance with our conclusions, indicating that chronic and recurrent health problems require more complex modelling techniques.

5 Conclusion

The present systematic review described the studies and compared the modelling techniques that have been used in the economic evaluation of depression treatments. There were substantial methodological differences between the studies, which decreased the comparability of the results. Since patient heterogeneity and the individual history of each patient are important for the prognosis of depression, DES and ISM simulation methods may be more appropriate for a pragmatic representation of the course of depression. However, direct comparisons between the available modelling techniques are necessary to yield firm conclusions.