FormalPara Key Points for Decision Makers

This study indicates a substantial increase in the number of economic evaluation studies on a diverse range of pharmacological, non-pharmacological, diagnostic, and preventive dementia interventions conducted between 2018 and 2022 compared with previous research in the field; however, the studies are restricted to a limited number of countries.

Moreover, the quality of methodology and reporting of these studies, similar to the most recent reviews in this area, exhibits significant weaknesses that should be addressed by researchers to enhance the quality and rigour of future studies.

1 Introduction

Dementia is one of the leading causes of disability and dependency among older adults globally, affecting individuals, their families, communities and societies [1]. In line with population ageing, it has become a significant healthcare challenge of this century [2], and in 2012 dementia was raised as a public health priority by the World Health Organization (WHO) and Alzheimer's Disease International (ADI) [3]. Over 55 million people worldwide lived with dementia in 2020, and this number is predicted to rise to 78 million in 2030 and 139 million by 2050 [4]. The Global Burden of Disease (GBD) 2019 Dementia Forecasting Collaborators have provided higher estimates, with more than half of the cases in high-income countries [5]. Recent evidence also shows that in 2019, 25.28 million disability-adjusted life-years (DALYs) were attributed to dementia, while this number was 9.66 in 1990 [6].

In addition, the associated economic costs of dementia were estimated at $818 billion worldwide [4]. While this number is expected to increase 3.44 times by 2030, updated figures as part of the WHO’s Global Status Report revealed a new figure of $1313.4 billion in 2019 [7], which could seriously challenge social and economic development and health and social services provision [4]. Of this amount, 16% represented direct medical costs, 34% were allocated to direct social sector costs (including long-term care) and 50% pertained to the costs of informal care [7]. These challenges would result in higher demand for health care and increasing indirect costs of labour productivity losses [8].

Despite the existence of effective treatments to mitigate the disease’s effects or slow its progression and improve the quality of life of people with dementia, their families and carers, scarce resources may prohibit affordable access to beneficial care and services [9, 10]. Therefore, economic evaluation studies can provide important inputs for the decision-making process of resource allocation. While economic evaluation of alternative interventions can be conducted alongside a trial or modelled evaluation [11], an advantage of model-based evaluations includes the ability to compare all alternatives, gather needed data from different sources of evidence and follow estimation of consequences over longer time horizons [11,12,13].

Dementia interventions have traditionally focused on pharmacological approaches, although there is a growing recognition of the importance of dementia prevention strategies and non-pharmacological interventions. These interventions encompass a range of activities such as physical exercise, interventions to support and enhance cognitive abilities in people with dementia, for instance, reality orientation, reminiscence therapy or cognitive stimulation, psychological and behavioural therapies, occupational therapy [2, 14,15,16] and psychosocial interventions for carers [17].

In spite of the increasing attention to non-pharmacological interventions, a recent systematic review by Nguyen et al. found that of 67 identified studies, only 5 and 19 studies evaluated non-pharmacological and preventive or diagnostic interventions, respectively, and 43 studies evaluated pharmacological interventions [19]. This trend is further evidenced by the limited number of economic evaluation studies on non-pharmacological interventions identified in another systematic review. Sopina and Sørensen identified only 10 studies between the years 2000 and 2017, highlighting the need for more comprehensive research in this area [18].

Moreover, the identified studies in these systematic reviews have mostly been critiqued in terms of their lack of incorporating long-term outcomes, such as behavioural and psychological symptoms and functional performance [19, 20]. Additionally, according to other literature, decision-analytic models for Alzheimer's disease (AD) for treatments at the very early stages need to encompass people with mild cognitive impairment (MCI), given that the pathophysiological progression of AD starts potentially decades prior to the manifestation of dementia symptoms [20].

The review by Nguyen et al. also revealed that while many studies had improvements in terms of modelling, particularly in relation to the decision problem description, perspective, data inputs, and incorporating disease states reflecting a coherent theory of the health condition, there were also several areas where studies performed poorly. For example, models often lacked transparency regarding the assumptions and did not provide evidence of model validation. Additionally, many studies had shortcomings in evaluating data quality or considering alternative modelling options or uncertainties, leading to biased or unreliable results [19].

The current study aimed to update the review by Nguyen et al. of model-based economic evaluations of interventions for dementia to include the most contemporary evidence, identify potential areas for improvement and to conduct a quality assessment of the included studies.

2 Methods

This systematic review was conducted based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [21] and reported according to the Consolidated Health Economic Evaluation Reporting Standards (CHEERS) 2022 checklist [22] for reporting economic evaluations. The study protocol was registered at the International Prospective Register of Systematic Reviews (PROSPERO; CRD42022337417).

2.1 Literature Search

Articles published in the English and German languages and indexed in PubMed, Cochrane Library, Embase, CINAHL, PsycINFO, EconLit, international HTA database,Footnote 1 and Tufts Medical Center Cost-Effectiveness Analysis Registry were searched from February 2018 until 3 August 2022 to identify the most recent economic evaluation studies since the previous systematic review conducted in this area. Compared with the initial review [19], the new search strategy included a combination of a wider array of relevant terms, Boolean operators and the addition of Medical Subject Heading (MeSH) terms and the German language, ensuring a more comprehensive and inclusive approach to capturing all studies in the two main blocks. Block 1 included keywords describing the population and health condition, i.e., people diagnosed with dementia of any type, such as AD, and any disease severity or those identified as carers for people with dementia, whereas Block 2 included keywords around the methodology describing model-based economic evaluation, for example, economic, cost effectiveness, and Markov. The two blocks were then combined with ‘AND’ and limited target date to build the final syntax. See Online Resource 1 for the full search strategy.

2.2 Inclusion and Exclusion Criteria

This systematic review included model-based economic evaluation studies and modelled extensions of trial data (including cost-effectiveness analysis [CEA], cost-utility analysis [CUA], and cost-benefit analysis [CBA]) in the English and German languages, reflecting the authors’ language proficiency. All model-based economic evaluations in which at least two interventions were compared in terms of their costs and benefits were eligible, including decision-tree, Markov, cohort simulation, and discrete event simulation modelling. Interventions included anything covering surveillance, screening, early diagnosis, treatment, management, and care for people with dementia or people with MCI (as a primary health condition) or their carers. Trial- or regression-based economic evaluations without decision-analytic models, simple cost or outcome description studies of a single intervention, cost-outcome description, cost-of-illness studies and reviews or systematic reviews and/or meta-analyses were excluded. However, for the identified systematic reviews, including the study by Sopina and Sørensen [18], forward and backward citation tracking was conducted by reviewing the reference lists and papers that cited the review to identify other relevant studies that were not initially identified with the search strategy. Studies relevant to animals, those in which patients have had a disease other than dementia-related conditions, or studies that did not focus on specific options were excluded. Conference abstracts, dissertations or studies without full texts were also excluded.

2.3 Selection of Studies and Data Extraction

The references from the bibliographic databases were imported and deduplicated in Covidence [23]. Two reviewers (MGD and DH) independently conducted each step of the process, from title and abstract screening to full-text review, using Covidence software. Disagreements were resolved by consensus or through discussion with a third reviewer (LE). The characteristics of the included studies were documented, using a data extraction form in Microsoft Excel, based on established guidelines for reporting economic evaluations (CHEERS) [22, 24]. Data relating to the properties of the included studies, including first author, year and country of study, intervention and evaluation type, setting, study perspective and population, model type, time horizon and health states, costs and outcomes, discount rate, currency, price date, uncertainty, source of funding and conflicts of interest were extracted by two reviewers independently (MGD and DH) for half of the studies. Given the low rate of disagreement between the extracted information (less than 8%), the data for the remaining articles were only cross-checked by the second reviewer. Any differences that remained unresolved after discussion between the two reviewers were then referred to a third reviewer (LE) for final resolution. Incremental cost-effectiveness ratios (ICERs) were adjusted for price inflation to 2023 United States (US) dollars using the CCEMG–EPPI-Centre Cost Converter.Footnote 2

2.4 Quality Assessment of Decision-Analytic Models

Regarding the focus of this study on model-based economic evaluations and in accordance with the literature, which advocates for differentiating between assessing the quality of reporting and the quality of methodology [24,25,26,27], we evaluated the articles using the Phillips appraisal tool, which is more specific to the methodology of modelled economic evaluation and the CHEERS 2022 checklist (considered an industry standard for reporting of economic evaluations), both are highly recognized appraisal tools [28]. In this regard, one reviewer (MGD) appraised the methodological quality of models in the included studies against the framework for assessment of good practice in decision-analytic models in economic evaluations developed by Philips et al. (hereafter referred to as the Philips checklist) [29] and the quality of reporting based on the CHEERS 2022 checklist [22]. In our evaluation, we focused primarily on the content presented within the main text of the articles.

The Philips checklist [29] addresses three critical aspects of modelling, including structure, data, and consistency, within 58 items. The results of the assessment for each item were described as ‘yes’, ‘no’, ‘unclear’ and ‘not applicable’. The CHEERS 2022 checklist [22] comprises 28 items across seven main domains: (1) Title; (2) Abstract; (3) Introduction; (4) Methods; (5) Results; (6) Discussion; and (7) other relevant information. To ensure the appropriate interpretation of each item description, the Explanation and Elaboration report of the Professional Society for Health Economics and Outcomes Research (ISPOR) CHEERS II was considered [24] and the presence of each item in the selected study was qualitatively described as ‘yes’, ‘no’, ‘partially’ and ‘not applicable’.

A random sample of 30% of the quality assessment of the included studies was checked by a second reviewer (DH) to ensure consistency. Since a low proportion (approximately 6%) of variation was found in 30% of the total studies, it was concluded that the remaining 70% of the studies was adequately assessed by a single reviewer. Disagreements were resolved through discussions between the two reviewers (MGD, DH), and in cases in which consensus could not be reached, a third author (LE) intervened.

Any item labelled as ‘unclear’ or ‘partially’ was considered to not meet the expected quality for that criterion.

2.5 Narrative Analysis of Findings

The findings were synthesized descriptively and summarized through narrative analysis. The relevant information from the CHEERS criteria for each study was extracted, including the description of the study population, interventions compared, perspective adopted, time horizon, discount rate, measurement of outcomes, characterization of uncertainty, and engagement with stakeholders. The results were summarized, providing a concise overview of how each study addressed the different items specified in the CHEERS checklist. The comprehensive assessment of each paper against both the CHEERS and Philips checklists was subsequently conducted, which allowed the examination of the methodological rigour and reporting quality of the included studies. In comparing the findings of the current review with previous studies, the strengths and limitations of the included studies were identified and discussed.

3 Results

3.1 Search Results

The search identified 5275 records. After duplicates were removed, 3182 unique titles and abstracts were screened, which resulted in 69 articles for full-text review. The final number of included studies was 23. The PRISMA flow diagram in Fig. 1 provides details about the selection process and reasons for exclusion.

Fig. 1
figure 1

Search process and reasons for exclusion. PRISMA flowchart depicting the process of study selection for the systematic review of economic evaluations of dementia interventions, conducted between February 2018 and August 2022. PRISMA Preferred Reporting Items for Systematic Reviews and Meta-Analyses

3.2 Study Characteristics

3.2.1 Population, Country of Study, Interventions, and Study Perspective and Setting

The 23 included studies covered a variety of populations and conditions related to dementia. Specifically, 10 (43%) studies focused on AD [30,31,32,33,34,35,36,37,38,39], five studies (22%) examined MCI and memory concerns [40,41,42,43,44], and three studies (13%) targeted people with other types of dementia [45,46,47] (Table 1). There were also individual studies that addressed certain dementia prevention or diagnostic strategies for specific populations, including couples with one heterozygous Huntington disease individual [48], people at risk of dementia without a specific health condition [49], those with idiopathic normal pressure hydrocephalus (iNPH) [50], or informal caregivers of people with dementia [51]. One study also focused on a population of individuals aged 60–64 years and examined the effectiveness of a dementia preventive programme [52].

Table 1 General characteristics of the included studies

Eight studies (35%) were conducted in Europe [31, 37, 42, 45, 47, 49,50,51], eight (35%) were conducted in the US [32,33,34, 36, 39,40,41, 48], and two (9%) were conducted in Canada [44, 46]. Additionally, single studies (4% each) were conducted in Australia [52], Brazil [30], South Korea [43], Taiwan [38], and Thailand [35].

Pharmacological interventions were the most evaluated type of intervention in the included studies (n = 10, 43%) [30,31,32,33,34,35,36,37, 41, 45] in relation to memory loss and cognitive decline symptoms. Within this range, rivastigmine, galantamine, and donepezil were from the category of cholinesterase inhibitors primarily approved for the treatment of memory loss. Other options were memantine from the class of medications known as N-methyl-D-aspartate (NMDA) receptor antagonists, along with aducanumab and lecanemab as anti-amyloid-β therapies [53, 54]; all are recommended for slowing cognitive decline. Four studies (17%) specifically examined non-pharmacological interventions [44, 46, 47, 51], including two options in relation to memory symptoms and cognitive decline, and two within services supporting activities with daily living and integrated community-based health and social services. Another four studies (17%) investigated preventive approaches in relation to dementia risk factors (primary prevention level) [38, 49, 52] and early intervention and disease modification (secondary prevention level) [50]. Diagnostic methods were the primary focus in four studies (17%) [39, 42, 43, 48], and one study (4%) addressed both diagnostic and pharmacological interventions together [40].

The majority of studies were conducted from either a healthcare perspective (n = 8, 34%) [30, 39, 42,43,44,45,46,47] or a societal/modified societal perspective (n = 6, 26%) [31, 38, 48,49,50, 52], or from both a healthcare sector and societal/modified societal perspective (n = 6, 26%) [34,35,36,37, 40, 41]. Two studies (9%) were conducted from both a healthcare payer perspective and a societal perspective [33, 51] and one study (4%) adopted a healthcare payer (insurance) perspective [32].

Of the 23 studies identified, only 12 studies (52%) provided information on the study setting. Of these, four studies (17%) investigated interventions in both community and residential care settings [33, 40, 41, 47], five studies (21%) in the community care setting only [32, 36, 44, 46, 51], two studies (9%) in the residential care setting [45, 50], and one study (4%) in a tertiary hospital setting [42].

3.2.2 Economic Evaluation Methods, Utilities, and Utilized Software for Modellings

Eighteen (78%) studies used CUA (Table 1), two (9%) used CBA [44, 46], one (4%) used CEA [42], and two studies employed both CUA and CEA [36, 41]. Quality-adjusted life-years (QALYs) were the main measure of outcome used across the studies.

All CUAs included in the analysis, as shown in Table 2, obtained health state utility values from the published literature. One study did not provide specific information regarding the source of the utilities [48]. Incorporating a range of health conditions (cognitively normal state, MCI, and dementia) or settings (community and residential care) into one Markov transition model, several studies employed multiple sources to incorporate utility values in their modelling practices [30, 31, 33, 40, 47]. The sources predominantly relied on two widely recognized generic utility measures, the EQ-5D and Health Utilities Index (HUI) Mark II and III, to assess health-related quality of life (HRQoL) through indirect valuation methods. Among these sources, the cross-sectional studies of patients and caregivers of health utilities in AD using the HUI-II and EQ-5D questionnaire by Neumann et al. [55,56,57] were the most common primary sources of utilities used in seven studies [34, 36, 38,39,40,41, 51]. Another frequently used source of utility values was a cross-sectional observational study on the societal costs of AD by Mesterton et al. [58] referenced in three studies [30, 33, 49]. Mean utility values elicited from the literature ranged from 0.73 to 0.80 for MCI, 0.43 to 0.77 for mild AD, 0.21 to 0.59 for moderate AD, and 0.17 to 0.45 for severe AD (Table 2).

Table 2 Sources of utility scores in included studies

Several other outcomes were also measured in a few of the studies, including the equal value of life-years gained (evLYG) [36, 41], life-years gained (LY) [36, 41], percentage of correctly diagnosed cases [42], and health service utilization reduction and costs averted by the intervention [44, 46].

In most of the included studies, analyses were performed using TreeAge [30, 31, 35, 40, 43, 48, 49] and Microsoft Excel [34, 36, 47, 50, 51] each with six studies (52%, in total). For a CBA conducted through decision tree modelling, a combination of Stata Standard Edition (SE) version v16 and TreeAge Pro 2019 software was used in the study [46]. A web-based and open-source health economic modelling platform, heRo3, was used for Markov modelling in one study [32]. The remaining studies did not specify any particular software used for their modelling analysis.

3.2.3 Incorporated Health States, Time Horizon, Decision-Analytic Model Type and Cycle Length

The distribution of health states (if applicable) varied, with the number of states ranging from three to seven. Specifically, two studies utilized a three-state categorization [45, 51], while another two studies employed a seven-state categorization [39, 43], which was specifically applied in the context of diagnostic interventions. However, the most frequent number of health states observed in the included studies was four and the most commonly encompassed categories of states were mild, moderate, severe, and death (Table 3).

Table 3 Summary of model structures, inputs, cost components and sensitivity analysis

The time horizon of analysis ranged from 3 months (n = 1, 4%) [42] for a diagnostic option, to lifetime as the most common approach, with 13 studies (57%) [Table 3]. Four studies applied time horizons of 10 and 24 years [30, 35, 38, 48], which can be considered equivalent to a lifetime horizon, given that dementia is typically diagnosed in older adults with a shorter life expectancy. The time horizon in four other studies varied between 1 year and 5 years [37, 45, 46, 51] and one study did not detail the time horizon employed [47].

The majority (n = 14, 60%) of the included studies employed Markov transition models to capture disease progression over time, with a cycle length between 1 month and 1 year, and without any explicit relationship between the type of intervention and the cycle length used in the models (Table 3). Among these, 17 studies employed a cohort modelling framework to simulate the course of events. Adding to studies with Markov modelling, three studies (13%) utilized decision tree models [42, 46, 48], two (9%) employed discrete-event simulation (DES) [31, 35], and two used disease simulation [33] and microsimulation and macrosimulation modelling [47]. Two studies (9%) did not specify the decision analytic approach they used [44, 52].

3.2.4 Costs and Discount Rates

All 23 studies included direct medical costs consisting of costs for medications, diagnostic tests and procedures, and inpatient or outpatient visits according to dementia type, disease stage, intervention and study setting (community or residential care) under investigation (Table 3). Nine studies (39%) incorporated types of non-medical costs, including transportation costs or costs of other facilities such as institutional care, home services, or patient social care [31, 33,34,35, 40, 43, 47, 50, 51]. In all of the studies adopting a societal or modified societal perspective, the costs associated with informal care were considered (Table 1). All studies, except one [41], did not consider productivity loss for people with dementia (considering that the average age of dementia onset aligns with the typical retirement age of 65 years or older) [4].

The discount rates applied varied between studies. Specifically, 13 studies (57%) used a discount rate of 3% for both costs and benefits (Table 3), two studies (9%) used a discount rate of 5% [30, 43], one study (4%) used a discount rate of 3.5%, and one study (4%) used a discount rate of 3% for costs and 1.5% for QALYs [51]. Incorporating discount rate was not applicable in three studies (17%) due to the short time horizon of ≤ 1 year [42, 45, 46], and there were three studies (17%) that did not report whether and how costs and effects were discounted [39, 44, 47].

3.2.5 Cost Effectiveness of Interventions and Associated Uncertainty

Sensitivity analyses were commonly conducted in the majority of studies. These included deterministic sensitivity analyses (DSAs) in 18 studies (Table 3), probabilistic sensitivity analyses (PSAs) in 13 studies (Table 3), and scenario analyses in eight studies [31, 33, 34, 36, 40, 41, 46, 51]. Six studies used both one-way DSAs and PSAs [30, 32, 35, 42, 48, 50]. A more comprehensive approach was taken in five studies by simultaneously employing one-way deterministic, probabilistic, and scenario sensitivity analyses to capture a wider range of uncertainty and potential variations in the evaluated interventions [34, 36, 40, 41, 51].

Based on the findings from DSAs across the studies, the main parameters that had a considerable impact on the cost effectiveness of interventions were treatment effectiveness [33, 35, 36, 40,41,42,43] or health utility values [33, 40], cost of medication [37, 40], treatment duration [31, 35, 51, 52], transition probabilities [34, 36, 39, 41, 51], treatment discontinuation [33, 52], and population characteristics [34, 37, 46, 52]. However, in general, the results of base-case analyses were mostly robust and were not significantly affected by changes to parameters or assumptions given sensitivity analyses.

The findings, as shown in Table 1, revealed supportive evidence about the cost effectiveness of certain pharmacological and non-pharmacological interventions, and diagnostic strategies such as shunt surgery in iNPH, a medical diet with Souvenaid, FINGER prevention programme, in-home and community-based care combination, a managed pharmacotherapy programme, in vitro fertilization pre-implantation genetic testing, cognitive stimulation therapy, a Primary Care Geriatric Initiative, an online Body-Brain-Life programme and community-based memory programmes. However, options related to aducanumab, donanemab, positron emission tomography, and a general screening policy were not recognized as being cost effective. Moreover, among studies in which interventions were evaluated from multiple perspectives, only the cost effectiveness of donepezil versus no treatment varied depending on the perspective adopted [35]. There were no contradictory results among the studies that investigated the same interventions, despite using potentially different methods.

3.3 Quality of the Included Studies

The methodological rigour and reporting quality of the included studies were assessed using the Philips checklist [29] and the CHEERS 2022 guidelines [22], respectively. The detailed evaluation of each included article against the checklists’ criteria is provided in Online Resource 2.

3.3.1 Quality of Methodology: Philips’ Checklist

Structure (S) The majority of studies substantially developed the decision problem (S1) and stated the scope (S2) and model type selection (S6) of the economic evaluation; however, there was insufficient clarity in the primary decision maker (S13) in most of the studies. Consistency between the model parameters and the stated perspective (S22) was also met in nearly half of the included studies.

Regarding the rationale for the model’s structure (S3), only 13% of the studies discussed competing theories related to model development (S33) and fewer than half of the included studies explicitly specified the sources of the data (S34) used to develop the model structure, or met the expected transparency in discussing structural assumptions (S41 and S42). In addition, a limited portion of the studies (39%) clearly describe the rationale for casual relationships incorporated in their models (S35).

The majority of the studies did not meet the expected quality in terms of considering all feasible and practical options (S52) and providing a clear justification for their exclusion (S53), although approximately half of the studies elaborated on the characteristics of options under evaluation (S51), including time horizon (S7), disease states/pathways (S8) and cycle length (S9).

Data (D) In assessing the included studies against the essential criteria for data identification (D1), it was found that a small fraction of the included studies satisfied the requirements for the use of a clear and systematic approach in data identification, specifically when considering key parameters in line with the objectives of their models (D11, D13, and D14) or providing evidence of utilizing a data quality assessment (D15).

Regarding the utility weights, a majority (65%) of studies appropriately incorporated and referenced utilities, reflecting a positive trend; however, less than half (35%) adequately justified the methods of deriving these utility weights.

The assessment of uncertainty (D4) showed that less than one-fifth of the reviewed studies sufficiently addressed all four types of uncertainty, including methodological uncertainties, or considered heterogeneity by utilizing separate models for different subgroups.

Consistency (C) Thirteen percent of the studies showed evidence of thorough pretesting of the mathematical logic within their models (C1). Moreover, although 70% of the studies managed to draw valid conclusions from their presented data, explanations and justifications for counterintuitive results were scarce (C2), found in only 9% of the studies.

3.3.2 Consolidated Health Economic Evaluation Reporting Standards (CHEERS)

Among the items assessed in the CHEERS checklist, the analysis revealed a significant lack of thorough reporting in several key areas, particularly with the methods employed and the presentation of study results. Most studies did not clearly disclose the setting and location, time horizon, discount rate, explicating analytics and assumptions, and characterizing heterogeneity.

The 2022 updated version of CHEERS introduced important additions, including the use of health economic analysis plans, model sharing, and the increasing involvement of stakeholders and engagement with communities, patients, and the public in health research [24]. However, our assessment of the quality of reporting in the included studies revealed that none of them explicitly addressed these newly added aspects.

Overall, items about the methods and results sections were the least adequately reported parts across the studies, despite their importance in understanding, replicating, and assessing the validity of economic evaluations.

4 Discussion

The purpose of this systematic review was to provide a comprehensive overview of model-based economic evaluations of dementia interventions. By synthesizing and analyzing a wide range of studies, we aimed to incorporate the latest evidence into the review, highlight the key characteristics of these studies, and critically appraise the methodological quality and quality of reporting among the included studies. Through this discussion, we shed light on the current state of knowledge, identify research gaps, and provide guidance for developing decision-analytic models in the field of dementia management.

In this systematic review, we identified 23 studies published since February 2018, suggesting a notable increase in the publication rate of model-based economic evaluations of dementia interventions in recent years. Among these studies, AD (43%), MCI (22%), and dementia (in general; 13%) were the most common conditions examined. While the concentration of studies on AD and MCI reflects the higher prevalence and research focus on these conditions within the field of dementia, the limited representation of other types of dementia, such as vascular dementia, Lewy body dementia, and frontotemporal dementia, highlights a significant gap in the literature.

Moreover, despite the growing body of literature that exists regarding the efficacy of non-pharmacological interventions, less than one-fifth of the included studies specifically evaluated these interventions, whereas approximately half of the studies focused on pharmacological interventions. This may have implications for the allocation of dementia care resources, introducing bias in prioritization, and skewing policy decisions in favour of pharmacological interventions.

Studies on pharmacological options included a comprehensive range of medications spanning the main pharmacological categories used in dementia treatment, from the conventional options to those recently approved [59], all recommended for slowing cognitive decline. However, regarding non-pharmacological options, none of the interventions was directly within the physical exercise interventions, psychological and behavioural therapies or occupational therapies. Instead, they were associated with the categories of integrated health and social services and support services for daily living activities and cognitive stimulation therapies.

This review revealed that the majority of the studies were conducted in the US and a limited number of European countries, indicating a lack of economic evaluations on dementia interventions in many other countries worldwide, even those with high estimated dementia prevalence rates [4], such as Japan, Italy, Greece, Portugal and Germany. Therefore, in light of the generalizability of the findings with the understanding that the outcomes of economic evaluations may not have universal applicability across diverse country contexts, it is necessary to perform tailored evaluations for each country’s specific circumstances.

Although most studies were CUAs, these findings generally aligned with existing literature in other clinical areas, which is also often driven by health technology assessment (HTA) guidelines by agencies such as National Institute for Health and Care Excellence (NICE) in the UK [60] and the Pharmaceutical Benefits Advisory Committee (PBAC) in Australia [61] that recommend CUAs.

Markov transition models were the most frequently utilized model structure in the included studies (60%), followed by decision trees (13%) and DES (9%) models. The utilization of Markov transition models is well-justified given the scope of the disease, considering its chronic and progressive nature characterized by recurring and long-term health states. Given the advantage of DES models that incorporate individuals’ unique demographics or disease characteristics, they are the favoured approach in economic evaluation as they can enhance the real-world representativeness [62, 63]. Nonetheless, as DES models require detailed patient-level data and a high-performance computer, they are less utilized [62, 63]. However, Markov cohort models have been commonly used in dementia research and they can, at an acceptable level, particularly if there is no evidence of substantive heterogeneity, reflect the underlying disease progression. The use of decision tree models in two studies that investigated diagnostic interventions [42, 48] is also deemed appropriate. However, since the incorporation of surrogate outcomes is generally less favoured according to HTA guidelines, including the PBAC guidelines [64], the study by Contador et al. [42] could benefit from implementing methods and assumptions to extend short-term results to final outcomes.

There were significant weaknesses in consideration of the competing theories in model structure development and the sources of data used. Despite the valuable insights offered by Brennan et al. [65] in their taxonomy of model structures for the economic evaluation of health technologies, they also highlighted that the discussion surrounding the selection of a model structure for a specific health economic evaluation context is often neglected or not adequately addressed in published studies. This critique highlights a lack of information reporting assumptions and detailed calculations that are critical in assessing how some of the models were built (transparency) and whether it sufficiently reproduces the reality (validation). Review of the included studies showed that although non-technical descriptions were generally better reported in terms of model type, funding sources, model parameters, results and limitations, technical documentation including information on methods to transform or extrapolate data beyond observed values and model validation assumptions, employed techniques or related sources of assumption remain mostly unclear. In light of this, it is highly recommended that future research considers the recommendations reflected in the report of the ISPOR Modeling Good Research Practices Taskforce [66] and suggested framework for assessing quality in decision-analytic models by Sculpher et al. [67].

Although the current consensus guidance on developing Health Economics Analysis Plans (HEAPs) has primarily focused on economic evaluations in randomized controlled trials (RCTs), it is recommended that all types of economic evaluations in future research would benefit from such plans. Thorn et al. have developed a 58-item template to prevent bias resulting from selective reporting or analyses and enhance reproducibility [24, 68]. This template includes ensuring details of the model structure are explained, published and preferably illustrated in a way that ensures replicability by any interested researcher [24].

Moreover, based on the findings, economic evaluations in the field of dementia seriously lack the engagement or description of their approach to engaging with people with dementia, caregivers, the general public, dementia communities, and other stakeholders, such as clinicians or payers in model development affected by the study [24]. This may result in less relevance, acceptability, and consequently research validation. Patients can potentially be involved in various parts of model development, including reaching for a common purpose of the model, enhancing model performance based on stakeholders’ values, and concerns and discussion about uncertainties for decision making [69, 70].

In addition, involving individuals with abstract thinking abilities and some trainings in health economics modelling can enhance this process [69].

Approximately half of the studies included in the analysis did not adopt a societal perspective when conducting economic evaluations of dementia interventions, despite the fact that the societal perspective is considered as a reference case to ensure the quality and comparability of economic evaluations [71]. One possible explanation for this might be the lack of detail regarding broader societal impacts of the interventions or difficulties involved in quantifying and valuing important costs, such as informal care costs [72], which represent a substantial portion of the non-healthcare costs associated with dementia [73]. Limited time and resources are other practical issues that often lead researchers to exclude the societal perspective and its relevant considerations from their analysis [72]. Nonetheless, overlooking the inclusion of the societal perspective can lead to an incomplete understanding of the comprehensive economic implications of dementia interventions [71].

The clarity in explaining the states and events incorporated in the model structure was found to be lacking in fewer than 60% of the reviewed studies. Additionally, none of these studies considered all three domains of cognition, function, and behaviour comprehensively. Instead, they generally defined severity-oriented health states for the course of the disease, which could potentially result in over- or underestimation of the treatment benefits, as discussed by Hernandez, et al. [20] and Önen et al. [39]. However, Cohen and Neumann [74] argued that incorporating multi-attribute health states, including all possible states, is practically infeasible due to the high number of states that would need to be included in the model. In general, the finding concerning the limitations in distinguishing features of the condition is consistent with the results reported in other studies within the same field [19, 75].

We identified a significant limitation in the reporting of utility measurement and valuation methods in the included studies. While utilities are crucial in enabling the comparison of interventions, the lack of detailed information and transparency regarding utility measurement and valuation methods may restrict decision-makers from fully evaluating the reliability and robustness of economic evaluations [76] unless they delve into the referenced studies in order to gain a more comprehensive understanding of the specific details. Although the ISPOR Task Force recently developed recommendations to identify, review and synthesize health state utilities, this information was mostly neglected in reporting the source and methods of the values identification process [77]. Given that a journal word limit may restrict detailed reporting, the primary information should be at least included in a supplementary file and referenced in the paper. This information should include the choice of instrument, method of instrument completion (self- or proxy-report), and the mode of administration (paper/pen, or online) regarding the outcome measurement and preference elicitation techniques choice (e.g., standard gamble or time-tradeoff methods), country-specific value set used, and composition and relevance of sample included in the valuation study (e.g., representative general population) regarding the outcome valuation [77].

The expected trend of decreasing utility values with disease progression is supported by the elicited values, as the mean utility values generally decrease from MCI to mild AD, moderate AD, or severe AD. Nonetheless, the wide range of values reported for the same disease state in different sources raises concerns about the potential impact on the effectiveness of interventions and, subsequently, their cost effectiveness. Thus, in addition to early discussion on the importance of collecting utility values through a systematic approach from the sources that reflects population preferences and aligns with the study objectives [78], in cases where multiple estimates for a particular health state exist, employing meta-analytic methods could serve to produce more reliable estimates with less uncertainty [78]. Finally, any probable uncertainty around the utility values needs to be investigated through parameter sensitivity analysis.

The NICE methods guide for technology appraisal recommends the use of utility values obtained through the EQ-5D method, along with published weights assigned to each EQ-5D health state [79]. Consistent with the findings in the existing literature, this review also observed that the EQ-5D is the most frequently used source of utility values in economic evaluations of dementia interventions. However, dementia can impact various health aspects that may not be adequately captured by the dimensions defined in generic measures [80].

Using outcome measures in dementia has been discussed and is challenging. For example, while self-reported measures are widely used in outcome research of health interventions, there is a debate on the reliability and validity of such measures in dementia due to cognitive issues among people with dementia [81, 82]. Previous research has indicated a weak correlation between self and proxy ratings for people in more advanced stages of AD [83]. Therefore, using proxy-report utility measurements rather than self-report instruments might be justified, although there is no consensus on this point [84]. The study by Smith et al. found that instead of substituting proxy-reports for self-reports in dementia, separate self- and proxy-report measures should be developed [85]. Therefore, future economic evaluations should consider incorporating condition-specific measures alongside generic measures in both self- and proxy-rated modes.

Moreover, future economic evaluations need to clearly define health and social service boundaries as well as informal care components incorporated in the study and the approach employed for their measurement. Duration of care, number and care task components (housework, personal care, support with mobility, administrative tasks and socializing) and time specifically invested due to illness are the key elements that should be stated [86]. The current literature shows inconsistent approaches have been used to informal care measurement and valuation [86, 87] and there is a need for a more consistent approach to ensure comparability.

In general, the results of this study echoed the findings of the previous review study by Nguyen et al. regarding the neglect of conducting cost-effectiveness studies on non-pharmacological interventions and for a range of dementia conditions, except for AD [19]. There is also a small number of studies with a social perspective, the high frequency of CUAs, and Markov models are the most common model type used. In relation to the quality of the methodology and modelling of CEAs, the results of this study did not show a substantive improvement compared with the previous study. In particular, the development of models to represent dementia progression with all its aspects, such as behavioural, psychological and functional symptoms, are still missing.

The inclusion of studies in English or German can be a limitation for this study. Moreover, due to incorporating a broad range of interventions and variety of health conditions in different model structures, it was not possible to make a direct quantitative comparison across all findings. Another potential limitation of our study is that the information extraction from the included studies was conducted solely by one reviewer. However, the study’s benefit includes the use of all relevant databases, with the initial screening of eligible articles and cross-checking the extracted information performed by two independent researchers.

5 Conclusion

This review informs future research and resource allocation by providing insights into model-based economic evaluations for dementia interventions and highlighting areas for improvement. Overall, the findings of this study indicate a substantial increase in the number of economic evaluation studies on dementia interventions conducted between 2018 and 2022 compared with previous research in the field. These studies have examined a diverse range of pharmacological, non-pharmacological, diagnostic, and preventive interventions in terms of their cost effectiveness. Nonetheless, these studies are restricted to a limited number of countries. Moreover, the quality of methodology and reporting of these studies, similar to the most recent reviews in this area, exhibits significant weaknesses that should be addressed by researchers to enhance the quality and rigour of future studies.