Introduction

It is estimated that 47 million people worldwide are living with dementia, and this is set to increase to more than 131 million by 2050 [1]. People with dementia experience higher levels of co-morbidities and may receive on average more medications than their cognitively intact counterparts [2]. They may also receive suboptimal care that includes poor pain control [3, 4] and prescribing of potentially inappropriate medications (PIMs) [5].

Prescribing is potentially inappropriate where there are drug–disease or drug–drug interactions, where the risks of a medication outweigh the benefits, where there is a lack of evidence for a medication or where time to benefit of treatment exceeds an individual’s life expectancy [6]. Potentially inappropriate prescribing (PIP) has been associated with an increased risk of adverse drug events (ADEs), hospitalisation, mortality and lower quality of life in older people with and without dementia [7, 8].

Many tools have been developed to identify PIP in older people for use in research and in clinical settings. An overview of published tools identified 46 different tools for identifying PIP, of which 36 were developed for use with older people and 6 for use in long-term care settings [9]. The most commonly used tool is the Beers criteria which were first published in 1991 and have been regularly updated since [10,11,12,13,14]. They include lists of medications considered inappropriate for older people in general and also provide a list of PIMs for specific conditions such as dementia and Parkinson’s disease. Another commonly used tool is the STOPP START criteria which in addition to identifying medications that may need to be stopped also identify where there may be a potential under-use of medications [15, 16]. The content of the different tools varies, and there is a lack of consensus about what medications are inappropriate and under what circumstances [17, 18].

Several studies have been published that describe and compare tools to identify PIP; however, they have focused on identifying PIP in older people in general. [9, 18, 19]. To date, there has not been a systematic review that has assessed how the tools are being utilised in studies of PIP in older people with dementia. Given the higher number of co-morbidities and the increased drug burden experienced by this group compared with their non-cognitively impaired counterparts [2, 20], a systematic review of studies utilising tools to identify PIP in this cohort is an essential addition to the literature.

The aims of the review were:

  1. 1.

    To describe and summarise studies that have used a published tool to identify PIP in people with dementia.

  2. 2.

    To report the prevalence of PIP and the medications identified as inappropriate in the included studies.

  3. 3.

    To describe the potential advantages, disadvantages or complications of using the tools as identified by the authors of the included studies.

Methods

The review was conducted according to the recommendations set out in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [21].

Search strategy

A systematic search of the published peer-reviewed literature and grey literature was conducted in April 2016. Literature search strategies were developed using Medical Subject Headings (MeSH) terms and text words around the topic of dementia and inappropriate prescribing (The Appendix Table 3 documents the full search strategy for Medline, which was then adapted as appropriate for the other databases). The following databases were searched: Medline (Ovid), CINAHL Plus (EBSCO Host), Embase (Ovid), PsycInfo (Ovid), Social Science Citation Index (Web of Science) and the Cochrane Library (Wiley). There was no restriction by year of publication. Searches were also conducted for grey literature using the following online databases: OpenGrey, Grey Literature Report, Mednar, BASE and the National Database for Ageing Research. A search of the reference lists of retrieved papers was also conducted. Table 1 lists the criteria for including studies in the systematic review.

Table 1 Inclusion and exclusion criteria

Data extraction, assessment and analysis

Two authors (DH and JB) independently reviewed the titles and abstracts against the inclusion criteria. The full papers for references that appeared to meet the criteria, or those for which there was uncertainty, were retrieved for independent review by both authors (DH and JB). Where there was disagreement about which papers should be included, these were resolved by discussion. The third author (UM) was available to resolve any ongoing disagreements about inclusion. Where there were any exclusions of full-text articles, the reason for exclusion was recorded (Fig. 1).

Fig. 1
figure 1

PRISMA flow chart of the literature search

A standardised list of data to be extracted was created, based on the aims and objectives of the review. The data extracted were authors and year of publication, study aims and objectives, study design, tools used, study setting, sample size, demographic data, prevalence of polypharmacy, prevalence of PIP, most commonly prescribed PIP medications, summary of main findings, strengths and limitations of the tools and author’s conclusions. The data were extracted independently by two of the authors (DH and JB) and any disagreements resolved by discussion.

Quality assessment

Critical appraisal of the included papers was conducted using the Hawker Tool [22]. The papers were independently assessed by two of the authors (DH and JB) for methodological quality and risk of bias. Studies were weighted but were not to be excluded from the review on the basis of quality. This tool was chosen because it has been designed to be used to assess study quality for multiple study designs and has been used for this purpose in previous systematic reviews [23,24,25,26,27,28]. The Hawker Tool assesses the reporting of a study in the following areas: abstract and title, introduction and aims, method and data, sampling, data analysis, ethics and bias, findings and results, transferability and reliability, and implications and usefulness. Each of the nine areas is given a score of either 1 (very poor), 2 (poor), 3 (fair) or 4 (good) which gives a total overall score out of 36 for each paper [22].

Results

Search results

The search of biomedical databases identified 4597 papers, and a further 712 were identified through the grey literature search and hand searching of reference lists. After removal of duplicates, there were 3626 unique studies. After eligibility screening, 47 full-text articles were assessed for inclusion, of which 26 papers were included in the review (Fig. 1).

Quality assessment

The Hawker score for the included studies ranged from 22/36 [29, 30] to 36/36 [31, 32]. The median total score was 29/36. Previous reviews using the Hawker Tool [22] have set a cut-off of 20/36 or above to indicate a study that is of a fair to good quality [25, 27]. On this basis, all 26 studies would be classed as at least fair quality. Overall, the studies scored lowest for reporting on Ethics and Bias and also Transferability and Reliability (Appendix Table 4).

Characteristics of the included studies

Table 2 outlines the characteristics of the 26 included studies including study design, tools used to identify potentially inappropriate prescribing, country, setting, participants and summary of results. Twenty-five studies were observational that used at least one tool to measure the prevalence of PIP [2, 5, 29, 31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52]. One study was a non-randomised evaluation of an intervention to improve medication management [30].

Table 2 Characteristics of included studies

Across all 26 studies, 26,534 participants were recruited, of which 21,285 (80%) had dementia or cognitive impairment, 968 (4%) had mild cognitive impairment and 4281 (16%) were non-cognitively impaired controls. A total of 17,928 of the participants were female (68%). Mean age ranged from 72.5 [2] to 86.8 [5].

Number of medications and the prevalence of polypharmacy

Ten studies reported a range for number of medications prescribed; this was from 0 to 22 [5, 32, 33, 35, 37, 46,47,48, 50, 51]. Mean number of medications was reported in 19 studies [2, 5, 29,30,31,32, 34, 36, 37, 40,41,42,43, 45,46,47,48, 50, 51] and ranged from 4 [50] to 14 [31]. When comparing the mean number of medications between those with and without dementia, results were mixed, with two studies reporting a significantly lower mean number of medications in those with cognitive impairment [43, 45] and, by contrast, two studies reporting a significantly higher number in those with cognitive impairment [2, 41].

Eleven studies reported the prevalence of polypharmacy as the percentage of participants taking ≥ 5 medications [2, 30,31,32, 35, 38, 40, 42, 44, 46, 51]. The prevalence of polypharmacy ranged from 25% [38] to 98% [51] for people with dementia or cognitive impairment (see Table 2).

Prevalence of PIP

The prevalence of PIP was reported as the percentage of the cohort taking at least one potentially inappropriate medication. For people with dementia, this ranged from 14% [44] to 74% [31]. For non-cognitively impaired controls, the prevalence ranged from 11% [2] to 44% [43] (see Table 2).

In studies comparing the prevalence of PIP between those with and without cognitive impairment or dementia, two studies found a significantly higher prevalence in those with dementia [2, 39], three found a significantly lower prevalence in those with dementia [41, 43, 47], and two found no significant difference between the groups [31, 37].

The most commonly prescribed potentially inappropriate medications were anxiolytic–hypnotics and anticholinergic medications [2, 29, 31,32,33, 35, 37,38,39, 43,44,45,46,47, 50, 51]. Rates of anticholinergic prescription ranged from 6% [45] to 46% [32]. Rates of anxiolytic–hypnotic (including benzodiazepines) use ranged from 5% [43] to 38% [51]. Other commonly prescribed potentially inappropriate medications included lipid-lowering medications [34, 48, 49] and antiplatelets [34], oestrogens [36, 40, 41, 47], non-steroidal anti-inflammatory drugs (NSAIDs) [5, 36, 47], antipsychotics [5] and proton pump inhibitors [5].

The use of a tool to identify potentially inappropriate prescribing

The Beers criteria [11,12,13] were the most commonly used of the tools (Table 2). Thirteen out of the 16 studies that used the Beers criteria did not apply the full tool. Of these, seven studies only used the list of potentially inappropriate medications for older people in general [29, 40,41,42, 45, 47, 52] and four applied only the disease-specific part of the tool [33, 35, 38, 39]. Two studies were unable to apply the full criteria as some medications were unavailable in the countries that the studies were located [43, 46].

Three of the five studies using the Holmes criteria [57] defined potentially inappropriate medications as only those drugs on the ‘Never Appropriate’ list [38, 48, 49], one defined potentially inappropriate medications as those on the ‘Never Appropriate’ and ‘Rarely Appropriate’ lists [34] and one applied the whole criteria to their dataset [50]. Three of the studies using the STOPP START [15] applied only the STOPP criteria to their data [2, 5, 35].

Eight studies utilised at least one additional tool to identify potentially inappropriate medications, five of which used a published tool to identify anticholinergic medications [2, 32, 35, 39, 46]. Four studies used an additional list of potentially inappropriate medications devised by the authors [39, 42, 43, 46].

Problems using the tools

Seventeen studies identified potential issues associated with using the tools [5, 30, 31, 33, 35, 38, 40,41,42,43,44,45,46,47,48,49,50]. One problem identified was the consensus methods used to develop the tools meant the risk benefit profile was subjective [33, 38]. In addition, the NORGEP criteria did not include drugs that may be inappropriate for specific co-morbidities and as such may have resulted in an underestimation of the prevalence of PIP [44].

Also, the type of data collected would determine the extent to which the criteria could be applied. For example, one study could only apply 31 out of the 65 STOPP criteria [15] because of a lack of diagnostic information available in the notes [5]. The authors note that this lack of information could in itself contribute to inappropriate prescribing [5].

Two versions of the Beers criteria [11, 12] were not easily applied in countries outside of the USA because some medications listed were not available in other countries [42, 43, 45, 46]. One study compared the Beers criteria [12] and the Laroche List [59] and found a higher prevalence of PIP using the latter [42]. The authors argue this demonstrates the Laroche List [59] is better suited to a European population. In addition, another study identified that the Beers criteria [12] needed to be updated to include more sedative and anticholinergic medications [40].

The Holmes criteria [57] may need more validating and updating. The authors of the studies using this tool highlighted that some medications may have been placed in the wrong category given more recent evidence for their use in older adults with advanced dementia [38, 48,49,50].

Discussion

This is the first systematic review to describe how tools designed to identify PIP were being used in studies of older people with dementia. In this review, the majority of included studies (21 out of 26) were published from 2010 onwards, perhaps reflecting an increasing awareness of the importance of rational prescribing in this cohort of older patients.

The Beers criteria [11,12,13] were the most commonly used of all the tools. With the exception of the STOPP START [15] and the Holmes criteria [57], all the other tools used were developed as country-specific versions of the Beers criteria. The review demonstrates that even when the same tool was used, how the tools were applied varied across studies. For example, several studies did not apply the disease-specific part of the Beers criteria to identify PIMs for people with dementia [29, 40,41,42, 45, 47, 52] mainly due to a lack of diagnostic or prescribing information. This may have underestimated the prevalence of PIP. Three studies compared the prevalence of PIP between those with and without dementia and reported a significantly lower prevalence in the dementia group [41, 47, 52]. Prevalence rates for PIP in people with dementia may have been higher had the disease-specific tools been applied. Prevalence under-reporting was also found in a systematic review of studies using the Beers criteria to identify PIP in older people [61]. When designing a study, researchers need to consider what medication and diagnostic data they will need to collect to fully apply their chosen tool. This will reduce the risk of underestimating the prevalence of PIP. There may also be scope for reviewing how hospital, general practice and care homes record prescribing and diagnostic information, how this can be improved and whether better recording of such information would lead to improvements in prescribing.

Polypharmacy and inappropriate medication use

This review found that prevalence of polypharmacy in those with dementia (25% to 98%) is high. This range of polypharmacy prevalence is similar to that in previous studies of polypharmacy in older people with dementia [62,63,64,65,66].

The prevalence of PIP in people with dementia ranged from 13 to 74% and from 11 to 39% for people without dementia. This is despite over 20 years of research into inappropriate prescribing in older people [10, 67, 68].

We were unable to reach a conclusion as to whether the prevalence rates of PIP are significantly higher in people with dementia compared with people without cognitive impairment. Some studies reported a higher rate of PIP in those with dementia [2, 39], while some reported the opposite effect [41, 43, 47] and others reported no difference between the two groups [31, 37]. The different findings may be partly due to the different tools used, how they were applied and variations in study design.

The most commonly prescribed potentially inappropriate medications reported were sedatives and anticholinergic medications. This is despite strong evidence that these medications should be avoided because of their adverse cognitive effects in people with dementia [69,70,71,72]. Only two studies correlated the use of such medications by severity of dementia [31, 38] which is surprising given the known risks associated with their use. Future research should focus on the use of medications with strong central nervous system (CNS) effects in people with dementia and how this changes over the disease course. Oestrogen was another commonly prescribed PIM despite an increased risk of endometrial and breast cancer, and there is some evidence it may increase cognitive decline in postmenopausal women [12, 41, 73, 74].

Advantages and disadvantages of the tools

The included studies highlighted some potential advantages and disadvantages of the tools. One study identified that both the STOPP START and the Beers criteria were internationally recognised and well-validated tools, making them the preferred choice despite any potential disadvantages [35].

However, medication or diagnostic information needed to apply sections of the criteria was not always available. For example, studies using the Beers criteria, or tools adapted from them, were unable to obtain information about dose or diagnosis in patient notes which meant parts of the tools were not used [40,41,42, 47]. In addition, one study had to exclude more than half of the STOPP criteria due to a lack of availability of diagnostic and other clinical information in patient notes [5]. It may be that the STOPP START criteria would be more useful in a clinical setting than in research. It could be used to support medication reviews in the presence of the patient where the necessary clinical information needed to fully utilise the tool can be obtained. The STOPP START is one of the few tools designed to identify both overuse and underuse of medications and as such would enable a comprehensive review of a patient’s medications compared with other tools such as the Beers criteria.

Studies utilising the STOPP START criteria [15] to identify PIMs excluded the START component and therefore did not identify cases where there was under-use of medications [2, 5, 35]. Given that people with dementia may be at risk of under-treatment, including poor pain control [3, 4], it is important to consider this aspect of potentially inappropriate prescribing in both clinical and research settings.

As new evidence becomes available, the tools need regular updating but this may not always be done. The Holmes criteria [57] were found to be in need of updating. In four of the studies using the Holmes criteria [34, 38, 48, 49], one of the most commonly prescribed PIMs were acetylcholinesterase inhibitors. These medications were placed in the Never Appropriate category of the Holmes criteria [57] because at the time of the criteria’s development there was little evidence to support their use for moderate to severe dementia [48]. However, recent evidence suggests they may continue to have a positive effect in later stages of the disease [75, 76] and this has resulted in changes to clinical guidelines on use of donepezil in people with dementia [41].

One study highlighted that the Beers criteria [12] were not comprehensive enough, particularly regarding anticholinergics and sedatives [40]. Subsequent versions of the Beers criteria [13, 14] include a list of anticholinergic and sedative medications that should be avoided particularly in older people with dementia or delirium. However, this list is not as comprehensive as the tools developed specifically to identify anticholinergic medications [55, 56, 77] which were used by some of the included studies in this review [2, 32, 35, 39, 46].

In choosing an appropriate tool to identify PIP for a research study, pragmatic consideration needs to be given to its comprehensiveness, how up to date it is and whether the tool can be fully utilised given the available drug and medical history information of the cohort under investigation.

Quality of the included studies

This review found that the reporting of the studies as measured by the Hawker Tool was generally good, with most studies scoring well overall. However, most of the studies could have better reported ethical and bias considerations and discussed generalisability of their findings in more detail. Better reporting of studies is crucial to being able to judge their potential for bias and the reliability of results.

The generalisability of the findings of this review may be limited by the marked heterogeneity of the methodologies employed in the included studies, for example, the variations in which tools were used and how they were utilised and the variations in study population and settings.

Twenty-five out of the 26 included studies were observational in design, and of these, only three were prospective cohort studies [39, 45, 48] while the remaining studies collected cross-sectional or retrospective data. The benefit of a prospective design is the reduced risk of bias and confounding. However, prospective studies conducted over many years run the risk of a high rate of attrition which the authors of one study noted was a limitation of their work [39]. The only intervention study [30] utilised an uncontrolled “before and after” design which risks an overestimation of the effect of the intervention. In addition, four studies may have had low statistical power due to a small sample size of less than 150 which may have affected the reliability of their results [5, 29, 30, 33]. More robust study designs that limit the opportunity for confounding or bias are needed in this area of research. A similar observation was made in a previous review of studies using the STOPP START criteria [78].

Strengths and limitations of the review

To date, this is the first systematic review of studies using tools to identify PIP in people with dementia and was designed using robust methodology [21]. A previous literature review [79] covered a similar topic; however, it was not a rigorously conducted systematic review. In addition, this review is more up to date and includes studies published as recently as 2016.

There are several potential limitations. Two papers by the same authors were included despite the potential for some overlap of participants in the two studies [40, 41]. Although the participants in each study appeared to be distinct, it was not entirely clear whether some of the participants may have taken part in both studies; therefore, the demographic data may be slightly overestimated.

A meta-analysis on the extracted data for prevalence of, or factors associated with, PIP was not possible due to the heterogeneity of the included studies in terms of methodology, participants, tools used and the application of those tools. There was heterogeneity of reporting data such as mean/median number of medications and prevalence of polypharmacy across the included studies. When reported in the original articles, the mean or median number of medications are presented in Table 2.

Implications for practice

This review has highlighted the need for a more standardised approach in the use of the tools that have been developed to identify PIP in older people with dementia. Tools identifying PIP cannot replace clinical judgement, and a medication identified as potentially inappropriate using such tools may subsequently be found to be appropriate following a full clinical assessment. Therefore, the use of such tools should be seen as a guide to aid clinical decision-making.

The use of more than one tool in several of the included studies suggests that current tools need to be more comprehensive to ensure that all potentially inappropriate medications are included. Consideration of whether the use of anxiolytic–hypnotic and anticholinergic medications is appropriate is particularly important given their effects on older people in general, but especially in people with dementia. As a minimum, clinicians should consider focusing on such CNS-PIP medications as a priority for deprescribing in this cohort. One of the published anticholinergic tools such as the Drug Burden Index [56] or the Anticholinergic Cognitive Burden Scale [77] may prove a useful guide to aid decision-making when reviewing a patient’s medications.

When deciding which tool to use, consideration needs to be given regarding what data will be needed to fully utilise the tool, the location of use, whether this data can be obtained and how to obtain it. In view that the tools have many differences, and none are universally applicable, two (or more) complementary tools (preferably recently updated) might be needed for a thorough assessment of PIP; using a tool which identifies both PIPs and omitted drugs would increase the clinical impact.

Future research

This review has demonstrated that rates of inappropriate prescribing and polypharmacy amongst older people with dementia are high. In particular, rates of anticholinergic and sedative medication use are high despite evidence of the risks associated with their use in people with dementia. Future research should focus on why this is the case and how the prescribing of these medications can be reduced.

As PIP is associated with the number of medications a patient is taking [5, 33, 37, 38, 41, 42, 52], future studies could focus on identifying a standardised way to present drug exposure that includes a definition of excessive polypharmacy (e.g. the use of ≥ 10 medications) and the use of concomitant medications as well as overuse or underuse of medications. Furthermore, given the known risks associated with the use of medications such as anticholinergics and sedatives in people with dementia, studies that correlate potentially inappropriate use of central nervous system medications with disease severity are needed.

Conclusions

This review found that the application of tools varied considerably. This may in part explain the variations in prevalence of PIP found across the studies. To be effective, they need to be regularly updated and may not yet be comprehensive enough to identify all potentially inappropriate medications. The review also demonstrated that despite long standing awareness of inappropriate prescribing, prevalence of PIP remains high for both older people in general and older people with dementia in particular.