Background

Relatively little is known about the level of evidence-based care provided to older adults living in long-term care (LTC) at a population level. Knowledge of evidence-based care in this sector is limited to single conditions such as diabetes, or a limited set of indicators, or studied in a small number of sites [1,2,3]. Unlike healthcare, population-based LTC studies using a standardised method across multiple conditions/processes of care have not been undertaken.

The reliable delivery of evidence-based care to LTC residents is a fundamental human right and is important to maximise their quality of life and reduce the incidence of adverse events. For example, the prevalence and chronicity of pain among LTC residents is under-detected and pain is therefore often inadequately managed [4]. As a result, residents can experience reduced quality of life with impaired physical and cognitive functioning, poor emotional and mental well-being, and increased social isolation [5]. Polypharmacy (> 9 concurrent medications) and overuse of specific agents such as antipsychotics and opiates are common in LTCs and can increase the risk of adverse events including cerebrovascular accidents, cognitive deterioration, and falls [6]. National reports into LTC in Australia [7,8,9], the United States (US) [10], the United Kingdom (UK) [11, 12] and Canada [13] repeatedly highlight major safety and quality issues for residents including neglect of wounds, incontinence, failure to recognise malnutrition, and poor management of medication which can be, in part, related to evidence-based care not being delivered to residents in a reliable manner.

Providing evidence-based care to elderly residents in LTC is likely to become more challenging. Populations in high-income countries are ageing, and worldwide, the number of persons aged ≥ 80 years is expected to triple between 2020 and 2050, reaching 426 million [14]. More elderly residents are presenting with co- and multi-morbidities [15], fragility and cognitive decline [16]. Scarcity of financial resources and appropriately trained staff and a rapidly changing evidence-base provide further stress to LTC systems [16]. Given these sustainability challenges for LTC, understanding the level of evidence-based care delivered to this vulnerable population, now and into the future, to help direct local and system-level quality improvement initiatives, is vital.

The aim of this study, CareTrack Aged, was to estimate the prevalence of evidence-based care, as measured by adherence to clinical practice guideline (CPG) recommendations in the care received by a population-based sample of Australian residents in LTC aged ≥ 65 years in 2021.

Methods

The CareTrack Aged study methods have been published elsewhere [17, 18]. We reviewed a sample of 294 care records of LTC residents aged ≥ 65 years as of March 1st 2021, against indicators derived from CPG recommendations for care delivered between 1 March 2021 and 31 May 2021 (our record review period).

Development and ratification of clinical indicators

We aimed to develop a set of indicators that represented evidence-based care delivered to residents of Australian LTC facilities in 2021. The RAND-UCLA Delphi method to develop indicators was applied [18, 19] (Fig. 1). Sixteen medical conditions or processes of care (Table 1) were selected for inclusion based on a systematic international search for prevalence and burden of disease data, CPGs, and indicator sets relevant to LTC published between 2013 and 2018 [17, 18]. These included high prevalence conditions, such as cognitive impairment which affects over half (54%) of LTC residents [20], and frequently used processes of care, such as medication management [21].

Fig. 1
figure 1

The process for developing and ratifying CareTrack Aged evidence-based care indicators following Hibbert et al. (2022) [18]

Table 1 Examples of included indicators by phase of care and quality type

Recommendations (n = 5609) were extracted from 139 CPGs relevant to the 16 conditions/processes of care and screened for eligibility; the research team excluded 2136 recommendations by consensus for one or more of four reasons: (1) weak strength of the recommendation indicated by wording such as “may” or “could”; (2) low likelihood of the information being documented; (3) guiding statements without recommended actions (e.g. “consideration should be given to”); and (4) “structure-level” recommendations (e.g. general instructions for personal protective equipment) [18, 22]. The 3473 remaining recommendations were grouped into a standardised indicator format and, after consolidation of similar recommendations, 1790 were used to draft 630 initial indicators [18].

Australian-based LTC experts (n = 41) were recruited to review the draft indicators [18]. Their profiles are outlined in Additional file 1: Table S1 [18]. Experts ratified the proposed indicators over a two-stage modified Delphi process, working independently to minimise group influence [23].

Experts scored the appropriateness of each of the draft indicators on a 9-point Likert scale (9 = highly appropriate, 1 = not at all appropriate) in line with the RAND-UCLA Delphi method [19]. In addition, they scored the indicators against three more specific criteria (acceptability, feasibility and impact, scored as ‘Yes’/’No’ or ‘Not Applicable’) [18] consistent with the process used in two previous CareTrack studies measuring evidence-based health care delivered to adults [24] and children [25]. Reviewers could also provide additional comments. Feedback was collated to revise indicators between rounds. Indicators with an average appropriateness score of less than 7 or a majority score of ‘No’ across any of the scoring criteria were excluded. This resulted in the removal of 394 indicators leaving 236 representing evidence-based care in LTC residents [18]. These indicators were categorised by the type of quality of care addressed (e.g. underuse, overuse) and type of phases of care (e.g. diagnosis/assessment, treatment, monitoring/review).

A single indicator was frequently separated into multiple indicator questions. For example, one indicator related to residents receiving a comprehensive physical assessment post-fall, within 1 week, of their gait, lower limb muscle strength and joint function. This generated three indicator questions, related to assessment of gait, lower limb muscle strength and joint function. The 236 indicators generated 323 indicator questions that were grouped into 16 conditions/processes of care to assess evidence-based care [18]. Examples of indicators are shown in Table 1, with full listing in Additional file 2: Table S2 [26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129].

Sampling process

A multistage sampling process was applied. Sampling was initially planned within three Australian states, Queensland, New South Wales and South Australia (SA). However, due to constantly changing government restrictions during the COVID-19 pandemic, only LTC facilities in SA were recruited. The profile of the facilities and residents in SA is similar to Australia (Tables 2 and 3) [130, 131].

Table 2 Demographic characteristics of the participants in the study compared to Australian long-term care residents
Table 3 Demographic characteristics of the facilities included in the study compared to Australian long-term care facilities

The sampling frame for LTC facilities was the Aged Care Service List [131], which groups LTC facilities into Aged Care Planning Regions [132]. The list includes the number of licensed beds at each facility, the Australian Standard Geographical Classification of Remoteness Areas (Major Cities, Inner Regional, Outer Regional, Remote and Very Remote) and organisation type (Charitable, Community-based, Local Government, Private, Religious, and State Government).

Within facilities, sampling was restricted to permanent residents aged ≥ 65 years on the 1st March 2021 who resided in the facility in the record review period. This period was selected because, in SA, COVID-19 prevalence and associated social restrictions were relatively low.

We aimed to sample the records of 12 residents per facility. We purposively sampled 4 residents each for those admitted within our timeframe (‘admission’) and those who died in our timeframe (‘end-of-life’). Within each consented facility, the eligible residents were identified by the facility and listed in random order; care records were accessed until a quota of 12 was reached.

Recruitment of long-term care facilities

Within SA, as of 30 June 2021, there were 272 listed facilities operated by 94 separate providers, with 18,847 funded residential beds. Of these, the initial sampling frame was created by restricting to services with the ‘Residential’ Care Type (i.e. excluding ‘Multi-purpose’ and ‘National ATSI Aged Care program’ services), to focus on LTC beds in services with 20 or more funded residential beds, for logistical reasons. The initial sampling frame thus comprised 84 providers (89% of total SA providers), operating 235 facilities (86% of SA) with 18,055 beds (96% of SA) (see Additional file 3: Fig. S1).

For practical reasons, reflecting management realities during the COVID-19 pandemic, facilities beyond a 3-h drive from the SA state capital city (Adelaide) were excluded as were facilities run by private sector organisations, which were proving difficult to recruit. The final sampling frame was thus reduced to 54 providers (64% of the initial sampling frame) operating 150 services (64%) containing 11,345 funded residential beds (63%).

Twenty-four of the 54 providers (44%) were approached directly after being recommended by colleagues or other providers. Of those, thirteen providers (54%) agreed to participate. For 10 providers, all LTC facilities were included (n = 15 facilities) while for the other three a subset of facilities (n = 10 from a total of 27 eligible) were randomly sampled by the providers; sampled facilities included 1927 of 3233 residential beds (59.6%) operated by these three providers.

Sample considerations

As noted previously, there are 323 indicator questions, grouped into 16 conditions/processes of care. Not all indicator questions are assessable for all residents. The underlying unit of analysis is the assessed indicator. As some questions were anticipated to apply to few people, sample sizes were estimated on the basis of that required to achieve a desired precision when assessed indicators are aggregated at the level of the condition/process of care rather than the individual indicator question. For example, in a condition with 10 questions and an average of 56 assessed indicators per question, there would be 560 assessed indicators for analysis of adherence at the level of the condition.

With simple random sampling, approximately 400 assessed indicators are required to obtain estimates with a precision of + / − 5% at an estimated 50% adherence (i.e. the adherence rate that generates the widest binomial confidence interval). We anticipated requiring substantially more than 400 assessed indicator questions per condition to compensate for the ‘design effects’ described below, and have therefore deliberately sought to achieve more than 400 assessed indicator questions per condition. This is not however precise as we do not know, a priori, how many indicator questions will be assessable per resident.

For each selected resident, record reviews were conducted for all conditions/processes relevant to their care, and surveyors determined if each indicator was relevant and, if deemed relevant, determined whether or not care was adherent. Multiple assessed indicators are clustered within a resident’s record and these residents are in turn clustered within a service which are in turn clustered within an organisation. If adherence rates are more similar within than between these clusters, this leads to a design effect, resulting in wider confidence intervals [133]. Estimation of this design effect requires knowledge of the prevalence of the indicator questions, and the intracluster correlation coefficient (ICC), a measure of the extent of within-cluster similarity; neither of these critical pieces of information was available when the study commenced.

Resources had been provided for reviewing 400 care records. As each condition/process contains multiple indicators, each record review can be expected to generate multiple assessed indicators. We therefore generated a set of simulations to assess the implications of clustering by resident, facility and provider on required sample sizes; using the ultimate cluster assumption, the final analysis would adjust for clustering by provider. With resident as the only unit of clustering, any ICC could be tolerated as long as 400 residents were sampled. With facility as the level of clustering, 25 facilities and 80 or more condition-questions assessed per facility, an ICC of 0.05 could be tolerated; with 40 or fewer questions, an ICC of 0.03 could be tolerated. Assuming two facilities per provider, an ICC of 0.02 could be tolerated if 120 or more indicator questions were assessed for the condition; if higher ICCs were encountered, this would result in wider confidence intervals, and vice versa. Based on these simulations it was decided to sample 16 residents in each of 25 facilities (i.e. n = 400 care records in total) which was assumed to generate over 1500 assessed indicators for most conditions/processes of care (i.e. 120/provider or 60/facility). This target number of care records was subsequently reduced to 300 residents in total, as a result of restrictions in response to the COVID-19 pandemic. It was decided to retain the number of facilities but reduce the number of residents/facility to 12, allowing tolerance for a higher ICC.

Data collection tools

A bespoke web-based data collection tool, developed for the CareTrack Australia study [24], was modified for LTC conditions/processes and indicators. A manual (which is available on request) was developed which included instructions, condition/process-specific definitions, inclusion and exclusion criteria, and guidance for assessing indicator eligibility.

Reviewer engagement, training and agreement based on kappa scores

Three experienced registered LTC nurses were recruited to review care records. They were all employed by the university and were independent of the recruited facilities. Prior to data collection, the reviewers undertook a 1-week training programme. Care records were reviewed on-site at each facility or off-site depending on accessibility of electronic systems: however, all facilities were visited by surveyors to collect data. Surveyors collected the data between October 2021 and September 2022.

Weekly meetings were held between the research team and the surveyors to harmonise surveyors’ views. Mock records were assessed to calculate inter-rater reliability. Among 300 indicators, a substantial level of inter-rater reliability between reviewers was found for indicator eligibility (k = 0.71, SD = 0.07) and adherence (k = 0.67, SD = 0.06).

Data collection

Reviewers undertook structured criterion-based care record reviews. One review per eligible condition/process was completed for each resident for the record review period. The reviewers responded to each indicator as ‘Yes’ (care provided during the encounter was consistent with the indicator), ‘No’, or ‘Not Applicable’ (NA; the indicator was not relevant to the encounter because the resident did not meet the inclusion criteria for the indicator). For example, an indicator shown in Table 1, MOBI05, has an inclusion criteria for a particular level of severity (e.g. “medium/high risk of falls”) but this level of severity is not applicable to all residents. If there were multiple instances of a particular indicator (i.e. three falls requiring follow-up in the 3-month period), multiple assessed indicators could be recorded for the same resident.

In a pilot study undertaken in July/August 2021 in two facilities collecting data on 9 residents and 734 eligible indicators, we did not find any residents in the care records who met inclusion criteria for ‘hearing and vision’ and ‘behaviours requiring restraint’. Therefore, we excluded these two conditions from our main data collection as there were no assessed indicators.

Analysis

The maximum number of indicator questions assessable in each condition/process of care reviewed ranged from six for ‘sleep’, to 40 for ‘pain’ [18] (Additional file 2: Table S2), with the number of eligible responses varying depending on age, sex, relevance and clinical criteria. Overall adherence, adherence per condition/process and adherence at any other aggregate level were estimated as the self-weighted average of the constituent indicator questions for which they were assessed. Each resident was allocated a weight indicative of our best estimate of the number of people they represented in the study population; that weight was applied to each indicator question for which they were assessed. More information on weighting is presented in Additional file 4 [134].

Indicators were clustered within residents who were in turn clustered within facilities which were in turn clustered within providers. Analysis was undertaken in SAS v9.4, using the SURVEYFREQ procedure to control for clustering at the provider level (for all analyses above the indicator level, except analysis of adherence by facility), as the ultimate cluster, weighted to address selective over-sampling. The overall estimate, the estimate by phase of care and the estimate by indicator type (overuse/underuse) were all stratified by both condition and organisation type, the latter aggregated as three pseudo-strata (community, religious, other [charitable, local government or state government]) to avoid single clusters by strata; condition-specific and indicator-specific estimates were solely stratified by organisation type as a pseudo-stratum. Variance was estimated by Taylor series linearization and exact two-side 95% confidence intervals were calculated using the modified Clopper-Pearson method.

For the purposes of informing future research, we estimated the ICCs calculable from our data. We used a well-described method of deriving the ICC for binary responses using the random intercept from the generalised linear mixed model [135]; this was operationalised using PROC GLIMMIX, with the estimate calculated using the Laplace method. ICCs were estimated at the level of the ultimate cluster, the provider.

For individual indicator questions, confidence intervals would be tighter for common conditions and wider for rarer ones. With 25 assessed indicator questions, a confidence interval around estimated adherence of 50% would have a precision of + / − 20% even without adjustment for design effects; it was decided that estimated adherence would not be reported for indicator questions with fewer than 25 assessments.

Results

Characteristics of sampled long-term care residents

Of the 300 residents included from 25 facilities, six were removed; three were admitted and died during the target period and three were found to be aged < 65 years. Of the 294 included residents, 73 were admitted during the target period (24.8% of the sample), 61 died during the record review period (20.7% of the sample) and the remaining 159 were residents throughout the record review period (54.4% of sample).

The 294 residents received assessable care (i.e. one or more assessed indicator) for one to thirteen separate clinical conditions/processes of care (median = 10 [IQR: 9–11], mean = 9.7 [SD: 2.53]) and had 23 to 221 assessed indicators (median = 85 [IQR: 61–120], mean = 90.9 [SD: 38.9]). Table 2 compares the age composition of this study population to all Australian and SA LTC residents [130]. Characteristics of the included facilities compared to Australian and SA facilities are shown in Table 3.

Quality of care indicators

Of 69,454 potentially assessable indicator questions, 41,021 (61%) were designated as not applicable. This left 26,731 assessed indicator questions. Mean prevalence of adherence with evidence-based care indicators, by clinical condition/process, is shown in Table 4. Estimated adherence ranged from 12.2% (95% CI: 1.6, 36.8) for depression to 81.3% (95% CI: 75.6, 86.3) for bladder and bowel. Overall, quality of care was estimated to be adherent for 53.2% (95% CI: 48.6, 57.7) of indicators. Facility-level adherence ranged from 34.1 to 66.4%.

Table 4 Evidence-based care by condition/process of care and phase of care in Australian long-term care residents, 2021

Mean adherence was also calculated by the selected phase of care. Estimated adherence was 51.5% (95%C CI: 41.5, 57.5) for diagnosis/assessment, 61.6% (95% CI: 54.3, 68.5) for treatment and 41.8% (95% CI: 35.0, 48.9) for monitoring/review processes (Table 4). Indicators designed to guard against overuse had an estimated adherence of 91.6% (95% CI: 79.6, 97.7), while those signalling care that is necessary (underuse) had an estimated adherence of 52.0% (95% CI: 47.6, 56.4).

We estimated the actual ICCs associated with overall adherence and condition-level adherence, at the level of the provider. The actual ICC associated with the overall estimate of adherence was 0.023. The median ICC at the condition level was 0.046 (IQR: 0.029 to 0.103). ICCs for each condition are listed in Additional file 5; these ranged from 0.009 for nutrition and hydration indicators to 0.542 for depression, where six of 14 providers had 0% adherence, making inter-provider variation a key component of total variation.

A summary of information about indicator questions is presented in Table 5. The number of indicator questions ranged from six for sleep to 40 for pain. The median number of responses to each question ranged from 5 for depression (IQR: 4 to 56) through to 202 for mobility (IQR 190 to 217). Adherence for each indicator with 25 or more assessments is presented in Additional file 2: Table S2. As reported in Table 5, the number of questions with reported adherence ranged from one for sleep to 38 for pain. Within each condition, the median (and the interquartile range) of reported adherences ranged from 10.6% (IQR 0.6 to 17.2%) for the three reported indicator questions for depression, to 92.4% (IQR: 49.4 to 97.1%) for the eight reported bladder and bowel indicator questions.

Table 5 Information about indicator questions, by condition/process of care

Discussion

This is the first study of adherence to evidence-based care in LTC facilities at a population level using a standardised method across multiple conditions/processes of care. We found LTC residents received 53.2% of recommended care for 14 conditions/processes. Population-level studies in acute care have similarly found that evidence-based care for adults in the US was 55% [136] and in Australia was 57% [24]. Residents received care for an average of 9.7 assessable conditions, much higher than the studies in acute adult care (e.g. 2.5 in the US [136] and 2.9 in Australia [24]), reflecting the residential nature of LTC and vulnerability of the population. There was considerable variation between conditions/processes, which was also found in the two previous adult healthcare studies [24, 136]. Adherence with indicators for the bladder and bowel condition scored highly with over 80% adherence, and another, cognitive impairment, showed adherence in over 70%. However, for care provided for six conditions (skin integrity, end-of-life care, infection, sleep, medication, and depression) adherence was below 50%.

The results provide valuable insights to identify specific conditions and clinical processes where improvement efforts should be targeted. For example, depression symptoms affect just over half (52%) of all permanent LTC residents [137]. When managing older people with depression, greater vigilance is necessary due to reduced bioavailability [88], risk of drug interactions with polypharmacy [138], and rare side effects such as bone loss [139]. However, we found that only 1% of residents who have depression and who had been receiving antidepressants for 4 weeks were monitored on a monthly basis for side effects (Table S2: Indicator no. DEPR07).

In Australian LTC facilities, approximately 83% of residents die in-house [20, 140]. End-of-life care that people receive in the last months or weeks of their lives should meet their cultural, spiritual, psychosocial and physical needs [141]. Family members who are prepared for a resident’s death through clear communication with LTC staff are less likely to experience complicated grief responses [94, 142]. However, we found less than half (47%) of residents who died in LTC had an individualised care plan including resource needs and involvement of family member needs (EOLC19). On the other hand, 93% of residents who died and who had been in pain were treated with morphine or hydromorphine (EOLC27) and 93% were provided with comfort care measures (EOLC31).

Urinary tract infections (UTIs) are the most common infection treated with antibiotics in Australian LTC facilities [143]. Our results show that 92% of residents who have symptoms of a UTI had a urine sample taken to test for signs of infection within 24 h (INFC16) but only 23.5% of residents with UTI symptoms had a full clinical assessment prior to diagnosis (INFC15). In frail older people, UTIs are more challenging to diagnose [144, 145] and urine sample testing results should not be undertaken in isolation without assessment of the resident’s clinical picture [146]. Relying solely on urine sample testing results contributes to overdiagnosis of UTI and overuse of antibiotics. At a societal level, this contributes to antimicrobial resistance which has been declared by the World Health Organization as one of the top 10 threats facing humanity [147].

The Royal Commission into Aged Care [7], which reported in 2021, is the most contemporary and comprehensive account of why the level of care, including experience, safety, access, and evidence-based care, provided to Australian residents is not meeting societal expectations. At a systems level, these include pressure on government budgets with the LTC sector growing quicker than revenue, poor regulation and systematic monitoring and scrutiny of process measures of care to residents, absence of a consumer voice in the design and delivery of services, and societal assumptions of ageism including within governments and providers. At the level of providers, clinical governance knowledge, skills and investment are markedly under-developed.

A significant contributing factor is the workforce which has changed over the last two decades; nurses comprised about 1:3 of the LTC workforce at the turn of the century and now is 1:4, replaced largely by less skilled personal care workers [7]. In addition, due to poor remuneration, access to medical and allied health skills, including pharmacy, is less than optimal [7]. These structural workforce access issues may explain some of the lower adherence results for conditions that require more specialised knowledge and skills such as end-of-life care, depression, medication and infection control. The key contributing factors relating to workforce found in the Royal Commission align with the most frequently found barriers to delivering evidence-based care in the LTC literature, namely knowledge gaps, organisational support, staff profiles and resources [7, 148].

In terms of the way forward for the LTC sector, as well as addressing the structural deficits such as workforce, the broader health care literature may provide some guidance. Evidence-based overarching strategies such as multi-disciplinary teams, structured handovers and communication [149], embedding co-design with residents, and locally agreed clinical pathways based on evidence should be implemented [150]. The adoption of these strategies in LTC should be underpinned by implementation science principles and skilled local clinical governance teams [150]. At a system and facility level, there should be ongoing routine measurement of evidence-based care [24], not just of common conditions as this study has done, but management of multi-morbidities of residents [15]. Adoption of electronic recording of care can improve both the delivery, via decision support, and efficient measurement of evidence-based care [24].

Our experience of developing indicators for evidence-based care from CPGs for LTC compared to adult [24] and children’s health care [25] was more challenging. CPG guidelines can apply to all adult care, or more specifically to older adults, and even more specifically to LTC. The evidence base for the latter is less well developed and is more likely to include a diverse range of practices, such as routine care, for example, ensuring activities of daily living are reliably undertaken or monitored (Table 1, Dysphagia indicator example, DYSP07) as well as providing complex medical care (Table 1, end-of-life indicator example, EOLC20).

Assessment of the level of evidence-based care provided to LTC residents using care documentation invariably involves clinical judgement by surveyors. In the surveyor manual, during initial training, and the weekly meetings, surveyors were encouraged to apply clinical judgement in the absence of definitions “to determine what is appropriate and practical”. Their consistent feedback was that pain was the most difficult condition to assess, in particular, defining new exacerbations. Surveyors also encountered circumstances in the care record when there may be justifiable deviations from evidence-based practices as embodied in the indicators. Similar circumstances were also encountered when residents did not consent to evidence-based care. In these cases, the indicator was scored as adherent.

As to limitations of the study, private facilities could not be recruited and were therefore removed from the sampling frame. There is some evidence that private facilities are likely to have lower adherence to care standards and therefore the prevalence of evidence-based care in Australia is likely to be lower than we have documented [151, 152].

Convenience sampling of facilities may mean that the recruited facilities were not representative of the LTC sector. We collected data from one state, however the profile of the recruited facilities and the residents were similar to those of the whole Australian not-for-profit LTC sector.

There is a potential for self-selection bias. Our provider recruitment rate was 44% which is at the high end of large-scale quality studies (range 8–92%) [25]. If self-selecting facilities were more likely to provide adherent care, this study would have overestimated the quality of care.

The kappa scores were consistent with other care record review studies but, for logistical reasons, were restricted to mock records. This process may have overestimated agreement between reviewers.

The care documented may not reflect the care delivered. All studies seeking to assess the quality of care based on care record review face this possibility. This could work in two ways. Firstly, care delivered is not documented, leading to an underestimation of evidence-based care delivered. This directional bias is well recognised in large-scale quality studies [24, 25, 136]. Secondly, care is not delivered but is documented which would lead to overestimation of evidence-based care. This has been found when checklists are used in healthcare [153]. There have been few studies, particularly recently, of the accuracy of documentation of care records in LTC for the purpose of collecting quality indicators. However, there is a trend that care records overestimate care delivered to residents in pressure ulcers [154], incontinence care [155], feeding assistance [156] and nutritional intake [157]. This may imply that the CareTrack Aged results overestimate the level of evidence-based care delivered to residents.

The indicators were derived from guidelines that were largely published in the years 2013–2018 [18]. As the data review period was 2021, some of the indicators may not have reflected contemporary evidence-based practice [18]. Finally, estimated adherence has wide confidence intervals for almost all indicator questions, and for some conditions/processes of care, especially sleep with only 93 indicator assessments for the six indicator questions. This principally reflects that a small number of indicators were assessed. The width of the confidence intervals suggests that reasonable caution should be exercised when interpreting these indicators. The ICC for overall adherence was in line with that planned for in the sample size estimation, but the vast majority of conditions had ICCs above that which we were able to cater for, leading to wider confidence intervals than desired unless the number of assessed indicators was substantially higher than anticipated. In light of these, future studies should plan to include as large a number of clusters as possible.

Conclusions

Among a sample of residents in LTC receiving care in Australia in 2021, adherence to evidence-based care indicators for important conditions and processes of care was just over half. Vulnerable older people are not receiving evidence-based care for many physical problems, nor care to support their mental health nor for end-of-life care. The six conditions in which adherence with indicators was less than 50% could be the initial focus of improvement efforts. At a systems level, addressing structural deficits of skills and mix of the workforce, implementing high-reliability practices that we know work, and ongoing measurement of evidence-based practice should be the policy focus.