Introduction

The COVID-19 pandemic had a major impact on adult social care. There was substantial excess mortality in care homes in the UK during the first phase of the COVID-19 pandemic, deaths were estimated 20% higher than previous years, a large portion of which are not registered as due to COVID-19 [1, 2]. The highest proportion of deaths involving COVID-19 of UK care home residents in wave one was in the North East of England (30% of deaths involved COVID-19) [2]. Care homes and long term care facilities have been disproportionately affected by the COVID-19 internationally [3, 4]. Best policy for care homes was uncertain at the beginning of the pandemic. International studies have shown long-term decline in health related quality of life and functional decline in older patients who were hospitalised for COVID-19 globally [5]. Healthcare for vulnerable people required a fine balance, to ensure necessary healthcare was maintained while minimising exposure to COVID-19 which was particularly pertinent in care homes [6].

During the early stages of the pandemic, policy recommendations for care homes were updated and revised rapidly. Between the initial COVID-19 guidance on 25th February 2020 and £850 m social care grant to councils on 16th April 2020, Public Health England and the Department of Health and Social Care provided numerous additional frameworks and guidance doument [7]. These were often vague and difficult to follow [8]. Criticisms have described the UK’s policy response in adult social care as ‘slow, late and inadequate’ [9]. Criticisms of many countries’ pandemic responses with respect to long term care facilities have been made [10].

On 17th March 2020 NHS England advised that all non-urgent elective operations should be postponed, and for all medically fit inpatients to be discharged to free-up capacity [11]. Grimm et al. found that UK care home residents’ use of inpatient care decreased in the early stages of the pandemic and suggest these reductions may result in substantial unmet healthcare need [12]. Internationally, healthcare utilisation decreased by around a third during the pandemic [13]. In a global survey in the early stages of the pandemic, two-thirds of health care professionals for chronic diseases stated moderate or severe effects on their patients due to changes in healthcare services [14].

Our study aims to investigate how care home residents received care in the period immediately surrounding COVID-19 tests. We aim to investigate how care home residents used health services and whether patients were moved around between different care settings. Trajectories of care are mapped over time, with time periods defined by their COVID-19 testing events. We cluster these trajectories to find groups of residents with similar care patterns. To achieve these aims we use a novel method for care pathway analysis, State Sequence Analysis. This allows us to investigate potential shared characteristics between these clusters that may drive the observed care patterns.

Care home residents have high levels of physical dependency, cognitive impairment, multiple morbidity, and polypharmacy [15]. Comorbidities such as diabetes and dementia are prevalent in the population and require ongoing high levels of care from staff and specialists [16, 17]. Dementia was the most common pre-existing condition for residents who died of COVID-19 before the end of 2021 and diabetes was a common comorbidity for male residents who died of COVID-19 in the same period [2]. Dementia patients are prone to confusion and struggle to adhere to social distancing and other restrictions. Hence, investigating how these two characteristics affected use of health services by patients having had a COVID-19 test is of particular interest. Furthermore, diabetes and dementia are not respiratory problems and therefore will be able to be viewed independently of COVID-19. We hypothesise that residents’ frailty and/or the presence of long-term conditions such as diabetes and dementia will influence the care a resident will receive. There is a lack of patient-level data from care homes themselves and it is difficult to identify care home residents from administrative hospital data [18]. This limits studies using routinely collected hospital data on care home residents and reduces the possible evidence base for policymakers [19]. This study is, to our knowledge, the first to investigate resident-level care pathways for care home residents during the COVID-19 pandemic. Synthesising patient-level care pathways during the COVID-19 pandemic is important for policy makers to get an empirical understanding of how residents were cared for overall and allows us to understand how characteristics may impact a patient’s care pathway. Understanding patient specific driving factors for decisions made by care staff about how residents were treated can lead to additional policy and guidelines being introduced when used in conjunction with other research in the area. We demonstrate the application of a novel methodology that can be used for further health pathway analysis using routinely collected data in more settings in the future.

Methods

We produce longitudinal sequences of daily healthcare utilisation in the days before and after COVID-19 testing events for care home residents. We cluster these sequences to find residents with similar patterns of healthcare utilisation around their testing events. We then investigate cluster trajectory characteristics, size as well as associations with resident characteristics. We present our work corresponding to the RECORD guidelines [20].

Data source

We utilised data from the HealthCall Digital Care Homes app that began rollout in the North East of England 3rd August 2018 and covered all homes in the area by the end of the data period in August 2021, the sample size is essentially the complete population of adults in care homes who had Pillar 1 COVID-19 tests. HealthCall is a digital referrals app used by care home staff to gather information and request review from a clinician. Three care home datasets from HealthCall covering resident enrolment, home enrolment, and app uploads are used in this study.

We also use eight routinely collected datasets from County Durham and Darlington NHS Foundation Trust hospitals (CDDFT), including A&E, inpatient, outpatient, and community data (primary care data is not included). Pillar 1 COVID-19 testing in the region is also utilised.

All these datasets were initially stored within the Trust, so could be pseudonymised together via patient/resident NHS number. These datasets were then transferred to a Trusted Research Environment for the researchers to access remotely and securely. We then linked theses datasets together using patients’ pseudonymised NHS numbers. In total eight of the datasets refer to patient healthcare events. Three datasets include additional information about residents and homes. A description of each dataset is contained in the supplementary materials.

The COVID-19 testing data used for this analysis is Pillar 1 PCR test results. Pillar 1 testing is classed as ‘swab testing in Public Health England (PHE) labs and NHS hospitals for those with a clinical need, and health and care workers’ [21]. The testing data consists of tests when a resident is an inpatient, or when a resident is symptomatic or believed to have been exposed to someone with suspected COVID-19.

Since the focus of the analysis is on identifying resident’s healthcare observations (interactions with healthcare systems) missingness was generally not an issue, see the discussion for further elaboration. The Charlson Comorbidity index was missing for a proportion of patients but is likely to be missing not at random and unreliable to impute. We only used cases where it was present and accounted for this in our interpretation. Each observation contained a pseudonymised NHS number and timestamp. Those along with which dataset the data had come from can be used to make longitudinal care sequences.

Dataset descriptive statistics

Monthly numbers of observations are calculated for each of the datasets. Locations of COVID-19 tests and rates of test results at the different location types were calculated and independence of these two factors was tested with a chi-squared test (see supplementary material).

Defining cohort and trajectories

Since the data contains the healthcare interactions of all CDDFT service patients, a cohort of care home residents was defined. Presence of individuals’ NHS numbers in the HealthCall enrolment (activation) dataset indicate care home residency. Observations in other datasets referring to a resident living in the set of HealthCall care homes are used to identify additional care home residents. Residents are included in the study from the identified timepoints at which they became a care home resident to when they died or moved out of the home. All individuals identified as care home residents are included in the cohort. Resident characteristics such as age, gender and comorbidities are also drawn from the available datasets (see supplementary material for methods). The limitations of using routinely collected observational data to compile resident characteristics are discussed in the Discussion section.

We define a resident’s healthcare trajectory as the sequence of care they received each day. To ensure only one state per day, we prioritise more ‘significant’ types of care. The possible states (in order of significance) are:

  • A&E attendance.

  • Inpatient stay in hospital.

  • Outpatient attendance.

  • Appointment in the community.

  • Care home visit by community healthcare staff.

  • Care Home – no actions in the datasets.

Sequence analysis

Four different 10-day sub-sequences of resident trajectories were investigated using index events defined by the available COVID-19 tests. The two index events used are a resident’s first COVID-19 test and a resident’s first positive COVID-19 test. The sequence length of 10 days corresponds to the UK government recommended isolation period for individuals who test positive for the majority of the study period. Residents without a COVID-19 test were not included. Sequences exceeding the boundaries of the study period or a resident’s time in the cohort were excluded from the analysis.

Pairwise distances were calculated between sub-sequences in each of the four sets using the Optimal Matching distance algorithm [22]. Insertion and deletion costs of 1 were used, and substitution costs were based on the transition rate between the two states (see supplementary materials for more information). The sequences were clustered based on the calculated dissimilarity between them using hierarchical clustering and Ward’s criterion. State Sequence Analysis was implemented in R using the TraMineR package [23].

Potential associations between cluster assignment and resident characteristics were investigated to provide insight into which factors are associated with the care a resident received. Specific characteristics were investigated: 28-day mortality after the COVID-19 test and Charlson Comorbidity Index, as well as the prevalent comorbidities: diabetes and dementia. Additional associations with wave of the pandemic and COVID-19 test result are included in the supplementary materials.

Chi-squared tests for independence were used for each of the characteristics separately (or Fisher’s exact test when counts in the elements of the table are ≤ 5) [24], with an adjusted significance level α = 0·00143 as a simple Bonferroni multiple testing correction from α = 0·05 (total number of tests presented in the main paper and supplementary materials = 45, 16 are included in the main paper).

Cluster transitions

Since the sequences defined lead to, and follow on from, index events we use Sankey diagrams to visualise the movement between clusters.

Results

In total from all the datasets there were 10,701,759 observations of 612,408 individual patients who have used the services between April 2018 and August 2021. 8,702 care home residents (those with observations in the Health Call datasets and those discharged to a care home) were identified from 122 care homes. Table 1 provides a summary of the cohort demographic information.

Table 1 Summary tableincluding characteristics of 8,702 identified care home residents

Table 2 summarises 11 datasets, consisting of routinely collected data. The data comes from the CDDFT’s secondary care, community database, observations taken inside the care home on the HealthCall app, and COVID-19 testing data. This data includes residents in the study cohort.

Table 2 Counts of observations and individuals in each data set, filtered for the cohort of care home residents

* Individuals can be in more than one dataset hence the sums do not equal the total.

Trajectories were defined from the set of healthcare interactions included in the dataset. Figure 1 visualises a resident’s care trajectory throughout their time in the study cohort. The longer blue periods represent an inpatient stay.

Fig. 1
figure 1

A 5-month sample of a single resident’s care trajectory, with coloured blocks for each day representing the care the resident received each day

Sequences for clustering were specified based on the COVID-19 testing index events. 4,767 residents have a recorded Pillar 1 COVID-19 PCR test in the dataset, and are therefore included in the analysis, 3,938 were ineligible for analysis due to no testing events. Of these, 1,049 residents test positive for COVID-19 at some point in time and their first tests are used as the index events for the pair of sequences before and after a first positive COVID-19 test.

Sequences before the test are not included when a resident moves into the home in the 10 days before the test (198 removed before first positive test, 1,296 removed before a first test). Sequences after the test are not included when the resident dies in the 10-days after the test, or their test is less than 10 days before the end of the study period (316 removed after a first positive test, 1,547 removed after a first test). The number of residents included for each sequence specification is [1] before a first positive test − 851 [2], after a first positive test − 733 [3], before a first test – 3,345 [4], after a first test – 3,220. The total number of individual residents that appear in the analysis is 3,471.

A visualisation of the four 10-day sequences in their assigned clusters can be seen in Fig. 2. The clusters are generally characterised by a single state. Sequences both before and after the first positive test [1, 2] are demonstrated by two clusters: an inpatient cluster, and a home cluster. The before and after first test sequences [3, 4] are characterised by three clusters each, home, community, and inpatient states and home, inpatient to home transfer and inpatient sequences respectively. The large number of residents in the inpatient cluster after the first test is likely due to testing upon hospital admission. The inclusion of an inpatient to home transfer cluster after a first test may indicate that these tests were testing on discharge from the hospital.

Fig. 2
figure 2

Sequence cluster assignments representing types of care received in the 10 days before (1) and after (2) a resident’s first positive COVID-19 test, and the 10 days before (3) and after (4) a resident’s first COVID-19 test (of any result). The clusters represent the groups of similar sequences, where each sequence represents one resident’s care over the 10 days

Table 3 Table of associations between cluster assignments for each of the sub-sequence groups and resident characteristics/sequence outcomes

Characteristics of the residents in these clusters were assessed. The relative frequencies of the characteristics within each of the clusters can be found in Table 3. The combinations found to be non-independent through the chi-squared test are highlighted in grey. All p-values for these tests can be found in the supplementary materials. A higher proportion of residents with diabetes are found in clusters indicating a higher level of care for all four sequences ([1] p = 0.00026 [2,3,4], p < 0.0001). For example in the 10 days before a resident’s first positive test 35% are diabetic of 142 in the inpatient cluster compared to 21% of 709 in the home cluster. A similar pattern is found after both all and positive tests for dementia patients ([2] p = 0.00036 [4], p < 0.0001). Before all first tests a higher proportion of those in the community cluster have frailty scores of 3 and above (73% of 140 versus 41% and 43% for 3,159 in the home and 172 in the inpatient cluster respectively).

Twenty-eight-day mortality is only associated with clusters 10 days after all tests ([4] p < 0.0001); residents in the inpatient and inpatient transfer cluster have a slightly higher 28-day mortality than those in the home cluster (8% of 810 and 8% of 578 versus 3% of 1,832). The two clusters with inpatient stays have the same 28-day mortality rate, despite one of the clusters demonstrating a discharge from hospital around halfway through the 10-day period ([4] p < 0.0001).

Flow between clusters before and after the positive test were displayed in a Sankey diagram (Fig. 3). Transitions between these clusters may indicate changes in care based on the positive test. The ‘Died’ after test group here is not the same as presented in the cluster associations previously. Here we identify whether they died within 10 days of their test and were therefore not included in any of the clusters.

Fig. 3
figure 3

Sankey diagram demonstrating flow between states before and after a resident’s first COVID-19 positive test

The majority of residents both start and end in the care home cluster. More die within 10 days than are transferred to a stay in hospital. A similar proportion from inpatient care and care homes died within 10 days.

Discussion

For care home residents the common patterns of healthcare before and after a positive Pillar 1 COVID-19 test generally consisted of residents who stayed in the care home for the whole sequence duration, and those who had the entire duration in hospital. The clusters of healthcare before any first COVID-19 test contain an additional group of residents receiving regular community care across the 10-days before. Clusters after first COVID-19 tests included an additional group of residents who were discharged halfway through the sequence.

Diabetes was always associated with clusters representing higher levels of care. Dementia is associated with inpatient care after a testing event and appears to be highly correlated with a short-term discharge from hospital. Residents who were discharged from inpatient care during the 10-days after their first test appeared to have a similar 28-day mortality rate than those who stayed in hospital. Charlson comorbidity coefficient was found only to be associated with the set of sequences where there was a high level of community care cluster. This may have been due to smaller sample sizes since these calculations only included patients with ICD-10 codes.

NHS secondary care use fell during the pandemic. However, the cluster assignments for all the sequences of care before and after COVID-19 tests and positive COVID-19 tests contain a substantial specific inpatient cluster. There was still a group of residents in hospital, despite the decrease in secondary care use for care home residents at the start of the pandemic [25].

Dementia is associated with the cluster assignments in the ‘after’ event cluster assignments. After the tests there are more residents with dementia in the clusters characterised by the inpatient state, in both the ‘positive tests’ and ‘all tests’ cases, indicating as significant proportion of residents with dementia have transferred into hospital after their test. Residents with dementia are most often in the inpatient to home transfer cluster after a first test, which implies that residents with dementia may be more likely to have a shorter stay in hospital. Deciding whether to send residents with dementia for an inpatient stay may be difficult; studies indicate that hospitalisations can be detrimental for individuals with dementia as evidence suggests they are linked with advanced stage of dementia and deterioration of active daily living, among other factors [26, 27]. Evidence suggests that residents with dementia were challenging to care for during the pandemic due to difficulties in adhering to social distancing in both the care home and hospital setting, this may have led to increased hospitalisation as well as high levels of discharge back into homes [28].

The Sankey diagram in Fig. 3 demonstrates movement of residents between clusters before and after their positive test. We see that a similar proportion of residents from the home cluster and the inpatient cluster die within 10 days of their test (and therefore aren’t clustered after their test). This finding could also be an artefact of the usage of Pillar 1 testing data, providing a sample of positive tests that are more likely to be symptomatic in care homes and more routine in hospitals. Alternatively, it may suggest that more residents in the care home should receive hospital care, but also could suggest that the level of care in hospital is not an improvement. We cannot account for how ill a resident is, so this could play a part in increasing inpatient mortality rates.

We provide, to our knowledge, the first in-depth investigation into healthcare patterns of care home residents during the COVID-19 pandemic. Other research provides information on care in the homes during the pandemic, such as that done by Shallcross et al. investigating care home-level risk factors among other work [29]. Our findings can be used in context with research on other aspects of residents’ care during the pandemic, to provide thorough policy guidelines for caring for this vulnerable group of the population.

This study demonstrates evidence of movement of positive and suspected positive (pillar 1 tested) care home residents between care settings in the days surrounding their tests. We also see evidence of residents with dementia experiencing short stays in hospital around the time of their tests. The nature of short stays in hospital for this vulnerable set of patients is likely to be detrimental to patients’ health, in general and within the context of nosocomial infection risk for the resident and the whole home. To our knowledge there was no specific guidance relating to secondary care for residents with dementia during the pandemic. We highlight that this group moved around between high-risk care locations and future policy could be targeted to avoid this in the case of local or more widespread outbreaks.

Our results also imply that residents in hospital are equally likely to die within 10 days of their test as those in the home beforehand and therefore suggests that hospital may not provide significantly improved outcomes. Hospital appointments are potentially disruptive to care home residents’ wellbeing, so should be considered carefully [30]. Future policy could indicate that in the early stages of a novel pandemic with an unvaccinated population, it should be encouraged to keep residents in an environment they are used to. The extra care may not be worth the distress of a hospital visit. We observe that patients with comorbidities such as diabetes are disproportionately represented in the group of patients who receive hospital care for all of the time periods investigated.

This study highlights a need for more admission and discharge guidance on sending residents to secondary care for care home residents in the early stages of the pandemic. guidance should include more consideration of the vulnerabilities of care home residents – such as residents with dementia. Some guidance was released related to the issue of hospital admissions, such as the guidance. One such issue was Overview of adult social care guidance on coronavirus (COVID-19) [31]. As was the case with most guidance, admission and discharge from hospital guidance was generally from an infection prevention and control perspective. Further research is needed into the impacts of secondary care admissions for care home residents during the COVID-19 pandemic to fully understand the implications of sending vulnerable patients to hospital. The negative effects of hospitalisation of care home residents are well documented [30, 32]. These effects may be heightened during a large-scale pandemic. Despite the fact that we see a drop in secondary care usage in our cohort at the start of the pandemic (supplementary materials Fig. 3), this paper highlights the fact that with the (relative lack of) guidance in place at the start of the pandemic, vulnerable residents attended secondary care and moved between care settings. Protocol is needed surrounding this for future infectious disease outbreaks.

One of the strengths of this study is the unique dataset allowing visualisation and analysis of healthcare for care home residents during the COVID-19 pandemic. Data from community care captures much home-based care, but the lack of primary care data means that some information is absent. We have derived some resident characteristics from secondary and community care history and our record of age and gender is incomplete. Diabetes and dementia are drawn from diagnosis codes for hospital stays and community procedures, hence we are likely to identify subset of residents who have more advanced disease or who have accessed external care. This is particularly pertinent in the case of dementia, as hospital admission is more likely to be for management of co-occurring conditions rather than dementia being a primary diagnosis [33]. This “missing data” is a known limitation of using observational, routinely collected data. Additional data sources such as primary care would be more likely to give a more complete, reliable set of patients with these comorbidities since they often contain more background information on patients than those from secondary care sources. A further limitation is that the COVID-19 testing data contains only Pillar 1 tests processed in the Trust’s hospital labs. This may bias the sequences we define (relating to a resident’s first positive COVID-19 test and first COVID-19 test in general), since a large portion of Pillar 1 testing was testing on admission to hospital. Testing outside of hospitals was for those with a clinical need, and are therefore more likely to be tests for symptomatic residents [21]. This is the case when looking at test result rates for the different testing locations, with tests in care homes much more often positive than those in hospital settings (see supplementary materials for breakdown). We find a large portion of the residents in inpatient care before their first positive test, remain in inpatient care afterwards – suggesting COVID-19 may not have been the reason for their admission, but tested positive on arrival. The location of testing differs between wave 1 and wave 2 of the pandemic, we investigated breaking down the clustering analysis into the two waves and found it did not significantly impact the results (both in supplementary materials). The use of Pillar 1 COVID-19 testing allows a consistent level of testing throughout the pandemic, since Pillar 1 testing was introduced first and was conducted over the whole pandemic period. However, a more complete – routine set of COVID-19 tests would give a more accurate description of how residents were treated in general and would allow us to identify residents’ first test and positive test more reliably.

Health services such as the National Health Service of the United Kingdom have large pools of untapped data that can be used for large scale, impactful analyses [34]. Research such as this work is needed to demonstrate the work that can be done going forward using linked, routinely collected datasets. The novel methodology demonstrated can be used in more settings to gain insights to other longitudinal care pathways such as what characteristics define what patterns of long-term cancer care patients receive or a patient’s pattern of outpatient care within a specific system such as neurology [35]. Implications from this study are limited by the nature of Pillar 1 COVID-19 testing. Further updates to this analysis could involve using additional primary care data to generate more complete pictures of pathways and characteristics allowing for more comprehensive results. Additional work on more recent testing data that is less stratified could also provide additional insights, however this would need to be viewed in the context of more recent COVID-19 policies. Comparisons between healthcare patterns during the pandemic and those outside of the pandemic could also give further insight into how typical decision making was altered by pandemic policy. Above all, this study demonstrates the potential for large scale linkage of routinely collected healthcare data to investigate longitudinal pathways of care for future studies going forward.