Expected length of stay at residential aged care facilities in Australia: current and future

This study explores the changing patterns of the length of stay (LOS) at Australian residential aged care facilities during 2008–2018 and likely trends up to 2040. The expected LOS was estimated via the hazard function of exiting from such a facility and its heterogeneity by residents’ sociodemographic characteristics using an improved Cox regression model. Data were sourced from the Australian Institute of Health and Welfare. In-sample modelling results reveal that the estimated LOS differed by age (in general, shorter for older groups), marital status (longer for the widowed) and sex (longer for females). In addition, the estimated LOS increased slowly from 2008–2009 to 2016–2017 but declined steadily thereafter. Out-of-sample predictions suggest that the declining trend of the estimated LOS will continue until 2040 and that the longest LOS (approximately 37 months) will be observed among widowed females aged 50–79 years. Relative uncertainty measures are provided. The results portray the current changing landscape and the future trend of residential aged care use in Australia, which can inform the development of optimised residential aged care policies to support ageing Australians more effectively.


Introduction
The lifespan of Australians has increased significantly since the past century, with the life expectancy at birth reaching 81.2 years for males and 85.0 for females, respectively, in 2021, compared with 51.1 and 54.8, respectively, in 1900(Australian Bureau of Statistics, 2021).Given the prolonged lifespan, the population size of elderly Australians is increasing unprecedentedly, and the number of people in the 65 years and above and 85 years and above age groups is projected to almost double from 3.8 million and 500,000 in 2017, respectively, to 6.7 million and 1,000,000 in 2042, respectively (Australian Bureau of Statistics, 2018).This increase will lead to considerable pressure on Australia's residential aged care sector, which approximately 40% of older adults use to a varying extent during their lifetime (Broad et al., 2015) and which consumes 60% of the Australian Government's funding to the aged care sector (Australian Institute of Health and Welfare, 2022a).Since 2011, the number of permanent residential aged care users in Australia has increased by 15%, despite the Australian Government's efforts to deinstitutionalise aged care and promote 'ageing at home' (Australian Institute of Health and Welfare, 2021).As the number of elderly Australians increases, the country's residential aged care sector will find it increasingly challenging to provide adequate, financially sustainable aged care services to these vulnerable older adults (Ergas & Paolucci, 2011).
The looming challenges associated with caring for the aged could also affect people's financial behaviour, given that individuals need to consider in detail the effects of using aged care services, particularly the costly residential options, on their lifecycle financial plans.Currently, senior Australian aged care residents pay about 25% of the total aged care cost, despite different levels of financial support from the government (Australian Aged Care Collaboration, 2021).These individuals can be classified according to their financial status as fully supported, partially supported and self-funded residents.A fully supported resident needs to pay only a basic daily fee, which is set at 85% of the full pension amount of the individual.Partially supported residents need to pay a basic daily fee and accommodation costs, 1 and selffunded residents need to pay a basic daily fee, accommodation costs and a meanstested care fee. 2 Given the concerns about the unsustainability of the government's aged care support and the growing expenditure on residential aged care services, the public has an increasing interest in understanding the use of such services, and, in particular, in identifying those who use these services, the ways in which they use it and whether the current utilisation patterns are likely to change.This knowledge may help people make optimal financial decisions regarding their whole-life financial plans.
1 They may have to pay a refundable deposit at the beginning for the accommodation costs.If they do not favour this option, they can choose to pay a non-refundable daily fee, or to even use a combination of the refundable deposit and the non-refundable daily fee, to cover the accommodation cost. 2 The means-tested care fee is related to financial status.Wealthier people may be required to pay higher fees.
Currently, Australian residential aged care facilities are used by a large number of vulnerable senior citizens for late-life care (Broad et al., 2015).Further, residents in these facilities have a higher mortality rate than the general population (Ferrah et al., 2018;Shah et al., 2013), and their mortality is affected by a wide range of factors, including individual characteristics (e.g., age and gender) and environmental factors (e.g., staffing level) (Damián et al., 2019;Vossius et al., 2018).The profile of these users is gradually changing, with a higher growth rate in the number of people aged 65-74 years and 90 or above, a decline in the number of females aged 75-89 years and an increasing dependency level since 2009 (Cooper-Stanbury & Howe, 2021;Gibson, 2020).However, there has been inconsistent understanding of how the length of stay (LOS), a key indicator of residential aged care service use, has changed in Australia.In this regard, Gibson (2020) reported a shortened average LOS at Australian residential aged care facilities in 2016-2017 (approximately 30 months) compared with preceding years (34-35 months), whereas Cooper-Stanbury and Howe (2021) reported a small increase in the mean LOS from 27.4 months in 2008-2009 to 29.9 months in 2018-2019.Moreover, little is known about how the patterns of residential aged care service use will change in Australia and the corresponding variations among different subpopulations.The lack of such understanding may hinder current policies and practices in Australia to optimise the quality of residential aged care services and the formulation of future plans for the provision of aged care services to an ageing population.
To fill this knowledge gap in the literature, in this study, we comprehensively investigate the changes in residential aged care use in Australia from 2008-2009 to 2018-2019 and estimate such use up to 2040 using data from the Australian Institute of Health and Welfare (AIHW).We conduct this analysis by studying the hazard functions to exit from a residential aged care facility using a weighted Cox regression, which is an improved method to estimate LOS at such facilities.We employ this method for a preliminary determination of the significant factors that influence the hazard ratio to leave this facility.The benefit of this method is that the validity of parameter inference is not affected by the violation of the proportional hazard assumption, a critical feature of the ordinary Cox regression (Cox, 1972).Then, using the identified influential factors, we employ the Cox regression to study the in-sample LOS and predict the out-of-sample LOS up to 2040.The violation of the proportional hazard assumption of the Cox regression is addressed by the modelling of males and females separately with stratification on marital status and application of restricted cubic splines (RCS) for the admission year.The analysis results provide both the in-sample estimated LOS and outof-sample predicted LOS for residential aged care users with heterogeneous characteristics.Significantly, to our knowledge, our investigation, which estimates the future trends of LOS at Australian residential aged care facilities, is the first such in the literature.Thus, the results provide new understanding of the changing landscape of residential aged care use in Australia and can be used to formulate improved policies to enhance the quality and management of residential aged care services in Australia and elsewhere.

Methods
To estimate the expected LOS at a residential aged care facility, regression-based models are commonly used.However, the data of the exact LOS is generally not available for all aged care residents.This is mainly due to the exiting of residents, and in particular, their transfer to other aged care facilities.Since this incomplete observation is a right-censored case in survival analysis, we employ the popular (Cox, 1972) regression, which is specified as follows: where h i (t) is the hazard ratio to exit from an aged care facility of the ith observation at time t, h 0 (t) is the unspecified baseline hazard ratio at time t, and x represents are the corresponding covariates that will influence the LOS at an aged care via h i (t).
An essential assumption for a valid Cox regression is that of proportional hazards.For two observations i and j, using Eq. ( 1), it can be shown that In other words, at the same time t, hazard ratios of two observations are independent of t.Intuitively, this means that the corresponding survival curves will not cross for any two observations over the entire sample period.
If this assumption is violated, non-proportionality will often lead to inaccurate results.For instance, estimates under non-proportionality are argued to be sensitive to the type of departure from proportionality and the censoring pattern of the data (Dunkler et al., 2010;Xu & O'Quigley, 2000).If the covariate is categorical, a simple solution is to adopt stratification.However, the significance of the effect of the stratifying covariate cannot be examined in this case (Schemper et al., 2009).
To resolve this issue, we employ the weighted Cox regression of Schemper et al. (2009) to investigate the significance of covariates, owing to its improved efficiency and interpretability.The weighted Cox regression was originally proposed by Xu and O'Quigley (2000) provide an (uninterpretable) average effect that is independent of the observed censoring pattern, and was further extended by Schemper et al. (2009) by using a revised weighting strategy, such that the estimated average effect could be interpretable.This improved weighted Cox regression has been demonstrated more efficient than the concordance regression (Dunkler et al., 2010), which is also independent from the type of non-proportional hazards and censoring pattern.Specifically, in this study, estimates of coefficients in Eq. (1) for a weighted Cox regression are obtained by solving the following set of equations: (1) e i e j .
(2) log L( ) where log L( ) is the log-likelihood function corresponding to Eq. ( 1), = ( 1 , 2 , … , k ) � , t j is the jth distinct uncensored survival times, m is the total number of those distinct times, R j is the set of subjects without the event and uncen- sored prior to t j (also known as the risk set) and w(t j ) is the weight.If all w(t j ) are set to 1, then a standard Cox regression is solved.Let S(t) be the survival function at time t and G(t) the cumulative probability of follow-up until t.Schemper et al. (2009) required that w(t j ) = S(t j )G(t j ) −1 .Intuitively, those weights are proportional to the expected number of subjects at risk at each t j had there been no censoring.This then improves the interpretability of the estimated coefficients, compared with those with weights set to G(t j ) −1 as in Xu and O'Quigley (2000).
After a weighted Cox regression is fitted, a standard Cox model is then constructed to obtain predictions, using selected influential covariates in the weighted model.In this standard regression, a categorical covariate will be stratified if the corresponding non-proportional-hazards assumption is violated.Under a more complicated scenario such that non-proportionality is found for a non-categorical variable, we consider including time-by-covariate interactions.Specifically, as recommended by Dunkler et al. (2018), the RCS approach is employed in such cases, to allow for a parsimonious, but flexible, estimation of time-dependent effects for noncategorical covariates.

Data
We use the residential aged care data provided by the AIHW, sourced from the National Aged Care Data Clearinghouse.This dataset is requested for the purpose of the Aged Care Assessment program and a Commonwealth-funded home-based aged care research project.Data privacy and confidentiality were ensured to fulfil the requirements of the Australian Institute of Health and Welfare Act 1987.Permission to collect this dataset was granted by the AIHW Ethics Committee.For this research project, we use a dataset that contains 793,323 records of all Australian residential aged care residents admitted into residential aged care facilities from 2008-2009 to  2018-2019. 3Detailed information on the residents, including the year of leaving the facility, admission year, the state in which the aged care facility is located, discharge reason, LOS, age group of discharge, gender, preferred language and marital status, is recorded in this dataset.
This dataset contains records for both permanent residential aged care and respite aged care.Permanent residential aged care provides people who cannot live independently with accommodation and aged care services.Respite care offers shortterm aged care (which is usually capped at a certain period) to people.As stated in the dataset description, people might use the respite care multiple times as a 24 Page 6 of 30 transient service.Because of the features of respite care (highly censored with very short LOS), we include only the permanent aged care data in our analysis.
To facilitate our research, we consider that a censorship takes place if an aged care resident is alive at the end of our observation period or was transferred to another aged care facility or exits without a specified reason.Meanwhile, data on residents who died in an aged care facility are considered uncensored data.Further, we exclude the age group of discharge 1-49 from our dataset because of its small size (about 0.3%).Our dataset does not contain information of the age at admission of an aged care resident, whereas the admission year and the age at exit are available.To estimate the age group at admission, which is essential to the subsequent analyses, we employ the age at discharge and the admission year via the midpoint approach with a uniform distribution assumption.Specifically, we assume that population in the same 5-year age group is uniformly distributed; that is, the mean single-year age is the midpoint of that group.For example, for the age group 85-89, their mean single-year age is 87.Using this assumption and the difference between admission and discharge years, the 5-year age group of admission could be estimated.In this example, for the age group 85-89 admitted in 2015 and discharged in 2022, their single-year age and five-year age group at admission are then estimated as 80 and 80-84, respectively.Note that owing to its open-interval nature, the midpoint of the last age group of discharge, the 100+ group, cannot be inferred, and therefore, we exclude this age group also from our subsequent analyses owing to its small size (about 0.5%).Last, to avoid the influence of outliers, we work with the trimmed dataset.In other words, the observations corresponding to the smallest 2.5% (i.e., < 0.3 months) and largest 2.5% (i.e., > 94 months) lengths of stay at an aged care facility are trimmed for the subsequent analyses.As shown in Fig. 1, in the untrimmed case, rare, but extremely large, outliers are evident, which, if not excluded, may potentially reduce the reliability of the subsequent analyses.Furthermore, to be consistent with prior studies (Broad et al., 2015), we employ the LOS measured in months.The dataset is further refined such that there are no missing data in each of the following (categorical) covariates of an aged care resident: sex, age group of admission, residential state, language spoken and marital status.A summary of the untrimmed and trimmed data, in terms of the censorship and average LOS, is presented in Table 1.

All admission years combined
Section "All admission years combined" describes the investigated sample across those covariates with all admission years (2008-2018) combined.First, we consider the one-dimensional distribution according to each of the examined categorical covariates, with data for 2008-2018 combined.Altogether, there are 772,543 and 731,996 observations in the original and the trimmed datasets, respectively.The distributions (measured in percentages) are listed in Table 2. Overall, female residents were dominant (about 60%), and more than one-fourth of the residents were 85-89 years old.As for the residential state, New South Wales (NSW) and Victoria (VIC) accounted for more than 60% of these residents.Further, English is the preferred language of more than 90% of all residents, and almost half of residents were widowed.Many of the above distributions (except those of sex and age) are quite consistent with their counterparts for the general Australian population (Raymer et al., 2018a, b).
Using only the uncensored data, we now discuss the average LOS at a residential aged care facility.There are 578,436 such observations in our trimmed dataset, indicating a censoring ratio of 21.0%.Note that the average lengths of the full, censored and uncensored samples are 24.4,30.9 and 22.6, respectively.The average LOS according to each of the categorical variates are presented in Table 3.We observe that females (25.1 months) spend about 30% more time on average than males (19.0 months) at a residential aged care facility.In addition, the average LOS for the age groups in the 50-to-79-year range were close to each other and were mostly about 30 months.On the residential state dimension, only the average LOS (24.3 months) of the Northern Territory (NT) deviated more than trivially from that of the others (approximately 22-23 months).There were only few differences at the subcategory level of the language spoken.As for marital status, married residents spent less time (20.5 months) than widowed residents (24.0 months) at an aged care facility.Some differences observed in Table 3 may be explained by the age-specific distributions at the subcategory level of each covariate.To demonstrate this, we plot those distributions in Fig. 2b-e.For instance, there are relatively more older residents in states other than the NT.

Temporal patterns
In section "Temporal patterns", we present the temporal trends and compare relevant differences based on the uncensored data.In Fig. 3a, it can be seen that those averages stayed at about 25 months for 2008-2013, and then almost monotonically declined with time, to roughly 14 months in 2018.
It is important to note that the findings and discussion in this section are preliminary and for motivating the study.The observed decline in average length of stay since 2014 is not necessarily a real trend, and this trend only reflects the uncensored sample.The censoring issue is more likely to affect residents admitted later in the dataset (e.g. after 2014), because they are most likely to be still alive and living in aged care at the end of the observation period, or they would have died early in their stay.The full sample (including censored observations) needs to be analysed with an appropriate model, which precisely incorporates the impact of the censorship in the estimation (i.e., the Cox regression), to consider a real trend and obtain an accurate sense of the dynamic LOS.We provide scientifically rigorous results in section "Final model and predictions" and discuss the findings in section "Discussion".Temporal patterns are further investigated for each of the categorical covariates, in Fig. 3b-f.As for the sex, despite similar trends, females spent consistently more time than males in all years.The differences, however, reduced with years.Regarding the age groups, despite the volatility, the groups in the 50-to-79-year range seemed to have very identical patterns, whereas the declining trends were less distant for the oldest groups in the 90-to-99-year range.More importantly, those in the older age groups consistently spent less time in an aged care facility.The temporal pattern of the NT is also volatile, owing to its small size.The pattern of Tasmania (TAS) was relatively lower than those of the other states, whereas the differences across all states, except the NT, were small for each year.Furthermore, there was a negligible difference observed in the LOS between residents whose preferred language is not English and those who prefer English as their primary language.Last, on average, married aged care users spent uniformly less time than widowed users in all years.
Overall, it appears that there may be significant temporal trends at the subcategory level of all relevant covariates.Recall that there are potentially significant differences in the average LOS across (old) age groups.Thus, temporal changes in the age-specific distribution of a covariate may explain the corresponding yearly patterns in the average LOS.For instance, it can be seen in Fig. 2a that the curve gradually shifts to the right side (older ages) over time.Thus, even if there are no temporal changes for the underlying age-specific hazard ratio to exit an aged care facility, with more older residents over years, a temporal trend similar to Fig. 3a will be observed.Therefore, we discuss the temporal changes of age and sex-specific distributions of other covariates in the next section.

Age and sex-specific pyramids
From the empirical evidence, it is credible to consider that age and sex are influential covariates on mortality (Australian Bureau of Statistics, 2019) and therefore on the hazard ratios of aged care LOS.Consequently, we examine dynamic changes in age and sex-specific distributions for the investigated covariates.These changes may help explain the observed differences of some covariates in section "Temporal patterns".Specifically, following Raymer et al. (2018a), we present demographic pyramids, which are useful to demonstrate age and sex-specific distributions simultaneously.Moreover, to further incorporate the temporal changes, we contrast distributions of 2008 and 2018 in those pyramids.In all cases, the base of each proportion is the total population (i.e., male and female population combined).That is, in each year, all proportions of females and males will add up to 1. First, pyramids are produced for each state in Fig. 4. Consistently, in all states there were a larger number of older female than male aged care residents.Further, for males, the proportions of older ages were growing over time.For instance, in most cases, the dominant age group for males changed from 80-84 in 2008 to 85-89 in 2018.Owing to the small population of the NT residents, the distributions were more irregular and differ from those of other states.This may explain the differences between the temporal trends presented in Fig. 3d.
In addition, pyramids at the subcategory level are generated for preferred language and marital status in Figs. 5 and 6, respectively.Figure 5a, b present very identical distributions, whereas many differences are observed across Fig. 6a-e.For instance, there were more males for the married group, whereas females dominate the widowed residents group.This may help explain the seemingly contradictory observations in Figs.2e and 3f: there were more younger age residents in the married group than in the widowed group (see Fig. 2e), suggesting a longer length of stay of the married group; however, the average LOS of the married group was less than that of the widowed group (see Fig. 3f).The different sex distributions of these two marital categories presented in Fig. 6 help explain such contradiction.That is, given that males tend to have a shorter length of stay than females, the married group, with a high proportion of males (see Fig. 6b), therefore experienced a lower length of stay than the widowed group, which was dominated by females (see Fig. 6e).
There were more younger age residents in the married group than in the widowed group, but the average LOS of the married group was less than that of the widowed group.To examine the influences of covariates formally, statistical tests are conducted, as discussed in section "Preliminary analyses".

Preliminary analyses
The details of the preliminary analyses are stated in Appendix A. In summary, our preliminary analyses suggest that the four influential covariates on the hazard ratio of aged care LOS are sex, age, admission year and marital status.Specifically, significant estimates in Table 4 indicate that with all other covariates held the same, males are more likely than females to exit an aged care facility; older residents are expected to stay for a shorter period; and within our sample period, residents may stay for a shorter period in the future (considering the quadratic influence of admission year).More importantly, the age groups may influence males and females differently, which motivates a separate analysis for each sex.In the subsequent analysis, we combine the age groups in the 50-79 (50-69) range into one group for females (males).We discuss the final model using those covariates in a standard Cox model and the predicted results in the next section.

Final model and predictions
First, a standard Cox model consisting of admission year, age (with five subgroups: 50-79, 80-84, 85-89, 90-94 and 95-99 for females and seven subgroups: 50-69, 70-74, 75-79, 80-84, 85-89, 90-94 and 95-99 for males) and marital status (with three subgroups: married, never married and widowed) is examined for females and males.To investigate whether the proportional hazard assumption is met, we perform the test proposed by Grambsch and Therneau (1994).The 2 test statistics of admission year, age, sex and marital status are 123, 7677, 861 and 8607, respectively, indicating significant non-proportional hazards in all cases.As stated in section "Methods" and recommended by Dunkler et al. (2018), we then apply stratification on age and marital status, and employ the RCS approach for the admission year.
To select the degrees of freedom (knots) of RCS, we use the Bayesian information criterion, which favours more parsimonious specifications.Note that the application of RCS will fit the most appropriate polynomial of temporal patterns, such that a non-linear influence is considered.The final standard Cox model is composed of the RCS of admission year with four degrees of freedom, as well as stratified age and marital status, for both females and males.The log-likelihood ratio and score tests both suggest that the RCS of admission year can significantly influence the hazard ratio of aged care stay, after considering stratified covariates.The test statistics are 3268 and 3263 (1333 and 1332), respectively, for females (males) and both follow an 2 distribution with three degrees of freedom.Therefore, we use the fitted coefficients to estimate insample hazard ratios and predict out-of-sample ones.The corresponding survival curves and expected aged care LOS can then be produced.First, we compare the in-sample estimates of expected aged care LOS using the final model to those fitted by the Kaplan-Meier (KM) method.Results are plotted in Fig. 7.Note that the KM estimates are obtained using subsamples that consider all covariates jointly.For instance, the sample for 2008 of married females in the 50-79 age group is employed to produce the corresponding KM result.In contrast, the Cox regression utilises the full sample (2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017)(2018) for females and males separately in all cases.This may explain the relative stability of its estimates compared with those of its KM counterparts.For example, the KM results for never-married males aged 95-99 are more volatile over the years, owing to the small subsample sizes.Despite this volatility, the Cox results are overall consistent with the KM estimates.Notably, after considering age, marital status and their interaction, the expected LOS reaches the peak in 2016 and then slowly declines for both female and male groups.This pattern is slightly different from the preliminary observations presented in section "Data", where only the uncensored data were considered. 4hese results suggest that the fitted Cox model is able to provide accurate insample estimates.Since the RCS of the admission year is parsimonious, the overfitting issue is avoided, and the out-of-sample prediction is therefore expected to be reliable.The fitted and predicted expected aged care LOS for 2008-2040 with 95 per cent Confidence Intervals (95 CIs) using the final model are plotted in Fig. 8. Overall, we conclude that married residents have the lowest LOS, whereas widowed users have the longest LOS.In all cases, since 2017, the results have decreased steadily over time.On comparing the estimates/predictions for 2018 and 2040, we find that, on average, the aged care LOS will decrease by about 10 months for females and 3.5 months for males.As of 2018, the shortest expected aged care LOS was 10.5 months for married males aged 95-99, and the longest was 50.6 months for a widowed female aged 50-79.In 2040, the shortest and longest expected lengths will be 9.3 months for males aged 95-99, and 40.8 months for females aged 50-79, respectively.
Last, we investigate the fitted and predicted survival curves of aged care stay in 2018 and 2040, respectively.The curves (with 95% CIs) are plotted in Fig. 9.For the same age and marital status subcategory, the predicted curve shifts slightly to the left from 2018 to 2040, indicating a shorter expected LOS.Moreover, the uncertainty grows more substantially, comparing the widths of CIs in 2018 and 2040.Nevertheless, consistent with Fig. 8, the estimated survival probabilities are highest for the widowed group (longest expected LOS) and smallest for the married group (shortest expected LOS) in all cases.

Summary of results
Analysing the patterns of use of such services is of crucial importance to improving both the management of patient flow in residential aged care facilities and the policy formulation to support the stable residential aged care population.

Discussion of results
The results demonstrate that females are significantly more likely to have experienced a longer LOS when they exit from an Australian residential aged care facility.This finding is consistent with those of prior studies that females have a greater probability to enter such a facility (Grundy & Jitlal, 2007;Luppa et al., 2009;McCann et al., 2012) and that a large proportion of the residential aged care population is female (Australian Institute of Health and Welfare, 2022b).Females' longer LOS at such facilities might be explained by gender differences in life expectancy, health level, marital pattern and living arrangement in later life.Specifically, females generally outlive males (e.g., in 2019-2021, in Australia, female life expectancy was 85.4 years as against male life expectancy of 81.3 years) (Australian Bureau of Statistics, 2021); however, older females are more likely to be unhealthier than their older male counterparts, given that unhealthy females are more likely to survive to old age than unhealthy males because of the lower mortality of females in young age (Oksuzyan et al., 2008).In addition, females tend to marry slightly older males and have a lower remarriage rate in old age when widowed (Luppa et al., 2009).Therefore, females generally face a longer lifetime of being frail and living without care support from their spouse.This creates a greater demand for aged care services from them, and hence, the longer female LOS at residential aged care facilities.
24 Page 20 of 30 The study also demonstrates that residents admitted at old age are more likely to experience a shorter LOS than those admitted at a young age.The negative association between LOS and age at admission is particularly significant at age 75 and above, with the average LOS declining significantly from approximately 30 months for age group 75-80 to approximately 10 months for age group 95-99.Such linkage mirrors the negative association between LOS and age at admission reported previously (Cooper-Stanbury & Howe, 2021).It might stem from the less desirable health conditions and therefore greater care demand for residents admitted at a young age.That is, given their preference of ageing at home, people generally only accept being transferred to residential aged care facilities when their health conditions deteriorate and when they need more intensive care service.Therefore, people admitted at a young age are more likely to have long-term health conditions (e.g., disability) and a higher dependency level, and hence, they have to access the residential aged care at a young age and stay for a long period.
Another interesting finding is that LOS slightly increased from 2008-2009 to 2016-2017 but steadily decreased from 2016-2017 to 2018-2019, with the decreasing trend projected to continue up to 2040.This finding differs from that of Gibson (2020), who reported a decrease in the average LOS from approximately 34-35 months between 2007-2008 and 2010-2011 to around 30 months in 2016-2017 and 2017-2018.Additionally, it also differs from Cooper-Stanbury and Howe (2021), who observed a fluctuating increase in both mean (from 27.4 to 29.9 months) and median (from 15.9 to 19.1 months) of LOS during the period from 2008-2009 to 2018-2019.Moreover, our finding also differs from that reported in an AIHW report5 (Australian Institute of Health and Welfare, 2023), which indicated an overall increasing median LOS over time (i.e., from 16.8 months in 2012-2013 to 22.4 months in 2021-2022) among Australian residential aged care facilities.These differences demonstrate that the trends in LOS at such facilities might be more complicated than previously thought.The increase in LOS from 2008-2009 to 2016-2017, which was particularly significant from 2014-2015 to 2016-2017, might be driven by the removal of the distinction between low care and high care in Australian residential aged care facilities in 2014 to achieve 'ageing in place' (Department of Social Services, 2014).Consequently, residents would no long be transferred because of a change in their care level, resulting in a reduced number of admissions but an increased LOS per admission.
The increase in LOS from 2008-2009 to 2016-2017 was followed by a gradual decrease from 2016-2017 to 2018-2019; this shift might be likely because of combined changes in multiple aspects, particularly the increased utilisation of aged care services in home and community settings and the changes in the morbidity pattern.Importantly, the growing availability of home-and community-based care services provides older people with alternatives to transfer to residential aged care facilities for care support.In Australia, the number of home care recipients grew substantially from 50,871 to 176,157 (by 246%) during 2011-2021, dwarfing the contemporary growth of residential aged care recipients by 14% (Australian Institute of Health and Welfare, 2022b).Many studies have demonstrated that the use of home-and community-based care services could reduce, or delay, the access to residential aged care facilities (Jorgensen et al., 2018), and hence, shorten older people's LOS at these facilities.In addition, the changing morbidity pattern of the older population of Australians might also contribute to the shortened LOS in such facilities.Overall, older Australians are now less likely to live with disability, a major predictor of being institutionalised, given that the prevalence of all types of disability decreased from 82.1 to 78.0% in 2003-2018(Australian Bureau of Statistics, 2021;Gaugler et al., 2007).In addition, dementia, another major cause of using residential aged care (Gaugler et al., 2007), has also become less common among the residential aged care population, with age-and gender-standardised prevalence of dementia declining from 50.0% in 2008 to 46.6% in 2014 among Australians accessing residential aged care services (Harrison et al., 2020).The decline in these two major morbidities could have reduced older Australians' need to access these services and thus shortened the LOS.
The results also indicate that widowed/unmarried residents are more likely to have a long LOS compared with those married.This finding aligns with that of previous studies, namely, that having a spouse/partner is an obstructive factor to enter residential aged care facilities (Grundy & Jitlal, 2007;Kendig et al., 2017).Possibly, widowed/unmarried individuals use such a facility to a greater extent because they are unable to obtain care support from their spouses/partners when they are ill or disabled, forcing them to access residential aged care earlier.Even when widowed/ unmarried people are in reasonably good health, they may also prefer entering residential aged care facilities earlier than those married because they need assistance in daily life and have unmet psychological and social demand due to lack of support and accompany of spouses/partners.Moreover, having a spouse/partner may also increase the possibility of married people to use respite care services at admission given their preference of living at home and being cared for by their spouses/ partners when their health conditions recover.The greater use of residential aged care services among widowed/unmarried individuals has some caveats for Australia, given the growing life expectancy and declining marriage rate in the country, which might imply that a growing number of Australians will not have support from their spouses/partners in later life, which would increase the dependency on the residential aged care sector.

Policy implications
This study has several crucial implications for policy formulation aimed at improving residential aged care services in Australiana and elsewhere.First, the strong preponderance of females in residential aged care facilities, characterised by not only the higher probability of admission but also a greater LOS, necessitates further research to investigate female care demand and quality of life as well as the gender difference in wellbeing and health outcomes in the residential aged care setting.Second, given the positive effect of marriage in terms of restraining the need to 24 Page 22 of 30 use residential aged care services, the implementation of marriage/family-friendly policies, which could increase working flexibility, ease work and family conflicts and improve work-life satisfaction, is suggested.It is also suggested to eliminate the social stigmatisation on remarriage/cohabitation of older adults, which exists in curtain cultural contexts, to help people re-enter a marriage or a relationship in their later life.In addition, it is recommended that the support from family members for residents be strengthened.This outcome might be achieved through establishing more family-friendly spaces for relatives and children to visit and increasing the accommodation capacity in residential aged care facilities to allow overnight stays by family members.Third, continued policy support is needed for the provision of home-/community-based care services in order to facilitate older people to age at their homes or in communities.Such efforts might include measures to increase safety at home, facilitate communication and interaction between community residents, improve the house ownership of older people and increase the access to clinical services and medical support in the community setting.These measures will alleviate the growing burden borne by the residential aged care system and also contribute to increasing the quality of life of older people before they need to access residential aged care facilities.

Strengths and limitations
This is the first study to project the future LOS at Australia's residential aged care facilities, and thus, it provides rare insights into the likely future changes in the utilisation of such facilities in the country.The major strength of this study lies in its use of a large, complete dataset (approximately 770,000 episodes) and of accurate LOS information at each episode level, which provided a solid foundation for the analysis of hazard ratios.Another important strength is the use of a weighted Cox regression model, which prevented the violation of the proportional hazard assumption of the original Cox regression model.This enhances the validity of parameter inference and provides unbiased average hazard ratio estimates.
Nevertheless, this study has some limitations.First, we used the age at discharge and admission year to estimate the age at admission.This type of estimation requires the use of the midpoint approach, in which it is assumed that the population is uniformly distributed over age groups, and hence, extra caution may be needed when adopting this approach.However, this limitation should not be a major concern, given that we removed both the top and the bottom 2.5% outliers of LOS from the analysis.Second, the dataset used in this study does not include non-mainstream programs that largely target Aboriginal and Torres Strait Islander people, such as the National Aboriginal and Torres Strait Islander Flexible Aged Care Program and the Remote and Aboriginal and Torres Strait Islander Aged Care Service Development Assistance Panel.The exclusion of such programs might not significantly affect the estimates for the whole Australian population and for states with large populations (e.g.NSW and VIC), but it might affect the results for those states with a small population and with a large proportion of Aboriginal and Torres Strait Islander people (e.g. the NT).This factor explains the deviation of the results for the NT from those for other states, as observed in Table 2, although the deviation is found to be insignificant in our subsequent analyses.Third, the dataset does not provide information on care needs, a key predictor of the demand for aged care services, which is vital to determining the use of residential aged care services (Cooper-Stanbury & Howe, 2021).Similarly, given data limitations, we were unable to explore the link between LOS and other important variables, including residents' morbidities, the number of staff at the facility and the facility location.We encourage researchers to explore LOS at residential aged care facilities from these dimensions if data are available.Fourth, this dataset does not include aged care service information about older Aboriginal and Torres Strait Islander Australians who use aged care services via a flexible model.Certain information can be used to justify the age structure of the NT but is unavailable in the dataset that we used.
Furthermore, the findings of this study should be treated with caution due to inevitably considerable uncertainties when projecting the future trends of residential aged care use.These uncertainties include the unpredictable future changes of aged care policy, changes in the pattern of utilisation of aged care (e.g., increased availability of home care) and unforeseen technological innovation that is shaping quality of aged care service provision.Additionally, this study makes projections based on one-decade data from 2008-2018 and some certain specific assumptions (e.g., a uniform distribution assumption of age group of admission) of future trends of residential aged care use, thus the projections represent one of many possibilities.Moreover, while this study has controlled key co-variates (e.g., age, sex and marital status) in the projection model, other factors that may affect residential aged care use (e.g., number of children and income level) are not included in the analysis due to data availability.Future studies are recommended to address these limitations when data are available.

Concluding remarks
In this study, we analysed data from the AIHW on people leaving Australian residential aged care facilities in 2008-2018, using an improved Cox regression model.Our analyses demonstrated that sex, age at admission, marital status and admission year are four prominent covariates that affect LOS at such facilities, with residents who are female, admitted at a young age and widowed more likely to experience longer LOS.We also found that LOS at these facilities increased slightly from 2008-2009 to 2016-2017, followed by a subsequent gradual decrease from 2016-2017 to 2018-2019.Moreover, our analysis indicates a steady trend towards a reduction in LOS until 2040, although at a low rate.These findings reveal the evolving profiles of residential aged care users in more than one decade and the likely future changing course in Australia.Thus, these findings have important policy implications for enhancing the quality of such care services and for constructing a supportive ageing environment in the residential aged care setting in Australia and other ageing societies.

Appendix A
We consider preliminary pairwise analyses to illustrate potential influence of each categorical covariate.In all cases, the subsample of 2008 is employed for demonstrative purposes.Also, the traditional log-rank test is not employed to compare whether two fitted survival curves differ significantly.The reason is that its power will reduce when the hazard ratio is not constant, especially when those ratios of two curves cross (Yang & Prentice, 2005, 2011;Yang & Zhao, 2012).Instead, the adaptively weighed log-rank (AWL) test proposed in Yang and Prentice (2010) is employed, which is more robust when the hazard ratio is not constant.
First, we focus on the covariate of age, using the largest age group 85-89 and the next younger category (80-84) for comparison.To control for the potential interaction between sex and age, the subgroup of females is analyzed.The corresponding fitted Kaplan-Meier (KM) survival curves with 95% confidence intervals (CIs) are plotted in Fig. 10a.Significant differences are observed at almost all lengths of stay.The p-value of AWL test is 0, suggesting that age is a significant factor influencing the hazard ratio.Further, aged care residents of a younger group are like to spend more time than an older group.
Next, we consider the variable of sex.For the same reason of possible interactions, the largest age group 85-89 is examined.As shown in Fig. 10b, over almost the entire range of months (0, 100), females are more likely than males to stay at an aged care.The p-value of AWL test is 0, indicating that sex is also an important covariate.
Since age and sex are both potentially influential, we use female that were aged 85-89 and entered the facility in 2008 in the following analyses.To illustrate the impact of state and marital status, the pair of NSW versus VIC (two states with most observations), and that of married versus widowed (two dominating subcategories) are compared, respectively.The binary subcategories in preferred language are also contrasted.The fitted KM survival curves are plotted in Fig. 10c-e.Using the AWL test, the only covariate that is significantly influential on the hazard ratio of aged care stay is the marital status.
Weighted Cox regression models are now fitted to provide more informative results on the significance of categorical covariates.The main model is the average hazard ratios (AHR) proposed in Schemper et al. (2009), and the average regression effects (ARE) developed in Xu and O'Quigley (2000) is employed as the robustness check.
We firstly focus on the subsample of 2008-2009 with age of 85-89 to illustrate the significance of sex, residential state, preferred language and marital status.The baseline subcategories are female, NSW, English and married, respectively.Results are presented in the left panel of Table 4.It can be seen that the only two significant (at 1%) covariates are sex and marital status.Specifically, the impacts of divorced and separated groups are statistically identical to that of married.Therefore, we combine the three subcategories and renamed it "married" in the subsequent studies.Second, we analyze the impact of the non-categorical variable of admission year.Using female data aged 85-89, we fit this single covariate using weighted Cox regression with AHR and ARE.Also, due to the potentially quadratic temporal pattern as shown in Fig. 3, both the original admission year and squared term are included.The results are reported in the right panel of Table 4.The admission year factor is therefore significant in affecting the underlying hazard ratio.Thus, at this stage, our preliminary analyses suggest that covariates of sex and admission year are influential to study the aged care LOS.Recall in section "Methods", weighted Cox regression cannot be employed for the prediction purpose.Instead, the standard Cox model needs to be used.In such a case, if the non-proportional assumption is violated, the categorical variables need to be stratified.However, this is only practical for covariates with a small number of subcategories (Dunkler et al., 2018).Further, as demonstrated in Table 3, the average lengths of aged care stay for younger age groups may not differ largely from each other.Thus, it is worth investigating if some subcategories of the age could be combined, before we formally test its significance on influencing the aged care LOS.(Schemper et al., 2009) and average regression effects (Xu & O'Quigley, 2000), respectively.Values in parentheses are the corresponding robust standard errors (Lin & Wei, 1989) For this aim, we employ the uncensored data only, and display the age and sexspecific average LOS in Fig. 11.Specifically, the female and male yearly averages are presented in (A) and (C), respectively, whereas the female and male means considering all years are plotted in (B) and (D), respectively.For the yearly cases, despite the volatility, female and male residents demonstrate different shapes of declining trends.Using the combined plots, it can be seen that the average lengths of females (males) younger than 80 (70) may be deemed identical.This then provides motivation to combine age groups differently for males and females.
Weighted Cox regressions are then fitted to preliminarily demonstrate the significant differences among age groups, and to support such combinations.Using the data of 2008, the results of female (left) and male (right) are presented in Table 5, with a baseline group of 50-54 in both cases.Clearly, older subcategories were individually significant at 1% in both AHR and ARE models.Those younger than 80 (70), however, were not individually significant for female (male) and may be combined as a single group.(Schemper et al., 2009) and average regression effects (Xu & O'Quigley, 2000), respectively.Values in parentheses are the corresponding robust standard errors (Lin & Wei, 1989)

Fig. 1
Fig. 1 Frequencies of lengths of stay at an aged-care facility.a Untrimmed data (dashed line is the 97.5% percentile of 116.1) b Trimmed data (ranging between 0.4 and 116)

Fig. 2
Fig. 2 Age-specific distributions based on population stocks across various covariates.a Year.b Sex.c State.d Preferred language.e Marital status

Fig. 3
Fig. 3 Temporal plots of average lengths of stay at an aged care across various covariates: uncensored data over 2008-2018.a Year.b Sex.c Age. d State.e Preferred language.f Marital status

Fig. 4
Fig. 4 Distributions in pyramids based on population stocks across states.a New South Wales.b Victoria.c Queensland. d South Australia.e Western Australia.f Tasmania.g Australian Capital Territory.h Northern Territory

Fig. 5
Fig. 5 Distributions in pyramids across preferred language.a English.b Other languages

Fig. 6
Fig. 6 Distributions in pyramids across marital status.a Divorced.b Married.c Never married.d Separated.e Widowed

Fig. 7 Fig. 8 Fig. 9
Fig. 7 Fitted expected lengths of stay at aged care facilities: Cox regression vs Kaplan-Meier estimation.a Female.b Male

Table 1
Summary of untrimmed and trimmed data N (%) is the number of observations (percentage of the total category).LOS is the length of stay at an aged care facility measured in months

Table 2
Distributions of categorical covariates, measured in percentages

Table 3
Average LOS in months at an aged care (uncensored data) Using an improved Cox regression model and nationally representative data on age groups from2008-2009 to 2018-2019from AIHW, this study provides an in-depth investigation into the current patterns and identifies future trends of LOS in Australian residential aged care facilities.Results demonstrate that four factors-sex, age, admission year and marital status-affect the hazard ratios of LOS at an Australian residential aged care facility.Specifically, females are significantly more likely to stay longer than males, while residents admitted at older age are prone to experience a shorter LOS than those with a younger entry age.Furthermore, married residents have the shortest LOS, and widowed residents have the longest LOS.In addition, overall, the average LOS exhibited an increasing trend from2008-2009 to 2016-2017, and a decreasing trend from 2016-2017 to  2018-2019;this decreasing trend is projected to continue, but at a slower rate, until 2040.These results provide new understanding of the current user profile and the future trajectories of residential aged care utilisation in Australia, offering new insights that would help in formulating various long-term policies to optimise residential aged care in Australia and elsewhere.

Table 4
Preliminary analyses: weighted Cox regression Table notes AHR and ARE are weighted Cox regressions with average hazard ratios . *** indicates the significance at 1% level

Table 5
Preliminary analyses for age groups: weighted Cox regression Table notes AHR and ARE are weighted Cox regressions with average hazard ratios . *** indicates the significance at 1% level