Measuring the Activity of Mental Health Services in England: Variation in Categorising Activity for Payment Purposes

In the context of international interest in reforming mental health payment systems, national policy in England has sought to move towards an episodic funding approach. Patients are categorised into care clusters, and providers will be paid for episodes of care for patients within each cluster. For the payment system to work, clusters need to be appropriately homogenous in terms of financial resource use. We examine variation in costs and activity within clusters and across health care providers. We find that the large variation between providers with respect to costs within clusters mean that a cluster-based episodic payment system would have substantially different financial impacts across providers.


Introduction
There is significant international interest in developing payment approaches in mental health services that create the right incentives to use scarce resources more efficiently without compromising care quality (Moran and Jacobs 2017). Public providers of mental health services in England (National Health Service (NHS) Trusts) have historically been funded through block contracts agreed between commissioners (who purchase care services) and providers of care, and usually this has been on the basis of levels of existing 'inputs' such as the number of beds (Mason et al. 2011). This method of financing offers little financial incentive for providers to efficiently meet health care needs and can have perverse incentives for quality of care (Jacobs 2014).
For England, starting in 2012 the Department of Health mandated that mental health services move away from block contracts and use newly developed care clusters to classify patients according to their needs, with payment related to episodes of care within a cluster, an approach referred to as episodic payment (Department of Health 2010, 2013. Implementation of the use of the clusters and development of payment models has largely been devolved to local commissioners and providers of mental health care. Subsequently, there has been much debate about the use of the clusters with hope that the approach had merit, but also caution that much more work was needed to understand the implementation of care clusters and the development of payment systems based on them; a key need being robust analysis of cluster data (see, for example, NHS Confederation 2011 ;Clark 2011;Jacobs 2014).
With the sustained focus on the implementation of the cluster payment approach over a number of years, there has only recently been a more substantial quantity of care cluster data available to undertake analysis of the categorisation of patients. We contribute to the need for more robust analysis of clusters. In this paper we examine nationally available cluster data and consider how well the approach to clustering is operating, and reflect on the feasibility of its further development as a basis for episodic payment in mental health care.
The potential move towards an episodic payment system (Khan et al. 2014), would align mental health services 1 3 more closely to the payment system used for acute physical health care in England, formerly called payment by results (PbR) but now known as the national tariff payment system (NTPS). Potential benefits of an episodic-based payment approach are that it increases transparency and accountability (The Mental Health Taskforce 2016), can incentivise providers to control unit costs, and deliver more efficient care (Jacobs 2014).
There is international interest in developing new forms of payment systems for secondary mental health care but experiences have been mixed. Australia and New Zealand developed casemix classification systems specific to mental health that incorporated information on patient severity and functioning using the Health of the Nation Outcome Scales (HoNOS). In both countries providers were shown to exhibit cost variations that rendered the classification systems unsuitable for payment, although this was an explicit objective in Australia only (Buckingham et al. 1998;Gaines et al. 2003). Some countries, such as the Netherlands, have included psychiatric care in the prospective activity-based payment system used for physical acute care (Kobel et al. 2011). This system takes account of the type of care and treatment provided as well as diagnosis (Forti et al. 2014). Cost control is incentivised by nationally agreed unit prices and the system also incentivises quality improvements that lead to lower resource consumption (Swan-Tan et al. 2011). Other countries have implemented prospective payment but chosen an alternative payment unit to the diagnosis casemix groups used for acute physical health care, such as in the United States which reimburses psychiatric inpatient care under Medicare using a per diem system, reflecting the fact that length of stay reflects cost (Mason and Goddard 2009).
In contrast to standard casemix systems which fix a patients' categorisation and the payment that will be received for their treatment, under the episodic payment approach in England, patients are categorised into one of 21 care clusters according to their need and these are reviewed at set intervals. We describe the cluster model in more detail under "Methods" section.
Providers of services would be paid for the care a patient receives whilst assigned to a cluster for a defined review period or episode of care. Episodic payment therefore links a provider's payment closely to the volume and type of mental health care activity that it supplies in such a way that the provider would know in advance how much each patientcluster-period will yield it in terms of income (NHS England and NHS Improvement 2016) but nevertheless builds in flexibility as patients are reviewed. Fixed prices, agreed locally between commissioners and providers, are set for each cluster-episode. Alternatively, national average costs can be used to derive a national prospective fixed price.
Ultimately, an agreed payment rate needs to consider how prices relate to costs and the potential (perverse) incentives for care delivery (Jacobs 2014). A prospective fixed-price mechanism has found favour in many health care systems especially in regard to paying for acute hospital care (O'Reilly et al. 2012) but remains limited in mental health care as it is more difficult to define and cost an episode of care due to the interplay of such factors as the diversity of diagnosis, and the chronic and fluctuating nature of much mental illness (Wolff et al. 2016;Jacobs et al. 2015).
A key consideration for the viability of the episodic payment approach is in limiting the variation in costs associated with the defined payment categories. Cost variation needs to be considered both among patients within a given category or cluster and between providers for a given cluster. In considering the appropriate unit of activity for mental health services, the primary consideration is cost variation within the unit of activity since if this variation was minimal then all providers would produce at the same cost. However, for a given specification of categories, both forms of variation are important and if either is too great, problems may ensue. First, if a cluster captures a very diverse group of patients, there are a number of risks associated with setting a single price. Patients within the group who are exceptionally costly to treat are potentially lossmaking to the provider. There is an incentive to treat only cheaper patients ("cream skimming"), refer more costly ones to other providers ("dumping"), or to reduce the treatments given to try and contain their cost ("skimping") (Jacobs 2014). Unnecessarily moving people to another cluster in order to obtain higher income (cluster "creep"), is another option. Hence, excessive variation of cost within a treatment group is concerning (Moscelli et al. 2019).
The impact of variability of cost between providers is not straightforward. On the one hand a fixed price acts as an incentive for high-cost providers to control their costs so that ex ante variability in costs may not be a great concern in the medium to longer term. Alternatively, variation in costs across providers might indicate variation in their case mix and setting a uniform price based on average costs across providers might drive high cost, high quality providers who treat difficult patients into financial distress.
Therefore, for an episodic payment system to work, there should not be too much variation in costs between providers, or within clusters. There needs to be reasonable resource homogeneity within groups (both in terms of activity and costs) i.e. clusters should comprise patients who are homogenous in terms of resource use and ultimately cost. Some observed variation could be the result of poor data quality, and some may be due to legitimate variation in patient care. Either way, excessive variation in activity within each cluster will drive very high cost variation and hinder translation into a prospective cluster price, which will make the operation of an episodic payment approach difficult.

Purpose of the Study
Using data on cluster assignments, costs and activity for all patients in contact with secondary mental health care services in England in the financial year 2014/15 we examine variation in costs and activity within clusters and between providers. We make three specific contributions. First, we provide an analysis of the type and variation of activity undertaken within care clusters, for example, the contacts with health care professionals which patients have. Second, we provide an assessment of how well clusters operate as a unit of activity in terms of categorising patients into resource homogenous groups. Third, by using the national administrative data which is intended to underpin the payment approach, we provide a comprehensive assessment of the proposed episodic payment system for England. Our results enable key policy lessons to be drawn for the future development of the payment approach.

Data
We use two main sources of data, the Mental Health Minimum Data Set (MHMDS) (Health & Social Care Information Centre) and Reference Costs (Department of Health 2015), for the financial year 2014/15. 1 The MHMDS is a patient-level dataset with national coverage of all secondary mental health care for England and contains demographic, diagnostic and treatment information. The MHMDS contains data on the items of the Mental Health Clustering Tool (MHCT) which is used by a clinician or clinical team to assign a patient to one of 21 care clusters following guidance for scores on MHCT items (Monitor and NHS England 2013b). The MHCT consists of 18 items; the 13 items of the Health of the Nation Outcomes Scales (HoNOS) (Wing et al. 1994) and the five items of the Summary Assessment of Risk and Need (SARN) (Self et al. 2008, Self andPainter 2009). These are used to assess a patient's need on both a current and historical basis. Each item is rated by staff on a scale of 0, no problems, to 4, severe to very severe problems. Clinicians or clinical teams then translate these scores according to guidance (Self et al. 2008, Self andPainter 2009) into assignments to the 21 clusters which reflect assessments of specific symptoms as well as needs and chronicity. Conceptually, these clusters are grouped into three super-clusters: non-psychotic (clusters 1-8), psychotic (clusters 10-17), and organic (clusters 18-21) (see Table 1). Cluster 9 is designated blank and not used, and 0 is a variance cluster, used to code someone on a short-term basis who cannot at that time be classified into one of the other 20 clusters but who does need some mental health treatment/support (Monitor and NHS England 2013a).
The intention for these clusters was to group patients according to similar symptoms and needs and that these would be associated with similar sets of interventions or care packages (Royal College of Psychiatrists 2014). The clusters were developed according to a classification system of service users based on the similarity of their current needs and the similarity of their care plans (Self and Painter 2009).
Cluster-episodes are periods of time a patient spends assigned to the same cluster and these periods can comprise admitted (inpatient care) and non-admitted days (e.g. outpatient attendances, contacts with health care professionals in the community). The clusters are mutually exclusive meaning that a patient should only be assigned to one at any given time.
The system requires patients to be reviewed and reassigned to clusters or discharged according to cluster review periods (see Table 1). If a person's condition deteriorates and needs increase, they can be moved to a higher need cluster before the end of the review period of their current cluster. They should not be allocated to lower need clusters before designated review periods if their condition improves before then. National guidance specifies the fixed review periods for each cluster, though the actual duration of cluster-episodes does not necessarily reflect the review periods listed in Table 1, since after a review for a clusterepisode it can be decided that the patient should continue in the same cluster.
The MHMDS also records the activity which takes place while a patient is assigned to a Cluster. We focus on two types of activity: days spent admitted to hospital and days when the patient had contact with a health care professional. We remove observations where the number of admitted days is greater than the length of the cluster-episode. We count days with contact with a health care professional rather than all health care contacts because we observe cases with several contacts with health care professionals in the same day, which up to a point is feasible, for example a patient could have an appointment with two different types of staff, but is unlikely to have more than ten contacts on a day, which is observed for a small number of patients.
Reference Costs are a mandatory national data return for NHS providers in which they report their costs by cluster and by admitted and non-admitted days. These two cost measures (relating to admitted and non-admitted days) are available both at provider level and national level, with the latter reflecting the average of the costs reported by all providers.

Statistical Analysis
We undertake two analyses. First, we analyse the variation between mental health providers in terms of their costs and activity. Second, we analyse the variation within clusters in terms of costs and activity.
Our unit of analysis in both cases is the cluster-episode, which records the time a patient spends assigned to one cluster. 2 For each cluster-episode we observe its cluster number (see Table 1), its length ( epidays c ) and the activity that takes place during it. We focus on two types of activity: length of stay or days spent admitted to hospital ( warddays c ) and days when the patient had contact with a health care professional ( hpcondays c ). The cluster number is used to match the corresponding (provider level and national average) costs, for admitted and non-admitted days, to each cluster-episode.
The cost for each cluster episode can be calculated as the sum of two costs, one for admitted days and the other for non-admitted days: where warddays c is the number of admitted days and nonadm c is the number of non-admitted days while assigned to cluster c , rc adm,c is the cost of an admitted day and rc nonadm,c is the cost of a non-admitted day in cluster c obtained from Reference Costs. Since we have two types of cost, one at provider level and a national average, we can calculate two versions of the cost described in Eq. (1) for each provider, one based on the provider's own costs and one based on a national average cost (i.e. a notional fixed price which could be used under an episodic payment approach).
The provider-level cost and the national average cost in turn can be used to calculate a cost index for each provider.
(1) C c = warddays c * rc adm,c + nonadm c * rc nonadm,c Common mental health problems 15 weeks 3 Non-psychotic (moderate severity) 6 months 4 Non-psychotic (severe) 6 months 5 Non-psychotic (very severe) 6 months 6 Non-psychotic disorders of overvalued Ideas 6 months 7 Enduring non-psychotic disorders (high disability) Annual 8 Non-psychotic chaotic and challenging disorders Annual N/A 9 Blank cluster N/A Psychosis 10 First episode in psychosis Annual 11 Ongoing recurrent psychosis (low symptoms) Annual 12 Ongoing or recurrent psychosis (high disability) Annual 13 Ongoing or recurrent psychosis (high symptom and disability) Annual 14 Psychotic crisis 4 weeks 15 Severe psychotic depression 4 weeks 16 Dual diagnosis (substance abuse and mental illness) 6 months 17 Psychosis and affective disorder difficult to engage 6 months Organic 18 Cognitive impairment (low need) Annual 19 Cognitive impairment or dementia (moderate need) 6 months 20 Cognitive impairment or dementia (high need) 6 months 21 Cognitive impairment or dementia (high physical need or engagement) 6 months where warddays c is the number of days spent as an inpatient and nonadm c is the number of non-admitted days while assigned to cluster c . Costs differ in their superscript, p indicates provider level and NA national average, rc adm,c is the cost of an admitted day and rc nonadm,c is the cost of a nonadmitted day in cluster c . The index is then the providerlevel average cluster costs where values of the index above 1 represent high cost providers while those below 1 are relatively low cost providers. When analysing the variation in activity within clusters, we investigate whether longer cluster-episodes of care translate into proportionally more activity. This is relevant because activity drives costs and if activity is not proportional to the length of a cluster-episode the payment for a period of care would have to vary over the treatment period (e.g. pay more at the start of the treatment than at the end). Large within-cluster variation in activity could indicate that clusters are not resource homogenous. We use multilevel regressions (Rabe-Hesketh and Skrondal 2008) to reflect the hierarchical nature of the data. For all clusters, we consider a three-level model, where cluster-episodes are nested within patients and patients are nested within providers and report the results as elasticities, i.e. as the proportional change in activity for a proportional change in the length of the cluster-episode.
We regress the number of days with activity (i.e. ward days and contacts with health care professionals) on the length of a cluster-episode to obtain the correlation between the length of the cluster-episode and the activity performed in it. In both cases, the explanatory variable is the length of the cluster-episode. If all providers delivered the same services per period of time in a cluster (for example, 1 year), but reported it at different intervals (for example, quarterly or bi-annually), we would expect longer cluster-episodes to translate into proportionally more services delivered e.g. more contacts with health care professionals in comparison to shorter cluster-episodes. These regression coefficients are reported as elasticities and if activity within cluster-episodes is proportional to their length, we anticipate elasticities to be around one, i.e. longer (shorter) cluster-episodes of care translate into proportionally more (less) activity.
Equation (3) shows the regression for admitted days ( warddays cij ). The explanatory variable is the length of the cluster-episode ( epidays cij ), clusters ( c ) are nested within patients ( i ) and patients are nested within providers ( j ). We do not consider explanatory variables at the patient or provider levels, but these levels are reflected in the random intercepts (2) ij for patient-provider combinations and (3) j for providers, which are assumed to have a mean of zero. The regression for days with contact with a health care professional ( hpcondays ) is identical to Eq. (3) except for a switch in the dependent variable.
As a sensitivity analysis, we restrict the sample to finished (with recorded end date) cluster-episodes where patients can be either discharged from care or moved to another clusterepisode. We also estimate the model using only two levels (cluster-episodes nested within providers). All statistical analyses were carried out in Stata 13 (StataCorp. 2013). Table 2 shows the descriptive statistics for our sample. Note that the number of observations for the provider reported costs (superscript p ) is smaller than for the national average costs (superscript NA ). This is because the provider level costs were not available for all providers but we can still match the national averages to their activity.

Descriptive Statistics
Patients spent on average 3½ months in a cluster, with 2 days as an inpatient on a ward and had contact with a health care professional for 7 days. On average, cost for inpatient care was around £365 per day whilst care in the community or as an outpatient was £10 per day. It is evident from (3) warddays cij = 0 + 1 * epidays cij + cij + (2) ij + (3) j  Table 2 that there is a huge amount of variation between providers in terms of costs, activity and length of stay. Descriptive statistics for each cluster are shown in Table 3. It shows the distribution of the cluster-episodes across the different care clusters. The largest are clusters 4, 18 and 19 with around 12% of cluster-episodes, while cluster 3 has around 9% of cluster-episodes allocated to it. Table 4 shows the cost index, as calculated in Eq. (2), for each provider. We observe that the highest-cost provider is 58% above average and the lowest-cost provider is 27% below average, resulting in a ratio between the maximum cost and the minimum cost of 2.16. Table 5 shows the maximum and minimum cost index for each cluster, and the ratio between them. We observe that the clusters where the difference between most and least expensive provider is greatest, are clusters 0, 1, 2 and 3 where the minimum index is < 25%, giving rise to variation within clusters of more than 20-fold between most-and least-cost providers. Table 6 shows the multi-level model regression results for ward days and Table 7 for days with contact with health care professionals. The results are based on the regression of Eq. (3), using the two different dependent variables (ward days and contacts with health care professionals), employing a post-estimation command in Stata (margins) to obtain the results as elasticities. For both dependent variables all elasticities are significant and smaller than one, which indicates that longer cluster-episodes do not translate into proportionally more activity.

Activity Regressions
The results are robust to the sensitivity analyses that restricted the sample to include only finished cluster-episodes and used a two-level model (cluster-episodes nested within providers).

Contribution to the Current Evidence Base
This paper has explored the proposed episodic payment approach for mental health services in England whereby clinicians allocate patients into one of 21 clusters on the basis of similar levels of need using the MHCT. For this episodic payment system to work effectively, there should not be too much variation in costs or resource use either within clusters, or between providers. We tested whether the existing unit of activity, namely clusters, which underpin the collection of mental health activity and cost data amongst English mental health providers, would support the new payment system. Specifically, we examined the variation both within clusters and between mental health providers in terms of their costs and activity/resource use.
We contribute to the evidence base by examining the implementation of care clusters as a unit of activity in mental health, with a key need being more robust analysis of cluster data. Our results suggest a large amount of variation between providers in terms of costs, activity rates and length of stay within clusters. There is substantial variability between providers in the length of cluster episodes, and there is huge variability within clusters in terms of the proportion of inpatient days and the proportion of contact with health care professionals. We find longer cluster episodes do not translate into proportionally more activity in terms of either inpatient days or contacts with health care professionals. With high levels of variation within clusters, accurate baseline activity rates cannot be determined for planning and purchasing care. Variation in activity rates means that providers see different numbers of patients, have different treatment approaches, levels of productivity, and put different care pathways and packages of care in place for patients within each cluster. This could lead to differences in care quality and outcomes across providers, generating potential geographic inequalities for patients. While the average costs per cluster (Table 5) broadly correspond to severity as indicated by the cluster labels (Table 1), there is also enormous variation within clusters in terms of costs. Variations in cost mean that patients with similar levels of need may be using different levels of resource, leading to a potential waste of scarce resources or an under-treatment of some people in some localities. This also suggests that the introduction of an episodic payment approach would result in large variation across providers in terms of their financial positions.
Our findings of significant heterogeneity in costs, and significant heterogeneity in terms of resource use, do not bode well for an episodic payment approach which requires resource homogeneity within clusters. The reduction of variation in care, activity levels and costs is pivotal to the establishment of a well-designed classification and payment system.
Of course, some of the variation we have found could be a result of data quality issues, including poor costing systems, poor coding, and differing allocation of patients to clusters between individual clinicians and providers. Even if this is the case, it still does not augur well for an episodic payment system since until the data quality issues are resolved, the analysis underpinning it will result in inappropriate payments to providers.

Policy Implications
Whilst the findings from our study appear discouraging for the prospect of an episodic payment approach in mental health, we would argue that instead of abandoning the episodic payment approach and clustering altogether, a much clearer steer is needed from policymakers to support providers and commissioners to move towards refining and developing episodic payment as a viable payment option. A step in this direction would be increasing investment in information technology and underpinning data systems (such as outcomes and patient level costing systems). The improvement of data quality needs to be a priority in mental health services, since without robust evidence regarding the incurred costs, building any kind of payment system is an impossible task. Another step could be re-designing or refining the clusters to improve homogeneity, and indeed there may be a case for increasing the number of clusters to make them more resource homogenous. In the acute hospital sector it took more than a decade since its implementation to refine and develop the Payment by Results approach (Appleby et al. 2012), and a similar development and refinement period could be anticipated for mental health services to ensure clusters are improved over time and become fit for purpose. Indeed the number of HRGs has increased exponentially over time as part of this refinement process. There has been significant investment in staff development, information 1 3 technology and collection of cost and other data over time which the acute hospital sector has benefited from and is still learning how to gain even more from (Marini and Street 2007).
The system also needs to implement change at a pace that does not risk destabilising local health economies and that fits with other priorities and developments in health and social care policy nationally and locally. Having said that, the slow movement within the sector around transitioning to alternative payment approaches and the current policy focus on devolving more decisions to local levels, means that this risk is quite low. These developments also need to bear in mind the evidence that developing and implementing new payment models is a long, complex process (ICF Consulting Services 2015).

Limitations and Future Research
Our research shows that it will be difficult to create a classification and payment system with the currently available data (MHMDS). A key source of the variation we have identified is driven by poor quality data in the MHMDS e.g. duplicate data, missing data such as end dates, overlapping clusterepisodes for the same patient, all which require significant effort and expertise to clean and use. We have specifically limited our analysis to measures of resource use and activity which could be reliably defined to ensure the robustness of our results. However, the data quality would improve significantly if commissioners required all providers to use the MHMDS dataset for contracting and payment, rather than producing their own spreadsheets for such purposes, and this would facilitate greater national benchmarking and expanded research opportunities.
The cost data in particular are of a relatively poor quality and cannot be used at present to identify a reliable pricing system. For the development of any payment system, high quality activity and cost data would be a key requisite. Data quality is a significant challenge with any payment system, but it is at least underway for the episodic payment approach using clustering, collected routinely and there is evidence of some improvements in data quality over time in its collection (Jacobs et al. 2016). Improvements to reference cost data are also essential and the introduction of patient level information costing systems (PLICS) at provider level can support the process of generating this. Future research may explore the advantages in robustness that patient level costing systems could provide compared to cluster level costing and whether these reduce variation within clusters. It would also be valuable for future research to compare and contrast the pros and cons of different units of activity used in payment systems internationally so that policymakers can learn from the design of different payment approaches.

Conclusion
Our analysis of cost and activity for all secondary mental health care in England in the financial year 2014/2015 has revealed substantial variability between providers as well as within clusters. While such variability in and of itself does not make efficient pricing arrangements impossible, our results indicate that a cluster-based episodic payment system would have substantially different financial impacts across providers which could destabilise local health economies. Our analysis further suggests that the inconsistent quality of the currently available mental health care cost and utilization data in England will make it difficult to create a fair and accurate episodic pricing system for mental health care. Greater investment in developing more accurate and consistent information and costing systems for health care may be needed to support such episodic payment systems in the future.  Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creat iveco mmons .org/licen ses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.