Outline of the Study
Anonymized data were obtained in an observational study in which performance of MHS providers was compared. SBG manages a large nationwide dataset covering outcome of full treatment trajectories of patients treated by MHS providers. Of these MHS providers, eight participated in a pilot project to study outcome of treatment corrected for process indicators. The MHS providers were ranked on outcome: the mean of case mix corrected pre-to-posttreatment change scores on symptomatology (attained after a full treatment trajectory). The MHS providers were coded henceforth 1–8 (with 1 having the best results). For the present study, treatment trajectories that started in 2013 or 2014, and concluded in 2014 were selected. Only treatment trajectories with complete pre- and posttreatment data were included (49% of all remunerated treatments, see Table 1). The maximum treatment length was 2 years, and covered 90% of all monitored treatments. For technical reasons, longer treatments (the remaining 10%) were excluded from the present study.
Participants and Patients
The eight providers who participated in the study are among the 16 largest Dutch MHC institutes in terms of their yearly overall turnover and represent the various types of MHC providers in The Netherlands well: Six are large institutes where many clinicians—predominantly psychologists and psychiatrists—provide inpatient and outpatient care to all types of psychiatric disorders; two specialize in outpatient care of predominantly mood, anxiety, and personality disorders. The latter two providers are both franchise organizations operating nationally; one of these provided nation-wide data, the other provided data from a single municipality. There is quite some variation among providers regarding structural factors (some providers are large institutes providing a mix of inpatient and outpatient care, whereas other provide only outpatient treatment and claim to abide better to clinical guidelines for short problem-focused treatments) There is variation in process factors as well (possibly due to different theoretical approaches). Thus, we may expect variance in outcome and efficiency as well.
In order to homogenize the study sample and improve comparability of results, we selected patients with DSM-IV (American Psychiatric Association 1994) depressive disorders and/or anxiety disorders. The patients received predominantly outpatient psychotherapy and/or pharmacotherapy. Table 1 presents an overview of characteristics of included patients.
Treatment outcome was assessed through the repeated use of various reliable and valid self-report questionnaires for general psychopathology. Five questionnaires were used: the symptomatic distress scale of the outcome questionnaire (OQ-45; Lambert et al. 2004); the total score of the depression anxiety stress scales (DASS-21; Lovibond and Lovibond 1993); the total score of the brief symptom inventory (BSI; Derogatis 1975); the total score of the short symptom list (Korte Klachtenlijst—KKL; Appelo 2006), and the problems subscale of the clinical outcomes in routine evaluation-outcome measure (CORE-OM; Evans et al. 2002). Scores on questionnaires were standardized to a common metric: T-scores with at pretreatment M = 50; SD = 10 (McCall 1922). In addition, scores were transformed in order to get a normal distribution and a true interval scale, required for calculation of pre-to-posttreatment change scores (de Beurs 2010). In previous studies we compared these instruments on their responsiveness to change and found some variation, amounting to a 10–15% difference in outcome between pairs of instruments (de Beurs et al. 2012).
Methods for Rendering Treatment Outcome
Treatment outcome was defined by the pre-to-posttreatment difference in severity of symptoms (Delta T or ΔT). Posttreatment scores and ΔT were corrected for case mix differences. The continuous nature of the T-scale optimizes statistical power and simplifies ranking of MHS providers (de Beurs et al. 2016). However, a limitation of ΔT is that it produces a rather abstract figure, which does not yield any information on quality and nature of a patient’s clinical end state. An alternative method to denote treatment outcome was proposed by Jacobson and Truax (1991). Their two core concepts are the reliable change index (JTRCI) and clinical significance (JTCS). For JTRCI it should be unlikely (p < .05) that change as expressed in the difference between the pre- and post-test score is due to measurement imprecision. For the JTRCI a value of ΔΤ = 5.0 is used that represents half a standard deviation and is considered the minimal clinically important difference (de Beurs et al. 2016; Norman et al. 2003; Sloan et al. 2005). Patients with a case mix corrected ΔT > 5 were considered improved. To meet clinical significant outcome or recovered status a patient’s score needs to be changed within the criteria of the reliable change index (JTRCI), but also the posttreatment score needs to be within the functional range (JTCS). For JTCS a cutoff point of T = 42.5 was determined (de Beurs et al. 2016). Combined with case mix correction, for JTRCI&CS the case mix corrected posttreatment score needs to be T < 42.5 as well as more than five points less than the pretreatment T-score.
Cost and Treatment Duration
Costs were defined as direct and indirect cost of therapist time for patient care. It involves the costs for diagnosis treatment combinations specified in reimbursement rates in the Dutch fee-for-service system (diagnose-behandeling-combinaties or DBCs as they are called in Dutch (Tan et al. 2012). These data are embedded in 13 treatment time categories of the DBC-code system. Four categories cover short treatment (0–99, 100–199, 200–399, and ≥400 min). Nine categories cover more comprehensive treatment, 250–799, 800–1799, 1800–2999, 3000–5999, 6000–11,999, 12,000–17,999, 18,000–23,999, 24,000–29,999, and ≥30,000 min). These categories were recoded into 13 monetary values based on the cost rates for Dutch MHS in 2014. The cost of treatment for the second year was added to the cost of the first year to arrive at the sum of cost for the complete treatment trajectory for each patient. Cost for psychiatric hospitalizations, for medications, or for treatment by GP’s are not included in this study, as it focusses on outpatient treatments provided by psychiatrists and psychologists who are employed by MHS providers. The cost of a treatment trajectory was on average about 3150 euros (see Table 3).
Treatment duration was calculated in weeks, based on the interval between the date of opening the DBC (usually the first face-to-face diagnostic contact of the patient with the intaker/therapist) and the date of the last face-to-face treatment session. Thus, a possible waiting period between the intake and first treatment session was included in the treatment duration. Both SBG and the Dutch Healthcare Authority (NZa, https://www.nza.nl/organisatie/sitewide/english) provide guidelines and detailed specifications on timing of assessments, on how the start and conclusion date of treatments should be recorded, and on how to log treatment time (the number of minutes). As scrutiny is critical for a fair remuneration system, compliance is monitored through yearly audits by accountants.
To examine the improvement rate over time, duration per outcome was calculated by dividing standardized duration by standardized outcome (both variables transformed to T-scores, to avoid 0-scores in the nominator and denominator). The resulting indicator had value 1 when treatment duration and outcome are in balance. When the indicator value was <1 it took less time to achieve a similar outcome or a better outcome was achieved in the same time. A value above 1 indicated that it took more time or less improvement was achieved. In a similar vein, cost per outcome was calculated by dividing standardized cost by standardized outcome. Thus, the longer and/or more expensive the treatment and/or the lower the ΔT (a worse outcome), the higher these indicator values will be. Both indicators were calculated for each patient (patient-oriented).
In addition, duration and cost per outcome was calculated for each MHS provider by dividing the mean scores on these variables. In addition, duration and cost per reliably improved patient, and duration and cost per recovered patient were calculated by dividing the average cost of the treatment for each MHS provider by the proportion of patients with reliable improvement (JTRCI) or with recovery (JTRCI&CS). For example, if an MHS provider is able to achieve a 25% recovery rate and their average cost of treatment is 2500 euros, cost per recovered patient is 2500/25% = 10,000 euros; if the recovery rate is 50%, the indicator cost per recovery would be 5000 euros. In a similar way, duration per reliably improved patient was calculated by dividing the average duration by the proportion of patients with reliable improvement (JTRCI). If the average duration of treatment for a MHS provider is 30 weeks and the improvement rate is 50%, the indicator “duration per improved patient” is 30/50% = 60 weeks; with 75% improved patients the indicator value is 30/75% = 40. These six indicators are “service provider-oriented”, as they are derived from the average scores on performance indicators achieved by the MHS providers.
Case Mix Correction
As we expected differences in case mix among MHS providers, various demographic and clinical variables were collected. Socio-economic status (SES) and urbanization was coded in five levels (higher scores indicate higher urbanization or higher SES level) and was derived from the first four digits of the postal codes of patients. Diagnostic information was obtained according to the DSM-IV (American Psychiatric Association 1994), pretreatment disorder severity was operationalized with the pretreatment T-score, and pretreatment functioning with the global assessment of functioning (GAF) scale of the DSM-IV.
There were substantial differences between providers in pretreatment severity of the patients (F (7, 3583) = 19.03, p < .001, η
2 = 0.04) and GAF-score (F (7, 3561) = 91.99, p < .001, η
2 = 0.15). Furthermore, the gender distribution differed between providers (χ2 (7) = 35.06; p < .001), and their populations also differed in age (F (7, 3583) = 17.53, p < .001, η2 = 0.03), socio-economic status (F (6, 3337) = 32.12, p < .001, η
2 = 0.06), and urbanization (F (6, 3344) = 63.99, p < .001, η
2 = 0.10). See Table 1 for full details including Bonferroni corrected multiple comparisons of MHS providers on pretreatment severity.
As the populations of MHS providers diverged, all indicators (outcome, duration, and cost) were corrected for case mix differences (Iezzoni 2013). In previous analyses, the pretreatment score appeared the most important case mix variable, explaining about 25% of variance in the posttreatment score (Warmerdam et al. 2016). A higher pretreatment level predicts both a higher posttreatment level as well as a larger ΔT, as it leaves more room for improvement. In addition, outcome was corrected for two other predictors: GAF score and SES. For both items, a lower score was associated with worse outcome. Other variables (e.g. gender or urbanization) were not associated with outcome. This model explained a substantial 29.0% of posttreatment variance in the national dataset (N = 29,395) (Warmerdam et al. 2017). Case mix corrected ΔT was calculated by correcting the posttreatment level for case mix variables.
Duration was corrected for initial severity level, functioning, age, gender, and diagnoses. Many of these variables showed different associations with outcome. The diagnoses “major depressive disorder, single episode” and “other mood disorder” were associated with shorter treatment; OCD was associated with longer treatment. A higher severity and worse functioning at pretest were associated with longer treatment. A higher age and male gender predicted longer treatment. This model explained only 2.7% of variance in duration.
Cost was corrected for initial severity level, functioning, age and for the diagnoses “major depressive disorder, recurrent” and OCD. In all these variables, a higher score was associated with higher costs. The model explained a modest 8.4% of variance in cost, which is less than typically found in MHC (Hermann et al. 2007; Iezzoni 2013). This may be explained by diminished variance in the predictors, due to selection of a diagnostically homogenous patient group and by diminished variance in the cost variable, as only outpatient treatments were selected. Finally, potentially relevant variables, such as comorbidity, were assessed with insufficient reliability to be included; others, such as education or living situation had to many missing values (>25%).
The various performance indicators were compared in patient-oriented and service provider-oriented data. First, treatment outcome of patients was compared among service providers with a repeated measures ANOVA. After this omnibus test, we performed post-hoc tests (all possible pairwise comparisons between providers with a Bonferroni correction for multiple testing) to ascertain which service providers had statistically different outcomes. Mean treatment duration, cost of treatment, duration per outcome and cost per outcome of patients were also compared among service providers with ANOVA. Differences in proportions of recovered and improved patients between service providers were tested with Chi-square tests. The association between duration, cost, and outcome was assessed with correlational analysis (Pearson r). Next, service providers were rank ordered according to each performance indicator (service provider-oriented data). To investigate discordance among indicators (or their potential redundancy, because of concordance) we calculated the correlation between rank ordering of the service providers (Spearman rho rank correlation coefficient). Finally, we investigated the ability of the indicators to discriminate between service providers with stepwise discriminant analysis. As indicators are correlated (as cost and duration do), two stepwise discriminant analyses were done: one focusing of cost and the other on duration. The option of stepwise entry based on Wilks Lambda was chosen, entering discriminant variables one by one only after they appear to improve the discriminant function significantly. The classification variable is the service provider, indicators are independent variables, and each analysis tests which indicators discriminate best between service providers. The first variable to enter maximizes separation among the groups, the next to enter adds the most in further separating the groups, etc.