“There Are Things I Want to Say But You Do Not Ask”: a Comparison Between Standardised and Individualised Evaluations in Substance Use Treatment

There has been an increasing call for service users to be more actively involved with the evaluation of treatment outcomes. One strategy to impove such involvement is to ask service users to contribute with their own criteria for evaluation by sharing their personal story and perspective about their clinical situation. In this cross-sectional study, we contrasted the contents elicited by service users completing two individualised measures against the contents of three widely used standardised measures. We also compared two methods to generate individualised data using self-report and interview-based instruments (PSYCHLOPS and PQ). Following a thematic comparison approach, we found that one quarter of the problems reported by patients in individualised measures were not covered by any of our standardised comparators. Also, half of our sample generated at least one problem whose theme was not covered by any of the three standardised measures. We also found that patients in this population have many other concerns beyond drug use. These included psychological (e.g. interpersonal relationships) and socio-economic (e.g. money) problems, which were frequently reported. Our study suggests that listening to service users’ stories allows us to capture issues of importance to service users in substance use treatment, which may be underestimated by standardised measures.


Introduction
A growing body of literature suggests that personalising the assessment of outcome in substance use treatment is of central importance (Alves et al. 2015;Trujols et al. 2015). Recent reports have also shown that patients in this population appreciate being actively engaged in outcome assessment . Individualised outcome measures are tools that gather each patient's unique perspective about their clinical condition but there is little evidence about the extent to which they may complement traditional standardised questionnaires. In this study, we contrasted the contents elicited by patients in individualised outcome measures with standardised measures, as well as comparing two measures with different methods for generating individualised data.

Outcome Assessment in Substance Use Treatment
The evaluation of patients in substance use treatment has been discussed in several international guidelines (e.g. the European Monitoring Centre for Drugs and Drug Addiction, EMCDDA). According to these, evaluation aims "to determine individual needs, to obtain standard somatic, mental and psychological information" (EMCDDA, 2007, p. 36) and to focus on "the consequences of treatment for the clients" (EMCDDA 2007, p. 17). Furthermore, the EMCDDA recommends that evaluation, including outcome assessment, should be based on "instruments that are available and validated" (EMCDDA 2007, p. 38).
Instruments recommended for outcome assessment in this field tend to be standardised scales containing quantitative pre-determined items (see the Evaluation Instruments Bank available at http://www.emcdda.europa.eu/eib). These instruments are mostly based on criteria selected by experts, which do not always coincide with what patients deem important (Pulford et al. 2009;Thurgood et al. 2014). A recent study has attempted to overcome this limitation by including patients in the development of a standardised outcome measure, involving them in topic selection . However, even when patients are involved in such a task, the universal scope of standardised measures limits the understanding of each patient's unique problems. More specifically, these measures do not give patients the freedom to express their personal views and they may contain questions that lack personal relevance or omit questions of relevance.

Why Use Individualised Outcome Assessment?
In outcome assessment, capturing the personal issues, concerns or problems is important for several reasons. For instance, Wagner (2002) showed that, in a psychotherapeutic context, nearly 60% of personalised data were not equivalent to any item included in two standardised measures used as comparators. Hunter et al. (2004) carried out a similar study in mental health services, showing that 25% of the information provided by patients was not represented in standardised measures. In 2007, Ashworth and colleagues found that, in the primary care setting, 44% of items in a personalised measure were not covered by a commonly used standardised measure of psychological well-being. These findings indicate that standardised outcome measures potentially fail to capture information about patients that is relevant for evaluation purposes.
Gathering personalised data involves the use of individualised outcome measures. These are open-ended questionnaires, tailored to each individual, and whose items (problems or goals) are generated by patients, in their own words (Ashworth et al. 2004). Items in the individualised questionnaire reflect the patient's reality (e.g. "I haven't spoken with my daughter for two years"), unlike standardised questionnaires which contain issues that apply to the whole population (e.g. "Talking to people has felt too much for me").
There are two main processes for gathering individualised items: the self-report method, where patients are invited to write their concerns in a pen-and-paper format (e.g. Psychological Outcome Profiles, PSYCHLOPS; Ashworth et al. 2004); and open-ended interviews, where patients are asked, in a dialogue, to talk about their problems (e.g. Simplified Personal Questionnaire, PQ; Elliott et al. 2016). Researchers believe that both formats have their own advantages and disadvantages. For instance, self-report individualised measures tend to be briefer, to demand less staff/service resources (e.g. presence of researcher/staff member not necessary), are flexible when it comes to the context of application (e.g. waiting room, private consultation room) and can be administered to multiple patients at the same time (e.g. evaluating group therapy). However, as they are meant for self-completion, patients are required to have a minimum level of literacy skills and be physically able to complete the questionnaire unaided. On the other hand, interview-based individualised protocols can be administered to any patient able to communicate verbally, since data collection is conducted by the interviewer. Nevertheless, interview-based individualised outcome measures are normally lengthy (e.g. 30-60 min) and require completion in a one-to-one format between patient and interviewer (for a review about individualised outcome measures, see Sales and Alves 2016).
Individualised measures are neither a substitute for clinical interviews, which are used by clinicians, e.g. for diagnostic purposes (e.g. Structured Clinical Interview DSM), nor for standardised outcome measures, which are used to evaluate how much a patient has changed during treatment. However, because individualised measures generate unique information about the patient's story, they have been used by clinicians for several tasks, for instance, to complement diagnoses or to support clinical decision-making (Sales et al. 2007;Sales et al. 2014). If used as part of the outcome evaluation process, individualised measures can generate patient-specific information that can be combined with standardised data for a better understanding of the patient's clinical situation (Alves et al. 2015;Sales and Alves 2012). Also, individualised measures provide an evidenced-based structured strategy to collect qualitative and personalised information about patients, enabling use for evaluation purposes, unlike other clinically relevant information such as clinical notes (Elliott et al. 2016).

Using Individualised Outcome Measures in Substance Use Treatment
Individualised outcome measures have already been used in other health contexts with promising findings, such as primary care mental health (e.g. Ashworth et al. 2007), counselling and psychotherapy services (e.g. Elliott et al. 2016). However, the use of individualised outcome measures has only recently been applied to substance use treatment, and little is known about their potential in this field.
We believe that individualised outcome measures may broaden the understanding of outcome assessment in substance use treatment. Patients in this population tend to be stigmatised (Livingston et al. 2012), their perspectives about treatment are seldom taken into account (Orford 2008) and they are rarely involved in outcome assessment (Alves et al. 2015). In a recent study, we found that patients valued the freedom provided by individualised outcome measures to express personal concerns, even when the topic of concern was unrelated to substance use . Patients also reported that they preferred an interviewbased procedure, especially if the interviewer was their own therapist. In contrast, they admitted having difficulties identifying personal problems in self-complete individualised outcome measures .

Study Rationale
In this study, we sought to explore the extent to which individualised outcome measures add personalised information to traditional measures of outcome assessment, in substance use treatment. Our aims were the following: (1) to explore the personal problems of patients with individualised outcome measures; (2) to compare the problems elicited from individualised and standardised outcome measures, investigating whether individualised data added information to that obtained from their standardised counterparts; (3) to contrast the problems elicited from two types of individualised measures (self-report vs. interview-based protocol); and (4) to investigate whether prior exposure to standardised measures influenced the contents elicited by patients in individualised measures.

Material and Methods
This study followed a cross-sectional design and was part of a larger project (Alves et al. 2013) that aimed to implement the personalised assessment approach in the field of substance use treatment. Data were collected in four drug and alcohol treatment services in Portugal: three outpatient and one inpatient therapeutic community. Ethics approval was obtained at the Committee for Health of Lisbon and surrounding areas (ARSLVT, Ref. 8251/CES/2012).

Participants, Measures etc
Our sample included patients starting treatment for substance use. During the recruitment period, all new patients at the four study sites, who met the inclusion criteria, were invited to participate. The inclusion criteria were (1) aged 18 years and over; (2) admitted for a first or new treatment episode (i.e. treatment of a relapse); and (3) fluency in Portuguese. A total of 102 patients were invited for the study. Of these, eight people declined participation and one was excluded on the basis of incomplete data collection. The final sample consisted of 93 respondents, corresponding to a 91% response rate.

Measures
The evaluation protocol included (a) two individualised measures: Psychological Outcome Profiles (PSYCHLOPS; Ashworth et al. 2004), a self-report individualised outcome measure in which patients are invited to answer three open-ended questions: "Choose the problem that troubles you the most", "Choose another problem that troubles you" and "Choose one thing that is hard to do because of your problem (s)". PSYCHLOPS includes a fourth standardised six-point scale question about overall well-being; and the Personal Questionnaire (PQ; Elliott et al. 2016), an interview-based individualised outcome measure whose items are elicited in a semi-structured format. In the interview, the patient is asked to brainstorm his/her current problems, prompted by the question "Describe the main problems that you are having right now that led you to seek treatment". As comparators, we used (b) two standardised psychological well-being scales, namely Clinical Outcome Routine Evaluation-Outcome Measure (CORE-OM; Evans et al. 2002), a standardised self-report measure about generic psychological distress, which contains 34 items covering four domains: well-being, problems/symptoms, functional capacity and risk/harm; the Patient Health Questionnaire-9 (PHQ-9; Kroenke et al. 2001), a nine-item standardised self-report questionnaire to measure depression; and (c) a standardised substance use-specific scale, Treatment Outcomes Profile (TOP; Marsden et al. 2008), which is a staff-administered questionnaire focusing on drug and alcohol use, injecting risk behaviours, offending and criminal involvement and health and social functioning.
PQ and PSYCHLOPS were chosen because they are the most frequently used individualised outcome measures in the mental health field (Sales and Alves 2016); CORE-OM and PHQ-9 are gold standard outcome measures that have already been administered in combination with PQ and PSYCHLOPS with satisfactory/good convergent validity scores (PQ vs. CORE-OM r = 0.80 and PQ vs. PHQ-9 r = 0.44; Elliott et al. 2016; PSYCHLOPS vs. CORE-OM r = 0.6; Ashworth et al. 2007); and TOP is one of the most frequently used measures for outcome evaluation in substance use treatment (see http://www.nta.nhs.uk/top-world-map.aspx).

Data Collection
Data were collected between July 2013 and May 2015 by the first author and five research assistants. All selected participants were asked to complete the evaluation protocol prior to their first treatment session, in a private room. Patients were given a patient information leaflet and consent was obtained before proceeding with questionnaire completion. The four measures of general psychological wellbeing (PQ, PSYCHLOPS, CORE-OM and PHQ-9) were presented in random order. TOP was the final measure to be presented and was not randomised because it focused mainly on drug-related issues. Randomisation was achieved through the use of numbered evaluation packs with each of the 24 possible questionnaire combinations labelled as pack #1 to #24. A random number generator was used to select which pack was administered to each participant.

Data Analysis
To achieve our first aim (exploring patients' individualised problems), we analysed the free text items in PQ and PSYCHLOPS. Responses were categorised according to their content, or subtheme, based on a previously validated thematic classification system. This classification system comprised of 65 mutually exclusive sub-themes of problems and was created to analyse PSYCHLOPS items (Robinson et al. 2006;Sales et al. 2018). We used this classification system to allow for our findings to be compared with previous studies and to increase the robustness of our categorisation procedure. The categorisations were made independently by three researchers, followed by inter-rater reliability calculations. Whenever there was disagreement, discussions with an independent expert in individualised measures took place until consensus was reached.
For the second aim (matching the content of individualised and standardised measures), we categorised the sub-themes found in individualised items according to whether their content overlapped with each standardised item in CORE-OM, PHQ-9 and TOP. A binary yes/no scale was used, where "no" meant "individualised item vague, general or completely different from the standardised item" and "yes" meant "individualised item connected, clearly related or completely overlapped with the standardised item". Content overlap was categorised independently by two researchers and inter-rater reliability was also computed. Frequencies of sub-themes with and without overlap with the three standardised measures were calculated.
The third aim (contrasting the two individualised measures) was attained by comparing the number of items and the type of contents generated in PQ and PSYCHLOPS. The similarity between patients' responses in both measures was explored using Jaccard's similarity index (J) (Real and Vargas 1996) to estimate the percentage of patients that reported the same subthemes in the two measures. We considered values of J > .3 to indicate strong similarity.
Finally, we investigated whether prior exposure to standardised measures could influence (or not) the contents of patient-generated items. To explore this hypothesis, we selected a subsample of people (n = 29) who responded to standardised measures in between individualised measures (i.e. in the following order: first, one individualised measure; then, one or two standardised measures; finally, the other individualised measure). Then, for each patient, we counted the sub-themes elicited from individualised measures that were featured in standardised measures, and, when they were, we analysed if patients mentioned them before or after having contact with the standardised measure. If there was no influence, the proportion of featured sub-themes spontaneously mentioned in individualised measures prior to completing the standard measure would be at least 50% of the total featured sub-themes mentioned by the patient (within-subject analysis). To test this hypothesis, we used the one sample t test.

Results
The mean age of our final sample (N = 93) was 42.7 years old (SD = 11.3) and more than half (57%) were male (see Table 1 for a full summary of socio-demographic information). Among the study participants, 92 generated a total of 275 items from PQ (one patient did not complete PQ) and 89 generated 214 items from PSYCHLOPS (four patients did not complete the Problem section of PSYCHLOPS).

What Problems Do Patients Report in Individualised Measures?
Individualised items generated from PQ and PSYCHLOPS were classified into 54 of the available 65 sub-themes, with good inter-rater reliability results (Cohen's kappa between raters ranged from .88 to .93). Altogether, the sub-themes most frequently elicited by patients in individualised measures were "addiction" (72%), "work-related problems" (47%), "general relationship difficulties with family" (21%), "money worries" (19%) and "relationship difficulties with family that involve worrying about another person" (16%) (see Table 2).

Do Individualised Measures Add Information to Standardised Measures?
For the process of analysing content overlap between individualised and standardised outcome measures, all categorisations achieved satisfactory inter-rater reliability results (Cohen's kappa between raters ranged from .66 to 1.0).
Just over two thirds (38 out of 54; 70%) of sub-themes captured by individualised measures were absent from TOP. Among these were sub-themes frequently reported by patients such as "money worries" (19%), "relationship difficulties with familyworry about another" (16%) and "self-image/self-worth" (13%). Among the measures of general psychological well-being, a little over one third (19 out of 54; 35%) of sub-themes captured by individualised measures were not covered by CORE-OM. A large proportion of sub-themes (40 out of 54; 74%) were not covered by PHQ-9. Sub-themes not featuring in CORE-OM and PHQ-9 included topics frequently reported by patients, namely, "addiction", mentioned by 73% of patients, "workrelated problems" (47%) and "money worries" (19%).
When considered as a whole, 43 out of 54 (80%) sub-themes reported by patients on individualised measures were captured by one or more of the standardised instruments, whilst 11 out of 54 (20%) were captured by individualised measures only. However, almost half of the patients in our sample (49%) described at least one individualised problem whose content was not covered by any of the three standardised measures. This indicates that even with the inclusion of three standardised measures, certain types of personal problems (e.g. "money worries") were only covered by an individualised measure (see Fig. 1).

Are There Any Differences Between the Two Individualised Measures?
The mean number of items elicited in PQ was 3.0 (SD = 2.1; range 1 to 12) and in PSYCHLOPS was 2.4 items (SD = 0.7; range 0 to 3); this moderate difference was significant (t = 3.2, df = 91, p = .002, Cohen's d = 0.44). Twenty-five patients (27%) reported the same number of items in both instruments; 41% (n = 38) reported more items in PQ and 32% (n = 30) reported more items in PSYCHLOPS. Notes. In Portugal, the first year of school (which is called primary school) starts at the age of six. Secondary education ends on the 12th school year. The mean (SD) and number, n (%) values are given where applicable .13 ✓ Relationships-general 8 (8.6) 0 ✓ ✓ Relationship difficulties: family-conflict 7 (7.5) .14 ✓ Socialising 7 (7.5) .14 ✓ Aggression/irritability 6 (6.5) .17 ✓ ✓ Housing worries 6 (6.5) .50* ✓ Relationship difficulties partner-breaking up 6 (6.5) . There was little content overlap between the two individualised instruments. Most subthemes (72%) present in patient-generated items had a Jaccard's similarity index of 0. This means that the responses elicited by PQ tended not to coincide with those elicited by the same patient in PSYCHLOPS and vice versa. A strong overlap was only found for the following sub-themes: "addiction" (J = .54), "housing worries" (J = .50), "obsessive compulsive disorder" (J = .50) and "worries about health" (J = .31). Self-acceptance 1 (1.1) 0 ✓ ✓ Sexual problems 1 (1.1) 0 Another person illness 0 (0) n/a Avoiding issues 0 (0) n/a Making decisions 0 (0) n/a Relationship difficulties: family-development 0 (0) n/a ✓ Relationship difficulties: partner-forming 0 (0) n/a ✓ Relaxing 0 (0) n/a ✓ Self-harm 0 (0) n/a ✓ Somatic symptoms 0 (0) n/a ✓ ✓ ✓ Thinking rationally 0 (0) n/a Thoughts 0 (0) n/a ✓ ✓ Traumatic event 0 (0) n/a ✓ Notes: In this table, "n/a" refers to sub-themes that are included in the classification system used in this study, but were not present in any item elicited by this sample Our study showed no evidence that prior completion of a standardised measure influenced the items reported by patients in individualised measures. We found that the proportion of CORE-OM and PHQ-9 sub-themes that were mentioned in patient-generated items prior to completing a standardised measure was 67.8% (SD = 38%). This result was statistically superior to 50% (t = 2.5, df = 28, p = .02), suggesting that items covered by standardised measures were spontaneously reported by patients more than 50% of the times before being exposed to standardised measures.

Discussion
Our study suggests that individualised measures allow us to collect additional information regarding the personal problems of patients in substance use treatment, which may not included in pre-set standardised measures. It was expected that a sample of patients being admitted for drug and alcohol treatment, in specialist services, would mainly discuss issues related to their addiction problems. However, as patients stated it in a previous study (Alves, Sales & Ashworth, 2016), it was not "just about the alcohol" (p. 4) and drugs. Besides their substance use, people made use of individualised outcome measures to express other concerns, such as their financial situation or difficulties in relating with their family members. To learn that patients who seek substance use treatment report problems beyond drug use is a major finding of our study. On the one hand, it confirms the importance of outcome evaluation protocols that include other aspects, such as psychosocial functioning or stress, which go beyond substance use (Tiffany et al. 2012). Both domains suggested by Tiffany et al. (2012) were also expressed by patients in our sample in individualised measures. Moreover, our study also reinforces the value of involving patients in the selection of criteria to evaluate treatment success, ensuring that treatment focuses on topics of relevance for patients. As Lee and Zerai (2010) stated, "assuming that [treatment] success itself can be defined, one must accept that is nuanced and, (at least in part), participantdefined" (p. 2423).
Because of its focus on drug-related issues, we were primarily interested in TOP when comparing individualised and standardised measures. More specifically, we expected this measure to have a high content overlap with individualised measures, because it is focused on problems specific to this population. However, the majority of problems reported by patients in an individualised format were overlooked by TOP. By adding measures of general psychological distress and depression (CORE-OM and PHQ-9, respectively), we extended the range of problems covered by the standardised module of the evaluation protocol. This decision followed previous work that emphasised the importance of psychological health as a major factor in recovery from drug and alcohol dependence (Wanigaratne et al. 2005).
It is unlikely that a real-practice evaluation protocol would concurrently administer five outcome measures in total. However, we opted for such research protocol to expand the possibility of matching individualised information standardised measures. Even with inclusion of three standardised measures, 20% of problems reported by patients were still not captured by any of the measures used as comparators.
These findings have various implications for treatment evaluation in this patient group. The failure of standardised instruments to capture a substantial proportion of reported problems implies that current evaluation protocols may need to be revised in order to accommodate the needs of this population. The wide range of reported problems illustrates the importance of including broadly defined psychosocial criteria in evaluation protocols and not merely focusing on drug and alcohol issues. Additionally, our study indicates that individualised measures can be a valuable complement to the existing evaluation protocols, by capturing aspects that are overlooked by standardised measures, but relevant at an individual level. The burden of administering individualised measures (which tend to be lengthy) is potentially overcome by the type, amount and relevance of the information gathered from a clinical perspective. A previous study reported that although time consuming, individualised measures were valued by patients, particularly the freedom to express any topic of their concern . In other words, this study suggests that in addition to clinicians and patients' favourable opinion towards individualised measures, these measures are also valuable from a qualitative point of view, by generating information which other measures may not contain.
Patients reported a greater number of items in PQ than in PSYCHLOPS. This finding was expected because PQ imposes no limits on the number of items that patients can create, whereas PSYCHLOPS asks people to generate up to a maximum of three items. However, although significant, the difference between the mean number of items in PQ and PSYCHLOPS was small (3.0 vs. 2.4, respectively), which means that the choice of format might be dependent on available resources. This suggests that, if time constraints are not important, one might opt for a questionnaire without a cap on item number. If time constraints are more of a consideration, the one-page, self-complete format of PSYCHLOPS may be preferable. Moreover, we found that in self-report individualised measures, some patients (4%) did not describe any problems at all, resulting in missing data for outcome assessment. In a previous study, patients reported a preference for someone "pushing them" to facilitate a discussion about their problems rather than documenting their own problems in writing and thus preferred the format of PQ rather than PSYCHLOPS .
The two individualised measures elicited different concerns. This discrepancy may have arisen because patients found it easier to express certain problems in one format rather than another. For example, patients may prefer to report sensitive topics, e.g. expression of suicidal thoughts, in a therapist-administered questionnaire; whereas, others may prefer a written format to report, e.g. communication difficulties. Another explanation is that patients may not have wanted to duplicate reporting across the two measures. If this was the case, we do not know which of the measures elicited the topics of greatest concern for patients. Further research using think-aloud testing would enable us to explore reasons for the unexpected differences in responses elicited by the two individualised measures administered consecutively (Charters 2003).
This study has several strengths and limitations. Even though we used outcome measures, we did not collect any post-treatment data. However, this not was a pre-post study design, but instead an exploratory study to investigate the potential of individualised measures to generate new information not covered by traditional measures. Our findings provide the first thematic comparison in substance use patients between the contents reported in individualised measures and traditional standardised questionnaires. The comparison of items generated in PQ and PSYCHLOPS has not been previously reported and hints about how the method to generate individualised data may, or may not, influence the type of contents reported. We were able to test for contamination of individualised measure completion through prior use of standardised measures, and our findings suggested little if any evidence of bias arising from this source. However, further testing of the individualised measures is required in order to establish whether reported differences are related to the structure of the measures or to the mode of administration. Also, because we only focused on data collected at the beginning of treatment, it is likely that target problems and treatment goals may change during treatment. Future studies should focus of analysing how problems vary after treatment entry by comparing session-to-session or prepost data. Even though we chose standardised comparators in common use both in mental health and substance use treatment, it is possible that other measures, not included in this study, may have a greater content overlap with PQ and PSYCHLOPS. Another concern relates to administration of some measures orally rather than as self-report measures and the degree to which non-verbal cues may have influenced responses.

Conclusions
Overall, we have demonstrated that individualised measures have the potential to capture qualitative information about personal problems, which is likely to be excluded from standardised outcome measures of general psychological distress, depression and drug-and alcohol-specific-related problems. From a qualitative perspective, the inclusion of individualised outcome measures in routine assessment protocols is likely to enhance patient-relevant information included in outcome assessment and translate into more personalised treatment provision.