Introduction

Quality of life measures are increasingly being recognized as an important endpoint for the evaluation of treatment for opiod use disorders, as well as other types of substance use disorders (Krebs et al. 2016; De Maeyer et al. 2013; Miller and Miller 2009; Veilleux et al. 2010; Karow et al. 2010; Strada et al. 2017; Bray et al. 2017; Barati et al. 2021). Studies confirm that individuals who habitually use opioids are more likely to experience ongoing somatic, mental health, and social complications (Hser, Mooney, Saxon, Miotto, Bell, and Huang 2017; Degenhardt et al. 2019; Jones and McCance-Katz 2019; Ashrafioun, Allan, and Stecker 2022; Milaney et al. 2021).

Opioid agonist treatment (OAT) serves as the gold-standard of care for patients struggling with opioid addiction as it can provide relief from withdrawal symptoms, mitigate anxiety caused by seeking drugs and broaden options for both pharmaceutical and psychosocial treatment of underlying conditions (Krebs et al. 2016; Nosyk et al. 2011; Nordt et al. 2019). Among some OAT patients, unsuccessful treatments can frequently be attributed to the continuation of the side use of other instantly gratifying and fast-acting opioids (Vogel et al. 2023). As a response, heroin-assisted treatment (HAT), usually consisting of supervised administration of injectable pharmaceutical heroin protocol, has been developed and implemented as an effective variant of OAT (Smart 2018). The HAT treatment option was first offered in Switzerland in 1998 and has since been made available in Canada and several European counties, including Denmark in 2010 (Strang, Groshkova, and Uchtenhagen 2015; Danish Health Authority 2021).

Cost-effectiveness studies consistently find that HAT is an expensive yet cost-effective option for opioid addiction (Watson 2012; Dijkgraaf et al. 2005; Nosyk et al. 2012; Byford et al. 2013; Bansback et al. 2018). These estimates frequently incorporate estimated quality-adjusted life year (QALY), a measure of longevity and quality of life, which is used in the economic analyses that underpin many health policy choices to quantify the benefits of treatment interventions. Therefore, in addition to being a valuable clinical result, using self-reported quality-of-life measures that capture the various health domains significant to the patient receiving treatment is an essential part of health economic evaluations.

Different instruments can be used to measure the quality of life of opioid-dependent individuals (Strang, Groshkova, and Uchtenhagen 2017) and Health Related QALY (Bray et al. 2017). The Short-Form SF-36 Questionnaire is amenable to the calculation of (health-related) QALY for use in health economic evaluations and is one of the widest-used Quality of Life measurement scales in the world (Lins and Carvalho 2016; Aguirre et al. 2022; Bray et al. 2017). In the Danish supervised injected HAT program patients have completed an SF-36 Questionnaire at enrollment and have a planned questionnaire as they continue treatment (Danish Health Authority 2021).

In the present study, we evaluated the psychometric properties SF-36 Health Survey (structural validity and external validity) in patients enrolled in the Danish Heroin Assisted Treatment Program (HAT). We further investigated structural validity by testing whether the statistical fit of the SF-36 model’s factor structure is consistent over time, thereby providing the basis for future longitudinal design analyses in this patient group. Moreover, we further assessed validity by investigating whether different physical and mental health QoL factors were associated with the amount of prior hospital contact (PHC) as well as the category hospital contact of either somatic codes (All other codes than F codes) versus psychiatric codes (F20-F59 and F340-411 and F603-F608 codes) using the ICD-10 diagnostic codes (World Health Organization 2015) and whether they were in-patients or out-patients hospital contacts.

This research aims to fill a knowledge gap regarding the use of the SF 36 Health Survey, in heroin-assisted treatment settings. A key purpose of this study was to demonstrate that many patient issues in heroin-assisted treatments, such as somatic and mental health problems, can be anticipated by monitoring patients during HAT treatment. These issues often require care that may not be adequately provided by the healthcare system, emphasizing the need for allocating relevant resources at the HAT clinics. Doing so will enhance the likelihood of providing timely and relevant help as part of the ongoing treatment while at the same time alleviating some of the burdens on the general healthcare systems that often lack the training needed to effectively meet and treat this patient group.

Materials and Methods

Patients

From 2010 to 2018, a total of 541 patients were enrolled in the HAT program at five different treatment centers in the city of Aarhus, Esbjerg, Odense, and two clinics in Copenhagen (Valmuen and KABS) (Danish Health Authority 2021). The patients were in the age group of 21–65, with an average age of 41.4 and SD = 8.9 with 399 males and 142 females. All 541 patients completed at least one questionnaire (“SS-FO”, only first observation) with a total of 2315 SF-36 measurement observations. Each individual was linked with the “BEF” databases (Danish Population Registry). We refer to this as the main sample “S-ALL”. Patients were eligible for HAT, if they had regularly used opioids intravenously during the past 12 months in spite of opioid agonist treatment, did not have current or untreated serious somatic conditions (that contraindicate injectable pharmaceutical heroin), mental health problems, benzodiazepine or alcohol use disorders, were above the age of 18, were not pregnant and did not plan to become pregnant, and accepted to attend supervised self-administration of injectable pharmaceutical heroin.

Instruments and Variables

Patients filled in the SF-36 Health Survey at enrollment at the heroin clinics roughly every year (M = 314.7 (SD = 298.9) days). Patients completed the Danish SF-36 Health Questionaire (BjBjorner, Damsgaard, et al. 1998; Bjorner, Kreiner, et al. 1998; Bjorner, Thunedborg, et al. 1998c). The SF-36 Health Survey has 35 items subdivided into eight dimensions or factors: Physical Function (PF) (10 items); Vitality (VT) (4 items); Role Physical (RP) (4 items); Body Pain (BP) (2 items); General Health (GH) (5 items); Social Function (SF) (2 items); Role Emotional (RE) (3 items); and Mental Health (MH) (5 items). The first four dimensions contribute to the physical health composite factors (PCS), and the remaining four dimensions are the mental health composite factors (MCS). Additionally, the SF-36 includes an item on the change in overall health status from the previous year.

Hospital Contact Data Retrieval

We retrieved data on the patient’s gender and age at enrollment as well as hospital contact history from the Danish National Patient Registry (NPR). The NPR is a high-quality national register covering both public and private hospital-based core that contains ICD-10 diagnostic codes for each contact with hospital-based care (World Health Organization 1992) and whether the episode was in-patient (overnight visit) or out-patient (treatment course) (Schmidt et al. 2015). We defined psychiatric hospital contacts as any hospital contacts during which a diagnosis of psychosis, anxiety/neurotic, mood, or personality disorder was given (F20-F59 and F340-411 and F603-F608 codes). We also removed mammograms (“DZ016”). For non-psychiatric hospital contacts, we consider all other diagnoses, except mammograms (DZ016). Employing this database, we could count the number of hospital contacts prior to a patient's enrollment at the HAT program, thereby defining the amount of prior hospital contact (PHC). Patients had on average 34.5 (SD = 36.3) prior hospital contacts before HAT enrollment.

SF-36 Item Coding

SF-36 scores were calculated according to the SF-36 Health Survey Manual and Interpretation Guide (Ware 1993; Ware and Gandek 1998). All variables were recoded so that greater values indicate greater therapeutic improvement, i.e., greater values on the “body pain” score indicate less pain, greater scores on “role physical” indicate a lessening of the role of physical limitations and greater scores on “physical functioning” indicate better physical functioning. Consequently, positive correlations (indicated by blue) between QoL items and factors in the (Fig. 1D, E subplots) can be interpreted that as one QoL factor measure increases, such as “body pain” (lesser pain), it indicates a covaried increase in another factor, such as “social functioning”. Reversely, a negative correlation (indicated by red/orange) indicates that as age increases, QoL factor measures such as “role physical” decrease (i.e., greater role of physical limitations with age).

Fig. 1
figure 1

Investigation of SF-36 factor structure in danish heroin-assisted treatment patients. Analyses use different samples and subsamples. Sample "S-ALL" use all patients and all observations (N = 541, Obs = 2315). Subsample "SS-FO" (N = 541, Obs = 541) uses all patients for only the First Observation. Subsample "SS-FIVE-FIVE" uses only patients with all 5 "waves" and only those five waves (N = 173, Obs = 870). "SS-TEN" uses all patients for only ten waves (N = 541, Obs = 2133). A Exploratory factor analysis: Exploratory Factor Analysis of SF-36-Questionnaire in Danish Heroin-Assisted Treatment Patients. Parallel analysis suggests eight factors and four components (uses "S-ALL"). B Real and permuted SF-36 models RMSEA scores: Investigation of the root mean squared of approximation (RMSEA) fit measure for the “Real SF-36 Model” (e.g., the actual theoretically specified SF-36 factor structure) and "Permuted SF-36 Models" (e.g., all 430 combinations/permutations of one item from the 35 item questionnaire replaced by an item from another factor) (uses “S-ALL”). Results indicate that the Real SF-36 Model provides a better fit (lower RMSEA scores) than any permuted SF-36 models. C RMSEA across ten waves: Investigation of factor structure fit across waves (questionnaire time points) using the RMSEA fit measure of the SF-36 model. At each wave, there is a loss of patients completing the SF-36 questionnaire, and RMSEA measures increase, indicating a poorer fit over time. Results indicate that the fit of the factor structure is acceptable across the first five waves. Shaded regions indicate levels of fit. Dark red indicates poor fit, red indicates marginal fit; light green indicates acceptable fit, and dark green indicates good fit (uses "SS-TEN"). D Correlogram of correlations among SF-36 items, amount of previous hospital contact s, age, and gender: Results indicate high correlations among clusters of theoretical SF-36 factor structures, as well as correlations to hospital contact, gender, and age. The size of the circles indicates correlation strength (larger size—stronger correlation). Blue dots indicate positive correlations, and red dots indicate negative correlations. Insignificant correlations are left blank. All variables were recoded so that greater values indicate greater therapeutic improvement (i.e., greater values on the “body pain” items indicate less pain, greater scores on “role physical” items indicate a lessening of the role of physical limitations, and greater scores on “physical functioning” item indicate better physical functioning) (uses "SS-FO"). E Correlogram of correlations among SF-36 factors, Composite factors, amount of previous hospital contact s, age, and gender. Results correlations among theoretical SF-36 factor structures, as well as correlations to hospital contact, gender, and age. Insignificant correlations are uncolored (uses "SS-FO") (color figure online)

Data Analyses

Exploratory factor analysis was performed using the fa and fa.parallel function using 1000 iterations from the psych R package (Revelle and Revelle 2015). We used the R statistical computing software (R Core Team 2013) version 4.2.2 for all analyses.

To investigate if the "Real SF-36 Model" fit could be improved (lower RMSEA score), we adopted a permutation inference approach. We refer to the “Real SF-36 Factor Model” as the theoretical factor defined in Ware (1993); Under the null hypothesis, we could randomly calculate "Permuted SF-36 Models", which was all 430 combinations/permutations of one item from the 35-item questionnaire replaced by an item from another factor and compare RMSEA scores between the Real SF-36 Factor Model and the permuted models.

We calculated the fit of the SF-36 factor model over time and compared RMSEA scores over the ten waves. We refer to questionnaire time points as “waves”. We aimed to identify a subsample of patients who completed a certain amount of observations where the RMSEA score was acceptable. It has been advised that RMSEA values less than 0.05 are good, values between 0.05 and 0.08 are acceptable, values between 0.08 and 0.10 are marginal, and values greater than 0.1 are poor (Kim et al. 2016; Fabrigar et al. 1999). In addition we wanted to compare measurement invariance in the SF-36 factor structure across time points by treating “waves” as a grouping variable (Svetina, Rutkowski, and Rutkowski 2020). As it has been demonstrated that the comparison of alternative fit indices (AFIs) and X2 across configural and metric invariance models have inconsistent Type I error rates, we adopted a nonparametric permutation approach using the permuteMeasEq (nPermute = 20) comparing factor loadings as a parameter using the semTools R package (Jorgensen et al. 2016, 2018).

We investigated intercorrelation across SF-36 Items in order to descriptively inspect clusters of correlations among factors and their relation to demographics and prior hospital contact using the Hmisc and corrplot R packages (Harrell and Harrell 2019; Wei et al. 2017).

To investigate the relationship between SF-36 composite factors and prior hospital contact above and beyond that explained by demographics (gender and age), we constructed regression models with gender and age alone, as well as a model with either the Physical Health Composite (PCS) and the Mental Health Composite (MCS) score. P values were extracted from the difference between models containing exclusively age and gender compared to a model that also included the respective Composite score.

For evaluating individual SF-36 factors and the predictive value of prior hospital contact (PHC), we employed a stepwise regression approach to assess if SF-36 factors would be preferentially selected as improving predictability above and beyond demographic variables (gender and age). A tenfold cross-validation (CV) scheme was used to assess out-of-sample predictability using the caret R package (Kuhn 2015). A leap-backward algorithm evaluated the predictability of variables by starting all independent variables and iteratively removing the least important predictors and resulting cease when the model has predictors that are all statistically significant predictors of the dependent variable. The algorithm was tuned to find the model with the lowest root-mean-squared-error (RMSE).

Results

Exploratory and Confirmatory Factor Analysis

Parallel analysis indicated eight factors and four components (see Fig. 1A). We considered model fit using model fit indices of the confirmatory factor analysis (CFA). The (theoretical/CFA) SF-36 factor structure had a better fit (lower RMSEA) RMSEA = 0.063 compared with the (empirical) factor structure derived through the fa function RMSEA = 0.067, although both were acceptable (see Supplementary material S1). We compared the factors derived through our CFA with the automatic scores performed by the Danish Health Data Agency (Danish Health Authority 2021; “Danish Health Authority” n.d.). Correlations were high (0.95–0.99 Pearson’s R) for all factors (see supplementary material S1).

Real and Permuted SF-36 Factor Model

In our permutation approach of the comparison of “Real SF-36 Model” with permuted models. We found that no other model provided an improved fit (see Fig. 1B). Also see Fig. 2 and Supplementary material S1 for the complete factor structure. A full list of all permutations and associated RMSEA scores and modification indices calculated by the lavaan R package (Rosseel 2012) provided in supplementary material S2.

Fig. 2
figure 2

The SF-36 Health Survey Questionnaire and a confirmatory factor analysis of the SF-36 factor structure in Danish Heroin Assisted Treatment patients. Subsample "SS-FO" (N = 541, Obs = 541) uses all patients for only the first observation

SF-36 Factor Model Fit Over Time

We found that RMSEA values were acceptable (between 0.05 and 0.08) across the first five waves (see Fig. 1C) (Fabrigar et al. 1999). We could thus identify a subsample of 173 patients completing five or more waves (SS-FIVE) or employ only those five waves (SS-FIVE-FIVE), which we find appropriate for future studies in this population with an appropriate level of model fit. All fit measures for “SS-TEN” and “SS-FIVE-FIVE” are provided in the Supplementary material S3.

Employing this subsample (“SS-FIVE-FIVE”) we could thus compare measurement invariance a nonparametric permutation approach. The Omnibus p value-based nonparametric permutation method of AFIs revealed non-significant p values (0.4–0.5), whereas the X2 test had a p = 0.097 (see Supplementary material S3). These findings suggest that the factor loadings for the SF-36 in Danish HAT remain consistent across the first five waves while displaying an acceptable level of CFA fit for the SF-36 Factor Structure Model. We thus conclude that QoL measures in Danish HAT can be reliably and consistently estimated using the first five waves.

Intercorrelation Among SF-36 Items, SF-36 Factors, Demographics and Prior Hospital Contact

Results of the correlation analyses indicated high correlations among clusters of theoretical SF-36 factor structures, as well as correlations between gender and age and prior hospital contact (see Fig. 1D for SF-36 items and Fig. 1E for SF-36 Factors (uses “SS-FO”)). Noteworthy correlations between age and SF-36 factors were an expected negative correlation between age and physical functioning (R = − 0.22, p < 0.001), whereas older patients tended to score higher on the social functioning factor (R = 0.15, p < 0.001) and higher on the mental health factor (R = 0.15, p < 0.001) (see Supplementary S4 for Pearson correlation coefficients and p-values).

SF-36 Factors Relation to Prior Hospital Contact Beyond Demographics

Our investigation of the relation between SF-36 factors and hospital contact showed that PCS and MCS composite factors were significantly negatively correlated with out-patient and In-patient somatic prior hospital contact (PHC) (R = − 0.23 p < 0.001 and R = − 0.20, p < 0.001) and (R = − 0.13, p = 0.003 and R = − 0.13, p = 0.004). In contrast, correlations with outpatient and in-patient psychiatric prior hospital contact (PHC) were only marginal or non-significant (R = − 0.08, p = 0.050 and R = − 0.06, p = 0.165) and (R = − 0.08, p = 0.091 and R = − 0.07, p = 0.184) (see Supplementary material S5). In our stepwise regression approach results indicated that the best model picked for out-patient somatic prior hospital contact (PHC) employs the general health factor (GH) and the physical functioning factor (PF), and gender reaches an R2 of 0.126 between out-of-sample true values of prior hospital contact and predicts values. For in-patient prior hospital contact, the best model was general Health (GH), gender, social functioning (SF), role-emotional (RE), physical functioning (PF), and role-physical (RP), reaching an R2 of 0.105. For out-patient psychiatric (F) prior hospital contact (PHC), the best model was physical functioning (PF) and age, which reached a low R2 of 0.036, which was also the case for in-patient psychiatric (F) prior hospital contact (PHC) where the best model was physical functioning (PF) which only reached a low R2 of 0.011 (Fig. 3).

Fig. 3
figure 3

Analysis of correlation and stepwise regression analysis between SF-36 composite factors, SF-36 individual factors, and prior hospital contact (PHC) in Danish Heroin Assisted Treatment patients. The analysis uses the subsample "SS-FO" (N = 541, Obs = 541 for SF-36). From top to bottom: Physical Health Composite (PCS) and Mental Health Composite (MCS) composite factors were correlated with Out-patient and In-patient prior somatic hospital contact (PHC), whereas correlations with Out-patient and In-patient psychiatric prior hospital contact (PHC) were only marginal or non-significant. Correlations were only considered significant if they predicted variance above that explained by gender and age. Below each prior hospital contact, type is indicated as the best model (in Wilkinson notation) of a stepwise regression analysis (leap-backward algorithm), using age, gender, and all SF-36 factors (i.e., PHC ~ age + gender + PF + RP + BP + GH + VT + SF + RE + MH). Results indicate that the best model picked for out-patient somatic prior hospital contact (PHC) employs the general health factor (GH) and the physical functioning factor (PF), and gender (with the first mentioned variables being chosen as stronger predictors). Other factors did not provide significant improvement (lower RMSE) in fit between predicted and true PHC values. A tenfold cross-validation (CV) scheme was used to assess out-of-fold prediction. Root-mean-square error (RMSE), Rsquared (R2) is the coefficient of determination Mean absolute error (MAE) for predicted and true values in out-of-sample values of PHC. Results indicate that SF-36 composite factors and individual factors correlate with (0.12–0.10 R2) somatic prior hospital contact but not psychiatric prior hospital contact (0.03 and 0.01 R2)

Discussion

This study found support for both the structural and external validity of the SF-36 in patients under-going heroin assisted treatment in Denmark. The exploratory factor analysis pointed to eight factors which is consistent with the theoretical SF-36 factor structure (Ware 1993). Using confirmatory factor analysis (CFA), we found that the SF-36 factor structure had an acceptable level of statistical fit (RMSEA = 0.067). We also found that for a subsample of patients (N = 173), completing the first five questionnaire time points (“waves”), the statistical model fit was stable over time, thus supporting the use of the SF-36 in longitudinal studies in this type of population.

Regarding the external validity, we found that outpatient and inpatient somatic hospital contact (non-psychiatric, all other ICD-10 codes than F codes) correlated negatively (R = − 0.23 and R = − 0.22) with the Physical Health Summary score (PCS) and correlated negatively, albeit to a lesser extent (R = − 0.13 and R = − 0.13) with the Mental Health Summary score (MCS). Our analyses of individual factors revealed that the General Health (GH) and the Physical Functioning (PF) factor score was correlated with somatic hospital contacts and was preferentially picked by the stepwise regression approach for predicting prior somatic hospital contacts. Our data thus dovetail with findings that stress the importance of health measures in patients with opioid addictions (Hser, Mooney, Saxon, Miotto, Bell, and Huang 2017; Hser, Mooney, Saxon, Miotto, Bell, Zhu, et al. 2017; Jones and McCance-Katz 2019; De Maeyer et al. 2010; Bray et al. 2017). Future studies are planned which explore the predictive validity of SF-36 factors for future hospital contact in this sample. We argue that it is an advantageous research agenda that will support a more detailed understanding of the relationship between quality of life measures and hospital contacts in opioid users. Such knowledge can guide informed changes in health care services to this population and be a part of health economic evaluations of which treatment interventions that will be worth pursuing.

In contrast to somatic hospital contact, we found that psychiatric hospital contact (F00-F99 ICD-10 codes) was only marginally significantly predicted by PCS and MCS. The stepwise regression approach revealed low correlations (R2 values = 0.036–0.011) for predicted versus true psychiatric hospital contacts, thus showing that quality of life measures in this sample do not readily capture the prior psychiatric hospital contact. The lack of correlation between a psychiatric history and the SF-36 factors could be caused by the strict eligibility requirements of the Danish HAT program which might have restricted the inclusion of psychiatric patients with current or untreated mental illness in our sample.

While our findings imply that meeting the untreated illness criteria for participation in the Danish HAT program could affect the absence of a significant association with SF-36 factors, it is also important to consider that the SF36 may not fully capture the specific nuances related to psychiatric conditions in the study population. To gain a fuller understanding of the impact that psychiatric conditions have on individuals in different HAT settings, future research should incorporate additional questionnaires specifically designed to assess such mental health conditions.

Promoting a better quality of life among patients with opioid use disorder will undeniably provide improvements in the health and psychological well-being (Bray et al. 2017), which calls for short and valid instruments that assess quality of life in this population (Strada et al. 2017). The present study serves as a premise for future studies that aim to assess longitudinal changes in quality of life and whether quality of life measures can be used to predict future hospital contact. One of the primary limitations of our study is the inherent heterogeneity of the participating clinics, including variations in budget, practices, and patient characteristics. This diversity may have affected the generalizability of our findings. Furthermore, regional differences in patient characteristics across Denmark could influence differences in treatment outcomes, necessitating a cautious interpretation of our results. Another crucial limitation worth noting is the potential for selection bias in our study. The individuals who participated in the Danish Heroin Assisted Treatment Program may not represent the total population of individuals struggling with heroin dependence, especially those who refrain from seeking treatment or those who do not have immediate access to treatment. Consequently, it might be challenging to generalize our findings to the population facing dependence.

Conclusion

The analysis of the validity of the SF-36 Health Survey in patients enrolled in the Danish Heroin Assisted Treatment Program (HAT) showed acceptable statistical fit, confirming the factor structure of the SF-36 model. The estimated SF-36 factors were further correlated with prior somatic hospital contacts, thus underlining predictive validity. However, prior psychiatric hospital contacts were not correlated with SF-36 factors. Further studies should assess longitudinal changes and attempt to predict future hospital contacts.