Background

Peripheral intravenous catheters (PIVCs) are the most commonly inserted invasive device in modern healthcare delivery [1]. These devices consist of a small plastic tube (introduced into the bloodstream by a steel needle), recommended for administration of intravenous therapies ≤ 5 days. Two in every three patients entering a tertiary institution will require at least one PIVC for the delivery of essential intravenous medications and other therapies [1]. Patients often require multiple attempts to achieve successful PIVC insertion; and PIVC failure, as a result of complications such as occlusion, dislodgement, phlebitis (inflammation), local- and bloodstream-infection, occurs in one in every three PIVCs placed [2, 3]. The sequelae of multiple repeated failed attempts at PIVC insertion, and later PIVC failure include patient reported pain and distress [4], missed medication (e.g., antibiotic) doses leading to sub-optimal treatment [5], irreversible damage to vasculature [6], and, in severe cases, morbidity and mortality [7]. While the incidence of clinician-identified PIVC-related harm is often reported at an individual- and institution- level [8], it is essential that clinicians and policy makers further consider the patient’s self-reported health outcomes and experiences.

Health Related Quality of Life (HRQoL) measures (including patient-reported outcome measures (PROMs)) and patient-reported experience measures (PREMs) are common tools within healthcare institutions, enabling contemporaneous identification of clinical problems [9], establish suitability of healthcare interventions, improve patient-clinician communication [10], and ensure quality and safety in healthcare [11]. These instruments can be generic or disease-specific, and require validation to establish reliability and usefulness [12]. Their use in the context of PIVCs, however, has been limited.

A recent scoping review identified that no generic (whole of treatment/person) HRQoL or PIVC-specific instruments were used to collect/report PIVC outcomes/experiences [13]. Several studies incorporated individual patient-reported items into their data collection (e.g., numerical rating scales) [13]. Overall, the core domains related to five unique patient-reported outcomes including: pain, discomfort, distress, anxiety, and fear [13]. Similarly, while several individual questions related to patient experiences existed (e.g., How much difficulty did health staff have when trying to insert an IV cannula), only one purpose-built PIVC-specific PREM was found [13]. This instrument was developed in partnership with industry representatives and consumers; however, it requires further testing to establish validity, reliability, and responsiveness. Within the remaining studies, domains related to patient experiences included satisfaction, confidence, and understanding [13]. The scoping review demonstrated a clear need for either generic or purpose-built PROMs and PREMs for use in establishing quality of care and safety, for the insertion and care of PIVCs.

Methods

The aim of this secondary analysis was to establish the discrimination and responsiveness of two generic PREMs (The Australian Hospital Patient Experience Question Set (AHPEQS) [14]; the Functional Assessment of Chronic Illness Therapy (Treatment Satisfaction - General) (FACIT-TS-G) [15]), and one generic HRQoL measure (EuroQol Five Dimension, Five Level (EQ5D-5L) [16]), collected as an outcome of a recent clinical trial comparing two PIVC designs (integrated, non-integrated) [17]. Prior to this study, none of the selected instruments (EQ5D-5L, FACIT-TS-G, or AHPEQS) had previously been used to assess health-related outcomes for patients with PIVCs.

Hypotheses

1 A: Null Hypothesis (discrimination). There will be no significant difference in the experiences of participants observed with desirable (completion of therapy) or undesirable (device failure) outcomes using the Functional Assessment of Chronic Illness Therapy (FACIT, PREM), or the Australian Hospital Patient Experience Question Set (AHPEQS, PREM).

1B: Null Hypothesis (responsiveness): There will be no significant difference in the quality of life outcomes of participants observed with desirable (completion of therapy) or undesirable (device failure) outcomes using the EQ5D-5L (HRQoL).

Data collection

A multi-site randomised controlled trial (RCT) comparing the use of integrated- and non-integrated PIVCs, the OPTIMUM Trial (Australian New Zealand Clinical Trials Registry, ACTRN12617000089336) [17], was conducted between July 2017 and December 2019. In total, 1,759 adult participants were recruited from medical, surgical, and emergency settings across three adult tertiary acute care hospitals [17]. Research Nurses prospectively recruited participants prior to PIVC insertion, subsequently collecting data on patient demographics (e.g., gender, age, underlying condition), and device details (e.g., number of insertion attempts, inserting clinician, device location). Participants were assessed daily for signs and symptoms of site complications (e.g., pain, erythema/redness). Upon PIVC removal, device outcome data (e.g., reason for removal, signs, and symptoms of site complications) and patient outcome data (e.g., treatment received, PIVC replacement) were collected.

Concurrently, a convenience sample of patients (n = 685) were approached across two recruiting sites to provide responses to a HRQoL survey, (EQ5D-5L) and one of two patient experience surveys (FACIT-TS-G and AHPEQS). Sampling occurred Monday to Friday, based on availability of the Research Nurse. Participants were invited to participate if they were able to provide verbal informed consent and were expected to require the PIVC for > 48 h. The EQ5D-5L was administered at baseline (prior to or immediately following PIVC insertion) and at 36 to 60 h following PIVC insertion (reliant upon participant availability). The FACIT-TS-G (available for collection between July 2017 and December 2018) and AHPEQS (available between January 2019 and December 2019) were also administered at 36 to 60 h following PIVC insertion. The follow-up time-point (i.e., 36 to 60 h) was selected a-priori, based on the expected mean dwell time of PIVCs (local average dwell time between 1.5 and 2.5 days) [3, 18] to ensure a higher response rate (minimising attrition related to patients discharged immediately following PIVC removal). Notably, while two instruments (EQ5D-5L and FACIT-TS-G) were administered with an introductory statement asking the patient to relate responses to their outcome and experiences associated with their PIVC, one (AHPEQS) was not. This tool was instead administered with respect to the patient’s (whole) hospital experience.

Instruments

Henceforth, all individual questions within the instruments will be referred to as ‘items’ and values recorded from responses on item scales will be ‘scores.’

Equation 5D-5 L

HRQoL was assessed using the EQ5D-5L [16]; this measure was selected for use in the multi-site RCT through investigator consensus, based on the widespread use of the tool and selection for use in other venous access device trials. First published in 1991 as a three-level generic measure (later adopted to a five-level option in 2009), the EQ5D is one of the most widely used quality of life instruments worldwide [19]. It has been validated in many clinical contexts (e.g., orthopaedic, cardiac settings) [20, 21], and is available in more than 150 languages [16]. The EQ5D-5L consists of five domains (mobility, self-care, usual activities, pain/discomfort, and anxiety/depression) measured against 5-levels (ranging from no problems to extreme problems) with a supplementary visual analogue scale for a self-reported health status measure (0 to 100; worst to best health). The instrument is intended for a patient population of ≥ 16 years of age and takes only a few minutes to complete [16]. EQ5D-5L responses are scored to determine a ‘summary index value’ (continuous variable, henceforth ‘utility’) as per the Australian EQ5D-5L algorithm, which accounts for up to 243 different health states (1.0, perfect health to -0.217, worse than death) [22, 23]. A disutility index value was also created by subtracting the utility estimate from one (perfect health). The introduction statement is available in Supplementary File 1.

FACIT-TS-G

The Functional Assessment of Chronic Illness Therapy (FACIT) measurement questionnaires, established by FACIT.org, are a series of measurements, established in over 80 languages, to assess HRQoL for a number of specific (e.g., cancer/treatment-specific) and general conditions [15]. No PIVC-specific FACIT measurement currently exists. The Functional Assessment of Chronic Illness Therapy – Treatment Satisfaction – General measure (FACIT-TS-G) was selected as the most appropriate generic PREM instrument to pilot-test in this context, intended for a population of ≥ 18 years or age undergoing treatment for chronic illness. This tool was selected by consensus of the investigator team, based on appropriateness of the included items. While the multi-centre RCT included general medical and surgical in-patients, this experience measure (designed for patients with chronic illness) was selected based on known patient demographics at the participating hospitals (which demonstrated high rates of re-admissions and underlying multi-morbidity).The tool is comprised of eight unique items which can be collated for a single summary score and is estimated to require 5 min for completion [15]. The introduction statement is available in Supplementary File 1.

AHPEQS

Following the rigorous development and subsequent release of the AHPEQS, developed by the Australian Commission on Safety and Quality in Health Care in 2017, use of the FACIT-TS-G was ceased, and replaced, following investigator consensus. The AHPEQS is a PREM instrument consisting of ten core items (and two sub-items) intended for use by hospitals and other healthcare providers to survey patients on their recent experiences of treatment/care [14]. The instrument is designed for a population of ≥ 18 years of age and takes approximately 10 min to complete. The introduction statement is available in Supplementary File 1.

Outcomes of interest

The performance of the three unique instruments were assessed against two key outcomes of interest, collected during the conduct of the large multi-centre RCT. These included:

  1. 1.

    All-cause PIVC failure: binary variable, a composite measure of failure resulting from the most commonly occurring PIVC complications, including occlusion (the inability to infuse IV medications/fluids) [17], infiltration (movement of intravenous fluid/medication outside of the vein into the patient’s cell tissue), cell damage from an irritant infusate (extravasation) [17], phlebitis defined as clinician-reported phlebitis; patient-reported pain/tenderness (≥ 2 on a 0–10 scale) resulting in PIVC removal, or two or more of pain/tenderness (≥ 1 on a 0–10 scale), erythema (redness), swelling, palpable cord, vein streak, or purulent drainage) [17] (up to 24 h prior to PIVC removal), dislodgement [17], and local/bloodstream infection (according to the Centers for Disease Control/National Health and Safety Network definitions) [24].

  2. 2.

    Multiple insertion attempt: binary variable, defined as a PIVC requiring more than a single attempt (needle to skin) for successful insertion [17].

Data analysis

Analysis methods were informed by previous studies, which similarly analysed generic HRQoL measures in various clinical contexts [25, 26]. Data were imported into Stata (StataCorp, Release 13. College Station, TX: StataCorp LLC) to analyse the three unique instruments’ discrimination, responsiveness, and ceiling/floor effects. P-values less than 0.05 were considered statistically significant. No formal corrections for multiple comparisons were applied. No data were imputed; where missing data exists, altered sample sizes are provided.

Discrimination

(used to measure construct validity), is defined as the ability for the instruments to accurately discriminate between clinical severity levels [25] (in this case, the relationship between patient-reported scores; and PIVC failure- and non-failure events (e.g., multiple insertion attempts versus single attempt). This was analysed using generalised linear regression (gamma) model (Eq. 5D-5 L disutility scores only), regression model (ordinary least squares), ordered logistic regression model, or multinomial (polytomous) logistic regression model [27]. All regression models were multivariable, adjusting for clinically important patient/PIVC characteristics (hospital, age, gender, medical/surgical admission, PIVC type (integrated or non-integrated), device location, and gauge size).

Responsiveness

Three statistics were used to assess responsiveness (the absolute value of change over time, in direction and magnitude) [28] of EQ5D-5L (only); this included (i) ES (calculated as the mean EQ5D-5L score change (D) divided by standard deviation (SD) at baseline), (ii) standardised response mean (SRM) (calculated as D divided by SD of score changes), and (iii) the responsiveness statistic (calculated as D divided by SD of the constant (unchanged responses, stable participant) D) [29]. Responsiveness ES was compared against standard thresholds (< 0.2, ‘trivial’; ≥0.2 but < 0.50, ‘small’; ≥0.5 but < 0.80, ‘moderate;’ ≥0.8, ‘large’) [30].

Ceiling and floor effects

Assessed for all instruments (EQ5D-5L, FACIT-TS-G, AHPEQS), these were measured to test whether the instruments had the ability to represent the construct being assessed by preventing the identification of a possible genuine difference [31]. Established a-priori, we determined there to be a ceiling effect when ≥ 80% of responses selected the highest score of the item and a floor effect when ≥ 80% of responses select the lowest score of the item.

Results

Of the 685 participants who completed the EQ5D-5L at baseline, 526 (77%) completed a follow-up EQ5D-5L at PIVC removal. The FACIT-TS-G was provided as a supplementary instrument to 264 participants (50%), with the remining 262 participants completing the AHPEQS instrument. Most participants were male (67%), with a mean age of 62 (SD 15.7) years (Table 1). A large majority of participants (95%) were admitted for emergent- or planned- surgery, and were from a single large tertiary hospital (94%); this was representative of the patients in the larger multi-centre RCT sample [17]. Devices were commonly 22 or 24 gauge/size (76%), inserted in the forearm (71%). Patient and device characteristics were similar between the FACIT-TS-G and AHPEQS groups; there were no AHPEQS instruments completed at site two (small tertiary hospital). Overall, 20% (103/524) of participants completing follow-up instruments had experienced two or more PIVC insertion attempts; 30% (155/524) experienced all-cause PIVC failure, with phlebitis the most reported complication (n = 73, 14%).

Table 1 Participant and device characteristics

At baseline, more than half of participants reported (in relation to their PIVC) either ‘no’ or ‘slight’ problems on the EQ5D-5L, with mobility, personal-care, pain/discomfort, or anxiety/depression (68%, 74%, 51%, and 79%, respectively), whilst slightly less than half reported ‘no’ or ‘slight’ problems for usual activities (49%) (Table 2). This was consistent at follow up with more than half of participants reporting either no or slight problems with mobility, personal-care, pain/discomfort, or anxiety/depression (67%, 73%, 61%, and 83%, respectively), and 49% for usual activities. The self-reported overall health score was 61.4 (SD 22.9) and 64.7 (SD 21.3) at baseline and follow-up, respectively. The mean EQ5D-5L utility score was 0.52 (95% confidence interval [CI] 0.49,0.55) and 0.55 (95% CI 0.52,0.58) at baseline and follow-up, respectively.

Participants completing the FACIT-TS-G instrument demonstrated poorer outcomes/experiences in their responses, compared to that which they reported for the EQ5D-5L. Compared to what was expected, participants rated the effectiveness and side effects (of their PIVC) as a little/a lot better in 56% and 44% of responses, respectively. Participants most frequently answered “completely agree” that they received assistance in evaluating the effects of their treatment, received treatment that were right for them, and were satisfied with the effects of treatment (49%, 66% and 57%, respectively). Among participants, 84% would recommend this treatment to others and 82% would choose it again. Overall care was reported as very good or excellent among 67% of respondents.

Responses to the AHPEQS instrument (regarding the overall hospital episode) demonstrated positive experiences, with participants responding that they ‘always’ or ‘mostly’ had their views and concerns listened to (89%), had their individual needs met (91%), felt cared for (97%), were involved in decision-making (82%), were kept informed (88%), believed their staff communications with each other (90%), received adequate pain relief (91%), and felt confident in their safety of care (97%). Overall quality of treatment was reported as ‘very good’ or ‘good’ for 96% of all respondents. Despite this, 19% of participants reported unintentional harm because of their care, including emotional distress, physical harm, or both.

Table 2 Responses EQ5D-5L, FACIT-TS-G, AHPEQS

All-cause PIVC failure and multiple insertion attempts were associated with several individual items in the three instruments (E5D-5L, FACIT-TS-G, AHPEQS) (presented here as coefficients and p-values in text). For those with all-cause failure, participants were more likely to report increased mobility problems within the EQ5D-5L (utility − 0.022, p = 0.038; disutility 0.02, p = 0.04 ‘slight problems’ 0.802, p = 0.004; ‘moderate problems’ 0.642, p = 0.035; and ‘unable to mobilise’ 0.713 p = 0.033) (Table 3) (detailed EQ5D-5L analysis results available in Supplementary Table 1). In the ordered logistic regression only, all-cause failure significantly correlated with increased problems with ‘usual activities’ (0.371, p = 0.042). Multiple insertion attempts were not associated with any EQ5D-5L items.

Table 3 EQ5D-5L discrimination results

With respect to items in the FACIT-TS-G, all-cause PIVC failure was significantly associated with lower effectiveness (-0.558, p = 0.000), satisfaction (-0.465, p = 0.000), likelihood to recommend PIVC to others (-0.17, p = 0.015), likelihood to choose PIVC again (-0.226, p = 0.002), and overall rating (-0.392, p = 0.003) (Table 4) (detailed FACIT-TS-G analysis results available in Supplementary Table 2). Additionally, all-cause failure was significantly associated with a reduced likelihood of participants reporting that doctors didn’t help them (“at all”) evaluate the effects of their PIVC (i.e., decreased likelihood of reporting doctors helping them) (-1.891, p = 0.015). Participants with multiple PIVC insertion attempts were significantly more likely to report lower satisfactionto some extent” with their PIVC (1.789, p = 0.002).

Table 4 FACIT-TS-G discrimination results

All-cause PIVC failure was significantly associated with participants experiencing unexpected ‘physical and emotional harm’ (1.577, p = 0.005) in the AHPEQS (Table 5) (detailed AHPEQS analysis results available in Supplementary Table 3). Additionally, all-cause failure was significantly associated with participants reporting higher involvement in ‘decision-making’ (0.575, p = 0.049) and greater ‘inter-staff communication’ (0.213, p = 0.046). Multiple PIVC insertion attempts were not associated with any AHPEQS items.

Table 5 AHPEQS discrimination results

Two FACIT-TS-G items (“would you recommend this treatment to others?” 84% and “would you choose this treatment again?” 82%), and one AHPEQS item (“I felt confident in the safety of my treatment and care” 84%) demonstrated ceiling effect (Table 2). There were no floor effects observed.

The EQ5D-5L responsiveness ES was deemed trivial at 0.16 and 0.15 for all-cause PIVC failure and multiple attempts at PIVC insertion, respectively. The responsiveness statistic demonstrated similar results at 0.16 for both outcomes of interest; the EQ5D-5L SRM overall was 0.157.

Discussion

Our study successfully examined the usefulness of one generic HRQoL and two patient-reported experience instruments among patients with PIVCs. Several individual items demonstrated usefulness in discriminating the incidence of both multiple PIVC insertion attempts and all-cause PIVC failure, however our investigation suggested the measures may not be useful as a whole. While both EQ5D-5L and FACIT-TS-G have previously been validated (and found reliable) in various other clinical contexts, much of this work has related to complex and/or chronic health conditions (involving multiple human systems) such as multiple-sclerosis (FACIT-TS-G) [32], long-COVID [33], and cancer (FACIT-TS-G; EQ5D-5L) [34, 35]. Therefore, they may not be suitable to detect the nuances of a small (nevertheless important) elements (or interventions) of healthcare interactions (a phenomenon previously identified in relation to the EQ5D-5L) [35]. Notably AHPEQS, as a comparatively new instrument has undergone little validation to date [36].

Despite this, all-cause failure was correlated with significant differences in responses to several individual items, warranting further investigation. For those with all-cause PIVC failure, participants were more likely to report increased problems with ‘mobility’ and ‘usual activities’ in the EQ5D-5L, however, the overall EQ5D-5L utility score demonstrated trivial responsiveness. Consequently, the correlation between PIVC failure and these two items may be spurious, particularly if correlations of covariates are large (and self-predictive) (e.g., patient acuity and incidence of PIVC failure may increase in a collinear manner) [37].

Overall, the PREMs (AHPEQS and FACIT-TS-G) included seven items with significant results. These items at times aligned with themes identified in qualitative studies of patients’ lived experiences of PIVCs [4, 38]. For example, patients who experienced all-cause failure were more likely to report lower ‘satisfaction’ and reported “both physical and emotional harm” related to their PIVCs (rather than physical or emotional harm, individually). However, for both FACIT-TS-G and AHPEQS, the direction of effect was reversed for several items, casting doubt on their usefulness. For example, participants with all-cause failure reported increased likelihood of reporting doctors helping them “evaluating the effects of their PIVC” (FACIT-TS-G) and higher involvement in ‘decision-making’, in addition to greater ‘inter-staff communication’ (AHPEQS). This suggests patients who experienced PIVC failure were more likely to discuss this with their treating clinicians; and note increased discussion between staff members related to it.

Participants who experienced multiple PIVC insertion attempts were significantly more likely to report the side effects of their PIVC being ‘a little worse’ and lower satisfaction (on the FACIT-TS-G only). No other significantly significant results were noted. In contrast, multiple insertion attempts (and resulting pain, discomfort, and anxiety) are a common issue of high importance identified in recent qualitative studies [4, 38]. Use of such qualitative data to support psychometric validation is essential when determining construct validity [39]. Thus, largely, all three instruments were inadequate for use in this context.

There were three items (from a total of 16, between the three instruments) which demonstrated ceiling effect, demonstrating a generally high level of variability among most items analysed. Of these, two FACIT-TS-G items were yes/no questions (“would you recommend this treatment to others” and “would you choose this treatment again”), which suggests they were meaningfully eliciting a response to identify patients with problems.

Limitations

Limitations of this study include the large attrition of participants between completion of the EQ5D-5L at baseline (time 0, device insertion), and time 1 (device removal, or study completion), however impact of attrition is likely to be low given the similarities of participant and device characteristics, and outcomes of T0 and T1 samples. Whilst the above testing of the performance of the selected instruments is consistent with previously published assessments and is limited by the available data collected alongside a randomised control trial, it is noted that the above analyses do not represent all elements considered in the validation of an instrument (see for example COSMIN). However, whilst including a broader set of performance metrics would provide a more complete understanding of the above instruments’ validity within this context, the limitations and capabilities of these instruments as identified within this restricted set of assessment remain.

Additionally, findings are limited by the use of PIVC-contextualised responses for EQD-5L and FACIT-TS-G, to the exclusion of AHPEQS, for which patients were asked to elicit responses related to their whole hospital experience. AHPEQS responses were also limited to one site (Site 1), resulting from timing of parent trial recruitment periods. Furthermore, as this site was limited to two metropolitan hospitals in Queensland, Australia, findings may not be externally generalisable. Despite this, the large number of responses elicited, and the high number of outcomes of interest (PIVC-failure, and multiple insertion attempts) enabled a meaningful analysis. We believe these findings may be useful to clinicians and researchers utilising HRQoL measures and PREMs in a PIVC-specific context in the future.

Conclusions

Initial investigation of the HRQoL and PREM instruments assessed in this secondary analysis suggest these tools are inadequate in the context of PIVCs among hospitalised patients. Several individual items demonstrated significant results in our analysis, which correlated with similar themes identified in recent qualitative studies Future purpose-built PREM and HRQoL measures, if developed, should consider inclusion of these items, in addition to robust qualitative assessment to ensure their relevance, comprehensiveness and comprehensibility.