FormalPara Key Points for Decision Makers

Previous economic studies of traditional excisional surgery and stapled haemorrhoidopexy were based on limited quality-of-life data and suggested a shorter operation time for stapled haemorrhoidopexy than for traditional excisional surgery.

The results of this study show that traditional excisional surgery costs less and is associated with higher quality of life than stapled haemorrhoidopexy.

Given the current financial status of the UK national health service, commissioners of healthcare may consider being more prescriptive about procedures being offered for the surgical treatment of haemorrhoids.

1 Introduction

Haemorrhoids occur when the tissues of the distal rectum and anal canal prolapse in the canal because of laxity of the surrounding connective tissues and engorgement of the blood vessel. Symptoms from haemorrhoids include bleeding, pain, prolapse and peri-anal itch, all of which are common within the general population [1]. The widely adopted Goligher system [2] for grading haemorrhoids based on their appearance and degree of prolapse was used:

  • Grade  I: The anal cushions bleed but do not prolapse.

  • Grade II: The anal cushions prolapse through the anus on straining but reduce spontaneously.

  • Grade III: The anal cushions prolapse through the anus on straining or exertion and require manual replacement into the anal canal.

  • Grade IV: The prolapse stays out at all times and is irreducible.

The initial management of haemorrhoids is community based, and persistent symptoms are treated with outpatient procedures such as rubber band ligation (RBL) for lower-grade haemorrhoids, whereas surgical interventions are often reserved for higher grade haemorrhoids or when banding has been unsuccessful.

Given the prevalence of the condition, the management of haemorrhoidal disease continues to have considerable workload and cost implications for the UK national health service (NHS), with approximately 38,000 haemorrhoidal procedures being performed as hospital day-case or inpatient admissions in England in 2014–2015 [3]. Over the last two decades, understanding of the anatomy of haemorrhoids has improved, leading to the introduction of new surgical technologies into clinical practice. In 2009, two main surgical treatments for haemorrhoids were available: traditional (or excisional) surgical haemorrhoidectomy (TH) and stapled haemorrhoidopexy (SH). A third treatment, haemorrhoidal artery ligation (HAL), had been introduced but was not in widespread use.

TH involves excision of the haemorrhoidal cushions and has generally been advocated for larger symptomatic haemorrhoids (grades III and IV). SH was first developed by Longo at the end of the last millennium [4]. In contrast to the traditional approach, not all the haemorrhoidal tissue is removed, instead the abnormally enlarged tissue is removed, and the remaining tissue is repositioned back into its normal anatomic position. This results in relocation of the cushions and interruption of the feeding arteries. Its potential advantages over traditional surgery included a reduction of operating time, hospital stay, time to return to work and postoperative pain [5]. These features, compared with traditional haemorrhoid surgery, made it attractive to patients and healthcare providers. Nevertheless, uncertainties around complication rates, recurrence of symptoms and costs preclude its widespread use across the NHS.

The economic evaluation addressed the question “what is the relative cost effectiveness, assessed in terms of incremental cost per quality-adjusted life-year (QALY), and net benefits of SH and TH?”. The cost-effectiveness analysis followed the National Institute for Health and Care Excellence (NICE) reference case [6] and the International Society for Pharmacoeconomics and Outcomes Research (ISPOR) recommendations on conducting economic evaluations alongside clinical trials [7].

2 Methods

The economic evaluation was undertaken alongside a multicentre randomised controlled trial comparing SH and TH. This trial was registered with the ISRCTN registry (number ISRCTN80061723). The study was approved by the North of Scotland Research Ethics Committee on 18 June 2010 (reference number 10/20802/17). In brief, 777 participants (389 to receive SH and 388 to receive TH) with grade II–IV haemorrhoids who had not previously undergone SH or TH, were recruited from 32 UK hospitals between January 2011 and August 2014 and followed-up for 24 months. Sex, haemorrhoid grade and EQ-5D-3L (EuroQol—5 dimensions—3 levels) baseline scores were included as minimisation variables, thereby ensuring balance between the two treatment groups for these covariates. Median age of the patients was 50 years, and 51% were male. Over 60% of patients had grade III haemorrhoids, and 35.8% of participants in the SH arm and 30.1% in the TH arm had received previous haemorrhoid treatment. Details of the clinical results and study methodology are available elsewhere [8]. The economic analysis was undertaken from the perspective of the UK NHS, and costs are expressed in pounds (£) for the financial year 2016 [8]. Costs and benefits incurred in the second year were discounted at a rate of 3.5% per annum [6]. Details of the methods used to derive resource use are described below.

2.1 Identification of Resources and Measurement of Costs

We considered three broad areas of resource use: intervention, secondary care and primary care. Use of the intervention resource was recorded on a per patient basis. The resources used to provide surgery were established by consulting with relevant staff at participating centres (surgeons, theatre nurses, business managers) and members of the study team to elicit information on consumables such as the type of stapler used, frequency of use and other consumables used during surgery, such as surgical trays, staff mix of the surgical team (e.g. the grades of the operating surgeon, anaesthetist and nurses and number of nurses). The staplers were single use. In addition, operative details and procedure duration were collected on the trial case report forms (CRFs). CRF data were collected in the day case clinic on the day of the operation and at 6 weeks. CRFs were completed in clinic for those who attended the review clinic and from patient notes for those who did not attend.

Length-of-stay information was collected for each participant through CRFs by recording the dates of admission and discharge. For the initial intervention, cost estimation focussed on those resources that differed between the two interventions, i.e. we assumed there would be no difference in time spent in recovery or time on the ward following the procedure (for those managed as day cases) as patient lists for day cases are planned such that all patients are able to leave before the day-case clinic closes. Information was collected for those who were admitted. The use of subsequent care, such as inpatient stay (duration of stay), reoperation or other surgical interventions (such as SH, TH) and outpatient visits over the study follow-up period, was obtained from the CRFs (6 weeks) and patient questionnaire (12 and 24 months). The questionnaires used to collect subsequent resource use data were developed by the trial team to gather the relevant information. All primary care resource use, such as general practice doctor and nurse contacts and medications prescribed to treat haemorrhoids, was obtained from the participant questionnaires administered at 12 and 24 months. Self-reported subsequent resource utilisation data was verified by contacting sites for all but six cases (time constraints meant four cases were not verified; in two cases, the sites did not respond to queries), and data were still included if the site did not confirm an intervention was carried out.

Costs of the health service utilisation were estimated by combining the amount of resource used with unit costs of this resource use. Unit costs were based on study-specific estimates in combination with data from standard sources. Unit costs for the consumables used in stapling were obtained through personal communication with sites using the consumables or from published price lists. Table 1 details the unit costs used, the source of the estimate and any assumptions used to derive them.

Table 1 Unit costs for resources used in the within-trial economic analysis

Unit costs for outpatient visits were obtained from the national reference costs [9]. Unit costs for general practitioner visits were obtained from the Personal Social Services Research Unit [10] unit costs of community care. The unit costs of anaesthetic drugs such as propofol used in the operation and post-surgery were derived from the British National Formulary [11]. For each participant, the number of visits were multiplied by the appropriate unit cost. These costs were summed to produce a total cost per patient. The unit cost of the type of stapler used in the intervention was based on the cost of the stapler specified for each patient.

2.2 Quality of Life

Effectiveness in the economic analysis was measured in terms of QALYs. The EQ-5D-3L [12] generic quality-of-life instrument was administered to all study participants at baseline, 1 week, 3 weeks, 6 weeks, 12 months and 24 months, and UK index values were used. Quality-of-life data were also collected using the 36-item MOS Short Form health questionnaire (SF-36) [13] at baseline, 6 weeks, 12 months and 24 months. These data were converted into a SF-6D utility index using a published algorithm [14].

2.3 Missing Data

Missing data can lead to bias when undertaking economic evaluation data analysis; this is especially true surrounding resource use and quality-of-life data reported using participant completed questionnaires. The amount of missing EQ-5D-3L and resource use data varied over time. For example, 210 (27%) patients were missing EQ-ED-3L data at 12 months, 375 (48%) patients were missing QALY data at 24 months, 345 (44%) patients were missing resource use and therefore cost data at 12 months, and 421 (54%) were missing total cost data at 24 months. The amount of missing data was similar in both groups. One reason for missing data could be that the data were collected at several time points; a patient may have returned baseline, 1-week, 3-week, 6-week, and 24-month questionnaires but not the 12-month questionnaire. Briggs et al. [15] concluded that imputing the missing data is preferable to a complete or available case analysis.

2.4 Data Analysis

As the amount of data missing for some of the observations was >5% (Ramsey et al. [7]), the primary economic analysis was based on imputation of missing data and included all participants as randomised, irrespective of the treatment allocation. The imputation analysis was performed using Stata’s multiple imputation (MI) procedure [16]. Components of cost data were imputed based on linear regression models adjusted for the minimisation variables, which were centre, grade of haemorrhoidal disease (II, III or IV), baseline EQ-5D-3L score and sex. Missing utility values were imputed using predictive mean matching, accounting for the five closest estimates. Chained equations were used for the imputations. The imputation procedure predicted ten plausible alternative imputed datasets, which was found to be sufficient to provide stable estimates. Analysis of incremental costs and outcomes was undertaken across the ten imputed datasets and combined to generate one imputed estimate of incremental costs and QALYs. Bootstrapping was conducted to calculate confidence intervals for cost-effectiveness ratios. The results of the differences in costs and QALYs were plotted on cost-effectiveness acceptability curves (CEACs). All data analyses were conducted using Stata version 14™ software.

We used a generalised linear model (GLM) to explore the skewness of data. The GLM allows for heteroscedasticity by selecting an appropriate distributional family for the data [17]. The family offers alternative specifications to reflect the relationship between the mean and the variance of the estimates under consideration. The most appropriate distributional family was selected by (1) performing a modified Parks test, which identified two potentially viable distributional families for costs, namely Gaussian or Gamma; and (2) consulting the Akaike information criterion (AIC), which supported the use of a Gaussian model with an identity link as having the lowest AIC score (15.12) and the most appropriate model fit. A standard ordinary least squares (OLS) model was identified as the most appropriate and was applied to the analysis of incremental QALY gains. All analyses were conducted using robust standard errors. The primary economic analysis presents estimates of the incremental cost per QALY of SH versus TH. The incremental cost-effectiveness ratio (ICER) can be compared against the benchmark willingness-to-pay thresholds for cost effectiveness in the NHS context of £20,000–30,000 per QALY gained, as applied by NICE [18]. Analysis was also undertaken using the number of recurrences of haemorrhoids as an outcome.

2.5 Deterministic Sensitivity Analyses

The presentation of CEACs and scatter plots illustrates some of the sampling uncertainty in the data; however, other assumptions surrounding the most appropriate discount rate and analysis models undertaken may create additional uncertainty that is not captured in the presented CEACs. Sensitivity analysis was applied to assess the robustness of the results to realistic variations in the levels of the underlying data and also alternative assumptions. The analyses were conducted using QALYs derived from the SF-36, complete case data and by varying the price of staplers. The impact of using MI to impute missing data was also explored by running a complete case analysis (including only participants with complete cost and QALY data). Sensitivity analysis was also undertaken to explore the cost of staplers as the study was a pragmatic study and the participating centres used the available staplers. Subgroup analyses explored the possible treatment effect modification of clinically important factors (haemorrhoidal grade and sex) through the use of treatment by factor interaction.

3 Results

3.1 Resource Use

On average, the use of intervention resources was similar across both arms. The length of hospital stay (0.4 days) was the same in each group. The number of further interventions was low over the 24-month period. For example, at the 6-week time point, eight participants had undergone TH: six in the SH and two in the TH arm. Three (SH) and one (TH) participants had undergone SH, whereas four (SH) and six (TH) had undergone RBL. Five (SH) and four (TH) participants had received further treatment for skin tags. Further non-surgical intervention resource use was similar for both arms at 6 weeks post-treatment. However, the number of participants receiving further interventions was higher for SH at 12 months (SH 54 vs. TH 31) than at 24 months (SH 39 vs. TH 19).

Table 2 provides the details of average resource use costs and cost differences between the two randomised groups based on the available data. The estimates reported in terms of NHS costs (Table 2) incurred after the participants received the treatments show that the total mean cost per patient was £922 ± standard deviation (SD) 587 in the SH arm and £621 ± 582.98 in the TH arm. There was a statistically significant difference in the (adjusted) total mean costs (£323; 95% confidence interval [CI] 237–409) of the interventions.

Table 2 Mean UK national health service costs (£) and adjusted mean difference for study interventions

Low resource use means that total cost data were highly skewed to the right because most of the participants had low costs or no cost at all, but a few had high costs. Although costs of resource use measured during the intervention, such as staff time, anaesthetic used and admissions, were similar in both arms, the mean cost of interventions was £273 higher in SH (95% CI 240–306) because of the additional costs of the staplers (Table 2). The costs reported at the 6-week visit were £25 lower in SH (95% CI −68 to 17) but not statistically significant. The other patient-reported costs between 6 weeks and 12 months were higher for SH £33 (95% CI −2 to 68) but not statistically significant, and the total 12-month costs were significantly higher for the SH arm £309 (95% CI 238–380) than for the TH arm. The 12- to 24-month costs were significantly higher for the SH arm £48 (95% CI 13–82) because it had more admissions and outpatient visits. Total mean costs over the 24-month follow-up period were significantly higher for the SH arm (£323; 95% CI 237–410) than for the TH arm. The incremental differences are based on regression models (GLM [costs] and OLS [QALYs]), with adjustments for baseline covariates, including baseline EQ-SD-3L score.

3.2 Quality-Adjusted Life-Years

The EQ-5D-3L scores for study intervention at baseline, 1 week, 3 weeks, 6 weeks, 12 months and 24 months are shown in Table 3. They were higher for SH at 1 week and 3 weeks but lower at 6 weeks, 12 months and 24 months. From these data, we estimated the mean QALYs over the 2 years as 1.676 ± SD 0.384 for the SH arm and 1.738 ± SD 0.334 for the TH arm.

Table 3 Quality of life (EQ-5D and quality-adjusted life-year) by study intervention

The mean difference in EQ-SD-3L scores after adjusting for minimisation variables and baseline EQ-5D-3L scores was statistically significantly higher for SH at 1 week (0.135; 95% CI 0.082–0.188) and 3 weeks (0.05; 95% CI 0.008–0.091) but was significantly lower at 12 months (−0.064; 95% CI −0.095 to −0.033) and 24 months (−0.046; 95% CI −0.079 to −0.013). The QALY difference was −0.071 (95% CI −0.127 to −0.016)—statistically significantly lower for SH than for TH.

3.3 Cost-Utility Results

As mentioned, the base-case analysis was based on multiple imputed data. The estimated costs and QALYs are reported in Table 4. Total costs were higher for the SH group: mean difference £337 (95% CI 251–423). Total QALYs were lower in the SH group: mean difference −0.074 (95% CI −0.070 to −0.011). Both these differences were statistically significant.

Table 4 Estimation of cost-utility analysis

The CEAC generated from the base-case cost-effectiveness analysis (Fig. 1) shows there is zero probability of SH being cost effective at either the £20,000 or the £30,000 willingness-to-pay threshold. The scatter plot graph (Fig. 2) shows the point estimate and the distribution of the joint differences in costs and effects. The ICER point estimate and almost all of the bootstrapped estimates fall in the north-west quadrant of the cost-effectiveness plane, suggesting SH is significantly more costly and less effective than TH.

Fig. 1
figure 1

Cost effectiveness acceptability curve showing the probability that SH and TH is cost effective giving the different values of willingness to pay thresholds

Fig. 2
figure 2

Cost effectiveness scatter plot illustrating the distribution of differences in cost and QALY

The results of cost-effectiveness analysis based on the number of recurrences averted were similar to those of the base case (Table 3). On average, significantly more recurrences occurred in the SH arm (0.18; 95% CI 0.245–0.120), and SH was more costly than TH; therefore, SH was dominated by TH.

3.4 Sensitivity Analysis

The results of the analysis considering the complete cases (based on participants with both cost and QALY data) are presented in Table 4. On average, SH cost £288 (95% CI 190–386) more than TH and had −0.060 (95% CI −0.113 to −0.007) fewer QALYs than TH. The results of the complete cases were broadly similar to those of the base-case analysis. On average, SH had higher costs and lower QALYs than TH. The chance that SH might be considered to be cost effective at the £30,000 threshold was 1.7%.

The utility and QALY scores derived from the SF-36 followed a similar pattern to those of the EQ-SD-3L. On average, the 6-week SH utility score was −0.015 lower than that of TH, but this did not meet statistical significance. However, it was lower at 12 months (−0.040) and at 24 months (−0.034), and these differences were statistically significant (p < 0.05). The QALYs were −0.04 (95% CI −0.069 to −0.013) lower for SH than for TH. There is only a 0.1% chance of SH being considered cost effective at willingness-to-pay thresholds of £20,000 or £30,000 (based on QALYs derived from the SF-6D).

Further analysis was undertaken varying the cost of the staplers and using the least expensive of those used in the study (Table 4). The QALY difference remained the same (−0.070 [0.027]; 95% CI −0.127 to −0.011). The results based on using the least expensive stapler were similar to those of the base-case analysis. For costs of £125 or above, SH cost significantly more and had lower QALYs than TH. SH remained marginally more costly than TH unless the cost of the stapler fell to zero. The results of the analysis that assumed no additional cost from staplers suggested SH had a 0.1% chance of being considered cost effective at both the £20,000 and the £30,000 threshold. Results of the analysis incorporating subgroup interaction terms relating to the sex and grade of haemorrhoidal disease were not statistically significant.

4 Discussion

The results of the base-case analysis suggested that, on average, SH cost £337 (95% CI 251–423) more and had −0.07 (95% CI −0.13 to −0.01) fewer QALYs than TH. The cost-utility analysis suggested there was no chance of SH being considered cost effective at £20,000 or £30,000 willingness-to-pay thresholds. These results are robust; none of the sensitivity analyses altered the conclusions that SH always cost more and generated fewer QALYs than TH.

The QALYs derived from the two different instruments (EQ-5D-3L and SF-6D) were similar. The benefits of short-term post-operative pain experienced by patients in the SH arm was reflected in the EQ-5D-3L utility scores at 1 and 3 weeks (statistically significantly higher) and at 6 weeks (not significantly higher), but any gains in quality of life were offset by the higher rate of recurrence at 12 and 24 months. Although the SF-6D results were lower for the SH arm at 6 weeks, the results were not significant. The 12- and 24-month SF-6D utility scores were similar to those of the EQ-5D-3L.

The major driver for the increased cost of SH was the cost of staplers, but sensitivity analysis conducted varying the cost of staplers indicated the cost-effectiveness conclusions were not particularly sensitive to this parameter because of the superior QALY estimates for TH. The study was a pragmatic study, and research sites were allowed to use their choice of stapler type. The stapler cost analyses suggested the cost of the stapler would have to fall to zero for cost differences to become positive for SH, and even in this instance the costs would not be statistically significantly different. The distribution of the estimates of costs and QALY differences would lie in the south-west quadrant of the cost-effectiveness plane (SH would cost less but have lower QALYs than TH), where cost savings do not outweigh associated QALY losses.

A key strength of this study was that it was an economic evaluation undertaken alongside a large randomised controlled trial to compare SH and TH. It was a multicentre study with centres across the UK that followed-up participants for 24 months. This suggests the results could be generalisable to all patient populations seeking treatment for grade II–IV haemorrhoids. The number of patients recruited ensured considerable confidence in the conclusions drawn from the trial-based cost-effectiveness analyses. The existing economic evaluations of these treatments undertaken alongside a randomised controlled trial include low numbers of participants and shorter follow-up times [19, 20]. The other economic evaluations [21, 22] were conducted within a modelling framework and used data from small studies.

This study also captured the effect of these treatments on patient quality of life, using the EQ-5D-3L to measure the effect of pain or complications on quality of life post-surgery over several time points, particularly early time points (1, 3 and 6 weeks) where they were anticipated to affect quality of life. Short-term quality-of-life scores were better for SH, reflecting lower rates of pain in the immediate post-operative period; however, the longer-term scores were better for TH, which had fewer residual haemorrhoidal symptoms, recurrences and re-interventions. The quality-of-life estimates based on the additional instrument, SF-6D, were similar in direction and magnitude to those based on the EQ-5D-3L instrument.

One of the limitations of the economic analysis was the amount of missing data. This could be because the population was based on working age patients and haemorrhoids are a chronic condition that may be considered by some to be a sensitive condition. However, the amount of missing data was similar in both arms. The MI method, which assumed the data to be missing at random, was conducted to address this challenge, and the results of the analyses from the imputed dataset and complete case were similar. The conclusion that, on average, SH cost more and had fewer QALYs than TH remained the same, irrespective of the approach used.

The results of the economic analysis are inconsistent with those published in Burch et al. [21], who reported that TH and SH had similar costs because the staple gun costs in the SH arm were offset by hospital stay savings in the TH arm. Our results suggest that the operation and time spent in hospital were similar in SH and TH, so there were no cost savings in inpatient stay. The cost differences in our results were driven by the additional cost of stapler guns. Burch et al. [21] reported that the QALYs were similar in both arms. Ho et al. [19] reported that the total costs incurred for TH at 1 year were less (£9210.17 [16.85] vs. 1283.09 [31.59]; p < 0.005). Thaha et al. [20] reported that the extra mean cost (£312.51) incurred for SH was due to additional costs for the stapler. Ribarac et al. [22] reported that an incremental cost of £33 was incurred for SH after 1 year. These cost results were similar to ours, as they indicated that SH cost more than TH.

The results of our study indicated that, for SH, the quality-of-life gains experienced post-surgery were less than the quality-of-life reductions in the 24-month follow-up period. Burch et al. [21] reported no difference in the quality-of-life measures for the two treatments. The quality-of-life results in our study were similar to those of Ribarić et al. [22], Ho et al. [19] and Thaha et al. [20], who all reported that SH was less effective than TH. However, all of the studies indicated a higher rate of prolapse and re-intervention for prolapse in the SH group, which was reflected in the higher follow-up costs and lower QALYs experienced in the SH group in our study.

5 Conclusions

The analysis suggested that SH cost more and was less effective than TH. These results were supported by the sensitivity analyses and the fact that secondary clinical outcomes such as tenesmus were more prevalent in the SH arm (p < 0.001) and more participants reported recurrence of haemorrhoids at both 12 and 24 months. Therefore, TH is a superior surgical treatment for the management of grades II–IV haemorrhoids when compared with SH in terms of both clinical and cost effectiveness. Robust economic data on haemorrhoid surgery are scarce; however, if the results of this study are adopted into practice, substantial annual cost savings in publicly funded health services could be achieved. Given the current financial status of the NHS, commissioners of healthcare may consider being more prescriptive about procedures being offered for surgical treatment of haemorrhoids.