Introduction

There is substantial unexplained geographical and surgeon-to-surgeon variation in rates of surgery [1, 2, 19, 21, 29]. The variation mainly pertains to discretionary procedures rather than clinical decisions that are constrained to a narrow range of treatment options or urgent and emergent surgical needs [2, 21]. Differences in illness burden, diagnostic and screening practices, and patient attitudes only explain a small degree of this variation [2, 14]. Physician attitudes and their beliefs about indications for surgery seem to explain more of this variation in the rate of surgery [24].

One would expect surgeons to treat patients and themselves similarly based on best evidence and accounting for patient preferences. This golden rule or ethic of reciprocity is frequently called on by patients when discussing treatment options: “doctor, if you were in my position, what would you do?” [16]. Understanding how surgeons make decisions, and knowing more about their confidence levels with regard to these decisions, might improve our understanding of treatment variation. Confidence level is the degree that one believes that his or her decision is most appropriate. Physicians might be more uncertain (ie, less confident) about the best treatment option when they are not fully informed about a patient’s circumstances, expectations, and considerations, which in turn might result in a recommendation for treatment that does not match the patient’s preferences and values [28].

Therefore, we aimed to assess if surgeons would recommend similar treatment for their patients as they would for themselves and if they make this decision with the same confidence.

Our primary null hypothesis was that surgeons in general recommend the same treatment for their patients as they do for themselves and with the same confidence. Specifically, we asked the following questions: (1) Are surgeons more likely to recommend surgery when choosing for a patient than for themselves? (2) Are surgeons less confident in deciding for patients than for themselves?

Materials and Methods

Study Design, Setting, and Participants

This cross-sectional survey study was approved by our institutional review board, and the study setting was a survey of the Science of Variation Group (SOVG) members; the SOVG aims to study variation in definition and treatment of human illness without financial incentives. All members with emails in the SOVG database (n = 790) were invited to complete a survey evaluating variation in treatment recommendation for upper extremity conditions [7, 9, 12, 13]. Of those, 283 (36%) responded and participated in this randomized study. Because most of the members with emails in the SOVG database are not active participants, the rate of participation is not a true response rate. We excluded physicians (n = 12) who were in training for orthopaedic surgery; 271 participants remained. Of the 271 participants, 254 completed all questions and were kept for analysis. Participants specialized in orthopaedic, trauma, or plastic (hand-wrist) surgery. Areas of interest of the included surgeons were: hand-wrist, shoulder-elbow, trauma, or general orthopaedic surgery (Table 1).

Table 1 Baseline characteristics of participating surgeons per group (n = 254)*

The survey was developed in an online survey tool, SurveyMonkey (Palo Alto, CA, USA). Invitations to participate were sent on December 15, 2014. At 2 and 3 weeks we sent a reminder. Participants completed two questions for 21 fictional cases: (1) What treatment would you choose/recommend: operative or nonoperative? (2) On a scale from 0 to 10, how confident are you about this decision: (0–10) (0 = not at all confident; 10 = very confident)? The confidence of participants with regard to the treatment decision measures the degree that one believes that his or her decision is the right one.

We included eight trauma (displaced midshaft clavicle fracture, proximal humerus fracture, distal radius fracture, greater tuberosity fracture, scaphoid fracture, distal biceps rupture, proximal biceps rupture, and lateral clavicle fracture) and 13 nontrauma (small rotator cuff defect [Fig. 1], ganglion cyst, triangular fibrocartilage complex defect, trapeziometacarpal arthrosis, scapholunate ligament insufficiency, mucous cyst, wrist arthrosis, Kienböck disease [two cases], De Quervain tendinopathy, carpal tunnel syndrome, pronator syndrome, and radial tunnel syndrome) cases. All scenarios, except De Quervain tendinopathy and the three nerve entrapment syndromes, contained clinical, radiographic, or MRI images (Appendix 1 [Supplemental materials are available with the online version of CORR®.]). For the trauma cases we explained that there were no signs of neurovascular damage. Participants were asked to assume sufficient symptoms and impact on daily activities to seek specialist attention for every case. For the patient cases, we explained that they worked as a professional.

Fig. 1
figure 1

Group 1: (1) If you, with no comorbidities, have this small rotator cuff defect: What treatment would you prefer: operative or nonoperative? (2) On a scale from 0 to 10, how confident are you about this decision? Group 2: (1) A [xx]-year-old [female/male] with no comorbidities has this small rotator cuff defect. [She/He] works as a professional: (1) What treatment would you recommend: operative or nonoperative? (2) On a scale from 0 to 10, how confident are you about this decision?

Although participating surgeons have experience in treating upper extremity conditions, clinical expertise probably varies among participants. We eliminated the risk of confounding by different levels of expertise—and other known and unknown factors—by randomizing surgeons into two groups. Furthermore, we accounted for imbalances in subspecialization and years in practice in multivariable analyses.

Randomization

Participants were randomized by entering the survey through an automated software algorithm into two groups on a 50/50 basis. Group 1 answered all questions as if they were making treatment decisions for themselves (surgeon cases). Group 2 assessed the cases as if they were making recommendations for a patient (Fig. 1). The age and gender of the patient cases were matched to the age and gender of the participants to minimize influence of these factors on decision for treatment and confidence. Age was randomly assigned to the patient case within 10 years of the participant’s age. All participants were explained at the beginning of the survey that the survey evaluated treatment variation for upper extremity conditions.

Sample Size Calculation

We calculated that a minimum sample size of 138 participants (69 per group) would provide 80% statistical power (β = 0.20; α = 0.05) to detect a difference in proportion of recommendation for surgery of 20% assuming a proportion of 10% in one group and 30% in the other.

Outcome Measures and Explanatory Variables

Our primary outcome measures were overall recommendation for operative or nonoperative treatment and overall confidence regarding this decision. Overall recommendation for treatment was expressed as a surgery score per surgeon by dividing the amount of cases they would operate on by the total number of cases (n = 21). The surgery score ranges from 0% to 100% with a higher score indicating a higher likelihood of recommending surgery. Overall confidence regarding the decision for treatment was calculated by taking the mean confidence for all 21 cases per surgeon. The overall confidence score ranges from 0 to 10 with a higher score indicating more overall treatment confidence.

Secondary outcome measures were the proportion of surgeons recommending operative treatment and confidence regarding the decision for treatment per case.

Participants were asked about their work status, gender, and age. Furthermore, we extracted location of practice, years in practice, supervising trainees, and specialization from the members’ database.

Statistical Analysis

Categorical variables were demonstrated as frequencies with percentages and continuous variables as mean with SD.

The overall surgery and confidence scores were compared between groups using an unpaired t-test. A two-tailed p value < 0.05 was considered significant. Furthermore, we did a multivariable linear regression analysis to assess the difference in surgery score and confidence score between groups and controlled for possible imbalances in all included explanatory variables (gender, location of practice, years in practice, supervising trainees, specialization, work status).

We demonstrated the proportion of surgeons recommending operative treatment per case and compared groups using Fisher’s exact test (Table 2). We presented the relative risk (or risk ratio) per case including 95% confidence interval (CI). The relative risk indicates the risk of having surgery in Group 1 (surgeon cases) as compared with Group 2 (patient cases). The confidence regarding the decision for treatment was presented per case and we compared groups using the unpaired t-test (Table 3). We presented the mean difference between Group 1 (surgeon cases) and Group 2 (patient cases) with a 95% CI.

Table 2 Recommendation of surgery per case
Table 3 Confidence in decision for treatment

All statistical analyses were performed using Stata 12.0 (StataCorp LP, College Station, TX, USA).

Surgeon Characteristics

Two hundred seventy-one surgeons were included of whom 140 (52%) were randomized into Group 1 and 131 (48%) into Group 2. Two hundred fifty-four (94%) participants completed all questions and were kept for analysis: 132 (52%) in Group 1 and 122 (48%) in Group 2. There was no difference in number of participants who did not complete the survey between both groups as per Fisher’s exact test (p = 0.80).

There were 234 (92%) men and the participants were mainly from the United States and Canada (52%) and Europe (35%). Most surgeons supervised trainees (90%) and almost all worked full-time (97%) (Table 1).

Results

Overall Recommendation for Treatment and Confidence

Surgeons were more likely to recommend surgery for a patient (44.2% ± 14.0%) than they were to choose surgery for themselves (38.5% ± 15.4%) with a mean difference of 5.8% (95% CI, 2.1%–9.4%; p = 0.002). The difference in surgery score between groups remained significant after controlling for potential imbalance of confounders in multivariable linear regression analysis (β regression coefficient [β] −5.8, standard error [SE] 1.9; 95% CI, −9.5 to −2.1; p = 0.002). Factors associated with recommendation for surgery in multivariable linear regression analysis were location of practice and type of specialization: surgeons from the United States and Canada were less likely to recommend surgery as compared with those from Asia (β −13.1, SE 5.3; 95% CI, −23.5 to −2.6; p = 0.014); hand and wrist surgeons (β 10.4, SE 5.0; 95% CI, 0.48–20.2; p = 0.040) were more likely to recommend surgery as compared with general orthopaedic surgeons.

Surgeons were more confident in deciding for themselves than they were for a patient of similar age and gender (self: 7.9 ± 1.0, patient: 7.5 ± 1.2, mean difference: 0.35 [CI, 0.075–0.62], p = 0.012). The difference in confidence score between groups remained significant after controlling for potential imbalance of confounders in multivariable linear regression analysis (β 0.33, SE 0.14; 95% CI, 0.052–0.60; p = 0.020). Surgeons who were in practice for 21 to 30 years were more confident about their recommendation for treatment as compared with those with 0 to 5 years in practice (β 0.55, SE 0.24; 95% CI, 0.078–1.02; p = 0.023).

Case-by-case Recommendations and Confidence

We found that surgeons were less likely to choose surgery for themselves than they were to recommend surgery for a patient for the following four conditions: rotator cuff defect (relative risk [RR], 0.57; 95% CI, 0.38–0.86; p = 0.006), ganglion cyst (RR, 0.57; 95% CI, 0.37–0.87; p = 0.010), scapholunate ligament insufficiency (RR, 0.83 95% CI, 0.71–0.98; p = 0.042), and distal biceps rupture (RR 0.86, 95% CI: 0.77–0.97; p = 0.014) (Table 2). There was no difference in recommendation for surgery among the other 17 conditions.

We found that surgeons were less confident about recommending treatment for their patients compared with choosing treatment for themselves for the following seven cases: displaced midshaft clavicle fracture (self: 8.1 ± 1.7, patient: 7.6 ± 1.6, mean difference: −0.52 [CI, −0.92 to −0.11], p = 0.013), radius fracture (self: 8.2 ± 1.7, patient: 7.7 ± 1.8, mean difference: −0.52 [CI, −0.96 to −0.09], p = 0.019), scaphoid fracture (self: 8.3 ± 1.7, patient: 7.7 ± 1.8, mean difference: −0.53 [CI, −0.97 to −0.09], p = 0.019), rotator cuff defect (self: 7.8 ± 1.9, patient: 7.3 ± 1.9, mean difference: −0.49 [CI, −0.96 to −0.01], p = 0.045), ganglion cyst (self: 8.7 ± 1.5, patient: 7.9 ± 2.0, mean difference: −0.77 [CI, −1.20 to −0.33], p < 0.001), mucous cyst (self: 7.9 ± 1.7, patient: 7.3 ± 2.4, mean difference: −0.68 [CI, −1.19 to −0.18], p = 0.009), and Kienböck disease (disease-modifying surgery) (self: 7.3 ± 1.9, patient: 6.7 ± 2.3, mean difference: −0.54 [CI, −1.05 to −0.02], p = 0.042) (Table 3). There was no difference in confidence regarding decision for treatment among the remaining 14 cases.

Discussion

There are substantial unexplained geographical and surgeon-to-surgeon variations in rates of surgery [1, 2, 19, 21, 29]. This study addressed variation in treatment recommendations between surgeons choosing treatment for themselves or for patients the same age and sex as themselves and their confidence level when making these decisions. This might provide us with further understanding of the unexplained variation in rates of surgery. We found that surgeons were slightly (6%) more likely to recommend surgery for a patient than they were to choose surgery for themselves. Surgeons were slightly less confident (certain about the appropriateness) when recommending treatment for their patients compared with choosing treatment for themselves.

This study has several limitations. First, participants might have perceived their own circumstances differently than the patients’ circumstances. We tried to minimize this by matching age and gender of the patient to those of the participant, explaining that patients worked as a professional, that participants should assume absence of comorbidities for all cases (including themselves), and participants were asked to assume sufficient symptoms and impact on daily activities to seek specialist attention. We believe that the simplicity of the information might be considered strength of the study, because surgeons will “fill in the blanks” with their bias. The bias they bring to the average patient encounter rather than a specific patient encounter. Second, there is a gap between hypothetical and actual decision-making; thinking about having a certain condition is not equivalent to having the condition [16]. However, this would have influenced both groups and we therefore believe that this did not influence our results. Third, surgeons within the SOVG are a subgroup—most of them are in academic medicine (90% supervises trainees)—and their values, training, and practice probably differ from the larger community of orthopaedic surgeons. Recommendation for treatment and corresponding confidence might be different among surgeons outside of the SOVG. However, we do believe that the finding of surgeons treating themselves and patients slightly differently and with a different confidence level applies to the larger community of orthopaedic surgeons. Fourth, levels of expertise might have varied among participating surgeons. However, we accounted for this—and other known and unknown confounders—by randomizing surgeons into two groups. Furthermore, we accounted for potential imbalances in randomization by including demographic characteristics in multivariable analysis. The noncompletion rate of the survey was 6.3% and did not differ among the group recommending treatment for their patients compared with those choosing treatment for themselves.

Our study is consistent with prior studies that found that physicians choose different treatment for themselves than they would recommend to a patient. Treatment preferences among patients and physicians are extensively studied and preferences differ between groups for many conditions [10, 16, 20]. The direction and magnitude of this effect are not consistent, but it highlights the importance of shared decision-making as opposed to the health provider-as-agent model [20, 28]. In the health provider-as-agent model, the physician chooses what he or she believes the patient would choose if the patient had their knowledge; however, it is not possible for the physician to fully and accurately understand patients’ preferences [28]. Furthermore, several other studies demonstrated that people confronted with a decision for another person behave differently in comparison to situations in which they have to decide for themselves [15, 18, 27, 30]. A randomized study by Ubel et al [27] assessed how decisions by physicians differed when recommending treatment for themselves or for their patients using two clinical scenarios: (1) having colon cancer and facing two different surgical options; and (2) having a new strain of avian influenza and deciding between experimental and no treatment. Physicians deciding for themselves were more likely to choose the treatment option with a higher risk of death and a lower risk of complications for both scenarios [27]. This study, like ours, does not mean to establish which decision is better; it only demonstrates the difference in recommendations. These differences might be explained by cognitive biases leading to errors in processing information that can interfere with optimal decision-making [8, 23, 24]. For example, there is a difference in weighting of dimensions; someone deciding for others typically weights only one or a few dimensions, whereas people deciding for themselves weight multiple dimensions [18]. Surgeons might, for instance, focus on the condition when advising a patient, whereas they balance more factors–family life, sports, work, social activities–when deciding on treatment for themselves. Physicians should be aware of this when asked for recommendations by a patient, because their recommendations have a strong influence on patient choice [11, 22, 25]. Furthermore, surgeons should attempt to learn as much about patients’ preferences and their considerations in decision-making as possible to provide tailored information. Giving patients more autonomy by letting them balance risks and benefits themselves will reduce the influence of cognitive biases. This can be done by providing decision aids. Decision aids are web sites, videos, or pamphlets with simple, clear explanations of the problem, all treatment options, and the risks and benefits of each approach. The information is provided dispassionately and at an eighth-grade reading level. The patient and family can go over the parts important to them repeatedly at home at their own pace. Decision aids help patients explore their own preferences and values and participate more fully in decision-making [5, 17, 26]. These tools improve the patients’ knowledge regarding options, reduce their decisional conflict, and seem to decrease rates of discretionary surgeries [17, 26].

Our finding that the proportion of surgeons choosing/recommending surgery varies by location of practice is supported by previous studies demonstrating large unexplained geographic variation in surgery rates [2]. The higher likelihood of choosing/recommending surgery by hand and wrist surgeons as compared with general orthopaedic surgeons might be a result of differences in clinical knowledge with regard to the presented cases.

Surgeons are slightly more confident when choosing treatment for themselves as compared with recommending treatment to a patient. This means that surgeons were more certain about the appropriateness of a treatment when choosing for themselves. This could be explained by the availability of more circumstantial information when deciding for oneself as compared with deciding for a patient. On the other hand, surgeons might—on average—feel a little less comfortable deciding for another person. Recommending a specific treatment (rather than providing options and helping patients decide on their preferences) may be something we do based on tradition and habit, but not something we feel entirely comfortable with. This further emphasizes the need for studies focusing on decision aids because these might help both patients and surgeons be sure that the patients’ preferences and values are adequately accounted for [17, 26].

The finding that more years in practice was associated with a higher level of surgeons’ decision confidence in our study was in line with previous studies [6].

In conclusion, surgeons are slightly more likely to recommend surgery for a patient than they are to choose surgery for themselves, and they choose for themselves with slightly greater confidence. Different perspectives, preferences, circumstantial information, and cognitive biases might explain the differences found. This emphasizes the importance of (1) understanding patients’ preferences and their considerations for treatment; (2) being aware that surgeons and patients might balance factors influencing their decisions differently; (3) giving patients more autonomy by letting them balance risks and benefits themselves (ie, shared decision-making); and (4) assessing how dispassionate evidence-based decision aids help inform the patient and influences their decisional conflict.