INTRODUCTION

Patients presenting with common symptoms often worry about the possibility of having a serious illness.1,2 In situations where the patient’s concerns are not easily explained, responding to patient worries is even more challenging.36 In these situations, patient–physician relationships are often stressed,4,610 and patients may feel dismissed or not validated.6 Simple reassurance may be counterproductive unless accompanied by medical explanations based on an understanding of the patient’s feelings, expectations, and ideas about causality, a promise of tangible help and avoidance of blaming the patient.5,6 Failure to explore patients’ symptoms, ideas, expectations, and illness beliefs that underlie their concerns, and failure to validate those concerns using empathic communication comes at a high cost—unnecessary return visits, unnecessary and unwanted somatic treatments,3 excessive diagnostic testing,11 missed diagnoses,12 and symptom amplification.13 In contrast, a true understanding of the patient’s concerns can lead to providing useful information, recommendations, and psychological support. In short, particular kinds of patient-centered responses may be especially important for patients with medically unexplained symptoms (MUS) compared with those whose symptoms are more medically straightforward. Our previous work suggests that patient-centered communication is associated with greater patient trust,14 lower health care costs,11 provision of guideline-concordant care,13 and appropriate counseling.15

Exploring and validating patients’ concerns is not always straightforward. Not all patients who choose to share their worries with their physicians do so directly. Rather, many patients, including those with multiple unexplained symptoms,4 cue their physicians to their concerns by telling stories, amplifying their concerns,2,13,16 or using affectively “loaded questions”. Loaded questions are often superficially straightforward, but reflect underlying feelings of fear, anger, or apprehension that should be addressed. Whether patients express worries directly or indirectly, they often leave it up to physicians to explore the topic further; if physicians do not do so, patients may assume that the concern is either unimportant or that the physician is not interested.6,7,17 When physicians offer acknowledgement, empathy, encouragement, praise, and active help,18 patients are more likely to feel that their physicians are trustworthy and supportive, and are more likely to report improved satisfaction with care, adherence, and chronic disease outcomes.1921 Research suggests that many patient concerns are minimized or interrupted,6,22 and that expressions of empathy and support are uncommon.13,18

The goal of this study was to evaluate physicians’ responses to a common expression of worry—the “loaded question”.1,2 We used 2 data sources: surveys of study physicians’ current patients and transcripts of covertly audiorecorded unannounced standardized patient (SP) encounters. We used SPs to inquire about whether their primary symptom (chest pain) might represent “something serious” in 2 contexts—a straightforward symptom presentation and a presentation portraying MUS. Using a staged multimethod approach, we first used qualitative methods to develop a taxonomy of physicians’ responses to patient expressions of worry. We then used quantitative methods to characterize the type, frequency, and pattern of physician responses. Next, we explored whether the way physicians’ communication styles (as manifested by how they responded to patient expressions of worry) was noticed by their patients—that is, whether the physicians’ responses were correlated with ratings of interpersonal aspects of care obtained from a sample of each physician’s patient panel. We then examined whether physicians’ ability to explore and validate patients’ concerns in the context of MUS (compared with the straightforward presentation) would be a stronger predictor of patients’ ratings. Our specific hypotheses were that (1) patients’ ratings of interpersonal care would be associated with physicians’ use of empathy or provision of information about the illness (rather than providing “empty” reassurance without providing explanations or dismissal of the patient’s concerns) and (2) expression of empathy in the setting of MUS would have a greater effect on patient ratings compared with more straightforward symptom presentations.

Finally, we used sequential analyses to explore, in an open-ended way, how desirable physician responses (e.g., those responses that we found to be most closely associated with higher patients’ ratings of care) tended to emerge during clinical conversations. We wanted to see if these desirable responses would be more likely to occur if expressed early in the conversational sequence. Also, we wished to see if particular types of statements tended to be conversation-stoppers (for example, suggesting a test or offering a prescription) that would inhibit further use of desirable responses.

METHODS

Physician Selection

Physician recruitment took place during 2001–2002 in the Greater Rochester (NY) area (population 1.1 million). In 1999, 594 family physicians and General Internists with at least 100 insured patients were identified from a large managed care organization (MCO) database. Of 297 eligible physicians contacted, 100 (33%) agreed to participate in the study. Participating physicians consented to have 2 unannounced visits by SPs furtively audiorecorded during a 1-year period after consent, and afterwards to allow a research assistant to collect survey data from 50 patients in the waiting room. Details of the study design have been published elsewhere.11,14

Standardized Patients

SPs are actors trained to portray patients in a realistic and uniform way. SPs have been used extensively to assess physician responses to a variety of scenarios in outpatient practices.2327 SP methods avoid many biases inherent in using data collected from practice patients, including self-selection, accommodation to the physician’s style, and case mix. Prior research has demonstrated that SPs portray their roles consistently and that physicians can be successfully blinded to their identity.24 In this study, SPs were used to evaluate the physician’s contribution to patient-centered communication with particular attention to physician responses to patient expressions of worry that a common symptom might represent serious disease. Comparisons were made in physician responses to patients presenting with straightforward symptoms and MUS.

Standardized Patient Roles

Each SP presented as a new patient to the physician’s practice with an acute concern, either chest pain characteristic of gastroesophageal reflux disease (the GERD role) or with poorly characterized chest pain (the MUS role). The GERD SP presented as a 48-year-old male or female with nocturnal chest pain and associated fatigue as a result of sleep loss. Symptoms were affected by food intake and partially relieved by antacid use. The MUS SP presented with a constellation of symptoms that have been shown to produce diagnostic ambiguity;7 a 48-year-old male or female with multiple symptoms including poorly characterized generalized chest pain, fatigue, dizziness, and moderate emotional distress.

Standardized Patient Training and Monitoring

We created detailed clinical biographies for each role, which were reviewed by investigators and advisors to ensure that each role represented clinically credible patients who could be managed within the context of a 15- to 20-minute, new-to-the-practice, acute visit. Each SP portrayed the same role throughout the study.

Five SPs (2 male, 3 female) were intensively trained to deliver a medical history and respond to the physical examination in a carefully scripted fashion. Probable physician questions were anticipated and a response to these questions were prepared and rehearsed by the SPs. The SPs were monitored regularly and were required to maintain 95% accuracy in role portrayal on a 100-item rating form. SPs were randomly assigned by role and gender to visit participating physicians.

The “Something Serious” Prompt

SPs were trained to deliver a prompt signaling worry about their symptoms. Because each physician saw both SP roles, the prompt was slightly different for each role; for the GERD role, this prompt was “Do you think that this could be something serious?” and for the MUS role, the prompt was “I first thought it was heartburn, but just want to make sure that this was not something serious”. This and subsequent prompts such as “you hear a lot about cancer or heart disease, I was worried about that... ” were offered for both roles. Initial prompts were to be delivered 6–8 minutes into the interview but before the physical exam, if possible, and repeated if the physician did not address them directly.

SP Deployment and Detection

The patient–physician interactions were recorded with a hidden digital audiodisk recorder and subsequently transcribed. Two days after each office visit, we notified the physician by fax that they had recently been visited by a SP and asked whether they could identify the SP; 40% of physicians were subsequently able to identify the SP. The majority stated that the reason for detection was that the practice had been closed to new patients. Physicians rated the detected SPs as “very realistic” (8 or greater on a 1–10 “realism” rating scale) the physicians in nearly all cases. A panel of experts blinded to study hypotheses could not identify differences between audiorecordings of detected and undetected visits, and analyses of other data from this study did not differ when detected visits were excluded.11 Across all SP studies, detection is inversely proportional to the time between the visit and the inquiry; ours was the shortest interval used in SP research and was chosen on the advice of a community physician advisory board to avoid unnecessary efforts to follow-up with patients.28

Patient Survey Ratings of Interpersonal Aspects of Care

Approximately 50 consecutive patients aged 18–65 years were approached in each participating physician’s waiting room by a research assistant; those agreeing completed a 10-minute written survey. The survey included demographic data, a list of common chronic illnesses, the SF-12, and a composite rating of the physician’s patient-centered communication. This composite measure was drawn from validated scales measuring satisfaction,29 trust in the physician,19,30 perceived physician knowledge of the patient,30,31 and perceived support for patient autonomy, involvement in care and choice.32 As reported elsewhere,33 we used principal components analysis to show that these 4 scales could reliably be reduced to a single factor, “satisfaction/trust/autonomy support/knowledge” (STAK).

Qualitative Analyses

Using transcripts of recorded visits, we identified every “something serious” and “worry” prompt. These prompts or “empathic opportunities”, were defined as instances where SPs asked if the physician thought they had “something serious” or stated a worry about heart disease or cancer.13 All subsequent physician responses that pertained to each SP prompt were also identified. Three individuals (CGS, JL, and JC) each reviewed 10 transcripts to generate a list of physician responses to the “something serious” prompts. We refined the list to generate 10 mutually exclusive physician response codes: action (e.g., a prescription, or a referral), dismissive/minimizing concern, empathy, explore emotion, explore psychosocial issues, “I don’t know”, medical explanation (without reassurance), nonspecific acknowledgement, reassurance (with or without medical explanation), and redirecting toward biomedical inquiry. Because medical explanations and reassurance were often combined into a single phrase, we coded “medical explanation” as information presented without statements of reassurance; correspondingly, the “reassurance” statements might either include or not include medical information. Two coders (JL, JC) independently analyzed 20 additional transcripts yielding item intraclass correlation coefficients of 0.80–0.90. The remaining transcripts were coded by 1 of the 2 coders (JL).

Quantitative Analyses

We used descriptive statistics first to analyze the frequency of each type of physician response. Then, we examined crude and adjusted correlations between physician responses to the first MUS and GERD “something serious” prompts and patient survey STAK measures. Multivariate analyses were adjusted for physician and patient gender, race and age, length of patient–doctor relationship, and patient comorbidity indicators. Because patients might cluster within specific physicians (e.g., physicians whose offices are in certain neighborhoods might have more African Americans or older patients, etc.) and because patient ratings of their physicians might differ on the basis of these same factors (race, age, etc.), we used “random effects” models to adjust estimates for the clustering of patient STAK measures within physician. Only the response to the first prompt was used for the correlation and multivariate analyses because it became difficult to determine whether subsequent physician statements were in response to the first or subsequent prompts. Analyses were performed using SAS (Version 9.1, SAS Institute, Cary, NC, USA).

Sequence Analyses

Once the types of physician statements that were associated with positive patient perceptions were established, we used lag sequence analysis and correlation tables to explore the sequence of physician responses that led to these statements. Analyses were based on the statistical concept of conditional probability. The analyses provided information about the probability that any given patient prompt or physician response (for example, empathy) would precede another specific physician response (such as providing reassurance). The data were then recorded as a series of “lags”, each of which represented an event (an utterance) in a temporal sequence.34 For example, lag 1 is the association between a given behavior (in this case, the SPs’ prompts) and the immediate subsequent behavior (in this case, physicians’ responses). Lag 2 refers to the physician’s second response after a prompt. We also noted when the conversation ended or changed topic. We used tree and flow diagrams to illustrate conversational sequences.

RESULTS

Patient Demographics

Of patients approached in waiting rooms, 96% agreed to participate, yielding 4,746 completed surveys. Patient demographics are displayed in Table 1. Females accounted for 62% of the total sample, and 84% of patients were white. Mean patient age was 45 years and the majority had been with the same physician for more than 5 years.

Table 1 Patient Demographics (n = 4,746)

Physician Demographics

One hundred physicians were enrolled in the study, including 47 family physicians and 53 General Internists. Seventy-seven percent were male, 24% were solo practitioners, and 32% practiced in rural areas—similar to local physician demographics. Seven physicians moved or retired after the first SP visit; 4 recordings were not usable because of technical difficulties. Recordings from 189 SP visits to 100 different physicians were available for analysis.

SP Prompts

The SPs, for both roles combined, delivered 613 prompts with a mean of 3.2 (standard deviation [SD] = 1.6) prompts per clinical encounter. The 93 GERD and 96 MUS portrayals included 255 and 358 prompts, respectively. The SPs portraying the GERD role delivered a mean of 2.7 prompts (SD = 0.9) per clinical encounter; the MUS SPs delivered a mean of 3.7 prompts (SD = 1.9).

Physician Responses

Physicians offered a mean of 3.08 responses (SD = 0.58) per prompt to the GERD SPs and a mean of 3.13 responses (SD = 0.86) per prompt to the MUS SPs.

The most frequent responses offered by physicians to the SP prompts were redirecting toward biomedical inquiry, action, nonspecific acknowledgment, medical explanation, and reassurance (Tables 2 and 3). Biomedical inquiry, action, and medical explanation were more common in the GERD role. “I don’t know” responses, exploration of emotion, and psychosocial exploration were relatively uncommon in the MUS role, and even less common in the GERD role. Reassurance and use of dismissive language occurred at similar frequency in both roles despite more persistent prompting in the MUS role. Empathy was expressed in 15% of visits, even after repeated prompting.

Table 2 Physician Responses to SP Prompts by SP Role
Table 3 Frequency of Physician Responses Per Visit to the “Something Serious” Prompt: All Responses

Patient Survey Data

Our initial multivariate model of patient STAK scores looked at whether the physician responded with each of the 10 response types to the first “something serious” prompt encountered in the interview transcript for either the MUS or GERD role. Scores were normalized so that they all had a mean of 0 and a SD of 1. Empathy and nonspecific acknowledgement were marginally associated with increases (improvements) in the mean STAK score of 0.15 SD (p = 0.06; 95%CI = −0.01 to 0.31) and 0.09 SD (p = 0.10; 95%CI = −0.02 to 0.20), respectively, as rated by the physician’s real patients. After adjusting for physician and patient age, gender and race, physician practice location (rural, urban), length of patient–physician relationship, and patient comorbidities, the parameter estimate for empathy was 0.17 SD (p = 0.02; 95%CI = 0.02–0.32); the effect for nonspecific acknowledgement was no longer even marginally significant. Additional analyses stratified by role showed that whereas empathic statements and acknowledgement were associated with higher patient STAK ratings obtained by both SP’s, the association was only statistically significant for responses to the MUS role, which was characterized by greater complexity, ambiguity, and uncertainty. Thus, our original hypotheses were supported.

Sequence Analyses

The most common physician responses to initial patient prompts were nonspecific acknowledgment, redirecting toward biomedical inquiry, medical explanation, and reassurance. These results are displayed graphically as a tree diagram (Fig. 1). Using combined qualitative analysis of the tree diagram as well as frequency/contingency tables, we constructed a semiquantitative representation of the most common observed sequences of conversation in this sample (Fig. 2), taking into account, not only the first patient prompts, but all prompts. The thick arrows indicate the most frequent conversational pathways; once an SP prompt was delivered, the most frequent responses to follow were nonspecific acknowledgement (40%); of these, 49% were followed by a biomedical inquiry, and, of those, 17% resulted in an action. Empathy was most likely to occur if it was expressed at the beginning of the conversational sequence, and tended to facilitate further biomedical inquiry. In contrast, action and psychosocial exploration tended to close down the discussion leading to a change of topic. These patterns were similar when they were analyzed separately for the GERD and MUS roles, and thus were combined.

Figure 1
figure 1

Tree diagram indicating the most likely follow-up (lag 2) statements made by physicians after initial statements indicating nonspecific acknowledgement, biomedical inquiry, medical explanations without reassurance, reassurance (with or without medical explanations), and empathy.

Figure 2
figure 2

Flow diagram of all physician responses after patient expressions of worry. Thick arrows indicate the most common sequence of physician responses to patients’ expressions of worry. Thin arrows represent less common pathways. Percentages refer to the percent of responses leading to the next response. For example, 51% of the time, physicians’ nonspecific acknowledgements were followed by a biomedical inquiry and 14% of the time these were followed by some clinical action (ordering a test, prescribing a medication, etc.).

DISCUSSION

This study builds on our prior work in which we have described how exploration of patients’ concerns and validation of their illness experience results in improved health care. In this paper, we specifically explored how physicians responded when patients presented a “loaded question” indicating worry about serious illness. In contrast to prior naturalistic studies, we were able to design this study as a true experiment, using SPs to portray the same clinical scenarios to 100 primary care physicians. We conducted qualitative, quantitative, and sequence analyses that revealed predictable sequences of physician response to patient expressions of worry. Consistent with prior work by Eide et al.,35 physicians’ most common initial response was nonspecific acknowledgement; in addition, many physicians also asked biomedical questions and provided medical explanations and reassurance. Later, physicians tended to take action by suggesting a diagnostic test, medication, or other treatment. Not surprisingly, empathy and emotional exploration were much less common.13

Empathy is widely regarded as an essential element in fostering healing relationships.3645 Our findings demonstrated that the use of specific empathic responses to a patient expression of worry is a marker for greater patient trust, feeling of being known, satisfaction, and feeling supported in their decision-making. Also, empathy was found to be distinct from reassurance, which may paradoxically raise patient anxiety.46,47 Our data suggest that empathy appears to be expressed more variably when the patient presentation involves greater diagnostic ambiguity. Using the SP scenarios as indicators of the “empathic capacity” of the physician, it appears that the ability to be empathic in more challenging situations characterized by ambiguity was a more robust marker of high-quality patient–physician relationships compared with the more straightforward medical situations. In the more challenging situations, clinicians not only had to empathize with patients’ worry, but also express empathy in situations in which they were dealing with their patients’ (and their own) anxiety generated by facing uncertainty.48 This does not diminish, however, the importance of empathy in more straightforward situations.

Furthermore, the MUS role may also have been a more robust test of physicians’ capacity for mindfulness, self-calibration, and lowered emotional reactivity in the face of anxiety. Mindfulness has been promoted as an essential clinical skill to help the clinician understand the patient’s world to a sufficient degree to experience and express empathy.49,50 Finally, the MUS was a test of the effect of addressing uncertainty on patient ratings of their physicians. Although uncommon, physicians’ expressions of uncertainty (“I don’t know...”) were not associated with lower patient ratings of the physician, in contrast to survey and vignette studies which suggest otherwise.51,52

Concern has been raised about the inappropriate use of reassurance without adequate biomedical explanations or lack of empathic responses when patients present with MUS.5,6 However, even accounting for the increased number of prompts in the MUS role, we found that physicians used empathy at least as frequently in the MUS role compared with the more straightforward GERD presentation, did not attempt to reassure more often, explored psychosocial domains, and admitted uncertainty more often. This study cannot tell us, though, whether their responsiveness to patient distress was sufficient. Actual patients presenting with MUS are likely more anxious and more dissatisfied with their prior care; they may require greater empathy and more careful explanations, and may receive empty reassurances more negatively than patients whose symptoms are more easily explained.

Sequence analyses suggested that if empathy was not expressed early, it was much less likely to be expressed later in the encounter. Empathy tended to facilitate further biomedical inquiry, reassurance, and action. These findings suggest that it might be best to lead with empathy in response to patient concerns about serious illness, followed by appropriate biomedical inquiry. Conversely, leading with biomedical inquiry may tend to result in the physician forgetting the human dimensions of the patient’s concerns and not express empathy later in the interaction sequence. It is interesting to note that the exploration of emotion and psychosocial inquiry tended to lead to change of topic and no further addressing of the patient’s worry, implying that these approaches may have been invoked after clinicians had exhausted biomedical avenues.

Limitations and Strengths

Inferences from cross-sectional data must be made cautiously, and we can only regard our findings as suggestive. Participating physicians may be different from nonparticipants; for example; they may have had more confidence in their communication skills than their nonparticipating colleagues. The SP roles sampled only two of many possible patient presentations; and it is likely that physician responses to patient worry may have been quite different in cases where serious illness was highly likely and unambiguous. Further studies are needed to clarify other pathways between expression of worry and physician responses. Although the SPs portrayed their roles with high fidelity and the detected visits were indistinguishable from the undetected ones, SP presentations may have differed in unpredictable ways from the way real patients present. Because SPs were scripted, we could not study antecedents to patient expressions of concern.35 Finally, SP visits are, by definition, first visits, and therefore may not represent physician actions in established relationships.

Strengths of the study include our use of 2 sources of data: patient surveys and unannounced SP encounters. Real patient reports capture overall physician style over time, whereas SPs substantially reduce confounding by patient variability, case mix, and accommodation to the physician’s style. SPs also allowed us to study response to a consistent stimulus across 100 clinicians. Previous sequential research on patient concerns by Eide et al.35 was limited by small sample size, a heterogeneity of patient concerns resulting in difficulty discerning chains of sequential responses, and a preset coding scheme that may not have had sufficient flexibility to accommodate all relevant physician responses. Using emergent coding, we were able to parse elements of conversation based on their natural occurrence, rather than predefined clustering (for example, clustering empathy and reassurance as part of an “affirmation” cluster). Our large sample size allowed us to observe relatively uncommon events, such as expression of empathy. Mixed qualitative and quantitative methods allowed for correlations between emergent-coded observed sequences with meaningful patient ratings.

CONCLUSIONS

We observed patterns of physician response to SP expressions of worry and correlated those patterns with ratings of the physician by a sample of the physicians’ patient panel. Compared to physicians who were initially dismissive of the patient’s concerns or focused only on diagnosis and treatment, physicians who were empathic had patients who experienced greater trust, autonomy support, feeling known, and satisfaction with the patient–physician relationship, especially in situations involving ambiguity and uncertainty. Whereas some of these observations confirm basic principles of communication training (such as expression of empathy as an initial response to patient worries), few studies have actually noted that empathy makes a difference in patients’ experiences of care. Our findings also raise more nuanced questions about how and when empathy is used: Is the expression of empathy more important or effective in situations involving ambiguity and uncertainty? Is the timing of empathic statements within the clinical encounter important? How long does empathy take? If emotions were explored earlier in the interview, would discussion of emotion tend not to terminate conversations? Our observations and the questions they raise further emphasize the importance of the clinical context in evaluating communication behavior.