Introduction

It is essential to evaluate quality of life and the functional results after prolapse surgery [1]. Traditionally, the physician evaluates the outcome by interviewing and examining the patient. In clinical studies, these data are often obtained by a (self-report) questionnaire.

Several studies showed discrepancies in the interpretation of surgical outcome between the physician and the patient [2]. Physicians tended to underestimate the degree of bother in 25–37% of the patients [1]. The evaluation of the severity of the complaints of incontinence before treatment showed contradictory results between patient and physician [3, 4].

Although the above-mentioned aspects have been studied in relation with various surgical interventions, no data are available on the outcome of pelvic organ prolapse (POP) surgery. As POP surgery is an important and growing field of interest, we investigated whether there were discrepancies between the interview recorded by the physician and the answers to a patient self-assessment questionnaire on the functional results after POP surgery.

Materials and methods

Several weeks before the 1-year follow-up appointment, the standard urogynaecological questionnaire (proposed by the Pelvic Floor Committee of the Dutch Gynaecological Society) was sent to all the patients who had undergone vaginal repair POP surgery with porcine denatured dermal collagen (Pelvicol®) between December 2003 and August 2005. Details of the study population and procedures are presented in Table 1.

Table 1 Patient characteristics

The patient self-assessment questionnaire is a combination of well-known internationally used validated questionnaires which are all validated for the Dutch language containing questions on general quality of life and health, derived from the Dutch version of the Euroqol 5D, [5] disease-specific questions from the validated Dutch translation of the Incontinence Impact Questionnaire (IIQ-7) [6] and Urogenital Distress Inventory (UDI-6) [6] and questions from the Defaecatory Distress Inventory (DDI) [7]. If a specific symptom is present, the patient is asked to rate the amount of bother it was causing her on a four-point Likert scale. The complete Dutch language standardised version of this questionnaire has been validated [8]. In addition, two questions were included about urinary symptoms, derived from the Patient Global Impression of Improvement Scale [9].

At the 1-year follow-up visit, all the patients were interviewed by the same surgeon who had performed the surgery, using a checklist with ten symptoms (Table 2). Items were scored as present or absent. The physician was unaware of the patient self-assessment responses to the questionnaire.

Table 2 Inter-rater discrepancies between the physician and the patient

The data obtained with the interviews and the questionnaire were entered into an SPSS 13.0 database. Statistical analyses were performed using Cohen’s kappa coefficient, which is a statistical measure of inter-rater agreement. Kappa takes into account the agreement that occurs by chance. When raters are in complete agreement, kappa is 1. If there is no agreement, other than that expected by chance, then kappa is 0. Also, the 95% confidence interval for kappa is calculated. For the statistical analyses, SPSS 13.0 and SAS version 8.2 were used.

Results

A total of 79 patients were sent the questionnaire and invited for a 1-year follow-up visit. Interview and questionnaire data were available for analysis in 72 patients (response rate 91.1%). Reasons for not attending were illness and living in a foreign country, and one patient had died from another illness.

Mean age of the participants was 66.6 years (range 37–87); 65.3% had undergone surgery because of recurrent prolapse. Table 2 shows the differences in ratings on ten items between the interview scores recorded by physician and the patient self-assessments on the questionnaire. The results of the analysis are shown in Table 3. We found bidirectional differences in opinion between the physician and the patient: there was underestimation on eight items and overestimation on two items. The low kappa coefficients on all items illustrated poor to slight agreement between the patients and the physician. Table 4 shows the amount of bother reported by patients in the group whose complaints were underestimated by the physician.

Table 3 Analysis of inter-rater discrepancies between the physician and the patient
Table 4 Amount of bother reported by the patients in the physician no/patient yes group whose complaints were underestimated by the physician

Table 5 shows the responses to the Patient Global Impression of Improvement Scale question about urinary tract functioning before and after the operation in the group of patients whose complaints were underestimated by the physician. Most of the patients reported improvement in their symptoms after the operation.

Table 5 Average improvement in urinary symptoms compared to before the operation according to Patients Global Impression of Improvement Scale in the physician no/patient yes group

Discussion

There were large discrepancies in the judgements of functional outcome between the physician-based interview data and the self-report answers to the questionnaire given by the patients. This is best illustrated by the low kappa coefficients. The physician showed a strong tendency to underestimate the complaints of the patients. This phenomenon is well known in gynaecological follow-up studies and in other fields of medicine [1, 2, 4, 10, 11].

There are several possible explanations for the differences in judgement between the physician and the patients.

First, the complaints may be relatively slight and therefore not mentioned by the patient in the interview. In this study, this was valid in a considerable proportion of the patients. Table 4 shows that roughly two thirds of the patients who had a positive score on a specific complaint on the questionnaire versus a negative score from the physician (physician no/patient yes group) had slight to moderate complaints reported in the questionnaire.

We see that, on a few topics, some missing data are found, which might indicate that these patients had problems with understanding these questions. However, we only used questions from validated questionnaires; therefore, misinterpretation of the questions by the patients is unlikely to be an explanation. The semantics of the questions asked by physician were not monitored and could be an important factor. A question might be formulated slightly differently by the physician than in the questionnaire. If the physician asks the patient about a complaint, he might say: “Do you ever have complaints about…?”, while the questionnaire formulates the same question as: “Do you ever have the following symptoms?” In this way, the physician will hear about the patient’s complaints, whereas the questionnaire scores the symptoms.

Another possible explanation is what we call the “waiter effect”. In general, people are reluctant to complain after a meal, even when it was unsatisfactory, because they consider this to be impolite. Along these lines, Hall et al. [12] reported that a patient may believe that being liked is one way to ensure prompt, conscientious, thorough and considerate care. Being a good patient is motivated by the desire to obtain goodwill by minimising the burden placed on health care providers, so that care would be forthcoming when it was really needed [1214].

Another explanation is that a specific symptom has improved, which suppresses its importance during the interview, although the symptom is still present. We compared the surgical outcome to the original situation using a question from the Patient Global Impression of Improvement Scale. Table 5 shows that the physician no/patient yes group (Table 4) had an average improvement of 72.1% compared to that before the operation.

Symptoms may have changed in the time between filling in the questionnaire at home and being interviewed by the physician. Although we did not record this time specifically, it is unlikely to be an important factor because the questionnaires were sent to the patients only a few weeks before their visit to the outpatient department.

Physicians tend to overestimate their own surgical results [15]. If an independent physician had interviewed the patients, the results might have been different. However, this would have been contrary to normal clinical practice.

Distress and embarrassment during the interview could lead to denial of symptoms in order to escape awkward situations. However, our patients knew their physician well, so there was less likelihood of hesitance. In the literature, conflicting information is given about how embarrassment influenced the results [16, 17].

Conclusion

The physician was more optimistic about the outcome of the operation than was justified according to the answers to the patient self-assessment questionnaire. We therefore recommended the use of validated self-assessment questionnaires in clinical studies and to evaluate surgical outcome because they help to obtain a more realistic view of the functional results.