INTRODUCTION

Episodes of transient loss of consciousness (TLOC) are very common1,2 and can be caused by a large number of clinical disorders.3 Most episodes of TLOC are caused by benign conditions like vasovagal syncope.2 Only a small group of patients with TLOC suffer from potentially lethal cardiac arrhythmias.4

Patients with TLOC have a health-related quality of life (HR-QoL) comparable to that of patients with serious chronic conditions like rheumatoid arthritis as assessed with generic HR-QoL questionnaires.57 The absence of physical symptoms between episodes in combination with the risk of recurrences or even death4,8 makes it a unique clinical syndrome. A disease-specific HR-QoL questionnaire should provide greater insight in the impairment of patients with TLOC than a generic HR-QoL questionnaire.

In an earlier study, Linzer et al. described the development and preliminary validation of the disease specific Syncope Functional Status Questionnaire (SFSQ).9 Limited validation of the instrument was performed in 49 severely affected American patients9 against the Sickness Impact Profile10 and Symptom Checklist 90-R.11 More extensive validation in a larger and more representative group of patients has not yet been performed.

The full evaluation of any health measurement scale requires establishing its reliability, validity, and responsiveness to change.12 The objective of the present study was therefore to test the reliability, validity and responsiveness of the SFSQ in a large cohort of patients presenting with TLOC.

METHODS

Subjects

This study is part of the Fainting Assessment Study (FAST); FAST was designed to asses the diagnostic yield and accuracy of attending physicians in patients presenting with TLOC (Van Dijk et al., submitted for publication).6,7 The Academic Medical Center (AMC) is a first-line referral center for patients from the surrounding area and a speciality center for patients with TLOC from the Dutch population. Adult patients presenting to the AMC, between January 2000 and July 2002, with at least one episode of TLOC in the 12 months before presentation were eligible for the study. TLOC was defined as self-limiting episodes of loss of consciousness, not lasting longer than 1 hour. Patients incapable of reading Dutch and patients with a physical/mental impairment that prevented their completing the questionnaires were excluded. All patients gave informed consent. This study was approved by the Medical Ethical Committee of our institution.

Data Collection

Clinical Data

At presentation, an initial clinical evaluation was performed, including in most cases a standardized history, a physical examination and a 12-lead ECG.3 If no diagnosis could be made after initial evaluation, additional cardiologic testing and autonomic function testing were performed.

Data on comorbidity was obtained from the medical records and numerically converted using the Charlson comorbidity index.13 Clinical follow-up data were obtained by sending a clinical follow-up questionnaire to all patients at 1 and at least 2 years after presentation. If a questionnaire was not returned, the patient, a family member, the general physician, or the medical insurance company was contacted by telephone to obtain the requested information. Final diagnoses were made by an expert committee after 2 years of follow-up (Van Dijk et al., submitted for publication).

Quality-of-life Data

HR-QoL was assessed using the SFSQ as the questionnaire under study9 and the Short Form-36 (SF-36) as a comparative measure.14

The SFSQ consists of 11 yes/no questions, assessing the areas in which syncope interferes with a patient’s daily life, and three eight-point Likert-scale questions that assess a patient’s fear and worry about syncope. The impairment score was calculated by dividing the number of areas in which syncope interferes with the patient’s life (ranging from 0–11) by the total number of areas applicable to a patient’s life, multiplied by 100. This score ranges from 0 to 100, with higher scores representing more interference in the patient’s life. The three Likert-scale questions were linearly converted to a 0–100 scale and averaged to calculate the fear–worry score, also scaled from 0 to 100 with higher scores indicating more fear and worry. A syncope dysfunction score (SDS) was then calculated as the statistical mean of the impairment score and the fear–worry score.9 The questionnaire was translated into Dutch by means of the forward–backward method.

The SF-36 is a self-administered scale that measures HR-QoL by scoring responses to standardized questions. The SF-36 provides eight scale scores, a score for reported health transition, and two summary scale scores.14 Translation, validation, and norming of the Dutch language version have been performed earlier by Aaronson et al.15

All patients received the questionnaires with a return envelope directly after their initial presentation to our hospital. Follow-up questionnaires were sent to the patients home addresses 1 year after initial presentation. Nonrespondents received a written reminder, which was sent 2 and 4 weeks after sending the questionnaires.

To assess test–retest reliability, a convenience sample of 52 respondents to the first questionnaire was asked to fill out the SFSQ again 1 week after they returned the first questionnaire. A 1-week period was chosen to minimize both memory bias and relevant changes in clinical condition.16

Analysis

All HR-QoL questionnaires were entered into an SPSS database by two different people. Differences were checked and corrected according to data on the original questionnaire. All analyses were performed using SPSS version 11.5.

Sociodemographic and clinical data were expressed as percentages for categorical data and mean (SD) or median (quartiles) for numerical data. Sociodemographic and clinical data of respondents and nonrespondents were compared using a chi-square test, t test, and a Mann–Whitney U test where appropriate.

Reliability

Score distributions (skew, floor and ceiling effects, and missing items) were examined for the SFSQ (impairment score, fear–worry score and overall SDS). To assess internal consistency of the impairment score the Kuder–Richardson formula 20 (KR-20) was used,17 while the fear–worry score was evaluated using Cronbach’s alpha. An alpha of >0.7 was considered internally consistent.12,18

Test–retest reliability of the SFSQ was assessed by testing the mean difference in scores, displayed as mean difference with 95% CI and intraclass correlation (ICC; two-way mixed model; consistency definition). To compare the results of the 11 yes/no questions, we used the kappa statistic (0.2–0.4 fair agreement; 0.4–0.6 moderate; 0.6–0.8 good; >0.8 excellent agreement). Correlation between the Likert questions at both moments was assessed using Pearson’s correlation coefficient (r > 0.4 = moderate correlation; r > 0.75 = high correlation).12

Validity

To assess the validity of the SFSQ scale scores, we compared them to the inverse results of the subscales of the SF-36. It was expected that the impairment score would show high correlation with physical functioning (PF), role functioning physical (RP) and social functioning (SF). The fear–worry score was expected to show high correlation with mental health (MH), and the SDS with the general health (GH) domain. The relations between the scale scores were calculated using Pearson’s correlation coefficient (r). An r < 0.4 was considered to indicate low, r > 0.4 moderate, and r > 0.75 high correlation.12

Clinical Validity

The method of known-group comparison19 was used to determine the extent to which the SFSQ and the SF-36 were able to discriminate between mutually exclusive subgroups of patients, differing in age, gender, and number of TLOC episodes. We expected the SF-36 to discriminate between patients with differences in comorbid conditions other than TLOC, but not the SFSQ, because the SFSQ aims to specifically assess impairment caused by syncopal episodes. Results were displayed as mean differences (95% CI) and compared using a Student’s t test.

Responsiveness

Changes in SFSQ scores were expected to be in agreement with the perceived change in health status in patients after 1 year follow-up. Based on the five answering options to the transition question in the SF-36 (compared to 1 year ago, how would you rate your health in general now), patients were divided into five groups. The changes in SFSQ scores were calculated per group and displayed as mean (standard error of the mean). Differences were tested using ANOVA.

Responsiveness of the SFSQ was also tested by comparing changes between baseline and 1-year follow-up scale scores with changes in clinical condition during follow-up. These changes were defined as: (1) presence or absence of recurrences during follow-up,20,21 (2) a diagnosis for the episodes or not,7 and (3) diagnostic category.7 Differences were assessed using Student’s t test or ANOVA.

RESULTS

Subjects

From January 2000 till July 2002, 503 patients were enrolled in FAST. Thirty-five patients were excluded from this HR-QoL study because of language problems (n = 14) or being physically or mentally unable to complete the questionnaires (n = 21). Of the remaining 468 patients, 385 (82%) returned their questionnaires. At 1-year follow-up, 268 (72%) of the original respondents filled in the questionnaires again. Twelve patients were excluded because of death during the follow-up period (n = 10), severe dementia (n = 1), and detention (n = 1). Fifty-two patients completed the SFSQ two times for the test–retest analysis with a mean interim period of 8 days (SD 6.9). The sociodemographic and clinical characteristics of the respondents and nonrespondents at baseline and 1-year follow-up are displayed in Table 1. Nonrespondents were significantly younger, more often female, and had experienced fewer episodes of TLOC than respondents.

Table 1 Sociodemographic and Clinical Characteristics of Study Subjects Completing Baseline and Follow Up Surveys6,7

Reliability

Table 2 lists the means, SDs, medians, and ranges for the summary scores of the SFSQ. For all scales, the full range of possible scores was seen. Score distributions were asymmetrical, with large peaks at 0 (no impairment) for all scales. The number of missing values ranged from 2.1% (relationship with family/ friends) to 5.2% (use of public transportation). Internal consistency of the impairment score was 0.88, and it was 0.92 of the fear–worry score.

Table 2 Score Distributions of the Summary Scales (Impairment Score and Fear–Worry Score) and Overall Score (SDS) of the SFSQ

Test–retest

Mean kappa for the 11 yes/no questions was good (0.61), and kappas range from moderate to good (0.41–0.75). Correlation for the Likert-scale questions was moderate (0.56–0.60), whereas the ICC for the summary scales was high (0.78 for all scales). There is no statistically significant difference between the baseline and retest questionnaire scale scores.

Validity

Correlations between the scale scores of the SFSQ and the SF-36 are low to moderate, ranging from 0.30 to 0.62 (Table 3). The impairment score shows moderate correlation with the PF, RP, and SF scores (0.52, 0.54, and 0.58). The correlation between the fear–worry score and MH is moderate (0.48) but higher than the correlation with any other scale. The SDS shows moderate correlation with GH (0.50) and shows similar correlations with other scales scores of the SF-36.

Table 3 Correlation (r) Between Scale Scores of the SFSQ and Scale Scores of the SF-36 at Baseline

Clinical Validity

The SDS did not discriminate between patients differing in age and gender. Patients with more than one episode in the year before presentation showed a significantly poorer HR-QoL on all SFSQ scales than patients with one episode before presentation. The impairment score was significantly poorer in patients with comorbid conditions than in patients without comorbid conditions (Table 4). These differences were comparable to the minimally important difference for these scales7 and, therefore, can also be considered clinically relevant. The scores on the physical component summary of the SF-36 were different for subgroups of patients differing in gender, number of episodes, and Charlson score. The mental component summary (MCS) scores were not different between any of the subgroups.

Table 4 Known-group comparison: discriminative properties of the Syncope Functional Status Questionnaire between mutually exclusive subgroups differing in sociodemographic and clinical status

Responsiveness

The changes in the scale scores of the SFSQ between baseline and 1-year follow-up, per category of perceived change in health status, are displayed in Figure 1. Changes are linearly related to the change in health status and significantly different from each other (ANOVA; p = <0.01). The improvement in impairment score (mean difference 11.8; p < 0.01) and in SDS (mean difference 8.2; p = 0.02) was larger in patients without recurrences during follow-up than in patients with recurrences. The changes were not different between patients with or without a diagnosis or with different diagnoses.

Figure 1
figure 1

Mean changes on scales of the SFSQ related to change in clinical status of the patient over 1 year (much better, somewhat better, equal, somewhat worse, much worse) measured by health transition question of SF-36 for n = 268 patients. Drop in score indicates improvement of patient. Imp impairment score; F/W fear–worry score.

DISCUSSION

Summary of Results

In this study we examined the reliability, validity, and responsiveness of a disease-specific functional status questionnaire for patients with syncope. The internal consistency of the impairment score and fear–worry score items was high with alphas of 0.88 and 0.92, respectively. Test–retest reliability was moderate to good for the 11 yes/no impairment questions, moderate for the Likert-scale fear–worry questions, and high for the summary scale (SDS). Validity was moderate when comparing the summary scales of the SDS with comparable scale scores of the SF-36. The SFSQ showed adequate discriminative properties between some of the clinically defined subgroups of patients but not on sociodemographically defined patient subgroups. This finding was in concordance with the results of the SF-36. Responsiveness to perceived change in health status and the presence of recurrence during follow-up was good.

Reliability

Both scale scores and the SDS show a large floor effect. This indicates that the SFSQ is not very sensitive to detecting differences between patients with a low impact of the episodes on their lives, i.e., good HR-QoL scores. Thus, when studying a relatively healthy population, the SFSQ might not be adequately sensitive. This could in part be the consequence of the dichotomous choices (yes/no) for the impairment questions. To make the instrument more sensitive to small changes in relatively healthy subjects, one could replace the yes/no response scale with a five-point Likert scale. This would likely lead to the instrument being more sensitive to changes in clinical conditions in all patients. Furthermore, in the existing scoring system, all areas of impairment are considered equally important; these questions may benefit from a weighting system. In future use, we also recommend not combining the impairment score and fear–worry score to obtain the SDS, as the separate scale scores may provide more valid information.

The number of missing items is low, indicating a high acceptability of the questionnaire. Although we expected the question on sexual functioning would be considered inappropriate by some, this question did show equal acceptability.

The impairment score is initially based on 11 questions. In cases where patients have “not applicable” responses (e.g., when the patient does not drive, does not work, has no partner, is isolated socially, and is not sexually active), the utility of the instrument may be diminished. For future iterations of the questionnaire, the term “partner” could be added to the question “does syncope interfere with your relation with spouse/boyfriend/girlfriend” to make the question applicable to more patients.

The internal consistency of the measure is comparable to that found in the earlier study by Linzer et al.9 The test–retest method shows that the within-subject variability is quite small when comparing the results to the mean changes seen in patients with different changes in health status (Fig. 1). This should enable researchers to use the SFSQ in clinical trials with relatively small sample sizes,22 keeping in mind that the test–retest reliability is related to the variance of the study population in which it was tested. Our population was quite large and heterogeneous; therefore, these results might differ in a more homogeneous population.23 Furthermore, the respondents in our population differed from the nonrespondents, which also might affect the generalizability of our results.

Validity

The correlation was moderate between the scores of the SFSQ and the preselected scales of the SF-36. Although these correlations are among the highest correlations, they are not markedly different from the correlations with the other scale scores of the SF-36. The relatively high correlation of the impairment score with the RE scale can be explained by the fact that the questions that comprise the RE scale are work-related. The fear–worry score shows moderate correlation with the MH scale of the SF-36. A possible explanation for this moderate correlation can be that the items of the SF-36 focus on MH in general, whereas the SFSQ focuses on fear and worry on account of episodes of TLOC. Individuals are likely to have another source of emotional distress in addition to their syncope episodes.

Clinical Validity

The SFSQ has been shown to differentiate between patients with one versus those with more than one episode and between patients with or without comorbid disorders; it does not differentiate between age groups and gender. This could indicate that the questionnaire is not very discriminative or that the difference between the selected subgroups was not large enough for any differences to be detected. The second explanation seems plausible because the SF-36, particularly the MCS, does not discriminate between the selected groups either. Another view could be that the lack of difference is actually a strength of the instrument because influences from age and gender may arise from other than syncope-related causes.

Responsiveness

The results of the changes in scores related to changes in health status (Fig. 1) and the differentiation between patients with and without recurrences during follow-up indicate that the responsiveness of the SFSQ is appropriate, especially because recurrences have been shown to be one of the main influences on the HR-QoL of patients with TLOC.7,20,21,24 Although the SFSQ was not responsive to changes in HR-QoL in patients with different diagnoses, earlier studies have shown that the difference in the HR-QoL of patients with different diagnoses is modest.7 It therefore seems that other factors are more important contributors to QoL. One possible explanation could be that patients interpret the questions to be related to the moment of an episode instead of their daily life. To prevent this issue, a clearer time frame could be added to the questionnaire, for instance, focusing patients on the last 4 weeks.

Clinical Application of the Instrument

The SFSQ can provide important information on the effect of interventions on HR-QoL of patients, especially in moderately or severely affected patients. With minimal amendments to the instrument, the SFSQ could also be reliable in less affected groups of patients. We recommend that this instrument be used primarily in group-comparisons in trials comparing diagnostic tests or treatment effects. As well, it could be a viable option in cohort-studies, where patients are compared over different time points. Furthermore, we feel that individual patient SFSQ scores could be informative for the physician, giving the physician an impression of the patient’s fear and worry and general experience as a result of the episodes.