Introduction

The Neck Pain and Disability Scale (NPAD) was developed in the USA as a comprehensive measure of neck pain and related disability. The 20-item scale measures problems with neck movements, neck pain intensity, effect of neck pain on emotion and cognition, and the level of interference with daily life activities. It has been found easy to complete for patients and simple to score, and it provides a validated measure to evaluate outcomes in patients with neck pain [1]. Originally developed in the USA [2], the NPAD was translated in various other languages [39] including a recently introduced German version [10].

Although high levels of reliability and construct validity of the NPAD original version and translated versions were reported, there is very rare data on the sensitivity to change (the capacity of a measure to detect change in patient status at interest over time) of the NPAD. When the NPAD is intended to measure patient change over time, sensitivity to change is a central property, as it is essential to rate the clinical meaningfulness of changes in scores [11]. The aim of this study was to assess sensitivity to change of the NPAD in a sample of German-speaking subjects from a primary care setting.

Methods

Study design

This is a follow-up survey of patients from a primary care setting in Germany with at least one onset of neck pain between March 2005 and April 2006. The study was approved by the local research ethics committee. As part of a project on the quality of medical care in general practice (MedViP), a network of 104 general practices has been established [12]. Fifteen of these practices within a radius of 30 km around Göttingen, a medium-sized town in the middle of Germany, were selected for participation and provided anonymised electronic patient data (year of birth, sex, diagnosis).

Patients were identified via electronic patient records in general practices and included in a list of potentially eligible persons if at least one consultation because of neck pain was documented during the study period. All general practitioners were asked to exclude patients from this list, if they had a neck pain consultation because of a new trauma, were terminally ill, suffered from cancer, were in need of nursing care or had severe cognitive impairment. In addition, patients seen by locums only, patients who had moved to a region outside of the study area or who were not able to speak German were excluded from the study. For invitation to participate in the study, eligible persons were—due to data protection regulations—directly contacted by the general practitioners’ offices without transferring names and addresses to the research team. Participants received a comprehensive self-administered questionnaire covering socio-demographic information, anxiety, depression, social support, and neck pain at baseline and at 3-month follow-up. Questionnaires were directly sent to the research team.

Neck Pain and Disability Scale (NPAD) [1, 2]

The NPAD is a 20-item measure that was specifically developed for patients with neck pain. It measures the intensity of pain; its interference with vocational, recreational, social, and functional aspects of living; and the extent of associated emotional factors. Patients responded to each item by marking along a 10-cm visual analogue scale. Item scores range from 0 to 5, and the total score is the sum of the item scores [possible range 0 (no pain)–100 (maximal pain)]. A valid NPAD score can be generated if no more than 15% of the items are missing [10]. The NPAD has been shown to have validity in comparison to other self-reported pain measures [1] as well as supporting constructs of mood and neuroticism [2]. A German version of the NPAD (NPAD-d) was developed recently demonstrating good reliability and validity. Details on the development and on validity and reliability markers of the NPAD-d have been reported elsewhere [10]. Participants of this study completed the NPAD-d.

Psychosocial and socio-demographic variables

Depressive mood and anxiety were measured by the Hospital Anxiety and Depression Scale (HADS) [1315], a widely used, short self-assessment questionnaire mainly asking for psychological manifestations of (generalised) anxiety and depressive mood. It consists of two subscales with seven items each. Possible subscale scores range from 0 to 21. According to the German test manual [16], patients with a depression score > 8 were considered depressed and subjects with an anxiety score > 10 were considered anxious. Perceived social support was measured by the 14-item short form of the Social Support Questionnaire (“Fragebogen zur Sozialen Unterstützung”; F-SozU) [17]. The items refer to different aspects of perceived social support (emotional and instrumental support and social integration), resulting in a global scale with higher scores indicating better social support (five-point scale: from “relevant” = 5 to “not relevant” = 1). The overall score is calculated as the mean score of all completed items. The F-SozU scale was dichotomised because of its skewed distribution. Deficits in social support were defined as having 4 or less points out of a maximum of 5 on the F-SozU scale [18]. Age, gender, living with a partner, and education were assessed by single items. Persons who had completed more than 10 years at school were considered to have more than basic education. A variable “younger age class” was defined with six age categories including participants ≥70, 60–69, 50–59, 40–49, 30–39, and <30 years. Single item questions were used to ask for surgical interventions, injuries of the cervical spine, and neck pain frequency prior to completing the questionnaire. Specifically, a non-NPAD single item question was used to assess the frequency of neck pain in the 3 months prior to questionnaire completion (once/more than once/continuously). A variable “decrease in pain frequency” was created by identifying persons who reported to have neck pain less frequently at follow-up compared to baseline according to this categorical non-NPAD single item. All variables were assessed at baseline and follow-up.

Statistical analyses

First, summary statistics as well as absolute and relative frequencies were computed to describe the baseline characteristics of the sample. Then, NPAD-d total scores were calculated as previously described. Up to three missing item values were imputed by value substitution based on each subject’s valid responses to NPAD-d items. Specifically, imputed values for missing NPAD items were calculated by dividing the sum of the non-missing NPAD-d items by the number of the non-missing items. Mean NPAD-d values at baseline and at follow-up with 95% confidence intervals (95% CI) were calculated.

Minimal detectable change (MDC; defined as the minimal change that falls outside the measurement error in the score of an instrument used to measure a symptom) was calculated as 1.96 × √2 × SEM [19]. The standard error of measurement (SEM) was estimated by dividing the standard deviation of the sample by the square root of the sample size [20]. The proportion of persons whose NPAD-d score at follow-up was equal or higher than their NPAD-d score at baseline plus the MDC and the proportion of persons whose NPAD-d score at follow-up was equal or lower than their NPAD-d score at baseline minus the MDC was calculated.

Then, the sensitivity to change of the NPAD-d was analysed using three different approaches. First, responsiveness was evaluated by standardised response mean (SRM) [21]. The SRM is the mean change in score divided by the standard deviation of the changes in scores. A larger SRM indicates a greater sensitivity to change. For the total sample and for those persons who changed by the MDC (increase or decrease), the mean change in score was calculated using absolute values for increase and decrease in NPAD-d [22]. The mean change was calculated separately for the group of patients who increased and who decreased by the MDC. In addition, mean change was calculated for the patients who reported a decrease in pain frequency in the non-NPAD item. The SRM was then calculated based on the NPAD-d mean changes for the total sample and for the different subsamples. For sensitivity analysis, SRM was also calculated using the resampling method of jackknife technique [23]. By jackknifing, the statistical estimate is systematically recomputed leaving out one observation at a time from the sample set. Then, linear regression analysis was used to investigate the change of the NPAD-d at follow-up by several characteristics that have been proven in earlier studies to ameliorate neck pain (more than basic education [2426], younger age class [27]) or deteriorate it (depression, anxiety, deficits in social support [27, 28]) The regression models were adjusted for NPAD-d change by MDC from baseline to follow-up (see definition above). This analysis is supposed to be appropriate if identifiable subgroups of patients who are expected to change by different amounts are assessed at two points in time [29, 30].

In addition, Pearson’s correlation coefficients were calculated using correlations between NPAD-d score differences at baseline and follow-up (baseline minus follow-up) and prognostic markers (change in HADS depression and anxiety subscale and in F-SozU scale between baseline and follow-up; baseline minus follow-up). This was done in the total sample consisting of patients many of whom are expected to change by different amounts and are assessed at two points in time. In this case, the ability of the measure to detect change is based on a correlation analysis [21].

For uncertainty analysis, sensitivity to change analyses was repeated in the sample restricted to subjects with complete data for all 20 NPAD-d items. All analyses were performed using Stata 9.2 (Stata Corporation, College Station, TX, USA).

Results

Of 1,228 persons fulfilling inclusion criteria, 483 persons (39%) were willing to participate in the study. Of those, 35 did not return or complete the questionnaire and for 37 persons there was no NPAD-d score at follow-up available resulting in an analytical study sample consisting of 411 persons (Fig. 1).

Fig. 1
figure 1

Flowchart of participants

Forty-three percent of participants were 50 years or older and 77% of the study population were female. About two-thirds had more than basic education and almost half did physical exercise once or more a week. More than half of the study population suffered from neck pain on the day of questionnaire completion (56%), 41% reported to have had neck pain on more than 100 days in the preceding year, and for almost a third (27%) neck pain was constantly present in the last year. Very few participants (2%) reported to have had a surgical intervention of the neck, but 20% of the study population had experienced an accidental injury of the neck. One-fifth of the participants presented depressive mood, one-fourth was found to be anxious, and one-third reported to have deficits in their social support system (Table 1).

Table 1 Baseline characteristics of the sample (N = 411)

NPAD-d mean values at baseline were higher (48.2, 95% CI 46.4–50.0) than NPAD-d mean values at follow-up (45.8, 95% CI 43.9–47.7) which corresponds to higher levels of neck pain at baseline. The mean change in the total sample (including both increase and decrease in absolute numbers) in NPAD-d was 9.4 (95% CI 8.6–10.2). With the standard deviation of the NPAD-d at baseline of 18.4 and the SEM of 0.9, the MDC was 3 (2.5 rounded off). One hundred thirty-one persons (32%) reported NPAD-d scores that were increased by the MDC from baseline to follow-up, and the mean change in NPAD-d for those persons was 10.6 ± 7.5. 191 persons (46%) had NPAD-d scores at follow-up that were decreased by the MDC, and the mean change in NPAD-d in this subgroup was −12.3 ± 8.9. Seventy-two persons of the total sample reported a decrease in neck pain frequency from baseline to follow-up (19%), and for this subgroup the mean change was −4.1 ± 10.9 (Table 2).

Table 2 Neck Pain and Disability Scale German version (NPAD-d) values at baseline and at 3-month follow-up, mean changes and change in external criterion (N = 411)

The SRM for the total sample was 1.096 (95% CI 0.999–1.193). When restricting the analysis to those persons who changed by the MDC or more, SRM values were considerably larger (increase and decrease: 1.389; increase: 1.418; decrease: −1.389). When the non-NPAD item as an external criterion was used for identification of persons with decrease, the SRM value dropped to −0.374 (Table 3). The subgroup of persons who reported a decrease in neck pain frequency according to the non-NPAD item, however, was small (N = 72) leading to considerable imprecision as measured by the standard deviation of the mean change (Table 2). SRM values using jackknife technique were marginally smaller with a slightly broader confidence interval than those calculated without resampling (e.g. SRM for the total sample 1.002, 95% CI 0.885–1.119).

Table 3 Standardised response mean (SRM) of the Neck Pain and Disability Scale German version (NPAD-d)

Table 4 shows the results of the linear regression analysis of the NPAD-d at follow-up with ameliorating and deteriorating factors for neck pain. These data are in the direction anticipated across all baseline factors investigated. Those having more than basic education and those in a younger age class consistently reported significantly lower average NPAD-d scores at follow-up compared to those with basic education and those in an older age class. In contrast, those who were classified to be depressed or anxious or to have deficits in social support reported significantly higher NPAD-d scores.

Table 4 Regression analysis of the Neck Pain and Disability Scale German version (NPAD-d) at follow-up with ameliorating and deteriorating baseline factors

Table 5 depicts the results of the correlation analysis of the mean difference in NPAD-d values between baseline and follow-up with clinical prognostic measures that are known to be predictive of neck pain. The mean change in NPAD-d correlated significantly with the mean change in the HADS depression subscale, in the anxiety subscale and in the F-SozU scale. The correlation coefficients were in the direction anticipated: A positive coefficient for the two HADS subscales indicates increasing levels of neck pain for subjects with increasing levels of depression or anxiety. A negative coefficient for the F-SozU scale points at increasing levels of neck pain for subjects with decreasing levels of social support.

Table 5 Correlation analysis of the mean change in the Neck Pain and Disability Scale German version (NPAD-d) with the mean change in prognostic measures between baseline and 3-month follow-up

Eighty-one of the 411 persons (20%) who had a valid NPAD-d score had one or more missing values. This resulted in a complete-subject sample of 330 persons. Sensitivity to change analyses of the subsample restricted to subjects with complete data for all 20 NPAD-d items revealed no substantially different results.

Discussion

The NPAD seems to be sensitive to change in neck pain patients from general practice. The NPAD—as measured by the NPAD-d—demonstrated sensitivity to change both in subgroups that were expected to change by different amounts and in the total sample consisting of patients many of whom are expected to change by different amounts. The data presented here are further evidence for the validity of the scale and increase trust in future applications of the NPAD.

This study evaluated sensitivity to change of the scale in relation to psychosocial parameters. Psychosocial factors, e.g. depression and anxiety, predict course and prognosis of unspecific neck pain [27]. Unspecific neck pain is very prevalent in the general practice population targeted in this study. A common design to assess sensitivity to change is to apply interventions of known effectiveness and compare scale scores with a placebo group. However, there is no convincing evidence for the effectiveness of widely used therapeutic options such as exercise, manipulation and mobilisation, acupuncture, or injection therapies in this patient population [3134]. Thus, we preferred psychosocial parameters as opposed to clinical markers (e.g. range of motion) to evaluate sensitivity to change for this setting. Nevertheless, analyses using clinical markers are useful when evaluating properties of the scale for different settings or in different patient populations.

With the scale ranging from 0 to 100, the MDC of 3 represents an excellent value indicating that the NPAD-d is able to detect very small changes. These results may help to calculate the sample size of future studies aiming to assess the effectiveness of neck pain interventions. The MDC value provided here is a basis for clinicians’ interpretation of their patients’ NPAD values. However, more research on the practicability and utility of self-administered pain scales in busy clinical settings is needed.

There are several limitations to consider in evaluating this research. Firstly, of 1,228 eligible persons, only 411 (33%) returned valid baseline and follow-up questionnaires and were thus included in the study. One reason is that persons were identified via routine electronical data which was only possible inside the general practitioners’ offices. Therefore, it was not possible to contact eligible persons more than once, to send any reminder mails, or to analyse differences between participants and non-participants. However, this study was conducted in a relatively large group recruited by a defined algorithm from the whole patient population of various practices. It may, therefore, be largely representative for the typical neck pain patients participating in this kind of studies, and the quite large number of exclusions can be traced back to predefined reasons according to this algorithm (Fig. 1). Secondly, the population consisted largely of subjects with mild or moderate unspecific neck pain. Although this may be expected in this adult population, sensitivity to change should also be tested in populations with severe pain and/or disability. Thirdly, we used the German version of the instrument in a German-speaking sample. Therefore, results may not be generalisable to other language versions of the NPAD. Fourthly, there are various other factors that influence neck pain than those included in this study (e.g. therapeutic interventions) and their effect on the NPAD has not been investigated. However, this study included several important factors that have been proven to have significant impact on the course and prognostics of neck pain [27, 28]. Investigating sensitivity to change of the scale by applying therapeutic interventions to one group and comparing their NPAD values with those of an untreated group has not been subject to this study and should be analysed in other more appropriate settings (e.g. in spine surgery departments).

In addition, another popular method to assess sensitivity of change of a measure is the retrospective rating of change. Such a measure was not included in the design of the present study. However, this approach has been challenged as—among other reasons—retrospective judgements of change may be subject to recall bias. So-called “response shifts” can affect or distort outcome measures in medical or psychosocial research [35, 36].

Meanwhile, Bremerich and colleagues [37] proposed a slightly different German version of the NPAD. They used a small sample of patients with an assured physical diagnosis accountable for neck pain and treated at a rheumatology clinic. Our NPAD-d version, in contrast, has been developed especially for the use in general practice with a high proportion of patients suffering from unspecific neck pain. Although evaluated in different settings with different patients, reliability and validity parameters of both German NPAD scales were generally consistent; Bremerich et al. reported a MDC of 10.5 compared to 3 in this study. This difference is probably due to the smaller sample size in Bremerich’s study.

Sensitivity to change of the NPAD has not been studied extensively. Goolkasian et al. reported score changes for the NPAD original version after a 16-week injection therapy. Their study demonstrated significant and clinically meaningful differences in NPAD scores according to clinical parameters. A Turkish study investigated sensitivity to change comparing Turkish versions of the Neck Disability Index, the Northwick Park Pain Questionnaire, the Copenhagen Neck Functional Disability Scale and the NPAD [38]. All those scales were highly valid, reliable and sensitive to change as measured by SRM values and correlation analyses. SRMs identified in the present study indicated a good sensitivity to change when being based on statistical criteria for change in the patients’ health status. However, when applying an external criterion for identifying persons who changed, the SRM dropped to a moderate value. We believe this is due to two reasons. Firstly, the external criterion was a rough measure of change as it only asks for change in the frequency but not in other dimensions of neck pain. Secondly, the analysis was possible only in a small subgroup of patients which resulted in comparably high imprecision of the mean change in this subgroup. This in turn leads to an even smaller SRM value. As this study did not focus on non-NPAD assessment of change in neck pain, future studies should evaluate sensitivity to change in comparison with other more comprehensive external criteria.

The scales investigated in the Turkish study differed in their acceptability and usefulness with the NPAD showing good acceptability and a low number of missing values. Hence, results from this study are well in line with previous research. However, when making a decision regarding an appropriate scale, researchers and clinicians should consider the scale’s ability to measure change when the quality of a questionnaire is critically appraised. In conclusion, the NPAD seems to be a sensitive measure of neck pain and related disability for use both in clinical and research settings.