Introduction

The evaluation criteria were based on physiological, biological, and anatomical outcome measure results of the Japanese Orthopaedic Association (JOA) score for low back pain.1 The criteria include laboratory values, physiological findings, and imaging findings. These findings are significant for doctors but have little meaning for patients. From a patient’s perspective, the presence of a symptom or its degree and functional condition must have real meaning. This means that outcome measures need to be translatable from an objective evaluation to a subjective one, or from the doctor’s perspective to the patient’s perspective. The JOA decided to revise the JOA score for low back pain and develop a new scientific, patient-oriented outcome measure.

The first committee meeting was held in June 2000, and the first survey was initiated in February 2002 using a preliminary questionnaire consisting of 60 items. It was a self-administered, disease-specific measure developed with reference to the Japanese editions of SF-362,3 and the Roland-Morris Disability Questionnaire (RDQ)4,5 to assess health-related quality of life. Based on findings of the survey, 25 items were selected for a draft of the JOA Back Pain Evaluation Questionnaire (JOABPEQ) (see Appendix 1).

The second survey was started in January 2004 to evaluate the reliability of the 25 items selected for the draft JOABPEQ. We successfully confirmed the reliability, and these details have been described in previous reports of Part 16 and Part 2.7 Part 3 of this study involves further development of the new JOA questionnaire, evaluation of the validity of the draft JOABPEQ, and establishment of a measurement scale.

Materials and methods

Recruitment of patients

A total of 369 of the 829 Japanese board-certified spine surgeons were randomly selected and asked to recruit at least three patients each to participate in evaluating the JOABPEQ during February 2004. The inclusion criterion was any type of lumbar spine disorder. Exclusion criteria were patients who had:

  • Other musculoskeletal diseases requiring medical treatment

  • Psychiatric disease, potentially leading to inappropriate answers

  • Postoperative condition

  • Participation in previous surveys related to this study

Testing the questionnaire

Each patient was asked to fill in the self-administered questionnaire. The attending surgeon filled out information on the diagnosis, presence or absence of concomitant diseases, and a judgment regarding the severity of symptoms using a three-step rating scale (mild, moderate, severe). The severity of the symptoms was determined subjectively by the attending surgeon, who was asked not to select a similar patient based only on the severity. This study was approved by the Ethics Committee of the Japanese Society for Spine Surgery and Related Research, and informed consent was obtained from each patient.

Factor analysis was used to check the statistical validity of the questionnaire and establish the measurement scale. All statistics were calculated using SPSS software (version 12; SPSS, Chicago, IL, USA).

Results

Patient characteristics

Of the 452 patients selected for participation in this survey, 1 patient who was judged inappropriate by the attending doctor and 60 patients with other musculoskeletal diseases requiring medical treatment were excluded. The responses from 36 patients who answered incompletely were also excluded, leaving 355 patients available for analysis: 201 men and 154 women, with a mean ± SD age of 50.7 ± 18.0 (Table 1). The diagnosis was lumbar disc herniation in 167, lumbar spinal canal stenosis in 103, and spondylolisthesis in 37.

Table 1 Distribution of age and severity of symptoms (n = 355)

According to the judgment of the attending doctor, there were 115 mild, 142 moderate, and 98 severe cases. Table 2 summarizes the severity of low back pain evaluated by the current JOA scoring system and shows that the characteristics of the recruited patients were not specific. There was no marked difference in the distribution of the severity levels between the 451 patients who were initially recruited and the 355 who were finally analyzed.

Table 2 Distribution of the severity evaluated by the current JOA scoring system and finger-floor distance (n = 355)

Superficial validity

Superficial validity was checked in terms of the completion rate for filling out the questionnaire. Regarding the distribution of responses for each item, it was judged that none of the questions was too difficult to answer because the highest rate of nonrespose was 1.8%. With regard to deflection of an answer, the highest rate (78.3%) was concentrated on “yes” responses to question 1–14, although this was judged not to be inappropriate. Therefore, the distribution was not skewed, which would indicate “floor and ceiling” effects (Table 3).

Table 3 Distribution of answers for each item in the questionnaire (n = 451)

Factor analysis

First, we tried to extract some observed variables from 25 items by the Maximum Likelihood Method. It was found that the eigenvalue was >1.0 for five items, and the accumulative contribution ratio until the fifth factor was 53.1% (Table 4).

Table 4 Results of factor analysis: eigenvalue of each item

Next, we performed orthogonal rotation by the direct oblimin method. The results are shown in Table 5. Each item was categorized into five factors: Four items (Q2- 6, Q2-5, Q1-2, Q2-4) related to factor 1; seven items (Q2-8, Q2-7, Q2-11, Q1-13, Q2-9, Q2-10, Q2-1) related to factor 2; six items (Q1-9, Q1-6, Q2-3, Q1-8, Q1-5, Q1-4) related to factor 3; five items (Q1-10, Q2-4, Q1- 12, Q1-14, Q2-2) to factor 4; and the last four items to factor 5. Although factor loading was <0.30 in three items (Q1-4 to factor 3, Q2-2 to factor 4, Q1-11 to factor 5), we adopted all of them for the reason that the question itself was important for the factor or the number of questions in each factor needed to be more than four.

Table 5 Results of factor analysis: factor loading of each item

Factor names were determined based on the commonality of the items that showed a large value on factor loading: factor 1, social function (four items); factor 2, mental health (seven items); factor 3, lumbar function (six items); factor 4, walking ability (five items); and factor 5, low back pain (four items).

Measurement scale

To establish a measurement scale for each factor, we determined the size of the coefficient for each item so the difference between the maximum factor scores and minimum factor scores was approximately 100 (Table 6). When a coefficient became a negative numerical value, we changed the coefficient to a positive numerical value by reversing the order of the answer choice. We adjusted the formula so the maximum for each factor score was 100 and the minimum was 0 (see Appendix 2).

Table 6 Coefficient for each item of the formula for measurement scale

Discussion

It is considered ideal for the outcome measure to evaluate patients from various perspectives, such as dysfunction, disability, handicap, and psychological problem. The outcome measure should be patient-oriented, and its reliability and validity should be confirmed by statistical analysis. However, the current JOA score does not include subjective evaluations and does not meet such requirements. We developed a new questionnaire, JOABPEQ, specifically to evaluate low back pain. It is patient-oriented and mainly based on recognizing problems with activities of daily living. We categorized 25 questions into five factors; each factor is then scored up to 100 points using the measurement scale. The factors are then evaluated separately. The point is to be aware that it is meaningless and inadequate to total the five factors’ scores; rather, they should be treated by nonparametric analysis. The reliability of the questionnaire including 25 items for the JOABPEQ was confirmed in Part 2 of this project. The validity of the questionnaire was evaluated using factor analysis, and the measurement scale was established in Part 3 of this study. Further studies must be performed to confirm the responsiveness of the calculations of the severity score.

Conclusions

We confirmed the validity of the JOA Back Pain Evaluation Questionnaire (JOABPEQ) and established a measurement scale.