Introduction

Pompe disease (OMIM #232300) is a rare neuromuscular disorder caused by deficiency of the lysosomal enzyme acid α-glucosidase. This deficiency induces glycogen to accumulate in the lysosomes of many tissues, albeit mainly in skeletal muscle. Its major clinical manifestation is progressive muscle weakness, which eventually impairs motor and respiratory function (van der Ploeg and Reuser 2008; Engel and Hirschhorn 1994). The disease manifests across a spectrum of severity, and affects infants, children and adults (van den Hout et al. 2003; Winkel et al. 2005; Kishnani et al. 2006). Patients with the classic infantile form present with severe generalized hypotonia and a hypertrophic cardiomyopathy shortly after birth; the disease progresses rapidly, and the patients usually die in their first year of life from cardiorespiratory failure. Childhood, juvenile, and adult forms of the disease are characterized by a more slowly progressive proximal myopathy. Respiratory muscles are affected as well. In these patients, onset of symptoms, disease severity and rate of disease progression varies. Cardiomyopathy rarely occurs. The majority of patients eventually become wheelchair and respirator-dependent (Hagemans et al. 2006; Laforet et al. 2000; Wokke et al. 1995).

In our centre we follow more than 100 children and adults with Pompe disease. The disease severity of these patients shows large differences. Some are ambulant and others completely wheelchair dependent. Currently there is no functional scale that has been standardized for Pompe disease and is capable to rate differences in muscle function sufficiently. This has become even more important since marketing approval was given to recombinant human alpha-glucosidase as enzyme replacement therapy for Pompe disease.

The aim of the present study was to construct a functional motor scale specific for Pompe disease that is easy to apply and sufficiently sensitive to assess disease severity and to detect clinically important changes over time, so that it can be used both in clinical practice to monitor disease progression and to evaluate therapeutic effectiveness. For this purpose, we constructed and psychometrically tested the scale in a large cohort of children and adults with Pompe disease.

Methods

Construction of the test

Motor function items that were difficult specifically for patients with Pompe disease were derived from the clinical expertise of several neurologists, paediatricians, and physical therapists involved in the treatment and care of Pompe patients; the Gross Motor Function Measure (Russell et al. 1989), an 88-item motor function test that has been validated for Cerebral Palsy; and the IPA/Erasmus MC Pompe survey, an international questionnaire study performed in over 300 Pompe patients (Hagemans et al. 2005; Hagemans et al. 2007).

The final version of the test consisted of 16 items and was named the Quick Motor Function Test (Supplement 1). Administering this test takes approximately 10 to 15 minutes. An evaluator observes the performance of a patient and scores the items separately on a 5-point ordinal scale (ranging from 0 to 4). If items can be performed on both left and right extremities, the right side is taken. A total score is obtained by adding the scores of all items. The total score ranges between 0 and 64 points.

Psychometric testing

Subjects

A total of 91 child and adult patients with Pompe disease who had attended the Erasmus Medical Center between February 2005 and February 2008 were included in the present study. All patients were diagnosed with Pompe disease through the measurement of acid α-glucosidase activity in cultured fibroblasts or leukocytes, and mutation analysis. They were enrolled in either one of two observational studies. One study investigated the rate of disease progression in untreated patients; the other study monitored the disease course after start of treatment with recombinant human alpha-glucosidase. Both studies were approved by the Institutional Review Board of the Erasmus MC. All patients (and/or parents if necessary) gave signed informed consent.

Design

In both studies, the newly constructed Quick Motor Function Test (QMFT) was part of a standardized follow-up protocol. The following assessments were performed at baseline and every 3 months thereafter: hand-held dynamometry (HHD) (van der Ploeg 1992; Beenakker et al. 2001), manual muscle testing (MMT) (Brooke et al. 1981), pulmonary function testing in sitting and supine position (ATS/ERS Statement on respiratory muscle testing 2002), and the QMFT. A physical examination was performed at every visit.

The QMFT was administered by two pediatricians and two neurologists. Beforehand, all physicians were trained by testing at least five patients following standardized instructions, while being observed by one of the senior physicians who originally developed the test. The test was performed in a separate examination room and all assessments were videotaped. The scores were recorded on an QMFT scoring sheet.

Reliability

The internal consistency (Nunnally 1978) was measured by Cronbach’s α. To estimate intrarater reliability (Hobart et al. 1996), three evaluators were shown the videotapes of the baseline assessments of 20 of their patients more than one year after the assessments and were asked to rescore the QMFT. The patients were randomly selected and the evaluators were blinded for their initial scoring.

To measure the interrater reliability (Hobart et al. 1996), videotapes of assessments from 60 randomly selected patients were scored by all four evaluators.

Test-retest reliability (Hobart et al. 1996) was assessed in 24 patients. As we assumed that little change in functional performance had occurred in this period of time, each patient was evaluated at baseline and approximately three months thereafter. The evaluators had no access to their initial scoring.

Validity

Validity is defined as the extent to which an instrument measures the concept it is intended to measure (Hobart et al. 1996). If no gold standard exists to compare the instrument, criterion validity (Nunnally 1978, Cronbach and Meehl 1955) may be assessed. Since Pompe disease predominantly presents as a proximal myopathy, we examined whether the QMFT would correlate with other tests that are used to measure proximal muscle weakness. For this purpose, the strength of proximal muscle groups as assessed by both manual muscle testing (Brooke et al. 1981) and hand held dynamometry (van der Ploeg 1992, Beenakker et al. 2001) were compared with the QMFT score. The following proximal muscle groups were tested: neck flexors, shoulder abductors, elbow flexors, elbow extensors, hip flexors, hip abductors, knee extensors, and knee flexors. To demonstrate that the QMFT score correlated less well with the strength of other muscles, the following muscle groups were tested and compared with the QMFT: neck extensors, wrist extensors, wrist flexors, foot dorsal flexors, and foot plantar flexors. Finally, differential validity was assessed by comparing the QMFT scores of patients with different severities of disease. To this end, the patients were classified into three groups based on their ability to walk: patients who were completely ambulant, patients who were able to walk with aids, and patients who were completely wheelchair bound.

Responsiveness

To assess the responsiveness of the QMFT, a clinical-empirical exploration was performed. This analysis included a sub-sample of 18 patients who had been treated with recombinant human α-glucosidase for more than one year, and also a sub-sample of 23 untreated patients who were followed for more than one year. As no gold standard exists, several strategies for assessing responsiveness have been suggested (Husted et al. 2000; Wright and Young 1997; Liang 1995). We therefore used three different methods to explore the responsiveness of the QMFT.

First, we compared the change in score at 12 months follow-up between the untreated and the treated group. Second, we calculated sensitivity to change using the standardized response mean (Wright and Young 1997; Liang et al. 1990), an effect size statistic that is equal to the mean of the change in score of treated patients divided by the standard deviation of the change score. Third, two physicians were asked to judge whether the motor function of individual patients in the treated group had to their opinion improved, remained stable, or deteriorated after 12 months of treatment (Liang 1995, Deyo and Centor 1986). These physicians were involved in the care of either children or adults with Pompe disease but had not participated in the construction and administration of the QMFT.

Statistical analysis

All continuous variables are described as mean ± standard deviation, or as median and range; categorical data are presented as percentages. Internal consistency of the test was measured using Cronbach’s α. A Cronbach’s α > 0.80 was considered good. Intrarater reliability, interrater reliability, and test-retest reliability of the test and the separate items were determined by calculating intraclass correlation coefficient (ICC) with a random effect ANOVA model. An ICC value of less than 0.40 was considered poor; an ICC value between 0.40 and 0.80 was considered fair and an ICC value greater than 0.80 was considered excellent. Pearson correlation coefficients were used to determine criterion validity; Spearman rank correlation coefficients were used in case of non-normal distributions. For differential validity, group differences were analyzed by one-way analysis of variance followed by a Bonferroni post hoc correction for multiple testing. P- values < 0.05 (two-tailed) were considered statistically significant.

Responsiveness was investigated by calculating the standardized response mean. A score between 0.50 and 0.80 was considered moderate, and a score greater than 0.80 represents high responsiveness (Liang et al. 1990). A Mann-Whitney test was used to test differences in QMFT scores between the treated patients and the untreated patients. A ROC curve was made by plotting the true-positive rate (sensitivity) against the false-positive rate (1-specificity). The area under the curve represents the ability of the test to correctly discriminate between improved and non-improved patients. This area ranges from 0.5 (no discriminating ability) to 1.0 (perfect discriminating ability).

Statistical analysis was performed using SPSS version 15.0.

Results

Study population

Clinical characteristics of the 91 patients enrolled in the study are shown in Table 1. Nineteen patients were younger than 18 years. The mean QMFT score at baseline assessment was 36.9 ± 16.6 (median 38.5, range 3 to 64), on a possible score range from 0 to 64.

Table 1 Clinical characteristics of the 91 study patients

Reliability

The internal consistency of the QMFT was excellent: Cronbach’s α was 0.94. There were no substantial floor and ceiling effects: none of the patients reached the lowest possible score, and only two of the patients reached the highest possible score of 64. Both patients were diagnosed presymptomatically.

The intraclass correlation coefficient for intrarater reliability was 0.95 for the total scale. The ICCs for the separate items of the scale ranged from 0.78 to 0.98. The intraclass correlation coefficient for interrater reliability was 0.91 for the total test. The ICCs for the separate items of the scale ranged from 0.76 to 0.98.

The intraclass correlation coefficient for test-retest reliability was 0.98 for the total test. The ICCs for the separate items of the scale ranged from 0.84 to 1.00.

Validity

The total QMFT score correlated strongly with the strength sum scores of proximal muscle groups assessed by MMT and HHD (rs (MMT) = 0.89 (p < 0.001), rs(HHD) = 0.81, (p < 0.01)). In sharp contrast, much lower correlations were found between the QMFT and strength sum scores of other muscle groups (rs(MMT) = 0.05 (p = 0.33), rs(HHD) = 0.33 (p < 0.01)) (Fig. 1).

Fig. 1
figure 1

Relationship of the QMFT score to manual muscle testing: a. sumscore of proximal muscle groups, b. sumscore of other muscle groups

Differential validity was supported by significant differences between the three groups with different severities of disease (F (2,84) = 66.29, p < 0.001). Mean QMFT scores significantly decreased with the ability to walk: scores were highest in the group that was fully ambulant (47.1 ± 10.7, p < 0.01), followed by those for patients who were able to walk with aids (32.4 ± 11.0, p < 0.01), and those for patients who were completely wheelchair bound (16.6 ± 10.6, p < 0.01) (Fig. 2).

Fig. 2
figure 2

Mean scores (95% CI) on the Quick Motor Function Test of the patients related to three grades of disease severity

Responsiveness

Responsiveness was tested in 41 patients (18 treated; 23 untreated). Median age of the 18 patients that received recombinant human alpha-glucosidase was 51.5 years (range 5 to 76 years). Eight of these patients were wheelchair bound, and ten patients were dependent on ventilation. Median age of the 23 untreated patients was 52.1 years (range 34 to 72 years). Two patients were wheelchair bound, and three used ventilation.

The QMFT scores of the treated patients (median change 4.15) showed a significant difference over one year period compared to the QMFT scores of the untreated patients (median change 0), (p < 0.01). The standardized response mean was high (0.81). Fig. 3 shows the ROC curve of the Quick Motor Function Test with an area under the curve of 0.88 (p < 0.05).

Fig. 3
figure 3

Receiver operating characteristic (ROC) curve to determine the responsiveness of the Quick Motor Function Test (QMFT), with clinical judgment as the external factor. Sensitivity was defined by dividing the number of patients, who had been identified by the QMFT to have changed, by the number of patients who had truly undergone change as based on the judgment of two physicians. Specificity was defined as the number of patients who had been identified by the QMFT to not have changed, divided by the number of patients who had not changed, based on the judgment of the same physicians

Discussion

This study shows that the Quick Motor Function Test is a reliable and valid test for assessing motor function in patients with Pompe disease. It is the first muscle function test designed and validated specifically for Pompe patients. The test had good psychometric properties, including good internal consistency, and good intrarater and interrater reliabilities over the entire test and the separate items. The QMFT score strongly correlated with proximal muscle strength as measured by HHD and MMT, and significantly differentiated between patients with different levels of mobility. The test was evaluated in patients between 5 and 76 years of age, and was easy and quick to administer.

According to the World Health Organization, assessment of health should have a multi-dimensional approach. The International Classification of Functioning, Disability and Health (ICF) (World Health Organization 2001) provides such an interdisciplinary framework and measures consequences of disease in three domains: impairments of body functions and body structures, activity limitations (individual level), and participation restrictions (societal level).

In Pompe disease, the approach towards evaluating disease severity and effect of treatment has become increasingly multi-dimensional over the past years. Measurement tools have been designed and validated for their use in Pompe patients. Currently, a battery of tests is used in the long-term follow-up of Pompe patients. For example, muscle strength, pulmonary function tests, echocardiography, timed tests and the 6-minute walk test are used to evaluate disease consequences and effect of treatment on the level of body functions and body structures. The Pompe Pediatric Evaluation of Disability Inventory (Pompe PEDI), SF-36, and the Rotterdam Handicap Scale are used to assess the level of participation restrictions and quality of life. However, a validated tool to measure activity limitations on an individual level is currently lacking.

In clinical practice, muscle strength tests have often been applied to assess muscle function. However, although closely related, muscle strength and muscle function represent two different entities of the muscle system, and correspond to different levels of the ICF (World Health Organization 2001). Both parameters should therefore be evaluated separately by valid and reliable assessment tools.

In a recent placebo-controlled clinical trial in 90 juvenile and adult patients with Pompe disease, primary outcome measures were the 6 minute walk test (distance walked in 6 minutes), and Forced Vital Capacity in seated position (van der Ploeg et al. 2010). Muscle function was not assessed, because a reliable motor function test validated for Pompe disease did not exist. Although it would have been possible to use scales that were designed for other neuromuscular disorders (Russell et al. 1989; Main et al. 2003; Scott et al. 1982; Lovell et al. 1999; Berard et al. 2005) or a composite disease severity score that covers various domains of the ICF, as developed by Lue et al. for Duchenne Muscular Dystrophy (Lue et al. 2006), none of these scales were validated for Pompe disease. Some were designed for children, while Pompe disease affects all age groups. Our study demonstrates that the QMFT can be used in both children and adults, with different levels of disease severity.

Another quality of the QMFT is that it is easy and quick to perform. The test takes approximately 15 minutes, does not require specialized equipment, and can be performed by a physician in a clinical setting. As opposed to other scales, that frequently need to be performed by a physical therapist. It is a practical tool that can be used in all patients including those who are confined to a wheelchair or dependent on artificial ventilation. The overall responsiveness of the QMFT appears to be good: the test accurately detected change when it had occurred and remained stable when no change had occurred. It also discriminated between varying levels of disease. This indicates that the QMFT can serve as a tool to estimate disease severity, but also as a longitudinal assessment tool to detect changes in motor function over time. This is useful, as the emergence of new treatment modalities such as enzyme replacement therapy and possibly chaperone therapy will make the (long-term) evaluation of therapeutic effects essential.

Four issues need further attention. First, while responsiveness to change, which was assessed in a subgroup of 18 treated and 23 untreated patients, showed promising results, it is recommended to perform a large scale empirical study. The current study was insufficient to demonstrate whether the changes observed over time were related to enzyme replacement therapy or not. Second, the test was validated for patients between 5 and 76 years of age. In the youngest and oldest patients, motor development and age-related motor limitation might have interfered with the test results. Therefore, reference values for age should be obtained. Third, to ensure tester reliability we recommend annual recertification of the physicians who perform the QMFT. Fourth, the present study validated the QMFT in Pompe patients, but the test may also be useful for other neuromuscular disorders, especially those with proximal muscle weakness.

In conclusion, this study shows that the Quick Motor Function Test has good psychometric properties and excellent clinical utility. Our findings indicate that this test can be used to assess motor function and response to treatment in children and adults with different levels of disease severity. The applicability of the test for other neuromuscular disorders deserves further investigation.