Journal of Autism and Developmental Disorders

, Volume 38, Issue 9, pp 1751–1757

Characterization of Potential Outcome Measures for Future Clinical Trials in Fragile X Syndrome

Authors

    • Department of PediatricsRush University Medical Center
    • Department of Neurological SciencesRush University Medical Center
    • Department of BiochemistryRush University Medical Center
  • Allison Sumis
    • Department of PediatricsRush University Medical Center
    • Department of Molecular and Cellular Oncology
  • Ok-Kyung Kim
    • Rush University School of Medicine
  • Rebecca Lara
    • Department of PediatricsRush University Medical Center
    • Pritzer School of MedicineChicago University
  • Joanne Wuu
    • Department of Neurological SciencesRush University Medical Center
    • Department of NeurologyEmory University
Original Paper

DOI: 10.1007/s10803-008-0564-8

Cite this article as:
Berry-Kravis, E., Sumis, A., Kim, O. et al. J Autism Dev Disord (2008) 38: 1751. doi:10.1007/s10803-008-0564-8

Abstract

Clinical trials targeting recently elucidated synaptic defects in fragile X syndrome (FXS) will require outcome measures capable of assessing short-term changes in cognitive functioning. Potentially useful measures for FXS were evaluated here in a test–retest setting in males and females with FXS (N = 46). Good reproducibility, determined by an interclass correlation (ICC) or weighted kappa (κ) of 0.7–0.9 was seen for RBANS List and Story Memory, NEPSY Tower, Woodcock–Johnson Spatial Relations and the commissions score from the Carolina Fragile X Project Continuous Performance Test (CPT). This study demonstrates the feasibility of generating test profiles containing reliability data, ability levels required for test performance, and refusal rates to assist with choice of outcome measures in FXS and other cohorts with cognitive disability.

Keywords

Fragile X syndromeClinical trialsOutcome measuresFMR1

Fragile X syndrome (FXS) is the most common inherited form of mental retardation, with a population prevalence of about 1/4,000 males and females (Turner et al. 1996). Clinically FXS is characterized by cognitive and or learning disabilities ranging from mild to severe and often debilitating behavioral problems such as hyperactivity, anxiety, sensory hyperarousal, mood lability, aggression, and autistic behaviors (Berry-Kravis et al. 2002). FXS results from an unstable trinucleotide repeat expansion mutation in the FMR1 gene (Verkerk et al. 1991), located on the long arm of the X chromosome. The mutation leads to transcriptional silencing of FMR1 and; thus, the gene product (FMRP, Fragile X Mental Retardation Protein) is reduced or absent in FXS (Devys et al. 1993). A body of literature has shown that FMRP is an RNA binding protein which shows increased expression in models of CNS activation or experience (Irwin 2005; Gabel et al. 2004). Recent studies suggest that FMRP modulates dendritic maturation and synaptic plasticity through a mechanism involving regulation of activation-mediated dendritic protein synthesis (for reviews see Grossman et al. 2006; Bagni and Greenough 2005).

Further work has shown that FMRP is involved in regulation of group 1 metabotropic glutamate receptor (mGluRs, includes mGluR1 and mGluR5)-mediated dendritic translation (Antar et al. 2004; Aschrafi et al. 2005; Weiler et al. 2004), on which specific forms of synaptic plasticity depend. It appears that mGluR5-mediated activation of translation is regulated, specifically inhibited, by FMRP, based on the finding that mGluR5-dependent LTD is enhanced in the hippocampus of the fmr1 knockout mouse model of FXS (Bear et al. 2004). Numerous additional predicted consequences of excessive activity of mGluR-mediated processes have been identified in the knockout mouse and dFXR mutant Drosophila models of FXS, including reduction of synaptic AMPA receptors in cortex (Huber et al. 2002; Li et al. 2002), structurally immature-appearing elongated dendritic processes (Bagni and Greenough 2005; Grossman et al. 2006; Irwin et al. 2002), and abnormal epileptiform discharges (Chuang et al. 2005). Further, based on known locations and activities of group 1 mGluR receptors in the central and peripheral nervous system, many of the phenotypic features of FXS including seizures, electrical excitability on EEGs, hypersensitivity to tactile stimuli, mental retardation, hyperactivity, strabismus, enhanced fear memory, coordination problems and even loose stools are effects that would be predicted in a setting of enhancement of mGluR-mediated processes that would normally be inhibited by FMRP (Li et al. 2002; Bear 2005).

Based on increasingly convincing evidence to support the hypothesis of overactive group 1 mGluR-mediated activity, new therapeutic options have been proposed to target this underlying disease mechanism in FXS, including the use of negative allosteric modulators of mGluR5, such as MPEP (2-methyl-6-(phenylethynyl)-pyridine), as well as other strategies to mitigate against the changes in mGluR activity. Indeed, MPEP and other mGluR negative modulators have been shown to reverse phenotypes in mouse (Chuang et al. 2005; Yan et al. 2005) and Drosophila (McBride et al. 2005) models which have absent FMRP. Several mGluR5 negative modulators are currently being developed for use in humans to target the underlying disorder in FXS.

If mGluR5 negative modulators do target the underlying disorder in FXS, it is expected that they should produce both cognitive and behavioral improvements. Although validated outcome measures for caregiver rating of behavior such as the Aberrant Behavior Checklist (ABC-C) have been shown to be reproducible in populations with FXS (Berry-Kravis et al. 2006) and demonstrate drug responsiveness in developmental populations (McCracken et al. 2002), no good outcome measures to address cognitive change have been developed for populations with FXS or other developmentally impaired populations. Often administration of such measures provides challenges with regards to subject co-operation and good reproducibility and drug sensitivity has not been shown for any such measures. In order to gather early data regarding reliability of potential outcome measures that might be used in impending clinical trials targeting cognition in FXS, in this study we evaluated completion rates and reproducibility of a battery of measures thought to show promise for assessing short term cognitive change in FXS.

Methods

Enrollment

Subjects were recruited through the Rush University Fragile X Clinic, which currently provides care to over 300 patients with FXS. Informed consent was obtained from all parents or legal guardians of subjects and subjects signed assent if capable. The study was approved by the Rush University Institutional Review Board. All subjects with a diagnosis of fragile X syndrome, confirmed by DNA analysis documenting a fragile X full mutation with at least partial methylation, were eligible. Parents were asked to provide a copy of the subject’s most recent IQ assessment so that level of functioning could be documented. Age and IQ score did not play a role in determining whether a subject could participate because a goal of the study was evaluate subjects over a broad age range to determine the age range and range of functioning over which the proposed outcome measures could be used in subjects with FXS. It was thought to be important to assess the outcome measures in both genders and in children and adults, given that new targeted treatments for FXS would likely need to be studied first in adults of both genders, but information would then be needed regarding extension of trial designs and outcome measures to children.

Cognitive Assessment Protocol

Each subject was administered a series of brief tests, designed to evaluate attention and aspects of cognitive function, and which were thought to represent measures that could be used as cognitive outcome measures in cohorts with FXS, but have not had formal validation in a test–retest paradigm in populations with FXS to show that they have the consistency required for measurements of drug effect across a treatment period. Tests were administered by a research assistant with experience in working with individuals with FXS, and who was trained to administer the measures in a standardized manner by a clinical psychologist. The testing session required about 1–2 h, depending on the level of function of the subject and numbers of measures able to be completed. The research assistant was trained to pick up on physical and behavioral cues indicating cognitive exhaustion and loss of focus on the tasks. Given the short attention span and distractibility seen frequently in individuals with FXS, breaks with physical activity were provided as needed between tasks to help maintain attentiveness and avoid tiring. The tester was well-trained in how to coax subjects with FXS to start tasks, and use of frequent re-enforcers and re-direction.

The battery of tests administered included: (1) Carolina Fragile X Project Continuous Performance Test (CPT)—a simplified computerized 10 min CPT with separate measures of auditory and visual attention and impulsivity in which the subject must push a key, for example, when a picture of a dog appears on the screen and separately when sound is heard. This CPT was chosen because prior experience had suggested that standard CPT measures are too difficult for individuals with FXS (Berry-Kravis et al. 2002) and this CPT has been piloted and could be completed by a prior cohort of children with FXS in the Carolina Project, although short-term reproducibility was not assessed through that study (Sullivan et al. 2007). (2) Card Sort Test—computerized 10–20 min test of visual memory, sequential memory, and working memory taken on a computer with pictures of cards during which the subject must remember a progressively longer sequence of colors followed by combinations of colors and numbers. This test has been piloted in an cohort with FXS (Johnson-Glenberg 2004) and could be done by many subjects. As a computerized test using a familiar item as the stimulus (cards), it is thought to be more engaging for subjects with FXS than pencil and paper tests. (3) Non-Verbal Associative Learning Task (NVALT)—a 30-min non-verbal measure of object discrimination learning and object discrimination reversal (Boutet et al. 2005b). This task requires no language, social interaction, or eye contact. The subject is presented with several items under one of which a nickel is hidden. The items change position but the nickel is hidden under the same item and the subject must make that association by finding the nickel correctly 10 consecutive tries. This test has been piloted in FXS (Boutet et al. 2005a) and is particularly engaging for subjects with FXS because of the frequent reinforcement the subject receives when he/she gets to keep nickels found, and because the test produces less anxiety than most tests as the subject does not have to make eye contact or engage socially. (4) NEPSY Tower subtest—a 10 min standardized measure of cognitive flexibility and problem solving (Korkman et al. 1998). This test was chosen because cognitive flexibility is a particular area of difficulty in FXS and because a score can be obtained from individuals with a mental age as low as of 4 years. It was expected that individuals with FXS who cannot do more difficult tests of cognitive flexibility might be able to achieve a score on the NEPSY Tower. (5) Woodcock–Johnson Spatial Relations subtest—a 5 min standardized measure of visuospatial reasoning validated for children and adults 2–90+ years of age (Woodcock and Johnson 1990). This test was chosen because subtests from the Test of Visual Perceptual Skills (TVPS) evaluating visuospatial reasoning were shown to be too difficult for a significant percentage of individuals with FXS (Berry-Kravis et al. 2002). As such, no measure of visuospatial reasoning has yet been demonstrated to be able to be consistently and successfully completed successfully by a cohort with FXS. It was thought that the Woodcock–Johnson Spatial Relations subtest would be easier than TVPS subtests and might be more applicable in the populations with FXS. (6) Wechsler Intelligence Scale for Children, Fourth Edition (WISC-IV) Symbol Search subtest—a 5 min measure of processing speed validated for use in children age 5–17 years (Wechsler 2003). This test was chosen as no prior measures of processing speed have been piloted formally in cohorts with FXS and it was thought this test could potentially be completed by subjects with FXS. (7) The Repeatable Battery for the Assessment of Neuropsychological Status (RBANS) List and Story Memory Subtests—two 5 min measures of verbal/auditory memory developed for use in subjects age 20–89 with neurological or dementing illnesses and a broad range of cognitive functioning (Randolph 1998) and shown to have high completion rates and reproducibility in adults with FXS in a prior study (Berry-Kravis et al. 2002). This measure was utilized to further characterize completion rates and reproducibility, and evaluate the measure for learning effects across a cohort with FXS which encompasses a broader developmental range, including children.

Retest

All subjects returned for repeat testing within a 1–2-month time frame. An attempt was made to re-administer all tests (even if a test could not be done the first time) at the second visit in order to determine intra-individual variation in ability to complete the tests due to variation in behavioral state from one visit to the next and to assess intra-individual test–retest reliability in subjects with FXS. Subjects treated with psychoactive medications for behavioral management were not allowed any medication or dose changes between the first and second testing sessions.

Data Analyses

Data from all testing sessions was entered into a computer database. The percentage of subjects with FXS failing to complete the test at either session was determined for each test. Completion failures were further subgrouped into categories of subjects with FXS who attempted but could not understand the test and therefore failed to get a score above baseline, and those who refused to even try the test. The association of chronological age and mental age (MA) versus test score at the first visit for those who attempted each test was assessed by Pearson correlation coefficient (r). The MA cut-off, above which all subjects who attempted the test could achieve a score above baseline was determined for each cognitive test. For subjects age 5–21, cognitive assessments provided by the parents as available were used for analysis if they had been performed at age 5 or older and within the 2 years (most within 18 months) prior to study participation. For adults, any recent cognitive assessment was used as IQ and MA are expected to be stable in adults. MA values from the assessments were utilized as estimates of cognitive level rather than IQ, as MA allows determination of ability level required to do cognitive tests across populations including both children and adults (e.g. a given IQ in a child and adult will not represent a comparable functional level). Test–retest reproducibility was determined by comparing raw scores for all subjects successfully achieving a score on the test at both testing sessions using the interclass correlation co-efficient (ICC, for numerical measures) or weighted kappa statistic (κ, for ordinal measures).

For the Carolina Fragile X Project CPT, it was noted that Omission scores did not accurately reflect performance on the test because subjects with FXS tended to impulsively and repeatedly hit the computer keys and thus would not miss the correct stimuli when presented due to constant random hits, resulting in very low (good) omissions scores despite actual poor performance. Therefore, Commission scores were felt to be a better reflection of performance for this cohort and only Commission scores were formally analyzed. In order to work with extreme variability in responses of individuals with FXS on CPT measures, raw scores were placed into categories of relatively equivalent performance, as follows: (1) 0 commissions, (2) 1–5 commissions, (3) 6–20 commissions, (4) 21+ commissions, and test–retest performance was analyzed based on reproducibility of category.

Results

Over a 9-month time period, 46 subjects (39 male, 7 female) with FXS and an FMR1 full mutation were enrolled in this study, ranging in age from 5 to 47 years. Reports of cognitive assessments were available for analysis for 31 subjects (25 male, 6 female) and represented an IQ from 30 to 89, which translated to an MA range from 2.1 to 10.7 years. Demographic data for the subject cohort is illustrated in Fig. 1.
https://static-content.springer.com/image/art%3A10.1007%2Fs10803-008-0564-8/MediaObjects/10803_2008_564_Fig1_HTML.gif
Fig. 1

Age, gender, and cognitive demographics of the study cohort with FXS

Completion rates for the various tests administered are shown in Table 1.
Table 1

Completion and refusal rates for measures administered to subjects with FXS

Test

N

AbleaN (%)

Not ablebN (%)

Subjects refusingc both sessions N (%)

Subjects refusing one or both sessions N (%)

RBANS list learning

46

43 (93)

0 (0)

3 (7)

4 (9)

RBANS story memory

46

40 (87)

3 (7)

3 (7)

3 (7)

W–J spatial relations

46

34 (74)

3 (7)

9 (20)

9 (20)

Symbol search

46

28 (52)

4 (17)

14 (30)

14 (30)

NEPSY tower

46

39 (85)

2 (4)

5 (11)

8 (17)

Card task—color

41d

36 (88)

0 (0)

5 (12)

11 (27)

Card Task—color/number

41d

30 (73)

3 (7)

8 (20)

14 (34)

FXS project CPT—visual

46

42 (91)

0 (0)

4 (9)

7 (15)

FXS project CPT—auditory

41d

37 (90)

0 (0)

4 (10)

9 (22)

NVALT

36e

35 (97)

0 (0)

1 (3)

4 (11)

aAble refers to subjects who achieved a non-basal score on the test at either of the testing visits, indicating that they could do the test

bNot able refers to subjects who could not do the test in either attempt at administration, despite trying to do the test at least once, and thus never achieved a score on the test

cSubjects who refused would not even attempt the test even if they were likely capable and therefore no data was available on whether they could do the test at the session(s) indicated

dNot all subjects were administered these tests due to a problem with computer function on several of the testing days

eNot all subjects were administered this test because the testing apparatus was not available at the beginning of the study

The NVALT had the highest completion rate (97%), whereas the Symbol Search had the lowest (52%). Table 1 also shows percentages of subjects who failed to complete the tests, subgrouped according to whether they were unable to do the test or refused to even try. Refusals were a common problem for subjects with FXS, although refusals could vary from one session to another, largely due to the expected significant impact of the labile behavioral state seen in individuals with FXS, on ability to participate in an evaluation. Completion rates included subjects who accomplished the test at only one administration but not the other, in order to assess what fraction of subjects had the capability to do the tests, given co-operation. However, the overall percentage of subjects refusing a given test at either or both test sessions was also determined (Table 1) because this best predicts the likelihood of completion of any given measure at both the pre- and post-intervention assessment of a clinical trial (i.e. if either time point is not completed the effect of the trial-related intervention on the outcome measure cannot be determined). Refusals decreased with age, and subjects >18 years accounted for only 11% of the testing refusals. The NVALT had the lowest refusal rate, and the Symbol Search had the highest. Refusals did not seem to be a reflection of cognitive exhaustion or lack of attention as refusals were seen with the first tasks administered as well as those later in the battery, and frequently one task was refused but then the subject would do a different task of higher seeming interest offered subsequently.

Cognitive performance correlated with MA estimates (Table 2, Fig. 2) on all measures except the Visual and Auditory CPT Commissions and the NVALT (Table 2).
Table 2

Correlation of MA with performance on measures for subjects with FXS

Test

N

ra

pb

MA cut-offc

RBANS list learning

28

0.57

<0.005

>3.0

RBANS story memory

29

0.65

<0.0005

>5.4

W–J spatial relations

27

0.61

<0.0005

>7.1

Symbol search

21

0.66

<0.001

>7.1

NEPSY tower

27

0.67

<0.0005

>7.1

Card task—color

27

0.59

<0.001

>5.4

Card task—color/#

27

0.65

<0.0005

>6.6

FXS project CPT—visual

27

−0.21

NS

>3.0

FXS project CPT—auditory

22

−0.26

NS

>3.0

NVALT

30

−0.14

NS

None

aCorrelation co-efficient

bSignificance

cMA cut-off is the MA above which all subjects who attempted the test could achieve a score above basal

https://static-content.springer.com/image/art%3A10.1007%2Fs10803-008-0564-8/MediaObjects/10803_2008_564_Fig2_HTML.gif
Fig. 2

Correlation plots for MA and representative cognitive measures, the RBANS list learning (N = 28) and Woodcock–Johnson spatial relations tasks (N = 27), in subjects with FXS. Examples of the test profiles that were generated for all tasks in this study are presented alongside the graphs

There was no correlation of performance with chronological age for any of the measures. MA cut-offs varied widely between tests (Table 2), and low functioning subjects with FXS (Mental Age < 3) were hard to test and had difficulty completing any tests among the battery.

Reproducibility of the testing result for all measures except the NVALT is shown in Table 3.
Table 3

Reproducibility of measures for FXS cohort

Test

Reproducibility

N

RBANS list learning

ICC = 0.7

41

RBANS story memory

ICC = 0.7

38

W–J spatial relations

ICC = 0.9

34

Symbol search

ICC = 0.6

27

NEPSY tower

ICC = 0.7

38

Card task

κ = 0.5 (color), κ = 0.6 (number)

31, 31

FXS Project CPT—commissions

κ = 0.7 (auditory), κ = 0.8 (visual)

29, 31

Despite the lack of correlation of the score achieved on the NVALT with MA, reproducibility and apparent learning effects seen with the NVALT seemed to be dependent on cognitive ability (learning effects occurring only in higher functioning individuals) and analysis of the NVALT will be the subject of a separate report. The RBANS List Learning and Story Memory subtests, the W–J Spatial Relations subtest, the NEPSY Tower, and the Carolina FXS Project CPT Commissions score all had ICC or weighted kappa for reproducibility of 0.7 or more, suggesting good reproducibility for test–retest in a population with FXS over a broad age range. The Card Task and Symbol Search subtest had more borderline reproducibility values of 0.5–0.6. There was no practice effect demonstrated on any of the measures aside from the NVALT.

Discussion

In this study we evaluated the reproducibility of measures that were thought likely to represent good outcome measures for cognitive change in potential future clinical trials targeting cognition in populations with FXS. A number of the measures had sufficiently good reproducibility on the test–retest analysis. However based on MA cut-offs, below which basal scores are obtained in a fraction of the group, some of the tests with reasonable reproducibility could not be effectively used in a clinical trial enrolling typical subjects with FXS, as the lower functioning subjects would not be able to achieve a score. Further, the higher functioning subjects with FXS frequently received maximal scores on tests such as the Carolina FXS Project CPT. Since it would be preferable not to limit clinical trial enrollment to higher functioning subjects with FXS, either tests covering a very broad range of ability level or a set of tests addressing different MA ranges will be needed to obtain valid testing data from an entire representative cohort with FXS.

Even if all subjects are capable of obtaining a score above basal on a given measure, refusal is a large problem in cohorts with FXS and can significantly limit data acquisition in a clinical trial setting. This has been a challenge in clinical trials involving testing of subjects with autism as well and has limited cognitive assessment in the large controlled trial of risperidone in autistic subjects (McCracken et al. 2002). Refusals were less problematic for the group of adults with FXS in this study and in our prior CX516 study (Berry-Kravis et al. 2002). Personnel at our center were highly familiar with FXS, most subjects in the study were familiar already with the clinic environment due to prior visits, and care was taken to maximize the comfort level of subjects, minimize anxiety, and provide breaks, re-enforcers and frequent re-direction to task. Thus, conditions during the testing process in this study were optimized for subjects with FXS, suggesting that refusals will occur in most clinical trial settings and need to be taken into account during study design, especially when enrolling children with FXS. Further, in order to minimize refusals, efforts must be made to reduce anxiety for subjects with FXS during clinical trials, by ensuring familiarity and habituating the subject to the testing environment and training of study personnel in behavioral patterns of children with FXS and techniques for motivating performance in subjects with FXS.

The RBANS List Learning and Story Memory subtests appear to be sufficiently reproducible and have a range adequate for use in clinical trials enrolling subjects with FXS across virtually the full range of functioning. The Carolina FXS Project CPT can be used for studies enrolling typical or low functioning subjects with FXS but may not be able to measure improvement in high functioning subjects with FXS, and the W–J Spatial Relations subtest and NEPSY Tower should be sufficiently reproducible for use in clinical trials involving typical to high functioning subjects with FXS. Few of the measures examined in this study could be done by low-functioning subjects with MA <4 years and this is a general problem which limits cognitive testing not only in lower functioning adults and older children, but also in typical functioning young (<5 years) children with FXS. If newer treatments show promise in older age groups, there will be a need to extend these treatments to clinical trials in very young children with FXS, to potentially begin to impact development at the youngest possible age. Thus, there is a significant need for further research to develop valid and reproducible outcome measures that can successfully evaluate cognition in very young children and lower functioning individuals with FXS.

This study serves as an example to demonstrate that test profiles (Fig. 2) involving MA plots, completion rates, and reproducibility can be generated for cohorts of subjects with FXS for the tests utilized in this report and potentially for other measures thought to be applicable in the future. Separate profiles on different or overlapping measures chosen based on the characteristics of the population to be tested could also be generated for cohorts with autism and other developmental disabilities. Such profiles will be extremely valuable in choosing outcome measures for clinical trials involving new therapeutic agents.

Acknowledgments

This study was supported by a grant from the Spastic Paralysis Research Foundation of the Illinois-Eastern Iowa District of Kiwanis International. The authors would like to thank Steve Hooper PhD for providing the Carolina Fragile X Project CPT, Mina Johnson PhD for providing the Card Task, and Isabel Boutet PhD for assisting with training of study staff for use of the NVALT. The authors especially thank the fragile X families who participated in this trial for their time and enthusiasm.

Copyright information

© Springer Science+Business Media, LLC 2008