Fragile X syndrome (FXS) is the most common genetic cause of intellectual impairment (Crawford et al. 2001) and the most common monogenetic cause of autism (Cohen et al. 2005). FXS results from an expansion mutation of greater than 200 CGG trinucleotide repeats in the promoter region of the fragile X mental retardation 1 (FMR1) gene on the X chromosome (Verkerk et al. 1991), which is associated with methylation and transcriptional silencing of the gene and consequently leads to reduction or complete absence of fragile X mental retardation protein (FMRP; (Devys et al. 1993). The lack of FMRP gives rise to abnormal dendritic spine maturation and synaptic pruning during brain development (Bagni and Greenough 2005; Greenough et al. 2001). In addition to mild to severe intellectual impairment, individuals affected by FXS commonly exhibit behaviors consistent with attention-deficit hyperactivity disorder (ADHD), autism spectrum disorders (ASD), and social anxiety (for reviews see Hagerman 2002; Schneider et al. 2009).

Recent advances in translational research have furthered our understanding of the neurobiology underlying FXS and have led to a surge in the development of pharmacological treatments targeted at ameliorating downstream effects of reduced FMRP. The most prominent framework for the development of drug therapies is the “mGluR theory” of FXS, which suggests that FMRP modulates dendritic maturation through a mechanism involving the repression of metabotropic glutamate receptor (mGluR)-mediated protein synthesis (Bear etal. 2004). Support for this theory comes from animal studies which have demonstrated that without the negative feedback provided by FMRP, mGluR5-dependent hippocampal and cerebellar long-term depression (LTD) is enhanced, cortical synaptic AMPA receptor numbers are reduced, and dendritic processes are structurally immature in the FMR1 knockout mouse (Bear et al. 2004; Huber et al. 2002; Irwin et al. 2002). These and other phenotypic abnormalities are rescued in the mouse (Yan et al. 2005) and fruit fly (McBride et al. 2005) models of FXS with mGluR5 antagonists such as MPEP (2-methyl-6-(phenylethynyl)pyridine). The consistency of these promising results across several laboratories has spurred the development of mGluR5-antagonists as potential pharmacotherapies for targeting the underlying neurobiological pathology of FXS (for reviews see Waung and Huber 2009; Bear 2005).

Phase II neurotherapeutic clinical trials in humans with FXS are currently in progress and additional trials are being designed, generating an urgent need for objective, empirically validated, quantitative outcome measures in order to assess the efficacy of drug treatments. Outcome measures based on standardized assessments and parent questionnaires may be age- or IQ-specific and are often susceptible to strong placebo effects and rater bias. Furthermore, these outcome measures often do not address effects of treatment within a specific cognitive domain let alone a specific neural pathway or system. Psychophysiological tasks, on the other hand, offer significant appeal as outcome measures as much of the underlying neurobiological circuitry involved in these tasks has already been well characterized in the literature. These measures are also much less prone to placebo effects and bias, and may therefore provide greater sensitivity to treatment efficacy than standardized measures, particularly in smaller scale Phase II studies. Furthermore, despite numerous reports of the effectiveness of behavioral interventions among patients with FXS, biological measures that can be used to evaluate the specificity of these programs are lacking. An example of the utility of psychophysiological measures in early-phase clinical trials in FXS was the demonstration of improvement in prepulse inhibition in an open label, single dose trial of the mGluR5-blocker fenobam (Berry-Kravis et al. 2009).

Gaze avoidance is a hallmark behavioral feature of FXS (Cohen et al. 1988; Bregman et al. 1988; Garrett et al. 2004; Farzin et al. 2009; Cohen et al. 1989), and has been physiologically linked with cortisol dysregulation (Hessl et al. 2002, 2006) and enhanced autonomic reactivity (Farzin et al. 2009; Hall et al. 2006, 2009; Belser and Sudhalter 1995). Findings from neuroimaging studies suggest atypical neural circuitry involved in face processing and social cognition may exist in individuals in FXS (Dalton et al. 2007; Holsen et al. 2008). Recent work has used eye tracking methods to quantify differences in gaze patterns and pupillary reactivity when adolescents and adults with and without FXS passively viewed images of faces, evoking the idea that eye tracking may hold significant potential as a measure for assessing a specific core behavioral and physiological phenotype observed in individuals with FXS.

The aim of the present study was to examine the feasibility and reliability of eye tracking and pupillometry as potential outcome measures for evaluating the efficacy of psychopharmacological treatments in individuals with FXS. To do so, we utilized a previously developed eye tracking protocol (Farzin et al. 2009) to quantify gaze aversion and face-specific pupillary response across two test sessions in groups of participants with and without FXS. Given that the mechanisms and function of mGluR5 likely contribute to core phenotypic features found in individuals with FXS, we expect that using eye tracking to measure visual processing of faces will prove particularly sensitive to assessing treatment-specific outcomes related to symptoms of social anxiety and sensory hyperarousal.



Participants included 15 individuals with FXS confirmed by DNA testing to carry the full FMR1 mutation (12 males; ages 7–51 years) and 20 neurotypically developing (NT) controls (10 males; ages 7–71 years).Participants were matched based on chronological age (t (33) = 1.522, p = 0.138). Individuals with FXS were recruited through the Fragile X Research Clinic and Research Program at Rush University Medical Center. NT participants were recruited from the Rush campus community and local schools in the Chicago area and were included only if there was no history of psychiatric diagnosis. All participants had normal or corrected-to-normal vision. Adult participants or parents/guardians of child or adult FXS participants provided written consent according to a protocol approved by the Institutional Review Board at Rush University Medical Center.

At the time of testing, 12 individuals with FXS (80% of the group) were being treated with at least one class of psychoactive medication; SSRI/SNRI (46%), stimulant (38%), antipsychotic (31%).

Cognitive abilities were measured in 13 of the 15 individuals with FXS (10 males) using the Wechsler Scales of Intelligence (Wechsler Intelligence Scale for Children, Fourth Edition; Wechsler Abbreviated Scale of Intelligence; The Psychological Corporation, San Antonio, TX) or the Stanford Binet, Fifth Edition (Riverside Publishing, Rolling Meadows, IL). Males with FXS had a mean IQ of 51.8 (SD = 10.9; range = 36–72) and females with FXS had a mean IQ of 76.3 (SD = 9.3; range = 72–86).

The Aberrant Behavior Checklist (ABC; Aman et al. 1985) is a 58-item rating scale developed for persons with developmental disabilities and was used to assess maladaptive behaviors including hyperactivity, lethargy/social withdrawal, inappropriate speech, and irritability in individuals with FXS.

The Social Responsiveness Scale (SRS;Constantino and Gruber 2005) was administered to individuals with FXS to evaluate the severity of autism spectrum symptoms.

Group characteristics are given in Table 1.

Table 1 Participant characteristics by group (mean ± SD)

Apparatus and Stimuli

A Tobii T120 infrared binocular eye tracker (Tobii Technology, Sweden) was used to record X and Y coordinates of eye position and pupil diameter. This video-based system consists of a high-resolution camera embedded in a 17-inch thin-film transistor LCD monitor (1,280 × 1,024 pixels resolution), which promotes more natural user behavior since it does not place restraints on participants such as a helmet, head-mounted sensor, or glasses. The eye tracker samples the position of the eyes at a rate of 120 Hz (one data point approximately every 10 ms, with an average precision of within 0.5° of visual angle).

Stimuli were identical to those used by Farzin et al. (2009). Images consisted of 60 colored photographs of adult human faces (equal numbers of males and females; different races and ethnicities) from the NimStim Face Stimulus Set (Tottenham et al. 2002), each face exhibiting a calm, happy, or fearful expression, and 60 scrambled versions of the face images. To insure that pupillary response to the onset of a face was independent of a pupillary light reflex, each face and corresponding scrambled image were matched on mean luminance, and equivalence was confirmed using a photometer (Minolta, LS-100, Osaka, Japan). Face images subtended a 12.12° by 17.19° region (the size of an actual human face) when viewed from a distance of 60 cm, and were presented on a standard 50% grey background (RGB: 128, 128, 128).


Testing was conducted in a quiet room with the lights turned off. The eye tracker was calibrated for each participant at the beginning of each session using a standardized 9-point routine. Following calibration, participants were told to view the pictures shown on the screen. Each trial began with presentation of a scrambled face image for 1 s followed immediately by its matched face image for 3 s. An inter-trial interval (ITI) containing a uniform grey screen was shown for 0.5, 1, or 2 s, randomly determined. The order of face presentation was pseudorandomized and each eye tracking session lasted approximately 6 min. All measurements were analyzed offline.

Test–retest reliability of eye tracking measures was assessed based on two testing sessions separated by no more than 2 weeks. This time interval was chosen in order to match the time frame between clinic visits used in the protocol of an ongoing clinical trial. Test–retest intervals were equivalent between groups (NT controls: 9 days, FXS: 10 days; [t (33) = −0.930, p = 0.359]).


Four area-of-interest (AOI) regions were defined for each face image: eyes (including the eyebrows), nose, mouth, and other. Scrambled faces included a single AOI around the ellipse. Measures included number of fixations (where a fixation was defined as any data point within a 30 pixel radius for a minimum duration of 150 ms) and proportion of looking time to each AOI region (calculated by dividing looking time to AOI region by total looking time to face).

Pupil data were filtered to remove points in which both eyes were not successfully recorded, outlier values corresponding to blinks, loss of tracking data, or large changes in head position, and trials in which the participant did not look at the preceding scrambled face image for 3 or more consecutive 250 ms intervals (rendering the baseline pupil diameter invalid). Mean pupil diameter was calculated for interval durations of 250 ms across the 3-s face presentation (12 intervals), time-locked to the onset of the image. The following calculation was used to compute the face-specific pupillary response: mean pupil diameter during face presentation interval—mean pupil diameter during the scrambled face presentation. To provide a relative change, we “standardized” this difference value by dividing it by the mean pupil diameter during the scrambled face presentation. Relative change in pupil diameter was averaged across trials of each face emotion. Pupillary response during each interval of the scrambled face presentation (4 intervals) was calculated with respect to the mean pupil diameter during the ITI interval period.


All participants successfully completed the experimental procedure during both testing sessions. Individuals with and without FXS provided gaze data for a comparable number of slides of each emotion type across sessions [F (2, 32) = 0.912, p = 0.412], allowing us to rule out possible confounds such as differences in attention to faces or general motivation between groups.

Replicating findings reported in Farzin et al. (2009), individuals with FXS made fewer fixations to and spent less time looking at the eye region of all faces, relative to the NT control group, during both testing sessions (Fig. 1). A repeated measures analysis of variance (ANOVA) with AOI region (eye, nose, mouth, and other), emotion (calm, happy and fear), testing session (1 and 2), and group (FXS and NT) as independent variables and number of fixations as the dependent variable revealed significant main effects of AOI region [F (3, 31) = 22.73, p = 0.0001, η2 = 0.687] and emotion [F (2, 32) = 4.029, p = 0.028, η2 = 0.201], and significant interaction effects between AOI region and group [F (3, 31) = 6.12, p = 0.002, η2 = 0.372] and emotion and group [F (2, 32) = 4.711, p = 0.016, η2 = 0.227]. No effect of session was found [F (1, 33) = 0.280, p = 0.460]. Independent-samples t-tests confirmed that, compared to controls, individuals with FXS made fewer fixations to the eye region of all face images (FXS M = 1.61, SD = 1.04; NT M = 3.99, SD = 1.59; [t (33) = 5.01, p = 0.0001]) and made fewer overall fixations when the happy (FXS M = 2.48, SD = 1.22; NT M = 3.53, SD = 1.19; [t (33) = 3.28, p = 0.002]) and calm (FXS M = 2.47, SD = 1.21; NT M = 3.48, SD = 1.18; [t (33) = 2.05, p = 0.049]) faces were on the screen.

Fig. 1
figure 1

Mean number of fixations to each AOI region by group for test sessions 1 and 2. Error bars represent the standard errors of the mean. Asterisk and double asterisk indicate significant difference between pairwise comparisons at the p < 0.05 and p < 0.01 level, respectively

A similar repeated measures ANOVA was conducted using proportion of looking time as the dependent variable, which also yielded a main effect of AOI region [F (3, 31) = 22.87, p = 0.0001, η2 = 0.689] and a significant interaction effect between AOI region and group [F (3, 31) = 8.711, p = 0.0001, η2 = 0.457]. A significant interaction effect between AOI region and emotion was identified [F (6, 28) = 6.71, p = 0.0001, η2 = 0.590], driven by generally longer looking to the mouth region of happy faces compared to either of the other two emotions. No effect of session was observed [F (1, 33) = 0.626, p = 0.435]. Independent-samples t tests qualified that, across both sessions, individuals with FXS spent approximately half as much time looking at the eye region (FXS: M = 16.2%, SD = 11.41; NT: M = 28.3%, SD = 10.04; [t (33) = 3.33, p = 0.002]) and a larger proportion of time looking at the mouth region (FXS M = 42.1, SD = 12.19; NT M = 30.5, SD = 4.43; [t (33) = −3.97, p = 0.0001]) of all faces, compared to controls (Fig. 2).

Fig. 2
figure 2

Mean proportion looking time to each AOI region by group for test sessions 1 and 2. Error bars represent the standard errors of the mean. Asterisk and double asterisk indicate significant difference between pairwise comparisons at the p < 0.05 and p < 0.01 level, respectively

Importantly, no group difference was found for the number of fixations made to [F (1, 33) = 0.710, p = 0.405, η2 = 0.21], or time spent looking at [F (1, 33) = 2.782, p = 0.105, η2 = 0.780], the scrambled images across sessions, suggesting that the group differences in gaze behavior were face-specific. Within the group of individuals with FXS, no effect of gender was found for number of fixations made to the eye region of faces [F (1, 14) = 0.008, p = 0.929], or proportion of time spent looking at the eye region of faces [F (1, 14) = 0.266, p = 0.614] across the two sessions, suggesting that there was no difference in the extent of eye gaze avoidance between males and females with FXS. Likewise, age was not associated with number of fixations made to the eye region of faces [F (1, 14) = 0.487, p = 0.83], or proportion of time spent looking at the eye region of faces [F (1, 14) = 1.526, p = 0.463] across the two sessions in individuals with FXS. Sex and age yielded no effects within the control group either.

Individuals with FXS demonstrated significantly greater pupillary dilation in response to faces, relative to controls, replicating results of Farzin et al. (2009). A repeated measures ANOVA with interval (12), emotion, and test session as independent variables and face-specific change in pupil diameter as the dependent variable was conducted within each group. A significant main effect of interval revealed that pupil diameter increased across time in both groups (FXS: [F (11, 4) = 9.67, p = 0.021, η2 = 0.964], NT: [F (11, 9) = 19.96, p = 0.0001, η2 = 0.961]). This dilation was not modulated by face emotion or session for either group, as illustrated in the scatterplots presented in Fig. 3. We also analyzed pupillary response with group as a between-subject factor and confirmed that, in addition to a significant main effect of interval [F (11, 23) = 14.14, p = 0.0001, η2 = 0.871], individuals with FXS generally experienced greater pupil dilation than NTs (FXS: M = 0.018 SD = 0.03, NT: M = 0.003 SD = 0.02; [F (1, 33) = 4.68, p = 0.038, η2 = 0.124]). Pupillary reactivity to the onset of scrambled faces did not differ between groups [F (1, 34) = 1.980, p = 0.169] or as a function of session [F (1, 33) = 2.044, p = 0.162].

Fig. 3
figure 3

Mean relative change in pupil diameter (mm) in response to calm, happy, and fear faces for individual participants in each group. On the X-axis are data from test session 1 and on the Y-axis are data from test session 2

Since we were primarily interested in the test–retest stability of fixation count, looking time, and pupillary response measures as potential metrics of change for treatment outcome studies, we computed the intraclass correlation coefficient (Bradley et al. 2008) between sessions for each group using a two-factor mixed-effects consistency model. If participants performed identically on the two occasions, the ICC value will indicate perfect association and agreement, and will be equal to 1. Because a group difference in gaze behavior was found across all faces, we removed emotion as a factor for the reliability analyses. Pupillary response was averaged across intervals for the reliability analyses. A high degree of reliability was found for all measures; ICCs demonstrated good (ICC > 0.40) to excellent (ICC > 0.75) test–retest reproducibility in both groups (Table 2). In the FXS group, number of fixations and proportion looking time to the eye and nose regions were exceptionally high (ICC > 0.90) relative to controls. Pupillometry in response to fear faces was most reproducible in individuals with FXS (ICC = 0.87).

Table 2 Test-retest reliability (Intraclass Correlation Coefficient) of eye tracking measures between test sessions 1 and 2, by group


The main goal of this study was to investigate the test–retest reliability of eye tracking measures in individuals with FXS in order to establish their potential utility for use in clinical drug trials. Here, we present gaze position and pupillometry data recorded while individuals with and without FXS viewed images of faces, collected during two separate test sessions. These data reveal substantial quantitative differences in face processing and autonomic reactivity between groups, replicating Farzin et al. (2009). Most importantly, repeated assessment using these eye tracking measures within the same sample of participants yielded high reliability for both groups. The reduced between-subject variance present in the NT group relative to that of the FXS group may explain the lower ICC values obtained for a few of the eye tracking measures in the NT group. To our knowledge, this is the first study to demonstrate test–retest reliability of eye tracking in this developmentally delayed population. All individuals with FXS were able to complete both test sessions, regardless of age, social function, or IQ, suggesting that this protocol could be effectively used in a clinical trial enrolling children, adolescents, and adult individuals with FXS of a broad range of functioning. Further, individuals with FXS with a wide range of behavioral severity on the ABC were able to complete the task, suggesting that participants with FXS having sufficient behavioral dysfunction on the ABC to qualify for clinical trials will still be able to do the task.

The selection of appropriate outcome measures that are both psychometrically valid and reliable internally, across raters, and across time, is critical for clinical drug trials. Our data suggest that eye tracking and pupillometry are reliable measures for evaluating changes associated with treatments.

Researchers have suggested that social avoidance/anxiety in general, and gaze aversion specifically, may be coping strategies that serve to reduce negative arousal in individuals with FXS (Garrett et al. 2004; Farzin et al. 2009; Hessl et al. 2006). A recent study has shown that eye contact in children with FXS is amenable to improvement through behavioral training (Hall et al. 2009). While the exact biological bases for differences in gaze behaviors are not fully understood, research has shown that secretion of cortisol, by means of a cascade of hormones along the hypothalamic–pituitary–adrenal (HPA) axis, involves feedback between several limbic structures, among which the amygdala plays a leading role. Long-term potentiation in the amygdala requires activation of mGluR5 and is impaired in FXS knockout mice (Zhao et al. 2005; Suvrathan et al. 2010; Rodrigues et al. 2002). The present protocol, if used to test the efficacy of neurotherapeutic agents such as mGluR5 antagonists, would provide information regarding not only the therapeutic potential for social anxiety in FXS, but also the primary site(s) and mechanism(s) of action of the drug.