Participants
After exclusion, the final sample included 91 infants, with data from assessments performed at 10, 14, and 18 months of age (see section Assessment at 10, 14 and 18 months of age for the amount of valid gaze data per assessment and Table 1 for group comparisons of background characteristics). All participants were part of the Early Autism Sweden study (EASE; for a general overview, see http://www.earlyautism.se). The EASE study is a prospective longitudinal study of infant siblings at elevated likelihood of ASD. At the time of enrollment, the clinical outcome is unknown; thus, infants with an older sibling diagnosed with ASD represent the “elevated likelihood” group, while infant siblings with no familial history of ASD represent the “low likelihood” group. At 36 months of age, a clinical diagnostic assessment was conducted and children were assigned to (1) an elevated likelihood group with ASD (EL-ASD); (2) an elevated likelihood group without ASD (EL-no-ASD); or (3) a low likelihood group with neurotypical outcome (LL), with the characteristics summarized in Table 1.
Table 1 Participant characteristics by group, final samples (n, Mean, SD) at 10, 14 and 18 months of age Participating families were recruited through multiple channels, including the project’s website, advertisements, clinical units, and a database of families within the larger Stockholm area who had indicated interest in research participation previously. All children included in the study were born at full term (> 36 weeks) and no infant had any confirmed or suspected medical problems (including visual and auditory impairments).
All families provided written informed consent and the study was approved by the Regional Ethical Board in Stockholm. The study was conducted in accordance with the standards specified in the 1964 Declaration of Helsinki.
Assessment at 10, 14 and 18 months of age
Participants in the EASE study undergo multiple measures and assessments including eye tracking (Falck-Ytter et al. 2018; Nyström et al. 2018, 2019), motion tracking (Achermann et al. 2020), electroencephalography (EEG), magnetic resonance imaging (MRI), parent–child interaction, and developmental assessments, spending 4–5 h in the lab. This study includes data from the eye tracking session and the developmental assessment using the Mullen Scales of Early Learning (MSEL; Mullen, 1995), at 10, 14 and 18 months of age. Regarding the eye tracking session, at 10 months of age 76.8% of infants provided valid gaze data (n = 70), at 14 month of age 70.0% provided valid gaze data (n = 63, 1 infant did not complete the assessment), and at 18 months of age 80.5% of infants provided valid gaze data (n = 70, 4 infants did not complete the assessment).
Assessment at 36 months of age
At 36 months, a clinical diagnostic assessment was conducted by experienced psychologists and included standardized information on medical history, current developmental, and adaptive level, as well as autistic symptoms using the Autism Diagnostic Interview-Revised (ADI-R; Rutter et al. 2003), the Autism Diagnostic Observation Schedule 2nd Edition (ADOS-2; Lord et al. 2012), and the Mullen Scales of Early Learning (MSEL; Mullen 1995).
Data collection
The current eye tracking task was part of a larger eye tracking session (total duration of approximately 8 min) including other experiments (Falck-Ytter et al. 2018; Kleberg et al. 2018) which are not relevant for the current research questions and which are not analyzed here. The eye tracking session typically took place after the lunch break, and a nap if the infant was tired. The infants were seated on their parent’s lap at approximately 60 cm distance to the computer monitor where the stimuli were presented. Gaze data were collected using Tobii corneal reflection eye trackers (Tobii AB, Danderyd, Sweden). During this longitudinal study, equipment changes occurred, such that data were first recorded on a Tobii 1750 at 50 Hz and after an upgrade with a Tobii TX300 eye tracker at 120 Hz. The two trackers used different sampling rates and screen resolutions, but after temporal upsampling using linear interpolation and spatial resampling (stimuli were presented at different screen resolutions but the same physical size), all recordings displayed a resolution of 1024 × 768 pixels and a screen size of 23″ with a sample rate of 300 Hz. Before the eye tracking session, the eye tracker was calibrated using a five-point calibration procedure in which a coloured sphere expanded and contracted on the screen in synchrony with sound. The sphere expanded sequentially at five locations on the screen (i.e., every corner and the centre). The procedure was repeated if necessary, until an acceptable calibration of both eyes was obtained (as in Kleberg et al. 2018; Nyström et al. 2018).
Stimulus
The stimulus consisted of a moving object (a ball; radius = 10 px, 0.44°; Fig. 1) that started moving horizontally from the left side of the screen with constant speed, accompanied by a music track to increase interest in the task. In the middle of the screen, after 960 ms, the moving object disappeared behind a circular occluder (radius = 100 px, 4.36°). At the center of the screen (behind the occluder) the object changed direction 90° counter-clockwise, and continued moving in this direction. The object reappeared after 1120 ms, and continued upward for 960 ms. Then, the object reversed the direction and moved back to the starting point along the same trajectory. Thus, the moving object rolled back and forth between two endpoints at constant velocity in a horizontally flipped L-shape (Fig. 1). For analysis, we defined one trial as one occlusion passage from one endpoint to the other (3040 ms), either starting on the side and ending on top, or starting on top and ending on the side. Each infant was presented with 2 blocks consisting of 10 such occlusion passages; hence, every session included 20 occlusion passages (trials). Within the blocks, the trials formed a continuous back-and-fort movement of the ball following its L-shaped trajectory. A subgroup of infants saw 3 blocks of the task due to changes in the experimental setting; however, this additional block was excluded from the analysis in order to create congruent preconditions for all infants included in the study.
To assess gaze data, five areas of interest (AOIs) were defined. These included four oblong rectangles for each possible direction of the moving object and one AOI for the occluder. The size of each AOI was kept constant throughout the experiment (see Fig. 1).
Data reduction
Raw gaze coordinates were analyzed in MATLAB (R2015a, Mathworks Inc., CA, USA) using custom written scripts. First, we measured the time the object was completely occluded and time the object started to reappear again. These timestamps were then used to define a window of interest for the analysis. Occlusion always occurred 1120 ms before reappearance, and the former was defined as time point 0. Trials with less than 50% gaze data prior the event (i.e., reappearance) were automatically discarded. Infants contributing with less than 4 trials were excluded from the analysis (n = 19). Next, we interpolated gaze data linearly over gaps shorter than 15 samples (i.e. 50 ms). All trials underwent visual inspection, blinded for infant identity and group status, in order to remove trials containing movement artefacts, noisy data, or missing data close to the occlusion event. Visual inspection was done by plotting the two-dimensional gaze data over time and included the AOIs for the trajectory of the moving object and the occluder (see Supplementary Figures). Gaze data were manually transposed so that the AOIs covered as much gaze as possible and accounted for gaze calibration drifts during the recording. Then, gaze velocity was calculated and plotted in order to identify the gaze shift latency towards an AOI after occlusion.
The pupillary response to the reappearance of the moving object after temporary occlusion was measured after the gaze shift towards the object. The change in size of the pupil was calculated based on a 2000 ms time interval after the gaze shift relative to a baseline measure prior to the gaze shift (1000 ms), and converted a percentage (i.e. 100 is same size as baseline, < 100 means pupil constriction, and > 100 means pupil dilation). All pupil measurements were taken from the left eye instead of the average of both eyes to avoid artifacts if the eye tracker lost track of one of the eyes.
Dependent variables
The behavioral measures of the study were gaze shift latencies and pupillary responses as described above, which were measured both across trials within each testing session and across the three different time points (10, 14, and 18 months). Individual trial values were used as the most basic dependent variables, and performance on the first trial was of particular interest. In a first step, we were interested in general effects in the entire sample, regardless of group and age. Bayesian one sample t-tests against 0 including the average adaptation rate and the first trial response were conducted in order to detect general effects regarding gaze shift latency and pupillary responses.
Next, because the focus of the study is predictive and adaptive behavior, the dependent variables included the adaptation rates across trials for gaze shift latency and pupillary responses. Adaptation rate was operationalized as the slope of a linear regression within each participant with at least 4 valid trials (as an example see Fig. 2, where the trial responses across trials for gaze shift latencies are shown, and the adaptation rate on a group level is represented by the regression line). There were no significant differences between block 1 and 2 in terms of trial means, or adaptation rates within blocks. Therefore, in order to include infants who had spuriously excluded trials in one of the block, the two blocks were averaged to increase the number of data points for the individual adaptation rates.
To investigate age differences, we calculated a similar linear regression slope across ages (10, 14, and 18 months) for each infant and measure, which is hereafter, termed developmental change. The developmental change was calculated both for the first trial values and for adaptation rates.
The approach of testing regression slopes (adaptation rates and developmental change) instead of performing repeated measures analyses, makes it possible to include participants which had missing data for one of the time points. Testing the slope between groups is similar to testing the interaction effect between age and group in a repeated measures ANOVA, but does allow for missing data. An option would be to use Linear Mixed Models (LMMs), but because we use Bayesian statistics (see our motivation below) our approach provides a mathematically easier solution than using Bayesian LMMs. It is not possible to test for “main effects” of group when comparing slopes, but testing the average values across ages provides equivalent results. Together, the first trial values (similar to an intercept), the slope (adaptation rate and developmental change), and the average across ages, give thorough information about the collected data. For completeness, we also show data from specific ages.
In the results section we present all statistical tests using the subsections (1) first trial gaze shift latency, (2) adaptation rate of gaze shift latency, (3) first trial pupil response, and (4) adaptation rate of pupil response.
Statistical analysis
The data was analyzed using Bayesian t-tests and ANOVAs implemented in JASP (JASP Team 2019, Version 0.9.0.1). The support for our hypotheses is described by the Bayes factor (BF). The BF10 describes the ratio between the evidence for the hypothesis H1 relative to another hypothesis H0, where the latter typically is the null hypothesis. For example, when using a Bayesian t-test, BF10 = 5 denotes that the data are five times more likely under the hypothesis H1 (i.e. that there is an effect) than under H0 (i.e. no effect). Conversely, the BF01 describes the likelihood of the data under H0. The strength of evidence is interpreted as follows: BF10 < 1, no evidence; BF10 = 1–3, anecdotal evidence for H1; BF10 = 3–10, moderate evidence for H1; BF10 = 10–30, strong evidence for H1; BF10 = 30–100, very strong evidence for H1; and BF10 > 100, extreme evidence for H1. The equivalent applies when reporting BF01 for H0.
We chose to use Bayesian statistics instead of traditional frequentist t-tests and ANOVAs because Bayesian statistics can give the strength of evidence for the null hypothesis, which frequentist tests do not provide. In addition, using Bayesian statistic, we are provided with richer information on the difference of means and standard deviations, the influence of outliers, and power of the test.