Using Time Perception to Explore Implicit Sensitivity to Emotional Stimuli in Autism Spectrum Disorder

Jones, Catherine R. G.; Lambrechts, Anna; Gaigg, Sebastian B.

doi:10.1007/s10803-017-3120-6

Using Time Perception to Explore Implicit Sensitivity to Emotional Stimuli in Autism Spectrum Disorder

Original Paper
Open access
Published: 20 April 2017

Volume 47, pages 2054–2066, (2017)
Cite this article

Download PDF

You have full access to this open access article

Journal of Autism and Developmental Disorders Aims and scope Submit manuscript

Using Time Perception to Explore Implicit Sensitivity to Emotional Stimuli in Autism Spectrum Disorder

Download PDF

Catherine R. G. Jones¹,
Anna Lambrechts² &
Sebastian B. Gaigg²

3612 Accesses
11 Citations
6 Altmetric
Explore all metrics

Abstract

Establishing whether implicit responses to emotional cues are intact in autism spectrum disorder (ASD) is fundamental to ascertaining why their emotional understanding is compromised. We used a temporal bisection task to assess for responsiveness to face and wildlife images that varied in emotional salience. There were no significant differences between an adult ASD and comparison group, with both showing implicit overestimation of emotional stimuli. Further, there was no correlation between overestimation of emotional stimuli and autistic traits in undergraduate students. These data do not suggest a fundamental insensitivity to the arousing content of emotional images in ASD, or in individuals with a high degree of autistic traits. The findings have implications for understanding how emotional stimuli are processed in ASD.

Development and Validation of the Camouflaging Autistic Traits Questionnaire (CAT-Q)

Article Open access 25 October 2018

Understanding Social Anxiety Disorder in Adolescents and Improving Treatment Outcomes: Applying the Cognitive Model of Clark and Wells (1995)

Article Open access 13 April 2018

Social Anxiety Disorder

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Difficulties in understanding and responding appropriately during social exchange are hallmarks of autism spectrum disorder (ASD). These difficulties have led to close scrutiny of the ability to process emotional cues, with a heavy emphasis on recognising emotion in the face (Uljarevic and Hamilton 2013). Investigation of facial emotion recognition in ASD has typically involved labelling faces expressing the six basic emotions (happiness, sadness, fear, anger, surprise, disgust). Against a background of mixed findings, recent meta-analyses have concluded that difficulties in facial emotion recognition are characteristic of ASD, although the severity of impairment varies according to emotion (Lozier et al. 2014; Uljarevic and Hamilton 2013). However, a challenge of any task that involves participants explicitly engaging with the process being measured is that they may use alternative strategies to ‘hack out’ the correct response. For instance, similar behavioural emotion recognition performance in participants with and without ASD is found alongside different patterns of neural activation (e.g. Rahko et al. 2012). A related issue is that uninterrupted time to decide a person’s emotional state does not recreate the demands of real-life social interactions. Therefore, it is arguable that explicit and relatively straightforward measures of emotion recognition provide only limited insight into the more complex realities of processing emotion in ASD.

One way of circumventing these issues is to measure emotion processing indirectly. For example, in circumstances where implicit processing of the emotional content of stimuli will influence the response, despite no explicit instruction to pay attention to emotion. A similar approach has been taken in characterising theory of mind in ASD, with explicit mentalising being ostensibly unimpaired while implicit and intuitive mentalising abilities are compromised (Senju et al. 2009). For emotion processing, an elegant paradigm that achieves this goal is an adapted version of the temporal bisection task (Droit-Volet et al. 2004). The temporal bisection task is a classic measure of interval timing that was originally used in animals (Wearden 1991). Within the last 25 years, the task has helped characterise the mechanistic structure and psychophysical hallmarks of human perceptual timing in the millisecond- and seconds-range (see Jones and Jahanshahi 2014 for a summary of related tasks). It has been argued that a brain-based internal clock (or clocks) govern this distinct type of timing process (see Buhusi and Meck 2005). The temporal bisection task requires participants to learn short and long standard durations (e.g. 400 and 1600 ms), typically presented as simple visual displays. During the testing phase, stimuli are presented for both the standard and intermediate durations and participants have to classify each as more similar to the short or long standards. The proportion of ‘long’ responses increases monotonically with stimulus duration and can be plotted as a psychophysical function, or bisection curve, with duration along the x axis. Various performance measures can be obtained from the psychophysical function, with the steepness of the slope indexing temporal sensitivity and the lateral displacement along the x axis indexing response bias (see Wearden 1991). In typical populations, when face stimuli are used during the testing phase the duration of emotional faces are consistently overestimated compared to neutral faces, which is demonstrated in the leftward displacement of the bisection curve (e.g. Droit-Volet et al. 2004; Effron et al. 2006; Fayolle and Droit-Volet 2014; Tipples 2008, 2011; Tipples et al. 2015). In contrast to response bias, temporal sensitivity is typically not affected (see Fig. 1 for a hypothetical illustration of typical findings). The intuitive explanation is that implicit recognition of the emotional content of the stimuli is driving the effect.

One explanation for the findings is that the internal clock that times the intervals is sensitive to the arousal induced by viewing emotional faces (see Cheng et al. 2016; Droit-Volet and Meck 2007). In essence, the internal clock is speeded by the increased levels of arousal, which means that more clock ‘ticks’ (temporal units) are accrued and the period of time is judge as longer. This explanation has been interpreted within the most common internal clock model, scalar expectancy theory (SET; Gibbon 1977; Gibbon et al. 1984). SET is an information processing model that conceives that time is processed within a clock system consisting of a pacemaker that emits pulses, which are passed via a switch to an accumulator that represents current elapsed time. Working memory and reference memory processes are used to store time values and a comparator, or decision making process, compares these values to enable a temporal judgement to be made. Despite criticisms of SET (e.g. Buhusi and Meck 2005; Droit-Volet and Meck 2007), the explanation of overestimation being driven by a speeded pacemaker and increased accumulation of temporal units aligns with evidence demonstrating that stimulant drugs lead to overestimation of duration (see Coull et al. 2011; Droit-Volet et al. 2013). This arousal-based explanation also fits the subjective phenomenon of time seeming to slow when in a highly arousing situation such as an accident. A range of evidence has indicated that emotional stimuli trigger activation of the sympathetic autonomic nervous system (e.g. Brouwer et al. 2013). Direct physiological evidence for increased arousal during the emotional temporal bisection task has remained unexplored, partly as the multiple short trials do not lend themselves to accommodating the refractory periods of physiological responses. However, Gil and Droit-Volet (2012) found that emotional images that were subjectively judged as highly arousing produced greater overestimation than images with lower ratings. Mella et al. (2011) directly measured skin conductance response (SCR) during a duration and emotion discrimination paradigm. High arousing sounds were judged as longer and led to enhanced SCR when participants attended to the emotional intensity of the stimuli, although not all data were compatible with a simple relationship between time, arousal and emotion. Regardless of the physiological underpinnings, overestimation of emotional stimuli is a reliable finding that can be used as an indirect index that the emotional salience of the stimuli has been processed.

The emotional temporal bisection task can therefore give insight into whether the implicit response to emotion is intact in ASD, which is important for illuminating the precise nature of the emotional processing difficulties experienced in ASD. As performance requires a timing judgement, the task also provides information on the accuracy of perceptual timing in ASD. Previous research has argued that there is a fundamental timing difficulty in ASD (Allman et al. 2011; Brodeur et al. 2014; Falter et al. 2012; Karaminis et al. 2016; Kargas et al. 2015; Maister and Plaisted-Grant 2011; Martin et al. 2010; Szelag et al. 2004). However, this is not a universal finding (Jones et al. 2009; Gil et al. 2012; Mostofsky et al. 2000; Wallace and Happé 2008) and the debate remains open. An important secondary aim of the study, therefore, is to add to the small but growing body of literature that considers whether interval timing poses difficulties for individuals with ASD.

As well as investigating the implicit responses to emotional faces in ASD, the studies reported below also examined responses to a set of wildlife images (e.g., spider) that were chosen to vary in emotional salience to a similar extent as the face stimuli. Emotion research in ASD often focuses on the human face, making it difficult to determine whether observed effects are face-specific or reflect a more general difficulty with emotion processing (see Gaigg 2012). Our study was piloted in a large population of typically developing (TD) young adults, reported in Study 1, in which the Autism Quotient (AQ: Baron-Cohen et al. 2001) was used to investigate if there was any meaningful association between task performance and self-reported autistic traits in the general population. Study 2 directly compared adults with ASD to a comparison group without a diagnosis. A reduced emotional temporal bisection effect in ASD would suggest atypical implicit responsiveness to emotional stimuli, whereas an intact emotional temporal bisection effect would indicate that this response, thought to be mediated by sub-cortical arousal mechanisms, is functioning typically.

Study 1: Temporal Bisection of Arousing Face and Wildlife Images in a Typical Adult Population

Method

Participants

Eighty-five undergraduate and postgraduate students (47 female; M = 22 years 7 months; SD = 4.94) from the University of Essex participated. There was no significant difference in age between male and female participants (see Table 1). None of the participants had a history of psychiatric or neurological disorder or illness. As one of the tasks included pictures of spiders, all participants were screened for arachnophobia. No participant had a diagnosis of ASD, or a family member with ASD. An additional four participants were tested but their data in both the face and wildlife conditions were discarded because their responses across the varying durations of the stimuli did not conform to a sigmoid curve (see below), which suggests that they did not follow the task instruction (i.e., the participants did not discriminate between shorter and longer durations). All participants gave informed consent and the study was approved by the Ethics Committee of the University of Essex.

Table 1 Summary of participants in Study 1

Full size table

Materials and Procedure

Temporal Bisection Tasks

The tasks were programmed in E-Prime 2.0 (Schneider et al. 2002) and displayed on a PC. The experiment consisted of two versions of a temporal bisection task in which participants first learned to discriminate between a short and a long reference duration and then tried to categorise varying durations as more similar to the short or long exemplars. The training phase of each version consisted of 20 trials in which a monochrome grey rectangle (15 cm × 19.3 cm) appeared for either 400 or 1600 ms on a computer monitor. These durations served as the short and long reference durations. During the first 10 training trials the duration of the rectangle alternated, accompanied by a visual display of the appropriate label (‘short’ or ‘long’). For the final 10 training trials, the rectangle appeared randomly for either 400 or 1600 ms and participants were required to label the trial as either short or long by pressing appropriate response keys following the on-screen question, ‘Do you think this was SHORT or LONG?’. Throughout the task, the ‘N’ key (re-labelled ‘S’) on the computer keypad was used for a short response and the ‘M’ key (re-labelled ‘L’) was used for a long response. Participants used their preferred hand/fingers to respond.

Two versions of the bisection task were administered in counterbalanced order across participants. In the face version, photographs of four Caucasian male models (#23, 26, 27 and 36) were selected from the NimStim database (Tottenham et al. 2009), with each posing neutral, happy, angry and fearful expressions (i.e. 16 face stimuli). In the wildlife version, photographs of four different flowers, puppies, snarling canine/felines (snarl) and spiders were sourced from various web-sites (i.e. 16 wildlife stimuli). The wildlife stimuli were chosen because they were hedonically similar to the neutral, happy, angry and fearful facial expressions, respectively.

All experimental stimuli were converted to 24-bit grey-scale images and cropped to match the dimensions of the grey rectangle used for training. In each of the two versions of the task, the 16 stimuli (4 per hedonic category) were presented once each at 7 different durations (400, 600, 800, 1000, 1200, 1400 and 1600 ms) for a total of 112 trials. The order of presentation was pseudo-randomised with the constraint that no more than 2 successive trials could be of the same duration or hedonic category. Each trial began with a ‘READY’ screen that lasted randomly between 1800 and 2500 ms and was followed by the experimental stimulus at one of the pre-set durations. A blank interval lasting between 200 and 500 ms separated the stimulus from the response prompt, ‘Do you think this was SHORT or LONG’?’. The prompt terminated with the participants’ response and was followed by another 200–500ms blank interval before the ‘READY’ signal reappeared to mark the beginning of the next trial.

To establish that the images produced the predicted subjective feelings of arousal, the participants were required to rate each image for valence and arousal using the Self-Assessment Manikin (SAM: Lang 1980). The SAM uses cartoon images to represent 9-point scales of arousal and valence. The images (16 face and 16 wildlife) were presented in a random order and at a self-paced rate immediately after completion of the temporal bisection tasks. Fifty-four participants completed this stage of the study.

The AQ questionnaire (Baron-Cohen et al. 2001) was used to measure self-reported autistic traits (range available = 0–50). This was administered at the end of the testing session.

Analysis of the Temporal Bisection Data

The proportion of long responses (p(long)) for each category and stimulus duration was calculated (i.e. proportion of long responses out of 4). In addition, we fitted participants’ response to a cumulative Gaussian sigmoid using the psignifit MATLAB Toolbox (Wichmann and Hill 2001a, b) and extracted the bisection points of the resulting response curves for each individual. The bisection point is the point of subjective equality, i.e. the duration at which short and long responses occur with equal probability. It reflects accuracy in relation to the veridical middle point, with lateral displacement indicating response bias towards either short or long responses. It can be measured as the x-axis value at which sigmoid functions cross the 50% midpoint of the y-axis (p(long) = 0.5). For the current experiment this would be expected to be close to 1000 ms (i.e. half way between the shortest and longest durations). Following similar principles the Weber ratio can be calculated, which reflects the slope of the sigmoid curve and serves as an index of temporal sensitivity. It is half the difference between the upper difference limen (p(long) = 0.75) and the lower difference limen (p(long) = 0.25) divided by the bisection point. A lower score indicates greater temporal sensitivity, reflected in a steeper slope. As the p(long) and bisection point data both enable inspection of response bias and to reduce the amount of analysis reported, our main analysis focuses on the p(long) and Weber ratio. The bisection data are provided in tables for interest, presented alongside the Weber ratio data, and are the index of temporal overestimation that are correlated with the AQ.

To identify participants who may not have followed the instructions and therefore performed no better than chance, we applied a best-fit computation to compare the quality of the response curve within two different models. Using MATLAB, the response curve produced for each participant for each stimulus type was analysed to establish if a sigmoid curve (two-parameter fit) or a horizontal line (one-parameter fit, indicting no differentiation in performance by image duration) was a significantly better fit. If a sigmoid function did not best-fit a participant’s data for any one of the images in a given condition (face or wildlife) then the participant was excluded from that condition.

Analysis of the data was conducted in SPSS (IBM Corp, Version 20.0) using repeated measures analysis of variance (ANOVA). Hypothesis significance testing was supplemented by the calculation of effect size as well as the 90% confidence intervals for the effect sizes (Lakens 2013; Steiger 2004). Effect sizes were calculated using partial eta squared (ƞ² _p), where a small effect is considered 0.01, a medium effect 0.06 and a large effect 0.14.

Results

One participant’s face data was lost for technical reasons and another participant did not produce a best-fit sigmoid-shaped curve (therefore, face n = 83). For the wildlife images, one participant did not complete the task due to self-reported arachnophobia and a further three participants did not produce a best-fit sigmoid-shaped curve (therefore, wildlife n = 81).

Autism Spectrum Quotient

Scores ranged from 2 to 29, with no individuals scoring above the cut-off (32) for clinical significance (Baron-Cohen et al. 2001). The mean score for the group was 15.0 (SD = 5.5) with a median of 15.0. There was no significant difference between the scores for males and females (see Table 1).

Subjective Valence and Arousal Ratings for the Face and Wildlife Data

Using two repeated measures ANOVAs to separately examine the face and wildlife data (see Table 2), both showed a main effect of emotion on the arousal ratings (Face: F (3, 159) = 15.29; p < .001, ƞ² _p=0.22, 90% CI [.13,.30]; Wildlife: F (3, 159) = 17.86; p < .001, ƞ² _p=0.25, 90% CI [.15, .33]). Planned simple contrasts indicated emotional faces were rated as significantly more arousing than the neutral face (all p < .001, ƞ² _p range 0.24–0.38) and emotional wildlife images were significantly more arousing than the neutral flower (all p < .001, ƞ² _p range 0.32–0.41). For the valence ratings (see Table 2), repeated measures ANOVAs again showed main effects of emotion (Face: F (3, 159) = 69.80; p < .001, ƞ² _p=0.57, 90%CI [.48, .63]; Wildlife: F (3, 159) = 114.20; p < .001, ƞ² _p=0.68, 90% CI [.61,.73]). For the face stimuli, neutral faces were rated as more positive than angry and fearful faces and less positive than happy faces (all p < .01, ƞ² _p range 0.11–0.69). Similarly, for the wildlife images, flower images were rated as more positive than spider and snarl images but less positive than puppy images (all p < .001, ƞ² _p range 0.28–0.66).

Table 2 Average valence and arousal ratings for the face and wildlife stimuli. Score range was 1–9, with a high score indicating the image produced more positive valence and greater arousal

Full size table

Temporal Bisection: Face

A 4(Emotion) × 7(Duration) ANOVA of p(long) showed a main effect of Duration (F(3.04, 249.03) = 995.83, p < .001, ƞ² _p=0.92, 90% CI [.91, .93]), confirming that long responses increased with stimulus duration, and Emotion (F(3, 246) = 6.16, p < .001, ƞ² _p=0.07, 90% CI [.02, .12]) (see Fig. 2). Planned simple contrasts indicated that fearful (F(1, 82) = 13.70, p = .001, ƞ² _p=0.14, 90% CI [.05, .26]) and happy (F(1, 82) = 13.05, p = .001, ƞ² _p=0.14, 90% CI [.04, .25]) faces elicited a significantly higher proportion of long responses than neutral faces. There was no difference between angry and neutral faces (p > .3, ƞ² _p=0.01, 90% CI [.00, .07]). The interaction of Emotion and Duration was not significant (p > .2, ƞ² _p=0.01, 90% CI [.00, .02]).

Parallel analyses of the Weber ratio (see Table 3) did not produce significant findings (p > .7; ƞ² _p=0.006, 90% CI [.00, .02]) indicating that accuracy but not sensitivity was affected by emotional faces.

Table 3 Average bisection points (ms) and Weber ratio for all stimulus categories for the face and wildlife images in Study 1

Full size table

Temporal Bisection: Wildlife

A 4(Emotion) × 7(Duration) ANOVA of p(long) showed a main effect of Duration (F(2.86, 228.79) = 958.65, p < .001, ƞ² _p=0.92, 90% CI [.91, .93]) and Emotion (F(3, 240) = 5.66, p = .001, ƞ² _p=0.07, 90% CI [.02, .11]) (see Fig. 3). Planned simple contrasts showed that both snarl (F(1, 80) = 17.92, p < .001, ƞ² _p =0.18, 90% CI [.07, .30]) and puppy images (F(1, 80) = 7.16, p = .009, ƞ² _p=0.08, 90% CI [.01, .19]) elicited significantly higher proportion of long responses than flower images. There was no difference between spider and flower images (p > .1, ƞ² _p=0.02, 90% CI [.00, .10]). The interaction of Emotion and Duration was also significant (F(10.16, 812.64 = 3.49, p < .001, ƞ² _p=0.04, 90% CI [.90, .92]).

There were no significant effects for a parallel analysis of the Weber ratio (p > .1, ƞ² _p=0.02, 90% CI [.00, .05]) (see Table 3), again showing that response bias but not sensitivity was moderated by the emotional salience of the stimuli.

Association Between Temporal Bisection, Weber Ratio and AQ

For both the temporal bisection points and Weber ratio scores, each emotional image was subtracted from the neutral image to give an index of relative overestimation and of temporal sensitivity, respectively, for each emotional image. This was done for both the face and wildlife images, resulting in six correlations (three temporal bisection point; three Weber ratio) for each category of image. There was no evidence of a substantive or significant correlation with AQ for any measure (Pearson’s r range: −0.16 to 0.08).