Cross-modal attentional entrainment: Insights from magicians

Barnhart, Anthony S.; Ehlert, Mandy J.; Goldinger, Stephen D.; Mackey, Alison D.

doi:10.3758/s13414-018-1497-8

Cross-modal attentional entrainment: Insights from magicians

Published: 08 March 2018

Volume 80, pages 1240–1249, (2018)
Cite this article

Download PDF

Attention, Perception, & Psychophysics Aims and scope Submit manuscript

Cross-modal attentional entrainment: Insights from magicians

Download PDF

Anthony S. Barnhart¹,
Mandy J. Ehlert¹,
Stephen D. Goldinger² &
…
Alison D. Mackey¹

2218 Accesses
20 Citations
18 Altmetric
1 Mention
Explore all metrics

Abstract

Recently, performance magic has become a source of insight into the processes underlying awareness. Magicians have highlighted a set of variables that can create moments of visual attentional suppression, which they call “off-beats.” One of these variables is akin to the phenomenon psychologists know as attentional entrainment. The current experiments, inspired by performance magic, explore the extent to which entrainment can occur across sensory modalities. Across two experiments using a difficult dot probe detection task, we find that the mere presence of an auditory rhythm can bias when visual attention is deployed, speeding responses to stimuli appearing in phase with the rhythm. However, the extent of this cross-modal influence is moderated by factors such as the speed of the entrainers and whether their frequency is increasing or decreasing. In Experiment 1, entrainment occurred for rhythms presented at .67 Hz, but not at 1.5 Hz. In Experiment 2, entrainment only occurred for rhythms that were slowing from 1.5 Hz to .67 Hz, not speeding. The results of these experiments challenge current models of temporal attention.

Cross-modal attentional effects of rhythmic sensory stimulation

Article Open access 16 November 2022

Ulrich Pomper, Bence Szaszkó, … Ulrich Ansorge

Turn the beat around: Commentary on “Slow and fast beat sequences are represented differently through space" (De Tommaso & Prpic, 2020, in Attention, Perception, & Psychophysics)

Article Open access 08 March 2021

Danielle Wood, Samuel Shaki & Martin H. Fischer

A comparative analysis of response times shows that multisensory benefits and interactions are not equivalent

Article Open access 27 February 2019

Bobby R. Innes & Thomas U. Otto

Introduction

Upon the magician’s open palm sits a coin. He taps the coin once, twice, but by the third tap the small metal treasure has seemingly disappeared. The method is simple, but the effect is impressive: Before the magic wand strikes it a third time, the coin is thrown from the open palm into the hand holding the wand (Kaufman, 1989; see Supplementary Materials for video). By virtue of entraining the audience’s attention to the rhythmic tapping, the sleight (which occurs during the attentional “trough” between the second and third beats) goes unnoticed.

In order to develop techniques for deceiving the senses, magicians must have hypotheses about the processes underlying perception. Exploration of these hypotheses has shown promise as a means of advancing the laboratory study of attention and perception (Ekroll & Wagemans, 2016; Quian Quiroga, 2016; Rensink & Kuhn, 2015). Thus far, the most fruitful collaborations between magicians and scientists have been in the domains of spatial attention and inattentional blindness (Barnhart & Goldinger, 2014; Kuhn & Martinez, 2012; Kuhn & Teszka, 2015). Here, we argue that the study of attentional deployment in time provides an ideal springboard for the collaboration between magicians and cognitive scientists. The current experiments, while not directly testing techniques from magical performance, explore ideas that underlie their tactics.

(WMV 5234 kb)

Both magicians and cognitive scientists use the analogy of an attentional spotlight; however, this has led to a conceptualization of attention that is biased toward the visuo-spatial domain at the expense of temporal dimensions (Fernandez-Duque & Johnson, 1999; Levin & Saylor, 2008; Nobre & van Ede, 2017). On some level, magicians are aware that attention can be influenced by variables outside of the visuo-spatial domain. They regularly teach that sleight of hand should occur on the “off-beat,” a moment of attentional suppression, to evade detection (Kurtz, 1998; Lamont & Wiseman, 1999). Use of the term “off-beat” implies (1) that attention fluctuates over time, and (2) that its waxing and waning follows a regular time course, like the beats of a metronome.

One variable that magicians employ to create an off-beat is the instantiation of a rhythm to focus attention at predictable points in time while presumably relaxing attention at moments between beats, the strategy used in the vanishing coin trick (Lamont & Wiseman, 1999). The application of rhythmicity to influence attention is often conflated with other factors, in practice. For example, in a classic treatise on the psychology of magic, Dessoir (1893) noted that,

If we count ‘One! two! three!’ before the disappearance of an object, then the actual disappearance must take place before and not just at the ‘three’; for while the attention of the audience is fixed upon ‘three’ anything taking place at ‘one’ or ‘two’ entirely escapes it. (p. 3618)

This example (and the vanishing coin trick) seem to rely on multiple features of temporal expectation (Nobre & van Ede, 2017). While potentially exploiting rhythmicity, it also clearly relies on a strong association of events happening “on three.” Although explicit in the coin trick, implementation of rhythmic misdirection may not always be intentional on the part of the magician. In many cases, it may be a natural effect of using music or rhythmic patter to accompany the performance of magic, and magicians may unwittingly take advantage of the rhythms that are already present during performance.

While this intuition does not fit comfortably into many popular models of attention (Posner & Rothbart, 2007), it is in line with modern dynamic models that tend to focus on temporal over spatial aspects of attention (Large & Jones, 1999; Olivers & Meeter, 2008). The most notable of these is the Dynamic Attending Model (Large & Jones, 1999), which proposes that internal oscillations (or attending rhythms) can be influenced by rhythms ex vivo, such that the attending rhythms entrain to external sources, optimizing attentional resources in anticipation of future events. Attending rhythms are conceptualized as self-sustaining biological oscillations wherein a brief pulse of energy (generated from the external rhythm) can cause a phase shift, aligning one point in the oscillator’s limit cycle with the recurring environmental stimulus. In behavioral terms, the model suggests that attention is deployed as a series of “pulses” over time, with perceptual readiness tracking these pulses.

Laboratory examinations of attentional entrainment have produced results that support the dynamic attending model. Using a metacontrast masking procedure, Mathewson et al. (2010) found that detection rates for subtle visual targets increased when the targets were presented in phase with a visual, rhythmic entrainer. The behavioral outcome reported by Mathewson et al. has been observed repeatedly across both visual and auditory modalities (Hickok, Farahbod, & Saberi, 2015; Jones, Moynihan, MacKenzie, & Puente, 2002; Landau & Fries, 2012; Lawrance, Harper, Cooke, & Schnupp, 2014; Rohenkohl, Cravo, Wyart, & Nobre, 2012). Attention aligns to environmental rhythms as a means of optimizing perception of future events. While the transient deployment of attention in time can enhance stimulus processing, it also comes with a cost. Stimuli appearing at unpredictable time points (such as the tossing of the coin) are less apt to reach awareness.

Although the effects of attentional entrainment within modalities are well known, comparatively little research has assessed cross-modal entrainment, the anecdotal mechanisms that magicians exploit. The frequent covariation of visual and auditory rhythms in the environment should naturally lead to conditions of cross-modal entrainment, as the signal in one modality is highly predictive of the other (Jack & Thurlow, 1973; MacDonald & McGurk, 1978). Indeed, Escoffier, Sheng, and Schirmer (2010) found that participants were faster to make judgments about images that were presented with synchronous auditory rhythms, relative to asynchronous rhythms or silence, suggesting that the mere presence of auditory rhythms can entrain visual attention. Similarly, Miller, Carlson, and McAuley (2013) observed faster fixation times to dot probes aligned to a rhythm, relative to temporally misaligned probes. More recently, Jones (2015) explored cross-modal entrainment using a task with both spatial and temporal cues. Response times to report the location of a spatial target were independently influenced by spatial cueing and temporal cueing. Regardless of whether targets appeared in the cued location, detection was faster when they aligned with the period of a rhythmic cue preceding onset.

The foregoing cross-modal entrainment experiments all employed stimuli that easily captured attention. Thus, they were unable to examine differences in sensitivity to stimulation. As a consequence, it becomes difficult to assess whether entrainment facilitated stimulus detection or simply the execution of a motor response. Under conditions where visual information is noisy, entrainment should facilitate signal detection (not just preparedness for action) to stimuli appearing in phase with the rhythm. The experiments reported here were designed to assess whether entrainment to regular auditory rhythms leads to concurrent optimization of visual attention at coinciding time points. Furthermore, we assessed whether attention toward the rhythmic stimulus is necessary for entrainment effects to occur. Previous entrainment experiments in a single modality have shown that performance in time can be biased by the mere presence of entraining stimuli (Mathewson, Fabiani, Gratton, Beck, & Lleras, 2010). However, this previous work could not manipulate whether participants were attending to the rhythm because it was inextricably linked to stimuli in the primary task (but see Kizuk & Mathewson, 2017).

In the current experiments, entrainment was examined within simple auditory and visual stimulus monitoring tasks wherein the presentation of a subtle visual stimulus was either aligned or misaligned in time with the regularly occurring rhythm of an auditory stimulus stream. If entrainment operates across sensory modalities, visual perception should be more sensitive in moments when an auditory stimulus onset is expected than in the “off-beats” between auditory stimulus presentations.

Experiment 1: Cross-modal entrainment to auditory rhythms

Experiment 1 examined the effect of cross-modal entrainment on the detection of a subtle stimulus. We actively manipulated (between subjects) whether participants needed to attend to the auditory stream: Participants in the Attend Audio condition had to monitor for an oddball tone, while also reporting dot probes. We expected that, when a rhythm was available to one modality, attention would automatically entrain to that signal (regardless of attentional set) and would facilitate the detection of visual stimuli falling on the beat. This prediction follows from the observed tendency for oscillatory mechanisms in the brain to phase-lock across cortical regions (Lakatos, Karmos, Mehta, Ulbert, & Schroeder, 2008).

Method

Participants

Participants were 111 students recruited from Introductory Psychology courses at Arizona State University, all with normal or corrected-to-normal vision. There were 52 participants in the Attend Audio condition and 59 in the Ignore Audio condition. All volunteered for partial course credit. With a similar experimental design, Miller et al. (2013) observed a large entrainment effect (Cohen’s d = 1.6) with 20 participants, suggesting that the current experiment is adequately powered.

Materials and stimuli

Experiments were programmed using E-Prime 1.2 (Schneider, Eschman, & Zuccolotto, 2002) and data were collected on Gateway computers. Visual stimuli were presented on 16-in. flat-screen CRT monitors with refresh rates at 60 Hz. Responses were collected using PST serial response boxes. Auditory stimuli were delivered via Sennheiser HD280 headphones.

Auditory stimuli consisted of streams of 150-ms tones at 750 or 900 Hz. Although tone files did not ramp up/down, there was no perceivable clicking artifact in the stimuli. In half the trials, 750Hz tones were used as entraining stimuli; 900Hz tones were used in the other half. In trials with 750 Hz entrainers, the 900 Hz tones were oddball stimuli, and vice versa. Entraining tones were presented at one of two rates, manipulated within-subjects. On fast trials, tones were presented every 650 ms (roughly 1.5 Hz). On slow trials, tones were presented every 1,500 ms (.67 Hz). Visual stimuli consisted of three background images created using Adobe Photoshop (see Fig. 1). The images were generated as 1,024 × 768 pixels to fill the computer screen. In each image, the color value for every pixel was selected randomly, creating a field of visual noise. Six dot probe stimuli were created in a similar fashion. Each dot probe was a 30 × 30 pixel square (roughly 3° visual angle), generated using the same random pixel color procedure as the background images. Then, a yellow field with 95 % transparency was overlaid upon the probe so that it could be discriminated from the background noise. Background and dot probe stimuli were randomly sampled from this pool on every trial.

Procedure

All procedures were approved by the Arizona State University Institutional Review Board. After obtaining informed consent, participants completed six practice trials (half fast, half slow) followed by 108 experimental trials. In each trial, participants heard a stream of auditory tones while they monitored a visual field of colored noise for the onset of a transient dot probe. Participants in Attend Audio condition actively monitored the auditory stream to detect an oddball stimulus. Participants in the Ignore Audio condition heard the same auditory sequences, but were not directed to monitor for oddballs. Participants pressed the right-most button on the response box to report detection of dot probes, and those in the Attend Audio condition pressed the left-most response box button upon detecting auditory oddballs.

Each trial lasted 19.5 s (13 tones at the slow rate; 30 tones at the fast rate), but trials were blocked into 36 groups of three (each block at the same entrainment rate) with no explicit boundaries between trials. Thus, participants perceived each trial as lasting 58.5 s. Within each block they encountered one auditory oddball and three visual dot-probes (one per trial). The position of the auditory oddball trial in each block (trial 1, trial 2, or trial 3) was randomized across blocks. Within the auditory oddball trials, the dissimilar tone could appear at one of two positions within the stream, following the first third or preceding the final third of the entraining tones (also selected randomly).

The primary visual attention task was adapted from Klein (1988). On each trial, dot probes appeared overlaid on the background of colored noise in one of nine randomly-selected positions in a 3 × 3 grid measuring 624 × 442 pixels, with a random amount of jitter (up to ±50 pixels) added about the X and Y axes. Dot probes appeared at one of three temporal positions relative to the entraining tones in each trial (counterbalanced across trials): following the first quarter of entraining tones, at the midpoint of the auditory stream, or before the final quarter of entraining tones. Within each block of three trials, the onset of the dot probe was temporally aligned with the onset of an entraining tone on one trial, offset by 25 % of the entraining frequency on one trial, and offset by 50 % of the entraining frequency on one trial (with the order randomized across blocks). Dot probes disappeared 500 ms after their onset regardless of whether participants responded with a button-press, and only one dot probe response was accepted on each trial. No other variables were manipulated or measured.

Results

Eleven participants were excluded from analyses (eight from the Attend Audio condition; three from Ignore Audio condition). Six were excluded from Attend Audio for average rates of oddball detection >2.5 standard deviations below the group mean. The remaining five participants were excluded for detecting dot probes at rates >2.5 standard deviations below their group means. Responses falling outside a 1,500 ms window following dot probe onset were classified as erroneous. This criterion led to the exclusion of 31.5 % of all trials from reaction time (RT) analyses (which included false-alarms occurring prior to the dot-probe onset). High error rates in this experiment (and the following experiments) precluded analysis via repeated-measures ANOVA, as many participants had at least one empty cell, and thus would be excluded by list-wise deletion. Consequently, all analyses reported were carried out through linear mixed-effects modeling (LMM), which regresses over missing values while also accounting for variance that arises from individual differences (Baayen, Davidson, & Bates, 2008). All analyses were carried out using R software (R Core Team, 2017) running the lme4 package (Bates, Maechler, Bolker, & Walker, 2015).

Reaction times

RTs from trials with accurate responses were log-transformed to counteract non-normality and conform with the assumptions of LMM. RTs were analyzed with Subject as a random effect and fixed effects of Audio Condition (attend audio, ignore audio), Tone Frequency (slow, fast), and Dot Probe Phase (on beat, off 25 %, off 50 %; dummy coded). The full model revealed a main effect of Audio Condition (β=-.05, SE=.02, t= -2.89, p= .004): RTs were significantly faster in the Ignore Audio condition than the Attend Audio condition. However, this factor did not interact with any others, and thus the model was simplified to exclude this factor. Figure 2 depicts untransformed RTs, including the partition by Audio Condition, for the sake of comparison. In the simplified model (see Table 1) RTs were significantly slower for dot probes falling 25 % off the beat (β=.01, SE=.007, t=2.39, p=.01) and 50 % off the beat (β=.02, SE=.007, t=2.97, p=.003) in the slow tone frequency condition. However, the model also produced an unexpected Tone Frequency by Dot Probe Phase interaction. In the fast frequency, dot probes appearing 25 % off the beat (β=-.02, SE=.01, t=-2,04, p=.04) and 50 % off the beat (β=-.02, SE=.01, t=-2.34, p=.02) elicited significantly faster RTs than those appearing on the beat.

Table 1 . Experiment 1 linear mixed model output for reaction times

Full size table

Accuracy

Probe detection accuracy rates were also analyzed via a LMM with Subject as a random effect and fixed effects of Audio Condition (attend audio, ignore audio), Tone Frequency (slow or fast), and Dot Probe Phase (on beat, off 25 %, off 50 %; dummy coded). However, there was no effect of condition so the model was simplified to exclude this effect. The simpler model (Table 2) revealed a significant main effect of Tone Frequency (β=.29, SE=.01, t=20.00, p<.001), with higher accuracy in the fast condition. In the slow condition, accuracy was significantly reduced for probes presented 25 % off the beat (β=-.04, SE=.01, t=-2.72, p=.006). Accuracy in the slow condition did not differ between probes presented on the beat or off by 50 %. Dot Probe Phase had no impact on accuracy in the fast condition (see Fig. 3).

Table 2 . Experiment 1 linear mixed model output for accuracy

Full size table

Discussion

Experiment 1 produced clear evidence of cross-modal attentional entrainment effects. Participants responded to dot probe onsets faster and with greater accuracy when they were aligned in time with the onset of an auditory stimulus in the rhythmic stream. However, the effects of attentional entrainment were only evident when the entraining rhythm was relatively slow. Lakatos and colleagues (Lakatos, Karmos, Mehta, Ulbert, & Schroeder, 2008; Lakatos, et al., 2005) suggested that the mechanisms underlying attentional entrainment should flexibly adapt to almost any rhythmic stimulus, as entrained neural oscillators modulate both the phase and amplitude of those in other frequency bands. It is possible that the faster entraining rhythm elicited a more vigilant mode of attending, whereas slow rhythms encourage periodic attentional optimization (Schroeder & Lakatos, 2008). The results of Experiment 1 highlight a limitation of the Large and Jones (1999, b) model, which cannot predict differences in entrainment across frequency bands (or within a frequency band, as was the case in Experiment 1).

Experiment 1 also produced a surprising outcome wherein dot probe detection accuracy rebounded for probes presented exactly between beats. Although this outcome was unpredicted, the rebound effect could be attributed to a few different sources. It could reflect the interplay of endogenous and exogenous influences on attentional deployment. Although attention should naturally entrain to the auditory rhythm, endogenous attentional control mechanisms could fight this tendency, attempting to enhance attention at moments when entrainment would push it to its lower limit. However, this hypothesis is relatively intractable, from an experimental standpoint. Perhaps a better explanation is provided by an experiment conducted by Gomez-Ramirez et al. (2011). They replicated and extended the work of Lakatos and colleagues (Lakatos, Karmos, Mehta, Ulbert, & Schroeder, 2008) by examining entrainment to one channel of an audio-visual stream at a rate of .67 Hz. While they observed significant entrainment to the rhythm, as evidenced by EEG amplitude peaks of a .67-Hz component, they observed peaks of a much higher power at the second harmonic, 1.33 Hz. The current experiment used .67-Hz entrainers, so a substantial second harmonic would fall exactly in between beats, and could have produced the observed rebound effect in accuracy rates.

An important outcome from Experiment 1 is that entrainment effects did not differ as a consequence of attentional set. Entrainment effects were still observed when the auditory stimuli required no attention at all, suggesting an automatic tendency to integrate information, however irrelevant, across sensory channels in service of generating predictions for perceptual optimization. Cross-modal entrainment is clearly adaptive within a dynamic (but often redundant) world. However, rhythms in the environment are rarely perfectly consistent, and sometimes change at varying rates (as with a horse beginning its gallop). Large and Jones (1999) took rhythmic variability into account: According to their model, attending rhythms will be able to accurately guide attention despite transient fluctuations around a mean frequency. The model also explicitly allows for entrainment to a rhythm that is broadly changing in frequency, so long as the change is consistent over time.

Currently, there is little evidence to suggest that attention can entrain to a changing rhythm. Furthermore, evidence (Cope, Grube, & Griffiths, 2012) seems to suggest the contrary pattern, that people anticipate future temporal events based almost exclusively on the most recent interval. When participants were asked to detect an out-of-place time interval in a changing rhythm, they were more likely to detect changes that exaggerated the pattern. Early tones were easier to detect in a speeding tempo, and late tones were easier to detect in a slowing tempo. Experiment 2 was designed to explore whether attention cross-modally entrains to a consistently changing rhythm, as is predicted by the Dynamic Attending Model (Large & Jones, 1999).

Experiment 2: Cross-modal entrainment to a changing rhythm

Experiment 2 used the same methodology as Experiment 1 to explore the effect of a changing rhythm on the deployment of attention. Participants were presented with consistently-changing (either speeding or slowing) auditory rhythms and were asked to detect visual targets presented on or off the beat. Although the Dynamic Attending Model (Large & Jones, 1999) predicts that attention will entrain to rhythms that change consistently over time, duration estimation experiments predict the failure of entrainment mechanisms (Cope, Grube, & Griffiths, 2012). If the model is correct, participants should be faster to report dot probes presented in phase with the rhythm, regardless of whether the rhythm is speeding or slowing. Cope et al. provide an alternative prediction. If participants over-rely on the previous interval in predicting tone onsets, probe detection should differ across tempo conditions. Participants in the speeding condition will deploy attention too late, missing stimulus onset if it is aligned with the rhythm, but detecting probes shifted off the rhythm. Conversely, participants in the slowing condition will anticipate the onset too early, but this will have the advantage of preparing them for the eventual onset moments later.