Summary statistics
Divided attention
For these analyses, accuracy data were analyzed using repeated measures ANOVAs. Mean audiovisual, auditory, and visual-only accuracy scores for the “yes” responses in the divided attention condition were generally high. The ANOVA results and mean accuracy scores are shown in Tables 3 and 4, respectively. Due to heterogeneity of the variances, accuracy scores were transformed using an arcsine transformation.
Table 3 Repeated measures ANOVAs for mean accuracy
Table 4 Mean accuracy in each condition
Overall, results revealed significant differences across stimuli, in which higher accuracy was observed in the congruent AV trials A/b/V/b/compared to the incongruent A/b/V/g/“McGurk” stimulus. More accurate responses were also observed for A/b/V/b/stimuli relative to A/b/, and also relative to V/b/. Next, mean RTs for each participant (for correct responses) were analyzed using repeated-measures ANOVAs. ANOVA results and mean RTs averaged across participants are shown in Tables 5 and 6, respectively.
Table 5 Repeated-measures ANOVAs for mean RT
Table 6 Mean RTs in each condition
First, we carried out a comparison of mean RTs in the congruent A/b/V/b/trials to the incongruent A/b/V/g/“McGurk” trials. Results averaged across participants indicated comparatively faster congruent AV RTs compared to incongruent “McGurk” trials. Next, we compared the congruent A/b/V/b/mean RTs to the single target trials. In contrast to the accuracy results, the results failed to show evidence for a facilitation in A/b/V/b/responses compared A/b/or V/b/.
Focused attention
Results for the repeated-measures ANOVAs and mean accuracy (% correct) are displayed in Tables 7 and 8, respectively. Accuracy in the focused attention trials for the “yes” responses was high. To investigate whether differences in accuracy were present across conditions, we carried out ANOVAs comparing mean accuracy for A/b/V/b/versus A/b/V/g/trials, and again for A/b/V/b/versus A/b/trials. Results showed evidence for a significant difference between A/b/V/b/versus A/b/V/g/trials, although not for A/b/V/b/versus A/b/trials.
Table 7 Repeated-measures ANOVAs for mean accuracy
Table 8 Mean accuracy in each condition.
Finally, we carried out an ANOVA comparing mean RTs across relevant conditions. These results are displayed in Table 9, and mean RTs averaged across participants are displayed in Table 10.
Table 9 Repeated-measures ANOVAs for mean RT
Table 10 Mean RT (SD) in each condition
Results show a marginal though not significant trend toward faster congruent AV RTs compared to the incongruent trials. We also compared the congruent A/b/V/b/mean RTs to the A-only mean RTs. Results point to a modest slowdown in A/b/V/b/responses compared to the A/b/trials. This indicates that participants failed to benefit from congruent visual cues when they were not pertinent for the task. To test whether the McGurk effect was stronger in the divided as opposed to the focused attention condition, we carried out an ANOVA on the 2 × 2 interaction between congruency (congruent vs. incongruent) and attention (focused vs. divided). The interaction was significant, indicating a greater congruency facilitation in the divided (988 ms vs. 1,247 ms) compared to the focused (1,081 ms vs. 1,194 ms) attention condition.
Divided attention: Capacity
Unlike those applied to mean RT or mean accuracy, significance tests for the capacity coefficients are not computed using parametric measures. Traditionally, capacity has been assessed by comparing calculated capacity values to upper and lower bounds (Townsend & Eidels, 2011; Townsend & Nozawa, 1995; Townsend & Wenger, 2004a, b). These bounds have been translated into capacity space by Townsend and Eidels (2011), and appear in Fig. 2. These comparisons of C(t) to theoretical bounds do not rely on parametric assumptions for RT distributions. In terms of statistical tests, Houpt and Townsend (2012) showed that semiparametric estimates of integrated hazard functions could be used to compute a Z statistic comparing data to race model predictions derived from unisensory trials. Figure 2 shows the values of the capacity coefficient (Eq. 1) for each participant. Recall that we only used the capacity coefficient for the congruent stimuli: ABVB/(AB + VB). Note that in each of the following figures, capacity, and subsequently integrated hazard function ratio values are displayed for time points in which there is overlap between the audiovisual, auditory, and visual-only RT distributions. In Fig. 2, the solid line and dotted line represent the upper and lower bounds, respectively, for parallel independent model predictions.
Results showed that capacity was limited (<1) for each of the participants. Notably, all capacity functions also increased as RTs increase, illustrating powerful dynamics in AV perception. The statistical tests from Houpt and Townsend (2012) indicated that capacity was significantly lower than UCIP predictions for all participants, C(t) = 1, Z < -100, p < .0001. None of the capacity functions approach the upper Miller bound, suggesting a strong rejection of a pure coactivation model for all individual listeners. The capacity functions for all participants, especially 3, 4, and 5 fell below the lower Grice bound for some time points, implying severely limited capacity. This was true especially for the fast RTs. Those data that fall below the Grice bound show than RTs are slower for A/b/V/b/than the faster of A/b/or V/b/. Interestingly, RTs were actually harmed by the congruent audiovisual information in this high-accuracy setting. Thus, because capacity ranges from moderately limited to severely, we can conclude that there is no evidence for either a strongly coactive model or a parallel independent channels model with unlimited capacity.
Although we can reject these particular models, our results could arise from a more sophisticated coactive or a parallel system. For example, systems that have inhibitory interactions between channels would be capable of yielding limited capacity (e.g., Eidels et al., 2011; Wenger & Townsend, 2006). However, a coactive system would have to include exceedingly strong inhibitory connections in order to produce the severely limited capacity exhibited by subjects 3, 4, and 5. Parsimony suggests, then, that these results are most consistent with a parallel system with limited capacity or a parallel system with inhibitory crosstalk.
Divided attention: Integrated hazard ratios
Next, we carried out comparisons involving the integrated hazard function ratios for each participant in order to supplement the capacity results and to deepen our inferences regarding which model may be more appropriate to describe auditory-visual integration. These analyses included the ratio of empirical integrated hazard functions for congruent ABVB (see Fig. 3) and incongruent ABVG (see Fig. 4) in the numerator. Using AB and VB is in the denominator allows us to assess the amount of audiovisual gain or interference provided over each modality.
Cox proportional hazard regression statistical tests for the congruent ratios are shown in the first two columns of Table 11. Cox tests are semiparametric statistical tests that do not require the assumption of normality of RT distributions (see Altieri, Stevenson, Wallace, & Wenger, 2015; Wenger & Gibson, 2004; Wenger & Rhoten, in press). The Cox regression analysis was used to determine whether the underlying hazard functions from the two different trial types (e.g., ABVB vs. VB) statistically differed from one other. We used Cox regression analysis, similar to Wenger and Gibson (2004) and Altieri et al. (2015) because the method has been established for testing differences in hazard function; furthermore, significant differences between two hazard functions implies that the integrated hazard functions will also differ.
Table 11 Cox regression analysis (Allison, 1995), % Change (β; p), results for the divided attention study. Positive β values indicate that RTs for the stimulus specified in the numerator were faster than the stimulus specified in the denominator, and negative values indicate the reverse
Figure 3 and the first two columns of Table 11 illustrate the effects of the congruency benefit compared to each individual modality. Both demonstrate that for the ABVB/AB comparison, Participants 1 and 2 showed ratios significantly greater than 1 indicating benefits provided by the congruent audiovisual stimulus (A/b/V/b/) over A/b/alone. Notably, these two subjects also responded faster to V/b/versus A/b/stimuli, evidenced in the greater efficiency ratio for ABVB/AB compared to ABVB/VB. These two subjects also reveal a small redundancy gain over the visual modality, as both ratios appear to be greater than 1. These results are consistent with those for the capacity statistics. The other three participants evidenced similar RTs for V/b/and A/b/, although Participants 3 and 5 received greater benefit from A/b/V/b/over V/b/.
Recall that both coactive and UCIP models predict integrated hazard ratios greater than 1, with coactive predictions being much higher than 1. Data from Participants 3 and 4 do not support either of these models, as both congruent integrated hazard ratios tended to be less than 1. However, Participants 1, 2, and 5 showed evidence of AV advantages over at least one of the two modalities. For these participants, AV RTs appear to be driven by the faster of the two modalities, with only Participant 1 and Participant 2, to a lesser degree, showing any advantage over the fastest single modality (the smaller of the two efficiency ratios is >1). Thus, there is further evidence to reject the coactive and UCIP models for four of the five participants, as 4 of 5 subjects received little to no benefit from the second modality.
Next, Fig. 4 shows three comparisons involving incongruent audiovisual speech information. These ratios allow assessment of how incongruent information in one modality interferes with the opposing modality. Cox regression tests are shown in the rightmost columns of Table 2, which support the results described below. First, the comparison involving ABVG/AGVB was carried out to determine whether incongruent visual information had a stronger effect on auditory processing, or alternatively, whether incongruent auditory information had a stronger effect on visual identification. A ratio of 1 would imply that there is symmetry in the influence of one modality on the other. The results show that in four out of five cases, visual distractors slowed auditory processing more than the other way around. That is, ABVG/AGVB < 1, and the McGurk perception of A/b/V/g/led to slower responses than the presumed clustered perception of A/g/V/b/. The exception was Participant 3, who showed no effect either way.
Second, we assessed whether an incongruent V/g/slowed processing of A/b/by evaluating the ratio ABVG/AB. All participants showed evidence for slower processing for the A/b/V/g/trials when compared to A/b/, consistent with the traditional accuracy-based result that V/g/inhibits the perception of the A/b/. Finally, we evaluated AGVB/VB to determine whether and the extent to which the incongruent auditory A/g/inhibited the detection of V/b/. Testing this asymmetry constitutes an advantage of the detection approach since we can assess the relative influence of the visual modality on auditory perception, and the reverse. Strikingly, all participants except Participant 1 show evidence for slower processing to the A/g/V/b/stimulus when compared to V/b/. Thus, there are effects in both directions: visual incongruence deleteriously affects auditory processing and auditory incongruence damages visual processing. However, the effect is greater for conflicting visual rather than conflicting auditory information.
One other notable finding in these data is that the incongruency effect is largest for short RTs. As subjects take longer to respond to the stimuli, the in-congruency effects tend to diminish, and efficiencies even approach 1 for the longest RTs. We also see similar effects in Fig. 2 where efficiencies for congruent stimuli increase with increasing RTs.
Focused attention: Integrated hazard ratios
Figure 5 displays three integrated hazard ratios for the focused attention condition in the right panels separately for each participant. The integrated hazard comparisons include: ABVB/AB, ABVB/ABVG, and ABVG/AB. The Cox regression analysis results are displayed in Table 12. Evidence for the ability of listeners to focus their attention on the auditory modality would be revealed in these hazard ratios all equaling 1, suggesting that the presence of the visual stimulus, whether it is V/b/or V/g/, would not influence the response. However, Fig. 5 demonstrates that for ABVB/AB and ABVG/A, the efficiency ratios tend to be less than 1. That is, data from all observers revealed evidence for more efficient processing when only auditory information was present. It is particularly interesting that the irrelevant but congruent cue provided by V/b/actually hurt performance. For ABVG/AB, we see efficiencies much, much lower than 1 for all listeners, indicative of a strong inability to filter out the incongruent V/g/stimulus.
Table 12 Cox regression statistics [β(p)] from the focused attention condition
The comparison between ABVB/AB and ABVG/AB allows a determination of the extent to which the conflicting but irrelevant visual information slowed processing relative to the congruent but irrelevant visual cues. All participants except Participant 3 evidenced significantly poorer efficiency when stimuli were incongruent than when they were congruent: The ratio ABVB/ABVG illustrates that subjects are faster for A/b/V/b/than for A/b/V/g/, indicative of faster RTs for congruent than for incongruent stimuli. Generally speaking, although the congruent information did not improve performance over that observed with a single modality, we still see a failure of attentional mechanisms to filter out the incongruent information. The incongruency hurt subjects much more than the congruent (but unhelpful) information.
Divided versus focused attention
To compare relative effects of attention, we compared integrated hazard ratios from the divided attention condition to the analogous ratio from the focused attention condition. The purpose of these comparisons was essentially to test an interaction across conditions in order to address the following question: To what extent was the difference between hazard ratios greater in the divided compared to the focused attention condition? Answering this question would allow us to determine whether the influence of the visual modality was greater when attention was divided. This test was carried out for the three integrated hazard ratios as shown in Fig. 6: DIV/FOC: ABVB/AB, ABVG/AB, ABVB/ABVG. To do this, we used a Z test based on Houpt and Townsend’s (2012) capacity test statistics, modified to test for interactions: for example, the null hypothesis for the ABVB/AB comparison was [ABVB - AB]
Divided
- [ABVB - AB]
Focused
= 0. Table 13 shows the results from the statistical tests.
Table 13 Statistical tests using estimated integrated hazard functions comparing divided versus focused attention integration (Z(p))
First, we observe that DIV/FOC: ABVB/AB across conditions were significantly greater than 1 for all participants. Thus, all participants benefited more from congruent visual cues when attention was divided compared to when they focused only on the auditory modality. Note, however, that A/b/V/b/provides the participant with two opportunities to say “yes” in the divided attention experiment but not the focused attention experiment. Hence, there may be statistical effects present that are typically associated with multiple targets (Miller, 1982), causing us to overestimate the benefit of dividing attention.
The most straightforward ratio to interpret between the divided and focused attention conditions is that of ABVG/AB, as for both experiments, only A/b/is associated with a “yes” response. Therefore, when taking the divided hazard ratios over the focused ratios, values of 1 would indicate that attention has little effect on performance. If subjects were faster in the focused attention condition, we would expect that values would be less than 1. Values greater than 1 imply that dividing attention provides faster responses for the incongruent stimulus and that focusing impairs performance. There are clearly some large individual differences, but these ratios are near 1 for only one of the subjects, Participant 1 (see Tables 7 and 8), suggesting that only this participant was immune to the effects of attentional manipulation. Interestingly, Participants 2 and 4 demonstrated divided attention ratios greater than the focused attention ratios, suggesting that there was a cost to focusing their attention on the speed of AV processing. However, Participants 3 and 5 show the opposite effect, indicating that responses were faster to A/b/V/g/trials when compared to A/b/in focused attention. The implication is that these subjects possessed a weak ability to focus their attention and were being less impaired by the incongruent information in the focused attention case.
Finally, we compared the ABVB/ABVG ratio across attention conditions (see Table 13). This comparison allowed us to examine whether the benefit of congruent audiovisual signals is greater when attention is divided rather than focused. The results were variable, but consistent with the results for the ABVG/AB ratio: Participants 3 and 5 showed a greater effect in the divided attention condition, suggestive of some (albeit weak) ability to focus their attention and filter out the congruent visual stimulus. On the other hand, Participants 2 and 4 showed evidence of a weaker congruency effect in the divided attention condition.
While the influence of the visual signal varied across participants and conditions, the overall pattern of results shows that all of the participants were unsuccessful in completely focusing their attention on the auditory modality. There is some suggestion that there is a small effect of attention for some subjects, but on the whole, there is a global failure to inhibit both congruent and incongruent information. That people are impaired by congruent information also lends credence to a parallel interactive channel model: Congruent information would not be expected to have a negative effect in a coactive model.