Skip to main content

Effects of emotional prosody on novel word learning in relation to autism-like traits


Emotional information can influence various cognitive processes, such as attention, motivation, and memory. Differences in the processing of emotion have been observed in individuals with high levels of autism-like traits. The current study aimed to determine the influence of emotional prosody on word learning ability in neurotypical adults who varied in their levels of autism-like traits. Thirty-eight participants learned 30 nonsense words as names for 30 “alien” characters. Alien names were verbally presented with happy, fearful, or neutral prosody. For all participants, recall performance was significantly worse for words spoken with fearful prosody compared to neutral. Recall performance was also worse for words spoken with happy prosody compared to neutral, but only for those with lower levels of autism-like traits. The findings suggest that emotional prosody can interfere with word learning, and that people with fewer autism-like traits may be more susceptible to such interference due to a higher attention bias toward emotion.


The exchange of emotional information (e.g. facial expressions, tone of voice, etc.) is widely accepted to occur automatically during communication (Moses et al. 2001). The processing of emotion not only enhances the quality of social communication, but can aid in the recognition of threat (i.e., an individual expressing fearful or angry emotions may signal threat to others), or benefit (i.e. positive emotions may signal an opportunity for gain) (Kuperman 2013). However, certain individual characteristics may impact on the ability to process emotion. For example, along with characteristics such as reduced social functioning and increased rigid and/or repetitive behaviours, individuals with autism spectrum disorders (ASD) show impaired processing and understanding of emotion (e.g. Begeer et al. 2008; Golan et al. 2006; Nuske et al. 2013). There is evidence to suggest that, even without a diagnosis of ASD, individuals with higher levels of autism-like traits have poorer recognition of emotion compared to those with lower levels of autism-like traits (Hermans et al. 2009; Hosokawa et al. 2015; Ingersoll 2010; Poljac et al. 2012). The broader autism phenotype (BAP) refers to genetic similarities between those with ASD and neurotypical individuals with high levels of autism-like traits.

In the general population, research has demonstrated that emotional information is processed more efficiently than neutral information, and can influence various cognitive processes (e.g. Hinojosa et al. 2010). Life events associated with strong emotional reactions are better remembered compared to other events (Calev 1996; Kensinger et al. 2007). Emotional information can also be useful during the acquisition of new knowledge, particularly when emotion can be used as a referent towards a particular piece of novel information. For instance, Hooker et al. (2008) asked participants to learn associations between novel objects and either emotional or neutral faces. Objects were presented alongside images of happy, fearful or neutral facial expressions. Participants were faster at learning objects associated with the emotional expressions compared to the neutral expressions. Such results suggest a processing bias for emotional information during associative learning and memory tasks.

Within language processing, words with positive (e.g., “kindness”) or negative (e.g., “danger”) emotional meaning are more quickly recognised and better remembered compared to non-emotional words, such as “tower” (e.g. Kensinger and Corkin 2003; Kousta et al. 2009). Although some studies have shown preferential processing for positive emotion words (Calev 1996; Hofmann et al. 2009), a large body of evidence shows a stronger bias toward negative emotion words (Peeters and Czapinski 1990; Vaish et al. 2008; Wang and Fu 2011). This negativity bias is believed to be due to the evolutionary advantage in attending quickly to negative emotional information. It is advantageous to attend to positive emotional information as a signal for gaining something beneficial, however this is not often as immediately essential for survival as recognising negative, and potentially threatening, information (Kuperman 2013; Peeters and Czapinski 1990). Although the influence of emotion on the ability to remember known words has been explored, it is yet to be determined whether an emotion bias influences the ability to learn new words.

Word acquisition is a complex mental process which can be influenced by multiple factors. For example, some factors which enhance the learning and retention of new words include vocal rehearsal (Kaushanskaya and Yoo 2011), native language phonology (Baddeley et al. 1998), tone of voice (Reinisch et al. 2013; Shintel et al. 2014), additional semantic information (Angwin et al. 2014), and greater verbal short term memory ability (Gupta 2003). Interestingly, both Reinisch et al. (2013) and Shintel et al. (2014) found that the way in which a word was said, for example increasing the volume when referring to something large, can aid in forming associations between novel words and meaning. These studies show that an informative tone of voice can be a useful tool during word learning if the tone is consistent with the context. Given the influence of emotion on our attention, an emotional tone of voice during word learning may influence attentional processes and, consequently, word learning processes.

There is evidence indicating that additional semantic information can enhance word learning ability (Angwin et al. 2014), consistent with previous work showing that a deeper level of encoding enhances memory for words (Craik and Tulving 1975). However, research into whether emotional information provides such a benefit is exceptionally limited. Eden et al. (2015) demonstrated an advantage for learning words with an emotional meaning compared to other words in neurotypical adults. Written novel pseudo-words were paired with images of known objects which had been rated as emotionally negative or neutral. A word learning paradigm was used whereby the participants arbitrarily indicated whether a pseudo-word was the appropriate label for an image. They continued to make such judgements until they met a threshold of consistency regarding their chosen word and image pairs. Following this task, a cued recall task was administered in which participants recalled the object that each pseudo-word was associated with to measure the retention of word meaning. The object and word associations were better remembered when the objects were negative compared to neutral. Similar to the memory research described previously (Bower 1981; Calev 1996; Hooker et al. 2008; Kensinger et al. 2007), these results show a processing advantage for information which is explicitly linked with emotion, in this instance in regards to word learning.

The results of Eden et al. (2015) demonstrate that emotional visual stimuli can influence word learning. However, it can be argued that this study was limited in representing natural word learning scenarios in two ways. Firstly, new words are typically linked with unfamiliar objects, without an existing lexical representation, rather than familiar objects. Secondly, it is rarely the case that new words would be paired with a visual stimulus without additional information, such as tone of voice, or vocal prosody. As we are typically exposed to new words via speech, it is possible that the way in which a word is spoken may influence learning (Reinisch et al. 2013; Shintel et al. 2014). Indeed, vocal prosody is also one of the most useful and effective modalities for communicating emotion (Schirmer and Kotz 2003; Shriberg et al. 2001). Thus, it is important to determine the extent to which emotional prosody influences our ability to learn new words.

Berman et al. (2013) investigated whether word learning in 5 year old children was facilitated by informative emotional prosody. Objects were presented in either a broken or enhanced (i.e., attractive attributes added, such as stars) state and paired with pseudo-words spoken with either positive or negative emotional prosody. Analysis of eye gaze showed that 5 year olds appropriately associated the broken and enhanced objects with the negatively and positively spoken pseudo-word, respectively. Further, children showed an ability to generalise the pseudo-word learned with negative prosody by correctly associating the word with the object both when the object was returned to its original unbroken state, and when an alternative version of the object was presented. The results from this study suggest that the use of emotional prosody, particularly negative emotional prosody, to guide word-object associations may be an important mechanism for learning new words throughout development. Exploring the extent to which emotional prosody can impact word learning in more complex scenarios in adulthood may further elucidate current understanding regarding how emotion in speech impacts word learning. For example, it is yet to be determined whether emotional prosody can aid in word learning when it is not directly relevant to the referent or the focus of the task.

It is also important to determine whether an emotion bias elicited during word learning varies as a function of autism-like traits. Research has demonstrated a lack of memory bias for emotional stimuli relative to neutral stimuli in adults with ASD when recalling emotional or neutral sentences (Beversdorf et al. 1998), or recognising emotional or neutral images (Deruelle et al. 2008). In these experiments, emotional stimuli were remembered better than neutral stimuli for typical adults, but not for those with ASD, despite having equivalent overall performance. Further, evidence in childhood has shown that children with ASD are less influenced by positive emotions when required to use this information to associate pseudo-words with objects, but do not show difficulty making these associations when given negative emotional information (Thurman et al. 2015).

The results of these studies suggest that individuals with ASD may have a weaker automatic attention bias toward emotional information than those without ASD. It has been theorised that autism-like traits exist on a spectrum, and that every individual in the population has a place along this spectrum (Wing 1988). The higher end of the spectrum includes those who show a lower interest or ability in social behaviours, and more systematic cognitions and behaviours. On the lower end of the spectrum are those who are more socially competent and have a higher capacity for empathy, but may struggle with routines and understanding systems. It is believed that those who have a diagnosis of ASD fall at the extreme high end of this spectrum, and typical adults fall somewhere below this. Baron-Cohen et al. (2001) devised an assessment termed the Autism-Spectrum Quotient (AQ) to measure these traits in the general population. In regards to emotion processing and understanding, research has found that neurotypical adults who obtain high scores on the AQ, indicating a relatively high non-clinical level of autism-like traits, are less able to accurately recognise, as well as imitate, emotional facial expressions compared to individuals who obtain low scores on the AQ (Hermans et al. 2009; Hosokawa et al. 2015; Ingersoll 2010; Poljac et al. 2012). No studies to date have examined the relationship between AQ scores and the extent to which emotional information can influence cognitive processes such as memory and learning.

The current study aimed to determine whether the word learning ability of neurotypical adults with relatively higher or lower levels of autism-like traits is differentially affected by emotion. Participants were required to learn the names of novel alien drawings during a single session. Names were auditory novel pseudo-words spoken with happy, fearful, or neutral emotional prosody. It was hypothesised that words spoken with both positive and negative emotional prosody would be learned more effectively than words spoken with neutral prosody, and that words spoken with emotionally negative prosody would be learned more effectively than words spoken with positive prosody. It was further hypothesised that this emotional bias would be less pronounced or absent in those who score higher on the AQ, relative to those with lower AQ scores.



Thirty-eight healthy adults (22 females, 16 males; M age = 22.6 years, SD = 5.7) participated in the study. Twenty-three participants were students who completed the task to obtain course credit towards an undergraduate psychology course. The remaining 15 participants were recruited via community advertisements and social media, and were entered into a prize draw as incentive to participate. All participants reported having normal vision and hearing, no history of any psychological or neurological conditions (including an autism spectrum condition), and that English was their primary language. Three participants reported being left handed. All participants provided written informed consent prior to participation, and ethical approval for the study was granted by the University of Queensland Medical Research Ethics Committee.


The Autism-Spectrum Quotient (AQ)

The AQ (Baron-Cohen et al. 2001) was used in the current study to measure the level of autism-like traits. The AQ requires participants to indicate the extent to which they agree with 50 statements, using 4-point Likert-type response scales, ranging from “definitely agree” to “definitely disagree”. The 50 statements are intended to measure five different traits related to ASD. These include social skill (e.g. “I prefer to do things with others rather than on my own”), attention switching (e.g. “I prefer to do things the same way over and over again”), imagination (e.g. “If I try to imagine something, I find it very easy to create a picture in my mind”), attention to detail (e.g. “I often notice small sounds when others do not”), and communication (e.g. “Other people frequently tell me that what I’ve said is impolite, even though I think it is polite”). Baron-Cohen et al. (2001) specified a scoring system in which the four possible response options were collapsed and scored as either 1 or 0 depending on whether they indicated the presence of an autism-like trait (i.e., collapsing responses which indicate “slightly” or “definitely” agreeing or disagreeing). Consistent with Austin (2005), the current study utilised an alternative scoring system in which scoring ranges from 1 to 4 in accordance with the Likert-type responses. For instance, for statements which are designed to reflect the presence of an autism characteristic, responses of “slightly agree” or “definitely agree” would receive a score of 3 or 4, respectively, and the levels of disagreement would receive a score of 1 or 2. Some items reflect the opposite to an aspect of an autism characteristic, and are reverse scored. Thus, a total AQ score can range from 50 to 200, with higher scores indicating higher levels of autism-like traits. This system of scoring has been suggested to be more sensitive to detecting levels of autism-like traits among individuals (Austin 2005).

Additional measures

Three measures of cognition were also included to determine whether cognition impacted word learning ability relative to AQ scores. These consisted of the National Adult Reading Test (NART; Nelson and Willison 1991), to assess word reading and verbal IQ, a Digit Span forwards and backwards test, to assess working memory, and a measure of non-word repetition ability. The Digit Span tests require participants to immediately repeat strings of numbers which progressively increase in length. The nonword repetition task was developed using nonword stimuli from Gupta et al. (2004) that were not already being used as stimuli in the word learning task. The stimuli progressively increased in number of syllables such that there were 10 items of 3-syllable words (e.g., “bokonip”), 10 items of 5-syllable words (e.g., “ponokasidin”), and 10 items of 7-syllable words (e.g., “tasedibutarimon”). Words were recorded by a female Australian speaker and delivered on a Dell laptop using E Prime 2 software (Psychology Software Tools, Pittsburgh, PA). Participants were instructed to repeat each nonword immediately after hearing it. Responses were scored as correct only if pronunciation was exact.

Word learning task


The study consisted of a word learning paradigm in which participants were required to learn novel names for images of unique “alien” characters, drawn from Gupta et al. (2004). These names were 30 three-syllable pseudo-words consisting of random combinations of English phonemes, such as “Bekinore”, and were presented vocally. Three of Gupta et al.’s pre-arranged lists of ten pseudo-words were chosen, with each list having equal variation of onset letters, such that each list contained an equal number of words beginning with b, d, g, k, p, and t.

Auditory recordings of each nonword were created by a female acting student who vocalised each pseudo-word with happy, fearful, and neutral emotional prosody. Each of the 30 pseudo-words was recorded twice for each of the three intonations, resulting in 180 recordings. These recordings were then presented to 19 healthy adults (13 female; M age = 29.93 years, SD = 9.41), who were not participating in the word learning experiment, in an online survey for validation of the emotional expression. In the survey, participants listened to the 180 recordings one at a time, and rated each on its level of both fearful and happy emotional expression using likert-type scales ranging from 1 (not at all) to 7 (very much). Of the 2 recordings per word for the fear and happy conditions, the recordings with the highest mean rating on the appropriate emotion were chosen for use in the study. Paired samples t tests were performed on the ratings for the chosen stimuli. Due to technical error, 25.85% of ratings were missing overall. The findings revealed that the 30 fearful recordings were rated significantly higher on the fear scale (M = 4.65, SD = 1.16) compared to the happy scale (M = 1.08, SD = 0.26), t(14) = 12.20, p < .001, d = 3.15, 95% CI [2.94, 4.20], and that 80% of responses were ratings of 4 or above on the fear scale. For the happy recordings, the selected recordings were rated significantly higher on the happy scale (M = 5.05, SD = 1.49) compared to the fear scale (M = 1.34, SD = 0.42), t(16) = 10.32, p < .001, d = 2.50, 95% CI [2.95, 4.48], and that 79% of responses were ratings of four or above on the happy scale. Of the 60 words spoken in neutral prosody, the 30 which were rated lowest on both the fear and happy scales were chosen for use in the study. Results revealed that the fear ratings (M = 1.29, SD = 0.51) and happy ratings (M = 1.35, SD = 0.73) for these 30 neutral recordings were not significantly different, t(13) = 0.31, p = .765, ns, and only 5% of responses were ratings of four or above on either the happy or fear scales. In addition, the neutral recordings were rated as significantly less fearful than the fearful recordings, t(14) = 11.52, p < .001, d = 2.98, 95% CI [2.81, 4.09], and significantly less happy than the happy recordings t(13) = 11.13, p < .001, d = 2.98, 95% CI [2.98, 4.41].

The aliens were 30 unique alien images taken from the Gupta et al. (2004) database. In this database, aliens were developed to vary with respect to four main characteristics; the number of arms, the presence or absence of a tail, the head shape, and the body shape. Three sets of ten aliens were selected for the current study. Each set was matched for the number of aliens with each variation of the previously mentioned characteristics.

Three versions of the experiment were created and were randomly assigned to each participant. Each version consisted of the 30 aliens and 30 pseudo-words, with each of the three lists of ten pseudo-words allocated to an emotional condition. This resulted in ten fearful, ten happy, and ten neutral sounding words. The pairing of each set of aliens with each set of words was randomised using a Latin square and counterbalanced across emotional conditions. This ensured that each version of the experiment contained different alien-word pairings in each emotional condition. See Table 1 for a visual representation of the three versions of the experiment. In addition, trials within the experiment were pseudo-randomised to ensure that the same emotional condition did not occur on five or more subsequent trials, and that each emotional condition was evenly distributed throughout any given sequence.

Table 1 Counter-balanced experiment versions


Prior to the experiment proper, participants completed a repetition test in order to verify their capacity to accurately pronounce each of the nonword stimuli. Participants were presented with each of the verbal nonwords one at a time, without accompanying visual stimuli, and were asked to repeat each nonword aloud. The experimenter scored the participant’s response by either marking the item as correct if it was pronounced correctly, or by noting the alternate pronunciation used, and then indicated when the participant could proceed to the next trial. Responses were also audio recorded for later verification. If a word was pronounced with more than one syllable incorrect during the word practice, or was pronounced such that it resembled a real word (e.g., “bonikak” pronounced as “bunny-kak”), that item was removed from all analyses for that participant. In total, 7.90% of items were removed on this basis. For words which were pronounced with only one syllable incorrect and did not resemble a real word, the alternative pronunciation was considered correct for that participant for the remainder of the task. Participants wore AKG K550 closed back reference class headphones, and volume was adjusted to a comfortable listening level for each individual.

The word learning task involved three phases; a learning phase, a recall phase, and a recognition phase, with each task presented on a Dell laptop using E Prime 2 software. In the learning phase, participants viewed images of each alien, one at a time, while simultaneously hearing a vocal recording of a nonword spoken with happy, fearful, or neutral emotional prosody. Participants were instructed to remember the names of each of the aliens. Each trial consisted of a fixation cross in the centre of the screen for 1500 ms, followed by the alien image and the vocal recording. There was a 250 ms delay between the onset of the alien and the onset of the word. Each alien remained on the screen for a total of 5000 ms for each trial, followed by a blank screen for 500 ms before automatically beginning the next trial. The presentation of all 30 aliens and their names was repeated twice before beginning the subsequent recall phase. Participants were given the opportunity for a short break between each set of 30 aliens within a learning phase. The duration of the breaks were controlled by the participants, who were instructed to press a button when they were ready to proceed.

In the recall phase, participants were shown each of the 30 aliens one at a time in a random order, and were required to state the name of each alien aloud while their responses were recorded using an Olympus digital voice recorder. Each alien remained on the screen for a maximum of 5000 ms, however participants could press a button prior to this to proceed to the next trial. The 5000 ms time limit was applied to ensure that participants were not fixating on one stimulus for too long. Responses were scored online as either correct or incorrect, with mispronunciations of any part of the word scored as incorrect. Scoring was later verified from the audio recordings. The learning and recall phases were repeated five times, with a different pseudo-randomised order of trials used in each sequence. Each word-alien pair and emotional condition remained consistent across all sequences for each participant.

At the end of the five learning and recall phases, there was a final recognition phase. In this phase participants were again presented with each of the 30 aliens in a random order. Alongside each alien were three written nonwords which included the ‘correct’ word that had been associated with the alien throughout the experiment, and two ‘incorrect’ words which had been paired with other aliens. Participants were required to indicate which word was the correct name for the alien by pressing one of three buttons on a serial response box. The correct response option was pseudo-randomised such that the same button was not allocated to the correct option more than five times consecutively. The two incorrect options were always selected from the same emotional condition as the target alien for each trial, and all three options always had different onset letters. Each trial consisted of a fixation cross for 1000 ms, followed by the alien and the word options. Participants were instructed to respond as quickly and accurately as possible, and a 4000 ms time limit was imposed on each trial. After a participant had responded or after 4000 ms had elapsed, the trial would automatically proceed to a blank screen for 250 ms prior to the commencement of the next trial. Following the completion of the word learning task, the experimenter administered the NART, digit span, and nonword repetition tests, and the participant completed the AQ.


Scores from the AQ ranged from 63 to 119 (M = 98.47, SD = 12.67). Gender had no significant impact on AQ score (M males = 101.69, M females = 96.14), t(36) = 1.35, p = .186, ns. Additionally, any potential effects of gender on recall or recognition performance were investigated, with no significant main effects or interactions. Participants were allocated into high and low AQ groups using a median split. Thus, the high AQ group comprised 19 participants who scored 100 or above, and the low AQ group included 19 participants who scored 99 or below on the AQ. A Mann–Whitney test indicated that AQ scores were significantly greater in the high AQ group (Mdn = 107) compared to the low AQ group (Mdn = 87), U = 361.00, p < .001.

An arcsine square root transformation was performed on proportional data for both recall accuracy and recognition accuracy in order to improve normality, as is deemed most appropriate for proportional data (Snedecor and Cochran 1989). Prior to performing the transformation, values of 0 were replaced with [1/(4n)], with n equal to the number of trials in each condition (10). Values of 1 were replaced with [1 − (1/(4n))]. To improve normality for recognition latency, a square root transformation was performed on reaction time data (Howell 2007).


A 5 × 3 × 2 mixed analysis of variance (ANOVA) was performed, with recall (phases 1–5) and emotion (happy, fearful, neutral) as within subjects factors, and AQ (low, high) as a between subjects factor. The analysis revealed a main effect of recall, F(4, 144) = 188.69, p < .001, ηp 2 = 0.840, which was indicative of the improved overall recall performance across learning phases. A main effect of emotion, F(2, 72) = 9.40, p < .001, ηp 2 = 0.207, was evident, with pairwise comparisons revealing that accuracy was significantly worse in the fearful condition overall, compared to the happy, t(37) = 2.78, p = .009, d = 0.45, 95% CI [0.02, 0.13], and neutral, t(37) = 4.38, p < .001, d = 0.71, 95% CI [0.06, 0.15], conditions. A significant interaction was observed between recall and emotion, F(8, 288) = 2.03, p = .043, ηp 2 = 0.053, which was further explored with pairwise comparisons at each recall phase (see Table 2). These comparisons revealed that performance in the fearful condition was significantly worse than the neutral condition at all recall phases, except for phase 2, and was significantly worse than the happy condition at recall phase 3. Further, although recall performance in the happy and neutral conditions did not significantly differ during recall phases 1–4, performance in the happy condition was significantly worse than the neutral condition at recall phase 5.

Table 2 Results (t statistic) of pairwise comparisons between each emotion condition, at each recall phase (collapsed across AQ)

An interaction between emotion and AQ was just outside significance, F(2, 72) = 2.87, p = .062, ηp 2 = 0.076. Due to the a priori predictions relating to AQ, this interaction was further investigated by conducting separate 5 (recall) × 3 (emotion) mixed ANOVA’s on the high and low AQ groups. The analysis for the low AQ group produced significant main effects of recall, F(4, 72) = 98.55, p < .001, ηp 2 = 0.846, and emotion, F(2, 36) = 4.152, p = .024, ηp 2 = 0.187. Additionally, there was a significant interaction between recall and emotion in this group, F(8, 144) = 2.08, p = .042, ηp 2 = 0.103. Pairwise comparisons revealed that this group showed worse overall performance in the fear condition compared to neutral, t(18) = 2.63, p = .017, d = 0.60, 95% CI [0.02, 0.16], and significantly worse performance in the happy condition compared to neutral, t(18) = 2.24, p = .038, d = 0.51, 95% CI [0.00, 0.14]. The fear and happy conditions did not differ significantly. Comparisons at each recall phase revealed that the impairment in recall performance in the fear condition compared to neutral for the low AQ group was evident at the third recall phase only, t(18) = 3.63, p = .002, d = 0.83, 95% CI [0.07, 0.25], whereas the impairment in performance in the happy condition compared to neutral was evident at the fifth recall phase only, t(18) = 3.32, p = .004, d = 0.76, 95% CI [0.07, 0.29]. No further comparisons were significant.

The analysis for the high AQ group indicated significant main effects of recall, F(4, 72) = 91.54, p < .001, ηp 2 = 0.836, and emotion, F(2, 36) = 7.67, p = .002, ηp 2 = 0.299, however the interaction between recall and emotion was non-significant in this group, F(8, 144) = 1.48, p = .169, ns. Post-hoc pairwise comparisons revealed that accuracy was significantly lower in the fearful condition overall compared to both the happy, t(18) = 3.32, p = .004, d = 0.76, 95% CI [0.05, 0.21], and neutral, t(18) = 3.50, p = .003, d = 0.80, 95% CI [0.05, 0.19], conditions. The happy and neutral conditions did not differ significantly. Figure 1 presents the means for recall accuracy for both the low and high AQ groups for each emotion condition, at each recall phase.

Fig. 1

Arcsine transformed recall accuracy for each emotion condition, separated into low and high groups based on Autism-Spectrum Quotient (AQ) score


Three participants from the low AQ group failed to complete the recognition task due to technical difficulties. Thus, the recognition accuracy analysis included 16 participants in the low AQ group and 19 participants in the high AQ group.


For recognition accuracy, errors included incorrect responses in addition to trials in which a response was not given within the 4000-ms time limit. The overall error rate was 13.8%.

A 3 × 2 (emotion × AQ) mixed ANOVA produced a significant main effect of emotion, F(2, 66) = 12.04, p < .001, ηp 2 = 0.267. Post-hoc pairwise comparisons revealed that accuracy was significantly worse in the fearful condition compared to both the neutral condition, t(34) = 4.42, p < .001, d = 0.75, 95% CI [0.08, 0.21], and the happy condition, t(34) = 3.76, p = .001, d = 0.64, 95% CI [0.06, 0.20]. There was no significant difference in recognition accuracy between the happy and neutral conditions, and there was no significant interaction with AQ, F(2, 66) = 2.70, p = .075, ns. Table 3 shows the arcsine transformed means and standard deviations for recognition accuracy for each AQ group in each emotion condition.

Table 3 Arcsine transformed mean recognition accuracy as a function of emotion and AQ (SD in parenthesis)

Reaction time (RT)

Only correct trials were analysed for RT, and further trials were excluded if the RT exceeded two standard deviations from the conditional RT mean for a given participant (1.35% of otherwise eligible trials). Additionally, participants were removed from the reaction time analysis if their data did not contribute six or more items to each emotion condition. As a result, data from eight participants were removed from the reaction time analysis, three of these were from the low AQ group and five from the high AQ group. Thus, the reaction time analysis included 13 participants in the low AQ group and 14 participants in the high AQ group.

A 3 × 2 mixed ANOVA (emotion x AQ) produced a main effect of emotion, F(2, 50) = 7.97, p = .001, ηp 2 = 0.242. Follow-up pairwise comparisons revealed that reaction times were significantly slower in the fearful condition (M = 2223, SD = 774) compared to both the happy condition (M = 2015, SD = 658), t(26) = 3.26, p = .003, d = 0.63, 95% CI [0.79, 3.50], and the neutral condition (M = 1988, SD = 575), t(26) = 3.18, p = .004, d = 0.61, 95% CI [0.81, 3.78]. There were no further significant main or interaction effects of reaction time.

Additional measures

Pearson bivariate correlation analyses were performed for each AQ group separately to determine the relationships between overall recall performance, digit span scores, nonword repetition accuracy, predicted verbal IQ based on number of errors in the NART (Nelson and Willison 1991), and years of education. For the low AQ group, recall performance was significantly positively associated with the Digit Span forward and backwards tests (r = 0.67, p = .002; r = 0.49, p = .032, respectively), and nonword repetition (r = 0.51, p = .027), such that higher performance on these measures were related to higher recall performance. Verbal IQ scores were not significantly correlated with recall performance (r = 0.38, p = .109, ns). Years of education was not correlated with any measures for the low AQ group.

For the high AQ group, overall recall performance was significantly positively associated with only the Digit Span forwards test (r = 0.49, p = .035). There were no other significant correlations in this group.

Independent samples t-tests were performed to compare scores on these measures between high and low AQ groups. The results revealed that there were no significant differences in scores on the Digit Span forwards and backwards tests, the nonword repetition task, or in years of education (t’s(36) < 0.87, p’s > .397). Verbal IQ scores based on the NART performance were significantly different between the high and low AQ groups, with the high AQ group showing higher verbal IQ, t(36) = 2.05, p = .048, d = 0.67, 95% CI [0.04, 7.96]. Table 4 presents the means and standard deviations of each additional measure for the high and low AQ groups.

Table 4 Mean (SD in parenthesis) Digit Span scores, nonword repetition performance, verbal IQ scores, and years of education for the high and low AQ groups


This study investigated the influence of vocal emotional prosody on word learning ability in neurotypical adults with higher or lower levels of autism-like traits. It was first hypothesised that, for all participants, words spoken with emotional prosody, particularly negative emotional prosody, would be learned more effectively than words spoken with neutral prosody. The results did not support this hypothesis. However, the hypothesis that those participants with high AQ scores would be less affected by the emotional information than those with low AQ scores was supported.

Overall, emotional information resulted in interference to word learning processes. Much of the previously discussed research showing that emotional information facilitates learning and memory, such as those by Hooker et al. (2008), Eden et al. (2015), and Berman et al. (2013), have utilised tasks which require explicit association of the novel stimuli with emotion. Thus an attentional bias toward emotional information in these tasks was beneficial for linking the novel stimuli with that information. However, in the current study, there was no benefit for attending to the emotional information presented in the task. There is evidence to suggest that emotion can interfere with the processing of unrelated stimuli via the modulation of attention (Anderson and Shimamura 2005; Christianson 1992; Touryan et al. 2007), which may explain the negative impact of emotional prosody on word learning in the present study.

The attentional narrowing hypothesis (Christianson 1992) suggests that automatic attention toward emotional information reduces the processing of irrelevant and non-central information. In support of this, Anderson and Shimamura (2005) presented participants with short video clips which were emotionally positive, negative, or neutral in nature, which were accompanied by voice clips saying neutral words. Participants were then tested on their memory for the words they heard. Performance was worse for words which were played during the negative video clips compared to the other clips. Similarly, Touryan et al. (2007) presented participants with negative emotional images and neutral images, each paired with an unrelated object. Participants were then tested for their memory of the images and the associations between the images and the objects. Memory was better for negative emotional images compared to neutral images, however the associations with the objects were remembered worse when the image was negative. The results of both Anderson and Shimamura (2005) and Touryan et al. (2007) suggest that negative information captures attention greater than neutral information, but that this serves as a distraction to associating the stimuli with non-emotional stimuli. These studies are distinct from previously discussed work suggesting a memory advantage for emotional information when the emotion is relevant and central to the task (e.g., Berman et al. 2013; Hooker et al. 2008; Kensinger and Corkin 2003; Kousta et al. 2009). Thus, evidence suggests that task design has a critical influence on the impact of emotional information on performance. In the current study, it can be argued that the emotional prosody in which the words were spoken had no relevance to that of the word itself and its association with a given alien. Thus, the emotional prosody captured attention in such a way that distracted from the task of associating the words with aliens and that hindered learning performance.

In regards to particular emotions, learning was worse for words spoken with a fearful prosody relative to words with happy or neutral prosody across recall phases, whereas recall performance for words spoken with happy prosody was worse than those spoken with neutral prosody in only the final recall phase of the task. Thus, the emotional interference effect was particularly evident for the fearful condition, suggesting a stronger attentional bias toward negative information in this particular paradigm. This finding supports the negativity bias found in previous research, and the view that it is more advantageous for survival to attend quickly to negative emotional information than to positive information (Kuperman 2013; Peeters and Czapinski 1990). It is interesting to consider the implications of these findings in the context of word learning. This study is unique in investigating the influence of emotional information on word learning, and yet it may be expected that varied degrees of emotional prosody are naturally present during everyday speech. Therefore it is logical to expect that some type of emotional prosody is often naturally present in circumstances of new word learning. The current findings are the first to suggest that consideration should be made of the emotional context within which language learning occurs. It would be interesting to investigate the effects of a broader range of emotions.

Pertaining to the second hypothesis, emotional prosody had a differing influence on the recall performance in those with higher or lower levels of autism-like traits. The interference effect found in the fearful condition was evident for all participants, however, performance in the happy condition was lower than neutral only for those with lower levels of autism-like traits. The findings of the current study are consistent with observations suggesting a reduced emotion bias in individuals with ASD (Beversdorf et al. 1998; Deruelle et al. 2008; Thurman et al. 2015), and extend on this research by showing that word learning is impaired by positive emotional information in neurotypical adults with less autism-like traits but not those with higher autism-like traits. This research can inform knowledge of the underlying characteristics of ASD and the BAP (Landry and Chouinard 2016), and suggests that a weaker emotion bias is related to the BAP and may be either a consequence of autism-like traits, or an underlying factor. The lack of difference between the groups in the fearful condition supports the notion that fearful stimuli capture attention more strongly than happy stimuli. The attentional bias toward this information was just as strong regardless of level of autism-like traits. It would be interesting to determine in future research the comparative performance of individuals with ASD in this paradigm. Further, it may be important to determine whether similar influences of emotional prosody on word learning would be observed in children, as childhood is a more critical period for language development and therefore such research could have wide implications.

The recognition task revealed close to ceiling performance, particularly in the happy and neutral conditions. This is an unsurprising finding given that this task was completed after five learning and recall sequences, and recognition involves less cognitive effort than recall (Anderson and Bower 1972). However, in revealing less accurate and slower performance for the fearful condition compared to the happy and neutral conditions, the recognition results further support the notion that fearful information demands greater attention than happy information. The simplicity of the recognition task can explain the lack of effects for the happy condition, as well as the lack of difference between levels of autism-like traits.

Overall, this research implies that emotional information can interfere with the processes involved in new word learning. This paper has argued that this interference is due to increased attention being allocated to the emotional properties of the stimuli, which distract away from other properties of the stimuli. The results of this study demonstrate that, for neurotypical adults who have low levels of autism-like traits, words spoken in a happy emotional tone are more difficult to learn than words spoken in a neutral tone. However, those with higher levels of autism-like traits do not show difficulty learning happy sounding words relative to neutral words, and thus are suggested to attend less to emotional signals. Potential relationships between such findings and symptoms of alexithymia could be considered in future research. Despite this difference with happy sounding words, all participants showed greater difficulty learning words spoken in a fearful emotional tone compared to words spoken in a neutral tone. This finding suggests that fearful information captures attention in neurotypical adults regardless of individual sensitivity to emotion or level of autism-like traits.


  1. Anderson, J. R., & Bower, G. H. (1972). Recognition and retrieval processes in free recall. Psychological Review, 79(2), 97–123. doi:10.1037/h0033773.

    Article  Google Scholar 

  2. Anderson, L., & Shimamura, A. P. (2005). Influences of emotion on context memory while viewing film clips. The American Journal of Psychology, 118(3), 323–337.

    PubMed  Google Scholar 

  3. Angwin, A. J., Phua, B., & Copland, D. A. (2014). Using semantics to enhance new word learning: An ERP investigation. Neuropsychologia, 59, 169–178. doi:10.1016/j.neuropsychologia.2014.05.002.

    Article  PubMed  Google Scholar 

  4. Austin, E. J. (2005). Personality correlates of the broader autism phenotype as assessed by the Autism Spectrum Quotient (AQ). Personality and Individual Differences, 38, 451–460. doi:10.1016/j.paid.2004.04.022.

    Article  Google Scholar 

  5. Baddeley, A., Gathercole, S., & Papagno, C. (1998). The phonological loop as a language learning device. Psychological Review, 105(1), 158–173. doi:10.1037/0033-295X.105.1.158.

    Article  PubMed  Google Scholar 

  6. Baron-Cohen, S., Wheelwright, S., Skinner, R., Martin, J., & Clubley, E. (2001). The Autism-Spectrum Quotient (AQ): Evidence from Asperger syndrome/high-functioning autism, males and females, scientists and mathematicians. Journal of Autism and Developmental Disorders, 31(1), 5–17. doi:10.1023/A:1005653411471.

    Article  PubMed  Google Scholar 

  7. Begeer, S., Koot, H. M., Rieffe, C., Terwogt, M. M., & Stegge, H. (2008). Emotional competence in children with autism: Diagnostic criteria and empirical evidence. Developmental Review, 28, 342–369. doi:10.1016/j.dr.2007.09.001.

    Article  Google Scholar 

  8. Berman, M. J., Graham, S. A., Callaway, D., & Chambers, C. G. (2013). Preschoolers use emotion in speech to learn new words. Child Development, 85(5), 1791–1805. doi:10.1111/cdev.12074.

    Article  Google Scholar 

  9. Beversdorf, D. Q., Anderson, J. M., Manning, S. E., Anderson, S. L., Nordgren, R. E., Felopulos, G. J. & Bauman, M. L. (1998). The effect of semantic and emotional context on written recall for verbal language in high functioning adults with autism spectrum disorder. Journal of Neurology, Neurosurgery, & Psychiatry, 65, 685–692. doi:10.1136/jnnp.65.5.685.

    Article  Google Scholar 

  10. Bower, G. H. (1981). Mood and memory. American Psychologist, 36(2), 129–148. doi:10.1037/0003-066X.36.2.129.

    Article  PubMed  Google Scholar 

  11. Calev, A. (1996). Affect and memory in depression: evidence of better delayed recall of positive than negative affect words. Psychopathology, 29, 71–76. doi:10.1159/000284974.

    Article  PubMed  Google Scholar 

  12. Christianson, S.-A. (1992). Emotional stress and eyewitness memory: A critical review. Psychological Bulletin, 112(2), 284–309. doi:10.1037//0033-2909.112.2.284.

    Article  PubMed  Google Scholar 

  13. Craik, F. I. M., & Tulving, E. (1975). Depth of processing and the retention of words in episodic memory. Journal of Experimental Psychology: General, 104(3), 268–294. doi:10.1037/0096-3445.104.3.268.

    Article  Google Scholar 

  14. Deruelle, C., Hubert, B., Santos, A., & Wicker, B. (2008). Negative emotion does not enhance recall skills in adults with autistic spectrum disorders. Autism Research, 1, 91–96. doi:10.1002/aur.13.

    Article  PubMed  Google Scholar 

  15. Eden, A. S., Dehmelt, V., Bischoff, M., Zwitserlood, P., Kugel, H., Keuper, K., Zwanzger, P., & Dobel, C. (2015). Brief learning induces a memory bias for arousing-negative words: An fMRI study in high and low trait anxious persons. Frontiers in Psychology, 6(1226), 1–15. doi:10.3389/fpsyg.2015.01226.

    Google Scholar 

  16. Golan, O., Baron-Cohen, S., & Hill, J. (2006). The Cambridge Mindreading (CAM) face-voice battery: Testing complex emotion recognition in adults with and without Asperger Syndrome. Journal of Autism and Developmental Disorders, 36(2), 169–183. doi:10.1007/s10803-005-0057-y.

    Article  PubMed  Google Scholar 

  17. Gupta, P. (2003). Examining the relationship between word learning, nonword repetition, and immediate serial recall in adults. The Quarterly Journal of Experimental Psychology, 56A(7), 1213–1236. doi:10.1080/02724980343000071.

    Article  Google Scholar 

  18. Gupta, P., Lipinski, J., Abbs, B., Lin, P.-H., Aktunc, E., Ludden, D., Martin, N., & Newman, R. (2004). Space aliens and nonwords: Stimuli for investigating the learning of novel word-meaning pairs. Behaviour Research Methods, Instruments, & Computers, 36(4), 599–603. doi:10.3758/BF03206540.

    Article  Google Scholar 

  19. Hermans, E. J., van Wingen, G., Bos, P. A., Putman, P., & van Honk, J. (2009). Reduced spontaneous facial mimicry in women with autistic traits. Biological Psychology, 80, 348–353. doi:10.1016/j.biopsycho.2008.12.002.

    Article  PubMed  Google Scholar 

  20. Hinojosa, J. A., Méndez-Bértolo, C., & Pozo, M. A. (2010). Looking at emotional words is not the same as reading emotional words: Behavioral and neural correlates. Psychophysiology, 47, 748–757. doi:10.1111/j.1469-8986.2010.00982.x.

    PubMed  Google Scholar 

  21. Hofmann, M. J., Kuchinke, L., Tamm, S., Võ, M. L. H., & Jacobs, A. M. (2009). Affective processing within 1/10th of a second: High arousal is necessary for early facilitative processing of negative but not positive words. Cognitive, Affective, & Behavioral Neuroscience, 9(4), 389–397. doi:10.3758/9.4.389.

    Article  Google Scholar 

  22. Hooker, C. I., Verosky, S. C., Miyakawa, A., Knight, R. T., & D’Esposito, M. (2008). The influence of personality on neural mechanisms of observational fear and reward learning. Neuropsychologia, 46, 2709–2724. doi:10.1016/j.neuropsychologia.2008.05.005.

    Article  PubMed  PubMed Central  Google Scholar 

  23. Hosokawa, M., Nakadoi, Y., Watanabe, Y., Sumitani, S., & Ohmori, T. (2015). Association of autism tendency and hemodynamic changes in the prefrontal cortex during facial expression stimuli measured by multi-channel near-infrared spectroscopy. Psychiatry and Clinical Neurosciences, 69, 145–152. doi:10.1111/pcn.12240.

    Article  PubMed  Google Scholar 

  24. Howell, D. C. (2007). Statistical methods for psychology. Belmont, CA: Cengage Learning.

    Google Scholar 

  25. Ingersoll, B. (2010). Broader autism phenotype and nonverbal sensitivity: Evidence for an association in the general population. Journal of Autism and Developmental Disorders, 40, 590–598. doi:10.1007/s10803-009-0907-0.

    Article  PubMed  Google Scholar 

  26. Kaushanskaya, M., & Yoo, J. (2011). Rehearsal effects in adult word learning. Language and Cognitive Processes, 26(1), 121–148. doi:10.1080/01690965.2010.486579.

    Article  Google Scholar 

  27. Kensinger, E., & Corkin, S. (2003). Memory enhancement for emotional words: Are emotional words more vividly remembered than neutral words? Memory & Cognition, 31(8), 1169–1180.

    Article  Google Scholar 

  28. Kensinger, E. A., Garoff-Eaton, R. J., & Schacter, D. L. (2007). Effects of emotion on memory specificity in young and older adults. Journal of Gerontology: Psychological Sciences, 62(4), 208–215. doi:10.1093/geronb/62.4.P208.

    Article  Google Scholar 

  29. Kousta, S. T., Vinson, D. P., & Vigliocco, G. (2009). Emotion words, regardless of polarity, have a processing advantage over neutral words. Cognition, 112(3), 473–481. doi:10.1016/j.cognition.2009.06.007.

    Article  PubMed  Google Scholar 

  30. Kuperman, V. (2013). Accentuate the positive: Semantic access in English compounds. Frontiers in Psychology, 4(203), 1–10. doi:10.3389/fpsyg.2013.00203.

    Google Scholar 

  31. Landry, O., & Chouinard, P. A. (2016). Why we should study the broader autism phenotype in typically developing populations. Journal of Cognition and Development, 17(4), 584–595. doi:10.1080/15248372.2016.1200046.

    Article  Google Scholar 

  32. Moses, L. J., Baldwin, D. A., Rosicky, J. G., & Tidball, G. (2001). Evidence for referential understanding in the emotions domain at twelve and eighteen months. Child Development, 72(3), 718–735. doi:10.1111/1467-8624.00311.

    Article  PubMed  Google Scholar 

  33. Nelson, H. E., & Willison, J. R. (1991). The revised national adult reading test. Windsor: NFER Nelson.

    Google Scholar 

  34. Nuske, H. J., Vivanti, G., & Dissanayake, C. (2013). Are emotion impairments unique to, universal, or specific in autism spectrum disorder? A comprehensive review. Cognition and Emotion, 27(6), 1042–1061. doi:10.1080/02699931.2012.762900.

    Article  PubMed  Google Scholar 

  35. Peeters, G., & Czapinski, J. (1990). Positive-negative asymmetry in evaluations: The distinction between affective and informational negativity effects. European Review of Social Psychology, 1, 33–60. doi:10.1080/14792779108401856.

    Article  Google Scholar 

  36. Poljac, E., Poljac, E., & Wagemans, J. (2012). Reduced accuracy and sensitivity in the perception of emotional facial expressions in individuals with high autism spectrum traits. Autism: The International Journal of Research and Practice, 17(6), 668–680. doi:10.1177/1362361312455703.

    Article  Google Scholar 

  37. Psychology Software Tools, Inc. [E-Prime 2.0]. (2012). Retrieved from

  38. Reinisch, E., Jesse, A., & Nygaard, L. C. (2013). Tone of voice guides word learning in informative referential contexts. The Quarterly Journal of Experimental Psychology, 66(6), 1227–1240. doi:10.1080/17470218.2012.736525.

    Article  PubMed  Google Scholar 

  39. Schirmer, A., & Kotz, S. A. (2003). ERP evidence for a sex-specific stroop effect in emotional speech. Journal of Cognitive Neuroscience, 15(8), 1135–1148. doi:10.1162/089892903322598102.

    Article  PubMed  Google Scholar 

  40. Shintel, H., Anderson, N. L., & Fenn, K. M. (2014). Talk this way: The effect of prosodically conveyed semantic information on memory for novel words. Journal of Experimental Psychology: General, 143(4), 1437–1442. doi:10.1037/a0036605.

    Article  Google Scholar 

  41. Shriberg, L. D., Paul, R., McSweeny, J. L., Klin, A., Cohen, D. J., & Volkmar, F. R. (2001). Speech and prosody characteristics of adolescents and adults with high-functioning autism and Asperger Syndrome. Journal of Speech, Language, and Hearing Research, 44, 1097–1115. doi:10.1044/1092-4388(2001/087)).

    Article  PubMed  Google Scholar 

  42. Snedecor, G., & Cochran, W. G. (1989). Statistical methods (8th ed.). Ames, IA: Iowa State University Press.

    Google Scholar 

  43. Thurman, A. J., McDuffe, A., Kover, S. T., Hagerman, R., Channell, M. M., Mastergeorge, A., & Abbeduto, L. (2015). Use of emotional cues for lexical learning: A comparison of autism spectrum disorder and fragile X syndrome. Journal of Autism and Developmental Disorders, 45, 1042–1061. doi:10.1007/s10803-014-2260-1.

    Article  PubMed  PubMed Central  Google Scholar 

  44. Touryan, S. R., Marian, D. E., & Shimamura, A. P. (2007). Effect of negative emotional pictures on associative memory for peripheral information. Memory, 15(2), 154–166. doi:10.1080/09658210601151310.

    Article  PubMed  Google Scholar 

  45. Vaish, A., Grossmann, T., & Woodward, A. (2008). Not all emotions are created equal: The negativity bias in social-emotional development. Psychological Bulletin, 134(3), 383–403. doi:10.1037/0033-2909.134.3.383.

    Article  PubMed  PubMed Central  Google Scholar 

  46. Wang, B., & Fu, X. (2011). Time course effects of emotion on item memory and source memory for Chinese words. Neurobiology of Learning and Memory, 95, 415–424. doi:10.1016/j.nlm.2011.02.001.

    Article  PubMed  Google Scholar 

  47. Wing, L. (1988). The continuum of autistic characteristics. In E. Schopler & G. B. Mesibov (Eds.), Diagnosis and assessment in autism (pp. 91–110). New York: Springer.

    Chapter  Google Scholar 

Download references


We thank acting student Raechyl French for providing voice recordings of the emotional stimuli. Author 2 was supported by a UQ Vice Chancellor’s Fellowship.

Author information



Corresponding author

Correspondence to Melina J. West.

Ethics declarations

Conflict of interest

The authors report no conflicts of interest.

Ethical approval

All procedures performed in this study were in accordance with the ethical standards of the University of Queensland and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

West, M.J., Copland, D.A., Arnott, W.L. et al. Effects of emotional prosody on novel word learning in relation to autism-like traits. Motiv Emot 41, 749–759 (2017).

Download citation


  • Emotion
  • Word learning
  • Autism
  • Broader autism phenotype
  • Prosody