Introduction

The ability to recognize and interpret emotion portrayed via facial and vocal expressions is important to our overall wellbeing (Barrett et al., 2011; Da Fonesca et al., 2009). Accurate decoding of facial and vocal emotion cues allows us to respond appropriately to the feelings of others which increases our capacity to build positive relationships with family, peers, teachers, and colleagues (Bloom & Heath, 2010; Laukka et al., 2005). Difficulties with emotion recognition can impact academic and employment success (Byron et al., 2007; Lopes et al., 2006; Riggio, 2006), put people at risk for bullying (Copeland et al., 2013; Woods et al., 2009), and contribute to issues with self-esteem, anxiety, and depression (Pickett et al., 2004; Warnes et al., 2005). Adolescence is a developmental period in which emotion recognition is particularly important because it is during this stage of life when children begin to detach from their parents to build stronger, more complex relationships with peers (Forbes & Dahl, 2012; Vetter et al., 2018). However, there is ambiguity regarding how this skill develops and changes during different periods of the lifespan due to a range of methodology-related inconsistencies across studies. The overall purpose of this systematic review is to synthesize and evaluate current literature related to emotion recognition in adolescents to identify the typical pattern of recognition for facial and vocal emotion expressions. Classifying emotion recognition patterns for facial and vocal emotion recognition may support programs for intervention; an important consideration given that adolescence is associated with increased vulnerability to social-emotional disorders (Scherf et al., 2012; Steinberg, 2005). Moreover, there is concern that many mental health issues in adults originate in adolescence (Blakemore & Mills, 2014; Paus et al., 2008).

Adolescence begins with the onset of puberty which occurs, on average, at approximately 11 years of age (Foulkes & Blakemore, 2018). During this period of life there are not only considerable hormonal changes but also changes occurring in both brain structure and function, particularly in regions of the brain fundamental to processing emotion cues in the face and voice (Blakemore et al., 2010; Giedd et al., 1999; Gogtay et al., 2004; Golari et al., 2010; Kilford et al., 2016; Monk, 2008; Scherf et al., 2012). Blakemore and Mills (2014) refer to these regions of the brain as the social brain network; a network that undergoes developmental changes from childhood through to early adulthood as represented by changes in grey matter volume and cortical thickness (Andrews et al., 2020; Blakemore & Mills, 2014). The brain and hormonal changes that occur during adolescence are said to result in maturation of cognitive processes that are important to emotion recognition, such as face processing, attention and memory, that can influence the strategies adolescents use to process emotional cues (Blakemore & Mills, 2014; Crone & Dahl, 2012; Dan & Raz, 2018; Kilford et al., 2016; Scherf et al., 2012).

Recognizing emotions in others also requires adolescents to encode and appraise available cues to make inferences about the emotion being portrayed, processes that are associated with cognitive development (Larsen & Luna, 2018; Scherer et al., 2019). Given that cognitive processes associated with processing emotion cues, such as attentional and working memory abilities, continue to mature during adolescence, ongoing development of facial and vocal emotion recognition might be expected. Traditionally, general cognitive development theory suggested that face perception should be fully mature by 5 years of age (McKone et al., 2012) but more recent evidence suggests that development continues until age 15 (Meinhardt-Injac et al., 2020). Facial emotion recognition, which has interdependent processing with face perception, has been reported to be fully developed by age 10 (Gao & Maurer, 2009), or 12 (Durand et al., 2007). However, there is some behavioral evidence suggesting facial emotion recognition abilities continue to develop until age 15 (Herba et al., 2006) or even older (Thomas et al., 2007). Vocal emotion recognition also requires interdependent processing which includes perception of the key features contained in the signal (i.e., acoustic cues) alongside higher-order evaluation and interpretation of those cues (Lavan & Lima, 2014). Not surprisingly then, there are similar discrepancies in the literature regarding development of vocal emotion recognition, with some studies reporting ongoing development in recognition until around 12 years of age (Brosgole & Weisman, 1995; Zupan, 2015) and others reporting improvements as late as age 17 (Morningstar et al., 2019). In addition to cognitive development, these discrepancies may also be connected to changes in the social brain network that make adolescents sensitive to peer acceptance and social cues (Blakemore & Mills, 2014). However, development of the recognition of facial and vocal cues associated with more complex, social emotions remains understudied (Burnett et al., 2009).

The overall development of recognition for adolescents for facial and vocal emotion expressions remains ambiguous for a number of reasons, but a primary reason includes variations in the operationalization of age groups across studies as well as the age range examined. For instance, participants of the same age may be categorized as children in one study but adolescents in another. Moreover, the overall span of ages included in those groups can vary widely (e.g., including participants from 4 to 15 years of age in the same group), making it difficult to gain a full understanding of how the ongoing brain and pubertal changes in adolescence may be impacting emotion recognition. Additionally, since the focus in research has often been on younger children’s emotion recognition, studies may find children are not performing as well as adults, but the age range of participant does not allow for exploration of ongoing developmental changes (e.g., Chronaki et al., 2014).

Another important consideration for facial and vocal emotion recognition in adolescents is that the developmental pattern of recognition may vary depending on the specific emotion. Research suggests that the recognition of some emotion categories develops earlier than others, with different emotions more easily recognized in the face versus the voice (Chronaki et al., 2014). For instance, Happy is most easily identified in the face by both children (Gao & Maurer, 2009; Herba & Phillips, 2004) and adults (Franklin & Zebrowitz, 2017). For the voice, Sad is most easily identified by children (Nelson & Russell, 2011) and Angry for adults (Zupan et al., 2009); however, the age at which this pattern of recognition shifts is not clear. It is also unclear if these patterns remain consistent across the period of adolescence. The importance of emotion type is particularly evident when considering possible sex differences. Females have been shown to have greater emotion recognition accuracy for both facial (Collignon et al., 2010; McClure, 2000) and vocal (Collignon et al., 2010; Grosbas et al., 2018) emotion expressions. Thompson and Voyer (2014) reported that this effect is particularly evident between the ages of 13 and 30, with the largest effect shown in the recognition of angry emotion expressions.

The emotion portrayed via each modality can also vary in terms of the type and strength of cues available; these factors cannot be ignored as potential moderators in emotion recognition for adolescents. For instance, some facial emotion recognition studies include still photographs (e.g., Zupan, 2015) or photographs that change from one emotion expression to another (i.e., morphing) as stimuli (e.g., Sully et al., 2015) whereas others include fully dynamic images (e.g., Wieckowski & White, 2017). Vocal emotion recognition studies may include nonlinguistic stimuli such as vocal bursts (e.g., laughs or cries; Amorim et al., 2021), or linguistic stimuli with pseudo (e.g., Filippa et al., 2022) or neutral content (e.g., Nelson & Russell, 2011). Emotions may also vary according to the intensity of the emotion expression. Research in facial emotion recognition generally shows better recognition accuracy for stimuli with higher emotional intensity (Montirosso et al., 2010; Picardo et al., 2016); however, research in vocal emotion recognition suggests that stimulus intensity may differentially impact the recognition of different emotion categories (Morningstar et al., 2021).

In terms of the emotion categories typically studied, emotion recognition research, in general, has focused on the “basic” emotions: Happy, Sad, Anger, Fear, Disgust, and Surprise (Ekman, 1992). Not only has research restricted the emotion range to these few emotions, but often studies only include a subset of these emotions. Given that there may be different developmental patterns depending on emotion type, this practice is particularly problematic. For example, Happy is the only positive basic emotion and therefore its recognition only requires making a valence judgment (i.e., positive vs. negative emotion), something infants are capable of distinguishing with facial expressions (Soken & Pick, 1992; Young-Browne et al., 1977). The conclusion that Happy is one of the earliest facial expressions recognized may be a result of experimental design. In addition to issues surrounding a balance in emotions representing positive versus negative valence, the practice of focusing on (subsets of) basic emotions when trying to understand emotion recognition development in adolescence does not account for adolescents’ heightened sensitivity to social information. For instance, developmental changes in the social brain network are proposed to increase adolescents’ sensitivity to peer evaluation and rejection (Keltner et al., 2019), making them more vulnerable to the experience of social emotions (e.g., shame, desire; Burnett et al., 2009; Garcia & Scherf, 2015). These social emotions are complex, comprised of more nuanced cues, and reported to continue to develop beyond the period of adolescence (Meinhardt-Injac et al., 2020). Thus, limiting studies to basic emotions may not capture ongoing emotion recognition development across the adolescent period. A final caveat to consider is the specific set of emotions included. Research has found that even among the basic emotions, the choice of emotions to compare within an identification task can influence participants’ performance regardless of participants’ age (Hayes et al., 2020; Zupan et al., 2023).

The Current Study

The ability to recognize emotion in others is central to one’s overall well-being and may be particularly important during adolescence to support developing relationships with peers. Currently, there are no existing systematic reviews exploring recognition of specific emotions in facial and/or vocal expressions by typically developing adolescents. Thus, the primary aim of this review was to systematically evaluate the current literature to clarify the pattern of development for recognition of facial and vocal emotion expressions in typically developing adolescents. Given the notable differences found in the literature between these two modalities, it was hypothesized that each modality would yield a different pattern of recognition. Based on existing literature in emotion development that shows different patterns of recognition for children versus adults, it was also hypothesized that the pattern of recognition in each modality would differ across the adolescent period. Specifically, it was hypothesized that different patterns of recognition would be evident for early versus mid- versus late-adolescents, indicating ongoing development of emotion recognition across adolescence. In addition to exploring the overall pattern of recognition for facial versus vocal emotion expressions, this review also aimed to examine how recognition performance for facial versus vocal emotion expressions is related to task characteristics (i.e., emotion set), and characteristics of the emotion expressed (i.e., cue type; emotion intensity) across adolescents of different ages. It was hypothesized that the same general pattern of recognition would remain across these different variables such that the emotions found to be easiest and hardest to identify for facial versus vocal emotion expressions overall would be retained in the overall pattern of recognition. The overall purpose of this systematic review is to provide a better understanding of the development of emotion recognition during a critical period of social and emotional development as well as identify areas that require further investigation.

Method

This systematic review was conducted and reported using the Preferred Reporting Items for Systematic Review and Meta-Analyses (PRISMA) guidelines (Moher et al., 2009; Page et al., 2021). The original systematic review protocol was posted to Open Science Framework (https://osf.io/jgkp5/) and includes the research question, eligibility criteria, information sources, and search terms.

Sources of Information and Search Strategy

Systematic searches were conducted in May 2021, and again in December 2022 in six databases: ERIC, PsycARTICLES, Psychology and Behavioural Sciences Collection (Ebscohost), PsycINFO, PubMed and Scopus. The Patient, Intervention, Comparison, Outcome (PICO) model (Eriksen & Frandsen, 2018) was used as a search strategy which was adapted as per guidelines of individual databases. Search terms were related to adolescents and facial and vocal emotion recognition abilities (see the example for PsycINFO in Table 1). Backward citation chaining was also applied.

Table 1 Search strategy example for PsycINFO

Eligibility Criteria

Studies identified through the systematic searches of the databases were included if they were peer reviewed, original studies published in English. Given the individual variability in which people transition in and out of adolescence, it is not surprising that age range specified for adolescence varies widely in the literature. For the purpose of this review, inclusion criteria were set to include studies that included data specific to adolescents between 11 and 18 years of age to represent the time period between average age of puberty onset and the beginning of adulthood (Blakemore et al., 2010). Although definitions of adolescence often extend the period until an individual reaches adult independence (Foulkes & Blakemore, 2018) and thus can go beyond the age of 18, a scoping search of the literature showed that studies identified individuals in this latter range of adolescence as adults. Studies that focused on special populations were included if data were separately reported for adolescents with no concomitant disorders. In relation to emotion recognition, studies were excluded if the measure was on face recognition without the presence of emotion or if the facial emotion stimuli were not human faces (e.g., avatars; cartoon drawings). Similarly, studies measuring vocal recognition without the presence of emotion (e.g., suprasegmentals) and/or without human voice (e.g., emotional music) were also excluded. Study tasks and/or measures needed to focus on accuracy (e.g., identification, labelling) of emotions. Studies that used only matching, same-different tasks, or memory-based tasks were excluded. Since the aim of this review was to report on recognition of facial versus vocal emotion expressions during the adolescent period, studies that reported only on valence and/or intensity of emotion were also excluded. Similarly, studies that reported on auditory-visual recognition of emotion were also excluded. Authors were contacted to provide full texts when they could not be accessed elsewhere and/or to provide data specific to each emotion when only total score was reported. Studies for which the full text or emotion-specific data could not be sourced were excluded.

Study Selection

After removing duplicates (n = 703), the first author uploaded the remaining records identified in the database searches to the systematic review web application Rayyan (Ouzzani et al., 2016). Both authors initially screened the abstracts against inclusion and exclusion criteria for 10% of the studies using blind review. Following this, the authors met to conduct a reliability check in which screening decisions and eligibility criteria were reviewed to ensure consistency in decision making. Records for which abstracts did not contain sufficient detail for an informed decision to be made (e.g., age range for participants not specified) were retained for full text screening. Both authors then conducted blind review of the remaining abstracts. Following screening, discrepancies were resolved by consultation to achieve agreement, resulting in 134 studies for blinded full text review. Both authors reviewed all full texts against the eligibility criteria. Twelve discrepancies were identified and resolved via discussion until mutual agreement was reached.

Methodological Evaluation and Quality Appraisal

The quality of included studies was assessed using the Mixed Methods Appraisal Tool (MMAT) Version 2018 (Hong et al., 2018). This tool was chosen because it includes appraisal criteria for different study designs. The MMAT requires appraisal of seven criteria for each type of study design. The first two criteria remain the same regardless of study design and include rating the clarity of the research question and identifying whether the collected data allows for the question to be sufficiently answered. The remaining five criteria differ according to study design. All criteria are rated using the following response types: Yes, No, Can’t Tell which are scored 1, 0, 0, respectively. Using the total score of the five questions specific to study design, each study was identified as high (100%); moderate (60–80%) or low (below 60%) in quality. To ensure reliability of the final ratings, each author rated the quality of 50% of the studies and then reviewed the other author’s ratings of the other 50%. Any discrepancies in ratings were then discussed.

Data Extraction and Analysis

Both authors independently extracted the data for all studies that met inclusion criteria using the following predefined fields: (1) author; (2) year of publication; (3) title; (4) research question; (5) participant characteristics; (6) modality (i.e., face, voice); (7) task and measurement tool; (8) emotions studied; and (9) results. The first author then reviewed, complied, and verified the extraction.

Descriptive results are provided in Table 2 which includes participant details for each study, task characteristics (i.e., modality, emotion set) and emotion expression characteristics (i.e., cue type and intensity). Due to the heterogeneity of tasks and measures used across the studies, a meta-analysis was not possible. Instead, emotion recognition data were synthesized using heat maps. To create these heat maps, findings from each study were converted to ranks that specified the order of recognition in descending order, with the emotion that was best recognized by participants ranked as 1, the emotion identified second best as 2, and so on. For example, if a study reporting mean accuracy of recognition showed Happy as having the highest mean, followed by Sad, and then Disgust, Happy was given a rank of 1, Sad a rank of 2, and Disgust a rank of 3. Frequency data of ranks were then used to identify the number of times each emotion was ranked first, second, third, et cetera according to modality, and then according to participant characteristics (i.e., age, sex), task characteristics (i.e., emotion set) and emotion expression characteristics (i.e., cue type, intensity of emotion). Since the number of studies that included data for each modality and characteristic varied, frequency data were converted to proportions. For instance, if Happy was ranked first 11 times out of a possible 23, the frequency data was recorded as 0.47826 in Microsoft Excel which was then used to generate the heat map. Studies reporting on more than one modality, or performance for multiple age groups, cue types, or stimuli of different intensity levels were extracted as individual data sets. In other words, if a study reported data for responses to both low and high intensity facial emotion expressions, each set of data was included separately in the heat maps. Heat map results are supported via narrative synthesis and reported according to the research questions.

Table 2 Description of study characteristics

Results

Search Results

A total of 1630 unique studies were identified via electronic databases (n = 1628) and citation chaining (n = 2). Figure 1 outlines studies remaining at each stage of the screening process, including reasons for exclusion for studies reviewed at the full-text stage. The full text for five studies could not be sourced and were excluded at the full text stage. Authors were contacted for studies that reported only a total score to seek out data for specific emotion categories; four authors provided the data needed for this review. The remaining studies reporting only total score (n = 10) were excluded.

Fig. 1
figure 1

Study selection flow chart

Results of Methodological Evaluation and Quality Appraisal

Of the 47 included studies, 24 were case–control design (51%), 21 (44.6%) were cross-sectional design, and two (4%) were cohort design. MMAT ratings for each study are provided in Table 2. The majority of studies (n = 38; 80.8%) were rated either four (n = 23; 48.9%) or five (n = 15; 31.9%) out of five. Using the system of classification described above, results of the MMAT deemed 14 studies as high in quality (Bowen et al., 2014; Coffman et al., 2015; Davis et al., 2020; Georgiou et al., 2018; Lawrence et al., 2015; Lee et al., 2013; McCown et al., 1986; Morningstar et al., 2020; Novello et al., 2018; Pozzoli et al., 2017; Rutter et al., 2019; Steenhuis et al., 2020; Thomas et al., 2007; van Rijn et al., 2011); 29 studies as moderate in quality and 4 studies (Auerbach et al., 2015; Chronaki et al., 2018; Kessels et al., 2014; Lydon & Nixon, 2014) as low in quality. See Appendix 1 for MMAT findings.

Study Characteristics

Year of Publication and Location

Figure 2 summarizes geographical location and year of publication of included studies. Using the location of the first author for each study, geographical distribution showed that nearly half (n = 21; 44.7%) of the included studies were conducted in either the United States (n = 12; 25.5%) or England (n = 9; 19.1%). The year of publication spanned from 1986 to 2022. The majority of papers were published between 2010 and 2020 (n = 39); with 40% (n = 19) published in the last 5 years.

Fig. 2
figure 2

Number of studies published per year by continent

Participant Characteristics

As shown in Table 2, the number of typically developing adolescent participants per study ranged from 10 (Zupan, 2015) to 2059 (Steenhuis et al., 2020), with a mean of 144.89 (SD = 411.03). Based on study criteria, participants ranged in age from 11 to 18, with a mean age of 14.66 (SD = 1.28). The majority of studies (n = 35; 74.5%) included participants that spanned the full adolescent age range. Twelve studies focused on participants in early to mid-adolescence (ages 11 to 15) and one study focused on participants in late adolescence (ages 16 to 18; Memisevic et al., 2016). The majority of the studies (n = 36; 76.6%) included both male and female participants. Of the remaining 11 studies, 7 (14.9%) included only females (Auerbach et al., 2015; Fairchild et al., 2010; Lule et al., 2014; Porter-Vignola et al., 2021; Sfarlea et al., 2016, 2018; Shenk et al., 2013), and 4 (8.5%) included only males (Bowen et al., 2014; Fairchild et al., 2009; McCown et al., 1986; Smith et al., 2010). The total number of females (n = 1103) and males (n = 1062) across studies was similar.

Task Characteristics

Of the 47 studies, 41 (87.2%) included one or more facial emotion recognition tasks, and 7 (14.9%) included a vocal emotion recognition task (Amorim et al., 2021; Chronaki et al., 2018; Davis et al., 2020; Filippa et al., 2022; Grosbas et al., 2018; Morningstar et al., 2020; Zupan, 2015); only one of the included studies evaluated both facial and vocal emotion recognition in their participants (Zupan, 2015).

As shown in Table 2, the overall emotion set included in each study varied both in relation to the number of emotions included as well as the specific emotions studied. The number of emotions included ranged from 1 (n = 1; 2.2%) to 10 (n = 1; 2.2%). All but three studies focused on basic emotions only, ranging from inclusion of one emotion (n = 1; disgust; Whitaker & Widen, 2018) to all six basic emotions; 11 of these studies included neutral. Only three studies included more complex emotions, and all three used vocal emotion recognition tasks only (Amorim et al., 2021; Davis et al., 2020; Morningstar et al., 2020). Given the limited number of studies that included more complex emotions and differences in the complex emotions included, regretfully only the basic emotions and neutral were included for analysis in this systematic review.

Emotion Expression Characteristics

As shown in Fig. 3a and b, the studies evaluated facial and vocal emotion recognition using a range of cue types. Despite ongoing acknowledgement that still photographs are not ecologically valid and that still images are not processed similarly to the dynamic images we see in day-to-day interactions (Paiva-Silva et al., 2016), they were the most commonly used cue type in facial emotion recognition studies.

Fig. 3
figure 3

Cue types used in facial versus vocal emotion recognition tasks

Another variable manipulated in some studies was the intensity of the emotion expression shown to participants, however, data specific to intensity level were not consistently reported (see Table 2). Of the 24 facial emotion recognition studies that referenced intensity levels of their stimuli, only eight (33.3%) reported results for different intensity levels (Auerbach et al., 2015; Bowen et al., 2014; Hauschild et al., 2020; Lee et al., 2013; Martin-Key et al., 2018; Pozzoli et al., 2017; Smith et al., 2010; Zupan, 2015); five (20.8%) additional studies reported accuracy for high intensity expressions only (Fairchild et al., 2009; Legenbauer et al., 2018; Lydon & Nixon, 2014; Sully et al., 2015; Whitaker & Widen, 2018). Of the remaining studies, eight (33.3%) reported results based on an overall intensity only (Airdrie et al., 2018; Fairchild et al., 2010; Hauschild et al., 2020; Kessels et al., 2014; McClure et al., 2005; Novello et al., 2018; Shenk et al., 2013; Thomas et al., 2007; van Rijn et al., 2011), and four (16.7%) reported results based on the minimum intensity threshold for recognition (Leganes-Fonteneau et al., 2020; Porter-Vignola et al., 2021; Short et al., 2016; Vanhalst et al., 2017). Only one of the seven studies investigating vocal emotion recognition examined intensity (Zupan, 2015); this study reported overall recognition of vocal emotion expressions, but also recognition of high and low intensity expressions.

Study Results

Facial Versus Vocal Emotion Recognition

As shown in Fig. 3c, a different pattern of recognition emerged for facial versus vocal emotion. For facial emotion recognition, Happy was consistently identified best (i.e., ranked first 90.5% of the time) and Surprise was most frequently identified as ranking second (47.6%); Neutral and Disgust were almost equally found to be ranked third (31.25%; 29.58% consecutively). Fearful facial emotion expressions were most consistently identified as ranking either fifth or sixth (51.85%). For vocal emotion expressions, Angry was identified best (i.e., ranked first 53.8% of the time), with Neutral most frequently identified as ranking second (80%) and Fearful third (46.15%). Only one study included vocal expressions of Surprise and this emotion it was found to be ranked sixth overall.

Participant Characteristics

Age

The influence of age on emotion recognition was analyzed by organizing available data into three age groups: (1) early adolescence (11 to 13 years old); (2) mid-adolescence (14 to 15 years old); and (3) late adolescence (16 to 18 years old). As shown in Fig. 4, and similar to the overall performance, accuracy for Happy facial expressions most consistently received a ranking of first and Surprise second for all three age groups. Similarly, Fearful appeared to be the most difficult facial emotion to recognize with all groups showing the most frequent ranking of this emotion as sixth (early and mid-adolescents) or fifth (late adolescents). Studies including Neutral, which was most commonly ranked as third in the overall comparisons, were almost entirely conducted on mid-adolescents (i.e., only 1 was not) and therefore age comparisons cannot be made regarding its recognition. Age appeared to have the largest impact on recognition of Sad. The most frequent ranking for this emotion for late adolescents was second (42.9%) but for early and mid-adolescents, it was most frequently ranked fifth. For vocal emotion recognition, only one study reported data for late adolescents so the impact of age on the pattern of recognition for this modality can only be considered across early and mid-adolescents; the pattern of recognition for Happy, Sad, Angry, Fearful and Neutral vocal expressions did not differ between these two groups.

Fig. 4
figure 4

Influence of participant characteristics on recognition of facial and vocal emotion expressions

Sex

Sex differences were only reported for studies that included facial emotion recognition tasks (see Fig. 4). Overall, this variable did not appear to have much impact on the overall recognition of different emotions. Both males and females consistently identified Happy best as indicated by the frequency in which these expressions were found to be ranked first (females = 88%; males = 96%); Fearful appeared to be most difficult for both groups. However, some minor variation was seen with Sad and Disgust. More females were found to have Sad ranked as first (12%) more often than males (4%); Females also had the same proportion of second and fifth rankings (28%), while males had the same proportion of third and fifth rankings (32%). The frequency of rankings for Disgust showed that the most frequent ranking for females was third (47%) and the most frequent ranking for males was fifth (28%).

Task Characteristics

The influence of emotion set on performance for facial and vocal emotion recognition is shown in Fig. 5. As the number of emotions studied in facial emotion recognition studies increased, performance for Happy, Sad, and Fearful declined. The opposite appeared to occur for Angry. The inclusion of Neutral appeared to most greatly impact the pattern of recognition for Happy and Sad facial emotion expressions. For instance, the frequency in which Happy was found to be ranked first decreased from 100 to 83.3% when Neutral was added to the emotion set of Happy, Sad, Angry, and Fearful, and from 93.55 to 0% when added to the emotion set that included all six basic emotions. A similar pattern was seen for Sad with a larger proportion of low rankings occurring for emotion sets that included Neutral compared to without. Too few studies included Neutral with Disgust and Surprise (i.e., 2) to compare the effect of Neutral on recognition for those emotions. Fearful facial emotion recognition was more challenging to identify when Surprise was added to the set of emotions. Given the limited numbers of studies for vocal emotion recognition, it is more difficult to draw conclusions about the impact of emotions studied on performance. However, similar to facial emotion recognition, the inclusion of Neutral appeared to have the greatest impact, with frequencies for lower ranks increasing for Happy and Fearful, and to a lesser extent Sad, when this emotion was included. Also similar to facial emotion recognition, identification of Angry appeared to increase with the inclusion of Neutral.

Fig. 5
figure 5

Influence of emotion set on recognition of facial and vocal emotion expressions

Emotion Expression Characteristics

Cue type

As shown in Fig. 6, Happy was the only facial emotion in which recognition appeared unimpacted by whether emotion expressions were presented as photos (i.e., stills), video (i.e., dynamic), or morphing. Performance on the remaining emotions differentially varied according to this variable. For instance, Sad was most difficult to recognize in expressions portrayed via morphing with higher rankings identified more frequently for dynamic stimuli. Surprise showed a similar pattern. The opposite pattern was seen for Angry and Disgust. Again, a comparison for Neutral could not be made since almost all studies including Neutral used stills.

Fig. 6
figure 6

Influence of cue type on recognition of facial and vocal emotion expressions

For vocal emotion recognition, all emotions studied via different cue types across studies showed a variable ranking pattern (see Fig. 6). However, this pattern appeared to be due to inclusion of non-verbal (i.e., affective bursts) versus verbal (i.e., semantically neutral or pseudo words and sentences) cue types. For instance, Happy and Fearful showed a greater frequency for higher rankings when affective bursts were used compared to semantically neutral or pseudo verbal cue types. However, Angry showed a greater frequency of higher ratings for semantically neutral and pseudo cue types compared to affective bursts.

Emotion Intensity

Overall, for facial expressions, emotions tended to have a higher frequency of low rankings for low versus high intensity stimuli; this pattern can be seen most clearly for Fearful (see Fig. 7). However, for Sad, low and moderate intensity expressions had a higher proportion of first and second rankings than high intensity ones. Low intensity Disgust expressions also had higher rankings than moderate and high intensity expressions. Changes in intensity appeared to least impact the recognition of Happy and Surprise facial emotion expressions, though both showed a higher frequency of higher rankings with more intense stimuli. Only one study examined recognition for vocal emotions expressions of different intensity. This study reported recognition for low and high intensity expressions of Happy, Sad, Angry, Fearful. For low intensity vocal expressions, Sad was identified best, followed by equal identification of Angry and Fearful, then finally Happy. The pattern for high intensity vocal emotion expressions differed with Angry most easily recognized, followed by Happy, Sad, and Fearful.

Fig. 7
figure 7

Influence of intensity of emotion expression on facial emotion recognition

Discussion

The ability to recognize facial and vocal emotion expressions facilitates successful social interactions and is central to building positive relationships with others. This skill is therefore particularly important for adolescents as they individuate from their parents and learn to navigate more complex interactions with peers. Though numerous studies have been published on adolescent emotion recognition, methodological differences across studies have resulted in ambiguity regarding the typical development of facial and vocal emotion recognition across the adolescent period. This systematic review aimed to clarify the pattern of recognition of individual emotions in facial and vocal expressions across the period of adolescence. The relationship of performance to participant, task and emotion expression characteristics was also studied. Contrary to the main hypothesis of this review, the pattern of recognition was generally consistent across the period of adolescence; minimal sex differences were also found. However, the pattern of recognition was found to vary considerably in response to differences in task characteristics (i.e., emotion set) and emotion expression characteristics (i.e., cue type; intensity).

Facial Versus Vocal Emotion Recognition

Maturation of the social brain network makes adolescents particularly sensitive to social cues, and presumably more aware of complex social emotions (Burnett et al., 2009; Garcia & Scherf, 2015), making these emotions particularly relevant to study. Unfortunately, across studies, the range of emotions studied was almost entirely restricted to the basic emotions, with the exception of three studies looking at vocal emotion recognition. Furthermore, it was rare for a study to include the entire range of basic emotions. However, within the range of emotions studied, the pattern of recognition for individual emotions differed for facial versus vocal expressions, as expected. As found in prior research with both children and adults, Happy was the emotion most easily recognized in the face and Fearful was the most difficult. Following Happy, adolescents were best able to identify Surprise and Neutral. This overall pattern of facial emotion recognition across the adolescent period differs from that reported for children in which recognition of Sad or Angry follows Happy (De Sonneville et al., 2002; Montirosso et al., 2010). This shift in the pattern of recognition for facial emotion recognition may reflect the developmental changes in the social brain network and the underlying processes used to interpret social stimuli that occur during the period of adolescent development.

For vocal emotion expressions, adolescents were best able to identify Angry, followed by Neutral, Fearful and Happy indicating that Happy was a more difficult emotion to recognize in this modality. Though Surprise appeared to be the most challenging emotion for adolescents to identify in the voice, this emotion was included in only one study, so it is possible that difficulties were specific to the particular task characteristics of the emotion expressions used. The high recognition of Angry and low recognition of Happy in the voice aligns to the pattern of recognition reported for adults (Zupan et al., 2009). This pattern differs to children who tend to recognize Sad expressions in the voice most accurately, and Fearful expressions least accurately.

Participant Characteristics

While the overall results might suggest that emotion recognition abilities in adolescence are further developed than those in children, the intent of this systematic review was to gain a more nuanced view of how emotion recognition might change across adolescence and how characteristics of the task and emotion expression might relate to the pattern of development. One of the primary aims of this review was to explore whether age moderated the overall pattern of recognition for adolescents. It was hypothesized that the recognition performance for facial emotion expressions would be more similar to children for the youngest group of adolescents (i.e., 11 to 13 year olds) and align to the pattern reported for adults for the oldest group of adolescents (i.e., 16 to 18 year olds). This prediction was not wholly supported. For facial emotion recognition, results showed that even the youngest adolescents included in the review recognized Happy and Surprise better than other emotions. The lack of a clear difference in the pattern of facial emotion recognition for early versus late adolescents may be related to the fact that only basic emotions were examined. In a study examining pubertal development on facial emotion processing, Motta-Mena and Scherf (2017) found that pubertal development impacted recognition of complex, but not basic emotions in the face. Moreover, recognition of basic emotions may not necessitate the same level of cognitive development as recognition of complex emotions which require deeper appraisal and inferencing of available cues (Larsen & Luna, 2018; Scherer et al., 2019). It should also be noted that data included in this review focused on participants’ recognition accuracy but studies with children have shown that developmental differences may only be apparent when response times are also considered (De Sonneville et al., 2002; Johnston et al., 2011). It is possible that clearer differences in pattern of recognition might emerge across early, mid, and late adolescents if both response accuracy and response times are evaluated and if complex emotions are also considered.

Despite the fact participants across all age groups generally showed a pattern of recognition for facial emotion expressions that was more similar to adults than children, it is important to note that some variability was visible suggesting ongoing development of at least some of these emotion categories. One example of this was in the recognition of Angry facial expressions. The frequency with which Angry was found to be recognized second best was considerably higher for early (34%) versus mid (27%) versus late (8%) adolescents supporting a shift in emotion recognition development across this adolescent period from a pattern that would be associated with emotion recognition in children toward one more similar to adults. Recognition of Sad facial expressions was particularly variable across age groups. While this variability may be similarly indicative of ongoing development, it is also possible that the variability reflects social factors. Garcia and Tully (2020) suggested that adolescents develop a superior recognition of emotions that more readily convey a negative impact on them as individuals. In other words, misperceiving facial expressions such as Angry, which may signal significant displeasure or threat toward the individual, may be more detrimental than misinterpreting expressions such as Sad where the impact may have a lesser effect at an individual level. This may be particularly relevant during adolescence when social interactions are strongly motivated by peer acceptance (Allen et al., 2005).

Studies in recognition of vocal emotion expressions were limited overall. Only one study included participants in the late adolescent group and results indicated an unexpected pattern of recognition—participants identified Happy vocal expressions most easily, followed by Fearful, Sad, and Angry. Given the increased shift toward more pro-social behaviour that occurs for older adolescents (Crone & Dahl, 2012), it is possible that this older group of participants were processing vocal cues differently to younger adolescents, reflecting a need to engage in positive experiences. However, it is also possible that this difference was specific to characteristics of the emotion expressions and directly related to the use of non-verbal affective bursts as stimuli. While affective vocal bursts provide rich, nuanced information about emotion, the acoustic cues differ from those in linguistic vocal emotion expressions rely on more automatic processing mechanisms (Castiajo & Pinheiro, 2019).

Only studies focused on facial emotion recognition reported recognition data for males versus females. Though it appears males showed a tendency toward higher recognition of Happy facial expressions (i.e., ranked first 96%) than females (i.e., ranked first 87%), this most likely reflects the fact that Neutral was only included in studies with females. It is reported that more ambiguous facial expressions tend to be identified as neutral (Durand et al., 2007), so the addition of this category as a response option may have led females to identify more subtle expressions of Happy as Neutral. Slight variation also appears in the pattern of recognition for Sad, Angry, and Disgust, with females showing a tendency toward higher accuracy for Sad and Disgust facial expressions and males showing higher accuracy for Angry facial expressions. These findings may reflect gender role expectations learned through socialization where parents tend to validate expressions of sadness in their daughters and expressions of anger in their sons (McNeil & Zeman, 2021). This socialization is then further reinforced in adolescence where adolescent females are expected to respond to expressions of sadness in their peers to solidify friendships (Miller-Slough & Dunsmore, 2018). Studies with adults suggest that these socialization effects continue beyond adolescence since adult females have been found to show a particular advantage for Sad facial expressions (Bonebright et al., 1996), while males tend to show stronger responses to potentially threatening stimuli such as Angry facial expressions (Kret & De Gelder, 2012).

Task Characteristics

The systematic review revealed the importance of the set of emotions included on recognition. Regretfully, studies in this review primarily included basic emotions, but the actual range of emotions varied widely. Happy, Sad, Angry, and Fearful were most consistently included in studies evaluating both facial and vocal emotion recognition; only a small proportion of studies included Neutral. In general, as the number of emotion categories increased, recognition of Happy and Sad decreased. The addition of Neutral appeared to particularly impact recognition of these two emotions. Research has shown that adults tend to identify more subtle or ambiguous expressions as either Happy or Neutral (Durand et al., 2007) so it is possible that adolescents spread responses for expressions they were uncertain about across these two categories. This result may be also related to the inclusion of only one positive response category (i.e., Happy) such that the inclusion of Neutral provided an alternative response option for emotion expressions participants perceived as positive but did not wholly fit into the Happy category. This type of responding has been shown in research exploring categorization of emotion words (Zupan et al., 2023). Another visible difference in the pattern of recognition for different emotion sets was the decreased recognition of Fearful with the inclusion of Disgust and Surprise. This result was likely driven by the inclusion of Surprise which carries similar facial features to fearful (i.e., wide eyes, open mouth) but does not carry a clear valence (Schlegel et al., 2012) providing an alternative for adolescents who felt uncertain in their identification of Fearful. Overall, this result adds further evidence that Fearful is a challenging emotion for adolescents to identify in the face.

Similar to facial emotion recognition, the inclusion of Neutral appeared to reduce the recognition of Happy vocal expressions; the recognition of Fearful expressions was also reduced. The decline in Happy is not surprising given that this emotion is typically difficult to identify in the voice; it suggests that adolescents may be using Neutral as an identification category for ambiguous expressions. Given that Fearful is the most difficult emotion for children to identify in the voice, the decrease in recognition for this emotion by adolescents in the presence of Neutral may indicate that development of recognition of this emotion is ongoing in adolescence. Though Neutral may have been the response option chosen to identify more ambiguous vocal emotion expressions, the inclusion of Neutral as a response option increased identification of Sad and Angry vocal expressions. This finding suggests that these vocal expressions may be less ambiguous overall than other basic emotions which may be due to their distinct acoustic correlates. Of the emotions included in this review, Sad is distinct in that it is the only one associated with decreases across all associated acoustic variables while Angry is associated with overall increases is distinct by its harsh vocal quality (Zupan et al., 2009). It is possible that the presence of Neutral vocal stimuli provided a “baseline” which made these patterns more salient for adolescents.

The change in rankings based on the set of emotions included and the inclusion of Neutral highlight the importance of which emotions are studied in drawing conclusions regarding emotion recognition. Unfortunately, Neutral is not commonly included in research (e.g., less than a third of the emotion recognition tasks in this systematic review included Neutral). Moreover, the six basic emotions may not be representative of the emotions experienced in everyday interactions which tend to be more complex (e.g., Anxiety) and social (e.g., Pride) (Turkstra et al., 2017). The lack of research that includes more complex emotions that require deeper social and cognitive understanding hampers any attempts to fully capture patterns of development. The Differentiation Model suggests that children develop emotion recognition abilities gradually and learn to apply more refined labels to distinguish between emotions over time (Widen, 2013). Complex emotions (e.g., anxiety) often overlap basic ones (e.g., fear), but represent a more subtle and nuanced experience (Deiner et al., 1995). Therefore, a focus on emotion recognition of only basic emotions during adolescence prevents a full exploration of the gradual narrowing of emotion categories proposed by The Differentiation Model reducing understanding of how adolescents process and respond to these more socially relevant cues. Not only is adolescence a time when social emotions may relate to heightened attention and development (Garcia & Scherf, 2015), but adding more complex emotions would provide greater insight into the influence of valence on emotion recognition. This systematic review showed that Happy facial expressions were consistently identified better than other emotion expressions. However, the use of only basic emotion categories results in the inclusion of only one positively-valanced response option increasing the probability of ceiling effects for Happy. That performance on Happy deteriorated with the inclusion of other emotions, particularly the inclusion of Neutral, illustrates the importance of carefully selecting the emotion set to ensure that responses are not simply based on a process of elimination. In other words, emotion sets should include both basic and complex expressions that represent both positive and negative valence.

Emotion Expression Characteristics

A related issue to emotion set is the influence of cue type on recognition performance. Facial emotion recognition tasks included in this review differed in whether they included static images (52%), morphing tasks (42%) or dynamic images (8%). Happy was found to be least impacted by this variable. Cue type differentially impacted recognition of basic emotions, with still images leading to better recognition of Fearful. Given that Fearful was the most difficult emotion to recognize in the face, it is not surprising that improvements were noticeable in response to static pictures which tend to present more prototypical expressions, provide additional time for processing, and contribute to ceiling effects (Darke et al., 2019; Paiva-Silva et al., 2016). However, the lack of an overall improvement in recognition of all facial emotion expressions in response to static images, as well as the variability in rankings for this cue type, suggests that the lack of ecological validity associated with these stimuli negatively impacts perception. Facial emotion expressions experienced in everyday life include continually changing facial features for which our visual system is primed (Dobs et al., 2014). It should not be surprising then that research comparing recognition of static versus dynamic facial expression in adults has shown an advantage for dynamic images across a range of emotion categories (Darke et al., 2019; Dobs et al., 2014). Adolescents in this review also appeared to benefit from the movement present in dynamic images as evidenced by their improved recognition for Happy and Surprise expressions, with a noticeable increase in response to dynamic facial expressions depicting Sad. Research has shown that the speed of facial movements (e.g., eyebrows, mouth) associated with particular emotions contributes to recognition of dynamic facial expressions (Sowden et al., 2021). Of the six basic emotions included in this systematic review, Sad is the only emotion associated with low-speed movements in the face. These results suggest that adolescents may be relying on temporal movement cues in their identification of facial emotions.

It is important to note that the movement associated with morphing tasks did not appear to similarly facilitate recognition of Sad. Unlike dynamic stimuli, morphing stimuli either interpolate static images of two distinct emotions of different intensity levels or incrementally shift from an image of one emotion (or Neutral) to another and are thus often described as including motion (Paiva-Silva et al., 2016). In other words, a dynamic morph may begin with a Neutral facial expression and then transition to another expression such as Angry by slowly increasing the intensity of the emotional expression. Sad is the only negative basic emotion that is also identified as being low in activation or arousal. Thus, adolescents may have found it harder to extricate this emotion from stimuli where other negatively-valanced emotions were also presented at very low activation levels. In fact, Lee et al. (2013) reported an increased sensitivity to Angry expressions contained in morphing stimuli and decreased recognition of Sad. Further, Lee et al. (2013) suggest that increased sensitivity to cues of Anger may be socially motivated and particularly salient in adolescence because these cues signal potential rejection from a peer, which they suggest may be socially motivated.

Investigations of vocal emotion recognition included the use of semantically neutral or pseudo words and/or sentences, or affective bursts. The pattern of recognition was similar across studies that used linguistic stimuli, though Sad appeared to be more difficult to identify in pseudo cue types compared to semantically neutral cue types. This finding was surprising given that research has shown that pseudo-cue types should yield similar recognition accuracy to semantically neutral ones (Castro & Lima, 2010) and that Sad is considered the most acoustically distinct of the basic emotions (Zupan et al., 2009). The use of affective bursts appeared to have the greatest influence on the pattern of vocal emotion expressions, resulting in a pattern opposite to the one seen for linguistic stimuli such that Happy was most easily recognized, and Angry most challenging. Angry vocal bursts have also been reported to be poorly identified by adults (Belin et al., 2008; Lausen & Hammerschmidt, 2020; Pell et al., 2015). It is possible that the overall difference in pattern of recognition of vocal affect bursts may reflect a difference in perceptual processing for this type of cue versus linguistic based stimuli (Castiajo & Pinheiro, 2019; Liu et al., 2012; Pell et al., 2015). Alternatively, Amorim et al. (2021) suggest that the increased recognition of Happy in response to affective bursts versus words and sentences spoken in Happy tone of voice is likely due to the social relevance of laughter. This proposition aligns well to those presented by Garcia and Tully (2020) and Lee et al. (2013) who suggest emotion recognition in adolescence may be socially motivated toward avoidance of peer rejection. In other words, recognition of cues of such as laughter would be more likely to result in positive peer interactions.

Cue type is not the only emotion expression characteristic that can influence emotion recognition; intensity should also be considered. Though adolescents tend to report more intense emotions overall (Deiner et al., 1985), they still report experiencing more low intensity emotions compared to high intensity emotions (Larson & Lampman-Petraitis, 1989). However, research has focused much more on the recognition of high intensity emotional expressions. Though the intention of this review was to consider intensity for recognition of both facial and vocal emotion expressions, only one study reported data related to vocal emotion intensity (Zupan, 2015). That study included only 10 adolescent participants precluding an intensive discussion of those results. For facial emotion recognition, the number of studies that included low, moderate, or high intensity expressions and/or reported accuracy based on threshold varied by emotion (see Fig. 6). Happy remained the easiest emotion to identify and Fearful the most challenging regardless of intensity level. However, different levels of intensity led to considerable variability in rankings for the remaining emotions. For instance, low intensity expressions of Disgust led to much improved recognition compared to moderate and high expressions. Low and moderate exemplars of Sad also appeared to facilitate recognition of this emotion compared to high intensity ones. It has been reported that lower intensity expressions are more likely to be processed on the basis of individual features (Ambadar et al., 2005). That difference in processing may at least partially explain the impact of intensity on the recognition of Disgust and Sad. For instance, Disgust includes two prominent features—a nose wrinkle and an upper lip raise—so processing this emotion feature by feature is still likely to result in good recognition. Sad includes three distinct features—droopy eyelids, a down-turned mouth and raising of the eyebrow at the inner corner of the eye (Ekman, 2003); any one of these cues may have supported recognition of this emotion.

Strengths and Limitations

This review included a comprehensive set of search terms that resulted in the retrieval of a large number of relevant studies. However, two studies (Montirosso et al., 2010; Zupan, 2015) were identified via hand searching and chaining that did not appear in any of the six databases. Neither of these studies included any of the adolescent-related search terms used for this review and instead referenced their adolescent participants as children. It is possible that there were additional studies not included in this review for the same reason. Both authors reviewed all abstracts, all full texts, participated in quality appraisal, and independently conducted data extraction for each study, increasing the validity of findings for this review. The majority of studies identified for this review were found to be of moderate to high quality; only four studies were rated as low in quality.

A key aim of this systematic review was to identify the pattern of recognition for facial versus vocal emotion recognition. While the review was able to identify an overall pattern for each modality as well as demonstrate how the pattern of recognition may change according to different variables, the data available and analyses conducted did not allow for direct comparison of accuracy for facial versus vocal emotion recognition. As a result, it remains unclear if emotion recognition tends to develop more quickly in one modality versus the other (and for which emotions) or whether adolescents find emotion recognition easier for one modality over the other. Of the 47 studies included in this review, only seven investigated vocal emotion recognition, using a number of different tasks across studies. Additional research in this area would provide more insight into the development of recognition in each modality, particularly if facial and vocal emotion recognition tasks are administered to the same participants. Given that the voice has been reported as a “privileged source” of information for subtle emotional cues (Morningstar et al., 2018, p. 223), this type of investigation may be particularly important during adolescence when sensitivity to social cues is heightened.

Despite meta-analyses that have found sex differences in emotion recognition as early as childhood (McClure, 2000), the sex differences found in this review were minimal at best. This could be partially due to the age criteria used for inclusion. For instance, evidence suggests that puberty is related to changes in emotion recognition (Lawrence et al., 2015) and that females typically enter puberty before males. However, this factor was not considered as part of this review. Future research should consider these pubertal sex differences in addition to chronological age when comparing performance of males versus females. Another limitation that impacted analysis of sex differences in this systematic review was the lack of vocal emotion recognition studies that reported separate data for males versus females.

Another related point is that this review did not explore effects related to differences in the age and sex of the encoder (i.e., the person portraying emotion). Hauschild et al. (2020) showed that the age of the encoder influences accuracy of facial emotion recognition for adolescents. Studies with adults have also shown differences in recognition in response to emotions portrayed by males versus females for both facial and vocal emotion expressions (Eskritt & Zupan, In press; Kret & De Gelder, 2012; Lausen & Schacht, 2018; Milders et al., 2003). Future reviews should explore encoder related impacts on recognition, particularly since the task and stimuli characteristics investigated here were found to result in variable response patterns. Another task characteristic that was not included in this review but has shown to be an important outcome measure in the recognition of emotion expression was response time.

For some of the variables analyzed as part of this review, only a limited amount of related research was available so readers should interpret results related to these variables carefully. For example, of the 47 studies identified for this review, only seven (~ 15%) included a measure of vocal affect recognition. This made the exploration of the influence of different characteristics on the pattern of emotion recognition within this modality difficult. Similarly, it appeared that Neutral strongly influenced emotion recognition performance but only 13 of the 47 studies included Neutral as part of the emotion set limiting analysis of its influence, 11 of which focused on facial emotion recognition. This review also highlighted the need for studies that explore recognition of emotions beyond the six basic categories. Despite not limiting the search or review to specific emotion categories, only three studies included complex emotions; none of these were facial emotion recognition studies.

Due to variability in data across studies included in this review, a meta-analysis could not be conducted and patterns of recognition based on frequency of rankings were reported instead. While this method allows for a novel way to visualize adolescent response patterns according to a range of variables, it did not always allow for clear comparison to existing literature. For instance, studies have shown that females tend to have higher emotion recognition than males, but the analysis conducted for this review only allowed a comparison of whether the overall pattern of recognition was the same versus different between these two groups, not whether one group was more accurate than the other. A discussion of age differences within the adolescent period was similarly constrained. Finally, this review looked at emotion recognition in two individual modalities—face and voice. Audiovisual and/or multimodal emotion recognition were not explored but should be in future since everyday emotion expressions are interpreted on the basis of more than a single cue.

Conclusion

The ability to recognize emotions is particularly important in adolescence, a period of development where adolescents are individuating from their parents and learning to navigate more complex social interactions. Given the importance of emotional development to adolescent social development and wellbeing, the recent awareness of important maturational changes in the social brain network, and contradictory research findings, the development for emotion recognition in typically developing adolescents requires more careful examination. This review aimed to systematically evaluate the current literature to identify the pattern of recognition for facial and vocal emotion expression across the developmental period of adolescence. This review included a total of 47 studies, only seven of which included data on vocal emotion recognition. Nearly half of the included studies were published over the last 5 years indicating increased interest in this topic. Overall, results generally showed a more homogeneous pattern of emotion recognition development than expected across early, mid, and late adolescence with one exception—late adolescents showed better recognition of Sad facial expressions then early and mid-adolescents. However, task and emotion expression characteristics tended to result in greater variability in the pattern of responses than participant characteristics alone, suggesting their importance for fully understanding the sometimes contradictory research findings. The lack of the inclusion of Neutral in many studies, the limited inclusion of low intensity emotion expressions, or a balance in the valence of emotions included (i.e., equal number of positive and negative emotions) should be considered in the overall interpretation of the analysis of results presented here. Further, based on the studies included in this review, it was only possible to analyze performance for basic emotions, limiting any conclusions that could be drawn about emotion recognition across adolescence in general and highlighting a need for research focused on recognition of complex emotions that involve more sophisticated social understanding that would be particularly relevant for adolescents. The results regarding the influence of different characteristics on the pattern of recognition for vocal emotion expressions should also be interpreted carefully given the limited number of studies that reported data for each. As the voice may be an important means of conveying more subtle social emotions, the lack of evidence for this modality in the literature is particularly unfortunate. The present review provides important insight into the factors that may influence emotion recognition performance in adolescents but also revealed that methodological differences and limitations are hindering a full understanding of emotion recognition abilities across this important development period. Future reviews on emotion recognition by adolescents might add further insight by exploring additional participant characteristics (e.g., encoder age and sex) as well as task characteristics (e.g., response time). Recognition of multimodal expressions which are more representative of expressions encountered in everyday social interactions should also be considered.