Introduction

Language is an area of profound importance in research on individuals with autism spectrum disorders (ASD) as well as in the daily life of individuals on the spectrum. Language is our main means of communication, and therefore constitutes the basis for social interaction, a defining weakness in ASD. The specific characterizations of this disorder, such as Theory of Mind (ToM) deficits, repetitive behaviors, and comorbid conditions (intellectual disability, attention deficits, etc.), are numerous and therefore evaluating language skills is a complex mission, wherein it is not always clear whether low scores are a result of language deficits per se or are a result of these specific characterizations that prevent individuals on the spectrum from understanding the task demands. In the DSM-5 (American Psychiatric Association, 2013), language impairment is no longer a core diagnostic criterion for ASD, although clinicians are still obliged to specify whether a child has accompanying language impairment or not. In research on children with ASD, language measures are frequently used to match individuals with ASD to control subjects (Battaglia, 2012; Begeer et al., 2013; Brynskov et al, 2016; Condouris et al., 2003; de Marchena et al., 2011; Dunn & Bates, 2005; Haebig et al., 2015; Hala et al., 2007; Hani, 2015; Harper-Hill et al., 2014; Huemer & Mann, 2010; Löfkvist et al., 2014; McGregor et al., 2012; Naigles et al., 2013; Pastor-Cerezuela et al., 2016; Paynter & Peterson, 2010; Singh & Harrow, 2014; Su & Su, 2015; Whitehouse et al., 2007). The language measure typically used is vocabulary, receptive, and/or expressive.

Most tests of expressive and receptive vocabulary assess lexical semantics, the meaning of words, and the focus of this review article. Testing receptive vocabulary typically involves a child listening to a word and selecting a picture, which depicts the meaning of that word. To assess expressive vocabulary, a child is shown a picture and must supply the word whose meaning corresponds. These two methods, picture matching and picture naming, are the most common ways of assessing lexical semantics. In research on ASD, scores on these types of tests are commonly used for matching participant groups and at times are the only linguistic variable tested. Lexical semantics is also the most common feature of the verbal subparts of different IQ tests, making understanding of lexical semantic skills even more important for children with ASD. A systematic large-scale study evaluating whether lexical semantic skills can predict structural language abilities has yet to be conducted. Some studies have in fact suggested that performance of children with ASD on lexical semantic tasks is above their other linguistic abilities, and does not predict their structural language abilities (Tager-Flusberg, 1981; Mottron, 2004; Sukenik, 2017; Walenski et al., 2006, 2008).

Previous studies testing lexical semantic abilities of children with ASD do not appear to present a uniform picture, with some finding the lexical semantic abilities of children with ASD to be intact (Begeer et al., 2013; Bowler et al., 2009; Cantiani et al., 2016; Dunn et al., 1999; Ellawadi et al., 2017; Fiebelkorn et al., 2013; Groen et al., 2010; Kamio et al., 2007; Knaus et al., 2008; Speirs et al., 2011; Walenski et al., 2008) and others finding them to be severely impaired (Alqhazo et al., 2018; Battaglia, 2012; Boser et al., 2002; Hartley & Allen, 2013, 2014; Henderson et al., 2011; Lo et al., 2013; McCleery et al., 2010; Naigles et al., 2013; Norbury et al., 2010; Ropar & Peebles, 2007; Singh & Harrow, 2014; Whitehouse et al., 2007). The question is why are there such differences between study outcomes? One explanation might be that this is simply a result of the well-documented ASD heterogeneity, both between participants and between different domains for the same individual (Masi et al., 2017). Another hypothesis could be that there are specific characteristics of children with ASD, such as (non-verbal) IQ or severity of autistic symptoms, that influence their lexical semantic abilities, and therefore could explain seemingly conflicting results. A third explanation may be related to the age or the developmental stage of the ASD participants, with younger children with ASD showing more lexical semantic deficits than older children do. If this is the case, it would have important implications for assessment and treatment methods.

Given the widespread use of lexical semantic measures both in clinical and research settings, it is essential that we understand the underlying reasons behind the contradictory findings reported in the literature. The aim of the current review is to systematically explore studies that tested lexical semantic abilities in children with ASD. Are there studies on language in children with ASD whose specific focus is lexical semantic abilities? In studies reporting on lexical semantic abilities, how many found children with ASD to have intact versus impaired performance on lexical semantic tasks? And, crucially, is impaired versus intact performance related to any variable(s) that may explain the apparent divergence between study results.

Method

Search Strategies

Three databases were searched: the OCLC WorldCat online catalogue, Web of Science, and ScienceDirect. We furthermore hand-searched three major journals devoted to studies on autism: Autism, Journal of Autism and Developmental Disorders, and Research in Autism Spectrum Disorders. In addition, we searched a dissertation database, OATD (open access). In all searches, the following terms were used in all possible combinations with no Boolean operators: autism, autistic, ASD, autism spectrum disorder, semantic, semantics, lexicon, lexical semantic, naming, PPVT (Peabody Picture Vocabulary Test), CDI (Communicative Developmental Inventory), lexical retrieval, semantic priming, semantic recall, and verbal. We wanted to use search terms that were as wide as possible; so, for example, we found that although different language versions of the PPVT and the CDI were used, the acronyms of PPVT and CDI were usually present. Furthermore, in order to maximize the results, we used the word “verbal” and not “vocabulary” as not all studies including an assessment of lexical semantics necessarily tested vocabulary, some tested categorization abilities, semantic priming, or verbal fluency to name a few.

We chose to concentrate on lexical semantic knowledge, which previous studies have defined as information stored in a storage unit that involves long-term memory procedures (Hall et al., 2017) and that is organized by shared semantic content and contains words and their meanings (semantic category relations to other words, syntactic category, color traits, size) (Friedmann et al., 2013). Our desire to focus on lexical semantic knowledge means that studies on word learning were not included. Word learning is a process in which an individual must process phonological information, notice and differentiate relevant linguistic and non-linguistic contextual cues, attribute the meaning to the correct phonological form, match the new meaning to previous knowledge, retain the form-meaning association, and use the word appropriately (Haebig et al., 2017). Our reason for excluding word learning studies is that, although word learning is an important aspect of lexical functioning, it involves many other cognitive and social mechanisms that go beyond knowledge about words and their lexical properties.

Inclusion Criteria

The current review is based exclusively on articles that were published in English, after the year 2000, the year the DSM-IV-TR diagnostic guidelines were published and until the year 2019. We included only studies that provided at least some background information on the participants (verbal IQ, nonverbal IQ, and/or general IQ, autism severity, autism diagnosis, age, control matching criteria). We furthermore included only studies in which all the participants had been given a clinical diagnosis for autism enabling us to determine that according to current international diagnostic guidelines, these participants would meet the criteria for an autism diagnosis today. We excluded studies reporting on children with “optimal outcomes,” which are described as children who improve at such a rate that at some point in their life they no longer qualify for an autism diagnosis. As Suh and colleagues (Suh et al., 2017) pointed out, the factors involved in determining which individuals will experience optical outcome are not fully known yet and therefore these studies were excluded. Studies that included children described as having “autistic traits,” but no formal diagnosis, were also excluded because there was no way of determining if the participants would be eligible for an autism diagnosis. Studies assessing bilingual children with autism were excluded from this review given that it is known for TD children that bilingualism plays an important role in an individual’s linguistic abilities (and particularly their vocabulary) and that the impact of bilingualism in children with ASD is not yet understood, and therefore, this subgroup of children with autism should be considered separately (see Drysdale et al., 2015 and Lund et al., 2017, for comprehensive reviews). We included only studies with participants aged between 0 and 18 years. Besides the relative scarcity of studies on language in adults with autism, and the fact that very little is known about the effects of aging on autism and the interaction with language (Magiati et al., 2014), we chose to concentrate on the years in which language is known to develop in autism (Sigman & McGovern, 2005). Finally, we included studies testing lexical semantics, either receptive or expressive or both, through the use of either standardized tasks with scores or through the use of experimental tasks. Both authors screened all studies and any disagreements were discussed and resolved.

Factors Researched

The following factors were noted for each of the studies: age of participants (mean and range), number of ASD participants, control groups and matching criteria, native language of participants, background measures, and other measures from tasks not specifically testing lexical semantics. We sought to determine whether the ASD participants had intact or impaired performance on the lexical semantic measures, whether they differed from TD participants quantitatively or qualitatively (as manifested by error analysis), and whether or not there was evidence for language impairment, in the area of lexical semantics or in other language domains tested. Since it turned out that all the studies testing lexical semantics also tested other areas of linguistic functioning, we were able to look at this variable as well. Autism severity was a factor we hoped to include for this review, but doing so turned out to be impossible because this variable was described in very different ways in the studies under review; some provided scores from standardized tests like the ADOS, but others reported only broad diagnostic labels such as “high functioning” ASD (HFASD), “low functioning” ASD (LFASD), or pervasive developmental disorder not otherwise specified (PDDNOS) making it impossible to make any useful cross-study comparisons.

Results

Our first research question was whether there have been studies which included specific information on lexical semantic abilities in children with ASD. Our search criteria yielded 73 studies of which 12 were general literature reviews on language and ASD. The remaining studies reported original data (including four PhD dissertations). Beginning with the 12 literature reviews which corresponded to our search criteria, none of these directly and systematically examined lexical knowledge and its use in children with ASD. For example, Arunachalam and Luyster (2015) reviewed what is known about the cognitive processes which underlie the acquisition of lexical knowledge in autism, and Naigles and Tek (2017) reviewed evidence in favor of the hypothesis that meaning, broadly defined to encompass pragmatics and lexical semantics, is disproportionately impaired in ASD, compared to form, encompassing syntactic and phonological knowledge.

Turning now to the 61 empirical studies matching our search criteria (see Fig. 1), roughly half of these (29/61) did not meet all of our inclusion criteria. Some were excluded because they were published before the year 2000 (n = 10); included children with a diagnosis of “optimal outcome” or “at risk for autism” (n = 4); participants were bilingual children with ASD (n = 10); or participants were all over the age of 18 (n = 5). The remaining 32 studies matched all of our inclusion criteria and constitute the focus of the current review.

Fig. 1
figure 1

Search criteria flow chart

In order to answer our second research question (how many of the studies that included results on lexical semantics found participants with ASD to have intact versus impaired lexical semantic abilities), study results were classified into three categories: studies that found participants with ASD to have intact lexical semantic abilities (n = 14), studies that found ASD participants to have impaired lexical semantic abilities (n = 14), and studies that found ASD participants to have impaired performance only on some, but not all, lexical semantic tasks (n = 4).

Our third research question was whether any factor(s) emerged from these studies that could explain the presence or absence of impaired lexical semantic abilities. In order to answer this question, we looked at each factor separately and whether it distinguished the subgroups of studies (e.g., intact lexical semantics, impaired lexical semantics, and partially intact lexical semantics) trying to find common factors that could explain the results. We consider, in turn, age, intellectual ability, task type, and wider linguistic level as manifested by scores on other linguistic tasks.

Age

The first factor we looked at was age of the ASD participants. Children with ASD undergo intensive treatment and intervention over the years, and some children may be slow in developing, reaching normal levels later than TD children; we wanted to assess whether the studies under scrutiny provided evidence that performance improves with age. In the reported studies, first of all, the age of the participants was not equally represented: most studies (n = 19) reported on adolescents (ages 11 to 18). Studies on school-aged children (ages 6 to 10) were less frequent (n = 8), and very few (n = 5) reported on children in the years when rapid vocabulary growth is observed in TD children, below age 6.

Looking at each of these age groups in turn, the following picture emerged. In the 19 studies reporting on adolescents, 8 found ASD participants to have impaired lexical semantic abilities (Battaglia, 2012; Boser et al., 2002; Henderson et al., 2011; Lo et al., 2013; Naigles et al., 2013; Naito & Nagayama, 2004; Ropar & Peebles, 2007; Singh & Harrow, 2014), 11 found ASD participants to have intact lexical semantic abilities (Begeer et al., 2013; Bowler et al., 2009; Dunn & Bates, 2005; Fiebelkorn et al., 2013; Groen et al., 2010; Kamio et al., 2007; Knaus et al., 2008; Speirs et al., 2011; Walenski et al., 2008; Whitehouse et al., 2007), and one study found ASD participants to have partially intact lexical semantic abilities (Harper-Hill et al., 2014). In the eight studies that tested school-aged children, five found ASD participants to have impaired lexical semantic abilities (Alqhazo et al., 2018; Hartley & Allen, 2013, 2014; Norbury et al., 2010; Vogindroukas et al., 2003), two studies found ASD participants to have intact lexical semantic abilities (Cantiani et al., 2016; Ellawadi et al., 2017), and one study found ASD participants to have partially intact lexical semantic abilities (Hani, 2015). Finally, in the five studies which tested children with ASD below age 6, two found the ASD participants to have impaired lexical semantic abilities (Rescorla & Safyer, 2013; Tek et al., 2008), two studies found ASD participants to have intact lexical semantic abilities (McCleery et al., 2010; McGregor et al., 2012), and one study found ASD participants to have partially intact lexical semantic abilities (Barone et al., 2019).

Summarizing, age in and of itself does not seem to be a crucial factor in determining the outcome results of studies assessing lexical semantic competence in children with ASD. However, studies on lexical semantic abilities in younger children with ASD are scarce.

Intellectual Disability

We wanted to determine whether having low IQ scores (e.g., low cognitive abilities) is related to having low lexical semantic abilities. Participants with ASD have typically been identified in studies on language according to their intellectual abilities, and IQ scores have been used to classify participants into subgroups and for matching with other participant groups. Studies vary as to whether this is done on the basis of Full Scale IQ (FSIQ) scores or on the basis of nonverbal IQ (NVIQ) scores. We look at these in turn.

In most of the studies, individual FSIQ scores of the ASD participants were not reported (n = 28). Some of these studies reported group means or stated that ASD participants were FSIQ-matched to a control group, but since no scores were reported, we cannot draw conclusions from these studies regarding the individual cognitive levels and their relationship to lexical semantic abilities. Eight studies reported individual FSIQ scores of their ASD participants (Dunn & Bates, 2005; Fiebelkorn et al., 2013; Groen et al., 2010; Kamio et al., 2007; Knaus et al., 2008; Lo et al., 2013; McCleery et al., 2010; Speirs et al., 2011; Vogindroukas et al., 2003). Of these studies, half had ASD participants with no intellectual disability as seen by the fact that they were matched to the control group on age and FSIQ score (Fiebelkorn et al., 2013; Groen et al., 2010; Kamio et al., 2007; Speirs et al., 2011). Results on lexical semantic tasks in these studies did not pattern with IQ status, as measured by FSIQ scores; in other words, impaired versus intact lexical semantic performance was not related to FSIQ scores. FSIQ thus did not provide any explanation for when lexical semantic performance is impaired or not in children with ASD.

While for the general population FSIQ scores are considered a good indication for cognitive abilities of children, research has found this is not always the case for children with ASD. Mottron (2004) found that children with ASD exhibit strengths and weakness throughout the different parts of IQ tests, exhibiting an uneven pattern of cognitive abilities, wherein the final FSIQ score does not necessarily represent the cognitive abilities of each sub-test. Nonverbal reasoning was found to be a relative strength for children with ASD as it involves abstract and spatial reasoning and does not involve linguistic, social, or cultural aspects (Mottron et al., 2006). Furthermore, studies on language abilities of children with ASD have chosen to use NVIQ because global IQ measures include language skills, making it problematic to look at the relation between language and cognition. For this reason, in recent years, many researchers evaluating different ASD functions chose to include in their background characteristics NVIQ scores rather than FSIQ scores. Out of the studies under review, eleven studies matched the TD control group and the ASD group on the basis of NVIQ scores; in six studies, the TD children were also matched on chronological age, meaning the ASD participants had normal NVIQs (Dunn & Bates, 2005; Ellawadi et al., 2017; Lo et al., 2013; McCleery et al., 2010; Naigles et al., 2013; Norbury et al., 2010). Out of these studies, two found the ASD participants to have intact lexical semantic abilities, while four found them to have impaired lexical semantic abilities. In four studies, TD language- and IQ-matched controls were younger than the ASD participants, indicating that the verbal abilities as well as NVIQ of the ASD participants could be impaired (Barone et al., 2019; Hani, 2015; Hartley & Allen, 2013, 2014). In two studies, lexical semantic abilities of the ASD participants were impaired while in two they were found to be partially intact. No discernible pattern emerges regarding the role of NVIQ in impaired/intact lexical semantic performance.

Task Type

A major candidate in explaining the diversity in reported performance on lexical semantic tasks in children with ASD is, of course, the diversity in tasks used to measure this performance, and the possibility that a certain task (type) may be more difficult/easier for children with ASD. The studies under review included tests measuring receptive lexical semantic abilities, expressive lexical semantic abilities, or both. Most studies administered more than one task; see Appendix Table 1 for full list of tasks used in each study.

Receptive lexical semantic tasks included the following types of tasks: (1) word picture matching task, the most popular being the PPVT (Dunn & Dunn, 2007)(in several languages; n = 14); (2) parental questionnaires assessing the vocabulary of children, such as the CDI (Luyster et al., 2007) and CELF (Paslawski, 2005); and (3) semantic priming tasks. The expressive lexical semantic tasks used included the following kinds of tasks in various versions: (1) picture naming tasks, (2) verbal fluency and word association tasks, (3) cued and free word recall, (4) categorization tasks, and (5) other experimental tasks.

Results for the different kinds of tasks did not give rise to any pattern of results. Receptive and expressive lexical semantic tasks yielded both studies that found lexical semantic abilities of children with ASD to be intact as well as studies that found lexical semantic abilities of children with ASD to be impaired (see Appendix Table 1). No specific pattern could be found and no specific type of task seemed more/less challenging for children with ASD. However, comparing the different tasks is challenging as each study used a slightly different version of any given task and had different aged participants with differing levels of cognitive functioning.

Linguistic Level

The final variable we looked at was the linguistic level of the ASD participants as measured by their performance on other language tasks, either scores from verbal indices or subtasks of IQ tests or scores on linguistic tasks testing other language domains, such as syntax or phonology. As many studies assessing lexical semantics (as well as many other areas of functioning) use lexical semantics scores as a proxy for language levels, we wanted to see whether in fact low lexical semantic scores were linked to other language scores and thus whether these latter could in fact predict the differences between study outcomes on lexical semantic abilities.

In ten studies, children with ASD were matched with a group of TD children according to both age and language level—as measured by NVIQ subtasks, linguistic scores on parental questionnaires, or syntactic measures (and sometimes also on reading level), implying that these children had normal verbal levels. In four of these studies, lexical semantic abilities in the ASD participants were found to be intact, i.e., not different from their TD peers (Begeer et al., 2013; Bowler et al., 2009; Ellawadi et al., 2017; Henderson et al., 2011). In five studies, lexical semantic abilities in the ASD participants were found to be impaired (McCleery et al., 2010; Norbury et al., 2010; Ropar & Peebles, 2007; Singh & Harrow, 2014; Tek et al., 2008). The remaining study (McGregor et al., 2012) differentiated between ASD participants with no syntactic language impairment and ASD participants with syntactic impairment. In this study, the ASD participants with normal syntactic functioning had intact lexical semantic abilities while the group of ASD participants with syntactic deficits were also impaired in their lexical semantic abilities.

We also looked specifically at the 16 studies that found the ASD participants’ lexical semantic abilities to be impaired. In these studies, lexical semantics was tested using different experimental tasks, but the ASD participants also had low scores on normed language measures or low verbal IQ scores (Mullen, WISC). These scores were reported as background measures, but in fact most of them were either solely lexical semantic measures (PPVT, CDI) or include assessment of lexical semantics in a composite score (CELF, CDI, VIQ).

Two studies pointed out differences between subgroups of participants with ASD, wherein children with ASD that showed language impairment (on syntactic tasks) also showed impaired lexical semantic abilities, while children with ASD and no evident language impairment showed good lexical semantic abilities (Hani, 2015; McGregor et al., 2012).

Some interesting related differences were noted. In the studies reporting that the ASD participants had impaired lexical semantic abilities, in most cases, it was not only that ASD participants had lower scores than controls but also that their behavioral patterns seemed to be different: longer response time (Battaglia, 2012), need for more input in order to learn words (Norbury et al., 2010), trouble in relating words to other words (Hartley & Allen, 2013), differences in brain activation to semantic stimuli (Groen et al., 2010), categorization and sorting strategies which seemed to match TD children younger in age (Ropar & Peebles, 2007), over-generalization (Hartley & Allen, 2014), or no extension of category knowledge (Naigles et al., 2013). Several studies also reported different error patterns that were not seen in control groups of age-matched TD children and children with learning disabilities. Löfkvist et al. (2014) found many semantically irrelevant answers and perseverations from previous stimuli that were not seen at all in the TD control group. Vogindroukas et al. (2003) found many global and semantic paraphasia errors. Finally, Ropar and Peebles (2007) found that children with ASD who were given a sorting task were more likely to sort according to concrete criteria and had trouble with abstract concepts.

The studies that reported that participants with ASD whose lexical semantic scores did not differ from those of age-matched TD children also reported that in almost all cases, error patterns seemed to differ qualitatively (although not always quantitatively) from those of TD children. In some studies, the children with ASD seemed to show error patterns similar to those of younger TD children (Rescorla & Safyer, 2013); other studies found the error patterns of children with ASD to be different from both those of TD children and those of other clinical groups, appearing thus to be specific to the ASD group (e.g., Begeer et al., 2013, found that on a verbal fluency task, children with ASD produced bigger word clusters with many more atypical items than TD age-matched children).

Three other studies found no differences in behavioral scores between the ASD groups and those of age-matched controls, but differences in brain activation patterns during semantic stimuli were observed (Dunn & Bates, 2005; Fiebelkorn et al., 2013; Groen et al., 2010), suggesting differences in semantic processing or semantic categorization. Whitehouse et al. (2007) tested children with HFA on a recall task and found no significant differences between age-matched controls on behavioral scores. They also found a tendency (that was not significant) for children with ASD to have better memory skills, hence also suggesting differences in cognitive processing.

Discussion

In the current review, we set out to answer three research questions. First, we wanted to see if there are studies on language in children with ASD which include a specific focus on lexical semantic abilities. Our search criteria led us to 32 empirical research studies conducted over the last 20 years; this is a very small number of studies compared to the overall number of studies published on language in ASD.

Our second research question was, in studies reporting on lexical semantic abilities, how many have found children with ASD to have intact versus impaired performance on lexical semantic tasks? An equal number of studies was found (n = 14) reporting on ASD participants with impaired/intact lexical semantic abilities adding to the overall picture of contradicting results.

Finally, our third research question was to assess whether impaired versus intact performance on lexical semantic tasks of children with ASD found in these studies could be related to any variable(s) that may explain the apparent divergence between study results. No clear result emerged regarding intact or impaired lexical semantic abilities, a result which certainly meshes with the overall heterogeneity found in the ASD population.

Age does not appear to predict whether lexical semantic abilities are impaired or not. However, it must be emphasized that this question is in need of much more enquiry: not all age ranges have been equally studied, and furthermore, many studies report only group results and the groups include very wide age ranges. Our results showed that the majority of studies were conducted on adolescents with ASD (over the age of 10), some included elementary school children (and typically within a very wide age range), while only five studies included children under the age of 6. In TD children, ages 0–6 years are considered the years in which vocabulary growth is rapid and wide (Rom et al., 2003). Literature on the development of lexical semantic abilities of TD children in these ages is extensive and yet very few studies have been conducted on young children with ASD. It should be noted that the average age of ASD diagnosis in many countries is 3–4 years and so including children in research protocols younger than or in this age would involve participants who may not always have a clear official diagnosis. Our decision to include only children with a full, formal diagnosis was with the aim of eliminating interfering variables that may have an effect on the results. For example, high-risk children (such as siblings of children with ASD) may be getting targeted intervention before a formal diagnosis has been established.

In many studies, the age range of the participants was very wide (see Appendix Table 1). Since in the vast majority of studies only group results were reported, reaching conclusions relative to age was not possible. In sum, there is a need for future studies which focus on lexical semantic abilities in young children, preschool, and school-aged children, and there is a general need for reporting individual results, particularly in studies with wide age ranges.

Next, we tried to assess whether FSIQ or NVIQ could be related to outcomes on lexical semantic tasks. Although many studies did not report specific IQ scores (but rather relied on IQ measures obtained while the child underwent a diagnostic procedure and hence reported the child was “high functioning,” meaning he had no ID, or the child was “low functioning,” indicating ID), there seems to be a dearth of studies that have assessed both cognitive abilities and lexical semantic abilities in children with low cognitive abilities. The lack of studies testing children with ASD and ID is the reason that in the scope of the current review, we cannot answer the question whether there is a link between ID and lexical semantic abilities.

Task type was one of the major variables that we believed could most likely explain the heterogeneity between study outcomes, but here too, no specific pattern of impaired versus intact performance arose. Lexical semantic abilities consist of many intertwined cognitive abilities (see Friedmann et al., 2013, for a detailed account), whereas most tasks used tap into a very specific ability that may or may not be related to other lexical semantic features. It seems that since each study tested a different part of lexical semantic functioning, we cannot conclude from the lack of a pattern that task (type) does not affect study results. For example, a study may have found individuals with ASD to have had trouble categorizing items, but we do not know if these same individuals also had trouble naming pictures. A systematic research protocol is needed in order to evaluate the different cognitive mechanisms involved in expressive and receptive lexical semantic abilities in the same group of individuals.

Finally, the last variable we looked at, where possible, was the linguistic level of the ASD participants in other language domains. A seemingly intriguing result that stemmed from this review was that in all studies that found the lexical semantic abilities of children with ASD to be impaired, there seemed to be an overall or at least some other linguistic deficit. However, this conclusion is only weakly supported due to a number of issues. The biggest difficulty with investigating this question in the existing literature was the fact that very different scores are reported for other language domains: scores on the verbal subparts of IQ tests, scores on syntactic tasks, scores from parental questionnaires, and scores from omnibus language batteries. These scores were usually reported as background measures and used as a matching criterion between ASD participants and control groups. Another major concern was that many studies used tasks that were lexical semantic in nature (PPVT, picture naming, picture matching) in order to describe the linguistic levels of the participants, and thus did not in fact provide information about other linguistic domains (syntax, pragmatics, phonology). A study assessing lexical semantic abilities as well as linguistic capabilities in other domains (e.g., syntax, phonology, and pragmatics) would give a much more accurate description both on the lexical semantic and overall linguistic capabilities of an individual. Another important point to consider was that only two studies (Hani, 2015; McGregor et al., 2012) distinguished between children with ASD and normal language levels (as tested by syntactic tasks) and children with ASD and language impairment. Due to ASD heterogeneity, it is crucial that individual scores and error patterns be looked at more closely as group scores hide enormous individual variation.

Summarizing, an in-depth analysis of the different studies reporting on lexical semantic abilities in children with ASD showed that although contradictory findings were found, there were some underlying common characteristics between the ASD participants, the most distinct one being the fact that children with ASD (regardless of the score they achieved on the different tasks) seemed to show differences in behavioral patterns (different reaction times, needing more input) as well as error patterns (semantic errors and semantic paraphasias) not seen in control groups. These kinds of qualitative differences have been reported in the past in several studies (Begeer et al., 2013; Bowler et al., 2009; Riches et al., 2010; Sukenik & Friedmann, 2018; Sukenik et al., 2021) which have shown that even children with ASD who were able to achieve overall scores similar to controls (TD and DLD) produced answers that were very different.

We acknowledge that our search criteria yielded some limitations. The current review focused on lexical knowledge and excluded studies on lexical learning. Previous studies found that the word learning mechanisms of children with ASD may be different and widely affected by their social difficulties (see Abdelaziz et al., 2018; Blume et al., 2021; Kelty-Stephen et al., 2020). The fact that some studies reported that the children with ASD displayed odd behavioral patterns, different from those observed in TD (and other clinical) groups, calls for more studies to assess whether lexical semantic development may be different, even in children for whom it does not seem delayed.

Another important point to consider in future studies is the fact that the current review excluded studies testing children labeled as “at-risk” or “optimal outcome.” The reason for this decision was to allow the review to focus on children that we know were children with an ASD diagnosis, a fact we reasoned would make their background characteristics distinct. Children who are at risk for autism and who have optimal outcomes are children that are usually very high functioning compared to the general ASD population. Unfortunately, excluding studies on at-risk children also restricted the number of studies including children under the age of 5 years substantially, given the typical age of diagnosis, which is 3–4 years.

The main conclusion from this review is that we do not know enough about lexical semantic abilities in children with ASD. In other words, the common practice of using lexical semantic ability measures (e.g., receptive vocabulary) to serve as matching criteria to controls or as an indication for language levels is in need of empirical support. In order to understand whether the development of lexical semantics in children with ASD follows the same trajectories as this development in TD children, research on young children (under the age of 6), but also school children, including children from across the spectrum in terms of intellectual abilities, is needed. In order to test the idea that children who possess a lexical semantic deficit also have wider linguistic impairment, systematic studies assessing both lexical semantics and morphosyntax, phonology, and pragmatics are needed. Finally, given the fact that most studies assessed a very narrow and specific lexical semantic ability, wide scope studies that would assess different aspects (both receptive and expressive) abilities of lexical semantics are called for.