Introduction

Words are at the heart of the language system. They form the basic level of language processing and constitute a milestone in early language development. The very process of language development starts by infants becoming apt at isolating potential word candidates in the speech stream they are exposed to from early on, and gradually associating these sound segments with meaning (Saffran et al., 1996). A common hypothesis about the process of word learning is that, initially, infants work on identifying the phonological segments corresponding to words (speech analysis/segmentation), only subsequently to map those segments onto meaning. There is evidence to support this idea: very young infants are successful at phonetic learning already between 6 and 10 months (Jusczyk & Hohne, 1997; Kuhl et al., 1992; Polka & Werker, 1994), while meaningful (referential) communication typically starts around 11–12 months marked by the production of referential gestures and speech (first words) (Bates et al., 1975). The first year of life is devoted to “cracking the speech code” in Kuhl’s apt description (Kuhl, 2004) and discovering and making a commitment to the sound structure of the native language. There is additional evidence indicating that word form is leading in early word learning: young children have been shown to learn words from larger phonological categories more easily than words belonging to smaller phonological categories (Newman et al., 2008; Altvater‐Mackensen & Mani, 2013), suggestive that word form plays a facilitatory role in that process. From a developmental perspective, lexical knowledge measured as vocabulary size is a major predictor of other language competences and skills, such as e.g., grammar development (Bates & Goodman, 1997), oral and reading comprehension.

There exist a number of theories concerning how words are learned. A common denominator in all approaches is the focus on the mechanisms underlying word learning and their possible weighting. While some theories are mainly concerned with the sensorimotor affordances and perceptual properties of referents out in the world, other theories emphasize the importance of language as a system, and the relations among language units (other words or syntax). More detailed approaches consider differences within language as a system and the ways in which language relates to the world in terms of symbol-to-symbol relations (language-internal), symbol-to-percept (language to world) and object-to-object (world to world) relations (Coventry, 2012; Pace et al., 2016). A third type of account suggests that advances in social cognition are an important prerequisite for word learning, and, in particular, for the learning of abstract words (Carpenter et al., 1998; Bergelson & Swingley, 2013). While these theoretical approaches tend to focus on word learning from a single perspective and without reference to learning experience, other explanations make sense of subtle interactions between cognitive and motor input, couched within multimodal and embodied approaches to language acquisition (Louwerse & Jeuniaux, 2008). Given the focus on individual and grounded learning experience, such approaches appear better equipped to explain the already attested individual variation in language competence. Also, given the multiplicity of factors involved in word learning and the variation in contexts in which words are acquired, more sophisticated accounts are, in addition, moving away from classical sound-to-referent mapping approaches (Pace et al., 2016; Rohlfing et al., 2016; Wojcik et al., 2022).

Current embodiment theories of cognition and language propose that human cognition and language are grounded in experience, and that concepts and their linguistic labels are formed in rich interaction with the world. Embodied theories are not new psychological constructs (see Thelen & Smith, 1994). Barsalou’s (1999) classic account characterizes initial bottom-up activation of perceptual input in sensory-motor areas alongside later top-down processing; together this gives rise to creation of perceptual symbolic systems. Central to the theory is the idea that perceptual experiences are stored in the different sensory or motor neural areas in which they were encoded and thus tend to be activated together. Concerning word learning, perceptual attributes of referents feature heavily in the encoding process. For instance, there is abundant evidence that object shape takes precedence as a cue, and overrides other object properties (e.g., colour, size), when acquiring a word (Landau et al., 1998). The shape bias has thus been recognized as a basic mechanism supporting word learning and is well-attested in research on early language development (Yee et al., 2012; see Vulchanova et al., 2019 for a discussion). Importantly, the shape bias appears to be operative only in the context of an explicit label for the object (Landau et al., 1998). Furthermore, the preference for shape over other properties in the context of naming has been attested already at 15 months (Graham & Diesendruck, 2010), in children (Booth et al., 2005; Imai et al., 1994), as well as adults (Landau et al., 1998). Attention to object shape has also been found to be a reliable predictor of noun vocabulary growth (Gershkoff-Stowe & Smith, 2004; Poulin-Dubois et al., 1999; Smith et al., 2002). There is also neuro-physiological evidence of the role of object shape in early word learning. Thus, in a longitudinal study during the dynamic period of vocabulary growth between 20 and 24 months, Borgström et al., (2015, 2016) investigated the role of object properties (shape and object part information) in word-object mapping. In this study, neural responses (event related potentials, ERPs) recorded when words were primed by object shape at 20 months predicted vocabulary size at 24 months in their sample. In contrast, detached object parts failed to function as word primes, regardless of age or vocabulary size, although the part-objects were identified behaviorally. In addition, participating infants also showed relatively poor recognition of the part-objects compared to the shape-objects, suggestive of a primary role for shape in object naming.

Other factors and mechanisms highlighted in research are the ability to form categories and taxonomic relations (Markman & Hutchinson, 1984), associative learning skills (Allen, 2012; Bloom, 2000), and the ability to integrate information from multiple modalities (multi-modal integration) (Russo et al., 2010 for new evidence and discussion).

Despite the complexity of word learning, most children successfully enrich their vocabularies in a rather effortless manner. For some children, however, word learning poses difficulties. Developmental disorders have been attested to impact on different aspects of word learning. Children with ASD manifest with problems in that domain, reflected in both how such knowledge is acquired and how words are used (Vulchanova et al., 2020 for a review and discussion; however, see also Luyster & Lord, 2009). Despite this evidence, the mechanisms that support word learning remain poorly understood and understudied in autistic children (Haebig et al., 2017). An additional problem is the large heterogeneity observed on the spectrum with varying degrees of structural language competence, where some of the problems might be obscured in children on the highly verbal end (Vulchanova et al., 2020).

Form is easy, meaning is difficult

Naigles (2002) put forth the hypothesis that in the process of word learning, the encoding of the formal aspects of words, such as phonology, are relatively easier to acquire than the mapping of form to meaning. There is indeed evidence that the acquisition of word phonology precedes the acquisition of meaning and semantic aspects of words, both in adult word learning and in early language development. In an adult word learning experiment, López-Barroso et al. (2013) document that the ability to learn new words relies on an efficient and fast communication between temporal and frontal areas in the brain. This study also demonstrates that the initial stages of word learning apply primarily to phonological (auditory) aspects of words. Studies of word learning in children show that phonological information is learnt swiftly. Thus, Gaskell and Dumay (2003) show that novel words may activate the representation of the closest real word, rather than developing their own lexical representations, as indicated by the facilitatory effect of phonological competition in their study. However, full integration of newly learnt words with existing items develops at a slower rate.

A number of studies provide evidence of the structure of the early lexicon. Thus, upon hearing a word or seeing an object image, toddlers activate not only the corresponding word itself, but also other words that are semantically (Arias-Trejo & Plunkett, 2009), phonologically (Mani & Plunkett, 2010, 2011) or phono-semantically related (Mani et al., 2012) to the heard word or the label of the image. This co-activation of words in the mental lexicon of toddlers appears to be primarily mediated by the phonological overlap between prime and target labels. In a visual world paradigm priming study of German speaking toddlers, Altvater-Mackensen and Mani (2013) manipulated the phonological similarity between prime and target word. Onset priming had an interference effect, while rhyme-priming had a facilitatory effect on target word recognition (e.g., Hund—Mund; Fisch—Tisch). Based on these results, the authors conclude that word retrieval and recognition is modulated by 2 processes in the developing lexicon, feature-level activation (phonological features—rhyme overlap), and lexical-level activation (word-level—onset overlap), suggestive of a developed word structure parsing. These studies suggest that the infant/toddler lexicon develops in a dynamic way, reflecting interaction both between the phonological properties of words and their lexico-semantic properties. However, over time, semantic relations take precedence, reflecting later stages of word knowledge when words are not only consolidated, but also integrated in the mental lexicon. These studies also indicate rich interaction with the environment, both the visual world (perception—object properties), and the social world.

The verbal profile in ASD

The verbal profile of autism is characterized by huge heterogeneity, and this is also reflected in relative strengths and weaknesses regarding word learning and word processing in this population (Pickles et al., 2014). In word learning there appear to be specific problems in the integration of novel words with already existing lexical knowledge, in contrast to somewhat heightened sensitivity to formal aspects of words (phonology, morpho-phonology, orthography), and strengths in the initial mapping of novel words to new referents, most likely due to intact associative learning mechanisms (Baron-Cohen et al., 1997; Parish-Morris et al., 2007). Associative learning may in fact be the dominant learning style in some individuals with autism, particularly children with substantial language difficulty (Preissler & Carey, 2004), which would predict problems in conceptual abstraction. Problems with semantic aspects of words are also reflected in qualitative differences in the activation of lexical knowledge for the purposes of language understanding. Despite preserved structural language skills and sometimes strengths in grammar, even highly verbal individuals with autism are faced with problems in figurative language comprehension and display a delayed developmental trajectory (Chahboun et al., 2016).

Strengths in word learning in ASD

Specific strengths which have been documented in research are related to the formal aspects of words, such as e.g., the phonological forms of words, especially evident in the process of word learning and initial lexical consolidation, in contrast to poor semantic skills, in all likelihood, indicative of poorer or atypical lexical integration (Henderson et al., 2014; Norbury et al, 2010). A strength in orthographic processing of the type attested in hyperlexia has also been widely documented (Nation, 1999; Saldaña et al., 2009). Research also documents a specific strength in the domain of inflectional morphology/paradigms (Walenski et al., 2014). The latter strength has been attested both in highly verbal individuals with a talent for language learning (Vulchanova et al., 2012a, b) and in savants whose cognitive and verbal resources may be limited (Smith & Tsimpli, 1995).

Concerning mechanisms that support word learning, participants with autism have been shown to use mutual exclusivity in a way similar to typically developing children (Rescorla and Safyer, 2013; Marchena et al., 2011; Eigsti et al., 2007; Preissler & Carey, 2005), and a strength at associative learning has been attested even in low-verbal children in a series of studies by Allen/Preissler and colleagues (Preissler, 2008). Furthermore, there is evidence that statistical learning is largely intact in children with autism (for adult evidence, see Sapey-Triomphe et al., 2021), as documented in a study which compared children with language impairment to autistic children (Haebig et al., 2017), and confirmed in a meta-analysis (Obeid et al., 2016). Importantly, even the children with ASD and comorbid language impairment in the study by Haebig et al. (2017) did not present with statistical learning problems (however, see discussion in Saldaña, 2022). This study also indicates that word segmentation abilities are associated with word learning in school-aged children with typical development and ASD, but not language impairment.

Problems in word learning in highly verbal individuals on the autism spectrum

Extant research also documents selective impairments in the word learning domain. Semantic problems reflecting poor subsequent word consolidation and integration have been attested (Henderson et al., 2014; Micai et al., 2019) along with impaired interpretation of word meaning, especially in context (Frith & Snowling, 1983; Happé, 1997). These findings have led scholars to suggest a specific semantic problem in autism. Indeed, while this may be easily formulated as a problem with meaning in language, it is in need of further specification in terms of the level of language structure at which it manifests and the potential underlying factors or mechanisms which cause it. Thus, within the 'form is easy, meaning is hard’ thesis, Naigles and Tek (2017) propose that the social difficulties of autistic individuals may compromise the meaning-related components of their language, thus leading to a dissociation between semantic and form-related components. A review of studies included in Naigles and Tek (2017) provides evidence that children with ASD demonstrate significant challenges in the areas of pragmatics and lexical/semantic organization and highlights their good performance on assessment in the domain of grammar (from wh-questions to reflexive pronouns). These authors also point out an important gap in extant research, namely the absence of relevant and parallel lexical-semantic data from children with co-morbid language impairment. Naigles and Tek (2017) report direct comparisons of assessments of lexical/semantic organization and grammatical knowledge collected from the same participants in their own sample. Notably, in those data there are more children at a given age demonstrating adequate grammatical knowledge than semantic organization. The authors thus call for new research aiming at parallel in-depth grammatical knowledge and detailed semantic organization data collected from the same sample. In addition, this study supports findings of, sometimes subtle, dissociations between domains of language competence on the autism spectrum (Vulchanova et al., 2015; Vulchanova et al., 2012a, b).

Problems in word learning related to cognitive domains beyond language

Early categorization skills are strongly implicated in the ability to acquire the labels of referents (Gelman & Markman, 1986; Markman, 1989; Twomey et al., 2014). Categorization skills have not been extensively studied across the autism spectrum and across dimensions of language ability. There is some evidence that children with ASD perform similarly to TD on tasks sorting words at basic level and superordinate level (Tager-Flusberg, 1985, 2001), yet they appear to be less sensitive to semantic relationships among words. Children with ASD appear to have sparser semantic networks (Schafer et al., 2013) and overall problems with conceptual knowledge and categorical induction.

There is substantial evidence of poor categorization skills in minimally verbal children with autism. In a series of studies Allen and colleagues document that children with autism who have limited verbal skills need more exemplars to acquire the category label, cannot generalize based on less realistic (e.g., black & white) and impoverished representations (Hartley & Allen, 2015). Thus, while typically developing children generalize labels learned via pictures to real referents and regardless of iconicity, children with ASD learn labels associatively (Preissler, 2008), and are more likely to map words to objects when the pictures are coloured (Hartley & Allen, 2015). This suggests that autistic children rely heavily on perceptual similarity between picture and referent for the learning of the respective label. Importantly, these findings suggest that minimally-verbal children with ASD have an atypical understanding of symbolic relationships between words, pictures, and the referents of those words (objects). In the studies by Preissler (2008) and Hartley and Allen (2014a), minimally-verbal children with ASD were taught the names of unfamiliar objects depicted in drawings and photographs. When tested, children had to identify the referents of the newly-acquired words by looking at the pictures and their previously unseen depicted referents. In both studies, the children with ASD displayed a strong tendency to select the picture alone, which indicates failure to understand that the label refers symbolically to the object, while the picture is an iconic representation (which may or may not mediate this relationship). Furthermore, Hartley and Allen (2014a) found that minimally-verbal children with ASD may extend labels from pictures to referent objects either based on shape, which is a category-relevant cue or colour (a category-irrelevant cue). These findings are in contrast with the widely attested shape bias in typically developing children. These studies show that minimally-verbal children with ASD often display atypical symbolic understanding of pictures despite having comparable receptive language skills to TD controls. This impaired extension by shape in such children may be related to pervasive difficulties with generalization in ASD. Interestingly, a deficient shape bias has also been reported in late-talkers and children with developmental language disorder (Collisson et al., 2015). This deficit may point to atypical interaction between language learning and visual attention. Since the shape bias has been attested as a strong predictor of vocabulary size (Smith, 2009; Yee et al., 2012), it appears likely that slower rates of vocabulary development in minimally-verbal autistic children may be caused by impaired mechanisms which underlie word learning, such as poorer categorization skills and deficient shape bias and a trend to attend to irrelevant cues (e.g., colour). This evidence suggests poor symbol skills across the spectrum, which is also evident in problems in the processing of non-literal (figurative) language in highly-verbal individuals and the tendency to interpret figurative stimuli overly literally (Happe, 1995; Kalandadze et al., 2018; Morsanyi & Stamenković, 2021).

There is also evidence that children with ASD can use relative size to infer picture-referent relations in the absence of perceptual resemblance. However, they linked the abstract picture to a perceptually related distractor rather than the intended referent in the study by Hartley and Allen (2014b). In contrast, TD children can use relative size to infer representational status, and link this to the correct real-world referent. Thus, it seems that children with ASD follow a realist route focusing on perceptual similarity, while typically developing children follow an intentional one, relying upon social cognition.

Embodied approaches to symbol formation

The differences in word learning in autism highlighted above, particularly at the semantic level, may be useful to consider from an embodied, sensorimotor approach to language and cognition (Eigsti, 2013). Embodied approaches posit that perceptual experience is encoded alongside contextual cues and stored in olfactory, auditory, motor and visual systems on a neuronal level (Barsalou, 1999). For instance, imagine a child’s first encounter with a monkey in a zoo where a parent gleefully points to a creature in a tree and directs her child’s attention to it (‘look, a monkey!’). Several aspects of this simple interaction may be simultaneously encoded. The word-referent relation may be encoded alongside a particular smell or high-pitched monkey call, which would then be activated upon subsequent recall. As the word becomes entrenched in the lexicon, various aspects are encoded alongside the label (e.g. furry, small, fast, brown, loud) and with repeated exposures between words, referents alongside sensorimotor context, categorical links are formed. Ample evidence of priming effects and data from psychophysiological measures supports the link between sensorimotor information and semantic processing (see Meteyard et al., 2012 for a review), alongside complementary evidence from apraxic populations where action and manipulation are disentangled (Myung et al, 2010).

How well can an embodied approach explain linguistic differences in autism? A well-known feature of autism is pervasive difficulty in the motor domain (Bhat et al., 2011; Whyatt & Craig, 2013) with particular regard to planning (Hughes, 1996). These motor deficits have been linked to differences in acquiring and using conceptual knowledge (Eigsti, 2013; Moseley & Pulvermuller, 2018). Davis et al., (2022) explain that the sensorimotor experience, such as manipulating objects, so vital for early conceptual knowledge in typical development (Gibson, 1977; Pexman, 2019), may be detached and uncoupled from such concept formation in autism. This would have direct effects on how motoric features of concepts are embodied, and disrupt the categorical formation that typically arises from repeated simulations of schematic representations of perceptual elements (Barsalou, 1999, 2008). Evidence for this comes from an eye tracking paradigm (Davis et al., 2022) in which autistic traits measured by the Autism Quotient in a typical adult population related to object-concept representation. Individuals with higher autistic traits were less likely to look at a semantically related distractor that activated an overlapping concept (e.g., faucet when they heard the word jar, which can be physically manipulated in a similar manner). Further evidence of failure to use motor information for the purposes of encoding novel stimuli in participants on the autism spectrum has been attested in the study by Eigsti et al. (2015). This study demonstrated reduced embodiment effects/lack of embodiment in the participants with autism reflected in the fact that motor states involved in the encoding of novel stimuli were not re-activated upon later encounter of that same stimulus. Another study by Linkenauger et al. (2012) has linked the ability to estimate one’s own successful spatial navigation (such as being able to grasp an object or reach through a hole) with language and communication skills in adolescents and adults with autism. This suggests a direct relation between motor representation and linguistic ability.

At the level of word learning in autistic populations, it may be the case that stimuli are encoded on perceptual levels, but without the surrounding sensorimotor representations that allow for later abstraction. In a sense, stimulus encoding may ‘get stuck’ at the level of perceptual features and not be bound with necessary sensorimotor information that allows for conceptual formation to take place. This aligns with impaired categorization and reliance on sometimes incorrect perceptual features (as in a preference for colour rather than shape in Hartley & Allen, 2014a, b) and delays in shape bias formation (Field et al., 2016; Tek et al., 2008). Potrzeba et al. (2015) showed that individual differences exist in use of a shape bias by children with autism, which related to overall language ability concurrently and longitudinally across a 20 months interval (see also Abdelaziz et al., 2018). Furthermore, Tek et al. (2008) have documented failure in 2–4 years old children with autism to use shape information in a novel word extension task which was repeated four times in the course of a year.

A reduced/altered shape bias in children with autism can be viewed from an embodiment perspective in the context of altered interaction with objects and motor deficiencies/absence of motor encoding. The seminal study by Pereira et al. (2010) indicates that object manipulation and close viewing may be an integral part of acquiring the label associated with that object. It can be thus stipulated that altered motor interaction with objects might also impact on how the sound pattern/word form-meaning package which represents the word is encoded.

Young children at risk for autism are also unable to use feedback to guide successful long-term word retention in situations of referential ambiguity (e.g., mutual exclusivity). This difficulty using feedback is related to receptive vocabulary (Bedford et al., 2013) and suggests that individual differences in encoding strategy, and, in turn embodiment, are also likely to impact language abilities.

The symbolic

Other differences in language acquisition can be attributed to the structure of the system itself. Language is, without doubt, one of the most advanced symbolic systems in human culture. Such systems are characterized by an indirect (or arbitrary) relationship between the sign and its object of reference out in the world. One debate concerning words as symbols is the extent to which users understand the relationship between the word and its referent as symbolic or not (Allen, 2012). On a symbolic account, the symbolic relationship between a linguistic label and its referent is meaningful and is used and interpreted by speakers purposefully and intentionally (Bloom, 2000). Symbolic knowledge of pictures or words requires an individual to have a mental representation of the relationship between a picture/word, and its corresponding real-world referent; this is because very often pictures and words stand for, and are used to signify, items that are of out sight (DeLoache, 2004). In this way, symbolic knowledge differs from associative learning whereby words and objects can be related associatively as pairs. Allen Preissler & Carey (2004) provide evidence that already at 18 months infants understand pictures and words as symbols. The contrast between building an associative link and establishing a symbolic link is further strengthened by the nature of language as an intricate network of multiple symbolic representations.

In this line of thinking, the relationship between a word and its denotatum is not an isolated relationship, but rather a function of the relationship the word token has to other words/tokens in the system (Deacon, 1997). This property of language was recognized as early as European structuralism in defining language as a system where everything holds together (‘où tout se tient’) (e.g., Meillet, 1903). Deacon (1997) provides a schematic model of the construction of symbolic referential relationships as arising from indexical relationships (Fig. 1). This process starts with the individual learning of indices through associations between individual objects of reference and their signs (tokens). In a transitional stage, systematic relations between the tokens are learned as additional indices. Finally, during the symbolic final stage, the load of reference is shifted and relies on the relationships among tokens, in order to pick out categories of objects, rather than individual referents.

Fig. 1
figure 1

Reproduced original Fig. 3.3 from Deacon (1997). A schematic depiction of the construction of symbolic referential relationships from indexical relationships

Given that this symbol formation schema is inspired by human cognition and the memory system, it can be applied to the process of word learning tracing down the same stages outlined in Fig. 1. It can be stipulated that, initially, associations are formed between individual referents and their word labels (the indexical relations in Fig. 1). In a second step, and as individual vocabularies increase in size, semantic relationships are formed between words as part of the lexicon, and lexical networks begin to form and grow in density. These are relations between words (symbols). Finally, the relationship between words and referents out in the world is organized into a lexical network whereby the relations between words and referents are mediated by the semantic relations among words in the lexicon and the categorical structure imposed on referents out in the world by the existence of word labels. This latter stage highlights the tight interrelationship between linguistic labels (tokens) and the concepts they denote. The well-attested problems in word learning and use in autism can be thus rationalized within this type of framework. Thus, the tendency for associative linking between a concrete referent and its label documented in children with autism maps onto the first stage of symbol construction. The proposed semantic deficit can be explained as arising during the transitional and the final stages of symbol construction due to problems in establishing the logical and systematic relationships between the tokens (words) and using this system to pick out categories, rather than individual objects of reference. Put in other words, the lexical networks in autistic individuals can be sparse and without dense connections between words in the system, but rather build on direct associative links between individual words and their referents, without system-internal links between the symbols. The latter property of linguistic reference is well-attested in typical development, as evident in the effect of labels on categorization. In a seminal experiment by Waxman (1998), 13 months-old infants were more successful in picking out an object in the same category when the object was named by the experimenter, as in the instruction “Look at this blick, look at this blick! Find me another blick”. In contrast, when the object was not named by an explicit label, but referred to as “this one”, infants failed in forming the category. Furthermore, this reasoning is consistent with findings of sparser lexical networks in children with autism. The study by Schafer et al., (2013) established that High Colorado Meaningful words were underrepresented in the comprehension vocabularies of 2- to 12-year-olds with ASD. Given that the Colorado Meaningfulness test measures how many words can be associated with each of the test words, this study provides evidence of poorer semantic associations between words and weaker links in the lexical networks in autism. Indeed, Norbury et al., (2010) provide evidence of initial word consolidation (based on phonological information), but poorer or atypical lexical integration in their sample of children with autism, suggestive of problems in integrating novel words with already existing lexical knowledge.

Priming experiments suggest that words are tightly linked with their neighbours both formally, based on similarity of sound, and semantically, based on similarity of meaning (Altmann, 1998; Sperber et al., 1979). Given the structure of the lexicon, if words are learned as mere associations between label and referent/concept, rather than within a tightly interwoven lexical network, it can be predicted that children with autism will not benefit from either phonological or semantic similarity in the process of word learning, a hypothesis which can be tested empirically in future work.

Furthermore, problems in symbolic thinking have been attested both in the domain of successfully linking picture representations of objects and the real-world referents (Hartley & Allen, 2014b), as well as extending the labels acquired in the context of a picture to the real-world referent (Hartley & Allen, 2014a). The symbolic impairment account is also highly consistent with the well-attested problems in figurative language understanding, and in particular, metaphors and idioms, where secondary symbolization may be assumed to be taking place (e.g., by extension from the literal meaning). Further and independent support of the symbolic deficit in autism can be found in the well-documented problem in deictic gesture production in autism (Baron-Cohen, 1989; Goodhart & Baron-Cohen, 1993; Iverson et al., 2017; Manwaring et al., 2018; Mastrogiuseppe et al., 2015; Ramos-Cabo et al., 2021; Stone et al., 1997). Deictic gestures are inherently communicative and intentional. According to Werner and Kaplan (1963), infants' communicative pointing denotes an important first step toward true symbolic understanding. Ramos-Cabo et al. (2021) document that deictic gesture production in a semi-structured elicitation task was not only quantitatively different in their sample of children with autism in comparison to matched typically developing children, but also qualitatively different. A sophisticated pointing gesture is characterized by an extended index finger and absence of contact with the referent, which reflects the symbolic nature of the act of referring by pointing. In this sense deictic gestures are similar to language and can be used to replace reference by words as part of an integrated system for communication (McNeill, 1992). The symbolic nature of deictic gestures is also evidenced in the systematic predictive relationship between pointing behaviour and language in early development (Cochet & Vauclair, 2010; Liszkowksi & Tomasello, 2011; Ramos-Cabo et al., 2019, 2022). Notably, in the study by Ramos-Cabo et al., (2021), the children with autism produced significantly less pointing gestures with an extended index finger and less gestures characterized by absence of contact with the referent. This finding suggests lack of sophistication in deictic gesture production, in all likelihood due to a delay in understanding of the symbolic nature of deictic behaviour.

Conclusions

Extant research provides evidence of an interesting word-learning profile in autism, not surprisingly characterized by dissociations. On the one hand, a number of strengths have been documented, specifically on the ‘nuts and bolts’ side concerning initial word form encoding, strengths at morpho-phonological aspects of words (inflections), and on the other, problems with acquiring the meanings of words. The latter problems, often referred to as semantic problems, may be caused by atypical reliance on a number of underlying mechanisms, such as impaired categorization skills, impaired symbol formation, attendance to irrelevant perceptual cues (colour) at the expense of deficient shape-bias. From this evidence it can be hypothesized that autistic children may actually be more immediately grounded in reality, as suggested by the heightened attendance to perceptual features (both auditory and visual properties of objects), but poor at symbol operations which result from the coupling and integration of sensorimotor information in the semantic circuits in the brain. Deacon (1997) suggests that symbol formation is dynamic and resides in the association between the vehicle (sound segment, in the case of word learning), and specific features of the denotatum, but not in properties of either in isolation. Thus, it may be the case that symbol formation might be difficult, as a result of poor information integration, and “being stuck” at initial stages of symbol construction where visual perceptual properties are the strongest cues. Language is a symbolic system which mediates between the world and human conceptualization of this world. The intra-systemic relations among symbols within that system (lexical networks) are symbol-to-symbol relations. Atypical symbol operations in autism may compromise the emergence of this system and its functioning in both word learning and language comprehension.

Going forward it will be important to investigate relations between motoric function, priming effects and performance on behavioral language tasks investigating core language mechanisms, such as the shape bias or categorization in populations diagnosed with autism. Future research should also establish to what extent an embodied cognition-inspired account, which assumes an affordance impairment (Davis et al., 2022; Eigsti, 2013) can go beyond concepts arising in the context of manipulable objects, and explain abstract concepts, such justice, thought and the like. It has been suggested, for instance, that abstract concepts might be acquired in a way similar to concrete ones via a process of metaphor activation (Dijkstra et al., 2014). Given the well-documented problems in metaphor comprehension in individuals on the autism spectrum, future designs will need to study these two domains in comparison and in terms of their interrelationship.