Introduction

The differential diagnosis of motor speech disorders in children is of increasingly high interest in the speech-language pathology literature. In the past few years, several significant papers have been published challenging clinicians’ current motor speech practice and beliefs. This increase can partially be attributed to international efforts to standardize terminology, led by the speech subcommittee of the International Association of Communication Sciences and Disorders which led to a series of papers about diagnosis and diagnostic labels (e.g. [1••, 2•]) and the 2022 Childhood Apraxia of Speech Research Symposium (e.g. [3]). These papers have coincided with work describing current clinical use of diagnostic terminology in paediatric speech sound disorders and confidence in such diagnosis (e.g. [4, 5••]). This paper describes the process of speech production, common speech sound disorders, and newer assessment processes and theoretical considerations.

For readers less familiar with the processes involved in speech production, we provide a general description of speech sound disorders. Here, we are discussing children’s speech production, accuracy, and clarity rather than their language skills or cognitive capacity for communication. Within this subcategory of human behaviour, speech can be thought of as the acoustic end point of a series of firstly neurocognitive processes and secondly coordinated anatomical movements.

The Speech Chain

In brief, before a word can be spoken, a series of cognitive and motor processes are required (see [1••] for a summary). A thought is transformed into words and sentences and arranged in the correct order within the cognitive-linguistic system. These words and sentences are then mapped to sounds that are used within the language of the speaker via the phonological system [6]. From there, these ideas of the sound are matched with stored movement memories for sounds and words known as speech-motor plans or schemas [7]. These motor plans are combined and adjusted to account for the combinations of sounds and words and the context in which the word is being said [8]. At that point, the adjusted plan (or program) is forwarded neurologically to the articulators and stored in working memory for immediate comparison with the subsequent movement as an internal reference of correctness [9]. Speech then happens when the combined contraction and release of more than 100 muscles [10], in a coordinated sequence, modifies exhaled breath to make a stream of sound which we recognize as meaningful.

A breakdown anywhere in this speech chain can cause the speaker to be imprecise or inaccurate, potentially leading to reduced intelligibility, and reduced communicative participation, which if unresolved may have flow-on effects to a lifetime of adverse social, academic, and occupational outcomes [11, 12].

Historically, speech disorders and associated treatments have been broadly separated into those with a cognitive-linguistic basis and those with a motor-speech basis (see [2•] for a review). The primary cognitive-linguistic speech disorder is a phonological speech sound disorder and results in the child unconsciously selecting incorrect sounds. This can manifest as only having a small repertoire of sounds, a consistent substitution of one sound or type of sounds with another, or as inconsistent substitutions resulting from unstable underlying representations of the sounds [13]. Mild-moderate phonological speech sound disorders are common and appear as patterns of sound substitutions and omissions and are readily addressed through speech therapy [14]. Severe-profound phonological disorders are less frequent and may be hard to distinguish from other severe-profound, motor-based speech disorders.

Motor speech disorders are the second group of speech disorders in children and include articulation disorder, speech-motor delay, childhood apraxia of speech (CAS), and childhood dysarthria [15, 16]. Articulation disorder is a difficulty with one or two speech sounds arising from a simple mislearning of the affected sound or anomalies with the actual structure of the articulators (i.e. palate, pharynx, tongue, teeth, lips, and cheeks). Once the neurological signal is transmitted to the articulator muscles, the shape, resting position, and intactness of the articulators will influence the accuracy and quality of the sounds produced. For example, children with any type of cleft (i.e. repaired or otherwise) may have difficulty achieving appropriate closure of the velopharynx and thus produce muffled, nasal sounding speech that is inaccurate.

Speech-motor delay is identified when a child’s speech-motor skills are slower than expected to develop but they do not meet the criteria for either CAS or childhood dysarthria [16]. Speech-motor delay often co-occurs with other development disorders and may represent broader gross and fine motor difficulties, or it may be a standalone delay. The remaining motor speech disorders often present as difficult to differentially diagnose [5, 17]. These disorders operate on speech production in the neuro-cognitive sequence described previously and generally are not fully resolved across the lifespan [11, 12]. CAS is a central nervous system disorder of speech motor planning and programming. This results in difficulty coordinating the articulators involved in speech movements leading to speech that may sound ‘robotic’, distorted, and inconsistent [18]. Children with CAS present with a range of severities and in many cases also have other communication or developmental conditions or disabilities [19]. Children with severe CAS may have difficulty producing any speech. Causes for CAS are diverse but include syndromic presentations, de-novo genetic variations, and idiosyncratic cases where no cause can be determined [20].

Childhood dysarthria is a motor speech disorder resulting from a central or cranial-nerve disruption that prevents the developed speech motor plan from being executed accurately at an appropriate speed. Failure of the neurological signal to be transmitted accurately to the articulators results in imprecise, slow, or effortful speech leading to inconsistent accuracy and unintelligibility [21,22,23]. According to the Mayo Clinic auditory-perceptual approach [24], childhood dysarthria results from breakdown across any, or all four speech subsystems (respiration, phonation, resonance, and articulation). Childhood dysarthria is frequently associated with a known condition such as cerebral palsy, Down syndrome, stroke, or brain injury; but it can also occur seemingly idiopathically and, in rare instances, may be an early indicator of a degenerative neurological condition [25, 26].

Diagnosing children with a specific speech sound disorder is not as simple as matching the child’s speech features (phenotype) with one of the speech sound disorder subtypes. There is an overlap between the symptoms of different disorders with the same speech features potentially resulting from multiple different breakdowns within the speech chain. For example, the speech feature of excess nasal resonance may occur because the child does not understand when to use nasal speech and when to use oral speech (phonology) but may also occur because of motor planning (like in CAS) or motor execution difficulties (like in childhood dysarthria) or a structural abnormality with the palate.

How Should a Speech Diagnosis be Made?

To make a diagnosis, speech-language pathologists use clinical reasoning to evaluate a range of signs and symptoms which are both directly observed and reported by others. These are obtained using procedures that may include case history, analysis of speech errors in single word and connected speech production, evaluation of the structure of the speech anatomy and the function of the relevant cranial nerves, and physiological measures including of respiration and phonation [4, 27]. Clinicians may then conduct additional assessments to confirm or refute their diagnostic hypothesis. Observation of anatomical or physiological issues that are anomalous with the child’s case history are usually referred for further investigation (e.g. to an ear, nose, and throat specialist). Returning to the previous example of a resonance disorder diagnosis, observation of a submucous cleft or fistula, repeated ear infections, or consistently excess nasal airflow during speech may be indicators for referral for medical review and further diagnostic investigation.

Stand-Alone vs. Co-occurring Speech Disorders

Although separate in underlying cause, many speech disorders may result in the same speech errors (phenotype). For example, Iuzzini-Seigel et al. [5••] reported the challenges of the overlapping symptoms in CAS and childhood dysarthria, such as vowel distortions and difficulties with resonance and prosody. Children may also have more than one speech disorder simultaneously with either separate causes or with a common causal pathway [1••], and it can be challenging for clinicians to disentangle these. Murray et al. [17] reported that community-based clinicians referred 72 children with suspected CAS for inclusion in a research trial. After telephone screening, 47 children were assessed, and of these 28 had CAS alone, and 4 had CAS plus another speech disorder. Of the remaining 15 children, three had previously undiagnosed sub-mucous cleft palates rather than CAS, two had childhood dysarthria alone, and various phonological disorders rather than CAS accounted for the remainder (n = 10). The question that arises from Murray et al.’s [17] paper is why clinicians identified dysarthria, structural anomalies, and phonological disorders as being CAS and what can be done to improve their differential diagnosis and diagnostic confidence.

Unfortunately, researchers in children’s speech disorders have tended to report on one speech diagnosis at a time, rather than all diagnoses present, particularly in treatment papers. For example, authors reporting treatment studies describing dysarthria secondary to cerebral palsy have either not discussed other speech diagnoses the children may have had or have excluded children with multiple speech diagnoses from the research [28, 29]. In a recent study, Murray and colleagues [3] explored CAS diagnostic reliability by inviting CAS researchers to decide whether CAS was present or not in audio or video recordings of children with and without CAS. Individual raters were moderately reliable with their own judgements but not reliable with others. In fact, the highest inter-rater reliability was achieved when identifying that a child did not have CAS. It should be noted here that a significant flaw in this study was that children were assumed to have a single CAS diagnosis or not, rather than a mixed presentation with the potential for multiple, interacting diagnoses.

Thus, researchers primarily rely on and use clear examples of each category or diagnosis; while in clinical practice, children often have multiple overlapping symptoms and conditions. This real-life complexity is poorly represented in the research to date, and therefore when it is present clinically, clinicians have little evidence to guide practice, relying instead on clinical reasoning and sequential hypothesis development through therapy trials to establish a diagnosis.

Evidence-Based Practice in Speech Diagnosis

Evidence-based practice in speech diagnosis is reported to be difficult for clinicians because of a lack of time and limited confidence to try new ways of working [30]. Many clinicians report insufficient time for assessment, analysis, and diagnosis of childhood speech disorders [27, 31]. This may result from a shortfall between demand for services and the availability of speech-language pathologists and other limits on services (e.g. individual education plan goals focused on therapy rather than detailed assessment, insurance limits). Clinicians may feel pressured to begin therapy so will treat the most obvious speech impairments without a more thorough diagnosis being made [27]. Furthermore, tools to assist with the analysis of complex phonology (e.g. non-linear phonology [32], CHIRPA [33]) are used infrequently [27]. Instead, clinicians tend to rely on readily available materials such as commercially produced or in-house developed single word lists for assessment tasks, and consequently diagnosis may be based on limited evidence [27]. For example, clinicians in the Netherlands do not routinely complete thorough diagnoses because of difficulty in obtaining the case history, especially for clinicians working in schools [27]. A recent study (Staples, submitted) [34] suggests that these time pressures are more acute for less experienced clinicians, with more experienced clinicians conducting more speech assessment tasks and taking longer to resolve their diagnostic questions.

Accurate Diagnosis Is Important

There are several reasons why accurate diagnosis of speech disorders is important both theoretically and clinically. From a theoretical perspective, understanding causal pathways and interactions along the speech chain may inform our understanding of the disorders, their prognosis, and the development of new treatments. Historically, speech disorders have been diagnosed through observation of their phenotype [35], that is, the combination of clinical signs and symptoms which best match an agreed label without regard to the underlying causal pathway [1••]. In some instances, this has been strongly influenced by the child’s medical diagnosis, so a child with speech difficulties and cerebral palsy may be de facto assumed to have dysarthria. This is despite clear evidence that children with cerebral palsy may have CAS and/or phonological disorder as well as and separate to any dysarthria [36]. For these children with childhood dysarthria, the presence of CAS or phonological disorder may impact on intelligibility more than dysarthria alone [37].

Clinically, diagnosis that is based on the presence and the consideration of all possible impairments in the child’s speech should be considered superior to those that only consider the primary diagnosis [1••]. Firstly, all speech impairments can impact on intelligibility. Secondly, if only one speech diagnosis is considered the basis of therapy, selecting the speech impairment based on the phenotype may not reveal the speech impairment which is most influential in intelligibility or accuracy. As described above, a child with cerebral palsy may have dysarthria, but the main cause for the child’s speech difficulties may be CAS. If dysarthria was selected based on their cerebral palsy phenotype only, without consideration of CAS, the intervention may lack efficacy or efficiency. Finally, the cumulative impact of multiple speech disorders within the speech of an individual child is poorly reported but could be expected to be additive [38], if not multiplicative, in reducing accuracy and intelligibility.

Diagnosis Should Lead to Treatment

Despite the concerns outlined above, selection of speech intervention appears most usually to be based on the primary diagnosis alone. For instance, children with CAS are likely to receive the Nuffield Dyspraxia Programme (NDP) [39] in the UK and Australia, or Dynamic Temporal and Tactile Cueing (DTTC) [40] or the Kaufman Speech to Language Protocol [41] in the US and Canada [42, 43], while children with dysarthria associated with cerebral palsy typically receive either the Speech Systems Approach [44] or Speech Intelligibility Treatment (SIT) [45] in the United States for treating dysarthria.

Although these interventions have shown effectiveness in their targeted diagnostic groups, there is emerging evidence that motor speech treatments have efficacy across different diagnostic groups. There are three prominent examples (in order of depth of evidence): Rapid Syllable Transition Treatment (ReST) [46], DTTC [40], and Prompts for Restructuring Oral Musculature Phonetic Targets (PROMPT) [47]. ReST is an evidence-based speech motor intervention designed for children with CAS which also shows promise in improving dysarthric speech in children with cerebral palsy [28, 29]. DTTC was designed for use with children with CAS [40]; however, the literature includes children with a range of syndromes associated with dysarthria. PROMPT is a motor-speech intervention that has good evidence with children with motor speech delay [47], fair evidence with children with dysarthria [48], and emerging evidence with children with CAS [49].

It is assumed that using the principles of motor learning in treatment is effective for all motor-speech disorders [7] and would be ineffective for a cognitive-linguistic speech disorder [14]. Motor-speech interventions which use the principles of motor learning, such as ReST [46], DTTC [40], and PROMPT [50], focus on the repetition of whole words and phrases, while the child adjusts any speech errors, including any distortions, to achieve an accurate production of the word. Production of many repetitions of a speech target in a structured intervention indirectly targets multiple neurological pathways and activates all processes within the speech chain and facilitates the development of a new generalized motor program for the target production, adjusting the stored schema of the word. Clinician feedback to the child on overall speech accuracy during therapies such as ReST, DTTC, and PROMPT then enables the child to develop an internal reference of correct production [7] which in turn leads to increased accuracy of subsequent productions, development of self-correction, and associated neuroplastic changes [51].

Confidence in Motor Speech Diagnosis

Although the diagnostic features of both CAS and childhood dysarthria are reasonably well documented, speech-language pathologists report diagnostic uncertainty, having low confidence in diagnosing these conditions [5••]. For example, Randazzo [52] reported that clinicians with specialist CAS practice in the United States were confident in diagnosing standalone CAS in children; however, they felt less comfortable in diagnosis where a child may have either multiple speech disorders or other non-speech conditions. This diagnostic uncertainty can be seen in the research literature and clinically in labels such as “suspected CAS” (e.g. Highman et al. [53] Randazzo [52]). Given that choice of intervention generally depends on confident diagnosis, this leaves the speech-language pathologist in a dilemma which is often solved by resorting to well described but lower evidenced general interventions such as articulation therapy or hybridization of multiple interventions designed for a range of disorders [e.g. 43, 54]. Consideration of the capability, opportunity, and motivation model of behaviour change (COM-B) [55] would suggest that clinicians may be reluctant to use motor-based treatments because of this diagnostic uncertainty and continue using therapies which are familiar and routine [30].

Steps to Reduce the Diagnostic Uncertainty

The recent literature on diagnosis of speech disorders in children has attempted to reduce the diagnostic uncertainty described above and address the real-life complexity of children potentially having multiple diagnoses, to provide clinically functional tools and guidance, and to impose an over-arching framework for diagnosis which is based in theory.

Multiple Diagnoses

Iuzzini-Seigel [5••] introduced a process of differential diagnosis using the Profile of Childhood Apraxia of Speech and Dysarthria (Pro-CAD), a clinically oriented flow chart for differential diagnosis for CAS, dysarthria, and a combination of these based on speech characteristics. Firstly, the Pro-CAD tool identifies both the CAS-only and the dysarthria-only features present in the child’s speech. These are then used to determine whether either diagnosis can be ruled out, which need further testing, and which is/are the most likely diagnosis. Like the recent Verbal Motor Production Assessment for Children- Revised [56], the Pro-CAD allows for the possibility of a child having a dual speech diagnosis. Previous tools such as the Diagnostic Evaluation of Motor Speech (DEMSS) [57] typically focused on identifying the presence or absence of only one SSD in their scoring systems, which is CAS in the case of the DEMSS.

As a diagnostic tool that has clear clinical utility, the Pro-CAD makes a valuable contribution. It is not, however, without limitations. These include limitations to the scope of reliability and validity assessment. To date, inter-rater reliability has only been established between the three authors of the tool, and the validity evaluation was conducted on 26 children with either CAS or epilepsy, rather than a broad range of children with high frequency dysarthria-associated conditions such as Down syndrome and cerebral palsy. Secondly, in the Pro-CAD, dysarthric features are determined based on auditory-perceptual features alone rather than oral-facial structure and function.

The next step in the development of improved tools for differential diagnosis of motor speech disorders needs to include the other developmental speech disorders (i.e. motor-speech disorders of articulation and speech-motor delay; and cognitive-linguistic phonological disorders). Children with speech disorders need clinicians to be able to consider children’s speech development using a unified tool which contemplates the potential of multiple, interacting diagnoses.

Theoretical Understanding of Speech Diagnosis in Children

As described previously, historically, speech disorder diagnosis has been primarily based on the child’s speech phenotype. Two alternate perspectives have emerged this century, one being use of genotyping and the second being neuro-cognitive modelling of speech systems and their disorders.

Genetic Understanding of Speech Diagnosis

While it is beyond the scope of this review to detail the diversity of genetic information now available, the literature describes numerous new syndromes and de novo copy number variants (micro deletions and duplications) associated with speech disorders in children (see [20] for a review). The existing evidence suggests that children with genotypic speech disorders should be expected to have multiple overlapping diagnoses. For example, people with FOX P2 variants that lead to speech disorders may have both CAS and childhood dysarthria and may have other speech, language, intellectual, and psychological difficulties [58]. Lauretta et al. [59] argue that speech-language pathologists will need to become knowledgeable in the underlying genetic causes of speech disorders in children.

Modelling of Speech Disorders

Littlejohn and Maas [1••] reviewed speech disorder categorization through a series of theoretical models. They explained the limitations of the current diagnostic process and proposed a shift in thinking away from broad diagnostic categories and toward identifying the underlying processes within the speech production chain causing the child’s speech production difficulty. This type of assessment would move away from the current reliance on perceptual assessment of children’s speech errors and include more sensitive measures such as reaction times and acoustic analysis, perhaps with the aid of artificial intelligence. Using process orientated diagnosis would likely reveal that broad speech sound disorder classifications, such as phonological disorders and CAS could consist of several different subtypes. The authors conclude that it is necessary to identify and target the deficits to provide effective intervention but note that there are not yet validated tools for comprehensively assessing all speech production processes. Advances in modelling, including those which use algorithms to weight factors such as signs and symptoms, severity, age, comorbidities, and social factors, may significantly reduce both the time required for clinicians’ diagnostic decision-making and reduce the uncertainty reported in many of the papers described in this report.

Conclusions

Motor speech diagnosis is a topic of high research interest with multiple papers addressing this topic in recent years. Although accurate diagnosis is important for treatment decisions and prognostic statements, accurate diagnosis is difficult. This is because there is overlap between speech features of different speech disorders, comprehensive assessment is time-consuming, and SLPs report limited confidence in their diagnostic abilities. Historically, the focus was on identifying one speech disorder for each child and treating that singular issue. Recent assessment, diagnosis, and treatment papers better acknowledge the issues of diagnostic uncertainty, co-morbidity, and clinician constraints. Researchers are using new tools including genetics and neuropsychological modelling to better address these issues. Ultimately, the outcome of this new effort is that diagnostic uncertainty is reduced; and children with motor speech disorders receive an accurate diagnosis and appropriate and timely intervention.