1 Introduction

Learning disabilities or learning disorders, a hypernym for a wide variety of learning problems hinders the skill acquisition activity of an individual. It is not a problem with intelligence. Nevertheless, it can have a negative impact on the self-esteem and confidence of children who are expected to acquire new information and skills day by day. Students with learning disabilities account for the major proportion of the “special educational needs” category [1]. Since these disabilities affect the perception capability of a child, the difficulties may be either in reading, writing, doing math, or any other tasks. Figure 1 shows different learning disabilities found in children.

Fig. 1
figure 1

Types Of learning disabilities

Dyslexia [2] is a kind of learning disability that includes difficulty in reading because of issues in recognizing speech sounds as well as decoding them. Dyslexia is affected in the brain areas which is assigned for processing the language. Dyscalculia [3] is a specific type of learning disorder that involves difficulty in understanding numbers and its related issues in learning mathematics. Dysgraphia is primarily considered as a disorder in written expression. It can affect the spelling, grammar, organization, etc., in addition to the handwriting aspects [4]. Figure 2 summarizes the basic categorization of dysgraphia based on the related symptoms. Although the exact prevalence depends on the definition of dysgraphia, between 10–30% of children face difficulty in handwriting [5]. Handwriting samples of individual with dysgraphia is shown in Fig. 3.

Fig. 2
figure 2

Types Of dysgraphia

Fig. 3
figure 3

Handwriting samples of person with dysgraphia

The typical diagnostic strategy for any learning disability follows a team-based assessment [5], which is often coordinated by a child/adolescent psychiatrist or a neuropediatrician with specialized training in neurodevelopmental disorders. This multidisciplinary team may include professionals such as an occupational therapist, speech therapist, special education teacher, psychologist, and, if necessary, a psychomotrician, who collectively contribute their expertise to the assessment process. In addition, any other prevailing medical conditions such as poor vision, hearing problems, intellectual disability, lack of proper training should be ruled out with the help of a specialist. As far as dysgraphia is concerned, it is indeed important to consider the several contributing factors such as handwriting speed and legibility, inconsistency between spelling ability and verbal intelligence quotient, as well as the pencil grip and writing posture evaluation. However, there is no generalized medical testing strategy available for the diagnosis of dysgraphia. Concise Evaluation Scale for Children’s Handwriting (BHK) for French [6], Detailed Assessment of Speed of Handwriting (DASH) in English [7] and Hebrew Handwriting Evaluation (HHE) in Hebrew [8, 9] are some commonly used standards in the assessment of dysgraphia.

It is quite difficult to diagnose any of the learning disorders because of the multiple cues that should be considered in the assessment. Depending on the age and developmental stage, the warning signs and symptoms may vary too. More importantly, the predefined symptoms should persist for at least 6 months with parallel intervention measures being administered [10]. Dysgraphia may appear in isolation or as a comorbidity with other learning disorders or even autism spectrum disorder (ASD), dyspraxia [11], and Attention Deficit Hyperactivity Disorder. This emphasizes the timely diagnosis of dysgraphia or handwriting disorders in particular. The early recognition and intervention lessen the task and efforts needed to correct the disorders. The manual assessment techniques solely rely on the handwritten product for the final scoring and judgment. This paved the way to propose several automated techniques that can exploit the dynamic characteristics of handwriting as well. Digital tablets capable of capturing these multiple features of handwriting have yielded promising results in the related research. Meantime studies indicate that the writing sensation may be different for stylus-tablet setting in contrast to pencil-paper, which is the commonly used procedure during the skill acquisition phase [12]. Despite the complexity and the multifaceted approaches involved in the diagnosis of dysgraphia, several artificial intelligence-based and non artificial intelligence-based automated technologies have been proposed in recent years [9, 13, 14].

Table 1 Summary of related works

To the best of our knowledge, no other study in the existing literature has provided a comprehensive review specifically focusing on automated dysgraphia diagnosis systems until now. While a few related review articles exist, they predominantly cover a broader spectrum of learning disabilities rather than concentrating solely on dysgraphia. Additionally, some of these reviews overlook dysgraphia entirely. Only two works [24, 25] in the literature have dedicated attention to dysgraphia diagnosis. However, the first work offers only a cursory overview of machine learning-based methods for diagnosing dysgraphia, lacking in-depth explanation regarding the details of feature engineering in machine learning model development. Furthermore, it lacks the review of commercial applications or non-machine learning-based automated systems. The second work primarily focuses on reviewing available tools for dysgraphia diagnosis, encompassing both automated and non-automated options, with a greater emphasis on non-automated tools and lacking a detailed assessment of automated ones. A summary of related works is given in Table 1. As the trend towards automated systems for dysgraphia diagnosis has been steadily increasing in recent years, there exists a clear necessity for a review article addressing this specific area. Such a review can offer valuable insights for researchers and academic students seeking to deepen their understanding of this topic.

This work presents a review of the existing automated dysgraphia diagnosis systems for children in the literature. The main focus of the work is to review artificial intelligence-based systems for dysgraphia diagnosis in children. This work discusses the data collection method, important handwriting features, and machine learning algorithms employed in the literature for the diagnosis of dysgraphia. Apart from that, this article discusses some of the non-artificial intelligence-based automated systems and psychological methods. Furthermore, this article discusses the drawbacks of existing systems and proposes a novel framework for dysgraphia diagnosis.

2 Research methodology

This work followed a systematic approach to preparing the literature review. The systematic approach is followed to find out the specific issues in this research domain. In the systematic approach, we defined certain search terms and pre-selected a few digital databases to find the related research articles. After finding the related articles the most relevant works are sorted out and reviewed in this work.

A research question is a required element for conducting the review in a systematic approach. The generic research questions in this work are:

  • What are the automated systems used for the prediction of dysgraphia in children?

  • How effectively the machine learning technology has employed in the literature for the diagnosis of dysgraphia in children?

  • How the handwritten data are collected for training machine learning models and what are the relevant handwritten features that can discriminate the abnormal and normal handwriting.

  • How do extracted features from handwriting data contribute to the interpretability of dysgraphia cases for clinicians, and what specific features can demonstrate the greatest utility in aiding clinical interpretation.

During the literature search, we utilized four popular digital libraries IEEE Digital Library, Web of Science, PubMed, and Springer Link to find the related articles. The search keywords used for the literature search are given below.

  • “Automated System Dysgraphia”

  • “Machine Learning Dysgraphia”

  • “Deep Learning Dysgraphia”

  • “Automated System Learning Disorder”

  • “Machine Learning for Learning Disorder”

Fig. 4
figure 4

PRISMA flowchart illustrating the systematic review process. The figure depicts the stages of identification, screening, eligibility, and inclusion, providing a visual representation of the study selection process

Among all the obtained search results by using the above-mentioned keywords separately in each database, we found that only about 50 papers are most relevant to the specific topic. Another interesting fact is that most of the papers in the literature are published after 2015. As illustrated in Fig. 4, the review process is outlined in a PRISMA flowchart. The flowchart depicts the number of records identified through searching, screening, and the inclusion/exclusion criteria applied at each stage.

We conducted a comprehensive literature search in databases utilizing keywords mentioned above, and the search yielded a substantial number of articles. We refined the results to focus on studies specifically related to dysgraphia diagnosis that employed automated or semi-automated approaches, including available reviews on the same topic. This process resulted in a total of 42 records. In addition to database searches, we extended our search to other sources such as the Internet and Google, identifying 12 additional works, including commercially available systems. After removing duplicates, the dataset comprised 45 studies. Subsequently, we conducted manual screening, excluding existing reviews on similar topics, leaving us with 40 screened records. From these, we selected works with good quality, reducing the number of records to 30. These 30 records underwent qualitative discussion and analysis. For the quantitative aspect, we focused on works employing machine learning (ML) supervised algorithms, resulting in 21 records. These works were further grouped based on the underlying machine learning algorithm for a more nuanced analysis.

3 Dysgraphia diagnosis methods

In this section, we first discuss the conventional methods used to diagnose dysgraphia and the manual analysis of handwriting. We then review the notable contributions to automated analysis (machine learning methods) and non-ML based diagnosis systems of handwriting for the detection of dysgraphia.

3.1 Conventional methods

Dysgraphia originates from the Greek words ”dys” (meaning difficulty or poor) and ”graphia” (meaning writing), accurately reflecting challenges in the writing process. Initially, the difficulty in handwriting or dysgraphia was described by Hamstra-Bletz and Bolte as difficulty in the construction of letters while writing which is closely related to the mechanics of writing [26]. It has also been referred to as a specific learning disability [27, 28].

Fig. 5
figure 5

Conventional methods for dysgraphia diagnosis: Assessment categories

The popular symptoms of dysgraphia include “messy handwriting, inconsistency in letter spacing and capitalization, pain or discomfort when writing, fine motor skill challenges, trouble with spelling, or trouble with composing written work”. Further, it is also common that the students with dysgraphia can express themselves well while speaking but can’t transfer them onto paper perfectly. The evaluation of various skills, categorized as shown in Fig. 5, to identify the presence of dysgraphia is typically conducted by an occupational therapist or trained psychologist. However, it is crucial to note that the synthesis and assessment should also involve professionals such as child and adolescent psychiatrists or neuropediatricians specializing in neurodevelopmental disorders, along with the expertise of psychomotricians and speech therapists. In alignment with existing conventional approaches in dysgraphia assessment literature, these evaluation categories can be distinctly grouped into three primary domains: cognitive abilities, motor skills, and language and phonological skills.

3.1.1 Cognitive ability

Within the cognitive abilities domain, trained professionals assess key aspects such as retrieval fluency, intelligence measures, working memory, and executive functions. These cognitive functions play a pivotal role in the overall cognitive processing of children. Executive function skills enable the student to plan, concentrate, remember commands, and organize multiple tasks. The student requires all the executive functions to be intact for writing [29]. Rey Complex Figure Test [30] and Behaviour Rating Inventory of Executive Function (BRIEF) [31] are the popular examination test conducted for executive function assessment. In the Rey Complex Figure Test, the students were asked to draw very complex figures. Based on their drawing output the capacity of executive functions is quantified. On the other hand, BRIEF provides a questionnaire to students’ parent and teacher, and ask them to answer for that. The provided questionnaire is a rating form that consists of 86 items to assess eight clinical scales independently. The eight clinical scales include three behavioral regulation scales (Inhibit, shift, emotional control)and five metacognition scales (initiate, working memory, plan, monitor, and organization of materials).

Working memory refers to the cognitive ability of the human memory system that has limited capacity and can keep information for a short time. Generally, dysgraphia can occur with some degree of working memory problems [32]. Test of Memory and Learning - 2 (TOMAL-2) [33] and Wide Range Assessment of Memory and Learning-2 (WRAML-2) [34] are two popular examinations for assessing the decline of working memory. Intelligence measure a.k.a intelligence quotient quantifies human intelligence which varies with age. Wechsler Intelligence Scale for Children (WISC-IV) [35] and Differential Ability Scales (DAS) [36] are the popular tests for quantifying the intelligence of children. Retrieval fluency tests are assessments designed to measure an individual’s ability to efficiently retrieve information from memory, particularly under time constraints. These tests are often used to evaluate cognitive functions, including memory recall and processing speed. Individuals may be asked to generate as many items as possible from a given category or recall specific details within a set time frame. Retrieval fluency issues are commonly found to be less in children with dysgraphia [37].

3.1.2 Motor skill

In the motor skills domain, constructional ability stands out as a crucial component, addressing the intricate relationship between motor coordination and writing abilities. The constructional ability assessment will look for the proficiency of the student to copy or reconstruct lines, shapes, or figures. One of the popular conventional methods utilized by trained psychologists for constructional ability assessment is "Beery Visual Motor Test of Integration - Sixth Edition (VMI-6)” [38]. In the VMI-6, the participants are asked to copy drawings on paper. The complexity of the drawings will be increased throughout each stage of the test and participants are not allowed to erase the drawings. The drawings given for the copying task include overlapped figures, angles, and three-dimensional images. But the visual motor skills purely depend on the age and it is expected that until age 16, the standards of the visual-motor skills will be different for every age until 16. Usually, many students with dysgraphia struggle with eye-hand coordination and planning. The visual-motor integration test conveys more information about the participant’s capacity to understand a drawing and the motor ability to copy the information (motor response). Other popular constructional ability assessment tests are Bender Gestalt II [39], NEPSY-II [40].

In the Bender Gestalt II test [39], participants are instructed to reproduce a series of nine abstract designs on a paper. Throughout the test, the designs’ complexity gradually rises, progressing from straightforward geometric forms to more complicated patterns. The Bender Gestalt II is distinctive in that participants are not permitted to erase any of their drawings, which can provide important information about their perceptual-motor abilities and problem-solving methods. Numerous aspects are present in the drawings, such as overlapping figures, angles, and three-dimensional pictures. The grading for the Bender Gestalt II test, in contrast to the VMI-6, is more qualitative and places greater emphasis on identifying the mistakes, omissions, distortions, and rotations that participants made when duplicating the patterns. These observations reveal important details about the subject’s perceptual-motor abilities and their capacity to accurately interpret and replicate visual stimuli. NEPSY-II [40] is a constructional subtest that evaluates a person’s capacity for drawing out designs on paper. Participants are shown drawings that range in complexity, like in the VMI-6 and Bender Gestalt II, but NEPSY-II goes beyond constructional ability to incorporate a wider range of neuropsychological processes. The NEPSY-II constructional subtest assesses a person’s ability to comprehend a drawing and their motor skills to accurately replicate that information. NEPSY-II, on the other hand, provides scaled scores, percentile ranks, and age equivalents for this subtest, allowing comparison with a normative population. As a result, it provides a thorough assessment.

Building upon the foundation laid by specific constructional ability tests mentioned above, the inclusion of the Developmental Test of Visual Perception (DTVP) [41] and the Movement Assessment Battery for Children, Second Edition (MABC-2) [42] within the motor skills domain serves to enrich the evaluation. These assessments extend beyond exclusive constructional tasks, offering broader insights into a child’s visual-motor integration and overall motor proficiency. DTVP assesses various aspects such as visual-motor coordination, discrimination, spatial relationships, and memory. On the other hand, MABC-2 is a standardized assessment tool designed to evaluate motor skills and identify motor difficulties in children. Developed by Henderson, Sugden, and Barnett, the MABC-2 assesses a range of motor functions, including manual dexterity, aiming and catching, and balance.

Although the DTVP and MABC-2 involves tasks that assess visual-motor integration and motor functions respectively, it covers a broader spectrum of visual perceptual skills beyond constructional ability alone. In the context of dysgraphia, the DTVP holds significance as it provides valuable insights into specific visual processing difficulties that may impact tasks such as handwriting and fine motor skills. While MABC-2 can be valuable for assessing motor skills that contribute to writing and fine motor difficulties. By evaluating a child’s motor proficiency, including aspects related to hand-eye coordination and manual dexterity, the MABC-2 provides insights into potential motor challenges that may impact writing tasks.

3.1.3 Language and phonological skills

The language and phonological skills domain encompasses the evaluation of writing and spelling skills and phonological awareness, shedding light on linguistic aspects integral to dysgraphia. Wechsler Individual Achievement Test (WIAT-III) [43], Woodcock Johnson-III Tests of Achievement (WJ-III) [44], and Test of Written Language-4 (TOWL-4) [45] can assess the writing and spelling skills. Phonological awareness refers to the ability of the human to perceive and work with the audio, especially the sound in the spoken language. It includes understanding patterns like alliteration or rhymes, the ability to segment the sentences into words, understanding of phonemes and syllables, etc. Comprehensive Test of Phonological Processing [46], and NEPSYII phonological processing are the two popular phonological awareness assessment tests.

The Concise Evaluation Scale for Children’s Handwriting (BHK) is another popular manual handwriting analysis method conducted under the supervision of an occupational therapist or psychologist to quantify the speed and standard of writing. BHK test is usually conducted in the individual clinical setting and classical scholar setting. Initially, it was introduced for assessing the handwritten samples of 2nd and 3rd-grade students. Currently, BHK scales are used in the research for constructing the ground truth of the data used for training and evaluation of machine learning-based dysgraphia diagnosis systems.

3.2 Machine learning based methods

This section discusses machine learning based systems for diagnosis of dysgraphia from either the images of handwritten texts or features of writing dynamics. The general flow of machine learning based dysgraphia diagnosis system is shown in Fig. 6. As in other machine learning applications, dysgraphia diagnosis systems also follow a similar workflow. Data collection, preprocessing, feature extraction, feature selection, and training using a machine learning classifier are important steps included in building the machine learning-based dysgraphia diagnosis system. Data is a pivot for any machine learning task. Cleaned and sufficient data is required for any machine learning algorithms to make an accurate prediction. In the case of the dysgraphia diagnosis system, two types of data collection methods are usually implemented. The first approach is offline-based data collection where the subject is asked to write or copy a few words or sentences on the paper or tablet and the resulting handwritten images are used for further analysis. On the other hand, the online-based data collection is focused on collecting the handwriting data during the run time which includes the trajectory of the pen /pencil, writing speed, pressure on the pen tip, etc. Most of the existing dysgraphia diagnosis systems followed the online data collection strategy.

Fig. 6
figure 6

General workflow of machine learning based dysgraphia diagnosis system

3.2.1 Feature extraction

Extraction of features from the data is crucial for training the machine learning models. Multiple features are extracted from the collected handwritten data for further analysis in the existing systems. The important online handwritten features used for machine learning-based systems in the literature for dysgraphia detection are shown in Fig. 7.

Fig. 7
figure 7

Relevant handwriting features

  • Kinematic Features

The kinematics features mainly include the velocity, acceleration, and jerk of the writing. Velocity just quantifies the speed of writing and abnormal variability in writing speed is related to the underlying handwriting problems. Acceleration computes the variability in the velocity during writing. Jerk is the change in acceleration over time per stroke. Unusual changes in acceleration over time per stroke may be related to writing problems. Various studies [47, 48] in the literature have found that the kinematic aspects are affected in children with handwriting disabilities. Along with the average velocity of the whole writing process, few works considered the velocity or speed of writing each word or letter as a new derived feature for enhanced discrimination.

  • Temporal Features

The temporal features mainly quantify the writing/drawing duration or time. Usually, children with dysgraphia will take more time to write or draw compared to normal children. In this sense, the writing or drawing duration has significance in differentiating normal and abnormal handwriting behavior. The total time required to complete the task, the total time spent on the paper, the ratio between the total time spent on the paper and the total time of the task, etc. are the popular temporal or time-related features utilized in the literature for dysgraphia detection.

  • Spatial Features

The width, height, and length of the strokes as well as whole written data are the common spatial features extracted from handwritten data. The writing or drawing activities of children with dysgraphia always contain various inconsistencies. The irregular size of letters and irregular spacing between the words are the common inconsistencies found in the writing. The spatial feature can well discriminate the writing samples of children with dysgraphia. These features can be called as geometric features also. The offline handwritten based dysgraphia diagnosis systems mainly rely on these features to differentiate the normal and abnormal handwriting.

  • Dynamic Features

The popular dynamic features are pressure, tilt, azimuth, etc. Among these, pressure features are quantified to find out the characteristics of pressure induced by the pen tip on the writing surface. Generally, the statistical measures such as mean, median, the standard deviation of pressure are quantified for feature construction. The speed of pressure change and speed of pressure change frequencies are the other derivable features from pressure values. The tilt feature is used to measure the inclination of the pen or pencil used for writing. The azimuth angle of a pen or pencil with respect to the plane of the written surface is also quantified for feature set construction. Tilt and azimuth angle are displayed in Fig. 8.

Fig. 8
figure 8

Tilt and Azimuth angle

  • Other Features

Apart from the main four categories of features (Kinematic, temporal, spatial, dynamic), few works in the literature have considered some other features to improve the dysgraphia screening efficiency. It includes the number of interruptions occurred during the writing, number of pen elevations, as well as number of times the child has made the mistake and erased it (erase count), etc. These features has also have relevance in screening dysgraphia since children with dysgraphia are tends to make more mistakes than normal children while writing.

Most of the works in the literature has used combination of different feature categories to effectively identify the existence of dysgraphia in children using machine learning algorithms. Table 2 displays the different feature combinations used in the literature.

Table 2 Summary of feature combinations used in the literature

3.2.2 Existing machine learning based systems

The use of machine learning algorithms in the diagnosis of dysgraphia has become a viable path for accurate and effective evaluation. The ability to examine enormous databases of handwriting samples and identify complex patterns that may evade human inspection is provided by machine learning techniques. This section goes in-depth on the various machine learning models based systems that are used to diagnose dysgraphia. We seek to present an overview that provides a nuanced comprehension of the various contributions made by different techniques to the field by classifying them according to the specific algorithms they employ. Here, each subsection is devoted to a different machine learning algorithm based automated diagnosis system and provides details on its methodology, possibilities for diagnosis, and performance measures. The popular machine learning algorithms used in the literature are random forest [60], support vector machines [61], convolutional neural networks (CNN), AdaBoost, and XGBoost, etc.

Table 3 Overview of random forest based dysgraphia screening systems

3.2.2.1 Random forest-based dysgraphia diagnosis systems

Random forest algorithm is an ensemble of multiple decision trees. Decision trees are popular in supervised machine learning due to their simplicity and competence in both classification and regression problems. With each node assisting to divide the data into sections based on particular features, these tree-like structures serve as a visual representation of the decision-making process. By utilizing the principles of ensemble learning and elaborating on the idea of bagging, Random forest is used. For each decision tree in this strategy, subsets of features are carefully and randomly chosen. This method reduces variance, especially in datasets with intrinsic noise, while having the potential to add significant computing burden. Table 3 presents the overview of dysgraphia diagnosis systems based on machine learning, particularly those employing the random forest algorithm for decision-making.

Random forest has demonstrated superiority over other machine learning algorithms, such as logistic regression and naïve Bayes, as evidenced in the work by Richard et al. [62]. This study is grounded in the observation that dysgraphic children encounter challenges in various aspects of the handwritten task, including letter formation, maintaining consistent letter size, following straight lines, and ensuring uniform gaps between letters and words. The dataset comprised approximately 1400 handwritten images, each containing four lines of handwritten text. From these images, a diverse set of features, including slant, pressure, amplitude, letter spacing, word spacing, slant regularity, size regularity, size regularity, and horizontal regularity, were manually extracted. Remarkably, among the classifiers implemented in the study, the Random Forest classifier achieved the highest accuracy, boasting an impressive 96.2% accuracy rate.

In Mekyska et al.’s study [49], the random forest classifier played a key role in automating developmental dysgraphia diagnosis and estimating handwriting difficulty levels. Handwriting data from 54 students (27 normal, 27 dysgraphic) were collected using a digitized tablet and COMPET, a standardized online tool. Features extracted from the data, including Kinematic, non-linear dynamic, and other features, were used to train machine learning models. Notably, the random forest algorithm achieved an impressive 96% sensitivity and specificity, demonstrating its effectiveness in classifying dysgraphic and normal handwriting. The work proposed by Asselborn et al. [50] is a compelling automated dysgraphia diagnosis tool for primary school students, prominently featuring the utilization of the random forest machine learning classifier. The study involved 298 students, among whom 56 had been identified with dysgraphia. These students were tasked with a 5-min writing exercise, using a sheet of paper affixed to a Wacom Intuos tablet. 54 distinct features are extracted from the collected handwriting data. These features encompassed a wide range of handwriting characteristics, including Static features, which represented purely geometric attributes of the written text; Kinematic features, offering insights into the dynamics of handwriting paths; Dynamics features, capturing characteristics of the pressure recorded between the pen tip and the tablet surface; and attributes related to pen tilt. The random forest classifier achieved a classification accuracy of 96.5%.

The raw data obtained from graphomotor tests are analyzed using machine learning algorithms for the early detection of dysgraphia in children [53]. Initially, 305 students were asked to participate in performing a graphomotor test, where each student has to write or draw specific predefined shapes or figures called stimuli on the paper attached to the screen of the tablet. Numerous features including time duration of the stimulus, duration of a stroke, duration of lift, velocity, jerk, Renyi Entropy of order 2, signal to noise ratio, etc. were extracted from the raw data for further analysis. Various machine learning algorithms including SVM [61], random forest, MLP [63], Extra Trees, Gaussian Naïve Bayes [64], Ada Boost, etc. are utilized in the proposed work for analyzing the features. Among all implemented machine learning algorithms the random forest classifier trained with features selected using the Linear SVM algorithm achieved the highest classification accuracy of 73.4%.

Dutt et al. [65] also analyzed the performance of multiple machine learning algorithms for the detection of dysgraphia. Initially, a dataset was constructed by conducting various handwritten-based exercises in a group of 240 subjects where 142 have some sort of learning disabilities and 45 subjects have dysgraphia. Boolean features are extracted for the following properties: sentence structure, word formation, and visual-spatial response. Along with that, the images of writing are subjected to SSIM (structural similarity index measure) evaluation, and spellings are checked through a spell checker. Among all the extracted features the relevant features are selected using a feature selection technique known as Elastic Net [66]. The selected relevant features are used to train KNN [67], Naïve Bayes, Decision tree [68], random forest, and SVM classification model for prediction of dysgraphia. The random forest classifier achieved the highest classification accuracy and it can differentiate the dysgraphic and non-dysgraphic subjects with an accuracy of 99% in the experimented dataset.

In [55], an automated system based on random forest algorithm for identifying dysgraphia from digitized handwriting is proposed. Initially, a dataset was meticulously constructed by engaging 78 participants in various handwriting tasks, all executed on a Wacom Intuos Pro Large graphic tablet. Among these 78 participants, 36 were individuals with dysgraphia. From each written sample, a diverse set of features was extracted, including kinematic, temporal, and other features such as pen lifts. Subsequently, three distinct machine learning classifiers - random forest, support vector classifier (SVC), and AdaBoost-were trained using the extracted features to automate the detection of dysgraphia. The outcomes of this study reveal that random forest outperformed the other classifiers, demonstrating superior classification performance.

In [59], a digital system for dysgraphia identification is proposed. It utilizes a Wacom tablet with an interactive stylus for data collection as students perform five handwriting tasks. Key metrics like total time, tip pressure, and pen coordinates are recorded. Nine distinct features including on-paper segments, in-air segments, total time, total airtime, total time spent on paper, mean pressure, the velocity of the pen, etc. are extracted. Machine learning is employed with three classifiers: decision tree, random forest, and SVM. Among them, random forest achieved a maximum classification accuracy of 92.59%.

3.2.2.2 SVM-based dysgraphia diagnosis systems

SVM is a machine learning algorithm used for both classification and regression tasks, with a focus on enhancing prediction accuracy. SVM operates by categorizing each class in a high-dimensional space using one or more hyperplanes. By leveraging labeled input training data, the SVM algorithm constructs an optimal hyperplane for classifying new test data. SVM classifiers excel in binary classification scenarios and can be extended to multi-class classification tasks. This extension is accomplished through two approaches: ”one against all,” where each class distinguishes itself from all other classes, and ”one against one,” where one class is pitted against each of the other classes individually. Table 4 presents an overview analysis of dysgraphia diagnosis systems based on machine learning, particularly those employing the SVM algorithm for decision-making.

In [54], the SVM algorithm was implemented to construct a decision-making model for creating an automated pre-diagnosis tool for dysgraphia. Multiple tablets were employed for data collection to ensure that the proposed system is not dependent on a particular tablet or data collection tool. To the best of our knowledge, this work has generated the most extensive database for dysgraphia diagnosis by involving 580 children in the data collection process. These children were tasked with performing the BHK test (French version) on the tablet, from which 100 different features were extracted to train the SVM classifier. In the test set, the SVM classifier achieved a sensitivity of 91% and specificity of 81% for detecting children with dysgraphia.

A software system, leveraging the SVM algorithm and an Android application, is proposed for identifying dysgraphia based on handwriting data from elementary school students [69]. The Android application incorporates a handwriting recognition tool called ”WritePad” for data acquisition. Students were instructed to write on the smartphone screen, capturing data such as time, pressure, letter spacing, letter size, letter position, boundary consistency, and more, all of which were stored for subsequent analysis. SVM classifiers were implemented in two distinct ways: the One-Versus-One (OVO) and One-Versus-Rest (OVR) approaches for classification. Results obtained after applying three different SVM kernels, namely Linear, Polynomial, and Radial Base Function (RBF) kernels, revealed that the RBF kernel outperformed the others, producing a higher average accuracy of 82.51%, as compared to 78.56% for Linear and 81.40% for Polynomial kernels.

Rosenblum et al. [56] introduced an online tool based on the SVM algorithm for diagnosing dysgraphia in 3rd-grade students. The study involved 99 students, with 50 considered proficient in handwriting. Online handwriting data was collected using a tablet and the ComPET tool, starting with writing task in Hebrew and drawing task. Various time features, static properties of the written product, and pressure features were extracted. Linear SVM algorithm-based classifier trained using these features achieved 90% specificity and sensitivity. Among the extracted features, time and pressure-related features were found to be highly relevant for discriminating between normal and abnormal handwriting.

In [70], the SVM algorithm is implemented with the combination of fused CNN features extracted from handwritten images for the purpose of diagnosing dysgraphia. The initial step involves transforming an existing online dataset into offline handwritten data, which encompasses images related to various handwritten tasks. CNN features are separately extracted from each task-specific image and then combined to facilitate the training of machine learning classifiers. In addition to SVM, AdaBoost and random forest algorithms are employed to train these decision-making classifiers. Notably, among these algorithms, SVM trained on the fused features achieves an outstanding maximum classification accuracy of 97.3%.

3.2.2.3 CNN-based dysgraphia diagnosis systems

CNNs are a type of neural network designed for tasks involving multi-dimensional data such as images or time series information. A CNN comprises several key components or layers, including convolutional layers, pooling layers, fully connected dense layers, and batch normalization layers. Additionally, there are input and output layers.

Convolutional layers form the foundation of deep CNN architectures. They consist of learnable filters with small receptive fields, responsible for generating feature maps through convolution operations. To introduce non-linearity into the network, activation functions, also known as transfer functions, are applied after convolution operations. Batch normalization layers are used to accelerate the learning process by normalizing the output from previous layers.

Pooling layers are employed for dimensionality reduction in deep CNNs. They reduce the spatial size of feature maps, aiding in learning essential features and achieving translation invariance. The multi-dimensional feature maps produced by convolutional layers are flattened into one-dimensional arrays before entering the fully connected layer, also called the dense layer. The output layer, situated at the end of the fully connected layer, is crucial in classification problems. It estimates probabilities for each input belonging to different classes, serving as the network’s prediction function.

Recent literature features the utilization of the CNN algorithm in dysgraphia screening from handwritten data. Table 5 presents an overview analysis of dysgraphia diagnosis systems based on machine learning, particularly those employing the CNN algorithm for decision-making.

Table 4 Overview of SVM based dysgraphia screening systems
Table 5 Overview of CNN based dysgraphia screening systems

In [72], a three-level 2D CNN architecture is employed to automate the detection of dyslexia-dysgraphia from handwritten images. A new dataset is curated to assess the effectiveness of this CNN architecture, comprising a total of 267 handwritten images (in Hindi orthography), with 164 belonging to the dyslexia-dysgraphia class and 103 to the normal class. The proposed CNN architecture is trained using random patches from grayscale versions of these images, achieving an accuracy of 86.14%.

Meanwhile, in [73], a combination of Kekre-Discrete Cosine Transform and deep transfer learning is applied to efficiently distinguish between normally developed handwriting and dysgraphia writing. The input images undergo preprocessing via Kekre-Discrete Cosine Transform, transforming the input images and feeding the resulting hybrid transform into a deep CNN network. This method attains an impressive classification accuracy of 99.75%.

The mobile application ”Pubudu” [71] offers a deep learning-based solution for screening and intervening in multiple learning disorders, including dyslexia, dysgraphia, and dyscalculia among school children. In dysgraphia diagnosis, the application addresses both numerical and letter dysgraphia. For letter dysgraphia screening, a CNN model is trained using 5000 handwritten image samples (comprising 3 Sinhala letters) from non-dysgraphic children. Dysgraphic and non-dysgraphic students are then asked to write the same letters, generating various features such as success probability of written letters, total correct letters, total incorrect letters, number of attempts, total time taken, and erase count. The trained CNN model estimates the success probability, correct letters, and incorrect letters, which are subsequently used to train an SVM model for letter dysgraphia screening. Similarly, for numerical dysgraphia screening, the approach employs the MNIST dataset to train a CNN model for automated feature extraction. The proposed method achieves an accuracy of 88% for letter dysgraphia screening and 90% for numerical dysgraphia screening.

Similarly a mobile-based system ”Nana Shilpa” [74] also utilized deep CNN networks to screen and identify the risk of dysgraphia and dyscalculia in primary school students in Sri Lanka. It employs two risk identification components: letter-level dysgraphia and numeric dysgraphia. A CNN model with four two-dimensional convolutional layers is developed to screen for letter-level dysgraphia. In the case of numeric dysgraphia screening, students are required to write numbers from 0 to 9 on the application’s drawable canvas. Features such as correctness, erase count, and time taken are considered for risk prediction. The accuracy of the written numbers is determined using a CNN model trained on the MNIST dataset. These features are then used to train a machine learning model to predict the risk of numeric dysgraphia. The proposed methods achieve a high accuracy of 99% in predicting the risk of both numeric and letter-level dysgraphia.

3.2.2.4 XGBoost-based dysgraphia diagnosis systems

XGBoost is an exceptionally effective and popular machine learning algorithm renowned for its remarkable accuracy. It works by leveraging the strengths of many decision trees, successively refining predictions, and handling complicated information effectively. Numerous benefits, including integrated regularization, feature importance analysis, and the capacity to manage missing data, are provided by XGBoost. Its scalability, support for parallel and distributed computation, and adjustable objectives make it a viable solution for a variety of machine learning tasks. However, it is noteworthy that the XGBoost algorithm is rarely employed in the literature for the development of dysgraphia diagnosis systems. Table 6 presents an overview analysis of dysgraphia diagnosis systems based on machine learning, particularly those employing the XGBoost algorithm for decision-making.

Table 6 Overview of XGBoost based dysgraphia screening systems

Mekyska et al. [57] and Zvoncak et al. [76] utilized the XGBoost algorithm in the literature to develop a prediction model for the diagnosis of dysgraphia. In [57], a new dataset was created by having a cohort of students draw seven different figures, such as the Archimedean spiral, connected loops, and rainbows, on an A4 paper. The paper is laid down and fixed to a digitizing tablet, specifically the Wacom Intuos Pro L (PHT-80). Five distinct categories of features were extracted from the acquired handwriting data. These extracted features encompassed spatial dimensions, including width, height, and length of the entire product, as well as its strokes, stroke width, height, and length; temporal aspects, such as the duration of drawing; kinematic attributes, including velocity, acceleration, and jerk; dynamic factors, such as pressure, tilt, and azimuth; and other factors, such as the number of interruptions (pen elevations and the relative number of interruptions). The XGBoost algorithm achieved a classification specificity of 90%.

On the other hand, Zvoncak et al. [76] generated a new dataset consisting of data related to paragraph writing exclusively. Apart from extracting commonly used kinematic, dynamic, spatial, and temporal features, four intra-writer normalization approaches (l1, l2, l infinity, and Z-score) were introduced. Unlike [57, 76] treated the diagnosis as a regression problem, predicting the handwriting proficiency score. This implies that XGBoost was applied for a regression task in this context. Observed results indicate that the introduced normalization approach reduced the error rate by 5%.

3.2.2.5 AdaBoost-based dysgraphia diagnosis systems

AdaBoost, which stands for Adaptive Boosting, is a strong ensemble machine learning method that is usually employed for classification tasks. By iteratively training weak learners, such as straightforward decision trees, and giving weights to data points based on classification accuracy, it excels at enhancing their performance. The key to AdaBoost’s success is its adaptive learning methodology, which emphasizes complex data points in each iteration to produce an accurate and robust ensemble model. Due to its capacity to improve classification accuracy and reduce the risk of overfitting, it is widely used in a broad range of fields, including face identification and text categorization. Similar to XGBoost algorithm, AdaBoost is also rarely employed in the literature for the development of dysgraphia diagnosis systems. Table 7 presents an overview of dysgraphia diagnosis systems based on machine learning, particularly those employing the AdaBoost algorithm for decision-making.

In the work introduced in [51], a classifier based on the AdaBoost algorithm outperformed SVM and random forest algorithms in distinguishing normal handwriting from dysgraphic handwriting. Initially, a new dataset was developed in the study, which included online handwritten data collected from 120 students using a digitizing tablet while performing different types of writing activities. From the data of each writing activity, 22 types of features, including spatiotemporal and kinematic features, were extracted to train and evaluate machine learning algorithms. The AdaBoost algorithm achieved a classification accuracy of 79.5%. However, the study involved more than 1000 features.

Table 7 Overview of AdaBoost based dysgraphia screening systems

Kunhoth et al. [77] introduced a method to reduce the number of features without compromising the prediction performance of the AdaBoost algorithm. Instead of extracting features from each task-specific data separately, the features were extracted from the combined data. The performance of these features was evaluated using multiple machine learning algorithms, including SVM, random forest, neural networks, and AdaBoost. The AdaBoost algorithm achieved a maximum classification accuracy of 80.8% with fewer features compared to [51].

3.3 Automated diagnosis systems without ML

Non machine learning based automated digital screening systems are also proposed in the literature to diagnose dysgraphia. This section briefly discusses popular non-ML based digital dysgraphia screening systems. Furthermore, a comparative analysis of available non-ML based dysgraphia screening tools is presented in Table 8. A tablet-based app named "Play-Draw-Write" has been proposed in [78] for screening the handwriting skills of children in the preliteracy stage. The proposed tablet app will quantify three handwriting laws isochrony, homothety, and speed-accuracy trade-off for assessing the existence of dysgraphia markers. “The isochrony principle states that bigger gesture execution is accompanied by an increase in average movement speed to keep the movement time approximately constant [79]. The homothety principle predicts that the fraction of time devoted to each letter of a word is kept constant and is independent of the total word duration [79, 80].” These principles seem to be altered in the subjects with dysgraphia along with dyslexia. The app is developed for Samsung Galaxy Tab A and children can use S-Pen with a rubber tip write or draw on the screen of the tablet App. Two tasks are available in the app, where the first one is a copy game and the second is a tunnel game. In the copy game, the students are asked to copy a few symbols and words to the canvas provided in the app using the stylus. The copying game is targeted for testing the isochrony and homothety. The tunnel game is targeted for evaluating the speed-accuracy trade-off.

Table 8 Comparison of non-machine learning-Based automated digital dysgraphia screening tools

Giordano and Maiorana [81] have developed a web-based software system for multiple handwritten gesture recognition which can be extended for online dysgraphia screening. The proposed software system is designed in a client–server manner and it can be used in any type of smartphone or tablet and computers. A modified version of the dynamic wrapping algorithm is employed in the server to recognize multiple hand gestures. The proposed software system offers multiple functions to execute different writing and drawing tasks. For each task, the system will store the time taken, degree of similarity with the reference line, amount of points outside the reference line in terms of percentage, average horizontal distance, etc. From the stored data, the administrator of the software tool can derive other parameters such as time taken for each stroke or their average time, length of the path, velocity, total air time, trajectory analysis, etc. These features can be utilized for the development of machine learning classifiers for dysgraphia diagnosis. While the system’s architecture and user interface have been carefully developed, it is important to emphasize that comprehensive performance testing and validation with real users have not been conducted as of yet. These forthcoming phases are essential to assess the system’s speed, accuracy, and overall effectiveness, as well as its suitability for practical applications.

Raza et al. [82] proposed an interactive mobile application for dysgraphia screening in children from age 5 to 12. The mobile application offers different activities to assess the handwriting quality of the student, assess the phonological dysgraphia, assess the surface dysgraphia, and assess the copying ability. Surface dysgraphia is characterized by difficulties primarily in spelling irregular words (irregular words are those that do not follow common phonetic patterns or spelling rules, eg: psychology). This condition can impact writing fluency and accuracy, particularly in situations requiring the correct spelling of irregular words. Moreover, it is relatively less common compared to other types of dysgraphia. The student can use any compatible stylus pen to write on the screen of the device. A total of twenty words are given for students to write where seven words are used to estimate the existence of phonological dysgraphia, eight words for surface dysgraphia, and the remaining for copying ability. The software utilizes a handwritten recognition method to recognize the spelling and after recognition, it is compared with the ground truth value to provide a score based on the correctness. However, it’s essential to highlight that the paper lacks comprehensive details regarding the anomaly detection algorithm, which is crucial for assessing the system’s diagnostic capabilities. While the system’s architecture and data collection aspects have been meticulously described, it is evident that the work is in its preliminary stages. Notably, there is a lack of testing on original subjects to validate the system’s practical utility.

TestGraphia [83]is a software system proposed for the early diagnosis of dysgraphia. The proposed system is based on the conventional BHK method. They extended the conventional BHK method to the software device for automated diagnosis. In the BHK test, the students are asked to copy a few given texts to a paper in 5 min. From the handwritten document, scores are given to the 13 features including writing size, skewed writing, sharp angles, etc., and scoring these features is the crucial step in the BHK test. The TestGraphia will automatically compute the scores for 9 features ("non-aligned left margin", "skewed writing", "insufficient space between two words", "sharp angles", " broken links between two letters", "irregular size of letters", " inconsistent height between letters with extension and letters without extension", "ambiguous letters"). Scores of the remaining features can be added by the doctors or occupational therapists manually using the dashboard of the software. For the automated scoring, initially, the images of the writing are subjected to segmentation (individual line segmentation) using an image processing algorithm based on horizontal and vertical histograms. Multiple simple image processing algorithms are employed on the segmented lines to compute the feature score. For each of the nine features considered in automated scoring, a scale of 5 is employed. This scale involves evaluating five lines of handwriting, and each line is assigned a score of 0 if it exhibits normal characteristics or a score of 1 if irregularities are identified. The total score for each feature is then determined by summing the scores assigned to each of the five lines of handwriting. The scores for the remaining four features, rated on a five-point scale, are determined manually in collaboration with an occupational therapist. These manually computed scores are then integrated into the software, which subsequently calculates the overall score based on the BHK scale. Cases with a final score below -2 are categorized as severe instances of dysgraphia. The proposed system was tested with 109 students (2nd - 5th grade). The system demonstrated promising performance with a sensitivity of 0.83, indicating its ability to correctly identify 83% of individuals with dysgraphia, along with a high specificity of 0.98, signifying its precision in classifying those without the condition. With an overall accuracy of 0.96, the system showcased robust diagnostic capabilities. These results suggest that the diagnostic system is highly efficient in identifying dysgraphia, making it a valuable tool in the assessment and diagnosis of this condition.

3.4 Commercial systems for dysgraphia diagnosis

In this section, we briefly discuss the available software systems in the marketplace for dysgraphia diagnosis. There are very few commercially available systems for dysgraphia screening. A comparative analysis of commercially available software tools available for dysgraphia diagnosis is presented in Table 9, delineating their respective advantages and disadvantages.

Table 9 Comparison of dysgraphia diagnosis tools in the marketplace: advantages and disadvantages

‘Lexcercise’ is a company founded in 2008 for supporting students with learning disabilities. Their web application provides an easy and cost-free way for preliminary assessment of dysgraphia in children [84]. The whole procedure for assessment consists of three sections. The parents or caretakers of the student are asked to fill out the answers to the provided questions about the student. In the first section, the questions are related to letter and number writing (such as reverse writing, messy writing, mixing up of lowercase and uppercase). The questions in the second section and third sections are targeted for assessing the writing convention and writing proficiency of the student respectively. They will provide the service of the dysgraphia therapist also. Upon completion of the assessment, the parents or caretaker will be asked to submit the actual writing samples of the student to a dysgraphia therapist, if the assessment displays any risk. [85] also, provides a web-based tool for the free assessment of dysgraphia. It is similar to [84], just a screener test to decide whether the child is having the symptoms of dysgraphia. The assessment consists of 16 basic questions where most of which are considered to be the prevalent symptoms of dysgraphia. All the questions are prepared based on the criteria from the learning disabilities association of America. Unlike [84], here they will not provide the therapist, but they will give suggestions for the parents to check with an occupational therapist if the test is positive.

Dyscreen [86] is a smartphone application developed by the Australian company Dystech in 2020 for self-screening of dysgraphia and dyslexia using the power of artificial intelligence. Dystech claims that they are the pioneer in this field to introduce the first AI-based smartphone application for dysgraphia screening. Currently, the application is available for both iOS and Android. The dysgraphia screening functionality of the application is very simple. Just take the picture of the student’s handwritten text and upload it via the application. The intelligent machine learning model deployed in the cloud will classify the handwritten sample into positive or negative classes and provide feedback to the user. Since the processing of the images is done in the cloud, internet connectivity is required for dysgraphia screening. Dystech claims that the machine learning model can predict the existence of dysgraphia with an accuracy of 95.6%. The dysgraphia screening in Dyscreen is free of cost. The same application provides other services such as dyslexia screening, but it is not free of cost. A team of professionals and engineers from Hongkong university of science and technology has developed an artificial intelligence-based smartphone application named ’AI Dysgraphia Pre-screening’ for dysgraphia screening [87]. The application is free and available on both Android as well as iOS. The application provides a 15-min drawing exercise for the student. Upon completing the drawing exercise the essential dynamic and static features are extracted from the handwritten data and processed for data analysis using trained ML algorithms. Internet connectivity is required to access the app services. Compared to Dyscreen this app considered the dynamics of writings also for screening dysgraphia.

Dynamillis [88], a software developed to enhance children’s handwriting skills, requires an iPad with a stylus pen for operation. Leveraging state-of-the-art artificial intelligence-based machine learning algorithms, Dynamillis not only analyzes children’s handwriting for the early detection of difficulties, ranging from mild to severe, including dysgraphia but also suggests personalized plans to improve handwriting. The tablet application meticulously monitors over 100 dynamic features of writing, including velocity, acceleration, pressure, and pen tilt. Trained on handwriting examples from more than 1000 children, the machine learning algorithm ensures a comprehensive approach to handwriting improvement and further enables progress tracking for personalized learning journeys. Table 8 provides the comparison of tools available in the marketplace for dysgraphia diagnosis.

To the best of our knowledge, these are the available software tools for dysgraphia screening in the market. All are free to access (Dynamillis is not completely free, a free trail is provided for one week), however, there are some limitations. The Dyscreen just considers the images only, so the dynamics of the handwriting are not considered. On the other hand, in [87] the data are collected via smartphone screen, and in Dynamillis data is collected via tablet screen, and it is very different from a normal paper, and friction while writing will be very different.

4 Discussion and future recommendations

The popularity and the built-in capabilities of digitizing tablets to acquire information such as the position of the pen tip, on surface/ in-air pen position, pen tip pressure, the azimuth angle of the pen with respect to the tablet surface, the tilt of pen, timestamp [49] made it a suitable tool for the handwriting analysis. Stroke dimensions, Velocity, Acceleration, Jerk, Pressure, Tilt, Temporal, Azimuth angle, and Number of pen elevations are the features that have more or less equal prominence in the dysgraphia discrimination systems [12, 49,50,51,52, 57, 75, 89]. The most relevant features that are spotted out in a multitude of works are Kinematic and Pen tip pressure. While our review has delved into the various automated dysgraphia assessment methods, it is crucial to emphasize the role of feature extraction for enhanced interpretability. The incorporation of kinematic, temporal, spatial, and dynamic features is not only pivotal for accurate classification but also holds significant implications for fundamental researchers and clinicians. These features provide a nuanced understanding of the intricacies of dysgraphia, aiding researchers in uncovering underlying patterns and clinicians in tailoring interventions based on a more comprehensive set of characteristics.

Compared to the approaches merely based on handwriting images, [90, 91] the tablet-based techniques could explore more characteristics of handwriting, which turned out to be significant for the detection of dysgraphia. The data acquisition in the latter approach involved writing with a normal pen or an electronic pen on paper overlaid on the tablet. However, the lower friction surface of tablet computers modifies graphomotor execution, which in turn contradicts the purpose [92, 93]. Multiple studies have reported in the literature about the issues with lower friction surface of tablet for handwriting analysis. In the study reported in [94] all groups, including experienced writers, demonstrated increased writing velocity when using a tablet due to the lower friction of its surface. The smoother tablet surface demands enhanced control of handwriting movements. In another [95] study, it was noted that when young children use tablets for writing, there are challenges in accurately calculating segment trajectories, while older children seem to struggle with controlling their muscular adjustments. There are few studies [96, 97] which reports there is no performance difference between writing on tablet and writing on paper. However, their results indicates that children who used pencil-paper approach for writing demonstrated better letter recognition skills and improved visuo-spatial abilities. Moreover, using a stylus on a touchscreen appeared to be the least favorable writing tool, potentially due to increased motor control demands. The study suggests that further research with longer observation periods is needed to draw conclusions about the long-term effects of writing tools on literacy acquisition and overall cognitive development. Lower friction surface of tablets can be solved to an extent by using appropriate surface stickers available in the market. However, the pressure sensitivity of these tablets may vary depending on the model [98].

Even though the grip force between the hands and writing instrument has a substantial correlation to fine motor performance [99] and the improper handling of the pen is prevalent in motor dysgraphia [100], it is not considered as an attributing factor for dysgraphia prediction. The existing methodologies are meant to provide an overall diagnostic assessment as being dysgraphic/ non-dysgraphic except a few with a grading scale [52] for handwriting difficulties. In [52], Asselborn et al. implemented K means clustering algorithm to distinguish the handwritings of 448 subjects to different 3 clusters that differentiate different writing profiles and level of dysgraphia severity.

Furthermore, it is essential to highlight that solely relying on a single task, whether it involves copying, writing, or drawing, may introduce constraints in the assessment of dysgraphia. Dysgraphia is inherently complex, exhibiting a spectrum of manifestations that a singular task might not fully capture. The intricacies of this condition necessitate a more comprehensive evaluation that considers diverse aspects of handwriting performance.

To our knowledge, there are the number of gaps in the automated dysgraphia detection system domain. Some of the recommendations for future works are:

  • Since the pressure sensitivity of the tablets varies by models and tablets have lower friction surfaces, future dysgraphia diagnosis systems can consider normal paper (writing on paper) based systems rather than writing on tablets.

  • The future systems can explore the importance of improper handling of pen/pencil and grip force between hand and writing instruments as relevant features for discriminating normal and abnormal handwriting.

  • Instead of developing binary systems (dysgraphia or not) for dysgraphia diagnosis, future work should consider developing systems for recognizing different types of dysgraphia or grading the levels of dysgraphia. Because the symptoms of dysgraphia vary with different types and levels of disability.

5 New framework for diagnosis and assistance evaluation of dysgraphia

To address some of the limitations in the existing frameworks, we propose a new framework for diagnosis and assistance evaluation of dysgraphia. An overview of the framework is shown in Fig. 9. The framework consists of 4 steps. In the first step, children are classified into normal and dysgraphic children using psychological methods. Then, an automatic system is developed that takes into account data collection and processing. In this step, depending on the type of approach used, other processing steps can be added. For example, for handwriting on paper, preprocessing should be added.

Fig. 9
figure 9

Overview of proposed framework

Furthermore, instead of feature extraction and classifier, deep learning methods can also be used. However, deep learning models, especially complex neural networks, often face the challenge of interpretability. Ensuring the interpretability of deep learning models is crucial, especially in applications where decisions have significant consequences. In the collection step, the content that the participants are asked to write or draw is very important. Depending on what equipment is available, a combination of this equipment can also be considered. In feature extraction, off-line and on-line features are considered based on the approach of data collection.

After collecting data with different modalities, artificial intelligence (AI) models can be created for each modality based on traditional methods and Deep Learning methods. The analysis of each group of data should be conducted separately and the integration of the data should be conducted through multimodal data fusion techniques. It is also possible to create models using fusion methods at the level of extracted features and classifiers. There are different approaches for feature-level fusion and classifier-level fusion. The choice of the appropriate fusion method depends on the characteristics of the data and the complexity of the relationships between the modalities. Experimentation and validation with appropriate metrics will determine which fusion method is most appropriate for the proposed dysgraphia detection framework. It should be noted that the fusion methods and the analysis of the integration of data have been increased significantly [101,102,103]. It should be noted that the AI models are created on the basis of training data and that the evaluation of the model is based on test data. The decision to detect dysgraphia is based on analysis and on the best results of the approaches under consideration. This may lead us to use only one modality or other scenarios (such as the fusion of two modalities or all modalities) for the treatment assessment step. Since fusion methods can play an important role in this framework, researchers and future users are invited to have a look at some fusion methods in references [100,101,102, 104] to get a better picture. In the last step, which is very important, an approach is proposed for tracking the dysgraphia cases and their treatment evaluation process. This step can be carried out in the periods proposed by the psychologist. The impact of the treatment method can be explored during the tracking step. It is possible to add new treatments and study their effectiveness.

It should be noted that some tasks and steps can be added to the framework such as detecting the type and severity level of dysgraphia. As the process is repeated until good improvement occurs, this is important to avoid protracted and counterproductive factors. The problem should be solved by considering appropriate time intervals suggested by psychologists and by considering interesting patterns for writing and drawing. It is crucial to tailor the assessment process to the individual’s natural writing and drawing behavior. Another aspect that should be explored in the framework is the acceptability of the tools which is a critical aspect of any diagnostic tool, especially in children, that can greatly influence the success of dysgraphia assessment and improvement tracking. The problems in writing in other neurodevelopmental disorders such as attention deficit/hyperactivity disorder (ADHD), developmental coordination disorder (DCD), dyslexia, and ASD can be studied as an important part of our framework that focuses on the assessment process. It may be interesting to examine the correlation of disorders and motor control that impact on writing. The framework can be seen as an aid to the Cognitive Orientation to Occupational Performance (CO-OP) program [105]. This framework can also be combined with treatment approaches as presented in [106]. Since handwriting reeducation plays an important role in treatment approaches and patients may also have solutions based on their perception of the problem [106], we should seek feedback from both patients and neurologists to update AI methods in this framework. Some aspects of acceptability that should be considered are:

  • Focus groups and simulation exercises are excellent methods to assess the acceptability of sensor-based instruments, especially in the context of dysgraphia assessment. With these approaches, it can gather valuable insights from potential users, educators, and other stakeholders before implementing the technology on a larger scale.

  • Comfort and ease of use of sensors: ensuring that body-worn sensors are non-intrusive, that they are comfortable to wear, and that they do not interfere with the writing process.

  • Calibration and setup to ensure that the setup process is simple and minimizes interference with the child’s routine.

  • Ethical considerations for privacy and informed consent, especially when minors are involved.

Fig. 10
figure 10

Workflow of novel methods

The new framework consists of two novel methods (Fig. 10) that can be used along with existing methods for automatic screening of spatial dysgraphia and motor dysgraphia, respectively. For the former one, the dynamic characteristics of handwriting are captured with the help of two types of sensors- (1) Force-sensitive resistors (FSR) attached to the pencil for capturing pencil grip patterns (2) sEMG sensor attached to the hand for capturing muscle activity. The acquired data is preprocessed followed by feature extraction and binary classification with the help of prebuilt machine learning models.

Handwriting is a prominent fine motor skill acquired during the early developmental stage. Several types of research have been carried out to understand the detailed aspects and characteristics of handwriting. Earlier in 1961, Herrick and Otto have signified the barrel pressure variability as an important distinguishing aspect of individual handwriting [107]. The grip force variability and its effect on handwriting legibility are studied in [108]. The demonstrated results indicated that grip force variability during the entire writing task was lower for non-proficient writers. Lin et al. [109] investigated the pen grip kinetics of school-age children to validate the hypothesis of correlation between force control when handling a pen and fine motor performance. The significance of the middle finger compared to the thumb and index finger for the force control is also demonstrated. The writing speed being unaffected by the pencil grip force and muscular activity holds promise for the application of these cues in the remedial program for disabled persons. Surface Electromyography (sEMG) has proven out to study muscle coordination [110], which can be exploited to understand the poor muscle tone in motor dysgraphia. sEMG signals generated during handwriting have been already applied for character recognition [111, 112] and Parkinson’s disease diagnosis [113]. It has been used in Arabic handwriting character recognition as well [114].

For the latter one, a mobile phone camera is utilized to capture the image of a handwritten piece. Relevant features are extracted from the image and further classification is carried out with the aid of machine learning models. The illegibility aspect of handwriting due to the defect in understanding of space results in spatial dysgraphia. Suitable line/ word segmentation techniques [115] can be utilized to observe and compare the line as well as the word spacing. Further, document analysis algorithms [49] serve to identify dysgraphic characteristics such as skewed writing, irregular writing size, etc.

As a result, the framework was developed and implemented at Qatar University to create a new multimodal database for the diagnosis and assistance evaluation of dysgraphia in students. However, the results are not yet ready for publication and will be published as soon as the results are finalized

6 Conclusion

Learning disabilities are often unrecognized, which makes it often misinterpreted as a lack of intelligence. This comes from the fact that the screening process is quite complex. In this work, we presented a survey of existing tools and methods for the preliminary screening of dysgraphia, which is characterized by impaired handwriting skills. We discussed many AI based and Non AI based automated systems for dysgraphia screening. We provided a comparative analysis of these systems and highlighted the strengths and weaknesses. Later we proposed a novel framework for automated dysgraphia screening by combining the existing methods along with new methods.