Deep learning assisted cognitive diagnosis for the D-Riska application

In this article, we expose a system developed that extends the Acquired Brain Injury (ABI) diagnostic application known as D-Riska with an artificial intelligence module that supports the diagnosis of ABI enabling therapists to evaluate patients in an assisted way. The application is in charge of collecting the data of the diagnostic tests of the patients, and due to a multi-class Convolutional Neural Network classifier (CNN), it is capable of making predictions that facilitate the diagnosis and the final score obtained in the test by the patient. To find out the best solution to this problem, different classifiers are used to compare the performance of the proposed model based on various classification metrics. The proposed CNN classifier makes predictions with 93 % of Accuracy, 94 % of Precision, 91 %, of Recall and 92% of F1-Score.

: View of the set of cards that the patient has to classify in D-Riska test application is responsible for storing all the information related to the session, 43 and allows the therapist to focus on observing how the patient performs the 44 test to facilitate and improve their diagnosis. In addition, it allows this 45 observation to be less invasive,since it is not necessary even for them to be 46 in the same physical space [7].
The decision regarding patients' final diagnosis by therapists only de-48 pends on therapists knowledge and interpretation. However, it is considered 49 essential to endow this type of applications with the ability to assist or sup-50 port therapists' final decision. this paper presents an expert module based 51 on Artificial Intelligence (AI) using techniques stand out in the current state 52 of the art to reach this goal. 53 Various techniques are considered to tackle this problem in order to 54 achieve a great capacity for success in the model interpretation. 55 The system consists of two separated physical UIs which are synchronized. 56 While therapists operate the Therapist UI ; patients perform assessments on 57 the Patient UI. 58 Therapist UI. It enables therapists to conduct and analyze the assessment 59 process. Therefore, it enables therapists to introduce assessment session in-60 formation (i.e. patient personal information and condition, therapist per-61 sonal information, etc. as well as to have full control of the assessment 62 process in real-time, since both UI are connected and synchronized. For 63 instance, therapist can control patients' UI during the assessment process). 64 Patient UI. It enables therapist to evaluate patients' condition. Therefore, this UI enables patients to grouping cards (instead of physical plastic objects) 66 using a touch screen. In fact, card movements performed by patients are transferred to the Therapist UI in real-time in order to observe unexpected 68 patterns or behaviors in patients as soon as possible to intervene in the 69 assessment process as required. This paper proposes the development of an expert module to extend the 72 tool previously developed and evaluated in [5] in order to support patients'  This module is not intended to replace the work done by therapists, it 86 only assists and support their decisions on the evaluation. However, as this 87 tool supports session recording to review the assessment process a posteriori, 88 it enables therapists to move their attention to other aspects, such as putting 89 down notes while the patient is performing the test, instead of focusing their 90 full attention only to every movement performed by patients. Some of these 91 that the therapist can focus on are the procedure that follows to carry out 92 the test, recheck what it has happened when the session is recorded, and in • A greater probability of success at diagnosis time.

104
• A reduction in the average diagnosis time.

105
• A self-performing and open-source diagnostic tool.

106
• Enables the therapists to focus their attention on other fundamental 107 aspects of the evaluation.

108
To reduce the development time, a set of existing convolutional neural 109 networks presented as part of the state of the art in [9] is used as a stat-  In general, the contribution of this work can be summarized as follows:

115
• Presents a DUI system which employs an AI module that assists ther-116 apists in the diagnosis of patients with ABI; which is migrated to a CC 117 service provider.

118
• Collects and stores information obtained from tests carried out on real 119 patients to generate sets of valid data to train and validate the system 120 model, while providing a reliable persistence and management of this 121 information.

122
• Presents a classifier that infers the class of a grouping and its proba-   cess, which inspired the design and implementation of the D-Riska applica-137 tion detailed in [5]. The LOTCA was developed as a technique to assess basic 138 cognitive skills and visual perceptions in adults with neurological disabilities.

139
Provides an in-depth assessment of basic cognitive skills that can be used for 140 treatment planning as well as for treatment progress reviews [10].

141
The LOTCA battery assess the basic cognitive skills required for daily  The classification of Riska objects consists of two sub-tests. While in the Un-157 structured sub-test therapists ask patients to form groupings of objects spon-158 taneously; in the structured sub-test therapists ask patient to form groupings 159 of objects according to a class following a given pattern which is presented 160 as an example.

161
Thus, the D-Riska application enables therapist to carry out patients' 162 assessment process in a similar way to the traditional one while providing 163 the advantages that digital technologies introduce in the process [5]. 164 The development of the expert module integrated into the D-Riska ap-165 plication supported by a CC architecture is described in detail in the section          The proposed model assigns classes to card groupings in images using 281 a percentage, or value, of matching for each of them. The class associ-282 ated to an image results from the highest probability of prediction matching.

283
The advantage of this model is that it allows the extraction of significant   On the other hand, a common, and highly effective solution used in Deep 317 Learning, and specifically in small data sets, as it is the case, is to use pretrained networks, a concept known as Transfer Learning [9]. A pretrained 319 network is a stored network that was previously trained to solve a problem  There are two different approach to use pretrained networks, feature ex-333 traction and fine-tuning. As the second one is not taken into accounf in this 334 work, only the first one is detailed in next section.  All these elements are easy to deploy using the framework that stands   2) to generate their outputs. The following steps summarize the procedure 378 followed the defined networks:

379
• Load data sets, train, validate and test.

380
• Reduce of the size of all the sets (resizing of images).

381
• Perform data augmentation of the training set images (rescaled).

382
• Add classification layers to feature extraction models.  (1) The resolution of these images is 1320 x 410 pixels in RGB format. They 394 were arranged in 5 different proportion classes, taking into account that the 395 minority class is the class that corresponds to a random grouping (grouping 396 more susceptible to suffering a pathology).

397
Since the number of images is too small to address the problem using The training, validation and test sets are presented in Table 1.

492
The  that make it unfeasible to use them to solve the problem.

507
As for the Precision-Recall balance (F1-Score), the VGG16 classifier is 508 again the one with the best results. It implies that despite of the problem be-

509
ing unbalanced, the model correctly predicts minority classes, and especially 510 the class random, which, as recalled, corresponds to the most susceptible 511 cases of disease.

512
In the case of MCC, it can be seen how the VGG16 approaches remarkably 513 perfectly in the predictions; however, the rest of the models are closer to the 514 random prediction (which can be seen perfectly in the Accuracy of these 515 models).

516
The matrix Figure 6 presents the case of the Cohen's Kappa classifier, 517 which is the one that has the greatest agreement with the real classes is the 518 VGG16 (0.91), followed by the MobileNet network (0.32). It is easy to see 519 that there is also a certain agreement between them since their value is 0.32; 520 however, it is not a high agreement value.

521
The Figure 7 compares these results graphically for the all the metrics.

522
The closer the score is to 100, the better the result of the classifier for that  to the others.

526
The Figure 8 shows the confusion matrix for the classifier that has ob-527 tained the best results on the test set, the VGG16. If we analyze this matrix, 528 we obtain several conclusions about the behavior of the VGG16 classifier for 529 this problem, and where it finds the greatest difficulties in prediction. As 530 seen in the first row of the matrix, for the "random" case, the proposed sys-531 tem is always correct, having correctly predicted the 45 "random" cases that 532 have been provided. In the second row, for the "color" case, we see that he 533 has been correct 72 times out of the 75 cases that have been provided to him, 534 erroneously classifying two as "shape" and one as "color shape". In the third 535 row, for the "form" case, we see that it obtains similar results to the "color" 536 case, hitting 72 of the 75 times, erroneously classifying 3 as "color" this time.

537
For the "form color" case, in the fourth row, we see that once again, as in the 538 "random" case, it has a full number of correct answers, correctly classifying 539 the 60 proposed cases. In the last row, for the "complete series" case, it is 540 where we find the most of errors in the proposed model. The Recall metric 541 for this case drops to 64 out of 100 indicating that the proposed model is only capable of classifying 64% of "complete series" cases correctly, while the 543 rest, in its vast majority, classifies them as "color form".

544
This implies that our classifier is sensitive to distinguish between the cases 545 of "color form" and "complete series", that is, there is great uncertainty in 546 the classifier's discrimination power between these two classes; therefore,  The selection of the CNN classifier to be use in the system, we have car-569 ried out an experimental evaluation where we have compared several trained 570 classifiers with test cases of the problem to solve. Consequently, we have 571 decided that the classifier that best suits this problem is the VGG16, which 572 is capable of making predictions with a 93% hit rate and a 0.92% F1-Score.

573
Observing the experimental results for this classifier, we assume that they 574 validate the performance of the proposed model in terms of different ranking 575 metrics over other ranking models.

576
As future works, this approach could be improved by implementing an 577 expert system or a rule-based system that is capable of increasing the Pre-578 cision the VGG16 classifier makes its predictions for the "complete series" 579 case, which is the one that generates the most errors in order to increase the 580 Precision to a value close to 100%.