Artificial intelligence utilising corneal confocal microscopy for the diagnosis of peripheral neuropathy in diabetes mellitus and prediabetes

Preston, Frank G.; Meng, Yanda; Burgess, Jamie; Ferdousi, Maryam; Azmi, Shazli; Petropoulos, Ioannis N.; Kaye, Stephen; Malik, Rayaz A.; Zheng, Yalin; Alam, Uazman

doi:10.1007/s00125-021-05617-x

Artificial intelligence utilising corneal confocal microscopy for the diagnosis of peripheral neuropathy in diabetes mellitus and prediabetes

Article
Open access
Published: 21 November 2021

Volume 65, pages 457–466, (2022)
Cite this article

Download PDF

You have full access to this open access article

Diabetologia Aims and scope Submit manuscript

Artificial intelligence utilising corneal confocal microscopy for the diagnosis of peripheral neuropathy in diabetes mellitus and prediabetes

Download PDF

7031 Accesses
29 Citations
33 Altmetric
Explore all metrics

Abstract

Aims/hypothesis

We aimed to develop an artificial intelligence (AI)-based deep learning algorithm (DLA) applying attribution methods without image segmentation to corneal confocal microscopy images and to accurately classify peripheral neuropathy (or lack of).

Methods

The AI-based DLA utilised convolutional neural networks with data augmentation to increase the algorithm’s generalisability. The algorithm was trained using a high-end graphics processor for 300 epochs on 329 corneal nerve images and tested on 40 images (1 image/participant). Participants consisted of healthy volunteer (HV) participants (n = 90) and participants with type 1 diabetes (n = 88), type 2 diabetes (n = 141) and prediabetes (n = 50) (defined as impaired fasting glucose, impaired glucose tolerance or a combination of both), and were classified into HV, those without neuropathy (PN−) (n = 149) and those with neuropathy (PN+) (n = 130). For the AI-based DLA, a modified residual neural network called ResNet-50 was developed and used to extract features from images and perform classification. The algorithm was tested on 40 participants (15 HV, 13 PN−, 12 PN+). Attribution methods gradient-weighted class activation mapping (Grad-CAM), Guided Grad-CAM and occlusion sensitivity displayed the areas within the image that had the greatest impact on the decision of the algorithm.

Results

The results were as follows: HV: recall of 1.0 (95% CI 1.0, 1.0), precision of 0.83 (95% CI 0.65, 1.0), F₁-score of 0.91 (95% CI 0.79, 1.0); PN−: recall of 0.85 (95% CI 0.62, 1.0), precision of 0.92 (95% CI 0.73, 1.0), F₁-score of 0.88 (95% CI 0.71, 1.0); PN+: recall of 0.83 (95% CI 0.58, 1.0), precision of 1.0 (95% CI 1.0, 1.0), F₁-score of 0.91 (95% CI 0.74, 1.0). The features displayed by the attribution methods demonstrated more corneal nerves in HV, a reduction in corneal nerves for PN− and an absence of corneal nerves for PN+ images.

Conclusions/interpretation

We demonstrate promising results in the rapid classification of peripheral neuropathy using a single corneal image. A large-scale multicentre validation study is required to assess the utility of AI-based DLA in screening and diagnostic programmes for diabetic neuropathy.

Graphical abstract

An artificial intelligence-based deep learning algorithm for the diagnosis of diabetic neuropathy using corneal confocal microscopy: a development and validation study

Article Open access 12 November 2019

Image Classification Using CNN to Diagnose Diabetic Retinopathy

Early Detection of Diabetic Retinopathy Using Deep Learning

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Diabetes mellitus had an estimated worldwide prevalence in 2017 of 451 million which is expected to rise to 693 million people by 2045 [1]. Neuropathy affects ~50% of people with diabetes and diabetes is the leading cause of neuropathy worldwide [2]. It results in neuropathic pain which impacts on quality of life and may lead to foot ulceration and amputation, with an excess premature mortality rate. Peripheral neuropathy has also been demonstrated in approximately 10% of individuals with prediabetes [3]. Given that prediabetes is projected to affect up to 587 million people (8.3% of the global adult population) by 2040, this represents a major burden on healthcare. Early diagnosis of diabetic neuropathy is essential to prevent progression [4] and subsequent morbidity and mortality rate [2]. A robust screening programme that incorporates reliable state-of-the-art technologies and biomarkers is required to deploy targeted screening for neuropathy in prediabetes and diabetes.

Current screening methods for diabetic neuropathy rely on neurological examination or 10 g monofilament which detect moderate to severe neuropathy affecting the large nerve fibres, yet small nerve fibres are the earliest to be damaged. Skin biopsy with quantification of intra-epidermal nerve fibres is the current reference standard to detect small fibre damage [4], but this method is invasive [5] and there are limited specialist clinical laboratories undertaking this procedure, making it unsuitable for population-level screening of peripheral neuropathy. In vivo corneal confocal microscopy (CCM) is a non-invasive, rapid and reiterative ophthalmic imaging technique that can quantify small nerve fibres in the cornea [4, 5]. Indeed, corneal nerve loss occurs in subclinical diabetic neuropathy [6], increases with the severity of diabetic neuropathy [7] and predicts incident diabetic neuropathy [8]. A large body of published data has shown that CCM can be used to diagnose and monitor progression of diabetic neuropathy [4, 9, 10]. Additionally, CCM detects nerve fibre regeneration in clinical trials of patients with diabetic neuropathy [5, 10], which precedes improvements in symptoms and neurophysiology [11]. Normative ranges have been established [12] and the corneal subbasal nerve plexus remains stable in healthy individuals over 3 years [13].

However, quantitative analysis of the subbasal nerve plexus requires reliable extraction of image features [14], and although manual segmentation of corneal nerve fibres is sensitive [15] and reproducible [16], it is operator-dependent and laborious. Dabbah et al. [14] developed an automated image analysis system using a dual model feature descriptor combined with an artificial neural network which correlated highly with manual measurements [17]. Chen et al. [18] further refined the automated software using either a neural network or random forest for classification and achieved a performance equivalent to that of manual annotation, combined with greater reproducibility and speed. More recently, advanced convolutional neural networks (CNNs), a class of deep learning algorithm (DLA), have been developed to enhance feature detection [19] and quantification of corneal nerve fibre morphology and have produced promising results [20,21,22,23]. Williams et al. [22] compared an artificial intelligence (AI)-based DLA with ACCMetrics [18] and demonstrated more consistent quantification of corneal nerve morphology with a superior diagnostic performance [22]. In a small dataset, Scarpa et al. [21] utilised a CNN on corneal nerve images (without segmentation) and classified individuals who were healthy or had diabetic neuropathy with an accuracy of 96%.

Despite providing accurate decisions comparable to human experts, the deployment of AI into medical practice has been partly hindered by its ‘black-box’ nature and the inability to provide the logic for the decision to end users. Thus, identifying the features by which the AI-based DLA classifies disease, in addition to the quantitative algorithmic performance, is key to promoting acceptance within healthcare and by physicians [24]. The primary modality used to explicate AI-based DLA within medical imaging diagnostics is attribution based, where the contribution to the output decision of each input feature is determined, allowing the generation of heat-maps known as attribution maps [24]. Gradient-weighted class activation mapping (Grad-CAM), Guided Grad-CAM and occlusion sensitivity are extensively used attribution methods which generate visual outcomes via attribution maps [25, 26].

The aim of this study was to develop and refine an AI-based DLA utilising image classification to identify healthy volunteer (HV) participants and individuals with prediabetes and diabetes with and without neuropathy, without using image segmentation. Grad-CAM, Guided Grad-CAM and occlusion sensitivity attribution methods were implemented to provide transparency and explanation of the AI-based DLA decision-making process.

Methods

Participants

All participants provided informed valid consent prior to assessments and the study was conducted in accordance with the Declaration of Helsinki. Ethical and institutional approvals were obtained before the participants completed the scientific protocol including CCM imaging. Other causes of peripheral neuropathy (except for diabetes/prediabetes) were excluded based on a comprehensive medical and family history and blood tests (immunoglobulins, anti-nuclear antibody, vitamin B₁₂ levels, thyroid function tests). Prediabetes was defined using standard international criteria (WHO/ADA) (impaired fasting glucose, impaired glucose tolerance or a combination of both). Peripheral neuropathy was defined according to the Toronto Consensus on diabetic neuropathy, which defined confirmed diabetic neuropathy as a combination of an abnormality of nerve conduction studies and a symptom(s) and/or sign(s) of neuropathy [27]. Participant data upon which the peripheral neuropathy diagnosis was originally confirmed were available in 360/369 participants. These data were independently assessed by two authors (UA and MF) to determine the diagnosis. For any disagreement between authors, a third author (INP) made the final decision. The Cohen’s κ score, which measures inter-rater reliability, between UA and MF was 0.962, demonstrating almost perfect agreement.

Image dataset and dataset preparation

The dataset (Early Neuropathy Assessment [ENA] group, University of Manchester, UK) consisted of images of the corneal subbasal nerve plexus from HV participants and participants with prediabetes and diabetes (n = 369). The CCM images were captured, using a standard, internationally accepted protocol developed by the ENA group, at 400 × 400 μm (384 × 384 pixels) using a Heidelberg Retina Tomograph III using the Rostock Corneal Module (RCM; HRTII32-RCM) confocal laser microscope (Heidelberg Engineering, Heidelberg, Germany). To enable compatibility with the image analysis software, the images were exported in the BMP file format. The images used were from: HV (n = 90); type 1 diabetes with neuropathy (n = 39); type 1 diabetes without neuropathy (n = 49); type 2 diabetes with neuropathy (n = 67); type 2 diabetes without neuropathy (n = 74); prediabetes with neuropathy (n = 24); prediabetes without neuropathy (n = 26). There were 90 HV participants, 149 participants with no peripheral neuropathy (PN−) and 130 participants with peripheral neuropathy (PN+) (Fig. 1). Neuropathy data for each of the three groups are detailed in Fig. 1. In keeping with the neuropathic phenotype, people with confirmed peripheral neuropathy had greater neuropathic deficits with more signs (higher neuropathy disability score) and symptoms (higher neuropathy symptom profile), higher vibration perception threshold, and lower peroneal and sural nerve conduction velocities and amplitudes, corneal nerve fibre length (CNFL), corneal nerve branch density and corneal nerve fibre density. As expected, people with peripheral neuropathy were older and, in those with diabetes, there was a longer duration of disease.

Each of the CCM images was labelled with its respective class, ‘control’, ‘no neuropathy’ or ‘neuropathy’, allowing supervised training to occur. Out of a total of 369 CCM images, 245 (66%) were used in the training set, 84 (23%) in the validation set and 40 (11%) in the test set. Electronic supplementary material (ESM) Table 1 demonstrates the breakdown of participant groups within the training, validation and test sets. The distribution between groups was allocated randomly, using the Python package ‘random’ to generate a random number for each image. A random number was generated between 0 and 1; if it was between 0 and 0.1, the image was put into the test dataset; if it was between 0.1 and 0.3, it was put into the validation dataset; and if it was between 0.3 and 1, it was put into the training dataset. Each participant had up to seven CCM images; however, when all the images were used the AI-based DLA suffered significantly from overfitting. Therefore, a single image for each participant was selected at random. Data augmentation strategies were employed in the training of the algorithm, having been previously shown to increase the generalisability of AI-based DLAs [28], where additional training images were generated via images being either rotated between 0 and 90 degrees or flipped on their horizontal axis.

Network architecture

ResNet is a residual neural network proposed by He et al. [29] at the 2015 ImageNet competition, where it achieved first place in the classification task. A ResNet network was developed as it overcomes the ‘vanishing gradient problem’ [30] by introducing skip (or shortcut) connections, where input from a previous layer can be transferred to the next layer without modification, allowing ResNet to have up to 152 layers [29]. Our ResNet-50 model comprises 50 layers, which culminate in a dense layer of 1000 neurons that has an applied softmax activation function. Two types of shortcut modules allow the ResNet-50 model to employ skip connections: convolution blocks and identity blocks. Convolution blocks contain a convolutional layer within the skip connection which results in the input dimensions being smaller compared with the output dimensions. Identity blocks do not contain a convolutional layer within the skip connection, meaning input and output dimensions are the same. In both shortcut modules, a 1 × 1 convolutional layer begins and ends the module, employing a bottleneck design to enable the reduction of parameters without degrading network performance.

A modified version of the ResNet-50 architecture (Fig. 2) was used to extract features from the images and perform classification. Modifications involved replacing the dense layer of 1000 neurons with one of 2048 neurons, adding a dropout layer with a rate of 0.6 after this layer and ending with a final dense layer of three neurons with the softmax activation function being applied to it, since there were three classes. The largest probability of three classes’ predictions (e.g., argmax) was used to determine the class label. The dropout layer was added to reduce overfitting, achieving this by randomly dropping layers and their connections during training, preventing layers from co-adapting, where one corrects the mistakes of other layers, but does not generalise to new data [31]. The initial weights of the model were pre-trained on the ‘ImageNet’ dataset [32], with weights in all the layers being set to be trainable.

Additional models

Further experiments were conducted to allow comparison of the modified ResNet-50 model. We adopted the backbone of MobileNet and MobileNetV2 [33] to perform the comparison experiments under the same experimental setting. Note that the same modification was done with respect to the model structure. Compared with ResNet-50, MobileNet and MobileNetV2 are lightweight models that contain relatively fewer model parameters. This choice was made to demonstrate the effectiveness of model size in this work.

Implementation details

Before training the model, we undertook pre-processing on input images. For example, we resized the image from 384 × 384 to 224 × 224 with the bilinear interpolation method. We increased the image channel from 1 to 3 through replicating along the channels. Additionally, we first scaled the image pixel values into [0–1] and then normalised the values in the range of [−1, 1] by using a mean value of 0.5 and an SD of 0.5 for three channels. The underlying motivations are threefold: First, due to the limited GPU graphic memory, we resized the input image into a lower size for training. Second, given the limited dataset size in this work, overfitting may be a potential issue during the model training. To address this issue, we increased the image channel size for fitting a pre-trained model on ImageNet [32]. Third, normalisation of pixel values can stabilise the training process and benefit optimisation [34].

The model was trained for 300 epochs (passes of the entire dataset) on the training datasets and evaluated on the validation datasets. The model was trained (245 images) and then used to predict the class of images in the validation dataset (84 images) to determine the validation accuracy. After each epoch, the model’s weights were altered via backpropagation and gradient descent, with the weights of the model achieving the highest validation accuracy being saved and applied to the test set (40 images—equal to 40 participants) to perform classification. Experiments were conducted with a batch size of 12, 24 and 36; learning rate of 0.01, 0.001 and 0.0001; and dropout rate of 0.6, 0.4 and 0.2. Hyperparameters were empirically set with a batch size of 12, learning rate of 0.001 and dropout rate of 0.6. The optimiser was stochastic gradient decent (SGD), and the loss function was cross entropy. Early stopping was set to monitor validation accuracy, which discontinued training if an improvement in validation accuracy did not occur after 100 epochs.

The model was developed, tested and trained within Python 3.7 (https://www.python.org/), Tensorflow 2.2.0 (https://www.tensorflow.org/; Google, Mountain View, CA, USA) and Keras 1.0.8 (https://keras.io/) on a high-end graphics processor, NVIDIA GeForce GTX 960M (NVIDIA, Santa Clara, CA, USA).

Performance evaluation

A confusion matrix was developed to ascertain the AI-based DLA performance, displaying the true image classifications against the classifications predicted by the AI-based DLA. Using the confusion matrix, a classification report was produced displaying the widely used performance metrics precision, recall (also known as sensitivity) and F₁-score. Precision is the proportion of true positive cases out of all the predicted positives which measures the effects of false-positives. Recall is the ratio of the predicted positives and total actual positives. F1 = 2 (precision × recall)/(precision + recall) and measures the trade-off between precision and recall. 95% CIs were generated to show statistical significance. In detail, 2000 samples of Clopper–Pearson interval [35] were used for precision, recall and F₁. Fivefold cross-validation was done across all experiments to provide more robust results; the performance is reported as the mean of fivefold results.

Attribution maps

The attribution method Grad-CAM utilises the gradients entering the final convolutional layer to generate a coarse attribution map, which demonstrates the areas in the image that have impacted the decision most [25]. Grad-CAM can be further combined with the fine-grained image to generate a high-resolution class-discriminative visualisation known as Guided Grad-CAM [25]. Occlusion sensitivity systematically occludes different areas of the input image with a grey patch, and monitors the effect of this on the classification [26]. A grey patch of 48 pixels was used in this study. Grad-CAM, Guided Grad-CAM and occlusion sensitivity were employed to generate attribution maps for each of the test images.

Results

ResNet-50 classification performance

The confusion matrix generated after the trained AI-based DLA had classified the test dataset (n = 40) is displayed in Table 1. All HV images (n = 15) were correctly detected by the AI-based DLA. Out of the PN− images (n = 13), 11 were correctly detected by the AI-based DLA, and two misclassified as HV images. Of the PN+ images (n = 12), ten were correctly detected, with one misclassified as PN− and one as HV.

Table 1 Confusion matrix report from modified ResNet-50 in HV, PN− and PN+

Full size table

Using the data demonstrated in the confusion matrix, a classification report (Table 2) was produced with the performance metrics described previously. In detecting HV images, the AI-based DLA had a recall of 1.0 (95% CI 1.0, 1.0), precision of 0.83 (95% CI 0.65, 1.0) and F₁-score of 0.91 (95% CI 0.79, 1.0); for PN− images, the AI-based DLA had a recall of 0.85 (95% CI 0.62, 1.0), precision of 0.92 (95% CI 0.73, 1.0) and F₁-score of 0.88 (95% CI 0.71, 1.0); and for PN+ images, the AI-based DLA had a recall of 0.83 (95% CI 0.58, 1.0), precision of 1.0 (95% CI 1.0, 1.0) and F₁-score of 0.91 (95% CI 0.74, 1.0).

Table 2 Classification report from modified ResNet-50 in HV, PN− and PN+

Full size table

MobileNet and MobileNetV2 classification performance

Confusion matrices were also generated for MobileNet (ESM Table 2) and MobileNetV2 (ESM Table 3). Classification reports were produced based on these confusion matrices for both MobileNet (ESM Table 4) and MobileNetV2 (ESM Table 5). In detecting HV images, MobileNet had a recall of 1.0 (95% CI 1.0, 1.0), precision of 0.68 (95% CI 0.50, 0.87) and F₁-score of 0.81 (95% CI 0.67, 0.93); for PN− images, MobileNet had a recall of 0.54 (95% CI 0.25, 0.82), precision of 0.88 (95% CI 0.57, 1.0) and F₁-score of 0.67 (95% CI 0.36, 0.87); and for PN+ images, MobileNet had a recall of 0.75 (95% CI 0.46, 1.0), precision of 0.90 (95% CI 0.67, 1.0) and F₁-score of 0.82 (95% CI 0.58, 0.96). In detecting HV images, MobileNetV2 had a recall of 0.87 (95% CI 0.67, 1.0), precision of 0.72 (95% CI 0.50, 0.93) and F₁-score of 0.79 (95% CI 0.60, 0.92); for PN− images, MobileNetV2 had a recall of 0.62 (95% CI 0.33, 0.90), precision of 0.67 (95% CI 0.36, 0.92) and F₁-score of 0.64 (95% CI 0.38, 0.84); and for PN+ images, MobileNetV2 had a recall of 0.75 (95% CI 0.46, 1.0), precision of 0.90 (95% CI 0.67, 1.0) and F₁-score of 0.82 (95% CI 0.57, 0.97).

The ResNet-50 model had the lowest number of misclassifications (n = 4), followed by MobileNet (n = 9) and MobileNetV2 (n = 10). The ResNet-50 model also performed better than MobileNet and MobileNetV2 in all performance metrics across all classes. For instance, in detecting PN+, ResNet-50 achieved 10.7%, 11.1% and 11.0% higher recall, precision and F₁-score than both MobileNet and MobileNetV2.

ResNet-50 attribution maps

Figure 3 shows six example CCM images from the test set that were correctly detected, and the resulting Grad-CAM, Guided Grad-CAM and occlusion sensitivity images generated. The attribution maps for correctly detected HV (Fig. 3a,b) highlighted the presence of corneal nerves, focusing on the main nerve segment, emphasised by the Guided Grad-CAM. PN− images that were correctly detected (Fig. 3c,d) had attribution maps which highlighted areas of corneal nerves but displayed shorter segments on the Guided Grad-CAM. Attribution maps from correctly detected PN+ images (Fig. 3e,f) highlighted areas with absence of corneal nerves. Compared with occlusion sensitivity maps, Grad-CAM and Guided Grad-CAM were able to indicate the attribution of the image more clearly. ESM Fig. 1 and ESM Fig. 2 display the attribution maps generated from MobileNet and MobileNetV2, respectively.

Discussion

CCM is a non-invasive ophthalmic imaging modality which may serve as a bona fide biomarker of diabetic neuropathy [36]. It has been posited as a game changer in the screening and diagnosis of diabetic and peripheral neuropathies [37]. Our study demonstrated two main findings: (1) the successful development of an AI-based algorithm without the need for nerve segmentation; and (2) it established accurate classification of individuals with and without peripheral neuropathy and healthy individuals. This is the first study to utilise an AI-based DLA for the classification of peripheral neuropathy with the addition of attribution methods to provide transparency and explanation of the decision-making process.

A number of studies have developed automated software or neural networks for the segmentation of CCM images [14, 23, 38, 39]. Dabbah et al. [14] proposed an automated system that quantified the nerve fibres and used them as feature vectors to enable classification via random forest and neural network classifiers, producing results that matched the expert manual annotation. Colonna et al. [38] proposed a U-Net-based CNN for automated tracing of corneal nerves, achieving 95% sensitivity compared with the manual tracing. Zhou et al. [23] also developed an improved U-Net architecture, achieving superior results compared with baseline and a super performance with existing DLA for segmentation. Zhao et al. proposed a noise-constrained Retinex model to first enhance the CCM image and used exponential curve estimation as the tortuosity measure to outperform previously used methods, and their results were comparable to human experts [39]. More recently, Mou et al. [40] proposed a curvilinear structure segmentation network validated using six different imaging modalities including CCM, using both 2D and 3D images, outperforming a number of other state-of-the-art algorithms [40].

Previously, Williams et al. [22] presented a novel DLA for estimation of CNFL, which achieved an AUC of 0.83, specificity of 87% and sensitivity of 68% for the diagnosis of peripheral neuropathy. A recent study by Oakley et al. [41] utilised a CNN in macaque CCM images with the advantage of being retrained for additional in vitro [42] and in vivo [43] corneal imaging modalities. For both Williams et al. [22] and Oakley et al. [41], deep learning outperformed ACCMetrics, the current most commonly utilised programme for CNFL estimation. However, the development of such AI-based systems requires the acquisition of large image/datasets with human-graded (ground truth) annotations as a reference standard to train the algorithm [44]. Our study validates the use of an AI-based DLA to diagnose peripheral neuropathy without image segmentation prior to classification. The lack of requirement of manual or automated annotation to train the AI-based DLA allows the utilisation of larger datasets as only unannotated CCM images are required [45]. Without reliance on predetermined features and variables, our method enables the AI-based DLA to learn the features it considers of importance, allowing a more complex image analysis. In our study, two non-neuropathic images which were classified as healthy, suggesting a lack of subclinical small fibre loss, essentially denoting the correct classification was determined (lack of disease) and further adding to the method’s validity.

In general, there is a paucity of studies demonstrating the accurate classification (without segmentation) of peripheral neuropathy based on CCM images. As discussed, our AI-based DLA does not rely on traditional methods of image segmentation. Scarpa et al. [21] also employed an AI-based DLA to classify CCM images without image segmentation utilising a CNN, which analysed three non-overlapping images of each eye per individual, classifying them as either healthy or pathological [21]. Our AI-based DLA achieved comparable results in participants with diabetic neuropathy, but additionally differentiated healthy people from individuals with prediabetes or diabetes without neuropathy, indicating that our AI-based DLA detects early subclinical neuropathy in a real-world clinical setting. Recently, Salahouddin et al. [46] developed a novel automated AI-based analysis system which rapidly quantified CNFL and classified patients with diabetic neuropathy using an adaptive neuro-fuzzy inference system, achieving an AUC of 0.95 (92% sensitivity/80% specificity) for discriminating patients with and without diabetic neuropathy. We propose the instigation of a screening programme for diabetic neuropathy utilising CCM alongside diabetic retinopathy screening [47]. The Food and Drug Administration (FDA) has recently approved the first autonomous AI-based DLA to screen for diabetic retinopathy [48]. In Scotland, an AI-based algorithm was used in a real-world screening service and demonstrated good sensitivity and specificity for detecting high-risk retinopathy, which halved the workload for human graders [49].

Our study was based on a relatively small dataset (N = 369 participants), resulting in wide CIs, but nevertheless achieved reasonable classification accuracy. Furthermore, only one image from each participant was used, unlike previous studies [18, 21, 22, 46] which have used multiple images. Indeed, despite defining diabetic neuropathy using the Toronto criteria [27], which rely on abnormal nerve conduction [5], our AI-based DLA, which identifies small fibre pathology known to precede large fibre involvement, still achieved reasonable outcomes. Further refining the model by including additional clinical and demographic data may help to further improve the diagnostic performance. This AI-based DLA needs to be validated in a larger study utilising small fibre measures to identify neuropathy and prospectively in a large-scale clinical population. If validated, cost-effectiveness models need to be established to ascertain its health economics impact.

In conclusion, our AI-based DLA achieved a good classification between HV participants and people with prediabetes and diabetes with and without neuropathy, and the addition of attribution methods aids transparency in the decision making. This AI-based DLA, if validated in a larger study, has considerable potential to be adopted into a screening programme for diabetic neuropathy.

Abbreviations

AI:: Artificial intelligence
CCM:: Corneal confocal microscopy
CNFL:: Corneal nerve fibre length
CNN:: Convolutional neural network
DLA:: Deep learning algorithm
ENA:: Early Neuropathy Assessment
Grad-CAM:: Gradient-weighted class activation mapping
HV:: Healthy volunteer
PN+ :: Participants with peripheral neuropathy
PN–:: Participants without peripheral neuropathy

References

Cho NH, Shaw JE, Karuranga S et al (2018) IDF diabetes atlas: global estimates of diabetes prevalence for 2017 and projections for 2045. Diabetes Res Clin Pract 138:271–281. https://doi.org/10.1016/j.diabres.2018.02.023
Article CAS PubMed Google Scholar
Iqbal Z, Azmi S, Yadav R et al (2018) Diabetic peripheral neuropathy: epidemiology, diagnosis, and pharmacotherapy. Clin Ther 40(6):828–849. https://doi.org/10.1016/j.clinthera.2018.04.001
Article PubMed Google Scholar
Kirthi V, Perumbalath A, Brown E et al (2021) Prevalence of peripheral neuropathy in pre-diabetes: a systematic review. BMJ Open Diabetes Res Care 9(1):e002040. https://doi.org/10.1136/bmjdrc-2020-002040
Article PubMed PubMed Central Google Scholar
Alam U, Jeziorska M, Petropoulos IN et al (2017) Diagnostic utility of corneal confocal microscopy and intra-epidermal nerve fibre density in diabetic neuropathy. PLoS One 12(7):e0180175. https://doi.org/10.1371/journal.pone.0180175
Article CAS PubMed PubMed Central Google Scholar
Petropoulos IN, Ponirakis G, Khan A, Almuhannadi H, Gad H, Malik RA (2018) Diagnosing diabetic neuropathy: something old, something new. Diabetes Metab J 42(4):255–269. https://doi.org/10.4093/dmj.2018.0056
Article PubMed PubMed Central Google Scholar
Tavakoli M, Quattrini C, Abbott C et al (2010) Corneal confocal microscopy: a novel noninvasive test to diagnose and stratify the severity of human diabetic neuropathy. Diabetes Care 33(8):1792–1797. https://doi.org/10.2337/dc10-0253
Article PubMed PubMed Central Google Scholar
Petropoulos IN, Alam U, Fadavi H et al (2013) Corneal nerve loss detected with corneal confocal microscopy is symmetrical and related to the severity of diabetic polyneuropathy. Diabetes Care 36(11):3646–3651. https://doi.org/10.2337/dc13-0193
Article PubMed PubMed Central Google Scholar
Lewis EJH, Lovblom LE, Ferdousi M et al (2020) Rapid corneal nerve Fiber loss: a marker of diabetic neuropathy onset and progression. Diabetes Care 43(8):1829–1835. https://doi.org/10.2337/dc19-0951
Article PubMed PubMed Central Google Scholar
Petropoulos IN, Manzoor T, Morgan P et al (2013) Repeatability of in vivo corneal confocal microscopy to quantify corneal nerve morphology. Cornea 32(5):e83–e89. https://doi.org/10.1097/ICO.0b013e3182749419
Article PubMed Google Scholar
Tavakoli M, Mitu-Pretorian M, Petropoulos IN et al (2013) Corneal confocal microscopy detects early nerve regeneration in diabetic neuropathy after simultaneous pancreas and kidney transplantation. Diabetes 62(1):254–260. https://doi.org/10.2337/db12-0574
Article CAS PubMed Google Scholar
Azmi S, Jeziorska M, Ferdousi M et al (2019) Early nerve fibre regeneration in individuals with type 1 diabetes after simultaneous pancreas and kidney transplantation. Diabetologia 62(8):1478–1487. https://doi.org/10.1007/s00125-019-4897-y
Article CAS PubMed PubMed Central Google Scholar
Tavakoli M, Ferdousi M, Petropoulos IN et al (2015) Normative values for corneal nerve morphology assessed using corneal confocal microscopy: a multinational normative data set. Diabetes Care 38(5):838–843. https://doi.org/10.2337/dc14-2311
Article PubMed PubMed Central Google Scholar
Dehghani C, Pritchard N, Edwards K et al (2014) Morphometric stability of the corneal subbasal nerve plexus in healthy individuals: a 3-year longitudinal study using corneal confocal microscopy. Invest Ophthalmol Vis Sci 55(5):3195–3199. https://doi.org/10.1167/iovs.14-13959
Article PubMed Google Scholar
Dabbah MA, Graham J, Petropoulos IN, Tavakoli M, Malik RA (2011) Automatic analysis of diabetic peripheral neuropathy using multi-scale quantitative morphology of nerve fibres in corneal confocal microscopy imaging. Med Image Anal 15(5):738–747. https://doi.org/10.1016/j.media.2011.05.016
Article CAS PubMed Google Scholar
Ahmed A, Bril V, Orszag A et al (2012) Detection of diabetic sensorimotor polyneuropathy by corneal confocal microscopy in type 1 diabetes: a concurrent validity study. Diabetes Care 35(4):821–828. https://doi.org/10.2337/dc11-1396
Article CAS PubMed PubMed Central Google Scholar
Kalteniece A, Ferdousi M, Adam S et al (2017) Corneal confocal microscopy is a rapid reproducible ophthalmic technique for quantifying corneal nerve abnormalities. PLoS One 12(8):e0183040. https://doi.org/10.1371/journal.pone.0183040
Article CAS PubMed PubMed Central Google Scholar
Petropoulos IN, Alam U, Fadavi H et al (2014) Rapid automated diagnosis of diabetic peripheral neuropathy with in vivo corneal confocal microscopy. Invest Ophthalmol Vis Sci 55(4):2071–2078. https://doi.org/10.1167/iovs.13-13787
Article PubMed PubMed Central Google Scholar
Chen X, Graham J, Dabbah MA, Petropoulos IN, Tavakoli M, Malik RA (2017) An automatic tool for quantification of nerve fibers in corneal confocal microscopy images. IEEE Trans Biomed Eng 64(4):786–794. https://doi.org/10.1109/tbme.2016.2573642
Article PubMed Google Scholar
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444. https://doi.org/10.1038/nature14539
Article CAS Google Scholar
Salahuddin T, Al-Maadeed SA, Petropoulos IN, Malik RA, Ilyas SK, Qidwai U (2019) Smart neuropathy detection using machine intelligence: Filling the void between clinical practice and early diagnosis. In: 2019 Third World Conference on Smart Trends in Systems Security and Sustainablity (WorldS4). IEEE, New Jersey, United States, pp 141–146
Scarpa F, Colonna A, Ruggeri A (2020) Multiple-image deep learning analysis for neuropathy detection in corneal nerve images. Cornea 39(3):342–347. https://doi.org/10.1097/ico.0000000000002181
Article PubMed Google Scholar
Williams BM, Borroni D, Liu R et al (2020) An artificial intelligence-based deep learning algorithm for the diagnosis of diabetic neuropathy using corneal confocal microscopy: a development and validation study. Diabetologia 63(2):419–430. https://doi.org/10.1007/s00125-019-05023-4
Article PubMed Google Scholar
Zhou X, Chen X, Feng S, Shi F (2020) An improved U-Net for nerve fibre segmentation in confocal corneal microscopy images. In: Medical Imaging 2020: Image Processing. Vol. 11313. International Society for Optics and Photonics, Washington, United States, p 113131Z
Singh A, Sengupta S, Lakshminarayanan V (2020) Explainable deep learning models in medical image analysis. J Imaging 6(6):52
Article Google Scholar
Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-CAM: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision. IEEE, New Jersey, United States, pp 618–626
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European Conference on Computer Vision. Springer, Berlin/Heidelberg, Germany, pp 818–833
Tesfaye S, Boulton AJ, Dyck PJ et al (2010) Diabetic neuropathies: update on definitions, diagnostic criteria, estimation of severity, and treatments. Diabetes Care 33(10):2285–2293. https://doi.org/10.2337/dc10-1303
Article PubMed PubMed Central Google Scholar
Ronneberger O, Fischer P, Brox T (2015) U-Net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, Berlin/Heidelberg, Germany, pp 234–241
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, New Jersey, United States, pp 770–778
Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166. https://doi.org/10.1109/72.279181
Article CAS PubMed Google Scholar
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
Google Scholar
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, New Jersey, United States, pp 248–255
Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C (2018) MobileNetV2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, New Jersey, United States, pp 4510–4520
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning. PMLR, United States, pp 448–456
Clopper CJ, Pearson ES (1934) The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika 26(4):404–413
Article Google Scholar
Alam U, Maria Jeziorska M, Petropoulos IN et al (2017) Diagnostic utility of corneal confocal microscopy and intra-epidermal nerve fibre density in diabetic neuropathy. PLoS One 12(7):e0180175. https://doi.org/10.1371/journal.pone.0180175
Article CAS PubMed PubMed Central Google Scholar
Petropoulos IN, Ponirakis G, Ferdousi M et al (2021) Corneal confocal microscopy: a biomarker for diabetic peripheral neuropathy. Clin Ther S0149-2918(21):00197–00191. https://doi.org/10.1016/j.clinthera.2021.04.003
Article CAS Google Scholar
Colonna A, Scarpa F, Ruggeri A (2018) Segmentation of corneal nerves using a U-Net-based convolutional neural network. In: Computational Pathology and Ophthalmic Medical Image Analysis. Springer, Berlin/Heidelberg, Germany, pp 185–192
Zhao Y, Zhang J, Pereira E et al (2020) Automated tortuosity analysis of nerve fibers in corneal confocal microscopy. IEEE Trans Med Imaging 39(9):2725–2737. https://doi.org/10.1109/tmi.2020.2974499
Article PubMed Google Scholar
Mou L, Zhao Y, Fu H et al (2021) CS2-net: deep learning segmentation of curvilinear structures in medical imaging. Med Image Anal 67:101874. https://doi.org/10.1016/j.media.2020.101874
Article PubMed Google Scholar
Oakley JD, Russakoff DB, McCarron ME et al (2020) Deep learning-based analysis of macaque corneal sub-basal nerve fibers in confocal microscopy images. Eye Vis 7(1):27. https://doi.org/10.1186/s40662-020-00192-5
Article Google Scholar
Dorsey JL, Mangus LM, Oakley JD et al (2014) Loss of corneal sensory nerve fibers in SIV-infected macaques: an alternate approach to investigate HIV-induced PNS damage. Am J Pathol 184(6):1652–1659. https://doi.org/10.1016/j.ajpath.2014.02.009
Article CAS PubMed PubMed Central Google Scholar
Tan B, Hosseinaee Z, Han L, Kralj O, Sorbara L, Bizheva K (2018) 250 kHz, 1.5 μm resolution SD-OCT for in-vivo cellular imaging of the human cornea. Biomed Opt Express 9(12):6569–6583. https://doi.org/10.1364/boe.9.006569
Article CAS PubMed PubMed Central Google Scholar
Grzybowski A, Brona P, Lim G et al (2020) Artificial intelligence for diabetic retinopathy screening: a review. Eye (Lond) 34(3):451–460. https://doi.org/10.1038/s41433-019-0566-0
Article Google Scholar
Campanella G, Hanna MG, Geneslaw L et al (2019) Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat Med 25(8):1301–1309. https://doi.org/10.1038/s41591-019-0508-1
Article CAS PubMed PubMed Central Google Scholar
Salahouddin T, Petropoulos IN, Ferdousi M et al (2021) Artificial intelligence-based classification of diabetic peripheral neuropathy from corneal confocal microscopy images. Diabetes Care 44(7):e151–e153. https://doi.org/10.2337/dc20-2012
Article CAS PubMed PubMed Central Google Scholar
Burgess J, Frank B, Marshall A et al (2021) Early detection of diabetic peripheral neuropathy: a focus on small nerve fibres. Diagnostics 11(2):165. https://doi.org/10.3390/diagnostics11020165
Article CAS PubMed PubMed Central Google Scholar
US Food and Drug Administration (2018) FDA permits marketing of artificial intelligence-based device to detect certain diabetes-related eye problems. Available from https://www.fda.gov/news-events/press-announcements/fda-permits-marketing-artificial-intelligence-based-device-detect-certain-diabetes-related-eye. Accessed 01 March 2021
Heydon P, Egan C, Bolter L et al (2021) Prospective evaluation of an artificial intelligence-enabled algorithm for automated diabetic retinopathy screening of 30 000 patients. Br J Ophthalmol 105(5):723–728. https://doi.org/10.1136/bjophthalmol-2020-316594
Article PubMed Google Scholar

Download references

Author information

Frank G. Preston and Yanda Meng are joint co-first authors. Yalin Zheng and Uazman Alam are joint senior authors.

Authors and Affiliations

Department of Eye and Vision Science, Institute of Life Course and Medical Sciences, University of Liverpool, Liverpool, UK
Frank G. Preston, Yanda Meng, Stephen Kaye & Yalin Zheng
Institute of Life Course and Medical Sciences and the Pain Research Institute, University of Liverpool and Liverpool University Hospital NHS Foundation Trust, Liverpool, UK
Jamie Burgess & Uazman Alam
Institute of Cardiovascular Science, University of Manchester and Manchester Diabetes Centre, Manchester Foundation Trust, Manchester, UK
Maryam Ferdousi & Shazli Azmi
Weill Cornell Medicine – Qatar, Doha, Qatar
Ioannis N. Petropoulos & Rayaz A. Malik
St Paul’s Eye Unit, Royal Liverpool University Hospital, Liverpool, UK
Yalin Zheng
Division of Endocrinology, Diabetes and Gastroenterology, University of Manchester, Manchester, UK
Uazman Alam

Authors

Frank G. Preston
View author publications
You can also search for this author in PubMed Google Scholar
Yanda Meng
View author publications
You can also search for this author in PubMed Google Scholar
Jamie Burgess
View author publications
You can also search for this author in PubMed Google Scholar
Maryam Ferdousi
View author publications
You can also search for this author in PubMed Google Scholar
Shazli Azmi
View author publications
You can also search for this author in PubMed Google Scholar
Ioannis N. Petropoulos
View author publications
You can also search for this author in PubMed Google Scholar
Stephen Kaye
View author publications
You can also search for this author in PubMed Google Scholar
Rayaz A. Malik
View author publications
You can also search for this author in PubMed Google Scholar
Yalin Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Uazman Alam
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Yalin Zheng or Uazman Alam.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

ESM 1

(PDF 229 kb)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Preston, F.G., Meng, Y., Burgess, J. et al. Artificial intelligence utilising corneal confocal microscopy for the diagnosis of peripheral neuropathy in diabetes mellitus and prediabetes. Diabetologia 65, 457–466 (2022). https://doi.org/10.1007/s00125-021-05617-x

Download citation

Received: 03 June 2021
Accepted: 07 October 2021
Published: 21 November 2021
Issue Date: March 2022
DOI: https://doi.org/10.1007/s00125-021-05617-x

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Artificial intelligence utilising corneal confocal microscopy for the diagnosis of peripheral neuropathy in diabetes mellitus and prediabetes