FormalPara Key Summary Points

The maculopathy in highly myopic eyes is complex and its clinical diagnosis is a huge workload and subjective.

We developed an accurate and reliable deep learning model based on color fundus images to screen myopic maculopathy.

The artificial intelligence system could detect and classify normal or mild tessellated fundus, severe tessellated fundus, early pathologic myopia, and advance pathologic myopia.

The model achieved high sensitivities, specificities, and reliable Cohen’s kappa compared with those of attending ophthalmologists.

The artificial intelligence system was designed for easy integration into a clinical tool which could be applied in a large-scale myopia screening.

Introduction

Pathologic myopia (PM) is a major cause of legal blindness worldwide and the prevalence of myopia-related complications is expected to continue increasing in the future, presenting a great challenge for ophthalmologists [1,2,3,4]. In East and Southeast Asia, the prevalence of myopia and high myopia in young adults is around 80–90% and 10–20%, respectively [5]. In China, the prevalence of myopia in 1995, 2000, 2005, 2010, and 2014 was 35.9%, 41.5%, 48.7%, 57.3%, and 57.1%, respectively, with a gradual upward trend [6]. According to the META-PM (meta analyses of pathologic myopia) classification system proposed by Ohno-Matsui et al., PM is defined as “eyes having equal to or more serious than diffuse choroidal atrophy” or “eyes having lacquer cracks, myopic choroidal neovascularization (CNV) or Fuchs spot” [7]. However, manual interpretation of fundus photographs is subject to clinician variability since clear definition of various morphological characteristics was lacking in the META-PM classification system.

Though tessellation is a common characteristic of myopia, it is occasionally an earlier sign of chorioretinal atrophy or staphyloma development as well [8]. The higher the degree of fundus tessellation was, the thinner the subfoveal choroidal thickness was [9,10,11]. Yan et al. reported that higher degree of fundus tessellation was significantly associated with longer axial length, more myopic refractive error, and best-corrected visual acuity (BCVA) [12]. These reports have indicated that severe fundus tessellation might be the first indicator of myopia-to-PM transition. And Foo et al. demonstrated that tessellated fundus had good predictive value for incident myopic macular degeneration [13]. Therefore, screening severe fundus tessellation which is defined as equal to or more serious than grade 2 proposed by Yan et al. is beneficial to detect people at high risk of PM [12]. When people present signs of PM, visual acuity might be gradually impaired. According to recent research, patients with severe PM which was defined as equal to or more serious than patchy chorioretinal atrophy or foveal detachment and/or active CNV presented significantly worse BCVA than those with common PM [14]. Whereas, diffuse atrophy and lacquer cracks (LCs) which cause mild vision impairment and progressed slowly were considered as early-stage PM [15, 16]. Considering the complex maculopathy in highly myopic eyes, a simplified PM classification model would facilitate early detection of population with high risks of PM and stratified management of PM. However, screening the large number of patients with myopia is a huge workload for ophthalmologists.

Fortunately, with the rapid development of artificial intelligence (AI) technologies, the application of AI could provide a potential solution for the increasing burden of myopia, attributed to its ability to analyze a tremendous amount of data. In the field of ophthalmology, the deep learning system has led to exciting prospects in the detection of papilledema, glaucomatous optic neuropathy, and diabetic retinopathy based on color fundus photographs [17,18,19,20]. As a result of the complexity of the classification and definition system of PM, the application of deep learning technology in PM lesion screening is still a challenge. As evidenced by Tan et al., Lu et al., Wu et al., the AI models based on fundus images have achieved good performance in diagnosing and classifying high myopia [21,22,23,24]. However, the value of AI implementation for screening severe tessellated fundus in patients with high myopia has not been fully explored. On the basis of our classification system, it is viable to design the AI algorithm to automatically detect people at high risk of PM and to identify PM.

This study aimed to develop and train the deep learning system to automatically detect normal or mild tessellated fundus, severe tessellated fundus, early-stage PM, and advanced-stage PM using a large data set of color retinal fundus images obtained from the ophthalmic clinics of the hospitals.

Methods

Data Acquisition

In this study, the use of retinal fundus images was approved by the Ethics Committee of Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, and adhered to the tenets of the Declaration of Helsinki (Approval ID: No. 2015KY156). Written informed consent forms were obtained from all participants.

The 45° color fundus photographs centered on macula were collected from 6738 participants at Shanghai Eye Disease Prevention and Treatment Center (SEDPTC) in China from 2016 to 2018, using the TOPCON DRI Triton. Images in which the fovea was not fully visible or over 50% of the total area was obscured were excluded. Finally, 8210 images with visible macula from 5778 patients were included for model development. On the basis of the patient’s code number, these images were divided into a training data set (90% of the images) and a validation data set (10% of the images) for validating the models.

To evaluate model performance, the algorithm was applied to another data set collected from SEDPTC and Shanghai General Hospital (SGH), which consisted of 2137 macula-centered fundus photographs from 1828 participants or patients.

Classification and Labeling of Myopic Maculopathy

All fundus photographs were independently classified and labeled by three retina specialists (YF, WW, and LY). When disagreements occurred, the final diagnosis was confirmed through a group discussion among the retina specialists and another senior expert (XX). Diagnoses made by three attending ophthalmologists (RW, LY, and DS) were recorded to compare with AI performance. The META-PM classification system was slightly modified on the basis of the risk of progression and impact on vision [8, 14, 25, 26]. In accordance with Yan et al., severe tessellated fundus was defined as equal to or more serious than grade 2 in this study [12]. Therefore, the images were classified into four groups: (1) normal or mild tessellated fundus, (2) severe tessellated fundus, (3) early-stage PM, and (4) advanced-stage PM (Fig. 1). Details are illustrated in Table 1.

Fig. 1
figure 1

Typical fundus photographs of four categories. a Normal or mild tessellated fundus. b Severe tessellated fundus. c Early-stage PM. d Advanced-stage PM

Table 1 Detailed classification of myopic maculopathy

Image Processing

Original fundus photographs were preprocessed for prominence to improve classification accuracy [27, 28] (Fig. 2). Image preprocessing includes the following modules: ROI interception, data denoising, augmentation, and normalization (Supplementary Fig. S1).

Fig. 2
figure 2

Diagrams showing the overview of developing deep learning system (a) and the network architecture based on EfficientNet-B8 (b). The number 61 means that 61 MBConvBlocks were included in the network. PM pathological myopia

In the ROI interception module, we extracted the effective area by removing excessive black margins which may affect the identification of key feature information. Firstly, we converted the RGB images to grayscale images, in which the pixel value of background is equal to zero and the pixel value of effective area is greater than zero. Then, we used OpenCV tools to traverse the pixel information and get the location of effective area in grayscale images. Last, RGB images were cropped on the basis of the location of effective area.

In the data denoising module, an unsharp masking (USM) filter was applied to the cropped images to reduce noise interference during imaging according to the following formula [29]: \(I_{{\text{O}}} = a \cdot I_{{{\text{In}}}} + b \cdot G(\sigma )*I_{{{\text{In}}}} + c\), where \(I_{{{\text{I}}n}}\) represents the input image, \(I_{{\text{O}}}\) represents the standardized output image, \(G(\sigma )\) is a Gaussian filter with standard deviation s, and * represents the convolutional operator. Parameters a, b, c, and s were set to 4, 3.5, 128, and 30, respectively, on the basis of experience. The images were resampled to a resolution of 672 × 672 according to the code at Github (source code is at https://github.com/tensorflow/tpu/blob/master/models/official/efficientnet/efficientnet_builder.py).

In the data augmentation module, to increase the diversity of the data set and reduce the chance of overfitting [30], the horizontal and vertical flipping, rotation up to 60°, brightness shift within the range of 0.8–1.2, and contrast shift within the range of 0.9–1.1 were randomly applied to the images in the training data set, which increase its size to five times the original size.

In the data normalization module, the pixel values of images after augmentation were normalized within the range of 0–1. Then, z-score is used for standardization of the input image before deep learning.

Deep Learning Algorithm Development

Our training platform is implemented by PyTorch framework with Python3.6 and CUDA10.0. Training equipment comprised a 2.60 GHz Intel(R) CPU and a Tesla V100-SXM2 GPU. EfficientNet-B8 architecture, an excellent convolutional neural network suitable for large-size input image, was adopted [31]. The EfficientNet-B8 model was transfer learned from pretrained weights on ImageNet [32]. Then, we replaced the final classification layers in the network and trained further with our data set.

The cross entropy was used as an objective function in our model during the training process. Training was performed with an initial learning rate of 10−2, weight decay coefficient for I2 regularization of 1e−5, and dropout for output layer of 0.5. Then, the stochastic gradient descent (SGD) optimizer was used for 80 epochs on the training data set, and each epoch is verified on the validation set to determine the final weight. To reduce overfitting of the model, early stopping strategy and sharpness aware minimization (SAM) optimizer were applied. The training process was stopped if the validation loss did not improve over 20 consecutive epochs. The model state with lowest validation loss was saved as the final state of the model.

Statistical Analysis

To determine the model performance, the receiver operating characteristic (ROC) curves were used and analyzed with Python software. According to the results of the classification model, the area under precision–recall (P–R) curve—the average precision value (AP), the area under ROC curve (AUC), sensitivity, specificity, and the overall accuracy were evaluated for the four groups.

Results

Characteristics of the Data Sets

The deep learning system was trained and validated on 8210 fundus photographs collected from 5778 participants (59.00% with photographs for both eyes; mean age 51.36 years old; 60.00% female), including 4920 (59.93%) normal or mild tessellated fundus images, 2110 (25.70%) severe tessellated fundus images, 870 (10.60%) early-stage PM images, and 310 (3.78%) advanced-stage PM images. And 10% out of 8210 fundus photographs were randomly selected for internal validation. A separated set of 2137 photographs, including 1053 (49.28%) normal or mild tessellated fundus images, 405 (18.95%) severe tessellated fundus images, 406 (19.00%) early-stage PM images, and 273 (12.77%) advanced-stage PM images, was used for externally testing the performance of the deep learning system (Table 2).

Table 2 Characteristics of the training and validation data sets

The characteristics of eyes are presented in Table 3. Patients with PM as compared with individuals without PM had a significantly longer axial length (P < 0.001), had a significantly worse BCVA (P < 0.001), and had a significantly higher refractive error (− 10.17 D vs. − 0.70 D, − 9.17 D vs. − 0.93 D, − 10.23 D vs. − 0.60 D in training, validation, and testing data set, respectively, P < 0.001). Similar to previous study, patients with severe tessellation fundus had a significantly longer axial length (P < 0.001), worse BCVA (P < 0.001), and higher refractive error (− 3.54 D vs. − 0.70 D, − 3.30 D vs. − 0.93 D, − 3.15 D vs. − 0.60 D in training, validation, and testing data set, respectively, P < 0.001). Moreover, clinical features showed no significant difference between the training set and internal validation set, which indicated that the training set and internal validation set were homogeneous.

Table 3 Ocular parameters of the training and validation data sets

Classification Performance in the Validation Data Set

In the validation data set (Table 4), the deep learning system discriminated normal or mild tessellated fundus from all the other types with an AUC of 0.98, a sensitivity of 93.10%, and a specificity of 97.60%. The deep learning system discriminated severe tessellated fundus from all the other types with an AUC of 0.95, a sensitivity of 92.90%, and a specificity of 93%. The system showed a sensitivity of 90.80% and specificity of 98.90%, with an AUC of 0.99, for screening early-stage PM from all the other types. Meanwhile, it differentiated advanced-stage PM from all the other types with a sensitivity of 96.80%, specificity of 99.90%, and an AUC of 1.00. The overall accuracy of the model was 92.90%.

Table 4 Classification performance of the model based on the internal validation set

Classification Performance in the External Testing Data Set

The system was further applied to an external testing data set to assess the generalizability. Similar to the results from the validation data set, the system discriminated normal or mild tessellated fundus, severe tessellated fundus, early-stage PM, and advanced-stage PM with an average precision of 0.99, 0.87, 0.94, and 0.97, a sensitivity of 92.00%, 92.60%, 88.20%, and 94.90%, and a specificity of 98.60%, 93.10%, 98.20%, and 99.50%, respectively; the areas under ROC was 0.99, 0.96, 0.98, and 1.00 (Fig. 3a). P–R curves were used to measure the P–R trade-off of the model due to the imbalance of the data sets (Fig. 3b). The overall accuracy of the deep learning system was 91.80%.

Fig. 3
figure 3

Performance of the deep learning model in the external testing data set using receiver operating characteristic (ROC) curves and precision–recall (P–R) curves. The external-testing data sets included ocular fundus photographs from SEDPTC and SGH. a ROC curves for the testing data set among the four categories. The area under ROC curves is presented as AUC. b P–R curves for the testing data set among the four categories. Average precision value (AP) was defined as the area under P–R curve

Meanwhile, the photographs of the external testing data set were also independently graded by three attending ophthalmologists, and the sensitivity and specificity were compared with those from the deep learning system. As illustrated in Table 5, the system showed an equal or even better sensitivity than the attending ophthalmologists, especially for discriminating severe tessellated fundus and early-stage PM, showing a significantly higher sensitivity at a similar specificity. In addition, the mean overall accuracy of the attending ophthalmologists was 90.07% (range 89.40–91.20%), which was lower than the deep learning model.

Table 5 Classification performance of the model based on the external testing set

Visualizing the Prediction Process of Deep Learning System

The visualization of the prediction process for the deep learning system was displayed in the form of class activation map (CAM). As shown in Fig. 4, the highlighted areas were consistent with the region of tessellated fundus, diffuse atrophy, and patch atrophy, indicating that the system obtained generalized characteristics of tessellation, early-stage PM, and advanced-stage PM, respectively.

Fig. 4
figure 4

Examples of a class activation map (CAM) for the prediction of normal fundus or mild tessellated fundus (a and e), severe tessellated fundus (b and f), early-stage PM (c and g), and advanced-stage PM (d and h) by the trained model using the external testing data set

Discussion

A deep learning algorithm that is able to accurately screen and assess myopic maculopathy can also potentially provide significant benefits, allowing enhanced accessibility and affordability of myopic maculopathy screening for a large at-risk population, which improves the access to care and substantially decreases global costs particularly in remote and underserved communities. In particular, as a result of the outbreak of COVID-19 in 2019, the remote medical systems will be more important [33, 34]. An AI-integrated telemedicine platform will be a new direction of myopia healthcare in the post-COVID-19 period [35]. In this study, we developed an effective deep learning model using EfficientNet-B8 based on 8210 color fundus photographs and demonstrated its potential in screening myopic maculopathy. The AI model showed excellent performance in classifying normal fundus, severe tessellation, early-stage PM, and advanced-stage PM. In particular, the performance in classifying severe tessellation and early-stage PM was better than manual classification.

Previous studies showed that AI algorithms using deep learning neural networks have been applied for screening diabetic retinopathy, age-related macular degeneration, glaucoma, and papilledema [17, 19, 36, 37]. The Google team demonstrated that the deep learning system extracting information from fundus photographs could be applied to estimate the refractive error [38], which suggested that fundus images have information on the refractive powers. Our study also showed that patients with PM had a significantly higher refractive error compared with individuals without PM, as did patients with severe tessellation fundus compared with normal individuals (Table 3). Recently, several automatic systems for detecting PM have also been reported. Devda and Eswari developed a deep learning method with conventional neural network for detecting pathologic myopia [39]. Their work showed satisfactory performance in classification and segmentation of atrophy lesions. However, the development of their system was based on public databases. The amount of training and testing data sets involved in the development process was relatively small. Moreover, authoritative criteria for diagnosing PM were lacking. In our work, a large data set of 8210 color fundus photographs were used to develop the algorithm. Compared to public databases, data sets from the real world could afford more data complexity and original disease information. Du et al. also developed a deep learning algorithm to categorize the myopic maculopathy automatically on the basis of the META-PM categorizing system [40]. Compared with their system, our training data set was larger and our deep learning system was more powerful. In addition, Lu et al. designed deep learning systems with excellent performance to detect PM and myopic macular lesions according to the META-PM classification system [22, 23]. Compared with their research, severe tessellation fundus was added in our classification system in order to detect populations at high risk of PM promptly.

Tessellated fundus is one of the preliminary signs of myopia in general that does not impair central vision. However, Fang et al. reported that progressive and continuous thinning of choroid was associated with the progression to tessellation and diffuse chorioretinal atrophy [16]. Yan et al. also demonstrated that the higher the degree of fundus tessellation was, the thinner the subfoveal choroidal thickness was [12]. Cheng et al. demonstrated that the grade of fundus tessellation was associated with choroidal thickness and axial length in children and adolescents [41]. Moreover, similarities were found in the distribution pattern of choroid thinning between tessellated fundus and other lesions of myopic maculopathy [16]. And Foo et al. demonstrated that tessellated fundus had good predictive value for incident MM [13]. These findings indicated that tessellation might be the first sign for myopia to become pathologic. In addition, it has been reported that diffuse atrophy in childhood can develop into advanced myopic chorioretinal atrophy in later life, whereas these lesions have usually progressed from severe fundus tessellation [42]. Moreover, Kim et al. showed that the tessellated fundus was related to myopic regression after corneal refractive surgery, which indicated that tessellated fundus is associated with a myopic shift [43]. Therefore, discriminating severe fundus tessellation from common myopia is important for individuals with myopia, especially for those with high myopia, and the follow-up frequency of patients with severe tessellated fundus can be increased.

Moreover, to improve the screening efficiency for the population at high risk, the classification of myopic maculopathy lesions was simplified in our work according to the degree of vision impairment. Ruiz-Medrano et al. have demonstrated that people who presented with equal to or more serious than patchy chorioretinal atrophy or foveal detachment and/or active CNV showed worse visual acuity than common PM [14]. And 92.70–100% of eyes with patchy atrophy, myopic CNV, and macular atrophy showed progression and were associated with significant vision impairment based on a 10-year follow-up study [16, 25]. Therefore, these lesions were classified into advanced-stage PM in the present study (Table 1). In addition to receiving treatment in time, vision rehabilitation training and community management of the individuals with low vision are recommended for those diagnosed with advanced-stage PM. As a result of mild impairment of central vision, diffuse atrophy and LC alone were categorized into early-stage PM in the present study. Li et al. reported that half of the participants with diffuse chorioretinal atrophy had progression during a 4-year follow-up study, manifested as enlargement and newly formed diffuse chorioretinal atrophy [44]. Close follow-up is recommended to individuals when diagnosed with early-stage PM.

In addition, our study involved the following technology optimizations. To overcome difficulties due to the complicated manifestations such as atypical lesions, coexisting comorbidities, and posterior staphyloma, a channel attention module was added to suppress unnecessary channels, and a spatial attention module was added to capture the most abundant feature information of the maps. Moreover, a weighted cross-entropy loss function was used to minimize model decision boundary deviation caused by the imbalanced data sets. The weight coefficient was set to the reciprocal of the amount of data for each category. Lastly, a label smoothing strategy was applied during the training of the mild and severe tessellated fundus recognition model to reduce the impact of incorrect labels on the model and promote its generalization ability. And a USM filter was used to denoise and obtain effective information from images. Additionally, to discriminate severe from mild tessellated fundus, SAM was used as optimizer on the basis of SGD.

As illustrated in the class activation map (Fig. 4), the deep learning model could identify the position and distinguish features of lesions, which may potentially facilitate the diagnosis. In addition, as revealed in Table 5, the sensitivity of the deep learning algorithm for detecting severe tessellated fundus and early-stage PM was 92.60% and 88.20%, respectively, which were better than that by attending ophthalmologists, indicating that the AI system is reliable for screening. With the assistance of the AI system, basic examinations such as fundus photographs can be carried out in local community hospitals, which is convenient for both patients and ophthalmologists, especially for those in remote areas without retina experts.

Limitations of this study also need to be considered. Firstly, though we further confirmed that the performance of the model was better than attending ophthalmologists for detecting atypical lesions (Supplementary Fig. S2), a few photographs were still misdiagnosed, which might be attributed to the relatively low image quality or the microlesions. These reveal that the model requires higher image quality than retina experts. In addition, multimodal imaging for myopic eyes can facilitate in improving the accuracy of diagnosis. For example, diffuse atrophy appears as an ill-defined yellowish-white lesion in the posterior fundus on ophthalmoscopy, which exhibits mild hyperfluorescence in the late phase on fluorescein angiography (FA) [45]. Moreover, the choroidal thickness in the area of diffuse atrophy was markedly thinned on optical coherence tomography (OCT) [16, 46]. Additionally, LCs were assessed as yellowish linear lesions on ophthalmoscopy, as linear hypofluorescence in the late phase on indocyanine green angiography (ICGA), and linear hyperfluorescence in early-late phases on FA [47, 48]. Therefore, more real-world clinical data, such as images from FA, ICGA, or OCT, should be considered together in clinical labeling of the photographs in the future. And the number of photographs with early pathologic myopia could be increased during training which may improve the accuracy of this classification. Secondly, the fundus color might be different because of the difference in the degree of fundus pigmentation among races, which can decrease the diagnosis accuracy of atrophic lesions. Future research is warranted to investigate the model efficacy for other ethnic groups. Thirdly, although photographs were collected from two different clinical centers, the model performance based on photographs using other cameras is still unclear. Therefore, photographs collected from multiple fundus cameras are necessary to further improve the generalization and reliability of the AI model.

Conclusions

A deep learning algorithm was applied to identify normal fundus or mild tessellation, severe tessellated fundus, early stage of PM, and advanced stage of PM based on fundus photographs. Our AI model achieved performance comparable to that of experts. Owing to the promising performance of our AI system, it can assist ophthalmologists by reducing workload and saving time during large-scale myopia screening and long-term follow-up.