Nuclear cataract classification in anterior segment OCT based on clinical global–local features

Nuclear cataract (NC) is a priority ocular disease of blindness and vision impairment globally. Early intervention and cataract surgery can improve the vision and life quality of NC patients. Anterior segment coherence tomography (AS-OCT) imaging is a non-invasive way to capture the NC opacity objectively and quantitatively. Recent clinical research has shown that there exists a strong opacity correlation relationship between NC severity levels and the mean density on AS-OCT images. In this paper, we present an effective NC classification framework on AS-OCT images, based on feature extraction and feature importance analysis. Motivated by previous clinical knowledge, our method extracts the clinical global–local features, and then applies Pearson’s correlation coefficient and recursive feature elimination methods to analyze the feature importance. Finally, an ensemble logistic regression is employed to distinguish NC, which considers different optimization methods’ characteristics. A dataset with 11,442 AS-OCT images is collected to evaluate the method. The results show that the proposed method achieves 86.96% accuracy and 88.70% macro-sensitivity, respectively. The performance comparison analysis also demonstrates that the global–local feature extraction method improves about 2% accuracy than the single region-based feature extraction method.


Introduction
According to World Report on vision [27], it is reported that cataract is the leading cause for blindness and vision impairment, approximately 65.2 million people are suffer-ing from moderate or severe cataract. These cataract patients can improve their vision and life quality through efficient cataract surgery or early intervention, reducing the bilateral cataract-blindness burden for society.
Nuclear cataract (NC) is one of the most common cataract types, and the clinical manifestations include the gradual clouding and progressive hardening of the nuclear region of the crystalline lens [25]. Ophthalmologists have applied several ophthalmic images to NC diagnosis based on gold cataract grading protocols over the past years. Lens opacity classification system III (LOCS III) [34] is a commonly well-accepted cataract grading protocols built on slit-lamp images. E.g., ophthalmologists usually grade NC's severity levels based on the slit-lamp images and LOCS III in the clinical diagnosis. This manual NC classification mode is subjective and error-prone; moreover, it is easily affected by the ophthalmologist's experience and professional knowledge.
Anterior segment coherence tomography (AS-OCT) image is one type of OCT imaging technique, which is capable of capturing the whole anterior structure, including the crystalline lens structure information. Compared with other ophthalmic images like the slit lamp image, it is non-invasive, objective, user-friendly, high-resolution, and quick. Furthermore, it can measure the opacities of the lens quantitatively and objectively. According to the opacity pathology development of NC, it generally can be divided into three stages on LOCS III [28]. (1) Stage 0: Normal (non-nuclear cataract), without nuclear opacity. (2) Stage 1: Low-grade (NC grade = 1 or NC grade = 2), is asymptomatic. (3) Stage 2:Highgrade (NC grade is ≥ 3). For subjects with low-grade nuclear cataract, clinical intervention, such as Kary Uni eye drops, can slow the nuclear cataract progress; while for subjects with high-grade nuclear cataract, it is necessary to undergo cataract surgery and progress follow-up. Figure 1 provides three severity levels of nuclear cataract on AS-OCT images.
Over the past years, ophthalmologists have increasingly used AS-OCT images to diagnose anterior segment ophthalmic diseases, e.g., glaucoma, corneal diseases [1,11,12]. Researchers have recently begun to study the opacity relationship between NC grades and the lens nucleus region on AS-OCT images quantitively and objectively. Wong et al. [33] first used the linear fitting method to build a opacity relationship between NC grades and mean density of nuclear region on AS-OCT images, and statistical results showed that the opacity relationship is strong. Literature [5,6,15,26] also obtained similar statistical results in clinical research, but [26] gets weak opacity relationship on down nucleus region compared with whole nucleus region [5,6]. Further, these statistical results provided a potential contribution for AS-OCT image-based cataract surgery planning and the clinical diagnosis support for automatic NC classification. Motivated by clinical AS-OCT image-based NC research, [43] applies a deep learning model to NC classification automatically on the whole lens region of AS-OCT images. It only obtained about 58% accuracy, indicating that it is a challenging for automatic NC classification on AS-OCT. This paper presents a simple yet effective nuclear cataract classification framework on AS-OCT images, assisting ophthalmologists in diagnosing nuclear cataract accurately and objectively. It includes three steps: feature extraction, feature importance analysis, and classification, as shown in Fig. 2. In the feature extraction step, we devise a clinical globallocal feature extraction method to extract 20 image features from the whole nucleus region, up nucleus region, and down nucleus region, respectively. It is motivated by clinical NC research [5,6,15] and opacity locations of nuclear cataract subtypes. Moreover, according to the literature [19], two nuclear size features are also extracted: nuclear thickness and nuclear diameter. Hence, the total number of extracted features from AS-OCT images is 62. In the feature importance analysis step, we use Pearson's correlation coefficient (PCC) and recursive feature elimination method (RFE) to analyze feature importance, considering both the clinical research and classification performance requirements. We then use Fig. 1 Three nuclear cataract severity's levels based on AS-OCT images. Normal a denotes the nuclear region without nuclear opacity; low-grade b denotes the nuclear region with slight nuclear opacity but asymptomatic; high-grade c with nuclear opacity but symptomatic an ensemble multiclass logistic regression (EMLR) further to improve NC classification performance in the classification step, in which two different optimization methods are used for two multiclass logistic regression classifiers. Finally, a clinical AS-OCT image dataset is used to evaluate the proposed feature extraction-based framework. The dataset contains 543 subjects and the total number of AS-OCT image is 11,442. The results demonstrate that the proposed feature extraction-based learning framework is simple and effective, compared with strong baselines. Moreover, it can potentially be a computer-aided diagnosis (CAD) tool for AS-OCT image-based cataract diagnosis and cataract surgery planning.
In general, the main contributions of this paper are summarized as follows: -To obtain more useful features from the nuclear region on AS-OCT images, this paper proposes the global-local feature extraction method, inspired by clinical research of nuclear cataract. Furthermore, we extracted two nuclear size features to boost the NC classification results. -Using PCC and RFE method to analyze feature importance, to eliminate less important features as well as select useful features. To further enhance the overall NC classification results, we propose an ensemble multiclass logistic regression classifier by considering the effects of different optimization methods for the single multiclass logistic regression classifier. -The results on the AS-OCT image dataset demonstrate that the proposed feature extraction-based framework achieves state-of-the-art performance compared with strong baselines.
The rest of this paper is organized as follows. The section "Related work" reviews related work. The section "AS-OCT image dataset" introduces the AS-OCT image First, we crop the nucleus region from the AS-OCT image and use the global-local feature-based extraction method to extract features from the up, whole, and down nucleus region. Then, we use PCC and RFE methods to analyze feature importance. Finally, we present an ensemble multiclass logistic regression to distinguish three severity levels of the nuclear region dataset. The section "Methodology" elaborates the proposed feature extraction-based framework for automatic AS-OCT image-based NC classification. Experiment settings and evaluation measures are presented in the section "Experiment settings and evaluation measures". We analyze and discuss nuclear classification results in the section "Result analysis and discussion". The section "Conclusion and future work" presents conclusions and future work.

Related work
In this section, we review recent advances in automatic cataract classification and AS-OCT-based ocular disease diagnosis.

Automatic cataract classification
Over the past years, researchers have developed various artificial intelligence (AI) algorithms for automatic cataract classification based on several ophthalmic imaging modalities (slit-lamp images and fundus images), ranging from conventional machine learning methods to deep learning methods.
Conventional machine learning methods. Literature [20][21][22][23]42] develops an automatic nuclear cataract grading system based on slit lamp images, comprised of lens contour detection, feature extraction, and classification. They used linear regression (LR) as the classifier and achieved a 0.36 mean error in their work. Literature [38] adopts bag of words (BOW) method to extract features and got 82.5% accuracy via group sparsity regression (GSR) method on slit-lamp images. Cheng [7] presented sparse range-constrained learning (SRCL) method for slit lamp image-based nuclear cataract classification and obtained higher accuracy than previous works [38,39]. Caixinha et al. [2] used ultrasound images for automatic cataract classification based on the animal model. They achieved 95% accuracy of nuclear cataract hardness classification using a multiclass SVM classifier on a small dataset. [4] proposes the improved Haar wavelet method for cataract screening on fundus images. However, fundus images can not detail opacity information of different cataract types, only can be used for cataract screening.
Deep learning methods. Gao et al. [14] combined the convolutional neural network (CNN) and recurrent neural network (RNN) for automatic slit-lamp image-based nuclear cataract classification and achieved 84.2% accuracy. Literature [36] proposes an end-to-end deep learning framework for both the nuclear region contour detection and nuclear cataract classification automatically. Using Faster R-CNN, they achieved 84.7% accuracy. Wu et al. [35] designed a deep learning platform for slit-lamp image-based cataract screening. Xu et al. [37] proposed a hybrid CNN model for cataract screening on retinal images by fusing different region information of retinal images. The results on fundus images showed that the hybrid CNN improved cataract screening results. In [41], researchers use a deep convolutional neural network ((DCNN) to fundus image-based cataract screening and achieved good screening results.

AS-OCT-based ocular disease diagnosis
As stated in the section "Introduction", AS-OCT images are noncontacted, non-invasive, user-friendly, objective, and quantitative. Moreover, they can capture 2D (two -dimensional) and 3D (three-dimensional) information of the eye's anterior structure. Ophthalmologists have gradually used AS-OCT images for ocular disease diagnosis (E.g., corneal diseases) and scientific research purposes due to characteristics of AS-OCT. Literature [9,16] proposes a deep CNN-based segmentation method for corneal structure segmentation, which can help clinicians diagnose corneal diseases accurately. Fu et al. [11][12][13] applied AS-OCT images to diagnose angle-closure glaucoma through deep learning models, which can assist ophthalmologists objectively diagnose glaucoma. Wong et al. [33] studied the correlation relationship between nuclear cataract grades and mean density of the whole nucleus region through the linear fitting method. The statistical results show that the relationship between them is strong. Literature [5,6,15] also gets similar results between nuclear cataract grades and the whole nucleus region on AS-OCT images. [26] uses the down nucleus region to study the opacity relationship between nuclear cataract grades and mean density, but gets a weak opacity relationship. All in all, these clinical AS-OCT imagebased cataract research can be a potential contribution to nuclear cataract surgery planning and provide clinical support for automatic nuclear cataract classification.
According to a review of related works, we can get points as follows. (1) Previous results have achieved high cataract classification performance via different ophthalmic images, but most of them focused on cataract screening. (2) Feature extraction methods can obtain competitive performance through comparison to deep learning methods. Moreover, deep learning methods need massive data to train a good deep learning model, and the clinical explanation of learned feature representations is poor. (3) Automatic nuclear cataract classification works only based on slit-lamp images, but they cannot measure nuclear cataract opacity objectively and quantitatively. (4) AS-OCT images overcome shortcomings of slit lamp images, but AS-OCT image-based nuclear cataract classification research has not widely been studied.

AS-OCT image dataset
This paper collects a clinical AS-OCT image dataset through CASIA2 ophthalmology device, Tomey Corporation, Japan. AS-OCT image captures whole anterior structure information of an eye, as shown in top left corner of Fig. 2. Only the lens nucleus region is essential for NC classification according to clinical cataract research [6,15,32], as shown in Fig. 1. We use the deep segmentation network [3] to get coarse segmentation results of the nuclear region. To get accurate nuclear region segmentation results, we use ImageJ software to correct nuclear region segmentation results manually.
Considering there is no clinical nuclear cataract classification system built on AS-OCT images. We construct the mapping relationship between AS-OCT images and slitlamp images through LOCS III to acquire nuclear cataract grades for AS-OCT images. Three experienced ophthalmologists labeled the subject's NC grades using silt lamps, which confirmed the label quality and reliability for AS-OCT images. This paper converts NC's severity levels into three stages based on clinical AS-OCT-based classification research, as introduced in the section "Introduction". Stage 1: the subject's lens nuclear region without opacity is normal (non-nuclear cataract); the subject with the NC grade 1 or grade 2 is asymptomatic (low-grade); the subject with the NC grade is greater than or equal to 3 are symptomatic (high-grade).
The AS-OCT image dataset contains 543 subjects, including 422 right eyes and 440 left eyes. The gender and age information of some subjects are missed. The number of male and female subjects are 135 and 335, respectively. Four hundred ninety-four subjects have age information, and the age ranges from 15 to 94. Each subject contains 128 images. This paper selects AS-OCT images based on the interval mode by considering the repeatability of adjacent AS-OCT images; thus, 64 AS-OCT images of each subject are used. The available AS-OCT images of each eye range from 1 to 64, because we manually remove poor-quality images with an ophthalmologist's guidance. Considering opacity levels of each subjective's eyes may have mutual effects on each other, we split the AS-OCT image dataset based on the number of subjects into disjoint subsets: training dataset and testing dataset. The training dataset and the testing dataset contain 7831 and 3611 AS-OCT images, respectively, and the total number of AS-OCT images is 11,442. Table 1 summarizes the three different NC severity-level distribution on the AS-OCT image dataset.

Methodology
In this paper, we propose a simple yet effective NC classification framework on AS-OCT images, as illustrated in

Global-local feature extraction
Refs. [5,6,33] and [26] study the opacity relationship between NC grades and mean density through the whole nucleus region and down nucleus region based on AS-OCT images, respectively. We found that opacity relationship value on whole nucleus region is higher than on down nucleus region, which is caused by opacity locations of nuclear cataract subtypes. Motivated by the clinical research finding, this paper extracts features from three different regions: whole, up, and down, as shown in Fig. 2. We extract 20 features from each region using the intensity-based statistics method and intensity histogram method [17,24,[44][45][46]. Hence, obtained features can be divided into intensity statistics features and intensity histogram features.

Intensity-based statistical features
Using the intensity-based statistics method, we extract 17 intensity-based statistics features from each lens nucleus region as follows: 1. Mean μ: the average intensity of each nucleus region on AS-OCT images, which is an important indicator for clinical AS-OCT image-based nuclear cataract diagnosis X k and N denote the intensity value of nucleus region pixel and the total number of intensities on AS-OCT images, respectively.
P 75 and P 25 denote the 75th percentile nucleus region intensity value and the 25th percentile nucleus region intensity value. 11. Energy: considering nuclear sizes are different, here, energy is average of total nucleus region intensity square 12. Variance: it measures how far the nucleus region intensity values are spread out from the average intensity value. 13. Standard deviation (SD): it measures the dispersion of the nucleus region intensity values. 14. Mean absolute deviation (Mad): it is a measure of dispersion from the average intensity 15. Skewnessμ 3 : in probability theory and statistics, skewness is an indicator to measure the asymmetry of nuclear region intensity distribution and can be expressed via the following equation: 16. Kurtosisμ k : it is used to measure peakedness [46] in the nuclear region intensity distribution on the AS-OCT image and we compute it through Eq. (6) 17. Root-mean-square intensity (RMS): it also called the quadratic mean and can be computed as follows:

Intensity-based histogram features
Apart from the above 17 intensity-based statistics features, we also apply the intensity histogram method to extract three intensity histogram features from AS-OCT images. Then, nuclear region intensity (density) value is between 0 and 255. The interval value for each bin is 25 in the histogram; hence, the number of histogram bins is 11.
18. Uniformity: the sum of probability squares of different intensity value intervals in the histogram [24]. It enables to measure the randomness of a histogram. 19. Entropy: it is an information-theoretic concept that provides a metric for the AS-OCT image intensity information of nuclear cataract severity levels. This paper uses the following equation to express: where P i denotes of probability of each bin, which is determined by the number of intensity values in a bin. 20. Histogram-based energy (HBE): it measures the intensity distribution, and large values imply that intensity distribution is uneven.

Nuclear size-based features
Ref. [19] has studied the opacity correlation relationship between nuclear size-based features and nuclear cataract severity levels through the linear fitting method. The statistical results show that the relationship between nucleus size features and nuclear cataract grades is strong. In this paper, we extract two features from nuclear size: thickness and diameter, which are represented by height and width of the nucleus region AS-OCT images. Figure 3 presents nuclear thickness (red) and nuclear diameter (green) of the nuclear region on AS-OCT images. Overall, the total number of extracted features from the nuclear region is 62, and for detailed feature information, see Table 2.

Feature importance analysis
Considering both clinical research and NC classification performance requirements, this paper uses two different feature selection methods to analyze feature importance: Pearson's correlation coefficient (PCC) [22] and recursive feature elimination method (RFE) [18]. The motivation to use the PCC method is that it is widely used in clinical scientific research. Hence, we construct correlation relationships between nuclear cataract severity levels and the nuclear region's extracted features on AS-OCT images through the linear fitting. This paper uses the following equation to the PCC: where f K , y K , and n denote the extracted features, NC severity levels, and the number of AS-OCT images. K is K -th AS-OCT image. r indicates the PCC value between the extracted features and NC severity levels. RFE is another widely used feature selection method for feature importance analysis, which selects features by recursively using smaller and smaller feature set. The multiclass logistic regression method is used for RFE based on NC classification performance. Moreover, we only use 59 features for RFE, because the nuclear region's minimum density values are 0. To compute feature importance efficiently, we use recursive feature elimination with cross-validation (RFECV) for training dataset. Before feature selection. Tenfold crossvalidation is adopted in this paper [29], which training dataset is divided into tenfold, ninefold for training and onefold for testing. Because, this strategy can enable multiclass logistic regression to have a good generation ability. The Z -score method is utilized to transform one feature vector space into another feature vector space through the following equation: wherex is transformed feature vector space, x is original feature vector space, and μ and σ denote mean and standard deviation of each feature vector. It maps features with different scales into the same feature scales and deletes feature background correlation information.Then, we apply the RFECV to analyze feature importance and get two feature subsets: important feature subset and unimportant feature subset. Important feature subset denotes that features are for classification, while unimportant feature subset indicates that features are not used for classification. Finally, we use multiclass logistic regression to determine the number of selected features based on the classification performance.

Automatic nuclear cataract classification via ensemble multiclass logistic regression
This paper uses the logistic regression method (LR) for automatic nuclear cataract classification, because previous works have shown LR achieved promising classification results on various learning tasks [18]. Considering nuclear cataract classification is a multiclassification task. Thus, this paper uses multiclass logistic regression (MLR) through the following equation: where i ∈0,1,2, φ denotes the feature vectors x 0 , x 1 , x 2 , ..., x M , M is the number of feature vectors, w T i is the learned parameters for kth class, and p(y = i|φ) is the predicted output of ith class.
In the training, the parameters of multiclass logistic regression can be optimized through the following cost function: Equation (13) also named cross-entropy error function.
In the experiments, we found MLR classifier with different weight optimization methods that obtain different NC classification results. Specifically, different weight optimization methods enable MLR classifier to pay attention to different nuclear cataract severity levels. Therefore, we present an ensemble logistic regression (EMLR) framework in which two different optimization methods [8,30,31] are used for two MLR classifiers based on the classification performance, respectively. SAGA (stochastic average gradient ascent) and LBFGS (limited-memory Broyden-Fletcher-Goldfarb-Shanno) optimization methods are used in this paper [8,30] according to experimental results.
The predicted output of EMLR can be expressed as follows: where p MLR lbfgs and p MLR2 saga denote MLR uses LBFGS and SAGA optimization methods, respectively.

Experiment settings
We implement experimental codes using Python language, OpenCV package, and Pytorch platform. To demonstrate the proposed feature extraction-based framework's performance comprehensively, this paper conducts the following comparable experiments.
-Performance comparison of different nucleus region features. This paper extracts features from three lens nucleus regions include the whole nucleus region, up nucleus region, and down nucleus region correspondingly, as shown in Fig. 2

Evaluation measures
To evaluate the overall performance of methods, we calculate the following commonly accepted evaluation measures: accuracy (ACC), macro precision, macro-sensitivity (Sen), and macro-F1 score. These evaluation measures can be expressed by the following equations: where TP, FP, TN, and FN denote the numbers of true positives, false positives, true negatives, and false negatives, respectively. Table 4 presents the NC classification performance on features of different lens nucleus regions and nuclear size via four machine learning methods. It can see that compared to RF, NB, and RE, MLR achieves the best NC classification results (86.71% accuracy and 87.44% macro-F1) on three lens nucleus region features and nuclear size features and improves over 1% accuracy. Four machine learning methods generally achieve better NC classification results on the whole, up, and down nucleus regions. These results indicate that the fusion of different nuclear region features can boost the classification performance. We can also see that NC classification results in this paper agree with linear regression results on clinical works using different nuclear regions of AS-OCT images. RE achieves the best accuracy of 66.99% on two nuclear size features, which keeps agreement with clinical works. Four machine learning methods generally achieve better NC classification results on three regions plus nuclear size features. MLR achieves the highest improvement of about 5% on the fusion of up nucleus region features and nuclear size features. The results demonstrate that nuclear size features can enhance NC classification results. Moreover, the fusion of different nuclear region features and nuclear size features is more robust than single nuclear region features and nuclear size features based on the NC classification results, four machine learning methods achieve over 80% accuracy, and three machine learning methods obtain than 86.00% accuracy. Table 2 presents PCC values between 62 features and NC's severe levels, and we can see that the correlation relationship between the severity levels of NC and IQR is stronger than other features on three nuclear regions. The feature importance of uniformity is second only to IQR. These two features have the potential as clinical indicators for the clinical NC diagnosis, because they are explainable. Moreover, the PCC value of minimum density is 0, because the minimum density value of the nuclear region is 0. Thus, we do not use minimum density for the following feature importance analysis and NC classification, that is, only 59 features are useful. PCC value of nuclear diameter is low, which is conflicted with clinical founding. Mainly because we cannot extract the right edge and left edge of nuclear size accurately, as shown in Fig. 3, which is effected by scanning angle and environment. Table 3 presents NC classification results of different features. Features on three nuclear regions with PCC values > 0.700 are selected. For each selected feature in three regions with the highest PCC value is used. It can see that features with high PCC values generally achieve better NC classification performance. It also demonstrates that the machine learning-based classification results have good agreement with clinical works. MLR achieves the best accuracy of 75.71% using skewness than other single features, while RF achieves the best accuracy of 71.73% through IRQ.   To further study PCC values' effects on NC classification performance, we select the highest PCC values of each feature extracted from three regions, the selected feature subset named Hybrid. According to Table 4, the hybrid subset achieves better performance than other region feature subsets using MLR and RF. It demonstrates that high PCC values of features can improve NC classification performance, and feature information of three regions is different, contributing to boosting NC classification results. Figure 4 presents feature selection results of the RFE method using MLR. The horizontal axis represents the number of features that are used based on their coefficient values. The vertical axis presents the accuracy values change with each number of features. 46 features (important feature subset) are selected when MLR achieves the best accuracy on the training data. Table 5 shows the feature importance rankings of unselected features (unimportant feature subset). The higher the feature importance ranking value is, the more unimportant feature is (ranking value starts with 2). Figure 5 presents feature coefficient values of MLR based on RFE for every nuclear cataract severity level.    Figure 6 shows the results of the different number of features based on MLR when deleting unimportant features. It can be inferred that MLR achieves the best results (86.82% accuracy) when the number of features is 51. The following features are not used: P 10 , Down P 10 , Up P 10 , P 90 , Down HBE, Kurtosis, Up kurtosis, and Down entropy, which may provide a reference for the future work. Furthermore, comparable machine learning methods also use 51 features as input in the following experiments.  and lbgs optimization methods than other optimization methods. Hence, this paper adopts these two optimization methods for EMLR. Table 6 presents the NC classification results of machine learning methods and deep learning methods. We can see that the proposed EMLR achieves the best accuracy and the best precision with 86.96 and 87.31% on the AS-OCT image dataset, respectively. GoogleNet achieves the best F1 and the best sensitivity of 88.01% and 89.98%. The proposed EMLR and GoogleNet achieve better NC classification results than other machine learning methods and deep learning methods. The main reason to explain the classification results of EMLR is that it considers the advantages of optimization methods for MLR and characteristics of features based on feature importance analysis methods.

Performance comparison of machine learning methods and deep learning methods
Compared with deep learning methods like ResNets and VGGNets, EMLR, MLR, RE, and SVM achieve competitive classification performance, which confirms the effectiveness of the proposed global-local feature extraction method. Machine learning methods have better explanation ability than deep learning methods, because used features are interpretable, which are significant for clinical disease diagnosis. Moreover, the proposed method outperforms literature [43] approximately 30%, because this paper uses the nuclear Fig. 8 Confusion matrix of EMLR on AS-OCT image dataset region for NC classification, while [43] uses the whole lens region as inputs.
Furthermore, the proposed feature extraction-based framework's hardware environment requirements are lower than deep learning methods; it also requires less training time and is easy to be deployed on photography devices. Figure 8 presents the confusion matrix of EMLR. We can conclude that EMLR classifies all normal AS-OCT images correctly, and specificity is 90.44%. The precision value of low-grade is 71.58%, which may be caused by an imbalanced dataset problem.
All in all, the proposed feature extraction-based framework is able to achieve state-of-art nuclear cataract classification results as well as has a good explanation. Nevertheless, low-density values occupy a large proportion of density values, which makes machine learning methods hard to distinguish different nuclear cataract severity levels. This challenge would be investigated in the future work.

Conclusion and future work
This paper proposes a simple yet effective feature extractionbased framework to distinguish different nuclear cataract severity levels on AS-OCT images, comprised of globallocal feature extraction, feature importance analysis, and ensemble multiclass logistic regression. The global-local feature extraction method is applied to obtain features from three nuclear regions for enhancing classification performance. Feature importance analysis conduces to select useful features. Ensemble multiclass logistic regression considers the advantages of different optimization methods. The results on the AS-OCT image dataset demonstrate that the proposed feature extraction-based framework achieves state-of-art nuclear cataract classification results through comparison to advanced machine learning methods and deep learning methods. Moreover, the proposed framework has the potential as a computer-aided diagnosis tool for nuclear cataract diagnosis and cataract surgery planning.
In the future work, we will incorporate different nuclear region information based on AS-OCT into the deep neural network models, which may further improve nuclear cataract classification results.