Introduction

Optic disk edema (ODE) is an abnormal condition describing the swelling of the optic disk. The causes of ODE are, for example, toxic optic neuropathy, infiltrative optic neuropathy, malignant hypertension, and papilledema1. Symptoms vary from patient to patient, depending on the causes. Common ones are eye pain, visual field loss, color vision loss, flashing lights, and even vision loss when left untreated. ODE due to idiopathic intracranial hypertension (IIH), widely known as papilledema, is the most prevalent cause of ODE. The prevalence of papilledema is as high as 3.5 out of 100,000 in females aged between 15 and 44 years old2. Fundus photography can be used to diagnose ODE from ophthalmic investigations. The appearances of edematous OD considerably differ from non-edematous ODs3. Typical characteristics of edematous OD are blur edge, disk hyperemia, elevation, peripapillary, hemorrhage, and tortuosity of retinal veins. Figure 1 shows a comparison of retinal images of non-edematous and edematous OD.

Figure 1
figure 1

Retinal images of a non-edematous (left) and edematous (right).

Most works for edematous OD classification are from a clinical point of view. Works related to computer-aided software or algorithms proposed for edematous OD classification purposes were limited. The following are existing works related to edematous classification, including the stage grading applicable. Deep learning and machine learning were the two main approaches used.

Milea et al.4 used a deep learning approach on 14,341 ocular fundus photographs, including 9156 normal retina images, 2148 papilledema images, and 3037 other retina abnormalities images for training and validation of the model and 1505 images for external testing. The system classifies images of normal, papilledema, and other abnormalities by applying U-Net for detecting OD location and DenseNet for classification. The model’s performance was evaluated by calculating the area under the curve (AUC), sensitivity, specificity, and accuracy. The overall classification performance for the detection of papilledema in the external-testing dataset is 96%, 87.5%, 96.4%, and 84.7% for AUC, accuracy, sensitivity, and specificity, respectively. Saba et al.5 proposed a fully automated deep-learning-based papilledema detection system using DenseNet4. They used DenseNet for the classification of normal and papilledema OD images. The STARE dataset with 100 images is used in the experiments. The sensitivity, specificity, accuracy, and dice coefficient of classification obtained were 98.63%, 97.83%, 99.17%, and 99.08%. Another used approach is machine learning. Fatima et al.6 developed a hybrid feature-based papilledema detection system. They first manually detected the OD region. The SVM classifier with thirteen extracted features from color, GLCM, statistical features, and intensity line profile was then used for classification. The method was evaluated on a small subset of the STARE dataset comprising 20 swelling cases and ten normal cases. The reported average values of the performance measures were 100% sensitivity, 95% specificity, 91.67% precision, and 96.67% accuracy, respectively. Yousaf et al.7 extracted six related vascular features and four GLCM textual features from 36 manually cropped non-edematous OD boundaries and ten edematous OD boundaries from the STARE dataset. The classification was performed using the supervised support vector machine (SVM) classifier with a radial basic function kernel. They reported accuracy, sensitivity, precision, and specificity measurements of individual features of 95.65%, 100%, 83.30%, and 94.40%, respectively. In both studies in the machine learning approach, the OD region used as a domain for feature extraction was manually cropped. In addition, each experiment was done on a single small dataset with an imbalance of non-edematous and edematous ODs.

As automatic OD localization and segmentation are essential steps in our work, reviews of these tasks are also provided. Many localization techniques have been proposed based on the optic disk intensity, shape, size, color, and vessel information. We have summarized the work related to OD localization in Table 1. Although several techniques with many different features have been used for OD segmentation in the past, and some even achieved accuracy as high as one hundred percent, those methods are evaluated on collections in which most images are non-edematous OD. They did not consider edematous cases. All OD localization and segmentation methods rely on typical normal ODs' intensity, shape, and size. Therefore, these methods were not suitable for edematous ODs. Thus, the accuracy may not be as good as they claimed when involved with more ODE cases. This is because of physical changes in OD appearances, such as color, brightness, and size, and associating vessel structures that tend to be incomplete and tortuous. Reviews of automatic classification, localization, and segmentation of edematous OD are provided in the next section. The summary of the techniques used for OD segmentation is shown in Table 2.

Table 1 Reviews on OD localization techniques.
Table 2 Reviews of OD segmentation techniques.

Objectives, novelty, and contributions

This study extended our previous work8, initially presented at the 19th International Conference on Electrical Engineering, Computer, Telecommunications, and Information Technology (ECTI-CON 2022). In that conference work, we initially introduced a factorized gradient vector flow (FGVF)9, a special kind of gradient vector flow for texture segmentation, to segment the edematous ODs. It is experimentally proven on a small dataset containing 35 images to yield high performance.

The extension parts from the previous work can be summarized as follows.

  1. 1.

    We experimented with using FGVF to segment the OD for non-edematous ODs. The previous work was only done on edematous ODs.

  2. 2.

    We experimented with more images from two additional public datasets containing both types of ODs. The total number of images used in the experiment is 295 with 146 edematous and 149 non-edematous cases.

  3. 3.

    We demonstrated that the precise OD boundary is useful for edematous OD classification.

The use of FGVF to identify the boundary of the optic disk (OD) is a groundbreaking technique in the field of research. This approach shows great promise for advancing OD detection methodologies, especially in cases where ODs are swollen. In comparison to four other state-of-the-art methods, experiments show that FGVF can provide precise OD segmentation results, regardless of whether the OD is edematous or non-edematous. This is a significant finding in the field of ophthalmic image processing, where images with mixed types of ODs are common. Moreover, the classification of OD types can be particularly useful for ODE prescreening.

Methodology

A diagram depicting the procedures of our method is shown in Fig. 2. Our method comprised OD localization, OD boundary segmentation, and edematous classification. The hybrid localization method (HLM)24 was utilized to localize the OD. The optic disk boundary was segmented using factorized gradient vector flow (FGVF)8,9 with the computed location of OD used as the seed point. After OD boundary segmentation, the 27 features were extracted from a region centered at the localized OD region, and the classification of the type of OD was performed using the linear SVM classifier. The details of each step are provided in the following subsections. Settings used in SVM were described in the section Datasets, Classifiers, and Evaluation.

Figure 2
figure 2

Framework of edematous and non-edematous OD classification.

OD localization

To locate the OD in cases where the vascular networks were incomplete, we used the hybrid localization method (HLM)24. This was because the vascular networks were often incomplete in edematous cases. The HLM method was effective on all types of vascular networks, regardless of their structural completeness. The HLM method first analyzed the structure of the vascular network. If the vascular network was complete, the main vessels appeared in a horizontal parabolic shape. The HLM method assumed that the vertex of the parabola was the location of the OD. If the vascular network was incomplete, it appeared as several broken lines. The OD location is determined by the convergence of the fitted straight lines, which represent these broken vessels. Figure 3 illustrates the OD localization step for edematous and non-edematous OD.

Figure 3
figure 3

Illustration of the HLM method used for OD localization (rectangle) in non-edematous (left) and edematous ODs (right).

OD boundary segmentation

The OD region of interest (ROI) was first defined as a square centered at the OD location obtained from the HLM method. According to the size of OD in our datasets, we assumed that the diameter of the non-edematous OD was one-sixth of the retina’s diameter24. As the edematous OD’s region was commonly larger than the average size of the non-edematous OD, the ROI square’s width of both edematous and non-edematous OD was set to one-third of the retina’s diameter. Figure 4a and b depicts the original image and the ROI region.

Figure 4
figure 4

FGVF procedure illustration (a) Original Image, (b) Region of Interest (c) Contrast-Enhanced Image, (d) Vessel Removed Image, (e) Seed point and initial contour (f) after the 100th round of FGVF’s evolution process, (g) after the 400th round of FGVF’s evolution process, (h) OD boundary after FGVF’s convergence.

Next, the image contrast was enhanced. The color space transform was applied to convert from an RGB to a L*a*b* channel. The Contrast-Limited Adaptive Histogram Equalization (CLAHE)8 was performed on the L channel on the ROI. For removing vessels from ROI, a masked image was created by multiplying the green channel of the original image with the binary ROI. Gaussian filtering was applied to the green channel of the masked image to smooth it. Then, the region filling was performed on the pixels within the mask based on Laplace’s equation, removing vessel-related regions. Figure 4c and d illustrate the results after these processes.

A factorized Gradient Vector Flow (FGVF) proposed by Gao et al.9 was employed in our work to segment the OD boundary. The following FGVF pseudocode illustrates the main tasks in a recursive manner. The initial contour (C) was defined as a circle with a radius of 1/4 of the retina's width, centered at the OD location. Subsequently, the texture feature matrix Y was computed from the vessel-removed image from the prior step. The FGVF algorithm takes the texture feature matrix Y, the initial contour C, the number of rounds i as inputs. It repeatedly evolves the contour C to be closer to the OD boundary until convergence.

figure a

The FGVF algorithm uses the following functions.

MakeFeatureMatrix(Img) takes the image Img as an input. It returned a texture feature matrix calculated by using local spectral histograms39 and a factorized-based texture segmentation method proposed by Yuan et al.40.

PerformEvolution(Y, C) takes a texture feature matrix Y and a contour C. It evolves using the level set function and returns the new contour. The contour evolution is performed using level-set regularization proposed by Li et al.41.

CheckConvergence(C, C*) takes contours C and C* as inputs. If the average differences along the x and y directions between the input contours are less than a convergence threshold, the function returns true; otherwise, it returns false. In our experiment, we used 0.05 for a convergence threshold.

Figure 4e–h displays contours at the initial round, 100th round, 400th round, and upon convergence.

The contour evolution based on the factorization-based fitting energy and level-set regularization can be mathematically explained. Given \(\phi\) a signed distance function of a contour curve and R is the presentation feature9.

The FGVF energy function \(\left( {E_{FGVF} } \right)\) consists of two energy terms: a factorization-based fitting energy (\(E_{data}\))9 and a level-set regularization term (\(E_{regularization}\))41.

$$E_{FGVF} \left( {\phi ,R} \right) = \tau E_{data} \left( {\phi ,R} \right) + \upsilon E_{regularization} \left( \phi \right),$$
(1)

where \(\tau\) and \(\upsilon\) are two positive constants to control the proportion of \(E_{data}\) and \(E_{regularization}\). In our edematous and non-edematous OD boundary segmentation, we set the constant values \(\tau\) = 50 and \(\upsilon\) = 1.5. These values are tested empirically to yield the best result.

The first energy term \(E_{data}\) is derived from the matrix factorization techniques. Equations (2)–(5) collectively contribute to \(E_{data}\). Terms \(A\) and \(B\) are determined using the Heaviside function and the weight vectors \(\omega_{o}\) and \(\omega_{b}\). The weights are calculated from the presentation feature \(R\) and the feature matrix \(Y\).

$$E_{data} \left( {\phi , R} \right) = - \int_{\varOmega } {A + Bdx}$$
(2)
$$A = H_{\varepsilon } \left( \phi \right)\omega_{o} \left( {x, R} \right)$$
(3)
$$B = 1 - H_{\varepsilon } \left( \phi \right)\omega_{b} \left( {x, R} \right)$$
(4)
$$\left[ {\omega_{o} , \omega_{b} } \right] = \left( {RR^{T} } \right)^{ - 1} R^{T} Y$$
(5)
$$Y = R\beta + \epsilon$$
(6)

where \(\varOmega\) is a 2D image domain, x is a point in the domain, \(\varOmega_{o}\) and \(\varOmega_{b}\) are defined as the object region and the background region (i.e. \(\varOmega = \varOmega_{o} \cup \varOmega_{b} )\), \(H_{\epsilon } \left( \phi \right)\) is a Heaviside function, \(\omega_{o}\) and \(\omega_{b}\) are the weights of the object and background regions, R is the representative features, Y is the feature matrix of the ROI region calculated using factorization based method for textual image segmentation proposed by Yaun et al.40, \(\beta\) is a matrix whose columns are region weight vectors, and \(\epsilon\) is the additive noise. In our work, \(\epsilon\) is set to 0.5. The object and background of ROI are divided into two parts with different textural feature maps. The Y in Eq. (6) refers to the resultant matrix of the MakeFeatureMatrix(Img) function in the prior FGVF pseudocode.

The second energy term \(E_{regularization}\) shown in Eq. (1) is expressed as:

$$E_{regularization} \left( \phi \right) = \int_{\varOmega } {\frac{1}{2}\left( {\left| {\nabla_{\phi } \left( x \right) - 1} \right|} \right)^{2} dx}$$
(7)

where \(\nabla_{\phi }\) is the derivate of the level set function, the deformation process is repeated until the contour converges into the object boundary. Equation (7) enforces regularization and smoothness in the level-set function.

The update of the level set function at each round is described in Eq. (8).

$$\phi_{t + 1} = \phi_{t} + \frac{\partial \phi }{{\partial t}}dt$$
(8)

The evolving contour is a level set of \(\phi\), expressed as in Eq. (9).

$$C = \left\{ {x :\phi \left( x \right) = 0} \right\}$$
(9)

This C in Eq. 9 refers to the resultant contour of the PerformEvolution (Y, C) in FGVF pseudocode.

Edematous classification

A compilation of 40 different appearance-based and statistical-based features of OD was made from various literature sources. The maximum relevance minimum redundancy (mRMR) algorithm42 was then utilized to pick the most relevant and non-redundant features from the initial list. The mRMR algorithm prioritized features that offer informative data while minimizing any redundant information. It finally picked a set of 27 features, which we grouped into four categories, namely GLCM, vessel, color, and intensity line profiles. Below is a detailed list of the selected features.

Gray-level co-occurrence matrix features

Gray-level co-occurrence matrix (GLCM) is a statistical technique for analyzing texture that considers the spatial relationship of pixels43,44. GLCM calculates the texture based on pairs of pixels with specific values and their spatial arrangement. Ten GLCM features are extracted. Let M be a co-occurrence matrix with N dimension, where \(N\) is the number of gray-values, all pairs of intensities i, j are its coefficients and coordinates of the elements, \(p\) is the normalized co-occurrence matrix, \(\mu_{x} , \mu_{y}\)  and σx, σy are the mean and standard deviations for the matrix p’s rows and columns, respectively.

  1. 1.

    Autocorrelation (autoc) computed as the sum of the product of each element in the matrix \(p\) and the product of their distance from the mean that refer to the absolute differences between the row and column indices of the element in \(p\) and the mean row and column indices, respectively. It is high in edematous OD due to having similar intensity values, while non-edematous OD has lower autocorrelation because of high intensity change between optic disc and optic cup of the normal condition.

    $$autoc = \mathop \sum \limits_{i} \mathop \sum \limits_{j} \left( {ij} \right)p\left( {i,j} \right)$$
    (10)
  2. 2.

    Contrast (contr) measures the difference in color shades and brightness of the region. A higher contrast value indicates a higher variation in gray level between neighboring pixels. Thus, non-edematous OD has larger contrast value than the edematous condition.

    $$contr = \mathop \sum \limits_{n = 0}^{{N_{g - 1} }} n^{2} \left\{ {\mathop \sum \limits_{i = 1}^{{N_{g} }} \mathop \sum \limits_{j = 1}^{{N_{g} }} p\left( {i,j} \right)\left| {\left| {i - j} \right| = n} \right.} \right\}$$
    (11)

    where \(n = \left| {i - j} \right|\) and \(N_{g}\) is quantized gray levels.

  3. 3.

    Correlation (corrp) uses means and standard deviations to quantify the linear relationship between pixel intensities in the matrix \(p\). The low variations in pixel intensities of the edematous case show high correlation.

    $$corrp = \frac{{\mathop \sum \nolimits_{i} \mathop \sum \nolimits_{j} p\left( {i,j} \right) - \mu_{x} \mu_{y} }}{{\sigma_{x} \sigma_{y} }}$$
    (12)
  4. 4.

    Cluster prominence (cprom) measures the presence of clusters in the image, where a higher value indicates a greater prominence of clusters in the image. Thus, non-edematous OD has high value and edematous OD has low value.

    $$cprom = \mathop \sum \limits_{i} \mathop \sum \limits_{j} \left( {i + j - \mu_{x} - \mu_{y} } \right)^{4} p\left( {i,j} \right)$$
    (13)
  5. 5.

    Cluster shade (cshad) measures the degree of asymmetry in the grayscale pair distribution. Non-edematous condition has high asymmetry in the distribution of the matrix \(p\) and edematous case perform low asymmetry.

    $$cshad = \mathop \sum \limits_{i} \mathop \sum \limits_{j} \left( {i + j - \mu_{x} - \mu_{y} } \right)^{3} p\left( {i,j} \right)$$
    (14)
  6. 6.

    Dissimilarity (dissi) measures the average absolute differences between pixel intensities in the matrix \(p\). When non-edematous OD has significant changes in texture of optic cup and disc, the value is high.

    $$dissi = \mathop \sum \limits_{i} \mathop \sum \limits_{j} \left| {i - j} \right|.p\left( {i,j} \right)$$
    (15)
  7. 7.

    Energy (energy) measures the uniformity of the image pixels. Texture is likely uniform in edematous OD and varying in non-edematous OD. Thus, energy value increase in abnormal disruptions of texture patterns.

    $$energy = \mathop \sum \limits_{i} \mathop \sum \limits_{j} p\left( {i,j} \right)^{2}$$
    (16)
  8. 8.

    Entropy (entro) measures the disorder in the distribution of pixel pairs. The value is high when the matrix \(p\)’s elements are uniformly distributed. The homogenous characteristic of edematous OD condition has low entropy.

    $$entro = - \mathop \sum \limits_{i} \mathop \sum \limits_{j} p\left( {i,j} \right)log\left( {p\left( {i,j} \right)} \right)$$
    (17)
  9. 9.

    Homogeneity (homop) measures image homogeneity with larger values for smaller gray tone differences in pair object. The non-edematous OD has low homogeneity compared to the edematous OD.

    $$homop = \mathop \sum \limits_{i} \mathop \sum \limits_{j} \frac{1}{{1 + \left( {i - j} \right)^{2} }}p\left( {i,j} \right)$$
    (18)
  10. 10.

    Max probability (maxpr) measures the most frequently occurring intensity pair in the image. The edematous ODs tend to have a lower value of maxpr than the nonedematous.

    $$maxpr = Max (p\left( {i,j} \right))$$
    (19)

Vessel features

The following are the vessel features used in the experiment.

  1. 1.

    Vessel disk continuity Index (VDI) is the number of disjointed vessel regions in the segmented vascular network of OD images. Non-edematous OD image usually has a completely connected vascular structure, resulting in a low VDI value. In comparison, an edematous OD image usually has more broken vessels, especially in a severe case, resulting in a higher VDI value3.

  2. 2.

    Vessel disk continuity index to disk proximity (VDIP) is a VDI that calculates within the scope of the segmented OD region.

  3. 3.

    An area of the largest vessel region is the number of pixels in the largest vessel region. It offers details regarding the prevalence or range of the largest vessel structure. The edematous ODs tend to have a smaller number than the non-edematous ODs due to less completeness of the vascular network.

  4. 4.

    Mean vessel area- The ratio of the sum of vessel pixels to the total number of connected vessel clusters. Edematous ODs usually have this number lower than non-edematous ODs due to the vessel compression effect.

  5. 5.

    A standard deviation (\(\sigma\)) of the probability of intensity distribution is defined as follows.

    $$\sigma = \sqrt {\frac{{\sum \left( {x - \mu } \right)^{2} }}{N}} ,$$
    (20)

    where N is the total number of pixels in the image, x represents each pixel intensity value, \(\mu\) is the mean of image intensity distribution. The \(\sigma\) values of the edematous ODs tend to be higher than the non-edematous ODs.

  6. 6.

    Kurtosis distribution (\(\kappa\)) is a measure of the tailedness of an intensity distribution defined as follows.

    $$\kappa = \frac{{\frac{1}{N}\sum \left( {x - \mu } \right)^{4} }}{{\sigma^{4} }}$$
    (21)

    where N is the total number of pixels in the image, x represents each pixel intensity value, \(\mu\) is the mean of image intensity distribution, and \(\sigma\) is the standard deviation. It indicates how often the outliers occur.

The \(\kappa\) values of the edematous ODs tend to be lower than the non-edematous ODs.

Color features

  1. 1.

    Sharpness (S): the ratio of the sum of all gradient norms and the number of image pixels.

  2. 2.

    The hue value (H) in the HSV space

  3. 3.

    The saturation value (S) in the HSV space

  4. 4.

    The brightness (V) in the HSV space

  5. 5.

    Mean values of the intensity

  6. 6.

    The Red/Green value (a*) in the L*a*b* color spaces

  7. 7.

    The Blue/Yellow value (b*) in the L*a*b* color spaces

Generally, these color features of non-edematous ODs are higher than edematous ODs.

Image intensity line profile features

A horizontal line centered at the optic disk (OD) location with a length one-half of the diameter of the retina is considered. Figure 5 depicts the intensity profile.

Figure 5
figure 5

Example of image intensity profile from a line on edematous and non-edematous OD image.

The following features are extracted from a line profile.

  1. 1.

    The average intensity

  2. 2.

    The minimum intensity

  3. 3.

    The maximum intensity

  4. 4.

    The standard deviation of intensity

Generally, the averages, the maximums, and the standard deviations of the intensity of the non-edematous ODs are larger than the edematous ODs. In contrast, the minimum intensity values of non-edematous ODs are lower than those of edematous ODs.

Datasets, classifier, and evaluation

The programs were implemented using MATLAB R2022a and ran on DELL IN5406 (Intel Core i7-1165G7 Processor). The experiments were tested on three datasets. The first dataset downloaded images from the Internet45,46 includes 35 edematous and 38 non-edematous ODs images with the dimensions between 600 × 600 and 2300 × 1900. The selected fundus images with optic disk edema from RFMiD public dataset47 contained 91 edematous and 91 non-edematous ODs images with dimensions between 2144 × 1424 and 4288 × 2848. From the RFMiD2.0 public dataset48, 20 edematous and 20 non-edematous OD images with the dimensions 512 × 512 and 2048 × 1536 were selected. A total of 295 OD images with 146 edematous and 149 non-edematous cases were used in the experiments.

For the ODE classification, we selected a Linear Support Vector Machine (SVM) since it is effective with datasets with many features. To minimize over-fitting, we used fivefold cross-validation approach with 80% training and 20% testing. However, when dealing with a new image with different characteristics from the current dataset, over-fitting may still occur. Additionally, it is important to note that there are limited publicly available retinal images with edematous ODs. Thus, it is currently not possible to solve the issue of over-fitting by simply increasing the size of the dataset.

For OD localization, the performance was measured using a location accuracy (Accloc) defined in Eq. (22).

$$Acc_{loc} = \frac{C}{N},$$
(22)

where C is the number of images the method correctly localizes the OD, and N is the number of images. Remark that the case is successful when the method’s calculated OD location is within the ground truth contour.

The performance of the OD segmentation method was evaluated using precision, recall, and F1 measures. The evaluation formulas are shown in Eqs. (23)–(25).

$$Precision = \frac{TP}{{TP + FP}},$$
(23)
$$Recall = \frac{TP}{{TP + FN}},$$
(24)
$$F1 \, measure = \frac{2 \times Precision \times Recall}{{Precision + Recall}},$$
(25)

where TP, FP, TN, and FN are the number of pixels that are true positive, false positive, true negative, and false negative, respectively.

For edematous classification, we compared the performances of each feature and all together features using a support vector machine (SVM) linear classifier. The classification accuracy (Accclassify) is defined in Eq. (26).

$$Acc_{classify} = \frac{{C_{Ede} + C_{Non} }}{N},$$
(26)

where \(C_{Ede}\) and \(C_{Non}\) are the numbers of images correctly classified as edematous and non-edematous and N is the number of images.

Numerical results and discussion

This section presents comparative and quantitative studies of localization, segmentation, and classification of edematous OD compared to the existing methods.

OD localization

We compared the hybrid localization method (HLM)24 used by our method against the feature projection (FP)22 and adaptive thresholding (AT)10 methods. Selected cases of localization results from non-edematous and edematous groups of two datasets are depicted in Fig. 6.

Figure 6
figure 6

Examples of OD localization results of edematous cases from FP (yellow square), AT (red circle), and HLM used by the FGVF method (blue hexagram) for non-edematous (top) and edematous (bottom).

The numerical results are reported in Table 3. For non-edematous ODs, most methods could locate the OD efficiently. The FP performed worse than other methods because it relied only on vessels. When the vascular network was incomplete in some edematous cases, the FP failed. AT could sometimes spot abnormally high bright spots as the OD.

Table 3 Comparison of the OD localization performance of FP, AT, and HLM.

Results from Fig. 6 and Table 3 showed that the HLM method used by our algorithm achieved the best average Accloc of 97.88% for all three datasets and was considerably higher than the FP and AT methods. The Accloc values of all three methods were lower in the edematous cases than in the non-edematous cases. Across all OD types, the average Accloc of HLM was higher than FP and AT by 12.48% and 6.04%, respectively. Moreover, HLM localization performance was significantly superior to the other two comparative methods, especially in edematous cases. For such cases, Accloc of HLM was higher than FP and AT by as much as 22.49% and 12.03%, respectively.

OD segmentation

We compared the factorized gradient vector flow (FGVF)8,9 used in our work against four other comparative methods: alternated deflation-inflation gradient vector flow (ADI-GVF)37, traditional gradient vector flow (GVF)36, region growing (RG)29, and super-pixel clustering (SPC)30. All methods except super-pixel clustering required initial points. The OD locations obtained from the HLM method were the initial points. Figure 7 shows examples of segmentation results from different approaches for edematous and non-edematous ODs.

Figure 7
figure 7

Examples of segmentation results of non-edematous cases (top) and edematous (bottom) for ADI-GVF, GVF, RG, SPC, and FGVF (ours).

Most methods performed better on the non-edematous ODs than the edematous ODs. For edematous OD, the methods in the GVF family showed undersegmentation, while the region growing and superpixel clustering showed oversegmentation. Most methods worked well for non-edematous OD. Table 4 shows the numerical performance comparison of segmentation methods.

Table 4 OD segmentation performance.

The following findings can be summarized from the results of Table 4.

In the case of non-edematous images, both GVF and FGVF methods have F1 measures that are significantly higher than other comparative methods. On average, the improvement of FGVF over the second-best method (GVF) is only 0.21%. However, FGVF outperforms the poorest method (ADI-GVF) by 21.16%.

It was found that for images with edema, all methods performed worse than those without edema. Among all the methods, FGVF was the best and had significantly better results than GVF and other methods. On average, the improvement of FGVF over the second-best method (GVF) was 2.99%, while the improvement of FGVF over the poorest method (RG) was 15.54%.

In general, regarding mix cases, both GVF and FGVF have F1 measure values that are fairly close, but significantly better than other methods. Precision-wise, GVF was slightly better than FGVF, but FGVF had considerably better recall than GVF. This resulted in FGVF having a better overall F1 measure than GVF. However, the ADI-GVF method was the poorest performer among them.

The RFMiD2.0 dataset is known to be more challenging for most methods due to the low resolution and indistinct OD region in edematous OD images. However, when only considering edematous OD images in this dataset, FGVF still performs best.

Edema classification

Table 5 compares the linear SVM classifier accuracy of classification performance using each sole feature set and combined feature sets on different datasets.

Table 5 Accuracy comparisons of the proposed method (all featured combined) against each feature set and also against a state-of-the-art method (Yousaf et al.7).

The findings and discussions from Table 5 are as follows.

  1. 1.

    The proposed method achieved an average accuracy of 99.40%, which was the highest accuracy recorded.

  2. 2.

    The results of the average accuracy classification for a single feature type showed that the intensity line profile, vessel, color, and GLCM gave the best to worst results, respectively. However, in the proposed work, the average accuracy significantly improved. The overall improvement of all feature types combined compared to the best type (the intensity line profile) was 2.6%, while compared to the worst type (GLCM) was 11.4%.

  3. 3.

    It should be noted that our proposed method had shown better performance than the method proposed by Yousaf et al.7 by 3.34%. This improvement could be attributed to the fact that Yousaf et al. used only ten features from vessels and GLCM, while our method also employs features from the Color and intensity line profiles. This suggests that using additional features could help improve the classification results.

  4. 4.

    After analyzing the unsuccessful cases, we found that the accuracy of classification depended on several factors, such as the stages of edema in the dataset and the appearance of the OD. We noticed that when the images dealt with the mild edema stage in the dataset, the classification accuracy was lower. This was because the differences in characteristics between mild edema and normal OD showed minimal changes in the appearance of the disk. Additionally, some non-edematous OD with unclear boundary resulted in incorrect segmentation of the OD, which led to extracting wrong features and consequently resulted in inaccurate classification. Figure 8 shows examples of an edematous OD misclassified as non-edematous (false negative) and a non-edematous image misclassified as edematous (false positive).

Figure 8
figure 8

Examples of false negative (left) and false positive (right) cases. The black solid contour is the ground truth, the hexagon is the OD location, and the dash line is the OD boundary.

Conclusion

This paper presents an automatic classification and segmentation of optic disks with edematous and non-edematous based on the FGVF segmentation model using HLM initialization and classification results from a linear SVM classifier. The proposed method was evaluated on 146 edematous and 149 non-edematous images from Internet and RFMiD datasets by comparing the proposed localization, segmentation, and classification performances against the existing methods. The HLM worked well for OD localization and correctly located the OD in 295 out of 292 images with 97.88% accuracy. The proposed FGVF achieved an average segmentation precision of 86.56%, recall of 88.19%, and F1-score of 86.48%. The average classification accuracy was 99.40%. However, the FGVF method used in the OD segmentation algorithm had limitations, including high computational demands and sensitivity to initial conditions. For edematous OD classification, accuracy relied heavily on the precision of OD segmentation. Finding more useful features, such as the cloud OD boundary and the ratio of OD diameter to that of the retina, and improving the limitations of FGVF will be our future work.