1 Introduction

Glaucoma is a group of eye diseases which cause a progressive degeneration of the optic nerve fibers, eventually leading to structural damages in the optic nerve head, also known as optic disc, and gradual vision loss [4, 7]. Early detection is critical to contain its development, since these diseases are asymptomatic in their initial stages.

The optic disc (OD) is the region where ganglion cell axons exit the eye to form the optic nerve, which transmits visual data to the brain. It is formed by a central area known as the optic cup (OC), which is usually brighter than the rest of the disc, and the neuroretinal rim, a peripheral area where the nerve fibers bend into the cup region (Fig. 1).

Fig. 1
figure 1

Optic disc, optic cup and neuroretinal rim

The loss of optic nerve fibers caused by the progression of glaucoma leads to structural damages to the OD, namely an expansion of the OC region and a thinning of the neuroretinal rim. Due to these changes, a high ratio between the sizes of the optic cup and the optic disc, known as cup-to-disc ratio (CDR), serves as one of the main indicators of glaucoma: a patient with glaucoma will most often show a larger CDR than a healthy person [2, 16, 49].

There exist several other parameters that are indicative risk factors of glaucoma, such as increased intraocular pressure [5], peripapillary atrophy (PPA) [32] or disc diameter [51]. However, since CDR is most commonly used, early detection and treatment heavily rely on the accurate detection of the optic disc and cup.

Manually segmenting the OD is a fairly simple task, since this area is commonly easy to visually detect. However, the opposite is true for the OC in monocular fundus images. The boundary of the cup region can be detected thanks to its pallor, that is the color contrast inside the disc region, and the blood vessels which flow from the disc to the cup. Since these features are often subtle, the task of segmenting the cup in monocular images can become strenuous and highly subjective [48]. However, as the tissue bends into the optic nerve in the cup area, there is a depth variation that is easily appreciable using 3D information.

The main motivation of this work is that 3D information enriches 2D information in the cup segmentation process which, in turn, will enable better diagnostics and generate more accurate ground truth for automated diagnostic procedures.

Virtual reality (VR) head mounted displays have become very popular in recent years. These devices allow 3D viewing of stereoscopic content. Besides that, mobile phones can be used inside custom-fit headsets, for a virtual reality experience at a fraction of the cost of dedicated VR head mounted displays. A setup composed of a mid-range phone and a standard headset for virtual reality can cost around 200€.

For all these reasons, the main contribution of this work is to develop an affordable virtual reality system which supports the manual segmentation of the OC, by allowing medical experts to rely on visual depth information. This is achieved by displaying overlapping fundus stereo image pairs, arranged so that the three–dimensional shape of the OC becomes visible to the user.

This paper is organized as follows: Section 2 presents the related works, Section 3 presents the implementation details and how to interact with the developed application; Section 4 describes the obtained results, as a comparison of the segmentations obtained visualizing in 2D the raw fundus stereo image pairs versus using the VR application for the same purpose; Section 5 presents a final discussion on the observed benefits of the presented work; and Section 6 presents the conclusions of the work.

2 Related work

Medical images can be labeled using general–purpose image processing applications, such as GIMP [42], Adobe Photoshop [3] or Inkscape [12], but there exist specific interactive tools designed for such task [17, 30]. Most of these tools use three–dimensional gray–scale images as their input, and are not applicable to optic cup segmentation which requires two–dimensional color images instead.

Segmentation applications designed specifically for monocular retinal fundus images are rare or not easily accessible. DCSeg [28], in its monocular operation mode, is a tool to annotate the optic disc and cup in this kind of images, while separate tools to manually segment the macula, the vessels and the optic disc have also been published [9, 36].

Stereo imaging has also been applied to ophthalmology in the past, as it provides three–dimensional information that monocular images lack. This information is more easily observed with stereoscopic visor devices [11, 27]. The application of stereo-photography to the specific task of the optic disc segmentation for the detection of glaucoma has been compared to more elaborate techniques, such as optical coherence tomography [52] and Heidelberg retina tomography [33]. Other works use stereo imaging to perform automatic OD segmentation. Such works include classic depth imaging [20], physiologically plausible features [14] or graph searching [6].

The optic cup is usually difficult to detect in monocular images, but it can be reliably segmented thanks to its three–dimensional shape. However, it cannot be clearly observed using plain stereo fundus image pairs, as an accurate matching between image pixels would require to calculate their relative depth through spatial disparity. Previous works, such as DCSeg [28], in its stereo operation mode, rely on rapid image swapping to simulate depth, but the result can become unpleasant to the user. On the contrary, with the tool presented is this work, the three–dimensional shape of the OC becomes clearly visible by overlapping the stereo pairs in a VR environment, without the need of image swapping. The users can also naturally reorient the scene by turning their head, to better perceive the limits of the cup region.

Virtual reality interfaces have been successfully applied to research in numerous medical fields, such as cognitive behavioral therapy [22, 39], magnetic resonance image segmentation [40], and medical education [23]. The main reason for the popularity of VR applications is that they allow users to navigate the scene naturally, by simply walking, glancing around and focusing on any details of interest, thus producing a level of immersion which surpasses the possibilities of classic mouse and keyboard interfaces.

Virtual reality also offers depth information and full 360° views of simulated scenes. Since VR visors occlude all peripheral visual stimuli, they provide an immersive experience without external distractions [18]. The human sensory system has been proved to better attune to virtual reality than to classic computer interfaces, since the appearance of physical presence within a synthetic environment improves the understanding of the observed scenes [41, 50].

Regarding the application of virtual reality in the field of image segmentation, most of the published works have the objective of facilitating the visualization of previously segmented data volumes for the medical specialist [25]. Virtual reality technology has become very popular, and is the target of several commercial developments, such as InViewR [15] or the Nextmed project [1]. These tools are mainly used to visualize Magnetic Resonance Imaging (MRI) and Computerized Tomography (CT) medical data, and to facilitate the specialist’s decision making in diagnosis or surgery. Also, although to a lesser extent, there are virtual reality applications that allow the user to interactively participate in the segmentation process of MRI and CT data, such as the NUI-VR system [21].

In the field of Ophthalmology, the use of virtual reality for education and training is increasingly frequent [8, 10, 24, 43, 47, 53]. Virtual reality head-mounted displays are also an attractive technology for viewing intrasurgical optical coherence tomography (OCT) volumes [19, 34]. As can be deduced from the review presented in [31], the number of virtual reality applications in Ophthalmology is increasing: of the 77 works included in the review, 28 evaluated the use of virtual reality in ophthalmic surgical training/assessment and guidance, 7 in clinical training, 23 in diagnosis/screening, and 19 in treatment/therapy. Most studies focused on the validity and usability of these novel technologies.

The use of virtual reality in the problem of glaucoma diagnosis deserves special mention. In recent years, eye-tracking techniques have been used to diagnose visual field defects [26, 38, 44] and perimetries have been performed with virtual reality systems [37].

Regarding the use of virtual reality for the segmentation of the optic disc and cup in retinography images, which is the application proposed in this work, it is important to highlight that we have only found one similar development [45], where stereo images and OCT data are used to generate a 3D volume that, viewed in virtual reality, assists in the segmentation of the optic disc and cup. In this work, trained optometry students assessed the cup-to-disc ratio (CDR) of 10 eyes. The main objective of the study was to verify the influence of virtual reality technology in the training process, and its main finding was that the assessments in VR resulted in larger estimates of CDR compared to static stereoscopic assessments. Our system does not need OCT data to generate depth in VR, as it only uses the overlapping planes of the stereo pair, which can lead to a greater diffusion of the tool. In addition, the presented tool has been tested with 136 eyes of the RIMONE-DL dataset [13] and 30 eyes of the INSPIRE-stereo dataset [46].

3 Methods

The presented work was implemented as a Unity3D application for mobile devices with a virtual reality interface. The user interacts with the application via a gamepad. Mobile phones were chosen as primary deployment platform due to their widespread usage, which allows the developed tool to reach a larger amount of users (Fig. 2).

Fig. 2
figure 2

Hardware setting for application control

The main objective of this tool is to help the user manually segment the optic cup, by letting them visually perceive the depth of this region through stereoscopic vision. A configuration based on the GoogleVR [35] library provides a pair of virtual stereo cameras and tracks the orientation of the head of the user through the gyroscopes and accelerometers of the mobile device. There should be no motion sickness because the virtual segmentation environment is static. The users can change the point of view by moving their head around but they can not walk in the simulated environment.

In order to generate the perception of depth in the user, the stereo pair image (Fig. 3) is split apart into two sub–images, and displayed as an overlapping texture in the Unity 3D scene: a plane showing the left sub–image is visible only to the left camera, while another plane showing the right sub–image is visible only to the right camera (Fig. 4).

Fig. 3
figure 3

Stereo retinal fundus image pair

Fig. 4
figure 4

Overlapping planes in the Unity 3D scene generates the perception of depth in the user.

This configuration causes that, when the users stare at the overlapping planes, their left eye sees only the left sub–image and their right eye sees only the right sub–image. This information dissonance produces a perceived depth effect in the optic cup region.

The tool has two operation modes: image selection and image segmentation, as described in Fig. 5. The user can switch between modes with the A button.

In the image selection mode, users can change the image that is being visualized with LB and RB buttons.

In the image segmentation mode, users can interact with a set of linked reference points, placed over the plane that displays the stereoscopic image. These reference points can be selected with the LB and RB buttons and freely repositioned over the plane with the left analog stick (LSB), while the line segments that link them adjust automatically to the new positions. Those segments are constructed using the Catmull-Rom spline [29] between each pair of reference points.

All segments together form a closed contour that delimits the segmentation area of the optic cup. When the user presses the B button, a binary mask is computed from this contour, as a black and white image shaped like one of the stereo sub–images, that is, having the same height and half the width as the original combined stereo pair image. All the pixels of the binary mask located within the optic cup contour are set to their highest value (255), while all outside pixels are set to 0. These segmented images are saved to the gallery folder of the mobile device.

Fig. 5
figure 5

Segmentation process with the virtual reality app and gamepad interface

4 Results

RIMONE-DL [13] is an extensive dataset of manual segmentations of the optic cup and disc. 136 stereo images (resolution of 2144x1424 pixels) of this dataset (85 healthy and 51 glaucoma), which were captured using a non-mydriatic Kowa WX 3D stereo fundus camera, were used to test the presented work. These images were originally segmented by a medical specialist from the Hospital Universitario de Canarias (HUC) in Tenerife, with 20 years of experience. In order to see how the segmentations made with the presented tool compare with the original segmentations, the same specialist produced new segmentations using the tool presented in this work. Although the optic disc can most often be accurately delimited using only two–dimensional images, their delimitation was included in the experiments along with that of the optic cup, as a comparison baseline.

The metric used to compare the segmentations is the Intersection over Union (IoU) of the original segmentations of RIMONE-DL and the segmentations obtained with the stereo tool. Given two segmentations, A and B, the IoU value is computed as the number of overlapping pixels (\(A\cap B\)) divided by the number of pixels of the union of both sets (\(A\cup B\)). This value ranges from 0 (no overlapping pixels) to 1 (exactly the same pixels in both segmentations).

$$\begin{aligned} \text {IoU} = \frac{A\cap B}{A\cup B} \end{aligned}$$

Figure 6a and b show the box plots of the IoU of the segmentations of optic cup and disc, split into healthy eyes and patients with glaucoma.

Fig. 6
figure 6

Intersection over union (IoU) boxplots of the segmentations done with the stereo application and the ones present in the RIMONE-DL dataset

Figures 789 and 10 show some examples of comparisons between segmentations from RIMONE-DL (red highlighted pixels) and those done with the presented stereo tool (blue highlighted pixels). The intersections of both are displayed in magenta.

Fig. 7
figure 7

Disc segmentation comparison in glaucoma images

Fig. 8
figure 8

Disc segmentation comparison in healthy images

Fig. 9
figure 9

Cup segmentation comparison in glaucoma images

Fig. 10
figure 10

Cup segmentation comparison in healthy images

Apart from evaluating the difference between the IoU values, we have considered that it is also interesting to evaluate the impact that the use of this tool has on the cup-to-disc ratio (CDR), which is one of the main indicators of glaucoma.

(Figure 11a, b) show the distribution of the CDR obtained from the original segmentations of RIMONE-DL and those obtained from the stereo tool for each category: normal and glaucomatous images.

Fig. 11
figure 11

Cup-to-disc ratio (CDR) histograms of the segmentations done with the stereo application and the ones present in the RIMONE-DL dataset. Glaucomatous (Red) and Healthy (Green) as labeled in the RIMONE-DL dataset

We have conducted another experiment in which we have tested the stereo tool with the INSPIRE-stereo dataset [46]. In this experiment, we have had the collaboration of a second expert segmenting the images, which will be denoted as expert 2, being expert 1 the same that conducted the experiments with RIMONE-DL. This dataset contains 30 stereo images (resolution of 1536x1019 pixels) of glaucomatous eyes and, as far as we know, it is the only other public stereo image dataset of retinal images apart from RIMONE-DL. Figure 12 shows the intersection over the union of the cup segmentations done by both experts and Fig. 13a and b show a comparison of the CDRs obtained both in 2D and with the stereo tool.

Fig. 12
figure 12

Intersection over union (IoU) boxplots of the cup segmentations done in 2D and with the stereo tool by both experts with the images of the INSPIRE-stereo dataset

Fig. 13
figure 13

Cup-to-disc ratio (CDR) boxplots of the segmentations done by both experts with the images present in the INSPIRE-stereo dataset, using both the stereo application and 2D visualization of the stereo pair

5 Discussion

As the box plots in Fig. 6b show, there is no considerable difference between the original optic disc segmentations of the RIMONE-DL dataset and the ones done with the stereo tool (median IoU of 0.87 in glaucomatous images and 0.84 in healthy images).The reason is that the edge of the disc is formed by the axons of the ganglion cells as they exit the back of the eye, producing a slight relief variation that is easy to detect in both kinds of images. However, the specialist showed a tendency to segment slightly outwards in the stereo application and slightly inwards in the original RIMONE-DL dataset segmentations (Figs. 7 and 8), because the former allows for a clearer view of the bending point of the axons forming the optic nerve. This occurs both in healthy and glaucomatous patients.

However, the segmentation of the optic cup shows an appreciable difference between both methods, both for healthy and glaucomatous patients, as seen in Fig. 6a (median IoU of 0.81 in glaucomatous images and 0.58 in healthy images). As a general rule, larger cups show clearer borders and are easier to delimit both in 2D and with the stereo tool, which decreases the segmentation discrepancy. However, the difference becomes noticeable in images in which the cup edge looks blurry (Fig. 9).

Greater differences arise with healthy patients, for whom the optic cup is small and very flat (Fig. 10). In these cases, delimiting the cup in a two–dimensional image becomes very difficult, but the stereo tool provides enough information to overcome this obstacle. In addition, in areas with blood vessels where there is no visible color gradient, three–dimensional data generally allows for a more accurate segmentation, as stereo vision allows the user to determine the depth position of the vessels and whether they are located within the optic cup or not.

Regarding the analysis of the CDR histrograms, it can be seen that there is a variation between the classes (healthy-green and glaucoma-red) obtained both with the stereo tool and the original segmentations of RIMONE-DL. Table 1 shows the average and standard deviation of the CDRs. A paired Student’s t-test has been performed to determine if the means of the two groups are significantly different from each other. This has been done for both healthy and glaucomatous samples. The results obtained are a P–value of 0.003 in the case of healthy and a P–value of 0.001 in the case of glaucoma. These values indicate that there is a statistically significant difference between the two groups that are compared.

We believe that this significant difference is due to the fact that with the stereo tool, the expert perceives 3D information that enriches the 2D information in the cup segmentation process which, in turn, will enable better diagnostics and generate more accurate ground truth for automated diagnostic procedures.

Table 1 Average and standard deviation of CDR in healthy and glaucomatous images using the stereo tool and the original segmentations present in the RIMONE-DL dataset

In order to get more insight into the differences observed in Fig. 11a and b, the expert was asked to determine which glaucoma images were the most difficult to categorize because they were in an incipient state. The CDR values corresponding to these images are considered as a new class for the purpose of highlighting where the CDR varies the most.

In Fig. 14a and b it is shown the distribution of the CDRs as in Fig. 11a and b, but including the images of incipient glaucoma as a separate category. This greater observed variation could be due to the difficulty to appreciate the contour of the cup in monocular images compared to the visualization through the stereo tool. This finding leads us to think that the tool is helping us to better segment these images.

Fig. 14
figure 14

Cup-to-disc ratio (CDR) histograms of the segmentations done with the stereo application and the ones present in the RIMONE-DL dataset. Advanced glaucoma (Red), healthy (Green) and Incipient glaucoma (Dashed lines)

Regarding the results obtained with the INSPIRE-stereo dataset, it is worth mentioning that the two experts show lower IoU and with greater variation (Fig. 12) with respect to those obtained by expert 1 in the RIMONE-DL dataset on glaucomatous images (Fig. 6a).

In Fig. 13, it can be seen that the CDR values obtained with the stereo tool are higher than those obtained in 2D for both experts. However, this increase is more important in the case of expert 1. In general, expert 2 has segmented more similarly in 2D and stereo than expert 1. Nevertheless, it must be taken into account that the INSPIRE-stereo dataset contains glaucomatous images only, so the CDR variation is not expected to be as large as in images of healthy patients, where the optic cup region may be more difficult to segment.

6 Conclusion

The presented interactive tool helps to increase the accuracy of the segmentation of the optic cup, by applying a virtual reality environment to stereo retinal fundus images. This is achieved by enabling a natural perception of three–dimensional structure of the cup region.

In order to compare the segmentations produced by the developed tool with those obtained in 2D, several experiments were carried out by two experts on two public stereo retinal image datasets.

As a result of these experiments, a significant variation of the cup–to–disc ratio (CDR) was observed. In addition, the experts were able to generate more accurate segmentations, by taking advantage of the 3D information provided by the stereo tool, in particular regarding the optic cups of healthy eyes and in areas with blood vessels.