1 Introduction

Diabetes is reaching epidemic proportions worldwide, especially in Asia due to fast lifestyle changes and genetic factors. Today 11.6 % of the Chinese population is estimated to have diabetes-2. Diabetic retinopathy is also the main cause of newly formed blindness in the working population, leading to high societal costs. Early detection is the key to prevention and successful treatment of these forms of blindness. However, many cases still go unnoticed and are not treated in time, especially in rural areas.

To set up a screening program for the detection of early signs of diabetic retinopathy (DR), glaucoma and age-related macula degeneration (AMD), a Sino-Dutch consortium was formed and the project RetinaCheck was defined with the following four phases: (1) development of innovative algorithms for automated and quantitative detection of relevant bio-markers, (2) set up a significant validation study correlating the imaging data with relevant clinical metadata, and (3) roll out a screening infrastructure in the province of Liaoning, Northeast China, and (4) make a sustainable commercial infrastructure. The partners include the departments of Biomedical Engineering at Eindhoven University of Technology (TU/e, the Netherlands) and Northeastern University (NEU, Shenyang, China), which develop the computer-aided diagnosis (CAD) software, the clinical partners He University Eye Hospital (HUEH, Shenyang, China) and China Medical University (Shengjing Hospital, Shenyang, China) (Fig. 1), and the fundus camera manufacturer i-Optics (the Hague, the Netherlands).

This paper reports on phase 1 only, focusing in detail on algorithm design. The very important validation phase is ongoing, and will be reported in a forthcoming paper.

Due to its low cost, wide availability, high resolution and ease of use, optical imaging of the retinal fundus is often the method of choice for screening applications. Today non-mydriatic cameras (no pupil dilation) give good quality and high-resolution images. Quantitative measurement and analysis of geometric attributes of the retinal vasculature is highly informative for diagnosis and monitoring of diabetes at early stages [41]. The many extracted imaging biomarkers in this project will also be exploited for the early detection and monitoring of other diseases, like glaucoma, age-related macula degeneration (AMD), and cardiovascular and neurodegenerative diseases.

Fig. 1
figure 1

Clinical data acquisition at the Department of Endocrinology of Shengjing Hospital, Shenyang, China. The camera is an EasyScan scanning laser fundus camera (i-Optics Inc. The Hague, the Netherlands). Extensive clinical metadata are co-recorded for validation

Several other early DR signs can be measured, such as nerve damage in the cornea with confocal laser microscopy, or changes in retina neural tissue layer thickness with optical coherence tomography (OCT), but these methods are more costly and more labor intensive, especially given the projected huge-scale screening and the much lower availability of OCT in China.

This paper gives an overview of the current brain-inspired applications in the RetinaCheck project. The paper mainly focuses on vessel analysis, while a forthcoming paper will focus on the automated detection of micro-aneurysms and background diagnostic features, as drusen and exudates.

The paper is organized as follows: first a short introduction is given to the physiological evidence of multi-scale and multi-orientation processing in the visual system in Sect. 2, and a theoretical model for brain-inspired multi-orientation computing in Sect. 3. Then a series of recently developed computer-aided diagnosis (CAD) algorithms is discussed in Sect. 4, mostly based on the mechanism of invertible multi-orientation scores. The paper ends with a description of the current validation study in Sect. 5 and a conclusion in Sect. 6.

2 Brain-inspired computer vision

Modern brain imaging techniques have revolutionized brain research: optical imaging methods such as voltage sensitive dye imaging, calcium intrinsic imaging and optogenetics, as well as structural and functional MRI techniques have revealed intricate details of brain function and structure, especially of the visual system, the best studied brain area today. Many models of the functional mechanisms in early vision have been proposed. This paper focuses on the geometric approach, starting from the multi-scale and multi-orientation structure found in the early stages of vision.

2.1 Multi-scale analysis

Multi-scale analysis is now well established [15, 44, 58]. The center-surround receptive fields in the retina are the first multi-scale sampling step. Our model for V1’s ‘simple cells’ (so called by Hubel and Wiesel) is that these have the function of multi-scale, regularized spatial differential operators, i.e., Gaussian derivatives, possibly up to fourth order [44, 5860]. The simultaneous sampling of the multi-scale differential structure leads to ‘deep structure’ analysis [43], with applications as edge focusing, hierarchical top points and SIFT and SURF keypoints. Nonlinear adaptive multi-scale analysis is developed in the rich field of geometry-driven diffusion [32, 62], which introduced nonlinear PDEs and energy minimizing variational methods into this field of evolutionary geometric computing.

2.2 Multi-orientation analysis

Hubel and Wiesel [37] discovered that the receptive fields in cat’s striate cortex have a strong orientation-selective property (see Fig. 2). A so-called cortical hyper-column, with the characteristic pinwheel structure of equi-orientation lines radiating from a central singularity, can be interpreted as a visual pixel computer, neatly decomposed into a complete set of orientations.

Synaptic physiological studies of horizontal pathways in cat striate cortex show that neurons with aligned receptive field sites excite each other on the cortex between different pinwheels, in an elongated area and over long distances, creating long-range contextual connections [2]. Therefore, the visual system not only constructs the aforementioned score of local orientations, but also accounts for long-range context and alignment by excitation and inhibition a priori, which can be modeled by so-called association fields [15, 20, 27], and left-invariant PDEs and ODEs for contour enhancement and contour completion directly on the score [23, 61].

Fig. 2
figure 2

Left voltage sensitive dye measurement of the visual cortex of the tree shrew, exhibiting the columnar organization with the typical multi-orientation pinwheel structure. From [40]. Right Orientation score model with orientation on the vertical axis, and cortical columns with horizontal contextual (association field) connections. From [66]

This multi-orientation framework [14, 15, 21, 31] also allows us to generically deal with crossings, as we will show with the application to vessel tracking [6, 8, 9, 52] and segmentation [1, 33, 65]. Moreover, due to the neat organization of image data on the Lie-group SE(2), we are able to design effective detection algorithms [6, 7], geometric feature analysis techniques such as bifurcation detection/analysis [55] and local vessel curvature analysis [11], and enhancement methods [31, 66].

3 Theory

Motivated by this organization, so-called orientation scores are constructed by lifting all elongated structures (in 2D images) along an extra orientation dimension, see Fig. 2. Similar to the perceptual organization of orientation in the visual cortex, a 2D orientation score is an object that maps each 2D position and orientation \((x, y, \theta )\) to a complex scalar. So the original 2D image domain can be extended to the score domain. A great advantage is that it can deal with multiple orientations per position, and the extra dimension enables new techniques for, e.g., contextual geometric reasoning and crossing-preserving enhancement [31].

3.1 The Euclidean motion group SE(2)

The domain \(\mathbb {R}^2 \rtimes S^1\) (2 spatial dimensions, and one orientation dimension) of an orientation score can be identified with the Euclidean motion group SE(2), equipped with the group product

$$\begin{aligned} gg'=(\mathbf x ,\theta )(\mathbf x ',\theta ')=(\mathbf x +\mathbf R _\theta \cdot \mathbf x ',\theta +\theta '), \end{aligned}$$

for all \(g,g' \in \hbox {SE}(2)\), and with \(\textstyle \mathbf R _\theta = \left( \begin{array}{ccc} \cos \theta &{}\quad -\sin \theta \\ \sin \theta &{}\quad \cos \theta \\ \end{array} \right) \) a counterclockwise rotation over angle \(\theta \).

3.2 Invertible orientation scores on SE(2)

An orientation score \(U_f:{\hbox {SE}}(2) \rightarrow \mathbb {C}\) is obtained by convolving an input image f with a specially designed, anisotropic wavelet \(\psi \):

$$\begin{aligned} U_f(\mathbf {x},\theta ) = \left( \overline{\psi _\theta } \star f\right) (\mathbf {x},\theta ) = \int _{\mathbb {R}^2} \overline{\psi \left( R_\theta ^{-1}(\mathbf {y}-\mathbf {x})\right) } \; f(\mathbf {y}) \mathrm {d}\mathbf {y} \end{aligned}$$

where \(R_\theta \) denotes a 2D counterclockwise rotation matrix and \(\psi _\theta (\mathbf {x}) = \psi (R_\theta ^{-1}\mathbf {x})\). Exact image reconstruction is achieved by

$$\begin{aligned} f = \mathcal {F}^{-1}\left[ M_\psi ^{-1} \mathcal {F}\left[ \mathbf {x}\mapsto \frac{1}{2\pi }\int _0^{2\pi }\left( \psi _\theta * U_f\right) (\mathbf {x},\theta )\mathrm {d}\theta \right] \right] \end{aligned}$$

where \(\mathcal {F}[\cdot ]\) represents the unitary Fourier transform on \(\mathbb {L}_2(\mathbb {R}^2)\) and \(M_\psi \) is given by \(\int _0^{2\pi }|\mathcal {F}\left[ \psi _\theta \right] |^2 \mathrm {d}\theta \).

Well-posedness of the reconstruction is controlled by \(M_\Psi \) [10, 24]. \(M_\Psi \): \(\mathbb {R}^{2} \rightarrow \mathbb {R}^{+}\) is calculated by

$$\begin{aligned} M_\Psi =\int \nolimits _{0}^{2\pi }\overline{\mathcal {F}[\Psi _{0}]}\mathcal {F}[\Psi _{0}]d\theta =\int \nolimits _{0}^{2\pi }||\mathcal {F}[\Psi _{0}]||d\theta . \end{aligned}$$

The function \(M_\Psi \) provides a stability measure of the inverse transformation. Theoretically, reconstruction is well-posed, as long as \(0< \delta< M < M_\Psi<\), where \(\delta \) is arbitrarily small, since then the condition number of \(W_\Psi \) is bounded by \(M\delta ^{1}\) (for a detailed discussion see [22]).

Wavelets that meet this criterion are the so-called cake wavelets described by [6, 21, 31]. They uniformly cover the Fourier domain up to a radius of about the Nyquist frequency \(\rho _n\) (see Fig. 3). They have the advantage over other oriented wavelets (such as Gabor wavelets) that they allow for a stable inverse transformation \({\mathcal {W}}_\psi ^*\) from the orientation score back to the image. As such, cake wavelets ensure that no data-evidence is lost during the forward and backward transformation. The procedure for creating cake wavelets is illustrated in Fig. 3. In detail, the proposed wavelet takes the form

$$\begin{aligned} \psi (\mathbf {x}) = \left( \mathcal {F}^{-1}\left[ \mathbf {\omega }\mapsto \tilde{\psi }(\rho \cos \varphi ,\rho \sin \varphi )\right] \right) (\mathbf {x})\; G_{\hbox {s}}(\mathbf {x}) \end{aligned}$$

with \(G_{\hbox {s}}\) an isotropic Gaussian window in the spatial domain and \((\rho ,\varphi )^T\) denote polar coordinates in the Fourier domain, i.e., \(\mathbf {\omega }= (\rho \cos \varphi ,\rho \sin \varphi )^T\). The Fourier wavelet \(\tilde{\psi }(\mathbf {\omega }) = A(\varphi )B(\rho )\) has an angular component \(A(\varphi )\) resembling a ‘piece of cake’ (Fig. 3), and a Gaussian weighted radial component \(B(\rho )\):

$$\begin{aligned} \begin{array}{lr} A(\varphi ) = \left\{ \begin{array}{ll} B^k\left( \frac{(\varphi \, \mathrm {mod}\, 2\pi )-\pi /2}{s_\theta }\right) &{}\quad {\mathrm {if}}\;\rho > 0\\ \frac{1}{N_\theta } &{}\quad {\mathrm {if}}\;\rho = 0 \end{array} \right. &{} \\ B(\rho ) = {\hbox {e}}^{-\left( \frac{\rho }{t}\right) ^2} \sum \limits _{i=0}^N \frac{(\rho /t)^{2i}}{i!} \end{array} \end{aligned}$$

where \(B^k(x)\) denotes the kth order B-spline, \(N_\theta \) is the number of samples in the orientation direction and \(s_\theta =2\pi /N_\theta \) is the angular step size. The function \(B(\rho )\) is a Gaussian multiplied with the Taylor series of its inverse up to order N to enforce faster decay. The parameter t is given by \(t^2=2 \hat{\rho }/(1+2N)\) with the inflection point \(\hat{\rho }\) that determines the bending point of \(B(\rho )\).

As depicted in Fig. 3, the symmetric real part of the kernel picks up lines, whereas the asymmetric imaginary part responds to edges. In the context of vessel filtering, the real part is of primary interest, whereas vessel tracking in the orientation score additionally makes use of the imaginary part [6]. Elongated structures involved in a crossing can now be disentangled for crossing-preserving analysis.

Fig. 3
figure 3

a Real and b imaginary part of the cake kernel in the spatial domain, c Fourier contours at 70 % of the maximum for all orientations and d \(B(\rho )\) with \(\hat{\rho }=0.8\rho _n\), the Nyquist frequency \(\rho _n\) and \(N=8\). From [6]

3.3 Left-invariant Gaussian derivatives in SE(2)

Theoretically, because of the curved geometry of orientation scores, it is wrong to take the derivatives in orientation scores using the \(\{\partial _x,\partial _y,\partial _\theta \}\) derivative frame (we use shorthand notation \(\partial _i=\frac{\partial }{\partial _i}\)) [21]. Therefore, left-invariant differential operators

$$\begin{aligned} \{\partial _\xi ,\partial _\eta ,\partial _\theta \}= \{ \cos \theta \partial _x+\sin \theta \partial _y, -\sin \theta \partial _x+\cos \theta \partial _y,\partial _\theta \} \end{aligned}$$
(1)

are used in SE(2). The \(\partial _\xi \) and \(\partial _\eta \) are the spatial derivatives tangent and orthogonal to the orientation \(\theta \). It is important to mention that not all the left-invariant derivatives commute, e.g., \(\partial _\theta \partial _\xi U \ne \partial _\xi \partial _\theta U\) [22, 31]. Mathematically this is due to the fact that rotations and translations do not commute. We have the following commutators in the Lie-algebra of SE(2):

$$\begin{aligned} \begin{array}{l} [\partial _{\theta },\partial _{\xi }]=\partial _{\eta }, \quad [\partial _{\theta },\partial _{\eta }]= -\partial _{\xi }, \quad [\partial _{\xi },\partial _{\eta }]= 0, \\ \end{array} \end{aligned}$$

Suitable combinations of derivatives have been widely used to pick up differential invariant features like edges, ridges, corners and so on [59]. However, obtaining derivatives directly is an ill-posed problem. Therefore, we regularize the orientation scores via convolutions with Gaussian kernels \(G_{\sigma _s,\sigma _o}(\mathbf x ,\theta )=G_{\sigma _s}(\mathbf x )G_{\sigma _o}(\theta )\), with a d—dimensional Gaussian given by

$$\begin{aligned} G_{\sigma }(\mathbf x )=(2\pi \sigma ^2)^{-d/2}\, {\hbox {e}}^{-\frac{\Vert \mathbf x \Vert ^2}{2\sigma ^2}} \end{aligned}$$

and where \(\sigma _s\) and \(\sigma _o\) are used to define the spatial scale \(\frac{1}{2}\sigma _s^2\) and orientation scale \(\frac{1}{2}\sigma _o^2\) of the Gaussian kernel. Note that the spatial Gaussian distribution \(G_{\sigma _s}: \mathbb {R}^{2} \rightarrow \mathbb {R}^{+}\) must be isotropic to preserve commutator relations of the SE(2) group for scales \(\sigma _{s}>0\), i.e., to preserve left-invariance.

Detailed numerical approaches for linear left-invariant diffusions on SE(2) have been developed in [66].

4 Computer-aided diagnosis algorithms

The pipeline of functions in the developed CAD system is the following:

  1. 1.

    Quality assessment

  2. 2.

    Masking and normalization

  3. 3.

    Optic nerve head detection

  4. 4.

    Vessel enhancement and segmentation

    1. (a)

      Left-invariant Gaussian derivatives in SE(2)

    2. (b)

      Multi-scale and multi-orientation for SLO

    3. (c)

      Crossing-preserving multi-scale vesselness

  5. 5.

    Multi-orientation vessel tracking

  6. 6.

    Arterio-venous ratio

  7. 7.

    Fractal dimension

  8. 8.

    Geometric vessel features

    1. (a)

      Bifurcation/crossing detection

    2. (b)

      Curvature/tortuosity

Each of the steps will be discussed below in the respectively numbered sections.

4.1 Automatic quality assessment

Sometimes retinal images have a low quality, e.g., due to cataract or other pathology. To exclude non-diagnostic images automatically in the high-volume screening process, a method for image quality verification is developed [26], based on [50]. The supervised method is based on the assumption that sufficient image structure according to a pre-defined distribution must be available. Geometric differential invariants (expressed in 2D gauge coordinates [58]) up to second order and at 4 different scales are used to develop response vectors. Different combinations of features are trained by different classifiers on 100 normal and 100 low-quality images. The ground truth for normal or low quality images was specified by two expert ophthalmologists. Combining the image structure clusters with RGB color histogram features, the Random Forest classifier proved to be the best classifier, with a performance of 0.984 area under the curve (AUC) of the receiver operator characteristic (ROC), with 0.91 accuracy rate.

4.2 Masking and normalization

All pixels outside the camera field-of-view are masked, according to the camera type. Retinal images often suffer from non-uniform illumination and varying contrast, which may affect the later detection process. We exploit the luminosity and contrast normalization procedures proposed by Foracchia et al. [28].

4.3 Optic nerve head detection

The optic nerve head (ONH) or optical disk is a key landmark in retinal images, and many methods for detection have been proposed [4, 18, 45, 47, 54, 64]. For an extensive and recent overview see [51]. Correct segmentation of the ONH and its rim is an important biomarker for glaucoma identification [38]), and for establishing a metric for regions-of-interest on the retina [36].

Conventional fundus images show the ONH as a bright disk-like feature. However, scanning laser ophthalmoscopy (SLO) cameras generally show dark regions with considerably less contrast. In the presence of large pathologies, classical approaches typically show decreased performance, or fail. Better performance is obtained by including contextual information, e.g., by incorporating the characteristic pattern of large blood vessel arches in the upper and lower retina emerging from the ONH [45, 46, 64], but this comes at higher computational costs.

Exploiting the extra dimensions of orientation scores, we have developed a template matching application via cross-correlation including local orientations [7, 10] (see also [16]). We can now match patterns of orientation distributions, rather than pixel intensities.

The templates are constructed from (a) a disk filter, (b) a model of the vessels radiating from the ONH, and (c) through minimization of a suitable energy functional. In Ref. [7] we found that, in contrast to model-based templates [10] (e.g., disk shapes or vessel indicator functions), the best performing templates are trained by the minimization of an energy functional, consisting of a term where the desired inner product responses are trained, and a regularization term that ensures stability and smoothness of the template. For 2D spatial image templates t, we minimize

$$\begin{aligned} E(t)= & {} \underbrace{\sum _{i=1}^P \left( \left( t,\hat{f}_i \right) _{\mathbb {L}_2(\mathbb {R}^2,m \, {\hbox {d}}\tilde{\mathbf {x}})} - y_i \right) ^2 }_{\mathrm{data term}} \\&\quad +\,\lambda \underbrace{\iint \limits _{\mathbb {R}^2} ||\nabla t ||^2 {\hbox {d}}x{\hbox {d}}y }_{\mathrm{regularization \, term}}, \end{aligned}$$

where \(\hat{f}_i\) is one of P (\(P=100\)) normalized positive patches with ONH (with \(y_i = 1\)), or negative patches without ONH (with \(y_i = 0\)) with typically \(\lambda = 10^{-1.5}\). The regularization term enforces smoothness by punishing the squared gradient magnitude \(||\nabla t ||^2\), and prevents over-fitting. Examples of both positive and negative training patches are given in Fig. 4.

Fig. 4
figure 4

Exemplary retinal image patches used for template training. Top row positive patches. Bottom row negative patches. From [7]

Fig. 5
figure 5

Automatic vessel segmentation examples. a original (green channel), b enhanced, c segmented, d ground truth. Top row DRIVE database example (sensitivity \(=\) 0.8308, specificity \(=\) 0.9824 and accuracy \(=\) 0.9635). Bottom row STARE database (sensitivity \(=\) 0.8587, specificity \(=\) 0.9806 and accuracy \(=\) 0.9676). From [65]

For the orientation score template T, a similar functional is minimized:

$$\begin{aligned} \mathcal {E}(T)= & {} \underbrace{ \sum \limits _{i=1}^P ((T,\hat{U}_{f_i})_{\mathbb {L}_2({\hbox {SE}}(2),M\, {\hbox {d}}\mathbf {x}{\hbox {d}}\theta )} - y_i)^2 }_\text {data term} \\&\quad +\,\lambda \underbrace{ \int \!\!\int \!\!\int \limits _{{\hbox {SE}}(2)} ||\nabla T ||^2_D \; {\hbox {d}}x{\hbox {d}}y{\hbox {d}}\theta }_{\text {regularization term}}, \;\;\;\;\;\;\;\; \text {with}\\&\quad \int \!\!\int \!\!\int \limits _{{\hbox {SE}}(2)} ||\nabla T ||_D \; {\hbox {d}}x{\hbox {d}}y{\hbox {d}}\theta \\= & {} \int \!\!\int \!\!\int \limits _{{\hbox {SE}}(2)} D_{\xi \xi } \left| \frac{\partial T}{\partial \xi }\right| ^2 \!\!+\!\! D_{\eta \eta } \left| \frac{\partial T}{\partial \eta }\right| ^2 \!\!+\!\! D_{\theta \theta } \left| \frac{\partial T}{\partial \theta }\right| ^2 {\hbox {d}}x{\hbox {d}}y{\hbox {d}}\theta , \end{aligned}$$

with typically \(\lambda = 10\) and \(D_{\theta \theta } = 10^{-3.5}\) and with the left-invariant gradient \( \nabla T = \left( \frac{\partial T}{\partial \xi }, \frac{\partial T}{\partial \eta }, \frac{\partial T}{\partial \theta } \right) ^T \). The templates are represented in a B-spline basis, allowing for efficient and accurate optimization of the energy functionals. The B-spline coefficients are solved by a conjugate gradient approach.

The orientation score SE(2) template method outperforms state-of-the-art methods on publicly available benchmark databases, as it correctly identifies the ONH in 99.7 % of 1737 images of the well-known MESSIDOR, STARE (with a wide variety of pathological images) and DRIVE databases. For more details, see [7, 10].

4.4 Vessel enhancement and segmentation

4.4.1 Vessel enhancement filter

Vessels can be enhanced by oriented filters [5, 29], e.g., constructed from second-order Gaussian derivative operators at appropriate scales and orientations. To solve segmentation problems at complex structures like crossings, we have designed new filters in the orientation score matching the vessel profile as second-order left-invariant Gaussian differential operators perpendicular to the corresponding orientation [65], as described in Sect. 3.

With Gaussian filters, the maximum response occurs at \(\sigma =r/\sqrt{2}\), where r represents the radius of the vessel caliber [44]. Typically vessel calibers of the DRIVE and STARE databases range from 2 to 14 pixels, so we sample the spatial scales \(\sigma _s\) as \({\mathcal {S}}=\{0.7, 1.0, 1.5, 2.0, 2.5, 3.5, 4.5\}\) and angular scale \(\sigma _o=\pi /5\). We exploit \(N_o=36\) orientations between 0 and \(\pi \). Figure 5 shows our segmentation results on the DRIVE and STARE databases. From Fig. 6 we can see that the proposed orientation score based multi-scale filters show much better structure preservation ability on these special cases, as illustrated in Fig. 6.

Fig. 6
figure 6

Vessel segmentation results of our left-invariant Gaussian derivative method in comparison with state-of-the-art methods on an image of the DRIVE database. ac and df resp. show the vessel segmentation results by the methods of Frangi et al. [29] and Soares et al. [56] on 3 difficult cases: a high curvature change on low intensity vessel part and tiny crossing, b artery and vein crossing with central reflex and c closely parallel vessels; gi the results of our method, and jl the corresponding ground truth annotations by a human observer [57]. From [65]

We apply the scale normalization as proposed by Lindeberg [44]. The final image reconstruction from the multi-scale filtered orientation scores is obtained via

$$\begin{aligned} \left( \Upsilon _\eta ^{\sigma _s,\sigma _o}(f)\right) (\mathbf x ):=\underset{\theta _{i}}{\text {max}}\left\{ \sum _{\sigma _s \in \mathcal {S}}\varPhi _{\eta ,\text {norm}}^{\sigma _s,\sigma _o}(U_f)(\mathbf x ,\theta _i)\right\} \end{aligned}$$

with \(i \in \{1,..N_o\}\) and where \(N_o\) and \(\mathcal {S}\) represent resp. the number of orientations and the set of spatial scalings, and \(\Phi _{\eta ,\text {norm}}^{\sigma _s,\sigma _o}\) denotes the left-invariant Gaussian matching filters on the orientation score [65]. The maximum filter response is calculated over all orientations per position. The method is validated on the public databases DRIVE and STARE and is computationally efficient. With a sensitivity and specificity of 0.7744 resp. 0.9708 on DRIVE, and 0.7940 resp. 0.9707 on STARE, the proposed algorithm outperforms most current segmentation schemes. Interestingly, it can also deal with complicated vessel configurations. For more details see [65, 66].

4.4.2 Multi-scale and multi-orientation segmentation (BIMSO) for SLO images

For retinal images taken with a scanning laser ophthalmoscope (SLO) camera, new segmentation techniques are required. Such cameras typically exploit two wavelengths, green and infrared and typically exhibit better contrast for hemoglobin (vessels and bleedings). However, they also exhibit speckle noise, due to which conventional analysis methods for RGB images often fail. Very few studies are dedicated to SLO image analysis (e.g., [63]).

The proposed method [1], termed BIMSO (Brain-Inspired Multi-Scale and multi-Orientation), has four main steps: preprocessing, feature extraction, classification and post-processing. The method is developed for the green channel (RGB and SLO).

Noise reduction is effectuated by a nonlinear gamma transform (\(\check{U}_f=\alpha ~{|U_f|}^\gamma \)). The orientation score is raised by a power factor \(\gamma >1\). The absolute value of the orientation score \(|U_f|\) is taken because of the quadratic property of the cake wavelets, and \(\alpha \) is determined by the sign of the real part of the orientation score (\({\hbox {Re}}(U_f))\). See Fig. 7.

After enhancement, a pattern recognition approach is taken to classify the vessels from the background. A normalized feature vector is constructed per pixel, including the intensity, the half-cake wavelet responses and 3 left-invariant multi-scale first order Gaussian derivatives and 8 second-order Gaussian derivatives in the pre-processed orientation scores (two of the second-order derivatives are similar; therefore, there are only 8 unique second-order derivatives, so in total 11 Gaussian derivatives are used). The orientation score intrinsically allows for derivatives in the local directions (\(e_\xi ,e_\eta ,e_\theta \)), and crossings are ‘lifted’ as different orientations in different orientation layers. The method is trained with a feed-forward neural network classifier and validated and tested on two different datasets (DRIVE for RGB and IOSTAR for SLO).

Fig. 7
figure 7

Top original SLO image (EasyScan, I-Optics Inc.). Bottom nonlinear orientation score gamma transformation with \(\gamma =1.8\). From [1]

The BIMSO method detects blood vessels well, in particular the smaller ones in low contrast regions, and crossing vessels. The performance is better than the best supervised segmentation methods for RGB (e.g., [56]), for both the DRIVE and IOSTAR databases. For more details, see [1].

Fig. 8
figure 8

a Retinal image f and multi-scale vesselness filtering results for b the Frangi filter \(\mathcal {V}^\mathrm {Fr}(f)\) and our two methods c \(\mathcal {V}^{\xi \eta }(f)\) resp. d \(\mathcal {V}^{\mathbf {a}, \mathbf {b},\mathbf {c}}(f)\) (left to right). From [24]

4.4.3 Crossing-preserving multi-scale vesselness

Frangi’s vesselness filter [29], which is based on geometric relations between the principal curvatures (eigenvalues of the Hessian second-order matrix) to extract cylindrical shapes, has been extended to the orientation score domain, which makes it crossing-preserving [33].

In a single scale layer of the OS transform, the SE(2) vesselness is expressed as Frangi’s vesselness formulation [29]:

$$\begin{aligned}&{\mathcal {V}}_0^ {{\hbox {SE}}(2)}\, (\mathcal {U}_f) \nonumber \\&\qquad = \left\{ \begin{array}{ll} 0 &{} \quad \text{ if } \mathcal {Q} \le 0 \\ \exp \left( -\frac{\mathcal {R}^2}{2 \sigma _1^2}\right) \left[ 1-\exp \left( -\frac{\mathcal {S}}{2\sigma _2^2}\right) \right] &{} \quad \text{ if } \mathcal {Q}> 0 \end{array} \right. \end{aligned}$$

with typically \(\sigma _1=0.5\) and \(\sigma _2=0.2\;||\mathcal {S}||_\infty \).

The coordinates for the calculation of the second-order derivatives can be defined with two possible orthogonal frames: the local moving frame of reference \(\{\partial _\xi ,\partial _\eta ,\partial _\theta \}\) determined by the tangent and normal of the orientation score geodesics, or the gauge frame \(\{\partial _{\mathbf {a}}\), \(\partial _{\mathbf {b}}\), \(\partial _{\mathbf {c}}\}\) [31] determined by the eigendirections of the Hessian matrix \((\mathcal {H}^{s,\beta }\,\mathcal {U}_f^a)(\mathbf {g})\) at scale s and \(\mathbf {g}\in {\hbox {SE}}(2)\), normalized w.r.t. the \(\beta \)-metric. For the first we get:

$$\begin{aligned} \mathcal {R}= & {} \frac{\left( \partial _\xi ^2\,\mathcal {U}_f^a\right) ^{s,\beta }}{\left( \partial _\eta ^2\,\mathcal {U}_f^a\right) ^{s,\beta }}, \quad \mathcal {S}=\left[ \left( \partial _\xi ^2\,\mathcal {U}_f^a\right) ^{s,\beta }\right] ^2\\&+\left[ \left( \partial _\eta ^2\,\mathcal {U}_f^a\right) ^{s,\beta }\right] ^2\;,\\ \mathcal {Q}= & {} \left( \partial _\eta ^2\,\mathcal {U}_f^a\right) ^{s,\beta } \end{aligned}$$

where the superscripts \(^{s,\beta }\) indicate Gaussian derivatives at spatial scale \(s=\frac{1}{2}\sigma _s^2\) and angular scale \(\frac{1}{2}(\beta \sigma _s)^2\).

The second frame leads to (with \(\lambda _i\) the eigenvalues of the left-invariant Hessian):

$$\begin{aligned} \mathcal {R}=\frac{\lambda _1}{c}, \quad \mathcal {S}=\lambda _1^2 + c^2, \quad \mathcal {Q}=c \end{aligned}$$

with \(c=\frac{1}{2}(\lambda _2+\lambda _3)\), which can be interpreted as orientation confidence as defined by [31]. The SE(2) multi-scale vesselness is computed by summation over all scales. Figure 8 shows that the multi-scale vesselness gives much better results at crossings and bifurcations. The gauge frame \(\mathcal {V}^{\mathbf {a},\mathbf {b},\mathbf {c}}\) gives the best results. For more details, see [33].

4.5 Multi-orientation vessel tracking

Vessel tracking has the advantage over pixel classification that it guarantees connectedness of vessel segments. We have developed tracking via orientation scores [6], so classical difficulties as crossings, bifurcations, closely parallel vessels and vessels of varying width and high curvature can be dealt with naturally.

To handle bifurcations properly, one-sided kernels are exploited, constructed by decomposition of orientations scores in two opposite directions, weighted in the radial direction with an error function (cumulative Gaussian).

Two tracking methods are developed: Edge Tracking based on Orientation Scores (ETOS) and Centerline Tracking based on multi-scale Orientation Scores (CTOS). ETOS tracks both edges of the vessel simultaneously. The edge positions are detected in the orientation score, as a local minimum (left edge) and maximum (right edge) from the anti-symmetric imaginary part of the tangent plane. CTOS exploits the fact that the disturbing vessel central light reflex is filtered out by Gabor kernels with proper scales. The invertible orientation score has some distinct advantages over Gabor filtering: much lower computational costs as the orientation score filters for all scales simultaneously, whereas the Gabor filter only calculates a single spatial frequency at a time, and the orientation score filtering is more accurate.

Overall, ETOS outperforms CTOS. ETOS gives best results when applied on invertible orientation scores, can deal with many complex geometries, and gives reliable width measurements.

The algorithm performs well: ETOS detected 76 % (290/381) of the bifurcations and 96 % (109/114) of the crossings correctly. Most mistakes in bifurcations were actually crossings, only 5 % was misclassified. For more details, see [6].

4.5.1 Sub-Riemannian geodesics in SE(2)

In the orientation score, vessels can also be tracked as geodesic curves [8, 9, 52]. Geodesics (optimal curves minimizing some curve-length functional) are found by making proper use of the curved geometry of the domain. On the space of positions and orientations, a Riemannian geometry is defined, in which distances between tangent vectors are defined relative to the location in the score (e.g., via the left-invariant \(\{\partial _\xi ,\partial _\eta ,\partial _\theta \}\)-frame). Furthermore, since not all curves in the lifted domain SE(2) are natural some directions are prohibited, resulting in the definition of a sub-Riemannian geometry (compare this with the restricted motion of a car). Similarly, the paths of vessels are continuous and they do not exhibit sudden jumps to the left or the right.

Fig. 9
figure 9

Data-adaptive sub-Riemannian geodesics give good and smooth tracking results, also over crossings. From [8]

The algorithms for sub-Riemannian geodesic extraction are implemented via an anisotropic fast marching scheme [49, 52]. As a result, the curve extraction procedure is both fast and robust. In summary, the sub-Riemannian geodesic extraction in SE(2) allows for the robust extraction of curves because: (1) crossing structures are disentangled in the orientation scores, and (2) high curvature (e.g., discrete jumps to the left or right, or sudden change of direction) is punished due to the restricted Sub-Riemannian geometry. See examples in Fig. 9.

4.6 Arterio-venous ratio (AVR)

The separation of vessels in arteries and veins is crucial, as their different physiology, flow and mechanical properties respond differently to disease development, e.g., arteriolar narrowing, a decrease of the artery calibers relatively to the vein calibers is an important early biomarker for diabetic retinopathy. Also tortuosity measures are expected to be different for arteries and veins [39]. An automated artery–vein classification is required for high-volume screening.

The ratio of the arteriolar and venular diameters is called the arteriovenous ratio (AVR) and is classically computed from the six widest arteries and veins in a restricted zone around the optic disk [42].

We developed a novel method for artery/vein classification [25] based on local and contextual feature analysis of retinal vessels. Features are (a) the color, as arteries appear brighter and veins darker due to the oxygen content of the blood, (b) the transverse intensity profile, as arteries have a more pronounced central reflex to the camera flash, and (c) graph path properties of crossings and bifurcations of vessels, as these provide contextual information, because arteries never cross arteries and veins always cross arteries [17].

A non-submodular energy function is defined, integrating the features, and optimized exactly by means of graph cuts. The method was validated with a ground truth data set of 150 fundus images. An accuracy of 88.0 % was achieved for all vessels and 94.0 % for the 6 widest vessels. The contextual information especially benefits the classification of smaller vessels. For more details see [25].

Fig. 10
figure 10

The pipeline for calculating the fractal dimension from a color fundus image. From [35]

4.7 Fractal dimension

The concept of fractal dimension was initially defined and developed in mathematics. It measures the complexity of self-similar objects that have the same patterns across different scales, e.g., trees and snowflakes. The vascular tree on the human retina also has self-similar branching patterns over different scales. Therefore, there is a growing interest in retinal fractal analysis, exploiting fractal dimension as a biomarker for discriminating healthy from diabetic retinopathy.

Fractal dimensions have been widely investigated. However, conflicting findings are found [3, 12]. This motivated us to especially investigate the stability and reproducibility of fractal dimension measurements.

The retinal vascular network is extracted by the method described in Sect. 4.4.1. The region-of-interest (ROI) must be carefully specified for fractal dimension calculation, because it is a global measurement and different fundus cameras have different field-of-views. The ROI is determined via firstly locating the optic nerve head by the template-matching-based method as described in Sect. 4.3, and the optic disk radii are obtained. The fovea position is found at the global minimal intensity in a small region with 5*OD radii lateral distance to the optic nerve head. For the fovea resp. ONH-centered images, the circular ROI mask is centered at the fovea centralis, resp. ONH. For the fractal dimension calculation, we used the box counting method which paves the full retinal ROI image with square boxes (tiles) with different side-length (different scales) [48]. In each box, various measurements are done: counting the number of boxes that overlap with the vessels, calculating the entropy of each box and calculating the probability of finding vessels near each pixel. Finally, these measurements are plotted against the box side-length on a log-log plot, and the fractal dimension is the slope of the fitting regression line. See Fig. 10.

Fig. 11
figure 11

Top row pipeline of the BICROS method. a input image, b preprocessing result, c construction of the orientation score, d candidate selection, and e feature extraction and classification into bifurcations (orange) and crossings (blue). Bottom row The orientation column at a bifurcation point is f overlayed on an image with a polar plot, and g plotted against the orientation, where the dominant orientations are depicted in red. h A volume rendering of the orientation score. From [55] (color figure online)

We examined the stability of the fractal dimension measurements with respect to a range of variable factors: (1) different vessel annotations obtained from human observers, (2) different automatic segmentation methods, (3) different regions-of-interest, (4) different accuracy of vessel segmentation methods, and (5) different imaging modalities. Our results demonstrate that the relative errors for the measurement of fractal dimensions are significant and vary considerably according to the image quality, modality and the technique used for measuring it. So automated and semi-automated methods for the measurement of fractal dimension are not stable enough, which makes fractal dimension not a proper biomarker in quantitative clinical applications [35].

4.8 Geometric vessel features

4.8.1 Bifurcations/crossings detection

We have developed a fully automatic bifurcation and crossing detection algorithm called Biologically Inspired CRossing detecting in Orientation Scores (BICROS) [55], see Fig. 11, which does not depend on vessel segmentations. The precision of the junction detections is high, and the method can discriminate with high accuracy between crossings and bifurcations. Interestingly, junctions of small vessels, which are typically missing in vessel segmentations, are well detected by BICROS (Fig. 12). Through this, the proposed hybrid method outperforms state-of-the-art techniques [5]. It performs well on both RGB color and noisy SLO retinal images. It provides the orientation-augmented landmarks for image stitching, and because the orientations and the tracks of the branches are provided, useful other biomarkers can be extracted such as segment lengths and bifurcation angles.

Fig. 12
figure 12

Automated vessel bifurcation (yellow) and vessel crossing (red) detection. From [55] (color figure online)

4.9 Curvature/tortuosity

Vessel curvature is an important biomarker [13, 34, 39, 53]. Most methods require pre-segmentation and center line extraction, but our brain-inspired orientation score method [11] works directly on the image data by fitting so-called exponential curves in the orientation score. An exponential curve is a curve whose spatial projection has a constant curvature (Fig. 13).

The curvature values are computed directly from tangent vectors of exponential curves that locally best fit the data:

$$\begin{aligned} \kappa = \frac{c^\theta {\text {sign}}(c^\xi )}{\sqrt{|c^\xi |^2 + |c^\eta |^2}}. \end{aligned}$$
(2)

The coefficients of the tangent vector \({c_\xi ,c_\eta ,c_\theta }\) are expressed in the moving frame (see Eq. (1), [24, 30]).

Fig. 13
figure 13

SE(2) tortuosity measures per voxel. From [11]

Fig. 14
figure 14

Box-and-whisker plots of tortuosity measures \(\mu _{|\kappa |}\) and \(\sigma _{|\kappa |}\) in subgroups of the HRF and MESSIDOR database. From [11]

Principal directions are calculated from the eigenvectors of the Hessian matrix, but now in a curved domain, for which we need left-invariant derivatives:

$$\begin{aligned} \mathcal {H}U = \mathbf {M}_{\mu ^{-2}} \left( \begin{array}{ccc} \partial _\xi ^2 U &{} \quad \partial _\xi \partial _\eta U &{} \quad \partial _\theta \partial _\xi U \\ \partial _\xi \partial _\eta U &{} \quad \partial _\eta ^2 U &{} \quad \partial _\theta \partial _\eta U \\ \partial _\xi \partial _\theta U &{} \quad \partial _\eta \partial _\theta U &{} \quad \partial _\theta ^2 U \\ \end{array} \right) \mathbf {M}_{\mu ^{-2}}. \end{aligned}$$

As left-invariant derivatives do not commutate, e.g., \(\partial _\theta \partial _\xi U \ne \partial _\xi \partial _\theta U\), we need to make the asymmetric Hessian matrix symmetric via

$$\begin{aligned} \mathcal {H}_\mu U = \mathbf {M}_{\mu } (\mathcal {H}U)^T \mathbf {M}_{\mu ^2} (\mathcal {H}U) \mathbf {M}_{\mu }, \end{aligned}$$

so we can calculate the eigensystem of \(\mathcal {H}_\mu U\). The diagonal matrix \(M_{\mu ^{-2}}\) compensates for the difference in units between the orientation and spatial dimensions, for details see [11].

A confidence measure is acquired by calculating the Gaussian Laplacian in the plane orthogonal to the tangent direction of the vascular geodesic. From all pixelwise curvature measures, several global measures are derived, as the mean and standard deviation. The method was validated on synthetic and retinal images.

Tortuosity measures from the MESSIDOR database (1200 images) showed significant increase of tortuosity with the increasing severity of diabetic retinopathy (R0, R1, R2 and R3), see Fig. 14. For more details, see [11].

5 Next steps

The computer-aided diagnosis algorithms are mostly written in Mathematica 10 (Wolfram Research Inc.). Processing times are 1–15 min per image, but this is shortened by using multi-core (32) servers, both in China as in the Netherlands, and developing GPU-based implementations, in particular for the OS-based algorithms.

Fig. 15
figure 15

Workstation design for RetinaCheck retinal image analysis research (by: B. Dashtbozorg)

A dedicated workstation has been designed, to program any pipeline of processes, in batch mode, and with automated report generation (see Fig. 15, see also [19]). Data are acquired with different fundus cameras (EasyScan, DRS, Topcon). The workstation is also used for annotations: at least three experienced ophthalmology experts diagnose and annotate the bio-markers on the fundus image datasets as ground truth or reference standard. We have established a detailed workflow protocol to minimize variations.

The crucial validation phase, not reported in this, paper, is ongoing. The algorithms are currently evaluated in the collaborating hospitals: image and extensive metadata have been acquired of 3000 diabetes patients in Shengjing Hospital, and over 20.000 normal images with first expert reading are acquired at He Vision shops in Shenyang, China. Studies are ongoing to correlate the rich set of geometric biomarkers described in this paper with the extensive Chinese metadata obtained in the clinical setting, and find the most predictive and effective (combination of) geometric biomarkers extracted from the retinal fundus images for DR, glaucoma, and AMD.

6 Conclusion

The brain-inspired multi-orientation approach turned out to be highly successful in dealing with many complex vessel geometries, such as crossings, bifurcations, curvature analysis without segmentation, enhancement and segmentation. The theory for orientation scores in SE(2) is now well developed. The generalized approach of a Lie-group analysis by the visual front-end is currently a broad topic of research. The paper gives a review of many current geometric developments for retinal vessel analysis with the goal of establishing biomarkers for screening for early signs of a range of systemic, neurodegenerative and cardiovascular diseases.

In the parallel deep machine learning field major developments take place. The RetinaCheck team ended at 17th position in the recent Kaggle challenge on Diabetic Retinopathy classification (https://www.kaggle.com/c/diabetic-retinopathy-detection), of 661 participating teams. It can be foreseen that the merge of geometric analysis and reasoning with deep learning, based on the large volumes of data acquired in this project, can be highly successful. This is ongoing development, and will be reported in forthcoming papers.