Machine Vision and Applications

, Volume 27, Issue 8, pp 1137–1149 | Cite as

Supervised vessel delineation in retinal fundus images with the automatic selection of B-COSFIRE filters

  • Nicola StrisciuglioEmail author
  • George Azzopardi
  • Mario Vento
  • Nicolai Petkov
Open Access
Special Issue Paper


The inspection of retinal fundus images allows medical doctors to diagnose various pathologies. Computer-aided diagnosis systems can be used to assist in this process. As a first step, such systems delineate the vessel tree from the background. We propose a method for the delineation of blood vessels in retinal images that is effective for vessels of different thickness. In the proposed method, we employ a set of B-COSFIRE filters selective for vessels and vessel-endings. Such a set is determined in an automatic selection process and can adapt to different applications. We compare the performance of different selection methods based upon machine learning and information theory. The results that we achieve by performing experiments on two public benchmark data sets, namely DRIVE and STARE, demonstrate the effectiveness of the proposed approach.


B-COSFIRE Filters selection Retinal vessels segmentation Trainable filters Vessel delineation 

1 Introduction

Retinal fundus imaging is a noninvasive tool that is widely employed by medical experts to diagnose various pathologies such as glaucoma, age-related macular degeneration, diabetic retinopathy and atherosclerosis. There is also evidence that such images may contain signs of non-eye-related pathologies, including cardiovascular [19] and systemic diseases [32]. Figure 1 shows examples of two retinal fundus images and their corresponding manually segmented vessel trees. In the last years, particular attention by medical communities has been given to early diagnosis and monitoring of diabetic retinopathy, since it is one of the principal causes of blindness in the world [1].

The manual inspection of retinal fundus images requires highly skilled people, which results in an expensive and time-consuming process. Thus, the mass screening of a population is not feasible without the use of computer-aided diagnosis systems. Such systems could be used to refer to medical experts only the patients with suspicious signs of diseases [1, 2]. In this way, the medical professionals could focus on the most problematic cases and on the treatment of the pathologies.

The automatic segmentation of the blood vessel trees in retinal images is a basic step before further processing and formulation of diagnostic hypothesis. This means that the quality of vessel segmentation influences the reliability of the subsequent diagnostic steps. It is, therefore, of utmost importance to obtain accurate measurements about the geometrical structure of the vessels. After segmenting the vessel tree, it is common for many methodologies to detect candidate lesions and then to classify them as healthy or not. The better the segmentation, the less false candidate lesions will be detected.
Fig. 1

Examples of fundus images of a healthy and b unhealthy retinas, together with the corresponding manually segmented vessels taken from the STARE data set [15]

In the last years, this challenge has attracted wide interest of many image processing and pattern recognition researchers. Existing methods can be generally divided into two groups: unsupervised methods are based on filtering, vessel tracking techniques or modeling, while supervised methods train binary classification models using pixel-wise feature vectors.

In the unsupervised approaches, mathematical morphology techniques are used in combination with a priori knowledge about the vessels structure [24, 36] or with curvature analysis [11]. Vessel tracking-based methods start from an automatically or manually chosen set of points and segment the vessels by following their centerline [7, 10, 20, 39]. Methods based on matched filtering techniques, instead, assume that the profile of vessels can be modeled with a two-dimensional Gaussian kernel [3, 8, 15], also in combination with an orientation score [37]. In [22], information about the size, orientation, and width of the vessels is exploited by a region growing procedure. A model of the vessels based on their concavity and built by using a differentiable concavity measure was proposed in [18]. In previous works [6, 35], we introduced trainable filters selective for vessels and vessel-endings. We demonstrated that by combining their responses we could build an effective unsupervised delineation technique. A method for the construction of an orientation map of the vessels was proposed in [13]. The information about the topology of the vessels was used in a graph-based approach [9].

On the other hand, supervised methods are based on computing pixel-wise feature vectors and using them to train a classification model that can distinguish between vessel and non-vessel pixels. Different types of features have been studied in combination with various classification techniques. A k-NN classifier was used in combination with the responses of multi-scale Gaussian filters or ridge detectors in [26] and [33], respectively. Multi-scale features, computed by means of Gabor wavelets, were also used to train a Bayesian classifier in [31]. A feature vector composed of the response of a line operator, together with the information about the green channel and the line width, was proposed in [28] and used to train a support vector machine (SVM) classifier. In [21], a multilayer neural network was applied to classify pixels based on moment-invariant features. An ensemble of bagged and boosted decision trees was employed in [12].

Generally, unsupervised approaches are very efficient, but at the expense of lower effectiveness when compared to their supervised counterparts. Supervised methods, although well-performing, require a thorough feature-engineering step based upon domain knowledge. The sets of features, indeed, are built with the purpose to overcome specific problems of retinal fundus images, such as the presence of red or bright lesions, luminosity variations, among others. For instance, multiscale Gabor filters can be used to eliminate red lesions [31], while morphological transformations can be used for reducing the effects of bright lesions in the segmentation task [12]. Such methods, however, are suitable to cope with the processing of specific kinds of images and cannot be easily applied to delineate elongated structures in other applications (e.g., rivers segmentation in aerial images [38] or wall crack detection [25]).

We propose to address the problem of segmenting elongated structures, such as blood vessels in retinal fundus images, by using a set of B-COSFIRE filters of the type proposed in [6], selective for vessels of various thickness. The B-COSFIRE filter approach was originally proposed for delineation of retinal vessels. Such filters were also employed within a pipeline for the analysis of computed tomography angiography (CTA) images [40]. This demonstrates their suitability for various applications. In [6], two B-COSFIRE filters, one specific for the detection of vessels and the other for the detection of vessel-endings, were combined together by simply summing up their responses. The parameters of the vessel-ending filter were chosen in such a way to maximize the performance of the two filters. This implies a dependence of the configuration of the vessel-ending detector upon the vessel detector. Moreover, the configuration parameters of each filter were chosen in order to perform best on the most common thickness of all vessels.

In this work, we propose to determine a subset of B-COSFIRE filters, selective for vessels of different thickness, by means of information theory and machine learning. We compare the performance achieved by the system with different feature selection methods, including Generalized Matrix Learning Vector Quantization (GMLVQ) [29], class entropy and a genetic algorithm.

The rest of the paper is organized as follows. In Sect. 2, we present the B-COSFIRE filters and the feature selection procedure. In Sect. 3, we introduce the data sets and the tools that we use for the experiments, while in Sect. 4 we report the experimental results. After providing a comparison of the achieved results with the ones of the existing methods and a discussion in Sect. 5, we draw conclusions in Sect. 6.

2 Method

The main idea is to configure a bank of B-COSFIRE filters and to employ information theory and machine learning techniques to determine a subset of filters that maximize the performance in the segmentation task. We consider approaches that take into account the contribution of each feature individually and approaches that evaluate also their combined contribution.

2.1 B-COSFIRE filters

B-COSFIRE filters are trainable and in [6] they were configured to be selective for bar-like structures. Such a filter takes as input the response of a Difference-of-Gaussians (DoG) filter at certain positions with respect to the center of its area of support. The term trainable refers to the ability of determining these positions in an automatic configuration process by using a prototypical vessel or vessel-ending. Figure 2a shows a synthetic horizontal bar, which we use as a prototypical vessel to configure a B-COSFIRE filter.

For the configuration, we first convolve (the convolution is denoted by \(*\)) an input image I with a DoG function of a given standard deviation1 \(\sigma \):
$$\begin{aligned} c_{\sigma } \overset{\mathrm{def}}{=} |I *\hbox {DoG}_{\sigma }|^{+} \end{aligned}$$
where \(|\cdot |^+\) denotes half-wave rectification.2 In Fig. 2b, we show the response image of a DoG filter with \(\sigma =2.5\) applied to the prototype in Fig. 2a. We then consider the DoG responses along concentric circles around a given point of interest, and select from them the ones that have local maximum values (Fig. 2c). We describe each point i by three parameters: the standard deviation \(\sigma _i\) of the DoG filter, and the polar coordinates (\(\rho _i\), \(\phi _i\)) where we consider its response with respect to the center. We form a set \(S=\{(\sigma _i,\rho _i,\phi _i)|i=1, \dots , n\}\) that defines a B-COSFIRE filter that has a selectivity preference for the given prototype. The value of n represents the number of configured tuples.
Fig. 2

Example of the configuration of a B-COSFIRE filter using a a horizontal synthetic prototype vessel. We compute b the corresponding DoG filter response image and select c the local maxima DoG responses along concentric circles around a point of interest (identified by the cross marker in the center). d A sketch of the resulting filter: The sizes of the blobs correspond to the standard deviations of the Gaussian blurring functions

For the application of the resulting filter, we first convolve an input image with a DoG function that has a standard deviation specified in the tuples of the set S. Then, we blur the DoG responses in order to allow for some tolerance in the preferred positions of the concerned points. The blurring operation takes the maximum DoG response in a local neighourhood weighted by a Gaussian function \(G_{\sigma '}(x',y')\), whose standard deviation \(\sigma '\) is a linear function of the distance \(\rho _i\) from the support center of the filter: \(\sigma ' = \sigma '_0 + \alpha \rho _i\) (Fig. 2d). The values of \(\sigma _0'\) and \(\alpha \) are constants, and we tune them according to the application.

We then shift every blurred DoG response by a vector of length \(\rho _i\) in the direction toward the center of the area of support, which is the complimentary angle to \(\phi _i\). The concerned shift vector is \(({\Delta } x_i, {\Delta } y_i)\), where \({\Delta } x_i = -\rho _i\cos \phi _i\) and \({\Delta } y_i = -\rho _i\sin \phi _i\). We define the blurred and shifted DoG response for the tuple \((\sigma _i,\rho _i, \phi _i)\) as:We denote by \(r_{S}(x,y)\) the response of a B-COSFIRE filter by combining the involved blurred and shifted DoG responses by geometric mean:
$$\begin{aligned} r_{S}(x,y) \overset{\mathrm{def}}{=} \left( \prod _{i=1}^{\mid S\mid }\left( s_{\sigma _i,\rho _i,\phi _i}(x,y)\right) \right) ^{1/{\mid S\mid }} \end{aligned}$$
The procedure described above configures a B-COSFIRE filter that is selective for horizontally oriented vessels. In order to achieve multi-orientation selectivity, one can configure a number of B-COSFIRE filters by using prototype patterns in different orientations. Alternatively, we manipulate the parameter \(\phi \) of each tuple and create a new set \(R_{\psi }(S)=\{(\sigma _i,\rho _i,\phi _i + \psi )\mid i=1,\ldots ,n\}\) that represents a B-COSFIRE filter with an orientation preference of \(\psi \) radians offset from that of the original filter S. We achieve a rotation-tolerant response in a location (xy) by taking the maximum response of a group of B-COSFIRE filters with different orientation preferences:where \({\varPsi }=\{0, \frac{\pi }{12}, \frac{\pi }{6}, \ldots , \frac{11\pi }{12}\}\).

2.2 A bank of B-COSFIRE filters

The thickness of the vessels in retinal fundus images may vary from 1 pixel to a number of pixels that depends on the resolution of the input images. For this reason, we configure a large bank of B-COSFIRE filters consisting of 21 vessel detectors \(\{S_1, \ldots S_{21}\}\) and 21 vessel-ending detectors \(\{S_{22},\ldots S_{42}\}\), which are selective for vessels of different thickness.

In Fig. 3, we show the response images of the B-COSFIRE filters that are selective for (left column) vessels and (right column) vessel-endings. In particular, we configure filters selective for thin (second row), medium (third row) and thick (forth row) vessels. It is noticeable how the large-scale filters are selective for thick vessels (Fig. 3g, h) and are robust to background noise but achieve low responses along thin vessels. Conversely, the small-scale vessels (Fig. 3c, d) show higher selectivity for thin vessels but are less robust to background noise. The combination of their responses promises to achieve better delineation performance at various scales [34].

We construct a pixel-wise feature vector \({\mathbf {v}}(x,y)\) for every image location (xy) with the responses of the 42 B-COSFIRE filters in the filterbank, plus the intensity value g(xy) of the green channel in the RGB retinal image:
$$\begin{aligned} {\mathbf {v}}(x,y) = \begin{bmatrix} g(x,y), {\hat{r}}_{1}(x,y), \ldots ,{\hat{r}}_{42}(x,y) \end{bmatrix}^T \end{aligned}$$
where \({\hat{r}}_{i}(x,y)\) is the rotation-tolerant response of a B-COSFIRE filter \(S_{i}\). The inclusion of the intensity value of the green channel is suggested by many existing approaches [12, 28, 31, 33, 34].
Fig. 3

Response images obtained by B-COSFIRE filters that are selective to (left column) vessels and (right column) vessel-endings of different thickness. We consider filters selective for thin (c, d), medium (e, f) and thick (g, h) vessels

2.3 Feature transformation and rescaling

Before classification, we apply the inverse hyperbolic sine transformation function [17] to each element of the feature vector. It reduces the skewness in the data and is defined as:
$$\begin{aligned} f(v_i,\theta ) = \frac{\sinh ^{-1}(\theta v_i)}{\theta } \end{aligned}$$
For large values of \(v_i\) and \(\theta > 0\), the function behaves like a \(\log \) transformation.3 As \(\theta \rightarrow 0\), \(f(v_i,\theta )\rightarrow v_i\). We then compute the Z-score to standardize each of the 43 features. As suggested in [28], we apply the Z-score normalization procedure separately to each image in order to compensate for illumination variation between the images.

2.4 Automatic subset selection of B-COSFIRE filters

The filterbank that we designed in the previous section is overcomplete and might have many redundant filters. We investigate various feature selection approaches to determine the smallest subset of features that maximize the performance of the vessel tree delineation. We use as input the training data that consists of a matrix of size \(N \times 43\), where N corresponds to the number of randomly selected pixels (half of them are vessel pixels, and the other half are non-vessel pixels) from the training images, and the number of columns corresponds to the size of the filterbank plus the green channel.

2.4.1 Entropy score ranking

Entropy characterizes uncertainty about a source of information. The rarer a response in a specific range is the more information it provides when it occurs. We use a filter approach that computes the entropy E of each of the 43 features:
$$\begin{aligned} E= & {} \sum _{i=1}^{n} \sum _{j=1}^cP\left( y_i=j~|~x=\frac{i}{20}\right) \nonumber \\&\log P \left( y_i=j~|~x=\frac{i}{20}\right) \end{aligned}$$
where y is the class label (vessel or non-vessel), c is the number of classes (in this case \(c=2\)), x is a vector of quantized features rounded up to the nearest 0.05 increment and \(n=20\). Before computing the entropy, we first rescale and shift the Z-scored values in the range \(\left[ 0,1 \right] \), such that the minimum value becomes 0 and the maximum value becomes 1.

We rank the 43 features using the reciprocal of their corresponding entropy values and select the highest k ranked features that contribute to the maximum accuracy on the training set.

2.4.2 Genetic algorithm

The nature-inspired genetic algorithms are a family of search heuristics that can be used to solve optimization problems [14, 23]. We use a genetic algorithm to search for the best performing subset of features among the enormous possible combinations. We initialize a population of 400 chromosomes each with 43 random bits. The positions of the one bits indicate the columns (i.e., the green channel and the 42 B-COSFIRE filters) to be considered in the given matrix.
Fig. 4

A bar plot of the relevances of the features on the DRIVE data set

Fig. 5

Sketch of the application phase of the proposed method. The a input retinal image is first b preprocessed. Then, c the responses of the bank of selected B-COSFIRE filters and, possibly, the green channel are used to form a d feature vector. After e transforming and rescaling the features, f a SVM classifier is then used to classify every pixel in the input image and obtain g a response map. h The binary output is obtained by thresholding the SVM probability scores

The fitness function computes the average accuracy in a tenfold cross-validation on the training data with the selected columns. In each fold, we configure an SVM classifier with a linear kernel by using 90 % of the training set and apply it to the remaining 10 %. After every epoch, we sort the chromosomes in descending order of their fitness scores and keep only the top 40 (i.e., 10 %) of the population. We use this elite group of chromosomes to generate 360 offspring chromosomes by a crossover operation to randomly selected pairs of elite chromosomes. Every bit of the newly generated chromosomes has a probability of 10 % to be mutated (i.e., changing the bit from 1 to 0 or from 0 to 1). We run these iterative steps until the elite group of chromosomes stops changing.

Finally, we choose the filters that correspond to the positions of the one bits in the chromosome with the highest fitness score and with the minimum number of one bits.

2.4.3 GMLVQ

The Generalized Matrix Learning Vector Quantization (GMLVQ) [29, 30] computes the pairwise relevances of all features with respect to the classification problem. It generates a full matrix \({\varLambda }\) of relevances that describe the importance of the individual features and pairs of features in the classification task.

We consider the diagonal elements \({\varLambda }_{ii}\) as the ranking (relevant) scores of each feature. The higher the score, the more relevant the corresponding feature is in comparison with the others. In Fig. 4, we show the feature relevances obtained from the training images of the DRIVE data set. In the following, we investigate the selection of the subset of relevant features in two different ways.

Relevance peaks We select only the features that achieve relevance peaks. For instance, from the feature relevances shown in Fig. 4 we select the feature \(\left[ {\hat{r}}_3, {\hat{r}}_8, {\hat{r}}_{10}, {\hat{r}}_{17}, {\hat{r}}_{21}, {\hat{r}}_{24}, {\hat{r}}_{27},\right. \left. {\hat{r}}_{31}, {\hat{r}}_{33}, {\hat{r}}_{36}, {\hat{r}}_{38}, {\hat{r}}_{42} \right] \). It is worth noting that this approach can be used when the feature vector elements are in a systematic order and thus can be compared with their neighboring elements. In our case, the feature vector is constructed by the responses of B-COSFIRE filters whose thickness preference increases systematically, plus the green channel.

Relevance ranking We sort the 43 features in descending order of their relevance scores and select features with the top k relevances. We then determine the value of k that maximizes the accuracy on the training set.

2.5 Classification

We use the selected features to train a SVM classifier with a linear kernel. The SVM classifier is particularly suited for binary classification problems, since it finds an optimal separation hyperplane that maximizes the margin between the classes [16].

2.6 Application phase

In Fig. 5, we depict the architectural scheme of the application phase of the proposed method. First, we preprocess a given retinal fundus image (Fig. 5a, b). We discuss the preprocessing procedure in Sect. 4.1. For each pixel, we construct a feature vector by considering the features selected during the training phase (i.e., possibly the green channel and the responses of a subset of k B-COSFIRE filters) (Fig. 5c, d). Then, we transform and rescale the features and use a SVM classifier to determine the vesselness of each pixel in the input image (Fig. 5e, f). Finally, we compute the binary vessel map by thresholding the output score of the SVM (Fig. 5g, h).

3 Materials

3.1 Data sets

We performed experiments on two data sets of retinal fundus images that are publicly available for benchmarking purpose: DRIVE [33] and STARE [15].

The DRIVE data set is composed of 40 images (of size \(565\times 584\) pixels), divided into a training and a test set of 20 images each. The images in the training set were manually labeled by one human observer, while the images in the test set were labeled by two different observers. For each image in the data set, a binary mask of the field of view (FOV) of the retina is also provided.

The STARE data set consists of 20 retinal fundus images (of size \(700 \times 605\) pixels), 10 of which contain signs of pathology. Each image in the data set was manually labeled by two different human observers.

For both data sets, we consider the manual segmentation provided by the first observer as gold standard and use it as the reference ground truth for the performance evaluation of the algorithms. We use the second set of manually labeled images to compute the performance of the second human observer with respect to the gold standard.

3.2 B-COSFIRE implementation

We used the existing implementation of the B-COSFIRE filtering4 to compute the responses of the involved vessel-selective and vessel-ending-selective filters. Moreover, we provide a new set of MATLAB scripts5 of the proposed supervised delineation technique, including the automatic feature selection.

4 Experiments

4.1 Preprocessing

In our experiments, we considered only the green channel of the RGB retinal images, since it shows the highest contrast between vessels and background [24, 26, 33]. The blue channel has a small dynamic range, while the red channel has low contrast.

We preprocessed the retinal images in the DRIVE and STARE data sets in order to avoid false detection of vessels around the FOV and to further enhance the contrast in the green channel. Due to the high contrast on the border of the FOV of the retina, the B-COSFIRE filters might detect false vessels. We applied the preprocessing step proposed in [31], which aims at dilating the FOV by iteratively enlarging the radius of the region of interest by one pixel at a time. In each iteration, we selected the pixels in the outer border of the FOV and replaced them with the average value of the intensities of the 8-neighbor pixels contained inside the FOV. We iterated this procedure 50 times, as it was sufficient to avoid false detection of lines around the border of the FOV of the retina.

Finally, we applied the contrast-limited adaptive histogram equalization (CLAHE) algorithm [27] in order to enhance the contrast between vessels and background. The CLAHE algorithm improves the local contrast and avoids the over-amplification of the noise in homogeneous regions.

4.2 Evaluation

For the DRIVE data set, we construct the training set by selecting 1000 vessel and 1000 non-vessel pixels from each image of the training set, which correspond to a total of 40,000 feature vectors. The STARE data set does not have separate training and test sets. Thus, we construct the training set by randomly choosing 40,000 pixels from all the 20 images in the data set (1000 vessel pixels and 1000 non-vessel pixels from each image). As suggested in [12, 28], since the size of the selected training set is very small (<0.5 % of the entire data set), we evaluate the performance on the whole set of images.

The output of SVM classifier is continuous (in the range \(\left[ 0,1 \right] \)) and indicates the degree of vesselness of each pixel in a given image. The higher this value, the more likely a pixel is part of a vessel. We thresholded the output of the classifier in order to obtain the binary segmented image. The threshold operation separates the pixels into two categories: vessels and non-vessels.

When comparing the segmented image with the ground truth image, each pixel contributes to the calculation of one of the following measures: A vessel pixel in the segmented image is a true positive (TP) if it is also a vessel pixel in the ground truth, while it is a false positive (FP) if it is a background pixel in the ground truth; a background pixel in the segmented image that is part of the background also in the ground truth image is a true negative (TN); otherwise, it is a false negative (FN). In order to evaluate the performance of the proposed method and compare it with the ones of existing methods, we computed the sensitivity (Se), specificity (Sp), accuracy (Acc) and the Matthews correlation coefficient (MCC), which are defined as follows:
$$\begin{aligned} \mathrm{Acc} = \frac{\mathrm{TP}+\mathrm{TN}}{N},~~ \mathrm{Se} = \frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FN}} ,~~ \mathrm{Sp} = \frac{\mathrm{TN}}{\mathrm{TN}+\mathrm{FP}} \end{aligned}$$
$$\begin{aligned} \hbox {MCC} = \frac{\mathrm{TP}/N-S \times P}{\sqrt{P\times S \times (1 - S) \times (1 - P)}}, \end{aligned}$$
where \(N = \mathrm{TN} + \mathrm{TP} + \mathrm{FN} + \mathrm{FP}\), \(S =(\mathrm{TP}+\mathrm{FN})/N\) and \(P =(\mathrm{TP}+\mathrm{FP})/N\).
Table 1

Comparison of results with different B-COSFIRE approaches on the DRIVE data set


Results on DRIVE data set







# Features

Processing time (s)

Unsupervised B-COSFIRE [6]









   No feature selection








   Genetic algorithm








   GMLVQ-relevance peaks








   GMLVQ-relevance ranking








   Class entropy ranking








The highest value for each performance metric is reported in bold font

Table 2

Comparison of results with different B-COSFIRE approaches on the STARE data set


STARE data set







# Features

Processing time (s)

Unsupervised B-COSFIRE [6]









   No feature selection








   Genetic algorithm








   GMLVQ-relevance peaks








   GMLVQ-relevance ranking








   Class entropy ranking








The highest value for each performance metric is reported in bold font

For a binary classification problem, as in our case, the computation of the accuracy is influenced by the cardinality of the two classes. In the problem at hand, the number of non-vessel pixels is roughly seven times more than the number of vessel pixels. Therefore, the accuracy is biased by the number of true negative pixels. For this reason, we computed the MCC, which quantifies the quality of a binary classifier even when the two classes are imbalanced. It achieves a value of 1 for a perfect classification and a value of \(-1\) for a completely wrong classification. The value 0 indicates a random guess classifier.

Besides the above-mentioned measurements, we also generated a receiver operating characteristics (ROC) curve and computed its underlying area (AUC). The ROC curve is a plot that shows the trade-off between the rate of false positives and the rate of true positives as the classification threshold varies. The higher the AUC the better the performance of the classification system.

4.3 Results

For a given test image and a threshold value t we computed the MCC. Then, we computed the average MCC across all test images and obtained a single performance measure \({\overline{MCC}}\) for every threshold t. We vary the threshold from 0 to 1 in steps of 0.01. Finally, we choose the threshold \(t^*\) for a given data set that provides the maximum value of \({\overline{MCC}}\).

In Tables 1 and 2 we report the results that we achieved with the proposed supervised approach on the DRIVE and STARE data sets, respectively. In order to evaluate the effects of the different feature selection methods, we used as baseline the results (\(\hbox {MCC}=0.7492\) for the DRIVE data set and \(\hbox {MCC} = 0.7537\) for the STARE data set) that we obtained by a linear SVM classifier trained with the responses of the bank of 42 B-COSFIRE filters plus the intensity value in the green channel. This naïve supervised approach achieved better performance than the unsupervised B-COSFIRE filter approach [6], whose results are also reported in the two tables. The use of machine learning or information theory techniques that compute a score of the importance of each feature gives the possibility to select the best performing group of features and, at the same time, to reduce the overall processing time.
Fig. 6

The plots in a and b show the MCC as a function of the top performing features for the DRIVE and STARE data sets, respectively

Fig. 7

ROC curves achieved on a the DRIVE and b the STARE data sets by the selection methods based upon GMLVQ relevance peaks (solid line) and a genetic algorithm (dashed line), and by the unsupervised B-COSFIRE filters (dotted line)

For the methods based on feature ranking, namely GMLVQ and class entropy, we report the results achieved when considering a set of the most k top-scored features. We chose the value of k which provided the highest accuracy on the training set. With this method, we selected 11 features for both DRIVE and STARE data sets by using GMLVQ with relevance ranking. On the other hand, when we ranked the features on the basis of their class entropy score we selected 16 features for DRIVE and 19 for STARE. In Fig. 6a, b, we show how the MCC, on the DRIVE and STARE data sets, is sensitive to an increasing number of features involved in the classification process. We only show the most discriminant 19 features since the performance improvement achieved by further features is negligible. Moreover, the required processing time becomes too high and comparable to the one required to compute the full set of features. We performed experiments on a machine equipped with a 1.8 GHz Intel i7 processor with 4GB of RAM. In Fig. 7, we show the ROC curves obtained by the GMLVQ with relevance peaks (solid line) and by the genetic algorithm (dashed line) features selection methods in comparison with the one of the unsupervised B-COSFIRE filters (dotted line). A substantial improvement of performance is evident for the STARE data set.
Table 3

Comparison of the performance results achieved by the proposed approach with the ones achieved by other existing methods

Performance comparison














   B-COSFIRE [6]









   Hoover et al. [15]





   Mendonca and Campilho. [24]







   Martinez-Perez et al. [22]







   Al-Rawi et al. [3]





   Ricci and Perfetti [28]





   Lam et al. [18]






   Staal et al. [33]





   Soares et al. [31]









   Ricci and Perfetti [28]





   Marin et al. [21]









   Fraz et al. [12]









   Proposed method









4.4 Statistical analysis

We used the right-tailed paired \(t-test\) statistic to quantify the performance improvement that we achieved with the proposed supervised method with respect to the usupervised B-COSFIRE approach. For each data set and for each method we used, the MCC values computed from all test images as explained in Sect. 4.3.

A significant improvement of the results is confirmed for the feature selection method based on GMLVQ with relevance peaks (DRIVE: \(t(19)= 1.33, p < 0.1\); STARE: \(t(19) = 2.589, p < 0.01\)) and for the approach based on a genetic algorithm (DRIVE: \(t(19) = 1.13, p < 0.15\); STARE: \(t(19) = 2.589, p < 0.01\)). On the contrary, the feature selection methods based on ranking the features by their relevance or their class entropy score do not significantly improve the performance results.

For both data sets, the GMLVQ with relevance peaks and the genetic algorithm provide the best performance results. In fact, there is no statistical difference between the two methods.

4.5 Comparison with existing methods

With the proposed approach we achieve better results than many existing methods, which we report in Table 3. The direct evaluation of the results from Table 3 is not trivial. Thus, for comparison purposes, we move along the ROC curves in Fig. 7 and for the same specificity values achieved by other methods, we compare sensitivity values that we achieve to theirs. We refer to the performance achieved by the GMLVQ with relevance peaks feature selection. For the DRIVE data set and for the same specificity reported in [31] (\(\hbox {Sp}=0.9782\)) and in [21] (\(\hbox {Sp}=0.9801\)), we achieve better sensitivity: 0.7425 and 0.7183, respectively. For the same specificity reported in [12] (\(\hbox {Sp}=9807\)), we achieve a lower value of the sensitivity (\(\hbox {Se}= 0.7181\)). Similarly, for the STARE data set and for the same specificity values reported in [31], [21] and [12] (\(\hbox {Sp}=0.9747\), \(\hbox {Sp}=0.9819\) and \(\hbox {Sp}=0.9763\)) we achieve better sensitivity: 0.7806, 0.7316 and 07697, respectively.

5 Discussion

The main contribution of this work is a supervised method for vessels delineation based on the automatic selection of a subset of B-COSFIRE filters selective for vessels of different thickness. We applied various feature selection techniques to a bank of B-COSFIRE filters and compared their performance. The versatility of the B-COSFIRE filters together with the use of a features selection procedure showed high flexibility and robustness in the task of delineating elongated structures in retinal images. The proposed method can be applied to other applications, such as the quantification of length and width of cracks in walls [25] for earthquake damage estimation or for monitoring the flow of rivers in order to prevent flooding disasters [38].

The versatility of the B-COSFIRE filters lies in their trainable character and thus in being domain independent. They can be automatically configured to be selective for various prototype patterns of interest. In this work, we configured filters on some vessel-like prototype patterns. This avoids the need of manually creating a feature set to describe the pixels in the retinal images, which is an operation that requires skills and knowledge of the specific problem. This is in contrast to other methods that use hand-crafted features and thus domain knowledge. For instance, the features proposed in [12] are specifically designed to deal with particular issues of the retinal fundus images, such as bright and dark lesions or non-uniform illumination of the FOV. A specific B-COSFIRE filter is configured to detect patterns that are equivalent or similar to the prototype pattern used for its configuration. In our case, it detects blood vessels of specific thickness. One may also, however, configure B-COSFIRE filters selective for other kinds of patterns such as bifurcations and crossovers [4, 5] and add them to the filterbank.

Although the difference of the performance achieved by the genetic algorithm and by the GMLVQ with relevance peaks is not statistically significant, the latter method seems more stable as it selects a comparable number of features in the both data set. In fact, it selects a comparable number of features in both data sets. Furthermore, the reduced bank of features allows to improve the classification performance together with a reduction in the required processing time. As a matter of fact, the GMLVQ approach selected a subset of 12 features for the DRIVE data set and 10 features for the STARE data set. The technique based on a genetic algorithm selected a set of 17 features for the DRIVE data set and 7 features for the STARE data set.

For the DRIVE data set, we selected five vessel and seven vessel-ending B-COSFIRE filters.6 The value of the green channel was not relevant for this data set. For the STARE data set, instead, we found that the value of the green channel is important. Thus, we constructed the feature vectors with the intensity value of the green channel plus the responses of four vessel- and three vessel-ending B-COSFIRE filters.7

The output of a genetic algorithm is crisp as the selected features have the same weighting. In contrast, the GMLVQ approach shows higher flexibility since it provides a measure of the relevance (in the range [0, 1]) that each filter has in the classification task. The genetic algorithm, however, evaluates the combined contribution of many features, exploring a larger space of solutions, while the GMLVQ considers only the contribution of two features at a time.

Although the two approaches based on GMLVQ and the one based on a genetic algorithm construct different sets of B-COSFIRE filters, we achieve a statistically significant improvement of the performance results with respect to the unsupervised method. This demonstrates that the proposed B-COSFIRE filterbank is robust to the feature selection approach used. The flexibility and generalization capabilities of the B-COSFIRE filters, together with a feature selection procedure, allow the construction of a system that can adapt to any delineation problem.

The method based on the computation of the class entropy score of each feature and the selection of the k top-ranked features does not improve the performance substantially. In fact, in this approach the features are assumed to be statistically independent and their contribution to the classification task is evaluated singularly. This reduces the effectiveness of the selection procedure since it does not take into account eventual mutual contributions of pairs or groups of features to the classification task.

The application of a single B-COSFIRE filter is very efficient [6]. It takes from 3 to 5 s (on a 1.8 GHz Intel i7 processor with 4GB of RAM) to process an image from the DRIVE and the STARE data sets. The responses of a bank of B-COSFIRE filters are computed independently from each other. Therefore, the computation of such responses can be implemented in a parallel way so as to further optimize the required processing time.

6 Conclusions

The supervised method that we propose for the segmentation of blood vessels in retinal images is versatile and highly effective. The results that we achieve on two public benchmark data sets (DRIVE: \(\hbox {Se} = 0.7777\), \(\hbox {Sp}=0.9702\) and \(\hbox {MCC}=0.7525\); STARE: \(\hbox {Se} = 0.8046\), \(\hbox {Sp}=0.9710\) and \(\hbox {MCC} = 0.7536\)) are higher than many existing methods. The proposed approach couples the generalization capabilities of the B-COSFIRE filter with an automatic procedure (GMLVQ with relevance peaks) that selects the best performing ones. The delineation method that we propose can be employed in any application in which the delineation of elongated structures is required.


  1. 1.

    The standard deviation of the inner Gaussian function is 0.5\(\sigma \).

  2. 2.

    Half-wave rectification is an operation that suppresses (sets to 0) the negative values.

  3. 3.

    The value of \(\theta \) has been experimentally determined on a training set (40,000 feature vectors) and set to 1000 for both the DRIVE and STARE data sets.

  4. 4.
  5. 5.

    The new package of MATLAB scripts can be downloaded from

  6. 6.

    The selected scales for the DRIVE data set are \(\sigma _1=1.6, \sigma _2=2.1, \sigma _3=2.3, \sigma _4=3\) and \(\sigma _5=3.4\) for the vessel-selective filters and \(\sigma _6=1.1, \sigma _7=1.4, \sigma _8=1.8, \sigma _9=2, \sigma _{10}=2.3, \sigma _{11}=2.5\) and \(\sigma _{12}=2.9\) for the vessel-ending-selective filters. We set \(\sigma _0=3\) and \(\alpha = 0.7\) for the vessel-selective filters and \(\sigma _0=2\) and \(\alpha = 0.1\) for the vessel-ending-selective filters.

  7. 7.

    The selected scales for the STARE data set are \(\sigma _1=1.8\), \(\sigma _2=2.2\), \(\sigma _3=3\) and \(\sigma _4=3.7\) for the vessel-selective filters and \(\sigma _5=1.7, \sigma _6=1.\), \(\sigma _7=2.2\), \(\sigma _8=2.5\) and \(\sigma _9=3.2\) for the vessel-ending-selective filters. We set \(\sigma _0=1\) and \(\alpha = 0.5\) for the vessel-selective filters and \(\sigma _0=1\) and \(\alpha = 0.1\) for the vessel-ending-selective filters.


  1. 1.
    Abramoff, M., Garvin, M., Sonka, M.: Retinal imaging and image analysis. Biomed. Eng. IEEE Rev. 3, 169–208 (2010). doi: 10.1109/RBME.2010.2084567 CrossRefGoogle Scholar
  2. 2.
    Abramoff, M.D., Niemeijer, M., Russell, S.R.: Automated detection of diabetic retinopathy: barriers to translation into clinical practice. Expert Rev. Med. Devices 7, 287296 (2010). doi: 10.1586/erd.09.76 CrossRefGoogle Scholar
  3. 3.
    Al-Rawi, M., Qutaishat, M., Arrar, M.: An improved matched filter for blood vessel detection of digital retinal images. Comput. Biol. Med. 37(2), 262–267 (2007). doi: 10.1016/j.compbiomed.2006.03.003 CrossRefGoogle Scholar
  4. 4.
    Azzopardi, G., Petkov, N.: Automatic detection of vascular bifurcations in segmented retinal images using trainable COSFIRE filters. Pattern Recognit. Lett. 34, 922–933 (2013)CrossRefGoogle Scholar
  5. 5.
    Azzopardi, G., Petkov, N.: Trainable COSFIRE filters for keypoint detection and pattern recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35, 490–503 (2013). doi: 10.1109/TPAMI.2012.106 CrossRefGoogle Scholar
  6. 6.
    Azzopardi, G., Strisciuglio, N., Vento, M., Petkov, N.: Trainable COSFIRE filters for vessel delineation with application to retinal images. Med. Image Anal. 19(1), 46–57 (2015). doi: 10.1016/ CrossRefGoogle Scholar
  7. 7.
    Bekkers, E., Duits, R., Berendschot, T., ter Haar Romeny, B.: A multi-orientation analysis approach to retinal vessel tracking. J. Math. Imaging Vis. 49(3), 583–610 (2014). doi: 10.1007/s10851-013-0488-6 zbMATHCrossRefGoogle Scholar
  8. 8.
    Chauduri, S., Chatterjee, S., Katz, N., Nelson, M., Goldbaum, M.: Detection of blood-vessels in retinal images using two-dimensional matched-filters. IEEE Trans. Med Imaging 8(3), 263–269 (1989). doi: 10.1109/42.34715 CrossRefGoogle Scholar
  9. 9.
    Chen, L., Huang, X., Tian, J.: Retinal image registration using topological vascular tree segmentation and bifurcation structures. Biomed. Signal Process. Control 16, 22–31 (2015). doi: 10.1016/j.bspc.2014.10.009 CrossRefGoogle Scholar
  10. 10.
    Chutatape, O., Liu Zheng, Krishnan, S.: Retinal blood vessel detection and tracking by matched Gaussian and Kalman filters. In: Proceedings of the 20th Annual International Conference IEEE Engineering in Medicine and Biology Society (EMBS’98), vol. 17, pp. 3144–3149 (1998)Google Scholar
  11. 11.
    Fang, B., Hsu, W., Lee, M.: Reconstruction of vascular structures in retinal images. In: Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429), vol. 3, pp. II–157–60. IEEE Signal Process. Soc (2003)Google Scholar
  12. 12.
    Fraz, M., Remagnino, P., Hoppe, A., Uyyanonvara, B., Rudnicka, A., Owen, C., Barman, S.: An ensemble classification-based approach applied to retinal blood vessel segmentation. IEEE Trans. Biomed. Eng. 59(9), 2538–2548 (2012). doi: 10.1109/TBME.2012.2205687 CrossRefGoogle Scholar
  13. 13.
    Frucci, M., Riccio, D., di Baja, G.S., Serino, L.: Severe: segmenting vessels in retina images. Pattern Recognit. Lett. (2015). doi: 10.1016/j.patrec.2015.07.002 Google Scholar
  14. 14.
    Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning, 1st edn. Addison-Wesley Longman, Boston (1989)zbMATHGoogle Scholar
  15. 15.
    Hoover, A., Kouznetsova, V., Goldbaum, M.: Locating blood vessels in retinal images by piecewise threshold probing of a matched filter response. IEEE Trans. Med. Imaging 19(3), 203–210 (2000). doi: 10.1109/42.845178 CrossRefGoogle Scholar
  16. 16.
    Joachims, T.: Estimating the generalization performance of an SVM efficiently. In: Proceedings of the 17th International Conference on Machine Learning, ICML ’00, pp. 431–438 (2000)Google Scholar
  17. 17.
    Johnson, N.L.: Systems of frequency curves generated by methods of translation. Biometrika 36(1–2), 149–176 (1949). doi: 10.1093/biomet/36.1-2.149 MathSciNetzbMATHCrossRefGoogle Scholar
  18. 18.
    Lam, B., Gao, Y., Liew, A.C.: General retinal vessel segmentation using regularization-based multiconcavity modeling. IEEE Trans. Med Imaging 29(7), 1369–1381 (2010). doi: 10.1109/TMI.2010.2043259 CrossRefGoogle Scholar
  19. 19.
    Liew, G., Wang, J.J., Mitchell, P., Wong, T.: Retinal vascular imaging: a new tool in microvascular disease research. Circ. Cardiovasc. Imaging 1, 156–161 (2008). doi: 10.1161/CIRCIMAGING.108.784876 CrossRefGoogle Scholar
  20. 20.
    Liu, I., Sun, Y.: Recursive tracking of vascular networks in angiograms based on the detection deletion scheme. IEEE Trans. Med. Imaging 12(2), 334–341 (1993). doi: 10.1109/42.232264 CrossRefGoogle Scholar
  21. 21.
    Marin, D., Aquino, A., Emilio Gegundez-Arias, M., Manuel Bravo, J.: A new supervised method for blood vessel segmentation in retinal images by using gray-level and moment invariants-based features. IEEE Trans. Med. Imaging 30(1), 146–158 (2011). doi: 10.1109/TMI.2010.2064333 CrossRefGoogle Scholar
  22. 22.
    Martinez-Pérez, M.E., Hughes, A.D., Thom, S.A., Bharath, A.A., Parker, K.H.: Segmentation of blood vessels from red-free and fluorescein retinal images. Med. Image Anal. 11(1), 47–61 (2007). doi: 10.1016/ CrossRefGoogle Scholar
  23. 23.
    Matou, K., Lep, M., Zeman, J., Ejnoha, M.: Applying genetic algorithms to selected topics commonly encountered in engineering practice. Comput. Methods Appl. Mech. Eng. 190(1314), 1629–1650 (2000). doi: 10.1016/S0045-7825(00)00192-4 CrossRefGoogle Scholar
  24. 24.
    Mendonca, A.M., Campilho, A.: Segmentation of retinal blood vessels by combining the detection of centerlines and morphological reconstruction. IEEE Trans. Med. Imaging 25(9), 1200–1213 (2006). doi: 10.1109/TMI.2006.879955 CrossRefGoogle Scholar
  25. 25.
    Muduli, P., Pati, U.: A novel technique for wall crack detection using image fusion. In: Computer Communication and Informatics (ICCCI), 2013 International Conference on, pp. 1–6 (2013)Google Scholar
  26. 26.
    Niemeijer, M., Staal, J., van Ginneken, B., Loog, M., Abramoff, M.: Comparative study of retinal vessel segmentation methods on a new publicly available database. In: Proceedings of the SPIE—The International Society for Optical Engineering, pp. 648–56 (2004)Google Scholar
  27. 27.
    Pizer, S., Amburn, E., Austin, J., Cromartie, R., Geselowitz, A., Greer, T., Ter Haar Romeny, B., Zimmerman, J., Zuiderveld, K.: Adaptive histogram equalization and its variations. Comput. Vis. Gr. Image Process. 39(3), 355–368 (1987). doi: 10.1016/S0734-189X(87)80186-X
  28. 28.
    Ricci, E., Perfetti, R.: Retinal blood vessel segmentation using line operators and support vector classification. IEEE Trans. Med. Imaging 26(10), 1357–1365 (2007). doi: 10.1109/TMI.2007.898551 CrossRefGoogle Scholar
  29. 29.
    Schneider, P., Biehl, M., Hammer, B.: Adaptive relevance matrices in learning vector quantization. Neural Comput. 21(12), 3532–3561 (2009). doi: 10.1162/neco.2009.11-08-908 MathSciNetzbMATHCrossRefGoogle Scholar
  30. 30.
    Schneider, P., Biehl, M., Hammer, B.: Distance learning in discriminative vector quantization. Neural Comput. 21(10), 2942–2969 (2009). doi: 10.1162/neco.2009.10-08-892 MathSciNetzbMATHCrossRefGoogle Scholar
  31. 31.
    Soares, J.V.B., Leandro, J.J.G., Cesar Jr., R.M., Jelinek, H.F., Cree, M.J.: Retinal vessel segmentation using the 2-D Gabor wavelet and supervised classification. IEEE Trans. Med. Imaging 25(9), 1214–1222 (2006). doi: 10.1109/TMI.2006.879967 CrossRefGoogle Scholar
  32. 32.
    Sree, V., Rao, P.: Diagnosis of ophthalmologic disordersin retinal fundus images. In: ICADIWT, 2014 5th International Conference on the, pp. 131–136 (2014). doi: 10.1109/ICADIWT.2014.6814696
  33. 33.
    Staal, J., Abramoff, M., Niemeijer, M., Viergever, M., van Ginneken, B.: Ridge-based vessel segmentation in color images of the retina. IEEE Trans. Med. Imaging 23(4), 501–509 (2004). doi: 10.1109/TMI.2004.825627 CrossRefGoogle Scholar
  34. 34.
    Strisciuglio, N., Azzopardi, G., Vento, M., Petkov, N.: Multiscale blood vessel delineation using B-COSFIRE filters. In: Computer Analysis of Images and Patterns. Lecture Notes in Computer Science, vol. 9257, pp. 300–312. Springer International Publishing (2015). doi: 10.1007/978-3-319-23117-426
  35. 35.
    Strisciuglio, N., Azzopardi, G., Vento, M., Petkov, N.: Unsupervised delineation of the vessel tree in retinal fundus images. Comput Vis. Med. Image Process. VIPIMAGE 2015, 149–155 (2015)CrossRefGoogle Scholar
  36. 36.
    Zana, F., Klein, J.: Segmentation of vessel-like patterns using mathematical morphology and curvature evaluation. IEEE Trans. Med. Imaging 10(7), 1010–1019 (2001). doi: 10.1109/83.931095 zbMATHGoogle Scholar
  37. 37.
    Zhang, J., Bekkers, E., Abbasi, S., Dashtbozorg, B., ter Haar Romeny, B.: Robust and fast vessel segmentation via gaussian derivatives in orientation scores. In: Image Analysis and Processing ICIAP 2015, Lecture Notes in Computer Science, vol. 9279, pp. 537–547. Springer International Publishing (2015). doi: 10.1007/978-3-319-23231-7_48
  38. 38.
    Zhang, L., Zhang, Y., Wang, M., Li, Y.: Adaptive river segmentation in sar images. J. Electron. (China) 26(4), 438–442 (2009). doi: 10.1007/s11767-007-0113-1 CrossRefGoogle Scholar
  39. 39.
    Zhou, L., Rzeszotarski, M., Singerman, L., Chokreff, J.: The detection and quantification of retinopathy using digital angiograms. IEEE Trans. Med. Imaging 13(4), 619–626 (1994)CrossRefGoogle Scholar
  40. 40.
    Zhu, W.B., Li, B., Tian, L.F., Li, X.X., Chen, Q.L.: Topology adaptive vessel network skeleton extraction with novel medialness measuring function. Comput. Biol. Med. 64, 40–61 (2015). doi: 10.1016/j.compbiomed.2015.06.006 CrossRefGoogle Scholar

Copyright information

© The Author(s) 2016

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors and Affiliations

  • Nicola Strisciuglio
    • 1
    • 2
    Email author
  • George Azzopardi
    • 1
    • 3
  • Mario Vento
    • 2
  • Nicolai Petkov
    • 1
  1. 1.Johann Bernoulli Institute for Mathematics and Computer ScienceUniversity of GroningenGroningenThe Netherlands
  2. 2.Department of Computer Engineering and Electrical Engineering and Applied MathematicsUniversity of SalernoFiscianoItaly
  3. 3.Intelligent Computer SystemsUniversity of MaltaMsidaMalta

Personalised recommendations