Introduction

Melanoma is the fifth most common malignancy in the United States and has rapidly become one of the leading cancers in the world. Malignant melanoma is the most deadly form of skin cancer and the fastest growing skin cancer type in the human body. 8,441 deaths out of 68,720 incidences are estimated numbers in the United States in 2009 [1]. If it is detected early, melanoma can often be cured with a simple excision operation. Dermoscopy is the major non-invasive skin imaging technique that is extensively used in the diagnosis of melanoma and other skin lesions. Dermoscopy improves upon simple photography by revealing more of the subsurface structures underneath the skin, and is now widely used by dermatologists. The contact dermoscopy technique consists of placing fluid such as mineral oil, water, or alcohol on the skin lesion that is subsequently inspected using a digital camera and a hand-held dermoscopy attachment such as Dermlite. The fluid placed on the lesion eliminates surface reflection and renders the cornified layer translucent; thus, allowing a better visualization of pigmented structures within the epidermis, the dermoepidermal junction and the superficial dermis.

For early detection of melanoma, finding lesion borders is the first and key step of the diagnosis since the border structure provides life saving information for accurate diagnosis. Currently, dermatologists draw borders manually which is described as tedious and time consuming. That is where computer-aided border detection of dermoscopy images comes in to the picture. Detecting melanoma early is critical because the melanoma not detected early can be fatal. Also, speed is critical because of a lack of dermatologists to screen all the images. Thus, physician error increases with rapid evaluation of cases [2]. For these very reasons, automated systems would be a significant help for dermotologists. The proposed method in this study is a fully automated system. By fully automated authors mean that prior to and during the processing there is no human intervention in the system. The purpose with this system is to increase dermotologist’s comfort level with his/her decision; however, a dermatologist always constitutes the final decision on the subject.

Background

Differentiating or partitioning objects from background or other objects on an image is called image segmentation. Solutions to the image segmentation have widespread applications in various fields; including medical diagnosis and treatment. Plenty of methods have been generated for grayscale and color image segmentations [36]. Four popular approaches [7] for image segmentation are: edge-based methods, threshold techniques [8], neighborhood-based techniques, graph-based methods [9, 10], and cluster-based methods. Edge based techniques investigate discontinuities in image whereas neighborhood-based methods examine the similarity (neighborhoods) among different regions. Threshold methods identify different parts of an image by combining peaks and valleys of 1D or 3D histograms (RGB). Also, there exists numerous innovative graph-based image segmentation approaches in the literature. Shi et al. 1997-1998 [9, 10] treated segmentation as a graph partitioning problem, and proposed a novel unbiased measure for segregating subgroups of a graph, known as the Normalized Cut criterion. More recently, Felzenszwalb et al. [11] developed another segmentation technique by defining a predicate for the existence of boundaries between regions, utilizing graph-based representations of images. In this study; however, we focus on cluster-based segmentation methods. In cluster-based methods, individual image pixels are considered as general data samples and assumed correspondence between homogeneous image regions and clusters in the spectral domain.

Dermoscopy involves optical magnification of the region-of-interest, which makes subsurface structures more easily visible when compared to conventional macroscopic images [12]. This in turn improves screening characteristics and provides greater differentiation between difficult lesions such as pigmented Spitz nevi and small, clinically equivocal lesions [13]. However, it has also been demonstrated that dermoscopy may actually lower the diagnostic accuracy in the hands of an inexperienced dermatologists [14]. Therefore, novel computerized image understanding frameworks are needed to minimize the diagnostic errors that result from the difficulty and subjectivity of visual interpretations [15, 16].

For melanoma investigation, delineation of region-of-interest is the key step in the computerized analysis of skin lesion images for many reasons. First of all, the border structure provides important information for accurate diagnosis. Asymmetry, border irregularity, and abrupt border cutoff are a few of many clinical features calculated based on the lesion border. Furthermore, the extraction of other important clinical indicators such as atypical pigment networks, globules, and blue-white vein areas critically depends on the border detection [17]. The blue-white veil is described as an irregular area with blended blue pigment with a ground glass haze (white), as if the image were out of focus.

At the first stage for analysis of dermoscopy images, automated border detection is usually being applied [16]. There are many factors that make automated border detection challenging e.g. low contrast between the surrounding skin and the lesion, fuzzy and irregular lesion border, intrinsic artifacts such as cutaneous features (air bubbles, blood vessels, hairs, and black frames) to name a few [17]. According to Celebi et al. 2009 [17], automated border detection can be divided into four sections: pre-processing, segmentation, post-processing, and evaluation. Pre-processing step involves color space transformations [18], [19], contrast enhancement [20][21], and artifacts removal [22], [23], [2428]. Segmentation step involves partitioning of an image into disjoint regions [29], [28],[23]. Post-processing step is used to obtain the lesion border [16], [30]. Evaluation step involves dermatologists’ evaluations on the border detection results.

Regarding boundary of clusters, Lee and Castro [31] introduced a new algorithm of polygonization based on boundary of resulting point clusters. Recently Nosovskiy et al. [32] used another theoretical approach to find boundary of clusters in order to infer accurate boundary between close neighboring clusters. These two works principally study boundaries of finalized data groups (clusters). Schmid et al. [23] proposed an algorithm based on color clustering. First, a two-dimensional histogram is calculated from the first two principal components of the CIE L*u*v* color space. The histogram is then smoothed and initial cluster centers are obtained from the peaks using a perceptron classifier. At the final step, the lesion image is segmented.

In this study for computer-aided border detection we use two clustering algorithms density based clustering (DBSCAN) [33] and multi level fuzzy C means clustering (FCM) and compare their performances over dermoscopy images for border detection. In the context of dermoscopic images, clustering corresponds to finding whether each pixel in an image belongs to skin lesion border or not. Automatic border detection makes dermatologist’s tedious manual border drawing procedure faster and easier.

DBSCAN

With the aim of separating background from skin lesion to target possible melanoma, we cluster pixels of thresholded images by using DBSCAN. It takes a binary (segmented) image, and delineates only significantly important regions by clustering. The expected outcome of this framework is desired boundary of the lesion in a dermoscopy image.

Technically, it is appropriate to tailor density based algorithm in which cluster definition guarantees that the number of positive pixels is equal to or greater than minimum number of pixels (MinPxl) in certain neighborhood of core points. The core point is that the neighborhood of a given radius (Eps) has to contain at least a minimum number of positive pixels (MinPxl), i.e., the density in the neighborhood should exceed pre-defined threshold (MinPxl). The definition of a neighborhood is determined by the choice of a distance function for two pixels p and q, denoted by dist(p,q). For instance, when the Manhattan distance is used in 2D space, the shape of the neighborhood would be rectangular. Note that DBSCAN works with any distance function so that an appropriate function can be designed for some other specific applications. DBSCAN is significantly more effective in discovering clusters of arbitrary shapes. It was successfully used for synthetic dataset as well as earth science, and protein dataset. Theoretical details of DBSCAN are given in [33]. Once the two parameters Eps and MinPxl are defined, DBSCAN starts to cluster data points (pixels) from an arbitrary point q as illustrated in Figure 1.

Figure 1
figure 1

Direct density reachable (left) and density reachable property of DBSCAN (right)

Let I be a subimage that is of dimension N × N. For a pixel p, let px and py denote its position where top-left corner is (0, 0) of I. Let cxy represent the color at (px, py). The Eps-neighborhood of a pixel p, denoted by NEps(p), is defined by NEps(p) = {qI | dist(p, q) ≤ Eps} where dist is Euclidean distance. There can be found two kinds of pixels in a cluster: 1) pixels inside of the cluster (core pixels) and 2) pixels on the border of the cluster (border pixels). As expected, a neighborhood query for a border pixel returns notably less points than a neighborhood query of a core pixel. Thus, in order to include all points belonging to the same segment, we should set the minimum number of pixels (MinPxl) to a comparatively low value. This value, however, would not be characteristic for the respective cluster - particularly in the presence of negative pixels (non-cluster). Therefore, we require that for every pixel p in a cluster C there is a pixel q in C so that p is inside of the Eps-neighborhood of q and NEps(q) contains at least MinPxl pixels: | NEps(p) | ≥ MinPxl and dist(p, q) ≤ Eps. A pixel p is called density-reachable from a pixel q when there is a chain of pixels p1, p2, .., pn, where p1 = q, pn = p. This is illustrated in Figure 1. A cluster C (segment) in image is a non-empty subset of pixels and given as:

C = {pq | | NEps(p) | ≥ MinPxl ,

where q is density reachable from p. DBSCAN centers around the key idea: to form a new cluster or grow an existing cluster the Eps-neighborhood of a point should contain at least a minimum number of points (MinPxl).

Algorithm 1 DBSCAN

DBSCAN (SubImage, Eps, Minpxl)

ClusterId:=nextId(NOISE);

FOR I FROM 1 To SubImage.height DO

FOR I FROM 1 To SubImage.width DO

Point := SubImage.get(i,j);

IF point.Cid = UNCLASSIFIED AND Point.positive() = TRUE

THEN

IF ExpandCluster(SubImage, Point, ClusterId, Eps, MinPxl)

THEN

ClusterId:=nextId(ClusterId)

END IF;

END IF;

END FOR;

END FOR;

END DBSCAN;

Algorithm 1 summarizes DBSCAN for image segmentation. Once the two parameters Eps and MinPxl are defined, DBSCAN starts to cluster data points from an arbitrary point q. It begins by finding the neighborhood of point q, i.e., all points that are directly density reachable from point q. This neighborhood search is called region query. For an image, we start with left-top pixel (not necessarily a corner pixel, any arbitrary pixel can be chosen for first iteration) as our first point in the dataset (subimage). We look for first pixel satisfying the core pixel condition as a starting (seed) point. If the neighborhood is sparsely populated, i.e. it has fewer than MinPxl points, then point q is labeled as a noise. Otherwise, a cluster is initiated and all points in neighborhood of point q are marked by new cluster's ID. Next the neighborhoods of all q's neighbors are examined iteratively to check if they can be added into the cluster. If a cluster cannot be expanded any more, DBSCAN chooses another arbitrary unlabeled point and repeats processes to form another cluster. This procedure is iterated until all data points in the dataset have been labeled as noise or with a cluster ID. Figure 2 illustrates example cluster expansion.

Figure 2
figure 2

Example Cluster Expanding: New points (green ones in circle) are expanding cluster.

Fuzzy c-means clustering

Clustering, a major area of study in the scope of unsupervised learning, deals with recognizing meaningful groups of similar items. Under the influence of fuzzy logic, fuzzy clustering assigns each point with a degree of belonging to clusters, instead of belonging to exactly one cluster.

In fuzzy event modeling, pixel colors in a dermoscopy image can be viewed as probability space where the pixels with some colors can belong partially to the background class and/or the skin lesion. The main advantage of this method is that, it does not require a priori knowledge about number of objects in the image.

Fuzzy C-Means (FCM) clustering algorithm [34, 35] is one of the most popular fuzzy clustering algorithms. FCM is based on minimization of the objective function F m (u, c) [35]:

FCM computes the membership u ij and the cluster centers c j by:

where m, the fuzzification factor which is a weighting exponent on each fuzzy membership, is any real number greater than 1, u ij is the degree of membership of x i in the cluster j, x i is the ith of d-dimensional measured data, c j is the dimension center of the cluster, d2(x k ,c i ) is a distance measure between object xk and cluster center ci, and ||*|| is any norm expressing the similarity between any measured data and the center.

The FCM algorithm involves the following steps:

  1. 1.

    Set values for c and m

  2. 2.

    Initial membership matrix U= [u ij ], which is U(0) (|i| = number of members, |j| = number of clusters)

  3. 3.

    At k-step: calculate the centroids for each cluster through equation (2) if k ≠ 0. (If k=0, initial centroids location by random)

  4. 4.

    For each member, calculate membership degree by equation (1) and store the information in U(k)

  5. 5.

    If the difference between U(k) and U(k+1) less than a certain threshold, then STOP; otherwise, return to step 3.

In the FCM, the number of classes (c in equation 1) is a user input. We tried to find the number of unclassified data points is greater than some threshold (T) values (30, 40, 50, 60, and 70) in our experiments. Since the number of classes is a user input in FCM, there is a risk of over segmentation. For instance when the number of segments in a skin image is 3 and we force the number of clusters to be found by FCM to be 6, the FCM over segments the image. This was one of the principal challenges we encountered with FCM. Thus, we ran FCM for different number of clusters and different threshold values and found that for the value of five initial clusters and threshold value of 30, FCM gave good accuracy in segmentation. Therefore, we used these values in all of our experiments. Moreover, in all of our experiments fuzzification factor m is taken as 2. Figure 3 shows how FCM detected area (red region) is changed by the change in threshold.

Figure 3
figure 3

Overlay images of FCM with different threshold values a) 30, b) 40, c) 50, d) 60, e) 70.

Experiments and results

The proposed methods are tested on a set of 100 dermoscopy images obtained from the EDRA Interactive Atlas of Dermoscopy [12]. These are 24-bit RGB color images with dimensions ranging from 577 × 397 pixels to 1921 × 1285 pixels. The benign lesions include nevocellular nevi and dysplastic nevi. The distance function used is Euclidean distance between pixels p and q, and given as where p.x and p.y denote position of pixel p at xth column and yth row with respect to top-left corner (0, 0) of image. We run DBSCAN on each image with the eps of 5 and MinPts of 60.

We evaluated the border detection errors of the DBSCAN and FCM by comparing our results with physician-drawn boundaries as a ground truth. Manual borders were obtained by selecting a number of points on the lesion border, connecting these points by a second-order B-spline and finally filling the resulting closed curve [22]. Using the dermatologist-determined borders, the automatic borders obtained from the DBSCAN and FCM are compared using three quantitative error metrics: border error, precision, and recall. Border error is developed by Hance et al. [18] and currently the most important metric for assessing quality of any automatic border detection algorithm, and given by:

where AutomaticBorder is the binary image obtained from DBSCAN or FCM, ManualBorder is the binary image obtained from a dermatologist (see Figure 4 right side). Exclusive OR operator,⊕, essentially emphasizes disagreement between target (ManualBorder) and predicted (AutomaticBorder) regions. Referring to information retrieval terminology, numerator of the border error means summation of False Positive (FP) and False Negative (FN). The denominator is obtained by adding True Positive (TP) to False Negatives (FN). An illustrative example is given in Figure 5.

Figure 4
figure 4

An exemplary dermoscopy image (left) and corresponding dermatologist drawn border (right)

Figure 5
figure 5

Illustration of components used in accuracy and error quantification.

Regarding the image in Figure 5, assume that red is drawn by a dermatologist and blue is the automated line, respectively. TP indicates correct lesion region found automatically. Similarly, TN shows healthy region (background) both manual and computer assessment agree on. FN and FP are labels for missed lesion and erroneous positive regions, respectively. Addition to border error, we also reported precision (positive predictive value) and recall (sensitivity) for each experimental image in Table 1 and Table 2 for results generated with DBSCAN and FCM respectively. Precision and recall are defined as respectively.

Table 1 DBSCAN border error, precision, and recall measures for each image in the dataset
Table 2 FCM border error, precision, and recall measures for each image in the dataset

Note that all definitions runs for the number of pixels in the particular region. Analogously, Area() function returns the number of active pixels in a binary image. Table 1 gives border error, precision and recall rates generated from the DBSCAN for each image whereas Table 2 represents border error, precision and recall rates generated from the FCM. It can be seen that the results vary significantly across the images.

In Figure 4, an exemplary dermoscopy image, which is determined as melanoma, and its corresponding dermatologist drawn border are illustrated. Figure 6 illustrates the DBSCAN generated result in red color for the same image. The DBSCAN generated result is overlaid on top of the dermatologist drawn border image in black color. As seen from the figure, hair is detected as false positive. Figure 3 shows results generated from the FCM with different fuzzification factors.

Figure 6
figure 6

Overlay image of DBSCAN.

For example for the melanoma image given in Figure 4, FCM’s precision, recall, and border error rates are 99.4%, 75.4%, and 100.4% respectively; however, DBSCAN’s precision, recall, and border error rates for the same image are 94%, 84%, and 2.2% respectively. Following tables show results generated with the DBSCAN and the FCM for 100 image dataset respectively.

Since the most important metric to evaluate performance of a lesion detection algorithm is border error metric, border errors for DBSCAN and FCM are illustrated in Figure 7. In the figure, X-axis show image IDs in random order. As seen from Figure 7, DBSCAN outperforms FCM for lesion border detection on dermoscopy images: for DBSCAN overall average border error ratio is 6.94% whereas overall average border error ratio for FCM is 100%. As for recall and precision, DBSCAN and FCM averaged out 76.66% and 99.26%; 55% and 100% , respectively.

Figure 7
figure 7

Border errors generated by DBSCAN (red) and FCM (green)

Automatically drawn boundaries usually found at more intense regions of a lesion (see Figure 8a, 8b, 8c, 8e, 8f) having promising assessment with DBSCAN. In Figure 8(d) , the DBSCAN marked also outer regions. Obviously, the gradient region between blue and red boundaries seems to be a major problem for the DBSCAN. We believe that even though inter-dermatologist agreement on manual borders is not perfect, most dermatologists will draw borders approximately the red borders as shown in images of Figure 8. This is because the reddish area just outside the obvious tumor border is part of the lesion.

Figure 8
figure 8

Sample images showing assessments of the dermatologist (red), automated frameworks DBSCAN (blue) and FCM (green) introduced in this study.

We also made a rough comparison of the DBSCAN with prior state of the art lesion border detection methods proposed by Celebi et. al 2008 [22] and 2009 [17]. Comparisons showed that the mean error of DBSCAN (6.94%) is obviously less than their results. However, we cannot make image by image comparison since they used a subset of 100 dermoscopy image dataset (90 images). Their image IDs might be different than our image IDs even for the same image. Therefore, for now the mean error rate is only indication we have as a proof that DBSCAN is better than studies given in [17] and [22].

Conclusion

In this study, we introduced two approaches for automatic detection of skin lesions. First, a fast density based algorithm DBSCAN is introduced for dermoscopy imaging. Second, the FCM is used for lesion border detection. The assessments obtained from both methods are quantitatively analyzed over three accuracy measures: border error, precision, and recall. As well as low border error, high precision and recall, visual outcome showed that the DBSCAN effectively delineated targeted lesion, and has bright future; however, the FCM had poor performance especially in border error metric. The next step, we will focus on at more details on intra-variability and post-assessment during performance analysis of the intelligent systems. Additionally, performance of DBSCAN will be evaluated over different polygon-unioning algorithms. In terms of border errors, we plan to develop model that are more sensitive to melanoma lesion. A thresholding method which is well-integrated with clustering rationale, such as the one described in [36], will be preferred in the future because of unexpected difference between precision and recall rates.