Microcalcification Segmentation from Mammograms: A Morphological Approach

This publication presents a computer method for segmenting microcalcifications in mammograms. It makes use of morphological transformations and is composed of two parts. The first part detects microcalcifications morphologically, thus allowing the approximate area of their occurrence to be determined, the contrast to be improved, and noise to be reduced in the mammograms. In the second part, a watershed segmentation of microcalcifications is carried out. This study was carried out on a test set containing 200 ROIs 512 × 512 pixels in size, taken from mammograms from the Digital Database for Screening Mammography (DDSM), including 100 cases showing malignant lesions and 100 cases showing benign ones. The experiments carried out yielded the following average values of the measured indices: 80.5% (similarity index), 75.7% (overlap fraction), 70.8% (overlap value), and 19.8% (extra fraction). The average time of executing all steps of the methods used for a single ROI amounted to 0.83 s.


Introduction
I n m a m m o g r a p h y i m a g i n g , t h e p r e s e n c e o f microcalcifications, i.e., small deposits of calcium in the breast, is the primary indicator of breast cancer. However, not all microcalcifications are proof of malignancy and their distribution within the breast can be used to determine whether the clusters contain benign lesions or constitute a threat indicating a malignancy. Microcalcifications presented in Figs. 1a and 2a and d are small deposits of calcium in the breast, which appear as small bright spots on mammograms. Unfortunately, the correct detection of microcalcifications in mammograms can often be very difficult. Breasts contain variable quantities of glandular, fatty, and connective tissues, and if there are a lot of glandular tissues, the mammograms are very bright, which makes small microcalcifications poorly visible [1]. If a physician has to examine numerous series of mammograms, their visual assessment capacity is greatly reduced. Consequently, computer-aided diagnosis (CAD) is being developed to make the diagnostic process easier for the radiologists [2][3][4][5][6][7]. The standard functions of CAD systems comprise the segmentation [8][9][10][11], feature extraction [12][13][14][15], and classification [5,[16][17][18][19] to determine whether lesions are present.
Although the improvement of each of the listed functions raises the capacity of the system, the segmentation can be considered the most significant, as the precise segmentation of lesions impacts the extraction of features and the classification. Microcalcifications were segmented using several techniques, such as morphological filters [1,[20][21][22][23], machine learning [11,24], and the wavelet transform [25] method using normalized Tsallis Entropy and fuzzy sets [10]. Most recent research based on machine learning [24], the wavelet transform [25], and active contour [8,9] demonstrate that microcalcification segmentation is highly significant and the researchers report good results of the methods proposed.
Chen et al. [24] analyzed the topology/connectivity of individual microcalcifications inside a cluster using multiscale morphology. In [24], microcalcifications were segmented using a knowledge-based approach [11] with the application of machine learning methods like the pixel-based boosting classifier which automatically allows the most salient features of microcalcifications to be selected. Chen et al. [24] report high classification accuracies (up to 96%) and also good ROC (region of convergence) results achieved.
Batchelder [25] proposed the 2D wavelet transform modulus maxima method (WTMM) to detect microcalcifications in mammograms. Then, fractal geometry was used to determine benign and malignant microcalcification clusters, and in particular, a Bfractal zone^and BEuclidean zones^(nonfractal) were defined. The authors analyzed 118 images of 59 patients. According to their results, the probability that fractal breast lesions are malignant is between 74 and 98% and the probability that Euclidean breast lesions are benign is between 76 and 96%.
Arikidis et al. [8] presented multiscale active contours method (MAC) which enable single microcalcifications to be segmented. This method requires the seed contour to be initiated manually. In [8], rectangular ROIs 81 × 81 pixels in size were analyzed and experiments were carried out for the DDSM database, with the reported mean value of the area overlap measure of 0.61 ± 0.15.
Duarte et al. [9] presented a geometric active contour method (GAC) for segmenting single microcalcifications. In every instance, the active contour is initiated for a single microcalcification. In [9], researches used 1000 rectangular ROIs taken from mammograms from the DDSM database, sized from 20 × 20 pixels to 41 × 41 pixels. Duarte et al. [9] report that they obtained a mean value of the area overlap measure of 0.52 ± 0.20.
The purpose of this publication was to propose solutions for the following: & Obtain the best possible segmentation results. & Achieving a fast operation of all the methods used.
This project uses morphological image transformations [23,26] to detect microcalcifications, and then watershed segmentation [23,26] which makes it possible to extract the shape of microcalcifications just as in [22,27]. In publications by Nieniewski [22,27], user interaction is necessary to indicate  the seed point of the watershed by immersion segmentation [28,29]. In this project, the whole segmentation process is automated and does not require combining regions by maximizing average contrast, as was done in publications by Nieniewski [22,27,30]. This study makes use of other gradient transformations of the image undergoing watershed segmentation and fewer interim steps during the extraction of the final shape of microcalcifications. This makes it possible to execute the entire segmentation process in the mean time of 0.83 s.

Image Dataset Used
The research project used 220 ROIs with the constant dimensions of 512 × 512 pixels, in an 8-bit format, obtained from mammograms with the original high resolution (43.5 and 50 μm/pixel, digitized using the following scanners-H o w t e k 9 6 0 , L u m i s y s 2 0 0 L a s e r, a n d H o w t e k MultiRad850), which came from the publicly accessible DDSM database [31,32]. Of that number, 110 ROIs contain benign lesions, and the remaining 110 ROIs show malignant cases. The images were selected by a breast radiologist with 10 years of experience and are mainly fatty breast cases from different patients. Each ROI corresponds to a different patient. It should be noted that 20 ROIs, or more exactly 10 benign and 10 malignant ones, were used to determine the necessary parameters allowing the segmentation process to be controlled. The remaining 200 ROIs were used to test the presented segmentation method, and the results obtained are presented in the experimental part of this article. Methods which can automatically mark suspicious-looking anomalies containing potential microcalcifications form a useful functionality of CAD software. Examples of their solutions can be found in literature [33,34]. However, these methods might mark falsepositive regions which contain no microcalcifications. Consequently, the correct identification of microcalcification regions by an experienced breast radiologist is indispensable in analyzing the disease. Solutions presented in this publication concern microcalcification segmentation and make use of rectangular ROIs marked by a radiologist on the source mammogram, with suspicious-looking anomalies located in their centers. This is illustrated in Fig. 1. For every case in the DDSM database, there is a radiological diagnosis available. In addition, for images with microcalcifications, a coded contour identifying the area in which microcalcifications occur called a ground truth area (GTA) is available. Each case has four images acquired in the CC and MLO projections for the left and the right breast. CC is the cranio-caudal projection showing that central and medial part of the mamma. MLO is the medio-lateral oblique projection. In the experiments, a single view was taken, namely the CC or MLO view for each patient.

Detecting and Segmenting Microcalcifications in Mammograms
The computer-aided detection and segmentation of microcalcifications from mammograms is a complex process, also because these microcalcifications are often much dispersed in the analyzed images, have low contrast, and are difficult to distinguish from their surroundings. These features may make it difficult to correctly segment them. Brief characteristics of microcalcifications taken from [35]  In the light of the above difficulties, the method presented in this publication comprises two consecutive parts making use of the morphological processing of digital images [22,26,27] Detecting microcalcifications allows the contrast to be increased in the image, the noise to be removed from it, and also some of the false-positive signals of microcalcifications to be removed. The results obtained are treated as a Bmap^on which the approximate areas in which microcalcifications occur are marked and will be used as an auxiliary image for a more precise determination of their shape. The next step in working with mammograms is to segment microcalcifications. This will be done using the watershed segmentation [22,26,27] to more accurately extract microcalcification shapes. Knowing the shape of microcalcification is very important as, together with their other features, it can prove tumor malignancy. The following description can be given based on the recently published work by Chen et al. [24]: & Malignant microcalcifications appear to be small, numerous (>5 concentrated on an area of 1 cm 2 ), and distributed densely because they lie inside milk ducts and associated structures in the breast.
& Benign microcalcifications are generally larger, less numerous (<4-5 per 1 cm 2 ), and more spread out because they form in the breast stroma, cysts, or benign masses.
These differences in the variability of the distribution, the size, and the number of microcalcifications in the ROIs analyzed allow radiologists to decide on the further assessment and the possible biopsy of the breast. Consequently, a correctly performed microcalcification segmentation can greatly simplify decision-taking for the doctors. F i g u r e 2 ac s h o w s e x a m p l e R O I s w i t h b e n i g n microcalcifications. These microcalcifications are rather spread out, there are cases of a relatively larger size, and they are less numerous than malignant microcalcifications shown in Fig. 2d-f, which, in contrast, are small, numerous, and densely distributed.

Morphological Detection of Microcalcification
To detect microcalcifications, four stages of subsequent mammogram transformations are executed (shown in Fig. 3

):
& Stage 1. The input mammogram marked I should be subjected to an operation of shifting it 21 gray levels down and then the same number of gray levels up in order to remove small brighter points in the darkest parts of the image, which could be wrongly recognized as microcalcifications. As a result of these operations, the variance of the image for gray levels between 0 and 21 will be removed. The output image is marked as I 2 . & Stage 2. The second stage is about detecting microcalcifications of various sizes using the morphological pyramid and a structural element of the constant size of 3 × 3 pixels. The first level of the pyramid is the source image. The second level is obtained by applying first the closing-opening (C-O) filtration [26] with the aforementioned structural element to the source image, and then sampling every second pixel from the image. The third level is produced by conducting the same operations on the second level image. The C-O filtration and sampling produces an image size reduced twice but with the useful information about its objects retained. Microcalcifications are detected at the second and third levels of the pyramid, using the following formula: where I is the input image, T is the output image, S is a square structural element 3 × 3 pixels in size, γ S and φ S are, respectively, the opening and closing operations [26], and min represents the point minimum. The operation (1) detects small brighter parts of the image (Fig. 3b, c). Pixels with a less irregular brightness distribution in their surroundings receive a higher value. This transformation also constitutes de-noising filter. The min operation ensures that the result will never be negative. After the microcalcification detection at the second and third levels of the pyramid, the results are subjected to thresholding with the threshold equal to 4, i.e., pixels with their gray level below the threshold are assigned the value of 0 and the pixels with a gray level equal to 4 or higher are assigned the value of 255. If the threshold value was set lower, e.g., at 3, this produced too many potential microcalcification signals. The results of thresholding at the second and third level of the pyramid should then be reduced to the dimensions of the input image and sum up using the logical OR operator. The size of the input image can be restored by replacing every pixel with a block sized 2 × 2 for the second or 4 × 4 pixels for the third the pyramid level. & Stage 3. The third stage consists in extracting all the brighter areas found in the I 2 image produced in stage 1. This will be done by the morphological operation of the extended maximum emax [26], where image I 2 is the mask and the image I 2 after the number 5 is subtracted from all of its pixels is the marker: The I emax image is composed of both microcalcifications and other brighter areas of image I 2 (Fig. 3d). In the experiments forming part of this project, the value of h was adopted as 5 because higher values made the bright area too large. However, in some cases, the image will contain excessively large areas that do not correspond to the physical dimensions of microcalcifications. Such areas must be removed from the image as part of a separate operation. It w a s d e c i d e d t o e l i m i n a t e p o t e n t i a l s i g n a l s o f microcalcifications inside which a vertical, horizontal, left diagonal, or right diagonal chord 50 pixels or less in length can be drawn. These objects are deleted using erosion carried out separately for every one of four linear structural elements lying along the above directions. Erosion results should be summed up logically. The summed up erosion results will serve as a marker for the reconstruction by dilating large regions from the I emax image. The I emax image will be the mask in this reconstruction. The result of calculations at this stage is the difference between the I emax image and the image produced by the reconstruction. & Stage 4. The purpose of the last stage is to extract the area occupied by microcalcifications detected using Eq. (2). This will consist in reconstructing the appropriate areas of image I emax indicated by signals detected at the second and third levels of the morphological pyramid. The logical overlap of the image indicating the microcalcifications at the given level of the pyramid and the I emax image will constitute the marker in the reconstruction. The reconstruction results from the second and third levels of the pyramid should then be summed up using the logical OR operator, and as a result the so-called microcalcification Bmap^should be produced (Fig. 3i).
In addition, the image obtained at stage 4 was subjected to Bcleaning^operations, namely & Removing potential microcalcification areas located close to the edge of the image using the reconstruction. & Removing potential microcalcification areas smaller than 10 pixels and larger than 70 pixels in area by area opening [26] and the logical subtraction of the images. & Closing holes in image objects.

The Watershed Segmentation of Microcalcifications
The next step in working with mammograms is to extract the microcalcification shape. The methods discussed in the previous subsection were a way of determining the masks of microcalcifications, but their shape is, in the majority of cases, dependent on the morphological operations executed, such as the emax and the reconstruction [26]. The watershed segmentation [26] coupled with the use of the so-called markers which prepared the image for segmenting and control this process was used to find the shape of microcalcifications. Using markers provides additional knowledge about the objects for the segmentation process and makes their extraction more efficient. It also reduces oversegmentation. A marker is defined as a cohesive area of pixels belonging to the image. An internal marker is related to the object that should be extracted, while an external marker is related to the background. Further down, the sets of internal markers and external markers will be referred to, respectively, the internal marker and the external marker, and the sum of these two sets-simply as a marker. The next steps in segmenting the shape of microcalcifications are as follows: 1. The first is to find the mask of regional minima (assuming an 8-neighborhood in the analyzed image) for the mammogram image presented in Fig. 4b filtered through the C-O operation and then inverted. The image of regional minima generated at this step contains areas of microcalcifications as well as other artifacts in the image (Fig. 4d).

2.
Having obtained an image of a regional minima mask, its intersection with the map of microcalcifications generated by the morphological microcalcification detection presented in Fig. 4c should be found. The result of this intersection will constitute an internal marker (Fig. 4e). 3. Then, the morphological gradient of the filtered and inverted input image referred to an item 1 is calculated. The definition of the morphological gradient used for image I is the result of subtracting the dilation δ and erosion ε with the use of structural element: 3 × 3 pixels in size.
The result is presented in Fig. 4f, whereas the image of the gradient has been multiplied times 10 to better distinguish details. Fig. 4b should be subjected to the watershed segmentation in order to obtain an external marker which will consist of the output watershed lines. In Fig. 5a, the marker lines have been overlaid on the image from Fig. 4b. 5. The pixels of the internal marker must not touch the pixels of the external one. For this purpose, the image generated by dilating the external marker image should be subtracted from the image of the internal marker from Fig. 4e. The dilation operation on the external marker is presented in Fig. 5c-the lines from the marker image have been widened. The new internal marker should be added to the external marker (not widened) to obtain the complete marker image (Fig. 5d). 6. At the next computational step, the gradient undergoes the minima imposition operation [36] in which the argument is the complete marker obtained in item 5. This operation means that the only regional minima that remain in the gradient will be found in the places Bmarked^by the marker. The result of the minima imposition operation is presented in Fig 5e, whereas the output image has been multiplied times 10 to better distinguish details. The watershed segmentation is then carried out on a gradient thus modified. Its results consist in watershed lines running along the contour of microcalcifications, without the oversegmentation effect. The result of this method for the image from Fig. 3a is presented in Fig. 5f-the watershed lines have been overlaid on the inverted image from Fig. 4b in order to better present the areas of microcalcifications identified by them.

Methods of Measuring and Assessing Microcalcification Segmentations Carried Out
The accuracy of microcalcification segmentation in mammograms from the DDSM database was estimated by measuring four indices, namely the similarity index Using the four indices-SI, OF, OV, and EF-makes it possible to exhaustively compare the similarity and differences between the analyzed regions M and R and determine the overlap fraction, the underestimation, and the extra fraction. In [8,9], only the OV index was analyzed. During research work, ROIs with the constant dimensions of 512 × 512 pixels were analyzed, while for mammograms from the DDSM database, there are GTA contours identifying the areas in which microcalcifications occur. Therefore, a radiologist participated in the experiments carried out and made the appropriate assessments of the detected or undetected actual and assumed microcalcification signals, namely A radiologist's assessment was used to calculate the mean sensitivity (4) depending on the number of false-positive signals per image (FPI).  Table 1 The established values of parameters of the morphological detection and extraction of microcalcifications In contrast, the sensitivity was not analyzed in [8,9]. In [8,9 ] , t h e s e g m e n t a t i o n w a s p e r f o r m e d f o r e v e r y microcalcification separately and in addition for patches of various sizes, but after the previous initiation of the active contour, which can, unfortunately, be a painstaking and time-consuming activity if the number of microcalcifications is large. Table 1 presents the values of parameters established for the morphological detection of microcalcifications using 20 preselected ROIs from mammograms from the DDSM database, namely 10 benign and 10 malignant cases The 20-element training set included ROIs containing microcalcifications of various shapes, sizes, numbers, and distribution as well as brightness levels. The watershed segmentation carried out after the microcalcifications are detected is automated and requires no parameters to be used. The parameters presented in Table 1 are selected so that the SI, OF, and OV indices are the highest possible, while the EF index is as low as possible.

Results and Discussion
After the parameters necessary to control the segmentation had been established, the method of detecting and segmenting microcalcifications was tested on the remaining 200 mammograms, using segmentations done manually by a radiologist and GTA contours. The results of these segmentations are presented in Table 2. Table 2 presents calculated statistical parameters such as the maximum value (max), minimum  Fig. 6 Graphs of the mean value and the standard deviation based on measurements of four indices: SI, OF, OV, and EF for the applied method, compared to the contours drawn by the radiologist based on the data from Table 2 value (min), the mean value (mean), and the standard deviation (SD) of the following calculated indices: SI, OF, OV, and EF. Figure 6 shows a graph of data from Table 2. Table 3 Table 3. Table 4 shows the time measurements, in seconds, of the morphological extraction and detection method for microcalcifications (M) used for the 200 analyzed mammograms. The presented method was implemented in the Matlab R2015a environment. Time was measured for a PC with an Intel Core i7 2 GHz processor. The average time for a single ROI 512 × 512 pixels in size amounts to 0.83 s, and this includes all steps of the method presented in this publication.
Examples of differences in the segmentation of microcalcifications by the computer method presented in this publication and the contours manually traced by a radiologist are presented in Figs. 8 and 9. These are typical results obtained during the experiments carried out. In order to make microcalcification imaging easier, all examples of mammograms have been filtered according to stage 1 of the presented method and their gray levels have been inverted. In all examples from Figs. 8 and 9, GTA contours are superposed. Figure 8 shows example results for benign cases and Fig. 9 for malignant ones. The values of calculated indices are presented next to each example extracted by the watershed segmentation. Table 5 collates the results produced by active contour methods: MAC [8] and GAC [9] with those produced in this research work.
The experiments completed and the research results presented in Tables 2, 3 Higher values of SI, OF, and OV indices and a lower value of the EF index were obtained for benign cases, which are relatively larger and less numerous than malignant ones. The values of the SI, OF, OV, and EF indices for 100 analyzed ROIs containing benign lesions are 83, 78, 73.5, and 14%, while for those with malignant lesions they equal 78, 73, 68, and 22%. In [9], only the OV index was The  analyzed, and just as here, higher values were obtained for benign lesions than for malignant ones. In [8], in turn, only the sizes of microcalcifications were distinguished, and no results of experiments for types of microcalcifications are presented. This study produced higher average values of the OV index than in [9] (52%) and in [8] (61%). However, it is worth noting that Duarte et al. [9] researched 1000 ROIs from mammograms from the DDSM database, so significantly more experiments were carried out than in this publication (altogether 220 ROIs, with 20 used to determine the parameters of the method, and tests carried out on the remaining 200). What is more, in [9], the researchers analyzed various types of microcalcifications and for different types of breast tissues according to their classification to four tissue density categories [37]. In this study, two types of microcalcifications were analyzed, namely those which are symptoms of malignant cases and those which represent benign cases, and they are generally fatty breast cases. Unfortunately, as the manual tracing of individual microcalcifications can be very time consuming for the expert (as long as 30 min for a single ROI), this forms an obstacle to conducting a large number of experiments and significantly prolongs their time. In [8], 128 clusters of microcalcifications were analyzed. & According to the data from Table 3, there were eight supposed microcalcifications at the most. In the experiments completed, the most frequent ROIs had one, two, and three supposed microcalcifications, and for these cases the standard deviations are the greatest. The minimal number of microcalcifications that occurred was 5 and the lowest value of the standard deviation can be observed for this group. The mean sensitivity in all the experiments amounted to 80% and reached the maximum value of 81%  Fig. 8e and f as well as Fig. 9b and c. Figure 8e contains five microcalcifications, so not detecting one of them yields the sensitivity of 0.8- Fig. 8f. In the example from Fig. 9b, the radiologist has found 11 microcalcifications, and if the computer method misses two, the sensitivity amounts to 0.81. Figure 9e, in turn, contains nine microcalcifications, so if the computer detection misses three, this represents the sensitivity of 0.66. According to the data from Table 3, sensitivity falls to 0.5 in the worst case and is equal to 1 in the best case, which means that all microcalcifications had been found. & The average completion time of all steps of the computer method for ROIs 512 × 512 pixels in size amounted to 0.83 s and consisted of 0.51 s for the morphological detection of microcalcification and 0.32 s for the watershed segmentation. In comparison, the authors of [8] analyzed ROIs 81 × 81 pixels in size and reported that the average segmentation time of a single microcalcification was 0.42 s. Duarte et al. [9] also give the average segmentation time for a single microcalcification, which amounted to 0.4 s for analyzed ROIs whose dimensions ranged from 20 × 20 to 41 × 41 pixels. It should be noted that the active contour methods presented in [8,9] require a manual initialization for every single microcalcification, which represents a significant limitation because it prolongs the segmentation process, particularly if a large number of ROIs is analyzed and they contain an even greater number of microcalcifications. In summary, the solutions proposed in this publication are more practical because they do not require initializing in every instance-they allow the segmentation process to be automated not just for single microcalcifications but for many at the same time, inside ROIs larger in size and within a shorter time; and they do not require initializing in every instance-they allow the segmentation process to be automated not just for single microcalcifications but for many at the same time, inside ROIs larger in size and within a shorter time.

Summary and Conclusion
This publication presents a computer method for detecting and segmenting microcalcifications in mammograms from the DDSM database. It uses morphological transformations and is composed of two parts. The first part detects microcalcifications morphologically, thus allowing the approximate area of their occurrence to be determined, the contrast to be improved, and noise to be reduced in the mammog r a m s . T h e n , t h e w a t e r s h e d s e g m e n t a t i o n o f microcalcifications is performed. In the experiments carried out for 200 ROIs taken from mammograms from the DDSM database, the measured values of the SI, OF, OV, and EF indices amounted to, respectively, 80.5, 75.7, 70.8, and 19.8%. Higher values of the SI, OF, and OV indices and a lower value of the EF index were obtained for benign cases than for malignant ones. Compared to other solutions presented in [8,9], the process of microcalcification segmentation was automated and the computer methods used achieved at a significant speed. In the experiments completed, the average running time of the entire processing of a single ROI 512 × 512 in size amounted to 0.83 s. Increasing the number of cases from the DDSM database, particularly to include different types of microcalcifications according to the classification presented in [37], should be considered in further research. The segmentation results produced by the computer method should be evaluated by two experienced breast radiologists, and this would additionally allow the consistency of these evaluations to be compared. It should be noted that the DDSM database is not new and it will be worthwhile to add examination results produced by the newest generation of mammographs. On the other hand, the publicly accessible DDSM database is the only one containing the highest number of images together with the detailed location of lesions and their descriptions, so many researchers are willing to use it.