An energy conduction model for cell image segmentation

Cell image segmentation is an essential step in cytopathological analysis. Although their execution speed is fast, the results of cell image segmentation by conventional pixel-based, edge-based and continuity-based methods are often coarse. Fine structures in a cell image can be obtained with a method that quickly adjusts the threshold levels. However, the processing time of such a method is usually long and the final results may be sensitive to intensity differences and other factors. In this article, a new energy model is proposed that synthesizes a differential equation from the conventional and level set methods, and utilizes the nonuniformity property of cell images (e

Since it was first demonstrated in cervical carcinoma cell images by Babes, the analysis of cellular pathology has become increasingly important for the accurate diagnosis and treatment of cancer. Usually, the first step in a successful cytopathological analysis is to segment the nucleolus and cytoplasm from other background organelles. The segmented results are then used in a quantitative analysis of cellular pathology. The objective of our study was to develop a method for obtaining accurate and more reliable segmentation results so as to ensure that subsequent calculations of other parameters are correct.
Conventional methods used for cell image segmentation include pixel-based, edge-based, and continuity-based methods [1][2][3][4][5][6][7]. These methods are relatively simple and have fast execution speeds. However, most conventional methods are unable to handle detailed structures and obtain fine segmentation results when cell images contain relatively complex features. The active contour model [8][9][10][11][12][13][14][15][16][17][18] was developed recently and has several notable advantages. With this *Corresponding author (email: chenc@cs.zju.edu.cn) model, the segmentation result can be achieved at a sub-pixel level. The algorithm can be easily implemented in an energy minimization framework in which prior knowledge of shape and intensity can be incorporated [19]. More importantly, the segmentation results are closed curves, which is beneficial to further processing such as shape analysis and pattern recognition. Active contour models are often divided into two main subclasses: edge-based models [20][21][22][23][24] and region-based models [9,10,[25][26][27]. Edge-based models use local edge information to attract the active contour toward the object boundaries. In contrast, region-based models aim to identify each region of interest using a certain region descriptor to guide the motion of the active contour. In a region-based model, Li [19,28,29] proposed region-scalable fitting energy model to solve the problem of intensity inhomogeneity and accelerate the execution of the algorithm.
When the active contour methods were applied directly to cell image segmentation, we found it was not always possible to achieve satisfactory results in cases where the intensity difference between object and background was minor. In this paper, we first analyzed cell image characteristics, and then proposed an energy conduction model where the energy was minimized when the three parts of cell image were well segmented. Finally, a new variational function using the level set framework was constructed to segment cell images.

Characteristic features of cell images
A human body contains many types of cells. While the shape of these cells can be very different, the basic structures are mostly similar. In general, cells are formed of nucleoli and cytoplasts. The nucleolus is located in the center of a cell and is surrounded by cytoplasts. The density of the nucleolus is higher than that of cytoplasm. Furthermore, because many different molecules are present in cells, the cell density is spatially uneven.
Digital microscopes are typically used to take cell images from cell smears. A cell image has three components: nucleoli, cytoplasts and background. Additionally, the following features can usually be recognized in a cell image: (1) the nucleolus part Ω n is located within the cytoplasm part Ω c , and the cytoplasm part is located within the background part Ω b , (2) the intensity of the nucleolus I n is higher than that of the cytoplasm I c , and the intensity of the cytoplasm is higher than that of the background I b , (3) the intensity gradient of nucleolus G n and cytoplast G c is bigger than that of background G b because of the presence of different molecules within the cell (Figure 1).
We hypothesize that better segmentation results can be obtained if both intensity and intensity gradient information are used in the processing. By combining the information of the original image and of its gradient, we expect that the difference between the cellular components of interest and background can be amplified to allow more reliable segmentation of the different cell components.

Energy conduction model for cell image segmentation 2.1 Region-based active contour models
be the image domain, and I : Ω→R be a given gray level image. In [22], the image segmentation problem was formulated by Mumford and Shah as follows: given an image I, find a contour C which segments the image into non-overlapping regions. Mumford and Shah proposed the following energy function: where u is an approximation of the original image and is assumed to be smooth within each of the connected components in the image domain Ω and is separated by the con- represents the gradient of u, and | | C is the length of the contour C, and Ω\C is the area of Ω that excludes C, and μ and ν are two coefficients of the relative parts. In practice, it is difficult to minimize this function. Tery and Luminita [9] proposed an active contour approach to the MS (Mumford and Shah) problem for the special case in which the image u in function (4) is a piecewise constant (PC) function. For an image I, Chan and Vese proposed to minimize the following function: where outside(C) and inside(C) represent the regions outside and inside the contour C, and c 1 and c 2 represent two constants that approximate the image intensity in outside(C) and inside(C), and λ 1 , λ 2 ν andν are the relative coefficients.
In [10], Luminita and Tary subsequently proposed a piecewise smooth (PS) function as PS model can overcome the limitation of PC models in segmenting images with intensity inhomogeneity. However, the practical application of the PS model is limited because of its high computational load. As a potential solution, Chan and Vese proposed a multiphase level set framework to segment images that contain more than 2 phases [10]. The function that can segment 4 phases is: In [19], Li proposed a region-scalable fitting energy function so that the model can accommodate intensity inhomogeneity and accelerate the segmentation process. The function used by Li is , Ω 2 =inside(C), f 1 (x) and f 2 (x) are the approximate image in each part, and K is the kernel function that usually assumes a Gaussian kernel: in which σ>0.

Model for cell image segmentation
According to the cell image features specified in eqs.
(1)-(3), we propose a new energy functional to segment the nucleolus, cytoplast and background. Referring to eqs. (7) and (8), the function we propose is  where C 1 is the contour between cytoplasts and background, C 2 is the contour between cytoplasts and the nucleolus, IG is the combination of the original image and gradient information, f 10 and f 11 represent the approximate image of the cytoplast part and the nucleolus part, g 0 and g 1 is the approximate image of IG, inside(c 1 ) represents the cell part, outside(c 1 ) represents the background part, inside(c 1 ) and inside(c 2 ) are the nucleolus part and inside(c 1 ) and outside(c 2 ) are the cytoplast parts. The precondition of eq. (10) is that the difference between cytoplasts and background is much larger than that between cytoplasts and the nucleolus in IG.
According to the cell feature specified in eq. (1), the 3rd, 4th and 6th terms can be computed after obtaining the results of the 1st, 2nd, and 5th terms. Therefore, the function in eq. (10) can be divided into 2 separate functions for faster execution: The last terms in functions (13) and (14) are the lengths of the contours. These lengths are often substituted by [9,10] in which δ is the Dirac delta function [28]. In practice, the Heaviside function H and its derivative δ are approximated by To accelerate the calculation process, Li [15] added a level set regularization term to preserve the regularity of the level set function. An example of the regularization terms that can be used is After these modifications, eqs. (13) and (14) in which μ is a positive coefficient.

Energy minimization
We use the standard gradient descent method to minimize the energy function in eq. (15) and use F φ ∂ ∂ to represent the Gateaux derivative of the function F: which represents the gradient flow that minimizes the function F. According to [29], let Keeping g 1 and g 2 fixed, we can obtain After calculating φ 1 , to speed up the computation we initialized φ 2 with positive values in the potential nucleolus regions and with negative values in the potential cytoplast regions.

Implementation
In this model, we first combined the information of the original image I and its gradient G with a proper ratio to produce a new image IG. IG contains the information about the image gradients. Based on the cell image feature specified in eq. (3), the gradient of a cell image is larger than that of background. Therefore, we can differentiate cytoplast and background in the new image more easily than when using only the original image. Using the gradient informa-tion can be especially advantageous when the intensity difference between cytoplast and background is small.
The first step is to initialize the level set function φ 1 . In this work, a statistical method was used to predict the potential cell region and background region, and set the initial value of φ 1 to be positive in the potential cell region and negative in the potential background region. Afterwards, the contour C 1 was obtained by iteratively solving eq. (22). Next, using the statistical method, we initialize the second level set function φ 2 to image I. We restricted the initialization to the area inside C 1 . Within this area, the value of the potential nucleolus region is set to be positive and that of the potential cytoplast region to be negative. After this initialization, the second contour C 2 that separates cytoplast and nucleolus can be obtained through iteration of eq. (24). The two iterations are nearly identical. When multiple cell images are present, pipeline processing can be used to enhance the efficiency because the second iteration is based on the first iteration.

Experimental results
The proposed method was tested on real cell images that were acquired in our laboratory. Unless otherwise specified, in this paper we used the following parameters for processing the images: σ=3.0, λ 1 =λ 2 =1.0, time step Δt=0.1, μ=1. As described in [19], a smaller scale σ produces more accurate localization of object boundaries. In contrast, a larger σ value can produce results that are more independent of the location of the initial contours. All experiments were performed on a workstation with Intel® Core™2 Duo CPU E8200@2.66 GHz, 2.67 GHz, 3.40 GHz CPU and 3.25 GB memory. The workstation operated under Microsoft Win-dows XP Service Pack 2. Matlab 7.6.0 was used for processing. Figure 2 shows an example of the different stages of the segmentation process that we used. Figure 2(a) is the original image; Figure 2(b) is the gradient of the original image (a). As described previously, the gradient within the cell is much bigger than that of the background; Figure 2 the images in Figure 3 was 104×129 pixels. The total processing time was approximately 1.318.

Comparison with other methods
Our proposed method has several potential advantages when compared with other previously published methods. First, the model and algorithm we used are implemented under the level set framework. Second, the different features of cells are used to generate an energy conduction model. As a result, our method can produce finer segmentation results even when the intensity difference between cytoplast and background is small. Figure 4(a) shows an example whereby a conventional level set method failed to produce a correct segmentation of cytoplasm and background because of an insufficient intensity difference between the two regions. In contrast, our method was able to overcome this limitation and properly segment the two regions ( Figure  4(b), (c)). Finally, our method uses statistical information to initialize the level set function. As a result, the two iterations can be pipelined in execution, allowing a much shorter processing time. In cases where multiple images need to be segmented, the first and second iterations can even performed on different computers.

Conclusions
In summary, we analyzed the different features of a cell image and proposed an energy conduction model to improve cell image segmentation. In real cell images that can be challenging for conventional level set based methods, our proposed method was shown to be capable of detecting the small intensity difference between the different cell regions and producing correct image segmentation results. The proposed algorithm was found to be efficient and can be easily implemented on multiple computers for further improvement in the processing speed.