An optimal nuclei segmentation method based on enhanced multi-objective GWO

In breast cancer image analysis, reliable segmentation of the nuclei is still an open-ended research problem. In this paper, a new clustering-based nuclei segmentation method is presented. First, the proposed method pre-processes the histopathology image through SLIC method. Then, a novel variant of multi-objective grey wolf optimizer is employed to group the obtained super-pixels into optimal clusters. Lastly, the optimal cluster with minimum value is segmented as the nuclei region. The experimental results demonstrates that the proposed variant of multi-objective grey wolf algorithm surpasses the existing multi-objective algorithms over ten standard multi-objective benchmark functions belonging to different categories. Particularly, the proposed variant has achieved best fitness value of more than 0.90 on 90% of the considered functions. Further, the nuclei segmentation accuracy of the proposed method is validated on H&E-stained estrogen receptor positive (ER+) breast cancer images. Experimental results illustrates that the proposed method has attained dice-coefficient value of more than 0.52 on 80% of the images. This illustrates that the proposed method is efficient in producing efficacious segmenting over histology images of Breast cancer.


Introduction
Histological images are the golden standard in the breast cancer diagnosis and hematoxylin and eosin (H&E) staining of such images is the standard staining protocol [1].In manual analysis of these images, there are a number of issues to be handled like, analysis variation due to difference in observer's experience, time-taking process, and difficult to identify subtle visual features [2].However, the digitization of pathology systems have successfully mitigated such concerns [3].In digital pathology, the segmentation of nuclei from the histopathological image is the foremost unit whose accuracy determines the efficiency of the system [4].For the same, there are many nuclei segmentation methods defined over approaches like, super-pixels, clustering, active contours, watershed, and multi-level thresholding [5][6][7].Among them, super-pixels is one of the efficient approaches for segmentation.Therefore, this paper introduces an efficient nuclei B Kapil Sharma kapil@ieee.org 1 Delhi Technological University, Delhi, India segmentation method based on super-pixels for histopathological images.
Super-pixels divide the image into non-overlapping regions wherein similar pixels are grouped together [8].The boundary of each irregular-shaped super-pixel is according to the edge information in the original image.This makes each super-pixel perceptual meaningful [9,10].Various computervision applications like, image segmentation [11], depth estimation [12], object localization [13], body model estimation [14], bag-of-features [15,16] and skeletonization [17], employ super-pixels to obtain mid-level representations.Fouad et al. [8] employed unsupervised learning on superpixels to segment a cancer image into different tissues.In literature, it has been observed that unsupervised learning is quite advantages for histopathological image analysis as these methods are efficient in identifying anatomical structures in an image [8,18].
Generally, unsupervised learning methods work on the principle of clustering the unlabeled data into homogeneous clusters according to the considered criteria such as intra-cluster distance [19][20][21].Some of the popular unsupervised learning methods are KMeans and Fuzzy C-Means [22,23].However, there are a number of demerits in such Fig. 1 Histopathological images scanned at ×40 [47] methods like, sensitive towards initial parameters settings, may return local optimal solutions, and require knowledge about cluster numbers to be formed [24][25][26].To tackle these challenges, nature-inspired algorithms have proven a successful solution for generating efficient clusters [27,28].Such algorithms have solved many optimization problems of real-world [29][30][31][32].Some of the popular nature-inspired algorithms are genetic algorithm (GA) [33], biogeographybased optimization (BBO) [34,35], salp swarm optimization [36], gravitational search algorithm (GSA) [37], whale optimization algorithm (WOA) [38], and grey wolf optimizer (GWO) [39].Moreover, it has been witnessed that natureinspired algorithms usually consider single objective like, intra-cluster distance, to perform clustering which presents only single view of the data [40,41].To obtain better clusters, multi-objective criteria have been a better alternative as it presents multiple views while clustering the data which results in comparatively better segmented image.In literature, there are number of multi-objective natureinspired algorithms which try to optimizes multiple objective functions simultaneously.Some of popular multi-objective nature-inspired algorithms are Non-dominated sorting GA (NSGA-II) [42], Pareto-archived evolution strategy (PAES) [43], and strength-Pareto evolution algorithm (SPEA) [44].Multi-objective GWO (MOGWO) is a recent multi-objective nature-inspired algorithm which has been inspired from the behavior of grey wolves.However, this algorithm suffers from a number of disadvantages like lacks population diversity, time consuming, and poor exploration capability.Therefore, Thus, the main contribution of this paper is fourfold; a novel variant of MOGWO, enhanced multi-objective grey wolf optimizer (EMOGWO), is proposed.
Further, this paper leverages the strengths of the proposed variant, EMOGWO, to segment the nuclei from histopathological images.However, nuclei segmentation in such images is a difficult task due to challenges like, non-uniformity in the nuclei shape, overlapping of nuclei, difference in tissue texture, different stain absorption capabilities of nucleus, and variation in scanned artifacts [5,45,46].Figure 1 illustrates sample images, taken from a publicly available breast cancer histopathological dataset [47], to depict the involved complexity.In literature, multiple clustering criteria have been quite successful in handling such complex environment.Therefore, this paper present a new multi-objective clustering method, enhanced multi-objective grey wolf optimizer-superpixel (EMOGWO-SC), to cluster the superpixels optimally and efficiently to segment the nuclei from a histopathological image.Thus, the main contribution of this paper is fourfold: -A new variant, enhanced multi-objective grey wolf optimizer (EMOGWO), is proposed.-A novel multi-objective clustering method (EMOGWO-SC) is introduced which efficiently clusters the superpixels to segment the nuclei from a histopathological image.-The proposed EMOGWO is compared against three other multi-objective nature-inspired algorithms on 10 well-known multi-objective benchmark.Thus, the main contribution of this paper is fourfold; problems in terms of IGD, SP and MS.-To validate the performance of the proposed method (EMOGWO-SC), a publicly available dataset, H&Estained estrogen receptor positive (ER+) breast cancer images [47,48], has been considered and experimental comparison is conducted against GWO based superpixel clustering (GWO-SC) and kmeans based superpixel clustering (kmeans-SC) in terms of computation time and segmentation accuracy.
The remaining paper is organized as follows; "Preliminaries" briefs a super-pixel method and MOGWO.The proposed EMOGWO-SC along with EMOGWO is presented in "Proposed multi-objective clustering method for nuclei segmentation".Experimental analysis is conducted in "Results and discussion" followed by the conclusion in "Conclusion".

SLIC: a super-pixel method
Super-pixels are atomic and compact regions in an image that are formed after over-segmentation.Generally, superpixels are utilized for obtaining mid-level representation of an image [49].One of the efficient method for generating superpixels is simple linear iterative clustering (SLIC) [10].It demands only single input to operate, i.e., number of superpixels (K ) to be formed.In SLIC, two phases are followed, i.e., initialization and local clustering [10,50].Initialization phase corresponds to the random initialization of A centroids at an interval of P = U A .Here, U represents the total pixels in an image.In local clustering phase, the distance (T ) of kth super-pixel centroid from all the neighborhood pixels within the interval 2P × 2P is measured according to Eq. (1).
where m corresponds to the constant and i represents the ith image pixel.For kth super-pixel and ith image pixel, d c measures the Euclidean distance in the CIELab color space which is defined in Eq. ( 2), while d s computes the Euclidean distance in the spatial space according to Eq. (3).
The measured value of D is used to assign every pixel to the nearest super-pixel centroid.Then, mean of all the assigned pixels is computed to update the corresponding centroid of the super-pixels.This process is followed till residual error is converged.In post-processing, the unassigned pixels are assigned with nearest super-pixels.

Multi-objective GWO
Multi-objective grey wolf optimizer (MOGWO) [51] is the multi-objective version of grey wolf optimizer (GWO) [52].The basic working of MOGWO is inspired from GWO only but there are two distinguishing two components in MOGWO.First component corresponds to the archive set and the other component is the leader selection.Pareto optimal solutions in an iteration are stored as archive set (ACH) which depicts the set of solutions that are non-dominated by any other solution.Let an optimization problem P need to optimize three objective functions, namely f 1 , f 2 , and f 3 .Let there be two solutions, X and Y .A solution X dominates an another solution Y , if X is better than Y in at least one of the objectives f i and not worse than other objectives.In case, X does not dominate Y and Y does not dominate X , they are called non-dominated solutions.These solutions are considered as better solutions than other solutions in the population.
In MOGWO, an archive controller is also maintained to control the movements of non-dominated solutions in archive set.Following rules of movement are followed on the archive set: 1.If the new solution W is dominated by any solution of the ACH, then W can be added to ARC. 2. If the new solution W dominates one or more solutions of the ACH, then the dominated solution are deleted from the ACH and W is added.3.If both of the above conditions fail, then also W is added.4. If the ACH is full, the most crowded segments are identified using the grid mechanism and one of its solutions is deleted.Then, the solution W is added to the least crowded segment to maintain the diversity.
To update the leader, leader selection method is applied on the updated ACH.Similar to GWO, the leaders, i.e., α, β, and δ wolves are selected from the archive set which represent the three best solutions.For choosing the three best solution, fitness values are sorted and first three values are selected namely α, β, and δ.The other solutions in the population update the respective position according to these leaders only.Therefore, leader selection method is a key in the efficient performance of MOGWO.
For leader selection, a probability function P i is defined according to the density of the solution.Assume there are S number of non-dominated solutions in ith segment, then P i is defined as per Eq. ( 4).
where const > 1.From Eq. ( 4), it can be observed that the probability to pick a solution from highly crowded segments are lesser which is a good indication to maintain the diversity in the population [53].This will redirect the search to less crowded segments and explore the search space for finding better solutions.Therefore, MOGWO maintains the diversity in the population and explores each region of the search space to find the optimal solutions.The position update equations and other steps of MOGWO algorithm according to GWO [52].

Proposed method
A new multi-objective clustering method, enhanced multiobjective grey wolf optimizer-based super-pixel clustering (EMOGWO-SC), is presented for optimal segmentation of nuclei from a histopathological image.The block diagram of the proposed method is illustrated in Fig. 2. First, a H&E-stained histopathological image is pre-processed by the SLIC method to generate super-pixels.The generated super-pixels are further optimally clustered into 'g' clusters by employing the proposed enhanced multi-objective grey wolf optimizer (EMOGWO).For the same, two objective functions are considered, i.e., minimizing the intra-cluster distance and maximizing the inter-cluster distance.Intracluster distance measures the compactness of clusters while inter-cluster distance computes the separation among clusters.These two objective functions will help in achieving better clustering quality and optimal cluster centers.Therefore, the proposed method optimizes two objective functions simultaneously which are defined in Eqs. ( 5) and ( 6), respectively.
where 'g' and ' p' correspond to the number of required optimal clusters and total pixels in the image, respectively.Further, C j represent the jth cluster centroid while x i is the ith image pixel.In a H&E-stained histopathological image, nuclei regions are highlighted with dark color [54].Therefore, the minimum average cluster is segmented as the nuclei region.The pseudo-code of the proposed method is presented in Algorithm 1. Further, the proposed variant, EMOGWO, is discussed in the following section.

Enhanced MOGWO (EMOGWO)
In MOGWO, two new components have been introduced, namely archive set and leader selection method.The archive set contains the non-dominated solutions while leader selection method identifies three best solutions from least crowded segment in the population.However, this may lead to a problem.In case there are less than three solutions in the less crowded segment, then another least crowded segment will be evaluated to select the leaders.If this scenario remains same in the second less crowded segment also, then third less crowded segment will be selected.This results in increased time complexity.Therefore, this paper proposes an enhanced MOGWO (EMOGWO) with an enhanced leader selection method.
In the proposed method, the whole population is divided into different segments.The segment number is allocated to each solution i based on Eq. (7).
where x i represents segment number of solution i and n i represents the number of solution that dominate solution i.
Hence, for the non-dominated solutions the segment number will always be 1.The maximum segment number will always be less than the total number solutions in the population.
Once the population is divided into segments, less crowded solutions will be identified.Therefore, for every solution i in each segment, crowding count (cc i ) is calculated using Eq. ( 8) where λ(x i ) is number of solutions in each segment number x i and h(d i j ) is calculated by Eq. (9).
where thres is selected between zero and one and may change based on the application while d i j is distance between two solutions i and j in the objective space having M objectives.The calculation of distance is given by Eq. (10).
where f max Now, the selection of three best solutions will be according to the roulette wheel selection, based on following probability for each segment.
where c is a constant greater than one and cc i the crowding count of solution i in the segment k.
From Eq. ( 11), it can be observed that if the cc i is high, than the probability of this solution to be leader will be less.This indicates that less crowded solutions will be selected.This improves the population diversity and exploration capability of the algorithm.

Results and discussion
The performance of the proposed automatic nuclei segmentation method has experimented in two sections.First, "Performance analysis of EMOGWO" showcases the efficiency of the proposed multi-objective grey wolf optimizer on ten well-known CEC-2009 multi-objective benchmark functions [55] in which the proposed EMOGWO is validated qualitatively on 7 bi-objective and 3 tri-objective test problems.Second, in "Experimental analysis of automatic nuclei segmentation method", EMOGWO is used for nuclei segmentation within H&E-stained breast cancer histology images.For a fair analysis, all the experimentation have been performed using Matlab 2017a on a system having 2.66 GHz Intel core i3 processor and 8 GB of RAM.

Performance analysis of EMOGWO
The proposed EMOGWO has been tested over 10 multiobjective benchmark functions including 7 bi-objectives (UF 1 -UF 7 ) and 3 tri-objective test problems (UF 8 -UF 10 ) [55,56].Tables 1 and 2 tabulate these benchmark functions along with definitions.The benchmark functions are considered as the most challenging test problems in the literature which include different multi-objective search region with non-convex, convex, multi-modal, and dis-continuous Pareto fronts.To assess the efficiency of the proposed EMOGWO, inverted generational distance (IGD) that measures the convergence of all the considered algorithms has been used [57].
The maximum spread (MS) and spacing (SP) are used to compute and assess the coverage [44,58].The mathematical equation of IGD is enhanced version of generational distance (GD) [57,59] which is introduced by Sierra and Coello and formulated in Eq. (12).
where n represents the true optimal Pareto solutions and dis j refers to the Euclidean distance between the jth true Pareto optimal and the closest computed Pareto optimal solutions in the reference set.
The performance metric SP and MS can be mathematically formulated using Eqs.( 13) and (14).
where dis is the mean of all dis j , n is the number of optimal Pareto solutions obtained so far, and where det() computes the Euclidean distance, a j , b j represent the maximum and minimum value in jth objective and t is the total number of objectives.The performance parameters SP, MS, and IGD quantitatively validate the efficacy as they compare the mean and standard deviation values of all the considered algorithms.Thus, to qualitatively validate the performance, the best set of Pareto optimal solutions of each algorithms is compared.To do comparative analysis, the proposed EMOGWO is compared with MOGWO [56], MOPSO [58], and MOEA/D [60].To reduce the interference effect and for a fair analysis, each algorithm has been run 10 times.Moreover, the number of iterations (itr) and population size (N ) in all the considered algorithms are set as 1000 and 50, respectively, and for benchmark UF8.Thus, from the above analysis, it can be said that the performance of the proposed EMOGWO is more consistent than other considered algorithms.Furthermore, SP and MS values are also compared in Tables 3 and 4. As MOEA/D is not implemented in Matlab for tri-objective benchmark problems UF8, Uf9, and UF10.Therefore, the efficiency of the proposed EMOGWO is only compared with MOPSO and MOGWO for these benchmark problems.It can be envisioned from the tables that the proposed EMOGWO show better coverage and convergence.Although, there are some discontinuities on the Pareto optimal front obtained by EMOGWO such as the coverage of whole front is broader than the MOGWO, MOPSO, and MOEA/D for most of the benchmark problems.However, the Pareto optimal solutions of the proposed EMOGWO are closer to the true Pareto optimal front and evenly distributed for both bi and tri-objectives.Thus, from the statistical results, efficacy of the proposed algorithm can be easily observed.
A Friedman's test [61] is also included in Table 5 to statistically validate the efficacy of the proposed EMOGWO.Friedman's test assesses the efficacy of each approach on each benchmark function and ranks them in order of effectiveness [62].The best approach is given a score of 1, the second best is given a score of 2, the third best is given a score of 3, and so on.The rank is calculated by averaging the ranks returned in different runs if the performance of the two approaches is identical [63].Friedman test returns a pvalue of 0.004756, which is much lower than the threshold (α = 0.05), indicating that the obtained results are statistically different.The ranks of all the methods returned by the Friedman test are tabulated in Table 5 which shows that the proposed EMOGWO has the lowest ranking value of all the models.As a consequence of the statistical analysis and experimental data, the EMOGWO is found to be superior.Further, box-plots for 6 representative benchmark problems including 3 bi-modal (UF1, UF4, and UF6) and 3 tri-modal (UF8, UF9, and UF10) is also plotted in Figs. 3  and 4 to see the variations among the IGD values for existing and the proposed EMOGWO over 10 runs.In box-plots, algorithms are represented on the horizontal axis while the vertical axis denotes the best IGD values over 10 runs.It can be observed from the figures that the box-plots of the proposed EMOGWO in super narrow and its IGD value is also lower than MOPSO, POEA/D, and MOGWO for both bi-modal and tri-modal benchmark functions.However, for tri-modal benchmark function UF8, MOPSO shows comparative results.Hence, qualitative and quantitative analysis evident that the proposed EMOGWO is able to deliver very competing and promising results over multi-objective benchmark problems.

Experimental analysis of automatic nuclei segmentation method
In this section, the experimental results of the proposed multi-objective enhanced GWO-based automatic nuclei segmentation method (EMOGWO-SC) are presented.A breast cancer histology dataset which is publicly available [47] is considered for the performance evaluation.The dataset consists H&E-stained ER+ (estrogen receptor positive) breast cancer images, taken at 40× magnifying level.The mask images are manual generated by the domain experts [47].The experimental results of proposed and considered methods for the segmented nuclei are performed on four randomly selected images as shown in Fig. 5.It is concluded from the results that the proposed EMOGWO based method performed superior for nuclei segmentation as compared to other considered methods.The higher numerical value of dice-coefficient represents better segmentation accu-racy.The, segmentation accuracies of the EMOGWO-SC and other methods for the considered images are presented in Table 6.The TP value denotes the truly identified nuclei, while FP represents the false artifacts.Moreover, the value of FN denotes the nuclei which are not identified whereas DC represents the dice-coefficient value.
It is pertinent from the table that EMOGWO-SC has superior DC value over 70% of the images with 0.5; whereas, the maximum and minimum DC values as 0.6368 and 0.4813, respectively.Moreover, kmeans-SC and MOGWO-SC obtained 0.5 DC on 5 and 2 images, respectively, among the 10.Moreover, EMOGWO-SC has efficiently segmented the major nuclei regions which was computed from the TP parameter of the introduced method as depicted in Table 6.It is also vindicated from the results that EMOGWO-SC has comparatively low false alarm as indicated from the FN and FP values.
Furthermore, the computation time for the nuclei segmentation has been also investigated and presented in Table 7. From the table it can be observed that kmeans-SC has taken the least computation time among the other methods.However, K-means-Sc performed poorly in terms of accuracy among all the presented methods which can nor be com-promised in the nuclei segmentation applications.Also, the computation time of EMOGWO-SC is minimum among all the meta-heuristic-based methods.

Conclusion
In this paper, a new clustering-based nuclei segmentation method is introduced.The proposed method find the optimal cluster centroids using novel variant of multi-objective grey wolf optimizer.To perform optimal clustering, two objective functions, namely intra-cluster distance and inter-cluster distance, are considered.To validate the efficacy of the proposed variant, standard multi-objective functions have been considered which consists of 7 bi-objective and 3 tri-objective benchmark functions along with statistically investigation using box plots.It has been observed that the proposed variant is able to report best fitness value of more than 0.90 on 90% of the benchmark functions.Further, the proposed EMOGWO has been employed for nuclei segmentation from H&E stained estrogen receptor positive (ER+) breast cancer images.The results of the proposed method are empirically validated against K-means-SC and MOGWO-SC in terms of segmentation accuracy.The experimental results demonstrate that EMOGWO-SC attained DC-value of more than 0.52 over 80% of the images and outperformed the considered methods on 90% of the images.Moreover, the average computation time of the proposed method is approximately 4 times less than the MOGWO-SC which is quite promising.In future, fuzzy based segmentation can also be inculcated.Also, the other criteria such as structural similarity, boundary displacement can be explored to tackle large datasets.

Declarations
Conflict of interest The authors have stated that this paper has no potential conflict of interest.

Fig. 2
Fig. 2 Flow graph of the proposed multi-objective clustering method (IGSA-SC) for nuclei segmentation

Algorithm 1
Multi-objective clustering method for nuclei segmentationInput: A H&E-stained histopathological image (X) of size m × n.Output: Nuclei segmented image.Generate super-pixels by executing SLIC method on X ; Each individual of EMOGWO is initialized randomly which consists of g cluster centroids, {C 1 , C 2 , C 3 , . . ., C i , . . ., C g } and each C i is defined as {c 1 , c 2 , c 3 , . . ., c d } for d-dimensional super-pixel; Compute the fitness f it of each individual according to the considered objective functions; Update each individual according to EMOGWO algorithm; Use the best individual to optimally cluster the super-pixels; The cluster centroid with the minimum value corresponds to nuclei regions.

o and f min o are the maximum
and minimum fitness value of oth objective and f i o , f j o are the fitness values of solutions i and j in oth objective, respectively.

Fig. 3 Fig. 4
Fig. 3 Box-plots of the statistical results for IGD on three representative bi-modal benchmark problems a UF1, b UF4, and c UF6

Fig. 5
Fig. 5 Nuclei segmentation results on representative H&E-stained estrogen receptor positive (ER+) breast cancer images by the proposed and the considered methods

Table 1
Parameter settings for all the considered algorithms

Table 5
Mean ranking of all the considered methods To appraise the efficacy of the proposed EMOGWO, it is compared with all the considered algorithms in terms of mean, standard deviation, median, worst, and best values of IGD, MS, and SP.Tables2, 3 and 4depicts the IGD, SP, and MS values returned by the proposed EMOGWO and other considered algorithms.From the Table 2, it is observed that the proposed EMOGWO obtains the best IGD values for more than 90% of benchmark problems.As, IGD values are good indicators to benchmark the convergence of different algorithms.So, the results depicted in Table 2 signify the better convergence of the proposed EMOGWO.There are a few benchmark problems such as UF3, UF6, and UF7, in which MOEA/D obtains the best IGD values while MOPSO shows better results than proposed and other considered algorithms