A novel density deviation multi-peaks automatic clustering algorithm

The density peaks clustering (DPC) algorithm is a classical and widely used clustering method. However, the DPC algorithm requires manual selection of cluster centers, a single way of density calculation, and cannot effectively handle low-density points. To address the above issues, we propose a novel density deviation multi-peaks automatic clustering method (AmDPC) in this paper. Firstly, we propose a new local-density and use the deviation to measure the relationship between data points and the cut-off distance (dc\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_c$$\end{document}). Secondly, we divide the density deviation into multiple density levels equally and extract the points with higher distances in each density level. Finally, for the multi-peak points with higher distances at low-density levels, we merge them according to the size difference of the density deviation. We finally achieve the overall automatic clustering by processing the low-density points. To verify the performance of the method, we test the synthetic dataset, the real-world dataset, and the Olivetti Face dataset, respectively. The simulation experimental results indicate that the AmDPC method can handle low-density points more effectively and has certain effectiveness and robustness.


Introduction
Databases have been widely used in information systems. The development of internet technology and artificial intelligence imposes higher standards on the collection and processing of information [1]. Machine learning is the closest data processing method to artificial intelligence systems [2]. Data mining technology is an invaluable means of discovering the hidden worth of data [3]. Cluster analysis is a key method of data mining, which can categorize data with unknown relationships according to the similarity between data, and then analyze the valuable information hidden in the unknown data [4]. In recent years, many scholars have proposed different types of clustering algorithms, and also solved many practical problems [5]. Cluster analysis techniques have a widespread adoption in the fields of pattern recognition [6], stock prediction [7], bioengineering [8], and transportation [9]. Cluster analysis techniques have become an indispensable tool for mining the value of information in present society.
Density-based clustering includes density peaks clustering (DPC) algorithm [10], density distribution function clustering (DENCLUE) algorithm [11], density spatial clustering (DBSCAN) algorithm and so on [12]. The density peaks clustering (DPC) algorithm is the most typical clus-tering method in recent years. Compared with the k-means algorithm [13], the DPC method is not necessary to predetermine the number of class clusters and can effectively identify clustering centers in various complex datasets. Similar to the DBSCAN algorithm, the DPC is also superior for noise point identification. And the DPC algorithm does not need iteration to get the best clustering results. The DPC algorithm is also well suited for medium to large scale data clustering problems relative to the affinity propagation (AP) algorithm [14].
In recent years, the DPC algorithm is becoming a research hotspot for scholars at home and abroad. However, the algorithm needs to set the location of clustering centers artificially, and the selection of clustering centers lacks an accurate scientific basis and is subjective [15,16]. Ding et al. [17] proposed an automatic clustering algorithm to detect clustering centers by taking the lower value of the judgment metric after normalizing the density and distance ranges. The DPC algorithm assigns data points mainly based on the identified clustering centers, which is very effective for clustering high-density points. However, it is difficult to handle the data at low-density points. Wang et al. [18] proposed a systematic clustering method based on low-density representative points, which can effectively identify class clusters of different shapes and sizes. Jiang et al. [19] used the sensitivity of DBSCAN to noise points for the halo points of the DPC algorithm to make the anomaly identification more accurate and efficient.
For data with different density distributions, Chen et al. [20] proposed a domain adaptive density clustering (DADC) algorithm. The DADC algorithm can effectively handle data with different structural characteristics through strategies such as domain-adaptive density measurement, automatic extraction of initial cluster centers, and merged fragment clusters. Du et al. [21] proposed a robust border-peeling clustering algorithm. This method used Cauchy kernel density estimation and a bidirectional association strategy to share neighborhood information to improve clustering performance on complex datasets with non-uniform distributions. Qian et al. [22] proposed an adaptive density variation clustering algorithm based on local density information and nearest neighbor metric method. This algorithm can handle clustering of arbitrary shape and size with varying density, and has some robustness to outliers. The above methods can all work with data with different density distributions, but they lack systematic description and analysis for non-convex datasets with low-density points. The DPC method usually utilizes a Gaussian kernel function to compute the local density. However, the Gaussian kernel function considers the entire data space when calculating the local-density and may encounter difficulties in distinguishing overlapping clusters and dealing with low-density points [23][24][25]. Milan et al. [26] use residuals instead of Gaussian kernel function for local density calculation to further view the potential relationship between different data points.
To address the above issues, we present a novel density deviation multi-peaks auto-clustering method. We first utilize density deviation instead of the original local density calculation method, so that we can know the relationship between data points and cut-off distance d c more carefully. We put the density deviation in the decision diagram by taking the rightmost area as the high-density level, the leftmost area as the low-density level, and the rest as the medium density level. We select the points with higher distances in different density classes as sub-centers respectively. For the multi-peak points in the high-density region, we consider them to be the clustering centroids can be directly clustered. For the multi-peak points in the medium-density level and low-density level, we call them low-density clustering centroids. We judge whether they are one class according to the distance of the peaks between low-density centroids and then get the final clustering result. To validate the usefulness of the method, we conduct a clustering performance analysis on synthetic and real-world datasets, respectively. The simulation experimental results indicate that the AmDPC method has a better clustering effect compared with other state-ofthe-art methods.
The rest of this paper is organized as follows. Section "Related work"presents the technical details of the DPC algorithm in detail. Section "A novel density deviation multipeaks automatic clustering algorithm"focus on the advanced ideas and clustering details of the AmDPC algorithm. Section "Performance evaluation and experimental analysis"shows the experimental specifics and the comparative analysis results of the algorithm. Section "Conclusion"summarizes the contributions of the algorithm and give future perspectives.

Related work
Rodriguez and Laio presented an efficient density peaks clustering method in Science in 2014. This method has only one parameter and can rapidly locate the clustering centers of arbitrary structure data. The overall idea of the DPC algorithm is as follows: (i) The cluster center should be at the maximum density, and each cluster center is far away from each other, and the cluster center is surrounded by many similar clusters of data points with high local-density; (ii) There is the obvious distinction between the different clusters and the clustering centers are very far apart from each other. [27,28]. Therefore, the data points that are denser and more distant from the sample points at the same density level have the ability to act as clustering centers.
In Eq. (1), we give the computation of the cut-off distance. Where d c represents the cut-off distance, S indicates the set of data points sorted in ascending order by distance, p signifies the user-selected percentage parameter that determines which member of the set S is selected to attach to d c , and rnd(o) returns the integer value closest to o, in which M means the capacity of the set S.
Assume that for a dataset S = {x 1 , x 2 , ..., x N }, N denotes the number of sampling points, and its local-density is defined as follows: The represents the calculation of Euclidean distance between data points. In Eq. (3), we show the computation formula of local density under Gaussian kernel function. We generally choose 1%-2% of maximum spacing among all sample points as the cut-off distance d c based on our experience.
The DPC algorithm defines two important concepts: first, the local density for the samples, and second, the minimal distance between data points and samples having a greater local density.
Refers to the space from sample i to the denser sample than it and closest to it, and the distance between them as δ i . The formula for δ i is given as follows: The decision graph plots each sample point with the density ρ for the horizontal axis and the parameter δ for the vertical axis (see Fig. 1a). The data points at the top right side of the decision graph, which are obviously far from most of the samples, are often selected as the best clustering centers. The upper-right position represents a high density of sample points and their spacing among them. The strategy of selecting cluster centers by decision graph is the core of the DPC method. Finally, according to the location of the clustering center, DPC assigns other points that are closer to the clustering center to get the final clustering results [29,30].
The DPC algorithm has a very efficient clustering performance. The algorithm is not dependent upon the ranking of data input and can disrupt data and then input without affecting the clustering effect. The method of selecting clustering centers is relatively intuitive and stable on some datasets, and the clustering results are reliable within the applicable range. The density-based clustering method is superior to the distance-based clustering method in dealing with nonspherical cluster data due to the density ranking. However, the algorithm also has certain shortcomings. The DPC method has a single way of calculating local density, which cannot effectively measure the relationship between data. The cluster center selection lacks a certain scientific theoretical basis and cannot handle data with low-density points. Figure 1 presents the decision graph and the clustering result of the Pathbased dataset in the DPC method, respectively. In Fig. 1a, we give the true clustering results for the Pathbased dataset. Since the true clustering result of this dataset is three classes, we have selected the three points in the red rectangular box at the top right as the clustering centers in the decision graph. In Fig. 1c, we can know that the red points in the low-density region do not get the best clustering results. The above analysis shows that the DPC algorithm fails to efficiently handle the low-density regions of the dataset.

A novel density deviation multi-peaks automatic clustering algorithm
In this section, we put forward an automatic clustering algorithm named density deviation multi-peaks automatic clustering algorithm (AmDPC). Firstly, we propose a density deviation theoretical system that can further explore the density relationship between different data points. Then, we average the density deviations of the data into multiple regions. For different levels of density deviations, we select several different peak points as sub-clustering centroids respectively. Finally, we name these sub-clustering centroids as multi-peak points. We set the size of the lowdensity threshold according to the decision map to merge the low-density points and process individual high-density points.

Density deviation theory
The most frequently used local-density calculation in the DPC algorithm is the Gaussian kernel function. However, this computing method cannot measure the potential relationship and dispersion between the data, nor can it further explore the degree of deviation between data points within a certain range. Therefore, a more reasonable local-density theoretical framework is needed to express the relationship between the data. The deviation reflects the difference between the mean and the true value. It reflects both the measure of the dispersion of the data distribution and the deviation of the sign value of each unit in the statistical aggregate. Hence, in this paper, we introduce the calculation of deviations into the local density. The equation of the relationship between the space between data points and the deviation of cut-off distance is as follows: We use de to refer to the degree of deviation between the cut-off distance and the data distance. In Eq. (5), d c denotes the cut-off distance of the AmDPC algorithm (see Eq. (1)) and d i j represents the distance between data points (see Eq. (2)). To capture the level of the deviation of data points from the cut-off distance within a known range, we introduce the number of neighboring data points N i in Eq. (5).
According to Eq. (6), we can better measure the relationship between data points in the case of selecting a different number of neighboring data points N i . Similar to the calculation of Gaussian kernel to better measure the local-density for the sample. So we give the deviation density as follows: In Eq. (7), we give the concept of deviation density by accumulating the neighboring data points N i . Compared to the local density, the deviation density provides a better measure of the potential relationship between the data points. To capture the relationship between the local density and the deviation density of the sample, we give the following definition.
Definition 1 (Density deviation) The absolute value of the difference between the overall deviation density and the local density in the statistical sample. The specific computation formula for density deviation is shown below: In this subsection, we focus on the concept of density deviation. Compared with local density, density deviation can describe the relationship between data points and cutoff distance in more detail. This calculation can capture the degree of density deviation between samples as a whole.
Compared with the local-density of the DPC method, the density deviation can better capture the potential relationship between different data points within a certain range. The density deviation can also further measure the density dispersion of the overall sample, which is of higher value for the measurement of density properties.

Low-density points acquisition
With the above analysis, we use density deviation instead of local density to weigh the relationship among data. Hence, we can get the new cluster decision graph (see Fig. 2b). In Fig. 1 we may see that the DPC method cannot effectively handle the data points in the low-density region. The correct clustering results cannot be obtained by the initial data point assignment method alone. Therefore, we need to set another way of assigning the clustering centroids. In Fig. 1c, we could not efficiently process the cluster centroids in the low-density region of blue number 1. In considering the above, we propose an alternative method for finding clustering centers.
The DPC method directly artificially selects the clustering centers using rectangular boxes with some subjectivity. This algorithm lacks a reasonable approach for clustering centroids in low-density regions. To observe low-density points more flexibly, we average the density deviation of the decision graph into multiple regions Eq. (9).
In Eq. (9), to prevent the numerical size of d ρmin from being too small and having an impact on the selection of clustering centers in low-density regions. We set a value of d ρmin less than 0.001 to 0.1 uniformly. The λ indicates the number of regions to be split. Hence, we get the density interval in the decision graph as shown in Eq. (10).
In Fig. 2b, C 3 is the region of the highest density deviation points. In this paper, we use both higher density deviation and distance points as high density clustering centers. In Eq. (11), δ C λavg denotes the average distance corresponding to the different density points in the region of highest density deviation. In Eq. (12), C λ denotes the cluster centroid partition line within the maximum density deviation hierarchy; N um C λavg denotes the number of data points whose distance is greater than δ C λavg in the highest density deviation level; mean(δ C λavg ≤ δ ≤ δ C λmax ) denotes the mean value of the distance of data points between δ C λavg and δ C λmax in the highest density deviation level. With the above equations, we can automatically find the appropriate high-density deviation clustering centers in the data with different structures.
After numerous experiments, we found that when λ takes the value of 3, we can get the sub-clustering centroids of almost all data. So in this paper, we uniformly set λ to the constant 3. In Fig. 2b, the red line indicates that we divide the density deviation into three intervals equally in the decision graph. The vertical coordinates indicate the distance between different data points. To get more low-density points, we set peaks splitting lines in the distance coordinates of each of the three intervals. In Fig. 2b, C 1 , C 2 and C 3 , are the peaks split lines of the three intervals. The points above the split line are collectively called multi-peak points (also called subclustering centroids). In Fig. 2c, we extract the multi-peak points separately to provide a reference for the processing of density levels.
We name the multi-peak points in the intervals C 1 , C 2 and C 3 as P 1 , P 2 and P 3 respectively. In the clustering process, the different interval sorting methods of multi-peak points can obtain different clustering results. The specific formula we obtained is shown below: In Eq. (13), the w, ascend and descend is used to control the ordering of sub-cluster centroids. As shown in Fig. 2c, we set two density level threshold partition lines A and B (A < B) respectively. We name the multi-peak points with density deviation in the range of (d ρmin , A) as low-density points. Those in the range of ( A, B) are medium-density points, and in (B, d ρmax ) are high-density points. In summary, we divided the density deviations into different density strata by the threshold partition line.
In Fig. 2c, we can see that the sub-clusters centers in the low-density region are more concentrated and the density values are close to the lowest density points. The highdensity region has more concentrated sub-clusters and the density value is close to the highest density point. Therefore, we design a more reasonable threshold partition line based on this data feature. In Eqs. (14) and (15), ρ cmax denotes the maximum density value in the sub-cluster centroids and ρ cmin denotes the minimum density value in the sub-clustering centers. N i are several neighboring data points.

Automatic clustering of different density levels
We can get different density deviation stratification levels by dividing the density levels. Then, we merge and assign the data according to the different density levels. For the division and merging of multi-peak points, we divided them into two cases for clustering. (i) The threshold partition line A and B are between the maximum and minimum values of density deviation, respectively; (2) The threshold partition line B is smaller than the minimum density deviation d ρmin . We refer to these two clustering methods as low-density multipeak points clustering and high-density uniform clustering, respectively.
Low-density multi-peak points clustering: First, we find the ordinal numbers of the locations where the multi-peak points of density deviations at different density levels are located  Fig. 2c). Then the ordinary numbers of these multipeak points are sorted from lowest to highest according to the density hierarchy. We first clustered the data points that were in the highest density region, and then merged the multipeak points that were in the low-density region of (d ρmin , A). Finally, the points in the medium-density region that are at (A, B) are combined. The final result of automatic clustering without human intervention is obtained (see Fig. 2d).
High-density uniform clustering: The previous is a method for low-density region datasets. However, if all cluster centers in the decision graph have high-density deviations, the above data point assignments are no longer applicable. Therefore, we propose an allocation method for high-density uniform clustering. If the threshold partition line B is smaller than the density deviation minimum d ρmin , it means that the multi-peak points we obtained are the high-density clustering centroids. At this time, we allocate data points to those closest and denser than ourselves and finally get the automatic clustering result.
To contrast and evaluate the different clustering results of the same dataset, we observe the effects of other parameters on the clustering results by setting the cut-off distance d c in Figs. 1 and 2 to the same value. To observe the low-density point processing more clearly, we give the true clustering results for the Pathbased dataset in Fig. 2a. In Fig. 2d, the hexagrams are the clustering centroids with multi-peak points and the red numbers are the location ordinal numbers where

Algorithm 1 The proposed AmDPC method
Require: Dataset, parameters d c , N i , peaks splitting lines C 1 , C 2 . Ensure: Automatic clustering results 1: Obtain the distance matrix, density deviation (d ρ i ), distance (δ); 2: The density deviation levels are classified according to the Eqs. (8) and (9); 3: Multi-peak points selection 4: for Multi-peak points orderings P 1 , P 2 , P 3 do 5: if Chose ascend order then 6: Multi-peak points in different intervals are sorted from lowest to highest (see Eq. (13)); 7: else 8: Multi-peak points in different intervals are sorted from highest to lowest (see Eq. (13)); 9: end if 10: end for 11: for low-density points acqusition do 12: if Multi-peak points in ( Obtain the automatic clustering result. 25: end for the density deviations of data points are located. In Fig. 2d, we can see that the data points in the low-density region are effectively clustered. And the Pathbased dataset gets the best clustering result relative to the DPC algorithm (see Fig. 1c).
In summary, for different types of data, the AmDPC method can divide and process data according to the decision graph properties of the data points. For low-density data, the AmDPC algorithm has very good reliability and generalizability. The entire procedure of AmDPC is shown in Algorithm 1.

Performance evaluation and experimental analysis
In this section, we conduct simulation experiments using the Olivetti Face dataset, synthetic and real-world datasets to validate the clustering performance of the AmDPC method. We use the performance evaluation metrics such as normalized mutual information (NMI), accuracy (ACC), adjusted rand index (ARI), and F-measure (FM) to evaluate the algorithm clustering performance respectively. To have a clearer understanding of the clustering ability of the AmDPC algorithm, we compare the above evaluation metrics with the state-ofthe-art clustering methods such as DBSCAN [12], AP [14], DPC [10], McDPC [31], REDPC [26], and FHC-LDP [32], Letter 20000 16 26 respectively. In Table 1, we used eleven synthetic datasets 1 and seven real-world datasets 2 for the simulation tests of the method, respectively.

Clustering performance evaluation indexes
Before analyzing the clustering results, we will present the clustering performance evaluation metrics. We use four popular clustering performance metrics (NMI, ARI, ACC, FM). The values of these metrics range between -1 and 1, with the closer to 1 indicating better clustering performance. NMI is a widely used external evaluation metric for clustering. For a class of true labels A and clustered labels B, the eigenvalues of A are extracted to form the vector U and the eigenvalues in B are extracted to form the vector V . Hence, the NMI values of U and V can be described as: where p(u) denotes the proportion of u in A and p(v) represents the probability of v in B. P(u, v) denotes the coincidence probability of u and v. In clustering algorithm evaluation, the NMI values of real label A and clustering label B are often used to weigh the effectiveness of the method. ACC is a very classical metric used to measure clustering accuracy at this stage. We assume that in a dataset with n samples, where r i and s i denote clustering labels and true labels, respectively, acc can be defined as: The map(r i ) is the permutation mapping function that matches the clustering labels and true labels.
Suppose the true label of a dataset is P and the label of the clustering result is Q. The c 1 represents the number of sample pairs that fall into the same category of a cluster in P and the similar category of a cluster in Q. The c 2 represents the number of sample pairs that fall into the same category of a cluster in P and not the similar category of a cluster in Q. The c 3 represents the number of sample pairs that belong to different classes in P and the same classes in Q. Then the FM is calculated as shown in Eq. (22).
The results of rand index (RI) metrics of different methods are generally high, and it is difficult to evaluate the clustering capability. Hence, we typically utilize ARI instead of RI to test the method's ability. The ARI is shown below.

Experiment results on synthetic datasets
In Table 2, we give the clustering results of the synthetic datasets for different evaluation metrics in the seven stateof-the-art methods. To highlight the performance of the algorithms, we bold the results of the best evaluation metrics. To see more intuitively clustering of synthetic datasets by different methods, we give the visualization results in Figs. 3,4,5,6,7,8,9,10,11,12 and 13 for all datasets in Table 2. In Table 2, we give the values of the best clustering metrics corresponding to the optimal parameters. In Table 2, the Pathbased, Halfkernel, 2circles, Zelink1, Smile2, and Jain datasets are all non-convex datasets with low-density points. From the table, we can know that the best clustering results are obtained for all the non-convex datasets of the AmDPC algorithm. Therefore, our analysis concludes that the AmDPC algorithm has higher superiority for low-density data points compared to other algorithms. And the AmDPC algorithm can handle non-convex datasets well. Tables 2 and 3 show the values of the parameters corresponding to the clustering results of different methods. Among the seven state-of-the-art methods, the DBSCAN method parameters are the neighbourhood radius (Eps) and the number of clustering radius (MinPts), the AP algorithm parameters are the similarity matrix median value of (Preference) and the damping factor (λ), and the DPC algorithm parameter is the number of neighbors percentage ( p). The McDPC algorithm uses the parameters (γ , θ , λ) and the number of neighbors percentage ( p). The REDPC algorithm parameters are the neighborhood size (N i ) and the number of neighbors percentage ( p). The FHC-LDP method uses the number of neighbors (K ) and the quantity of clusters (C). The params of the AmDPC method are the quantity of neighbors (Ni), peak splitting line (C 1 , C 2 ), and number of neighbors percentage ( p). Among them, parameters C 1 and C 2 can modulate the selection of low-density clustering centers. We jointly tune these four group parameters of the AmDPC algorithm separately. All parameters correspond to the best cluster evaluation index values.
From Table 2, we can see that the AmDPC algorithm obtains the best clustering metric values in the first nine synthetic datasets compared to other state-of-the-art methods. The AmDPC method is more advanced than other state-of-the-art algorithms for non-convex datasets, especially for Pathbased and Halfkernel datasets. In the Pathbased dataset, the FHC-LDP algorithm cannot effectively handle low-density points with lower metric values compared to the AmDPC algorithm. In the D2 dataset, the AP, REDPC, and AmDPC algorithms achieve the best clustering results rel-ative to the other methods. In Fig. 8, we can see that the REDPC method identifies the outlier in the D2 dataset. Therefore, we can conclude that the AmDPC algorithm can very accurately assign outliers that are at low-density points to the corresponding cluster (see Table 2). In general, the AmDPC algorithm has very superior clustering results for non-convex datasets with low-density points.
The experiments show that the AmDPC algorithm achieves the best clustering results except for the Complex8 and Com-plex9 datasets. For Pathbased datasets, the DBSCAN, AP, DPC, REDPC, and FHC-LDP algorithms cannot obtain accurate clustering results (see Fig. 3). However, the AmDPC algorithm obtains the best clustering results by clustering and merging the low-density points. In Fig. 4, the AP, DPC and REDPC algorithms cannot get accurate results, while the DBSCAN, McDPC, FHC-LDP and AmDPC algorithms get the best clustering results. In non-convex datasets of Figs. 5, 9, 10 and 11, the AmDPC algorithms obtain the best clustering results. In Figs. 6, 7 and 8, the AmDPC algorithm still obtains the best clustering results for high-density data points. In Fig. 8, the DBSCAN, AP and REDPC methods fail to get the correct results whereas the DPC, McDPC, FHC-LDP and AmDPC algorithms get the best clustering results. It shows that the improved algorithm can accurately handle the data points in low-density regions and automatically gets the best clustering results without manual intervention.
In Figs. 12 and 13, the Complex8 and Complex9 are both complex structures datasets with non-uniform density. We can know that DBSCAN approximates the best clustering results for the Complex8 and Complex9 datasets. In Fig. 12, the FHC-LDP algorithm can also perform clustering on data with an arbitrary structure, but the performance is slightly lower than that of the DBSCAN method. In Fig. 13, the FHC-LDP algorithm achieves the best clustering result for the Complex9 dataset and the DBSCAN algorithm achieves the second best clustering result. In Table 2, we can conclude that the AmDPC algorithm is second only to the DBSCAN and FHC-LDP methods in terms of cluster evaluation metric values for the Complex8 and Complex9 datasets. From the above figures, we can know that the AmDPC algorithm is superior to other state-of-the-art methods for clustering non-convex datasets with low-density points. The AmDPC algorithm outperforms the AP, DPC, McDPC, REDPC algorithms for non-uniform density datasets with complex structures.
In general, the AmDPC algorithm has excellent clustering performance for non-convex datasets with low-density points, and also has a certain clustering effect on complex datasets with non-uniform density structures. To further validate the clustering performance of the method, we will confirm the clustering utility of the AmDPC method on highdimensional data. Table 3 shows the clustering evaluation metrics for the seven state-of-the-art methods on real-world datasets. We have given the best results based on the magnitude of the four clustering index values. The parameters in the table all correspond to the optimal experiment results of the different algorithms. From Table 3, we may know that the clustering evaluation metrics of the AmDPC method are significantly higher than the other methods in Heart dataset. In the Liver dataset, the AmDPC method has the best clustering results for NMI, ACC, and ARI evaluation metrics. Although the FM metric of the REDPC algorithm has the highest value, its ARI value appears negative and the overall clustering accuracy is not as high as that of the AmDPC algorithm. In the Sonar dataset, the clustering effect of the NMI, ACC, and ARI evaluation indicators of the AmDPC method is significantly higher than that of other state-of-the-art algorithms. Although the McDPC algorithm has the highest FM metric value, the other index values are 0, and the overall clustering accuracy is not as good as the AmDPC method.

Experiment results on real-world datasets
In the Breast dataset, we can see that the AmDPC algorithm has three clustering evaluation metrics higher than the other algorithms, while the FM metric value is second only to the DBSCAN algorithm. In the Pima dataset, we can see that the values of NMI, ACC and ARI clustering evaluation metrics in the AmDPC algorithm are much higher than the other algorithms. Although the FM metric values of McDPC, REDPC and FHC-LDP methods are slightly higher than the AmDPC algorithm, the other metric values are particularly low and the overall clustering accuracy is not good. Therefore, the clustering performance of the AmDPC algorithm is higher than the other algorithms in the Pima dataset. In the Glass dataset, the AmDPC method has two clustering evaluation metrics higher than other algorithms, and the other evaluation metric is second only to the FHC-LDP algorithm. The ACC, ARI evaluation metrics of the AmDPC method have the best clustering results in the Letter dataset with large sample points. Although the NMI and FM metric of the FHC-LDP algorithm has the highest value, all its other values show a lower status, and the overall clustering accuracy is not as high as that of the AmDPC method. In general, the AmDPC algorithms have better clustering performance for real-world datasets. And the AmDPC method has certain practicality and effectiveness.

Clustering time complexity analysis
In this subsection, we will further contrast and analyze the operation time complexity of DBSCAN, AP, DPC, McDPC, REDPC, FHC-LDP, and AmDPC algorithms.
In AmDPC, the time complexity of calculating the parameters d ρ i and δ is O(n 2 ). The time complexity of the clustering    Table 4, the time complexity of the DBSCAN and FHC-LDP algorithms is low, and the AP clustering algorithm is relatively high because it requires iterations. In Table 4, the n is the number of data points and the I is the number of iterations. The time complexity of the DPC, McDPC, REDPC and AmDPC methods is slightly lower than the AP method and higher than the DBSCAN and FHC-LDP method.

Clustering performance analysis of facial recognition
In this subsection, we used the ORL database 3 (Olivetti Research Laboratory in Cambridge, UK) to test the practicality of the AmDPC method and other algorithms. It contains    [33]. We take the first ten people with a total of 100 face images of the same size without overlapping for our experiment. We treat each person with an identical class label, for a total of 10 class labels. In Table 5, we give the results of the clustering evaluation metrics for the seven classical methods on the Olivetti Face dataset. The parameter types of the different algorithms in Table 5 are the same as those in Table 2 and Table 3. The parameters are the best clustering results obtained after several trials and records. In Table 5, we highlight the best clustering metrics. From Table 5, we can see that the evaluation metrics of the AmDPC algorithm are significantly higher than other algorithms, indicating that our proposed algorithm has higher effectiveness and robustness.
In Fig. 14, we also give the visualization results of the AmDPC algorithm on the Olivetti Face dataset. Different colors indicate different categories. In Fig. 14, we can see that 10 people are divided into 9 categories, where the first and the fourth person starting from the top are divided into one category. The facial expressions of the second, third, sixth, and seventh people are recognized very accurately. The fifth and tenth people identified nine and eight facial expressions, as well as the remaining facial expressions, were classified into the categories to which the ninth and seventh person belonged, respectively. In general, the AmDPC method is effective and practical for face recognition compared with other advanced methods.

Multi-peak points intervals sequencing ablation experiments
In this subsection, we will further discuss the effect of the multi-peak points intervals ordering methods on the clustering results. In Eq. (13), we only introduce two ordering methods ({P 1 , P 2 , P 3 } and {P 3 , P 2 , P 1 }) and and do not show the effect of the other ordering ways on the clustering results. There are six ordering types for P 1 , P 2 and P 3 . Since we choose only one of the sorting methods for each experiment, we will then analyze each of the six different sorting methods separately (see Table 6). Through the ablation experiments, we find that the clustering results of the other four sorting ways are the same as the results of the two sorting cases in Eq. (13). The best clustering performance can be basically achieved using the two cases in Eq. (13).
We select four synthetic datasets and four real-world datasets for testing from Table 2 and 3, respectively. To ensure the validity of the experiments, we keep the four parameters of the AmDPC algorithm in Table 2 and 3 unchanged and observe the effect of different ordering methods on the clustering performance. We have listed the six ordering methods {P 1 , P 2 , P 3 }, {P 1 , P 3 , P 2 }, {P 2 , P 1 , P 3 }, {P 2 , P 3 , P 1 },   {P 3 , P 1 , P 2 } and {P 3 , P 2 , P 1 } respectively in Table 6. In the table, we bold the clustering results corresponding to the sorting methods in Eq. (13) and further observe their effects on the clustering results.
In Table 6, we can know that the evaluation metric values of Flame, D2, Liver and Letter datasets remain unchanged for the six ordering methods in the AmDPC algorithm. It is indicated that the arrangement of the AmDPC method in different intervals in the above datasets has no effect on the clustering results. From Table 6, we find that the AmDPC method produces two groups clustering metric values for the interval ordering methods in the Pathbased, 2circles, Breast and Glass datasets. In these four datasets, the sorting methods of {P 1 , P 2 , P 3 }, {P 1 , P 3 , P 2 } and {P 2 , P 1 , P 3 } all have the same clustering indicator values. The ordering methods of {P 2 , P 3 , P 1 }, {P 3 , P 1 , P 2 }, {P 3 , P 2 , P 1 } also have the same clustering index values. It indicates that the interval ordering methods of multi-peak points have some influence on the clustering performance. The clustering results of {P 1 , P 2 , P 3 }, {P 1 , P 3 , P 2 }, {P 2 , P 1 , P 3 } sorting methods in Pathbased, Breast and Glass datasets are greater than those of {P 2 , P 3 , P 1 }, {P 3 , P 1 , P 2 }, {P 3 , P 2 , P 1 } sorting methods. In the 2circles dataset, the clustering results obtained by the {P 2 , P 3 , P 1 }, {P 3 , P 1 , P 2 }, {P 3 , P 2 , P 1 } sorting methods are 1 and greater than the clustering results obtained by the {P 1 , P 2 , P 3 }, {P 1 , P 3 , P 2 }, {P 2 , P 1 , P 3 } sorting methods.
Since the clustering results of the other four sorting methods are the same as those corresponding to the {P 1 , P 2 , P 3 } and {P 3 , P 2 , P 1 } sorting methods. So the best clustering results of the AmDPC algorithm can be fully achieved using only {P 1 , P 2 , P 3 } and {P 3 , P 2 , P 1 } sorting methods. It is also further indicated that the effect of other different sorting methods on the clustering results is consistent with the two sorting cases in Eq. (13). In summary, we can conclude that the AmDPC algorithm can obtain the best clustering results by testing and comparing the ordering cases in Eq. (13) separately.

Parameters sensitivity analysis of the AmDPC algorithm
The AmDPC algorithm has four parameters: N i , C 1 , C 2 , and p. In this subsection, we will test the sensitivity of the clustering results for each of these four parameters to further explore their impact on the clustering performance.
We first start with parameter p, which directly affects the quality of decision graph generation for both the AmDPC, REDPC and DPC algorithms. We set different p values for testing: p={1-8} ( p is a positive integer), and then test on three synthetic and three real-world datasets, which clustering results are shown in Tables 7,8,9 and 10. The values in the above tables are the means and standard deviations of the eight experimental results of the AmDPC, REDPC and DPC algorithms for the parameter p in the range of one to eight, respectively. We bold the mean and standard deviation of the better evaluation metric results. The AmDPC algorithm has better clustering results than the REDPC and DPC methods when p values are in the range of one to eight, and the stability difference between them is not significant. We test the stability of the AmDPC algorithm parameter p by keeping the other parameters constant. In Table 8, we can see that the REDPC algorithm outperforms the DPC and AmDPC algorithms in terms of mean and standard deviation on the Pima dataset. The main reason is that the range of values of the parameter p of the AmDPC algorithm does not make it achieve the best clustering results. In Tables 7, 8,  9 and 10, we can know that the mean values of evaluation metrics of the AmDPC algorithm on different datasets are significantly higher than the REDPC and DPC methods. The standard deviation of the AmDPC algorithm is higher than the REDPC and DPC method in some datasets. The main reason is that the AmDPC algorithm metric value is higher than other algorithms in the sensitivity test of the parameter p.
To further analyze the sensitivity of other parameters of the AmDPC method, we will give the range of parameter values for different datasets in the Tables 11, 12 and 13. We average the range of N i , C 1 and C 2 parameter values into eight values, respectively, and find the mean and standard deviation of the eight results. Table 11 shows that the parameter N i obtains high values of clustering metrics on different datasets. Encouragingly, for these datasets, the mean of the clustering results for the parameter N i within a certain range is high and the standard deviation is very low. This indicates that the parameter N i is less sensitive to the clustering  the Tables 12 and 13, the evaluation index values of the C 1 and C 2 parameters are unchanged, and all datasets achieve the best clustering effect within a certain range. In Tables 12, and 13, the standard deviation of the AmDPC clustering results is 0. The main reason is that the parameter values do not affect the selection of clustering centers or low-density processing, resulting in no change in the clustering results. In summary, the parameters N i , C 1 and C 2 in the AmDPC algorithm are not sensitive to the clustering results, and they all achieve good clustering results within a certain range.
Next, we will further analyze the clustering performance of the AmDPC, REDPC and DPC algorithms parameter p in different datasets. In Figs. 15, 16, 17, 18, 19 and 20, we give the comparison graphs of clustering evaluation metrics for different datasets with parameter values p={1-8}, respectively. In these figures, we take the different values of parameter p as the horizontal axis, keep the rest parameters constant, and take the values of NMI, ACC, ARI and FM indicators as the vertical axes respectively. The performance of the common parameter p of AmDPC, REDPC and DPC algorithms is further analyzed.
In Fig. 15, we give the performance of DPC, REDPC and AmDPC methods on Pathbased datasets with different   Fig. 17, the AmDPC algorithm always has higher clustering index values than the DPC and REDPC algorithms on the parameter p, and achieves the best clustering effect and stabilizes overall. It indicates that the AmDPC algorithm has higher stability and effectiveness. In Fig. 18, for the Breast dataset, the AmDPC algorithm is superior to the DPC and REDPC algorithms overall and more stable in terms of NMI, ACC, ARI and FM evaluation metric values. In Fig. 19, the NMI and ARI values are low for the Pima dataset. However, the AmDPC algorithm is overall higher than the REDPC and DPC algorithms for different parameter values. In Fig. 20, we can see that the AmDPC algorithm has a better clustering performance than the REDPC and DPC algorithms for the four clustering metric values and overall tends to be stable in the range of one to eight for the parameter p. From these figures, we can know that the AmDPC algorithm has better clustering performance than the REDPC and DPC algorithms on parameter p for different datasets and the overall trend is more stable. In summary, the clustering performance of the AmDPC algorithm for parameter p is better than that of the DPC and REDPC methods, which has better robustness and effectiveness.

Parameters setting of the AmDPC algorithm
In the previous subsection, we have analyzed the sensitivity of the AmDPC algorithm parameters p, N i , C 1 and C 2 . Next, we will analyze the setting of these parameters. Both parameters N i and p can be used to generate a decision graph and N i can control the partitioning of low-density hierarchies. By analyzing the clustering results with different parameters, we summarize the way these four parameters are set, especially for unknown data.
As mentioned before, p is an important parameter in the AmDPC, REDPC, and DPC algorithms. For the AmDPC algorithm, the parameter p determines the quality of the decision graph generation and the relationship between the different density points. In Figs. 15, 16, 17, 18, 19 and 20, we show the results of clustering indicators on different datasets with different p values. It is obvious that the AmDPC algorithm obtains better clustering results and is more stable in the range of one to eight. Therefore, for general structural features datasets, we usually set 1≤ p≤8. For datasets with density concentration and high density points cannot be determined, we generally set 0< p<1. For datasets with complex structure and uneven density distribution, we set 8< p≤80.
For the other parameters N i , C 1 , C 2 , their settings are more dependent on the datasets and different parameters need to be set for different datasets. The parameter N i is used in the generation of decision graph and the processing of low-density data and has an important influence on whether some low-density points are merged or kept as a separate class. To better choose an appropriate N i value, the user needs to observe whether there are multiple consecutive denser low-density points in the obtained sub-clustering centroid decision graph and set a lower N i value. In this case, we typically set 2≤N i ≤8. This case puts the threshold segmentation line A to the right of the low-density points in the decision graph and the threshold segmentation line B to the left of the high-density points (see Fig. 2c). If there is no continuous low-density point, the corresponding larger N i value can be chosen so that the threshold partition line B is smaller than the low-density points. In this case, for a dataset with n instances, we usually set 8<N i ≤n.
Parameters C 1 and C 2 are used to obtain sub-clustering centroids, which facilitate the differentiation of different density layers while obtaining high-quality low-density points. A reasonable value of parameters C 1 and C 2 can divide the decision graph into multiple clear sub-clustering centroids along the x-axis. In practice, a smaller value of C 1 or C 2 can be set if clearer low-density points do not exist. Although C 1 or C 2 has an important influence on the generation of subclustering centroids and clustering results, it is often easier to choose in practice. Assume that the dataset has a maximum distance of δ max in the decision graph. For datasets with low density points, we usually set 20%δ max ≤ C 1 ≤70%δ max . Since the number of low-density points in the C 2 region is relatively small, we generally set 10%δ max ≤ C 2 ≤80%δ max . For datasets that do not contain low-density points, we set C 1 > δ max or C 2 > δ max . For the specific values of C 1 and C 2 , we can further determine the size of C 1 and C 2 respectively by observing the minimum distance of low-density points to be selected in the decision graph.
Compared with the DPC method, the AmDPC algorithm adds three parameters, but all three parameters can find reasonable parameter values with the best clustering results by observing the decision graph. After a lot of experiments, it is proved that the performance of the AmDPC algorithm is much better than that of the DPC method. From the sensitivity test, it can be seen that the clustering results of the AmDPC algorithm are more stable and better than the DPC algorithm for a given range of parameters. In the Figs. 15,16,17,18,19 and 20, the parameter values of AmDPC are insensitive within a reasonable range of parameters On the datasets used for sensitivity testing, AmDPC performs better than the DPC and REDPC methods.
In general, we give a comparative analysis of the evaluation metrics of the state-of-the-art methods based on the values of the metrics in Tables 2 and 3, respectively. In Fig. 14, we can see that the AmDPC algorithm has significantly higher values of evaluation metrics in these datasets than the other algorithms. For the synthetic datasets, we can see that the AmDPC algorithm achieves the best clustering results compared to other methods. Most of these synthetic datasets are non-convex datasets, and we can arrive at the best clustering effects very efficiently by clustering the lowdensity points separately and then merging them.
In Figs. 6 and 7, we give the clustering results for the high density dataset without low density regions. Especially in Fig. 8, the best clustering results are obtained by AP and AmDPC algorithms. It shows that the algorithm also has very good clustering results for other types of datasets as well. In the real-world datasets and Olivetti Face dataset, the AmDPC method is slightly better than the other methods, indicating that the improved algorithm has a higher practical value (see Fig. 14 and Table 3). In synthetic datasets, we can discover that the values of NMI, ACC, ARI, and FM metrics of the AmDPC method are generally better than those of other advanced methods. In the real-world dataset, the clustering evaluation index values of the AmDPC algorithm are generally higher than other state-of-the-art algorithms. From this, we can know that the AmDPC method has higher clustering performance compared to other advanced methods. Therefore, we can conclude that the AmDPC method has certain effectiveness and robustness.

Conclusion
In overview, our proposed AmDPC method can overcome the shortcomings of the DPC method in artificially selecting clustering centers and is very effective for clustering lowdensity regions with multi-peaks. We use density deviation to measure the relationship between data points and the cutoff distance. The AmDPC algorithm has very good clustering performance for non-convex datasets with low-density peaks points. The algorithm provides a novel clustering model for clustering low-density data points and high-density region data points by dividing the decision map density deviation level. The simulation experimental performance indicates that the AmDPC method also has very good clustering effectiveness for clustering centroids in high-density areas.
In simulations with real-world and Olivetti Face datasets, AmDPC has better clustering performance than other methods. Therefore, the improved algorithm is robust and effective compared with other algorithms.
However, the AmDPC method needs to control several params simultaneously to achieve better clustering effects. Since the parameter selection process of AmDPC algorithm has certain complexity. In the future, we will utilize a multiobjective optimization method to optimize these parameters so that we can get the best clustering results quickly.