Abstract
Solving a clustering algorithm can usually be simplified into an optimization problem. Using relevant knowledge in graph theory, many optimization problems can be transformed into solving minimum spanning tree problems. Minimal spanning trees are also widely used in areas closely related to cognitive computing such as for face recognition by face cognition and gene data analysis by gene cognition. However, the minimum spanning tree has the shortcoming of the distance between neighbours because of which the minimum spanning tree algorithm cannot cluster unbalanced data. Thus, the face recognition rate is low, and facial expression cognition is difficult. In this paper, a minimum spanning tree algorithm based on fuzzy distance is proposed for the shortcomings of the minimum spanning tree (FCP). First, a relative neighbourhood distance measure is proposed by introducing neighbourhood rough set theory; the neighbourhood matrix is obtained based on the distance. Second, the minimum spanning tree is solved by the prim algorithm and the neighbourhood matrix. Finally, the minimum spanning tree is partitioned to realize clustering of the minimum spanning tree. In this paper, the UCI dataset and Olivetti face database are selected to verify the performance of the algorithm, and the algorithm is evaluated by three evaluation criteria. The experimental results show that the proposed algorithm can not only cluster data of any shape but also deal with unbalanced data containing noise points. Especially in face cognitive computing, the values of ACC, AMI, and ARI can reach 0.852, 0.843, and 0.782, respectively. In this study, the algorithm can obtain very good clustering results for data with good geometric structure, and the overall performance is better than other algorithms. In face recognition detection, the improved cognitive computing of faces makes it possible to accurately recognize different expressions from the same person.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12559-022-10002-w/MediaObjects/12559_2022_10002_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12559-022-10002-w/MediaObjects/12559_2022_10002_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12559-022-10002-w/MediaObjects/12559_2022_10002_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12559-022-10002-w/MediaObjects/12559_2022_10002_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12559-022-10002-w/MediaObjects/12559_2022_10002_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12559-022-10002-w/MediaObjects/12559_2022_10002_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12559-022-10002-w/MediaObjects/12559_2022_10002_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12559-022-10002-w/MediaObjects/12559_2022_10002_Fig8_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12559-022-10002-w/MediaObjects/12559_2022_10002_Fig9_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12559-022-10002-w/MediaObjects/12559_2022_10002_Fig10_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12559-022-10002-w/MediaObjects/12559_2022_10002_Fig11_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12559-022-10002-w/MediaObjects/12559_2022_10002_Fig12_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12559-022-10002-w/MediaObjects/12559_2022_10002_Fig13_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12559-022-10002-w/MediaObjects/12559_2022_10002_Fig14_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12559-022-10002-w/MediaObjects/12559_2022_10002_Fig15_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12559-022-10002-w/MediaObjects/12559_2022_10002_Fig16_HTML.jpg)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12559-022-10002-w/MediaObjects/12559_2022_10002_Fig17_HTML.jpg)
Similar content being viewed by others
References
Fan J, Niu Z, Liang Y, et al. Probability model selection and parameter evolutionary estimation for clustering imbalanced data without sampling. Neurocomputing. 2016;211:172–81.
Moore A W. Very fast EM-based mixture model clustering using multiresolution Kd-trees. Neural Information Processing systems. 2020; 543–549.
Zahn CT. Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Trans Comput. 2019;20(1):68–86.
Doucet, Arnaud, Freitas, Nando, Gordon, Neil. Sequential Monte Carlo methods in practice || an introduction to sequential monte carlo methods. 2020, 1(1):3–14.
Zhong C, Miao D, Wang R. A graph-theoretical clustering method based on two rounds of minimum spanning trees. Pattern Recogn. 2010;43(3):752–66.
Rhouma MB, Frigui H. Self-organization of pulse-coupled oscillators with application to clustering. IEEE Trans Pattern Anal Mach Intell. 2001;23(2):180–95.
Han J. Data mining: concepts and techniques. 2005.
Fox WR. Finding groups in data: an introduction to cluster analysis. J R Stat Soc: Ser C: Appl Stat. 2020;40(3):486–7.
Ng RT, Han J. CLARANS: a method for clustering objects for spatial data mining. IEEE Trans Knowl Data Eng. 2018;14(5):1003–16.
Hinneburg A. An efficient approach to clustering in large multimedia databases with noise. Knowledge Discovery and Data Mining (KDD '98), 1998; 13(4):332–344.
Ankerst M, Breunig M M, Kriegel H P, et al. OPTICS: ordering points to identify the clustering structure. SIGMOD 1999, Proceedings ACM SIGMOD International Conference on Management of Data, June 1–3, 1999, Philadelphia, Pennsylvania, USA. ACM. 1999.
Zhang T, Ramakrishnan R, Livny M, et al. BIRCH: an efficient data clustering method for very large databases. International Conference on Management of Data. 1996;25(2):103–14.
Guha S, Rastogi R, Shim K, et al. Cure: an efficient clustering algorithm for large databases. Inf Syst. 2001;26(1):35–58.
Sinaga KP, Yang M. Unsupervised K-means clustering algorithm. IEEE Access. 2020;8:80716–27.
Lloyd SP. Least squares quantization in PCM. IEEE Trans Inf Theory. 1982;28(2):129–37.
Li R, Wang S, Gu D. Ongoing evolution of visual SLAM from geometry to deep learning: challenges and opportunities. Cogn Comput. 2018;10:875–89.
Xie J, Jiang W, Ding L. Clustering by searching density peaks via local standard deviation. International Conference on Intelligent Data Engineering and Automated Learning. Springer, Cham, 2017; 4:295–305.
Yang Xu, Deng C, Wei K, Yan J, Liu W. Adversarial learning for robust deep clustering. NeurIPS Proceedings. 2020;10(4):112–9.
RuhuiLiu WeipingHuang, ZhengshunFei KaiWang, JunLiang,. Constraint-based clustering by fast search and find of density peaks. Neurocomputing. 2018;319:1–196.
Ankerst M, Breunig MM, Kriegel HP, et al. OPTICS: ordering points to identify the clustering structure. Proceeding of the ACM SIGMOD International Conference on Management of Data. Philadelphia. 2020; 6:49–60.
Frey BJ, Eueck D. Clustering by passing messages between data points. Science. 2007;315(5814):972–6.
Bie R, Mehmood R, Ruan S, et al. Adaptive fuzzy clustering by fast search and find of density peaks. Pers Ubiquit Comput. 2016;20(5):785–93.
Rodriguez A, Laio A. Clustering by fast search and find of density peaks. Science. 2020;344(6191):1492–6.
Xie J, Gao H, Xie W, et al. Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K-nearest neighbors. Inf Sci. 2019;354:19–40.
Xie JY, Gao HC, Xie WX. K-nearest neighbors optimized clustering algorithm by fast search and finding the density peaks of a dataset. Science Chia: Inform Sci. 2016; 46(2):258–280.
Weihua H, Takeru M, Seiya T, Eiichi M, Masashi S. Learning discrete representations via information maximizing self-augmented training. In Proceedings of the 34th International Conference on Machine Learning (ICML'17). 2017; 1558–1567.
Chang J, Wang L, Meng G, Xiang S, Pan C. Deep adaptive image clustering. IEEE International Conference on Computer Vision (ICCV). 2017;2017:5880–8.
Kumar N, Gumhold S. FuseVis: interpreting neural networks for image fusion using per-pixel saliency visualization. Computers. 2020;9:98.
Fahad A, Alshatri N, Tari Z, et al. A survey of clustering algorithms for big data: taxonomy and empirical analysis. IEEE Trans Emerg Top Comput. 2014;2(3):267–79.
Xu Y, Olman V, Xu D, et al. Clustering gene expression data using a graph-theoretic approach: an application of minimum spanning trees. Bioinformatics. 2020;18(4):536–45.
Wang X, Wang XL, Chen C, et al. Enhancing minimum spanning tree-based clustering by removing density-based outliers. Digital Signal Process. 2019;23(5):1523–38.
Jiang J, Chen Y, Meng X, et al. A novel density peaks clustering algorithm based on k nearest neighbors for improving assignment process. Physica A. 2019;523:702–13.
Parmar MD, Pang W, Hao D, et al. FREDPC: a feasible residual error-based density peak clustering algorithm with the fragment merging strategy. IEEE Access. 2019;6(99):1–7.
Rodriguez A, Laio A. Clustering by fast search and find of density peaks. Science. 2014;344(6191):1492–6.
Yan H, Wang L, Lu Y. Identifying cluster centroids from decision graph automatically using a statistical outlier detection method. Neurocomputing. 2019; 329(FEB.15):348–358.
Acknowledgements
This research is financially supported by The National Natural Science Foundation of China (61877065).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethics Approval
This article does not contain any experiments with human or animal participants performed by any of the authors.
Consent to Participate
Informed consent was obtained from all individual participants included in the study.
Conflict of Interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, Y., Zhou, W. A Novel Fuzzy Distance-Based Minimum Spanning Tree Clustering Algorithm for Face Detection. Cogn Comput 14, 1350–1361 (2022). https://doi.org/10.1007/s12559-022-10002-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12559-022-10002-w