Clustering high-dimensional data under the curse of dimensionality is an arduous task in many applications domains. The wide dimension yields the complexity-related challenges and the limited number of records leads to the overfitting trap. We propose to tackle this problematic using the graphical and probabilistic power of the Bayesian network. Our contribution is a new loose hierarchical Bayesian network model that encloses latent variables. These hidden variables are introduced for ensuring a multi-view clustering of the records. We propose a new framework for learning our proposed Bayesian network model. It starts by extracting the cliques of highly dependent features and it proceeds to learn representative latent variable for each features’ clique. The experimental results of our comparative analysis prove the efficiency of our model in tackling the distance concentration challenge. They also show the effectiveness of our model learning framework in skipping the overfitting trap, on benchmark high-dimensional datasets.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
Tax calculation will be finalised during checkout.
Bellman, R.E.: Adaptive control processes: a guided tour. Princeton university press, (2015)
Demartines, Pierre.: Analyse de données par réseaux de neurones auto-organisés. Diss. Grenoble INPG, (1994)
He, J., Kumar, S., and Chang, S-F.: On the difficulty of nearest neighbor search. arXiv preprint arXiv:1206.6411 (2012)
Tomašev, N.: Taming the empirical hubness risk in many dimensions. Proceedings of the 2015 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, (2015)
Angiulli, F.: On the behavior of intrinsically high-dimensional spaces: distances, direct and reverse nearest neighbors, and hubness. J. Mach. Learn. Res. 18(1), 6209–6268 (2017)
Elankavi, R., Kalaiprasath, R., Udayakumar, D.R.: A fast clustering algorithm for high-dimensional data. Intl J. Civil Eng. Technol. (Ijciet). 8(5), 1220–1227 (2017)
Assent, I.: Clustering high dimensional data. Wiley Interdiscip Rev: Data Mining Knowl Discovery. 2(4), 340–350 (2012)
Friedman, N., Goldszmidt, M.: Learning Bayesian Networks from Data. Morgan Kaufmann, (1999)
Wang, S., Gittens, A., Mahoney, M.W.: Scalable kernel K-means clustering with Nyström approximation: relative-error bounds. J. Mach. Learn. Res. 20(1), 431–479 (2019)
Shao, J., Yang, Q., Dang, H.-V., Schmidt, B., Kramer, S.: Scalable clustering by iterative partitioning and point attractor representation. ACM Trans. Knowl. Discovery Data (TKDD). 11(1), 1–23 (2016)
Mai, S.T., et al. "Scalable and interactive graph clustering algorithm on multicore CPUs." 2017 IEEE 33rd International Conference on Data Engineering (ICDE). IEEE, (2017)
Vishwasrao, M.D., Sangaiah, A.K.: ESCAPE: effective scalable clustering approach for parallel execution of continuous position-based queries in position monitoring applications. IEEE Trans. Sustain. Comput. 2(2), 49–61 (2017)
Chormunge, S., Jena, S.: Correlation based feature selection with clustering for high dimensional data. J. Electric. Syst. Inf. Technol. 5(3), 542–549 (2018)
Nadler, B.: Discussion of" influential features PCA for high dimensional clustering". Ann. Stat. 44(6), 2366–2371 (2016)
Islam, AKMT, et al. RESTRAC: REference Sequence Based Space TRAnsformation for Clustering. 2017 IEEE International Conference on Data Mining Workshops (ICDMW). IEEE, (2017)
Chang, W.-C.: On using principal components before separating a mixture of two multivariate normal distributions. J. R. Stat. Soc.: Ser. C: Appl. Stat. 32(3), 267–275 (1983)
Stutz, John, and Peter Cheeseman. "AutoClass—A Bayesian Approach to Classification." Maximum entropy and Bayesian methods. Springer, Dordrecht, 117–126 (1996)
Chen, T., Zhang, N.L., Liu, T., Poon, K.M., Wang, Y.: Model-based multidimensional clustering of categorical data. Artif. Intell. 176(1), 2246–2269 (2012)
Zhang, N.L.: Hierarchical latent class models for cluster analysis. J. Mach. Learn. Res. 5(6), 697–723 (2004)
Oña, D., Juan, et al.: Analysis of traffic accidents on rural highways using latent class clustering and Bayesian networks. Accid. Anal. Prev. 51, 1–10 (2013)
Harmeling, S., Williams, C.K.I.: Greedy learning of binary latent trees. IEEE Trans. Pattern Anal. Mach. Intell. 33(6), 1087–1097 (2010)
Mourad, R., Sinoquet, C., Zhang, N.L., Liu, T., Leray, P.: A survey on latent tree models and applications. J. Artif. Intell. Res. 47, 157–203 (2013)
Cybis, G.B., Sinsheimer, J.S., Bedford, T., Rambaut, A., Lemey, P., Suchard, M.A.: Bayesian nonparametric clustering in phylogenetics: modeling antigenic evolution in influenza. Stat. Med. 37(2), 195–206 (2018)
He, C., et al. Structure learning of bayesian network with latent variables by weight-induced refinement. Proceedings of the 5th International Workshop on Web-scale Knowledge Representation Retrieval & Reasoning. 2014
Spirtes, P. et al. Causation, prediction, and search. MIT press, (2000)
Moon, T.K.: The expectation-maximization algorithm. IEEE Signal Process. Mag. 13(6), 47–60 (1996)
Njah, H., et al. A new equilibrium criterion for learning the cardinality of latent variables. 2015 IEEE 27th International Conference on Tools with Artificial Intelligence (ICTAI). IEEE, (2015)
Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)
Ester, M., Kriegel, H.-P., Sander, J. & Xu, X.: Density-Based Spatial Clustering of Applications with Noise, (1996)
MacQueen, J. B.: Some methods for classification and analysis of MultiVariate observations. University of California Press, pp. 281-297. (1967)
Pelleg, D., Moore, A. W. & others. X-means: Extending K-means with Efficient Estimation of the Number of Clusters, (2000)
Santos, J.M., Embrechts, M.: On the Use of the Adjusted Rand Index as a Metric for Evaluating Supervised Classification. International conference on artificial neural networks. Springer, Berlin, Heidelberg, 2009
Yao, Y. Y.: Information-theoretic measures for knowledge discovery and data mining. Entropy measures, maximum entropy principle and emerging applications. Springer, Berlin, Heidelberg. 115–136 (2003)
Bock, R.D., Aitkin, M.: Marginal maximum likelihood estimation of item parameters: application of an EM algorithm. Psychometrika. 46(4), 443–459 (1981)
Zuk, O., Margel, S., Domany, E.: On the number of samples needed to learn the correct structure of a Bayesian network. arXiv preprint arXiv:1206.6862 (2012)
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Elsevier, (2014)
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Njah, H., Jamoussi, S. & Mahdi, W. Breaking the curse of dimensionality: hierarchical Bayesian network model for multi-view clustering. Ann Math Artif Intell (2021). https://doi.org/10.1007/s10472-021-09749-z
- Hierarchical Bayesian network
- Multi-view clustering
- Latent model
- High-dimensional data