An automatic clustering for interval data using the genetic algorithm

Vovan, Tai; Phamtoan, Dinh; Tuan, Le Hoang; Nguyentrang, Thao

doi:10.1007/s10479-020-03606-8

An automatic clustering for interval data using the genetic algorithm

S.I.: Data Mining and Decision Analytics
Published: 18 April 2020

Volume 303, pages 359–380, (2021)
Cite this article

Annals of Operations Research Aims and scope Submit manuscript

Tai Vovan¹,
Dinh Phamtoan^2,3,4,
Le Hoang Tuan^3,5 &
…
Thao Nguyentrang^2,3

337 Accesses
13 Citations
Explore all metrics

Abstract

This paper proposes an Automatic Clustering algorithm for Interval data using the Genetic algorithm (ACIG). In this algorithm, the overlapped distance between intervals is applied to determining the suitable number of clusters. Moreover, to optimize in clustering, we modify the Davies & Bouldin index, and to improve the crossover, mutation, and selection operators of the original genetic algorithm. The convergence of ACIG is theoretically proved and illustrated by the numerical examples. ACIG can be implemented effectively by the established Matlab procedure. Through the experiments on data sets with different characteristics, the proposed algorithm has shown the outstanding advantages in comparison to the existing ones. Recognizing the images by the proposed algorithm gives the potential in real applications of this research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Fuzzy Rule-Based Systems

Interpretable clustering: an optimization approach

Article 16 August 2020

Image thresholding method based on Tsallis entropy correlation

Article 11 May 2024

References

Agustı, L. E., Salcedo-Sanz, S., Jiménez-Fernández, S., Carro-Calvo, L., Del Ser, J., & Portilla-Figueras, J. A. (2012). A new grouping genetic algorithm for clustering problems. Expert Systems with Applications, 39(10), 9695–9703.
Article Google Scholar
Cabanes, G., Bennani, Y., Destenay, R., & Hardy, A. (2013). A new topological clustering algorithm for interval data. Pattern Recognition, 46(11), 3030–3039.
Article Google Scholar
Chen, J., Chang, Y., & Hung, W. (2018). A robust automatic clustering algorithm for probability density functions with application to categorizing color images. Communications in Statistics-Simulation and Computation, 47(7), 2152–2168.
Article Google Scholar
Chen, J. H., & Hung, W. L. (2015). An automatic clustering algorithm for probability density functions. Journal of Statistical Computation and Simulation, 85(15), 3047–3063.
Article Google Scholar
Chen, C., & Quadrianto, N. (2016). Clustering high dimensional categorical data via topographical features. JMLR, 48, 2732–2740.
Google Scholar
Davies, D. L., & Bouldin, D. W. (1979). A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2(2), 224–227.
Article Google Scholar
De Carvalho, FdAT, Pimentel, J. T., & Bezerra, L. X. T. (2007). Clustering of symbolic interval data based on a single adaptive l1 distance. IEEE pp. 224–229,
De Souza, R. M., de Carvalho, FdA., Silva, F. C. (2004). Clustering of interval-valued data using adaptive squared euclidean distances. Springer pp. 775–780.
Goh, A., & Vidal, R. (2008). Clustering and dimensionality reduction on riemannian manifolds. In CVPR 2008 IEEE conference on computer vision and pattern recognition (pp. 377–392).
Grogan, M., & Dahyot, R. (2019). $L_2$ divergence for robust colour transfer. Computer Vision and Image Understanding.https://doi.org/10.1016/j.cviu.2019.02.002.
Hajjar, C., & Hamdan, H. (2011). Self-organizing map based on hausdorff distance for interval-valued data. IEEE (pp. 1747–1752).
Hajjar, C., & Hamdan, H. (2013). Interval data clustering using self-organizing maps based on adaptive Mahalanobis distances. Neural Networks, 46, 124–132.
Article Google Scholar
Holland, J. H. (1973). Genetic algorithms and the optimal allocation of trials. SIAM Journal on Computing, 2(2), 88–105.
Article Google Scholar
Höppner, F., & Böttcher, M. (2007). Matching partitions over time to reliably capture local clusters in noisy domains. Springer, Berlin, Heidelberg (pp. 479–486).
Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193–218.
Article Google Scholar
Hung, W., Yang, J., & Shen, K. F. (2016). Self-updating clustering algorithm for interval-valued data. In IEEE international conference on fuzzy systems (pp. 1494–1500).
Izakian, Z., Saadi Mesgari, M., & Abraham, A. (2016). Automated clustering of trajectory data using a particle swarm optimization. Computers, Environment and Urban Systems, 55, 55–65.
Article Google Scholar
Jain, M., & Vayada, M. G. (2017). Non-cognitive color and texture based image segmentation amalgamation with evidence theory of crop images. IEEE (pp. 160–165).
Kabir, S., Wagner, C., Havens, T. C., Anderson, D. T., & Aickelin, U. (2017). Novel similarity measure for interval-valued data based on overlapping ratio. IEEE (pp. 1–6).
Kao, C. H., Nakano, J., Shieh, S. H., Tien, Y. J., Wu, H. M., Yang, C., et al. (2014). Exploratory data analysis of interval-valued symbolic data with matrix visualization. Computational Statistics & Data Analysis, 79, 14–29.
Article Google Scholar
Kim, K., & Ahn, H. (2008). A recommender system using GA K-means clustering in an online shopping market. Expert Systems with Applications, 34(2), 1200–1209.
Article Google Scholar
Lai, C. C. (2005). A novel clustering approach using hierarchical genetic algorithms. Intelligent Automation & Soft Computing, 11(3), 143–153.
Article Google Scholar
Liu, Y., Wu, X., & Shen, Y. (2011). Automatic clustering using genetic algorithms. Applied Mathematics and Computation, 218(4), 1267–1279.
Article Google Scholar
Masson, M. H., & Denœux, T. (2004). Clustering interval-valued proximity data using belief functions. Pattern Recognition Letters, 25(2), 163–171.
Article Google Scholar
NguyenTrang, T., & VoVan, T. (2017). A new approach for determining the prior probabilities in the classification problem by Bayesian method. Advances in Data Analysis and Classification, 11(3), 629–643.
Article Google Scholar
NguyenTrang, T., & Vovan, T. (2017). Fuzzy clustering of probability density functions. Journal of Applied Statistics, 44(4), 583–601.
Article Google Scholar
Parag, C. P., & James, A. R. (2004). An empirical study of impact of crossover operators on the performance of non-binary genetic algorithm based neural approaches for classification. Computers & Operations Research, 31, 481–498.
Article Google Scholar
Peng, W., & Li, T. (2006). Interval data clustering with applications. IEEE (pp. 355–362).
PhamGia, T., Turkkan, N., & VoVan, T. (2008). Statistical discrimination analysis using the maximum function. Communications in Statistics-Simulation and Computation, 37(2), 320–336.
Article Google Scholar
Ren, Y., Liu, Y.H., Rong, J., & Dew, R. (2009). Clustering interval-valued data using an overlapped interval divergence. Australian Computer Society, Inc (pp. 35–42).
Sato-Ilic, M. (2011). Symbolic clustering with interval-valued data. Procedia Computer Science, 6, 358–363.
Article Google Scholar
Şeref, O., Fan, Y. J., Borenstein, E., & Chaovalitwongse, W. A. (2018). Information-theoretic feature selection with discrete $$k$$k-median clustering. Annals of Operations Research, 263(1), 93–118.
Article Google Scholar
Souza, R. M. C. R., & Carvalho, F. A. T. (2004). Clustering of interval data based on city-block distances. Pattern Recognition Letters, 25(3), 353–365.
Article Google Scholar
Vovan, T., Phamtoan, D., & Tranthituy, D. (2019). Automatic genetic algorithm in clustering for discrete elements. Communications in Statistics-Simulation and Computation,. https://doi.org/10.1080/03610918.2019.1588305.
Vovan, T. (2017). $L^1$-distance and classification problem by Bayesian method. Journal of Applied Statistics, 44(3), 385–401.
Article Google Scholar
VoVan, T., NguyenThoi, T., VoDuy, T., HoHuu, V., & NguyenTrang, T. (2017). Modified genetic algorithm-based clustering for probability density functions. Journal of Statistical Computation and Simulation, 87(10), 1964–1979.
Article Google Scholar
VoVan, T., & NguyenTrang, T. (2018). Similar coefficient for cluster of probability density functions. Communications in Statistics—Theory and Methods, 47(8), 1792–1811.
Article Google Scholar
VoVan, T., & NguyenTrang, T. (2018b). Similar coefficient of cluster for discrete elements. Sankhya B, 80(01), 19–36.
Article Google Scholar
VoVan, T., NguyenTrang, T., & CheNgoc, H. (2016). Clustering for probability density functions based on Genetic Algorithm. Boca Raton: CRC Press.
Google Scholar
VoVan, T., & PhamGia, T. (2010). Clustering probability distributions. Journal of Applied Statistics, 37(11), 1891–1910.
Article Google Scholar
Xu, X., Li, X., Liu, X., Shen, H., & Shi, Q. (2016). Multimodal registration of remotely sensed images based on jeffrey’s divergence. ISPRS Journal of Photogrammetry and Remote Sensing, 122, 97–115.
Article Google Scholar

Download references

Acknowledgements

For Le Hoang Tuan, this research is funded by Vietnam National University Ho Chi Minh City (VNU-HCM) under Grant Number C2018-26-05.

Author information

Authors and Affiliations

College of Natural Science, Can Tho University, Can Tho, Vietnam
Tai Vovan
University of Science, Ho Chi Minh City, Vietnam
Dinh Phamtoan & Thao Nguyentrang
Vietnam National University, Ho Chi Minh City, Vietnam
Dinh Phamtoan, Le Hoang Tuan & Thao Nguyentrang
Faculty of Engineering, VanLang University, Ho Chi Minh City, Vietnam
Dinh Phamtoan
Department of Mathematics, University of Information Technology, Ho Chi Minh City, Vietnam
Le Hoang Tuan

Authors

Tai Vovan
View author publications
You can also search for this author in PubMed Google Scholar
Dinh Phamtoan
View author publications
You can also search for this author in PubMed Google Scholar
Le Hoang Tuan
View author publications
You can also search for this author in PubMed Google Scholar
Thao Nguyentrang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Thao Nguyentrang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vovan, T., Phamtoan, D., Tuan, L.H. et al. An automatic clustering for interval data using the genetic algorithm. Ann Oper Res 303, 359–380 (2021). https://doi.org/10.1007/s10479-020-03606-8

Download citation

Published: 18 April 2020
Issue Date: August 2021
DOI: https://doi.org/10.1007/s10479-020-03606-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An automatic clustering for interval data using the genetic algorithm

Abstract

Access this article

Similar content being viewed by others

Fuzzy Rule-Based Systems

Interpretable clustering: an optimization approach

Image thresholding method based on Tsallis entropy correlation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An automatic clustering for interval data using the genetic algorithm

Abstract

Access this article

Similar content being viewed by others

Fuzzy Rule-Based Systems

Interpretable clustering: an optimization approach

Image thresholding method based on Tsallis entropy correlation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation