Abstract
In this paper we propose a novel evolutive agent-based clustering algorithm where agents act as individuals of an evolving population, each one performing a random walk on a different subset of patterns drawn from the entire dataset. Such agents are orchestrated by means of a customised genetic algorithm and are able to perform simultaneously clustering and feature selection. Conversely to standard clustering algorithms, each agent is in charge of discovering well-formed (compact and populated) clusters and, at the same time, a suitable subset of features corresponding to the subspace where such clusters lie, following a local metric learning approach, where each cluster is characterised by its own subset of relevant features. This not only might lead to a deeper knowledge of the dataset at hand, revealing clusters that are not evident when using the whole set of features, but can also be suitable for large datasets, as each agent processes a small subset of patterns. We show the effectiveness of our algorithm on synthetic datasets, remarking some interesting future work scenarios and extensions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
In the sense that they must be able to deal with an heterogeneous genetic code such as (3.2).
- 2.
Such value is user-configurable and it is defined as a percentage of the number of survived agents.
- 3.
Such value now acts more like a cluster quality measure rather than a fitness value, as we are evaluating a list of clusters rather than individuals from a genetic population.
- 4.
We omit the penalty factor \(\beta \) (and, by extension, the reward factor \(\alpha \)) as they mainly drive RL-BSAS rather then describe the final clusters and/or agents.
References
Alamgir M., Von Luxburg, U.: Multi-agent random walks for local clustering on graphs. In: 2010 IEEE 10th International Conference on Data Mining (ICDM), pp. 18–27. IEEE (2010)
Ankerst, M., Breunig, M.M., Kriegel, H.P., Sander, J.: Optics: ordering points to identify the clustering structure. ACM Sigmod Rec. ACM 28, 49–60 (1999)
Bianchi, F.M., Maiorino, E., Livi, L., Rizzi, A., Sadeghian, A.: An agent-based algorithm exploiting multiple local dissimilarities for clusters mining and knowledge discovery. Soft Comput. 5(21), 1347–1369 (2015)
Bianchi, F.M., Rizzi, A., Sadeghian, A., Moiso, C.: Identifying user habits through data mining on call data records. Eng. Appl. Artif. Intel. 54, 49–61 (2016)
Carvalho, L.F., Barbon, S., de Souza Mendes, L., Proença, M.L.: Unsupervised learning clustering and self-organized agents applied to help network management. Expert Syst. Appl. 54, 29–47 (2016)
Chaimontree, S., Atkinson, K., Coenen, F.: Clustering in a multi-agent data mining environment. Agents Data Min. Interact., 103–114 (2010)
Chaimontree, S., Atkinson, K., Coenen, F.: A multi-agent based approach to clustering: harnessing the power of agents. In: ADMI, pp. 16–29. Springer (2011)
Ester, M., Kriegel, H.P., Sander, J., Xu, X., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd 96, 226–231 (1996)
Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, Reading (1989)
Guha, S., Rastogi, R., Shim, K.: Cure: an efficient clustering algorithm for large databases. ACM Sigmod Rec. ACM 27, 73–84 (1998)
Inkaya, T., Kayalıgil, S., Özdemirel, N.E.: Ant colony optimization based clustering methodology. Appl. Soft Comput. 28, 301–311 (2015)
Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Sov. Phys. Dokl. 10, 707–710 (1966)
Livi, L., Rizzi, A.: The graph matching problem. Pattern Anal. Appl. 16(3), 253–283 (2013)
MacQueen, J.: Some methods for classification and analysis of multivariate observations. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Oakland, CA, USA 1, 281–297 (1967)
Martino, A., Rizzi, A., Frattale Mascioli, F.M.: Efficient approaches for solving the large-scale k-medoids problem. In: Proceedings of the 9th International Joint Conference on Computational Intelligence, IJCCI, INSTICC, vol. 1, pp. 338–347. SciTePress (2017)
Martino, A., Giuliani, A., Rizzi, A.: Granular computing techniques for bioinformatics pattern recognition problems in non-metric spaces. In: Chen, S.M., Pedrycz, W. (eds.) Computational Intelligence for Pattern Recognition. Springer, Accepted for Publication (2018). https://rd.springer.com/chapter/10.1007%2F978-3-319-89629-8_3
Ogston, E., Overeinder, B., Van Steen, M., Brazier, F.: A method for decentralized clustering in large multi-agent systems. In: Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 789–796. ACM (2003)
Pan, X., Chen, H.: Multi-agent evolutionary clustering algorithm based on manifold distance. In: 2012 Eighth International Conference on Computational Intelligence and Security (CIS), pp. 123–127. IEEE (2012)
Park, J., Oh, K.: Multi-agent systems for intelligent clustering. Proc. World Acad. Sci. Eng. Technol. 11, 97–102 (2006)
Rizzi, A., Del Vescovo, G., Livi, L., Frattale Mascioli, F.M.: A new granular computing approach for sequences representation and classification. In: The 2012 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2012)
Theodoridis, S., Koutroumbas, K.: Pattern Recognition, 4th edn. Academic Press (2008)
Zhang, T., Ramakrishnan, R., Livny, M.: Birch: an efficient data clustering method for very large databases. ACM Sigmod Rec. ACM 25, 103–114 (1996)
Acknowledgements
The Authors would like to thank Daniele Sartori for his help in implementing and testing E-ABC.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer International Publishing AG, part of Springer Nature
About this chapter
Cite this chapter
Martino, A., Giampieri, M., Luzi, M., Rizzi, A. (2019). Data Mining by Evolving Agents for Clusters Discovery and Metric Learning. In: Esposito, A., Faundez-Zanuy, M., Morabito, F., Pasero, E. (eds) Neural Advances in Processing Nonlinear Dynamic Signals. WIRN 2017 2017. Smart Innovation, Systems and Technologies, vol 102. Springer, Cham. https://doi.org/10.1007/978-3-319-95098-3_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-95098-3_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-95097-6
Online ISBN: 978-3-319-95098-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)