Advertisement

Best Clustering Configuration Metrics: Towards Multiagent Based Clustering

  • Santhana Chaimontree
  • Katie Atkinson
  • Frans Coenen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6440)

Abstract

Multi-Agent Clustering (MAC) requires a mechanism for identifying the most appropriate cluster configuration. This paper reports on experiments conducted with respect to a number of validation metrics to identify the most effective metric with respect to this context. This paper also describes a process whereby such metrics can be used to determine the optimum parameters typically required by clustering algorithms, and a process for incorporating this into a MAC framework to generate best cluster configurations with minimum input from end users.

Keywords

Cluster Validity Metrics Multi-Agent Clustering 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Albashiri, K.A., Coenen, F.: Agent-enriched data mining using an extendable framework. In: Cao, L., Gorodetsky, V., Liu, J., Weiss, G., Yu, P.S. (eds.) Agents and Data Mining Interaction. LNCS (LNAI), vol. 5680, pp. 53–68. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  2. 2.
    Asuncion, A., Newman, D.: UCI machine learning repository (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html
  3. 3.
    Baik, S., Bala, J., Cho, J.: Agent based distributed data mining. In: Liew, K.-M., Shen, H., See, S., Cai, W. (eds.) PDCAT 2004. LNCS, vol. 3320, pp. 185–199. Springer, Heidelberg (2004)Google Scholar
  4. 4.
    Bailey, S., Grossman, R., Sivakumar, H., Turinsky, A.: Papyrus: A system for data mining over local and wide area clusters and super-clusters. IEEE Supercomputing (1999)Google Scholar
  5. 5.
    Canuto, A.M.P., Campos, A.M.C., Bezerra, V.M.S., Abreu, M.C.d.C.: Investigating the use of a multi-agent system for knowledge discovery in databases. International Journal of Hybrid Intelligent Systems 4(1), 27–38 (2007)Google Scholar
  6. 6.
    Chaimontree, S., Atkinson, K., Coenen, F.: Clustering in a multi-agent data mining environment. In: Cao, L., Bazzan, A.L.C., Gorodetsky, V., Mitkas, P.A., Weiss, G., Yu, P.S. (eds.) ADMI 2010. LNCS (LNAI), vol. 5980, pp. 103–114. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  7. 7.
    Chaimontree, S., Atkinson, K., Coenen, F.: Multi-agent based clustering: Towards generic multi-agent data mining. In: Perner, P. (ed.) ICDM 2010. LNCS (LNAI), vol. 6171, pp. 115–127. Springer, Heidelberg (2010)Google Scholar
  8. 8.
    Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-1(2), 224–227 (1979)CrossRefGoogle Scholar
  9. 9.
    Dunn, J.C.: Well separated clusters and optimal fuzzy-partitions. Journal of Cybernetics 4, 95–104 (1974)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Halkidi, M., Batistakis, Y., Vazirgiannis, M.: Cluster validity methods: part I. SIGMOD Record 31(2), 40–45 (2002)CrossRefGoogle Scholar
  11. 11.
    Halkidi, M., Vazirgiannis, M.: Clustering validity assessment: Finding the optimal partitioning of a data set. In: ICDM 2001: Proceedings of the 2001 IEEE International Conference on Data Mining, pp. 187–194. IEEE Computer Society, Washington (2001)CrossRefGoogle Scholar
  12. 12.
    Halkidi, M., Vazirgiannis, M., Batistakis, Y.: Quality scheme assessment in the clustering process. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 265–276. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  13. 13.
    Kargupta, H., Hamzaoglu, I., Stafford, B.: Scalable, distributed data mining using an agent based architecture. In: Proceedings the Third International Conference on the Knowledge Discovery and Data Mining, pp. 211–214. AAAI Press, Menlo Park (1997)Google Scholar
  14. 14.
    Kiselev, I., Alhajj, R.: A self-organizing multi-agent system for online unsupervised learning in complex dynamic environments. In: Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, pp. 1808–1809. AAAI Press, Menlo Park (2008)Google Scholar
  15. 15.
    Klusch, M., Lodi, S., Moro, G.: Agent-based distributed data mining: The KDEC scheme. In: Klusch, M., Bergamaschi, S., Edwards, P., Petta, P. (eds.) Intelligent Information Agents. LNCS (LNAI), vol. 2586, pp. 104–122. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  16. 16.
    Legány, C., Juhász, S., Babos, A.: Cluster validity measurement techniques. In: AIKED 2006: Proceedings of the 5th WSEAS International Conference on Artificial Intelligence, Knowledge Engineering and Data Bases, pp. 388–393. World Scientific and Engineering Academy and Society (WSEAS), Stevens Point (2006)Google Scholar
  17. 17.
    Qiao, H., Edwards, B.: A data clustering tool with cluster validity indices. In: ICC 2009 - International Conference of Computing in Engineering, Science and Information, pp. 303–309 (2009)Google Scholar
  18. 18.
    Rao, M.: Clustering analysis and mathematical programming. Journal of the American statistical association 66(345), 622–626 (1971)CrossRefzbMATHGoogle Scholar
  19. 19.
    Reed, J.W., Potok, T.E., Patton, R.M.: A multi-agent system for distributed cluster analysis. In: Proceedings of Third International Workshop on Software Engineering for Large-Scale Multi-Agent Systems (SELMAS 2004) W16L Workshop - 26th International Conference on Software Engineering. pp. 152–155. IEE, Edinburgh (2004)Google Scholar
  20. 20.
    Ristevski, B., Loshkovska, S., Dzeroski, S., Slavkov, I.: A comparison of validation indices for evaluation of clustering results of DNA microarray data. In: 2nd International Conference on Bioinformatics and Biomedical Engineering, iCBBE 2008, pp. 587–591 (2008)Google Scholar
  21. 21.
    Rousseeuw, P.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics 20(1), 53–65 (1987)CrossRefzbMATHGoogle Scholar
  22. 22.
    da Silva, J., Klusch, M., Lodi, S., Moro, G.: Privacy-preserving agent-based distributed data clustering. Web Intelligence and Agent Systems 4(2), 221–238 (2006)Google Scholar
  23. 23.
    Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison-Wesley, Reading (2005)Google Scholar
  24. 24.
    Wang, K., Wang, B., Peng, L.: CVAP: Validation for cluster analyses. Data Science Journal 8, 88–93 (2009)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Santhana Chaimontree
    • 1
  • Katie Atkinson
    • 1
  • Frans Coenen
    • 1
  1. 1.Department of Computer ScienceUniversity of LiverpoolUK

Personalised recommendations