Skip to main content

Aggregate Distance Based Clustering Using Fibonacci Series-FIBCLUS

  • Conference paper
Web Technologies and Applications (APWeb 2011)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6612))

Included in the following conference series:

Abstract

This paper proposes an innovative instance similarity based evaluation metric that reduces the search map for clustering to be performed. An aggregate global score is calculated for each instance using the novel idea of Fibonacci series. The use of Fibonacci numbers is able to separate the instances effectively and, in hence, the intra-cluster similarity is increased and the inter-cluster similarity is decreased during clustering. The proposed FIBCLUS algorithm is able to handle datasets with numerical, categorical and a mix of both types of attributes. Results obtained with FIBCLUS are compared with the results of existing algorithms such as k-means, x-means expected maximization and hierarchical algorithms that are widely used to cluster numeric, categorical and mix data types. Empirical analysis shows that FIBCLUS is able to produce better clustering solutions in terms of entropy, purity and F-score in comparison to the above described existing algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Rasmussen, M., Karypis, G.: Gcluto: An Interactive Clustering, Visualization, and Analysis System, vol. 21. Citeseer (2008)

    Google Scholar 

  2. Liao, H., Ng, M.K.: Categorical data clustering with automatic selection of cluster number. Fuzzy Information and Engineering 1, 5–25 (2009)

    Article  MATH  Google Scholar 

  3. Huang, Z.: Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Mining and Knowledge Discovery 2, 283–304 (1998)

    Article  Google Scholar 

  4. Stanfill, C.: Toward memory-based reasoning. Communications of the ACM 29, 1213–1228 (1986)

    Article  Google Scholar 

  5. Boriah, S., Chandola, V., Kumar, V.: Similarity measures for categorical data: A comparative evaluation, vol. 30, p. 3. Citeseer (2007)

    Google Scholar 

  6. Ian, H., Witten, E.F.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

  7. San, O.M., Huynh, V.N., Nakamori, Y.: An alternative extension of the k-means algorithm for clustering categorical data. Internation Journal of Applied Mathematics and Computer Science 14, 241–248 (2004)

    MATH  Google Scholar 

  8. Ahmad, A., Dey, L.: A method to compute distance between two categorical values of same attribute in unsupervised learning for categorical data set. Pattern Recognition Letters 28, 110–118 (2007)

    Article  Google Scholar 

  9. Le, S.Q., Ho, T.B.: An association-based dissimilarity measure for categorical data. Pattern Recognition Letters 26, 2549–2557 (2005)

    Article  Google Scholar 

  10. Guha, S., Rastogi, R., Shim, K.: Rock: A robust clustering algorithm for categorical attributes* 1. Information Systems 25, 345–366 (2000)

    Article  Google Scholar 

  11. Ganti, V., Gehrke, J., Ramakrishnan, R.: CACTUS—clustering categorical data using summaries. In: Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, California, United States, pp. 73–83 (1999)

    Google Scholar 

  12. Gibson, D., Kleinberg, J., Raghavan, P.: Clustering categorical data: An approach based on dynamical systems. The VLDB Journal 8(3), 222–236 (2000)

    Article  Google Scholar 

  13. Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: an efficient data clustering method for very large databases. In: ACM SIGMOD, International Conference on Management of Data, pp. 103–114 (1996)

    Google Scholar 

  14. Barbará, D., Li, Y., Couto, J.: COOLCAT: an entropy-based algorithm for categorical clustering. In: 11th International Conference on Information and knowledge Management, pp. 582–589 (2002)

    Google Scholar 

  15. Rendón, E., Sánchez, J.: Clustering based on compressed data for categorical and mixed attributes. In: Yeung, D.-Y., Kwok, J.T., Fred, A., Roli, F., de Ridder, D. (eds.) SSPR 2006 and SPR 2006. LNCS, vol. 4109, pp. 817–825. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  16. Ichino, M., Yaguchi, H.: Generalized Minkoeski metrics for mixed feature-type data analysis. IEEE Transaction on Systems,Man and Cybernitics 24, 694–708 (1994)

    Google Scholar 

  17. Chandra, P., Weisstein, E.W.: Fibonacci Number. In: MathWorld–A Wolfram Web Resource, http://mathworld.wolfram.com/FibonacciNumber.html

  18. Fredman, M.L., Tarjan, R.E.: Fibonacci heaps and their uses in improved network optimization algorithms, vol. 34, pp. 596–615. ACM, New York (1987)

    Google Scholar 

  19. Lacueva-Pérez, F.J.: Supervised Classification Fuzzy Growing Hierarchical SOM. In: Corchado, E., Abraham, A., Pedrycz, W. (eds.) HAIS 2008. LNCS (LNAI), vol. 5271, pp. 220–228. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rawat, R., Nayak, R., Li, Y., Alsaleh, S. (2011). Aggregate Distance Based Clustering Using Fibonacci Series-FIBCLUS. In: Du, X., Fan, W., Wang, J., Peng, Z., Sharaf, M.A. (eds) Web Technologies and Applications. APWeb 2011. Lecture Notes in Computer Science, vol 6612. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20291-9_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-20291-9_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-20290-2

  • Online ISBN: 978-3-642-20291-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics