Skip to main content

A Large-Scale Data Clustering Algorithm Based on BIRCH and Artificial Immune Network

  • Conference paper
  • First Online:
Advances in Swarm Intelligence (ICSI 2018)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10941))

Included in the following conference series:

  • 1615 Accesses

Abstract

This paper describes a large-scale data clustering algorithm which is a combination of Balanced Iterative Reducing and Clustering using Hierarchies Algorithm (BIRCH) and Artificial Immune Network Clustering Algorithm (aiNet). Compared with traditional clustering algorithms, aiNet can better adapt to non-convex datasets and does not require a given number of clusters. But it is not suitable for handling large-scale datasets for it needs a long time to evolve. Besides, the aiNet model is very sensitive to noise, which greatly restricts its application. Contrary to aiNet, BIRCH can better process large-scale datasets but cannot deal with non-convex datasets like traditional clustering algorithms, and requires the cluster number. By combining these two methods, a new large-scale data clustering algorithm is obtained which inherits the advantages and overcomes the disadvantages of BIRCH and aiNet simultaneously.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.ics.uci.edu/~mlearn/MLRepository.html.

References

  1. Hartigan, J.A., Wong, M.A.: A K-Means clustering algorithm. Appl. Statis. 28(1), 100–108 (1979)

    Article  Google Scholar 

  2. Dunn, J.C.: A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J. Cybern. 3(3), 32–57 (1974)

    Article  MathSciNet  Google Scholar 

  3. Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)

    Book  Google Scholar 

  4. Maulik, U., Bandyopadhyay, S.: Genetic algorithm-based clustering technique. Pattern Recogn. 33(9), 1455–1465 (2004)

    Article  Google Scholar 

  5. Das, S., Abraham, A., Konar, A.: Automatic kernel clustering with a Multi-Elitist particle swarm optimization algorithm. Pattern Recogn. Lett.-PRL 29(5), 688–699 (2008)

    Article  Google Scholar 

  6. Handl, J., Knowles, J.D.: Multi-objective clustering and cluster validation. In: Jin, Y. (ed.) Multi-Objective Machine Learning, vol. 16. Springer, Heidelberg (2006). https://doi.org/10.1007/3-540-33019-4_2

  7. Fred, A.L.N., Leitao, Y.M.N.: Partitional vs hierarchical clustering using a minimum grammar complexity approach. In: Ferri, F.J., Iñesta, J.M., Amin, A., Pudil, P. (eds.) Advances in Pattern Recognition. SSPR/SPR 2000, vol. 1876. Springer, Heidelberg, pp. 193–202 (2000). https://doi.org/10.1007/3-540-44522-6_20

    Chapter  Google Scholar 

  8. Nanni, M., Pedreschi, D.: Time-Focused clustering of trajectories of moving objects. J. Intell. Inf. Syst. 27(3), 267–289 (2006)

    Article  Google Scholar 

  9. Girolami, M.: Mercer kernel-based clustering in feature space. IEEE Trans. Neural Netw. 13(3), 780–784 (2002)

    Article  Google Scholar 

  10. Ng, A.Y,, Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Neural Information Processing Systems, pp. 849–856 (2001)

    Google Scholar 

  11. Martínez, A.M, Kak, A.C.: PCA versus LDA. IEEE Trans. Pattern Anal. Mach. Intell.–PAMI 23(2), 228–233 (2009)

    Article  Google Scholar 

  12. de Castro, L.N., Von, Z.F.J.: aiNet: an artificial immune network for data analysis. In: Data Mining: A Heuristic Approach, pp. 231–259 (2001)

    Google Scholar 

  13. Timmis, J., Neal, M.: A Resource Limited Artificial Immune System for Data Analysis. Research and Development in Intelligent Systems XVII, pp. 19–32, December 2000

    Chapter  Google Scholar 

  14. Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: an efficient data clustering method for very large databases. In: Proceedings of ACM SIGMOD Conference, Montreal, Canada, pp. 103–114 (1996)

    Article  Google Scholar 

  15. Greensmith, J., Aickelin, U., Cayzer, S.: Introducing dendritic cells as a novel immune-inspired algorithm for anomaly detection. In: The 4th International Conference on Artificial Immune Systems (ICARIS 2005), Banff, Alberta, Canada (2005)

    Google Scholar 

  16. Richard, O.D.: Sequential k-Means clustering (2008). http://www.cs.princeton.edu/courses/archive/fall08/cos436/Duda/C/sk_means.html

  17. Richard, O.D., Peter, E.H., David, G.S.: Pattern Classification, 2nd edn. China Machine Press, Beijing (2004)

    MATH  Google Scholar 

  18. Barbakh, W., Fyfe, C.: Online clustering algorithms. Int. J. Neural Syst. 18(3), 185–194 (2008)

    Article  Google Scholar 

  19. Havens, T.C., Bezdek, J.C., Leckie, C., et al.: Fuzzy c-means algorithms for very large data. IEEE Trans. Fuzzy Syst. 20(6), 1130–1146 (2012)

    Article  Google Scholar 

  20. Handl, J., Knowles, J.: An evolutionary approach to multiobjective clustering. IEEE Trans. Evol. Comput. 11(1), 56–76 (2007)

    Article  Google Scholar 

Download references

Acknowledgment

This work was supported by the National Natural Science Foundation of China under Grant 61772399, Grant U170126, Grant 61773304, Grant 61672405 and Grant 61772-400, the Program for Cheung Kong Scholars and Innovative Research Team in University Grant IRT_15R53, the Fund for Foreign Scholars in University Research and Teaching Programs (the 111 Project) Grant B07048, and the Major Research Plan of the National Natural Science Foundation of China Grant 91438201.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yangyang Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, Y., Liu, G., Li, P., Jiao, L. (2018). A Large-Scale Data Clustering Algorithm Based on BIRCH and Artificial Immune Network. In: Tan, Y., Shi, Y., Tang, Q. (eds) Advances in Swarm Intelligence. ICSI 2018. Lecture Notes in Computer Science(), vol 10941. Springer, Cham. https://doi.org/10.1007/978-3-319-93815-8_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-93815-8_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-93814-1

  • Online ISBN: 978-3-319-93815-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics