Skip to main content
Log in

AIDCOR: artificial immunity inspired density based clustering with outlier removal

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

AIDCOR is an artificial immunity inspired density based clustering algorithm which is able to identify crisp clusters with high degree of accuracy where the input dataset presented can have varied shaped clusters, varied density distribution, low inter cluster separation and with noise/outliers also. The algorithm works out into two phases, a data preprocessing module and a clustering module. The initial data processing part of AIDCOR is artificial immune system inspired and uses a novel approach of somatic hypermutation and affinity maturation with selective antigenic binding to reduce data redundancy while preserving the original data patterns. The actual data clustering part pursues a density based approach which forms clusters with the compressed data set and doing so it inherently identifies outliers also. We have thoroughly analyzed both theoretical aspects and experimental results of the proposed algorithm with wide variety of real and synthetic data set. The results of AIDCOR are compared with several current state of art algorithms where we found that it is giving much higher clustering accuracy for nearly all type of dataset. The time complexity of AIDCOR is coming to be sub quadratic when some indexing data structure is used for nearest neighbor search and quadratic otherwise. AIDCOR needs 3 user defined parameters for its operation. A heuristic method is also proposed to automatically determine those parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Guojun G, Ma C, Wu J (2007) Data clustering: theory, algorithms, and applications. ASA-SIAM series on statistics and applied probability, SIAM, Philadelphia, ASA, Alexandria, VA, 2007

  2. de Castro LN, Zuben FJV (2001) AiNet: an artificial immune network for data analysis. Idea Group Publishing, USA, pp 231–259

    Google Scholar 

  3. MacQueen JB (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of 5th Berkeley symposium on mathematical statistics and probability 1. University of California Press, 1967, pp 281–297

  4. Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proc. 2nd Int. Conf. on knowledge discovery and data mining, Portland, OR, AAAI Press, 1996, pp 226–231

  5. Paul SK, Bhaumik P (2014) A density based clustering with Artificial Immunity inspired preprocessing. 2014 International conference on advances in computing, communications and informatics (ICACCI), IEEE, New Delhi, September 2014, pp 2648–2654

  6. Graaff AJ, Engelbrecht AP (2007) A local network neighborhood artificial immune system for data clustering. 2007 IEEE Congress on Evolutionary Computation, Singapore, 25–28 Sept. 2007, pp 260–267

  7. de Castro LN, Zuben FJV (2000) An evolutionary immune network for data clustering. IEEE SBRN, Rio de Janeiro, pp 84–89

    Google Scholar 

  8. Graaff AJ, Engelbrecht AP (2012) Clustering data in stationary environments with a local network neighborhood artificial immune system. Int J Mach Learn Cybernet 3(1):1–26

    Article  Google Scholar 

  9. Timmis J, Neal M (2001) A resource limited artificial immune system for data analysis. Knowl Based Syst 14(3–4):121–130

    Article  Google Scholar 

  10. Burnet FM (1959) The clonal selection theory of acquired immunity. Cambridge University Press, Cambridge

    Book  Google Scholar 

  11. Kepler TB, Perelson AS (1993) Somatic hypermutation in B cells: an optimal control treatment. J Theor Biol 164(1):37–64

    Article  Google Scholar 

  12. Bezerra GB, Barra TV, de Castro LN, Von Zuben FJ (2005) Adaptive radius immune algorithm for data clustering. Artificial Immune Systems, Springer, Berlin, Heidelberg, 2005, pp 290–303

  13. Younsi R, Wang W (2004) A new artificial immune system algorithm for clustering. In: Intelligent Data Engineering and Automated Learning—IDEAL, Springer, Berlin, Heidelberg, 2004, pp 58–64

  14. De Castro LN, Von Zuben FJ (2002) Learning and optimization using the clonal selection principle. IEEE Trans Evolut Comput 6(3):239–251

    Article  Google Scholar 

  15. Ahmad W, Narayanan A (2011) Population-based artificial immune system clustering algorithm. In: Artificial immune systems, Springer, Berlin, Heidelberg, 2011, pp 348–360

  16. van der Merwe DW, Engelbrecht AP (2003) Data clustering using particle swarm optimization. The 2003 Congress on Evolutionary Computation, vol 1, 2003, pp 215–220

  17. Tang R, Fong S, Yang X-S, Deb S (2012) Integrating nature-inspired optimization algorithms to K-means clustering. 2012 Seventh international conference on digital information management (ICDIM), Macau, 2012, pp 116–123

  18. Folino G, Forestiero A, Spezzano G (2009) An adaptive flocking algorithm for performing approximate clustering. Inf Sci 179(18):3059–3078

    Article  Google Scholar 

  19. Ankerst M, Breunig MM, Kriegel H-P, Sander J (1999) OPTICS: ordering points to identify the clustering structure. ACM SIGMOD international conference on management of data. ACM Press, 1999, pp 49–60

  20. Liu P, Zhou D, Wu N (2007) VDBSCAN: varied density based spatial clustering of applications with noise. Service Systems and Service Management, 2007 International Conference on, Chengdu, 2007, pp 1–4

  21. Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496

    Article  Google Scholar 

  22. Duan L, Xu L, Guo F, Lee J, Yan B (2007) A local-density based spatial clustering algorithm with noise. Inf Syst 32(7):978–986

    Article  Google Scholar 

  23. Breunig MM, Kriegel HP, Ng RT, Sander J (2000) LOF: identifying density-based local outliers. In: Proc. ACM SIGMOD 2000 Int. Conf. On management of data, Dallas, TX, 2000, pp 93–104

  24. Mai ST, He X, Feng J, Plant C, Bohm C (2015) Anytime density-based clustering of complex data. Knowl Inform Syst (KAIS) 45(2):319–355

    Article  Google Scholar 

  25. Mai ST, He X, Hubig N, Plant C, Bohm C (2013) Active density-based clustering. 2013 IEEE 13th international conference on data mining (ICDM), IEEE, December 2013, pp 508–517

  26. Gan J, Tao Y (2015) DBSCAN revisited: mis-claim, un-fixability, and approximation. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data, ACM, May 2015, pp 519–530

  27. Aggarwal CC, Hinneburg A, Keim DA (2001) On the surprising behavior of distance metrics in high dimensional space. Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, 2001, pp 420–434

  28. Beyer K, Goldstein J, Ramakrishnan R, Shaft U (1999) When is “Nearest Neighbor” meaningful? 7th International Conference Jerusalem, Israel, January 10–12 1999, pp 217–235

  29. Labroche N, Monmarch N, Venturini G (2003) AntClust: ant clustering and web usage mining. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO), Springer Berlin Heidelberg, 2003, pp 25–36

  30. Aliguliyev RM (2009) Performance evaluation of density-based clustering methods. Inform Sci 179(20):3583–3602

    Article  Google Scholar 

  31. Tran TN, Drab K, Daszykowski M (2013) Revised DBSCAN algorithm to cluster data with dense adjacent clusters. Chemom Intell Lab Syst 120:92–96

    Article  Google Scholar 

  32. Chang H, Yeung DY (2008) Robust path-based spectral clustering. Pattern Recogn 41(1):191–203

    Article  MATH  Google Scholar 

  33. Zahn CT (1971) Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Trans Comput C-20(1):68–86

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Swarna Kamal Paul.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Paul, S.K., Bhaumik, P. AIDCOR: artificial immunity inspired density based clustering with outlier removal. Int. J. Mach. Learn. & Cyber. 9, 309–334 (2018). https://doi.org/10.1007/s13042-016-0499-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-016-0499-x

Keywords

Navigation