Skip to main content

Advertisement

Log in

An Adaptive Density Data Stream Clustering Algorithm

  • Published:
Cognitive Computation Aims and scope Submit manuscript

Abstract

Now we are in the age of big data. Huge amount of data and information are generated every time. Traditional data stream algorithms are suit for the data streams with low dimension and simple structure. However, with the development of information technology, the produced data streams are becoming more and more complicated. It is particularly important to study how to find new associations and patterns from complex data to achieve the cognition ability and judgment ability like human brain. Clustering data streams with mixed attributes of irregular distribution is a big challenge in data mining. To solve this problem, we present an adaptive density data stream clustering algorithm—ADStream. ADStream is based on the online–off-line clustering framework. It can automatically recognize the initial clusters by passing messages between data points. Then a novel time-decay density clustering strategy is designed to group and update the continuously arriving data streams. Comprehensive experimental results demonstrate that ADStream is adaptive to the evolving data streams and may generate high-quality clusters with fast processing rate.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Huang XX, Huang HX, Liao BS, et al. An ontology-based approach to metaphor cognitive computation. Mind Mach. 2013;23(1):105–21.

    Article  Google Scholar 

  2. Ding SF, Wu FL, Qian J, Jia HJ, Jin FX. Research on data stream clustering algorithms. Artif Intell Rev. 2015;43(4):593–600.

    Article  Google Scholar 

  3. Byun SS, Balashingham I, Vasilakos AV, et al. Computation of an equilibrium in spectrum markets for cognitive radio networks. IEEE Trans Comput. 2014;63(2):304–16.

    Article  Google Scholar 

  4. Zeng XQ, Li GZ. Incremental partial least squares analysis of big streaming data. Pattern Recogn. 2014;47(11):3726–35.

    Article  Google Scholar 

  5. Mital PK, Smith TJ, Hill RL, et al. Clustering of gaze during dynamic scene viewing is predicted by motion. Cogn Comput. 2011;3(1):5–24.

    Article  Google Scholar 

  6. Sancho-Asensio A, Navarro J, Arrieta-Salinas I, et al. Improving data partition schemes in Smart Grids via clustering data streams. Expert Syst Appl. 2014;41(13):5832–42.

    Article  Google Scholar 

  7. Bian XY, Zhang TX, Zhang XL, et al. Clustering-based extraction of near border data samples for remote sensing image classification. Cogn Comput. 2013;5(1):19–31.

    Article  Google Scholar 

  8. Amini A, Wah TY, Saboohi H. On density-based data streams clustering algorithms: a survey. J Comput Sci Technol. 2014;29(1):116–41.

    Article  Google Scholar 

  9. Jia HJ, Ding SF, Xu XZ, Nie R. The latest research progress on spectral clustering. Neural Comput Appl. 2014;24(7–8):1477–86.

    Article  Google Scholar 

  10. Yu J, Liu DQ, Tao DC, et al. Complex object correspondence construction in two-dimensional animation. IEEE Trans Image Process. 2011;20(11):3257–69.

    Article  PubMed  Google Scholar 

  11. Ding SF, Jia HJ, Zhang LW, et al. Research of semi-supervised spectral clustering algorithm based on pairwise constraints. Neural Comput Appl. 2014;24(1):211–9.

    Article  Google Scholar 

  12. Yu J, Hong RC, Wang M, et al. Image clustering based on sparse patch alignment framework. Pattern Recogn. 2014;47(11):3512–9.

    Article  Google Scholar 

  13. O’Callaghan L, Mishra N, Meyerson A, et al. Streaming-data algorithms for high quality clustering. In: Proceedings of IEEE international conference on data engineering, 2002, p. 685–694.

  14. Aggarwal C, Han J, Wang J, et al. A framework for clustering evolving data streams. In: Proceedings of the 29th VLDB conference, 2003, p .81–92.

  15. .Aggarwal CC, Han JW, Wang JY, et al. A framework for projected clustering of high dimensional data streams. In: Proceedings of the 30th international conference on very large data bases, 2004, p. 852–863.

  16. Cao F, Ester M, Qian W, et al. Density-based clustering over an evolving data stream with noise. In: Proceedings of the SIAM conference on data ming, 2006, p. 328–339.

  17. Chen Y, Tu L. Density-based clustering for real-time stream data. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, 2007, p. 133–142.

  18. Zhu WH, Yin J, Xie YH. Arbitrary shape cluster algorithm for clustering data stream. J Softw. 2006;17(3):379–87.

    Article  CAS  Google Scholar 

  19. Dai DB, Zhao G, Sun SL. Effective clustering algorithm for probabilistic data stream. J Softw. 2009;20(5):1313–28.

    Article  Google Scholar 

  20. Pereira CMM, de Mello RF. TS-stream: clustering time series on data streams. J Intel Inform Syst. 2014;42(3):531–66.

    Google Scholar 

  21. Miller Z, Dickinson B, Deitrick W, et al. Twitter spammer detection using data stream clustering. Inf Sci. 2014;260:64–73.

    Article  Google Scholar 

  22. Rodrigues PP, Gama J. Distributed clustering of ubiquitous data streams. Wiley Interdiscip Rev Data Mining Knowl Discov. 2014;4(1):38–54.

    Article  Google Scholar 

  23. Albertini MK, de Mello RF. Energy-based function to evaluate data stream clustering. Adv Data Anal Classif. 2013;7(4):435–64.

    Article  Google Scholar 

  24. Jin CQ, Yu JX, Zhou AY, et al. Efficient clustering of uncertain data streams. Knowl Inf Syst. 2014;40(3):509–39.

    Article  Google Scholar 

  25. Vallim RMM, Andrade JA, de Mello RF, et al. Unsupervised density-based behavior change detection in data streams. Intell Data Anal. 2014;18(2):181–201.

    Google Scholar 

  26. Frey BJ, Dueck D. Clustering by passing messages between data points. Science. 2007;315(5814):972–6.

    Article  CAS  PubMed  Google Scholar 

  27. Wang KJ, Zheng J. Specified number of classes under the affinity propagation clustering fast algorithm. Comput Syst Appl. 2010;19(7):207–9.

    Google Scholar 

  28. Wang CD, Lai JH, Suen CY, et al. Multi-exemplar affinity propagation. IEEE Trans Pattern Anal Mach Intell. 2013;35(9):2223–37.

    Article  PubMed  Google Scholar 

  29. Mu Y, Ding W, Zhou TY, et al. Constrained stochastic gradient descent for large-scale least squares problem. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining, 2013, p. 883–891.

  30. Clerc M, Kennedy J. The particle swarm—explosion, stability, and convergence in a multidimensional complex space. IEEE Trans Evol Comput. 2002;6(1):58–73.

    Article  Google Scholar 

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (No. 61379101), and the National Key Basic Research Program of China (No. 2013CB329502).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shifei Ding.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ding, S., Zhang, J., Jia, H. et al. An Adaptive Density Data Stream Clustering Algorithm. Cogn Comput 8, 30–38 (2016). https://doi.org/10.1007/s12559-015-9342-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12559-015-9342-z

Keywords

Navigation