Performance Evaluation of Data Stream Mining Algorithm with Shared Density Graph for Micro and Macro Clustering

  • S. GopinathanEmail author
  • L. Ramesh
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 941)


We propose to solve the problem of micro clustering using the integration of data stream mining algorithms. Streaming data are potentially infinite sequence of incoming data at very high speed and may evolve over the time. This causes several challenges in mining large scale high speed data streams in real time. This paper discusses various challenges associated with mining data streams. Several algorithm such as data stream mining algorithms of accuracy and micro clustering are specified along with their key features and significance. Also, the significant performance evaluation of micro and macro clustering relevant in streaming data of shared density graph and clustering are explained and their comparative significance is discussed. The paper illustrates various streaming data computation platforms that are developed and discusses each of them chronologically along with their major capabilities. The performance and analysis are different radius activation functions and various number of radius applied to an data stream mining algorithm with shared density graph for micro clustering and macro clustering.


Stream mining Micro clustering Shared density graph Macro clustering 


  1. 1.
    Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann, San Francisco (2011)zbMATHGoogle Scholar
  2. 2.
    Gama, J.: Knowledge Discovery from Data Streams. Chapman & Hall CRC, Atlanta (2010)CrossRefGoogle Scholar
  3. 3.
    Agarwal, S., Prasad, B.R.: High speed streaming data analysis of web generated log streams. In: 2015 IEEE 10th International Conference on Industrial and Information Systems (ICIIS), pp. 413–418. IEEE (2015)Google Scholar
  4. 4.
    Kifer, D., David, S.B., Gehrke, J.: Detecting change in data streams. In: VLDB Conference (2004)Google Scholar
  5. 5.
    Kranen, P., Kremer, H., Jansen, T., Seidl, T., Bifet, A., Holmes, G., Pfahringer, B.: Clustering performance on evolving data streams: assessing algorithms and evaluation measures within MOA. In: IEEE International Conference on Data Mining - ICDM, pp. 1400–1403 (2010)Google Scholar
  6. 6.
    Bifet, A.: Pitfalls in benchmarking data stream classification and how to avoid them. In: Machine Learning and Knowledge Discovery in Databases, pp. 465–479. Springer, Heidelberg (2013)Google Scholar
  7. 7.
    Song, M.J., Zhang, L.: Comparison of cluster representations from partial second-to full fourth-order cross moments for data stream clustering. In: ICDM, pp. 560–569 (2008)Google Scholar
  8. 8.
    Philipp, K.: Clustering performance on evolving data streams: assessing algorithms and evaluation measures within MOA. In: 2010 IEEE International Conference on Data Mining Workshops (ICDMW). IEEE (2010)Google Scholar
  9. 9.
    Daniel, J.A.: Aurora: a new model and architecture for data stream management. VLDB Int. J. Very Large Data Bases 12(2), 120–139 (2003)CrossRefGoogle Scholar
  10. 10.
    Brian, B., Datar, M., Motwani, R.: Load shedding for aggregation queries over data streams. In: 2004 Proceedings of 20th International Conference on Data Engineering. IEEE (2004)Google Scholar
  11. 11.
    Abadi, D.J.: The design of the borealis stream processing engine. In: Proceedings of CIDR (2005)Google Scholar
  12. 12.
    Vowpal Wabbit (2007).
  13. 13.
    Neumeyer, L., Robbins, B., Nair, A., Kesari, A.: S4: distributed stream computing platform. In: Proceedings of ICDMW, pp. 170–177. IEEE Press (2010)Google Scholar
  14. 14.
  15. 15.
    Prasad, B.R., Agarwal, S.: Handling big data stream analytics using SAMOA framework - a practical experience. Int. J. Database Theory Appl. 7(4), 197–208 (2014)CrossRefGoogle Scholar
  16. 16.
    Bifet, A.: Mining big data in real time. Informatica 37, 15–20 (2013)Google Scholar
  17. 17.
    Cao, F., Ester, M., Qian, W., Zhou, A.: Density-based clustering over an evolving data stream with noise. In: SDM, vol. 6, pp. 328–339 (2006)Google Scholar
  18. 18.
    Prasad, B.R., Agarwal, S.: Stream data mining: platforms, algorithms, performance evaluators and research trends. Int. J. Database Theory Appl. 9, 201–218 (2016)CrossRefGoogle Scholar
  19. 19.
    Sibson, R.: SLINK: an optimally efficient algorithm for the single-link cluster method. Comput. J. 16(1), 30–34 (1973)MathSciNetCrossRefGoogle Scholar
  20. 20.
    Hinneburg, A., Keim, D.: An efficient approach to clustering in large multimedia databases with noise. In: Proceedings of 4th International Conference on Knowledge Discovery & Data Mining, New York City, NY (1998)Google Scholar
  21. 21.
    Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall, Inc., Upper Saddle River (1988)zbMATHGoogle Scholar
  22. 22.
    Schikuta, E.: Grid clustering: an efficient hierarchical clustering method for very large data sets. In: Proceedings of 13th International Conference on Pattern Recognition, Vol. 2, pp. 101–105 (1996)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of MadrasChennaiIndia

Personalised recommendations