Advertisement

Knowledge and Information Systems

, Volume 45, Issue 3, pp 535–569 | Cite as

A survey on data stream clustering and classification

  • Hai-Long NguyenEmail author
  • Yew-Kwong Woon
  • Wee-Keong Ng
Survey Paper

Abstract

Nowadays, with the advance of technology, many applications generate huge amounts of data streams at very high speed. Examples include network traffic, web click streams, video surveillance, and sensor networks. Data stream mining has become a hot research topic. Its goal is to extract hidden knowledge/patterns from continuous data streams. Unlike traditional data mining where the dataset is static and can be repeatedly read many times, data stream mining algorithms face many challenges and have to satisfy constraints such as bounded memory, single-pass, real-time response, and concept-drift detection. This paper presents a comprehensive survey of the state-of-the-art data stream mining algorithms with a focus on clustering and classification because of their ubiquitous usage. It identifies mining constraints, proposes a general model for data stream mining, and depicts the relationship between traditional data mining and data stream mining. Furthermore, it analyzes the advantages as well as limitations of data stream algorithms and suggests potential areas for future research.

Keywords

Data stream mining Clustering Classification Survey 

Notes

Acknowledgments

This work is partially supported by the Wattalyzer grant project of I2R, Singapore with Grant reference: NRF2012EWT-EIRP002-044.

References

  1. 1.
    Adomavicius G, Bockstedt J (2008) C-trend: temporal cluster graphs for identifying and visualizing trends in multiattribute transactional data. IEEE Trans Knowl Data Eng 20:721–735CrossRefGoogle Scholar
  2. 2.
    Aggarwal CC (2009) Data streams: an overview and scientific applications. Springer, Berlin, pp 377–397Google Scholar
  3. 3.
    Aggarwal CC, Han J, Wang J, Yu PS (2003) A framework for clustering evolving data streams. In: Proceedings of the 29th international conference on very large data bases, vol 29, pp 81–92. VLDB EndowmentGoogle Scholar
  4. 4.
    Aggarwal CC, Han J, Wang J, Yu PS (2004) A framework for projected clustering of high dimensional data streams. In: Proceedings of the 13th international conference on very large data bases, vol 30, pp 852–863, 1316763. VLDB EndowmentGoogle Scholar
  5. 5.
    Aggarwal CC, Han J, Wang J, Yu PS (2006) A framework for on-demand classification of evolving data streams. IEEE Trans Knowl Data Eng 18(5):577–589CrossRefGoogle Scholar
  6. 6.
    Aggarwal CC, Philip SY (2010) On clustering massive text and categorical data streams. Knowl Inf Syst 24(2):171–196CrossRefGoogle Scholar
  7. 7.
    Aggarwal CC, Subbian K (2012) Event detection in social streams. In: Proceedings of the SIAM international conference on data mining, pp 624–635Google Scholar
  8. 8.
    Aggarwal CC, Yu P (2005) Online analysis of community evolution in data streams. In: Proceedings of the SIAM international conference on data mining, pp 56–67Google Scholar
  9. 9.
    Aksoy C, Can F, Kocberber S (2012) Novelty detection for topic tracking. J Am Soc Inf Sci Technol 63(4):777–795CrossRefGoogle Scholar
  10. 10.
    Alahakoon D, Halgamuge SK, Srinivasan B (2000) Dynamic self-organizing maps with controlled growth for knowledge discovery. IEEE Trans Neural Netw 11(3):601–614CrossRefGoogle Scholar
  11. 11.
    Alexander H, Er H, Daniel AK (1998) An efficient approach to clustering in large multimedia databases with noise. In: Proceeding of the 1998 international conference knowledge discovery and data mining, pp 58–65Google Scholar
  12. 12.
    Allan J, Papka R, Lavrenko V (1998) On-line new event detection and tracking. In: Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval, pp 37–45. ACMGoogle Scholar
  13. 13.
    Amigo E, Gonzalo J, Artiles J (2009) A comparison of extrinsic clustering evaluation metrics based on formal constraints. Inf Retr 12(4):461–486CrossRefGoogle Scholar
  14. 14.
    Ankerst M, Breunig MM, Kriegel H-P, Sander J (1999) Optics: ordering points to identify the clustering structure. In: Proceedings of the 1999 ACM SIGMOD international conference on management of data, vol 28, pp 49–60, 304187. ACMGoogle Scholar
  15. 15.
    Beckmann N, Kriegel H-P, Schneider R, Seeger B (1990) The R*-tree: an efficient and robust access method for points and rectangles, vol 19. ACMGoogle Scholar
  16. 16.
    Bifet A (2010) Adaptive stream mining: pattern learning and mining from evolving data streams. In: Proceeding of the 2010 conference on adaptive stream mining: pattern learning and mining from evolving data streams. IOS PressGoogle Scholar
  17. 17.
    Bifet A, Holmes G, Kirkby R (2012) Moa: massive online analysis. J Mach Learn Res 11:1601–1604Google Scholar
  18. 18.
    Bohm C, Kailing K, Kriegel H-P, Kroger P (2004) Density connected clustering with local subspace preferences. In: Proceedings of the 4th IEEE international conference on data mining, pp 27–34, 1033433. IEEE Computer SocietyGoogle Scholar
  19. 19.
    Can F (1993) Incremental clustering for dynamic information processing. ACM Trans Inf Syst (TOIS) 11(2):143–164MathSciNetCrossRefGoogle Scholar
  20. 20.
    Can F, Kocberber S, Baglioglu O, Kardas S, Ocalan HC, Uyar E (2010) New event detection and topic tracking in turkish. J Am Soc Inf Sci Technol 61(4):802–819Google Scholar
  21. 21.
    Can F, Ozkarahan EA (1990) Concepts and effectiveness of the cover-coefficient-based clustering methodology for text databases. ACM Trans Database Syst (TODS) 15(4):483–517CrossRefGoogle Scholar
  22. 22.
    Cao F, Ester M, Qian W, Zhou A (2006) Density-based clustering over an evolving data stream with noise. In: Proceedings of the 2006 SIAM international conference on data mining, pp 328–339Google Scholar
  23. 23.
    Chakrabarti D, Kumar R, Tomkins A (2006) Evolutionary clustering. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, pp 554–560. ACMGoogle Scholar
  24. 24.
    Chen Y, Tu L (2007) Density-based clustering for real-time stream data. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, pp 133–142, 1281210. ACMGoogle Scholar
  25. 25.
    Chi Y, Song X, Zhou D, Hino K, Tseng B (2007) Evolutionary spectral clustering by incorporating temporal smoothness. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, pp 153–162. ACMGoogle Scholar
  26. 26.
    Chi Y, Song X, Zhou D, Hino K, Tseng B (2009) On evolutionary spectral clustering. ACM Trans Knowl Discov Data 3(4):1–30CrossRefGoogle Scholar
  27. 27.
    Chien S, Immorlica N (2005) Semantic similarity between search engine queries using temporal correlation. In: Proceedings of the 14th international conference on World Wide Web, pp 2–11, 1060752. ACMGoogle Scholar
  28. 28.
    Chow TWS, Sitao W (2004) An online cellular probabilistic self-organizing map for static and dynamic data sets. IEEE Trans Circuits Syst I Regul Pap 51(4):732–747CrossRefGoogle Scholar
  29. 29.
    Chris G, Jiaweihan Y, Jianpei Z, Xifeng Yan Y, Philip SY (2003) Mining frequent patterns in data streams at multiple time granularities. Next Gener Data Min 212:191–212Google Scholar
  30. 30.
    Ciaccia P, Patella M, Zezula P (1997) M-tree: An efficient access method for similarity search in metric spaces. In: Proceedings of the 23rd international conference on very large data bases, pP 426–435, 671005. Morgan Kaufmann Publishers IncGoogle Scholar
  31. 31.
    Cui W, Liu S, Tan L, Shi C, Song Y, Gao Z, Qu H, Tong X (2011) Textflow: Towards better understanding of evolving topics in text. IEEE Trans Vis Comput Graph 17(12):2412–2421CrossRefGoogle Scholar
  32. 32.
    Dang X, Lee V, Ng W, Ciptadi A, Ong K (2009) An EM-based algorithm for clustering data streams in sliding windows, vol 5463 of Lecture notes in computer science, pp 230–235. Springer, BerlinGoogle Scholar
  33. 33.
    de Leeuw J (2005) Applications of convex analysis to multidimensional scaling. North Holland Publishing Company, Amsterdam, pp 133–146Google Scholar
  34. 34.
    Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em algorithm. J R Stat Soc Ser B (Methodol) 39(1):1–38MathSciNetzbMATHGoogle Scholar
  35. 35.
    Domingos P, Hulten G (2000) Mining high-speed data streams. In: Proceedings of the sixth ACM SIGKDD international conference on knowledge discovery and data mining, pp 71–80, 347107. ACMGoogle Scholar
  36. 36.
    Dork M, Gruen D, Williamson C, Carpendale S (2010) A visual backchannel for large-scale events. IEEE Trans Vis Comput Graph 16(6):1129–1138CrossRefGoogle Scholar
  37. 37.
    Fung BC, Wang K, Yu PS (2005) Top-down specialization for information and privacy preservation. In: 21st international conference on data engineering, 2005. ICDE 2005. Proceedings, pp 205–216. IEEEGoogle Scholar
  38. 38.
    Gaber MM, Zaslavsky A, Krishnaswamy S (2005) Mining data streams: a review. ACM Sigmod Rec 34(2):18–26CrossRefGoogle Scholar
  39. 39.
    Gama J (2013) Data stream mining: the bounded rationality. Informatica (Slovenia) 37(1):21–25Google Scholar
  40. 40.
    Gama J, Gaber MM (2007) Learning from data streams—processing techniques in sensor networks. Springer, BerlinzbMATHGoogle Scholar
  41. 41.
    Gama J, Pinto C (2006) Discretization from data streams: applications to histograms and data mining. In: Proceedings of the 2006 ACM symposium on applied computing, pp 662–667. ACMGoogle Scholar
  42. 42.
    Gama J, Rodrigues P (2009) An overview on mining data streams, vol 206 of Studies in computational intelligence, pp 29–45. Springer, BerlinGoogle Scholar
  43. 43.
    Gama J, Sebastião R, Rodrigues P (2013) On evaluating stream learning algorithms. Mach Learn 90(3):317–346MathSciNetCrossRefzbMATHGoogle Scholar
  44. 44.
    Guha S, Kim C, Shim K (2004) Xwave: optimal and approximate extended wavelets. In: Proceedings of the thirtieth international conference on very large data bases, vol 30, pp 288–299, 1316716. VLDB EndowmentGoogle Scholar
  45. 45.
    Guha S, Meyerson A, Mishra N, Motwani R, O’Callaghan L (2003) Clustering data streams: theory and practice. IEEE Trans Knowl Data Eng 15:515–528CrossRefGoogle Scholar
  46. 46.
    Guha S, Mishra N, Motwani R, O’Callaghan L (2000) Clustering data streams. In: Proceedings of the 41st annual symposium on foundations of computer science, pp 359–366Google Scholar
  47. 47.
    Guha S, Rastogi R, Shim K (1998) Cure: an efficient clustering algorithm for large databases. In: Proceedings of the 1998 ACM SIGMOD international conference on management of data, pp 73–84, 276312. ACMGoogle Scholar
  48. 48.
    Guha S, Rastogi R, Shim K (1999) Rock: a robust clustering algorithm for categorical attributes. In: Proceedings of the 15th international conference on data engineering, p 512, 847264. IEEE Computer SocietyGoogle Scholar
  49. 49.
    Guttman A (1984) R-trees: a dynamic index structure for spatial searching. In: Proceedings of the 1984 ACM SIGMOD international conference on management of data, pp 47–57Google Scholar
  50. 50.
    Hahsler M, Dunham M (2011) Temporal structure learning for clustering massive data streams in real-time. In: SIAM conference on data mining, pp 664–675. SIAMGoogle Scholar
  51. 51.
    Han J (2005) Data mining: concepts and techniques. Morgan Kaufmann Publishers Inc., San FranciscoGoogle Scholar
  52. 52.
    He Q, Chang K, Lim E-P, Zhang J (2007) Bursty feature representation for clustering text streams. In: SIAM international conference on data miningGoogle Scholar
  53. 53.
    Hulten G, Spencer L, Domingos P (2001) Mining time-changing data streams. In: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining, pp 97–106, 502529. ACMGoogle Scholar
  54. 54.
    Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv (CSUR) 31(3):264–323CrossRefGoogle Scholar
  55. 55.
    Karypis G, Eui-Hong H, Kumar V (1999) Chameleon: hierarchical clustering using dynamic modeling. Computer 32(8):68–75CrossRefGoogle Scholar
  56. 56.
    Kaufman L, Rousseeuw P (1990) Finding groups in data: an introduction to cluster analysis, vol 39. Wiley, New YorkCrossRefGoogle Scholar
  57. 57.
    Kelly MG, Hand DJ, Adams NM (1999) The impact of changing populations on classifier performance. In: Proceedings of the fifth ACM SIGKDD international conference on knowledge discovery and data mining, pp 367–371, 312285. ACMGoogle Scholar
  58. 58.
    Kleinberg J (2003) Bursty and hierarchical structure in streams. Data Min Knowl Discov 7(4):373–397MathSciNetCrossRefGoogle Scholar
  59. 59.
    Kohonen T (1990) The self-organizing map. Proc IEEE 78(9):1464–1480CrossRefGoogle Scholar
  60. 60.
    Kranen P, Assent I, Baldauf C, Seidl T (2011) The clustree: indexing micro-clusters for anytime stream mining. Knowl Inf Syst 29(2):249–272CrossRefGoogle Scholar
  61. 61.
    Kranen P, Günnemann S, Fries S, Seidl T (2010) MC-tree: improving Bayesian anytime classification, vol 6187 of Lecture notes in computer science, pp 252–269. Springer, BerlinGoogle Scholar
  62. 62.
    Kremer H, Kranen P, Jansen T, Seidl T, Bifet A, Holmes G, Pfahringer B (2011) An effective evaluation measure for clustering on evolving data streams. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, pp 868–876, 2020555. ACMGoogle Scholar
  63. 63.
    Kriegel H-P, Kröger P, Ntoutsi I, Zimek A (2011) Density based subspace clustering over dynamic data, vol 6809 of Lecture notes in computer science, pp 387–404. Springer, BerlinGoogle Scholar
  64. 64.
    Lam W, Meng H, Wong K, Yen J (2001) Using contextual analysis for news event detection. Int J Intell Syst 16(4):525–546CrossRefzbMATHGoogle Scholar
  65. 65.
    Last M (2002) Online classification of nonstationary data streams. Intell Data Anal 6(2):129–147MathSciNetzbMATHGoogle Scholar
  66. 66.
    Lei Y, Huan L (2003) Feature selection for high-dimensional data: a fast correlation-based filter solution. In: The 20th international conference on machine learning, pp 856–863Google Scholar
  67. 67.
    Leite D, Costa P, Gomide F (2010) Evolving granular neural network for semi-supervised data stream classification. In: The 2010 international joint conference on neural networks (IJCNN), pp 1–8. IEEEGoogle Scholar
  68. 68.
    Leite DF, Costa P, Gomide F (2009) Evolving granular classification neural networks. In: International joint conference on neural networks, 2009. IJCNN 2009, pp 1736–1743Google Scholar
  69. 69.
    Lühr S, Lazarescu M (2009) Incremental clustering of dynamic data streams using connectivity based representative points. Data Knowl Eng 68(1):1–27CrossRefGoogle Scholar
  70. 70.
    Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 17(4):491–502CrossRefGoogle Scholar
  71. 71.
    Martin E, Hans-Peter K, Jörg S, Xiaowei X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd international conference on knowledge discovery and data mining, pp 226–231. AAAI PressGoogle Scholar
  72. 72.
    Muthukrishnan SM (2003) Data streams: algorithms and applications. Now Publishers IncGoogle Scholar
  73. 73.
    Narasimhamurthy A, Kuncheva LI (2007) A framework for generating data to simulate changing environments. In: Proceedings of the 25th IASTED international multi-conference: artificial intelligence and applications, vol 549, p 389. ACTA PressGoogle Scholar
  74. 74.
    Nguyen H-L, Woon Y-K, Ng W-K, Wan L (2012) Heterogeneous ensemble for feature drifts in data streams. In: Proceedings of the 16th Pacific-Asia conference on advances in knowledge discovery and data mining, vol 2, pp 1–12, 2342648. SpringerGoogle Scholar
  75. 75.
    O’Callaghan L, Mishra N, Meyerson A, Guha S, Motwani R (2002) Streaming-data algorithms for high-quality clustering. In: Proceedings of the 18th international conference on data engineering, pp 685–694Google Scholar
  76. 76.
    Oza NC (2005) Online bagging and boosting. In: 2005 IEEE international conference on systems, man and cybernetics, vol 3, pp 2340–2345. IEEEGoogle Scholar
  77. 77.
    Park NH, Lee WS (2004) Statistical grid-based clustering over data streams. ACM SIGMOD Rec 33(1):32–37CrossRefGoogle Scholar
  78. 78.
    Park NH, Lee WS (2007) Cell trees: an adaptive synopsis structure for clustering multi-dimensional on-line data streams. Data Knowl Eng 63(2):528–549CrossRefGoogle Scholar
  79. 79.
    Rai P, Daum H, Venkatasubramanian S (2009) Streamed learning: one-pass svms. In: Proceedings of the 21st international jont conference on artificial intelligence, pp 1211–1216, 1661639. Morgan Kaufmann Publishers IncGoogle Scholar
  80. 80.
    Rodrigues P, Gama J, Pedroso J (2006) Odac: hierarchical clustering of time series data streams. In: SIAM international conference on data mining, Bethesda, Maryland, USA, pp 499–503Google Scholar
  81. 81.
    Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes twitter users: real-time event detection by social sensors. In: Proceedings of the 19th international conference on World wide web, pp 851–860. ACMGoogle Scholar
  82. 82.
    Sattar H, Ying Y, Zahra M, Mohammadreza K (2009) Adapted one-vs-all decision trees for data stream classification. IEEE Trans Knowl Data Eng 21:624–637CrossRefGoogle Scholar
  83. 83.
    Sebastiani F (2002) Machine learning in automated text categorization. ACM Comput Surv (CSUR) 34(1):1–47CrossRefGoogle Scholar
  84. 84.
    Seidl T, Assent I, Kranen P, Krieger R, Herrmann J (2009) Indexing density models for incremental learning and anytime classification on data streams. In: Proceedings of the 12th international conference on extending database technology: advances in database technology, pp 311–322, 1516397. ACMGoogle Scholar
  85. 85.
    Sheikholeslami G, Chatterjee S, Zhang A (2000) Wavecluster: a wavelet-based clustering approach for spatial data in very large databases. VLDB J 8(3–4):289–304CrossRefGoogle Scholar
  86. 86.
    Silva JA, Faria ER, Barros RC, Hruschka ER, Carvalho ACd, Gama J (2013) Data stream clustering: a survey. ACM Comput Surv (CSUR) 46(1):13CrossRefGoogle Scholar
  87. 87.
    Smith T, Alahakoon D (2009) Growing self-organizing map for online continuous clustering, vol 204 of Studies in computational intelligence, pp 49–83. Springer, BerlinGoogle Scholar
  88. 88.
    Sow D, Biem A, Blount M, Ebling M, Verscheure O (2010) Body sensor data processing using stream computing. In: Proceedings of the international conference on multimedia information retrieval, pp 449–458, 1743465. ACMGoogle Scholar
  89. 89.
    Spiliopoulou M, Ntoutsi I, Theodoridis Y, Schult R (2006) Monic: modeling and monitoring cluster transitions. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, pp 706–711. ACMGoogle Scholar
  90. 90.
    Street WN, Kim Y (2001) A streaming ensemble algorithm (sea) for large-scale classification. In: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining, pp 377–382, 502568. ACMGoogle Scholar
  91. 91.
    Sun J, Faloutsos C, Papadimitriou S, Yu PS (2007) Graphscope: parameter-free mining of large time-evolving graphs. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, pp 687–696. ACMGoogle Scholar
  92. 92.
    Tasoulis DK, Ross G, Adams NM (2007) Visualising the cluster structure of data streams. In: Proceedings of the 7th international conference on intelligent data analysis, pp 81–92, 1771633. SpringerGoogle Scholar
  93. 93.
    Teh Y, Jordan M, Beal M, Blei D (2006) Hierarchical dirichlet processes. J Am Stat Assoc 101(476):1566–1581MathSciNetCrossRefzbMATHGoogle Scholar
  94. 94.
    Tsang IW, Kocsor A, Kwok JT (2007) Simpler core vector machines with enclosing balls. In: Proceedings of the 24th international conference on machine learning, pp 911–918, 1273611. ACMGoogle Scholar
  95. 95.
    Udommanetanakit K, Rakthanmanon T, Waiyamai K (2007) E-stream: evolution-based technique for stream clustering, vol 4632 of Lecture notes in computer science, pp 605–615. Springer, BerlinGoogle Scholar
  96. 96.
    van Rijsbergen C (1979) Information retrieval, 2nd edn. Butterworths, MassachusettsGoogle Scholar
  97. 97.
    Wan L, Ng WK, Dang XH, Yu PS, Zhang K (2009) Density-based clustering of data streams at multiple resolutions. ACM Trans Knowl Discov Data (TKDD) 3(3):1–28CrossRefGoogle Scholar
  98. 98.
    Wang H, Fan W, Yu PS, Han J (2003) Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining, pp 226–235, 956778. ACMGoogle Scholar
  99. 99.
    Wang H, Yu PS, Han J (2005) Mining data streams. Springer, US, pp 777–792Google Scholar
  100. 100.
    Wang W, Yang J, Muntz R (1997) Sting: a statistical information grid approach to spatial data mining. In: Proceedings of the international conference on very large data bases, pp 186–195Google Scholar
  101. 101.
    Yang Y, Pierce T, Carbonell J (1998) A study of retrospective and on-line event detection. In: Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval, pp 28–36. ACMGoogle Scholar
  102. 102.
    Yang Y, Zhang J, Carbonell J, Jin C (2002) Topic-conditioned novelty detection. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining, pp 688–693. ACMGoogle Scholar
  103. 103.
    Zeng H-J, He Q-C, Chen Z, Ma W-Y, Ma J (2004) Learning to cluster web search results. In: Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval, pp 210–217, 1009030. ACMGoogle Scholar
  104. 104.
    Zhang P, Gao B, Zhu X, Guo L (2011) Enabling fast lazy learning for data streams. In: Proceedings of the 11th IEEE international conference on data mining (ICDM-11), pp 932–941. IEEEGoogle Scholar
  105. 105.
    Zhang P, Li J, Wang P, Gao BJ, Zhu X, Guo L (2011) Enabling fast prediction for ensemble models on data streams. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, pp 177–185, 2020442. ACMGoogle Scholar
  106. 106.
    Zhang P, Zhu X, Shi Y, Guo L, Wu X (2011) Robust ensemble learning for mining noisy data streams. Decis Support Syst 50(2):469–479CrossRefGoogle Scholar
  107. 107.
    Zhang P, Zhu X, Shi Y, Wu X (2009) An aggregate ensemble for mining concept drifting data streams with noise, vol 5476 of Lecture notes in computer science, pp 1021–1029. Springer, BerlinGoogle Scholar
  108. 108.
    Zhang T, Ramakrishnan R, Livny M (1996) Birch: an efficient data clustering method for very large databases. In: Proceedings of the 1996 ACM SIGMOD international conference on management of data, pp 103–114, 233324. ACMGoogle Scholar
  109. 109.
    Zhong S (2005) Efficient streaming text clustering. Neural Netw 18(5):790–798CrossRefzbMATHGoogle Scholar
  110. 110.
    Zhou A, Cao F, Qian W, Jin C (2008) Tracking clusters in evolving data streams over sliding windows. Knowl Inf Syst 15(2):181–214CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London 2014

Authors and Affiliations

  1. 1.Data Analytics Department, Institute for Infocomm Research (I2R) Agency for Science Technology and Research (A*STAR)SingaporeSingapore
  2. 2.Airbus Group Innovations South AsiaSingaporeSingapore
  3. 3.School of Computer EngineeringNanyang Technological UniversitySingaporeSingapore

Personalised recommendations