An Improved Data Stream Algorithm for Clustering

  • Sang-Sub Kim
  • Hee-Kap Ahn
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8392)

Abstract

We present a single-pass, (1.8 + ε)-factor, O(1/ε)-space data stream algorithm for the Euclidean 2-center problem for any fixed d ≥ 1. This is an improvement on the approximation factor over the (2 + ε)-factor and O(1/ε)-space algorithms of Ahn et al. [3] and Guha [8]. It can also be considered as an improvement on the space over the (1 + ε)-factor and O(1/εd)-space algorithm of Zarrabi-Zadeh [11], while sacrificing the approximation factor a little bit. To our best knowledge, this is the first breakthrough with an approximation factor below 2 using O(1/ε) space for any fixed d. Our algorithm also extends to the k-center problem with k > 2.

References

  1. 1.
    Agarwal, P.K., Sharathkumar, R.: Streaming algorithms for extent problems in high dimensions. In: Proc. of the 21st ACM-SIAM Sympos. Discrete Algorithms, pp. 1481–1489 (2010)Google Scholar
  2. 2.
    Aggarwal, C.C.: Data streams: models and algorithms. Springer (2007)Google Scholar
  3. 3.
    Ahn, H.-K., Kim, H.-S., Kim, S.-S., Son, W.: Computing k-center over streaming data for small k. In: Proc. of the 23rd Int. Sympos. Algorithms and Computation, pp. 54–63 (2012)Google Scholar
  4. 4.
    Bonnell, I., Bate, M., Vine, S.: The hierarchical formation of a stellar cluster. Monthly Notices of the Royal Astronomical Society 343(2), 413–418 (2003)CrossRefGoogle Scholar
  5. 5.
    Chan, T.M., Pathak, V.: Streaming and dynamic algorithms for minimum enclosing balls in high dimensions. In: Proc. of the 12th Int. Conf. on Algorithms and Data Structures, pp. 195–206 (2011)Google Scholar
  6. 6.
    Clarke, C., Bonnell, I., Hillenbrand, L.: The formation of stellar clusters. In: Mannings, V., Boss, A., Russell, S. (eds.) Protostars and Planets IV, pp. 151–177. University of Arizona Press, Tucson (2000)Google Scholar
  7. 7.
    Fayyad, U., Piatetsky-Shapiro, G., Smyth, P.: From data mining to knowledge discovery in databases. AI Magazine 17, 37–54 (1996)Google Scholar
  8. 8.
    Guha, S.: Tight results for clustering and summarizing data streams. In: Proc. of the 12th Int. Conf. on Database Theory, pp. 268–275. ACM (2009)Google Scholar
  9. 9.
    Han, J., Kamber, M.: Data mining: concepts and techniques. Morgan Kaufmann (2006)Google Scholar
  10. 10.
    Hershberger, J., Suri, S.: Adaptive sampling for geometric problems over data streams. Computational Geometry 39(3), 191–208 (2008)CrossRefMATHMathSciNetGoogle Scholar
  11. 11.
    Zarrabi-Zadeh, H.: Core-preserving algorithms. In: Proc. of 20th Canadian Conf. on Computational Geometry, pp. 159–162 (2008)Google Scholar
  12. 12.
    Poon, C.K., Zhu, B.: Streaming with minimum space: An algorithm for covering by two congruent balls. In: Lin, G. (ed.) COCOA 2012. LNCS, vol. 7402, pp. 269–280. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  13. 13.
    Sonka, M., Hlavac, V., Boyle, R.: Image processing, analysis, and machine vision, 3rd edn. Thomson Learning (2007)Google Scholar
  14. 14.
    Zarrabi-Zadeh, H., Chan, T.: A simple streaming algorithm for minimum enclosing balls. In: Proc. of 18th Canadian Conf. on Computational Geometry, pp. 139–142 (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Sang-Sub Kim
    • 1
  • Hee-Kap Ahn
    • 1
  1. 1.Department of Computer Science and EngineeringPOSTECHPohangRepublic of Korea

Personalised recommendations