Approximate Sorting of Data Streams with Limited Storage

  • Farzad Farnoud (Hassanzadeh)
  • Eitan Yaakobi
  • Jehoshua Bruck
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8591)


We consider the problem of approximate sorting of a data stream (in one pass) with limited internal storage where the goal is not to rearrange data but to output a permutation that reflects the ordering of the elements of the data stream as closely as possible. Our main objective is to study the relationship between the quality of the sorting and the amount of available storage. To measure quality, we use permutation distortion metrics, namely the Kendall tau and Chebyshev metrics, as well as mutual information, between the output permutation and the true ordering of data elements. We provide bounds on the performance of algorithms with limited storage and present a simple algorithm that asymptotically requires a constant factor as much storage as an optimal algorithm in terms of mutual information and average Kendall tau distortion.


Mutual Information Data Stream Deterministic Algorithm Internal Storage Limited Storage 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: Proc. 21st ACM Symp. Principles of Database Systems (PODS), New York, NY, USA (2002)Google Scholar
  2. 2.
    Chakrabarti, A., Jayram, T.S., Pǎtraşcu, M.: Tight lower bounds for selection in randomly ordered streams. In: ACM-SIAM Symp. Discrete Algorithms (SODA), pp. 720–729. Society for Industrial and Applied Mathematics, Philadelphia (2008), Google Scholar
  3. 3.
    Chen, C.P., Qi, F.: The best lower and upper bounds of harmonic sequence. Global Journal of Applied Mathematics and Mathematical Sciences 1(1), 41–49 (2008)Google Scholar
  4. 4.
    Corless, R.M., Gonnet, G.H., Hare, D.E.G., Jeffrey, D.J., Knuth, D.E.: On the Lambert W function. Advances in Computational Mathematics 5(1), 329–359 (1996), CrossRefzbMATHMathSciNetGoogle Scholar
  5. 5.
    Cover, T.M., Thomas, J.A.: Elements of information theory. John Wiley & Sons (2006)Google Scholar
  6. 6.
    Diaconis, P.: Group Representations in Probability and Statistics, vol. 11. Institute of Mathematical Statistics (1988)Google Scholar
  7. 7.
    Farnoud, F., Schwartz, M., Bruck, J.: Rate-distortion for ranking with incomplete information. arXiv preprint (2014),
  8. 8.
    Greenwald, M., Khanna, S.: Space-efficient online computation of quantile summaries. In: Proc. ACM SIGMOD Int. Conf. Management of Data, pp. 58–66. ACM, New York (2001), Google Scholar
  9. 9.
    Holst, L.: On the lengths of the pieces of a stick broken at random. Journal of Applied Probability 17(3), 623–634 (1980)CrossRefzbMATHMathSciNetGoogle Scholar
  10. 10.
    Manku, G.S., Rajagopalan, S., Lindsay, B.G.: Approximate medians and other quantiles in one pass and with limited memory. In: Proc. ACM SIGMOD Int. Conf. Management of Data, pp. 426–435. ACM, New York (1998), Google Scholar
  11. 11.
    McGregor, A., Valiant, P.: The shifting sands algorithm. In: ACM-SIAM Symp. Discrete Algorithms (SODA), pp. 453–458. SIAM (2012),
  12. 12.
    Munro, J., Paterson, M.: Selection and sorting with limited storage. Theoretical Computer Science 12(3), 315–323 (1980), CrossRefzbMATHMathSciNetGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Farzad Farnoud (Hassanzadeh)
    • 1
  • Eitan Yaakobi
    • 1
  • Jehoshua Bruck
    • 1
  1. 1.California Institute of TechnologyPasadenaUSA

Personalised recommendations