Approximate Sorting of Data Streams with Limited Storage
We consider the problem of approximate sorting of a data stream (in one pass) with limited internal storage where the goal is not to rearrange data but to output a permutation that reflects the ordering of the elements of the data stream as closely as possible. Our main objective is to study the relationship between the quality of the sorting and the amount of available storage. To measure quality, we use permutation distortion metrics, namely the Kendall tau and Chebyshev metrics, as well as mutual information, between the output permutation and the true ordering of data elements. We provide bounds on the performance of algorithms with limited storage and present a simple algorithm that asymptotically requires a constant factor as much storage as an optimal algorithm in terms of mutual information and average Kendall tau distortion.
KeywordsMutual Information Data Stream Deterministic Algorithm Internal Storage Limited Storage
Unable to display preview. Download preview PDF.
- 1.Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: Proc. 21st ACM Symp. Principles of Database Systems (PODS), New York, NY, USA (2002)Google Scholar
- 2.Chakrabarti, A., Jayram, T.S., Pǎtraşcu, M.: Tight lower bounds for selection in randomly ordered streams. In: ACM-SIAM Symp. Discrete Algorithms (SODA), pp. 720–729. Society for Industrial and Applied Mathematics, Philadelphia (2008), http://dl.acm.org/citation.cfm?id=1347082.1347161 Google Scholar
- 3.Chen, C.P., Qi, F.: The best lower and upper bounds of harmonic sequence. Global Journal of Applied Mathematics and Mathematical Sciences 1(1), 41–49 (2008)Google Scholar
- 5.Cover, T.M., Thomas, J.A.: Elements of information theory. John Wiley & Sons (2006)Google Scholar
- 6.Diaconis, P.: Group Representations in Probability and Statistics, vol. 11. Institute of Mathematical Statistics (1988)Google Scholar
- 7.Farnoud, F., Schwartz, M., Bruck, J.: Rate-distortion for ranking with incomplete information. arXiv preprint (2014), http://arxiv.org/abs/1401.3093
- 11.McGregor, A., Valiant, P.: The shifting sands algorithm. In: ACM-SIAM Symp. Discrete Algorithms (SODA), pp. 453–458. SIAM (2012), http://dl.acm.org/citation.cfm?id=2095116.2095155
- 12.Munro, J., Paterson, M.: Selection and sorting with limited storage. Theoretical Computer Science 12(3), 315–323 (1980), http://www.sciencedirect.com/science/article/pii/0304397580900614 CrossRefzbMATHMathSciNetGoogle Scholar