Advertisement

Algorithmica

, Volume 79, Issue 3, pp 960–983 | Cite as

Partial Sorting Problem on Evolving Data

  • Qin Huang
  • Xingwu Liu
  • Xiaoming Sun
  • Jialin Zhang
Article
  • 238 Downloads

Abstract

In this paper we investigate the top-k-selection problem, i.e. to determine and sort the top k elements, in the dynamic data model. Here dynamic means that the underlying total order evolves over time, and that the order can only be probed by pair-wise comparisons. It is assumed that at each time step, only one pair of elements can be compared. This assumption of restricted access is reasonable in the dynamic model, especially for massive data sets where it is impossible to access all the data before the next change occurs. Previously only two special cases were studied (Anagnostopoulos et al. in 36th international colloquium on automata, languages and programming (ICALP). LNCS, vol 5566, pp 339–350, 2009) in this model: selecting the element of a given rank, and sorting all elements. This paper systematically deals with \(1\le k\le n\). Specifically, we identify the critical point \(k^*\) such that the top-k-selection problem can be solved error-free with probability \(1-o(1)\) if and only if \(k=o(k^*)\). A lower bound of the error when \(k=\varOmega (k^*)\) is also determined, which actually is tight under some conditions. In contrast, we show that the top-k-set problem, which means finding the top k elements without sorting them, can be solved error-free with probability \(1-o(1)\) for all \(1\le k\le n\). Additionally, we consider some extensions of the dynamic data model and show that most of these results still hold.

Notes

Acknowledgements

The work is partially supported by the National Key Research and Development Program of China (2016YFB1000201, 2016YFB1000604), State Key Laboratory of Software Development Environment Open Fund (SKLSDE-2016KF-01), Science Foundation of Shenzhen City in China (JCYJ20160419152942010), National Natural Science Foundation of China (61222202, 61433014, 61502449, 61602440), and the China National Program for support of Top-notch Young Professionals.

References

  1. 1.
    Anagnostopoulos, A., Kumar, R., Mahdian, M., Upfal, E.: Sort me if you can: how to sort dynamic data. In: 36th International Colloquium on Automata, Languages and Programming (ICALP). LNCS, vol. 5566, pp. 339–350 (2009)Google Scholar
  2. 2.
    Ilyas, I., Beskales, G., Soliman, M.: A survey of top-\(k\) query processing techniques in relational database systems. ACM Comput. Surv. 40(4), 11 (2008)CrossRefGoogle Scholar
  3. 3.
    Whang, K., Kim, M., Lee, J.: Linear-time top-k sort method. US Patent 8,296,306 B1 (2012)Google Scholar
  4. 4.
    Knuth, D.E.: The Art of Computer Programming, vol. 3. Addison-Wesley, Reading (1973)zbMATHGoogle Scholar
  5. 5.
    Kislitsyn, S.S.: On the selection of the \(k\) th element of an ordered set by pairwise comparison. Sibirskii Mat. Zhurnal 5, 557–564 (1964)MathSciNetzbMATHGoogle Scholar
  6. 6.
    Blum, M., Floyd, R., Pratt, V., Rivest, R., Tarjan, R.: Time bounds for selection. J. Comput. Syst. Sci. 7(4), 448–461 (1973)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Dor, D., Zwick, U.: Selecting the median. In: SODA 1995, pp. 28–37 (1995)Google Scholar
  8. 8.
    Moreland, A.: Dynamic Data: Model, Sorting, Selection. Technical report (2014)Google Scholar
  9. 9.
    Anagnostopoulos, A., Kumar, R., Mahdian, M., Upfal, E., Vandin, F.: Algorithms on evolving graphs. In: 3rd Innovations in Theoretical Computer Science Conference (ITCS), pp. 149–160. ACM, New York (2012)Google Scholar
  10. 10.
    Bahmani, B., Kumar, R., Mahdian, M., Upfal, E.: Pagerank on an evolving graph. In: 18th ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD), pp. 24–32. ACM (2012)Google Scholar
  11. 11.
    Zhuang, H., Sun, Y., Tang, J., Zhang J., Sun, X.: Influence maximization in dynamic social networks. In: 13th IEEE International Conference on Data Mining (ICDM), pp. 1313–1318. IEEE (2013)Google Scholar
  12. 12.
    Kanade, V., Leonardos, N., Magniez, F.: Stable matching with evolving preferences (2015). arXiv:1509.01988
  13. 13.
    Labouseur, A.G., Olsen, P.W., Hwang, J.H.: Scalable and robust management of dynamic graph data. In: 1st International Workshop on Big Dynamic Distributed Data (BD3@VLDB), pp. 43–48 (2013)Google Scholar
  14. 14.
    Ren, C.: Algorithms for evolving graph analysis. Doctoral dissertation, The University of Hong Kong (2014)Google Scholar
  15. 15.
    Ajtai, M., Feldman, V., Hassidim, A., Nelson, J.: Sorting and selection with imprecise comparisons. In: 36th International Colloquium on Automata, Languages and Programming (ICALP). LNCS, vol. 5566, pp. 37–48 (2009)Google Scholar
  16. 16.
    Feige, U., Raghavan, P., Peleg, D., Upfal, E.: Computing with noisy information. SIAM J. Comput. 23(5), 1001–1018 (1994)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Finocchi, I., Grandoni, F., Italiano, G.: Optimal resilient sorting and searching in the presence of memory faults. In: 33th International Colloquium on Automata, Languages and Programming (ICALP). LNCS, vol. 4051, pp. 286–298 (2006)Google Scholar
  18. 18.
    Finocchi, I., Italiano, G.: Sorting and searching in the presence of memory faults (without redundancy). In: 36th Annual ACM Symposium on Theory of Computing (STOC), pp. 101–110 (2004)Google Scholar
  19. 19.
    Erlebach, T., Hoffmann, M., Kammer, F.: On temporal graph exploration. In: 42th International Colloquium on Automata, Languages and Programming (ICALP). LNCS, vol. 9134, pp. 444–455 (2015)Google Scholar
  20. 20.
    Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: 21st ACM SIGMOD-SIGACT-SIGART symposium on Principles of Database Systems (PODS), pp. 1–16. ACM (2002)Google Scholar
  21. 21.
    Bressan, M., Peserico, E., Pretto, L.: Approximating PageRank locally with sublinear query complexity (2014). arXiv:1404.1864
  22. 22.
    Fujiwara, Y., Nakatsuji, M., Shiokawa, H., Mishima, T., Onizuka, M.: Fast and exact top-k algorithm for PageRank. In: 27th AAAI Conference on Artificial Intelligence, pp. 1106–1112 (2013)Google Scholar
  23. 23.
    Albers, S.: Online algorithms: a survey. Math. Programm. 97(1–2), 3–26 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Kuleshov, V., Precup, D.: Algorithms for multi-armed bandit problems (2014). arXiv:1402.6028
  25. 25.
    Dubhashi, D.P., Panconesi, A.: Concentration of Measure for the Analysis of Randomized Algorithms. Cambridge University Press, Cambridge (2009)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  • Qin Huang
    • 1
    • 5
  • Xingwu Liu
    • 2
    • 3
    • 4
  • Xiaoming Sun
    • 4
    • 5
  • Jialin Zhang
    • 4
    • 5
  1. 1.State Key Laboratory of Software Development EnvironmentBeihang UniversityBeijingChina
  2. 2.Research Institute of Beihang University in ShenzhenShenzhenChina
  3. 3.Institute of Computing Technology, Chinese Academy of SciencesBeijingChina
  4. 4.University of Chinese Academy of SciencesBeijingChina
  5. 5.CAS Key Lab of Network Data Science and TechnologyInstitute of Computing Technology, Chinese Academy of SciencesBeijingChina

Personalised recommendations