Skip to main content
Log in

Probabilistic n-of-N skyline computation over uncertain data streams

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Skyline operator is a useful tool in multi-criteria decision making in various applications. Uncertainty is inherent in real applications due to various reasons. In this paper, we consider the problem of efficiently computing probabilistic skylines against the most recent N uncertain elements in a data stream seen so far. Specifically, we study the problem in the n-of-N model; that is, computing the probabilistic skyline for the most recent n (∀ n ≤ N) elements, where an element is a probabilistic skyline element if its skyline probability is not below a given probability threshold q. Firstly, an effective pruning technique to minimize the number of uncertain elements to be kept is developed. It can be shown that on average storing only O(log d N) uncertain elements from the most recent N elements is sufficient to support the precise computation of all probabilistic n-of-N skyline queries in a d-dimension space if the data distribution on each dimension is independent. A novel encoding scheme is then proposed together with efficient update techniques so that computing a probabilistic n-of-N skyline query in a d-dimension space is reduced to O(dloglogN + s) if the data distribution is independent, where s is the number of skyline points. A trigger based technique is provided to process continuous n-of-N skyline queries. Extensive experiments demonstrate that the new techniques on uncertain data streams can support on-line probabilistic skyline query computation over rapid data streams.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Atallah, M.J., Qi, Y.: “Computing all skyline probabilities for uncertain data,” in PODS, (2009)

  2. Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: “Models and issues in data stream systems.” in PODS, pp. 1–16. (2002)

  3. Börzsönyi, S., Kossmann, D., Stocker, K.: “The skyline operator.” in ICDE, pp. 421–430. (2001)

  4. Chomicki, J., Godfrey, P., Gryz, J., Liang, D.: “Skyline with presorting.” in ICDE, pp. 717–816. (2003)

  5. Cormode, G., Garofalakis, M.: “Sketching probabilistic data streams,” in SIGMOD, (2007)

  6. Ding, X., Lian, X., Chen, L., Jin, H.: “Continuous monitoring of skylines over uncertain data streams,” in Information Sciences, (2012)

  7. Godfrey, P., Shipley, R., Gryz, J.: “Maximal vector computation in large data sets,” in VLDB, (2005)

  8. Jayram, T.S., McGregor, A., Muthukrishan, S., Vee, E.: “Estimating statistical aggregrates on probabilistic data streams,” in PODS, (2007)

  9. Jin, C., Yi, K., Chen, L., Yu, J.X., Lin, X.: “Sliding-window top-k queries on uncertain streams,” in VLDB, (2008)

  10. Kossmann, D., Ramsak, F., Rost, S.: “Shooting stars in the sky: An online algorithm for skyline queries.” in VLDB, pp. 275–286. (2002)

  11. Lin, X., Lu, H., Xu, J., Yu, J., X.: “Continuously maintaining quantile summaries of the most recent n elements over a data stream.” in ICDE, pp. 362–374. (2004)

  12. Lin, X., Yuan, Y., Wang, W., Lu, H.: “Stabbing the skye: Efficient skyline computation over sliding windows”, in ICDE. (2005)

    Google Scholar 

  13. Lin, X., Zhang, Y., Zhang, W., Cheema, M.A.: “Stochastic skyline operator,” in ICDE (2011)

  14. Mehlhorn, K.: Data Structures and Algorithms: 3. Multidimensional Searching and Computational Geometry. Springer, (1984)

  15. Papadias, D., Tao, Y., Fu, G., Seeger, B.:“An optimal and progressive algorithm for skyline queries.” in SIGMOD, 2003, pp. 467–478.

  16. Pei, J., Jiang, B., Lin, X., Yuan, Y.: “Probabilistic skylines on uncertain data,” in VLDB (2007)

  17. Tan, K.-L., Eng, P.-K., Ooi, B.C.: “Efficient progressive skyline computation.” in VLDB, pp. 301–310. (2001)

  18. Tao, Y., Papadias, D.: “Maintaining sliding window skylines on data streams,” in TKDE, (2006)

  19. Zhang, W., Lin, X., Zhang, Y., Cheema, M.A., Zhang, Q.: “Stochastic skylines,” in TODS, (2012)

  20. Zhang, W., Lin, X., Zhang, Y., Wang, W., Yu, J.X.: “Probabilistic skyline operator over sliding windows,” in ICDE, (2009)

  21. Zhang, Y., Zhang, W., Lin, X., Jiang, B., Pei, J.: “Ranking uncertain sky: the probabilistic top-k skyline operator.” in Information Systems, (2011)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aiping Li.

Additional information

Wenjie Zhang was partially supported by ARC DE120102144 and DP120104168. Aiping Li (Corresponding Author) was partially supported by State Key Development Program of Basic Research of China (No. 2013CB329601) and National Key Technology R&D Program (No. 2012BAH38B-04). Muhammad Aamir Cheema was partially supported by ARC DE130101002 and DP130103405. Ying Zhang was partially supported by ARC DP110104880, DP130103245 and UNSW ECR grant PSE1799.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, W., Li, A., Cheema, M.A. et al. Probabilistic n-of-N skyline computation over uncertain data streams. World Wide Web 18, 1331–1350 (2015). https://doi.org/10.1007/s11280-014-0292-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-014-0292-2

Keywords

Navigation