Skip to main content

Top-k Best Probability Queries on Probabilistic Data

  • Conference paper
  • 1840 Accesses

Part of the Lecture Notes in Computer Science book series (LNISA,volume 7239)

Abstract

There has been much interest in answering top-k queries on probabilistic data in various applications such as market analysis, personalised services, and decision making. In relation to probabilistic data, the most common problem in answering top-k queries is selecting the semantics of results according to their scores and top-k probabilities. In this paper, we propose a novel top-k best probability query to obtain results which are not only the best top-k scores but also the best top-k probabilities. We also introduce an efficient algorithm for top-k best probability queries without requiring the user’s defined threshold. Then, the top-k best probability answer is analysed, which satisfies the semantic ranking properties of queries [3,18] on uncertain data. The experimental studies are tested with both the real data to verify the effectiveness of the top-k best probability queries and the efficiency of our algorithm.

Keywords

  • Top-k Query
  • Query Processing
  • Uncertain data

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (Canada)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aggarwal, C.C., Yu, P.S.: A survey of uncertain data algorithms and applications. IEEE TKDE 21, 609–623 (2009)

    Google Scholar 

  2. Atallah, M.J., Qi, Y.: Computing all skyline probabilities for uncertain data. In: PODS, pp. 279–287 (2009)

    Google Scholar 

  3. Li, F., Cormode, G., Yi, K.: Semantics of ranking queries for probabilistic data and expected ranks. In: ICDE, March 29-April 2, pp. 305–316 (2009)

    Google Scholar 

  4. Ge, T., Zdonik, S., Madden, S.: Top-k queries on uncertain data: on score distribution and typical answers. In: SIGMOD, pp. 375–388 (2009)

    Google Scholar 

  5. Getoor, L.: Learning Probabilistic Relational Models. In: Choueiry, B.Y., Walsh, T. (eds.) SARA 2000. LNCS (LNAI), vol. 1864, pp. 322–323. Springer, Heidelberg (2000)

    CrossRef  Google Scholar 

  6. Hua, M., Pei, J., Zhang, W., Lin, X.: Ranking queries on uncertain data: a probabilistic threshold approach. In: SIGMOD, pp. 673–686 (2008)

    Google Scholar 

  7. Ilyas, I.F., Beskales, G., Soliman, M.A.: A survey of top-k query processing techniques in relational database systems. ACM 40, 1–58 (2008)

    Google Scholar 

  8. Jan, C., Parke, G., Jarek, G., Dongming, L.: Skyline with presorting theory & optimizations. IIPWM 31, 595–604 (2005)

    Google Scholar 

  9. Jin, C., Yi, K., Chen, L., Yu, J.X., Lin, X.: Sliding-window top-k queries on uncertain streams. In: VLDB, pp. 301–312 (2008)

    Google Scholar 

  10. Lange, K.: Numerical analysis for statisticians. Springer, Heidelberg (1999)

    MATH  Google Scholar 

  11. Li, J., Saha, B., Deshpande, A.: A unified approach to ranking in probabilistic databases. In: VLDB, pp. 502–513 (2009)

    Google Scholar 

  12. Pang-Ning, T., Michael, S., Vipin, K.: Introduction to data mining. Library of Congress (2006)

    Google Scholar 

  13. Papadias, D., Tao, Y., Fu, G., Seeger, B.: An optimal and progressive algorithm for skyline queries. In: SIGMOD, pp. 467–478 (2003)

    Google Scholar 

  14. Pei, J., Jiang, B., Lin, X., Yuan, Y.: Probabilistic skylines on uncertain data. In: VLDB, pp. 15–26 (2007)

    Google Scholar 

  15. Sarma, A.D., Benjelloun, O., Halevy, A., Widom, J.: Working models for uncertain data. In: ICDE (2006)

    Google Scholar 

  16. Soliman, M.A., Ilyas, I.F., Chang, K.C.-C.: Top-k query processing in uncertain databases. In: ICDE, pp. 896–905 (2007)

    Google Scholar 

  17. Soliman, M.A., Ilyas, I.F., Chang, K.C.–C.: Probabilistic top-k & ranking-aggregate queries. ACM Trans. Database Syst. 33, 13:1–13:54 (2008)

    CrossRef  Google Scholar 

  18. Xi, Z., Jan, C.: Semantics and evaluation of top-k queries in probabilistic databases. Distributed Parallel Databases 26(1), 67–126 (2009)

    CrossRef  Google Scholar 

  19. Yan, D., Ng, W.: Robust Ranking of Uncertain Data. In: Yu, J.X., Kim, M.H., Unland, R. (eds.) DASFAA 2011, Part I. LNCS, vol. 6587, pp. 254–268. Springer, Heidelberg (2011)

    CrossRef  Google Scholar 

  20. Yi, K., Li, F., Kollios, G., Srivastava, D.: Efficient processing of top-k queries in uncertain databases with x-relations. TKDE 20, 1669–1682 (2008)

    Google Scholar 

  21. Zhang, S., Zhang, C.: A probabilistic data model and its semantics. Journal of Research & Practice in Information Technology 35, 237–256 (2003)

    Google Scholar 

  22. Zhang, W., Lin, X., Pei, J., Zhang, Y.: Managing uncertain data: probabilistic approaches. In: Web-Age Information Management (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Le, T.M.N., Cao, J. (2012). Top-k Best Probability Queries on Probabilistic Data. In: Lee, Sg., Peng, Z., Zhou, X., Moon, YS., Unland, R., Yoo, J. (eds) Database Systems for Advanced Applications. DASFAA 2012. Lecture Notes in Computer Science, vol 7239. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29035-0_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-29035-0_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-29034-3

  • Online ISBN: 978-3-642-29035-0

  • eBook Packages: Computer ScienceComputer Science (R0)