Getting Critical Categories of a Data Set

  • Cheqing Jin
  • Yizhen Zhang
  • Aoying Zhou
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6897)


Ranking query that is widely used in various applications is a fundamental kind of queries in the database management field. Although most of the existing work on ranking query focuses on getting top-k high-score tuples from a data set, this paper focuses on getting top-k critical categories from a data set, where each category is a data item in the nominal attribute or a combination of data items from more than one nominal attribute. To describe each category precisely, we use a data distribution that comes from the score attribute to represent each category, so that the set consisting of all categories can be treated as a probabilistic data set. In this paper, we devise a novel method to handle this issue. Analysis in theorem and experimental results show the effectiveness and efficiency of the proposed method.


critical category ranking query possible world 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aggarwal, C.C.: Managing and mining uncertain data. Springer, Heidelberg (2009)CrossRefzbMATHGoogle Scholar
  2. 2.
    Babcock, B., Olston, C.: Distributed top-k monitoring. In: Proc. of SIGMOD (2003)Google Scholar
  3. 3.
    Cormode, G., Li, F., Yi, K.: Semantics of ranking queries for probabilistic data and expected ranks. In: Proc. of ICDE (2009)Google Scholar
  4. 4.
    Fagin, R., Lotem, A., Naor, M.: Optimal aggregation algorithms for middleware. In: Proc. of PODS (2001)Google Scholar
  5. 5.
    Ge, T., Zdonik, S., Madden, S.: Top-k queries on uncertain data: On score distribution and typical answers. In: Proc. of SIGMOD (2009)Google Scholar
  6. 6.
    Hua, M., Pei, J., Zhang, W., Lin, X.: Ranking queries on uncertain data: A probabilistic threshold approach. In: Proc. of SIGMOD (2008)Google Scholar
  7. 7.
    Ilyas, I.F., Beskales, G., Soliman, M.A.: A survey of top-k query processing techniques in relational database systems. ACM Computing Surveys 40(4) (2008)Google Scholar
  8. 8.
    Jin, C., Gao, M., Zhou, A.: Handling er-topk query on uncertain streams. In: Yu, J.X., Kim, M.H., Unland, R. (eds.) DASFAA 2011, Part I. LNCS, vol. 6587, pp. 326–340. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  9. 9.
    Li, J., Saha, B., Deshpande, A.: A unified approach to ranking in probabilistic databases. In: Proc. of VLDB (2009)Google Scholar
  10. 10.
    Mouratidis, K., Bakiras, S., Papadias, D.: Continuous monitoring of top-k queries over sliding windows. In: Proc. of ACM SIGMOD (2006)Google Scholar
  11. 11.
    Nepal, S., Ramakrishna, M.V.: Query processing issues in image(multimedia) databases. In: Proc. of ICDE (1999)Google Scholar
  12. 12.
    Soliman, M.A., Ilyas, I.F.: Ranking with uncertain scores. In: Proc. of ICDE (2009)Google Scholar
  13. 13.
    Soliman, M.A., Ilyas, I.F., Chang, K.C.-C.: Top-k query processing in uncertain databases. In: Proc. of ICDE (2007)Google Scholar
  14. 14.
    Zhang, X., Chomicki, J.: On the semantics and evaluation of top-k queries in probabilistic databases. In: Proc. of DBRank (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Cheqing Jin
    • 1
  • Yizhen Zhang
    • 1
  • Aoying Zhou
    • 1
  1. 1.Shanghai Key Laboratory of Trustworthy Computing, Software Engineering InstituteEast China Normal UniversityChina

Personalised recommendations