Skip to main content

Fast Top-Q and Top-K Query Answering

  • Conference paper
  • First Online:
Future Data and Security Engineering (FDSE 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10646))

Included in the following conference series:

Abstract

Efficient retrieval of the most relevant (e.g. top-k, k-NN) tuples is an important requirement in information systems which access large amounts of data. Top-k (or k-nearest-neighbors) queries retrieve the k-objects which score best for a specified objective function. But retrieving the closest objects does not tell the user how close or similar the objects are to the ideal object described by the input query. To support the query issuer more appropriate we introduce the top-q query answering TQQA which does not return a fixed number of result tuples but all tuples that are similar to the searched optimum with at least some minimum degree q. We show how to combine top-q queries with top-k queries enabling the user to post a large number of interesting queries. To the best of our knowledge neither such a top-q query answering approach nor a combination with top-k has not been proposed before. We implemented our approach and evaluated it against the best position algorithm BPA-2 which proved to be the among the fastest threshold based top-k query answering approaches. Our experiments showed an improvement by one to two orders of magnitude regarding time and memory requirements.

The work reported here was supported by the Austrian Ministry of Science and Research within the project GATIB II and BBMRI.AT.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agrawal, S., Chaudhuri, S.: Automated ranking of database query results. In: CIDR, pp. 888–899 (2003)

    Google Scholar 

  2. Akbarinia, R., Pacitti, E., Valduriez, P.: Best position algorithms for top-k queries. In: Proceedings of the 33rd International Conference on Very Large Databases, pp. 495–506. VLDB Endowment (2007)

    Google Scholar 

  3. Asslaber, M., Abuja, P., et al.: The genome austria tissue bank (gatib). Pathobiology 74, 251–258 (2007)

    Article  Google Scholar 

  4. Church, K., Gale, W.: Inverse document frequency (idf): a measure of deviations from poisson. In: Armstrong, S., Church, K., Isabelle, P., Manzi, S., Tzoukermann, E., Yarowsky, D. (eds.) Natural Language Processing Using Very Large Corpora. Text, Speech and Language Technology, vol. 11, pp. 283–295. Springer, Dordrecht (1999)

    Google Scholar 

  5. Dabringer, C., Eder, J.: Efficient top-k retrieval for user preference queries. In: Proceedings of the 26th ACM Symposium on Applied Computing (2011)

    Google Scholar 

  6. Dabringer, C., Eder, J.: Fast top-k query answering. In: Proceedings of the 22nd International Conference on Database and Expert Systems Applications (2011)

    Google Scholar 

  7. Dabringer, C., Eder, J.: Towards adaptive distributed Top-k query processing. In: Ivanović, M., et al. (eds.) ADBIS 2016. CCIS, vol. 637, pp. 37–44. Springer, Cham (2016). doi:10.1007/978-3-319-44066-8_4

    Chapter  Google Scholar 

  8. Eder, J., Dabringer, C., Schicho, M., Stark, K.: Information systems for federated biobanks. In: Hameurlain, A., Küng, J., Wagner, R. (eds.) Transactions on Large-Scale Data- and Knowledge-Centered Systems I. LNCS, vol. 5740, pp. 156–190. Springer, Heidelberg (2009). doi:10.1007/978-3-642-03722-1_7

    Chapter  Google Scholar 

  9. Eder, J., Frank, H., Liebhart, W.: Optimization of object-oriented queries by inverse methods. In: Eder, J., Kalinichenko, L.A. (eds.) East/West Database Workshop. Workshops in Computing, pp. 109–121. Springer, London (1995)

    Google Scholar 

  10. Eder, J., Gottweis, H., Zatloukal, K.: It solutions for privacy protection in biobanking. Public Health Genomics 15(5), 254–262 (2012)

    Article  Google Scholar 

  11. Fagin, R., Lotem, A., Naor, M.: Optimal aggregation algorithms for middleware. In: Proceedings of the 20th ACM Symposium on Principles of Database Systems, pp. 102–113. ACM, New York (2001)

    Google Scholar 

  12. Guntzer, U., Balke, W.-T., Kiessling, W.: Optimizing multi-feature queries for image databases. In: Proceedings of the 26th International Conference on Very Large Databases, pp. 419–428. Morgan Kaufmann Publishers Inc., San Francisco (2000)

    Google Scholar 

  13. Guntzer, U., Balke, W.-T., Kiessling, W., Guntzer, U., Balke, W.-T., Kiessling, W.: Towards efficient multi-feature queries in heterogeneous environments. In: Proceedings of the IEEE International Conference on IT: Coding and Computing, pp. 622–628 (2001)

    Google Scholar 

  14. Hofer-Picout, P., Pichler, H., Eder, J., Neururer, S.B., Müller, H., Reihs, R., Holub, P., Insam, T., Goebel, G.: Conception and implementation of an Austrian biobank directory integration framework. Biopreservation Biobanking 15(4), 332-340 (2017)

    Google Scholar 

  15. Hristidis, V., Hu, Y., Ipeirotis, P.G.: Ranked queries over sources with boolean query interfaces without ranking support. In: 26th IEEE International Conference on Data Engineering (2010)

    Google Scholar 

  16. Hua, M., Pei, J., Fu, A.W.C., Lin, X., Leung, H.-F.: Efficiently answering top-k typicality queries on large databases. In: Proceedings of the 33rd International Conference on Very Large Databases, pp. 890–901. VLDB Endowment (2007)

    Google Scholar 

  17. Ilyas, I.F., Beskales, G., Soliman, M.A.: A survey of top-k query processing techniques in relational database systems. ACM Comput. Surv. 40(4), 1–58 (2008)

    Article  Google Scholar 

  18. Lesot, M., Rifqi, M., Benhadda, H.: Similarity measures for binary and numerical data. Int. J. Knowl. Eng. Soft Data Paradigm. 1, 63–84 (2009)

    Google Scholar 

  19. Levandoski, J.J., Mokbel, M.F., Khalefa, M.E., Korukanti, V.R.: Flexpref: a framework for extensible preference evaluation in database systems. In: ICDE, New York, NY, USA (2010)

    Google Scholar 

  20. Mamoulis, N., Yiu, M.L., Cheng, K.H., Cheung, D.W.: Efficient top-k aggregation of ranked inputs. ACM Trans. Database Syst. 32(3), 19 (2007)

    Article  Google Scholar 

  21. Marian, A., Bruno, N., Gravano, L.: Evaluating top-k queries over web-accessible databases. ACM Trans. Database Syst. 29(2), 319–362 (2004)

    Article  Google Scholar 

  22. Nepal, S., Ramakrishna, M.: Query processing issues in image (multimedia) databases. In: ICDE, pp. 22–29 (1999)

    Google Scholar 

  23. Robertson, S.: Understanding inverse document frequency: on theoretical arguments for idf. J. Documentation 60, 503–520 (2004)

    Article  Google Scholar 

  24. Sparck Jones, K.: A statistical interpretation of term specificity and its application in retrieval. J. Documentation 28(1), 132–142 (1988)

    Google Scholar 

  25. Wichmann, H.-E., Kuhn, K., et al.: Comprehensive catalog of European biobanks. Nat. Biotechnol. 29(9), 795–797 (2011)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Johann Eder .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dabringer, C., Eder, J. (2017). Fast Top-Q and Top-K Query Answering. In: Dang, T., Wagner, R., Küng, J., Thoai, N., Takizawa, M., Neuhold, E. (eds) Future Data and Security Engineering. FDSE 2017. Lecture Notes in Computer Science(), vol 10646. Springer, Cham. https://doi.org/10.1007/978-3-319-70004-5_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-70004-5_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-70003-8

  • Online ISBN: 978-3-319-70004-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics