Skip to main content
Log in

Taxonomy-based relaxation of query answering in relational databases

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

Traditional information search in which queries are posed against a known and rigid schema over a structured database is shifting toward a Web scenario in which exposed schemas are vague or absent and data come from heterogeneous sources. In this framework, query answering cannot be precise and needs to be relaxed, with the goal of matching user requests with accessible data. In this paper, we propose a logical model and a class of abstract query languages as a foundation for querying relational data sets with vague schemas. Our approach relies on the availability of taxonomies, that is, simple classifications of terms arranged in a hierarchical structure. The model is a natural extension of the relational model in which data domains are organized in hierarchies, according to different levels of generalization between terms. We first propose a conservative extension of the relational algebra for this model in which special operators allow the specification of relaxed queries over vaguely structured information. We also study equivalence and rewriting properties of the algebra that can be used for query optimization. We then illustrate a logic-based query language that can provide a basis for expressing relaxed queries in a declarative way. We finally investigate the expressive power of the proposed query languages and the independence of the taxonomy in this context.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. “MOKA: an infrastructure for public transit integrated car pooling”, a project funded by Politecnico di Milano. Website: http://moka.necst.it/app/index.html

  2. GenData 2020 (“Data-Driven Genomic Computing”) is a project funded by MIUR (Italian Ministry of Education, University and Research) involving a large consortium of Italian universities: see http://gendata.weebly.com/

  3. http://www.geneontology.org

References

  1. Abiteboul, S., Beeri, C.: The power of languages for the manipulation of complex values. VLDB J. 4(4), 727–794 (1995)

    Article  Google Scholar 

  2. Agrawal, R., Wimmers, E.L.: A framework for expressing and combining preferences. In: Proceedings of SIGMOD, pp. 297–306 (2000)

  3. Bellahsene, Z., Bonifati, A., Rahm, E. (eds.) Schema Matching and Mapping. Springer, Berlin (2011)

  4. Andreasen, T., Bulskov, H.: Conceptual querying through ontologies. Fuzzy Sets Syst. 160(15), 2159–2172 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  5. Amer-Yahia, S., Lakshmanan, L.V.S., Pandit, S.: Flexpath: flexible structure and full-text querying for XML. In: Proceedings of SIGMOD, pp. 83–94 (2004)

  6. Amer-Yahia, S., Curtmola, E., Deutsch, A.: Flexible and efficient XML search with complex full-text predicates. In: Proceedings of SIGMOD, pp. 575–586 (2006)

  7. Arvanitis, A., Koutrika, G.: PrefDB: bringing preferences closer to the DBMS. In: Proceedings of SIGMOD, pp. 665–668 (2012)

  8. Balke, W.-T., Wagner, M.: Through different eyes: assessing multiple conceptual views for querying web services. In Proceedings of WWW, pp. 196–205 (2004)

  9. Bernstein, A., Kiefer, C.: Imprecise RDQL: towards generic retrieval in ontologies using similarity joins. In: Proceedings of SAC, pp. 1684–1689 (2006)

  10. Bhogal, J., MacFarlane, A., Smith, P.: A review of ontology based query expansion. Inf. Process. Manag. 43(4), 866–886 (2007)

    Article  Google Scholar 

  11. Bolchini, C., Curino, C., Orsi, G., Quintarelli, E., Rossato, R., Schreiber, F., Tanca, L.: And what can context do for data? Commun. ACM 52(11), 136–140 (2009)

    Article  Google Scholar 

  12. Broder, A.Z., Fontoura, M., Josifovski, V., Riedel, L.: A semantic approach to contextual advertising. In: Proceedings of SIGIR, pp. 559–566 (2007)

  13. Bulskov, H., Knappe, R., Andreasen, T.: On querying ontologies and databases. In: Proceedings of FQAS, pp. 191–202 (2004)

  14. Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Poggi, A., Rodriguez-Muro, M., Rosati, R., Ruzzi, M., Fabio Savo, D.: The MASTRO system for ontology-based data access. Semantic Web 2(1), 43–53 (2011)

    Google Scholar 

  15. Catallo, I., Ciceri, E., Fraternali, P., Martinenghi, D., Tagliasacchi, M.: Top-k diversity queries over bounded regions. ACM Trans. Database Syst. 38(2): art. 10 (2013)

    Google Scholar 

  16. Chen, Y.-Y., Suel, T., Markowetz, A.: Efficient query processing in geographic web search engines. In: Proceedings of SIGMOD, pp. 277–288 (2006)

  17. Chomicki, J.: Preference formulas in relational queries. ACM Trans. Database Syst. 28(4), 427–466 (2003)

    Article  Google Scholar 

  18. Ciaccia, P., Torlone, R.: Modeling the propagation of user preferences. In: Proceedings of ER, pp. 304–317 (2011)

  19. Codd, E.F.: Relational completeness of data base sublanguages. In: Rustin, R. (ed.) Database Systems Prentice Hall and IBM Research Report RJ 987, pp. 65–98 (1972)

  20. Dolog, P., Stuckenschmidt, H., Wache, H., Diederich, J.: Relaxing RDF queries based on user and domain preferences. J. Intell. Inf. Syst. 33(3), 239–260 (2009)

    Article  Google Scholar 

  21. Dong, X., Halevy, A.Y.: Malleable schemas: a preliminary report. In: Proceedings of WebDB, pp. 139–144 (2005)

  22. Elbassuoni, S., Ramanath, M., Schenkel, R., Weikum, G.: Searching RDF graphs with SPARQL and keywords. IEEE Data Eng. Bull. 33(1), 16–24 (2010)

    Google Scholar 

  23. Elbassuoni, S., Ramanath, M., Weikum, G.: Query relaxation for entity-relationship search. In: Proceedings of ESWC, pp. 62–76 (2011)

  24. Escobar-Molano, M., Hull, R., Jacobs, D.: Safety and translation of calculus queries with scalar functions. In: Proceedings of PODS, pp. 253–264 (1993)

  25. Fagin, R., Guha, R.V., Kumar, R., Novak, J., Sivakumar, D., Tomkins, A.: Multi-structural databases. In: Proceedings of PODS, pp. 184–195 (2005)

  26. Fagin, R., Kolaitis, P.G., Guha, R.V., Kumar, R., Novak, J., Sivakumar, D., Tomkins, A.: Efficient implementation of large-scale multi-structural databases. In: Proceedings of SIGMOD, pp. 958–969 (2005)

  27. Fontoura, M., Josifovski, V., Kumar, R., Olston, C., Tomkins, A., Vassilvitskii, S.: Relaxation in text search using taxonomies. Proc. VLDB 1(1), 672–683 (2008)

    Article  Google Scholar 

  28. Gaasterland, T., Godfrey, P., Minker, J.: Relaxation as a platform for cooperative answering. J. Intell. Inf. Syst. 1(3/4), 293–321 (1992)

    Article  Google Scholar 

  29. Hurtado, C.A., Poulovassilis, A., Wood, P.T.: Query relaxation in RDF. J. Data Semant. 10, 31–61 (2008)

    Google Scholar 

  30. Ilyas, I.F., Beskales, G., Soliman, M.A.: A survey of top-k query processing techniques in relational database systems. ACM Comput. Surv. 40(4). Artcle 11 (2008)

    Google Scholar 

  31. Kanza, Y., Sagiv, Y.: Flexible queries over semistructured data. In: Proceedings of PODS, pp. 40–51 (2001)

  32. Kießling, W.: Foundations of preference in database systems. In: Proceedings of VLDB, pp. 311–322 (2005)

  33. Koudas, N., Li, C., Tung, A.K.H., Vernica, R.: Relaxing join and selection queries. In: Proceedings of VLDB, pp. 199–210 (2006)

  34. Koutrika, G., Ioannidis, Y.E.: Personalization of queries in database systems. In: Proceedings of ICDE, pp. 597–608 (2004)

  35. Li, Y., Yang, H., Jagadish, H.V.: NaLIX: A generic natural language search environment for XML data. ACM Trans Database Syst. 32(4): art. 30 (2007)

    Google Scholar 

  36. Liu, C., Li, J., Xu Yu, J.: NaLIX: adaptive relaxation for querying heterogeneous XML data sources. Inf. Syst. 35(6), 688–707 (2010)

    Article  Google Scholar 

  37. Martinenghi, D., Tagliasacchi, M.: Proximity measures for rank join. ACM Trans. Database Syst. 37(1): art. 2 (2012)

    Google Scholar 

  38. Martinenghi, D., Torlone, R.: Querying databases with taxonomies. In: Proceedings of ER, pp. 377–390 (2010)

  39. Meng, X., Ma, Z.M., Yan, L.: Answering approximate queries over autonomous web databases. In: Proceedings of WWW, pp. 1021–1030 (2009)

  40. Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. VLDB J. 10(4), 334–350 (2001)

    Article  MATH  Google Scholar 

  41. Stefanidis, K., Koutrika, G., Pitoura, E.: A survey on representation, composition and application of preferences in database systems. ACM Trans. Database Syst. 36(3): art. 19 (2011)

    Google Scholar 

  42. Zhou, X., Gaugaz, J., Balke, W., Nejdl, W.: Query relaxation using malleable schemas. In: Proceedings of SIGMOD, pp. 545–556 (2007)

Download references

Acknowledgments

The authors acknowledge support from the EC’s FP7 “CUbRIK” project and from the Italian “GenData” PRIN project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Riccardo Torlone.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Martinenghi, D., Torlone, R. Taxonomy-based relaxation of query answering in relational databases. The VLDB Journal 23, 747–769 (2014). https://doi.org/10.1007/s00778-013-0350-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-013-0350-x

Keywords

Navigation