Query answering over uncertain RDF knowledge bases: explain and obviate unsuccessful query results

Abstract

Several large uncertain knowledge bases (KBs) are available on the Web where facts are associated with a certainty degree. When querying these uncertain KBs, users seek high-quality results, i.e., results that have a certainty degree greater than a given threshold \(\alpha \). However, as they usually have only a partial knowledge of the KB contents, their queries may be failing i.e., they return no result for the desired certainty level. To prevent this frustrating situation, instead of returning an empty set of answers, our approach explains the reasons of the failure with a set of \(\alpha \)minimal failing subqueries (\(\alpha \)MFSs) and computes alternative relaxed queries, called \(\alpha \)maXimal succeeding subqueries (\(\alpha \)XSSs), that are as close as possible to the initial failing query. Moreover, as the user may not always be able to provide an appropriate threshold \(\alpha \), we propose three algorithms to compute the \(\alpha \)MFSs and \(\alpha \)XSSs for other thresholds, which also constitutes a relevant feedback for the user. Multiple experiments with the WatDiv benchmark show the relevance of our algorithms compared to a baseline method.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Notes

  1. 1.

    http://jena.apache.org/documentation/tdb/.

  2. 2.

    To improve readability, we use names instead of URIs to identify query elements.

  3. 3.

    Aggreg is the function used for assigning trust values to query results.

  4. 4.

    For simplicity, this definition is restricted to sets but could be extended to multisets.

  5. 5.

    http://jena.apache.org/documentation/tdb/quadfilter.html.

  6. 6.

    https://www.w3.org/wiki/PropertyReificationVocabulary.

References

  1. 1.

    Rodríguez M, Goldberg S, Wang DZ (2016) Sigmakb: multiple probabilistic knowledge base fusion. Proc VLDB Endow 9(13):1577–1580

    Article  Google Scholar 

  2. 2.

    Hoffart J, Suchanek FM, Berberich K, Weikum G (2013) YAGO2: a spatially and temporally enhanced knowledge base from Wikipedia. Artif Intell 194:28–61

    MathSciNet  Article  Google Scholar 

  3. 3.

    Carlson A, Betteridge J, Kisiel B, Settles B, Hruschka ER Jr, Mitchell TM (2010) Toward an architecture for never-ending language learning. In: AAAI, vol 5, p 3

  4. 4.

    Dong X, Gabrilovich E, Heitz G, Horn W, Lao N, Murphy K, Strohmann T, Sun S, Zhang W (2014) Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: KDD’14, pp 601–610

  5. 5.

    Wu W, Li H, Wang H, Zhu KQ (2012) Probase: a probabilistic taxonomy for text understanding. In: Proceedings of the 2012 ACM SIGMOD international conference on management of data. ACM, pp 481–492

  6. 6.

    Harris S, Garlik AS (2013) Sparql 1.1 query language (march 2013). W3C Recommendation

  7. 7.

    Hartig O (2009) Querying trust in RDF data with tSPARQL. In: ESWC 2009

    Google Scholar 

  8. 8.

    Tomaszuk D, Pak K, Rybiński H (2013) Trust in RDF graphs. In: ADBIS’13

  9. 9.

    Saleem M, Ali MI, Hogan A, Mehmood Q, Ngomo AN (2015) LSQ: the linked SPARQL queries dataset. In: ISWC’15, pp 261–269

    Google Scholar 

  10. 10.

    Mottin D, Marascu A, Roy SB, Das G, Palpanas T, Velegrakis Y (2013) A probabilistic optimization framework for the empty-answer problem. Proc VLDB Endow 6(14):1762–1773

    Article  Google Scholar 

  11. 11.

    Godfrey P (1997) Minimization in cooperative response to failing database queries. Int J Coop Inf Syst 6(2):95–149

    Article  Google Scholar 

  12. 12.

    Fokou G, Jean S, Hadjali A, Baron M (2017) Handling failing RDF queries: from diagnosis to relaxation. Knowl Inf Syst (KAIS) 50(1):167–195

    Article  Google Scholar 

  13. 13.

    Erling O, Mikhailov I (2009) RDF support in the virtuoso DBMS. In: Pellegrini T, Auer S, Tochtermann K, Schaffert S (eds) Networked knowledge—networked media. Springer, Berlin, pp 7–24

    Google Scholar 

  14. 14.

    Dellal I, Jean S, Hadjali A, Chardin B, Baron M (2017) On addressing the empty answer problem in uncertain knowledge bases. In: Benslimane D, Damiani E, Grosky WI, Hameurlain A, Sheth A, Wagner RR (eds) Database and expert systems applications. Springer International Publishing, pp 120–129

  15. 15.

    Pérez J, Arenas M, Gutierrez C (2009) Semantics and complexity of SPARQL. ACM Trans Database Syst (TODS) 34(3):16:1–16:45

    Article  Google Scholar 

  16. 16.

    Mannila H, Toivonen H (1997) Levelwise search and borders of theories in knowledge discovery. Data Min Knowl Discov 1(3):241–258

    Article  Google Scholar 

  17. 17.

    Aluç G, Hartig O, Özsu MT, Daudjee K (2014) Diversified stress testing of RDF data management systems. In: ISWC’14, pp 197–212

    Google Scholar 

  18. 18.

    Gallego MA, Fernández JD, Martínez-Prieto MA, de la Fuente P (2011) An empirical study of real-world SPARQL queries. In: Proceedings of the USEWOD workshop co-located with WWW’11

  19. 19.

    Carothers G (ed) (2014) Rdf 1.1 n-quads. W3C Recommendation

  20. 20.

    Schreiber G, Raimond Y (eds) (2014) Rdf 1.1 primer. W3C recommendation

  21. 21.

    Sahoo SS, Nguyen V, Bodenreider O, Parikh P, Minning T, Sheth AP (2011) A unified framework for managing provenance information in translational research. BMC Bioinform 12:461

    Article  Google Scholar 

  22. 22.

    Schueler B, Sizov S, Staab S, Tran DT (2008) Querying for meta knowledge. In: Proceedings of the 17th international conference on World Wide Web. ACM, pp 625–634

  23. 23.

    Straccia U, Lopes N, Lukacsy G, Polleres A (2010) A general framework for representing and reasoning with annotated semantic web data. In: AAAI

  24. 24.

    Galárraga L, Teflioudi C, Hose K, Suchanek FM (2015) Fast rule mining in ontological knowledge bases with AMIE+. VLDB J 24(6):707–730

    Article  Google Scholar 

  25. 25.

    Campinas S (2014) Live SPARQL auto-completion. In: ISWC’14 (posters and demos), pp 477–480

  26. 26.

    Pham M, Passing L, Erling O, Boncz PA (2015) Deriving an emergent relational schema from RDF data. In: WWW’15, pp 864–874

  27. 27.

    Hurtado CA, Poulovassilis A, Wood PT (2009) Ranking approximate answers to semantic web queries. In: ESWC’09, pp 263–277

    Google Scholar 

  28. 28.

    Huang H, Liu C, Zhou X (2012) Approximating query answering on RDF databases. J World Wide Web Internet Web Inf Syst (WWW) 15(1):89–114

    Article  Google Scholar 

  29. 29.

    Fokou G, Jean S, Hadjali A (2014) Endowing semantic query languages with advanced relaxation capabilities. In: ISMIS’14, pp 512–517

    Google Scholar 

  30. 30.

    Calí A, Frosini R, Poulovassilis A, Wood P (2014) Flexible querying for SPARQL. In: ODBASE’14, pp 473–490

    Google Scholar 

  31. 31.

    Hogan A, Mellotte M, Powell G, Stampouli D (2012) Towards fuzzy query-relaxation for RDF. In: ESWC’12, pp 687–702

    Google Scholar 

  32. 32.

    Elbassuoni S, Ramanath M, Weikum G (2011) Query relaxation for entity-relationship search. In: ESWC’11, pp 62–76

    Google Scholar 

  33. 33.

    Dolog P, Stuckenschmidt H, Wache H, Diederich J (2009) Relaxing RDF queries based on user and domain preferences. J Intell Inf Syst (JIIS) 33(3):239–260

    Article  Google Scholar 

  34. 34.

    Fokou G, Jean S, HadjAli A, Baron M (2016) RDF query relaxation strategies based on failure causes. In: ESWC’16, pp 439–454

    Google Scholar 

  35. 35.

    Reddy KB, Kumar PS (2013) Efficient trust-based approximate SPARQL querying of the web of linked data. In: Uncertainty reasoning for the semantic web II. Springer, pp 315–330

  36. 36.

    Jannach D (2009) Fast computation of query relaxations for knowledge-based recommenders. AI Commun 22(4):235–248

    MathSciNet  MATH  Google Scholar 

  37. 37.

    Pivert O, Smits G (2015) How to efficiently diagnose and repair fuzzy database queries that fail. In: Fifty years of fuzzy logic and its applications, studies in fuzziness and soft computing. Springer, pp 499–517

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Stéphane Jean.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Dellal, I., Jean, S., Hadjali, A. et al. Query answering over uncertain RDF knowledge bases: explain and obviate unsuccessful query results. Knowl Inf Syst 61, 1633–1665 (2019). https://doi.org/10.1007/s10115-019-01332-7

Download citation

Keywords

  • Uncertain knowledge bases
  • RDF quad
  • SPARQL queries
  • Empty answers
  • Named graph
  • Reification
  • Quadstore