CubeQA—Question Answering on RDF Data Cubes

  • Konrad Höffner
  • Jens Lehmann
  • Ricardo Usbeck
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9981)


Statistical data in the form of RDF Data Cubes is becoming increasingly valuable as it influences decisions in areas such as health care, policy and finance. While a growing amount is becoming freely available through the open data movement, this data is opaque to laypersons. Semantic Question Answering (SQA) technologies provide intuitive access via free-form natural language queries but general SQA systems cannot process RDF Data Cubes. On the intersection between RDF Data Cubes and SQA, we create a new subfield of SQA, called RDCQA. We create an RDQCA benchmark as task 3 of the QALD-6 evaluation challenge, to stimulate further research and enable quantitative comparison between RDCQA systems. We design and evaluate the domain independent CubeQA algorithm, which is the first RDCQA system and achieves a global \(F_1\) score of 0.43 on the QALD6T3-test benchmark, showing that RDCQA is feasible.


Parse Tree Question Answering Data Cube SPARQL Query Component Property 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work was supported by a grant from the EU H2020 Framework Programme provided for the project HOBBIT (GA no. 688227).


  1. 1.
    Allen, J.F.: Maintaining knowledge about temporal intervals. Commun. ACM 26(11), 832–843 (1983)CrossRefzbMATHGoogle Scholar
  2. 2.
    Athenikos, S., Han, H.: Biomedical question answering: a survey. Comput. Meth. Programs Biomed. 99(1), 1–24 (2010)CrossRefGoogle Scholar
  3. 3.
    Berners-Lee, T.: Linked Data-Design issues, W3C design issue (2009)Google Scholar
  4. 4.
    Capadisli, S., Auer, S., Ngonga Ngomo, A.C.: Linked SDMX data. Semant.Web J. 6(2), 105–112 (2015)Google Scholar
  5. 5.
    Cimiano, P., Lopez, V., Unger, C., Cabrio, E., Ngonga Ngomo, A.-C., Walter, S.: Multilingual question answering over linked data (QALD-3): lab overview. In: Forner, P., Müller, H., Paredes, R., Rosso, P., Stein, B. (eds.) CLEF 2013. LNCS, vol. 8138, pp. 321–332. Springer, Heidelberg (2013)Google Scholar
  6. 6.
    Cimiano, P., Minock, M.: Natural language interfaces: what Is the problem? – A data-driven quantitative analysis. In: Horacek, H., Métais, E., Muñoz, R., Wolska, M. (eds.) NLDB 2009. LNCS, vol. 5723, pp. 192–206. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  7. 7.
    Dima, C.: Intui2: A prototype system for question answering over linked data. In: Forner, P., Navigli, R., Tufis, D. (eds.) Question Answering over Linked Data (QALD-3), CLEF 2013 Evaluation Labs and Workshop, Online Working Notes (2013)Google Scholar
  8. 8.
    Ferré, S.: Sparklis: an expressive query builder for SPARQL endpoints with guidance in natural language. Semant. Web J. (2015)Google Scholar
  9. 9.
    Freitas, A., Curry, E., Oliveira, J., O’Riain, S.: Querying heterogeneous datasets on the Linked Data Web: challenges, approaches, and trends. IEEE Internet Comput. 16(1), 24–33 (2012)CrossRefGoogle Scholar
  10. 10.
    Gerber, D., Ngomo, A.-C.N.: Extracting multilingual natural-language patterns for RDF predicates. In: ten Teije, A., Völker, J., Handschuh, S., Stuckenschmidt, H., d’Acquin, M., Nikolov, A., Aussenac-Gilles, N., Hernandez, N. (eds.) EKAW 2012. LNCS, vol. 7603, pp. 87–96. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  11. 11.
    Hirschman, L., Gaizauskas, R.: Natural language question answering: the view from here. Nat. Lang. Eng. 7(4), 275–300 (2001)CrossRefGoogle Scholar
  12. 12.
    Höffner, K., Lehmann, J.: Towards question answering on statistical linked data. In: Proceedings of the 10th International Conference on Semantic Systems, SEM 2014, pp. 61–64. ACM (2014)Google Scholar
  13. 13.
    Höffner, K., Martin, M., Lehmann, J.: LinkedSpending: OpenSpending becomes Linked Open Data. Semant. Web J. 7, 95–104 (2015)CrossRefGoogle Scholar
  14. 14.
    Höffner, K., Walter, S., Marx, E., Usbeck, R., Lehmann, J., Ngonga Ngomo, A.C.: Survey on challenges of Question Answering in the Semantic Web. Semant. Web J. (2016, submitted)Google Scholar
  15. 15.
    Kondrak, G.: N-Gram similarity and distance. In: Consens, M.P., Navarro, G. (eds.) SPIRE 2005. LNCS, vol. 3772, pp. 115–126. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  16. 16.
    Lehmann, J., Bizer, C., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., Hellmann, S.: DBpedia - a crystallization point for the web of data. J. Web Semant. 7(3), 154–165 (2009)CrossRefGoogle Scholar
  17. 17.
    Lopez, V., Uren, V., Sabou, M., Motta, E.: Is question answering fit for the semantic web? A survey. Semant. Web J. 2(2), 125–155 (2011)Google Scholar
  18. 18.
    Marx, E., Usbeck, R., Ngomo Ngonga, A.C., Höffner, K., Lehmann, J., Auer, S.: Towards an open Question Answering architecture. In: SEMANTiCS 2014 (2014)Google Scholar
  19. 19.
    Piotrowski, S.J., Van Ryzin, G.G.: Citizen attitudes toward transparency in local government. Am. Rev. public Adm. 37(3), 306–323 (2007)CrossRefGoogle Scholar
  20. 20.
    Prud’hommeaux, E., Seaborne, A.: SPARQL query language for RDF. W3C Recommendation (2008)Google Scholar
  21. 21.
    Schulz, K.U., Mihov, S.: Fast string correction with Levenshtein automata. Int. J. Doc. Anal. Recogn. 5(1), 67–85 (2002)CrossRefzbMATHGoogle Scholar
  22. 22.
    Shekarpour, S., Ngonga Ngomo, A.C., Auer, S.: Query segmentation and resource disambiguation leveraging background knowledge. In: Proceedings of WoLE Workshop (2012)Google Scholar
  23. 23.
    Stadler, C., Unbehauen, J., Westphal, P., Sherif, M.A., Lehmann, J.: Simplified RDB2RDF mapping. In: Proceedings of the 8th Workshop on Linked Data on the Web (LDOW2015), Florence, Italy (2015)Google Scholar
  24. 24.
    Tao, C., Solbrig, H.R., Sharma, D.K., Wei, W.-Q., Savova, G.K., Chute, C.G.: Time-oriented question answering from clinical narratives using semantic-web techniques. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part II. LNCS, vol. 6497, pp. 241–256. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  25. 25.
    Tsatsaronis, G., Schroeder, M., Paliouras, G., Almirantis, Y., Androutsopoulos, I., Gaussier, E., Gallinari, P., Artieres, T., Alvers, M.R., Zschunke, M., et al.: BioASQ: a challenge on large-scale biomedical semantic indexing and Question Answering. In: 2012 AAAI Fall Symposium Series (2012)Google Scholar
  26. 26.
    Unger, C., Bühmann, L., Lehmann, J., Ngonga Ngomo, A.C., Gerber, D., Cimiano, P.: Template-based Question Answering over RDF data. In: Proceedings of the 21st International Conference on World Wide Web, pp. 639–648 (2012)Google Scholar
  27. 27.
    Wolfram, S.: The Mathematica Book, vol. 100, pp. 61820–67237. Cambridge University Press and Wolfram Research Inc., New York (2000)Google Scholar
  28. 28.
    Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., Auer, S.: Quality assessment for Linked Data: a survey. Semant. Web J. 7, 63–93 (2015)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Konrad Höffner
    • 1
  • Jens Lehmann
    • 2
    • 3
  • Ricardo Usbeck
    • 1
  1. 1.University of Leipzig, Institute of Computer Science, AKSW GroupLeipzigGermany
  2. 2.Computer Science InstituteUniversity of BonnBonnGermany
  3. 3.Knowledge Discovery DepartmentFraunhofer IAISSankt AugustinGermany

Personalised recommendations