AMUSE: Multilingual Semantic Parsing for Question Answering over Linked Data

  • Sherzod HakimovEmail author
  • Soufian Jebbara
  • Philipp Cimiano
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10587)


The task of answering natural language questions over RDF data has received wide interest in recent years, in particular in the context of the series of QALD benchmarks. The task consists of mapping a natural language question to an executable form, e.g. SPARQL, so that answers from a given KB can be extracted. So far, most systems proposed are (i) monolingual and (ii) rely on a set of hard-coded rules to interpret questions and map them into a SPARQL query. We present the first multilingual QALD pipeline that induces a model from training data for mapping a natural language question into logical form as probabilistic inference. In particular, our approach learns to map universal syntactic dependency representations to a language-independent logical form based on DUDES (Dependency-based Underspecified Discourse Representation Structures) that are then mapped to a SPARQL query as a deterministic second step. Our model builds on factor graphs that rely on features extracted from the dependency graph and corresponding semantic representations. We rely on approximate inference techniques, Markov Chain Monte Carlo methods in particular, as well as Sample Rank to update parameters using a ranking objective. Our focus lies on developing methods that overcome the lexical gap and present a novel combination of machine translation and word embedding approaches for this purpose. As a proof of concept for our approach, we evaluate our approach on the QALD-6 datasets for English, German & Spanish.


Question answering Multilinguality QALD Probabilistic graphical models Factor graphs 



This work was supported by the Cluster of Excellence Cognitive Interaction Technology ‘CITEC’ (EXC 277) at Bielefeld University, which is funded by the German Research Foundation (DFG).


  1. 1.
    Andrieu, C., de Freitas, N., Doucet, A., Jordan, M.I.: An introduction to MCMC for machine learning. Mach. Learn. 50, 5–43 (2003)CrossRefzbMATHGoogle Scholar
  2. 2.
    Artzi, Y., Lee, K., Zettlemoyer, L.: Broad-coverage CCG semantic parsing with AMR. In: Proceedings of EMNLP, pp. 1699–1710 (2015)Google Scholar
  3. 3.
    Artzi, Y., Zettlemoyer, L.S.: Bootstrapping semantic parsers from conversations. In: Proceedings of ACL, pp. 421–432 (2011)Google Scholar
  4. 4.
    Baldridge, J., Kruijff, G.J.M.: Coupling ccg and hybrid logic dependency semantics. In: Proceedings of ACL, pp. 319–326. Association for Computational Linguistics (2002)Google Scholar
  5. 5.
    Basile, V., Jebbara, S., Cabrio, E., Cimiano, P.: Populating a knowledge base with object-location relations using distributional semantics. In: Blomqvist, E., Ciancarini, P., Poggi, F., Vitali, F. (eds.) EKAW 2016. LNCS (LNAI), vol. 10024, pp. 34–50. Springer, Cham (2016). doi: 10.1007/978-3-319-49004-5_3 CrossRefGoogle Scholar
  6. 6.
    Berant, J., Chou, A., Frostig, R., Liang, P.: Semantic parsing on freebase from question-answer pairs. In: Proceedings of EMNLP, 1533–1544, October 2013Google Scholar
  7. 7.
    Berant, J., Liang, P.: Semantic Parsing via Paraphrasing. In: ACL (Figure 1), pp. 1415–1425 (2014)Google Scholar
  8. 8.
    Berant, J., Liang, P.: Imitation learning of agenda-based semantic parsers. Trans. Assoc. Comput. Linguist. 3, 545–558 (2015)Google Scholar
  9. 9.
    Cimiano, P.: Flexible semantic composition with dudes. In: Proceedings of the 8th International Conference on Computational Semantics (IWCS), pp. 272–276 (2009)Google Scholar
  10. 10.
    Cimiano, P., Frank, A., Reyle, U.: UDRT-based semantics construction for LTAG - and what it tells us about the role of adjunction in LTAG. In: Proceedings of the 7th International Workshop on Computational Semantics (IWCS), pp. 41–52 (2007)Google Scholar
  11. 11.
    Freitas, A., Barzegar, S., Sales, J.E., Handschuh, S., Davis, B.: Semantic relatedness for all (languages): a comparative analysis of multilingual semantic relatedness using machine translation. In: Blomqvist, E., Ciancarini, P., Poggi, F., Vitali, F. (eds.) EKAW 2016. LNCS (LNAI), vol. 10024, pp. 212–222. Springer, Cham (2016). doi: 10.1007/978-3-319-49004-5_14 CrossRefGoogle Scholar
  12. 12.
    Freitas, A., Curry, E.: Natural language queries over heterogeneous linked data graphs: a distributional-compositional semantics approach. In: Proceedings of the 19th International Conference on Intelligent User Interfaces, pp. 279–288. ACM (2014)Google Scholar
  13. 13.
    Hakimov, S., Unger, C., Walter, S., Cimiano, P.: Applying semantic parsing to question answering over linked data: addressing the lexical gap. In: Biemann, C., Handschuh, S., Freitas, A., Meziane, F., Métais, E. (eds.) NLDB 2015. LNCS, vol. 9103, pp. 103–109. Springer, Cham (2015). doi: 10.1007/978-3-319-19581-0_8 CrossRefGoogle Scholar
  14. 14.
    Höffner, K., Walter, S., Marx, E., Usbeck, R., Lehmann, J., Ngonga Ngomo, A.C.: Survey on challenges of question answering in the semantic web. Semantic Web (Preprint), 1–26 (2016)Google Scholar
  15. 15.
    Kamp, H., Reyle, U.: From Discourse to Logic. Introduction to the Modeltheoretic Semantics of Natural Language. Kluwer, Dordrecht (1993)Google Scholar
  16. 16.
    Kilgarriff, A., Fellbaum, C.: Wordnet: an electronic lexical database (2000)Google Scholar
  17. 17.
    Klinger, R., Cimiano, P.: Joint and pipeline probabilistic models for fine-grained sentiment analysis: extracting aspects, subjective phrases and their relations. In: Proceedings of ICDMW, pp. 937–944 (2013)Google Scholar
  18. 18.
    Krishnamurthy, J., Mitchell, T.M.: Joint syntactic and semantic parsing with combinatory categorial grammar. In: Proceedings of ACL, pp. 1188–1198 (2014)Google Scholar
  19. 19.
    Kschischang, F.R., Frey, B.J., Loeliger, H.A.: Factor graphs and sum product algorithm. IEEE Trans. Inf. Theory 47(2), 498–519 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Kwiatkowski, T., Choi, E., Artzi, Y., Zettlemoyer, L.: Scaling semantic parsers with on-the-fly ontology matching. In: Proceedings of EMNLP, pp. 1545–1556, October 2013Google Scholar
  21. 21.
    Kwiatkowski, T., Zettlemoyer, L., Goldwater, S., Steedman, M.: Inducing probabilistic CCG grammars from logical form with higher-order unification. In: Proceedings of EMNLP, pp. 1223–1233, October 2010Google Scholar
  22. 22.
    Lee, K., Lewis, M., Zettlemoyer, L.: Global neural CCG parsing with optimality guarantees. In: Proceedings of EMNLP pp. 2366–2376 (2015)Google Scholar
  23. 23.
    Lukovnikov, D., Fischer, A., Lehmann, J., Auer, S.: Neural network-based question answering over knowledge graphs on word and character level. In: Proceedings of the 26th International Conference on World Wide Web, pp. 1211–1220. International World Wide Web Conferences Steering Committee (2017)Google Scholar
  24. 24.
    Mazzeo, G.M., Zaniolo, C.: Answering controlled natural language questions on RDF knowledge bases. In: Proceedings of the 19th International Conference on Extending Database Technology, pp. 608–611 (2016)Google Scholar
  25. 25.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)Google Scholar
  26. 26.
    Miller, G.A.: Wordnet: a lexical database for english. Commun. ACM 38(11), 39–41 (1995)CrossRefGoogle Scholar
  27. 27.
    Neelakantan, A., Le, Q.V., Abadi, M., McCallum, A., Amodei, D.: Learning a natural language interface with neural programmer. In: International Conference on Learning Representations (2017)Google Scholar
  28. 28.
    Nivre, J., et al.: Universal dependencies 2.0. LINDAT/CLARIN digital library at the Institute of Formal and Applied Linguistics, Charles University (2017).
  29. 29.
    Pasupat, P., Liang, P.: Compositional semantic parsing on semi-structured tables. In: ACL (2015)Google Scholar
  30. 30.
    Reddy, S., Lapata, M., Steedman, M.: Large-scale semantic parsing without question-answer pairs. Trans. ACL 2, 377–392 (2014)Google Scholar
  31. 31.
    Reddy, S., Täckström, O., Collins, M., Kwiatkowski, T., Das, D., Steedman, M., Lapata, M.: Transforming dependency structures to logical forms for semantic parsing. Trans. ACL 4, 127–140 (2016)Google Scholar
  32. 32.
    Reddy, S., Täckström, O., Petrov, S., Steedman, M., Lapata, M.: Universal semantic parsing. In: Proceedings of EMNLP (2017)Google Scholar
  33. 33.
    Reyle, U.: Dealing with ambiguities by underspecification: construction, representation and deduction. J. Semant. 10(2), 123–179 (1993)CrossRefGoogle Scholar
  34. 34.
    Rockt, T., Riedel, S.: Injecting logical background knowledge into embeddings for relation extraction. In: NAACL, pp. 1119–1129 (2014)Google Scholar
  35. 35.
    Steedman, M.: The syntactic process. Comput. Linguist. 131(1), 146–148 (2000)Google Scholar
  36. 36.
    Unger, C., Ngomo, A.-C.N., Cabrio, E.: 6th open challenge on question answering over linked data (QALD-6). In: Sack, H., Dietze, S., Tordai, A., Lange, C. (eds.) SemWebEval 2016. CCIS, vol. 641, pp. 171–177. Springer, Cham (2016). doi: 10.1007/978-3-319-46565-4_13 CrossRefGoogle Scholar
  37. 37.
    Usbeck, R., Röder, M., Ngonga Ngomo, A.C., Baron, C., Both, A., Brümmer, M., Ceccarelli, D., Cornolti, M., Cherix, D., Eickmann, B., et al.: Gerbil: general entity annotator benchmarking framework. In: Proceedings of the 24th International Conference on World Wide Web, pp. 1133–1143. International World Wide Web Conferences Steering Committee (2015)Google Scholar
  38. 38.
    Veyseh, A.P.B.: Cross-lingual question answering using common semantic space. In: TextGraphs@ NAACL-HLT, pp. 15–19 (2016)Google Scholar
  39. 39.
    Walter, S., Unger, C., Cimiano, P.: M-ATOLL: a framework for the lexicalization of ontologies in multiple languages. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 472–486. Springer, Cham (2014). doi: 10.1007/978-3-319-11964-9_30 Google Scholar
  40. 40.
    Walter, S., Unger, C., Cimiano, P.: Dblexipedia: A nucleus for a multilingual lexical semantic web. In: Proceedings of 3th International Workshop on NLP and DBpedia, co-located with the 14th International Semantic Web Conference (ISWC 2015), USA, 11–15 October 2015Google Scholar
  41. 41.
    Wick, M., Rohanimanesh, K., Culotta, A., McCallum, A.: SampleRank. Learning preferences from atomic gradients. In: NIPS Workshop on Advances in Ranking, pp. 1–5 (2009)Google Scholar
  42. 42.
    Wong, Y.W., Mooney, R.J.: Learning for semantic parsing with statistical machine translation. In: Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the ACL, pp. 439–446. ACL (2006)Google Scholar
  43. 43.
    Wong, Y.W., Mooney, R.J.: Learning synchronous grammars for semantic parsing with lambda calculus. In: Proceedings of ACL, vol. 45, p. 960 (2007)Google Scholar
  44. 44.
    Xu, K., Reddy, S., Feng, Y., Huang, S., Zhao, D.: Question answering on freebase via relation extraction and textual evidence. In: Proceedings of ACL, pp. 2326–2336 (2016)Google Scholar
  45. 45.
    Yih, W.T., Chang, M.W., He, X., Gao, J.: Semantic parsing via staged query graph generation: question answering with knowledge base. In: ACL, pp. 1321–1331 (2015)Google Scholar
  46. 46.
    Zettlemoyer, L.S., Collins, M.: Learning to map sentences to logical form: structured classification with probabilistic categorial grammars. In: 21st Conference on Uncertainty in Artificial Intelligence (2005)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Sherzod Hakimov
    • 1
    Email author
  • Soufian Jebbara
    • 1
  • Philipp Cimiano
    • 1
  1. 1.Semantic Computing Group, Cognitive Interaction Technology – Center of Excellence (CITEC)Bielefeld UniversityBielefeldGermany

Personalised recommendations