Abstract
Cross-lingual and multilingual question answering is a critical part of a successful and accessible natural language interface. However, many current solutions are unsatisfactory. We believe that recent developments in deep learning approaches are likely to be efficient for question answering tasks spanning several languages. This work aims to discuss current achievements and remaining challenges. We outline requirements and suggestions for practical parallel data collection and describe existing methods and datasets. We also demonstrate that a simple translation of texts can be inadequate in case of Arabic, English and German languages (on InsuranceQA and SemEval datasets), and thus more sophisticated models are required. We hope that our findings will ignite interest in neural approaches to multilingual question answering.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The parameters are as follows: skip-gram, window 5, negative-sampling rate −1/1000.
References
Almarwani, N., Diab, M.: GW\_QA at SemEval-2017 task 3: question answer re-ranking on Arabic Fora. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pp. 344–348 (2017)
Banerjee, S., et al.: Overview of the mixed script information retrieval (MSIR) at FIRE-2016. In: Majumder, P., Mitra, M., Mehta, P., Sankhavara, J. (eds.) FIRE 2016. LNCS, vol. 10478, pp. 39–49. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73606-8_3
Banerjee, S., Naskar, S.K., Rosso, P., Bandyopadhyay, S.: The first cross-script code-mixed question answering corpus. In: MultiLingMine@ ECIR, pp. 56–65 (2016)
Boldrini, E., Ferrández, S., Izquierdo, R., Tomás, D., Vicedo, J.L.: A parallel corpus labeled using open and restricted domain ontologies. In: Gelbukh, A. (ed.) CICLing 2009. LNCS, vol. 5449, pp. 346–356. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-00382-0_28
Bouma, G., Kloosterman, G., Mur, J., van Noord, G., van der Plas, L., Tiedemann, J.: Question answering with Joost at CLEF 2007. In: Peters, C., et al. (eds.) CLEF 2007. LNCS, vol. 5152, pp. 257–260. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-85760-0_30
Chakma, K., Das, A.: CMIR: a corpus for evaluation of code mixed information retrieval of hindi-english tweets. Computación y Sistemas 20(3), 425–434 (2016)
Chandu, K.R., Chinnakotla, M., Black, A.W., Shrivastava, M.: WebShodh: a code mixed factoid question answering system for web. In: Jones, G.J.F., et al. (eds.) CLEF 2017. LNCS, vol. 10456, pp. 104–111. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-65813-1_9
Chollet, F., et al.: Keras (2015). https://github.com/fchollet/keras
Cimiano, P.: Flexible semantic composition with dudes. In: Proceedings of the Eighth International Conference on Computational Semantics, pp. 272–276. Association for Computational Linguistics (2009)
Du, X., Shao, J., Cardie, C.: Learning to ask: neural question generation for reading comprehension. arXiv preprint arXiv:1705.00106 (2017)
Feng, M., Xiang, B., Glass, M.R., Wang, L., Zhou, B.: Applying deep learning to answer selection: a study and an open task. In: 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp. 813–820. IEEE (2015)
Forner, P., et al.: Overview of the Clef 2008 multilingual question answering track. In: Peters, C., et al. (eds.) CLEF 2008. LNCS, vol. 5706, pp. 262–295. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04447-2_34
Ganin, Y., et al.: Domain-adversarial training of neural networks. J. Mach. Learn. Res. 17(1), 2096–2030 (2016)
Ghosh, S., Ghosh, S., Das, D.: Complexity metric for code-mixed social media text. arXiv preprint arXiv:1707.01183 (2017)
Haas, C., Riezler, S.: Response-based learning for machine translation of open-domain database queries. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1339–1344 (2015)
Hadla, L.S., Hailat, T.M., Al-Kabi, M.N.: Evaluating Arabic to English machine translation. Editorial Preface 5(11) (2014)
Hakimov, S., Jebbara, S., Cimiano, P.: AMUSE: multilingual semantic parsing for question answering over linked data. In: d’Amato, C., et al. (eds.) ISWC 2017. LNCS, vol. 10587, pp. 329–346. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68288-4_20
Honnibal, M., Johnson, M.: An improved non-monotonic transition system for dependency parsing. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1373–1378 (2015)
Johnson, M., et al.: Google’s multilingual neural machine translation system: enabling zero-shot translation. arXiv preprint arXiv:1611.04558 (2016)
Joty, S., Nakov, P., Màrquez, L., Jaradat, I.: Cross-language learning with adversarial neural networks: application to community question answering. arXiv preprint arXiv:1706.06749 (2017)
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759 (2016)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Magnini, B., et al.: Creating the DISEQuA corpus: a test set for multilingual question answering. In: Peters, C., Gonzalo, J., Braschler, M., Kluck, M. (eds.) CLEF 2003. LNCS, vol. 3237, pp. 487–500. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30222-3_47
Magnini, B.: The multiple language question answering track at CLEF 2003. In: Peters, C., Gonzalo, J., Braschler, M., Kluck, M. (eds.) CLEF 2003. LNCS, vol. 3237, pp. 471–486. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30222-3_46
Magnini, B., et al.: Overview of the CLEF 2004 multilingual question answering track. In: Peters, C., Clough, P., Gonzalo, J., Jones, G.J.F., Kluck, M., Magnini, B. (eds.) CLEF 2004. LNCS, vol. 3491, pp. 371–391. Springer, Heidelberg (2005). https://doi.org/10.1007/11519645_38
Martino, G.D.S., Romeo, S., Barrón-Cedeno, A., Joty, S., Marquez, L., Moschitti, A., Nakov, P.: Cross-language question re-ranking. arXiv preprint arXiv:1710.01487 (2017)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Nakov, P., et al.: SemEval-2017 task 3: community question answering. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pp. 27–48 (2017)
Paszke, A., et al.: Automatic differentiation in pytorch (2017)
Raghavi, K.C., Chinnakotla, M.K., Shrivastava, M.: Answer ka type kya he?: Learning to classify questions in code-mixed language. In: Proceedings of the 24th International Conference on World Wide Web, pp. 853–858. ACM (2015)
Riedl, M., Biemann, C.: Unsupervised compound splitting with distributional semantics rivals supervised methods. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 617–622 (2016)
Sasaki, Y., Lin, C.J., Chen, K.h., Chen, H.H.: Overview of the NTCIR-6 cross-lingual question answering task. In: Proceedings of the 6th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, Question Answering and Cross-Lingual Information Access, 15–18 May 2007, pp. 153–163. Citeseer (2007)
Sugiyama, K., et al.: An investigation of machine translation evaluation metrics in cross-lingual question answering. In: Proceedings of the Tenth Workshop on Statistical Machine Translation, pp. 442–449 (2015)
Tan, M., dos Santos, C., Xiang, B., Zhou, B.: Improved representation learning for question answer matching. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 464–473 (2016)
Tuggener, D.: Incremental coreference resolution for German. Ph.D. thesis, Universität Zürich (2016)
Ture, F., Boschee, E.: Learning to translate for multilingual question answering. arXiv preprint arXiv:1609.08210 (2016)
Vallin, A., et al.: Overview of the CLEF 2005 multilingual question answering track. In: Peters, C., et al. (eds.) CLEF 2005. LNCS, vol. 4022, pp. 307–331. Springer, Heidelberg (2006). https://doi.org/10.1007/11878773_36
Veyseh, A.P.B.: Cross-lingual question answering using common semantic space. In: Proceedings of TextGraphs-10: The Workshop on Graph-based Methods for Natural Language Processing, pp. 15–19 (2016)
Acknowledgements
This work was partially supported by the German Federal Ministry of Education and Research (BMBF) through the project DEEPLEE (01IW17001).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Loginova, E., Varanasi, S., Neumann, G. (2018). Towards Multilingual Neural Question Answering. In: Benczúr, A., et al. New Trends in Databases and Information Systems. ADBIS 2018. Communications in Computer and Information Science, vol 909. Springer, Cham. https://doi.org/10.1007/978-3-030-00063-9_26
Download citation
DOI: https://doi.org/10.1007/978-3-030-00063-9_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00062-2
Online ISBN: 978-3-030-00063-9
eBook Packages: Computer ScienceComputer Science (R0)