Abstract
It was recently shown, that neural language models, trained on large scale conversational corpus such as OpenSubtitles have recently demonstrated ability to simulate conversation and answer questions, that require common-sense knowledge, suggesting the possibility that such networks actually learn a way to represent and use common-sense knowledge, extracted from dialog corpus. If this is really true, the possibility exists of using large scale conversational models for use in information retrieval (IR) tasks, including question answering, document retrieval and other problems that require measuring of semantic similarity. In this work we analyze behavior of a number of neural network architectures, trained on Russian conversations corpus, containing 20 million dialog turns. We found that small to medium neural networks do not really learn any noticeable common-sense knowledge, operating pure on the level of syntactic features, while large very deep networks shows do posses some common-sense knowledge.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Vinyals, O., Le, Q.: A neural conversational model. arXiv preprint, arXiv:1506.05869 (2015)
Yao, K., Zweig, G., Peng, B.: Attention with intention for a neural network conversation model. arXiv preprint, arXiv:1510.08565 (2015)
Chen, X., et al.: Topic aware neural response generation. arXiv preprint, arXiv:1606.08340 (2016)
Li, J., Galley, M., Brockett, C., Gao, J., Dolan, B.: A diversity-promoting objective function for neural conversation models. arXiv preprint, arXiv:1510.03055 (2015)
Mihail, E., Manning, D.: A copy-augmented sequence-to-sequence architecture gives good performance on task-oriented dialogue. arXiv preprint, arXiv:1701.04024 (2017)
Ahn, S., et al.: A neural knowledge language model. arXiv preprint, arXiv:1608.00318 (2016)
Srivastava, R.K., Greff, K., Schmidhuber, J.: Highway networks. arXiv preprint, arXiv:1505.00387 (2015)
Tiedemann, J.: News from OPUS—a collection of multi-lingual parallel corpora with tools and interfaces. In: Nicolov, N., Bontcheva, K., Angelova, G., Mitkov, R. (eds.) Recent Advances in Natural Language Processing, pp. 237–248. John Benjamins Publishing Company, Amsterdam (2009)
Hu, B., Lu, Z., Li, H., Chen, Q.: Convolutional neural network architectures for matching natural language sentences. In: Advances in Neural Information Processing Systems, pp. 2042–2050 (2014)
Mikolov T., Karafiat M., Burget L., Cernocky J., Khudanpur S.: Recurrent neural network based language model. In: INTERSPEECH, pp. 1045–1048 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Tarasov, D.S., Izotova, E.D. (2018). Common Sense Knowledge in Large Scale Neural Conversational Models. In: Kryzhanovsky, B., Dunin-Barkowski, W., Redko, V. (eds) Advances in Neural Computation, Machine Learning, and Cognitive Research. NEUROINFORMATICS 2017. Studies in Computational Intelligence, vol 736. Springer, Cham. https://doi.org/10.1007/978-3-319-66604-4_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-66604-4_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66603-7
Online ISBN: 978-3-319-66604-4
eBook Packages: EngineeringEngineering (R0)