Common Sense Knowledge in Large Scale Neural Conversational Models

Tarasov, D. S.; Izotova, E. D.

doi:10.1007/978-3-319-66604-4_6

Common Sense Knowledge in Large Scale Neural Conversational Models

D. S. Tarasov⁵ &
E. D. Izotova⁵

Conference paper
First Online: 29 August 2017

1340 Accesses

Part of the book series: Studies in Computational Intelligence ((SCI,volume 736))

Abstract

It was recently shown, that neural language models, trained on large scale conversational corpus such as OpenSubtitles have recently demonstrated ability to simulate conversation and answer questions, that require common-sense knowledge, suggesting the possibility that such networks actually learn a way to represent and use common-sense knowledge, extracted from dialog corpus. If this is really true, the possibility exists of using large scale conversational models for use in information retrieval (IR) tasks, including question answering, document retrieval and other problems that require measuring of semantic similarity. In this work we analyze behavior of a number of neural network architectures, trained on Russian conversations corpus, containing 20 million dialog turns. We found that small to medium neural networks do not really learn any noticeable common-sense knowledge, operating pure on the level of syntactic features, while large very deep networks shows do posses some common-sense knowledge.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Vinyals, O., Le, Q.: A neural conversational model. arXiv preprint, arXiv:1506.05869 (2015)
Yao, K., Zweig, G., Peng, B.: Attention with intention for a neural network conversation model. arXiv preprint, arXiv:1510.08565 (2015)
Chen, X., et al.: Topic aware neural response generation. arXiv preprint, arXiv:1606.08340 (2016)
Li, J., Galley, M., Brockett, C., Gao, J., Dolan, B.: A diversity-promoting objective function for neural conversation models. arXiv preprint, arXiv:1510.03055 (2015)
Mihail, E., Manning, D.: A copy-augmented sequence-to-sequence architecture gives good performance on task-oriented dialogue. arXiv preprint, arXiv:1701.04024 (2017)
Ahn, S., et al.: A neural knowledge language model. arXiv preprint, arXiv:1608.00318 (2016)
Srivastava, R.K., Greff, K., Schmidhuber, J.: Highway networks. arXiv preprint, arXiv:1505.00387 (2015)
Tiedemann, J.: News from OPUS—a collection of multi-lingual parallel corpora with tools and interfaces. In: Nicolov, N., Bontcheva, K., Angelova, G., Mitkov, R. (eds.) Recent Advances in Natural Language Processing, pp. 237–248. John Benjamins Publishing Company, Amsterdam (2009)
Chapter Google Scholar
Hu, B., Lu, Z., Li, H., Chen, Q.: Convolutional neural network architectures for matching natural language sentences. In: Advances in Neural Information Processing Systems, pp. 2042–2050 (2014)
Google Scholar
Mikolov T., Karafiat M., Burget L., Cernocky J., Khudanpur S.: Recurrent neural network based language model. In: INTERSPEECH, pp. 1045–1048 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Meanotek AI Research, Sibirsky Trakt 34, Kazan, Russian Federation
D. S. Tarasov & E. D. Izotova

Authors

D. S. Tarasov
View author publications
You can also search for this author in PubMed Google Scholar
E. D. Izotova
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to D. S. Tarasov .

Editor information

Editors and Affiliations

Scientific Research Institute for System Analysis, Russian Academy of Sciences, Moscow, Russia
Boris Kryzhanovsky
Scientific Research Institute for System Analysis, Russian Academy of Sciences, Moscow, Russia
Witali Dunin-Barkowski
Scientific Research Institute for System Analysis, Russian Academy of Sciences, Moscow, Russia
Vladimir Redko

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tarasov, D.S., Izotova, E.D. (2018). Common Sense Knowledge in Large Scale Neural Conversational Models. In: Kryzhanovsky, B., Dunin-Barkowski, W., Redko, V. (eds) Advances in Neural Computation, Machine Learning, and Cognitive Research. NEUROINFORMATICS 2017. Studies in Computational Intelligence, vol 736. Springer, Cham. https://doi.org/10.1007/978-3-319-66604-4_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-66604-4_6
Published: 29 August 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66603-7
Online ISBN: 978-3-319-66604-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics