Advertisement

Supervised Mover’s Distance: A Simple Model for Sentence Comparison

  • Muktabh Mayank Srivastava
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 930)

Abstract

We propose a simple neural network model which can learn relation between sentences by passing their representations obtained from Long Short Term Memory (LSTM) through a Relation Network. The Relation Network module tries to extract similarity between multiple contextual representations obtained from LSTM. The aim is to build a model which is simple to implement, light in terms of parameters and works across multiple supervised sentence comparison tasks. We show good results for the model on two sentence comparison datasets.

Keywords

Supervised Mover’s Distance Sentence comparison Paraphrase detection Natural language inference 

References

  1. 1.
    Cheng, J., Kartsaklis, D.: Syntax-aware multi-sense word embeddings for deep compositional models of meaning. arXiv preprint arXiv:1508.02354 (2015)
  2. 2.
    Croce, D., Moschitti, A., Basili, R.: Structured lexical similarity via convolution kernels on dependency trees. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1034–1046. Association for Computational Linguistics (2011)Google Scholar
  3. 3.
    Dolan, B., Brockett, C.: Automatically constructing a corpus of sentential paraphrases. In: Third International Workshop on Paraphrasing (IWP2005). Asia Federation of Natural Language Processing, January 2005. https://www.microsoft.com/en-us/research/publication/automatically-constructing-a-corpus-of-sentential-paraphrases/
  4. 4.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  5. 5.
    Huang, G., Guo, C., Kusner, M.J., Sun, Y., Sha, F., Weinberger, K.Q.: Supervised word mover’s distance. In: Advances in Neural Information Processing Systems, pp. 4862–4870 (2016)Google Scholar
  6. 6.
    Iyar, S., Dandekar, N., Csernai, K.: First quora dataset release: question pairs, January 2016. https://data.quora.com/First-Quora-Dataset-Release-Question-Pairs
  7. 7.
    Ji, Y., Eisenstein, J.: Discriminative improvements to distributional sentence similarity. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 891–896 (2013)Google Scholar
  8. 8.
    Kusner, M.J., Sun, Y., Kolkin, N.I., Weinberger, K.Q.: From word embeddings to document distances. In: Proceedings of the 32nd International Conference on International Conference on Machine Learning, ICML 2015, vol. 37, pp. 957–966. JMLR.org (2015). http://dl.acm.org/citation.cfm?id=3045118.3045221
  9. 9.
    Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014). http://www.aclweb.org/anthology/D14-1162
  10. 10.
    Rubner, Y., Tomasi, C., Guibas, L.J.: A metric for distributions with applications to image databases. In: Sixth International Conference on Computer Vision, pp. 59–66. IEEE (1998)Google Scholar
  11. 11.
    Santoro, A., et al.: A simple neural network module for relational reasoning. In: Advances in Neural Information Processing Systems, pp. 4974–4983 (2017)Google Scholar
  12. 12.
    Wang, Z., Hamza, W., Florian, R.: Bilateral multi-perspective matching for natural language sentences. arXiv preprint arXiv:1702.03814 (2017)

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.ParallelDots, Inc.GurugramIndia

Personalised recommendations