Advertisement

Adversarial Training for Relation Classification with Attention Based Gate Mechanism

  • Pengfei Cao
  • Yubo Chen
  • Kang Liu
  • Jun Zhao
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 957)

Abstract

In recent years, deep neural networks have achieved significant success in relation classification and many other natural language processing tasks. However, existing neural networks for relation classification heavily rely on the quality of labelled data and tend to be overconfident about the noise in input signals. They may be limited in robustness and generalization. In this paper, we apply adversarial training to the relation classification by adding perturbations to the input vectors in bidirectional long short-term memory neural networks rather than to the original input itself. Besides, we propose an attention based gate module, which can not only discern the important information when learning the sentence representations but also adaptively concatenate sentence level and lexical level features. Experiments on the SemEval-2010 Task 8 benchmark dataset show that our model significantly outperforms other state-of-the-art models.

Keywords

Relation classification Adversarial training Attention based gate mechanism 

Notes

Acknowledgments

This work is supported by the Natural Science Foundation of China (No. 61533018, No. 61702512 and No. 61502493). This work was also supported by Alibaba Group through Alibaba Innovative Research (AIR) Program and Huawei Tech. Ltm through Huawei Innovation Research Program.

References

  1. 1.
    Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
  2. 2.
    Bunescu, R.C., Mooney, R.J.: A shortest path dependency kernel for relation extraction. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 724–731. ACL (2005)Google Scholar
  3. 3.
    Chen, J., Tandon, N., de Melo, G.: Neural word representations from large-scale commonsense knowledge. In: 2015 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), pp. 225–228 (2015)Google Scholar
  4. 4.
    Chen, J., Ji, D., Tan, C.L., Niu, Z.: Unsupervised feature selection for relation extraction. In: Companion Volume to the Proceedings of Conference Including Posters/Demos and Tutorial Abstracts (2005)Google Scholar
  5. 5.
    Elman, J.L.: Finding structure in time. Cogn. Sci. 14(2), 179–211 (1990)CrossRefGoogle Scholar
  6. 6.
    Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)Google Scholar
  7. 7.
    Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014)
  8. 8.
    GuoDong, Z., Jian, S., Jie, Z., Min, Z.: Exploring various knowledge in relation extraction. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 427–434. Association for Computational Linguistics (2005)Google Scholar
  9. 9.
    Hashimoto, K., Miwa, M., Tsuruoka, Y., Chikayama, T.: Simple customization of recursive neural networks for semantic relation classification. In: Proceedings of the 2013 Conference on EMNLP, pp. 1372–1376 (2013)Google Scholar
  10. 10.
    Hendrickx, I., et al.: Semeval-2010 task 8: multi-way classification of semantic relations between pairs of nominals, In: Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions, pp. 94–99. Association for Computational Linguistics (2009)Google Scholar
  11. 11.
    Hermann, K.M., et al.: Teaching machines to read and comprehend. In: Advances in Neural Information Processing Systems, pp. 1693–1701 (2015)Google Scholar
  12. 12.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  13. 13.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  14. 14.
    Liu, Y., Wei, F., Li, S., Ji, H., Zhou, M., Wang, H.: A dependency-based neural network for relation classification. arXiv preprint arXiv:1507.04646 (2015)
  15. 15.
    Miyato, T., Dai, A.M., Goodfellow, I.: Adversarial training methods for semi-supervised text classification. arXiv preprint arXiv:1605.07725 (2016)
  16. 16.
    Mooney, R.J., Bunescu, R.C.: Subsequence kernels for relation extraction. In: Advances in Neural Information Processing Systems, pp. 171–178 (2006)Google Scholar
  17. 17.
    Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)Google Scholar
  18. 18.
    Qian, L., Zhou, G., Kong, F., Zhu, Q., Qian, P.: Exploiting constituent dependencies for tree kernel-based semantic relation extraction. In: Proceedings of the 22nd International Conference on Computational Linguistics, pp. 697–704. ACL (2008)Google Scholar
  19. 19.
    Rink, B., Harabagiu, S.: UTD: classifying semantic relations by combining lexical and semantic resources. In: Proceedings of the 5th International Workshop on Semantic Evaluation, pp. 256–259. Association for Computational Linguistics (2010)Google Scholar
  20. 20.
    Santos, C.N.D., Xiang, B., Zhou, B.: Classifying relations by ranking with convolutional neural networks. arXiv preprint arXiv:1504.06580 (2015)
  21. 21.
    Socher, R., Huval, B., Manning, C.D., Ng, A.Y.: Semantic compositionality through recursive matrix-vector spaces. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 1201–1211. Association for Computational Linguistics (2012)Google Scholar
  22. 22.
    Socher, R., Pennington, J., Huang, E.H., Ng, A.Y., Manning, C.D.: Semi-supervised recursive autoencoders for predicting sentiment distributions. In: Proceedings of the Conference on EMNLP, pp. 151–161. ACL (2011)Google Scholar
  23. 23.
    Szegedy, C., et al.: Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013)
  24. 24.
    Wu, F., Weld, D.S.: Open information extraction using Wikipedia. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 118–127. Association for Computational Linguistics (2010)Google Scholar
  25. 25.
    Wu, Y., Bamman, D., Russell, S.: Adversarial training for relation extraction. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 1778–1783 (2017)Google Scholar
  26. 26.
    Xie, Z., et al.: Data noising as smoothing in neural network language models. arXiv preprint arXiv:1703.02573 (2017)
  27. 27.
    Xu, K., et al.: Show, attend and tell: neural image caption generation with visual attention. In: International Conference on Machine Learning, pp. 2048–2057 (2015)Google Scholar
  28. 28.
    Xu, Y., Mou, L., Li, G., Chen, Y., Peng, H., Jin, Z.: Classifying relations via long short term memory networks along shortest dependency paths. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1785–1794 (2015)Google Scholar
  29. 29.
    Yao, X., Van Durme, B.: Information extraction over structured data: question answering with freebase. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, pp. 956–966 (2014)Google Scholar
  30. 30.
    Zeng, D., Liu, K., Lai, S., Zhou, G., Zhao, J.: Relation classification via convolutional deep neural network. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pp. 2335–2344 (2014)Google Scholar
  31. 31.
    Zhang, S., Zheng, D., Hu, X., Yang, M.: Bidirectional long short-term memory networks for relation classification. In: Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation, pp. 73–78 (2015)Google Scholar
  32. 32.
    Zhou, P., et al.: Attention-based bidirectional long short-term memory networks for relation classification. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), vol. 2, pp. 207–212 (2016)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.University of Chinese Academy of SciencesBeijingChina
  2. 2.Institute of AutomationChinese Academy of SciencesBeijingChina

Personalised recommendations