Skip to main content

Adversarial Training for Relation Classification with Attention Based Gate Mechanism

  • Conference paper
  • First Online:
  • 813 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 957))

Abstract

In recent years, deep neural networks have achieved significant success in relation classification and many other natural language processing tasks. However, existing neural networks for relation classification heavily rely on the quality of labelled data and tend to be overconfident about the noise in input signals. They may be limited in robustness and generalization. In this paper, we apply adversarial training to the relation classification by adding perturbations to the input vectors in bidirectional long short-term memory neural networks rather than to the original input itself. Besides, we propose an attention based gate module, which can not only discern the important information when learning the sentence representations but also adaptively concatenate sentence level and lexical level features. Experiments on the SemEval-2010 Task 8 benchmark dataset show that our model significantly outperforms other state-of-the-art models.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)

  2. Bunescu, R.C., Mooney, R.J.: A shortest path dependency kernel for relation extraction. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 724–731. ACL (2005)

    Google Scholar 

  3. Chen, J., Tandon, N., de Melo, G.: Neural word representations from large-scale commonsense knowledge. In: 2015 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), pp. 225–228 (2015)

    Google Scholar 

  4. Chen, J., Ji, D., Tan, C.L., Niu, Z.: Unsupervised feature selection for relation extraction. In: Companion Volume to the Proceedings of Conference Including Posters/Demos and Tutorial Abstracts (2005)

    Google Scholar 

  5. Elman, J.L.: Finding structure in time. Cogn. Sci. 14(2), 179–211 (1990)

    Article  Google Scholar 

  6. Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)

    Google Scholar 

  7. Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014)

  8. GuoDong, Z., Jian, S., Jie, Z., Min, Z.: Exploring various knowledge in relation extraction. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 427–434. Association for Computational Linguistics (2005)

    Google Scholar 

  9. Hashimoto, K., Miwa, M., Tsuruoka, Y., Chikayama, T.: Simple customization of recursive neural networks for semantic relation classification. In: Proceedings of the 2013 Conference on EMNLP, pp. 1372–1376 (2013)

    Google Scholar 

  10. Hendrickx, I., et al.: Semeval-2010 task 8: multi-way classification of semantic relations between pairs of nominals, In: Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions, pp. 94–99. Association for Computational Linguistics (2009)

    Google Scholar 

  11. Hermann, K.M., et al.: Teaching machines to read and comprehend. In: Advances in Neural Information Processing Systems, pp. 1693–1701 (2015)

    Google Scholar 

  12. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  13. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  14. Liu, Y., Wei, F., Li, S., Ji, H., Zhou, M., Wang, H.: A dependency-based neural network for relation classification. arXiv preprint arXiv:1507.04646 (2015)

  15. Miyato, T., Dai, A.M., Goodfellow, I.: Adversarial training methods for semi-supervised text classification. arXiv preprint arXiv:1605.07725 (2016)

  16. Mooney, R.J., Bunescu, R.C.: Subsequence kernels for relation extraction. In: Advances in Neural Information Processing Systems, pp. 171–178 (2006)

    Google Scholar 

  17. Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)

    Google Scholar 

  18. Qian, L., Zhou, G., Kong, F., Zhu, Q., Qian, P.: Exploiting constituent dependencies for tree kernel-based semantic relation extraction. In: Proceedings of the 22nd International Conference on Computational Linguistics, pp. 697–704. ACL (2008)

    Google Scholar 

  19. Rink, B., Harabagiu, S.: UTD: classifying semantic relations by combining lexical and semantic resources. In: Proceedings of the 5th International Workshop on Semantic Evaluation, pp. 256–259. Association for Computational Linguistics (2010)

    Google Scholar 

  20. Santos, C.N.D., Xiang, B., Zhou, B.: Classifying relations by ranking with convolutional neural networks. arXiv preprint arXiv:1504.06580 (2015)

  21. Socher, R., Huval, B., Manning, C.D., Ng, A.Y.: Semantic compositionality through recursive matrix-vector spaces. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 1201–1211. Association for Computational Linguistics (2012)

    Google Scholar 

  22. Socher, R., Pennington, J., Huang, E.H., Ng, A.Y., Manning, C.D.: Semi-supervised recursive autoencoders for predicting sentiment distributions. In: Proceedings of the Conference on EMNLP, pp. 151–161. ACL (2011)

    Google Scholar 

  23. Szegedy, C., et al.: Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013)

  24. Wu, F., Weld, D.S.: Open information extraction using Wikipedia. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 118–127. Association for Computational Linguistics (2010)

    Google Scholar 

  25. Wu, Y., Bamman, D., Russell, S.: Adversarial training for relation extraction. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 1778–1783 (2017)

    Google Scholar 

  26. Xie, Z., et al.: Data noising as smoothing in neural network language models. arXiv preprint arXiv:1703.02573 (2017)

  27. Xu, K., et al.: Show, attend and tell: neural image caption generation with visual attention. In: International Conference on Machine Learning, pp. 2048–2057 (2015)

    Google Scholar 

  28. Xu, Y., Mou, L., Li, G., Chen, Y., Peng, H., Jin, Z.: Classifying relations via long short term memory networks along shortest dependency paths. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1785–1794 (2015)

    Google Scholar 

  29. Yao, X., Van Durme, B.: Information extraction over structured data: question answering with freebase. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, pp. 956–966 (2014)

    Google Scholar 

  30. Zeng, D., Liu, K., Lai, S., Zhou, G., Zhao, J.: Relation classification via convolutional deep neural network. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pp. 2335–2344 (2014)

    Google Scholar 

  31. Zhang, S., Zheng, D., Hu, X., Yang, M.: Bidirectional long short-term memory networks for relation classification. In: Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation, pp. 73–78 (2015)

    Google Scholar 

  32. Zhou, P., et al.: Attention-based bidirectional long short-term memory networks for relation classification. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), vol. 2, pp. 207–212 (2016)

    Google Scholar 

Download references

Acknowledgments

This work is supported by the Natural Science Foundation of China (No. 61533018, No. 61702512 and No. 61502493). This work was also supported by Alibaba Group through Alibaba Innovative Research (AIR) Program and Huawei Tech. Ltm through Huawei Innovation Research Program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun Zhao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cao, P., Chen, Y., Liu, K., Zhao, J. (2019). Adversarial Training for Relation Classification with Attention Based Gate Mechanism. In: Zhao, J., Harmelen, F., Tang, J., Han, X., Wang, Q., Li, X. (eds) Knowledge Graph and Semantic Computing. Knowledge Computing and Language Understanding. CCKS 2018. Communications in Computer and Information Science, vol 957. Springer, Singapore. https://doi.org/10.1007/978-981-13-3146-6_8

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-3146-6_8

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-3145-9

  • Online ISBN: 978-981-13-3146-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics