Adversarial Training for Relation Classification with Attention Based Gate Mechanism

Cao, Pengfei; Chen, Yubo; Liu, Kang; Zhao, Jun

doi:10.1007/978-981-13-3146-6_8

Adversarial Training for Relation Classification with Attention Based Gate Mechanism

Pengfei Cao^15,16,
Yubo Chen¹⁶,
Kang Liu^15,16 &
…
Jun Zhao^15,16

Conference paper
First Online: 07 December 2018

813 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 957))

Abstract

In recent years, deep neural networks have achieved significant success in relation classification and many other natural language processing tasks. However, existing neural networks for relation classification heavily rely on the quality of labelled data and tend to be overconfident about the noise in input signals. They may be limited in robustness and generalization. In this paper, we apply adversarial training to the relation classification by adding perturbations to the input vectors in bidirectional long short-term memory neural networks rather than to the original input itself. Besides, we propose an attention based gate module, which can not only discern the important information when learning the sentence representations but also adaptively concatenate sentence level and lexical level features. Experiments on the SemEval-2010 Task 8 benchmark dataset show that our model significantly outperforms other state-of-the-art models.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Bunescu, R.C., Mooney, R.J.: A shortest path dependency kernel for relation extraction. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 724–731. ACL (2005)
Google Scholar
Chen, J., Tandon, N., de Melo, G.: Neural word representations from large-scale commonsense knowledge. In: 2015 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), pp. 225–228 (2015)
Google Scholar
Chen, J., Ji, D., Tan, C.L., Niu, Z.: Unsupervised feature selection for relation extraction. In: Companion Volume to the Proceedings of Conference Including Posters/Demos and Tutorial Abstracts (2005)
Google Scholar
Elman, J.L.: Finding structure in time. Cogn. Sci. 14(2), 179–211 (1990)
Article Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Google Scholar
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014)
GuoDong, Z., Jian, S., Jie, Z., Min, Z.: Exploring various knowledge in relation extraction. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 427–434. Association for Computational Linguistics (2005)
Google Scholar
Hashimoto, K., Miwa, M., Tsuruoka, Y., Chikayama, T.: Simple customization of recursive neural networks for semantic relation classification. In: Proceedings of the 2013 Conference on EMNLP, pp. 1372–1376 (2013)
Google Scholar
Hendrickx, I., et al.: Semeval-2010 task 8: multi-way classification of semantic relations between pairs of nominals, In: Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions, pp. 94–99. Association for Computational Linguistics (2009)
Google Scholar
Hermann, K.M., et al.: Teaching machines to read and comprehend. In: Advances in Neural Information Processing Systems, pp. 1693–1701 (2015)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Liu, Y., Wei, F., Li, S., Ji, H., Zhou, M., Wang, H.: A dependency-based neural network for relation classification. arXiv preprint arXiv:1507.04646 (2015)
Miyato, T., Dai, A.M., Goodfellow, I.: Adversarial training methods for semi-supervised text classification. arXiv preprint arXiv:1605.07725 (2016)
Mooney, R.J., Bunescu, R.C.: Subsequence kernels for relation extraction. In: Advances in Neural Information Processing Systems, pp. 171–178 (2006)
Google Scholar
Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Google Scholar
Qian, L., Zhou, G., Kong, F., Zhu, Q., Qian, P.: Exploiting constituent dependencies for tree kernel-based semantic relation extraction. In: Proceedings of the 22nd International Conference on Computational Linguistics, pp. 697–704. ACL (2008)
Google Scholar
Rink, B., Harabagiu, S.: UTD: classifying semantic relations by combining lexical and semantic resources. In: Proceedings of the 5th International Workshop on Semantic Evaluation, pp. 256–259. Association for Computational Linguistics (2010)
Google Scholar
Santos, C.N.D., Xiang, B., Zhou, B.: Classifying relations by ranking with convolutional neural networks. arXiv preprint arXiv:1504.06580 (2015)
Socher, R., Huval, B., Manning, C.D., Ng, A.Y.: Semantic compositionality through recursive matrix-vector spaces. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 1201–1211. Association for Computational Linguistics (2012)
Google Scholar
Socher, R., Pennington, J., Huang, E.H., Ng, A.Y., Manning, C.D.: Semi-supervised recursive autoencoders for predicting sentiment distributions. In: Proceedings of the Conference on EMNLP, pp. 151–161. ACL (2011)
Google Scholar
Szegedy, C., et al.: Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013)
Wu, F., Weld, D.S.: Open information extraction using Wikipedia. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 118–127. Association for Computational Linguistics (2010)
Google Scholar
Wu, Y., Bamman, D., Russell, S.: Adversarial training for relation extraction. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 1778–1783 (2017)
Google Scholar
Xie, Z., et al.: Data noising as smoothing in neural network language models. arXiv preprint arXiv:1703.02573 (2017)
Xu, K., et al.: Show, attend and tell: neural image caption generation with visual attention. In: International Conference on Machine Learning, pp. 2048–2057 (2015)
Google Scholar
Xu, Y., Mou, L., Li, G., Chen, Y., Peng, H., Jin, Z.: Classifying relations via long short term memory networks along shortest dependency paths. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1785–1794 (2015)
Google Scholar
Yao, X., Van Durme, B.: Information extraction over structured data: question answering with freebase. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, pp. 956–966 (2014)
Google Scholar
Zeng, D., Liu, K., Lai, S., Zhou, G., Zhao, J.: Relation classification via convolutional deep neural network. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pp. 2335–2344 (2014)
Google Scholar
Zhang, S., Zheng, D., Hu, X., Yang, M.: Bidirectional long short-term memory networks for relation classification. In: Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation, pp. 73–78 (2015)
Google Scholar
Zhou, P., et al.: Attention-based bidirectional long short-term memory networks for relation classification. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), vol. 2, pp. 207–212 (2016)
Google Scholar

Download references

Acknowledgments

This work is supported by the Natural Science Foundation of China (No. 61533018, No. 61702512 and No. 61502493). This work was also supported by Alibaba Group through Alibaba Innovative Research (AIR) Program and Huawei Tech. Ltm through Huawei Innovation Research Program.

Author information

Authors and Affiliations

University of Chinese Academy of Sciences, Beijing, China
Pengfei Cao, Kang Liu & Jun Zhao
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Pengfei Cao, Yubo Chen, Kang Liu & Jun Zhao

Authors

Pengfei Cao
View author publications
You can also search for this author in PubMed Google Scholar
Yubo Chen
View author publications
You can also search for this author in PubMed Google Scholar
Kang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jun Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jun Zhao .

Editor information

Editors and Affiliations

Institute of Automation, Chinese Academy of Sciences, Beijing, China
Jun Zhao
Department of Computer Science, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
Frank van Harmelen
Tsinghua University, Beijing, China
Jie Tang
Institute of Software, Chinese Academy of Sciences, Beijing, China
Xianpei Han
Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
Quan Wang
Xihua University, Chendu, China
Xianyong Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cao, P., Chen, Y., Liu, K., Zhao, J. (2019). Adversarial Training for Relation Classification with Attention Based Gate Mechanism. In: Zhao, J., Harmelen, F., Tang, J., Han, X., Wang, Q., Li, X. (eds) Knowledge Graph and Semantic Computing. Knowledge Computing and Language Understanding. CCKS 2018. Communications in Computer and Information Science, vol 957. Springer, Singapore. https://doi.org/10.1007/978-981-13-3146-6_8

Download citation

DOI: https://doi.org/10.1007/978-981-13-3146-6_8
Published: 07 December 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-3145-9
Online ISBN: 978-981-13-3146-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics