Abstract
In natural language, a group of words constitute a phrase and several phrases constitute a sentence. However, existing transformer-based models for sentence-level tasks abstract sentence-level semantics from word-level semantics directly, which override phrase-level semantics so that they may be not favorable for capturing more precise semantics. In order to resolve this problem, we propose a novel multi-granularity semantic representation (MGSR) model for relation extraction. This model can bridge the semantic gap between low-level semantic abstraction and high-level semantic abstraction by learning word-level, phrase-level, and sentence-level multi-granularity semantic representations successively. We segment a sentence into entity chunks and context chunks according to an entity pair. Thus, the sentence is represented as a non-empty segmentation set. The entity chunks are noun phrases, and the context chunks contain the key phrases expressing semantic relations. Then, the MGSR model utilizes inter-word, inner-chunk and inter-chunk three kinds of different self-attention mechanisms, respectively, to learn the multi-granularity semantic representations. The experiments on two standard datasets demonstrate our model outperforms the previous models.
Similar content being viewed by others
References
Agichtein E, Gravano L (2000) Snowball: extracting relations from large plain-text collections. In: Proceedings of the fifth ACM conference on digital libraries, DL ’00, pp. 85–94. ACM, New York, NY, USA (2000). https://doi.org/10.1145/336597.336644
Blum A, Lafferty J, Rwebangira MR, Reddy R (2004) Semi-supervised learning using randomized mincuts. In: Proceedings of the twenty-first international conference on machine learning, ICML ’04, pp. 13–. ACM, New York (2004). https://doi.org/10.1145/1015330.1015429
Bollegala, DT, Matsuo, Y, Ishizuka M (2010) Relational duality: Unsupervised extraction of semantic relations between entities on the web. In: Proceedings of the 19th international conference on World Wide Web, WWW ’10, pp. 151–160. ACM, New York, USA (2010). https://doi.org/10.1145/1772690.1772707
Brin S (1999) Extracting patterns and relations from the world wide web. In: Selected papers from the international workshop on The World Wide Web and databases, WebDB ’98, pp. 172–183. Springer, London, UK (1999). http://dl.acm.org/citation.cfm?id=646543.696220
Bunescu R, Mooney R (2007) Learning to extract relations from the web using minimal supervision. In: Proceedings of the 45th annual meeting of the association of computational linguistics, pp. 576–583. ACL (2007). http://www.aclweb.org/anthology/P07-1073
Bunescu RC, Mooney RJ (2005) Subsequence kernels for relation extraction. In: Proceedings of the 18th international conference on neural information processing systems, NIPS’05, pp. 171–178. MIT Press, Cambridge, MA, USA (2005). http://dl.acm.org/citation.cfm?id=2976248.2976270
Chen J, Ji D, Tan CL, Niu Z (2006) Relation extraction using label propagation based semi-supervised learning. In: Proceedings of the 44th annual meeting of the association for computational linguistics, pp. 129–136. ACL (2006). http://www.aclweb.org/anthology/P06-1017
Devlin J, Chang M, Lee K, Toutanova K (2018) BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018). http://arxiv.org/abs/1810.04805
Eberts M, Ulges A (2019) Span-based joint entity and relation extraction with transformer pre-training. https://arxiv.org/abs/1909.07755
Guo Z, Zhang Y, Lu W (2019) Attention guided graph convolutional networks for relation extraction. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp. 241–251. ACL (2019). https://www.aclweb.org/anthology/P19-1024
Hearst MA (1992) Automatic acquisition of hyponyms from large text corpora. In: COLING 1992 Volume 2: The 15th international conference on computational linguistics (1992). http://www.aclweb.org/anthology/C92-2082
Hoffmann R, Zhang C, Ling X, Zettlemoyer L, Weld DS (2011) Knowledge-based weak supervision for information extraction of overlapping relations. In: Proceedings of the 49th annual meeting of the association for computational linguistics, pp. 541–550. ACL (2011). http://www.aclweb.org/anthology/P11-1055
Kingma DP, Ba JL (2015) Adam: a method for stochastic optimization. In: International conference on learning representations
Li Q, Ji H (2014) Incremental joint extraction of entity mentions and relations. In: Proceedings of the 52nd annual meeting of the association for computational linguistics, pp. 402–412. ACL (2014). https://doi.org/10.3115/v1/P14-1038
Lin Y, Shen S, Liu Z, Luan H, Sun M (2016) Neural relation extraction with selective attention over instances. In: Proceedings of the 54th annual meeting of the association for computational linguistics, pp. 2124–2133
Luan Y, Wadden D, He L, Shah A, Ostendorf M, Hajishirzi H (2019) A general framework for information extraction using dynamic span graphs. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, vol 1 (Long and Short Papers), pp 3036–3046
Mintz M, Bills S, Snow R, Jurafsky D (2009) Distant supervision for relation extraction without labeled data. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP, pp. 1003–1011. ACL (2009). http://www.aclweb.org/anthology/P09-1113
Miwa M, Bansal M (2016) End-to-end relation extraction using lstms on sequences and tree structures. In: Proceedings of the 54th annual meeting of the association for computational linguistics, pp. 1105–1116. ACL (2016). https://doi.org/10.18653/v1/P16-1105
Nakashole N, Tylenda T, Weikum G (2013) Fine-grained semantic typing of emerging entities. In: Proceedings of the 51st annual meeting of the association for computational linguistics, pp. 1488–1497. ACL (2013). http://www.aclweb.org/anthology/P13-1146
Oakes MP (2005) Using hearst’s rules for the automatic acquisition of hyponyms for mining a pharmaceutical corpus. In: International workshop text mining research, practice and opportunities, proceedings, Borovets, Bulgaria, 24 September 2005, Held in Conjunction with Ranlp, pp. 63–67
Peng N, Poon H, Quirk C, Toutanova K, Yih Wt (2017) Cross-sentence n-ary relation extraction with graph lstms. Trans Assoc Comput Linguist 5, 101–115 (2017). http://aclweb.org/anthology/Q17-1008
Qian L, Zhou G, Kong F, Zhu Q, Qian P (2008) Exploiting constituent dependencies for tree kernel-based semantic relation extraction. In: Proceedings of the 22nd international conference on computational linguistics, pp. 697–704. Coling 2008. http://www.aclweb.org/anthology/C08-1088
Radford A, Narasimhan K, Salimans T, Sutskever I (2018) Improving language understanding by generative pre-training. https://www.cs.ubc.ca/~amuham01/LING530/papers/radford2018improving.pdf
Ren X, Wu Z, He W, Qu M, Voss CR, Ji H, Abdelzaher TF, Han J (2017) Cotype: Joint extraction of typed entities and relations with knowledge bases. In: Proceedings of the 26th international conference on World Wide Web, WWW ’17, pp. 1015–1024. WWW (2017). https://doi.org/10.1145/3038912.3052708
dos Santos C, Xiang B, Zhou B (2015) Classifying relations by ranking with convolutional neural networks. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (Vol 1: Long Papers), pp 626–634. ACL (2015). https://doi.org/10.3115/v1/P15-1061
Socher R, Huval B, Manning CD, Ng AY (2012) Semantic compositionality through recursive matrix-vector spaces. In: Proceedings of the 2012 empirical methods in natural language processing, pp 1201–1211. ACL (2012). http://www.aclweb.org/anthology/D12-1110
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Lu, Polosukhin I (2017) Attention is all you need. In: I. Guyon, U.V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, R. Garnett (eds.) Advances in neural information processing systems 30, pp. 5998–6008. Curran Associates, Inc. (2017). http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf
Verga P, Strubell E, Mccallum A (2018) Simultaneously self-attending to all mentions for full-abstract biological relation extraction. arxiv:1802.10569
Vu NT, Adel H, Gupta P, Schütze H (2016) Combining recurrent and convolutional neural networks for relation classification. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 534–539. ACL (2016). https://doi.org/10.18653/v1/N16-1065
Wang L, Cao Z, de Melo G, Liu Z (2016) Relation classification via multi-level attention cnns. In: Proceedings of the 54th annual meeting of the association for computational linguistics, pp 1298–1307. ACL (2016). https://doi.org/10.18653/v1/P16-1123
Wang S, Zhang Y, Che W, Liu T (2018) Joint extraction of entities and relations based on a novel graph scheme. In: Proceedings of the twenty-seventh international joint conference on artificial intelligence, pp 4461–4467. IJCAI (2018). 10.24963/ijcai.2018/620. https://doi.org/10.24963/ijcai.2018/620
Xu K, Feng Y, Huang S, Zhao D (2015) Semantic relation classification via convolutional neural networks with simple negative sampling. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 536–540. ACL (2015). https://doi.org/10.18653/v1/D15-1062
Xu, Y., Jia, R., Mou, L., Li, G., Chen, Y., Lu, Y., Jin, Z.: Improved relation classification by deep recurrent neural networks with data augmentation. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics, pp 1461–1470. COLING (2016). http://www.aclweb.org/anthology/C16-1138
Zelenko D, Aone C, Richardella A (2003) Kernel methods for relation extraction. J Mach Learn Res 3, 1083–1106 (2003). http://dl.acm.org/citation.cfm?id=944919.944964
Zeng D, Liu K, Chen Y, Zhao J (2015) Distant supervision for relation extraction via piecewise convolutional neural networks. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 1753–1762
Zeng D, Liu K, Lai S, Zhou G, Zhao J (2014) Relation classification via convolutional deep neural network. In: Proceedings of COLING 2014, the 25th international conference on computational linguistics: technical papers, pp 2335–2344. ACL (2014). http://www.aclweb.org/anthology/C14-1220
Zhang H, Li J, Ji Y, Yue H (2017) Understanding subtitles by character-level sequence-to-sequence learning. IEEE Trans Industr Inf 13(2):616–624
Zhang H, Wang S, Xu X, Chow TW, Wu QJ (2018) Tree2vector: learning a vectorial representation for tree-structured data. IEEE Transac Neural Networks Learn Syst 99:1–15
Zhang S, Zheng D, Hu X, Yang M (2015) Bidirectional long short-term memory networks for relation classification. In: Proceedings of the 29th Pacific Asia conference on language, information and computation, pp. 73–78 (2015). http://www.aclweb.org/anthology/Y15-1009
Zhang Y, Qi P, Manning CD (2018) Graph convolution over pruned dependency trees improves relation extraction. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp. 2205–2215. ACL, Brussels, Belgium (2018). https://doi.org/10.18653/v1/D18-1244
Zheng S, Wang F, Bao H, Hao Y, Zhou P, Xu B (2017) Joint extraction of entities and relations based on a novel tagging scheme. In: Proceedings of the 55th annual meeting of the association for computational linguistics, pp 1227–1236. ACL (2017). https://doi.org/10.18653/v1/P17-1113
Acknowledgements
We would like to thank the anonymous reviewers. This work is supported by the National Key Research and Development Program of China (No.2016QY03D0602), the National Key Research and Development Program of China (No.2017YFB0803302) and the National Natural Science Foundation of China (No. 61751201).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
There is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Lei, M., Huang, H. & Feng, C. Multi-granularity semantic representation model for relation extraction. Neural Comput & Applic 33, 6879–6889 (2021). https://doi.org/10.1007/s00521-020-05464-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-020-05464-8