Complex Named Entity Recognition via Deep Multi-task Learning from Scratch

Chen, Guangyu; Liu, Tao; Zhang, Deyuan; Yu, Bo; Wang, Baoxun

doi:10.1007/978-3-319-99495-6_19

Guangyu Chen¹⁸,
Tao Liu¹⁸,
Deyuan Zhang¹⁹,
Bo Yu²⁰ &
…
Baoxun Wang²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11108))

Included in the following conference series:

CCF International Conference on Natural Language Processing and Chinese Computing

1993 Accesses
2 Citations

Abstract

Named Entity Recognition (NER) is the preliminary task in many basic NLP technologies and deep neural networks has shown their promising opportunities in NER task. However, the NER tasks covered in previous work are relatively simple, focusing on classic entity categories (Persons, Locations, Organizations) and failing to meet the requirements of newly-emerging application scenarios, where there exist more informal entity categories or even hierarchical category structures. In this paper, we propose a multi-task learning based subtask learning strategy to combat the complexity of modern NER tasks. We conduct experiments on a complex Chinese NER task, and the experimental results demonstrate the effectiveness of our approach.

This work is supported by visiting scholar program of China Scholarship Council and National Natural Science Foundation of China (Grant No. 61472428 and No. U1711262). The work was done when the first author was an intern in Tricorn (Beijing) Technology Co., Ltd.

Bo Yu is currently working in Baidu, Inc.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://dueros.baidu.com/.
2.
https://developer.amazon.com/alexa.
3.
https://catalog.ldc.upenn.edu/ldc2016t13.
4.
CATER,HOTEL,SCENE,PROD_TAG,PROD_BRAND,FILM,MUSIC,TV,ENT_OTHER.

References

Ando, R.K., Zhang, T.: A framework for learning predictive structures from multiple tasks and unlabeled data. J. Mach. Learn. Res. 6, 1817–1853 (2005). http://dl.acm.org/citation.cfm?id=1046920.1194905
Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994). https://doi.org/10.1109/72.279181
Article Google Scholar
Caruana, R.: Multitask learning. Mach. Learn. 28(1), 41–75 (1997)
Article MathSciNet Google Scholar
Chieu, H.L., Ng, H.T.: Named entity recognition: a maximum entropy approach using global information. In: Proceedings of the 19th International Conference on Computational Linguistics, COLING 2002, vol. 1, pp. 1–7. Association for Computational Linguistics, Stroudsburg (2002)
Google Scholar
Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning, ICML 2008, pp. 160–167. ACM, New York (2008). https://doi.org/10.1145/1390156.1390177, https://doi.acm.org/10.1145/1390156.1390177
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011). http://dl.acm.org/citation.cfm?id=1953048.2078186
Dai, H.J., Lai, P.T., Chang, Y.C., Tsai, R.T.H.: Enhancing of chemical compound and drug name recognition using representative tag scheme and fine-grained tokenization. J. Cheminform. 7(Suppl 1), S14–S14 (2015). https://doi.org/10.1186/1758-2946-7-S1-S14, http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4331690/. 1758-2946-7-S1-S14[PII]
Forney, G.D.: The viterbi algorithm. Proc. IEEE 61(3), 268–278 (1973). https://doi.org/10.1109/PROC.1973.9030
Article MathSciNet Google Scholar
Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: continual prediction with LSTM. Neural Comput. 12, 2451–2471 (1999)
Article Google Scholar
Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks 18(5), 602–610 (2005). https://doi.org/10.1016/j.neunet.2005.06.042, http://www.sciencedirect.com/science/article/pii/S0893608005001206. iJCNN 2005
Grishman, R., Sundheim, B.: Design of the MUC-6 evaluation. In: Proceedings of the 6th Conference on Message Understanding, MUC6 1995, pp. 1–11. Association for Computational Linguistics, Stroudsburg (1995)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735
Article Google Scholar
Kingma, D., Ba, J.: Adam: A Method for Stochastic Optimization (2014)
Google Scholar
Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, ICML 2001, pp. 282–289. Morgan Kaufmann Publishers Inc., San Francisco (2001). http://dl.acm.org/citation.cfm?id=645530.655813
Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 260–270. Association for Computational Linguistics (2016)
Google Scholar
Lin, D., Wu, X.: Phrase clustering for discriminative learning. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2, ACL 2009, pp. 1030–1038. Association for Computational Linguistics, Stroudsburg (2009). http://dl.acm.org/citation.cfm?id=1690219.1690290
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013). http://arxiv.org/abs/1301.3781
Peng, N., Dredze, M.: Learning word segmentation representations to improve named entity recognition for chinese social media. CoRR abs/1603.00786 (2016). http://arxiv.org/abs/1603.00786
Peng, N., Dredze, M.: Multi-task domain adaptation for sequence tagging. In: Proceedings of the 2nd Workshop on Representation Learning for NLP, pp. 91–100. Association for Computational Linguistics (2017). http://aclweb.org/anthology/W17-2612
Ratinov, L., Roth, D.: Design challenges and misconceptions in named entity recognition. In: Proceedings of the Thirteenth Conference on Computational Natural Language Learning, CoNLL 2009, pp. 147–155. Association for Computational Linguistics, Stroudsburg (2009). http://dl.acm.org/citation.cfm?id=1596374.1596399
Tjong Kim Sang, E.F., De Meulder, F.: Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, CONLL 2003, vol. 4, pp. 142–147. Association for Computational Linguistics, Stroudsburg (2003)
Google Scholar
Turian, J., Ratinov, L., Bengio, Y.: Word representations: a simple and general method for semi-supervised learning. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, ACL 2010, pp. 384–394. Association for Computational Linguistics, Stroudsburg (2010). http://dl.acm.org/citation.cfm?id=1858681.1858721
Yang, Z., Salakhutdinov, R., Cohen, W.W.: Multi-task cross-lingual sequence tagging from scratch. CoRR abs/1603.06270 (2016). http://arxiv.org/abs/1603.06270

Download references

Author information

Authors and Affiliations

School of Information, Renmin University of China, Beijing, China
Guangyu Chen & Tao Liu
School of Computer, Shenyang Aerospace University, Shenyang, China
Deyuan Zhang
Tricorn (Beijing) Technology Co., Ltd, Beijing, China
Bo Yu & Baoxun Wang

Authors

Guangyu Chen
View author publications
You can also search for this author in PubMed Google Scholar
Tao Liu
View author publications
You can also search for this author in PubMed Google Scholar
Deyuan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Bo Yu
View author publications
You can also search for this author in PubMed Google Scholar
Baoxun Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Deyuan Zhang .

Editor information

Editors and Affiliations

Soochow University, Suzhou, China
Min Zhang
The University of Texas at Dallas, Richardson, Texas, USA
Vincent Ng
Peking University, Beijing, China
Dongyan Zhao
Peking University, Beijing, China
Sujian Li
Zhengzhou University, Zhengzhou, China
Hongying Zan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, G., Liu, T., Zhang, D., Yu, B., Wang, B. (2018). Complex Named Entity Recognition via Deep Multi-task Learning from Scratch. In: Zhang, M., Ng, V., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2018. Lecture Notes in Computer Science(), vol 11108. Springer, Cham. https://doi.org/10.1007/978-3-319-99495-6_19

Download citation

DOI: https://doi.org/10.1007/978-3-319-99495-6_19
Published: 14 August 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99494-9
Online ISBN: 978-3-319-99495-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)