SC-NER: A Sequence-to-Sequence Model with Sentence Classification for Named Entity Recognition

  • Yu Wang
  • Yun LiEmail author
  • Ziye Zhu
  • Bin Xia
  • Zheng Liu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11439)


Named Entity Recognition (NER) is a basic task in Natural Language Processing (NLP). Recently, the sequence-to-sequence (seq2seq) model has been widely used in NLP task. Different from the general NLP task, 60% sentences in the NER task do not contain entities. Traditional seq2seq method cannot address this issue effectively. To solve the aforementioned problem, we propose a novel seq2seq model, named SC-NER, for NER task. We construct a classifier between the encoder and decoder. In particular, the classifier’s input is the last hidden state of the encoder. Moreover, we present the restricted beam search to improve the performance of the proposed SC-NER. To evaluate our proposed model, we construct the patent documents corpus in the communications field, and conduct experiments on it. Experimental results show that our SC-NER model achieves better performance than other baseline methods.


Named Entity Recognition Sequence-to-sequence model Deep learning 



This work was partially supported by Natural Science Foundation of China (No. 61603197, 61772284, 61876091, 61802205), Jiangsu Provincial Natural Science Foundation of China under Grant BK20171447, Jiangsu Provincial University Natural Science Research of China under Grant 17KJB520024, the Natural Science Research Project of Jiangsu Province under Grant 18KJB520037, and Nanjing University of Posts and Telecommunications under Grant NY215045.


  1. 1.
    Deft ERE annotation guidelines: Entities. Linguistics Data Consortium (2014)Google Scholar
  2. 2.
    Graves, A., Mohamed, A., Hinton, G.: Speech recognition with deep recurrent neural networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 38, pp. 6645–6649. IEEE (2013)Google Scholar
  3. 3.
    Alotaibi, F., Lee, M.: A hybrid approach to features representation for fine-grained Arabic named entity recognition. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pp. 984–995 (2014)Google Scholar
  4. 4.
    Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5, 157–166 (1994)CrossRefGoogle Scholar
  5. 5.
    Bowman, S.R., Vilnis, L., Vinyals, O., Dai, A.M., Jozefowicz, R., Bengio, S.: Generating sentences from a continuous space. arXiv preprint arXiv:1511.06349 (2015)
  6. 6.
    Cheng, Y., et al.: Agreement-based joint training for bidirectional attention-based neural machine translation. arXiv preprint arXiv:1512.04650 (2015)
  7. 7.
    Chiu, J., Nichols, E.: Named entity recognition with bidirectional LSTM-CNNs. Comput. Sci. (2015)Google Scholar
  8. 8.
    Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. Comput. Sci. (2014)Google Scholar
  9. 9.
    Collobert, R., Weston, J., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)zbMATHGoogle Scholar
  10. 10.
    Doersch, C.: Tutorial on variational autoencoders. arXiv preprint arXiv:1606.05908 (2016)
  11. 11.
    Dong, C., Zhang, J., Zong, C., Hattori, M., Di, H.: Character-based LSTM-CRF with radical-level features for Chinese named entity recognition. In: Lin, C.-Y., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds.) ICCPOL/NLPCC -2016. LNCS (LNAI), vol. 10102, pp. 239–250. Springer, Cham (2016). Scholar
  12. 12.
    Gehring, J., Auli, M., Grangier, D., Dauphin, Y.N.: A convolutional encoder model for neural machine translation. arXiv preprint arXiv:1611.02344 (2016)
  13. 13.
    Gehring, J., Auli, M., Grangier, D., Yarats, D., Dauphin, Y.N.: Convolutional sequence to sequence learning. arXiv preprint arXiv:1705.03122 (2017)
  14. 14.
    Graves, A., Mohamed, A.R., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6645–6649. IEEE (2013)Google Scholar
  15. 15.
    Hai, L., Ng, H.: Named entity recognition with a maximum entropy approach. In: Conference on Natural Language Learning at HLT-NAACL, pp. 160–163 (2003)Google Scholar
  16. 16.
    Ji, Y., Tan, C., Martschat, S., Choi, Y., Smith, N.A.: Dynamic entity representations in neural language models. arXiv preprint arXiv:1708.00781 (2017)
  17. 17.
    Keith, K.A., Handler, A., Pinkham, M., Magliozzi, C., McDuffie, J., O’Connor, B.: Identifying civilians killed by police with distantly supervised entity-event extraction. arXiv preprint arXiv:1707.07086 (2017)
  18. 18.
    Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)
  19. 19.
    Konstas, I., Iyer, S., Yatskar, M., Choi, Y., Zettlemoyer, L.: Neural AMR: Sequence-to-sequence models for parsing and generation (2017)Google Scholar
  20. 20.
    Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360 (2016)
  21. 21.
    Li, J., Monroe, W., Jurafsky, D.: A simple, fast diverse decoding algorithm for neural generation. arXiv preprint arXiv:1611.08562 (2016)
  22. 22.
    Li, P.H., Dong, R.P., Wang, Y.S., Chou, J.C., Ma, W.Y.: Leveraging linguistic structures for named entity recognition with bidirectional recursive neural networks. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2664–2669 (2017)Google Scholar
  23. 23.
    Lin, B.Y., Xu, F., Luo, Z., Zhu, K.: Multi-channel BiLSTM-CRF model for emerging named entity recognition in social media. In: Proceedings of the 3rd Workshop on Noisy User-generated Text, pp. 160–165 (2017)Google Scholar
  24. 24.
    Sundermeyer, M., Ney, H., Schluter, R.: From feedforward to recurrent LSTM neural networks for language modeling. IEEE/ACM Trans. Audio Speech Lang. Process. 23(3), 517–529 (2015)CrossRefGoogle Scholar
  25. 25.
    Ma, X., Hovy, E.: End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. arXiv preprint arXiv:1603.01354 (2016)
  26. 26.
    Mccallum, A., Li, W.: Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In: Conference on Natural Language Learning at HLT-NAACL, pp. 188–191 (2003)Google Scholar
  27. 27.
    Nadeau, D., Sekine, S.: A survey of named entity recognition and classification. Lingvisticae Investig. 30(1), 3–26 (2007)CrossRefGoogle Scholar
  28. 28.
    Passos, A., Kumar, V., McCallum, A.: Lexicon infused phrase embeddings for named entity resolution. arXiv preprint arXiv:1404.5367 (2014)
  29. 29.
    Peng, N., Dredze, M.: Improving named entity recognition for Chinese social media with word segmentation representation learning. In: Meeting of the Association for Computational Linguistics, Berlin, Germany, pp. 149–155, August 2016Google Scholar
  30. 30.
    Shao, L., Gouws, S., Britz, D., Goldie, A., Strope, B.: Generating high-quality and informative conversation responses with sequence-to-sequence models (2017)Google Scholar
  31. 31.
    Su, J., Su, J.: Named entity recognition using an HMM-based chunk tagger. In: Meeting on Association for Computational Linguistics, pp. 473–480 (2002)Google Scholar
  32. 32.
    Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)Google Scholar
  33. 33.
    Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 6000–6010 (2017)Google Scholar
  34. 34.
    Wiseman, S., Rush, A.M.: Sequence-to-sequence learning as beam-search optimization. arXiv preprint arXiv:1606.02960 (2016)
  35. 35.
    Wu, Y., Jiang, M., Lei, J., Xu, H.: Named entity recognition in Chinese clinical text using deep neural network. Stud. Health Technol. Inform. 216, 624–628 (2015)Google Scholar
  36. 36.
    Yao, Y., Huang, Z.: Bi-directional LSTM recurrent neural network for Chinese word segmentation. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds.) ICONIP 2016. LNCS, vol. 9950, pp. 345–353. Springer, Cham (2016). Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Jiangsu Key Laboratory of Big Data Security and Intelligent Processing, School of Computer ScienceNanjing University of Posts and TelecommunicationsNanjingChina

Personalised recommendations