Advertisement

Deep Learning for Character-Based Information Extraction

  • Yanjun Qi
  • Sujatha G. Das
  • Ronan Collobert
  • Jason Weston
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8416)

Abstract

In this paper we introduce a deep neural network architecture to perform information extraction on character-based sequences, e.g. named-entity recognition on Chinese text or secondary-structure detection on protein sequences. With a task-independent architecture, the deep network relies only on simple character-based features, which obviates the need for task-specific feature engineering. The proposed discriminative framework includes three important strategies, (1) a deep learning module mapping characters to vector representations is included to capture the semantic relationship between characters; (2) abundant online sequences (unlabeled) are utilized to improve the vector representation through semi-supervised learning; and (3) the constraints of spatial dependency among output labels are modeled explicitly in the deep architecture. The experiments on four benchmark datasets have demonstrated that, the proposed architecture consistently leads to the state-of-the-art performance.

Keywords

Information Extraction Deep Learning Conditional Random Field Output Label Deep Neural Network 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Supplementary (December 2013), http://www.cs.cmu.edu/~qyj/zhSenna/
  2. 2.
    Collobert, R., Weston, J., Bottou, L., Michael, K., Kuksa: Natural language processing (almost) from scratch. JMLR 12, 2493–2537 (2011)zbMATHGoogle Scholar
  3. 3.
    Cuff, J.A., Barton, G.J.: Evaluation and improvement of multiple sequence methods for protein secondary structure prediction. Proteins 34, 508–519 (1999)CrossRefGoogle Scholar
  4. 4.
    Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Kountouris, P., Hirst, J.D.: Prediction of backbone dihedral angles and protein secondary structure using support vector machines. BMC Bioinf. 10(437) (2009)Google Scholar
  6. 6.
    Levow, G.A.: The third international chinese language processing bakeoff: Word segmentation and named entity recognition. In: Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing, Sydney, vol. 117 (July 2006)Google Scholar
  7. 7.
    Qi, Y., Oja, M., Weston, J., Noble, W.S.: A unified multitask architecture for predicting local protein properties. PLoS One 7(3), e32235 (2012)Google Scholar
  8. 8.
    Xue, N., Xia, F., Chiou, F.D., Palmer, M.: Penn chinese treebank: Phrase structure annotation of a large corpus. Natural Language Engineering 11, 207–238 (2005)CrossRefGoogle Scholar
  9. 9.
    Xue, N., et al.: Chinese word segmentation as character tagging. Computational Linguistics and Chinese Language Processing 8(1), 29–48 (2003)Google Scholar
  10. 10.
    Zhang, Y., Clark, S.: Joint word segmentation and pos tagging using a single perceptron. In: Proceedings of the 46th Annual Meeting of ACL, pp. 888–896 (2008)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Yanjun Qi
    • 1
    • 2
  • Sujatha G. Das
    • 3
  • Ronan Collobert
    • 4
  • Jason Weston
    • 5
  1. 1.Department of Computer ScienceUniversity of VirginiaUSA
  2. 2.Machine Learning DepartmentNEC Labs AmericaUSA
  3. 3.Computer Science DepartmentPenn State UniversityUSA
  4. 4.IDIAP Research InstituteSwitzerland
  5. 5.GoogleNew YorkUSA

Personalised recommendations