Deep Learning for Character-Based Information Extraction
In this paper we introduce a deep neural network architecture to perform information extraction on character-based sequences, e.g. named-entity recognition on Chinese text or secondary-structure detection on protein sequences. With a task-independent architecture, the deep network relies only on simple character-based features, which obviates the need for task-specific feature engineering. The proposed discriminative framework includes three important strategies, (1) a deep learning module mapping characters to vector representations is included to capture the semantic relationship between characters; (2) abundant online sequences (unlabeled) are utilized to improve the vector representation through semi-supervised learning; and (3) the constraints of spatial dependency among output labels are modeled explicitly in the deep architecture. The experiments on four benchmark datasets have demonstrated that, the proposed architecture consistently leads to the state-of-the-art performance.
KeywordsInformation Extraction Deep Learning Conditional Random Field Output Label Deep Neural Network
Unable to display preview. Download preview PDF.
- 1.Supplementary (December 2013), http://www.cs.cmu.edu/~qyj/zhSenna/
- 5.Kountouris, P., Hirst, J.D.: Prediction of backbone dihedral angles and protein secondary structure using support vector machines. BMC Bioinf. 10(437) (2009)Google Scholar
- 6.Levow, G.A.: The third international chinese language processing bakeoff: Word segmentation and named entity recognition. In: Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing, Sydney, vol. 117 (July 2006)Google Scholar
- 7.Qi, Y., Oja, M., Weston, J., Noble, W.S.: A unified multitask architecture for predicting local protein properties. PLoS One 7(3), e32235 (2012)Google Scholar
- 9.Xue, N., et al.: Chinese word segmentation as character tagging. Computational Linguistics and Chinese Language Processing 8(1), 29–48 (2003)Google Scholar
- 10.Zhang, Y., Clark, S.: Joint word segmentation and pos tagging using a single perceptron. In: Proceedings of the 46th Annual Meeting of ACL, pp. 888–896 (2008)Google Scholar