Knowledge and Information Systems

, Volume 41, Issue 3, pp 611–646

Tackling representation, annotation and classification challenges for temporal knowledge base population

Regular Paper

DOI: 10.1007/s10115-013-0675-1

Cite this article as:
Ji, H., Cassidy, T., Li, Q. et al. Knowl Inf Syst (2014) 41: 611. doi:10.1007/s10115-013-0675-1

Abstract

Temporal Information Extraction (TIE) plays an important role in many natural language processing and database applications. Temporal slot filling (TSF) is a new and ambitious TIE task prepared for the knowledge base population (KBP2011) track of NIST Text Analysis Conference. TSF requires systems to discover temporally bound facts about entities and their attributes in order to populate a structured knowledge base. In this paper, we will provide an overview of the unique challenges of this new task and our novel approaches to address these challenges. We present challenges from three perspectives: (1) Temporal information representation: We will review the relevant linguistic semantic theories of temporal information and their limitations, motivating the need to develop a new (4-tuple) representation framework for the task. (2) Annotation acquisition: The lack of substantial labeled training data for supervised learning is a limiting factor in the design of TSF systems. Our work examines the use of multi-class logistic regression methods to improve the labeling quality of training data obtained by distant supervision. (3) Temporal information classification: Another key challenge lies in capturing relations between salient text elements separated by a long context. We develop two approaches for temporal classification and combine them through cross-document aggregation: a flat approach that uses lexical context and shallow dependency features and a structured approach that captures long syntactic contexts by using a dependency path kernel tailored for this task. Experimental results demonstrated that our annotation enhancement approach dramatically increased the speed of the training procedure (by almost 100 times), and that the flat and structured classification approaches were complementary, together yielding a state-of-the-art TSF system.

Keywords

Temporal slot filling Knowledge base population Temporal representation Distant supervision Dependency kernel Instance relabeling 

Copyright information

© Springer-Verlag London 2013

Authors and Affiliations

  • Heng Ji
    • 1
  • Taylor Cassidy
    • 2
  • Qi Li
    • 1
  • Suzanne Tamang
    • 3
  1. 1.Computer Science DepartmentRensselaer Polytechnic InstituteTroyUSA
  2. 2.Linguistics DepartmentGraduate Center, City University of New YorkNew YorkUSA
  3. 3.Computer Science DepartmentGraduate Center, City University of New YorkNew YorkUSA

Personalised recommendations