Abstract
In this paper, we deal with the task of extracting first order temporal facts from free text. This task is a subtask of relation extraction and it aims at extracting relations between entity and time. Currently, the field of relation extraction mainly focuses on extracting relations between entities. However, we observe that the multi-granular nature of time expressions can help us divide the dataset constructed by distant supervision into reliable and less reliable subsets, which can help to improve the extraction results on relations between entity and time. We accordingly contribute the first dataset focusing on the first order temporal fact extraction task using distant supervision. To fully utilize both the reliable and the less reliable data, we propose to use curriculum learning to rearrange the training procedure, label dropout to make the model be more conservative about less reliable data, and instance attention to help the model distinguish important instances from unimportant ones. Experiments show that these methods help the model outperform the model trained purely on the reliable dataset as well as the model trained on the dataset where all subsets are mixed together.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
www.wikidata.org. It is a rapid-growing knowledge base and Freebase (www.freebase.com) is migrating its data to it.
- 2.
- 3.
- 4.
The dataset can be downloaded from: github.com/pfllo/TemporalFactExtraction.
References
Aho, A.V., Corasick, M.J.: Efficient string matching: an aid to bibliographic search. Commun. ACM 18(6), 333–340 (1975)
Artiles, J., Li, Q., Cassidy, T., Tamang, S., Ji, H.: CUNY BLENDER TACKBP2011 temporal slot filling system description. In: Proceedings of Text Analysis Conference (TAC) (2011)
Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: Proceedings of 26th Annual International Conference on Machine Learning, pp. 41–48. ACM (2009)
Chang, A.X., Manning, C.D.: SUTime: a library for recognizing and normalizing time expressions. In: LREC, pp. 3735–3740 (2012)
Dang, H.T., Surdeanu, M.: Task description for knowledge-base population at TAC 2013 (2013)
Fabian, M., Gjergji, K., Gerhard, W.: YAGO: a core of semantic knowledge unifying WordNet and Wikipedia. In: 16th International World Wide Web Conference, WWW, pp. 697–706 (2007)
Hoffmann, R., Zhang, C., Ling, X., Zettlemoyer, L., Weld, D.S.: Knowledge-based weak supervision for information extraction of overlapping relations. In: Proceedings of ACL (2011)
Ji, H., Grishman, R., Dang, H.: Overview of the TAC 2011 knowledge base population track. In: Text Analysis Conference (2011)
Kuzey, E., Weikum, G.: Extraction of temporal facts and events from wikipedia. In: Proceedings of 2nd Temporal Web Analytics Workshop, pp. 25–32. ACM (2012)
Ling, X., Weld, D.S.: Temporal information extraction. In: AAAI, vol. 10, 1385–1390 (2010)
Mani, I., Verhagen, M., Wellner, B., Lee, C.M., Pustejovsky, J.: Machine learning of temporal relations. In: Proceedings of 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics (2006)
Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labeled data. In: Proceedings of 47th Annual Meeting of the Association for Computational Linguistics (2009)
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: EMNLP, vol. 14, pp. 1532–1543 (2014)
Pustejovsky, J., Verhagen, M.: SemEval-2010 task 13: evaluating events, time expressions, and temporal relations (TempEval-2). In: Proceedings of Workshop on Semantic Evaluations: Recent Achievements and Future Directions, pp. 112–116. Association for Computational Linguistics (2009)
Sil, A., Cucerzan, S.: Temporal scoping of relational facts based on Wikipedia data. In: CoNLL-2014, p. 109 (2014)
Surdeanu, M., Tibshirani, J., Nallapati, R., Manning, C.D.: Multi-instance multi-label learning for relation extraction. In: Proceedings of 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 455–465. Association for Computational Linguistics (2012)
Takamatsu, S., Sato, I., Nakagawa, H.: Reducing wrong labels in distant supervision for relation extraction. In: Proceedings of 50th Annual Meeting of the Association for Computational Linguistics (2012)
Verhagen, M., Gaizauskas, R., Schilder, F., Hepple, M., Katz, G., Pustejovsky, J.: SemEval-2007 task 15: TempEval temporal relation identification. In: Proceedings of 4th International Workshop on Semantic Evaluations, pp. 75–80. Association for Computational Linguistics (2007)
Wang, Y., Yang, B., Qu, L., Spaniol, M., Weikum, G.: Harvesting facts from textual web sources by constrained label propagation. In: Proceedings of 20th ACM International Conference on Information and Knowledge Management (2011)
Wang, Y., Zhu, M., Qu, L., Spaniol, M., Weikum, G.: Timely YAGO: harvesting, querying, and visualizing temporal knowledge from Wikipedia. In: Proceedings of 13th International Conference on Extending Database Technology (2010)
Xie, L., Wang, J., Wei, Z., Wang, M., Tian, Q.: DisturbLabel: regularizing CNN on the loss layer (2016). arXiv preprint arXiv:1605.00055
Yoshikawa, K., Riedel, S., Asahara, M., Matsumoto, Y.: Jointly identifying temporal relations with Markov logic. In: Proceedings of ACL-IJCNLP (2009)
Zeng, D., Liu, K., Chen, Y., Zhao, J.: Distant supervision for relation extraction via piecewise convolutional neural networks. In: EMNLP (2015)
Acknowledgement
This work was supported by National High Technology R&D Program of China (Grant Nos. 2015AA015403, 2014AA015102), Natural Science Foundation of China (Grant Nos. 61202233, 61272344, 61370055) and the joint project with IBM Research. Any correspondence please refer to Yansong Feng.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Luo, B., Feng, Y., Wang, Z., Zhao, D. (2016). Improving First Order Temporal Fact Extraction with Unreliable Data. In: Lin, CY., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds) Natural Language Understanding and Intelligent Applications. ICCPOL NLPCC 2016 2016. Lecture Notes in Computer Science(), vol 10102. Springer, Cham. https://doi.org/10.1007/978-3-319-50496-4_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-50496-4_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50495-7
Online ISBN: 978-3-319-50496-4
eBook Packages: Computer ScienceComputer Science (R0)