Skip to main content

Improving First Order Temporal Fact Extraction with Unreliable Data

  • Conference paper
  • First Online:
Natural Language Understanding and Intelligent Applications (ICCPOL 2016, NLPCC 2016)

Abstract

In this paper, we deal with the task of extracting first order temporal facts from free text. This task is a subtask of relation extraction and it aims at extracting relations between entity and time. Currently, the field of relation extraction mainly focuses on extracting relations between entities. However, we observe that the multi-granular nature of time expressions can help us divide the dataset constructed by distant supervision into reliable and less reliable subsets, which can help to improve the extraction results on relations between entity and time. We accordingly contribute the first dataset focusing on the first order temporal fact extraction task using distant supervision. To fully utilize both the reliable and the less reliable data, we propose to use curriculum learning to rearrange the training procedure, label dropout to make the model be more conservative about less reliable data, and instance attention to help the model distinguish important instances from unimportant ones. Experiments show that these methods help the model outperform the model trained purely on the reliable dataset as well as the model trained on the dataset where all subsets are mixed together.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    www.wikidata.org. It is a rapid-growing knowledge base and Freebase (www.freebase.com) is migrating its data to it.

  2. 2.

    www.wikipedia.org.

  3. 3.

    www.wikidata.org/w/api.php.

  4. 4.

    The dataset can be downloaded from: github.com/pfllo/TemporalFactExtraction.

References

  1. Aho, A.V., Corasick, M.J.: Efficient string matching: an aid to bibliographic search. Commun. ACM 18(6), 333–340 (1975)

    Article  MathSciNet  MATH  Google Scholar 

  2. Artiles, J., Li, Q., Cassidy, T., Tamang, S., Ji, H.: CUNY BLENDER TACKBP2011 temporal slot filling system description. In: Proceedings of Text Analysis Conference (TAC) (2011)

    Google Scholar 

  3. Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: Proceedings of 26th Annual International Conference on Machine Learning, pp. 41–48. ACM (2009)

    Google Scholar 

  4. Chang, A.X., Manning, C.D.: SUTime: a library for recognizing and normalizing time expressions. In: LREC, pp. 3735–3740 (2012)

    Google Scholar 

  5. Dang, H.T., Surdeanu, M.: Task description for knowledge-base population at TAC 2013 (2013)

    Google Scholar 

  6. Fabian, M., Gjergji, K., Gerhard, W.: YAGO: a core of semantic knowledge unifying WordNet and Wikipedia. In: 16th International World Wide Web Conference, WWW, pp. 697–706 (2007)

    Google Scholar 

  7. Hoffmann, R., Zhang, C., Ling, X., Zettlemoyer, L., Weld, D.S.: Knowledge-based weak supervision for information extraction of overlapping relations. In: Proceedings of ACL (2011)

    Google Scholar 

  8. Ji, H., Grishman, R., Dang, H.: Overview of the TAC 2011 knowledge base population track. In: Text Analysis Conference (2011)

    Google Scholar 

  9. Kuzey, E., Weikum, G.: Extraction of temporal facts and events from wikipedia. In: Proceedings of 2nd Temporal Web Analytics Workshop, pp. 25–32. ACM (2012)

    Google Scholar 

  10. Ling, X., Weld, D.S.: Temporal information extraction. In: AAAI, vol. 10, 1385–1390 (2010)

    Google Scholar 

  11. Mani, I., Verhagen, M., Wellner, B., Lee, C.M., Pustejovsky, J.: Machine learning of temporal relations. In: Proceedings of 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics (2006)

    Google Scholar 

  12. Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labeled data. In: Proceedings of 47th Annual Meeting of the Association for Computational Linguistics (2009)

    Google Scholar 

  13. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: EMNLP, vol. 14, pp. 1532–1543 (2014)

    Google Scholar 

  14. Pustejovsky, J., Verhagen, M.: SemEval-2010 task 13: evaluating events, time expressions, and temporal relations (TempEval-2). In: Proceedings of Workshop on Semantic Evaluations: Recent Achievements and Future Directions, pp. 112–116. Association for Computational Linguistics (2009)

    Google Scholar 

  15. Sil, A., Cucerzan, S.: Temporal scoping of relational facts based on Wikipedia data. In: CoNLL-2014, p. 109 (2014)

    Google Scholar 

  16. Surdeanu, M., Tibshirani, J., Nallapati, R., Manning, C.D.: Multi-instance multi-label learning for relation extraction. In: Proceedings of 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 455–465. Association for Computational Linguistics (2012)

    Google Scholar 

  17. Takamatsu, S., Sato, I., Nakagawa, H.: Reducing wrong labels in distant supervision for relation extraction. In: Proceedings of 50th Annual Meeting of the Association for Computational Linguistics (2012)

    Google Scholar 

  18. Verhagen, M., Gaizauskas, R., Schilder, F., Hepple, M., Katz, G., Pustejovsky, J.: SemEval-2007 task 15: TempEval temporal relation identification. In: Proceedings of 4th International Workshop on Semantic Evaluations, pp. 75–80. Association for Computational Linguistics (2007)

    Google Scholar 

  19. Wang, Y., Yang, B., Qu, L., Spaniol, M., Weikum, G.: Harvesting facts from textual web sources by constrained label propagation. In: Proceedings of 20th ACM International Conference on Information and Knowledge Management (2011)

    Google Scholar 

  20. Wang, Y., Zhu, M., Qu, L., Spaniol, M., Weikum, G.: Timely YAGO: harvesting, querying, and visualizing temporal knowledge from Wikipedia. In: Proceedings of 13th International Conference on Extending Database Technology (2010)

    Google Scholar 

  21. Xie, L., Wang, J., Wei, Z., Wang, M., Tian, Q.: DisturbLabel: regularizing CNN on the loss layer (2016). arXiv preprint arXiv:1605.00055

  22. Yoshikawa, K., Riedel, S., Asahara, M., Matsumoto, Y.: Jointly identifying temporal relations with Markov logic. In: Proceedings of ACL-IJCNLP (2009)

    Google Scholar 

  23. Zeng, D., Liu, K., Chen, Y., Zhao, J.: Distant supervision for relation extraction via piecewise convolutional neural networks. In: EMNLP (2015)

    Google Scholar 

Download references

Acknowledgement

This work was supported by National High Technology R&D Program of China (Grant Nos. 2015AA015403, 2014AA015102), Natural Science Foundation of China (Grant Nos. 61202233, 61272344, 61370055) and the joint project with IBM Research. Any correspondence please refer to Yansong Feng.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yansong Feng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Luo, B., Feng, Y., Wang, Z., Zhao, D. (2016). Improving First Order Temporal Fact Extraction with Unreliable Data. In: Lin, CY., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds) Natural Language Understanding and Intelligent Applications. ICCPOL NLPCC 2016 2016. Lecture Notes in Computer Science(), vol 10102. Springer, Cham. https://doi.org/10.1007/978-3-319-50496-4_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-50496-4_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-50495-7

  • Online ISBN: 978-3-319-50496-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics