Abstract
This paper presents a method used for extracting temporal information from raw texts in Polish. The extracted information consists of the text fragments which describe events, the time expressions and the temporal relations between them. Together with temporal reasoning, it can be used in applications such as question answering or for text summarization and information extraction. First, a bilingual corpus was used to project temporal annotations from English to Polish. This data was further enhanced by manual correction and then used for inducing classifiers based on Conditional Random Fields (CRF) and a Support Vector Machine (SVM). For the evaluation of this task we propose a cross-language method that compares the system’s results with results for different languages. It shows that the temporal relations classifier presented here outperforms the state of the art systems for English when using the macro-average F 1-measure, which is well suited for this multiclass classification task.
Keywords
- temporal information
- temporal relation
- event extraction
- word alignment
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Acedański, S.: A Morphosyntactic Brill Tagger for Inflectional Languages. In: Loftsson, H., Rögnvaldsson, E., Helgadóttir, S. (eds.) IceTAL 2010. LNCS, vol. 6233, pp. 3–14. Springer, Heidelberg (2010)
Arnulphy, B., Tannier, X., Vilnat, A.: Automatically Generated Noun Lexicons for Event Extraction. In: Gelbukh, A. (ed.) CICLing 2012, Part II. LNCS, vol. 7182, pp. 219–231. Springer, Heidelberg (2012)
Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pp. 144–152. ACM Press (1992)
Buczyński, A., Przepiórkowski, A.: Spejd: A Shallow Processing and Morphological Disambiguation Tool. In: Vetulani, Z., Uszkoreit, H. (eds.) LTC 2007. LNCS, vol. 5603, pp. 131–141. Springer, Heidelberg (2009)
Derczynski, L., Gaizauskas, R.: Using signals to improve automatic classification of temporal relations. In: Proceedings of the ESSLLI StuS (2010)
Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., Herbst, E.: Moses: Open source toolkit for statistical machine translation. In: Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, ACL 2007, pp. 177–180. Association for Computational Linguistics, Stroudsburg (2007)
Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional Random Fields: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, ICML 2001, pp. 282–289. Morgan Kaufmann Publishers Inc., San Francisco (2001)
Llorens, H., Saquete, E., Navarro, B.: TIPSem (English and Spanish): Evaluating CRFs and Semantic Roles in TempEval-2. In: Proceedings of the 5th International Workshop on Semantic Evaluation, pp. 284–291. Association for Computational Linguistics, Uppsala (2010)
Llorens, H., Saquete, E., Navarro-Colorado, B.: TimeML events recognition and classification: learning CRF models with semantic roles. In: Proceedings of the 23rd International Conference on Computational Linguistics, COLING 2010, pp. 725–733. Association for Computational Linguistics, Stroudsburg (2010)
Llorens, H., Saquete, E., Navarro-Colorado, B.: Automatic system for identifying and categorizing temporal relations in natural language. International Journal of Intelligent Systems 27(7), 680–703 (2012)
Mani, I., Verhagen, M., Wellner, B., Lee, C.M., Pustejovsky, J.: Machine learning of temporal relations. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, ACL-44, pp. 753–760. Association for Computational Linguistics, Stroudsburg (2006)
Maziarz, M., Piasecki, M., Szpakowicz, S.: Approaching plWordNet 2.0. In: Proceedings of the 6th Global Wordnet Conference, Matsue, Japan (2012)
Pustejovsky, J., Castaño, J., Ingria, R., Saurí, R., Gaizauskas, R., Setzer, A., Katz, G.: TimeML: Robust specification of event and temporal expressions in text. In: Fifth International Workshop on Computational Semantics, IWCS-5 (2003)
Pustejovsky, J., Hanks, P., Saurí, R., See, A., Gaizauskas, R., Setzer, A., Radev, D., Sundheim, B., Day, D., Ferro, L., Lazo, M.: The TimeBank corpus. In: Proceedings of Corpus Linguistics 2003, pp. 647–656 (2003)
Saurí, R., Knippen, R., Verhagen, M., Pustejovsky, J.: Evita: A Robust Event Recognizer for QA Systems. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT 2005, pp. 700–707. Association for Computational Linguistics, Stroudsburg (2005)
Spreyer, K., Frank, A.: Projection-based acquisition of a temporal labeller. In: Proceedings of IJCNLP 2008, Hyderabad, India, pp. 489–496 (2008)
Verhagen, M., Mani, I., Sauri, R., Knippen, R., Jang, S.B., Littman, J., Rumshisky, A., Phillips, J., Pustejovsky, J.: Automating temporal annotation with TARSQI. In: Proceedings of the ACL 2005 on Interactive Poster and Demonstration Sessions, ACLdemo 2005, pp. 81–84. Association for Computational Linguistics, Stroudsburg (2005)
Woliński, M.: Morfeusz – a Practical Tool for the Morphological Analysis of Polish. In: Klopotek, M., Wierzchon, S., Trojanowski, K. (eds.) Intelligent Information Processing and Web Mining. Advances in Soft Computing, vol. 35, pp. 511–520. Springer, Heidelberg (2006), http://dx.doi.org/10.1007/3-540-33521-8_55
Wróblewska, A.: Polish-English Word Alignment: Preliminary Study. In: Ryżko, D., Rybiński, H., Gawrysiak, P., Kryszkiewicz, M. (eds.) Emerging Intelligent Technologies in Industry. SCI, vol. 369, pp. 123–132. Springer, Heidelberg (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jarzębowski, P., Przepiórkowski, A. (2012). Temporal Information Extraction with Cross-Language Projected Data. In: Isahara, H., Kanzaki, K. (eds) Advances in Natural Language Processing. JapTAL 2012. Lecture Notes in Computer Science(), vol 7614. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33983-7_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-33983-7_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33982-0
Online ISBN: 978-3-642-33983-7
eBook Packages: Computer ScienceComputer Science (R0)
