Why Don’t Romanians Have a Five O’clock Tea, Nor Halloween, But Have a Kind of Valentines Day?

  • Corina Forăscu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4919)

Abstract

Recently the focus on temporal information in NLP applications has increased. Based on general temporal theories, annotations and standards, the paper presents the steps performed towards obtaining a parallel English-Romanian corpus, with the temporal information marked in both languages. The automatic import from English to Romanian of the TimeML markup has a success rate of 96.53%. The paper analyzes the main situations that appeared during the automatic import: perfect or impossible transfer, transfer with amendments or for the language specific phenomena. This corpus study permits to decide how import techniques can be used on the temporal domain.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Allen, J.F.: Towards a General Theory of Action and Time. Artificial Intelligence 23, 123–154 (1984)MATHCrossRefGoogle Scholar
  2. 2.
    Armstrong, S.: Multext: Multilingual Text Tools and Corpora. Lexikon und Text, 107–119 (1996)Google Scholar
  3. 3.
    Brants, T.: TnT – a statistical part-of-speech tagger. In: Proceedings of the 6th Applied NLP Conference, ANLP-2000, Seattle, WA, pp. 224–231 (2000)Google Scholar
  4. 4.
    Boguraev, B., Ando, R.: Analysis of TimeBank as a Resource for TimeML Parsing. In: Proceedings of LREC 2006, Genoa, Italy, pp. 71–76 (2006)Google Scholar
  5. 5.
    Ceauşu, A.: Integrated platform for Statistical Machine Translation system development (MTkit). Microsoft Imagine Cup (2005)Google Scholar
  6. 6.
    Cristea, D., Ide, N., Romary, L.: Veins Theory. An Approach to Global Cohesion and Coherence. In: Proceedings of COLING/ACL- 1998, Montreal, Canada, pp. 281–285 (1998)Google Scholar
  7. 7.
    Ferro, L., Gerber, L., Mani, I., Sundheim, B., Wilson, G.: TIDES 2005 Standard for the Annotation of Temporal Expressions (2005)Google Scholar
  8. 8.
    Forăscu, C., Pistol, I., Cristea, D.: Temporality in Relation with Discourse Structure. In: Proceedings of LREC-2006, Genoa, Italy, pp. 65–70 (2006) ISBN 2-9517408-2-4Google Scholar
  9. 9.
    Forăscu, C., Solomon, D.: Towards a Time Tagger for Romanian. In: Proceedings of the ESSLLI Student Session, Nancy, France (2004)Google Scholar
  10. 10.
    Hobbs, J.: Toward an Ontology for Time for the Semantic Web. In: Proceedings of the LREC 2002 Workshop Annotation Standards for Temporal Information in Natural Language, Las Palmas, Spain, pp. 28–35 (2002)Google Scholar
  11. 11.
    Hobbs, J., Pustejovsky, J.: Annotating and Reasoning about Time and Events. In: Proceedings of the AAAI Spring Symposium on Logical Formalizations of Commonsense Reasoning, Stanford, California (2003)Google Scholar
  12. 12.
    Ide, N., Bonhomme, P., Romary, L.: XCES: An XML-based Encoding Standard for Linguistic Corpora. In: Proceedings of the Second International Language Resources and Evaluation Conference, pp. 825–830 (2000)Google Scholar
  13. 13.
    Ion, R.: Word Sense Disambiguation Methods Applied to English and Romanian. (in Romanian) PhD thesis. Romanian Academy, Bucharest (2007)Google Scholar
  14. 14.
    Katz, G., Arosio, F.: The Annotation of Temporal Information in Natural Language Sentences. In: Proceedings of the ACL-2001 Workshop on Temporal and Spatial Information Processing, ACL-2001, Toulose, France, pp. 104–111 (2001)Google Scholar
  15. 15.
    Mani, I., Pustejovsky, J., Gaizauskas, R. (eds.): The Language of Time: A Reader. Oxford University Press, Oxford (2005)Google Scholar
  16. 16.
    Mann, W.C., Thompson, S.A.: Rhetorical structure theory: Description and construction of texts structures. In: Kempen, G. (ed.) Natural Language Generation, pp. 85–96. Martinus Nijhoff Publisher, Dordrecht (1987)Google Scholar
  17. 17.
    Martin, J., Mihalcea, R., Pedersen, T.: Word Alignment for Languages with Scarce Resources. In: Proceeding of the ACL2005 Workshop on Building and Using Parallel Corpora: Datadriven Machine Translation and Beyond. Ann Arbor, Michigan, pp. 65–74 (2005)Google Scholar
  18. 18.
    Pustejovsky, J., Belanger, L., Castaño, J., Gaizauskas, R., Hanks, P., Ingria, B., Katz, G., Radev, D., Rumshisky, A., Sanfilippo, A., Sauri, R., Setzer, A., Sundheim, B., Verhagen, M.: NRRC Summer Workshop on Temporal and Event Recognition for QA Systems (2002)Google Scholar
  19. 19.
    Pustejovsky, J., Verhagen, M., Sauri, R., Littman, J., Gaizauskas, R., Katz, G., Mani, I., Knippen, B., Setzer, A.: TimeBank 1.2. Linguistic Data Consortium (2006)Google Scholar
  20. 20.
    Reichenbach., H.: The tenses of verbs. In: Reichenbach, H. (ed.) Elements of Symbolic Logic, Section 51, pp. 287–298. Macmillan, New York (1947)Google Scholar
  21. 21.
    Sauri, R., Littman, J., Knippen, B., Gaizauskas, R., Setzer, A., Pustejovsky, J.: TimeML Annotation Guidelines, Version 1.2.1 (2006)Google Scholar
  22. 22.
    Setzer, A.: Temporal Information in Newswire Articles: an Annotation Scheme and Corpus Study. PhD dissertation. University of Sheffield (2001)Google Scholar
  23. 23.
    Tufiş, D., Ion, R., Ceauşu, A., Ştefănescu, D.: Combined Aligners. In: Proceedings of the ACL 2005 Workshop on Building and Using Parallel Corpora: Data-driven Machine Translation and Beyond, Ann Arbor, Michigan pp. 107–110 (2005)Google Scholar
  24. 24.
    Tufiş, D., Ion, R., Ceauşu, A., Ştefănescu, D.: Improved Lexical Alignment by Combining Multiple Reified Alignments. In: Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2006) Trento, Italy pp. 153–160 (2006)Google Scholar
  25. 25.
    Tufiş, D., Barbu, A.M.: Revealing translators knowledge: statistical methods in constructing practical translation lexicons for language and speech processing. International Journal of Speech Technology (5), 199–209 (2002)CrossRefGoogle Scholar
  26. 26.
    Verhagen, M., Mani, I., Sauri, R., Littman, J., Knippen, R., Bae Jang, S., Rumshisky, A., Phillips, J., Pustejovsky, J.: Automating Temporal Annotation with TARSQI. In: Proceedings of the 43rd Annual Meeting of the ACL, Ann Arbor, Michigan, pp. 81–84 (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Corina Forăscu
    • 1
  1. 1.University Al.I. Cuza of Iaşi, Faculty of Computer Science, Research Institute for Artificial Intelligence, Romanian Academy, 16, Gen. Berthelot, Iaşi – 700483Romania

Personalised recommendations