Abstract
This paper presents a new annotation tool for aligned bilingual corpora, which allows the annotation of a wide range of information, ranging from information about words (such as part-of-speech tags or named-entities) to quite complex annotation schemas involving links between aligned segments, such as co-reference or translation equivalence between aligned segments in the two languages. The annotation tool is implemented as a component of the Ellogon language engineering platform, exploiting its extensive annotation engine, its cross-platform abilities and its linguistic processing components, if such a need arises. The new annotation tool is distributed with an open source license (LGPL), as part of the Ellogon language engineering platform.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Uren, V., Cimiano, P., Iria, J., Handschuh, S., Vargas-Vera, M., Motta, E., Ciravegna, F.: Semantic annotation for knowledge management: Requirements and a survey of the state of the art. Web Semant. 4, 14–28 (2006)
Fragkou, P., Petasis, G., Theodorakos, A., Karkaletsis, V., Spyropoulos, C.: Boemie ontology-based text annotation tool. In: Proceedings of LREC 2008. ELRA (2008)
Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V., Aswani, N., Roberts, I., Gorrell, G., Funk, A., Roberts, A., Damljanovic, D., Heitz, T., Greenwood, M.A., Saggion, H., Petrak, J., Li, Y., Peters, W.: Text Processing with GATE (2011)
Barlow, M.: ParaConc: concordance software for multilingual parallel corpora. In: Proceedings of LREC 2002 (2002)
Tiedemann, J.: ISA & ICA - two web interfaces for interactive alignment of bitexts. In: Proceedings of LREC 2006 (2006)
Day, D., McHenry, C., Kozierok, R., Riek, L.: Callisto: A configurable annotation workbench. In: Proceedings of LREC 2004 (2004)
Farwell, D., Helmreich, S., Dorr, B., Green, R., Reeder, F., Miller, K., Levin, L., Mitamura, T., Hovy, E., Rambow, O., Habash, N., Siddharthan, A.: Interlingual Annotation of Multilingual Text Corpora and FrameNet, Berlin (2008)
Ide, N., Véronis, J.: Multext: Multilingual text tools and corpora. In: Proceedings of COLING 1994, vol. 1, pp. 588–592. ACL (1994)
Choi, J.D., Bonial, C., Palmer, M.: Multilingual propbank annotation tools: Cornerstone and jubilee. In: Proceedings of the NAACL HLT 2010 Demo Session, pp. 13–16. ACL (2010)
Tsoumari, M., Petasis, G.: Coreference Annotator - A new annotation tool for aligned bilingual corpora. In: Proceedings of AEPC 2, RANLP 2011 (2011)
Petasis, G., Karkaletsis, V., Paliouras, G., Androutsopoulos, I., Spyropoulos, C.D.: Ellogon: A New Text Engineering Platform. In: Proceedings of LREC 2002 (2002)
Müller, C., Strube, M.: Multi-level annotation of linguistic data with MMAX2. In: Braun, S., Kohn, K., Mukherjee, J. (eds.) Corpus Technology and Language Pedagogy: New Resources, New Tools, New Methods, Peter Lang, Frankfurt a.M., Germany, pp. 197–214 (2006)
Ogren, P.V.: Knowtator: A protégé plug-in for annotated corpus construction. In: Moore, R.C., Bilmes, J.A., Chu-Carroll, J., Sanderson, M. (eds.) HLT-NAACL. The Association for Computational Linguistics (2006)
Corcho, O.: Ontology based document annotation: trends and open research problems. Int. J. Metadata Semant. Ontologies 1, 47–57 (2006)
Petasis, G.: The SYNC3 Collaborative Annotation Tool. In: Proceedings of LREC 2012 (2012)
Sidiropoulou, M.: Contrast in english and greek newspaper reporting: A translation perspective. In: Proceedings of 8th International Symposium on English & Greek: Description and/or Comparison of the Two Languages, Thessaloniki, Greece, School of English, Aristotle University (1994)
Sidiropoulou, M.: Linguistic Identities through Translation. Approaches to Translation Studies, vol. 23. Rodopi B.V., Amsterdam (2004)
Newmark, P.: A Textbook of Translation. Prentice-Hall International, New York (1988)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Petasis, G., Tsoumari, M. (2012). A New Annotation Tool for Aligned Bilingual Corpora. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2012. Lecture Notes in Computer Science(), vol 7499. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32790-2_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-32790-2_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32789-6
Online ISBN: 978-3-642-32790-2
eBook Packages: Computer ScienceComputer Science (R0)