Abstract
The European Digital Single Market, one of the main goals of Europe 2020, is still fragmented due to language barriers. Language technologies (LT), like Machine Translation (MT) solutions, are key elements for solving this fragmentation. Nevertheless, it is necessary to compile, benchmark the quality and facilitate the access to Language Resources to build successful MT solutions. With these aims, the LT_Observatory project has been developed (2014–2016). The project was funded by the European Commission through the H2020 programme. This article describes the main outputs:
-
An on-line catalogue of language resources in existing pools and other national resources based on pre-identified user needs.
-
Methodologies for improving the quality and usability of language resources.
-
National and regional language strategies, policies and funding sources to support language technologies.
-
An EcoGuide that aims to adapt the findings of the LT_Observatory project for various stakeholder groups providing practical information for operational usability of LRs and tools for MT application, funding opportunities, and recommendations geared at European, national and regional policy and decision makers.
This project has been carried out by a team of five EU partners with complementary expertise: ZABALA (EU project management and community engagement), EMF (European Multimedia Forum with experience in outreach/social media, and funding, e.g. ESIF and combined funding), LT Innovate (the Language Technology Industry Association), CLARIN ERIC (LT resources and infrastructure, including a Virtual Language Observatory), and University of Vienna/InfoTerm (international information centre for terminology).
All the authors contributed equally to this work.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
- 3.
E.g. in the Introduction to the 2014/15 work programmes, p. 10, http://ec.europa.eu/research/participants/data/ref/h2020/wp/2014_2015/main/h2020-wp1415-intro_en.pdf.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
Information was gathered through direct contacts with Managing Authorities and agencies in the Member States.
- 11.
For more details, see: How does ESIF work? http://www.lt-innovate.org/lt-observe/how-does-esif-work.
- 12.
Information was gathered in a first round through desk research, and in a second through direct contact with the respective agencies. EUREKA and Eurostars information was collected from the EUREKA (http://www.eurekanetwork.org/eureka-countries) and Eurostars (https://www.eurostars-eureka.eu/eurostars-countries/europe) contacts pages.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
- 20.
- 21.
- 22.
- 23.
- 24.
- 25.
- 26.
- 27.
- 28.
- 29.
- 30.
- 31.
- 32.
- 33.
- 34.
- 35.
The LT_Observatory project contributed to the Strategic Research and Innovation Agenda for the Multilingual Digital Single Market, see http://www.cracking-the-language-barrier.eu/wp-content/uploads/SRIA-V0.9-final-online.pdf.
- 36.
Bilingual corpora are “true translation corpora”, whereas two language versions of a multilingual corpus are not necessarily translations of each other.
References
Mastropavlos, N., Papavassiliou, V.: Domain adaptation of statistical machine translation using web-crawled resources: a case study. In: Proceedings from the 10th International Conference of Greek Linguistics (2011)
Maegaard, B., Henriksen, L., Joscelyne, A., Lusicky, V., Mazura, M., Olsen, S., Povlsen, C., Wacker, P.: Providing a catalogue of language resources for commercial users. In: Calzolari, N., Choukri, K., Declerck, T., Goggi, S., Grobelnik, M., Maegaard, B., Mariani, J., Mazo, H., Moreno, A., Odijk, J., Piperidis, S. (eds.) Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Portoroz, pp. 449–456 (2016)
Lou, B., McEnery, T., Baker, P., Wilson, A.: Validation of Linguistic Corpora (1998)
Fersøe, H., Monachini, M.: ELRA validation methodology and standard promotion for linguistic resources. In: Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC 2004), ELRA, Lisboa (2004)
Rirdance, S., Vasiljevs, A.: Towards Consolidation of European Terminology Resources. Tilde, Riga (2006)
Pinnis, M., Skadins, R.: MT adaptation for under-resourced domains - what works and what not. In: Tavast, A. et al. (eds.) Human Language Technologies - The Baltic Perspective (2012). http://ebooks.iospress.nl/publication/7500
Lommel, A.R., DePalma, D.A.: How Europe Is Driving the Shift to MT. Common Sense Advisory (2016)
Budin, G.: Digital humanities, language industry, and multilingualism: global networking and innovation in collaborative methods. In: Forstner, M., Lee-Jahnke, H., Lang, P. (eds.) Proceedings of CIUTI-Forum 2014: Pooling Academic Excellence with Entrepreneurship for New Partnerships, pp. 423–448 (2015)
García, O., Wei, L.: Translanguaging. Language, Bilingualism and Education. Palgrave Macmillan, Basingstoke (2014)
Burchardt, A., Rehm, G., Sasaki, F. (eds.): The Future European Multilingual Information Society. Vision Paper for a Strategic Research Agenda (2011). http://www.meta-net.eu/
Rehm, G., et al.: The strategic impact of META-NET on the regional, national and international level. Lang. Resour. Eval. 50(2), 351–374 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Maegaard, B. et al. (2017). Observatory for Language Resources and Machine Translation in Europe – LT_Observatory. In: Quesada, J., Martín Mateos , FJ., López Soto, T. (eds) Future and Emerging Trends in Language Technology. Machine Learning and Big Data. FETLT 2016. Lecture Notes in Computer Science(), vol 10341. Springer, Cham. https://doi.org/10.1007/978-3-319-69365-1_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-69365-1_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-69364-4
Online ISBN: 978-3-319-69365-1
eBook Packages: Computer ScienceComputer Science (R0)