Advertisement

Appendices

  • Ahmet Aker
  • Radu Ion
  • Nikos Mastropavlos
  • Monica Paramita
  • Mārcis Pinnis
  • Dan Ştefănescu
  • Fangzhong Su
  • Gregor Thurmair
  • Elena Irimia
  • Nikola Ljubešić
  • Evangelos Kanoulas
  • Judita Preiss
  • Rob Gaizauskas
  • Paul Clough
  • Emma Barker
  • Nikos Glaros
  • Tiberiu Boroș
  • Inguna SkadiņaEmail author
  • Andrejs Vasiļjevs
Chapter
Part of the Theory and Applications of Natural Language Processing book series (NLP)

Abstract

The tools that were developed through the ACCURAT project and are presented in this book are packed into the ACCURAT toolkit (Pinnis et al. 2012a)—a collection of tools that are capable of collecting comparable corpora, analysing and extracting parallel data. The ACCURAT toolkit produces

References

  1. ACCURAT D2.6. (2012). Toolkit for multi-level alignment and information extraction from comparable corpora. http://www.accurat-project.eu
  2. Adafre, S. F., & de Rijke, M. (2006). Finding similar sentences across multiple languages in Wikipedia. Proceedings of the EACL Workshop on New Text, Trento, Italy.Google Scholar
  3. Baroni, M., & Bernardini, S. (2004). BootCaT: Bootstrapping corpora and terms from the web. Proceedings of LREC 2004 (pp. 1313–1316).Google Scholar
  4. Brown, P. F., Della Pietra, S. A., Della Pietra, V. J., & Mercer, R. L. (1993). The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics, 19(2), 263–311.Google Scholar
  5. Evert, S. (2005). The statistics of word cooccurrences: Word pairs and collocations. PhD thesis, Institut für Maschinelle Sprachverarbeitung, Universität Stuttgart.Google Scholar
  6. Ion, R., Ceauşu, A., & Irimia, E. (2011). An expectation maximization algorithm for textual unit alignment. Proceedings of the 4th Workshop on Building and Using Comparable Corpora (BUCC 2011) held at the 49th Annual Meeting of the Association for Computational Linguistics (pp. 128—135), Portland, OR, June 24th, 2011. (C) 2011 Association for Computational Linguistics. ISBN: 978-1-937284-01-5.
  7. Ion, R. (2012). PEXACC: A parallel data mining algorithm from comparable corpora. Proceedings of LREC 2012, May 21–27, Istanbul, Turkey.Google Scholar
  8. Pecina, P. (2009). Lexical association measures: Collocation extraction. Studies in computational and theoretical linguistics. Prague, Czech Republic: Institute of Formal and Applied Linguistics.Google Scholar
  9. Petrović, S., Šnajder, J., & Bašić, B. D. (2010). Extending lexical association measures for collocation extraction. Computer Speech and Language, 24(2), 383–394.CrossRefGoogle Scholar
  10. Pinnis, M. (2012). Latvian and Lithuanian named entity recognition with TildeNER. Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012), Istanbul, Turkey.Google Scholar
  11. Pinnis, M., Ion, R., Ştefănescu, D., Su, F., Skadiņa, I., Vasiļjevs, A., et al. (2012a). Toolkit for multi-level alignment and information extraction from comparable corpora. Proceedings of ACL 2012, System Demonstrations Track, Jeju Island, Republic of Korea, July 8–14, 2012.Google Scholar
  12. Pinnis, M., Ljubešić, N., Ştefănescu, D., Skadiņa, I., Tadić, M., Gornostay, T. (2012b). Term extraction, tagging, and mapping tools for under-resourced languages. Proceedings of the 10th Conference on Terminology and Knowledge Engineering (TKE 2012), June 20–21, Madrid, Spain.Google Scholar
  13. Skadiņa, I., Aker, A., Giouli, V., Tufiş, D., Gaizauskas, R., Mieriņa, M., et al. (2010). Collection of comparable corpora for under-resourced languages. Proceedings of the Fourth International Conference Baltic HLT 2010, Frontiers in Artificial Intelligence and Applications (Vol. 219, pp. 161–168). IOS Press.Google Scholar
  14. Ştefănescu, D. (2012). Mining for term translations in comparable corpora. Proceedings of the 5th Workshop on Building and Using Comparable Corpora (BUCC 2012) to be held at the 8th edition of Language Resources and Evaluation Conference (LREC 2012), Istanbul, Turkey, May 23–25, 2012.Google Scholar
  15. Ştefănescu, D., Ion, R., & Hunsicker, S. (2012). Hybrid parallel sentence mining from comparable corpora. Proceedings of the 16th Conference of the European Association for Machine Translation (EAMT 2012), Trento, Italy.Google Scholar
  16. Su, F., & Babych, B. (2012a). Development and application of a cross-language document comparability metric. Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012), Istanbul, Turkey.Google Scholar
  17. Su, F., & Babych, B. (2012b). Measuring comparability of documents in non-parallel corpora for efficient extraction of (semi-) parallel translation equivalents. Proceedings of EACL’12 Joint Workshop on Exploiting Synergies Between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra), Avignon, France.Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Ahmet Aker
    • 1
  • Radu Ion
    • 2
  • Nikos Mastropavlos
    • 3
  • Monica Paramita
    • 1
  • Mārcis Pinnis
    • 4
  • Dan Ştefănescu
    • 2
  • Fangzhong Su
    • 5
  • Gregor Thurmair
    • 6
  • Elena Irimia
    • 2
  • Nikola Ljubešić
    • 7
  • Evangelos Kanoulas
    • 1
  • Judita Preiss
    • 1
  • Rob Gaizauskas
    • 1
  • Paul Clough
    • 1
  • Emma Barker
    • 1
  • Nikos Glaros
    • 3
  • Tiberiu Boroș
    • 2
  • Inguna Skadiņa
    • 4
    Email author
  • Andrejs Vasiļjevs
    • 4
  1. 1.University of SheffieldSheffieldUK
  2. 2.Romanian AcademyResearch Institute for Artificial IntelligenceBucharestRomania
  3. 3.Institute for Language and Speech ProcessingAthensGreece
  4. 4.TildeRigaLatvia
  5. 5.University of LeedsLeedsUK
  6. 6.LinguatecMunichGermany
  7. 7.Faculty of Humanities and Social SciencesUniversity of ZagrebZagrebCroatia

Personalised recommendations