Advertisement

Diachronic Stemmed Corpus and Dictionary of Galician Language

  • Nieves R. Brisaboa
  • Juan-Ramón López
  • Miguel R. Penabad
  • Ángeles S. Places
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2588)

Abstract

In this work we present a manually marked up corpus of Old Galician language (460 documents, 5,601,290 running words) and a diachronic dictionary extracted from it, as well as its potential applications, whose implementation is a topic of future work.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Brisaboa, N. R., Callón, C., López, Juan-Ramón, Places, A. S., Sanmartín, G. Stemming Galician Texts. Lecture Notes in Computer Science (LNCS 2476),. Springer-Verlag (SPIRE’2002), pp. 91–97. Lisboa, Portugal, 2002.Google Scholar
  2. 3.
    Gelbukh, A., Sidorov, G. Morphological Analysis of Inflective Languages Through Generation. J. Procesamiento de Lenguaje Natural, No 29, 2002.Google Scholar
  3. 4.
    Gelbukh, A., Sidorov, G. Approach to Construction of Automatic Morphological Analysis Systems for Inflective Languages with Little Effort. Computational Liguistics and Intelligent Text Processing. Lecture Notes in Computer Science N 2588, Springer-Verlag, 2003.Google Scholar
  4. 5.
    Honrado, A., Leon, R., O’Donnell, R. and Sinclair, D. A Word Stemming Algorithm for the Spanish Language. In Proceedings of the 7th International Symposium on String Processing and Information Retrieval (SPIRE’2000)-IEEE Comp. Society., pp.139–145, Espana, 2000.Google Scholar
  5. 6.
    Kraaij, W., Pohlmann, R. Porter’s stemming algorithm for Dutch. In L.G.M. Noordman and W.A.M. de Vroomen, editors, Informatiewetenschap 1994: Wetenschappelijke bijdragen aan de derde STINFON Conferentie, pp. 167–180, Tilburg, 1994.Google Scholar
  6. 7.
    López, J.R., Iglesias, E.L., Brisaboa, N.R., Paramá, J.R., Penabad, M.R. Base de datos documental para el estudio del español antiguo. In Proceedings of the X Simposio Internacional en Aplicaciones de Informática (INFONOR’97), pp. 2–8.Chile, 1997.Google Scholar
  7. 8.
    Moreira, V., Huyck, C. A Stemming Algorithm for the Portuguese Language. In Proceedings of the 8th International Symposium on String Processing and Information Retrieval (SPIRE’2001)-IEEE Computer Society, pp.186–193, Chile, 2001.Google Scholar
  8. 10.
    Wechsler, M., Sheridan, P., Schäuble, P. Multi-Language Text Indexing for Internet Retrieval. In the Proceedings of the 5th RIAO Conference Computer Assisted Information Searching on the Internet. Montreal, Canada, 1997.Google Scholar
  9. 11.
    A. Gelbukh, G. Sidorov, L. Chanona-Hernández. Compilation of a Spanish representative corpus. Computational Linguistics and Intelligent Text Processing, Lecture Notes in Computer Science, N 2276, Springer-Verlag, 2002, pp. 285–288.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Nieves R. Brisaboa
    • 1
  • Juan-Ramón López
    • 1
  • Miguel R. Penabad
    • 1
  • Ángeles S. Places
    • 1
  1. 1.Dep. de ComputaciónUniversidade da CorunaCoruna

Personalised recommendations