Aspects of Automatic Text Analysis pp 277-299 | Cite as
Affix Discovery by Means of Corpora: Experiments for Spanish, Czech, Ralámuli and Chuj
Chapter
Abstract
Although the focus on morpheme discovering techniques originated within those linguistic schools which inherited from Franz Boas the concern for the unknown languages of the NewWorld, automatic, unsupervised morphological segmentation remains a field of interest for the computational processing and engineering1 of natural languages, as well as for the plain exercise of getting to know them intimately.2
Keywords
Word Segmentation Word Fragment Left Segment Tense Marker Minimal Distance Method
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Preview
Unable to display preview. Download preview PDF.
References
- [1]O. Cromm. Afixerkennung in deutschenWortformen. Eine Untersuchung zum nicht-lexikalischen Segmentierungsverfahren von N. D. Andreev. Abschluß des Ergänzungsstudiums Linguistische Datenverarbeitung, Frankfurt am Main, 1996.Google Scholar
- [2]J. de Kock and W. Bossaert. Introducción a la lingüýstica automática en las lenguas románicas, volume 202 of Estudios y Ensayos. Gredos, Madrid, 1974.Google Scholar
- [3]J. de Kock and W. Bossaert. The Morpheme. An Experiment in Quantitative and Computational Linguistics. Van Gorcum, Amsterdam, Madrid, 1978.Google Scholar
- [4]W. B. Frakes. Stemming Algorithms. In W. B. Frakes and R. Baeza, editors, Information Retrieval, Data Structures and Algorithms, pages 131–160. Prentice Hall, New Jersey, 1992.Google Scholar
- [5]A. Gelbukh, M. Alexandrov, and S. Y. Han. Detecting Infiection Patterns in Natural Language by Minimization of Morphological Model. In Congreso Iberoamericano de Reconocimiento de Patrones, CIARP-2004, LNCS, 2004.Google Scholar
- [6]J. Goldsmith. Unsupervised Learning of the Morphology of a Natural Language. Computational Linguistics, 27(2):153–198, 2001.CrossRefMathSciNetGoogle Scholar
- [7]J. H. Greenberg. Essays in Linguistics. The University of Chicago Press, Chicago, 1967.Google Scholar
- [8]M. A. Hafer and S. F. Weiss. Word Segmentation by Letter Successor Varieties. Information Storage and Retrieval, 10:371–385, 1974.CrossRefGoogle Scholar
- [9]Z. S. Harris. From Phoneme to Morpheme. Language, 31(2):190–222, 1955.CrossRefGoogle Scholar
- [10]H. Johnson and J. Martin. Unsupervised Learning of Morphology for English and Inuktitut. In Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, 2003.Google Scholar
- [11]K. Kageura. Bigram Statistics Revisited: A Comparative Examination of Some Statistical Measures in Morphological Analysis of Japanese Kanji Sequences. Journal of Quantitative Linguistics, 6:149–166, 1999.CrossRefGoogle Scholar
- [12]L. F. Lara and R. Ham Chande. Investigaciones lingüýsticas en lexicograf ýa, chapter Base estadýstica del Diccionario del Español de México, pages 5–39. Volume 89 of Jornadas [13], 1st edition, 1974.Google Scholar
- [13]L. F. Lara, R. Ham Chande, and M. I. Garcýa Hidalgo. Investigaciones lingüýsticas en lexicografýa, volume 89 of Jornadas. El Colegio de México, A. C., Mexico, 1st edition, 1979.Google Scholar
- [14]A. Medina-Urrea. Automatic Discovery of Afixes by Means of a Corpus: A Catalog of Spanish Afixes. Journal of Quantitative Linguistics, 7(2):97–114, 2000.CrossRefGoogle Scholar
- [15]A. Medina-Urrea. Investigación cuantitativa de afijos y clýticos del español de México. Glutinometrýa en el Corpus del Español Mexicano Contemporáneo. PhD thesis, El Colegio de México, Mexico, April 2003.Google Scholar
- [16]A. Medina-Urrea and M. Alvarado Garcýa. Análisis cuantitativo y cualitativo de la derivación léxica en ralámuli. In Primer Coloquio Leonardo Manrique, Mexico, Conaculta-INAH, September 2004.Google Scholar
- [17]A. Medina-Urrea and E. C. Buenrostro Dýaz. Caracterýsticas cuantitativas de la fiexión verbal del chuj. Estudios de Lingüýstica Aplicada, 38:15–31, 2003.Google Scholar
- [18]A. Medina-Urrea and J. Hlaváčová. Automatic Recognition of Czech Derivational Prefixes. In Proceedings of CICLing 2005, volume 3406 of Lecture Notes in Computer Science, pages 189–197. Springer, Berlin/Heidelberg/New York, 2005.Google Scholar
- [19]M. P. Oakes. Statistics for Corpus Linguistics. Edinburgh University Press, Edinburgh, 1998.Google Scholar
- [20]B. B. Rieger. Computing Granular Word Meanings. A Fuzzy Linguistic Approach in Computational Semiotics. In P. Wang, editor, Computing with Words, pages 147–208. John Wiley & Sons, New York, 2001.Google Scholar
- [21]J. Rini. Motives for Linguistic Change in the Formation of the Spanish Object Pronouns. Juan de la Cuesta, Newark, Delaware, 1992.Google Scholar
- [22]E. Sapir. Language: An Introduction to the Study of Speech. Harcourt, Brace & Company, New York, 1921.Google Scholar
- [23]C. E. Shannon and W. Weaver. The Mathematical Theory of Communication. University of Illinois Press, Urbana, 1949.zbMATHGoogle Scholar
- [24]A. Spencer and A. M. Zwicky. The Handbook of Morphology. Blackwell, Oxford, 1998.Google Scholar
Copyright information
© Springer 2007