Advertisement

Artificial Intelligence Review

, Volume 21, Issue 2, pp 113–138 | Cite as

Hebrew Computational Linguistics: Past and Future

  • Shuly Wintner
Article

Abstract

This paper reviews the current state of the art in Natural LanguageProcessing for Hebrew, both theoretical and practical. The Hebrewlanguage, like other Semitic languages, poses special challenges fordevelopers of programs for natural language processing: the writingsystem, rich morphology, unique word formation process of roots andpatterns, lack of linguistic corpora that document language usage, allcontribute to making computational approaches to Hebrew challenging. The paper briefly reviews the field of computational linguistics andthe problems it addresses, describes the special difficulties inherentto Hebrew (as well as to other Semitic languages), surveys a widevariety of past and ongoing works and attempts to characterize futureneeds and possible solutions.

computational linguistics Hebrew natural language processing 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adler, M. & Tebeka, M. (2001). Unsupervised Hebrew Part-of-Speech Tagging. In Wintner, S. (ed.) Israeli Seminar on Computational Linguistics (ISCOL'01), 19–20. Haifa.Google Scholar
  2. Albeck, O. (1995). A Formal Method for Analyzing a Hebrew Sentence. Hebrew Linguistics 39: 5–27 (in Hebrew).Google Scholar
  3. Attar, R., Choueka, Y., Dershowitz, N. & Fraenkel, A. S. (1978). KEDMA – Linguistic Tools for Retrieval Systems. Journal of the Association for Computing Machinery 25(1): 52–66.Google Scholar
  4. Azar, M. (1970). Analyse morphologique automatique du texte hébreu de la Bible. Technical Report 12 et 19, Faculte des Lettres et des Sciences Humaines, Nancy.Google Scholar
  5. Azar, M. (1972). Automatic Syntactical Analysis: The Method and Its Application to the Book of Ruth. Hebrew Computational Linguistics 5: 1–50 (in Hebrew).Google Scholar
  6. Bashkansky, G. & Ornan, U. (1998). Monolingual Translator Workstation. In MT and the Information Soup: Proceedings of AMTA'98, 136–149. Springer.Google Scholar
  7. Beesley, K. (1996). Arabic Finite-State Morphological Analysis and Generation. In Proceedings of COLING-96, the 16th International Conference on Computational Linguistics. Copenhagen.Google Scholar
  8. Beesley, K. R. (1998). Arabic Morphology Using Only Finite-State Operations. In Rosner, M. (ed.) Proceedings of the Workshop on Computational Approaches to Semitic Languages, 50–57. Montreal, Quebec (COLING-ACL'98).Google Scholar
  9. Beesley, K. R. & Karttunen, L. (2003). Finite-State Morphology: Xerox Tools and Techniques. Stanford: CSLI Publications.Google Scholar
  10. Bentur, E., Angel, A. & Segev, D. (1992). Computerized Analysis of Hebrew Words. Hebrew Linguistics 36: 33–38 (in Hebrew).Google Scholar
  11. Bentur, E., Angel, A., Segev, D. & Lavie, A. (1992). Analysis and Generation of the Nouns Inflection in Hebrew. In Ornan et al. (eds.), Chapter 3, 36–38 (in Hebrew).Google Scholar
  12. Carmel, D. & Maarek, Y. (1999). Morphological Disambiguation for Hebrew Search Systems. In Proceedings of the 4th international Workshop, NGITS-99, Number 1649 in Lecture Notes in Computer Science, 312–325. Springer Verlag.Google Scholar
  13. Chayen, M. J. and Dror, Z. (1976). Introduction to Hebrew Transformational Grammar. Jerusalem: University Publishing Projects Ltd. (in Hebrew).Google Scholar
  14. Choueka, Y. (1966). Computers and Grammar: Mechnical Analysis of Hebrew Verbs. In Proceedings of the Annual Conference of the Israeli Association for Information Processing, 49–66. Rehovot (in Hebrew).Google Scholar
  15. Choueka, Y. (1972). Fast Searching and Retrieval Techniques for Large Dictionaries and Concordances. Hebrew Computational Linguistics 6: 12–32 (in Hebrew).Google Scholar
  16. Choueka, Y. (1980). Computerized Full-Text Retrieval Systems and Research in the Humanities: The Responsa Project. Computers and the Humanities 14: 153–169.Google Scholar
  17. Choueka, Y. (1990). MLIM – a System for Full, Exact, On-Line Grammatical Analysis of Modern Hebrew. In Eizenberg, Y. (ed.) Proceedings of the Annual Conference on Computers in Education, 63. Tel Aviv (in Hebrew).Google Scholar
  18. Choueka, Y. (1993). Response to “Computerized Analysis of Hebrew Words”. Hebrew Linguistics 37: 87 (in Hebrew).Google Scholar
  19. Choueka, Y. & Lusignan, S. (1985). Disambiguation by Short Context. Computers and the Humanities 19: 147–157.Google Scholar
  20. Cohen, D. (1984). Mechanical Syntactic Analysis of a Hebrew Sentence. Ph.D. thesis, Hebrew University of Jerusalem (in Hebrew).Google Scholar
  21. Cohen, D. (1985). Analysis of Unvocalized Texts. In Proceedings of the Ninth World Congress of Jewish Studies, 117–122. Jerusalem: World Union of Jewish Studies (in Hebrew).Google Scholar
  22. Dagan, I. & Itai, A. (1994). Word Sense Disambiguation Using a Second Language Monolingual Corpus. Computational Linguistics 20(4): 563–596.Google Scholar
  23. Dahan Netzer, Y. (1997). HUGG – Unification-Based Grammar for the Generation of Hebrew Noun Phrases. Master's thesis, Ben-Gurion University of the Negev, Department of Computer Science, Faculty of Natural Sciences, Be'er Sheva, Israel.Google Scholar
  24. Dahan Netzer, Y. & Elhadad, M. (1998a). Generating Determiners and Quantifiers in Hebrew. In Rosner, M. (ed.) Proceedings of the Workshop on Computational Approaches to Semitic Languages (COLING/ACL'98), 82–88. Montreal, Canada.Google Scholar
  25. Dahan Netzer, Y. & Elhadad, M. (1998b). Generation of Noun Compounds in Hebrew: Can Syntactic Knowledge be Fully Encapsulated? In Hovy, E. (ed.) Proceedings of the Ninth International Workshop on Natural Language Generation, 168–177, New Brunswick, New Jersey: Association for Computational Linguistics.Google Scholar
  26. Dahan Netzer, Y. & Elhadad, M. (1999). Hebrew–English Generation of Possessives and Partitives: Raising the Input Abstraction Level. In Proceedings of the 37th Meeting of the ACL, 144–151. Maryland.Google Scholar
  27. Dalrymple, M., Kaplan, R. M., Maxwell, J. T. & Zaenen, A. (eds.) (1995). Formal Issues in Lexical-Functional Grammar, Volume 47 of CSLI Lecture Notes. Stanford, CA: CSLI.Google Scholar
  28. Fraenkel, A. S. (1976). All about the Responsa Retrieval Project – What You Always Wanted to Know But Were Afraid to Ask. Jurimetrics Journal 16(3): 149–156.Google Scholar
  29. Glinert, L. (1989). The Grammar of Modern Hebrew. Cambridge: Cambridge University Press.Google Scholar
  30. Goldstein, L. (1991). Generation and Inflection of the Possession Inflection of Hebrew Nouns. Master's thesis, Technion, Haifa, Israel (in Hebrew).Google Scholar
  31. Haddock, N., Klein, E. & Morill, G. (eds.) (1987). Categorial Grammar, Unification and Parsing, Volume 1 of Working Papers in Cognitive Science. University of Edinburgh, Center for Cognitive Science.Google Scholar
  32. Herz, J. & Rimon, M. (1991). Local Syntactic Constraints. In Proceedings of the Second International Workshop on Parsing Technologies. Cancun, Mexico.Google Scholar
  33. Herz, J. & Rimon, M. (1992). Lexical Disambiguation and Other Applications of Short Context Automata. In Ornan et al. (eds.), Chapter 7, 74–87 (in Hebrew).Google Scholar
  34. Izre'el, S., Hary, B. & Rahav, G. (to appear). Designing CoSIH: The Corpus of Spoken Israeli Hebrew.Google Scholar
  35. Joshi, A. K. (1987). An Introduction to Tree Adjoining Grammars. In Manaster-Ramer, A. (ed.) Mathematics of Language. Amsterdam: John Benjamins.Google Scholar
  36. Kaplan, R. & Bresnan, J. (1982). Lexical Functional Grammar: A Formal System for Grammatical Representation. In Bresnan, J. (ed.) The Mental Representation of Grammatical Relations, 173–281. Cambridge, MA: MIT Press.Google Scholar
  37. Kaplan, R. M. & Kay, M. (1994). Regular Models of Phonological Rule Systems. Computational Linguistics 20(3): 331–378.Google Scholar
  38. Karttunen, L., Chanod, J-P., Grefenstette, G. & Schiller, A. (1996). Regular Expressions for Language Engineering. Natural Language Engineering 2(4): 305–328.Google Scholar
  39. Kiraz, G. A. (2000). Multitiered Nonlinear Morphology Using Multitape Finite Automata: A Case Study on Syriac and Arabic. Computational Linguistics 26(1): 77–105.Google Scholar
  40. Koskenniemi, K. (1983). Two-Level Morphology: A General Computational Model for Word-Form Recognition and Production. The Department of General Linguistics, University of Helsinki.Google Scholar
  41. Laufer, A. (1976). Computer Generated Artificial Hebrew Speech. Leshonenu 40: 67–78 (in Hebrew).Google Scholar
  42. Lavie, A. (1989). Two-Level Morphology for Hebrew. Master's thesis, Technion, Haifa, Israel (in Hebrew).Google Scholar
  43. Lavie, A., Itai, A., Ornan, U. & Rimon, M. (1988a). On the Applicability of Two-Level Morphology to the Inflection of Hebrew Verbs, Technical Report 513. Department of Computer Science, Technion, 32000 Haifa, Israel.Google Scholar
  44. Lavie, A., Itai, A., Ornan, U. & Rimon, M. (1988b). On the Applicability of Two-Level Morphology to the Inflection of Hebrew Verbs. In Proceedings of the International Conference of the ALLC. Jerusalem, Israel.Google Scholar
  45. Lazewnik, R. G. (1970). Construction of an Algorithm for Stem Recognition in the Hebrew Language. Hebrew Computational Linguistics 2: 84–101.Google Scholar
  46. Levinger, M. (1992). Morphologic Disambiguation in Hebrew. Master's thesis, Technion, Haifa, Israel (in Hebrew).Google Scholar
  47. Levinger, M., Ornan, U. & Itai, A. (1995). Learning Morpho-Lexical Probabilities from an Untagged Corpus with an Application to Hebrew. Computational Linguistics 21(3): 383–404.Google Scholar
  48. Mani, A. (2001). Automatic Summarization. Amsterdam: John Benjamins.Google Scholar
  49. Mani, A. & Maybury, M. T. (eds.) (1999). Advances in Automatic Text Summarization. Cambridge, MA: MIT Press.Google Scholar
  50. Mohri, M. (1996). On Some Applications of Finite-State Automata Theory to Natural Language Processing. Natural Language Engineering 2(1): 61–80.Google Scholar
  51. Mohri, M., Pereira, F. & Riley, M. (1998). A Rational Design for a Weighted Finite-State Transducer Library, Number 1436 in Lecture Notes in Computer Science. Springer.Google Scholar
  52. Morgenbrod, M. & Serifi, E. (1976). Computer-Analysed Aspects of Hebrew Verbs. Hebrew Computational Linguistics 10: E1–17.Google Scholar
  53. Morgenbrod, M. & Serifi, E. (1977). Computer-Analysed Aspects of Hebrew Verbs: Mathematical Models. Hebrew Computational Linguistics 12: E1–18.Google Scholar
  54. Morgenbrod, M. & Serifi, E. (1978). Computer-Analysed Aspects of Hebrew Verbs: The Binjanim Structure. Hebrew Computational Linguistics 14: V–XV.Google Scholar
  55. Nirenburg, S. & Ben-Asher, Y. (1984). HUHU – the Hebrew University Hebrew Understander. Computer Languages 9(3/4).Google Scholar
  56. Nissan, E. (1993). Onomaturge: An Expert System for Word Formation. Hebrew Linguistics 36: 39–49 (in Hebrew).Google Scholar
  57. Ornan, U. (1977). Report on Linguistic Research in the Computer Carried on in Israel. Hebrew Computational Linguistics 11: 121–127 (in Hebrew).Google Scholar
  58. Ornan, U. (1979). The Simple Sentence. Jerusalem, Israel: Academon (in Hebrew).Google Scholar
  59. Ornan, U. (1985a). Indexes and Concordances in a Phonemic Hebrew Script. In Proceedings of the Ninth World Congress of Jewish Studies, 101–108. Jerusalem: World Union of Jewish Studies (in Hebrew).Google Scholar
  60. Ornan, U. (1985b). Vocalization by a Computer: A Linguistic Lesson. In Luria, B-Z. (ed.) Avraham Even-Shoshan Book, 67–76. Jerusalem: Kiryat-Sefer (in Hebrew).Google Scholar
  61. Ornan, U. (1986). Phonemic Script: A Central Vehicle for Processing Natural Language – the Case of Hebrew, Technical Report 88.181. IBM Research Center, Haifa, Israel.Google Scholar
  62. Ornan, U. 1(1987). Computer Processing of Hebrew Texts Based on an Unambiguous Script. Mishpatim 17(2): 15–24 (in Hebrew).Google Scholar
  63. Ornan, U. (1994). Basic Concepts in “Romanization” of Scripts, Technical Report LCL 94–5. Laboratory for Computational Linguistics, Technion, Haifa, Israel.Google Scholar
  64. Ornan, U., Arieli, G. & Doron, E. (eds.) (1992). Hebrew Computational Linguistics: Papers Presented at Seminars Held in 1988, 1989, 1990. Ministry of Science and Technology (in Hebrew).Google Scholar
  65. Ornan, U. & Gutter, I. (2000). Machine Translation by Semantic Features. In Lewis, D. & Mitkov, R. (eds.) Machine Translation and Multilingual Applications in the New Millennium. Exester, UK.Google Scholar
  66. Ornan, U. & Katz, M. (1995). A New Program for Hebrew Index Based on the Phonemic Script, Technical Report LCL 94–7. Laboratory for Computational Linguistics, Technion, Haifa, Israel.Google Scholar
  67. Ornan, U. & Kazatski, W. (1986). Analysis and Synthesis Processes in Hebrew Morphology. In Proceedings of the 21 st National Data Processing Conference (in Hebrew).Google Scholar
  68. Pinkas, G. (1985). A Linguistic System for Information Retrieval. Maase Hoshev 12: 10–16 (in Hebrew).Google Scholar
  69. Pollard, C. & Sag, I. A. (1987). Information Based Syntax and Semantics, Number 13 in CSLI Lecture Notes. CSLI.Google Scholar
  70. Pollard, C. & Sag, I. A. (1994). Head-Driven Phrase Structure Grammar. University of Chicago Press and CSLI Publications.Google Scholar
  71. Price, J. D. (1969). An Algorithm for Generating Hebrew Words. Hebrew Computational Linguistics 1: 51–54. Reprinted from Computer Studies in the Humanities and Verbal Behavior 1(2): 84–102 (1969).Google Scholar
  72. Price, J. D. (1970). The Development of a Theoretical Basis for Machine Aids for Translation from Hebrew to English. Hebrew Computational Linguistics 2: 65–83, May. Abstract of a Doctoral Dissertation, The Dropsie College for Hebrew and Cognate Learning, Philadelphia.Google Scholar
  73. Price, J. D. (1971a). An Algorithm for Analyzing Hebrew Words. Computer Studies in the Humanities and Verbal Behavior 3(2): 137–165.Google Scholar
  74. Price, J. D. (1971b). A Computerized Phrase Structure Grammar (Modern Hebrew), Report F-C2585–1/2/3/4. Franklin Institute.Google Scholar
  75. Roche, E. & Schabes, Y. (eds.) (1997). Finite-State Language Processing. Language, Speech and Communication. Cambridge, MA: MIT Press.Google Scholar
  76. Rosen, H. B. (1966). Ivrit Tova (Good Hebrew). Jerusalem: Kiryat Sepher (in Hebrew).Google Scholar
  77. Rubinstein, E. (1968). Ha-mishpat Ha-shemani (The Nominal Sentence). Merhavia: Ha-Kibbutz Ha-Me'uxad (in Hebrew).Google Scholar
  78. Rubinstein, E. (1970). Ha-cerup Ha-pooliy (The Verb Phrase). Merhavia: Ha-Kibbutz Ha-Me'uxad (in Hebrew).Google Scholar
  79. Samuelsdorff, P. O. (1980). Computational Analysis of Modern Hebrew. Hebrew Computational Linguistics 16: IV–XVI.Google Scholar
  80. Segal, E. (1997). Morphological Analyzer for Unvocalized Hebrew Words. Unpublished work, available from http://www.cs.technion.ac.il/~erelsgl/hmntx.zip.Google Scholar
  81. Segal, E. (1999). Hebrew Morphological Analyzer for Hebrew Undotted Texts. Master's thesis, Technion, Israel Institute of Technology, Haifa (in Hebrew).Google Scholar
  82. Shany-Klein, M. (1990). Generation and Analysis of Segolate Noun Inflection in Hebrew. Master's thesis, Technion, Haifa, Israel (in Hebrew).Google Scholar
  83. Shany-Klein, M. & Ornan, U. (1992). Analysis and Generation of Hebrew Segolate Nouns. In Ornan et al. (eds.), Chapter 4, 39–51 (in Hebrew).Google Scholar
  84. Shapira, M. & Choueka, Y. (1964). Mechanographic Analysis of Hebrew Morphology: Possibilities and Achievements. Leshonenu 28(4): 354–372 (in Hebrew).Google Scholar
  85. Shieber, S. M. (1986). An Introduction to Unification Based Approaches to Grammar, Number 4 in CSLI Lecture Notes. CSLI.Google Scholar
  86. Sima'an, K., Itai, A., Winter, Y., Altman, A. & Nativ, N. (to appear). Building a Tree-Bank of Modern Hebrew Text. Traitment Automatique des Langues.Google Scholar
  87. Skoblikov, V. (2000). Feature-Based Computational Lexicon of Hebrew Verbs. Master's thesis, Technion, Israel Institute of Technology, Haifa, Israel.Google Scholar
  88. Sproat, R. W. (1992). Morphology and Computation. Cambridge, MA: MIT Press.Google Scholar
  89. Steedman, M. (2000). The Syntactic Process. Language, Speech and Communication. Cambridge, MA: The MIT Press.Google Scholar
  90. Talmon, R. & Wintner, S. (2001). Computational Processing of Spoken North Israeli Arabic. In Arabic Language Processing: Status and Prospects, 124–126. Toulouse, France: Association for Computational Linguistics.Google Scholar
  91. Vaillette, N. (2001). Hebrew Relative Clauses in HPSG. In Flickinger, D. & Kathol, A. (eds.) Proceedings of the 7th International Conference on Head-Driven Phrase Structure Grammar. CSLI Publications.Google Scholar
  92. van der Toorn, A. J. (1971). Automatic Reading of Handwritten Hebrew. Hebrew Computational Linguistics 4: 83–99.Google Scholar
  93. van Noord, G. & Gerdemann, D. (2001). Finite State Transducers with Predicates and Identity. Grammars 4(3).Google Scholar
  94. Wintner, S. (1991). Syntactic Analysis of Hebrew Sentences. Master's thesis, Technion, Israel Institute of Technology, Haifa, Israel (in Hebrew, abstract in English).Google Scholar
  95. Wintner, S. (1992). Syntactic Analysis of Hebrew Sentences Using PATR. In Ornan et al. (eds.), Chapter 9, 105–115 (in Hebrew).Google Scholar
  96. Wintner, S. (1997). An Abstract Machine for Unification Grammars. Ph.D. thesis, Technion –Israel Institute of Technology, Haifa, Israel.Google Scholar
  97. Wintner, S. (1998). Towards a Linguistically Motivated Computational Grammar for Hebrew. In Rosner, M. (ed.) Proceedings of the Workshop on Computational Approaches to Semitic Languages (COLING-ACL'98), 82–88. Université de Montréal, Quebec, Canada: Association for Computational Linguistics.Google Scholar
  98. Wintner, S. (ed.) (2001). Israeli Seminar on Computational Linguistics (ISCOL'01). Haifa.Google Scholar
  99. Wintner, S. & Ornan, U. (1991a). Computational Models for Syntactic Analysis – Their Fitness for Writing a Computational Grammar for Hebrew. In Proceedings of the Bar-Ilan Symposium on Foundations of Artificial Intelligence. Also as CIS Report 9103, Center for Intelligent Systems, Technion.Google Scholar
  100. Wintner, S. & Ornan, U. (1991b). Syntactic Analysis of Hebrew Sentences. In Proceedings of the 8th Israeli Symposium on Artificial Intelligence and Computer Vision, 201–230. Information Processing Association of Israel.Google Scholar
  101. Wintner, S, & Ornan, U. (1996). Syntactic Analysis of Hebrew Sentences. Natural Language Engineering 1(3): 261–288.Google Scholar
  102. Yizhar, D. (1993). Computational Grammar for Hebrew Noun Phrases. Master's thesis, Computer Science Department, Hebrew University, Jerusalem, Israel (in Hebrew).Google Scholar

Copyright information

© Kluwer Academic Publishers 2004

Authors and Affiliations

  • Shuly Wintner
    • 1
  1. 1.Department of Computer ScienceUniversity of HaifaMount Carmel, HaifaIsrael

Personalised recommendations