Skip to main content

Semantic Processing of Semitic Languages

Part of the Theory and Applications of Natural Language Processing book series (NLP)

Abstract

In this chapter, we cover semantic processing in Semitic languages. We will present models of semantic processing over words and their relations in sentences, namely paradigmatic and syntagmatic models. We will contrast the processing of Semitic languages against English, illustrating some of the challenges – and clues – due to the inherent unique characteristics of Semitic languages.

Keywords

  • Word Sense
  • Semantic Distance
  • Statistical Machine Translation
  • Parallel Corpus
  • Noun Noun Compound

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   119.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    We will be using Buckwalter transliteration throughout this chapter to illustrate the Arabic script in Romanization.

  2. 2.

    We use the following transliteration for Hebrew letters: AbgdhwzHtyklmnsEpcqrST. When denoting pronunciation, some letters may denote one of two sounds each. In such cases, they may be transliterated as follows: b → v, k → x, p → f, non-silent w → V (sounds the same as v but denoted uniquely for disambiguation), non-silent y → Y, (h is non-silent unless in final position, so no special denotation is needed), and S → C (sounds the same as s but denoted uniquely for disambiguation). Non-letter (diacritic) vowels are transliterated with the lowercase letters aeiou (simplified as well).

  3. 3.

    Some researchers distinguish between similarity and relatedness [49]; without getting into this distinction, we use here a general notion of semantic distance measures.

  4. 4.

    http://nlp.cs.swarthmore.edu/semeval/ or http://www.senseval.org/

  5. 5.

    http://www.wikipedia.org/

  6. 6.

    Licensing for this data set is obtained through the Linguistic Data Consortium (LDC).

  7. 7.

    http://www.ldc.upenn.edu/Catalog/catalogEntry.jsp?catalogId=LDC2011T03

  8. 8.

    http://omega.isi.edu

  9. 9.

    http://www.ldc.upenn.edu/Catalog/catalogEntry.jsp?catalogId=LDC2009T30

  10. 10.

    Especially for English, notably the above-mentioned FrameNet (FN) [6] and ProbBank (PB) [52].

  11. 11.

    http://www.ldc.org/Ontonotes

  12. 12.

    http://verbs.colorado.edu/~mpalmer/projects/verbnet.html

  13. 13.

    http://www.icsi.berkeley.edu/pubs/ai/HFN.pdf

  14. 14.

    The exception is that Arabic sound masculine plural allows for relative disambiguation since it distinguishes between nominative

    and both accusative and genitive cases

    .

  15. 15.

    The authors experiment with P1-P6 and find that P3 yields the best performance.

  16. 16.

    http://ixa2.si.ehu.es/starsem/

  17. 17.

    https://sites.google.com/site/spsemmrl2012/

References

  1. Abend O, Reichart R, Rappoport A (2009) Unsupervised argument identification for semantic role labeling. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP, Singapore. Association for Computational Linguistics, Suntec, pp 28–36. http://www.aclweb.org/anthology/P/P09/P09-1004

  2. Agirre E, Lopez de Lacalle Lekuona O (2003) Clustering WordNet word senses. In: Proceedings of the 1st international conference on recent advances in natural language processing (RANLP-2003), Borovets

    Google Scholar 

  3. Al-Haj H, Wintner S (2010) Identifying multi-word expressions by leveraging morphological and syntactic idiosyncrasy. In: Proceedings of the 23rd international conference on computational linguistics (Coling 2010), Coling 2010 Organizing Committee, Beijing, pp 10–18. http://www.aclweb.org/anthology/C10-1002

  4. Androutsopoulos I, Malakasiotis P (2010) A survey of paraphrasing and textual entailment methods. J Artif Intell Res (JAIR) 38:135–187

    MATH  Google Scholar 

  5. Attia M, Toral A, Tounsi L, Pecina P, van Genabith J (2010) Automatic extraction of Arabic multiword expressions. In: Proceedings of the 2010 workshop on multiword expressions: from theory to applications, Coling 2010 Organizing Committee, Beijing, pp 19–27. http://www.aclweb.org/anthology/W10-3704

  6. Baker CF, Fillmore CJ, Lowe JB (1998) The berkeley FrameNet project. In: COLING-ACL ’98: proceedings of the conference, University of Montréal, pp 86–90

    Google Scholar 

  7. Banerjee S, Pedersen T (2003) Extended gloss overlaps as a measure of semantic relatedness. In: Proceedings of the eighteenth international joint conference on artificial intelligence (IJCAI-03), Acapulco, pp 805–810

    Google Scholar 

  8. Banko M, Cafarella MJ, Soderl S, Broadhead M, Etzioni O (2007) Open information extraction from the web. In: IJCAI, Hyderabad, pp 2670–2676

    Google Scholar 

  9. Bannard C, Callison-Burch C (2005) Paraphrasing with bilingual parallel corpora. In: Proceedings of the 43rd annual meeting of the Association for Computational Linguistics (ACL2005), Ann Arbor. Association for Computational Linguistics, pp 597–604

    Google Scholar 

  10. Carpuat M, Diab M (2010) Task-based evaluation of multiword expressions: a pilot study in statistical machine translation. In: Human language technologies: the 2010 annual conference of the North American chapter of the Association for Computational Linguistics, Los Angeles. Association for Computational Linguistics, pp 242–245. http://www.aclweb.org/anthology/N10-1029

  11. Carreras X, Màrquez L (2005) Introduction to the CoNLL-2005 shared task: semantic role labeling. In: Proceedings of the ninth conference on computational natural language learning (CoNLL-2005), Ann Arbor. Association for Computational Linguistics, pp 152–164. http://www.aclweb.org/anthology/W/W05/W05-0620

  12. Chen J, Rambow O (2003) Use of deep linguistic features for the recognition and labeling of semantic arguments. In: Proceedings of the 2003 conference on empirical methods in natural language processing, Sapporo

    Google Scholar 

  13. Cruse DA (1986) Lexical semantics. Cambridge University Press, Cambridge

    Google Scholar 

  14. Culo O, Erk K, Pado S, Schulte im Walde S (2008) Comparing and combining semantic verb classifications. Lang Resour Eval 42(3):265–291. doi:10.1007/s10579-008-9070-z

    Google Scholar 

  15. Dagan I, Itai A (1994) Word sense disambiguation using a second language monolingual corpus. Comput Linguist 20. http://aclweb.org/anthology-new/J/J94/J94-4003.pdf

  16. Dagan I, Lee L, Pereira F (1999) Similarity-based models of cooccurrence probabilities. Mach Learn 34(1–3):43–69

    CrossRef  MATH  Google Scholar 

  17. Das D, Smith NA (2011) Semi-supervised frame-semantic parsing for unknown predicates. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: human language technologies, Portland. Association for Computational Linguistics, pp 1435–1444. http://www.aclweb.org/anthology/P11-1144

  18. Diab M (2003) Word sense disambiguation within a multilingual framework. PhD thesis, University of Maryland, College Park

    Google Scholar 

  19. Diab MT (2004) An unsupervised approach for bootstrapping Arabic sense tagging. In: Farghaly A, Megerdoomian K (eds) COLING 2004 computational approaches to Arabic script-based languages, COLING, Geneva, pp 43–50

    Google Scholar 

  20. Diab M, Bhutada P (2009) Verb noun construction mwe token classification. In: Proceedings of the workshop on multiword expressions: identification, interpretation, disambiguation and applications, Suntec. Association for Computational Linguistics, pp 17–22. http://www.aclweb.org/anthology/W/W09/W09-2903

  21. Diab M, Krishna M (2009) Handling sparsity for verb noun MWE token classification. In: Proceedings of the workshop on geometrical models of natural language semantics, Athens. Association for Computational Linguistics, pp 96–103. http://www.aclweb.org/anthology/W09-0213

  22. Diab M, Krishna M (2009) Unsupervised classification for vnc multiword expressions tokens. In: CICLING, Mexico City

    Google Scholar 

  23. Diab M, Moschitti A (2007) Semantic parsing for Modern Standard Arabic. In: Proceedings of recent advances in natural language processing (RANLP), Borovets

    Google Scholar 

  24. Diab M, Resnik P (2002) An unsupervised method for word sense tagging using parallel corpora. In: Proceedings of 40th annual meeting of the Association for Computational Linguistics, Philadelphia. Association for Computational Linguistics, pp 255–262, doi:10.3115/1073083.1073126. http://www.aclweb.org/anthology/P02-1033

  25. Diab M, Hacioglu K, Jurafsky D (2004) Automatic tagging of Arabic text: from raw text to base phrase chunks. In: Susan Dumais DM, Roukos S (eds) HLT-NAACL 2004: Short papers, Boston. Association for Computational Linguistics, pp 149–152

    CrossRef  Google Scholar 

  26. Diab M, Alkhalifa M, ElKateb S, Fellbaum C, Mansouri A, Palmer M (2007) Semeval-2007 task 18: Arabic semantic labeling. In: Proceedings of the fourth international workshop on semantic evaluations (SemEval-2007), Association for Computational Linguistics, Prague, pp 93–98. http://www.aclweb.org/anthology/S/S07/S07-1017

  27. Diab M, Alkhalifa M, ElKateb S, Fellbaum C, Mansouri A, Palmer M (2007) Semeval-2007 task 18: Arabic semantic labeling. In: Proceedings of the fourth international workshop on semantic evaluations (SemEval-2007), Prague. Association for Computational Linguistics, pp 93–98. http://www.aclweb.org/anthology/W/W07/W07-2017

  28. Diab M, Ghoneim M, Habash N (2007) Arabic diacritization in the context of statistical machine translation. In: Proceedings of machine translation summit (MT-Summit), Copenhagen

    Google Scholar 

  29. Diab M, Moschitti A, Pighin D (2007) Cunit: a semantic role labeling system for Modern Standard Arabic. In: Proceedings of the fourth international workshop on semantic evaluations (SemEval-2007), Prague. Association for computational linguistics, pp 133–136. http://www.aclweb.org/anthology/W/W07/W07-2026

  30. Diab M, Moschitti A, Pighin D (2008) Semantic role labeling systems for Arabic using kernel methods. In: Proceedings of ACL-08: HLT, Columbus. Association for Computational Linguistics, pp 798–806. http://www.aclweb.org/anthology/P/P08/P08-1091

  31. Dunning T (1993) Accurate methods for the statistics of surprise and coincidence. Comput Linguist 19(1):61–74

    Google Scholar 

  32. Elkateb S, Black W, Rodriguez H, Alkhalifa M, Vossen P, Pease A, Fellbaum C (2006) Building a wordnet for Arabic. In: Proceedings of the fifth international conference on language resources and evaluation, LREC, Genoa

    Google Scholar 

  33. Elmougy S, Hamza T, Noaman HM (2008) Naive Bayes classifier for Arabic word sense disambiguation. In: Proceedings of INFOS 2008, Cairo, pp 27–29

    Google Scholar 

  34. Erk K, Pado S (2006) Shalmaneser – a toolchain for shallow semantic parsing. In: Proceedings of the language resources and evaluation conference (LREC), Genoa

    Google Scholar 

  35. Fellbaum C (1998) Wordnet: an electronic lexical database. MIT, Cambridge

    MATH  Google Scholar 

  36. Firth JR (1957) A synopsis of linguistic theory 1930–1955. Studies in linguistic analysis (special volume of the Philological Society). Blackwell, Oxford, pp 1–32

    Google Scholar 

  37. Frege G (1892) Über Sinn und Bedeutung. Zeitschrift für Philosophie und philosophische Kritik 100:25–50

    Google Scholar 

  38. Fung P, Yee LY (1998) An IR approach for translating new words from nonparallel, comparable texts. In: Proceedings of coling – ACL, Montreal, pp 414–420

    Google Scholar 

  39. Giampiccolo D, Magnini B, Dagan I, Dolan B (2007) The third Pascal recognizing textual entailment challenge. In: Proceedings of the ACL-PASCAL workshop on textual entailment and paraphrasing, Prague. Association for Computational Linguistics, pp 1–9. http://www.aclweb.org/anthology/W/W07/W07-1401

  40. Gildea D, Jurafsky D (2002) Automatic labeling of semantic roles. Comput Linguist 28(3):245–288. http://www.cs.rochester.edu/~gildea/gildea-cl02.pdf

    Google Scholar 

  41. Gildea D, Palmer M (2002) The necessity of parsing for predicate argument recognition. In: Proceedings of the 40th annual conference of the Association for Computational Linguistics (ACL-02), Philadelphia

    Google Scholar 

  42. Goldwasser D, Reichart R, Clarke J, Roth D (2011) Confidence driven unsupervised semantic parsing. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: human language technologies, Portland. Association for Computational Linguistics, pp 1486–1495. http://www.aclweb.org/anthology/P11-1149

  43. Habash N (2010) Introduction to Arabic natural language processing. Morgan & Claypool, San Rafael

    Google Scholar 

  44. Habash N, Rambow O (2005) Arabic tokenization, part-of-speech tagging and morphological disambiguation in one fell swoop. In: Proceedings of the 43rd annual meeting of the Association for Computational Linguistics (ACL’05), Ann Arbor. Association for Computational Linguistics, pp 573–580. doi:10.3115/1219840.1219911, http://www.aclweb.org/anthology/P05-1071

  45. Habash N, Rambow O (2007) Arabic diacritization through full morphological tagging. In: Human language technologies 2007: the conference of the North American chapter of the Association for Computational Linguistics; Companion Volume, Short Papers, Rochester. Association for Computational Linguistics, pp 53–56. http://www.aclweb.org/anthology/N/N07/N07-2014

  46. Haghighi A, Toutanova K, Manning C (2005) A joint model for semantic role labeling. In: Proceedings of the ninth conference on computational natural language learning (CoNLL-2005), Ann Arbor. Association for Computational Linguistics, pp 173–176. http://www.aclweb.org/anthology/W/W05/W05-0623

  47. Harris ZS (1940) Review of Louis H. Gray, foundations of language (Macmillan, New York 1939). Language 16(3):216–231

    CrossRef  Google Scholar 

  48. Hawwari A, Bar K, Diab M (2012) Building an Arabic multiword expressions repository. In: Proceedings of the ACL 2012 joint workshop on statistical parsing and semantic processing of morphologically rich languages, Jeju. Association for Computational Linguistics, pp 24–29. http://www.aclweb.org/anthology/W12-3403

  49. Hirst G, Budanitsky A (2005) Correcting real-word spelling errors by restoring lexical cohesion. Nat Lang Eng 11(1):87–111

    CrossRef  Google Scholar 

  50. Jackendoff R (1983) Semantics and cognition. MIT, Cambridge

    Google Scholar 

  51. Jackendoff R (1990) Semantic structures. MIT, Cambridge

    Google Scholar 

  52. Kingsbury P, Palmer M (2003) Propbank: the next level of treebank. In: Proceedings of treebanks and lexical theories, Växjö

    Google Scholar 

  53. Kipper K, Korhonen A, Ryant N, Palmer M (2006) Extending VerbNet with novel verb classes. In: Proceedings of the sixth international conference on language resources and evaluation, Genoa

    Google Scholar 

  54. Lang J, Lapata M (2011) Unsupervised semantic role induction via split-merge clustering. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: human language technologies, Portland. Association for Computational Linguistics, pp 1117–1126. http://www.aclweb.org/anthology/P11-1112

  55. Lee JH, Kim MH, Lee YJ (1993) Information retrieval based on conceptual distance in IS-A hierarchies. J Doc 49(2):188–207

    CrossRef  Google Scholar 

  56. Lenci A, McGillivray B, Montemagni S, Pirrelli V (2008) Unsupervised acquisition of verb subcategorization frames from shallow-parsed corpora. In: Proceedings of the sixth international conference on language resources and evaluation (LREC’08), Marrakech. http://www.lrec-conf.org/proceedings/lrec2008/

  57. Lesk M (1986) Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. In: Proceedings of the 5th annual international conference on systems documentation, Toronto, pp 24–26

    Google Scholar 

  58. Levin B (1993) English verb classes and alternations: a preliminary investigation. University Of Chicago Press, chicago

    Google Scholar 

  59. Levinson D (1999) Corpus-based method for unsupervised word sense disambiguation. In: Proceedings of the workshop on machine learning in human language technology, advanced course on artificial intelligence (ACAI’99), Chania, pp 267–273

    Google Scholar 

  60. Li J, Zhou G, Ng HT (2010) Joint syntactic and semantic parsing of Chinese. In: Proceedings of the 48th annual meeting of the Association for Computational Linguistics, Uppsala. Association for Computational Linguistics, pp 1108–1117. http://www.aclweb.org/anthology/P10-1113

  61. Lin D (1998) An information-theoretic definition of similarity. In: Proceedings of the 15th international conference on machine learning, San Francisco, pp 296–304

    Google Scholar 

  62. Lüdeling A, Kytö M (eds) (2009) Corpus linguistics. An international handbook. Handbooks of linguistics and communication science, vol 2. Mouton de Gruyter, Berlin

    Google Scholar 

  63. Maamouri M, Bies A, Buckwalter T, Diab M, Habash N, Rambow O, Tabessi D (2006) Developing and using a pilot dialectal Arabic treebank. In: Fifth international conference on language resources and evaluation (LREC2006), Genoa

    Google Scholar 

  64. Maamouri M, Bies A, Kulick S (2008) Enhanced annotation and parsing of the Arabic treebank. In: INFOS, Cairo

    Google Scholar 

  65. Madnani N, Dorr B (2010) Generating phrasal and sentential paraphrases: a survey of data-driven methods. Comput Linguist 36(3):341–387

    CrossRef  MathSciNet  Google Scholar 

  66. Marton Y (2010) Improved statistical machine translation using monolingual text and a shallow lexical resource for hybrid phrasal paraphrase generation. In: Proceedings of the AMTA, Denver

    Google Scholar 

  67. Marton Y, Callison-Burch C, Resnik P (2009) Improved statistical machine translation using monolingually-derived paraphrases. In: Proceedings of EMNLP, Singapore

    Google Scholar 

  68. Marton Y, Mohammad S, Resnik P (2009) Estimating semantic distance using soft semantic constraints in knowledge-source/corpus hybrid models. In: Proceedings of EMNLP, Singapore

    Google Scholar 

  69. Merhbene L, Zouaghi A, Zrigui M (2010) Ambiguous Arabic words disambiguation. In: SNPD, London, pp 157–164

    Google Scholar 

  70. Mohammad S, Hirst G (2006) Distributional measures of concept-distance: a task-oriented evaluation. In: Proceedings of EMNLP, Sydney

    Google Scholar 

  71. Moschitti A (2004) A study on convolution kernels for shallow semantic parsing. In: Proceedings of the 42nd conference on Association for Computational Linguistic (ACL-2004), Barcelona

    Google Scholar 

  72. Moschitti A (2006) Efficient convolution kernels for dependency and constituent syntactic trees. In: ECML’06, Berlin

    Google Scholar 

  73. Moschitti A (2006) Making tree kernels practical for natural language learning. In: Proceedings of 11th conference of the European chapter of the Association for Computational Linguistics (EACL2006), Trento, pp 113–120

    Google Scholar 

  74. Moschitti A (2008) Kernel methods, syntax and semantics for relational text categorization. In: Proceedings of the 17th ACM conference on information and knowledge management (CIKM), Napa Valley. ACM

    Google Scholar 

  75. Moschitti A, Quarteroni S, Basili R, Manandhar S (2007) Exploiting syntactic and shallow semantic kernels for question answer classification. In: Proceedings of the 45th annual meeting of the Association of Computational Linguistics, Prague. Association for Computational Linguistics, pp 776–783. http://www.aclweb.org/anthology/P/P07/P07-0098

  76. Moschitti A, Pighin D, Basili R (2008) Tree kernels for semantic role labeling. Comput Linguist 34(2):193–224. doi:10.1162/coli.2008.34.2.193, http://www.mitpressjournals.org/doi/abs/10.1162/coli.2008.34.2.193, http://www.mitpressjournals.org/doi/pdf/10.1162/coli.2008.34.2.193

    Google Scholar 

  77. Mousser J (2010) A large coverage verb taxonomy for Arabic. In: Proceedings of the seventh conference on international language resources and evaluation (LREC’10), Valletta

    Google Scholar 

  78. Mousser J (2011) Classifying Arabic verbs using sibling classes. In: Proceedings of the ninth international conference on computational semantics (IWCS 2011), Oxford

    Google Scholar 

  79. Navigli R (2006) Meaningful clustering of senses helps boost word sense disambiguation performance. In: Proceedings of the 21st international conference on computational linguistics and the 44th annual meeting of the association, Sydney, pp 105–112

    Google Scholar 

  80. Navigli R (2009) Word sense disambiguation: a survey. ACM Comput Surv 41(2):10:1–10:69. doi:10.1145/1459352.1459355, http://doi.acm.org/10.1145/1459352.1459355

    Google Scholar 

  81. Nelken R, Shieber SM (2005) Arabic diacritization using weighted finite-state transducers. In: Proceedings of the ACL workshop on computational approaches to Semitic languages, Ann Arbor. Association for Computational Linguistics, pp 79–86. http://www.aclweb.org/anthology/W/W05/W05-0711

  82. Niles I, Pease A (2001) Origins of the IEEE standard upper ontology. In: Working notes of the IJCAI-2001 workshop on the IEEE standard upper ontology, Seattle

    Google Scholar 

  83. Ordan N, Wintner S (2007) Hebrew wordnet: a test case of aligning lexical databases across languages. Int J Trans Spec Issue Lex Resour Mach Trans 19(1):39–58

    Google Scholar 

  84. Pantel P, Lin D (2002) Discovering word senses from text. In: Proceedings of ACM conference on knowledge discovery and data mining (KDD-02), ACM, Edmonton, pp 613–619. http://www.patrickpantel.com/download/papers/2002/kdd02.pdf

  85. Pasca M, Dienes P (2005) Aligning needles in a haystack: paraphrase acquisition across the web. In: Proceedings of IJCNLP, Jeju Island, pp 119–130

    Google Scholar 

  86. Patwardhan S, Pedersen T (2006) Using WordNet based context vectors to estimate the semantic relatedness of concepts. In: Proceedings of making sense of sense EACL workshop, Trento, pp 1–8

    Google Scholar 

  87. Poon H, Domingos P (2009) Unsupervised semantic parsing. In: Proceedings of the 2009 conference on empirical methods in natural language processing, Stroudsburg, EMNLP ’09, vol 1. Association for Computational Linguistics, pp 1–10. http://dl.acm.org/citation.cfm?id=1699510.1699512

  88. Pradhan S, Hacioglu K, Ward W, Martin JH, Jurafsky D (2003) Semantic role parsing: adding semantic structure to unstructured text. In: Proceedings of the international conference on data mining (ICDM-2003), Melbourne

    Google Scholar 

  89. Rada R, Mili H, Bicknell E, Blettner M (1989) Development and application of a metric on semantic nets. IEEE Trans Syst Man Cybern 19(1):17–30

    CrossRef  Google Scholar 

  90. Rapp R (1999) Automatic identification of word translations from unrelated English and German corpora. In: Proceedings of the 37th annual conference of the Association for Computational Linguistics, College Park, pp 519–525

    Google Scholar 

  91. Resnik P (1999) Semantic similarity in a taxonomy: an information-based measure and its application to problems of ambiguity in natural language. J Artif Intell Res (JAIR) 11:95–130

    MATH  Google Scholar 

  92. Rodriguez H, Farwell D, Farreres B, Bertran M, Alkhalifa M, Marti M, Black W, Elkateb S, Kirk J, Pease A, Vossen P, Fellbaum C (2008) Arabic wordnet: current state and future extensions. In: Proceedings of global WordNet conference, Szeged

    Google Scholar 

  93. Ross S (1976) A first course in probability. Macmillan, New York

    MATH  Google Scholar 

  94. Roth R, Rambow O, Habash N, Diab M, Rudin C (2008) Arabic morphological tagging, diacritization, and lemmatization using lexeme models and feature ranking. In: Proceedings of ACL-08: HLT, Short Papers, Columbus. Association for Computational Linguistics, pp 117–120. http://www.aclweb.org/anthology/P/P08/P08-2030

  95. Rozovskaya A, Sproat R (2007) Multilingual word sense discrimination: a comparative cross-linguistic study. In: Proceedings of the workshop on balto-slavonic natural language processing, Prague. Association for Computational Linguistics, pp 82–87. http://www.aclweb.org/anthology/W/W07/W07-1711

  96. Sag IA, Baldwin T, Bond F, Copestake AA, Flickinger D (2002) Multiword expressions: a pain in the neck for nlp. In: Proceedings of the third international conference on computational linguistics and intelligent text processing, Mexico City. Springer, London, pp 1–15

    Google Scholar 

  97. Salton G, McGill MJ (1983) Introduction to modern information retrieval. McGraw-Hill, New York

    MATH  Google Scholar 

  98. Schuetze H (1998) Automatic word sense discrimination. Comput Linguist 24(1):97–123

    Google Scholar 

  99. Schuetze H, Pedersen JO (1997) A cooccurrence-based thesaurus and two applications to information retreival. Inf Process Manag 33(3):307–318

    CrossRef  Google Scholar 

  100. Schulte im Walde S (2000) Clustering verbs semantically according to their alternation behaviour. In: Proceedings of the 18th international conference on computational linguistics (COLING-00), Saarbrücken, pp 747–753

    Google Scholar 

  101. Schulte im Walde S (2009) the induction of verb frames and verb classes from corpora. In: Lüdeling A, Kytö M (eds) An international handbook. Handbooks of linguistics and communication science, vol 2. Mouton de Gruyter, Berlin, chap 44, pp 952–971

    Google Scholar 

  102. Snider N, Diab M (2006) Unsupervised induction of Modern Standard Arabic verb classes. In: Proceedings of the human language technology conference of the NAACL, Companion Volume: Short Papers, New York. Association for Computational Linguistics, pp 153–156. http://www.aclweb.org/anthology/N/N06/N06-2039

  103. Snider N, Diab M (2006) Unsupervised induction of Modern Standard Arabic verb classes using syntactic frames and lsa. In: Proceedings of the COLING/ACL 2006 main conference poster sessions, Sydney. Association for Computational Linguistics, pp 795–802. http://www.aclweb.org/anthology/P/P06/P06-2102

  104. Sun H, Jurafsky D (2004) Shallow semantic parsing of Chinese. In: Susan Dumais DM, Roukos S (eds) HLT-NAACL 2004: main proceedings, Boston. Association for Computational Linguistics, pp 249–256

    Google Scholar 

  105. Sun W (2010) Semantics-driven shallow parsing for Chinese semantic role labeling. In: Proceedings of the ACL 2010 conference short papers, Uppsala. Association for Computational Linguistics, pp 103–108. http://www.aclweb.org/anthology/P10-2019

  106. Thompson CA, Levy R, Manning C (2003) A generative model for semantic role labeling. In: 14th European conference on machine learning, Cavtat-Dubrovnik

    Google Scholar 

  107. Titov I, Klementiev A (2011) A bayesian model for unsupervised semantic parsing. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: human language technologies, Portland. Association for Computational Linguistics, pp 1445–1455. http://www.aclweb.org/anthology/P11-1145

  108. Tsvetkov Y, Wintner S (2010) Extraction of multi-word expressions from small parallel corpora. In: Coling 2010: posters, COLING 2010 Organizing Committee, Beijing, pp 1256–1264. http://www.aclweb.org/anthology/C10-2144

  109. Turney PD, Pantel P (2010) From frequency to meaning: vector space models of semantics. J Artif Intell Res 37:141–188

    MATH  MathSciNet  Google Scholar 

  110. Vergyri D, Kirchhoff K (2004) Automatic diacritization of Arabic for acoustic modeling in speech recognition. In: Farghaly A, Megerdoomian K (eds) COLING 2004 computational approaches to arabic script-based languages, Geneva. COLING, pp 66–73

    Google Scholar 

  111. Weaver W (1949) Translation. In: Locke W, Booth A (eds) Machine translation of languages: fourteen essays. MIT, Cambridge

    Google Scholar 

  112. Wu F, Weld DS (2010) Open information extraction using wikipedia. In: Proceedings of the 48th annual meeting of the Association for Computational Linguistics, ACL ’10, Stroudsburg. Association for Computational Linguistics, pp 118–127. http://dl.acm.org/citation.cfm?id=1858681.1858694

  113. Xue N, Palmer M (2004) Calibrating features for semantic role labeling. In: Lin D, Wu D (eds) Proceedings of EMNLP 2004, Barcelona. Association for Computational Linguistics, pp 88–94

    Google Scholar 

  114. Zaghouani W, Diab M, Mansouri A, Pradhan S, Palmer M (2010) The revised Arabic propbank. In: Proceedings of the fourth linguistic annotation workshop, Uppsala. Association for Computational Linguistics, pp 222–226. http://www.aclweb.org/anthology/W10-1836

  115. Zbib R, Matsoukas S, Schwartz R, Makhoul J (2010) Decision trees for lexical smoothing in statistical machine translation. In: Proceedings of the joint fifth workshop on statistical machine translation and MetricsMATR, Uppsala. Association for Computational Linguistics, pp 428–437. http://www.aclweb.org/anthology/W10-1763

  116. Zitouni I, Sorensen JS, Sarikaya R (2006) Maximum entropy based restoration of Arabic diacritics. In: Proceedings of the 21st international conference on computational linguistics and 44th annual meeting of the Association for Computational Linguistics, Sydney. Association for Computational Linguistics, pp 577–584. doi:10.3115/1220175.1220248. http://www.aclweb.org/anthology/P06-1073

  117. Zouaghi A, Merhbene L, Zrigui M (2010) Combination of information retrieval methods with lesk algorithm for arabic word sense disambiguation. Artif Intell Rev 1–13. http://dx.doi.org/10.1007/s10462-011-9249-3

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Mona Diab or Yuval Marton .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Diab, M., Marton, Y. (2014). Semantic Processing of Semitic Languages. In: Zitouni, I. (eds) Natural Language Processing of Semitic Languages. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45358-8_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-45358-8_4

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-45357-1

  • Online ISBN: 978-3-642-45358-8

  • eBook Packages: Computer ScienceComputer Science (R0)