Skip to main content
Log in

From extended chunking to dependency parsing using traditional Arabic grammar

  • Original Paper
  • Published:
Language Resources and Evaluation Aims and scope Submit manuscript

Abstract

We describe in this paper the adopted approach combining a phrase structure grammar and dependency rules to develop AlkhalilPArser. The general architecture of this parser is composed of three main levels. The first level includes basic tasks such as tokenization, part-of-speech tagging, and chunking. The next level deals with analysis tasks such as managing common, internal and external dependencies according to the nature of the proposal. The last level detects and corrects anomalies. This parser is an extension to verbal sentences of a system previously designed for parsing the nominal sentences. Several modifications have been made to adapt this system to all types of sentences and improve its accuracy. The tests carried out on a representative corpus and the comparisons with other parsers testify to the robustness of our system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. Buckwalter transliteration.

  2. https://www.ethnologue.com/guides/ethnologue200.

References

  • Ababou, N., Mazroui, A., & Belahbib, R. (2019). Elaboration of a treebank for Arabic language. In International Conference on Advanced Technology for Humanitarian Sciences.

  • Ababou, N., & Mazroui, A. (2016). A hybrid Arabic Pos tagging for simple and compound morphosyntactic tags. International Journal of Speech Technology, 19(2), 289–302.

    Article  Google Scholar 

  • Ababou, N., Mazroui, A., & Belehbib, R. (2017). Parsing Arabic nominal sentences using context free grammar and fundamental rules of classical grammar. International Journal of Intelligent Systems and Applications, 9(8), 11.

    Article  Google Scholar 

  • Abdelrazaq, D., Abu-Soud, S., & Awajan, A. (2017). Distinguishing nominal and verbal arabic sentences: A machine learning approach. In The International Arab Conference on Information Technology.

  • Abney, S. P. (1991). Parsing by chunks. In Principle-based parsing (pp. 257–278). Springer.

  • Ali, A. D. A. Alomdah and alfadlah : Term and significance, mjlP AlEolwm AlErbyP wAl>insAnyp / 12 (4).

  • Alian, M., Awajan, A. (2018). Arabic tag sets. In Proceedings of SAI Intelligent Systems Conference (pp. 592–606) Springer.

  • Alian, M., Awajan, A., Al-Hasan, A., & Akuzhia, R. (2021). Building Arabic paraphrasing benchmark based on transformation rules. Transactions on Asian and Low-Resource Language Information Processing, 20(4), 1–17.

    Article  Google Scholar 

  • alqader alfAssy, A. E. Linguistics and the arabic language allsanyat walogap alearabyap.

  • Al-Taani, A. T., Msallam, M. M., & Wedian, S. A. (2012). A top-down chart parser for analyzing Arabic sentences. International Arab Journal of Information Technology, 9(2), 109–116.

    Google Scholar 

  • Aqel, D., AlZu’bi, S., & Hamadah, S. (2019). Comparative study for recent technologies in arabic language parsing. In 2019 Sixth International Conference on Software Defined Systems (SDS), IEEE (pp. 209–212).

  • Aqel, D., & Hawashin, B. (2018). Arabic relative clauses parsing based on inductive logic programming. Recent Patents on Computer Science, 11(2), 121–133.

    Article  Google Scholar 

  • Attia, M., & Somers, H. (2008). Handling Arabic morphological and syntactic ambiguity within the LFG framework with a view to machine translation (Vol. 279). University of Manchester Manchester.

    Google Scholar 

  • Bikel, D. M. (2004). On the parameter space of generative lexicalized statistical parsing models. Citeseer.

    Google Scholar 

  • Black, E. (1992). Meeting of interest group on evaluation of broad-coverage grammars of English. Linguist List 3.587.

  • Chen, D., Manning, C. D. (2014). A fast and accurate dependency parser using neural networks. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 740–750).

  • Cui, X. (2020). Learning transferable features for unsupervised domain adaptation in natural language processing. Ph.D. thesis, University of Liverpool.

  • Diab, M. (2007). Improved Arabic base phrase chunking with a new enriched pos tag set. In Proceedings of the 2007 Workshop on Computational Approaches to Semitic Languages: Common Issues and Resources (pp. 89–96).

  • Dyer, C., Ballesteros, M., Ling, W., Matthews, A., & Smith, N. A. (2015). Transition-based dependency parsing with stack long short-term memory, arXiv preprint arXiv:1505.08075.

  • Halabi, D., Fayyoumi, E., & Awajan, A. (2021). I3rab: A new Arabic dependency treebank based on Arabic grammatical theory. Transactions on Asian and Low-Resource Language Information Processing, 21(2), 1–32.

    Google Scholar 

  • Hammouda, N. G., & Haddar, K. (2017). Parsing Arabic nominal sentences with transducers to annotate corpora. Computación y Sistemas, 21(4), 647–656.

    Google Scholar 

  • Nie, L., Zhao, Y.-L., Akbari, M., Shen, J., & Chua, T.-S. (2014). Bridging the vocabulary gap between health seekers and healthcare knowledge. IEEE Transactions on Knowledge and Data Engineering, 27(2), 396–409.

    Article  Google Scholar 

  • Nivre, J., Hall, J., Kübler, S., McDonald, R., Nilsson, J., Riedel, S., & Yuret, D. (2007). The conll 2007 shared task on dependency parsing. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL) (pp. 915–932).

  • Plank, B., Van Noord, G. (2010). Grammar-driven versus data-driven: which parsing system is more affected by domain shifts? In Proceedings of the 2010 Workshop on NLP and Linguistics: Finding the Common Ground (pp. 25–33).

  • Remy, B. (1975). Les principales contraintes proposees par ross dans constraints on variables in syntax, DRLAV. Documentation et Recherche en Linguistique Allemande Vincennes, 12(1), 58–69.

    Article  Google Scholar 

  • Sado Al-Jarf, R. (2007). Svo word order errors in English-Arabic translation. Meta: journal des traducteurs/Meta: Translators’ Journal, 52(2), 299–308.

    Article  Google Scholar 

  • Santamaría, J., Araujo, L. (2013). Semi-supervised constituent grammar induction based on text chunking information. In International Conference on Intelligent Text Processing and Computational Linguistics (pp. 258–269). Springer.

  • Tounsi, L., & Van Genabith. (2010). J. Arabic parsing using grammar transforms.

  • Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones,L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention is all you need, arXiv preprint arXiv:1706.03762.

  • Zeman, D., Hajic, J., Popel, M., Potthast, M., Straka, M., Ginter, F., Nivre, J., Petrov, S. (2018). Conll 2018 shared task: Multilingual parsing from raw text to universal dependencies. In Proceedings of the CoNLL 2018 Shared Task: Multilingual parsing from raw text to universal dependencies (pp. 1–21).

  • Zhang, Y., Li, C., Barzilay, R., & Darwish, K. (2015). Randomized greedy inference for joint segmentation, pos tagging and dependency parsing. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 42–52).

  • Zhou, H., Zhang, Y., Cheng, C., Huang, S., Dai, X., & Chen, J. (2017). A neural probabilistic structured-prediction method for transition-based natural language processing. Journal of Artificial Intelligence Research, 58, 703–729.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Nabil Ababou, Azzeddine Mazroui or Rachid Belehbib.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ababou, N., Mazroui, A. & Belehbib, R. From extended chunking to dependency parsing using traditional Arabic grammar. Lang Resources & Evaluation 57, 1011–1043 (2023). https://doi.org/10.1007/s10579-022-09629-w

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10579-022-09629-w

Keywords

Navigation