Building an Arabic Linguistic Resource from a Treebank: The Case of Property Grammar

  • Raja Bensalem Bahloul
  • Marwa Elkarwi
  • Kais Haddar
  • Philippe Blache
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8655)


This paper presents a survey of Arabic treebanks to facilitate their reuse for the building of new linguistic resources. In our case, we created from a treebank an automatically induced Property Grammar (GP). So, we discussed characteristics of these treebanks to choose the appropriate one. To build our resource, we adopted an automatic technique, acquiring first a context-free grammar (CFG) from the chosen treebank, and second, inducing a GP by generating relations between grammatical units described in the CFG.


treebanks Arabic language reuse property grammar 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Blache, P.: Les Grammaires de Propriétés: Des contraintes pour le traitement automatique des langues naturelles. Hermès Sciences Publications (2001)Google Scholar
  2. 2.
    Diab, M.T., Habash, N., Rambow, O., Roth, R.: LDC Arabic Treebanks and Associated Corpora: Data Divisions Manual. Columbia University. Technical Report, Center for Computational Learning Systems (2013)Google Scholar
  3. 3.
    Dukes, K., Buckwalter, T.: A Dependency Treebank of the Quran using traditional Arabic grammar. Institute of Electrical and Electronics Engineers (2010)Google Scholar
  4. 4.
    Habash, N., Faraj, R., Roth, R.: Syntactic Annotation in the Columbia Arabic Treebank. In: Conference on Arabic Language Resources and Tools, Cairo, Egypt (2009)Google Scholar
  5. 5.
    Hajič, J., Smrž, O., Zemánek, P., Snaidauf, J., Beska, E.: Prague Arabic Dependency Treebank: Development in Data and Tools. In: Proceedings of the NEMLAR International Conference on Arabic Language Resources and Tools (2004)Google Scholar
  6. 6.
    Maamouri, M., Bies, A., Buckwalter, T.: The Penn Arabic Treebank: Building a Large-scale Annotated Arabic Corpus. In: Proceedings of the Network for Euro-Mediterranean Language Resources Conference on Arabic Language Resources, Cairo, Egypt (2004)Google Scholar
  7. 7.
    Maamouri, M., Bies, A., Krouna, S., Gaddeche, F., Bouziri, B.: Penn Arabic Treebank guidelines v4.8. Technical report, LDC, University of Pennsylvania (2009)Google Scholar
  8. 8.
    Smrž, O., Bielický, V., Kouřilová, I., Kráčmar, J., Hajič, J., Zemánek, P.: Prague Arabic Dependency Treebank: A Word on the Million Words. In: Proceedings of the Workshop on Arabic and Local Languages (LREC 2008), Marrakech, Morocco, pp. 16–23 (2008)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Raja Bensalem Bahloul
    • 1
  • Marwa Elkarwi
    • 1
  • Kais Haddar
    • 1
  • Philippe Blache
    • 2
  1. 1.Higher Institute of Computer Science and MultimediaMultimedia Information Systems and Advanced Computing LaboratorySfaxTunisia
  2. 2.Laboratoire Parole et Langage, CNRSUniversité de ProvenceFrance

Personalised recommendations