International Journal of Speech Technology

, Volume 19, Issue 2, pp 325–338 | Cite as

Towards an open platform based on HPSG formalism for the standard Arabic language

  • Mourad Loukam
  • Amar Balla
  • Mohamed Tayeb Laskri
Special Issue Article


The aim of this paper is to present an open software platform for analysing texts in standard Arabic language. The originality of this platform is that it is an integrated software environment which offers all the necessary resources and tools for parsing Arabic texts. For formalising the several elements of the language, the HPSG formalism has been adopted because of its effectiveness and its ability to be adapted to any natural language. Currently, the platform is operational with an appreciable coverage of many Arabic syntactic structures. In the medium-term, our objective is to use the platform for developing applications for the Arabic language such as interfaces, learning, information retrieval…etc.


Standard Arabic language HPSG Software platform Text parsing Natural language processing Linguistic resources Interface in natural language 



This project is supported by the Algerian higher education and scientific Research Ministry.

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.


  1. Abdul-Mageed, M., Kübler, S., & Diab, M. (2012). Samar: A system for subjectivity and sentiment analysis of arabic social media. In Proceedings of the 3rd workshop in computational approaches to subjectivity and sentiment analysis (pp. 19–28). Association for Computational Linguistics.Google Scholar
  2. Alabbas, M., & Ramsay, A. (2014). Improved Parsing for Arabic by Combining Diverse Dependency Parsers. In Human language technology challenges for computer science and linguistics (pp. 43–54). Springer International Publishing.Google Scholar
  3. Al-diabat, M. (2012). Arabic text categorization using classification rule mining. Applied Mathematical Sciences, 6(81), 4033–4046.Google Scholar
  4. Al-Jumaily, H., Martínez, P., Martínez-Fernández, J. L., & Van der Goot, E. (2012). A real time named entity recognition system for Arabic text mining. Language Resources and Evaluation, 46(4), 543–563.CrossRefGoogle Scholar
  5. Al-Kabi, M. N., Alsmadi, I. M., Gigieh, A. H., Wahsheh, H. A., & Haidar, M. M. (2014). Opinion mining and analysis for arabic language. International Journal of Advanced Computer Science and Applications (IJACSA), 5(5), 181–195.Google Scholar
  6. Al-Taani, A. T., Msallam, M. M., & Wedian, S. A. (2012). A top-down chart parser for analyzing arabic sentences. International Arab Journal of Information Technology, 9(2), 109–116.Google Scholar
  7. Azmi, A. M., & Al-Thanyyan, S. (2012). A text summarizer for Arabic. Computer Speech & Language, 26(4), 260–273.CrossRefGoogle Scholar
  8. Bender, E., & Lascarides, A. (2013). On modelling scope of inflectional negation. In P. Hofmeister & E. Norcliffe (Eds.), The core and the periphery: Data-driven perspectives on syntax inspired by Ivan A. Sag (pp. 101–124). Stanford: CSLI Publications.Google Scholar
  9. Copestake, A. (2002). Implementing typed feature structure grammars. Stanford: CSLI Publications, Stanford University. 2002.zbMATHGoogle Scholar
  10. Darwish, K. (2013). Named entity recognition using cross-lingual resources: Arabic as an example. In Proceedings of the 51st annual meeting of the association for computational linguistics (ACL) (pp. 1558–1567).Google Scholar
  11. Farghaly, A. & Shaalan, K. (2009). Arabic natural language processing : challenges and solutions, ACM transactions on Asian language information processing, Vol. 8, No. 4, Article 14.Google Scholar
  12. Haddar, K., Fehri, H., & Romary, L. (2012). A prototype for projecting HPSG syntactic lexica towards LMF. Journal for Language Technology and Computational Linguistics, 27(1), 21–46.Google Scholar
  13. Hadrich Belguith, L., Alloulou, C. & Ben Hamadou, A. (2007). De la segmentation à l’analyse syntaxique de textes arabe’s. I3 Journal (Interaction—Intelligence—Information), Volume 7 (2), 2007.Google Scholar
  14. Hann, M. (2011). Null conjoncts and bounds pronouns in Arabic. In Proceedings of HPSG 2011 conference, August 22–25, 2011, University of Washington, CSLI Publication.Google Scholar
  15. Hann, M. (2012). Arabic relativization patterns: A Unified HPSG analysis. In Proceedings of HPSG 2012 conference, Chugnam National University of Daejon, South Korea, CSLI Publications, July 18–19, 2012.Google Scholar
  16. Khorsheed, M. S., & Al-Thubaity, A. O. (2013). Comparative evaluation of text classification techniques using a large diverse Arabic dataset. Language resources and evaluation, 47(2), 513–538.CrossRefGoogle Scholar
  17. Loukam, M., Balla, A., & Laskri, M. T. (2013). PHARAS : Une plate-forme d’analyse basée sur le formalisme HPSG pour l’Arabe standard : Développements récents et perspectives. Revue RIST, 20(2), 20–31.Google Scholar
  18. Loukam, M., Balla, A. & Laskri, M.T. (2014). An open platform based on HPSG formalism for the standard Arabic language, workshop on free/open-source arabic corpora and corpora processing Tools, LREC conference 2014, May 27 2014, Reykyavik, Iceland, pp. 38–42.Google Scholar
  19. Mahdaouy, A. E., Gaussier, E., & Alaoui, S. O. E. (2014). ‘Exploring term proximity statistic for Arabic information retrieval’. In Information science and technology (CIST), 2014 Third IEEE international colloquium IEEE (pp. 272–277).Google Scholar
  20. Marton, Y., Chiang, D., & Resnik, P. (2012). Soft syntactic constraints for Arabic–English hierarchical phrase-based translation. Machine Translation, 26(1–2), 137–157.CrossRefGoogle Scholar
  21. Miyao, Y. & Tsujii, J. (2005). Probabilistic disambiguation models for wide-coverage hpsg parsing. In Proceedings of ACL-2005 (pp. 83–90).Google Scholar
  22. Müller, S. (1996). The babel-system-an HPSG Fragment for German, a Parser, and a Dialogue Component. In Proceedings of the fourth international conference on the practical application of prolog language (pp. 263–277) London.Google Scholar
  23. Müller, S. 2007. The Grammix CD Rom. a software collection for developing typed feature structure grammars. In T. H. King &E. M. Bender (Eds.) Proceedings of the grammar engineering across frameworks workshop 2007, ser. Studies in Computational Linguistics ONLINE, Stanford: CSLI Publications, 2007.Google Scholar
  24. Ninomiya, T., Matsuzaki, T., Tsuruoka,Y., Miyao, Y. & Tsujii, J. (2006). Extremely lexicalized models for accurate and fast HPSG parsing. In Proceedings of EMNLP.Google Scholar
  25. Pollard, C. & Sag, I.A. (1994). Head-driven phrase structure grammar. Chicago: University of Chicago Press and Stanford: CSLI Publications.Google Scholar
  26. Sag, I.A, Wasow, T. & Bender, E. (2003). Syntactic Theory: a formal introduction, 2nd edition, CSLI Publications, ISBN 9781575. Müller, S.2007. The Grammix CD Rom. A Software Collection for Developing Typed Feature Structure Grammars. In Tracy Holloway King and Emily M. Bender (eds.), Grammar Engineering across Frameworks 2007, Studies in Computational Linguistics ONLINE, Stanford: cslip.Google Scholar
  27. Thabtah, F., Gharaibeh, O., & Al-Zubaidy, R. (2012). Arabic text mining using rule based classification. Journal of Information & Knowledge Management, 11(01), 1250006.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Higher National School of Computer ScienceAlgiersAlgeria
  2. 2.Natural Language Processing Team, LMA Laboratory, Faculty of SciencesHassiba Benbouali University of ChlefChlefAlgeria
  3. 3.LMCS LaboratoryHigher National School of Computer ScienceAlgiersAlgeria
  4. 4.Department of Computer Science, Faculty of SciencesBadji Mokhtar University of AnnabaAnnabaAlgeria

Personalised recommendations