Towards an open platform based on HPSG formalism for the standard Arabic language

Abstract

The aim of this paper is to present an open software platform for analysing texts in standard Arabic language. The originality of this platform is that it is an integrated software environment which offers all the necessary resources and tools for parsing Arabic texts. For formalising the several elements of the language, the HPSG formalism has been adopted because of its effectiveness and its ability to be adapted to any natural language. Currently, the platform is operational with an appreciable coverage of many Arabic syntactic structures. In the medium-term, our objective is to use the platform for developing applications for the Arabic language such as interfaces, learning, information retrieval…etc.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

References

  1. Abdul-Mageed, M., Kübler, S., & Diab, M. (2012). Samar: A system for subjectivity and sentiment analysis of arabic social media. In Proceedings of the 3rd workshop in computational approaches to subjectivity and sentiment analysis (pp. 19–28). Association for Computational Linguistics.

  2. Alabbas, M., & Ramsay, A. (2014). Improved Parsing for Arabic by Combining Diverse Dependency Parsers. In Human language technology challenges for computer science and linguistics (pp. 43–54). Springer International Publishing.

  3. Al-diabat, M. (2012). Arabic text categorization using classification rule mining. Applied Mathematical Sciences, 6(81), 4033–4046.

    Google Scholar 

  4. Al-Jumaily, H., Martínez, P., Martínez-Fernández, J. L., & Van der Goot, E. (2012). A real time named entity recognition system for Arabic text mining. Language Resources and Evaluation, 46(4), 543–563.

    Article  Google Scholar 

  5. Al-Kabi, M. N., Alsmadi, I. M., Gigieh, A. H., Wahsheh, H. A., & Haidar, M. M. (2014). Opinion mining and analysis for arabic language. International Journal of Advanced Computer Science and Applications (IJACSA), 5(5), 181–195.

    Google Scholar 

  6. Al-Taani, A. T., Msallam, M. M., & Wedian, S. A. (2012). A top-down chart parser for analyzing arabic sentences. International Arab Journal of Information Technology, 9(2), 109–116.

    Google Scholar 

  7. Azmi, A. M., & Al-Thanyyan, S. (2012). A text summarizer for Arabic. Computer Speech & Language, 26(4), 260–273.

    Article  Google Scholar 

  8. Bender, E., & Lascarides, A. (2013). On modelling scope of inflectional negation. In P. Hofmeister & E. Norcliffe (Eds.), The core and the periphery: Data-driven perspectives on syntax inspired by Ivan A. Sag (pp. 101–124). Stanford: CSLI Publications.

    Google Scholar 

  9. Copestake, A. (2002). Implementing typed feature structure grammars. Stanford: CSLI Publications, Stanford University. 2002.

    Google Scholar 

  10. Darwish, K. (2013). Named entity recognition using cross-lingual resources: Arabic as an example. In Proceedings of the 51st annual meeting of the association for computational linguistics (ACL) (pp. 1558–1567).

  11. Farghaly, A. & Shaalan, K. (2009). Arabic natural language processing : challenges and solutions, ACM transactions on Asian language information processing, Vol. 8, No. 4, Article 14.

  12. Haddar, K., Fehri, H., & Romary, L. (2012). A prototype for projecting HPSG syntactic lexica towards LMF. Journal for Language Technology and Computational Linguistics, 27(1), 21–46.

    Google Scholar 

  13. Hadrich Belguith, L., Alloulou, C. & Ben Hamadou, A. (2007). De la segmentation à l’analyse syntaxique de textes arabe’s. I3 Journal (Interaction—Intelligence—Information), Volume 7 (2), 2007.

  14. Hann, M. (2011). Null conjoncts and bounds pronouns in Arabic. In Proceedings of HPSG 2011 conference, August 22–25, 2011, University of Washington, CSLI Publication.

  15. Hann, M. (2012). Arabic relativization patterns: A Unified HPSG analysis. In Proceedings of HPSG 2012 conference, Chugnam National University of Daejon, South Korea, CSLI Publications, July 18–19, 2012.

  16. Khorsheed, M. S., & Al-Thubaity, A. O. (2013). Comparative evaluation of text classification techniques using a large diverse Arabic dataset. Language resources and evaluation, 47(2), 513–538.

    Article  Google Scholar 

  17. Loukam, M., Balla, A., & Laskri, M. T. (2013). PHARAS : Une plate-forme d’analyse basée sur le formalisme HPSG pour l’Arabe standard : Développements récents et perspectives. Revue RIST, 20(2), 20–31.

    Google Scholar 

  18. Loukam, M., Balla, A. & Laskri, M.T. (2014). An open platform based on HPSG formalism for the standard Arabic language, workshop on free/open-source arabic corpora and corpora processing Tools, LREC conference 2014, May 27 2014, Reykyavik, Iceland, pp. 38–42.

  19. Mahdaouy, A. E., Gaussier, E., & Alaoui, S. O. E. (2014). ‘Exploring term proximity statistic for Arabic information retrieval’. In Information science and technology (CIST), 2014 Third IEEE international colloquium IEEE (pp. 272–277).

  20. Marton, Y., Chiang, D., & Resnik, P. (2012). Soft syntactic constraints for Arabic–English hierarchical phrase-based translation. Machine Translation, 26(1–2), 137–157.

    Article  Google Scholar 

  21. Miyao, Y. & Tsujii, J. (2005). Probabilistic disambiguation models for wide-coverage hpsg parsing. In Proceedings of ACL-2005 (pp. 83–90).

  22. Müller, S. (1996). The babel-system-an HPSG Fragment for German, a Parser, and a Dialogue Component. In Proceedings of the fourth international conference on the practical application of prolog language (pp. 263–277) London.

  23. Müller, S. 2007. The Grammix CD Rom. a software collection for developing typed feature structure grammars. In T. H. King &E. M. Bender (Eds.) Proceedings of the grammar engineering across frameworks workshop 2007, ser. Studies in Computational Linguistics ONLINE, Stanford: CSLI Publications, 2007.

  24. Ninomiya, T., Matsuzaki, T., Tsuruoka,Y., Miyao, Y. & Tsujii, J. (2006). Extremely lexicalized models for accurate and fast HPSG parsing. In Proceedings of EMNLP.

  25. Pollard, C. & Sag, I.A. (1994). Head-driven phrase structure grammar. Chicago: University of Chicago Press and Stanford: CSLI Publications.

  26. Sag, I.A, Wasow, T. & Bender, E. (2003). Syntactic Theory: a formal introduction, 2nd edition, CSLI Publications, ISBN 9781575. Müller, S.2007. The Grammix CD Rom. A Software Collection for Developing Typed Feature Structure Grammars. In Tracy Holloway King and Emily M. Bender (eds.), Grammar Engineering across Frameworks 2007, Studies in Computational Linguistics ONLINE, Stanford: cslip.

  27. Thabtah, F., Gharaibeh, O., & Al-Zubaidy, R. (2012). Arabic text mining using rule based classification. Journal of Information & Knowledge Management, 11(01), 1250006.

    Article  Google Scholar 

Download references

Acknowledgments

This project is supported by the Algerian higher education and scientific Research Ministry.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Mourad Loukam.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Loukam, M., Balla, A. & Laskri, M.T. Towards an open platform based on HPSG formalism for the standard Arabic language. Int J Speech Technol 19, 325–338 (2016). https://doi.org/10.1007/s10772-015-9314-4

Download citation

Keywords

  • Standard Arabic language
  • HPSG
  • Software platform
  • Text parsing
  • Natural language processing
  • Linguistic resources
  • Interface in natural language