Advertisement

A Statistical Approach for the Best Deep Neural Network Configuration for Arabic Language Processing

  • Abdelhalim Saadi
  • Hacene Belhadef
Conference paper
Part of the Lecture Notes in Networks and Systems book series (LNNS, volume 64)

Abstract

The widespread of the computer technology and the Internet lead to a massive amount of textual information being available in written Arabic. This that more is available, it becomes more difficult to extract the relevant information. To meet this challenge, many researchers are directed to the development of information retrieval systems based on syntactic and semantic parsing. In Arabic, this field is restricted by the lack of labeled datasets. Thus, it is important to build systems for part-of-speech tagging and language modeling and use their results for further syntactic and semantic parsing in fields like chunking, semantic role labeling, information extraction, named entity recognition and statistical machine translation. Deep neural networks have proved efficient in fields like imaging or acoustics and recently in natural language processing. In this study, we used the Taguchi method to find the optimal parameter combination for a deep neural network architecture. Therefore, the neural network obtained the most accurate results. The main use of the Taguchi method in our work is to help us to choose the best context which is the number of words before and after the word on which the training is made.

Keywords

Deep neural networks Arabic language processing Taguchi Method Statistical Machine Translation Parallel Computing 

References

  1. 1.
    Jurafsky, D., Martin, J.H.: Speech and language processing: an introduction to speech recognition. In: Computational Linguistics and Natural Language Processing. Prentice Hall (2008)Google Scholar
  2. 2.
    Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine learning, pp. 160–167. ACM (2008)Google Scholar
  3. 3.
    Freeman, A.: Brill’s POS Tagger and a Morphology Parser for Arabic (2004)Google Scholar
  4. 4.
    Diab, M., Hacioglu, K., Jurafsky, D.: Automatic tagging of Arabic text: from raw text to base phrase chunks. In: Proceedings of HLT-NAACL 2004: Short papers, Association for Computational Linguistics, pp. 149–152 (2004)Google Scholar
  5. 5.
    Duh, K., Kirchhoff, K.: POS tagging of dialectal Arabic: a minimally supervised approach. In: Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages, Association for Computational Linguistics, pp. 55–62 (2005)Google Scholar
  6. 6.
    Khoja, S.: APT: Arabic part-of-speech tagger. In: Proceedings of the Student Workshop at NAACL, pp. 20–25 (2001)Google Scholar
  7. 7.
    Heintz, I.: Arabic language modeling with finite state transducers. In: Proceedings of the ACL-08: HLT Student Research Workshop (Companion Volume), Association for Computational Linguistics, Columbus, pp. 37–42 (2008)Google Scholar
  8. 8.
    Vergyri, D., Kirchhoff, K., Duh, K., Stolcke, A.: Morphology-Based Language Modeling for Arabic Speech Recognition (2004)Google Scholar
  9. 9.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)Google Scholar
  10. 10.
    Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. volume 12, Aug, 2493–2537 (2011)zbMATHGoogle Scholar
  11. 11.
    Fischer, A., Igel, C.: An introduction to restricted Boltzmann machines. In: Iberoamerican Congress on Pattern Recognition, pp. 14–36. Springer (2012)Google Scholar
  12. 12.
    Toutanova, K., Klein, D., Manning, C.D., Singer, Y.: Feature-rich part-of-speech tagging with a cyclic dependency network. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, Association for Computational Linguistics, vol. 1, pp. 173–180 (2003)Google Scholar
  13. 13.
    Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. Aistats 9, 249–256 (2010)Google Scholar
  14. 14.
  15. 15.
    Takeuchi, K., Collier, N.: Use of support vector machines in extended named entity. In: Proceedings on Computational Natural Language Learning, Taiwan, pp. 119–125 (2002)Google Scholar
  16. 16.
    Toutanova, K.: Stanford log-linear part-of-speech tagger. https://nlp.stanford.edu/software/tagger.shtml (2011)
  17. 17.
    Lee, H.H.: Principles and Practices of Quality Design, 4th edn. Gaulih Book Publishing (2013)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Faculty of Technology, Department of Basic Education in TechnologyUniversity Ferhat ABBAS Setif 1SetifAlgeria
  2. 2.MISC Laboratory, NTIC Faculty, Abdelhamid MehriConstantine 2 UniversityConstantineAlgeria

Personalised recommendations