Advertisement

Cluster Computing

, Volume 21, Issue 1, pp 515–522 | Cite as

Urdu word sense disambiguation using machine learning approach

  • Muhammad AbidEmail author
  • Asad Habib
  • Jawad Ashraf
  • Abdul Shahid
Article

Abstract

This paper focuses on the word sense disambiguation (WSD) problem in the context of Urdu language. Word sense disambiguation (WSD) is a phenomena for disambiguating the text so that machine (computer) would be capable to deduce correct sense of individual given word(s). WSD is critical for solving natural language engineering (NLE) tasks such as machine translation and speech processing etc. It also increase the performance of other tasks such as text retrieval, document classification and document clustering etc. Research work in WSD has been conducted up to different extents in computationally developed languages of the world. In the context of Urdu language the NLE research in general and the WSD research in particular is still in the infancy stage due to the rich morphological structure of Urdu. In this paper, we use machine learning (ML) approaches such as Bayes net classifier (BN), support vector machine (SVM) and decision tree (DT) for WSD in native script Urdu text. The results shown that BN has better F-measure than SVM and DT. The maximum F-measure of 0.711 over 2.5 million words raw Urdu corpus was recorded for the Bayes net classifier.

Keywords

Machine learning approaches for Urdu text Bayes net classifier Support vector machine Decision tree Urdu word sense disambiguation Ambiguous named entities in Urdu language 

References

  1. 1.
    Bouhriz, N.: Word sense disambiguation approach for Arabic text. J. Adv. Comput. Sci. Appl. 7(4), 381–385 (2016)Google Scholar
  2. 2.
    Jumi, S., Sarma, K.: Decision tree based supervised word sense disambiguation for Assamese. Int. J. Comput. Appl. 141(1), 0975 (2016)Google Scholar
  3. 3.
    Zhou, J., et al.: An online marking system conducive to learning. J Int Fuzzy Syst. 31(5), 2463–2471 (2016)CrossRefGoogle Scholar
  4. 4.
    Sreenivasan, D., Vidya, M.: A walk through the approaches of word sense disambiguation. Int. J. Innov. Res. Sci. Technol. 2(10), 218–223 (2016)Google Scholar
  5. 5.
    Mittal, K., Jain, A.: Word sense disambiguation method using semantic similarity measures and owa operator. J. Soft Comput. 5(2), (2015). ISSN: 2229-6956(online)Google Scholar
  6. 6.
    Kalita, P., Barman, A.K.: Word sense disambiguation: a survey. Int. J. Eng. Comput. Sci. 4(5), 11743–11748V (2015)Google Scholar
  7. 7.
    Pal, P.A., Saha, D.: Word sense disambiguation: a survey. Int. J. Control Theory Comput. Model. 5(3), 1–16 (2015)CrossRefGoogle Scholar
  8. 8.
    Hadni, M., Alaoui, E., Lachkar, A.: Word sense disambiguation for Arabic text categorization. Int. Arab J. Inf. Technol. 13(1A), 215–222 (2016)Google Scholar
  9. 9.
    Aung, N.T.T., Soe, K.M., Thein, N.L.: A word sense disambiguation system using Nave Bayesian algorithm for Myanmar language. Int. J. Sci. Eng. Res. 2(9), 1–7 (2011)Google Scholar
  10. 10.
    Gupta, V., Lehal, G.S.: Named entity recognition for Punjabi language text summarization. Int. J. Comput. Appl. 33(3), 28–32 (2011)Google Scholar
  11. 11.
    Riaz, B.: Named entity recognition in Urdu: a progress report. Proceeding of international conference on internet computing, pp. 1–5. (2002)Google Scholar
  12. 12.
    Arif, Z.S., Yaqoob, M.M., Rehman, A., Jamil, A., Jamil, F.: Word sense disambiguation for Urdu text by machine learning. Int. J. Comput. Sci. Inf. Secur. (IJCSIS) 14(5), 738–757 (2016)Google Scholar
  13. 13.
    Bala, P.: Word sense disambiguation using selectional restriction. Int. J. Sci. Res. Publ. 3(4), 1–4 (2013)Google Scholar
  14. 14.
    Kaur, K., Gupta, V.: Named entity recognition system for Punjabi language. Int. J. Comput. Sci. Inf. Technol. Secur. 2, 561–567 (2012)Google Scholar
  15. 15.
    Yi, J., Nasukawa, T., Bunescu, R., Niblack, W.: Sentiment analyzer: extracting sentiments about a given topic using natural language processing techniques. In Proceedings of 3rd IEEE international conference on data mining, pp. 427-434, (2003)Google Scholar
  16. 16.
    Singh, S., Singh, V.K., Siddiqui, T.J.: Hindi word sense disambiguation using semantic relatedness measure, pp. 247–256. In international workshop on multi-disciplinary trends in, artificial intelligence (2013)Google Scholar
  17. 17.
    Jiang, Y., et al.: A self-adaptively evolutionary screening approach for Sepsis patient. Computer-based medical systems (CBMS). IEEE 29th international symposium on. IEEE, (2016)Google Scholar
  18. 18.
    Bushra, A.: Automatic approach for word sense disambiguation using genetic algorithms. Int. J. Adv. Comput. Sci. Appl. 7(1), 41–44 (2016)Google Scholar
  19. 19.
    Gupta, V., Lehal, G.S.: A survey of text mining techniques and applications. J. Emerg. Technol. Web Intell. 1(1), 60–76 (2009)Google Scholar
  20. 20.
    Agirre E., Lopez de., Lacalle, A., Soroa, A.: Knowledge-based WSD on specific domains performing better than generic supervised WSD. In Proceedings of 21st international joint conference on Artificial intelligence, pp. 1501–1506, San Francisco, CA, USA (2009)Google Scholar
  21. 21.
    Elmougy S., Hamza T., Noaman, H.M.: Naive Bayes classifier for Arabic word sense disambiguation. In: Proceedings of INFOS Cairo, pp. 2729. (2008)Google Scholar
  22. 22.
    Zouaghi, A., Merhbene, L., Zrigui, M.: Combination of information retrieval methods with LESK algorithm for Arabic word sense disambiguation. Artif. Intell. Rev. 38, 257269 (2012)CrossRefGoogle Scholar
  23. 23.
    Manish S., et.al.: Hindi word sense disambiguation. In: Proceedings of the 3rd global wordnet conference (GWC 05), (2006)Google Scholar
  24. 24.
    McCarthy, D., et al.: Unsupervised acquisition of predominant word senses. Comput. Linguist. 33(4), 553–590 (2007)CrossRefGoogle Scholar
  25. 25.
    Saif M., Graeme, H.: Distributional measures of concept-distance: a task-oriented evaluation. In: Proceedings of the 2006 conference on empirical methods in natural language processing. Association for Computational Linguistics, (2006)Google Scholar
  26. 26.
    Roy, A., Sarkar, S., Purakayastha, B.S.: Knowledge based approaches to Nepali word sense disambiguation. Int. J. Nat. Lang. Comput. 3(4), 51–63 (2014)CrossRefGoogle Scholar
  27. 27.
    Dhungana U.R., Shakya S., Baral K., Sharma, B.: Word sense disambiguation using WSD specific wordnet of polysemy words. Int. J. Nat. Lang. Comput. 3(4), 2014Google Scholar
  28. 28.
    Parameswarappa, S., Narayan, V.N.: Kannada word sense disambiguation using decision list. Int. J. Emerg. Trends Technol. Comput. Sci. 2(3), 272–278 (2013)Google Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  • Muhammad Abid
    • 1
    Email author
  • Asad Habib
    • 1
  • Jawad Ashraf
    • 1
  • Abdul Shahid
    • 1
  1. 1.Institute of Information TechnologyKohat University of Science and Technology KohatKhyberPakistan

Personalised recommendations