Skip to main content

Effectiveness Analysis of Different POS Tagging Techniques for Bangla Language

  • Conference paper
  • First Online:
Smart Systems: Innovations in Computing

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 235))

Abstract

Parts-of-speech (POS) tagging plays an important role in the field of natural language processing (NLP), such as—retrieval of information, machine translation, spelling check, language processing, sentiment analysis, and so on. Many works have been done for Bangla part-of-speech (POS) tagging using machine learning but the result does not enough. It is a matter of fact that not even a single effective research work has been conducted for Bangla POS tagging using deep learning due to a lack of data scarcity. Considering that our context is the Bangla POS tagging employing both machine learning and deep learning approach. In our research, we have compared some well-known supervised POS tagging approaches (Brill, HMM, unigram, bigram, trigram, and recurrent neural network) for Bangla languages. The supervised POS tagging technique requires a large number of data set to tag accurately. That is why we have used a large number of data set for POS tagging of Bangla languages, which will accept a raw Bangla text to produce a Bangla POS tagged output that can be directly used for other NLP applications. After the comparison, we have found the best tagging approach in terms of performance. Bangla is an inflectional language. That is why it is a very much tough job for grammatical categories of Bangla language. But our proposed model works well for Bangla languages.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 299.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 379.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 379.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bengali Vocabulary. Available online https://en.wikipedia.org/wiki/Bengali_vocabulary. Last accessed 2 August 2020

  2. Sarkar, K., Gayen, V.:A practical part-of-speech tagger for Bengali. In: 2012 Third International Conference on Emerging Applications of Information Technology, pp. 36–40. IEEE (2012)

    Google Scholar 

  3. Sarkar, K., Gayen, V.: A trigram HMM-based POS tagger for Indian languages. In: Satapathy, S., Udgata, S., Biswal, B. (eds.) Proceedings of the International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA). Advances in Intelligent Systems and Computing, vol 199. Springer, Berlin, Heidelberg (2013). https://doi.org/10.1007/978-3-642-35314-7_24

  4. Uddin, M.N., Islam, M.S., Khan, M.A., Jannat, M.E.: A neural network approach for Bangla POS tagger. In: 2018 International Conference on Bangla Speech and Language Processing (ICBSLP), pp. 1–4. IEEE (2018)

    Google Scholar 

  5. Ali, H.: An unsupervised parts-of-speech tagger for the Bangla language, vol. 20, pp. 1–8. Department of Computer Science, University of British Columbia (2010)

    Google Scholar 

  6. Hoque, M.N., Seddiqui, M.H.: Bangla parts-of-speech tagging using Bangla stemmer and rule based analyzer. In: 2015 18th International Conference on Computer and Information Technology (ICCIT), pp. 440–444. IEEE (2015)

    Google Scholar 

  7. Mukherjee, S. Mandal, S.K.D.: Bengali parts-of-speech tagging using global linear model. In Proceeding of IEEE INDICON—2013

    Google Scholar 

  8. Hasan, M.F., UzZaman, N., Khan, M.: Comparison of different POS tagging techniques (n-gram, HMM and Brill’s tagger) for Bangla. In: Elleithy, K. (eds) Advances and Innovations in Systems, Computing Sciences and Software Engineering. Springer, Dordrecht (2007). https://doi.org/10.1007/978-1-4020-6264-3_23

  9. Hasan, M.F., UzZaman, N., Khan, M.: Comparison of unigram, bigram, HMM and brill’s POS tagging approaches for some South Asian languages (2007)

    Google Scholar 

  10. Chakrabarti, D., CDAC, P.: Layered parts of speech tagging for bangla. Language in India, www.languageinindia.com. Special Volume: Problems of Parsing in Indian Languages (2011)

  11. Patil, H.B., Patil, A.S., Pawar, B.V.: Part-of-speech tagger for Marathi language using limited training Corpora. Int. J. Comput. Appl. 975, 8887 (2014)

    Google Scholar 

  12. Natural Language Toolkit. Available online https://en.wikipedia.org/wiki/Natural_Language_Toolkit. Last accessed 4 July 2020

  13. Keras. Available online https://en.wikipedia.org/wiki/Keras. Last accessed 2 August 2020

  14. Confusion Matrix. Available online https://en.wikipedia.org/wiki/Confusion_matrix. Last accessed 8 July 2020

  15. Brill. Available online https://en.wikipedia.org/wiki/Brill. Last accessed 2 August 2020

  16. Huang, X., Acero, A., Hon, H.W., Reddy, R.: Spoken language processing: a guide to theory, algorithm, and system development. Prentice hall PTR (2001)

    Google Scholar 

  17. NLP | Combining NGram Taggers. Available online https://www.geeksforgeeks.org/nlp-combining-ngram-taggers/. Last accessed 2 August 2020

  18. N-gram Language Models. Available online https://medium.com/mti-technology/n-gram-language-model-b7c2fc322799. Last accessed 15 July 2020

  19. Hidden Markov Models. Available online https://web.stanford.edu/~jurafsky/slp3/A.pdf. Last accessed 2 August 2020

  20. Long Short-term Memory. Available online https://en.wikipedia.org/wiki/Long_short-term_memory. Last accessed 4 June 2020

  21. Perez-Ortiz, J.A., Forcada, M.L.: Part-of-speech tagging with recurrent neural networks. In: IJCNN’01. International Joint Conference on Neural Networks. Proceedings (Cat. No. 01CH37222), vol. 3, pp. 1588–1592. IEEE (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jueal Mia, M., Hassan, M., Biswas, A.A. (2022). Effectiveness Analysis of Different POS Tagging Techniques for Bangla Language. In: Somani, A.K., Mundra, A., Doss, R., Bhattacharya, S. (eds) Smart Systems: Innovations in Computing. Smart Innovation, Systems and Technologies, vol 235. Springer, Singapore. https://doi.org/10.1007/978-981-16-2877-1_13

Download citation

Publish with us

Policies and ethics