Skip to main content

Analysis of Bangla Transformation of Sentences Using Machine Learning

  • Conference paper
  • First Online:
Key Digital Trends in Artificial Intelligence and Robotics (ICDLAIR 2022)

Abstract

In many languages, various language processing tools have been developed. The work of the Bengali NLP is getting richer day by day. Sentence pattern recognition in Bangla is a subject of attention. Additionally, our motivation was to work on implementing this pattern recognition concept into user-friendly applications. So, we generated an approach where a sentence (sorol, jotil and jougik) can be correctly identified. Our model accepts a Bangla sentence as input, determines the sentence construction type, and outputs the sentence type. The most popular and well-known six supervised machine learning algorithms were used to classify three types of sentence formation: Sorol Bakko (simple sentence), Jotil Bakko (complex sentence) and Jougik Bakko(compound sentence). We trained and tested our dataset, which contains 2727 numbers of data from various sources. We analyzed our dataset and got accuracy, precision, recall, f1-score and confusion matrix. We get the highest accuracy with the decision tree classifier, which is 93.72%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Shetu, S.F., et al.: Identifying the writing style of Bangla language using natural language processing. In: 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT), pp. 1–6. IEEE (2020)

    Google Scholar 

  2. Bijoy, M.H.I., Hasan, M., Tusher, A.N., Rahman, M.M., Mia, M.J., Rabbani, M.: An automated approach for Bangla sentence classification using supervised algorithms. In: 2021 12th International Conference on Computing Communication and Networking Technologies (ICCCNT), pp. 1–6. IEEE, July 2021

    Google Scholar 

  3. Čandrlić, S., Katić, M.A., Pavlić, M.: A system for transformation of sentences from the enriched formalized Node of Knowledge record into relational database. Expert Syst. Appl. 115, 442–464 (2019)

    Article  Google Scholar 

  4. Dhar, A., Mukherjee, H., Dash, N.S., Roy, K.: Performance of classifiers in Bangla text categorization. In: 2018 International Conference on Innovations in Science, Engineering and Technology (ICISET), pp. 168–173. IEEE, October 2018

    Google Scholar 

  5. Shafin, M.A., Hasan, M.M., Alam, M.R., Mithu, M.A., Nur, A.U., Faruk, M.O.: Product review sentiment analysis by using NLP and machine learning in Bangla language. In: 2020 23rd International Conference on Computer and Information Technology (ICCIT), pp. 1–5. IEEE, December 2020

    Google Scholar 

  6. Al-Radaideh, Q.A., Al-Khateeb, S.S.: An associative rule-based classifier for Arabic medical text. Int. J. Knowl. Eng. Data Mining 3(3–4), 255–273 (2015)

    Article  Google Scholar 

  7. Bolaj, P., Govilkar, S.: Text classification for Marathi documents using supervised learning methods. Int. J. Comput. Appl. 155(8), 0975–8887 (2016)

    Google Scholar 

  8. Dhar, A., Dash, N.S. and Roy, K.: Application of tf-idf feature for categorizing documents of online Bangla web text corpus. In: Intelligent Engineering Informatics, pp. 51–59. Springer, Singapore (2018)

    Google Scholar 

  9. Islam, M., Jubayer, F.E.M., Ahmed, S.I.: A comparative study on different types of approaches to Bengali document categorization (2017). arXiv preprint arXiv:1701.08694

  10. Sen, O., et al.: Bangla Natural Language Processing: A Comprehensive Analysis of Classical, Machine Learning, and Deep Learning Based Methods. IEEE Access (2022)

    Google Scholar 

  11. Tuhin, R.A., Paul, B.K., Nawrine, F., Akter, M., Das, A.K.: An automated system of sentiment analysis from Bangla text using supervised learning techniques. In: 2019 IEEE 4th International Conference on Computer and Communication Systems (ICCCS), pp. 360–364. IEEE, February 2019

    Google Scholar 

  12. Das, A., Bandyopadhyay, S.: Phrase-level polarity identification for Bangla. Int. J. Comput. Linguistics Appl. 1(1–2), 169–182 (2010)

    Google Scholar 

  13. Hasan, K.A., Rahman, M.: Sentiment detection from Bangla text using contextual valency analysis. In: 2014 17th International Conference on Computer and Information Technology (ICCIT), pp. 292–295. IEEE, December 2014

    Google Scholar 

  14. Uddin, A.H., Dam, S.K. and Arif, A.S.M.: Extracting severe negative sentence pattern from Bangla data via long short-term memory neural network. In: 2019 4th International Conference on Electrical Information and Communication Technology (EICT), pp. 1–6. IEEE, December 2019

    Google Scholar 

  15. Hassan, A., Amin, M.R., Al Azad, A.K., Mohammed, N.: Sentiment analysis on bangla and romanized bangla text using deep recurrent models. In: 2016 International Workshop on Computational Intelligence (IWCI), pp. 51–56. IEEE, December 2016

    Google Scholar 

  16. Sarker, S., Monisha, S.T.A., Nahid, M.M.H.: Bengali question answering system for factoid questions: a statistical approach. In 2019 International Conference on Bangla Speech and Language Processing (ICBSLP), pp. 1–5. IEEE, September 2019

    Google Scholar 

  17. Monisha, S.T.A., Sarker, S., Nahid, M.M.H.: Classification of bengali questions towards a factoid question answering system. In: 2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT), pp. 1–5. IEEE, May 2019

    Google Scholar 

  18. Khan, S., Kubra, K.T., Nahid, M.M.H.: Improving answer extraction for bangali q/a system using anaphora-cataphora resolution. In 2018 International Conference on Innovation in Engineering and Technology (ICIET), pp. 1–6. IEEE, December 2018

    Google Scholar 

  19. Urmi, T.T., Jammy, J.J., Ismail, S.: A corpus based unsupervised Bangla word stemming using N-gram language model. In: 2016 5th International Conference on Informatics, Electronics and Vision (ICIEV), pp. 824–828. IEEE, May 2016

    Google Scholar 

  20. Ahmad, A., Amin, M.R.: Bengali word embeddings and it’s application in solving document classification problem. In: 2016 19th International Conference on Computer and Information Technology (ICCIT), pp. 425–430. IEEE, December 2016

    Google Scholar 

  21. Rahaman, M.A., Jasim, M., Ali, M., Hasanuzzaman, M.: Bangla language modeling algorithm for automatic recognition of hand-sign-spelled Bangla sign language. Front. Comp. Sci. 14(3), 1–20 (2020)

    Google Scholar 

  22. Haque, M., Huda, M.N.: Relation between subject and verb in Bangla Language: a semantic analysis. In: 2016 5th International Conference on Informatics, Electronics and Vision (ICIEV), pp. 41–44. IEEE, May 2016

    Google Scholar 

  23. Islam, M.S., Mousumi, S.S.S., Abujar, S., Hossain, S.A.: Sequence-to-sequence Bangla sentence generation with LSTM recurrent neural networks. Procedia Comput. Sci. 152, 51–58 (2019)

    Article  Google Scholar 

  24. Abujar, S., Hasan, M., Shahin, M.S.I., Hossain, S.A.: A heuristic approach of text summarization for Bengali documentation. In: 2017 8th International Conference on Computing, Communication and Networking Technologies (ICCCNT), pp. 1–8. IEEE, July 2017

    Google Scholar 

  25. Dhar, A., Dash, N.S. and Roy, K., Weighing Word Length and Sentence Length as Parameters for Subject Area Identification in Bangla Text Documents

    Google Scholar 

  26. Hamid, M.M., Alam, T., Ismail, S., Rabbi, M.: Bangla interrogative sentence identification from transliterated Bangla sentences. In: 2018 International Conference on Bangla Speech and Language Processing (ICBSLP), pp. 1–6. IEEE, September 2018

    Google Scholar 

  27. Razzaghi, F., Minaee, H., Ghorbani, A.A.: Context free frequently asked questions detection using machine learning techniques. In: 2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI), pp. 558–561. IEEE, October 2016

    Google Scholar 

  28. Wang, D., Su, J., Yu, H.: Feature extraction and analysis of natural language processing for deep learning English language. IEEE Access 8, 46335–46345 (2020)

    Article  Google Scholar 

  29. Oliinyk, V.A., Vysotska, V., Burov, Y., Mykich, K., Fernandes, V.B.: Propaganda detection in text data based on NLP and machine learning. In: MoMLeT+ DS, pp. 132–144 (2020)

    Google Scholar 

  30. Alian, M., Awajan, A.: Paraphrasing identification techniques in English and Arabic texts. In: 2020 11th International Conference on Information and Communication Systems (ICICS), pp. 155–160. IEEE, April 2020

    Google Scholar 

  31. Mohammad, A.S., Jaradat, Z., Mahmoud, A.A., Jararweh, Y.: Paraphrase identification and semantic text similarity analysis in Arabic news tweets using lexical, syntactic, and semantic features. Inf. Process. Manage. 53(3), 640–652 (2017)

    Article  Google Scholar 

  32. Lamba, H., Govilkar, S.: A survey on plagiarism detection techniques for Indian regional languages. Int. J. Comput. Appl. 975, 8887 (2017)

    Google Scholar 

  33. Ngoc Phuoc An, V., Magnolini, S. and Popescu, O.: Paraphrase identification and semantic similarity in twitter with simple features (2015)

    Google Scholar 

  34. Anwar, M.M., Anwar, M.Z., Bhuiyan, M.A.A.: Syntax analysis and machine translation of Bangla sentences. Int. J. Comput. Sci. Network Secur. 9(8), 317–326 (2009)

    Google Scholar 

  35. Mehedy, L., Arifin, N., Kaykobad, M.: Bangla syntax analysis: a comprehensive approach. In: Proceedings of International Conference on Computer and Information Technology (ICCIT), Dhaka, Bangladesh, pp. 287–293 (2003)

    Google Scholar 

  36. Das, B., Majumder, M., Phadikar, S.: A novel system for generating simple sentences from complex and compound sentences. Int. J. Modern Educ. Comput. Sci. 11(1), 57 (2018)

    Article  Google Scholar 

  37. Purohit, P.P., Hoque, M.M., Hassan, M.K.: Feature based semantic analyzer for parsing Bangla complex and compound sentences. In: The 8th International Conference on Software, Knowledge, Information Management and Applications (SKIMA 2014), pp. 1–7. IEEE, December 2014

    Google Scholar 

  38. Hasan, K.A., Hozaifa, M., Dutta, S.: Detection of semantic errors from simple Bangla sentences. In: 2014 17th International Conference on Computer and Information Technology (ICCIT), pp. 296–299. IEEE, December 2014

    Google Scholar 

  39. Khatun, A., Hoque, M.M.: Statistical parsing of Bangla sentences by CYK algorithm. In: 2017 International Conference on Electrical, Computer and Communication Engineering (ECCE), pp. 655–661. IEEE, February 2017

    Google Scholar 

  40. Purohit, P.P., Hoque, M.M., Hassan, M.K.: An empirical framework for semantic analysis of Bangla sentences. In: 2014 9th International Forum on Strategic Technology (IFOST), pp. 34–39. IEEE, October 2014

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Moshfiqur Rahman Ajmain .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Das, R.K., Sammi, S.S., Kobra, K., Ajmain, M.R., khushbu, S.A., Noori, S.R.H. (2023). Analysis of Bangla Transformation of Sentences Using Machine Learning. In: Troiano, L., Vaccaro, A., Kesswani, N., Díaz Rodriguez, I., Brigui, I., Pastor-Escuredo, D. (eds) Key Digital Trends in Artificial Intelligence and Robotics. ICDLAIR 2022. Lecture Notes in Networks and Systems, vol 670. Springer, Cham. https://doi.org/10.1007/978-3-031-30396-8_4

Download citation

Publish with us

Policies and ethics