Abstract
In many languages, various language processing tools have been developed. The work of the Bengali NLP is getting richer day by day. Sentence pattern recognition in Bangla is a subject of attention. Additionally, our motivation was to work on implementing this pattern recognition concept into user-friendly applications. So, we generated an approach where a sentence (sorol, jotil and jougik) can be correctly identified. Our model accepts a Bangla sentence as input, determines the sentence construction type, and outputs the sentence type. The most popular and well-known six supervised machine learning algorithms were used to classify three types of sentence formation: Sorol Bakko (simple sentence), Jotil Bakko (complex sentence) and Jougik Bakko(compound sentence). We trained and tested our dataset, which contains 2727 numbers of data from various sources. We analyzed our dataset and got accuracy, precision, recall, f1-score and confusion matrix. We get the highest accuracy with the decision tree classifier, which is 93.72%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Shetu, S.F., et al.: Identifying the writing style of Bangla language using natural language processing. In: 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT), pp. 1–6. IEEE (2020)
Bijoy, M.H.I., Hasan, M., Tusher, A.N., Rahman, M.M., Mia, M.J., Rabbani, M.: An automated approach for Bangla sentence classification using supervised algorithms. In: 2021 12th International Conference on Computing Communication and Networking Technologies (ICCCNT), pp. 1–6. IEEE, July 2021
Čandrlić, S., Katić, M.A., Pavlić, M.: A system for transformation of sentences from the enriched formalized Node of Knowledge record into relational database. Expert Syst. Appl. 115, 442–464 (2019)
Dhar, A., Mukherjee, H., Dash, N.S., Roy, K.: Performance of classifiers in Bangla text categorization. In: 2018 International Conference on Innovations in Science, Engineering and Technology (ICISET), pp. 168–173. IEEE, October 2018
Shafin, M.A., Hasan, M.M., Alam, M.R., Mithu, M.A., Nur, A.U., Faruk, M.O.: Product review sentiment analysis by using NLP and machine learning in Bangla language. In: 2020 23rd International Conference on Computer and Information Technology (ICCIT), pp. 1–5. IEEE, December 2020
Al-Radaideh, Q.A., Al-Khateeb, S.S.: An associative rule-based classifier for Arabic medical text. Int. J. Knowl. Eng. Data Mining 3(3–4), 255–273 (2015)
Bolaj, P., Govilkar, S.: Text classification for Marathi documents using supervised learning methods. Int. J. Comput. Appl. 155(8), 0975–8887 (2016)
Dhar, A., Dash, N.S. and Roy, K.: Application of tf-idf feature for categorizing documents of online Bangla web text corpus. In: Intelligent Engineering Informatics, pp. 51–59. Springer, Singapore (2018)
Islam, M., Jubayer, F.E.M., Ahmed, S.I.: A comparative study on different types of approaches to Bengali document categorization (2017). arXiv preprint arXiv:1701.08694
Sen, O., et al.: Bangla Natural Language Processing: A Comprehensive Analysis of Classical, Machine Learning, and Deep Learning Based Methods. IEEE Access (2022)
Tuhin, R.A., Paul, B.K., Nawrine, F., Akter, M., Das, A.K.: An automated system of sentiment analysis from Bangla text using supervised learning techniques. In: 2019 IEEE 4th International Conference on Computer and Communication Systems (ICCCS), pp. 360–364. IEEE, February 2019
Das, A., Bandyopadhyay, S.: Phrase-level polarity identification for Bangla. Int. J. Comput. Linguistics Appl. 1(1–2), 169–182 (2010)
Hasan, K.A., Rahman, M.: Sentiment detection from Bangla text using contextual valency analysis. In: 2014 17th International Conference on Computer and Information Technology (ICCIT), pp. 292–295. IEEE, December 2014
Uddin, A.H., Dam, S.K. and Arif, A.S.M.: Extracting severe negative sentence pattern from Bangla data via long short-term memory neural network. In: 2019 4th International Conference on Electrical Information and Communication Technology (EICT), pp. 1–6. IEEE, December 2019
Hassan, A., Amin, M.R., Al Azad, A.K., Mohammed, N.: Sentiment analysis on bangla and romanized bangla text using deep recurrent models. In: 2016 International Workshop on Computational Intelligence (IWCI), pp. 51–56. IEEE, December 2016
Sarker, S., Monisha, S.T.A., Nahid, M.M.H.: Bengali question answering system for factoid questions: a statistical approach. In 2019 International Conference on Bangla Speech and Language Processing (ICBSLP), pp. 1–5. IEEE, September 2019
Monisha, S.T.A., Sarker, S., Nahid, M.M.H.: Classification of bengali questions towards a factoid question answering system. In: 2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT), pp. 1–5. IEEE, May 2019
Khan, S., Kubra, K.T., Nahid, M.M.H.: Improving answer extraction for bangali q/a system using anaphora-cataphora resolution. In 2018 International Conference on Innovation in Engineering and Technology (ICIET), pp. 1–6. IEEE, December 2018
Urmi, T.T., Jammy, J.J., Ismail, S.: A corpus based unsupervised Bangla word stemming using N-gram language model. In: 2016 5th International Conference on Informatics, Electronics and Vision (ICIEV), pp. 824–828. IEEE, May 2016
Ahmad, A., Amin, M.R.: Bengali word embeddings and it’s application in solving document classification problem. In: 2016 19th International Conference on Computer and Information Technology (ICCIT), pp. 425–430. IEEE, December 2016
Rahaman, M.A., Jasim, M., Ali, M., Hasanuzzaman, M.: Bangla language modeling algorithm for automatic recognition of hand-sign-spelled Bangla sign language. Front. Comp. Sci. 14(3), 1–20 (2020)
Haque, M., Huda, M.N.: Relation between subject and verb in Bangla Language: a semantic analysis. In: 2016 5th International Conference on Informatics, Electronics and Vision (ICIEV), pp. 41–44. IEEE, May 2016
Islam, M.S., Mousumi, S.S.S., Abujar, S., Hossain, S.A.: Sequence-to-sequence Bangla sentence generation with LSTM recurrent neural networks. Procedia Comput. Sci. 152, 51–58 (2019)
Abujar, S., Hasan, M., Shahin, M.S.I., Hossain, S.A.: A heuristic approach of text summarization for Bengali documentation. In: 2017 8th International Conference on Computing, Communication and Networking Technologies (ICCCNT), pp. 1–8. IEEE, July 2017
Dhar, A., Dash, N.S. and Roy, K., Weighing Word Length and Sentence Length as Parameters for Subject Area Identification in Bangla Text Documents
Hamid, M.M., Alam, T., Ismail, S., Rabbi, M.: Bangla interrogative sentence identification from transliterated Bangla sentences. In: 2018 International Conference on Bangla Speech and Language Processing (ICBSLP), pp. 1–6. IEEE, September 2018
Razzaghi, F., Minaee, H., Ghorbani, A.A.: Context free frequently asked questions detection using machine learning techniques. In: 2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI), pp. 558–561. IEEE, October 2016
Wang, D., Su, J., Yu, H.: Feature extraction and analysis of natural language processing for deep learning English language. IEEE Access 8, 46335–46345 (2020)
Oliinyk, V.A., Vysotska, V., Burov, Y., Mykich, K., Fernandes, V.B.: Propaganda detection in text data based on NLP and machine learning. In: MoMLeT+ DS, pp. 132–144 (2020)
Alian, M., Awajan, A.: Paraphrasing identification techniques in English and Arabic texts. In: 2020 11th International Conference on Information and Communication Systems (ICICS), pp. 155–160. IEEE, April 2020
Mohammad, A.S., Jaradat, Z., Mahmoud, A.A., Jararweh, Y.: Paraphrase identification and semantic text similarity analysis in Arabic news tweets using lexical, syntactic, and semantic features. Inf. Process. Manage. 53(3), 640–652 (2017)
Lamba, H., Govilkar, S.: A survey on plagiarism detection techniques for Indian regional languages. Int. J. Comput. Appl. 975, 8887 (2017)
Ngoc Phuoc An, V., Magnolini, S. and Popescu, O.: Paraphrase identification and semantic similarity in twitter with simple features (2015)
Anwar, M.M., Anwar, M.Z., Bhuiyan, M.A.A.: Syntax analysis and machine translation of Bangla sentences. Int. J. Comput. Sci. Network Secur. 9(8), 317–326 (2009)
Mehedy, L., Arifin, N., Kaykobad, M.: Bangla syntax analysis: a comprehensive approach. In: Proceedings of International Conference on Computer and Information Technology (ICCIT), Dhaka, Bangladesh, pp. 287–293 (2003)
Das, B., Majumder, M., Phadikar, S.: A novel system for generating simple sentences from complex and compound sentences. Int. J. Modern Educ. Comput. Sci. 11(1), 57 (2018)
Purohit, P.P., Hoque, M.M., Hassan, M.K.: Feature based semantic analyzer for parsing Bangla complex and compound sentences. In: The 8th International Conference on Software, Knowledge, Information Management and Applications (SKIMA 2014), pp. 1–7. IEEE, December 2014
Hasan, K.A., Hozaifa, M., Dutta, S.: Detection of semantic errors from simple Bangla sentences. In: 2014 17th International Conference on Computer and Information Technology (ICCIT), pp. 296–299. IEEE, December 2014
Khatun, A., Hoque, M.M.: Statistical parsing of Bangla sentences by CYK algorithm. In: 2017 International Conference on Electrical, Computer and Communication Engineering (ECCE), pp. 655–661. IEEE, February 2017
Purohit, P.P., Hoque, M.M., Hassan, M.K.: An empirical framework for semantic analysis of Bangla sentences. In: 2014 9th International Forum on Strategic Technology (IFOST), pp. 34–39. IEEE, October 2014
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Das, R.K., Sammi, S.S., Kobra, K., Ajmain, M.R., khushbu, S.A., Noori, S.R.H. (2023). Analysis of Bangla Transformation of Sentences Using Machine Learning. In: Troiano, L., Vaccaro, A., Kesswani, N., Díaz Rodriguez, I., Brigui, I., Pastor-Escuredo, D. (eds) Key Digital Trends in Artificial Intelligence and Robotics. ICDLAIR 2022. Lecture Notes in Networks and Systems, vol 670. Springer, Cham. https://doi.org/10.1007/978-3-031-30396-8_4
Download citation
DOI: https://doi.org/10.1007/978-3-031-30396-8_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30395-1
Online ISBN: 978-3-031-30396-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)