Abstract
Taking one’s own life is a tragic reaction to stressful situations in life. There is a noticeable increase in the ratio of number of suicides every year in Telangana [1]. Most of them are adolescents and youngsters and others too. So there is an urging need of research to be done on suicidal ideation and preventive methods to support mental health professionals and psychotherapists. So this paper aims in developing technological solutions to the problem. Suicides can be prevented if we could identify the mental health conditions of a person with ideations and predict the severity in earlier [2]. So in this paper, we applied machine learning algorithms to categorize persons with suicidal ideations from the data that is maintained or recorded during visit of an adolescent with a mental health professional in textual form of questionnaires. The data is recorded in native Telugu language during the session, as most of cases are from illiterates [1, 3]. So in order to classify the patient test data with more accuracy, there is a need of language corpus in Telugu with ideations. So this paper would give a great insight into creation of suicidal language or ideation corpora in native language Telugu.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Nilesh V (2019) Telangana has third-highest suicide rate in India. NCRB. https://www.newindianexpress.com/states/telangana/2019/nov/11/telangana-has-third-highest-suicide-rate-in-india-ncrb-2060087.html
Choudhary N, Singh R, Bindlish I, Shrivastava M (2018a) Emotions are universal: learning sentiment based representations of resource-poor languages using siamese networks. arXiv preprint arXiv:1804.00805
Rohit PS, State records highest suicide rate in country. https://www.thehindu.com/news/national/telangana/state-records-highest-suicide-rate-in-country/article8433720.ece#comments_14219168
Naidu R, Bharti SK, Babu KS, Mohapatra RK (2017) Sentiment analysis using Telugu SentiWordNet. In: 2017 international conference on wireless communications, signal processing and networking (WiSPNET), Chennai, pp 666–670. https://doi.org/10.1109/wispnet.2017.8299844
Magdum D, Dubey MS, Patil T, Shah R, Belhe S, Kulkarni M (2015) Methodology for designing and creating Hindi speech corpus. In: 2015 international conference on signal processing and communication engineering systems, Guntur, pp 336–339. https://doi.org/10.1109/spaces.2015.7058279
Gangula RR, Mamidi R (2018) Resource creation towards automated sentiment analysis in Telugu (a low resource language) and integrating multiple domain sources to enhance sentiment prediction. In: Conference: language resources and evaluation conference, At Miyazaki (Japan)
Srirangam V, Abhinav A, Singh V, Shrivastava M (2019) Corpus creation and analysis for named entity recognition in Telugu-English code-mixed social media data. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop. https://doi.org/10.18653/v1/p19-2025
Abdelali A, Guzman F, Sajjad H, Vogel S (2014) The AMARA corpus: Building parallel language resources for the educational domain. In: Proceedings of the ninth international conference on language resources and evaluation (LREC’14). European Language Resources Association (ELRA), Reykjavik, Iceland, pp 1856–1862
Lu X (2017) Automated measurement of syntactic complexity in corpus-based L2 writing research and implications for writing assessment. SAGE J. https://doi.org/10.1177/0265532217710675
Choi Y, Wiebe J (2014) Effectwordnet: sense-level lexicon acquisition for opinion inference. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1181–1191
Wołk K, Marasek K (2014) A sentence meaning based alignment method for parallel text corpora preparation. Adv Intell Syst Comput 275:107–114. arXiv:1509.09090
Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22Nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 785–794
Aguilar WG, Alulema D, Limaico A, Sandoval D (2017) Development and verification of a verbal corpus based on natural language for Ecuadorian dialect. In: IEEE 11th International Conference on Semantic Computing (ICSC), San Diego, CA, 2017, pp 515–519. https://doi.org/10.1109/icsc.2017.82
Choudhary N, Singh R, Bindlish I, Shrivastava M (2018b) Sentiment analysis of code-mixed languages leveraging resource rich languages. arXiv preprint arXiv:1804.00806
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Soumya, K., Garg, V.K. (2021). Building a Language Data Set in Telugu Using Machine Learning Techniques to Address Suicidal Ideation and Behaviors in Adolescents. In: Choudhary, A., Agrawal, A.P., Logeswaran, R., Unhelkar, B. (eds) Applications of Artificial Intelligence and Machine Learning. Lecture Notes in Electrical Engineering, vol 778. Springer, Singapore. https://doi.org/10.1007/978-981-16-3067-5_1
Download citation
DOI: https://doi.org/10.1007/978-981-16-3067-5_1
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-3066-8
Online ISBN: 978-981-16-3067-5
eBook Packages: Computer ScienceComputer Science (R0)