Sentiment Analysis of Code-Mixed Languages Leveraging Resource Rich Languages

Choudhary, Nurendra; Singh, Rajat; Bindlish, Ishita; Shrivastava, Manish

doi:10.1007/978-3-031-23804-8_9

Nurendra Choudhary⁸,
Rajat Singh⁸,
Ishita Bindlish⁸ &
…
Manish Shrivastava⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13397))

Included in the following conference series:

International Conference on Computational Linguistics and Intelligent Text Processing

231 Accesses
3 Citations

Abstract

Code-mixed data is an important challenge of natural language processing because its characteristics completely vary from the traditional structures of standard languages.

In this paper, we propose a novel approach called Sentiment Analysis of Code-Mixed Text (SACMT) to classify sentences into their corresponding sentiment - positive, negative or neutral, using contrastive learning. We utilize the shared parameters of siamese networks to map the sentences of code-mixed and standard languages to a common sentiment space. Also, we introduce a basic clustering based preprocessing method to capture variations of code-mixed transliterated words. Our experiments reveal that SACMT outperforms the state-of-the-art approaches in sentiment analysis for code-mixed text by 7.6% in accuracy and 10.1% in F-score.

N. Choudhary and R. Singh—These authors have contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://www.cs.york.ac.uk/semeval-2013/task2/index.html.

References

Barbieri, F., Ballesteros, M., Saggion, H.: Are emojis predictable? arXiv preprint arXiv:1702.07285 (2017)
Boden, M.: A guide to recurrent neural networks and backpropagation. The Dallas project (2002)
Google Scholar
Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., Shah, R.: Signature verification using a “siamese" time delay neural network. In: Advances in Neural Information Processing Systems, pp. 737–744 (1994)
Google Scholar
Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 1, pp. 539–546. IEEE (2005)
Google Scholar
Das, A., Yenala, H., Chinnakotla, M., Shrivastava, M.: Together we stand: Siamese networks for similar question retrieval. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). vol. 1, pp. 378–387 (2016)
Google Scholar
Ding, S., Cong, G., Lin, C.Y., Zhu, X.: Using conditional random fields to extract contexts and answers of questions from online forums. In: ACL. vol. 8, pp. 710–718 (2008)
Google Scholar
Joshi, A., Prabhu, A., Shrivastava, M., Varma, V.: Towards sub-word level compositions for sentiment analysis of hindi-english code mixed text. In: COLING, pp. 2482–2491 (2016)
Google Scholar
LeCun, Y., Huang, F.J.: Loss functions for discriminative training of energy-based models. In: AIStats (2005)
Google Scholar
Liu, Y., Li, S., Cao, Y., Lin, C.Y., Han, D., Yu, Y.: Understanding and summarizing answers in community-based question answering services. In: Proceedings of the 22nd International Conference on Computational Linguistics-vol 1, pp. 497–504. Association for Computational Linguistics (2008)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Mozetič, I., Grčar, M., Smailović, J.: Multilingual twitter sentiment classification: the role of human annotators. PloS One 11(5), e0155036 (2016)
Article Google Scholar
Mukku, S.S., Choudhary, N., Mamidi, R.: Enhanced sentiment classification of telugu text using ml techniques. In: SAAIP@ IJCAI, pp. 29–34 (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

Language Technologies Research Centre (LTRC), Kohli Center on Intelligent Systems (KCIS), International Institute of Information Technology, Hyderabad, India
Nurendra Choudhary, Rajat Singh, Ishita Bindlish & Manish Shrivastava

Authors

Nurendra Choudhary
View author publications
You can also search for this author in PubMed Google Scholar
Rajat Singh
View author publications
You can also search for this author in PubMed Google Scholar
Ishita Bindlish
View author publications
You can also search for this author in PubMed Google Scholar
Manish Shrivastava
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nurendra Choudhary .

Editor information

Editors and Affiliations

Instituto Politécnico Nacional, Mexico City, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Choudhary, N., Singh, R., Bindlish, I., Shrivastava, M. (2023). Sentiment Analysis of Code-Mixed Languages Leveraging Resource Rich Languages. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2018. Lecture Notes in Computer Science, vol 13397. Springer, Cham. https://doi.org/10.1007/978-3-031-23804-8_9

Download citation

DOI: https://doi.org/10.1007/978-3-031-23804-8_9
Published: 26 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-23803-1
Online ISBN: 978-3-031-23804-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Sentiment Analysis of Code-Mixed Languages Leveraging Resource Rich Languages