Abstract
In this paper, we present an approach of automatic sentence boundary detection for Hindi–English Codemixed social media texts. We develop a corpus of Hindi–English Codemixed posts collected from Facebook and made an in-depth study to explore the limitations of using existing rule-based sentence boundary detection systems on codemixed social media text. Our proposed approach is a rule-based sentence boundary detection approach which is tested on our developed corpus and outperforms over the existing approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Rudrapal, D., Jamatia, A., Chakma, K., Das, A., Gambäck, B.: Sentence boundary detection for social media text. In: ICON (2015)
Mikheev, A.: Tagging sentence boundaries. In: Proceedings of the NAACL, Seattle, pp. 264–271 (2000)
Reynar, J.C., Ratnaparkhi, A.: A maximum entropy approach to identifying sentence boundaries. In: Proceedings of the 5th Conference on Applied Natural Language Processing, Apr 1997, pp. 803–806, Washington, DC. ACL (1997)
Parakh, M., Rajesha, N., Ramya, M.: Sentence boundary disambiguation in Kannada texts, language in India. In: Special Volume: Problems of Parsing in Indian Languages, pp. 17–19. www.languageinindia.com. Accessed 11:5 May 2011
Deepamala, N., Ramakanth Kumar, P.: Sentence boundary detection in Kannada language. Int. J. Comput. Appl. (0975-8887) (2012)
Wanjaria, N., Dhopavkarb, G.M., Zungrec, N.B.: Sentence boundary detection for Marathi language. In: International Conference on Information Security and Privacy (ICISP2015), 11–12 Dec 2015, Nagpur, India
Jamatia, A., Gambäck, B., Das, A.: Part-of-speech tagging for code-mixed English-Hindi Twitter and Facebook chat messages. In: Proceedings of 10th International Conference on Recent Advances in Natural Language Processing, pp. 239-248, Hissar, Bulgaria, 7–9 Sept 2015
Acknowledgements
Thanks to Assistant Professor Anupam Jamatia and Assistant Professor Dwijen Rudrapal, Computer Science and Engineering Department, National Institute of Technology, Agartala for their support and guidance throughout our work.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Singh, A., Singh, B.P., Poddar, A.K., Singh, A. (2018). Sentence Boundary Detection for Hindi–English Social Media Text. In: Sa, P., Bakshi, S., Hatzilygeroudis, I., Sahoo, M. (eds) Recent Findings in Intelligent Computing Techniques . Advances in Intelligent Systems and Computing, vol 709. Springer, Singapore. https://doi.org/10.1007/978-981-10-8633-5_22
Download citation
DOI: https://doi.org/10.1007/978-981-10-8633-5_22
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-8632-8
Online ISBN: 978-981-10-8633-5
eBook Packages: EngineeringEngineering (R0)