Abstract
This paper reports a system for retrieving similar Frequently-Asked-Questions (FAQ) when queries are through Short-message-Service (SMS). The system was developed to participate in FIRE 2011 SMS-based FAQ Retrieval track (Monolingual). SMS contains various User Improvisations and Typographical errors. Proposed approach use approximate string matching (ASM) techniques to normalize SMS query with minimum linguistic resources. MRR obtained for English, Hindi and Malayalam are 0.85, 0.93 and 0.92 respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aw, A., Zhang, M., Xiao, J., Su, J.: A phrase-based statistical model for SMS text normalization. In: Proceedings of the COLING/ACL on Main Conference Poster Sessions, pp. 33–40. Association for Computational Linguistics (July 2006)
Contractor, D., Faruquie, T.A., Subramaniam, L.V.: Unsupervised cleansing of noisy text. In: Huang, C.R., Jurafsky, D. (eds.) COLING (Posters), pp. 189–196. Chinese Information Processing Society of China (2010)
Gouws, S., Hovy, D., Metzler, D., Rey, M.: Unsupervised Mining of Lexical Variants from Noisy Text. English, pp. 82–90 (2011)
Kobus, C., Marzin, P.: F-Lannion: Normalizing SMS: are two metaphors better than one? Computational Linguistics, 441–448 (August 2008)
Kothari, G., Negi, S., Faruquie, T.A., Chakaravarthy, V.T., Subramaniam, L.V.: Sms based interface for faq retrieval. In: Su, K.Y., Su, J., Wiebe, J. (eds.) ACL/AFNLP, pp. 852–860. The Association for Computer Linguistics (2009)
Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady 10(8), 707–710 (1966)
Ounis, I., Amati, G., Plachouras, V., He, B., Macdonald, C., Johnson, D.: Terrier information retrieval platform. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 517–519. Springer, Heidelberg (2005)
Prochasson, E., Viard-Gaudin, C., Morin, E.: Language models for handwritten short message services. In: ICDAR, pp. 83–87. IEEE Computer Society (2007)
Robertson, S., Spärck Jones, K.: Simple, proven approaches to text retrieval. Tech. Rep. UCAM-CL-TR-356, University of Cambridge, Computer Laboratory (December 1994)
Subramaniam, L.V., Roy, S., Faruquie, T.A., Negi, S.: A survey of types of text noise and techniques to handle noisy text. In: Lopresti, D.P., Roy, S., Schulz, K.U., Subramaniam, L.V. (eds.) AND, pp. 115–122. ACM (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Singhal, K., Arora, G., Kumari, S., Majumder, P. (2013). SMS Normalization for FAQ Retrieval. In: Majumder, P., Mitra, M., Bhattacharyya, P., Subramaniam, L.V., Contractor, D., Rosso, P. (eds) Multilingual Information Access in South Asian Languages. Lecture Notes in Computer Science, vol 7536. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40087-2_16
Download citation
DOI: https://doi.org/10.1007/978-3-642-40087-2_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40086-5
Online ISBN: 978-3-642-40087-2
eBook Packages: Computer ScienceComputer Science (R0)