Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7536))

Abstract

This paper reports a system for retrieving similar Frequently-Asked-Questions (FAQ) when queries are through Short-message-Service (SMS). The system was developed to participate in FIRE 2011 SMS-based FAQ Retrieval track (Monolingual). SMS contains various User Improvisations and Typographical errors. Proposed approach use approximate string matching (ASM) techniques to normalize SMS query with minimum linguistic resources. MRR obtained for English, Hindi and Malayalam are 0.85, 0.93 and 0.92 respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aw, A., Zhang, M., Xiao, J., Su, J.: A phrase-based statistical model for SMS text normalization. In: Proceedings of the COLING/ACL on Main Conference Poster Sessions, pp. 33–40. Association for Computational Linguistics (July 2006)

    Google Scholar 

  2. Contractor, D., Faruquie, T.A., Subramaniam, L.V.: Unsupervised cleansing of noisy text. In: Huang, C.R., Jurafsky, D. (eds.) COLING (Posters), pp. 189–196. Chinese Information Processing Society of China (2010)

    Google Scholar 

  3. Gouws, S., Hovy, D., Metzler, D., Rey, M.: Unsupervised Mining of Lexical Variants from Noisy Text. English, pp. 82–90 (2011)

    Google Scholar 

  4. Kobus, C., Marzin, P.: F-Lannion: Normalizing SMS: are two metaphors better than one? Computational Linguistics, 441–448 (August 2008)

    Google Scholar 

  5. Kothari, G., Negi, S., Faruquie, T.A., Chakaravarthy, V.T., Subramaniam, L.V.: Sms based interface for faq retrieval. In: Su, K.Y., Su, J., Wiebe, J. (eds.) ACL/AFNLP, pp. 852–860. The Association for Computer Linguistics (2009)

    Google Scholar 

  6. Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady 10(8), 707–710 (1966)

    MathSciNet  Google Scholar 

  7. Ounis, I., Amati, G., Plachouras, V., He, B., Macdonald, C., Johnson, D.: Terrier information retrieval platform. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 517–519. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  8. Prochasson, E., Viard-Gaudin, C., Morin, E.: Language models for handwritten short message services. In: ICDAR, pp. 83–87. IEEE Computer Society (2007)

    Google Scholar 

  9. Robertson, S., Spärck Jones, K.: Simple, proven approaches to text retrieval. Tech. Rep. UCAM-CL-TR-356, University of Cambridge, Computer Laboratory (December 1994)

    Google Scholar 

  10. Subramaniam, L.V., Roy, S., Faruquie, T.A., Negi, S.: A survey of types of text noise and techniques to handle noisy text. In: Lopresti, D.P., Roy, S., Schulz, K.U., Subramaniam, L.V. (eds.) AND, pp. 115–122. ACM (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Singhal, K., Arora, G., Kumari, S., Majumder, P. (2013). SMS Normalization for FAQ Retrieval. In: Majumder, P., Mitra, M., Bhattacharyya, P., Subramaniam, L.V., Contractor, D., Rosso, P. (eds) Multilingual Information Access in South Asian Languages. Lecture Notes in Computer Science, vol 7536. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40087-2_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-40087-2_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-40086-5

  • Online ISBN: 978-3-642-40087-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics