Skip to main content

Preprocessing Phase of Punjabi Language Text Summarization

  • Conference paper
Information Systems for Indian Languages (ICISIL 2011)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 139))

Included in the following conference series:

Abstract

Punjabi Text Summarization is the process of condensing the source Punjabi text into a shorter version, preserving its information content and overall meaning. It comprises two phases: 1) Pre Processing 2) Processing. Pre Processing is structured representation of the Punjabi text. This paper concentrates on Pre processing phase of Punjabi Text summarization. Various sub phases of pre processing are: Punjabi words boundary identification, Punjabi language stop words elimination, Punjabi language noun stemming, finding Common English Punjabi noun words, finding Punjabi language proper nouns, Punjabi sentence boundary identification, and identification of Punjabi language Cue phrase in a sentence.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berry, M.W.: Survey of Text Mining Clustering, Classification and Retrieval. Springer Verlag, LLC, New York (2004)

    MATH  Google Scholar 

  2. Kyoomarsi, F., Khosravi, H., Eslami, E., Dehkordy, P.K.: Optimizing Text Summarization Based on Fuzzy Logic. In: Proceedings of Seventh IEEE/ACIS International Conference on Computer and Information Science, pp. 347–352. IEEE, University of Shahid Bahonar Kerman, UK (2008)

    Google Scholar 

  3. Fattah, M.A., Ren, F.: Automatic Text Summarization. Proceedings of World Academy of Science Engineering and Technology 27, 192–195 (2008)

    Google Scholar 

  4. Kaikhah, K.: Automatic Text Summarization with Neural Networks. In: Proceedings of Second International Conference on Intelligent Systems, pp. 40–44. IEEE, Texas (2004)

    Google Scholar 

  5. Unicode Characters Chart, http://www.tamasoft.co.jp/en/general-info/unicode-decimal.html

  6. Zahurul Islam, M., Nizam Uddin, M., Khan, M.: A light weight stemmer for Bengali and its Use in spelling Checker. In: Proceedings of 1st International Conference on Digital Comm. and Computer Applications (DCCA 2007), Irbid, Jordan, pp. 19–23 (2007)

    Google Scholar 

  7. Kumar, P., Kashyap, S., Mittal, A., Gupta, S.: A Hindi question answering system for E-learning documents. In: Proceedings of International Conference on Intelligent Sensing and Information Processing, Banglore, India, pp. 80–85 (2005)

    Google Scholar 

  8. Singh, G., Gill, M.S., Joshi, S.S.: Punjabi to English Bilingual Dictionary. Punjabi University Patiala, India (1999)

    Google Scholar 

  9. Gill, M.S., Lehal, G.S., Joshi, S.S.: Part of Speech Tagging for Grammar Checking of Punjab. The Linguistic Journal 4(1), 6–21 (2009)

    Google Scholar 

  10. Punjabi Morph. Analyzer, http://www.advancedcentrepunjabi.org/punjabi_mor_ana.asp

  11. The Corpus of Cue Phrases, http://www.cs.otago.ac.nz/staffpriv/alik/papers/apps.ps

  12. Neto, J., et al.: Document Clustering and Text Summarization. In: Proc. of 4th Int. Conf. Practical Applications of Knowledge Discovery and Data Mining, London, pp. 41–55 (2000)

    Google Scholar 

  13. Ramanathan, A., Rao, D.: A Lightweight Stemmer for Hindi. In: Workshop on Computational Linguistics for South-Asian Languages, EACL (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gupta, V., Lehal, G.S. (2011). Preprocessing Phase of Punjabi Language Text Summarization. In: Singh, C., Singh Lehal, G., Sengupta, J., Sharma, D.V., Goyal, V. (eds) Information Systems for Indian Languages. ICISIL 2011. Communications in Computer and Information Science, vol 139. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19403-0_43

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-19403-0_43

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-19402-3

  • Online ISBN: 978-3-642-19403-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics