Preprocessing Phase of Punjabi Language Text Summarization

Gupta, Vishal; Lehal, Gurpreet Singh

doi:10.1007/978-3-642-19403-0_43

Vishal Gupta² &
Gurpreet Singh Lehal³

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 139))

Included in the following conference series:

International Conference on Information Systems for Indian Languages

726 Accesses
8 Citations

Abstract

Punjabi Text Summarization is the process of condensing the source Punjabi text into a shorter version, preserving its information content and overall meaning. It comprises two phases: 1) Pre Processing 2) Processing. Pre Processing is structured representation of the Punjabi text. This paper concentrates on Pre processing phase of Punjabi Text summarization. Various sub phases of pre processing are: Punjabi words boundary identification, Punjabi language stop words elimination, Punjabi language noun stemming, finding Common English Punjabi noun words, finding Punjabi language proper nouns, Punjabi sentence boundary identification, and identification of Punjabi language Cue phrase in a sentence.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Berry, M.W.: Survey of Text Mining Clustering, Classification and Retrieval. Springer Verlag, LLC, New York (2004)
MATH Google Scholar
Kyoomarsi, F., Khosravi, H., Eslami, E., Dehkordy, P.K.: Optimizing Text Summarization Based on Fuzzy Logic. In: Proceedings of Seventh IEEE/ACIS International Conference on Computer and Information Science, pp. 347–352. IEEE, University of Shahid Bahonar Kerman, UK (2008)
Google Scholar
Fattah, M.A., Ren, F.: Automatic Text Summarization. Proceedings of World Academy of Science Engineering and Technology 27, 192–195 (2008)
Google Scholar
Kaikhah, K.: Automatic Text Summarization with Neural Networks. In: Proceedings of Second International Conference on Intelligent Systems, pp. 40–44. IEEE, Texas (2004)
Google Scholar
Unicode Characters Chart, http://www.tamasoft.co.jp/en/general-info/unicode-decimal.html
Zahurul Islam, M., Nizam Uddin, M., Khan, M.: A light weight stemmer for Bengali and its Use in spelling Checker. In: Proceedings of 1st International Conference on Digital Comm. and Computer Applications (DCCA 2007), Irbid, Jordan, pp. 19–23 (2007)
Google Scholar
Kumar, P., Kashyap, S., Mittal, A., Gupta, S.: A Hindi question answering system for E-learning documents. In: Proceedings of International Conference on Intelligent Sensing and Information Processing, Banglore, India, pp. 80–85 (2005)
Google Scholar
Singh, G., Gill, M.S., Joshi, S.S.: Punjabi to English Bilingual Dictionary. Punjabi University Patiala, India (1999)
Google Scholar
Gill, M.S., Lehal, G.S., Joshi, S.S.: Part of Speech Tagging for Grammar Checking of Punjab. The Linguistic Journal 4(1), 6–21 (2009)
Google Scholar
Punjabi Morph. Analyzer, http://www.advancedcentrepunjabi.org/punjabi_mor_ana.asp
The Corpus of Cue Phrases, http://www.cs.otago.ac.nz/staffpriv/alik/papers/apps.ps
Neto, J., et al.: Document Clustering and Text Summarization. In: Proc. of 4th Int. Conf. Practical Applications of Knowledge Discovery and Data Mining, London, pp. 41–55 (2000)
Google Scholar
Ramanathan, A., Rao, D.: A Lightweight Stemmer for Hindi. In: Workshop on Computational Linguistics for South-Asian Languages, EACL (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science & Engineering, University Institute of Engineering & Technology, Panjab University, Chandigarh, India
Vishal Gupta
Department of Computer Science, Punjabi University, Patiala, Punjab, India
Gurpreet Singh Lehal

Authors

Vishal Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Gurpreet Singh Lehal
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Punjabi University, Patiala, India
Chandan Singh , Gurpreet Singh Lehal , Jyotsna Sengupta , Dharam Veer Sharma & Vishal Goyal , , , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gupta, V., Lehal, G.S. (2011). Preprocessing Phase of Punjabi Language Text Summarization. In: Singh, C., Singh Lehal, G., Sengupta, J., Sharma, D.V., Goyal, V. (eds) Information Systems for Indian Languages. ICISIL 2011. Communications in Computer and Information Science, vol 139. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19403-0_43

Download citation

DOI: https://doi.org/10.1007/978-3-642-19403-0_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19402-3
Online ISBN: 978-3-642-19403-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics