Venn Diagram-Based Feature Ranking Technique for Key Term Extraction

Chakraborty, Neelotpal; Mukherjee, Sambit; Naskar, Ashes Ranjan; Malakar, Samir; Sarkar, Ram; Nasipuri, Mita

doi:10.1007/978-981-10-3153-3_33

Neelotpal Chakraborty¹⁸,
Sambit Mukherjee¹⁹,
Ashes Ranjan Naskar¹⁸,
Samir Malakar²⁰,
Ram Sarkar¹⁸ &
…
Mita Nasipuri¹⁸

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 515))

1130 Accesses

Abstract

Classification of text documents from a pool of huge collection of the same is performed usually on the basis of certain key terms present in the said documents that distinguish a particular document set from the universal set. Generally, these key terms are identified using some feature sets, which can be statistical, rule-based, linguistic, or hybrid in nature. This paper develops a simple technique based on Venn diagram to prioritize the different standard features available in the literature, which in turn reduces the dimension of the feature sets used for document classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Chakraborty, Neelotpal, Samir Malakar, Ram Sarkar, Mita Nasipuri. “A Rule based Approach for Noun Phrase Extraction from English Text Document.” 2016 Seventh International Conference on CNC. CNC, 2016.
Google Scholar
Han, Jiawei, Micheline Kamber, and Jian Pei. Data mining: concepts and techniques. Elsevier, 2011.
Google Scholar
Hasan, Kazi Saidul, and Vincent Ng. “Automatic Keyphrase Extraction: A Survey of the State of the Art.” ACL (1). 2014.
Google Scholar
Mangina, Eleni, and John Kilbride. “Evaluation of keyphrase extraction algorithm and tiling process for a document/resource recommender within e-learning environments.” Computers & Education 50.3 (2008): 807–820.
Google Scholar
Haddoud, Mounia, and Saïd Abdeddaïm. “Accurate keyphrase extraction by discriminating overlapping phrases.” Journal of Information Science (2014): 0165551514530210.
Google Scholar
Jurafsky, Dan, and James H. Martin. Speech and language processing. Pearson, 2014.
Google Scholar
Turney, Peter D. “Learning algorithms for keyphrase extraction.” Information Retrieval 2.4 (2000): 303–336.
Google Scholar
Witten, Ian H., et al. “KEA: Practical automatic keyphrase extraction.” Proceedings of the fourth ACM conference on Digital libraries. ACM, 1999.
Google Scholar
Sarkar, Kamal, Mita Nasipuri, and Suranjan Ghose. “Machine learning based keyphrase extraction: comparing decision trees, naïve Bayes, and artificial neural networks.” Journal of Information Processing Systems 8.4 (2012): 693–712.
Google Scholar
Yu, Feng, Hong-Wei Xuan, and De-quan Zheng. “Key-Phrase Extraction Based on a Combination of CRF Model with Document Structure.” Computational Intelligence and Security (CIS), 2012 Eighth International Conference on. IEEE, 2012.
Google Scholar
Sarawagi, Sunita, and William W. Cohen. “Semi-markov conditional random fields for information extraction.” Advances in neural information processing systems. 2004.
Google Scholar
Beliga, Slobodan, Ana Meštrović, and Sanda Martinčić-Ipšić. “An Overview of Graph-Based Keyword Extraction Methods and Approaches.” Journal of Information and Organizational Sciences 39.1 (2015): 1–20.
Google Scholar
Dharmadhikari, Shweta C., Maya Ingle, and Parag Kulkarni. “Empirical Studies on Machine Learning Based Text Classification Algorithms.” Advanced Computing 2.6 (2011): 161.
Google Scholar
Jiang, Xin, Yunhua Hu, and Hang Li. “A ranking approach to keyphrase extraction.” Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval. ACM, 2009.
Google Scholar
Siddiqi, Sifatullah, and Aditi Sharan. “Keyword and Keyphrase Extraction Techniques: A Literature Review.” International Journal of Computer Applications 109.2 (2015).
Google Scholar
Kaur, Jasmeen, and Vishal Gupta. “Effective approaches for extraction of keywords.” Journal of Computer Science 7.6 (2010): 144–148.
Google Scholar

Download references

Acknowledgements

The authors are thankful to the Center for Microprocessor Applications for Training Education and Research (CMATER) of C.S.E. Dept., JU, for providing infrastructural facilities during progress of the work. The current work, reported here, has been partially funded by Technical Education Quality Improvement Programme Phase–II (TEQIP-II), Jadavpur University, Kolkata, India.

Author information

Authors and Affiliations

Jadavpur University, Kolkata, India
Neelotpal Chakraborty, Ashes Ranjan Naskar, Ram Sarkar & Mita Nasipuri
Future Institute of Engineering and Management, Kolkata, India
Sambit Mukherjee
MCKV Institute of Engineering, Howrah, India
Samir Malakar

Authors

Neelotpal Chakraborty
View author publications
You can also search for this author in PubMed Google Scholar
Sambit Mukherjee
View author publications
You can also search for this author in PubMed Google Scholar
Ashes Ranjan Naskar
View author publications
You can also search for this author in PubMed Google Scholar
Samir Malakar
View author publications
You can also search for this author in PubMed Google Scholar
Ram Sarkar
View author publications
You can also search for this author in PubMed Google Scholar
Mita Nasipuri
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Neelotpal Chakraborty .

Editor information

Editors and Affiliations

ANITS, Prof., Comp. Sci. & Engg. Dept. ANITS, Visakhapatnam, Andhra Pradesh, India
Suresh Chandra Satapathy
Dept. of ECE, Shri Ramswaroop Mem. Group of Prof. Clg Dept. of ECE, Lucknow, Uttar Pradesh, India
Vikrant Bhateja
SCIS, University of Hyderabad , Hyderabad, India
Siba K. Udgata
KIIT University, School of Computer Engineering KIIT University, Bhubaneswar, Odisha, India
Prasant Kumar Pattnaik

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chakraborty, N., Mukherjee, S., Naskar, A.R., Malakar, S., Sarkar, R., Nasipuri, M. (2017). Venn Diagram-Based Feature Ranking Technique for Key Term Extraction. In: Satapathy, S., Bhateja, V., Udgata, S., Pattnaik, P. (eds) Proceedings of the 5th International Conference on Frontiers in Intelligent Computing: Theory and Applications . Advances in Intelligent Systems and Computing, vol 515. Springer, Singapore. https://doi.org/10.1007/978-981-10-3153-3_33

Download citation

DOI: https://doi.org/10.1007/978-981-10-3153-3_33
Published: 17 March 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3152-6
Online ISBN: 978-981-10-3153-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics