Advertisement

Review on Natural Language Processing Trends and Techniques Using NLTK

  • Deepa Yogish
  • T. N. ManjunathEmail author
  • Ravindra S. HegadiEmail author
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 1037)

Abstract

In modern age of information explosion, every day millions of gigabytes of data are generated in the form of documents, web pages, e-mail, social media text, blogs etc., so importance of effective and efficient Natural Language Processing techniques become crucial for an information retrieval system, text summarization, sentiment analysis, information extraction, named entity recognition, relationship extraction, social media monitoring, text mining, language translation program, and question answering system. Natural Language Processing is a computational technique applies different levels of linguistic analysis for representing natural language into a useful representation for further processing. NLP is recognized as a challenging task in computer science and artificial intelligence because understanding human natural language is not only depends on the words but how those words are linked together to form precise meaning is also considered. Regardless of language being one of the easiest concepts for human to learn, but for training computers to understand natural language is a difficult task due to the ambiguity of language syntax and semantics. Natural Language processing techniques involves processing documents or text which reduces storage space and also reduces the size of index and understanding the given information which satisfies user’s need. NLP techniques improve the performance of the information retrieval efficiency and effective documentation processes. Common dialect handling procedures incorporates tokenization, stop word expulsion, stemming, lemmatization, parts of discourse labeling, lumping and named substance recognizer which enhances execution of NLP applications. The Natural Language Toolkit is the best possible solution for learning the ropes of NLP domain. NLTK, a collection of application packages which encourage researchers and learners in natural language processing, computational linguistics and artificial intelligence.

Keywords

Natural Language Processing (NLP) Artificial Intelligence (AI) Information Retrieval (IR) Natural Language Tool Kit (NLTK) 

References

  1. 1.
    Swapnil, V., Jayshree, A.: Natural language processing preprocessing techniques. Int. J. Comput. Eng. Appl. XI(Special Issue) (2017). http://www.ijcea.com/. ISSN 2321-3469
  2. 2.
    Alexandre, P., Hugo, G.O., Ana, O.A.: Comparing the performance of Different NLP toolkits in formal and social media text. In: 5th Symposium on Languages, Applications and Technologies, Germany (2016). https://doi.org/10.4230/OASIcs,SLATE.2016
  3. 3.
    Steven, B.: NLTK: the natural language toolkit. In: Proceedings of the COLING/ACL Interactive Presentation Sessions, Association for Computational Linguistics, Sydney, pp. 69–72 (2006)Google Scholar
  4. 4.
    Vijayarani, S., Janani, R.: Text mining: open source tokenization tools - an analysis. Adv. Comput. Intell. Int. J. (ACII) 3(1), 37–47 (2016)CrossRefGoogle Scholar
  5. 5.
    Raulji, J.K., Saini, J.R.: Stop-word removal algorithm and its implementation for Sanskrit language. Int. J. Comput. Appl. (0975–8887) 150(2), 15–17 (2016)Google Scholar
  6. 6.
    Jivani, A.G.: A comparative study of stemming algorithms. Int. J. Comp. Tech. Appl. 2(6), 1930–1938 (2011). ISSN 2229-6093Google Scholar
  7. 7.
    Anjali, M.K., BabuAnto, P.: Parts of speech taggers for dravidian languages. Int. J. Eng. Trends Technol. (IJETT) 21(7), 342–347 (2015).  https://doi.org/10.14445/22315381/IJETT-V21P263. ISSN 2231-5381CrossRefGoogle Scholar
  8. 8.
    Shubhangi, R., Sharvari, G.: Survey of various POS tagging techniques for Indian regional languages. Int. J. Comput. Sci. Inf. Technol. (IJCSIT) 6(3), 2525–2529 (2015)Google Scholar
  9. 9.
    Mahar, J.A., Qadir, G.: MEMON: rule based part of speech tagging of Sindhi language. In: Proceeding of International Conference on Signal Acquisition and Processing (2010)Google Scholar
  10. 10.
    Mahar, J.A., Memon, G.Q.: Parts of speech taggers for Dravidian languages. Int. J. Eng. Trends Technol. (IJETT) 21(7), 1933–1938 (2015). ISSN 2231-5381Google Scholar
  11. 11.
    Riddhi, D., Prem, B.: Survey paper of different lemmatization approaches. Int. J. Res. Advent Technol. (2015). (E-ISSN 2321-9637) Special Issue 1st International Conference on Advent Trends in Engineering, Science an d TechnologyGoogle Scholar
  12. 12.
    Manjunath, T., Ravindra, N., Hegadi, S.: Statistical data quality model for data migration business enterprise. Int. J. Soft Comput. Medwell J. (2013). ISSN 1816-9503Google Scholar
  13. 13.
    Ruikar, D.D., Hegadi, R.S.: Simple DFA construction algorithm using divide-and-conquer approach. In: Nagabhushan, P., Guru, D.S., Shekar, B.H., Kumar, Y.H.S. (eds.) Data Analytics and Learning. LNNS, vol. 43, pp. 245–255. Springer, Singapore (2019).  https://doi.org/10.1007/978-981-13-2514-4_21CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.VTU-RC-ISEBMSIT&MBangaloreIndia
  2. 2.ISEBMSIT&MBangaloreIndia
  3. 3.School of Computational SciencesSolapur UniversitySolapurIndia

Personalised recommendations