Natural Language Processing and Machine Learning for Big Data

  • Joy MustafiEmail author
Part of the Studies in Big Data book series (SBD, volume 17)


This chapter is focused on how Big data challenges can be handled from the data science perspective. The data available for analysis are in different forms in terms of volume, velocity, variety, and veracity. The objective is to resolve some of these real world problems using natural language processing, where the unstructured data can be transformed into meaningful structured information; and machine learning to get more insights out of the information available or derived. The combination of multiple algorithms can actually play a major role in the overall field of cognitive computing. The chapter fairly covers important methodologies where, what and when to apply. Some open research problems are shared for the budding data scientists. This chapter may be referred as the basic introduction to data science.


Natural Language Processing Sentiment Analysis Inductive Logic Programming Name Entity Recognition Word Sense Disambiguation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
  2. 2.
    Jurafsky, D., & Martin, J. H. Speech and language processing speech and language processing.Google Scholar
  3. 3.
    Mustafi, J., & Chaudhuri, B. B. (2008). A proposal for standardization of English to Bangla transliteration and Bangla universal editor. Lang. India, 8, 5. (May 2008).Google Scholar
  4. 4.
    Mustafi, J., Mukherjee, S., & Chaudhuri, A. (2002). Grid computing: The future of distributed computing for high performance scientific and business applications. In Lecture Notes in Computer Science, International Workshop on Distributed Computing, Springer, vol. 2571, pp. 339–342.Google Scholar
  5. 5.
    Mustafi, J., Parikh, A., Polisetty, A., Agarwalla, L., & Mungi, A. (2014). Thinkminers: Disorder recognition using conditional random fields and distributional semantics. In The 8th International Workshop on Semantic Evaluation (SemEval—COLING), pp. 652–656.Google Scholar
  6. 6.
    Schonberger, V. M., and Cukier, K. Big data: A revolution that will transform how we live, work, and think.Google Scholar
  7. 7.
    The Penn Treebank Project.
  8. 8.
    The Stanford Natural Language Processing Group.

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Cognitive Computing—Data Science—Advanced AnalyticsBangaloreIndia

Personalised recommendations