Natural Language Processing and Machine Learning for Big Data
This chapter is focused on how Big data challenges can be handled from the data science perspective. The data available for analysis are in different forms in terms of volume, velocity, variety, and veracity. The objective is to resolve some of these real world problems using natural language processing, where the unstructured data can be transformed into meaningful structured information; and machine learning to get more insights out of the information available or derived. The combination of multiple algorithms can actually play a major role in the overall field of cognitive computing. The chapter fairly covers important methodologies where, what and when to apply. Some open research problems are shared for the budding data scientists. This chapter may be referred as the basic introduction to data science.
KeywordsNatural Language Processing Sentiment Analysis Inductive Logic Programming Name Entity Recognition Word Sense Disambiguation
- 1.IBM Watson. http://www.ibm.com/smarterplanet/us/en/ibmwatson/.
- 2.Jurafsky, D., & Martin, J. H. Speech and language processing speech and language processing.Google Scholar
- 3.Mustafi, J., & Chaudhuri, B. B. (2008). A proposal for standardization of English to Bangla transliteration and Bangla universal editor. Lang. India, 8, 5. (May 2008).Google Scholar
- 4.Mustafi, J., Mukherjee, S., & Chaudhuri, A. (2002). Grid computing: The future of distributed computing for high performance scientific and business applications. In Lecture Notes in Computer Science, International Workshop on Distributed Computing, Springer, vol. 2571, pp. 339–342.Google Scholar
- 5.Mustafi, J., Parikh, A., Polisetty, A., Agarwalla, L., & Mungi, A. (2014). Thinkminers: Disorder recognition using conditional random fields and distributional semantics. In The 8th International Workshop on Semantic Evaluation (SemEval—COLING), pp. 652–656.Google Scholar
- 6.Schonberger, V. M., and Cukier, K. Big data: A revolution that will transform how we live, work, and think.Google Scholar
- 7.The Penn Treebank Project. http://www.cis.upenn.edu/~treebank/.
- 8.The Stanford Natural Language Processing Group. http://nlp.stanford.edu/.