Study of Sentiment Analysis Using Hadoop

Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 654)

Abstract

In the current world of Internet people express themselves, present their views and feelings about specific topics or entities using various social media application. These posts from users present a huge opportunity for the organizations to increase their market value by analyzing the posts and using information in decision making. These posts can be studied using various machine learning and lexicon-based approaches for extracting its sentiments. With more and more people moving to internet, huge data is being produced every second and challenge is to store this large data and process it efficiently in real time to infer knowledge from this data. This paper presents different approaches for real-time and scalable ways of performing sentiment analysis using Hadoop in a time efficient manner. Hadoop and its component tools like MapReduce, Mahout, and Hive are being surveyed in different scholar articles for this paper.

Keywords

Sentiment analysis Twitter Hadoop MapReduce Hive Mahout 

Notes

Acknowledgements

The author is grateful to (Dr.) Divakar Singh, HOD, Department of CSE, UIT, Barkatullah University, Bhopal for his valuable suggestions and comments in writing this paper. The author wish to thank Puneet Sharma, for his continuous encouragement, inspiration and guidance with his abode of experience in data mining.

References

  1. 1.
    The Apache Software Foundation, https://flume.apache.org
  2. 2.
    The Apache Software Foundation, https://sqoop.apache.org
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
    Rajurkar G.D., Goudar R.M.: A speedy data uploading approach for twitter trend and sentiment analysis using HADOOP. In: 2015 International Conference on Computing Communication Control and Automation, pp. 580–584. IEEE, Pune (2015)Google Scholar
  8. 8.
    Mane S.B., Sawnt Y., Kazi S., Shinde V.: Real time sentiment analysis of twitter data using hadoop. In: International Journal of Computer Science and Information Technology, vol. 5(3), pp. 3098–3100, IJCSIT (2014)Google Scholar
  9. 9.
    Zarrad A., Aljialoud J.: The evaluation of the public opinion a case study: MERS-Cov infection virus in KSA. In: 7th International Conference on Utility and Cloud Computing, pp. 664–607. IEEE, London (2014)Google Scholar
  10. 10.
    Kim J.S., Yang M.H., Hwang Y.J., Jeon S.H., Kim K.Y., Jung I.S., Choi C.H., Cho W.S., Na J.H.: Customer preference analysis based on SNS data. In: Second International Conference on Cloud and Green Computing, pp. 609–613. IEEE, Xiangtan (2012)Google Scholar
  11. 11.
    Ramesh R., Divya G., Divya D., Kurian M.K.: Big data sentiment analysis using hadoop. In: International Journal for Innovative Research in Science & Technology, vol. 1. IJIRST (2015)Google Scholar
  12. 12.
    Hammond K., Varde A.S.: Cloud based predictive analytics. In: 13th International Conference on Data Mining Workshops, pp. 607–612. IEEE, Dallas, TX (2013)Google Scholar
  13. 13.
    Shang S., Shi M., Shan W., Hong Z.: Research on public opinion based on big data. In: 14th International Conference on Computer and Information Science, pp. 559–562. IEEE, Las Vegas, NV (2015)Google Scholar
  14. 14.
    Lui B., Blasch E., Chen Y., Shen D., Chen G.: Scalable sentiment classification for big data analysis using naive bayes classifier. In: International Conference on Big Data, pp. 99–104. IEEE, Silicon Valley, CA (2013)Google Scholar
  15. 15.
    Conejero J, Burnap P., Rana O., Morgan J.: Scaling archived social media data analysis using a hadoop cloud. In: Sixth International Conference on Cloud Computing, pp. 685–692. IEEE, Santa Clara, CA (2013)Google Scholar
  16. 16.
    Prom-on S., Ranong S.N., Jenviriyakul P., Wongkaew T., Saetiew N., Achalakul T.: DOM: a big data analytics framework for mining Thai public opinions. In: International Conference on Computer, Control, Informatics and Its Applications, pp. 1–6. IEEE, Bandung (2014)Google Scholar
  17. 17.
    Joldzic O.V., Vukovic D.R.: The impact of cluster characteristics on HiveQL query optimization. In: 21st Telecommunication Forum, pp. 837–840. IEEE, Belgrade (2013)Google Scholar
  18. 18.
    The Apache Software Foundation, https://hadoop.apache.org/

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  1. 1.IBM IndiaBengaluruIndia
  2. 2.UIT, Barkatullah UniversityBhopalIndia

Personalised recommendations