Analysis and Prediction of Customers’ Reviews with Amazon Dataset on Products

  • Shitanshu JainEmail author
  • S. C. Jain
  • Santosh Vishwakarma
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 1077)


The main objective of this paper is to get a deeper knowledge of the text classification methods used in text mining. This paper describes different methods and algorithms used in text mining. Various text preprocessing steps have been performed like tokenization, case folding, stemming, stopword removal, etc. The customer reviews posted in the amazon website have been used as the training set and used with various classifiers like Naive Bayes, KNN, random forest and decision tree. The performance parameter of each method is determined with standard evaluation parameters such as precision, recall, and kappa measures. The results show that K-nearest neighbor method gives the optimal performance with the same dataset.


Text mining methods and techniques Naive Bayes KNN Decision tree Performance parameter 


  1. 1.
    Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann, Boston, MA, USA. Elsevier, San Francisco, CA, USA (2006)Google Scholar
  2. 2.
    Vijayarani, S., Ilamathi, J., Nithya: Preprocessing techniques for text mining—an overview. Int. J. Comput. Sci. Commun. Netw. 5(1), 7–16Google Scholar
  3. 3.
    Nidhi, Gupta, V.: Recent trends in text classification techniques. Int. J. Comput. Appl. 35(6), 0975–8887 (2011)Google Scholar
  4. 4.
    Stoica, E.A., Özyirmidokuz, E.K.: Mining customer feedback documents. Int. J. Knowl. Eng. 1(1) (2015)Google Scholar
  5. 5.
    Ziegler, C.-N., Skubacz, M. Viermetz, M.: Mining and exploring unstructured customer feedback data using language models and tree map visualizationsGoogle Scholar
  6. 6.
    Pagolu, M.K., Chakraborty, G.: Analysis of unstructured data: applications of text analytics and sentiment miningGoogle Scholar
  7. 7.
    Miranda, M.D., Renato, J.S.: Using sentiment analysis to assess customer satisfaction in an online job search company. In: International Conference on Business Information Systems. Springer, Cham (2014)Google Scholar
  8. 8.
    Voznika, F., Leonardo, V.: Data Mining Classification (2007)Google Scholar
  9. 9.
    Leung, K.M.: Naive Bayesian classifier. Department of Computer Science/Finance and Risk Engineering, Polytechnic University (2007)Google Scholar
  10. 10.
    Dey, L., et al.: Sentiment analysis of review datasets using Naive Bayes and K-NN classifier (2016). arXiv:1610.09982
  11. 11.
    Jungermann, F.: Information extraction with rapid miner. In: Proceedings of the GSCL Symposium Sprachtechnologie und eHumanities (2009)Google Scholar
  12. 12.
    Dataset Amazon: Accessed 05 June 2019

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  • Shitanshu Jain
    • 1
    Email author
  • S. C. Jain
    • 1
  • Santosh Vishwakarma
    • 2
  1. 1.Amity UniversityMadhya PradeshIndia
  2. 2.Gyan Ganga Institute of Technology and SciencesJabalpurIndia

Personalised recommendations