Skip to main content
Log in

A Supervised Machine Learning Approach for the Credibility Assessment of User-Generated Content

  • Published:
Wireless Personal Communications Aims and scope Submit manuscript

Abstract

Consumers increasingly rely on online reviews to assist them in their buying decisions. The rising popularity of e-commerce websites, hotel reviews, and social media has become a relevant research field in recent years. Online reviews affect people’s decisions in their day-to-day life; the fake review impacts both consumers and business organizations. They need to know how different types of consumers prefer consumer feedback, which influences their opinion. Automatic detection of such reviews is a difficult job, provided that the author writes in such a way that it seems like a real review. Previous work has tackled the identification of fake reviews in many fields, including food reviews or company reviews in a restaurant and hotels. In this study, we proposed a fully supervised approach to distinguish opinion spammers in online reviews. In this work, we have used labeled data that can be useful to classify real and fake reviews. We have also implemented various machine learning algorithms for classification on two different datasets (Yelp hotel review dataset, Yelp restaurant review dataset). We have performed the classification task on the features engineered dataset. Our experiment’s measured results show that Logistic regression performs better than other algorithms on most occasions. We may conclude that the presented study contributes to the existing literature with better accuracy from the obtained results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Chen, Y., & Xie, J. (2008). Online consumer review: Word-of-mouth as a new element of marketing communication mix. Management Science, 54(3), 477–491.

    Article  Google Scholar 

  2. Ha, S. H., Bae, S., & Son, L. K. (2015). Impact of online consumer reviews on product sales: Quantitative analysis of the source effect. Applied Mathematics and Information Sciences, 9(2L), 373–387.

    Google Scholar 

  3. Filieri, R., & McLeay, F. (2014). E-wom and accommodation: An analysis of the factors that influence travelers’ adoption of information from online reviews. Journal of Travel Research, 53(1), 44–57.

    Article  Google Scholar 

  4. Sotiriadis, M. D., & Van Zyl, C. (2013). Electronic word-of-mouth and online reviews in tourism services: The use of twitter by tourists. Electronic Commerce Research, 13(1), 103–124.

    Article  Google Scholar 

  5. Hernandez-Nieves, E., Hernández, G., Gil-González, A.-B., Rodríguez-González, S., & Corchado, J. M. (2020). Fog computing architecture for personalized recommendation of banking products. Expert Systems with Applications, 140, 112900.

    Article  Google Scholar 

  6. Jindal, N., & Liu, B. (2008). Opinion spam and analysis. In Proceedings of the 2008 international conference on web search and data mining (pp. 219–230).

  7. Mukherjee, A. (2015). Detecting deceptive opinion spam using linguistics, behavioral and statistical modeling. In Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing: Tutorial abstracts (pp. 21–22).

  8. Peddinti, S. T., Bilogrevic, I., Taft, N., Pelikan, M., Erlingsson, Ú., Anthonysamy, P., & Hogben, G. (2019). Reducing permission requests in mobile apps. Proceedings of the internet measurement conference (pp. 259–266).

  9. Rudolph, S. The impact of online reviews on customers’ buying decisions, Business 2 Community.

  10. Alothali, E., Zaki, N., Mohamed, E. A., & Alashwal, H. (2018). Detecting social bots on twitter: A literature review. In 2018 international conference on innovations in information technology (IIT), IEEE, 2018 (pp. 175–180).

  11. Gillum, E. C., Ke, Q., Xie, Y., Yu, F., & Zhao, Y. (2011). Graph based bot-user detection, uS Patent 8,069,210 (Nov. 29).

  12. Kudugunta, S., & Ferrara, E. (2018). Deep neural networks for bot detection. Information Sciences, 467, 312–322.

    Article  Google Scholar 

  13. Ott, M., Choi, Y., Cardie, C., & Hancock, J. T. (2011). Finding deceptive opinion spam by any stretch of the imagination. In Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies-volume 1, Association for Computational Linguistics (pp. 309–319).

  14. Feng, S., Xing, L., Gogar, A., & Choi, Y. (2012). Distributional footprints of deceptive product reviews. In Sixth international AAAI conference on weblogs and social media.

  15. Zhang, D., Zhou, L., Kehoe, J. L., & Kilic, I. Y. (2016). What online reviewer behaviors really matter? Effects of verbal and nonverbal behaviors on detection of fake online reviews. Journal of Management Information Systems, 33(2), 456–481.

    Article  Google Scholar 

  16. Keiningham, T. L., Cooil, B., Andreassen, T. W., & Aksoy, L. (2007). A longitudinal examination of net promoter and firm revenue growth. Journal of Marketing, 71(3), 39–51.

    Article  Google Scholar 

  17. Jang, S., Prasad, A., & Ratchford, B. T. (2012). How consumers use product reviews in the purchase decision process. Marketing Letters, 23(3), 825–838.

    Article  Google Scholar 

  18. Mudambi, S.M., & Schuff, D. (2010). Research note: What makes a helpful online review? a study of customer reviews on amazon. com, MIS quarterly 185–200.

  19. Ye, Q., Law, R., Gu, B., & Chen, W. (2011). The influence of user-generated content on traveler behavior: An empirical investigation on the effects of e-word-of-mouth to hotel online bookings. Computers in Human behavior, 27(2), 634–639.

    Article  Google Scholar 

  20. Yin, D., Bond, S. D., & Zhang, H. (2014). Anxious or angry? effects of discrete emotions on the perceived helpfulness of online reviews. MIS Quarterly, 38(2), 539–560.

    Article  Google Scholar 

  21. Reichheld, F. F. (2003). The one number you need to grow. Harvard Business Review, 81(12), 46–55.

    Google Scholar 

  22. Ding, X., Liu, B., & Yu, P.S. (2008). A holistic lexicon-based approach to opinion mining. In Proceedings of the 2008 international conference on web search and data mining (pp. 231–240).

  23. Mukherjee, A., Kumar, A., Liu, B., Wang, J., Hsu, M., Castellanos, M., & Ghosh, R. (2013). Spotting opinion spammers using behavioral footprints. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 632–640).

  24. Wang, G., Xie, S., Liu, B., & Philip, S. Y. (2011). Review graph based online store review spammer detection. In 2011 IEEE 11th international conference on data mining, IEEE (pp. 1242–1247).

  25. Xie, S., Wang, G., Lin, S., & Yu, P. S. (2012). Review spam detection via temporal pattern discovery. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 823–831).

  26. Fei, G., Mukherjee, A., Liu, B., Hsu, M., Castellanos, M., & Ghosh, R. (2013). Exploiting burstiness in reviews for review spammer detection. In Seventh international AAAI conference on weblogs and social media.

  27. Lim, E. -P., Nguyen, V. -A., Jindal, N., Liu, B., & Lauw, H. W. (2010). Detecting product review spammers using rating behaviors. In Proceedings of the 19th ACM international conference on Information and knowledge management (pp. 939–948).

  28. Ott, M., Cardie, C., & Hancock, J. (2012). Estimating the prevalence of deception in online review communities. In Proceedings of the 21st international conference on World Wide Web (pp. 201–210).

  29. Mukherjee, A., Venkataraman, V., Liu, B., & Glance, N. (2013). What yelp fake review filter might be doing? In Seventh international AAAI conference on weblogs and social media.

  30. Fornaciari, T., & Poesio, M. Identifying fake amazon reviews as learning from crowds.

  31. Dewang, R. K., & Singh, A. (2015). Identification of fake reviews using new set of lexical and syntactic features. Proceedings of the Sixth International Conference on Computer and Communication Technology, 2015, 115–119.

    Google Scholar 

  32. Li, H., Chen, Z., Mukherjee, A., Liu, B., & Shao, J. (2015). Analyzing and detecting opinion spam on a large-scale dataset via temporal and spatial patterns. In Ninth international AAAI conference on web and social Media.

  33. Fusilier, D. H., Montes-y Gómez, M., Rosso, P., & Cabrera, R. G. (2015). Detecting positive and negative deceptive opinions using pu-learning. Information Processing & Management, 51(4), 433–443.

    Article  Google Scholar 

  34. Li, Y., Feng, X., & Zhang, S. (2016). Detecting fake reviews utilizing semantic and emotion model. In 2016 3rd international conference on information science and control engineering (ICISCE), IEEE (pp. 317–320).

  35. Albitar, S., Fournier, S., & Espinasse, B. (2014). An effective tf/idf-based text-to-text semantic similarity measure for text classification. In International conference on web information systems engineering, Springer (pp. 105–114).

  36. Dwoskin, E., & Timberg, C. How merchants use facebook to flood amazon with fake reviews, Washington Post.

  37. Martin-Fuentes, E., Mateu, C., & Fernandez, C. (2018). Does verifying uses influence rankings? analyzing booking. com and tripadvisor, Tourism Analysis 23 (1) 1–15.

  38. McNamee, R. (2020). Zucked: Waking up to the Facebook catastrophe. London: Penguin Books.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Praphula Kumar Jain.

Ethics declarations

Conflict of interest

All authors declairs that they do not have any conflict of interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jain, P.K., Pamula, R. & Ansari, S. A Supervised Machine Learning Approach for the Credibility Assessment of User-Generated Content. Wireless Pers Commun 118, 2469–2485 (2021). https://doi.org/10.1007/s11277-021-08136-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11277-021-08136-5

Keywords

Navigation