Skip to main content
Log in

Fake review and reviewer detection through behavioral graph partitioning integrating deep neural network

  • S.I. : 2019 India Intl. Congress on Computational Intelligence
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

With a profound effect of online reviews on customers’ decisions about purchasing products or services, untruthful (fake) reviews written to deceive product quality and receive unfair commercial benefits have become a crucial problem. In this work, we propose a graph partitioning approach (BeGP) and its extension (BeGPX) to distinguish fake reviewers from benign ones. The main idea of BeGP is to first construct a behavioral graph in which reviewers are connected if they share common characteristic features that capture their similar behavior. Then, the algorithm starts with a small subgraph of known fake reviewers and afterwards repeatedly expands the subgraph by inducing other connected suspicious reviewers. Subsequently, all reviews of those suspects are hypothesized to be untruthful. Moreover, to enhance the performance of fake review(er) detection, BeGPX employs additional analysis of semantic content and emotions expressed in reviews. In particular, we use the deep neural network to learn word embeddings representation and lexicon-based emotion indicators in order to integrate into the graph construction process. We demonstrate the effectiveness of BeGP and BeGPX on two real-world review datasets from Yelp.com. The results show that both approaches outperform state-of-the-art methods with accurately identifying fake review(er)s within the k-first order of rankings. In addition, BeGPX shows significant enhancement although being provided with only a few amount of learning labeled data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. We also conducted experiments using the CBOW model. The results showed that the Skip-Gram model performed better than the CBOW, although the CBOW was slightly faster.

  2. https://www.yelp.com/dataset

References

  1. Akoglu L, Chandy R, Faloutsos C (2013) Opinion fraud detection in online reviews by network effects. In: Proceedings of the 7th International AAAI Conference on Weblogs and Social Media

  2. Baccianella S, Esuli A, Sebastiani F (2010) Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In: Proceedings of the 7th International Conference on Language Resources and Evaluation, 10, 2200–2204

  3. Barushka A, Hajek P (2018) Spam filtering using integrated distribution-based balancing approach and regularized deep neural networks. Appl Intell 48(10):3538–3556

    Article  Google Scholar 

  4. Barushka A, Hajek P (2019) Review spam detection using word embeddings and deep neural networks. In: MacIntyre J, Maglogiannis I, Iliadis L, Pimenidis E (eds) 15th IFIP International Conference on Artificial Intelligence Applications and Innovations, vol 559. Springer, Cham, pp 340–350

    Chapter  Google Scholar 

  5. Bergstra J, Bardenet R, Bengio Y, Kégl B (2011) Algorithms for hyper-parameter optimization. In: Shawe-Taylor J, Zemel R, Bartlett P, Pereira F, Weinberger K (eds) 25th International Conference on Neural Information Processing Systems, vol 24, Neural Information Processing Systems Foundation, pp 2546–2554

  6. Bergstra J, Yamins D, Cox D (2013) Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. In: Proceedings of the 30th International Conference on Machine Learning, 28, 115–123

  7. Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: Proceedings of the 11th Annual Conference on Computational Learning Theory, pp. 92–100

  8. Bravo-Marquez F, Mendoza M, Poblete B (2014) Meta-level sentiment models for big social data analysis. Knowl-Based Syst 69:86–99

    Article  Google Scholar 

  9. Chatzakou D, Vakali A (2015) Harvesting opinions and emotions from social media textual resources. IEEE Internet Comput 19(4):46–50

    Article  Google Scholar 

  10. Crawford M, Khoshgoftaar TM, Prusa JD, Richter AN, Najada HA (2015) Survey of review spam detection using machine learning techniques. J Big Data 2:23

    Article  Google Scholar 

  11. Fei G, Mukherjee A, Liu B, Hsu M, Castellanos M, Ghosh R (2013) Exploiting burstiness in reviews for review spammer detection. In: Proceedings of the 7th International AAAI Conference on Weblogs and Social Media, pp. 175–184

  12. Feng S, Xing L, Gogar A, Choi Y (2012) Distributional footprints of deceptive product reviews. In: Proceedings of the 6th International AAAI Conference on Weblogs and Social Media, pp. 98–105

  13. Fusilier DH, Cabrera RG, Montes-y-Gómez M, Rosso P (2013) Using pu-learning to detect deceptive opinion spam. In: Proceedings of the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pp. 38–45

  14. Hajek P, Barushka A, Munk M (2020) Fake consumer review detection using deep neural networks integrating word embeddings and emotion mining. Neural Comput Appl 32(23):17259–17274

    Article  Google Scholar 

  15. Heaton J (2008) Introduction to Neural Networks for Java, 2nd edn., chap. 5. Feedforward Neural Networks. Heaton Research, Inc

  16. Hu M, Liu B (2004) Mining and summarizing customer reviews. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 168–177

  17. Hussain N, Mirza HT, Rasool G, Hussain I, Kaleem M (2019) Spam review detection techniques: A systematic literature review. Appl Sci 9(5):987

    Article  Google Scholar 

  18. Jindal N, Liu B (2007) Analyzing and detecting review spam. In: Proceedings of the 7th IEEE International Conference on Data Mining, pp. 547–552

  19. Jindal N, Liu B (2008) Opinion spam and analysis. In: Proceedings of the International Conference on Web Search and Data Mining, pp. 219–230

  20. Jindal N, Liu B, Lim EP (2010) Finding unusual review patterns using unexpected rules. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 1549–1552

  21. Kiritchenko S, Zhu X, Mohammad SM (2014) Sentiment analysis of short informal texts. J Artif Intell Res 50(1):723–762

    Article  Google Scholar 

  22. Koven J, Siadati H, Lin CY (2014) Finding valuable yelp comments by personality, content, geo, and anomaly analysis. In: Proceedings of the 2014 IEEE International Conference on Data Mining Workshop, pp. 1215–1218

  23. Lak P, Turetken O (2014) Star ratings versus sentiment analysis – a comparison of explicit and implicit measures of opinions. In: Proceedings of 47th Hawaii International Conference on System Sciences, pp. 796–805

  24. Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: Proceedings of the 31st International Conference on Machine Learning, pp. 1188–1196

  25. Li F, Huang M, Yang Y, Zhu X (2011) Learning to identify review spam. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence, pp. 2488–2493

  26. Li L, Qin B, Ren W, Liu T (2017) Document representation and feature combination for deceptive spam review detection. Neurocomputing 254:33–41

    Article  Google Scholar 

  27. Lim EP, Nguyen VA, Jindal N, Liu B, Lauw HW (2010) Detecting product review spammers using rating behaviors. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 939–948

  28. Liu B, Dai Y, Li X, Lee WS, Yu PS (2003) Building text classifiers using positive and unlabeled examples. In: Proceedings of the 3rd IEEE International Conference on Data Mining, pp. 179–186

  29. Liu J, Cao Y, Lin CY, Huang Y, Zhou M (2007) Low-quality product review detection in opinion summarization. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 334–342

  30. Manaskasemsak B, Chanmakho C, Klainongsuang J, Rungsawang A (2019) Opinion spam detection through user behavioral graph partitioning approach. In: Proceedings of the 3rd International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence, pp. 73–77

  31. Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems, pp. 3111–3119

  32. Mohammad SM, Kiritchenko S, Zhu X (2013) NRC-Canada: Building the state-of-the-art in sentiment analysis of tweets. In: Proceedings of the 2nd Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the 7th International Workshop on Semantic Evaluation (SemEval 2013)

  33. Mohammad SM, Turney PD (2010) Emotions evoked by common words and phrases: Using mechanical turk to create an emotion lexicon. In: Proceedings of the NAACL-HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, pp. 26–34

  34. Mohammad SM, Turney PD (2013) Crowdsourcing a word-emotion association lexicon. Comput Intell 29(3):436–465

    Article  MathSciNet  Google Scholar 

  35. Mukherjee A, Kumar A, Liu B, Wang J, Hsu M, Castellanos M, Ghosh R (2013) Spotting opinion spammers using behavioral footprints. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 632–640

  36. Mukherjee A, Venkataraman V, Liu B, Glance N (2013) What yelp fake review filter might be doing? In: Proceedings of the 7th International AAAI Conference on Weblogs and Social Media, pp. 409–418

  37. Murphy R (2019) Local consumer review survey. https://www.brightlocal.com/research/local-consumer-review-survey/. Accessed: 20 April 2020

  38. Nielsen FÅ (2011) A new ANEW: Evaluation of a word list for sentiment analysis in microblogs. In: Proceedings of the ESWC2011 Workshop on ‘Making Sense of Microposts’: Big things come in small packages, pp. 93–98

  39. Peng Q, Zhong M (2014) Detecting spam review through sentiment analysis. J Softw 9(8):2065–2072

    Article  Google Scholar 

  40. Plutchik R (1988) The nature of emotions: clinical implications. In: Clynes M, Panksepp J (eds) Emot Psychopathol. Springer, Boston, MA, pp 1–20

    Google Scholar 

  41. Rayana S, Akoglu L (2015) Collective opinion spam detection: Bridging review networks and metadata. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 985–994

  42. Ren Y, Ji D (2017) Neural networks for deceptive opinion spam detection: An empirical study. Inf Sci 385–386:213–224

    Article  Google Scholar 

  43. Salton G, Buckley C (1998) Term-weighting approaches in automatic text retrieval. Inf Process Manage 24(5):513–523

    Article  Google Scholar 

  44. Sharma K, Lin KI (2013) Review spam detector with rating consistency check. In: Proceedings of the 51st ACM Southeast Conference, 24, 1–6

  45. Tan E, Guo L, Chen S, Zhang X, Zhao Y (2013) Unik: Unsupervised social network spam detection. In: Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, pp. 479–488

  46. Wang G, Xie S, Liu B, Yu PS (2011) Review graph based online store review spammer detection. In: Proceedings of the 11th IEEE International Conference on Data Mining, pp. 1242–1247

  47. Wang Z, Hu R, Chen Q, Gao P, Xu X (2020) Collueagle: Collusive review spammer detection using markov random fields. Data Min Knowl Disc 34:1621–1641

    Article  Google Scholar 

  48. Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, pp. 347–354

  49. Xie S, Wang G, Lin S, Yu PS (2012) Review spam detection via temporal pattern discovery. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 823–831

  50. Zhu X, Kiritchenko S, Mohammad SM (2014) NRC-Canada-2014: Recent improvements in the sentiment analysis of tweets. In: Proceedings of the 8th International Workshop on Semantic Evaluation, pp. 443–447

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bundit Manaskasemsak.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Manaskasemsak, B., Tantisuwankul, J. & Rungsawang, A. Fake review and reviewer detection through behavioral graph partitioning integrating deep neural network. Neural Comput & Applic 35, 1169–1182 (2023). https://doi.org/10.1007/s00521-021-05948-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-021-05948-1

Keywords

Navigation