Factitious or fact? Learning textual representations for fake online review detection

Mohawesh, Rami; Al-Hawawreh, Muna; Maqsood, Sumbal; Alqudah, Omar

doi:10.1007/s10586-023-04148-x

Factitious or fact? Learning textual representations for fake online review detection

Published: 28 September 2023

(2023)
Cite this article

Cluster Computing Aims and scope Submit manuscript

Rami Mohawesh¹,
Muna Al-Hawawreh²,
Sumbal Maqsood³ &
…
Omar Alqudah⁴

398 Accesses
2 Citations
Explore all metrics

Abstract

User reviews can play a big part in deciding a company's income in the e-commerce industry. Before making selections regarding any product or service, online users rely on reviews. As a result, the trustworthiness of online evaluations is vital for organisations and can directly impact their reputation and revenue. Because of this, some firms pay spammers to publish false reviews. Most recent studies to detect fake reviews utilise supervised learning. However, neural network techniques, a recent form of advanced technology, have been utilised extensively to detect fake reviews and have demonstrated their ability to do so. Thus, this paper first provides a benchmark study to analyse the performance of various machine learning algorithms with different feature extraction methods on five fake review datasets to present our results. Second, we propose three advanced language models for embedding reviews into the classifiers. Third, we conduct an exhaustive feature set evaluation study to find the best features in detecting fake reviews. Fourth, we analyse the performance of traditional machine learning, deep learning, and advanced deep learning models using different feature extraction methods on five fake review datasets. Finally, we integrate the ELECTRA model with CNN which can identify real or fake reviews. Our proposed technique utilises accuracy, precision, recall, and F1 score as assessment criteria to determine the leniency of the proposed model. For deep contextualised representation and neural classification, we integrate Single-Layer Perceptron (SLP), Multi-Layer Perceptron (MLP), and Convolutional Neural Networks (CNN) following the embedding layer of unique pre-trained models like ELMo, ELECTRA, and GPT2. The experimental results indicate that our proposed model outperforms state-of-the-art methods with improvements ranging from 1 to 7% in terms of the accuracy, F1 score. To the best of our knowledge, no prior work has evaluated such advanced pre-trained models' efficiency in detecting fake reviews. Further, this research comprehensively evaluates several machine-learning approaches and feature extraction strategies for fake online review detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fake news, disinformation and misinformation in social media: a review

Article 09 February 2023

Sentiment Analysis in the Age of Generative AI

Article Open access 05 March 2024

"Challenges and future in deep learning for sentiment analysis: a comprehensive review and a proposed novel hybrid approach"

Article Open access 05 March 2024

Data availability

All datasets are open-source, and the sources are cited.

Notes

https://www.nltk.org/_modules/nltk/tag.html.

References

Mir, A.Q., Khan, F.Y., Chishti, M.A.: Online Fake Review Detection Using Supervised Machine Learning and BERT Model. arXiv preprint (2023). arXiv:230103225
Kolides, A., Nawaz, A., Rathor, A., Beeman, D., Hashmi, M., Fatima, S., et al.: Artificial intelligence foundation and pre-trained models: fundamentals, applications, opportunities, and social impacts. Simul. Model. Pract. Theory 126, 102754 (2023)
Article Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, 2017, pp. 5998–6008.
Aslam, U., Jayabalan, M., Ilyas, H., Suhail, A.: A survey on opinion spam detection methods. Int. J. Sci. Technol. Res. 8(9), 1355–1363 (2019)
Google Scholar
Vidanagama, D.U., Silva, T.P., Karunananda, A.S.: Deceptive consumer review detection: a survey. Artif. Intell. Rev. 53(2), 1323–1352 (2020)
Article Google Scholar
Rodrigues, J.C., Rodrigues, J.T., Gonsalves, V.L.K., Naik, A.U., Shetgaonkar, P., Aswale, S.: Machine and deep learning techniques for detection of fake reviews: a survey. In: 2020 International Conference on Emerging Trends in Information Technology and Engineering (ic-ETITE), 2020, pp. 1–8. IEEE (2020)
Wu, Y., Ngai, E.W., Wu, P., Wu, C.: Fake online reviews: literature review, synthesis, and directions for future research. Decis. Support. Syst. 132, 113280 (2020)
Article Google Scholar
Ren, Y., Ji, D.: Learning to detect deceptive opinion spam: a survey. IEEE Access 7, 42934–42945 (2019)
Article Google Scholar
E4tech. The Fuel Cell Industry Review 2017. E4tech, London (2017)
Sedighi, Z., Ebrahimpour-Komleh, H., Bagheri, A.: RLOSD: representation learning based opinion spam detection. In: 2017 3rd Iranian Conference on Intelligent Systems and Signal Processing (ICSPIS), 2017, pp. 74–80. IEEE (2017)
Khurshid, F., Zhu, Y., Yohannese, C.W., Iqbal, M.: Recital of supervised learning on review spam detection: an empirical analysis. In: 2017 12th International Conference on Intelligent Systems and Knowledge Engineering (ISKE), 2017, pp. 1–6. IEEE (2017)
Kondamudi, M.R., Sahoo, S.R., Chouhan, L., Yadav, N.: A comprehensive survey of fake news in social networks: attributes, features, and detection approaches. J. King Saud Univ. Comput. Inf. Sci. 35(6), 101571 (2023)
Google Scholar
Li, L., Ren, W., Qin, B., Liu, T.: Learning document representation for deceptive opinion spam detection. In: Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data, pp. 393–404. Springer (2015)
Zhao, S., Xu, Z., Liu, L., Guo M.: Towards accurate deceptive opinion spam detection based on word order-preserving CNN. arXiv preprint (2017). arXiv:171109181
Ren, Y., Zhang, Y.: Deceptive opinion spam detection using neural network. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, 2016, pp. 140–150 (2016)
Tang, X., Qian, T., You, Z.: Generating behavior features for cold-start spam review detection with adversarial learning. Inf. Sci. 526, 274–288 (2020)
Article Google Scholar
Abdullah, M., Madain, A., Jararweh, Y.: ChatGPT: fundamentals, applications and social impacts. In: 2022 Ninth International Conference on Social Networks Analysis, Management and Security (SNAMS), 2022, pp. 1–8. IEEE (2022)
Tenney, I., Das, D., Pavlick, E.: BERT rediscovers the classical NLP pipeline. arXiv preprint (2019). arXiv:190505950
González-Carvajal, S., Garrido-Merchán, E.C.: Comparing BERT against traditional machine learning text classification. arXiv preprint (2020). arXiv:200513012
Alkhodair, S.A., Fung, B.C., Ding, S.H., Cheung, W.K., Huang, S.-C.: Detecting high-engaging breaking news rumors in social media. ACM Trans. Manag. Inf. Syst. 12(1), 1–16 (2020)
Article Google Scholar
Arulmurugan, R., Sabarmathi, K., Anandakumar, H.: Retraction Note: Classification of Sentence Level Sentiment Analysis Using Cloud Machine Learning Techniques. Springer, Berlin (2022)
Google Scholar
Mukherjee, A., Venkataraman, V., Liu, B., Glance, N.: Fake Review Detection: Classification and Analysis of Real and Pseudo Reviews. UIC-CS-03-2013 Technical Report (2013)
Rayana, S., Akoglu, L.: Collective opinion spam detection: bridging review networks and metadata. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015, pp. 985–994. ACM (2015)
Barbado, R., Araque, O., Iglesias, C.A.: A framework for fake review detection in online consumer electronics retailers. Inf. Process. Manag. 56(4), 1234–1244 (2019)
Article Google Scholar
Ott, M., Choi, Y., Cardie, C., Hancock, J.T.: Finding deceptive opinion spam by any stretch of the imagination. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 309–319. Association for Computational Linguistics (2011)
Li, J., Ott, M., Cardie, C., Hovy, E.: Towards a general rule for identifying deceptive opinion spam. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: Long Papers, 2014, vol. 1, pp. 1566–1576 (2014)
Aghakhani, H., Machiry, A., Nilizadeh, S., Kruegel, C., Vigna, G.: Detecting deceptive reviews using generative adversarial networks. In: 2018 IEEE Security and Privacy Workshops (SPW), 2018, pp. 89–95. IEEE (2018)
Das, B., Chakraborty, S.: An improved text sentiment classification model using TF–IDF and next word negation. arXiv preprint (2018). arXiv:180606407
Wu, H.C., Luk, R.W.P., Wong, K.F., Kwok, K.L.: Interpreting TF–IDF term weights as making relevance decisions. ACM Trans. Inf. Syst. 26(3), 1–37 (2008)
Article Google Scholar
Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3(Feb), 1137–1155 (2003)
MATH Google Scholar
Almeida, F., Xexéo, G.: Word embeddings: a survey. arXiv preprint (2019). arXiv:190109069
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Jain, N., Kumar, A., Singh, S., Singh, C., Tripathi, S.: Deceptive reviews detection using deep learning techniques. In: International Conference on Applications of Natural Language to Information Systems, 2019, pp. 79–91. Springer (2019)
Vimala, S., Khanaa, V., Nalini, C.: Retraction Note: A Study on Supervised Machine Learning Algorithm to Improvise Intrusion Detection Systems for Mobile Ad Hoc Networks. Springer, Berlin (2022)
Google Scholar
Clark, K., Luong, M.-T., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. arXiv preprint (2020). arXiv:200310555
Peters, M., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., et al.: Deep contextualized word representations. arXiv preprint (2018). arXiv:180205365.1802;12.
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I.: Improving language understanding by generative pre-training (2018)
Rosenblatt, F.: The perceptron: a probabilistic model for information storage and organization in the brain. Psychol. Rev. 65(6), 386 (1958)
Article Google Scholar
Riedmiller, M., Lernen, A.: Multi layer perceptron. In: Machine Learning Lab Special Lecture, 2014, pp. 7–24. University of Freiburg (2014)
Shang, R., He, J., Wang, J., Xu, K., Jiao, L., Stolkin, R.: Dense connection and depthwise separable convolution based CNN for polarimetric SAR image classification. Knowl. Based Syst. 194, 105542 (2020)
Article Google Scholar
Zhang, J., Dong, B., Philip, S.Y.: Fakedetector: effective fake news detection with deep diffusive neural network. In: 2020 IEEE 36th International Conference on Data Engineering (ICDE), 2020, pp. 1826–1829. IEEE (2020)
Halyal, S.V.: Running Google Colaboratory as a server-transferring dynamic data in and out of colabs. Int. J. Educ. Manag. Eng. 9(6), 35 (2019)
Google Scholar
Wolf, T., Chaumond, J., Debut, L., Sanh, V., Delangue, C., Moi, A., et al.: Transformers: state-of-the-art natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 2020, pp. 38–45 (2020)
Feng, S., Banerjee, R., Choi, Y.: Syntactic stylometry for deception detection. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers, 2012, vol. 2, pp. 171–175. Association for Computational Linguistics (2012)
Cagnina, L., Rosso, P.: Classification of deceptive opinions using a low dimensionality representation. In: Proceedings of the 6th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, 2015, pp. 58–66 (2015)
Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. In: Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015 (2015)
Ren, Y., Ji, D.: Neural networks for deceptive opinion spam detection: an empirical study. J. Inf. Sci. (2017). https://doi.org/10.1016/j.ins.2017.01.015
Article Google Scholar
Zhang, W., Du, Y., Yoshida, T., Wang, Q.: DRI-RCNN: an approach to deceptive review identification using recurrent convolutional neural network. Inf. Process. Manag. 54(4), 576–592 (2018)
Article Google Scholar
Zhang, C., Gupta, A., Qin, X., Zhou, Y.: A computational approach for real-time detection of fake news. Expert Syst. Appl. 221, 119656 (2023)
Article Google Scholar

Download references

Funding

No Funding.

Author information

Authors and Affiliations

Cybersecurity Department, College of Engineering, Al Ain University, Al Ain, Abu Dhabi, United Arab Emirates
Rami Mohawesh
School of Engineering and Information Technology, University of New South Wales (UNSW), Campbell, ACT, 2612, Australia
Muna Al-Hawawreh
School of Information Technology, University of Tasmania, Hobart, TAS, Australia
Sumbal Maqsood
Jordan University of Science and Technology, Irbid, Jordan
Omar Alqudah

Authors

Rami Mohawesh
View author publications
You can also search for this author in PubMed Google Scholar
Muna Al-Hawawreh
View author publications
You can also search for this author in PubMed Google Scholar
Sumbal Maqsood
View author publications
You can also search for this author in PubMed Google Scholar
Omar Alqudah
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

RM initialized the project and managed the study. RM collected the data and performed formal analysis. RM analyzed the data. RM wrote the initial manuscript. All authors contributed to the editing of the paper.

Corresponding author

Correspondence to Rami Mohawesh.

Ethics declarations

Conflict of interest

No competing interest.

Ethical approval

No ethical issue involved.

Informed consent

No ethical issue involved.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Mohawesh, R., Al-Hawawreh, M., Maqsood, S. et al. Factitious or fact? Learning textual representations for fake online review detection. Cluster Comput (2023). https://doi.org/10.1007/s10586-023-04148-x

Download citation

Received: 25 April 2023
Revised: 28 August 2023
Accepted: 04 September 2023
Published: 28 September 2023
DOI: https://doi.org/10.1007/s10586-023-04148-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Factitious or fact? Learning textual representations for fake online review detection

Abstract

Access this article

Similar content being viewed by others

Fake news, disinformation and misinformation in social media: a review

Sentiment Analysis in the Age of Generative AI

"Challenges and future in deep learning for sentiment analysis: a comprehensive review and a proposed novel hybrid approach"

Data availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Informed consent

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Factitious or fact? Learning textual representations for fake online review detection

Abstract

Access this article

Similar content being viewed by others

Fake news, disinformation and misinformation in social media: a review

Sentiment Analysis in the Age of Generative AI

"Challenges and future in deep learning for sentiment analysis: a comprehensive review and a proposed novel hybrid approach"

Data availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Informed consent

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation