Deceptive opinion spam detection using feature reduction techniques

Maurya, Sushil Kumar; Singh, Dinesh; Maurya, Ashish Kumar

doi:10.1007/s13198-023-02208-4

Deceptive opinion spam detection using feature reduction techniques

ORIGINAL ARTICLE
Published: 19 December 2023

Volume 15, pages 1210–1230, (2024)
Cite this article

International Journal of System Assurance Engineering and Management Aims and scope Submit manuscript

142 Accesses
Explore all metrics

Abstract

People usually prepare themselves by reading online reviews before purchasing a product. Sellers sometimes try to imitate user experience as a deceptive review to increase profits. Deceptive opinion spam detection has emerged as a challenging task in the field of opinion mining. Feature reduction techniques play the most important role in data mining which finds the essential features and removes the unnecessary dimensions that only contribute to the noise. This article extracts various textual features of gold-standard deceptive hotel reviews using different representation techniques like Part of Speech tag (POS tag), Bag of Word (BoW), and Doc2Vec. Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) are applied to reduce the features' dimensions. Various supervised classifiers like Decision Tree (DT), Na¨ıve Bayes (NB), Logistic Regression (LR), and Support Vector Machine (SVM) are used to classify deceptive opinions and truthful opinions. The features used by these supervised classifiers cannot retain sequential information from reviews. To overcome this problem, we used the Words Attention-based Bidirectional Long Short-Term Memory (WABiLSTM) network model that trains to learn the patterns of words. The article examines machine and deep learning-based spam detection models and provides their outline and results. The metrics like accuracy, precision, recall, and F-Measure are used to analyze the performance of these classification models. The experimental results showed the model's performance improved after reducing the features.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Comparative Approach for Opinion Spam Detection Using Sentiment Analysis

Robust multi-domain descriptive text classification leveraging conventional and hybrid deep learning models

Article 25 October 2023

Opinion Spam Detection with Attention-Based LSTM Networks

References

Algur SP, Patil AP, Hiremath PS, Shivashankar S (2010) Conceptual level similarity measure based review spam detection. In: 2010 International conference on signal and image processing, IEEE, pp 416–423
Asghar MZ, Ullah A, Ahmad S, Khan A (2020) Opinion spam detection framework using hybrid classification scheme. Soft Comput 24(5):3475–3498
Article Google Scholar
Barushka A, Hajek P (2019) Review spam detection using word embeddings and deep neural networks. In: Artificial intelligence applications and innovations: 15th IFIP WG 12.5 international conference, AIAI 2019, hersonissos, crete, greece, May 24–26, 2019, proceedings, vol 15. Springer International Publishing, pp 340–350
Batra J, Jain R, Tikkiwal VA, Chakraborty A (2021) A comprehensive study of spam detection in e-mails using bio-inspired optimization techniques. Int J Inf Manag Data Insights 1(1):100006
Google Scholar
Cervantes J, Garcia-Lamont F, Rodríguez-Mazahua L, Lopez A (2020) A comprehensive survey on support vector machine classification: applications, challenges and trends. Neurocomputing 408:189–215
Article Google Scholar
Costa VG, Pedreira CE (2023) Recent advances in decision trees: an updated survey. Artif Intell Rev 56(5):4765–4800
Article Google Scholar
Dong M, Yao L, Wang X, Benatallah B, Huang C, Ning X (2020) Opinion fraud detection via neural autoencoder decision forest. Pattern Recogn Lett 132:21–29
Article ADS Google Scholar
Fei G, Mukherjee A, Liu B, Hsu M, Castellanos M, Ghosh R (2013) Exploiting burstiness in reviews for review spammer detection. ICWSM 13:175–184
Google Scholar
Feng S, Banerjee R, Choi Y (2012) Syntactic stylometry for deception detection. In: Proceedings of the 50th annual meeting of the association for computational linguistics, vol 2. Short Papers, pp 171–175
Hameed Z, Garcia-Zapirain B (2020) Sentiment classification using a single-layered bilstm model. IEEE Access 8:73992–74001
Article Google Scholar
Heydari A, Ali-Tavakoli M, Salim N, Heydari Z (2015) Detection of review spam: a survey. Expert Syst Appl 42(7):3634–3642
Article Google Scholar
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article CAS PubMed Google Scholar
Jindal N, Liu B (2007) Analyzing and detecting review spam. In: 7th IEEE international conference on data mining ICDM 2007, pp 547-552
Jindal N, Liu B (2008) Opinion spam and analysis. In: Proceedings of the 2008 international conference on web search and data mining, pp 219–230
Labrín C, Urdinez F (2020) Principal component analysis. R for political data science. Chapman and Hall/CRC, Boca Raton, pp 375–393
Chapter Google Scholar
Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: International conference on machine learning, PMLR, pp 1188–1196
Li L, Qin B, Ren W, Liu T (2017) Document representation and feature combination for deceptive spam review detection. Neurocomputing 254:33–41
Article Google Scholar
Li Y, Wang F, Zhang S, Niu X (2021) Detection of fake reviews using group model. Mob Netw Appl 26(1):91–103
Article Google Scholar
Li J, Ott M, Cardie C, Hovy E (2014) Towards a general rule for identifying deceptive opinion spam. In: Proceedings of the 52nd annual meeting of the association for computational linguistics, vol 1. Long Papers, pp 1566–1576
Liu W, Jing W, Li Y (2020) Incorporating feature representation into bilstm for deceptive review detection. Computing 102(3):701–715
Article MathSciNet Google Scholar
Madisetty S, Desarkar MS (2018) A neural network-based ensemble approach for spam detection in twitter. IEEE Trans Comput Soc Syst 5(4):973–984
Article Google Scholar
Malandri L, Porcel C, Xing F, Serrano-Guerrero J, Cambria E (2022) Soft computing for recommender systems and sentiment analysis. Appl Soft Comput 118:108246
Article Google Scholar
Maurya SK, Singh D, Maurya AK (2023) Deceptive opinion spam detection approaches: a literature survey. Appl Intell 53(2):2189–2234
Article Google Scholar
Mewada A, Dewang RK (2021) Deceptive reviewer detection by analyzing web data using HMM and similarity measures. Materials today proceedings. Elsevier, Amsterdam
Google Scholar
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119
Mohammadzadeh H, Gharehchopogh FS (2021) A novel hybrid whale optimization algorithm with flower pollination algorithm for feature selection: case study email spam detection. Comput Intell 37(1):176–209
Article MathSciNet Google Scholar
Mukherjee A, Liu B, Glance N (2012) Spotting fake reviewer groups in consumer reviews. In: Proceedings of the 21st international conference on World Wide Web, pp 191–200
Mukherjee A, Kumar A, Liu B, Wang J, Hsu M, Castellanos M, Ghosh R (2013) Spotting opinion spammers using behavioral footprints. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining, pp 632–640
Narayan R, Rout JK, Jena SK (2018) Review spam detection using opinion mining. In: Progress in intelligent computing techniques: theory, practice, and applications: proceedings of ICACNI 2016, vol 2. Springer, Singapore, pp 273–279
Ott M, Choi Y, Cardie C, Hancock JT (2011) Finding deceptive opinion spam by any stretch of the imagination. arXiv preprint arXiv:1107.4557
Ott M, Cardie C, Hancock JT (2013) Negative deceptive opinion spam. In: Proceedings of the 2013 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 497–501
Radovanovi´c D, Krstaji´c B (2018) Review spam detection using machine learning. In: 2018 23rd international scientific-professional conference on information technology (IT), IEEE, pp 1–4
Rayana S, Akoglu L (2015) Collective opinion spam detection: bridging review networks and metadata. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pp 985–994
Ren Y, Zhang Y (2016) Deceptive opinion spam detection using neural network. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers, pp 140–150
Sandulescu V, Ester M (2015) Detecting singleton review spammers using semantic similarity. In: Proceedings of the 24th international conference on World Wide Web, pp 971–976
Saumya S, Singh JP et al (2020) Spam review detection using LSTM autoencoder: an unsupervised approach. Electron Commer Res 22:1–21
Google Scholar
Shojaee S, Murad MAA, Azman AB, Sharef NM, Nadali S (2013) Detecting deceptive reviews using lexical and syntactic features. In: 2013 13th international conference on intellient systems design and applications, IEEE, pp 53–58
Shuai Q, Huang Y, Jin L, Pang L (2018) Sentiment analysis on Chinese hotel reviews with doc2vec and classifiers. In: 2018 IEEE 3rd advanced information technology, electronic and automation control conference (IAEAC), IEEE, pp 1171–1174
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet Google Scholar
Sun H, Morales A, Yan X (2013) Synthetic review spamming and defense. In: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 1088–1096
Tian Y, Mirzabagheri M, Tirandazi P, Bamakan SMH (2020) A non-convex semi-supervised approach to opinion spam detection by ramp-one class SVM. Inf Process Manag 57(6):102381
Article Google Scholar
Wang Z, Songmin Gu, Zhao X, Xiaowei Xu (2018b) Graph-based review spammer group detection. Knowl Inf Syst 55(3):571–597
Article Google Scholar
Wang C-C, Day M-Y, Chen C-C, Liou J-W (2018) Detecting spamming reviews using long short-term memory recurrent neural network framework. In: Proceedings of the 2nd international conference on E-commerce, E-Business and E-Government, pp 16–20
Wickramasinghe I, Kalutarage H (2021) Naive Bayes: applications, variations and vulnerabilities: a review of literature with code snippets for implementation. Soft Comput 25(3):2277–2293
Article Google Scholar
Wu G, Greene D, Smyth B, Cunningham P (2010) Distortion as a validation criterion in the identification of suspicious reviews. In: Proceedings of the first workshop on social media analytics, pp 10–13
Xanthopoulos P, Pardalos PM, Trafalis TB, Xanthopoulos P, Pardalos PM, Trafalis TB (2013) Linear discriminant analysis. Robust Data Min 2013:27–33. https://doi.org/10.1007/978-1-4419-9878-1_4
Article Google Scholar
Xu Q, Zhao H (2012) Using deep linguistic features for finding deceptive opinion SPAM. In: Proceedings of COLING 2012: posters, pp 1341–1350
Yu L, Zhou R, Chen R, Lai KK (2022) Missing data preprocessing in credit classification: one-hot encoding or imputation? Emerg Mark Financ Trade 58(2):472–482
Article Google Scholar
Zhang Y, Rao Z (2020) n-bilstm: bilstm with n-gram features for text classification. In: 2020 IEEE 5th information technology and mechatronics engineering conference (ITOEC), IEEE, pp 1056–1059
Zou X, Hu Y, Tian Z, Shen K (2019) Logistic regression model optimization and case analysis. In: 2019 IEEE 7th international conference on computer science and network technology (ICCSNT), IEEE, pp 135–139

Download references

Funding

No funding.

Author information

Authors and Affiliations

Computer Science and Engineering, Motilal Nehru National Institute of Technology Allahabad, Prayagraj, 211004, India
Sushil Kumar Maurya, Dinesh Singh & Ashish Kumar Maurya

Authors

Sushil Kumar Maurya
View author publications
You can also search for this author in PubMed Google Scholar
Dinesh Singh
View author publications
You can also search for this author in PubMed Google Scholar
Ashish Kumar Maurya
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sushil Kumar Maurya.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Human and animals participants

This article does not involve human participants or animals.

Informed consent

There is no plagiarism.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Maurya, S.K., Singh, D. & Maurya, A.K. Deceptive opinion spam detection using feature reduction techniques. Int J Syst Assur Eng Manag 15, 1210–1230 (2024). https://doi.org/10.1007/s13198-023-02208-4

Download citation

Received: 28 October 2021
Revised: 29 September 2023
Accepted: 17 November 2023
Published: 19 December 2023
Issue Date: March 2024
DOI: https://doi.org/10.1007/s13198-023-02208-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deceptive opinion spam detection using feature reduction techniques

Abstract

Access this article

Similar content being viewed by others

A Comparative Approach for Opinion Spam Detection Using Sentiment Analysis

Robust multi-domain descriptive text classification leveraging conventional and hybrid deep learning models

Opinion Spam Detection with Attention-Based LSTM Networks

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Human and animals participants

Informed consent

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Deceptive opinion spam detection using feature reduction techniques

Abstract

Access this article

Similar content being viewed by others

A Comparative Approach for Opinion Spam Detection Using Sentiment Analysis

Robust multi-domain descriptive text classification leveraging conventional and hybrid deep learning models

Opinion Spam Detection with Attention-Based LSTM Networks

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Human and animals participants

Informed consent

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation