Abstract
Identification of manipulative behavior and the corresponding suspects is an essential task for maintaining robustness of reputation systems integrated by review websites. However, this task constitutes a great challenge. In this paper, we present an approach based on supervised learning to automatically detect suspicious behavior on travel websites. We distinguish between two types of manipulation, treating them as separate tasks: promoting manipulation, which is performed in order to push the reputation of a hotel, and demoting manipulation, which is used to demote competitors. Both tasks consist of three separate levels: detecting suspicious reviews (review level), suspicious reviewers (reviewer level) and suspicious objects of the reviews, i.e. hotels (object level). A separate classifier for each of the levels is trained on various sets of textual and non-textual features. We apply state-of-the-art machine learning algorithms like Support Vector Machines. The performance of our approach is evaluated on a new dataset that we created based on reviews taken from the platform TripAdvisor and which was carefully annotated by human judges. The results show that it is possible to identify manipulating reviewers and objects of manipulation with over 90% accuracy. Identifying suspicious reviews, however, seems to be a much harder task, for which our classifier achieves an accuracy of 68% detecting promoting manipulation and 84% detecting demoting manipulation. We argue that there is the need to identify more efficient features for the classification on review level. Finally, we analyze and discuss statistical characteristics of manipulative behavior based on the predictions of the reviewer and object level classifiers.
Keywords
- reputation system
- trust management
- manipulative behavior identification and analysis
- opinion mining
- supervised learning
- TripAdvisor
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Duan, H., Yang, P.: Building robust reputation systems for travel-related services. In: Proceedings of the 10th Annual Conference on Privacy, Security and Trust (PST 2012), Paris, France (2012), http://sites.google.com/site/duanhuiying/publications
Forman, G., Scholz, M.: Apples-to-apples in cross-validation studies: pitfalls in classifier performance measurement. SIGKDD Explorations 12(1), 49–57 (2010)
Gambetta, D.: Can we trust trust? In: Trust: Making and Breaking Cooperative Relations, pp. 213–237. Basil Blackwell (1988)
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46(1-3), 389–422 (2002)
Jindal, N., Liu, B.: Opinion spam and analysis. In: Proceedings of the International Conference on Web Search and Web Data Mining, WSDM 2008, pp. 219–230. ACM, New York (2008)
Jindal, N., Liu, B., Lim, E.P.: Finding unusual review patterns using unexpected rules. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, CIKM 2010, pp. 1549–1552. ACM, New York (2010)
Lau, R.Y.K., Liao, S.Y., Kwok, R.C.W., Xu, K., Xia, Y., Li, Y.: Text mining and probabilistic language modeling for online review spam detection. ACM Trans. Manage. Inf. Syst. 2, 25:1–25:30 (2012)
Lim, E.P., Nguyen, V.A., Jindal, N., Liu, B., Lauw, H.W.: Detecting product review spammers using rating behaviors. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, CIKM 2010, pp. 939–948. ACM, New York (2010)
O’Mahony, M.P., Smyth, B.: Learning to recommend helpful hotel reviews. In: Proceedings of the Third ACM Conference on Recommender Systems, RecSys 2009, pp. 305–308. ACM, New York (2009)
Ott, M., Cardie, C., Hancock, J.: Estimating the prevalence of deception in online review communities. In: Proceedings of the 21st International Conference on World Wide Web, WWW 2012, pp. 201–210. ACM, New York (2012)
Ott, M., Choi, Y., Cardie, C., Hancock, J.T.: Finding deceptive opinion spam by any stretch of the imagination. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, HLT 2011, vol. 1, pp. 309–319. Association for Computational Linguistics, Stroudsburg (2011)
Wu, G., Greene, D., Cunningham, P.: Merging multiple criteria to identify suspicious reviews. In: Proc. 4th ACM Conference on Recommender Systems, RecSys 2010 (2010)
Wu, G., Greene, D., Smyth, B., Cunningham, P.: Distortion as a validation criterion in the identification of suspicious reviews. In: Proceedings of the First Workshop on Social Media Analytics, SOMA 2010, pp. 10–13. ACM, New York (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Duan, H., Zirn, C. (2012). Can We Identify Manipulative Behavior and the Corresponding Suspects on Review Websites Using Supervised Learning?. In: Jøsang, A., Carlsson, B. (eds) Secure IT Systems. NordSec 2012. Lecture Notes in Computer Science, vol 7617. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34210-3_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-34210-3_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34209-7
Online ISBN: 978-3-642-34210-3
eBook Packages: Computer ScienceComputer Science (R0)
