Advertisement

Stay On-Topic: Generating Context-Specific Fake Restaurant Reviews

  • Mika Juuti
  • Bo Sun
  • Tatsuya Mori
  • N. Asokan
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11098)

Abstract

Automatically generated fake restaurant reviews are a threat to online review systems. Recent research has shown that users have difficulties in detecting machine-generated fake reviews hiding among real restaurant reviews. The method used in this work (char-LSTM) has one drawback: it has difficulties staying in context, i.e. when it generates a review for specific target entity, the resulting review may contain phrases that are unrelated to the target, thus increasing its detectability. In this work, we present and evaluate a more sophisticated technique based on neural machine translation (NMT) with which we can generate reviews that stay on-topic. We test multiple variants of our technique using native English speakers on Amazon Mechanical Turk. We demonstrate that reviews generated by the best variant have almost optimal undetectability (class-averaged F-score 47%). We conduct a user study with experienced users and show that our method evades detection more frequently compared to the state-of-the-art (average evasion 3.2 / 4 vs 1.5 / 4) with statistical significance, at level \(\alpha = 1\%\) (Sect. 4.3). We develop very effective detection tools and reach average F-score of \(97\%\) in classifying these. Although fake reviews are very effective in fooling people, effective automatic detection is still feasible.

References

  1. 1.
    Yao, Y., Viswanath, B., Cryan, J., Zheng, H., Zhao, B.Y.: Automated crowdturfing attacks and defenses in online review systems. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM (2017)Google Scholar
  2. 2.
    Murphy, K.: Machine Learning: A Probabilistic Approach. Massachusetts Institute of Technology, Cambridge (2012)zbMATHGoogle Scholar
  3. 3.
    Yelp: Yelp Challenge Dataset (2013)Google Scholar
  4. 4.
    Mukherjee, A., Venkataraman, V., Liu, B., Glance, N.: What yelp fake review filter might be doing? In: Seventh International AAAI Conference on Weblogs and Social Media (ICWSM) (2013)Google Scholar
  5. 5.
    Rayana, S., Akoglu, L.: Collective opinion spam detection: bridging review networks and metadata. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2015)Google Scholar
  6. 6.
    O’Connor, P.: User-generated content and travel: a case study on Tripadvisor.com. In: O’Connor, P., Höpken, W., Gretzel, U. (eds.) Information and Communication Technologies in Tourism 2008, pp. 47–58. Springer, Vienna (2008).  https://doi.org/10.1007/978-3-211-77280-5_5CrossRefGoogle Scholar
  7. 7.
    Luca, M.: Reviews, Reputation, and Revenue: The Case of Yelp.com. Harvard Business School, Boston (2010)Google Scholar
  8. 8.
    Wang, G., et al.: Serf and turf: crowdturfing for fun and profit. In: Proceedings of the 21st International Conference on World Wide Web (WWW). ACM (2012)Google Scholar
  9. 9.
    Rinta-Kahila, T., Soliman, W.: Understanding crowdturfing: the different ethical logics behind the clandestine industry of deception. In: ECIS 2017: Proceedings of the 25th European Conference on Information Systems (2017)Google Scholar
  10. 10.
    Luca, M., Zervas, G.: Fake it till you make it: reputation, competition, and yelp review fraud. Manage. Sci. 62, 3412–3427 (2016)CrossRefGoogle Scholar
  11. 11.
    National Literacy Trust: Commission on fake news and the teaching of critical literacy skills in schools. https://literacytrust.org.uk/policy-and-campaigns/all-party-parliamentary-group-literacy/fakenews/
  12. 12.
    Jurafsky, D., Martin, J.H.: Speech and Language Processing, vol. 3. Pearson London, London (2014)Google Scholar
  13. 13.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  14. 14.
    Cho, K., et al.: Learning phrase representations using rnn encoder-decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (2014)Google Scholar
  15. 15.
    Klein, G., Kim, Y., Deng, Y., Senellart, J., Rush, A.: OpenNMT: open-source toolkit for neural machine translation. In: Proceedings of ACL, System Demonstrations (2017)Google Scholar
  16. 16.
    Wu, Y., et al.: Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016)
  17. 17.
    Mei, H., Bansal, M., Walter, M.R.: Coherent dialogue with attention-based language models. In: AAAI, pp. 3252–3258 (2017)Google Scholar
  18. 18.
    Li, J., Galley, M., Brockett, C., Gao, J., Dolan, B.: A diversity-promoting objective function for neural conversation models. In: Proceedings of NAACL-HLT (2016)Google Scholar
  19. 19.
    Rubin, V.L., Liddy, E.D.: Assessing credibility of weblogs. In: AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs (2006)Google Scholar
  20. 20.
  21. 21.
    Pennebaker, J.W., Boyd, R.L., Jordan, K., Blackburn, K.: The development and psychometric properties of LIWC2015. Technical report (2015)Google Scholar
  22. 22.
    Honnibal, M., Johnson, M.: An improved non-monotonic transition system for dependency parsing. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP). ACM (2015)Google Scholar
  23. 23.
    Bird, S., Loper, E.: NLTK: the natural language toolkit. In: Proceedings of the ACL 2004 on Interactive Poster and Demonstration Sessions. Association for Computational Linguistics (2004)Google Scholar
  24. 24.
    Kumar, S., Shah, N.: False information on web and social media: a survey. arXiv preprint arXiv:1804.08559 (2018)
  25. 25.
    Everett, R.M., Nurse, J.R.C., Erola, A.: The anatomy of online deception: what makes automated text convincing? In: Proceedings of the 31st Annual ACM Symposium on Applied Computing, SAC 2016. ACM (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Aalto UniversityEspooFinland
  2. 2.Cybersecurity Research InstituteNational Institute of Information and Communications TechnologyTokyoJapan
  3. 3.Department of Computer Science and Communication EngineeringWaseda UniversityTokyoJapan
  4. 4.Center for Advanced Intelligence ProjectRIKENTokyoJapan

Personalised recommendations