Joint European Conference on Machine Learning and Knowledge Discovery in Databases

ECML PKDD 2015: Machine Learning and Knowledge Discovery in Databases pp 267-282

Discovering Opinion Spammer Groups by Network Footprints

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9284)

Abstract

Online reviews are an important source for consumers to evaluate products/services on the Internet (e.g. Amazon, Yelp, etc.). However, more and more fraudulent reviewers write fake reviews to mislead users. To maximize their impact and share effort, many spam attacks are organized as campaigns, by a group of spammers. In this paper, we propose a new two-step method to discover spammer groups and their targeted products. First, we introduce NFS (Network Footprint Score), a new measure that quantifies the likelihood of products being spam campaign targets. Second, we carefully devise GroupStrainer to cluster spammers on a 2-hop subgraph induced by top ranking products. We demonstrate the efficiency and effectiveness of our approach on both synthetic and real-world datasets from two different domains with millions of products and reviewers. Moreover, we discover interesting strategies that spammers employ through case studies of our detected groups.

Keywords

Opinion spam Spammer groups Spam detection Graph anomaly detection Efficient hierarchical clustering Network footprints 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Akoglu, L., Chandy, R., Faloutsos, C.: Opinion fraud detection in online reviews by network effects. In: ICWSM (2013)Google Scholar
  2. 2.
    Akoglu, L., Faloutsos, C.: RTG: a recursive realistic graph generator using random typing. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds.) ECML PKDD 2009, Part I. LNCS, vol. 5781, pp. 13–28. Springer, Heidelberg (2009) CrossRefGoogle Scholar
  3. 3.
    Akoglu, L., McGlohon, M., Faloutsos, C.: Oddball: spotting anomalies in weighted graphs. In: Zaki, M.J., Yu, J.X., Ravindran, B., Pudi, V. (eds.) AKDD 2010, Part II. LNCS, vol. 6119, pp. 410–421. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  4. 4.
    Barabási, A.-L., Albert, R., Jeong, H.: Scale-free characteristics of random networks: the topology of the world-wide web. Physica A: Statistical Mechanics and its Applications 281(1–4), 69–77 (2000)CrossRefGoogle Scholar
  5. 5.
    Benczr, A.A., Csalogny, K., Sarls, T., Uher, M.: Spamrank - fully automatic link spam detection. In: AIRWeb, pp. 25–38 (2005)Google Scholar
  6. 6.
    Broder, A.: Graph structure in the web. Computer Networks 33(1–6), 309–320 (2000)CrossRefGoogle Scholar
  7. 7.
    Chung, F.R.K., Lu, L.: The average distance in a random graph with given expected degrees. Internet Mathematics 1(1), 91–113 (2003)CrossRefMathSciNetMATHGoogle Scholar
  8. 8.
    Dalvi, N., Domingos, P., Mausam, Sanghai, S., Verma, D.: Adversarial classification. In: KDD, pp. 99–108 (2004)Google Scholar
  9. 9.
    Faloutsos, M., Faloutsos, P., Faloutsos, C.: On power-law relationships of the internet topology. In: SIGCOMM, pp. 251–262 (1999)Google Scholar
  10. 10.
    Faust, K.: Centrality in affiliation networks. Social Networks 19(2), 157–191 (1997)CrossRefGoogle Scholar
  11. 11.
    Fei, G., Mukherjee, A., Liu, B., Hsu, M., Castellanos, M., Ghosh, R.: Exploiting burstiness in reviews for review spammer detection. In: ICWSM (2013)Google Scholar
  12. 12.
    Feng, S., Banerjee, R., Choi, Y.: Syntactic stylometry for deception detection. In: ACL (2012)Google Scholar
  13. 13.
    Feng, S., Xing, L., Gogar, A., Choi, Y.: Distributional footprints of deceptive product reviews. In: ICWSM (2012)Google Scholar
  14. 14.
    Gao, J., Tan, P.-N.: Converting output scores from outlier detection algorithms to probability estimates. In: ICDM (2006)Google Scholar
  15. 15.
    Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. In: VLDB, pp. 518–529 (1999)Google Scholar
  16. 16.
    Jiang, M., Cui, P., Beutel, A., Faloutsos, C., Yang, S.: Catchsync: catching synchronized behavior in large directed graphs. In: KDD, pp. 941–950 (2014)Google Scholar
  17. 17.
    Jindal, N., Liu, B.: Opinion spam and analysis. In: WSDM, pp. 219–230 (2008)Google Scholar
  18. 18.
    Jindal, N., Liu, B., Lim, E.-P.: Finding unusual review patterns using unexpected rules. In: CIKM, pp. 1549–1552. ACM (2010)Google Scholar
  19. 19.
    Johnson, W., Lindenstrauss, J.: Extensions of Lipschitz mappings into a Hilbert space. Contemporary Mathematics, vol. 26, pp. 189–206 (1984)Google Scholar
  20. 20.
    Li, F., Huang, M., Yang, Y., Zhu, X.: Learning to identify review spam. In: IJCAI (2011)Google Scholar
  21. 21.
    Li, H., Chen, Z., Liu, B., Wei, X., Shao, J.: Spotting fake reviews via collective positive-unlabeled learning. In: ICDM (2014)Google Scholar
  22. 22.
    Mukherjee, A., Kumar, A., Liu, B., Wang, J., Hsu, M., Castellanos, M., Ghosh, R.: Spotting opinion spammers using behavioral footprints. In: KDD (2013)Google Scholar
  23. 23.
    Mukherjee, A., Liu, B., Glance, N.S.: Spotting fake reviewer groups in consumer reviews. In: WWW (2012)Google Scholar
  24. 24.
    Newman, M.: Power laws, Pareto distributions and Zipf’s law. Contemporary Physics 46(5), 323–351 (2005)CrossRefGoogle Scholar
  25. 25.
    Ott, M., Choi, Y., Cardie, C., Hancock, J.T.: Finding deceptive opinion spam by any stretch of the imagination. In: ACL, pp. 309–319 (2011)Google Scholar
  26. 26.
    Wang, G., Xie, S., Liu, B., Yu, P.S.: Review graph based online store review spammer detection. In: ICDM, pp. 1242–1247 (2011)Google Scholar
  27. 27.
    Xu, C., Zhang, J.: Combating product review spam campaigns via multiple heterogeneous pairwise features. In: SDM. SIAM (2015)Google Scholar
  28. 28.
    Xu, C., Zhang, J., Chang, K., Long, C.: Uncovering collusive spammers in Chinese review websites. In: CIKM, pp. 979–988. ACM (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Department of Computer ScienceStony Brook UniversityStony BrookUSA

Personalised recommendations