Advertisement

ConOut: Contextual Outlier Detection with Multiple Contexts: Application to Ad Fraud

  • M. Y. MeghanathEmail author
  • Deepak Pai
  • Leman Akoglu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11051)

Abstract

Outlier detection has numerous applications in different domains. A family of techniques, called contextual outlier detectors, are based on a single, user-specified demarcation of data attributes into indicators and contexts. In this work, we propose ConOut, a new contextual outlier detection technique that leverages multiple contexts that are automatically identified. Importantly, ConOut  is a one-click algorithm—it does not require any user-specified (hyper)parameters. Through experiments on various real-world data sets, we show that ConOut outperforms existing baselines in detection accuracy. Further, we motivate and apply ConOut to the advertisement domain to identify fraudulent publishers, where ConOut not only improves detection but also provides statistically significant revenue gains to advertisers: a minimum of 57% compared to a naïve fraud detector; and \(\sim \)20% in revenue gains as well as \(\sim \)34% in mean average precision compared to its nearest competitor. Code related to this paper is available at: https://github.com/meghanathmacha/ConOut, https://cmuconout.github.io/.

Notes

Acknowledgments

This research is sponsored by Adobe University Marketing Research Award, NSF CAREER 1452425 and IIS 1408287. Any conclusions expressed in this material do not necessarily reflect the views expressed by the funding parties.

References

  1. 1.
    Aggarwal, C.C.: Outlier Analysis. Springer, New York (2013).  https://doi.org/10.1007/978-1-4614-6396-2CrossRefzbMATHGoogle Scholar
  2. 2.
    Angiulli, F., Fassetti, F., Palopoli, L.: Discovering characterizations of the behavior of anomalous subpopulations. IEEE TKDE 25(6), 1280–1292 (2013)Google Scholar
  3. 3.
    Breunig, M.M., Kriegel, H.-P., Ng, R.T., Sander, J.: LOF: identifying density-based local outliers. In: SIGMOD, vol. 29, pp. 93–104. ACM (2000)Google Scholar
  4. 4.
    Dave, V., Guha, S., Zhang, Y.: ViceROI: catching click-spam in search ad networks. In: SIGSAC, pp. 765–776. ACM (2013)Google Scholar
  5. 5.
    Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. J. Royal Stat. Soc. 39, 1–38 (1977)MathSciNetzbMATHGoogle Scholar
  6. 6.
    Gao, J., Tan, P.-N.: Converting output scores from outlier detection algorithms into probability estimates. In: ICDM, pp. 212–221. IEEE (2006)Google Scholar
  7. 7.
    Hastie, T., Tibshirani, R., Walther, G.: Estimating the number of data clusters via the gap statistic. J. Royal Stat. Soc. B 63, 411–423 (2001)CrossRefGoogle Scholar
  8. 8.
    Lehmann, E.L., Romano, J.P.: Testing Statistical Hypotheses. Springer, New York (2006).  https://doi.org/10.1007/0-387-27605-XCrossRefzbMATHGoogle Scholar
  9. 9.
    Liang, J., Parthasarathy, S.: Robust contextual outlier detection: where context meets sparsity. In: CIKM, pp. 2167–2172. ACM (2016)Google Scholar
  10. 10.
    Liu, F.T., Ting, K.M., Zhou, Z.-H.: Isolation forest. In: ICDM. IEEE (2008)Google Scholar
  11. 11.
    Pelleg, D., Moore, A.W., et al.: X-means: extending k-means with efficient estimation of the number of clusters. In: ICML, pp. 727–734 (2000)Google Scholar
  12. 12.
    Schölkopf, B., Williamson, R.C., Smola, A.J., Shawe-Taylor, J., Platt, J.C.: Support vector method for novelty detection. In: NIPS, pp. 582–588 (2000)Google Scholar
  13. 13.
    Song, X., Wu, M., Jermaine, C., Ranka, S.: Conditional anomaly detection. IEEE TKDE 19(5), 631–645 (2007)Google Scholar
  14. 14.
    Spirin, N., Han, J.: Survey on web spam detection: principles and algorithms. ACM SIGKDD Explor. Newslett. 13(2), 50–64 (2012)CrossRefGoogle Scholar
  15. 15.
    Tan, S.C., Ting, K.M., Liu, T.F.: Fast anomaly detection for streaming data. In: IJCAI, vol. 22, p. 1511 (2011)Google Scholar
  16. 16.
    Wang, X., Davidson, I.: Discovering contexts and contextual outliers using random walks in graphs. In: ICDM, pp. 1034–1039. IEEE (2009)Google Scholar
  17. 17.
    Zheng, G., Brantley, S.L., Lauvaux, T., Li, Z.: Contextual spatial outlier detection with metric learning. In: KDD, pp. 2161–2170. ACM (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Heinz College of Information Systems and Public PolicyCarnegie Mellon UniversityPittsburghUSA
  2. 2.AdobeBangaloreIndia

Personalised recommendations