Skip to main content
Log in

You must have clicked on this ad by mistake! Data-driven identification of accidental clicks on mobile ads with applications to advertiser cost discounting and click-through rate prediction

  • Regular Paper
  • Published:
International Journal of Data Science and Analytics Aims and scope Submit manuscript

Abstract

In the cost per click pricing model, an advertiser pays an ad network only when a user clicks on an ad; in turn, the ad network gives a share of that revenue to the publisher where the ad was impressed. Still, advertisers may be unsatisfied with ad networks charging them for “valueless” clicks, or so-called accidental clicks. These happen when users click on an ad, are redirected to the advertiser website and bounce back without spending any time on the ad landing page. Charging advertisers for such clicks is detrimental in the long term as the advertiser may decide to run their campaigns on other ad networks. In addition, machine-learned click models trained to predict which ad will bring the highest revenue may overestimate an ad click-through rate, and as a consequence negatively impacting revenue for both the ad network and the publisher. In this work, we propose a data-driven method to detect accidental clicks from the perspective of the ad network. We collect observations of time spent by users on a large set of ad landing pages—i.e., dwell time. We notice that the majority of per-ad distributions of dwell time fit to a mixture of distributions, where each component may correspond to a particular type of clicks, the first one being accidental. We then estimate dwell time thresholds of accidental clicks from that component. Using our method to identify accidental clicks, we then propose a technique that smoothly discounts the advertiser’s cost of accidental clicks at billing time. Experiments conducted on a large dataset of ads served on Yahoo mobile apps confirm that our thresholds are stable over time, and revenue loss in the short term is marginal. We also compare the performance of an existing machine-learned click model trained on all ad clicks with that of the same model trained only on non-accidental clicks. There, we observe an increase in both ad click-through rate (+ 3.9%) and revenue (+ 0.2%) on ads served by the Yahoo Gemini network when using the latter. These two applications validate the need to consider accidental clicks for both billing advertisers and training ad click models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. There is no consensus around what a conversion is; it is up to the advertiser to specify it.

  2. https://gemini.yahoo.com/.

  3. We refer to Normal and Gaussian distribution, interchangeably.

  4. In practice, we often seek for \(\hat{{\varvec{\Theta }}} _{\text {MLE}}\) so as to maximise the log-likelihood function\(\text {ln}(L(\varvec{\Theta };{\mathcal {D}}_j))\), since this is equivalent (the natural logarithm is monotonically increasing) but simpler because products change into summations.

  5. This is also often referred to as the bias-variance trade-off.

  6. A threshold already used in previous work [16].

  7. The actual revenue loss is not shown due to business confidentiality.

  8. In this setting, \(p=\text {NACR}\) and \({{\hat{p}}} = {\widehat{\text {NACR}}}_{\text {MLE}}\).

  9. This may happen if the same pivoting app has been running for long and a new, better performing app slightly overtakes it.

  10. CPM stands for cost per mille (impressions) and indicates the earnings gained every thousand ad impressions sold.

References

  1. Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Petrov, B.N., Csaki, F. (eds.) Selected Papers of Hirotugu Akaike, pp. 199–213. Springer, Budapest, Hungary (1973)

    Google Scholar 

  2. Azimi, J., Zhang, R., Zhou, Y., Navalpakkam, V., Mao, J., Fern, X.: Visual appearance of display ads and its effect on click through rate. In: Chen, X-w, Lebanon, G., Wang, H., Zaki, M.J. (eds.) 21st ACM International Conference on Information and Knowledge Management, CIKM’12, pp. 495–504. Maui, HI, USA (2012)

    Google Scholar 

  3. Barajas, J., Akella, R., Holtan, M., Kwon, J., Flores, A., Andrei, V.: Dynamic effects of ad impressions on commercial actions in display advertising. In: Chen, X-w, Lebanon, G., Wang, H., Zaki, M.J. (eds.) 21st ACM International Conference on Information and Knowledge Management, pp. 1747–1751. Maui, HI, USA (2012)

    Google Scholar 

  4. Barbieri, N., Silvestri, F., Lalmas, M.: Improving post-click user engagement on native ads via survival analysis. In: Chen, X-w., Lebanon, G., Wang, H., Zaki, M.J. (eds.) 21st ACM International Conference on Information and Knowledge Management, CIKM’12, pp. 1747–1751. Maui, HI, USA (2012)

  5. Becker, H., Broder, A., Gabrilovich, E., Josifovski, V., Pang, B.: What happens after an ad click?: Quantifying the impact of landing pages in web advertising. In: Cheung, D.W-L., Song, Il-Y., Chu, W.W., Hu, X., Lin, J.J. (eds.) Proceedings of the 18th ACM Conference on Information and Knowledge Management, CIKM 2009, pp. 57–66. Hong Kong, China, 2–6 November 2009

  6. Bishop, C.M.: Pattern Recognition and Machine Learning (Information Science and Statistics). Springer, New York (2006)

    MATH  Google Scholar 

  7. Brown, L.D., Cai, T.T., DasGupta, A.: Interval estimation for a binomial proportion. Stat. Sci. 16(2), 101–117 (2001)

    MathSciNet  MATH  Google Scholar 

  8. Daswani, N., Stoppelman, M.: The anatomy of clickbot. A. In: Provos. N. (ed.) First Workshop on Hot Topics in Understanding Botnets, HotBots’07, p. 11. Cambridge, MA, USA, (2007)

  9. Elkan, C.: Mixture Models. http://cseweb.ucsd.edu/~elkan/250Bwinter2011/mixturemodels.pdf. Accessed 4 Mar 2010

  10. Goldman, M., Rao, J.M.: Experiments as instruments: heterogeneous position effects in sponsored search auctions. https://ssrn.com/abstract=2524688. Accessed 20 Nov 2014

  11. Grbovic, M., Djuric, N., Radosavljevic, V., Silvestri, F., Baeza-Yates, R., Feng, A., Ordentlich, E., Yang, L., Owens, G.: Scalable semantic matching of queries to ads in sponsored search advertising. In: Perego, R., Sebastiani, F., Aslam, J.A., Ruthven, I., Zobel, J. (eds.) Proceedings of the 39th International ACM, SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2016, pp. 375–384. Pisa, Italy, 17–21 July 2016

  12. Jacobson, A.: Preventing accidental clicks for a better mobile ads experience. https://adwords.googleblog.com/2016/05/preventing-accidental-clicks-for-better-mobile-ads.html. Accessed 6 May 2016

  13. James, G., Witten, D., Hastie, T., Tibshirani, R.: An Introduction to Statistical Learning: With Applications in R. Springer, Berlin (2014)

    MATH  Google Scholar 

  14. Kae, A., Kan, K., Narayanan, V.K., Yankov, D.: Categorization of display ads using image and landing page features. Proceedings of the Third Workshop on Large Scale Data Mining: Theory and Applications, pp. 1:1–1:8. San Diego, California

  15. Kim, Y., Hassan, A., White, R.W., Zitouni, I.: Modeling dwell time to predict click-level satisfaction. In: Carterette, B., Diaz, F., Castillo, C., Metzler, D. (eds.) Seventh ACM International Conference on Web Search and Data Mining, WSDM 2014, pp. 193–202. New York, NY, USA, 24–28 February 2014

  16. Lalmas, M., Lehmann, J., Shaked, G., Silvestri, F., Tolomei, G.: Promoting positive post-click experience for in-stream yahoo gemini users. In: Cao, L., Zhang, C., Joachims, T., Webb, G.I., Margineantu, D.D., Williams, G. (eds.) Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1929–1938. Sydney, NSW, Australia, 10–13 August 2015

  17. Lindsay, B.G.: Mixture models: theory, geometry and applications. In: NSF-CBMS Conference Booktitle in Probability and Statistics, Pennsylvania State University (1995)

  18. Liu, C., White, R.W., Dumais, S.: Understanding web browsing behaviors through Weibull analysis of dwell time. In: Crestani, F., Marchand-Maillet, S., Chen, H-H., Efthimiadis, E.N., Savoy, J. (eds.) Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2010, pp. 379–386. Geneva, Switzerland, 19–23 July 2010

  19. Mclachlan, G., Peel, D.: Finite Mixture Models, 1st edn. Wiley, Hoboken (2000)

    Book  MATH  Google Scholar 

  20. Papoulis, A., Pillai, S.U.: Probability, Random Variables, and Stochastic Processes, 4th edn. McGraw-Hill Higher Education, New York (2002)

    Google Scholar 

  21. Raghavan, H., Hillard, D.: A relevance model based filter for improving ad quality. In: Allan, J., Aslam, J.A., Sanderson, M., Zhai, C.X., Zobel, J. (eds.) Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2009, pp. 762–763. Boston, MA, USA, 19–23 July 2009

  22. Rosales, R., Cheng, H., Manavoglu, E.: Post-click conversion modeling and analysis for non-guaranteed delivery display advertising. In: Adar, E., Teevan, J., Agichtein, E., Maarek, Y. (eds.) Proceedings of the Fifth International Conference on Web Search and Web Data Mining, WSDM 2012, pp. 293–302. Seattle, WA, USA, 8–12 February 2012

  23. Schlattmann, P.: Medical Applications of Finite Mixture Models. Statistics for Biology and Health. Springer, Berlin (2009)

    MATH  Google Scholar 

  24. Schwarz, G.: Estimating the dimension of a model. Ann Stat 6, 461–464 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  25. Sculley, D., Malkin, R.G., Basu, S., Bayardo, R.J.: Predicting bounce rates in sponsored search advertisements. In: Elder IV, J.F., Fogelman-Soulié, F., Flach, P.A., Zaki, M.J. (eds.) Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1325–1334. Paris, France (2009)

  26. Sodomka, E., Lahaie, S., Hillard, D.: A predictive model for advertiser value-per-click in sponsored search. In: Schwabe, D., Almeida, V.A.F., Glaser, H., Baeza-Yates, R.A., Moon, S.B. (eds.) 22nd International World Wide Web Conference, WWW ’13, pp. 1179–1190. Rio de Janeiro, Brazil, 13–17 May 2013

  27. Stewart, C., Hoggan, E., Haverinen, L., Salamin, H., Jacucci, G.: An exploration of inadvertent variations in mobile pressure input. In: Churchill, E.F., Subramanian, S., Baudisch, P., O’Hara, K. (eds.) Mobile HCI ’12, Proceedings of the 14th international conference on Human-computer interaction with mobile devices and services, pp. 35–38. San Francsico, CA, USA, 21–24 September 2012

  28. Stone-Gross, B., Stevens, R., Zarras, A., Kemmerer, R., Kruegel, C., Vigna, G.: Understanding fraudulent activities in online ad exchanges. In: Thiran, P., Willinger, W. (eds.) Proceedings of the 11th ACM SIGCOMM Internet Measurement Conference, IMC ’11, pp. 279–294. Berlin, Germany (2011)

  29. Yi, X., Hong, L., Zhong, E., Liu, N.N., Rajan, S.: Beyond clicks: dwell time for personalization. In: Kobsa, A., Zhou, M.X., Ester, M., Koren, Y. (eds.) Eighth ACM Conference on Recommender Systems, RecSys ’14, pp. 113–120. Foster City, Silicon Valley, CA, USA, 06–10 October 2014

  30. Yin, P., Luo, P., Lee, W.-C., Wang, M.: Silence is also evidence: interpreting dwell time for recommendation from psychological perspective. In: Dhillon, I.S., Koren, Y., Ghani, R., Senator, T.E., Bradley, P., Parekh, R., He, J., Grossman, R.L., Uthurusamy, R. (eds.) The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013, pp. 989–997. Chicago, IL, USA, 11–14 August 2013

Download references

Acknowledgements

The authors would like to thank Michal Aharon and Marc Bron for their support in setting up the online A/B test, which allowed them to deploy and assess their approach on a second use case, i.e., the ad click model.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gabriele Tolomei.

Ethics declarations

Conflict of interest

The authors declared that they have no conflict of interest.

Additional information

All the authors contributed to this work while employed at Yahoo Research.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tolomei, G., Lalmas, M., Farahat, A. et al. You must have clicked on this ad by mistake! Data-driven identification of accidental clicks on mobile ads with applications to advertiser cost discounting and click-through rate prediction. Int J Data Sci Anal 7, 53–66 (2019). https://doi.org/10.1007/s41060-018-0122-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41060-018-0122-1

Keywords

Navigation