Soft Computing

, Volume 21, Issue 3, pp 651–665 | Cite as

A methodology based on Deep Learning for advert value calculation in CPM, CPC and CPA networks

  • Luis Miralles-Pechuán
  • Dafne Rosso
  • Fernando Jiménez
  • Jose M. García
Focus

Abstract

In this research, we propose a methodology for advert value calculation in CPM, CPC and CPA networks. Accurately estimating this value increases the three previous networks’ incomes by selecting the most profitable advert. By increasing income, publishers are better paid and improved services are afforded to advertisers. To develop this methodology, we propose a system based on traditional Machine Learning methods and Deep Learning methods. The system has two inputs and one output. The inputs are the user visit and the data about the advertiser. The output is the advert value expressed in dollars. Deep Learning predicts model behavior more precisely for many supervised problems. The three experiments carried out allow us to conclude that DL is a supervised method that is very efficient in the classification of spam adverts and in the estimation of the CTR. In the prediction of online sales, DLNN have shown, on average, worse performance than cubist and random forest methods, although better performance than model tree, model rules and linear regression methods.

Keywords

Advertisement value calculation in CPM, CPC and CPA networks Deep Learning methods in online advertising Sales prediction Spam probability calculation CTR estimation Deep Learning in advertisement value calculation 

Notes

Compliance with ethical standards

Conflict of interest

Luis Miralles-Pechuán, Dafne Rosso, Fernando Jiménez and Jose M. García declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Funding

Funded in part by the Spanish Ministerio de Economía y Competitividad (MINECO) and European Commission FEDER Under Grants TIN2013-45491-R and TIN2015-66972-C5-3-R.

References

  1. Agarwal D, Chen BC, Elango P (2009) Spatio-temporal models for estimating click-through rate. In: Proceedings of the 18th international conference on World wide web, ACM, pp 21–30Google Scholar
  2. Arel I, Rose DC, Karnowski TP (2010) Deep machine learning-a new frontier in artificial intelligence research [research frontier]. IEEE Comput Intell Mag 5(4):13–18CrossRefGoogle Scholar
  3. Balakrishnan S, Chopra S, Melamed ID (2010) The business next door: Click-through rate modeling for local search. Machine Learning in Online Advertising p 14Google Scholar
  4. Bauman K, Kornetova A, Topinsky V, Leshiner D (2010) Ctr prediction based on click statistic. In: Workshop: machine learning in online advertising, Citeseer, pp 8–13Google Scholar
  5. Bax E, Kuratti A, Mcafee P, Romero J (2012) Comparing predicted prices in auctions for online advertising. Int J Ind Organ 30(1):80–88CrossRefGoogle Scholar
  6. Beheshti-Kashi S, Karimi HR, Thoben KD, Lütjen M, Teucke M (2015) A survey on retail sales forecasting and prediction in fashion markets. Syst Sci Control Eng: Open Access J 3(1):154–161CrossRefGoogle Scholar
  7. Bengio Y (2009) Learning deep architectures for AI. Found Trends Mach Learn 2(1):1–127CrossRefMATHGoogle Scholar
  8. Bose I, Mahapatra RK (2001) Business data mining: a machine learning perspective. Inf Manag 39(3):211–225CrossRefGoogle Scholar
  9. caret Package T (2015) The caret package (short for classification and regression training). http://topepo.github.io/caret/index.html, [Online; accessed 05 July 2015]
  10. Chen FL, Ou TY (2011) Constructing a sales forecasting model by integrating GRA and ELM: a case study for retail industry. Int J Electron Bus Manag 9(2):107Google Scholar
  11. Cho CH, as UoTaAia (2004) Why do people avoid advertising on the internet? J Advert 33(4):89–97CrossRefGoogle Scholar
  12. Clark J, Koprinska I, Poon J (2003) A neural network based approach to automated e-mail classification. In: Null, IEEE, p 702Google Scholar
  13. Dembczynski K, Kotlowski W, Weiss D (2008) Predicting ads click-through rate with decision rules. In: Workshop on targeting and ranking in online advertising, vol 2008Google Scholar
  14. Díaz-Uriarte R, De Andres SA (2006) Gene selection and classification of microarray data using random forest. BMC Bioinform 7(1):3CrossRefGoogle Scholar
  15. Documentation DLH (2015) Deep Learning-H2O 2.8.6.2 documentation. https://s3.amazonaws.com/h2o-release/h2o/rel-markov/1/docs-website/datascience/deeplearning.html, [Online; accessed 22 April 2015]
  16. Duarte Torres S, Weber I, Hiemstra D (2014) Analysis of search and browsing behavior of young users on the web. ACM Trans Web 8(2):7CrossRefGoogle Scholar
  17. Fain DC, Pedersen JO (2006) Sponsored search: a brief history. Bull Am Soc Inf Sci Technol 32(2):12–13CrossRefGoogle Scholar
  18. Fang Z, Yue K, Zhang J, Zhang D, Liu W (2014) Predicting click-through rates of new advertisements based on the bayesian network. In: Mathematical problems in engineering 2014Google Scholar
  19. Fawcett T (2006) An introduction to roc analysis. Pattern Recognit Lett 27(8):861–874MathSciNetCrossRefGoogle Scholar
  20. Feily M, Shahrestani A, Ramadass S (2009) A survey of botnet and botnet detection. In: Emerging security information, systems and technologies, 2009. SECURWARE’09. Third International Conference on, IEEE, pp 268–273Google Scholar
  21. Fjell K (2010) Online advertising: pay-per-view versus pay-per-click with market power. J Revenue Pricing Manag 9(3):198–203CrossRefGoogle Scholar
  22. Gabrilovich E, Broder A, Fontoura M, Joshi A, Josifovski V, Riedel L, Zhang T (2009) Classifying search queries using the web as a source of knowledge. ACM Trans Web 3(2):5CrossRefGoogle Scholar
  23. Gandhi M, Jakobsson M, Ratkiewicz J (2006) Badvertisements: stealthy click-fraud with unwitting accessories. J Digit Forensic Pract 1(2):131–142CrossRefGoogle Scholar
  24. Goodman J, Yih WT (2006) Online discriminative spam filter training. In: CEAS, pp 1–4Google Scholar
  25. Graepel T, Candela JQ, Borchert T, Herbrich R (2010) Web-scale bayesian click-through rate prediction for sponsored search advertising in microsoft’s bing search engine. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 13–20Google Scholar
  26. Granitto PM, Furlanello C, Biasioli F, Gasperi F (2006) Recursive feature elimination with random forest for PTR-MS analysis of agroindustrial products. Chemom Intell Lab Syst 83(2):83–90CrossRefGoogle Scholar
  27. Grbovic M, Djuric N, Radosavljevic V, Bhamidipati N (2015) Search retargeting using directed query embeddings. In: Proceedings of the 24th international conference on world wide web companion, international world wide web conferences steering committee, pp 37–38Google Scholar
  28. Guzella TS, Caminhas WM (2009) A review of machine learning approaches to spam filtering. Expert Syst Appl 36(7):10,206–10,222CrossRefGoogle Scholar
  29. Haghi HV, Tafreshi SM (2007) An overview and verification of electricity price forecasting models. In: Power engineering conference, 2007. IPEC 2007. International, IEEE, pp 724–729Google Scholar
  30. Heckerman D, Horvitz E, Sahami M, Dumais S (1998) A bayesian approach to filtering junk e-mail. In: Proceeding of AAAI-98 workshop on learning for text categorization, pp 55–62Google Scholar
  31. Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554MathSciNetCrossRefMATHGoogle Scholar
  32. Hu YJ, Shin J, Tang Z (2010) Pricing of online advertising: cost-per-click-through vs. cost-per-action. In: System sciences (HICSS), 2010 43rd Hawaii International Conference on, IEEE, pp 1–9Google Scholar
  33. Hülsmann M, Borscheid D, Friedrich CM, Reith D (2012) General sales forecast models for automobile markets and their analysis. Trans MLDM 5(2):65–86Google Scholar
  34. Installation HRS (2015) H2O installation in R Studio H2O 2.3.0.1283 documentation. http://docs.h2o.ai/h2oclassic/Ruser/Rinstall.html, [Online; accessed 8-April-2015]
  35. Jakobsson M, Ramzan Z (2008) Crimeware: understanding new attacks and defenses. Addison-Wesley, ReadingGoogle Scholar
  36. Kim JH (2009) Estimating classification error rate: repeated cross-validation, repeated hold-out and bootstrap. Comput Stat Data Anal 53(11):3735–3745MathSciNetCrossRefMATHGoogle Scholar
  37. Kirubavathi G, Anitha R (2014) Botnets: a study and analysis. Computational intelligence, cyber security and computational models. Springer, Berlin, pp 203–214CrossRefGoogle Scholar
  38. Kondakindi G, Rana S, Rajkumar A, Ponnekanti SK, Parakh V (2014) A logistic regression approach to ad click prediction. Mach Learn Class ProjectGoogle Scholar
  39. König AC, Gamon M, Wu Q (2009) Click-through prediction for news queries. In: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, ACM, pp 347–354Google Scholar
  40. Kshetri N (2010) The economics of click fraud. IEEE Security Privacy 8(3):45–53CrossRefGoogle Scholar
  41. Kuhn M (2012) Variable selection using the caret package. URL http://cran.cermin.lipi.go.id/web/packages/caret/vignettes/caretSelection.pdf
  42. Kuhn M (2008) Building predictive models in R using the caret package. J Stat Softw 28(5):1–26CrossRefGoogle Scholar
  43. Kuhn M, Johnson K (2013) Applied predictive modeling. Springer, BerlinCrossRefMATHGoogle Scholar
  44. Kumar R, Naik SM, Naik VD, Shiralli S, Sunil V, Husain M (2015) Predicting clicks: CTR estimation of advertisements using logistic regression classifier. In: Advance computing conference (IACC), 2015 IEEE International, IEEE, pp 1134–1138Google Scholar
  45. Larochelle H, Erhan D, Courville A, Bergstra J, Bengio Y (2007) An empirical evaluation of deep architectures on problems with many factors of variation. In: Proceedings of the 24th international conference on machine learning, ACM, pp 473–480Google Scholar
  46. Le QV (2013) Building high-level features using large scale unsupervised learning. In: Acoustics, speech and signal processing (ICASSP), 2013 IEEE international conference on, IEEE, pp 8595–8598Google Scholar
  47. LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541–551CrossRefGoogle Scholar
  48. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444CrossRefGoogle Scholar
  49. Lee H, Grosse R, Ranganath R, Ng AY (2009) Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of the 26th annual international conference on machine learning, ACM, pp 609–616Google Scholar
  50. Lee J, Shi Y, Wang F, Lee H, Kim HK (2015) Advertisement clicking prediction by using multiple criteria mathematical programming. World Wide Web pp 1–18Google Scholar
  51. Levin J, Milgrom P (2010) Online advertising: heterogeneity and conflation in market design. Am Econ Rev 100(2):603–607CrossRefGoogle Scholar
  52. Lohtia R, Donthu N, Hershberger EK (2003) The impact of content and design elements on banner advertising click-through rates. J Advert Res 43(04):410–418CrossRefGoogle Scholar
  53. Mangani A (2004) Online advertising: pay-per-view versus pay-per-click. J Revenue Pricing Manag 2(4):295–302CrossRefGoogle Scholar
  54. Markoff J (2012) Scientists see promise in deep-learning programs. New York TimesGoogle Scholar
  55. Metz CE (1978) Basic principles of roc analysis. Semin Nuclear Med 8:283–298CrossRefGoogle Scholar
  56. Michie D, Spiegelhalter DJ, Taylor CC (1994) Machine learning, neural and statistical classificationGoogle Scholar
  57. Miller B, Pearce P, Grier C, Kreibich C, Paxson V (2011) Whats clicking what? Techniques and innovations of todays clickbots. In: International conference on detection of intrusions and malware, and vulnerability assessment. Springer, Berlin, pp 164–183Google Scholar
  58. Miralles-Pechuán L, Ballester EM, Carrasco JMG (2014) Online advertising and the cpa model: challenges and opportunities. Int J Eng Manag Res 4:324–334Google Scholar
  59. Miralles-Pechuán L, Rosso D, Brieva J (2015) Reconocimiento de dígitos escritos a mano mediante métodos de tratamiento de imagen y modelos de clasificación. Res Comput Sci 93(93):83–94Google Scholar
  60. Mohamed Ar, Sainath TN, Dahl G, Ramabhadran B, Hinton GE, Picheny M, et al (2011) Deep belief networks using discriminative features for phone recognition. In: Acoustics, speech and signal processing (ICASSP), 2011 IEEE international conference on, IEEE, pp 5060–5063Google Scholar
  61. Najafabadi MM, Villanustre F, Khoshgoftaar TM, Seliya N, Wald R, Muharemagic E (2015) Deep learning applications and challenges in big data analytics. J Big Data 2(1):1–21CrossRefGoogle Scholar
  62. of Data Science KTH (2015) Data—display advertising challenge—Kaggle. www.kaggle.com/c/criteo-display-ad-challenge/data, [Online; accessed 16-July-2008]
  63. Ponce H, Martínez-Villaseñor MdL, Miralles-Pechuán L (2016) A novel wearable sensor-based human activity recognition approach using artificial hydrocarbon networks. Sensors 16(7):1033CrossRefGoogle Scholar
  64. Ponce H, Miralles-Pechuán L, Martínez-Villaseñor MdL (2016b) A flexible approach for human activity recognition using artificial hydrocarbon networks. Sensors 16(11):1715CrossRefGoogle Scholar
  65. Ranadive A, Rizvi S, Daswani NM (2013) Malicious advertisement detection and remediation. US Patent 8,516,590Google Scholar
  66. Refaeilzadeh P, Tang L, Liu H (2009) Cross-validation. In: Encyclopedia of database systems. Springer, Berlin, pp 532–538Google Scholar
  67. Rey B, Kannan A (2010) Conversion rate based bid adjustment for sponsored search. In: Proceedings of the 19th international conference on world wide web, ACM, pp 1173–1174Google Scholar
  68. Richardson M, Dominowska E, Ragno R (2007) Predicting clicks: estimating the click-through rate for new ads. In: Proceedings of the 16th international conference on world wide web. ACM, pp 521–530Google Scholar
  69. Sales OP (2015) Description—online product sales | Kaggle. https://www.kaggle.com/c/online-sales, [Online; accessed 22 July 2015]
  70. Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117CrossRefGoogle Scholar
  71. Set IAD (2015) UCI machine learning repository: internet advertisements. https://archive.ics.uci.edu/ml/datasets/Internet+Advertisements, [Online; accessed 16 June 2015]
  72. Sharma SK, Sharma V (2012) Comparative analysis of machine learning techniques in sale forecasting. Int J Comput Appl 53(6):51–54Google Scholar
  73. Singh S, Kaur S (2015) Improved spambase dataset prediction using svm Rbf kernel with adaptive boost. Int J Res Eng Technol 4(6):383–386CrossRefGoogle Scholar
  74. Sonka M, Hlavac V, Boyle R (2014) Image processing, analysis, and machine vision. Cengage Learning, UKGoogle Scholar
  75. Sparks ER, Talwalkar A, Franklin MJ, Jordan MI, Kraska T (2015) Tupaq: An efficient planner for large-scale predictive analytic queries. arXiv:1502.00068
  76. Stone-Gross B, Stevens R, Zarras A, Kemmerer R, Kruegel C, Vigna G (2011) Understanding fraudulent activities in online ad exchanges. In: Proceedings of the 2011 ACM SIGCOMM conference on Internet measurement conference, ACM, pp 279–294Google Scholar
  77. Studio R (2015) R studio is free and open source data analysis. https://www.rstudio.com/, [Online; accessed 23 June 2015]
  78. Tagami Y, Ono S, Yamamoto K, Tsukamoto K, Tajima A (2013) Ctr prediction for contextual advertising: learning-to-rank approach. In: Proceedings of the seventh international workshop on data mining for online advertising, ACM, p 4Google Scholar
  79. Tappenden AF, Miller J (2009) Cookies: a deployment study and the testing implications. ACM Trans Web 3(3):9CrossRefGoogle Scholar
  80. Taylor GW, Hinton GE, Roweis ST (2006) Modeling human motion using binary latent variables. In: Advances in neural information processing systems, pp 1345–1352Google Scholar
  81. Tretyakov K (2004) Machine learning techniques in spam filtering. Data Min Probl-Oriented Semin MTAT 3:60–79Google Scholar
  82. Trofimov I, Kornetova A, Topinskiy V (2012) Using boosted trees for click-through rate prediction for sponsored search. In: Proceedings of the sixth international workshop on data mining for online advertising and internet economy, ACM, p 2Google Scholar
  83. Tuzhilin A (2006) The lane’s gifts v. google report. Official Google Blog: Findings on invalid clicks, posted pp 1–47Google Scholar
  84. Vasumati D, Vani MS, Bhramaramba R, Babu OY (2015) Data mining approach to filter click-spam in mobile Ad networksGoogle Scholar
  85. Weka (2015) Weka 3: Data Mining Software in Java). http://www.cs.waikato.ac.nz/ml/weka/, [Online; accessed 11 June 2015]
  86. Williams D, Hinton G (1986) Learning representations by back-propagating errors. Nature 323:533–536CrossRefGoogle Scholar
  87. Yin D, Mei S, Cao B, Sun JT, Davison BD (2014) Exploiting contextual factors for click modeling in sponsored search. In: Proceedings of the 7th ACM international conference on Web search and data mining, ACM, pp 113–122Google Scholar
  88. Yoganarasimhan H (2015) Search personalization using machine learning. Available at SSRN 2590020Google Scholar
  89. Zhang GP (2003) Time series forecasting using a hybrid arima and neural network model. Neurocomputing 50:159–175CrossRefMATHGoogle Scholar
  90. Zhong Sh, Liu Y, Liu Y (2011) Bilinear deep learning for image classification. In: Proceedings of the 19th ACM international conference on multimedia, ACM, pp 343–352Google Scholar
  91. Zucker J, Shapiro TR (2015) Systems and methods for optimizing marketing decisions based on visitor profitability. US Patent 20,150,193,830Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  1. 1.Universidad Panamericana, Campus México, Facultad de IngenieríaCiudad de MéxicoMéxico
  2. 2.Faculty of Computer ScienceUniversidad de MurciaMurciaSpain

Personalised recommendations