Advertisement

World Wide Web

, Volume 13, Issue 3, pp 251–274 | Cite as

A Wikipedia Matching Approach to Contextual Advertising

  • Alexander N. Pak
  • Chin-Wan ChungEmail author
Article

Abstract

Contextual advertising is an important part of today’s Web. It provides benefits to all parties: Web site owners and an advertising platform share the revenue, advertisers receive new customers, and Web site visitors get useful reference links. The relevance of selected ads for a Web page is essential for the whole system to work. Problems such as homonymy and polysemy, low intersection of keywords and context mismatch can lead to the selection of irrelevant ads. Therefore, a simple keyword matching technique gives a poor accuracy. In this paper, we propose a method for improving the relevance of contextual ads. We propose a novel “Wikipedia matching” technique that uses Wikipedia articles as “reference points” for ads selection. We show how to combine our new method with existing solutions in order to increase the overall performance. An experimental evaluation based on a set of real ads and a set of pages from news Web sites is conducted. Test results show that our proposed method performs better than existing matching strategies and using the Wikipedia matching in combination with existing approaches provides up to 50% lift in the average precision. TREC standard measure bpref-10 also confirms the positive effect of using Wikipedia matching for the effective ads selection.

Keywords

contextual advertising wikipedia matching 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Anagnostopoulos, A., Broder, A., Gabrilovich, E., Josifovski, V., Riedel, L.: Just-in-time contextual advertising. In: Proc. of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, pp. 331–340, Lisbon, Portugal (2007) LCrossRefGoogle Scholar
  2. 2.
    Anthony H.J.: Probability and Statistics for Engineers and Scientists. Duxbury, Belmont (2007)Google Scholar
  3. 3.
    Broder, A., Fontoura, M., Josifovski, V., Riedel, L.: Semantic approach to contextual advertising. In: Proc. of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Amsterdam, The Netherlands (2007)Google Scholar
  4. 4.
    Buckley, C., Voorhees, E.M.: Retrieval evaluation with incomplete information. In: SIGIR ’04: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 25–32, New York, NY, USA, ACM (2004)Google Scholar
  5. 5.
    Deerwester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 491–407 (1988)Google Scholar
  6. 6.
    Ding, C., He, X.: K-means clustering via principal component analysis. In: ICML ’04: Proceedings of the Twenty-first International Conference on Machine Learning, p. 29, New York, NY, USA, ACM (2004)Google Scholar
  7. 7.
    Dwork, C., Kumar, R., Naor, M., Sivakumar, D.: Rank aggregation methods for the web. In: WWW ’01: Proceedings of the 10th International Conference on World Wide Web, pp. 613–622, New York, NY, USA, ACM (2001)Google Scholar
  8. 8.
    IDC.: Worldwide and U.S. Internet ad Spend Report 4q08: U.S. Growth Flat, 1q09 Spending Likely to Contract (2009)Google Scholar
  9. 9.
  10. 10.
    Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)zbMATHGoogle Scholar
  11. 11.
    Murdock, V., Ciaramita, M., Plachouras, V.: A noisy-channel approach to contextual advertising. In: Proc. of the 1st International Workshop on Data Mining and Audience Intelligence for Advertising, pp. 21–27, San Jose, California (2007)Google Scholar
  12. 12.
    Porter, M.F.: An algorithm for suffix stripping. Readings in Information Retrieval, pp. 313–316 (1997)Google Scholar
  13. 13.
    Porter, M.F.: The Porter Stemming Algorithm official home page. http://tartarus.org/~martin/porterstemmer/index.html (2006)
  14. 14.
    Ribeiro-Neto, B., Cristo, M.: Impedance coupling in content-targeted advertising. In: Proc. of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 496–503, Salvador, Brazil (2005)Google Scholar
  15. 15.
    Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM, 18(11), 613–620 (1975)zbMATHCrossRefGoogle Scholar
  16. 16.
    Sullivan, D.: Search Engine Watch. http://searchenginewatch.com/2183531 (2003)
  17. 17.
    TREC.: The Fifteenth Text Retrieval Conference (TREC 2006) Proceedings. http://trec.nist.gov/pubs/trec15/appendices/ce.measures06.pdf (2006)
  18. 18.
    Zhang, Y., Vogel, S.: Measuring confidence intervals for the machine translation evaluation metrics. In: In Proceedings of the 10th International Conference on Theoretical and Methodological Issues in Machine Translation, TMI-2004, pp. 4–6 (2004)Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  1. 1.Division of Computer Science, Department of EECSKorea Advanced Institute of Science and Technology (KAIST)DaejeonKorea

Personalised recommendations