Skip to main content

Web Spam, Social Propaganda and the Evolution of Search Engine Rankings

  • Conference paper
Web Information Systems and Technologies (WEBIST 2009)

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 45))

Included in the following conference series:

Abstract

Search Engines have greatly influenced the way we experience the web. Since the early days of the web, users have been relying on them to get informed and make decisions. When the web was relatively small, web directories were built and maintained using human experts to screen and categorize pages according to their characteristics. By the mid 1990’s, however, it was apparent that the human expert model of categorizing web pages does not scale. The first search engines appeared and they have been evolving ever since, taking over the role that web directories used to play.

But what need makes a search engine evolve? Beyond the financial objectives, there is a need for quality in search results. Search engines know that the quality of their ranking will determine how successful they are. Search results, however, are not simply based on well-designed scientific principles, but they are influenced by web spammers. Web spamming, the practice of introducing artificial text and links into web pages to affect the results of web searches, has been recognized as a major search engine problem. It is also a serious users problem because they are not aware of it and they tend to confuse trusting the search engine with trusting the results of a search.

In this paper, we analyze the influence that web spam has on the evolution of the search engines and we identify the strong relationship of spamming methods on the web to propagandistic techniques in society. Our analysis provides a foundation for understanding why spamming works and offers new insight on how to address it.In particular, it suggests that one could use social anti-propagandistic techniques to recognize web spam.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Benczúr, A., Csalogány, K., Sarlós, T., Uher, M.: Spam Rank – Fully automatic link spam detection. In: Proceedings of the AIRWeb Workshop (2005)

    Google Scholar 

  2. Bharat, K., Chang, B.-W., Henzinger, M.R., Ruhl, M.: Who links to whom: Mining linkage between web sites. In: Proceedings of the 2001 IEEE International Conference on Data Mining, pp. 51–58. IEEE Computer Society, Los Alamitos (2001)

    Chapter  Google Scholar 

  3. Bianchini, M., Gori, M., Scarselli, F.: PageRank and web communities. In: Web Intelligence Conference 2003 (2003)

    Google Scholar 

  4. Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30(1-7), 107–117 (1998)

    Article  Google Scholar 

  5. Broder, A.: A taxonomy of web search. SIGIR Forum 36(2), 3–10 (2002)

    Article  Google Scholar 

  6. Broder, A., Kumar, R., Maghoul, F., Raghavan, P., Rajagopalan, S., Stata, R., Tomkins, A., Wiener, J.: Graph structure in the web. Comput. Networks 33(1-6), 309–320 (2000)

    Article  Google Scholar 

  7. Cho, J., Roy, S.: Impact of search engines on page popularity. In: WWW 2004 (2004)

    Google Scholar 

  8. CNETNews, Engine sells results, draws fire (1996), http://news.cnet.com/2100-1023-215491.html

  9. Corey, T.S.: Catching on-line traders in a web of lies: The perils of internet stock fraud. Ford Marrin Esposito, Witmeyer & Glesser, LLP (2001), http://www.fmew.com/archive/lies/

  10. Fetterly, D., Manasse, M., Najork, M.: Spam, damn spam, and statistics. In: WebDB 2004 (2004)

    Google Scholar 

  11. Fetterly, D., Manasse, M., Najork, M., Wiener, J.: A large-scale study of the evolution of web pages. In: Proceedings of the twelfth international conference on World Wide Web, pp. 669–678. ACM Press, New York (2003)

    Chapter  Google Scholar 

  12. Flake, G.W., Lawrence, S., Giles, C.L., Coetzee, F.: Self-organization of the web and identification of communities. IEEE Computer 35(3), 66–71 (2002)

    Google Scholar 

  13. Graham, L., Metaxas, P.T.: Of course it’s true; I saw it on the internet!: Critical thinking in the internet era. Commun. ACM 46(5), 70–75 (2003)

    Article  Google Scholar 

  14. Gyöngyi, Z., Garcia-Molina, H.: Web spam taxonomy. In: Proceedings of the AIRWeb Workshop (2005)

    Google Scholar 

  15. Gyöngyi, Z., Garcia-Molina, H., Pedersen, J.: Combating web spam with TrustRank. In: VLDB 2004 (2004)

    Google Scholar 

  16. Hansell, S.: Google keeps tweaking its search engine. New York Times (2007)

    Google Scholar 

  17. Henzinger, M.R.: Hyperlink analysis for the web. IEEE Internet Computing 5(1), 45–50 (2001)

    Article  Google Scholar 

  18. Henzinger, M.R., Motwani, R., Silverstein, C.: Challenges in web search engines. SIGIR Forum 36(2), 11–22 (2002)

    Article  Google Scholar 

  19. Hindman, M., Tsioutsiouliklis, K., Johnson, J.: Googlearchy: How a few heavily-linked sites dominate politics on the web. Annual Meeting of the Midwest Political Science Association (2003)

    Google Scholar 

  20. Introna, L., Nissenbaum, H.: Defining the web: The politics of search engines. Computer 33(1), 54–62 (2000)

    Article  Google Scholar 

  21. Kleinberg, J.: The small-world phenomenon: an algorithm perspective. In: STOC 2000: Proceedings of the thirty-second annual ACM symposium on Theory of computing, pp. 163–170. ACM Press, New York (2000)

    Chapter  Google Scholar 

  22. Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46(5), 604–632 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  23. Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.: Trawling the Web for emerging cyber-communities. Computer Networks 31(11-16), 1481–1493 (1999)

    Article  Google Scholar 

  24. Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.: The web and social networks. IEEE Computer 35(11), 32–36 (2002)

    Google Scholar 

  25. Lee, A.M., Lee, E.B. (eds.): The Fine Art of Propaganda. The Institute for Propaganda Analysis. Harcourt, Brace and Co. (1939)

    Google Scholar 

  26. Lynch, C.A.: When documents deceive: trust and provenance as new factors for information retrieval in a tangled web. J. Am. Soc. Inf. Sci. Technol. 52(1), 12–17 (2001)

    Article  Google Scholar 

  27. Marchiori, M.: The quest for correct information on the web: hyper search engines. Comput. Netw. ISDN Syst. 29(8-13), 1225–1235 (1997)

    Article  Google Scholar 

  28. Maulding, M.L.: Lycos: Design choices in an internet search service. IEEE Expert (12), 8–11 (1997)

    Google Scholar 

  29. Metaxas, P.T.: Using Propagation of Distrust to find Untrustworthy Web Neighborhoods. In: Proceedings of the fourth international conference on internet and Web Applications and Services, Venice, Italy, May 24-28 (2009)

    Google Scholar 

  30. Pringle, G., Allison, L., Dowe, D.L.: What is a tall poppy among web pages? In: Proceedings of the seventh international conference on World Wide Web 7, pp. 369–377. Elsevier Science Publishers B. V., Amsterdam (1998)

    Google Scholar 

  31. Raghavan, P.: Social networks: From the web to the enterprise. IEEE Internet Computing 6(1), 91–94 (2002)

    Article  Google Scholar 

  32. Salton, G.: Dynamic document processing. Commun. ACM 15(7), 658–668 (1972)

    Article  Google Scholar 

  33. Silverstein, C., Marais, H., Henzinger, M., Moricz, M.: Analysis of a very large web search engine query log. SIGIR Forum 33(1), 6–12 (1999)

    Article  Google Scholar 

  34. Totty, M., Mangalindan, M.: As google becomes web’s gatekeeper, sites fight to get in. Wall Street Journal CCXLI (39) (2003)

    Google Scholar 

  35. Vedder, A.: Medical data, new information technologies and the need for normative principles other than privacy rules. In: Freeman, M., Lewis, A. (eds.) Law and Medicine. Series Current Legal Issues, pp. 441–459. Oxford University Press, Oxford (2000)

    Google Scholar 

  36. Wasserman, S., Faust, K.: Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge (1994)

    Google Scholar 

  37. Welch, D.: Power of persuasion - propaganda. History Today 49(8), 24–26 (1999)

    Google Scholar 

  38. Wu, B., Davison, B.: Identifying link farm spam pages. In: Proceedings of the fourteenth international conference on World Wide Web. ACM Press, New York (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Metaxas, P.T. (2010). Web Spam, Social Propaganda and the Evolution of Search Engine Rankings. In: Cordeiro, J., Filipe, J. (eds) Web Information Systems and Technologies. WEBIST 2009. Lecture Notes in Business Information Processing, vol 45. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12436-5_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12436-5_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12435-8

  • Online ISBN: 978-3-642-12436-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics