Abstract
Search Engines have greatly influenced the way we experience the web. Since the early days of the web, users have been relying on them to get informed and make decisions. When the web was relatively small, web directories were built and maintained using human experts to screen and categorize pages according to their characteristics. By the mid 1990’s, however, it was apparent that the human expert model of categorizing web pages does not scale. The first search engines appeared and they have been evolving ever since, taking over the role that web directories used to play.
But what need makes a search engine evolve? Beyond the financial objectives, there is a need for quality in search results. Search engines know that the quality of their ranking will determine how successful they are. Search results, however, are not simply based on well-designed scientific principles, but they are influenced by web spammers. Web spamming, the practice of introducing artificial text and links into web pages to affect the results of web searches, has been recognized as a major search engine problem. It is also a serious users problem because they are not aware of it and they tend to confuse trusting the search engine with trusting the results of a search.
In this paper, we analyze the influence that web spam has on the evolution of the search engines and we identify the strong relationship of spamming methods on the web to propagandistic techniques in society. Our analysis provides a foundation for understanding why spamming works and offers new insight on how to address it.In particular, it suggests that one could use social anti-propagandistic techniques to recognize web spam.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Benczúr, A., Csalogány, K., Sarlós, T., Uher, M.: Spam Rank – Fully automatic link spam detection. In: Proceedings of the AIRWeb Workshop (2005)
Bharat, K., Chang, B.-W., Henzinger, M.R., Ruhl, M.: Who links to whom: Mining linkage between web sites. In: Proceedings of the 2001 IEEE International Conference on Data Mining, pp. 51–58. IEEE Computer Society, Los Alamitos (2001)
Bianchini, M., Gori, M., Scarselli, F.: PageRank and web communities. In: Web Intelligence Conference 2003 (2003)
Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30(1-7), 107–117 (1998)
Broder, A.: A taxonomy of web search. SIGIR Forum 36(2), 3–10 (2002)
Broder, A., Kumar, R., Maghoul, F., Raghavan, P., Rajagopalan, S., Stata, R., Tomkins, A., Wiener, J.: Graph structure in the web. Comput. Networks 33(1-6), 309–320 (2000)
Cho, J., Roy, S.: Impact of search engines on page popularity. In: WWW 2004 (2004)
CNETNews, Engine sells results, draws fire (1996), http://news.cnet.com/2100-1023-215491.html
Corey, T.S.: Catching on-line traders in a web of lies: The perils of internet stock fraud. Ford Marrin Esposito, Witmeyer & Glesser, LLP (2001), http://www.fmew.com/archive/lies/
Fetterly, D., Manasse, M., Najork, M.: Spam, damn spam, and statistics. In: WebDB 2004 (2004)
Fetterly, D., Manasse, M., Najork, M., Wiener, J.: A large-scale study of the evolution of web pages. In: Proceedings of the twelfth international conference on World Wide Web, pp. 669–678. ACM Press, New York (2003)
Flake, G.W., Lawrence, S., Giles, C.L., Coetzee, F.: Self-organization of the web and identification of communities. IEEE Computer 35(3), 66–71 (2002)
Graham, L., Metaxas, P.T.: Of course it’s true; I saw it on the internet!: Critical thinking in the internet era. Commun. ACM 46(5), 70–75 (2003)
Gyöngyi, Z., Garcia-Molina, H.: Web spam taxonomy. In: Proceedings of the AIRWeb Workshop (2005)
Gyöngyi, Z., Garcia-Molina, H., Pedersen, J.: Combating web spam with TrustRank. In: VLDB 2004 (2004)
Hansell, S.: Google keeps tweaking its search engine. New York Times (2007)
Henzinger, M.R.: Hyperlink analysis for the web. IEEE Internet Computing 5(1), 45–50 (2001)
Henzinger, M.R., Motwani, R., Silverstein, C.: Challenges in web search engines. SIGIR Forum 36(2), 11–22 (2002)
Hindman, M., Tsioutsiouliklis, K., Johnson, J.: Googlearchy: How a few heavily-linked sites dominate politics on the web. Annual Meeting of the Midwest Political Science Association (2003)
Introna, L., Nissenbaum, H.: Defining the web: The politics of search engines. Computer 33(1), 54–62 (2000)
Kleinberg, J.: The small-world phenomenon: an algorithm perspective. In: STOC 2000: Proceedings of the thirty-second annual ACM symposium on Theory of computing, pp. 163–170. ACM Press, New York (2000)
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46(5), 604–632 (1999)
Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.: Trawling the Web for emerging cyber-communities. Computer Networks 31(11-16), 1481–1493 (1999)
Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.: The web and social networks. IEEE Computer 35(11), 32–36 (2002)
Lee, A.M., Lee, E.B. (eds.): The Fine Art of Propaganda. The Institute for Propaganda Analysis. Harcourt, Brace and Co. (1939)
Lynch, C.A.: When documents deceive: trust and provenance as new factors for information retrieval in a tangled web. J. Am. Soc. Inf. Sci. Technol. 52(1), 12–17 (2001)
Marchiori, M.: The quest for correct information on the web: hyper search engines. Comput. Netw. ISDN Syst. 29(8-13), 1225–1235 (1997)
Maulding, M.L.: Lycos: Design choices in an internet search service. IEEE Expert (12), 8–11 (1997)
Metaxas, P.T.: Using Propagation of Distrust to find Untrustworthy Web Neighborhoods. In: Proceedings of the fourth international conference on internet and Web Applications and Services, Venice, Italy, May 24-28 (2009)
Pringle, G., Allison, L., Dowe, D.L.: What is a tall poppy among web pages? In: Proceedings of the seventh international conference on World Wide Web 7, pp. 369–377. Elsevier Science Publishers B. V., Amsterdam (1998)
Raghavan, P.: Social networks: From the web to the enterprise. IEEE Internet Computing 6(1), 91–94 (2002)
Salton, G.: Dynamic document processing. Commun. ACM 15(7), 658–668 (1972)
Silverstein, C., Marais, H., Henzinger, M., Moricz, M.: Analysis of a very large web search engine query log. SIGIR Forum 33(1), 6–12 (1999)
Totty, M., Mangalindan, M.: As google becomes web’s gatekeeper, sites fight to get in. Wall Street Journal CCXLI (39) (2003)
Vedder, A.: Medical data, new information technologies and the need for normative principles other than privacy rules. In: Freeman, M., Lewis, A. (eds.) Law and Medicine. Series Current Legal Issues, pp. 441–459. Oxford University Press, Oxford (2000)
Wasserman, S., Faust, K.: Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge (1994)
Welch, D.: Power of persuasion - propaganda. History Today 49(8), 24–26 (1999)
Wu, B., Davison, B.: Identifying link farm spam pages. In: Proceedings of the fourteenth international conference on World Wide Web. ACM Press, New York (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Metaxas, P.T. (2010). Web Spam, Social Propaganda and the Evolution of Search Engine Rankings. In: Cordeiro, J., Filipe, J. (eds) Web Information Systems and Technologies. WEBIST 2009. Lecture Notes in Business Information Processing, vol 45. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12436-5_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-12436-5_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12435-8
Online ISBN: 978-3-642-12436-5
eBook Packages: Computer ScienceComputer Science (R0)