The core of the Web is a hyperlink navigation system collaboratively set up by webmasters to help users find desired information. While it is well known that search engines are important for navigation, the extent to which search has led to a mismatch between hyperlinks and the pathways that users actually take has not been quantified. By applying network science to publicly available hyperlink and clickstream data for approximately 1,000 of the top Web sites, we show that the mismatch between hyperlinks and clickstreams is indeed substantial. We demonstrate that this mismatch has arisen because webmasters attempt to build a global virtual world without geographical or cultural boundaries, but users in fact prefer to navigate within more fragmented, language-based groups of Web sites. We call this type of behavior “preferential navigation” and find that it is driven by “local” search engines.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Because we manually identified communities for our analysis of the importance of search engines in user navigation, this leads to an inconsistency between the clickstreams used for Fig. 4 and those underlying Table 2 (where communities were automatically identified). For example, the sum of the clickstreams for the four sites from the Korean community reported in Fig. 4 is 3.7 million, which exceeds the total daily clickstreams for the entire community reported in Table 2 (2.29 million). However, this inconsistency has no impact on our qualitative findings.
Ackland R (2010) WWW hyperlink networks. In: Hansen DL, Shneiderman B, Smith MA (eds) Analyzing social media networks with NodeXL: insights from a connected world. Morgan-Kaufmann, Burlington
Ackland R, O’Neil M (2011) Online collective identity: the case of the environmental movement. Soc Netw 33:177–190
Barabási A, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509
Barnett GA, Chon BS, Rosen D (2001) The structure of internet flows in cyberspace. Netw Commun Stud (NETCOM) 15(1–2):61–80
Barnett GA, Park HW (2005) The structure of international internet hyperlinks and bilateral bandwidth. Annales des Telecommunications 60(9–10):1115–1132
Barnett GA, Sung EJ (2005) Culture and the structure of the international hyperlink network. J Comput Med Commun 11(1):217–238
Bollen J, Van de Sompel H, Hagberg A, Bettencourt L, Chute R, Rodriguez M, Balakireva L (2009) Clickstream data yields high-resolution maps of science. PLoS One 4(3):e4803
Brainerd J, Becker B (2001) Case study: e-commerce clickstream visualization. In: Proceedings of the IEEE symposium on information visualization 2001 (INFOVIS’01), p 153. IEEE Computer Society
Catledge LD, Pitkow JE (1995) Characterizing browsing strategies in the World Wide Web. Comput Netw ISDN Syst 27:1065–1073
Cattuto C, Loreto V, Pietronero L (2007) Semiotic dynamics and collaborative tagging. Proc Natl Acad Sci 104(5):1461
Freeman L (1979) Centrality in social networks conceptual clarification. Soc Netw 1(3):215–239
Fruchterman T, Reingold E (1991) Graph drawing by force-directed placement. Softw Pract Exp 21(11):1129–1164
Garlaschelli D, Caldarelli G, Pietronero L (2003) Universal scaling relations in food webs. Nature 423(6936):165–168
Hindman M, Tsioutsiouliklis K, Johnson J (2003) Googlearchy: how a few heavily-linked sites dominate politics on the web. In: Annual meeting of the midwest political science association, vol 4. Citeseer, pp 1–33
Kim D, Im I, Atluri V (2005) A clickstream-based collaborative filtering recommendation model for e-commerce. In: Seventh IEEE international conference on e-commerce technology, 2005 (CEC 2005). IEEE, pp 84–91
Kim DH, Atluri V, Bieber M, Adam N, Yesha Y (2004) A clickstream-based collaborative filtering personalization model: towards a better performance. In: Proceedings of the 6th annual ACM international workshop on web information and data management. Association for Computing Machinery
Kleinberg J (1999) Authoritative sources in a hyperlinked environment. J ACM (JACM) 46(5):604–632
Meiss MR., Gonalves B, Ramasco JJ, Flammini A, Menczer F (2010) Agents, bookmarks and clicks: a topical model of web navigation. In: Proceedings of the 21st ACM conference on Hypertext and hypermedia
Page L, Brin S, Motwani R., Winograd T (1997) Pagerank: bringing order to the web. www.pcd.stanford.edu/~page/papers/pagerank. Accessed 29 Jan 2001
Park HW, Barnett GA, Chung CJ (2011) Structural changes in the 2003–2009 global hyperlink network. Glob Netw 11(4):522–542
Park HW, Kim CS, Barnett GA (2004) Socio-communicational structure among political actors on the web in South Korea: the dynamics of digital presence in cyberspace. New Media Soc 6(3):403–423
Qiu F, Liu Z, Cho J (2005) Analysis of user web traffic with a focus on search activities. In: Proceedings of the international workshop on the web and databases
Raghavan UN, Albert R, Kumara S (2007) Near linear time algorithm to detect community structures in large-scale networks. Phys Rev E 76(3):036106
Ritzer G, Jurgenson N (2010) Production, consumption, prosumption. J Consum Cult 10(1):13
Schneider F, Feldmann A, Krishnamurthy B, Willinger W (2009) Understanding online social network usage from a network perspective. In: Proceedings of the 9th ACM SIGCOMM conference on internet measurement conference. ACM, pp 35–48
Shadbolt N, Hall W, Berners-Lee T (2006) The semantic web revisited. Intell Syst IEEE 21(3):96–101
Shumate M, Dewitt L (2008) The North/South divide in NGO hyperlink networks. J Comput Med Commu 13:405–428
Watts DJ, Strogatz SH (1998) Collective dynamics of small-world networks. Nature 393(6684):440
Wu F, Huberman B (2007) Novelty and collective attention. Proc Natl Acad Sci 104(45):17599
Yamakami T (2006) Regularity analysis using time slot counting in the mobile clickstream. In: Database and expert systems applications, 2006. DEXA’06. 17th international workshop on, IEEE, pp 55–59
Zhuge H (2009) Communities and emerging semantics in semantic link network: discovery and learning. Knowl Data Eng IEEE Trans 21(6):785–799
Zhuge H (2011) Semantic linking through spaces for cyber-physical-socio intelligence: a methodology. Artif Intell 175(5–6):988–1019
We thank Jonathan J. H. Zhu, Lexing Xie, Paul Thomas, Hai Liang, and the reviewers for providing comments on an earlier version of this paper.
About this article
Cite this article
Wu, L., Ackland, R. How Web 1.0 fails: the mismatch between hyperlinks and clickstreams. Soc. Netw. Anal. Min. 4, 202 (2014). https://doi.org/10.1007/s13278-014-0202-8
- Search engine
- Social network analysis