Skip to main content

How Web 1.0 fails: the mismatch between hyperlinks and clickstreams

Abstract

The core of the Web is a hyperlink navigation system collaboratively set up by webmasters to help users find desired information. While it is well known that search engines are important for navigation, the extent to which search has led to a mismatch between hyperlinks and the pathways that users actually take has not been quantified. By applying network science to publicly available hyperlink and clickstream data for approximately 1,000 of the top Web sites, we show that the mismatch between hyperlinks and clickstreams is indeed substantial. We demonstrate that this mismatch has arisen because webmasters attempt to build a global virtual world without geographical or cultural boundaries, but users in fact prefer to navigate within more fragmented, language-based groups of Web sites. We call this type of behavior “preferential navigation” and find that it is driven by “local” search engines.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Notes

  1. 1.

    Because we manually identified communities for our analysis of the importance of search engines in user navigation, this leads to an inconsistency between the clickstreams used for Fig. 4 and those underlying Table 2 (where communities were automatically identified). For example, the sum of the clickstreams for the four sites from the Korean community reported in Fig. 4 is 3.7 million, which exceeds the total daily clickstreams for the entire community reported in Table 2 (2.29 million). However, this inconsistency has no impact on our qualitative findings.

References

  1. Ackland R (2010) WWW hyperlink networks. In: Hansen DL, Shneiderman B, Smith MA (eds) Analyzing social media networks with NodeXL: insights from a connected world. Morgan-Kaufmann, Burlington

    Google Scholar 

  2. Ackland R, O’Neil M (2011) Online collective identity: the case of the environmental movement. Soc Netw 33:177–190

    Article  Google Scholar 

  3. Barabási A, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509

    Article  MathSciNet  Google Scholar 

  4. Barnett GA, Chon BS, Rosen D (2001) The structure of internet flows in cyberspace. Netw Commun Stud (NETCOM) 15(1–2):61–80

    Google Scholar 

  5. Barnett GA, Park HW (2005) The structure of international internet hyperlinks and bilateral bandwidth. Annales des Telecommunications 60(9–10):1115–1132

    Google Scholar 

  6. Barnett GA, Sung EJ (2005) Culture and the structure of the international hyperlink network. J Comput Med Commun 11(1):217–238

    Article  Google Scholar 

  7. Bollen J, Van de Sompel H, Hagberg A, Bettencourt L, Chute R, Rodriguez M, Balakireva L (2009) Clickstream data yields high-resolution maps of science. PLoS One 4(3):e4803

    Article  Google Scholar 

  8. Brainerd J, Becker B (2001) Case study: e-commerce clickstream visualization. In: Proceedings of the IEEE symposium on information visualization 2001 (INFOVIS’01), p 153. IEEE Computer Society

  9. Catledge LD, Pitkow JE (1995) Characterizing browsing strategies in the World Wide Web. Comput Netw ISDN Syst 27:1065–1073

    Article  Google Scholar 

  10. Cattuto C, Loreto V, Pietronero L (2007) Semiotic dynamics and collaborative tagging. Proc Natl Acad Sci 104(5):1461

    Article  Google Scholar 

  11. Freeman L (1979) Centrality in social networks conceptual clarification. Soc Netw 1(3):215–239

    Article  Google Scholar 

  12. Fruchterman T, Reingold E (1991) Graph drawing by force-directed placement. Softw Pract Exp 21(11):1129–1164

    Article  Google Scholar 

  13. Garlaschelli D, Caldarelli G, Pietronero L (2003) Universal scaling relations in food webs. Nature 423(6936):165–168

    Article  Google Scholar 

  14. Hindman M, Tsioutsiouliklis K, Johnson J (2003) Googlearchy: how a few heavily-linked sites dominate politics on the web. In: Annual meeting of the midwest political science association, vol 4. Citeseer, pp 1–33

  15. Kim D, Im I, Atluri V (2005) A clickstream-based collaborative filtering recommendation model for e-commerce. In: Seventh IEEE international conference on e-commerce technology, 2005 (CEC 2005). IEEE, pp 84–91

  16. Kim DH, Atluri V, Bieber M, Adam N, Yesha Y (2004) A clickstream-based collaborative filtering personalization model: towards a better performance. In: Proceedings of the 6th annual ACM international workshop on web information and data management. Association for Computing Machinery

  17. Kleinberg J (1999) Authoritative sources in a hyperlinked environment. J ACM (JACM) 46(5):604–632

    Article  MathSciNet  MATH  Google Scholar 

  18. Meiss MR., Gonalves B, Ramasco JJ, Flammini A, Menczer F (2010) Agents, bookmarks and clicks: a topical model of web navigation. In: Proceedings of the 21st ACM conference on Hypertext and hypermedia

  19. Page L, Brin S, Motwani R., Winograd T (1997) Pagerank: bringing order to the web. www.pcd.stanford.edu/~page/papers/pagerank. Accessed 29 Jan 2001

  20. Park HW, Barnett GA, Chung CJ (2011) Structural changes in the 2003–2009 global hyperlink network. Glob Netw 11(4):522–542

    Article  Google Scholar 

  21. Park HW, Kim CS, Barnett GA (2004) Socio-communicational structure among political actors on the web in South Korea: the dynamics of digital presence in cyberspace. New Media Soc 6(3):403–423

    Article  Google Scholar 

  22. Qiu F, Liu Z, Cho J (2005) Analysis of user web traffic with a focus on search activities. In: Proceedings of the international workshop on the web and databases

  23. Raghavan UN, Albert R, Kumara S (2007) Near linear time algorithm to detect community structures in large-scale networks. Phys Rev E 76(3):036106

    Article  Google Scholar 

  24. Ritzer G, Jurgenson N (2010) Production, consumption, prosumption. J Consum Cult 10(1):13

    Article  Google Scholar 

  25. Schneider F, Feldmann A, Krishnamurthy B, Willinger W (2009) Understanding online social network usage from a network perspective. In: Proceedings of the 9th ACM SIGCOMM conference on internet measurement conference. ACM, pp 35–48

  26. Shadbolt N, Hall W, Berners-Lee T (2006) The semantic web revisited. Intell Syst IEEE 21(3):96–101

    Article  Google Scholar 

  27. Shumate M, Dewitt L (2008) The North/South divide in NGO hyperlink networks. J Comput Med Commu 13:405–428

    Article  Google Scholar 

  28. Watts DJ, Strogatz SH (1998) Collective dynamics of small-world networks. Nature 393(6684):440

    Article  Google Scholar 

  29. Wu F, Huberman B (2007) Novelty and collective attention. Proc Natl Acad Sci 104(45):17599

    Article  Google Scholar 

  30. Yamakami T (2006) Regularity analysis using time slot counting in the mobile clickstream. In: Database and expert systems applications, 2006. DEXA’06. 17th international workshop on, IEEE, pp 55–59

  31. Zhuge H (2009) Communities and emerging semantics in semantic link network: discovery and learning. Knowl Data Eng IEEE Trans 21(6):785–799

    Article  MathSciNet  Google Scholar 

  32. Zhuge H (2011) Semantic linking through spaces for cyber-physical-socio intelligence: a methodology. Artif Intell 175(5–6):988–1019

Download references

Acknowledgments

We thank Jonathan J. H. Zhu, Lexing Xie, Paul Thomas, Hai Liang, and the reviewers for providing comments on an earlier version of this paper.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Robert Ackland.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wu, L., Ackland, R. How Web 1.0 fails: the mismatch between hyperlinks and clickstreams. Soc. Netw. Anal. Min. 4, 202 (2014). https://doi.org/10.1007/s13278-014-0202-8

Download citation

Keywords

  • Clickstream
  • Hyperlink
  • Search engine
  • Navigation
  • Social network analysis