Abstract
The link structure of the Web is generally viewed as the webgraph. One of the main objectives of web structure mining is to find hidden communities on the Web based on the webgraph, and one of its approaches tries to enumerate substructures each of which corresponds to a set of web pages of a community or its core. Through those research, it has been turned out that certain substructures can find sets of pages that are inherently irrelevant to communities. In this paper, we propose a model, which we call contracted webgraphs, where such substructures are contracted into single nodes to hide useless information. We then try structure mining iteratively on those contracted webgraphs since we can expect to find further hidden information once irrelevant information is eliminated. We also explore structural properties of contracted webgraphs from the viewpoint of scale-freeness, and we observe that they exhibit novel and extreme self-similarities.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Albert, R., Jeong, H., Barabási, A.-L.: Diameter of the World Wide Web. Nature 401, 130–131 (1999)
Asano, Y., Imai, H., Toyoda, M., Kitsuregawa, M.: Finding neighbor communities in the Web using inter-site graph. In: Mařík, V., Štěpánková, O., Retschitzegger, W. (eds.) DEXA 2003. LNCS, vol. 2736, pp. 558–568. Springer, Heidelberg (2003)
Bharat, K., Chang, B.-W., Henzinger, M., Ruhl, M.: Who links to whom: Mining linkage between web sites. In: Proc. 1st IEEE International Conference on Data Mining, pp. 51–58 (2001)
Broder, A.Z., Kumar, S.R., Maghoul, F., Raghavan, P., Rajagopalan, S., Stata, R., Tomkins, A., Wiener, J.L.: Graph structure in the web. Computer Networks 33, 309–320 (2000)
Dourisboure, Y., Geraci, F., Pellegrini, M.: Extraction and classification of dense communities in the Web. In: Proc. 16th International WWW Conference, pp. 461–470 (2007)
Flake, G.W., Lawrence, S., Giles, C.L.: Efficient identification of web communities. In: Proc. 6th ACM International Conference on Knowledge Discovery and Data Mining, pp. 150–160 (2000)
Flake, G.W., Tarjan, R.E., Tsioutsiouliklis, K.: Graph clustering and minimum cut trees. Internet Mathematics 1, 385–408 (2004)
Görke, R., Hartmann, T., Wagner, D.: Dynamic graph clustering using minimum-cut trees. In: Dehne, F., Gavrilova, M., Sack, J.-R., Tóth, C.D. (eds.) WADS 2009. LNCS, vol. 5664, pp. 339–350. Springer, Heidelberg (2009)
Henzinger, M.R.: Algorithmic challenges in web search engines. Internet Mathematics 1, 115–126 (2003)
Ito, H., Iwama, K., Osumi, T.: Linear-time enumeration of isolated cliques. In: Brodal, G.S., Leonardi, S. (eds.) ESA 2005. LNCS, vol. 3669, pp. 119–130. Springer, Heidelberg (2005)
Kleinberg, J.: Authoritative sources in a hyperlinked environment. J. ACM 46, 604–632 (1999)
Kleinberg, J.M., Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.S.: The Web as a Graph: Measurements, Models, and Methods. In: Asano, T., Imai, H., Lee, D.T., Nakano, S.-i., Tokuyama, T. (eds.) COCOON 1999. LNCS, vol. 1627, pp. 1–17. Springer, Heidelberg (1999)
Kleinberg, J., Lawrence, S.: The structure of the Web. Science 294, 1894–1895 (2001)
Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.: Trawling the Web for emerging cyber-communities. Computer Networks 31, 1481–1493 (1999)
Leskovec, J., Lang, K.J., Dasgupta, A., Mahoney, M.W.: Statistical properties of community structure in large social and information networks. In: Proc. 17th International WWW Conference, pp. 695–704 (2008)
Mitzenmacher, M.: Editorial: The future of power law research. Internet Mathematics 2, 525–534 (2006)
Newman, M.E.J.: The structure and function of complex networks. SIAM Review 45, 167–256 (2003)
Raghavan, S., Garcia-Molina, H.: Representing web graphs. In: Proc. 19th International Conference on Data Engineering, pp. 405–416 (2003)
Song, C., Havlin, S., Makse, H.A.: Self-similarity of complex networks. Nature 433, 392–395 (2005)
The Stanford WebBase Project, http://www-diglib.stanford.edu/~testbed/doc2/WebBase/
Uno, Y., Ota, Y., Uemichi, A.: Web structure mining by isolated cliques. IEICE Transactions on Information and Systems E90-D, 1998–2006 (2007)
Uno, Y., Ota, Y., Uemichi, A.: Web structure mining by isolated stars. In: Aiello, W., Broder, A., Janssen, J., Milios, E.E. (eds.) WAW 2006. LNCS, vol. 4936, pp. 149–156. Springer, Heidelberg (2008)
Watts, D., Strogatz, S.: Collective dynamics of ‘small-world’ networks. Nature 393, 440–442 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Uno, Y., Oguri, F. (2011). Contracted Webgraphs: Structure Mining and Scale-Freeness. In: Atallah, M., Li, XY., Zhu, B. (eds) Frontiers in Algorithmics and Algorithmic Aspects in Information and Management. Lecture Notes in Computer Science, vol 6681. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21204-8_31
Download citation
DOI: https://doi.org/10.1007/978-3-642-21204-8_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21203-1
Online ISBN: 978-3-642-21204-8
eBook Packages: Computer ScienceComputer Science (R0)