Skip to main content

Web Structure Mining by Isolated Stars

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4936))

Abstract

The link structure of the Web is generally viewed as the webgraph, and web structure mining is a research area that mainly aims to find hidden communities in the Web and so on, by focusing on the webgraph. In this paper, we identify a common frequent substructure by observing the webgraph, and newly define it as an isolated star (i-star). We propose an efficient enumeration algorithm of i-stars, and try structure mining by enumerating them from the real web data. As a result, we observed that most of i-stars correspond to index structures in single domains, while some of them are verified to stand for useful communities, which implies the validity of i-stars as candidate substructure for structure mining. We also suggest that the notion of i-star can be a helpful tool for preprocessing the webgraph to have its succinct representation for further structure mining.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Asano, Y., Imai, H., Toyoda, M., Kitsuregawa, M.: Finding neighbor communities in the Web using inter-site graph. In: Proc. 14th Int’l Conf. on Database and Expert Syst. Appl., pp. 558–568 (2003)

    Google Scholar 

  2. Broder, A.Z., Kumar, S.R., Maghoul, F., Raghavan, P., Rajagopalan, S., Stata, R., Tomkins, A., Wiener, J.L.: Graph structure in the web. Computer Networks 33, 309–320 (2000)

    Article  Google Scholar 

  3. Flake, G.W., Lawrence, S., Giles, C.L.: Efficient identification of web communities. In: Proc. 6th ACM Int’l Conf. on Knowl, pp. 150–160 (2000)

    Google Scholar 

  4. Ito, H., Iwama, K., Osumi, T.: Linear-time enumeration of isolated cliques. In: Proc. 13th Ann. European Symp. on Algorithms, pp. 119–130 (2005)

    Google Scholar 

  5. Kleinberg, J.: Authoritative sources in a hyperlinked environment. J. ACM 46, 604–632 (1997)

    Article  MathSciNet  Google Scholar 

  6. Kleinberg, J., Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.S.: The Web as a graph: measurements, models, and methods. In: Proc. 5th Int’l Comput. and Comb. Conf., pp. 1–17 (1999)

    Google Scholar 

  7. Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.S.: Trawling the Web for emerging cyber-communities. Comput. Net. 31, 1481–1493 (1999)

    Article  Google Scholar 

  8. Laura, L., Leonardi, S., Millozzi, S., Meyer, U., Sibeyn, J.F.: Algorithms and experiments for the Webgraph. In: Proc. 11th Ann. Euro. Symp. on Algor., pp. 703–714 (2003)

    Google Scholar 

  9. Raghavan, S., Garcia-Molina, H.: Representing web graphs. In: Proc. 19th Int’l Conf. on Data Eng., pp. 405–416 (2003)

    Google Scholar 

  10. The Stanford WebBase Project, http://www-diglib.stanford.edu/testbed/doc2/WebBase/

  11. Uno, Y., Ota, Y., Uemichi, A., Umano, M.: Mining communities and detecting link farms in the Web by isolated cliques. In: Proc. 2nd Int’l Conf. on Knowledge Engineering and Decision Support, pp. 179–187 (2006)

    Google Scholar 

  12. Uno, Y., Uemichi, A.: Structural properties of i-star contracted webgraphs. Manuscript

    Google Scholar 

  13. Wu, B., Davison, B.D.: Identifying link farm spam pages. In: Proc. 14th Int’l WWW Conf., pp. 820–829 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

William Aiello Andrei Broder Jeannette Janssen Evangelos Milios

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Uno, Y., Ota, Y., Uemichi, A. (2008). Web Structure Mining by Isolated Stars. In: Aiello, W., Broder, A., Janssen, J., Milios, E. (eds) Algorithms and Models for the Web-Graph. WAW 2006. Lecture Notes in Computer Science, vol 4936. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78808-9_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-78808-9_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-78807-2

  • Online ISBN: 978-3-540-78808-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics