Skip to main content

Mining Access Patterns Efficiently from Web Logs

  • Conference paper
  • First Online:
Knowledge Discovery and Data Mining. Current Issues and New Applications (PAKDD 2000)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1805))

Included in the following conference series:

Abstract

With the explosive growth of data available on the World Wide Web, discovery and analysis of useful information from the World Wide Web becomes a practical necessity. Web access pattern, which is the sequence of accesses pursued by users frequently, is a kind of interesting and useful knowledge in practice.

In this paper, we study the problem of mining access patterns from Web logs efficiently. A novel data structure, called Web access pattern tree, or WAP-tree in short, is developed for efficient mining of access patterns from pieces of logs. The Web access pattern tree stores highly compressed, critical information for access pattern mining and facilitates the development of novel algorithms for mining access patterns in large set of log pieces. Our algorithm can find access patterns from Web logs quite efficiently. The experimental and performance studies show that our method is in general an order of magnitude faster than conventional methods.

The work was supported in part by the Natural Sciences and Engineering Research Council of Canada (grant NSERC-A3723), the Networks of Centres of Excellence of Canada (grant NCE/IRIS-3), and the Hewlett-Packard Lab.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proc. 1994 Int. Conf. Very Large Data Bases, pages 487–499, Santiago, Chile, September 1994.

    Google Scholar 

  2. R. Agrawal and R. Srikant. Mining sequential patterns. In Proc. 1995 Int. Conf. Data Engineering, pages 3–14, Taipei, Taiwan, March 1995.

    Google Scholar 

  3. C. Bettini, X. Sean Wang, and S. Jajodia. Mining temporal relationships with multiple granularities in time sequences. Data Engineering Bulletin, 21:32–38, 1998.

    Google Scholar 

  4. R. Cooley, B. Mobasher, and J. Srivastava. Data preparation for mining World Wide Web browsing patterns. In Journal of Knowledge & Information Systems, Vol.1, No. l, 1999.

    Google Scholar 

  5. J. Graham-Cumming. Hits and misses: A year watching the Web. In Proc. 6th Int’l World Wide Web Conf., Santa Clara, California, April 1997.

    Google Scholar 

  6. J. Han, G. Dong, and Y. Yin. Efficient mining of partial periodic patterns in time series database. In Proc. 1999 Int. Conf. Data Engineering (ICDE’99), pages 106–115, Sydney, Australia, April 1999.

    Google Scholar 

  7. H. Lu, J. Han, and L. Feng. Stock movement and n-dimensional inter-transaction association rules. In Proc. 1998 SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD’98), pages 12:1–12:7, Seattle, Washington, June 1998.

    Google Scholar 

  8. H. Mannila, H Toivonen, and A. I. Verkamo. Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery, 1:259–289, 1997.

    Article  Google Scholar 

  9. B. Özden, S. Ramaswamy, and A. Silberschatz. Cyclic association rules. In Proc. 1998 Int. Conf. Data Engineering (ICDE’98), pages 412–421, Orlando, FL, Feb. 1998.

    Google Scholar 

  10. M. Perkowitz and O. Etzioni. Adaptive sites: Automatically learning from user access patterns. In Proc. 6th Int’l World Wide Web Conf., Santa Clara, California, April 1997.

    Google Scholar 

  11. M. Spiliopoulou and L. Faulstich. WUM: A tool for Web utilization analysis. In Proc. 6th Int’l Conf. on Extending Database Technology (EDBT’98), Valencia, Spain, March 1998.

    Google Scholar 

  12. R. Srikant and R. Agrawal. Mining quantitative association rules in large relational tables. In Proc. 1996 ACM-SIGMOD Int. Conf. Management of Data, pages 1–12, Montreal, Canada, June 1996.

    Google Scholar 

  13. T. Sullivan. Reading reader reaction: A proposal for inferential analysis of Web server log files. In Proc. 3rd Conf. Human Factors & The Web, Denver, Colorado, June 1997.

    Google Scholar 

  14. L. Tauscher and S. Greeberg. How people revisit Web pages: Empirical findings and implications for the design of history systems. In Int’l Journal of Human Computer Studies, Special Issue on World Wide Web Usability, 47:97–138, 1997.

    Google Scholar 

  15. O. Zaiane, M. Xin, and J. Han. Discovering Web access patterns and trends by applying OLAP and data mining technology on Web logs. In Proc. Advances in Digital Libraries Conf. (ADL’98), Melbourne, Australia, pages 144–158, April 1998.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Pei, J., Han, J., Mortazavi-asl, B., Zhu, H. (2000). Mining Access Patterns Efficiently from Web Logs. In: Terano, T., Liu, H., Chen, A.L.P. (eds) Knowledge Discovery and Data Mining. Current Issues and New Applications. PAKDD 2000. Lecture Notes in Computer Science(), vol 1805. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45571-X_47

Download citation

  • DOI: https://doi.org/10.1007/3-540-45571-X_47

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-67382-8

  • Online ISBN: 978-3-540-45571-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics