Skip to main content

Web Usage Mining: Sequential Pattern Extraction with a Very Low Support

  • Conference paper
Advanced Web Technologies and Applications (APWeb 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3007))

Included in the following conference series:

Abstract

The goal of this work is to increase the relevance and the interestingness of patterns discovered by a Web Usage Mining process. Indeed, the sequential patterns extracted on web log files, unless they are found under constraints, often lack interest because of their obvious content. Our goal is to discover minority users’ behaviors having a coherence which we want to be aware of (like hacking activities on the Web site or a users’ activity limited to a specific part of the Web site). By means of a clustering method on the extracted sequential patterns, we propose a recursive division of the problem. The developed clustering method is based on patterns summaries and neural networks. Our experiments show that we obtain the targeted patterns whereas their extraction by means of a classical process is impossible because of a very weak support (down to 0.006%). The diversity of users’ behaviors is so large that the minority ones are both numerous and difficult to locate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: Proceedings of the 1993 ACM SIGMOD Conference, Washington DC, USA, May 1993, pp. 207–216 (1993)

    Google Scholar 

  2. Benedek, A., Trousse, B.: Adaptation of Self-Organizing Maps for CBR case indexing. In: 27th Annual Conference of the Gesellschaft fur Klassifikation, Cottbus, Germany (March 2003)

    Google Scholar 

  3. Cooley, R., Mobasher, B., Srivastava, J.: Data preparation for mining world wide web browsing patterns. Knowledge and Information Systems 1(1), 5–32 (1999)

    Google Scholar 

  4. Fayad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.): Advances in Knowledge Discovery and Data Mining. AAAI Press, Menlo Park (1996)

    Google Scholar 

  5. Giacometti, A.: Modèles hybrides de l’expertise, novembre, PhD Thesis (in french), ENST Paris (1992)

    Google Scholar 

  6. Jaczynski, M.: Modèle et plate-forme à objets pour l’indexation des cas par situation comportementales: application à l’assistance à la navigation sur le web, décembre, PhD thesis (in french), Université de Nice Sophia-Antipolis (1998)

    Google Scholar 

  7. Malek, M.: Un modèle hybride de mémoire pour le raisonnement à partir de cas, octobre, PhD thesis (in french), Université Joseph Fourrier (1996)

    Google Scholar 

  8. Masseglia, F., Cathala, F., Poncelet, P.: The PSP Approach for Mining Sequential Patterns. In: Żytkow, J.M. (ed.) PKDD 1998. LNCS, vol. 1510, pp. 176–184. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  9. Masseglia, F., Poncelet, P., Cicchetti, R.: An efficient algorithm for web usage mining. Networking and Information Systems Journal (NIS) (April 2000)

    Google Scholar 

  10. Srikant, R., Agrawal, R.: Mining Sequential Patterns: Generalizations and Performance Improvements. In: Apers, P.M.G., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057, pp. 3–17. Springer, Heidelberg (1996)

    Chapter  Google Scholar 

  11. Tanasa, D., Trousse, B.: Web access pattern discovery and analysis based on page classification and on indexing sessions with a generalised suffix tree. In: Proceedings of the 3rd International Workshop on Symbolic and Numeric Algorithms for Scientific Computing, Timisoara, Romania, October 2001, pp. 62–72 (2001)

    Google Scholar 

  12. W3C. httpd-log files (1995), http://www.w3.org/Daemon/User/Config/Logging.html

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Masseglia, F., Tanasa, D., Trousse, B. (2004). Web Usage Mining: Sequential Pattern Extraction with a Very Low Support. In: Yu, J.X., Lin, X., Lu, H., Zhang, Y. (eds) Advanced Web Technologies and Applications. APWeb 2004. Lecture Notes in Computer Science, vol 3007. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24655-8_56

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24655-8_56

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-21371-0

  • Online ISBN: 978-3-540-24655-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics