Abstract
The goal of this work is to increase the relevance and the interestingness of patterns discovered by a Web Usage Mining process. Indeed, the sequential patterns extracted on web log files, unless they are found under constraints, often lack interest because of their obvious content. Our goal is to discover minority users’ behaviors having a coherence which we want to be aware of (like hacking activities on the Web site or a users’ activity limited to a specific part of the Web site). By means of a clustering method on the extracted sequential patterns, we propose a recursive division of the problem. The developed clustering method is based on patterns summaries and neural networks. Our experiments show that we obtain the targeted patterns whereas their extraction by means of a classical process is impossible because of a very weak support (down to 0.006%). The diversity of users’ behaviors is so large that the minority ones are both numerous and difficult to locate.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: Proceedings of the 1993 ACM SIGMOD Conference, Washington DC, USA, May 1993, pp. 207–216 (1993)
Benedek, A., Trousse, B.: Adaptation of Self-Organizing Maps for CBR case indexing. In: 27th Annual Conference of the Gesellschaft fur Klassifikation, Cottbus, Germany (March 2003)
Cooley, R., Mobasher, B., Srivastava, J.: Data preparation for mining world wide web browsing patterns. Knowledge and Information Systems 1(1), 5–32 (1999)
Fayad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.): Advances in Knowledge Discovery and Data Mining. AAAI Press, Menlo Park (1996)
Giacometti, A.: Modèles hybrides de l’expertise, novembre, PhD Thesis (in french), ENST Paris (1992)
Jaczynski, M.: Modèle et plate-forme à objets pour l’indexation des cas par situation comportementales: application à l’assistance à la navigation sur le web, décembre, PhD thesis (in french), Université de Nice Sophia-Antipolis (1998)
Malek, M.: Un modèle hybride de mémoire pour le raisonnement à partir de cas, octobre, PhD thesis (in french), Université Joseph Fourrier (1996)
Masseglia, F., Cathala, F., Poncelet, P.: The PSP Approach for Mining Sequential Patterns. In: Żytkow, J.M. (ed.) PKDD 1998. LNCS, vol. 1510, pp. 176–184. Springer, Heidelberg (1998)
Masseglia, F., Poncelet, P., Cicchetti, R.: An efficient algorithm for web usage mining. Networking and Information Systems Journal (NIS) (April 2000)
Srikant, R., Agrawal, R.: Mining Sequential Patterns: Generalizations and Performance Improvements. In: Apers, P.M.G., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057, pp. 3–17. Springer, Heidelberg (1996)
Tanasa, D., Trousse, B.: Web access pattern discovery and analysis based on page classification and on indexing sessions with a generalised suffix tree. In: Proceedings of the 3rd International Workshop on Symbolic and Numeric Algorithms for Scientific Computing, Timisoara, Romania, October 2001, pp. 62–72 (2001)
W3C. httpd-log files (1995), http://www.w3.org/Daemon/User/Config/Logging.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Masseglia, F., Tanasa, D., Trousse, B. (2004). Web Usage Mining: Sequential Pattern Extraction with a Very Low Support. In: Yu, J.X., Lin, X., Lu, H., Zhang, Y. (eds) Advanced Web Technologies and Applications. APWeb 2004. Lecture Notes in Computer Science, vol 3007. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24655-8_56
Download citation
DOI: https://doi.org/10.1007/978-3-540-24655-8_56
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21371-0
Online ISBN: 978-3-540-24655-8
eBook Packages: Springer Book Archive