An Efficient Algorithm for Finding Frequent Sequential Traversal Patterns from Web Logs Based on Dynamic Weight Constraint

Conference paper
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 150)

Abstract

Many frequent sequential traversal pattern mining algorithms have been developed which mine the set of frequent subsequences traversal pattern satisfying a minimum support constraint in a session database. However, previous frequent sequential traversal pattern mining algorithms give equal weightage to sequential traversal patterns while the pages in sequential traversal patterns have different importance and have different weightage. Another main problem in most of the frequent sequential traversal pattern mining algorithms is that they produce a large number of sequential traversal patterns when a minimum support is lowered and they do not provide alternative ways to adjust the number of sequential traversal patterns other than increasing the minimum support. In this paper, we propose a frequent sequential traversal pattern mining algorithm with weights constraint. Our main approach is to add the weight constraints into the sequential traversal pattern while maintaining the downward closure property. A weight range is defined to maintain the downward closure property and pages are given different weights and traversal sequences assign a minimum and maximum weight. In scanning a session database, a maximum and minimum weight in the session database is used to prune infrequent sequential traversal subsequence by doing downward closure property can be maintained.

Keywords

Sequential traversal pattern mining Weight constraint Web usage mining Data mining 

References

  1. 1.
    Etzioni O (1996) The world-wide web: quagmire or gold mine? Commun ACM 39(11):65–68CrossRefGoogle Scholar
  2. 2.
    Kosala R, Blockeel H (2000) Web mining research: a survey. SIGKDD Explor 1:1–15Google Scholar
  3. 3.
    Brin S, Page L (1998) The anatomy of a large-scale hyper-textual web search engine. Comput Networks ISDN Syst 30(1– 7):107–117CrossRefGoogle Scholar
  4. 4.
    Punin JR, Krishnamoorthy MS, Zaki MJ (2001) LOGML—log markup language for web usage mining. In: WEBKDD workshop 2001: mining log data across all customer touch points (with SIGKDD01), pp 88–112, San Francisco, AugGoogle Scholar
  5. 5.
    Srivastava J, Cooley R, Mukund D (2000) Web usage mining: discovery and applications of usage patterns from web data. SIGKDD Explor 1(2):12–23CrossRefGoogle Scholar
  6. 6.
    Zhang H, Liang W (2004) An intelligent algorithm of data pre-processing in Web usage mining. In: Proceedings of the world congress on intelligent control and automation (WCICA), v 4, p 3119–3123Google Scholar
  7. 7.
    Long W, Christoph M (2004) Behaviour recovery and complicated pattern definition in web usage mining. In: Web engineering: 4th international conference, ICWE 2004, Munich, July 26–30, pp 531–543Google Scholar
  8. 8.
    Jiayun G, Vlado K, Qigang G (2005) Integrating web content clustering into web log association rule mining. In: Advances in artificial intelligence: 18th conference of the canadian society for computational studies of intelligence, CanadaGoogle Scholar
  9. 9.
    Ayres J, Gehrke J, Yiu T, Flannick J (2002) Sequential pattern mining using a bitmap representation. In: SIGKDD’02, pp 1–7Google Scholar
  10. 10.
    Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proc.eedings of the 20th international conference on very large database, Chile, pp 487–499Google Scholar
  11. 11.
    Agrawal R, Srikant R (1995) Mining sequential patterns. In: Proceedings of the International Conference on Data Engineering (ICDE), Taipei March 1995Google Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  1. 1.Shri Vaisnav Institute of Technology and ScienceIndoreIndia
  2. 2.Shri Vaisnav Institute of Technology and, ScienceIndoreIndia

Personalised recommendations