Hierarchical Directed Acyclic Graph (HDAG) Based Preprocessing Technique for Session Construction

  • S. Chitra
  • B. Kalpana
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 177)


Web access log analysis is to examine the patterns of web site usage and the features of user’s behavior. Preprocessing of the log data is very essential for efficient web usage mining as the normal log data is very noisy. Session construction is very vital step in the preprocessing phase and recently various real world problems can be modeled as traversals on graph and mining from these traversals provides effective results. On the other hand, the traversals on unweighted graph have been taken into consideration in existing works. This paper oversimplifies this to the case where vertices of graph are given weights to reflect their significance. Patterns are closed frequent Directed Acyclic Graphs with page browsing time. The proposed method constructs sessions using an efficient Directed Acyclic Graph approach which contains pages with calculated weights. Hierarchical Directed Acyclic Graph (HDAG) Kernel approach is used for session construction. The HDAG directly accepts several levels of both chunks and their relations, and then efficiently computes the weighed sum of the number of common attribute sequences of the HDAGs. This will help site administrators to find the interesting pages for users and to redesign their web pages. After weighting each page according to browsing time a DAG structure is constructed for each user session.


Web Usage Mining Session Construction Directed Acyclic Graph (DAG) Preprocessing Robots Cleaning 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Mobasher, B.: Data Mining for Web Personalization. LCNS. Springer, Heidelberg (2007)Google Scholar
  2. 2.
    Catlegde, L., Pitkow, J.: Characterising browsing behaviours in the World Wide Web. Computer Networks and ISDN systems (1995)Google Scholar
  3. 3.
    Cooley, R., Mobasher, B., Srivastava, J.: Data preparation for mining World Wide Web browsing patterns. Knowledge and Information Systems (1999)Google Scholar
  4. 4.
    Cooley, R., Mobasher, B., Srivastava, J.: Web mining: Information and Pattern Discovery on the World Wide Web. In: International Conference on Tools with Artificial Intelligence, Newport Beach, pp. 558–567. IEEE (1997)Google Scholar
  5. 5.
    Mihara, K., Terabe, M., Hashimoto, K.: A Novel web usage mining method. Mining and Clustering of DAG Access Patterns Considering Page Browsing Time (2008)Google Scholar
  6. 6.
    Hofgesang, P.I.: Methodology for Preprocessing and Evaluating the Time Spent on Web Pages. In: Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence (2006)Google Scholar
  7. 7.
    Lee, S.D., Park, H.C.: Mining Weighted Frequent Patterns from Path Traversals on Weighted Graph. IJCSNS International Journal of Computer Science and Network Security 7(4) (2007)Google Scholar
  8. 8.
    Spilipoulou, M., Mobasher, B., Berendt, B.: A framework for the Evaluation of Session Reconstruction Heuristics in Web Usage Analysis. Informs Journal on Computing Spring (2003)Google Scholar
  9. 9.
    Suresh, R.M., Padmajavalli, R.: An Overview of Data Preprocessing in Data and Web usage Mining. IEEE (2006)Google Scholar
  10. 10.
    Termier, A., Tamada, Y., Numata, K., Imoto, S., Washio, T., Higuchi, T.: DIGDAG, a first algorithm to mine closed frequent embedded sub-DAGs. In: The 5th International Workshop on Mining and Learning with Graphs, MLG 2007 (2007)Google Scholar
  11. 11.
    Wang, T., He, P.-L.: Find Duration Time Maximal Frequent Traversal Sequence on Web Sites. In: IEEE International Conference on Control and Automation (2007)Google Scholar
  12. 12.
    Li, Y., Feng, B., Mao, Q.: Research on Path Completion Technique in Web Usage Mining. In: International Symposium on Computer Science and Computational Technology. IEEE (2008)Google Scholar
  13. 13.
    Li, Y., Feng, B.: The Construction of Transactions for Web Usage Mining. In: International Conference on Computational Intelligence and Natural Computing. IEEE (2009)Google Scholar
  14. 14.
    Etminani, K., Delui, A.R., Yanehsari, N.R., Rouhani, M.: Web Usage Mining: Discovery of the Users’ Navigational Patterns Using SOM. In: First International Conference on Networked Digital Technologies, pp. 224–249 (2009)Google Scholar
  15. 15.
    Nina, S.P., Rahman, M., Bhuiyan, K.I., Ahmed, K.: Pattern Discovery of Web Usage Mining. In: International Conference on Computer Technology and Development, vol. 1, pp. 499–503 (2009)Google Scholar
  16. 16.
    Lee, C.-H., Fu, Y.-H.: Web Usage Mining Based on Clustering of Browsing Features. In: Eighth International Conference on Intelligent Systems Design and Applications, vol. 1, pp. 281–286 (2008)Google Scholar
  17. 17.
    Suzuki, J., Hirao, T., Sasaki, Y., Maeda, E.: Hierarchical Directed Acyclic Graph Kernel: Methods for Structured Natural Language Data. Meeting of the Association for Computational Linguistics, pp. 32–39 (2003)Google Scholar
  18. 18.
    Collins, M., Duffy, N.: Parsing with a Single Neuron: Convolution Kernels for Natural Language Problems. Technical Report UCS-CRL-01-10, UC Santa Cruz (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.Department of Computer ScienceGovernment Arts CollegeCoimbatoreIndia
  2. 2.Avinashilingam Institute for Home Science and Higher Education for WomenCoimbatoreIndia

Personalised recommendations