Skip to main content

Application of Closed Gap-Constrained Sequential Pattern Mining in Web Log Data

  • Conference paper
  • First Online:
Advances in Control and Communication

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 137))

Abstract

Discovery of information in web log data is a very popular research area in the field of data mining. Two of the objectives of favorite applications are to obtain useful information of web users’ behavior and to analyze the structure of web sites. In this paper, we suggest a novel approach to generate web sequential patterns using the gap-constrained method in web log data. The process of mining task in the proposed approach is described as follows. First, pre-process of the raw web log data is introduced by removing irrelevant or redundant items, gathering the same users and transforming the web log data into a set of tuples (sequence identifier, sequence) constrained by visiting time. Second, web access patterns, which are closed sequential patterns with gap constraints, are generated using the Gap-BIDE algorithm in web log data with two parameters, minimum support threshold and gap constraint. In the experiment, a data set is derived from http://www.vtsns.edu.rs/maja/, which is proposed in [1]. The result shows that, with the application of sequential pattern mining in the web log data presented in this paper, we can find information about navigational behavior of web users and the structure of the web page can be designed more legitimately by the order of obtained patterns.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Dimitrijević, M., Bošnjak, Z., Subotica, S.: Discovering Interesting Association Rules in the Web Log Usage Data. Interdisciplinary Journal of Information, Knowledge, and Management 5, 191–207 (2010)

    Google Scholar 

  2. Grace, L.K.J., Maheswari, V., Nagamalai, D.: Analysis of Web Logs and Web User in Web Mining. International Journal of Network Security & Its Applications (IJNSA) 3(1) (2011)

    Google Scholar 

  3. Saxena, K., Shukla, R.: Significant Interval and Frequent Pattern Discovery in Web Log Data. International Journal of Computer Science Issues, IJCSI 7(1(3)) (2010)

    Google Scholar 

  4. Suresh, K., Paul, S.: Distributed Linear Programming for Weblog Data using Mining Techniques in Distributed Environment. International Journal of Computer Applications (0975-8887) 11(7) (2010)

    Google Scholar 

  5. Wang, Y., Le, J., Huang, D.: A Method for Privacy Preserving Mining of Association Rules Based on Web Usage Mining. In: 2010 International Conference on Web Information Systems and Mining, WISM, vol. 1, pp. 33–37. IEEE Computer Society, Washington, DC (2010)

    Chapter  Google Scholar 

  6. Wei, C., Sen, W., Yuan, Z., Chang, C.L.: Algorithm of mining sequential patterns for web personalization services. In: ACM SIGMIS Database, vol. 40. ACM. New York (2009)

    Google Scholar 

  7. Zhu, J., Wu, H., Gao, G.: An Efficient Method of Web Sequential Pattern Mining Based on Session Filter and Transaction Identification. Journal of Networks (JNW) 5(9), 1017–1024 (2010) ISSN 1796-2056

    Google Scholar 

  8. Santini, M.: Cross-Testing a Genre Classification Model for the Web. Genres on the Web, Part 3 42, 87–128 (2011)

    Article  Google Scholar 

  9. Rho, J.J., Moon, B.J., Kim, Y.J., Yang, D.H.: Internet Customer Segmentation Using Web Log Data. Journal of Business & Economics Research (JBER) 2(11) (2004)

    Google Scholar 

  10. Kejžar, N., Černe, S.K., Batagelj, V.: Network Analysis of Works on Clustering and Classification from Web of Science. In: Proceedings of the 11th IFCS, Part 3, pp. 525–536 (2010)

    Google Scholar 

  11. Xu, G., Zong, Y., Dolog, P., Zhang, Y.: Co-clustering Analysis of Weblogs Using Bipartite Spectral Projection Approach. In: Setchi, R., Jordanov, I., Howlett, R.J., Jain, L.C. (eds.) KES 2010. LNCS, vol. 6278, pp. 398–407. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  12. Makanju, A.A.O., Zincir-Heywood, A.N., Milios, E.E.: Clustering Event Logs Using Iterative Partitioning. In: Proceeding of KDD 2009, pp. 1255–1263 (2009)

    Google Scholar 

  13. Wang, J., Mo, Y., Huang, B., Wen, J., He, L.: Web Search Results Clustering Based on a Novel Suffix Tree Structure. In: Rong, C., Jaatun, M.G., Sandnes, F.E., Yang, L.T., Ma, J. (eds.) ATC 2008. LNCS, vol. 5060, pp. 540–554. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  14. Chen, J., Cook, T.: Mining contiguous sequential patterns from web logs. In: Proceedings of WWW, pp. 1177–1178 (2007)

    Google Scholar 

  15. Iváncsy, R., Vajk, I.: Frequent Pattern Mining in Web Log Data. Acta Polytechnica Hungarica 3(1), 77–90 (2006)

    Google Scholar 

  16. Mabroukeh, N.R., Ezeife, C.I.: A taxonomy of sequential pattern mining algorithms. ACM Computing Surveys (CSUR) 43(1) (2010)

    Google Scholar 

  17. Shivaprasad, G., Subbareddy, N.V., Acharya, U.D.: Knowledge Discovery from Web Usage Data: Research and Development of Web Access Pattern Tree Based Sequential Pattern Mining Techniques: A Survey. In: AIP Conference, vol. 1324, pp. 319–323 (2010)

    Google Scholar 

  18. Grace, L.K.J., Maheswari, V., Nagamalai, D.: Web Log Data Analysis and Mining. In: Meghanathan, N., Kaushik, B.K., Nagamalai, D. (eds.) CCSIT 2011, Part III. Communications in Computer and Information Science (CCIS), vol. 133, pp. 459–469. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  19. Li, C., Wang, J.: Efficiently Mining Closed Subsequences with gap constraints. In: Proceeding of 2008 SIAM International Conference on Data Mining, pp. 313–322 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiuming Yu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yu, X., Li, M., Lee, D.G., Kim, K.D., Ryu, K.H. (2012). Application of Closed Gap-Constrained Sequential Pattern Mining in Web Log Data. In: Zeng, D. (eds) Advances in Control and Communication. Lecture Notes in Electrical Engineering, vol 137. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-26007-0_80

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-26007-0_80

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-26006-3

  • Online ISBN: 978-3-642-26007-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics