Modeling Navigation Patterns of Visitors of Unstructured Websites

Balog, K.; Hofgesang, P.; Kowalczyk, W.

doi:10.1007/978-1-84628-226-3_10

Modeling Navigation Patterns of Visitors of Unstructured Websites

K. Balog⁴,
P. Hofgesang⁴ &
W. Kowalczyk⁴

Conference paper

408 Accesses
1 Citations

Abstract

In this paper we describe a practical approach for modeling navigation patterns of visitors of unstructured websites. These patterns are derived from web logs that are enriched with 3 sorts of information: (1) content type of visited pages, (2) visitor type, and (3) location of the visitor. We developed an intelligent Text Mining system, iTM, which supports the process of classifying web pages into a number of pre-defined categories. With help of this system we were able to reduce the labeling effort by a factor 10–20 without affecting the accuracy of the final result too much. Another feature of our approach is the use of a new technique for modeling navigation patterns: navigation trees. They provide a very informative graphical representation of most frequent sequences of categories of visited pages.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Imielinski, T., and Swami, A. (1993), Mining association rules between sets of items in large databases. In Proceedings of the ACM SIGMOD Conference on Management of Data, pages 207–216.
Google Scholar
Argamon-Engelson, S. and Dagan, I. (1999). Commitee-based sample selection for probabilistic classifiers. Journal of Artificial Intelligence Research, (11):335–360, 1999.
MATH Google Scholar
Baglioni, M., Ferrara, U., Romei, A., Ruggieri, S., and Turini, F. (2003), Preprocessing and Mining Web Log Data for Web Personalization. 8th Italian Conf. on Artificial Intelligence vol. 2829 of LNCS, p.237–249.
Google Scholar
Balog, K., (2004). An Intelligent Support System for Developing Text Classifiers. MSc. Thesis, Vrije Universiteit Amsterdam, The Netherlands.
Google Scholar
Cadez, I. V., Heckerman, D., Meek, C, Smyth, P., and White, S. (2003), Model-Based Clustering and Visualization of Navigation Patterns on a Web Site. Data Mining and Knowledge Discovery, vol.7 n.4, p.399–424.
Article MathSciNet Google Scholar
Chevalier, K., Bothorel, C, and Corruble, V. (2003), Discovering rich navigation patterns on a web site. Proceedings of the 6th International Conference on Discovery Science Hokkaido University Conference Hall, Sapporo, Japan.
Google Scholar
Cooley, R., Mobasher, B., Srivastava, J. (1999), Data Preparation for Mining World Wide Web Browsing Patterns. In Knowledge and Information System, vol. 1(1), pages 5–32.
Google Scholar
Dumais, S.T., and H. Chen (2000). Hierarchical classification of web content. Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’00), August 2000, pages 256–263.
Google Scholar
Hay B., Wets, G., and Vanhoof K. (2003), Segmentation of visiting patterns on websites using a sequence alignment method. Journal of Retailing and Consumer Services vol.10, p. 145–153.
Article Google Scholar
Hofgesang, P.I., (2004). Web usage mining. Structuring semantically enriched clickstream data. MSc. Thesis, Vrije Universiteit Amsterdam, The Netherlands.
Google Scholar
Jenamani, M., Mohapatra, P.K.J., and Ghose, S. (2003), A stochastic model of e-customer behaviour. Electronic Commerce Research and Applications vol.2, p.81–94.
Article Google Scholar
Kosala, R., and Blocked, H. (2000). Web mining research: A survey, SIGKDD Explorations. Newsletter of the ACM Special Interest Group on Knowledge Discovery and Data Mining 2(1), pp. 1–15, July, 2000.
Google Scholar
Mladenic, D. (1998). Turning Yahoo to Automatic Web-Page Classifier. In H. Prade, editor, Proceedings of the 13th European Conference on Artificial Intelligence (ECAI-98), pages 473–474.
Google Scholar
Mobasher, B., Jain, N., Han, E., and Srivastava, J. (1996), Web Mining: Pattern discovery from World Wide Web transactions. Technical Report TR 96-050, University of Minnesota, Dept. of Computer Science, Minneapolis.
Google Scholar
Nanopoulos A., Manolopoulos Y. (2001), Mining patterns from graph traversals. Data and Knowledge Engineering No. 37, pages 243–266.
Article MATH Google Scholar
Nigam, K., McCallum, A.K., Thrun, S., and Mitchell, T. (2000). Text classification from labeled and unlabeled documents using EM. Machine Learning, Kluwer Acedemic Press, 39(2/3),pages 103–134.
MATH Google Scholar
Pei, J., Han, J., Mortazavi-asl, B., and Zhu, H. (2000), Mining Access Patterns Efficiently from Web Logs. Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, pages 396–407.
Google Scholar
Sebastiani, F. (2002), Machine learning in automated text categorization. ACM Computing Surveys, 34(1), pages 1–47.
Article Google Scholar
Schapire, R.E. and Singer, Y. (2000). Boostexter: A boosting-based system for text categorization. Machine Learning, 39(2/3), pages 135–168.
Article MATH Google Scholar
Web Mining and Web Usage Mining Software, http://www.kdnuggets.com/software/web.html
Google Scholar
Xing, D., and Shen, J. (2004), Efficient data mining for web navigation patterns. Information and Software Technology vol.46, pages 55–63.
Article Google Scholar
Yang, Q., Li T.I., and Wang K. (2003), Web-log Cleaning for Constructing Sequential Classifiers. Applied Artificial Intelligence vol. 17, issue 5–6, pages 431–441.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Free University Amsterdam, De Boelelaan 1081 A, 1081HV, Amsterdam, The Netherlands
K. Balog, P. Hofgesang & W. Kowalczyk

Authors

K. Balog
View author publications
You can also search for this author in PubMed Google Scholar
P. Hofgesang
View author publications
You can also search for this author in PubMed Google Scholar
W. Kowalczyk
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Technology, University of Portsmouth, Portsmouth, UK
Max Bramer BSc, PhD, CEng, FBCS, FIEE, FRSA
Department of Computer Science, University of Liverpool, Liverpool, UK
Frans Coenen PhD
Nottingham Trent University, UK
Tony Allen PhD

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Balog, K., Hofgesang, P., Kowalczyk, W. (2006). Modeling Navigation Patterns of Visitors of Unstructured Websites. In: Bramer, M., Coenen, F., Allen, T. (eds) Research and Development in Intelligent Systems XXII. SGAI 2005. Springer, London. https://doi.org/10.1007/978-1-84628-226-3_10

Download citation

DOI: https://doi.org/10.1007/978-1-84628-226-3_10
Publisher Name: Springer, London
Print ISBN: 978-1-84628-225-6
Online ISBN: 978-1-84628-226-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics