Abstract
We present a case study about the application of the inductive database approach to the analysis of Web logs. We consider rich XML Web logs – called conceptual logs – that are generated by Web applications designed with the WebML conceptual model and developed with the WebRatio CASE tool. Conceptual logs integrate the usual information about user requests with meta-data concerning the structure of the content and the hypertext of a Web application. We apply a data mining language (MINE RULE) to conceptual logs in order to identify different types of patterns, such as: recurrent navigation paths, most frequently visited page contents, and anomalies (e.g., intrusion attempts or harmful usages of resources). We show that the exploitation of the nuggets of information embedded in the logs and of the specialized mining constructs provided by the query languages enables the rapid customization of the mining procedures following to the Web developers’ need. Given our on-field experience, we also suggest that the use of queries in advanced languages, as opposed to ad-hoc heuristics, eases the specification and the discovery of large spectrum of patterns.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aggarwal, C.C.: On leveraging user access patterns for topic specific crawling. Data Mining and Knowledge Discovery 9(2), 123–145 (2004)
Berendt, B.: Web usage mining, site semantics, and the support of navigation. In: Proceedings of the Web Mining for E-Commerce - Challenges and Opportunities Workshop (WEBKDD 2000), Boston, MA, USA, August 2000. Springer, Heidelberg (2000)
Berendt, B., Hotho, A., Stumme, G.: Towards Semantic Web Mining. In: Horrocks, I., Hendler, J. (eds.) ISWC 2002. LNCS, vol. 2342, pp. 264–278. Springer, Heidelberg (2002)
Ceri, S., Fraternali, P., Bongio, A., Brambilla, M., Comai, S., Matera, M.: Designing Data-Intensive Web Applications. Morgan Kaufmann, San Francisco (2002)
Cocoon, A.: Cocoon, http://xml.apache.org/cocoon/
Cooley, R.: Web Usage Mining: Discovery and Application of Interesting Patterns from Web Data. PhD thesis, University of Minnesota (2000)
Cooley, R., Mobasher, B., Srivastava, J.: Data preparation for mining world wide web browsing patterns. Knowledge and Information Systems 1(1), 5–32 (1999)
Cooley, R., Tan, P., Srivastava, J.: Discovery of Interesting Usage Patterns from Web Data. LNCS/LNAI. Springer, Heidelberg (2000)
Dai, H., Mobasher, B.: Using ontologies to discover domain-level web usage profiles. In: Proceedings of the Second Semantic Web Mining Workshop at ECML/PKDD-2002, Helsinki, Finland (August 2002)
Demiriz, A.: Enhancing product recommender systems on sparse binary data. Data Mining and Knowledge Discovery 9(2), 147–170 (2004)
Eirinaki, M., Lampos, H., Vazirgiannis, M., Varlamis, I.: Sewep: Using site semantics and a taxonomy to enhance the web personalization process. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 2003. Springer, Heidelberg (2003)
Etzioni, O.: The world-wide web: Quagmire or gold mine? Communications of the ACM 39(11), 65–68 (1996)
Facca, F.M., Lanzi, P.L.: Mining interesting knowledge from weblogs: A survey. Technical Report 2003.15, Dipartimento di Elettronica e Informazione. Politecnico di Milano (April 2003)
Fraternali, P., Matera, M., Maurino, A.: Conceptual-level log analysis for the evaluation of web application quality. In: Proceedings of LA-Web 2003, Santiago, Chile, November 2003. IEEE Computer Society, Los Alamitos (2003)
Imielinski, T., Mannila, H.: A database perspective on knowledge discovery. Coomunications of the ACM 39(11), 58–64 (1996)
Kohavi, R., Parekh, R.: Ten supplementary analyses to improve e-commerce web sites. In: Proceedings of the Fifth WEBKDD Workshop: Webmining as a premise to effective and intelligent Web Applications, ACM SIGKDD, Washington, DC, USA. Springer, Heidelberg (2003)
Meo, R., Psaila, G., Ceri, S.: An extension to SQL for mining association rules. Journal of Data Mining and Knowledge Discovery 2(2) (1998)
Nasraoui, O., Frigui, H., Joshi, A., Krishnapuram, R.: Mining web access logs using a fuzzy relational clustering algorithm based on a robust estimator. In: Proceedings of the 8th International World Wide Web Conference (WWW8), Toronto, Canada (May 1999)
Nasraoui, O., Frigui, H., Joshi, A., Krishnapuram, R.: Mining web access logs using relational competitive fuzzy clustering. In: Proceedings of the 8th International Fuzzy Systems Association Congress, Hsinchu, Taiwan (August 1999)
Oberle, D., Berendt, B., Hotho, A., Gonzales, J.: Conceptual User Tracking. In: Menasalvas, E., Segovia, J., Szczepaniak, P.S. (eds.) AWIC 2003. LNCS(LNAI), vol. 2663, pp. 142–154. Springer, Heidelberg (2003)
Pirolli, P., Pitkow, J., Rao, R.: Silk from a sow’s ear: Extracting usable structures form the web. In: Proc. of CHI 96 Conference, April 1996. ACM Press, New York (1996)
Punin, J.R., Krishnamoorthy, M.S., Zaki, M.J.: Logml: Log markup language for web usage mining. In: Kohavi, R., Masand, B., Spiliopoulou, M., Srivastava, J. (eds.) WebKDD 2001. LNCS, vol. 2356, pp. 88–112. Springer, Heidelberg (2002)
Spiliopoulou, M., Faulstich, L.: Wum: A web utilization miner. In: Proceedings of the International Workshop on the Web and Databases. Valencia, Spain (March 1998)
Srivastava, J., Cooley, R., Deshpande, M., Tan, P.-N.: Web usage mining: Discovery and applications of usage patterns from web data. SIGKDD Explorations 1(2), 12–23 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Meo, R., Lanzi, P.L., Matera, M., Esposito, R. (2006). Integrating Web Conceptual Modeling and Web Usage Mining. In: Mobasher, B., Nasraoui, O., Liu, B., Masand, B. (eds) Advances in Web Mining and Web Usage Analysis. WebKDD 2004. Lecture Notes in Computer Science(), vol 3932. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11899402_9
Download citation
DOI: https://doi.org/10.1007/11899402_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-47127-1
Online ISBN: 978-3-540-47128-8
eBook Packages: Computer ScienceComputer Science (R0)