Improving Web Sites with Web Usage Mining, Web Content Mining, and Semantic Analysis

  • Jean-Pierre Norguet
  • Esteban Zimányi
  • Ralf Steinberger
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3831)


With the emergence of the World Wide Web, Web sites have become a key communication channel for organizations. In this context, analyzing and improving Web communication is essential to better satisfy the objectives of the target audience. Web communication analysis is traditionnally performed by Web analytics software, which produce long lists of audience metrics. These metrics contain little semantics and are too detailed to be exploited by organization managers and chief editors, who need summarized and conceptual information to take decisions. Our solution to obtain such conceptual metrics is to analyze the content of the Web pages output by the Web server. In this paper, we first present a list of methods that we conceived to mine the output Web pages. Then, we explain how term weights in these pages can be used as audience metrics, and how they can be aggregated using OLAP tools to obtain concept-based metrics. Finally, we present the concept-based metrics that we obtained with our prototype WASA and SQL Server OLAP tools.


Organization Manager Site Editor Chief Editor Page Request Network Monitor 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aggarwal, C.C., Yu, P.S.: On Disk Caching of Web Objects in Proxy Servers. In: Proc. of the 6th Int. Conf. on Information and Knowledge Management, CIKM 1997, pp. 238–245 (1997)Google Scholar
  2. 2.
    Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, Reading (1999)Google Scholar
  3. 3.
    Büchner, A.G., Mulvenna, M.D.: Discovering Internet Marketing Intelligence through Online Analytical Web Usage Mining. SIGMOD Record 27(4), 54–61 (1998)CrossRefGoogle Scholar
  4. 4.
    Chen, M.-S., Han, J., Yu, P.S.: Data Mining: An Overview from a Database Perspective. IEEE Trans. Knowl. Data Eng. 8(6), 866–883 (1996)CrossRefGoogle Scholar
  5. 5.
    Chi, E.H., Pirolli, P., Chen, K., Pitkow, J.E.: Using Information Scent to Model User Information Needs and Actions and the Web. In: Proc. of the SIGCHI on Human Factors in Computing Systems, pp. 490–497 (2001)Google Scholar
  6. 6.
    Facca, F.M., Lanzi, P.L.: Mining Interesting Knowledge from Weblogs: a Survey. Data Knowl. Eng. 53(3), 225–241 (2005)CrossRefGoogle Scholar
  7. 7.
    Fensel, D.: Ontologies: A Silver Bullet for Knowledge Management and Electronic Commerce. Springer, Heidelberg (2000)Google Scholar
  8. 8.
    Lozano-Tello, A., Gómez-Pérez, A.: Ontometric: A Method to Choose the Appropriate Ontology. J. Database Manag. 15(2), 1–18 (2004)CrossRefGoogle Scholar
  9. 9.
    Malinowski, E., Zimányi, E.: OLAP Hierarchies: A Conceptual Perspective. In: Persson, A., Stirna, J. (eds.) CAiSE 2004. LNCS, vol. 3084, pp. 477–491. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  10. 10.
    March, J.G., Simon, H.A., Guetzkow, H.S.: Organizations, 2nd edn. Blackwell, Cambridge (1983)Google Scholar
  11. 11.
    Mobasher, B., Cooley, R., Srivastava, J.: Automatic Personalization Based on Web Usage Mining. Communications of the ACM 43(8), 142–151 (2000)CrossRefGoogle Scholar
  12. 12.
    Moeller, M., Cicaterri, C., Presser, A., Wang, M.: Measuring e-Business Web Usage, Performance, and Availability. IBM Press (2003)Google Scholar
  13. 13.
    Perkowitz, M., Etzioni, O.: Towards Adaptive Web Sites: Conceptual Framework and Case Study. Artif. Intell. 118(1-2), 245–275 (2000)zbMATHCrossRefGoogle Scholar
  14. 14.
    Pirolli, P., Pitkow, J.E.: Distributions of Surfers’ Paths through the World Wide Web: Empirical Characterizations. World Wide Web 2(1-2), 29–45 (1999)CrossRefGoogle Scholar
  15. 15.
    Ríos, S.A., Velásquez, J.D., Vera, E.S., Yasuda, H., Aoki, T.: Using SOFM to Improve Web Site Text Content. In: Wang, L., Chen, K., S. Ong, Y. (eds.) ICNC 2005. LNCS, vol. 3611, pp. 622–626. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  16. 16.
    Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1983)zbMATHGoogle Scholar
  17. 17.
    Srivastava, J., Cooley, R., Deshpande, M., Pang-Ning, T.: Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data. SIGKDD 1(2) (2000)Google Scholar
  18. 18.
    Steinberger, R., Pouliquen, B., Ignat, C.: Exploiting Multilingual Nomenclatures and Language-Independent Text Features as an Interlingua for Cross-Lingual Text Analysis Applications. In: Proc. of the 4th Slovenian Language Technology Conf., Information Society 2004 (2004)Google Scholar
  19. 19.
    Sterne, J.: Web Metrics: Proven Methods for Measuring Web Site Success. John Wiley & Sons, Chichester (2002)Google Scholar
  20. 20.
    Stumme, G., Maedche, A.: Fca-Merge: Bottom-up Merging of Ontologies. In: Proc. of the 17th Int. Joint Conf. on Artificial Intelligence, IJCAI 2001, pp. 225–234 (2001)Google Scholar
  21. 21.
    Wahli, U., Norguet, J.P., Andersen, J., Hargrove, N., Meser, M.: Websphere Version 5 Application Development Handbook. IBM Press (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Jean-Pierre Norguet
    • 1
  • Esteban Zimányi
    • 1
  • Ralf Steinberger
    • 2
  1. 1.Department of Computer & Network EngineeringUniversité Libre de BruxellesBrusselsBelgium
  2. 2.Joint Research CentreEuropean CommissionIspraItaly

Personalised recommendations