Distributed and Parallel Databases

, Volume 13, Issue 2, pp 161–180 | Cite as

On Using a Warehouse to Analyze Web Logs

  • Karuna P. Joshi
  • Anupam Joshi
  • Yelena Yesha


Analyzing Web Logs for usage and access trends can not only provide important information to web site developers and administrators, but also help in creating adaptive web sites. While there are many existing tools that generate fixed reports from web logs, they typically do not allow ad-hoc analysis queries. Moreover, such tools cannot discover hidden patterns of access embedded in the access logs. We describe a relational OLAP (ROLAP) approach for creating a web-log warehouse. This is populated both from web logs, as well as the results of mining web logs. We discuss the design criteria that influenced our choice of dimensions, facts and data granularity. A web based ad-hoc tool for analytic queries on the warehouse was developed. We present some of the performance specific experiments that we performed on our warehouse.

web mining data warehouse web logs e-commerce adaptive websites 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    'A Listing of Access Log Analyzers', Scholar
  2. 2.
    R. Agrawal and R. Srikant, “Fast algorithms for mining association rules,” in Proc. of the 20th Int'l Conference on Very Large Databases, Santiago, Chile, Sept. 1994.Google Scholar
  3. 3.
    H. Ahonen, O. Heinonen, M. Klemettinen, and I. Verkamo, “Mining in the phrasal frontier,” in 1st European Symposium on Principles of Data Mining and Knowledge Discovery (PKDD'97), Norway, June 1997.Google Scholar
  4. 4.
    A. Buchner and M. Mulvenna, “Discovering internet marketing intelligence through online analytical web usage mining,” SIGMOD Record, vol. 27, no. 4, pp. 54–61, 1998.Google Scholar
  5. 5.
    M.S. Chen, J.-S. Park, and P.S. Yu, “Efficient data mining for path traversal patterns,” IEEE Trans. on Knowledge and Data Engineering, vol. 10, no. 2, pp. 209–221, 1998.Google Scholar
  6. 6.
    R. Cooley, B. Mobasher, and J. Srivastava, “Web mining: Information and pattern discovery on the world wide web,” in ICTAI'97, Dec. 1997, pp. 558–567.Google Scholar
  7. 7.
    'Follow: A session based Log analyzing tool,'∼mnot/follow/.Google Scholar
  8. 8.
    A. Joshi and R. Krishnapuram, “Robust fuzzy clustering methods to support web mining,” in Proc.Workshop in Data Mining and knowledge Discovery, SIGMOD 1998.Google Scholar
  9. 9.
    A. Joshi and R. Krishnapuram, “On mining web acceess logs,” in Proc. SIGMOD 2000 Workshop on Research Issues in Data Mining and Knowledge Discovery, Dallas, 2000, pp 63–69.Google Scholar
  10. 10.
    T. Kamdar, MS Thesis, CSEE Department, University of Maryland Baltimore County, May 2001.Google Scholar
  11. 11.
    B. Lent, R. Agrawal, and R. Srikant, “Discovering trends in text databases,” in Proc. of the 3rd Int'l Conference on Knowledge Discovery in Databases and Data Mining, Newport Beach, California, August 1997.Google Scholar
  12. 12.
    O. Nasraoui, H. Frigui, A. Joshi, and R. Krishnapuram, “Extracting web user profiles using relational competitive fuzzy clustering,” Intl. J. Artificial Intelligence Tools, vol. 9, no. 4, pp. 509–526, 2000.Google Scholar
  13. 13.
    O. Nasraoui, R. Krishnapuram, and A. Joshi, “Mining web access logs using a fuzzy relational clustering algorithm based on a robust estimator,” (poster) at WWW8, August 1999.Google Scholar
  14. 14.
    M. Perkowitz and O. Etzioni, “Towards adaptive web sites: Conceptual framework and case study,” in Proc. of the Eighth International WWW Conference, May 1999, pp. 1245–1258.Google Scholar
  15. 15.
    SGI-MineSet ''.Google Scholar
  16. 16.
    C. Shahabi, A.M. Zarkesh, J. Abidi, and V. Shah, “Knowledge discovery from user's web-page navigation,” in Proc. Seventh IEEE Intl. Workshop on Research Issues in Data Engineering (RIDE),' 97, pp. 20–29.Google Scholar
  17. 17.
    “SpeedTracer: A web usage mining and analysis tool,” IBM Systems Journal, vol 37, no. 1--Internet Computing, pp. 89–105, 1998.Google Scholar
  18. 18.
    L. Yi, R. Krishnapuram, and A. Joshi, “A fuzzy relative of the k-medoids algorithm with application to document and snippet clustering,” IEEE Int'l Conference--Fuzzy Systems, 1999.Google Scholar
  19. 19.
    O.R. Zaiane, M. Xin, and J. Han, “Discovering web access patterns and trends by applying OLAP and data mining technology on web logs,” in Proc. Advances in Digital Libraries Conf. (ADL'98), Santa Barbara, CA, April 1998, pp. 19–29.Google Scholar
  20. 20.
    A. Zarkesh, J. Adibi, C. Shahabi, R. Sadri, and V. Shah, “Analysis and design of server informative WWWsites,” in Proceedings of the ACM CIKM'97, pp. 254–261.Google Scholar

Copyright information

© Kluwer Academic Publishers 2003

Authors and Affiliations

  • Karuna P. Joshi
    • 1
  • Anupam Joshi
    • 1
  • Yelena Yesha
    • 1
  1. 1.Department of Computer Science and Electrical EngineeringUniversity of MarylandBaltimoreUSA

Personalised recommendations