Analysis of Web Proxy Logs

  • Bennie Fei
  • Jan Eloff
  • Martin Olivier
  • Hein Venter
Part of the IFIP Advances in Information and Communication book series (IFIPAICT, volume 222)


Network forensics involves capturing, recording and analysing network audit trails. A crucial part of network forensics is to gather evidence at the server level, proxy level and from other sources. A web proxy relays URL requests from clients to a server. Analysing web proxy logs can give unobtrusive insights to the browsing behavior of computer users and provide an overview of the Internet usage in an organisation. More importantly, in terms of network forensics, it can aid in detecting anomalous browsing behavior. This paper demonstrates the use of a self-organising map (SOM), a powerful data mining technique, in network forensics. In particular, it focuses on how a SOM can be used to analyse data gathered at the web proxy level.


Network forensics web proxy logs self-organising map data analysis anomalous behavior 


  1. [1]
    A. Abraham and V. Ramos, Web usage mining using artificial ant colony clustering and linear genetic programming, Proceedings of the IEEE Congress on Evolutionary Computation, vol. 2, pp. 1384–1391, 2003.Google Scholar
  2. [2]
    B. Berendt, Web usage mining, site semantics and the support of navigation, Proceedings of the Workshop on Web Mining for E-Commerce: Challenges and Opportunities, 2000.Google Scholar
  3. [3]
    J. Brittle and C. Boldyreff, Self-organizing maps applied in visualising large software collections, Proceedings of the Second International Workshop on Visualising Software for Understanding and Analysis, 2003.Google Scholar
  4. [4]
    M. Caloyannides, Privacy Protection and Computer Forensics, Artech House, Boston, Massachusetts, 2004.Google Scholar
  5. [5]
    R. Cooley, B. Mobasher and J. Srivastava, Data preparation for mining World Wide Web browsing patterns, Knowledge and Information Systems, vol. 1(1), pp. 5–32, 1999.CrossRefGoogle Scholar
  6. [6]
    V. Corey, C. Peterman, S. Shearin, M. Greenberg and J. van Bokkelen, Network forensics analysis, IEEE Internet Computing, vol. 6(6), pp. 60–66, 2002.CrossRefGoogle Scholar
  7. [7]
    G. Deboeck, Financial applications of self-organising maps, Neural Network World, vol. 8(2), pp. 213–241, 1998.Google Scholar
  8. [8]
    M. Eirinaki and M. Vazirgiannis, Web mining for web personalization, ACM Transactions on Internet Technology, vol. 3(1), pp. 1–27, 2003.CrossRefGoogle Scholar
  9. [9]
    A. Engelbrecht, Computational Intelligence: An Introduction, Wiley, Chichester, United Kingdom, 2002.Google Scholar
  10. [10]
    M. Géry and H. Haddad, Evaluation of web usage mining approaches for users’ next request prediction, Proceedings of the Fifth ACM International Workshop on Web Information and Data Management, pp. 74–81, 2003.Google Scholar
  11. [11]
    T. Kohonen, The self-organizing map, Proceedings of the IEEE, vol. 78(9), pp. 1464–1480, 1990.CrossRefGoogle Scholar
  12. [12]
    T. Kohonen, Self-Organizing Maps, Springer, Berlin-Heidelberg, Germany, 2001.CrossRefzbMATHGoogle Scholar
  13. [13]
    T. Kohonen, S. Kaski, K. Lagus, J. Salojarvi, J. Honkela, V. Paatero and A. Saarela, Self organization of a massive document collection, IEEE Transactions on Neural Networks, vol. 11(3), pp. 574–585, 2000.CrossRefGoogle Scholar
  14. [14]
    P. Kolari and A. Joshi, Web mining: Research and practice, IEEE Computing in Science and Engineering, vol. 6(4), pp. 49–53, 2004.CrossRefGoogle Scholar
  15. [15]
    E. Kosala and H. Blockeel, Web mining research: A survey, SIGKDD Explorations, vol. 2(1), pp. 1–15, 2000.CrossRefGoogle Scholar
  16. [16]
    Y. Li, X. Chen and B. Yang, Research on web-mining-based intelligent search engines, Proceedings of the International Conference on Machine Learning and Cybernetics, 2002.Google Scholar
  17. [17]
    C. Maltzahn and K. Richardson, Performance issues of enterprise-level web proxies, Proceedings of the ACM Sigmetrics International Conference on Measurement and Modeling of Computer Systems, pp. 13–23, 1997.Google Scholar
  18. [18]
    B. Mobasher, N. Jain, E. Han and J. Srivastava, Web Mining: Pattern Discovery from World Wide Web Transactions, Technical Report TR96-050, Department of Computer Science and Engineering, University of Minnesota, Minneapolis, Minnesota, 1996.Google Scholar
  19. [19]
    S. Mukkamala and A. Sung, Identifying significant features for network forensic analysis using artificial techniques, International Journal of Digital Evidence, vol. 1(4), 2003.Google Scholar
  20. [20]
    M. Noblett, M. Pollitt and L. Presley, Recovering and examining computer forensic evidence, Forensic Science Communications, vol. 2(4), 2000.Google Scholar
  21. [21]
    U. Payer, P. Teufl and M. Lamberger, Traffic classification using self-organizing maps, Proceedings of the Fifth International Networking Conference, pp. 11–18, 2005.Google Scholar
  22. [22]
    M. Reith, C. Carr and G. Gunsch, An examination of digital forensic models, International Journal of Digital Evidence, vol. 1(3), 2002.Google Scholar
  23. [23]
    K. Smith and A. Ng, Web page clustering using a self-organizing map of user navigation patterns, Decision Support Systems, vol. 35(2), pp. 245–256, 2003.CrossRefGoogle Scholar
  24. [24]
    J. Srivastava, R. Cooley, M. Deshpande and P. Tan, Web usage mining: Discovery and applications of usage patterns from web data, SIGKDD Explorations, vol. 1(2), pp. 12–23, 2000.CrossRefGoogle Scholar
  25. [25]
    S. Tangsripairoj and M. Samadzadeh, Application of self-organizing maps to software repositories in reuse-based software development, Proceedings of the International Conference on Software Engineering Research and Practice, vol. 2, pp. 741–747, 2004.Google Scholar
  26. [26]
    J. Vacca, Computer Forensics: Computer Crime Scene Investigation, Charles River Media, Hingham, Massachusetts, 2002.Google Scholar
  27. [27]
    J. Vesanto, Using SOM in Data Mining, Licentiate Thesis, Helsinki University of Technology, Helsinki, Finland, 2000.Google Scholar
  28. [28]
    J. Vesanto, Data Exploration Process Based on the Self-Organizing Map, Doctoral Thesis, Helsinki University of Technology, Helsinki, Finland, 2002.Google Scholar
  29. [29]
    D. Wessels, Squid Web Proxy Cache ( Scholar

Copyright information

© IFIP Internatonal Federation for Information Processing 2006

Authors and Affiliations

  • Bennie Fei
    • 1
  • Jan Eloff
    • 1
  • Martin Olivier
    • 1
  • Hein Venter
    • 1
  1. 1.University of PretoriaPretoriaSouth Africa

Personalised recommendations