Extracting Extended Web Logs to Identify the Origin of Visits and Search Keywords

  • Jeeva Jose
  • P. Sojan Lal
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 182)


Web Usage Mining is the extraction of information from web log data. The extended web log file contains information about the user traffic and behavior, the browser type, its version and operating system used. Mining these web logs provide the origin of visit or the referring website and popular keywords used to access a website. This paper proposes an indiscernibility approach in rough set theory to extract information from extended web logs to identify the origin of visits and the keywords used to visit a web site which will lead to better design of websites and search engine optimization.


Web Usage Mining Extended Web Log Keyword Search 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Facca, M.F., Lanzi, L.P.: Mining interesting knowledge from weblogs: a survey. Data & Knowledge Engineering 53, 225–241 (2005)CrossRefGoogle Scholar
  2. 2.
    Cooley, R., Mobasher, B., Srivastava, J.: Web Mining: Information and Pattern Discovery on the World Wide Web. In: 9th IEEE International Conference on Tools with Artificial Intelligence (1997)Google Scholar
  3. 3.
    Pitkow, J.: In search of Reliable usage Data on WWW. In: Sixth International WWW Conference (1997)Google Scholar
  4. 4.
    Kohavi, R., Parekh, R.: Ten Supplementary Analyses to improve E-commerce Web Sites. In: Fifth KDD Workshop (2003)Google Scholar
  5. 5.
    Spiliopoulou, M.: Web Usage Mining for Web Site Evaluation. Communications of the ACM 43(8), 127–134 (2000)CrossRefGoogle Scholar
  6. 6.
    Ortega, J.L., Aguillo, I.: Differences between web sessions according to the origin of their visits. Journal of Informetrics 4(1), 331–337 (2010)CrossRefGoogle Scholar
  7. 7.
    Suresh, R.M., Padmajavalli, R.: An Overview of Data Pre processing in Data and Web Usage Mining. In: First International Conference on Digital Management, pp. 193–198 (2006)Google Scholar
  8. 8.
    Burton, M.C., Walther, B.J.: A Survey of Web Log Data and Their Application in Use-Based Design. In: 34th Hawaii International Conference on System Sciences, pp. 1–10 (2000)Google Scholar
  9. 9.
    Wahab, M.H.A., Mohd, M.N.H., Hanafi, H.F., Mohsin, M.F.M.: Data Pre-processing on Web Server Logs for Generalized Association Rules Mining Algorithm. In: Proceedings of the World Academy of Science, Engineering and Technology, pp. 190–197 (2008)Google Scholar
  10. 10.
    Pabarskaite, Z., Raudys, A.: A process of knowledge discovery from web log data: Systematization and critical review. Journal of Intelligent Information Systems 28, 79–104 (2007)CrossRefGoogle Scholar
  11. 11.
    Bertot, J.C., Mcculure, C.R., Moen, W.E., Rubin, J.: Web Usage Statistics: Measurement Issues and Analytical Techniques. Government Information Quarterly 14, 373–395 (1997)CrossRefGoogle Scholar
  12. 12.
    Hussain, T., Asghar, S., Masood, N.: Web Usage Mining: A Survey of Preprocessing of Web Log File. In: International Conference on Information and Emerging Technologies, pp. 1–6 (2010)Google Scholar
  13. 13.
    Internet: Hypertext Transfer Protocol Overview, (last retrieved October 2011)
  14. 14.
    Mican, D., Sitar-Taut, D.: Preprocessing and Content/Navigational Pages Identification as Premises for an Extended Web Usage Mining Model Development. Informatica Economica 13(4), 168–179 (2009)Google Scholar
  15. 15.
    Pawlak, Z., Skowron, A.: Rudiments of Rough Sets. Information Sciences 177(1), 3–27 (2007)MathSciNetMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.BPC CollegePiravomIndia
  2. 2.School of Computer ScienceMahatma Gandhi UniversityKottayamIndia

Personalised recommendations