Advertisement

Classification Technique for Improving User Access on Web Log Data

  • Bina Kotiyal
  • Ankit Kumar
  • Bhaskar Pant
  • R. H. Goudar
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 243)

Abstract

In the present era, Internet is playing a significant role in our everyday life; therefore, it is very thorny to survive without it. Web log file that keeps track of the users’ access on net, if mined, can provide us precious information about the surfers. Similarly, the rapid growth of data mining applications has shown the necessity for machine learning algorithms to be applied to large-scale data. In this paper, we are using the naïve Bayesian (NB) classification technique using Weka for identifying the frequent access pattern. The main objective of this paper is to categorize browsing behavior of the user based on their position. This paper performs experiment and classifies the user access behavior from the large databases, which could result in increasing the efficiency and effectiveness of the system by reducing the browsing time of the user or results in fast retrieval of information from the system.

Keywords

Data mining Weka Classification Web data Web usage mining Preprocessing Pattern discovery Naïve Bayesian 

References

  1. 1.
    Agrawal, R., Mehta, M.: SPRINT: a scalable parallel classifier for data mining. The International Conference on Very Large Database, pp. 544–555. Bombay, India (1996)Google Scholar
  2. 2.
    Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Academic press (2001)Google Scholar
  3. 3.
    Nasa, C., Suman, S.: Evaluation of different classification techniques for web data. Int. J. Comput. Appl. (0975–8887) 52(9) (2012)Google Scholar
  4. 4.
    Cooley, R., Mobasher, B., Srivastava, J.: Data preparation for mining World Wide Web browsing patterns. J. Knowl. Inf. Syst. 1(1), 5–32 (1999)CrossRefGoogle Scholar
  5. 5.
    Cunha, C.R., Jaccoud, C.F.B.: . Determining WWW user’s next access and its application to pre-fetching. In: The second IEEE Symposium on Computers and Communications, Alexandria, Egypt (1997)Google Scholar
  6. 6.
    Iyengar, A., MacNair, E., Nguyen, T.: An analysis of Web server performance. In: The IEEE Global Telecommunications Conference, vol. 3, Phoenix, AZ, pp. 1943–1947 (1997)Google Scholar
  7. 7.
    Bonchi, F., Giannotti, F., Gozzi, C., Manco, G., Nanni, M., Pedreschi, D.: Web log data warehousing and mining for intelligent web caching. Data Knowl. Eng. 39(2), 165–189 (2001)CrossRefMATHGoogle Scholar
  8. 8.
    Chen, Z., Shen, H.: A study of a new method of browsing path data mining. In: The sixth International Conference of Information Management Research and Practice. TsingHua University, HsingChu (2000)Google Scholar
  9. 9.
    Chen, M.S., Park, J.S., Yu, P.S.: Efficient data mining for path traversal patterns. IEEE Trans. Knowl. Data Eng. 10(2), 209–221 (1998)CrossRefGoogle Scholar
  10. 10.
    Zhang, D., Dong, Y.: A novel Web usage mining approach for search engines. Comput. Netw. 39(3), 303–310 (2002)CrossRefGoogle Scholar
  11. 11.
    Perkowitz, M., Etzioni, O.: Towards adaptive Web sites: conceptual framework and case study. Artif. Intell. 118(1–2), 245–275 (2000)CrossRefMATHGoogle Scholar
  12. 12.
    Catledge, L.D., Pitkow, J.E.: Characterizing browsing strategies in the World Wide Web. Comput. Netw. ISDN Syst. 27(6), 1065–1073 (1995)CrossRefGoogle Scholar
  13. 13.
    Mark Hall: The WEKA Data Mining Software: An Update, SIGKDD Explorations, vol. 11(1) (2009)Google Scholar
  14. 14.
    Pani, S.K., Panigrahy, L.: Web usage mining: a survey on pattern extraction from web logs. Int. J. Instrum. Control Autom. (IJICA) 1(1) (2011)Google Scholar
  15. 15.
    Santra1, A.K., Jayasudha, S.: Classification of web log data to identify interested users using Naïve Bayesian classification. Int. J. Comput. Sci. Issues (IJCSI) 9(1), 2 (2012)Google Scholar
  16. 16.
    Patil, A.S., Pawar, B.V.: Automated classification of web sites using Naive Bayesian algorithm. In: Proceedings of International Multi-Conference of Engineers and Computer Scientists, vol. 1 (2012)Google Scholar
  17. 17.
    Zhang, H.: The optimality of Naive Bayes. FLAIRS 2004 Conference. Available online: PDF (http://www.cs.unb.ca/profs/hzhang/publications/FLAIRS04ZhangH.pdf)
  18. 18.
    Caruana, R., Niculescu-Mizil, A.: An empirical comparison of supervised learning algorithms. In: Proceedings of the 23rd International Conference on Machine Learning (2006). Available online PDF (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.122.5901&rep=rep1&type=pdf)

Copyright information

© Springer India 2014

Authors and Affiliations

  • Bina Kotiyal
    • 1
  • Ankit Kumar
    • 2
  • Bhaskar Pant
    • 2
  • R. H. Goudar
    • 1
  1. 1.Computer Science and Engineering DepartmentGraphic Era UniversityDehradunIndia
  2. 2.Information Technology DepartmentEra UniversityDehradunIndia

Personalised recommendations