Skip to main content
Log in

Prediction of users webpage access behaviour using association rule mining

  • Published:
Sadhana Aims and scope Submit manuscript

Abstract

Web Usage mining is a technique used to identify the user needs from the web log. Discovering hidden patterns from the logs is an upcoming research area. Association rules play an important role in many web mining applications to detect interesting patterns. However, it generates enormous rules that cause researchers to spend ample time and expertise to discover the really interesting ones. This paper works on the server logs from the MSNBC dataset for the month of September 1999. This research aims at predicting the probable subsequent page in the usage of web pages listed in this data based on their navigating behaviour by using Apriori prefix tree (PT) algorithm. The generated rules were ranked based on the support, confidence and lift evaluation measures. The final predictions revealed that the interestingness of pages mainly depended on the support and lift measure whereas confidence assumed a uniform value among all the pages. It proved that the system guaranteed 100% confidence with the support of 1.3E −05. It revealed that the pages such as Front page, On-air, News, Sports and BBS attracted more interested subsequent users compared to Travel, MSN-News and MSN-Sports which were of less interest.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5

Similar content being viewed by others

References

  • Agrawal R and Srikant R 1994 Fast algorithms for mining association rules. In: Proceedings of 20th International Conference on Very Large Data Bases, VLDB, 1215(1): 487–499

  • Anitha A and Krishnan N 2011 A dynamic web mining framework for E-learning recommendations using rough sets and association rule mining. Int. J. Comp. Appl. 12 (11): 19–25

    Google Scholar 

  • Babu K G, Komali A, Mythry V and Ratnam A S K 2000 Web mining using semantic data mining techniques. Int. J. Soft Comput. Eng. (IJSCE) 3 (2): 2231–2307

    Google Scholar 

  • Chakrabarti S 2002 Mining the web: Analysis of hypertext and semi structured data, Morgan Kaufmann

  • Chandra B and Basker S 2000 A new approach for classification of patterns having categorical attributes. IEEE International Conference on Systems, Man, and Cybernetics (SMC): 960–964

  • Chifu V and Salomie I 2009 A fluent calculus approach to automatic web service composition. Adv. Electr. Comput. Eng. 9 (3): 75–83

    Article  Google Scholar 

  • Chun-sheng Z and Li Y. 2014 Extension of local association rules mining algorithm based on Apriori algorithm, pp. 340–343

  • Debahuti M 2010 Predictive data mining: Promising future and applications. Int. J. Comput. Commun. Technol. 2 (1): 20–28

    Google Scholar 

  • Eirinaki M, Vazirgiannis M and Kapogiannis D 2005 Web path recommendations based on page ranking and Markov models. In: Proceedings of the 7th Annual ACM International Workshop on Web Information and Data Management, 2–9

  • Ganapathy S, Sethukkarasi R, Yogesh P, Vijayakumar R and Kannan A 2014 An intelligent temporal pattern classification system using fuzzy temporal rules and particle swarm optimization. Sadhana, Indian Acad. Sci. 39 (2): 283–302

    MathSciNet  Google Scholar 

  • Gao S, Alhajj R, Rokne J and Guan J 2009 Set-based approach in mining sequential patterns. In: IEEE 24th International Symposium on Computer and Information Sciences, 2009. ISCIS 2009. pp. 218–223

  • Hacibeyoglu M, Arslan S and Kahramanli S 2013 A hybrid method for fast finding the reduct with the best classification accuracy. Adv. Electr. Comput. Eng. 13 (4): 57–64

    Article  Google Scholar 

  • Han J and Kamber M 2011 Data mining – Concepts and techniques, 3rd edition, Morgan Kauffmann Publishers

  • Hung Y S, Chen K L B, Yang C T and Deng G F 2013 Web usage mining for analysing elder self-care behavior patterns. Expert Syst. Appl. 40 (2): 775–783

    Article  Google Scholar 

  • Kum H. -C., Paulsen S. and Wang W. 2005 Comparative study of sequential pattern mining frameworks -support framework vs. multiple alignment framework. In IEEE 2nd International conference on data mining - workshop on the foundation of data mining and discovery. ICDM 2002. pp. 43–70

  • Internet Usage Statistics http://www.internetworldstats.com/stats.htm

  • Jacob S G and Ramani R G 2012 Evolving efficient classification rules from cardiotocography data through data mining methods and techniques. Eur. J. Sci. Res. 78 (3): 468–480

    Google Scholar 

  • Jacob S G and Ramani R G 2013 Design and Implementation of a clinical data classifier: A supervised learning approach. Res. J. Biotechnol. 8 (2): 16–24

    Google Scholar 

  • Jacob S G, Ramani R G and Nancy P 2013 Discovery of knowledge patterns in lymphographic clinical data through data mining methods and techniques. Advances in computing and information technology. LNCS Springer Berlin Heidelberg, 129–140

  • Jaideep S, Cooley R, Deshpande M and Tan P N 2000 Web usage mining: Discovery and applications of usage patterns from web data. ACM SIGKDD Explorations Newsletter 1 (2): 12–23

    Article  Google Scholar 

  • Kotsiantis S B and Kanellopoulos D 2001 Association rules mining: A recent overview. GESTS Int. Trans. Comput. Sci. Eng. 32 (1): 71–82

    Google Scholar 

  • Kotsiantis S B, Zaharakis I D and Pintelas P E 2007 Supervised machine learning: A review of classification techniques, pp. 3–24

  • Kriegel H P 2007 Future trends in data mining. Data Mining Knowledge Discovery 15 (1): 87–97

    Article  MathSciNet  Google Scholar 

  • Kumar S K and Chezian R M 2012 A survey on association rule mining using Apriori algorithm. Int. J. Comput. Appl. 45 (5): 7–50

    Google Scholar 

  • Liu L and Peng T 2013 Post-processing of deep web information extraction based on domain ontology. Adv. Electr. Comput. Eng. 13 (4): 25–32

    Article  Google Scholar 

  • Madhuri B 2002 Analysis of the navigation behavior of the users’ using grey relational pattern: Analysis with Markov chains. Int. J. Eng. Sci. Technol. 2 (10): 5402–5412

    Google Scholar 

  • Mary S S A and Malarvizhi M 2012 A new improved weighted association rule mining with dynamic programming approach for predicting a user’s next access. Comput. Sci. Inform. Technol. 2 (1): 10–15

    Google Scholar 

  • Mitchell T 2009 Machine learning. McGraw Hill

  • Phoa F K H and Sanchez J 2013 Modeling the browsing behavior of world wide web users. Open Journal of Statistics. 3(2):145–154

  • Ramani R G and Jacob S G 2013a Improved classification of lung cancer tumors based on structural and physicochemical properties of proteins using data mining models. PloS one 8 (3): e58772

  • Ramani R G and Jacob S G 2013b Benchmarking classification models for cancer prediction from gene expression data: A novel approach and new findings. Studies Informatics Control 22 (2): 134–143

  • Ramani R G, Lakshmi B and Jacob S G 2012 Data mining method of evaluating classifier prediction accuracy in retinal data. IEEE International Conference on Computational Intelligence & Computing Research (ICCIC)

  • Renáta I and Vajk I 2006 Frequent pattern mining in web log data. Acta Polytechnica Hungarica 3 (1): 77–90

    Google Scholar 

  • Robert C, Mobasher B and Srivastava J 1999 Data preparation for mining world wide web browsing patterns. Knowledge Inform. Syst. 1 (1): 5–32

    Article  Google Scholar 

  • Sanchez J and Liu C T 2011 Bayesian hierarchical model of the browsing behavior of world wide web Users. Department of Statistics, UCLA

  • Santhisree K and Damodaram A 2010 Optics on sequential data: Experiments and test results. Int. J. Comput. Appl. 11 (5): 15–21

    Google Scholar 

  • Suraya A, Norhisham R M and Fun T S 2011 Discovering frequent sequential pattern using personalized minimum support threshold with minimum items. International Conference on Research and Innovation in Information Systems (ICRIIS) 10 (1): 1–6

    Google Scholar 

  • Suresh K, Madanamohana R, Reddy R A and Subramanyam A 2011 Improved FCM algorithm for clustering on web usage mining. IEEE International Conference in Computer And Management (CAMAN): 1–4

  • Tassa T 2014 Secure mining of association rules in horizontally distributed databases. IEEE Trans. Knowledge Data Eng. 26(4): 970–983

  • University of California, Machine Learning Repository https://archive.ics.uci.edu/ml/.../MSNBC.com+Anonymous+Web+Data

  • Veeramalai S, Jaisankar N and Kannan A 2010 Efficient web log mining using enhanced Apriori algorithm with hash tree and fuzzy. Int. J. Comput. Sci. Inform. Technol. 2 (4): 241–247

    Google Scholar 

  • Wang W, Yang J and Philip S Y 2000 Efficient mining of weighted association rules (WAR) In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 13–31

  • Wen-Hai G 2010 Research on client behavior pattern recognition system based on web log mining. International Conference On Machine Learning and Cybernetics (ICMLC) 1 (1): 10–21

    Google Scholar 

  • Yang B, Xiangjun D and Fufu S 2009 Research of web usage mining based on negative association rules. International Forum on Computer Science-Technology and Applications 1 (1): 336

    Google Scholar 

  • Zhang Y and Chen G 2014 A Forensics method of web browsing behaviour based on association rule mining. In: 2nd International Conference on Systems and Informatics, pp. 927–932

  • Zhou X and Huang Y 2014 An improved parallel association rules algorithm based on mapreduce framework for big data. In: 11th International Conference on Fuzzy Systems and Knowledge Discovery, pp. 284–288

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to SHOMONA G JACOB.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

GEETHARAMANI, R., REVATHY, P. & JACOB, S.G. Prediction of users webpage access behaviour using association rule mining. Sadhana 40, 2353–2365 (2015). https://doi.org/10.1007/s12046-015-0424-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12046-015-0424-0

Keywords

Navigation