Information Retrieval Journal

, Volume 18, Issue 3, pp 167–214 | Cite as

An effective Web page recommender using binary data clustering

  • Rana ForsatiEmail author
  • Alireza Moayedikia
  • Mehrnoush Shamsfard


Through growth of the Web, the amount of data on the net is growing in an uncontrolled way, that makes it hard for the users to find the relevant and required information- an issue which is usually referred to as information overload. Recommender systems are among the appealing methods that can handle this problem effectively. Theses systems are either based on collaborative filtering and content based approaches, or rely on rating of items and the behavior of the users to generate customized recommendations. In this paper we propose an efficient Web page recommender by exploiting session data of users. To this end, we propose a novel clustering algorithm to partition the binary session data into a fixed number of clusters and utilize the partitioned sessions to make recommendations. The proposed binary clustering algorithm is scalable and employs a novel method to find the representative of a set of binary vectors to represent the center of clusters—that might be interesting in its own right. In addition, the proposed clustering algorithm is integrated with the \(k\)-means algorithm to achieve better clustering quality by combining its explorative power with fine-tuning power of the \(k\)-means algorithm. We have performed extensive experiments on a real dataset to demonstrate the advantages of proposed binary data clustering methods and Web page recommendation algorithm. In particular, the proposed recommender system is compared to previously published works in terms of minimum frequency and based on the number of recommended pages to show its superiority in terms of accuracy, coverage and F-measure.


Recommender systems Binary data clustering k-Means Harmony search optimization 



The authors would like to thank the Associate Editor and anonymous reviewers for their immensely insightful comments and helpful suggestions on the original version of this paper. The authors also would like to acknowledge the assistance of Professor Robin Burke, School of Computing, DePaul University in reviewing early drafts of this manuscript.


  1. Adomavicius, G., & Tuzhilin, A. (2005). Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering, 17(6), 734–749.CrossRefGoogle Scholar
  2. AlMurtadha, Y., Sulaiman, M. N., Mustapha, N., & Udzir, N. I. (2011). Ipact: Improved web page recommendation system using profile aggregation based on clustering of transactions. American Journal of Applied Sciences, 8(3), 277.CrossRefGoogle Scholar
  3. Baraglia, R., & Silvestri, F. (2007). Dynamic personalization of web sites without user intervention. Communications of the ACM, 50(2), 63–67.CrossRefGoogle Scholar
  4. Billsus, D., & Pazzani, M. J. (1998). Learning collaborative information filters. In ICML, vol. 98, (pp. 46–54).Google Scholar
  5. Breese, J. S., Heckerman, D., & Kadie, C. (1998). Empirical analysis of predictive algorithms for collaborative filtering. In Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence (pp. 43–52). Morgan Kaufmann Publishers Inc.Google Scholar
  6. Chen, G., Wang, F., & Zhang, C. (2009). Collaborative filtering using orthogonal nonnegative matrix tri-factorization. Information Processing and Management, 45(3), 368–379.CrossRefGoogle Scholar
  7. Chen, L., Bhowmick, S. S., & Nejdl, W. (2009). Cowes: Web user clustering based on evolutionary web sessions. Data and Knowledge Engineering, 68(10), 867–885.CrossRefGoogle Scholar
  8. Duraiswamy, K., & Mayil, V. V. (2008). Similarity matrix based session clustering by sequence alignment using dynamic programming. Computer and Information Science, 1(3), P66.Google Scholar
  9. Eirinaki, M., & Vazirgiannis, M. (2007). Web site personalization based on link analysis and navigational patterns. ACM Transactions on Internet Technology (TOIT), 7(4), 21.CrossRefGoogle Scholar
  10. Forsati, R., Doustdar, H. M., Shamsfard, M., Keikha, A., & Meybodi, M. R. (2013). A fuzzy co-clustering approach for hybrid recommender systems. International Journal of Hybrid Intelligent Systems, 10(2), 71–81.Google Scholar
  11. Forsati, R., Haghighat, A. T., & Mahdavi, M. (2008). Harmony search based algorithms for bandwidth-delay-constrained least-cost multicast routing. Computer Communications, 31(10), 2505–2519.CrossRefGoogle Scholar
  12. Forsati, R., Keikha, A., & Shamsfard, M. (2015). An improved bee colony optimization algorithm with an application to document clustering. Neurocomputing, 159, 9–26.CrossRefGoogle Scholar
  13. Forsati, R., & Mahdavi, M. (2010). Web text mining using harmony search. In Recent advances in harmony search algorithm (pp. 51–64). Springer.Google Scholar
  14. Forsati, R., Mahdavi, M., Shamsfard, M., & Meybodi, M. R. (2013). Efficient stochastic algorithms for document clustering. Information Sciences, 220, 269–291.CrossRefMathSciNetGoogle Scholar
  15. Forsati, R., & Meybodi, M. R. (2010). Effective page recommendation algorithms based on distributed learning automata and weighted association rules. Expert Systems with Applications, 37(2), 1316–1330.CrossRefGoogle Scholar
  16. Forsati, R., & Shamsfard, M. (2014). Novel harmony search-based algorithms for part-of-speech tagging. Knowledge and Information Systems, 42(3), 709–736.CrossRefGoogle Scholar
  17. Fu, Y., Sandhu, K., & Shih, M.-Y. (1999). Clustering of web users based on access patterns. In Proceedings of the 1999 KDD Workshop on Web Mining. Technical Report. Citeseer.Google Scholar
  18. Fu, Y., Sandhu, K., & Shih, M.-Y. (2000). A generalization-based approach to clustering of web usage sessions. In Web Usage Analysis and User Profiling (pp. 21–38). Springer.Google Scholar
  19. Geem, Z. W., Kim, J. H., & Loganathan, G. V. (2001). A new heuristic optimization algorithm: Harmony search. Simulation, 76(2), 60–68.CrossRefGoogle Scholar
  20. Herlocker, J. L., Konstan, J. A., Borchers, A., Riedl, J. (1999). An algorithmic framework for performing collaborative filtering. In Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval (pp. 230–237). ACM.Google Scholar
  21. Hofmann, T. (2004). Latent semantic models for collaborative filtering. ACM Transactions on Information Systems (TOIS), 22(1), 89–115.CrossRefGoogle Scholar
  22. Jalali, M., Mustapha, N., Mamat, A., & Sulaiman, M. N. B. (2009). A recommender system for online personalization in the wum applications. In Proceedings of the World congress on Engineering and Computer Science, vol. 2, (pp. 20–22).Google Scholar
  23. Jalali, M., Mustapha, N., Mamat, A., & Sulaiman, N. B. (2008). A new clustering approach based on graph partitioning for navigation patterns mining. In Pattern Recognition, 2008. ICPR 2008. 19th International Conference on IEEE (pp. 1–4).Google Scholar
  24. Jalali, M., Mustapha, N., Sulaiman, M. N., & Mamat, A. (2010). Webpum: A web-based recommendation system to predict user future movements. Expert Systems with Applications, 37(9), 6201–6212.CrossRefGoogle Scholar
  25. Kim, Y. (2007). Weighted order-dependent clustering and visualization of web navigation patterns. Decision Support Systems, 43(4), 1630–1645.CrossRefGoogle Scholar
  26. Kohrs, A., & Merialdo, B. (1999). Clustering for collaborative filtering applications. Computational Intelligence for Modelling, Control and Automation’99: Intelligent Image Processing, Data Analysis and Information Retrieval, 3, 199.Google Scholar
  27. Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer, 42(8), 30–37.CrossRefGoogle Scholar
  28. Krishna, K., & Murty, M. Narasimha. (1999). Genetic k-means algorithm. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 29(3), 433–439.CrossRefGoogle Scholar
  29. Kumar, P., Krishna, P. R., & Bapi, R. S. (2007). Rough clustering of sequential data. Data and Knowledge Engineering, 63(2), 183–199.CrossRefGoogle Scholar
  30. Kuo, R. J., Liao, J. L., & Tu, C. (2005). Integration of art2 neural network and genetic k-means algorithm for analyzing web browsing paths in electronic commerce. Decision Support Systems, 40(2), 355–374.CrossRefGoogle Scholar
  31. Kyung-Yong, J., & Jung-Hyun, L. (2002). Prediction of user preference in recommendation system using associative user clustering and bayesian estimated value. In AI 2002: Advances in Artificial Intelligence (pp. 284–296). Springer.Google Scholar
  32. Lee, K. S., & Geem, Z. W. (2005). A new meta-heuristic algorithm for continuous engineering optimization: Harmony search theory and practice. Computer Methods in Applied Mechanics and Engineering, 194(36), 3902–3933.CrossRefzbMATHGoogle Scholar
  33. Livnat, A., Papadimitriou, C., Pippenger, N., & Feldman, M. W. (2010). Sex, mixability, and modularity. Proceedings of the National Academy of Sciences, 107(4), 1452–1457.CrossRefGoogle Scholar
  34. Madhuri, B. C., Chandulal, A. J., Ramya, K., & Phanidra, M. (2011). Analysis of users’ web navigation behavior using grpa with variable length markov chains. International Journal of Data Mining & Knowledge Management Process, 1(2), 1001–1021.Google Scholar
  35. Mahdavi, M., Chehreghani, M. H., Abolhassani, H., & Forsati, R. (2008). Novel meta-heuristic algorithms for clustering web documents. Applied Mathematics and Computation, 201(1), 441–451.CrossRefzbMATHMathSciNetGoogle Scholar
  36. Mehr, S. M., Taran, M., Hashemi, A. B., & Meybodi, M. R. (2011). A new recommendation algorithm using distributed learning automata and graph partitioning. In Hybrid Intelligent Systems (HIS), 2011 11th International Conference on IEEE (pp. 351–357).Google Scholar
  37. Mirkhani, M., Forsati, R., Shahri, A. M., & Moayedikia, A. (2013). A novel efficient algorithm for mobile robot localization. Robotics and Autonomous Systems, 61(9), 920–931.CrossRefGoogle Scholar
  38. Mobasher, B., Dai, H., Luo, T., & Nakagawa, M. (2001). Effective personalization based on association rule discovery from web usage data. In Proceedings of the 3rd International Workshop on Web Information and Data Management (pp. 9–15). ACM.Google Scholar
  39. Ntoutsi, I., Stefanidis, K., Norvag, K., & Kriegel, H.-P. (2012). grecs: A group recommendation system based on user clustering. In Dtabase systems for advanced applications (pp. 299–303). Springer.Google Scholar
  40. Park, S., Suresh, N. C., & Jeong, B.-K. (2008). Sequence-based clustering for web usage mining: A new experimental framework and ann-enhanced k-means algorithm. Data and Knowledge Engineering, 65(3), 512–543.CrossRefGoogle Scholar
  41. Salakhutdinov, R., & Mnih, A. (2008). Probabilistic matrix factorization. Advances in Neural Information Processing Systems, 20, 1257–1264.Google Scholar
  42. Sarwar, B., Karypis, G., Konstan, J., & Riedl, J. (2000). Analysis of recommendation algorithms for e-commerce. In Proceedings of the 2nd ACM Conference on Electronic C (pp. 158–167). ACM.Google Scholar
  43. Sarwar, B., Karypis, G., Konstan, J., & Riedl, J. (2000). Application of dimensionality reduction in recommender system-a case study. Technical report. DTIC DocumentGoogle Scholar
  44. Sarwar, B. M., Karypis, G., Konstan, J., & Riedl, J. (2002). Recommender systems for large-scale e-commerce: Scalable neighborhood formation using clustering. In Proceedings of the Fifth International Conference on Computer and Information Technology, vol. 1. Citeseer.Google Scholar
  45. Shinde, S. K., & Kulkarni, U. V. (2008). A new approach for on line recommender system in web usage mining. In Advanced Computer Theory and Engineering, 2008. ICACTE’08. International Conference on IEEE (pp. 973–977).Google Scholar
  46. Talabeigi, M., Forsati, R., & Meybodi, M. R. (2010a). A dynamic web recommender system based on cellular learning automata. In Computer Engineering and Technology (ICCET), 2010 2nd International Conference on IEEE, vol. 7, (pp. V7–755).Google Scholar
  47. Talabeigi, M., Forsati, R., & Meybodi, M. R. (2010b). A hybrid web recommender system based on cellular learning automata. In Granular Computing (GrC), 2010 IEEE International Conference on IEEE (pp. 453–458).Google Scholar
  48. Wan, M., Jönsson, A., Wang, C., Li, L., & Yang, Y. (2012). Web user clustering and web prefetching using random indexing with weight functions. Knowledge and Information Systems, 33(1), 89–115.CrossRefGoogle Scholar
  49. Wang, F., Ma, S., Yang, L., & Li, T. (2006). Recommendation on item graphs. In Data Mining, 2006. ICDM’06. Sixth International Conference on IEEE (pp. 1119–1123).Google Scholar
  50. Wang, J., Robertson, S., de Vries, A. P., & Reinders, M. J. T. (2008). Probabilistic relevance ranking for collaborative filtering. Information Retrieval, 11(6), 477–497.CrossRefGoogle Scholar
  51. Wang, Y.-T., & Lee, A. J. T. (2011). Mining web navigation patterns with a path traversal graph. Expert Systems with Applications, 38(6), 7112–7122.CrossRefMathSciNetGoogle Scholar
  52. Wei, Y. Z., Moreau, L., & Jennings, N. R. (2005). Learning users’ interests by quality classification in market-based recommender systems. IEEE Transactions on Knowledge and Data Engineering, 17(12), 1678–1688.CrossRefGoogle Scholar
  53. Yang, S-H., Long, B., Smola, A., Sadagopan, N., Zheng, Z., & Zha, H. (2011). Like like alike: Joint friendship and interest propagation in social networks. In Proceedings of the 20th international conference on World wide web (pp. 537–546). ACM.Google Scholar
  54. Ypma, A., Ypma, E., & Heskes, T. (2002). Categorization of web pages and user clustering with mixtures of hidden markov models. Technical Report.Google Scholar
  55. Zhang, S., Wang, W., Ford, J., & Makedon, F. (2006). Learning from incomplete ratings using non-negative matrix factorization. In SDM, vol. 6, (pp. 548–552). SIAM.Google Scholar
  56. Zhang, T., Ramakrishnan, R., & Livny, M. (1996). BIRCH: An efficient data clustering method for very large databases. In ACM SIGMOD Record, vol. 25, (pp. 103–114). ACM.Google Scholar
  57. Zhang, Y., & Koren, J. (2007). Efficient bayesian hierarchical user modeling for recommendation system. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, (pp 47–54). ACM.Google Scholar
  58. Zhou, D., Zhu, S., Yu, K., Song, X., Tseng, B. L., Zha, H., & Giles, C. L. (2008). Learning multiple graphs for document recommendations. In Proceedings of the 17th international conference on World Wide Web (pp. 141–150). ACM.Google Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Rana Forsati
    • 1
    Email author
  • Alireza Moayedikia
    • 2
  • Mehrnoush Shamsfard
    • 3
  1. 1.Department of Computer Science and EngineeringMichigan State UniversityEast LansingUSA
  2. 2.Department of Information Systems and Business AnalyticsDeakin UniversityBurwoodAustralia
  3. 3.Natural Language Processing (NLP) Research Laboratory, Faculty of Electrical and Computer EngineeringShahid Beheshti University, G. C.TehranIran

Personalised recommendations