A Neighborhoodbased Matrix Factorization Technique for Recommendation
Abstract
The data sparsity and prediction quality are recognized as the key challenges in the existing recommender Systems. Most of the existing recommender systems depend on collaborating flitering (CF) method which mainly leverages the useritem rating matrix representing the relationship between users and items. However, the CFbased method sometimes fails to provide accurate information for predicting recommendations as there is an assumption that the relationship between attributes of items is independent and identically distributed. In real applications, there are often several kinds of coupling relationships or connections existed among users or items. In this paper, we incorporate the coupling relationship analysis to capture the underdiscovered relationships between items and aim to make the ratings more reasonable. Next, we propose a neighborhoodbased matrix factorization model, which considers both the explicit and implicit correlations between items, to suggest the more reasonable items to user. The experimental evaluations demonstrate that the proposed algorithms outperform the stateoftheart algorithms in the warm and coldstart settings.
Keywords
Recommender systems Coupling relationship Matrix factorization Coldstart Predicting1 Introduction
Recommender system (RS) is an important way to deal with the problem of information overload since it applies information filtering approaches for providing proposals for users that are suitable to their favors and tastes [1]. However, most of the existing RS perform not very well because they suffer from problems of cold start and sparsity caused by the massive growth of new items participation with no ratings. As a result, the magnitude of user/item vector could not be properly learned due to the lack of information. However, the decisionmaking quality of the recommender system always depends on the rating data from the users in real applications.
Based on the ideas mentioned above, an illustration of a movie recommendation problem in coupling relationship can be found in Fig. 1. For example, the attributes “director”, “artists”, and “genre” are consisted of the attributes of a movie. Different director, actor and genre jointly form the corresponding attribute values. Through the view recordings of each user can offer the information of relevant attribute values, it can establish different movies’ correction by the similarity of attribute values, which are often coupled together and serve as an extra source which can provide more information to indicate why the user gives the rating.
The main reason of the problems is that the traditional recommendation strategies such as collaborative filtering (CF) generally depend on useritem rating matrix, which is usually partially filled. Latent factor models, such as matrix factorization (MF), attempt to explain the ratings by transforming both items and users to the same latent factor space. MF models are effective at apprising structure which relates to most or all items simultaneously.
This paper is prominently improved by considering of Coupled Attribute Value Similarity. In addition, item coupling process extraordinary explanatory of the existing ratings based on the empirical evidence. Therefore, combining the relevance can be potentially utilized in RS.
This paper enlightened by previous studies such as [5] and [6]. We present an effort to exploit the CF problem with the ratings and combination the inter relation of items.

We extend the itemcoupling analysis method to reveal the implicit relationship between items, which enable them to effectively deal with sparse datasets;

We focus on inferring the implicit relationship from the item coupling relationship combined with user’s subjective preferences rating scale into matrix factorization learning model;

We conduct an extensive experimental study on two real data sets and show that the proposed methods outperform three stateoftheart methods for itemcold start recommendation.
2 Related Work
In this section, we review some classical approaches which mainly include collaborative filtering (CF) and contentbased (CB) techniques in Recommender Systems.
With the emerging of CF [1, 7] in the RS field, it achieved a great accomplishment since CF methods are domain independent and only rely on historical user record without demanding the establishment of explicit profiles, as well as seizing the abstruse and difficult to profile by other means. CF method is most adept with detecting relationships between items or, alternatively, between users for generating recommendations. CF can be further categoried into the neighborhoodbased and modelbased methods.
In recent years, the neighborhoodbased techniques have been effectively deployed and widely investigated by several researches. Neighborhoodbased methods also involved useroriented [8] and itemoriented approaches [9]. Useroriented methods mainly discern likeminded users with the similar historical actions or ratings, while itemoriented methods estimate unknown ratings on the basis of similar items that tend to be rated resemblance. Useroriented and itemoriented methods are commonly explicitly modelling the similarities of users or items or merging them together [10]. The neighborhoodbased method is prevalent used since it is easier and intuitive to implement. However, although it can generate the approximately precise results, it suffers the serious limitation of scalability with the rising magnitude of users and items.
An alternative way of collaborative filtering is modelbased method, which trains the observed ratings to get a welldesigned model. The unknown ratings can be evaluated via the model instead of handling the original rating matrix. The bayesian hierarchical model [11, 12], clustering model [13], latent factor model [7], are the wellknown examples in collaborative filtering. Among them the most widely used single model is matrix factorization (MF). The merit of MF approach is its elasticity to append some fundamental extensions to the primary model. MF techniques can be a potentially more effective method for the elusive relation data, owing to their remarkable precisely and scalability. Koren et al. did amount of relevance works as the guidance to the progress of development [7].
The mainly challenge in CF is to effectively forecast the preferences of users. However, the traditional CF mainly focuses on building the useritem matrix, meanwhile a wealth of items only rated by a small fraction of users. Therefore, leaving the majority of useritem relations unknown, the issue currently referred to the coldstart problem [14]. CF only settles the problem to a certain degree, but it cannot provide a complete solution. So, more progressive methods have been developed to fuse auxiliary information for the purpose of effectively providing the items with low popularity or new arrivals. These approaches can facilitate the personalized item recommendation performance.
Clearly, CF and CB method are complement with each other because of they typically deal with the same issue from different perspectives. Embedding user and itemoriented filtering and extra information can strengthen the performance of RS framework. Several attributes similarity of user or item fusion algorithms and hybridizations has been developed by [14]. They are major in excavating the dependence of users’ and item’s characters and transitivity of feedback indirect neighbors in the data sets. Based on this intuition, the similarity can capture the associate of new useritem and complement the closest neighbors’ predict score.
Review that numbers of modified researches has been done to ameliorate the performance of basic Matrix Factorization methods of recommender systems recently are presumed that the attributes of item are independent and identically distributed (i.i.d), and ignores the coupling relationships between items, which is not compatible with the reality situation. And there is no much work has been done to fuse relation of item attribute within model, inspired by the concept of coupled attribute value similarity (CAVS). This paper concerns on approaches which centers on itemitem similarity predicts the ratings for an item on the ratings expressed by the user inclination for an item of his or her ratings on the similar items.
3 Coupling Relationship Analysis
Most of similarity measuring method mainly depends on the historical rating score which is usually insufficient or deals with the items of category attributes relationship separately. The general used similarity metric is Pearson correlation coefficient (PCC) algorithm [18], which assumes there exists a linear relationship between the variables. Actually, the relationship between attributes of items should be incorporated together to measure the similarity.
In this section, to excavate the key concept of implicit relationship, we aim to leverage information of categorical attributes to unveil CAVS. CAVS is composed of both intracoupled and intercoupled value similarities, which can obtain the relatively accurate relationship between items. The work in [19, 20] presented a detailed analysis as showed in Fig. 2.
3.1 Intracoupled Attribute Value Similarity
3.2 Intercoupled Attribute Value Similarity
Based on the exemplification, we employ Shannon Entropy to depict the attribute weight. Since it is tough to gain credible subjective weights, the adoption of objective weights is demanded. Among objective weighting estimation that immensely has been used in Multicriteria decision making (MCDM) domain is Shannon’s entropy concept [21]. The notion of entropy is related to the quantity of information of a message as a statistical measure. Shannon’s entropy concept is a general measure of uncertainty in information formulated in terms of probability theory. Entropy weight is an argument that accounts how much diverse alternatives approach one another with respect to a certain criteria.
The \(d_{k}=1h_{k}, k = 1,...,N\), is the degree of diversification.
The \(\gamma _k =\frac{d_k}{\sum \nolimits _{k=1}^{N}{d_k}}\), is the degree of magnitude of attribute k.
3.3 Item Coupling
4 Neighborhoodbased MF Model
In this part, we have a brief introduction of the basic MF model. And then we import the improved coupling itembased MF model. Our algorithm is based on the NeighborIntegrated Matrix Factorization technique introduced by [17].
4.1 The Basic Matrix Factorization
The basic ideal of MF approach is to decompose the scarcity of useritem matrix into a joint latent factor space of a low dimensionality f, and aims at utilizing the inner product of factorized userspecific and itemspecific vectors to make further predictions in that space.
Given an \(\hbox {M}\times \hbox {N}\) rating matrix. \(E=\{r_{ui}\}\) represents M users’ ratings on N items, the character of user u is depicted by the vector \(p_{u}\in R_{f}\), which is used to gauge the affinity of the corresponding latent factors, and each item i with an itemfactors vector \(q_{i}\in R_{f}\), which is used to measure the relevance of corresponding latent factors, the matrix Rf captures the most major features of the data message, where Open image in new window min (M, N). The missing entries are obtained by multiplying the useritem featurevector pairs correspondingly, e.g. \(\hat{r}_{ui} \approx p_{u}^{\mathrm{T}}q_{i}\). A greater inner product between a user feature vector and an item feature vector represent their tendency. The regularized squared loss is frequently used error function [4, 7].
The traditional matrix factorization technique is context unaware method since it is primarily concentrate on the known entries, especially, the rating information is usually sparse, which suffers from poor scalability. Hence this approach is insensitive to seize a subgroup of items or users relatively similar. The overall structure will result in information loss problem.
In next section we represent some suitable variant of MF technique which improve the quality of recommendation significantly.
4.2 The Coupling Itembased Matrix Factorization
In this subsection, we propose our approach which leverages Coupled Attribute Value Similarity between items with the classic matrix factorization model for recommendation. It should be pointed out, although there are some differences between items, normalizing the neighbors of certain item can share similar property from some aspects which reflect the propagation of item’s trait. Namely, the item latent feature vector qi and its neighbor feature vector tend to be resembled in the corresponding space. On top of this observation, we encapsulate the whole structure and partial information which uncover the prediction model holistically.
5 Experiments
In this section, we aim to verify the accuracy of the proposed coupling itembase matrix factorization method (CISMF). We utilize a fivefold crossvalidation method for training and testing. We randomly sample each data set into five folds and pick four of them served as the training set. The rest are served as the test set for each iteration.
5.1 Experiments Settings
5.1.1 Data Set
Experiments are deployed on two public published collaborative filtering datasets, MovieLens 100k (ML100k) and MovieLens 1M (ML1M). These two datasets are offered by the GroupLens research group in the Department of Computer Science and Engineering at the University of Minnesota, and are broadly adopted in the current researches.
ML100k consisting of 100,000 ratings (1–5) derived from 943 users on 1682 movies, ML1M offer 1 million ratings voted by 6040 users on 3900 movies. Specially, in both of the datasets all the users have rated at least 20 movies, the sparsity of the datasets are 0.9369 and 0.9553. Apart from the historical score, it also supplies extra information about movies’ attributes, containing movie genre and release year, so it is extraordinary meaningful to itemoriented recommendations.
5.1.2 Evaluation Metrics
5.2 Comparison with Other Method
 A.
RSVD: regularization singular value decomposition is introduced in [22], which is a classic baseline model.
 B.
NMF: nonnegative matrix factorization is represented by [23], which restrict the latent feature nonnegative update during the learning process.
 C.
PMF: probabilistic matrix factorization is proposed by [11]. It is a wellknown method used in traditional recommender systems.
 D.
BPMF: Bayesian Probabilistic Matrix Factorization is proposed by [12], the method efficiently employs Markov chain Monte Carlo methods.
Results of comparative study on the MovieLens datasets
Metrics  Movielens100K  Movielens1M  

RSVD  NMF  PMF  BPMF  CISMF  RSVD  NMF  PMF  BPMF  CISMF  
MAE  0.7433  0.7724  0.7522  0.7465  0.7279  0.6885  0.7286  0.7306  0.7023  0.6814 
RMSE  0.9473  0.9874  0.9667  0.9533  0.9268  0.8670  0.9203  0.9234  0.8907  0.8592 
The parameter settings of our method are \(\alpha = 0.6\), Topk = 10, \(\lambda _{p}=\lambda _{q} = 0.001\), and \(d=5\) in the experiments. As Table 1 reported, it summarizes the results on testing data that we can see our results in the last column outperform the other Methods on two commonly used data sets. The bigger size of dimension may bring more noise into the model during learning procedure. The improvements are significant, which reveals promising orientation of recommendations. In the following we explore the other aspect factors in more detail and we only display the performance in MAE.
5.3 Validation on Cold Start Items
Figure 3 demonstrates the quantitative results for items presented with respect to different categories. As shown in the figures, CISMF achieves the best results, which indicates that the survey of considering the coupling relationship is effective.
5.4 Validation of Parameter
From Fig. 4, we can see that the performance of fusing similar neighborhoods via CISMF are provided when \(\alpha \) changes. It performed optimal value at \(\alpha = 0.5\) on MovieLens 100K and \(\alpha = 0.6\) on MovieLens 1M, respectively. It suggests that the neighbors’ feature is valuable for our model.
5.5 Validation of Size of Neighborhood
Figure 5a shows that the impact of size of neighborhood on MAE in MovieLens100K. From the figure, we can observe the deviation reach the minimum value happens for Topk \(=\) 10, along with the value of Topk increasing in range of 10 to 50, the MAE slightly rise. As Fig. 5b reflected, the influence of size of neighborhood on MAE in MovieLens1M. Values of Topk lie in the range of 20–100 with step size of 20, the MAE values does not behave evident fluctuation begin with Topk \(=\) 40. Through our analysis, it can be manifested that too few neighbors may not provide enough information while too many neighbors may bring some uncorrelated information, both of them can result the decrease of accuracy.
6 Conclusion
In this paper, we addressed the issues of cold start problem for new and receive few ratings’ items which is not well studied. In terms of the intuition that items’ attribute information can boost the accuracy of prediction, we have employed a novel coupling similarity measure fusing into matrix factorization for recommender system. According to our analysis and experiments, we capture coupling relationship serves as better information providers for similar items.
In the future research, we will collect more dataset with correlative attributes and use it to enhance our algorithm. Meanwhile, the coldstart user we haven’t consider in this paper, it is intriguing us to consider rich social relationship in the recommendation framework, and besides, we plan to further investigate the new algorithm.
Notes
Acknowledgments
Thanks to the support by Natural Science for Youth Foundation of China (No. 61003162) and the Young Scholars Growth Plan of Liaoning (No. LJQ2013038).
References
 1.Linden G, Smith B, York J (2003) Amazon.com recommendations: itemtoitem collaborative filtering. IEEE Internet Comput 7(1):76–80CrossRefGoogle Scholar
 2.Jaschke R, Marinho L, Hotho A, Schmidt L, Stumme G (2007) Tag recommendations in folksonomies. Proceedings of the 11th conference on european conference on principles and practice of knowledge discovery in databases, pp 506–514Google Scholar
 3.Levandoski J, Sarwat M, Eldawy A, Mokbel F (2012) LARS: a locationaware recommender system. Proceedings of the 28th conference on on data engineering, pp 450–461Google Scholar
 4.McAuley J, Leskovec J (2013) From amateurs to connoisseurs: modeling the evolution of user expertise though online reviews. Proceedings of the 22th international conference on world wide web, pp 897–908Google Scholar
 5.Ma H, King I, Lyu MR (2009) Learning to recommend with social trust ensemble. Proceedings of the 32th conference on research and development in information retrieval, pp 203–210Google Scholar
 6.Ma H, Zhou D, Liu C (2011) Recommender system with social regularization. Proceedings of the 4th conference on web search and data mining, pp 287–296Google Scholar
 7.Koren Y (2010) Collaborative filtering with temporal dynamics. Commun ACM 53(4):89–97CrossRefGoogle Scholar
 8.Middleton SE, Shadbolt NR, De DC (2004) Roure: ontological user profiling in recommender systems. ACM Trans Inf Syst 22(1):54–88CrossRefGoogle Scholar
 9.Paterek A (2007) Improving regularized singular value decomposition for collaborative filtering. Proceedings of the 13th international conference on knowledge discovery and data mining, pp 5–8Google Scholar
 10.Salakhutdinov R, Mnih A (2008) Bayesian probabilistic matrix factorization using markov chain monte carlo. Proceedings of the 25th conference on international conference on machine learning, pp 880–887Google Scholar
 11.Sarwar M, Karypis G, Konstan J, Riedl J (2002) Recommender systems for largescale ecommerce: Scalable neighborhood formation using clustering. Proceedings of the 5th international conference on computer and information technology, p 1Google Scholar
 12.Salakhutdinov R, Mnih A (2007) Probabilistic matrix facotorization. Proceedings of the 20th conference on neural information processing systems foundation, pp 1257–1264Google Scholar
 13.Sarwar B, Karypis G, Riedl J (2001) Itembased collaborative filtering recommendation algorithms. Proceedings of the 10th international conference on world wide web, pp 285–295Google Scholar
 14.Gantner Z, Drumond L, Freudenthaler C, Rendle S, SchmidtThieme L (2010) Learning attributetofeature mappings for coldstart recommendations. Proceedings of the 10th international conference on data mining, pp 176–185Google Scholar
 15.Balabanovic M, Shoham Y (1997) Fab: contentbased collaborative filtering. Commun ACM 40(3):66–72CrossRefGoogle Scholar
 16.Hotho A, Jaschke R, Schmitz C, Stumme G (2006) FolkRank:a ranking algorithm for folksonomies. LWA 1:111–114Google Scholar
 17.Li FF, Xu GD, Cao LB (2014) Coupled itembased matrix factorization. Proceedings of the 15th international conference on web information systems engineering, pp 1–14Google Scholar
 18.Breese JS, Heckerman D, Kadie C (1998) Empirical analysis of predictive algorithms for collaborative filtering. Proceedings of the 14th conference on uncertainty in artificial intelligence, pp 43–52Google Scholar
 19.Cao L, Ou Y, Yu PS (2012) Coupled behavior analysis with applications. IEEE Trans Knowl Data Eng 24(8):1378–1392CrossRefGoogle Scholar
 20.Wang J, De Vries AP, Reinders MJT (2006) Unifying userbased and itembased collaborative ltering approachesbysimilarityfusion. Proceedings of the 29th conference on research and development in information retrieval, pp 501–508Google Scholar
 21.Lotfi H, Fallahnejad R (2010) Imprecise shannon’s entropy and multi attribute decision making. Entropy 12(1):53–62CrossRefGoogle Scholar
 22.Nguyen JJ, Zhu M (2013) Contentboosted matrix factorization techniques for recommender systems. Stat Anal Data Mining 6(4):286–301CrossRefGoogle Scholar
 23.Lee DD, Seung HS (2001) Algorithms for nonnegative matrix factorization. Proceedings of the 14th conference on advances in neural information processing systems, pp 556–562Google Scholar