# A Neighborhood-based Matrix Factorization Technique for Recommendation

- 613 Downloads
- 2 Citations

## Abstract

The data sparsity and prediction quality are recognized as the key challenges in the existing recommender Systems. Most of the existing recommender systems depend on collaborating flitering (CF) method which mainly leverages the user-item rating matrix representing the relationship between users and items. However, the CF-based method sometimes fails to provide accurate information for predicting recommendations as there is an assumption that the relationship between attributes of items is independent and identically distributed. In real applications, there are often several kinds of coupling relationships or connections existed among users or items. In this paper, we incorporate the coupling relationship analysis to capture the under-discovered relationships between items and aim to make the ratings more reasonable. Next, we propose a neighborhood-based matrix factorization model, which considers both the explicit and implicit correlations between items, to suggest the more reasonable items to user. The experimental evaluations demonstrate that the proposed algorithms outperform the state-of-the-art algorithms in the warm- and cold-start settings.

### Keywords

Recommender systems Coupling relationship Matrix factorization Cold-start Predicting## 1 Introduction

Recommender system (RS) is an important way to deal with the problem of information overload since it applies information filtering approaches for providing proposals for users that are suitable to their favors and tastes [1]. However, most of the existing RS perform not very well because they suffer from problems of cold start and sparsity caused by the massive growth of new items participation with no ratings. As a result, the magnitude of user/item vector could not be properly learned due to the lack of information. However, the decision-making quality of the recommender system always depends on the rating data from the users in real applications.

Based on the ideas mentioned above, an illustration of a movie recommendation problem in coupling relationship can be found in Fig. 1. For example, the attributes “director”, “artists”, and “genre” are consisted of the attributes of a movie. Different director, actor and genre jointly form the corresponding attribute values. Through the view recordings of each user can offer the information of relevant attribute values, it can establish different movies’ correction by the similarity of attribute values, which are often coupled together and serve as an extra source which can provide more information to indicate why the user gives the rating.

The main reason of the problems is that the traditional recommendation strategies such as collaborative filtering (CF) generally depend on user-item rating matrix, which is usually partially filled. Latent factor models, such as matrix factorization (MF), attempt to explain the ratings by transforming both items and users to the same latent factor space. MF models are effective at apprising structure which relates to most or all items simultaneously.

This paper is prominently improved by considering of Coupled Attribute Value Similarity. In addition, item coupling process extraordinary explanatory of the existing ratings based on the empirical evidence. Therefore, combining the relevance can be potentially utilized in RS.

This paper enlightened by previous studies such as [5] and [6]. We present an effort to exploit the CF problem with the ratings and combination the inter relation of items.

We extend the item-coupling analysis method to reveal the implicit relationship between items, which enable them to effectively deal with sparse datasets;

We focus on inferring the implicit relationship from the item coupling relationship combined with user’s subjective preferences rating scale into matrix factorization learning model;

We conduct an extensive experimental study on two real data sets and show that the proposed methods outperform three state-of-the-art methods for item-cold start recommendation.

## 2 Related Work

In this section, we review some classical approaches which mainly include collaborative filtering (CF) and content-based (CB) techniques in Recommender Systems.

With the emerging of CF [1, 7] in the RS field, it achieved a great accomplishment since CF methods are domain independent and only rely on historical user record without demanding the establishment of explicit profiles, as well as seizing the abstruse and difficult to profile by other means. CF method is most adept with detecting relationships between items or, alternatively, between users for generating recommendations. CF can be further categoried into the neighborhood-based and model-based methods.

In recent years, the neighborhood-based techniques have been effectively deployed and widely investigated by several researches. Neighborhood-based methods also involved user-oriented [8] and item-oriented approaches [9]. User-oriented methods mainly discern like-minded users with the similar historical actions or ratings, while item-oriented methods estimate unknown ratings on the basis of similar items that tend to be rated resemblance. User-oriented and item-oriented methods are commonly explicitly modelling the similarities of users or items or merging them together [10]. The neighborhood-based method is prevalent used since it is easier and intuitive to implement. However, although it can generate the approximately precise results, it suffers the serious limitation of scalability with the rising magnitude of users and items.

An alternative way of collaborative filtering is model-based method, which trains the observed ratings to get a well-designed model. The unknown ratings can be evaluated via the model instead of handling the original rating matrix. The bayesian hierarchical model [11, 12], clustering model [13], latent factor model [7], are the well-known examples in collaborative filtering. Among them the most widely used single model is matrix factorization (MF). The merit of MF approach is its elasticity to append some fundamental extensions to the primary model. MF techniques can be a potentially more effective method for the elusive relation data, owing to their remarkable precisely and scalability. Koren et al. did amount of relevance works as the guidance to the progress of development [7].

The mainly challenge in CF is to effectively forecast the preferences of users. However, the traditional CF mainly focuses on building the user-item matrix, meanwhile a wealth of items only rated by a small fraction of users. Therefore, leaving the majority of user-item relations unknown, the issue currently referred to the cold-start problem [14]. CF only settles the problem to a certain degree, but it cannot provide a complete solution. So, more progressive methods have been developed to fuse auxiliary information for the purpose of effectively providing the items with low popularity or new arrivals. These approaches can facilitate the personalized item recommendation performance.

Clearly, CF and CB method are complement with each other because of they typically deal with the same issue from different perspectives. Embedding user- and item-oriented filtering and extra information can strengthen the performance of RS framework. Several attributes similarity of user or item fusion algorithms and hybridizations has been developed by [14]. They are major in excavating the dependence of users’ and item’s characters and transitivity of feedback indirect neighbors in the data sets. Based on this intuition, the similarity can capture the associate of new user-item and complement the closest neighbors’ predict score.

Review that numbers of modified researches has been done to ameliorate the performance of basic Matrix Factorization methods of recommender systems recently are presumed that the attributes of item are independent and identically distributed (i.i.d), and ignores the coupling relationships between items, which is not compatible with the reality situation. And there is no much work has been done to fuse relation of item attribute within model, inspired by the concept of coupled attribute value similarity (CAVS). This paper concerns on approaches which centers on item-item similarity predicts the ratings for an item on the ratings expressed by the user inclination for an item of his or her ratings on the similar items.

## 3 Coupling Relationship Analysis

Most of similarity measuring method mainly depends on the historical rating score which is usually insufficient or deals with the items of category attributes relationship separately. The general used similarity metric is Pearson correlation coefficient (PCC) algorithm [18], which assumes there exists a linear relationship between the variables. Actually, the relationship between attributes of items should be incorporated together to measure the similarity.

In this section, to excavate the key concept of implicit relationship, we aim to leverage information of categorical attributes to unveil CAVS. CAVS is composed of both intra-coupled and inter-coupled value similarities, which can obtain the relatively accurate relationship between items. The work in [19, 20] presented a detailed analysis as showed in Fig. 2.

### 3.1 Intra-coupled Attribute Value Similarity

*x*and

*y*, respectively.

### 3.2 Inter-coupled Attribute Value Similarity

*x*and

*y*of attribute \(A_{j}\) is defined as follows.

Based on the exemplification, we employ Shannon Entropy to depict the attribute weight. Since it is tough to gain credible subjective weights, the adoption of objective weights is demanded. Among objective weighting estimation that immensely has been used in Multi-criteria decision making (MCDM) domain is Shannon’s entropy concept [21]. The notion of entropy is related to the quantity of information of a message as a statistical measure. Shannon’s entropy concept is a general measure of uncertainty in information formulated in terms of probability theory. Entropy weight is an argument that accounts how much diverse alternatives approach one another with respect to a certain criteria.

*x*with corresponding attribute

*k*, \(p_{ik}\) is the probability of the occurrence of the

*k*th attribute in the attribute set, \(h_{k}\) is the entropy’s abbreviation of the

*k*th attribute, computed as:

The \(d_{k}=1-h_{k}, k = 1,...,N\), is the degree of diversification.

The \(\gamma _k =\frac{d_k}{\sum \nolimits _{k=1}^{N}{d_k}}\), is the degree of magnitude of attribute *k*.

*w*of attribute \(A_{k }\)based on the other attribute value

*x*of attribute \(A_{j}\), which can be computed by,

### 3.3 Item Coupling

## 4 Neighborhood-based MF Model

In this part, we have a brief introduction of the basic MF model. And then we import the improved coupling item-based MF model. Our algorithm is based on the Neighbor-Integrated Matrix Factorization technique introduced by [17].

### 4.1 The Basic Matrix Factorization

The basic ideal of MF approach is to decompose the scarcity of user-item matrix into a joint latent factor space of a low dimensionality *f*, and aims at utilizing the inner product of factorized user-specific and item-specific vectors to make further predictions in that space.

Given an \(\hbox {M}\times \hbox {N}\) rating matrix. \(E=\{r_{ui}\}\) represents M users’ ratings on N items, the character of user *u* is depicted by the vector \(p_{u}\in R_{f}\), which is used to gauge the affinity of the corresponding latent factors, and each item *i* with an item-factors vector \(q_{i}\in R_{f}\), which is used to measure the relevance of corresponding latent factors, the matrix Rf captures the most major features of the data message, where Open image in new window min (M, N). The missing entries are obtained by multiplying the user-item feature-vector pairs correspondingly, e.g. \(\hat{r}_{ui} \approx p_{u}^{\mathrm{T}}q_{i}\). A greater inner product between a user feature vector and an item feature vector represent their tendency. The regularized squared loss is frequently used error function [4, 7].

*u*,

*i*) pairs of which rui is actual rating value. Where \(I_{ui}\) is the indicator function that is equal to 1 if the vectors of items rated by user u and equal 0 otherwise, \(||\cdot ||\) denotes the Frobenius norm, the two regularization parameters \(\lambda _{p}\), \(\lambda _{q}>0\). Gradient based function is able to find a local minimum. The regularizing \(\frac{\lambda _{p}}{2}||p_u||_F^{2}+\frac{\lambda _q}{2}||q_i ||\) term need to penalize the magnitudes of the parameters avoids over-fitting [3].

The traditional matrix factorization technique is context unaware method since it is primarily concentrate on the known entries, especially, the rating information is usually sparse, which suffers from poor scalability. Hence this approach is insensitive to seize a subgroup of items or users relatively similar. The overall structure will result in information loss problem.

In next section we represent some suitable variant of MF technique which improve the quality of recommendation significantly.

### 4.2 The Coupling Item-based Matrix Factorization

In this subsection, we propose our approach which leverages Coupled Attribute Value Similarity between items with the classic matrix factorization model for recommendation. It should be pointed out, although there are some differences between items, normalizing the neighbors of certain item can share similar property from some aspects which reflect the propagation of item’s trait. Namely, the item latent feature vector qi and its neighbor feature vector tend to be resembled in the corresponding space. On top of this observation, we encapsulate the whole structure and partial information which uncover the prediction model holistically.

*i*, it then can be represented by the normalized similarity as follows:

*Top-K*similar items of item \(o_{i}\).

*g*(x).

*N*(

*i*) is all the items of which the

*k*most similar neighbors of item \(o_{i}\).

## 5 Experiments

In this section, we aim to verify the accuracy of the proposed coupling item-base matrix factorization method (CISMF). We utilize a fivefold cross-validation method for training and testing. We randomly sample each data set into five folds and pick four of them served as the training set. The rest are served as the test set for each iteration.

### 5.1 Experiments Settings

#### 5.1.1 Data Set

Experiments are deployed on two public published collaborative filtering datasets, MovieLens 100k (ML-100k) and MovieLens 1M (ML-1M). These two datasets are offered by the GroupLens research group in the Department of Computer Science and Engineering at the University of Minnesota, and are broadly adopted in the current researches.

ML-100k consisting of 100,000 ratings (1–5) derived from 943 users on 1682 movies, ML-1M offer 1 million ratings voted by 6040 users on 3900 movies. Specially, in both of the datasets all the users have rated at least 20 movies, the sparsity of the datasets are 0.9369 and 0.9553. Apart from the historical score, it also supplies extra information about movies’ attributes, containing movie genre and release year, so it is extraordinary meaningful to item-oriented recommendations.

#### 5.1.2 Evaluation Metrics

*MAE*) and root mean squared error (

*RMSE*) metrics to estimate the quality of our proposed algorithms. The metric MAE and RMSE are defined as follows, respectively.

*u*on item i and \(r_{test}\) is the number of all user-item pairs in the test set. Smaller

*MAE*or

*RMSE*represents superior prediction accuracy.

### 5.2 Comparison with Other Method

- A.
RSVD: regularization singular value decomposition is introduced in [22], which is a classic baseline model.

- B.
NMF: non-negative matrix factorization is represented by [23], which restrict the latent feature non-negative update during the learning process.

- C.
PMF: probabilistic matrix factorization is proposed by [11]. It is a well-known method used in traditional recommender systems.

- D.
BPMF: Bayesian Probabilistic Matrix Factorization is proposed by [12], the method efficiently employs Markov chain Monte Carlo methods.

Results of comparative study on the MovieLens datasets

Metrics | Movielens100K | Movielens1M | ||||||||
---|---|---|---|---|---|---|---|---|---|---|

RSVD | NMF | PMF | BPMF | CISMF | RSVD | NMF | PMF | BPMF | CISMF | |

| 0.7433 | 0.7724 | 0.7522 | 0.7465 | 0.7279 | 0.6885 | 0.7286 | 0.7306 | 0.7023 | 0.6814 |

| 0.9473 | 0.9874 | 0.9667 | 0.9533 | 0.9268 | 0.8670 | 0.9203 | 0.9234 | 0.8907 | 0.8592 |

The parameter settings of our method are \(\alpha = 0.6\), *Top-k* = 10, \(\lambda _{p}=\lambda _{q} = 0.001\), and \(d=5\) in the experiments. As Table 1 reported, it summarizes the results on testing data that we can see our results in the last column outperform the other Methods on two commonly used data sets. The bigger size of dimension may bring more noise into the model during learning procedure. The improvements are significant, which reveals promising orientation of recommendations. In the following we explore the other aspect factors in more detail and we only display the performance in *MAE*.

### 5.3 Validation on Cold Start Items

Figure 3 demonstrates the quantitative results for items presented with respect to different categories. As shown in the figures, CISMF achieves the best results, which indicates that the survey of considering the coupling relationship is effective.

### 5.4 Validation of Parameter

From Fig. 4, we can see that the performance of fusing similar neighborhoods via CISMF are provided when \(\alpha \) changes. It performed optimal value at \(\alpha = 0.5\) on MovieLens 100K and \(\alpha = 0.6\) on MovieLens 1M, respectively. It suggests that the neighbors’ feature is valuable for our model.

### 5.5 Validation of Size of Neighborhood

*Top-k*determines the number of similar items as well as affects the model performance. We investigate the size in range of 10–50 and 10–100 with the interval 10, 20, we set the parameter \(\alpha =0.5\) and \(\alpha =0.6\) in MovieLens100K and MovieLens1M, prescriptively.

Figure 5a shows that the impact of size of neighborhood on *MAE* in MovieLens100K. From the figure, we can observe the deviation reach the minimum value happens for *Top-k*\(=\) 10, along with the value of *Top-k* increasing in range of 10 to 50, the *MAE* slightly rise. As Fig. 5b reflected, the influence of size of neighborhood on *MAE* in MovieLens1M. Values of *Top-k* lie in the range of 20–100 with step size of 20, the *MAE* values does not behave evident fluctuation begin with *Top-k*\(=\) 40. Through our analysis, it can be manifested that too few neighbors may not provide enough information while too many neighbors may bring some uncorrelated information, both of them can result the decrease of accuracy.

## 6 Conclusion

In this paper, we addressed the issues of cold start problem for new and receive few ratings’ items which is not well studied. In terms of the intuition that items’ attribute information can boost the accuracy of prediction, we have employed a novel coupling similarity measure fusing into matrix factorization for recommender system. According to our analysis and experiments, we capture coupling relationship serves as better information providers for similar items.

In the future research, we will collect more dataset with correlative attributes and use it to enhance our algorithm. Meanwhile, the cold-start user we haven’t consider in this paper, it is intriguing us to consider rich social relationship in the recommendation framework, and besides, we plan to further investigate the new algorithm.

## Notes

### Acknowledgments

Thanks to the support by Natural Science for Youth Foundation of China (No. 61003162) and the Young Scholars Growth Plan of Liaoning (No. LJQ2013038).

### References

- 1.Linden G, Smith B, York J (2003) Amazon.com recommendations: item-to-item collaborative filtering. IEEE Internet Comput 7(1):76–80CrossRefGoogle Scholar
- 2.Jaschke R, Marinho L, Hotho A, Schmidt L, Stumme G (2007) Tag recommendations in folksonomies. Proceedings of the 11th conference on european conference on principles and practice of knowledge discovery in databases, pp 506–514Google Scholar
- 3.Levandoski J, Sarwat M, Eldawy A, Mokbel F (2012) LARS: a location-aware recommender system. Proceedings of the 28th conference on on data engineering, pp 450–461Google Scholar
- 4.McAuley J, Leskovec J (2013) From amateurs to connoisseurs: modeling the evolution of user expertise though online reviews. Proceedings of the 22th international conference on world wide web, pp 897–908Google Scholar
- 5.Ma H, King I, Lyu MR (2009) Learning to recommend with social trust ensemble. Proceedings of the 32th conference on research and development in information retrieval, pp 203–210Google Scholar
- 6.Ma H, Zhou D, Liu C (2011) Recommender system with social regularization. Proceedings of the 4th conference on web search and data mining, pp 287–296Google Scholar
- 7.Koren Y (2010) Collaborative filtering with temporal dynamics. Commun ACM 53(4):89–97CrossRefGoogle Scholar
- 8.Middleton SE, Shadbolt NR, De DC (2004) Roure: ontological user profiling in recommender systems. ACM Trans Inf Syst 22(1):54–88CrossRefGoogle Scholar
- 9.Paterek A (2007) Improving regularized singular value decomposition for collaborative filtering. Proceedings of the 13th international conference on knowledge discovery and data mining, pp 5–8Google Scholar
- 10.Salakhutdinov R, Mnih A (2008) Bayesian probabilistic matrix factorization using markov chain monte carlo. Proceedings of the 25th conference on international conference on machine learning, pp 880–887Google Scholar
- 11.Sarwar M, Karypis G, Konstan J, Riedl J (2002) Recommender systems for large-scale e-commerce: Scalable neighborhood formation using clustering. Proceedings of the 5th international conference on computer and information technology, p 1Google Scholar
- 12.Salakhutdinov R, Mnih A (2007) Probabilistic matrix facotorization. Proceedings of the 20th conference on neural information processing systems foundation, pp 1257–1264Google Scholar
- 13.Sarwar B, Karypis G, Riedl J (2001) Item-based collaborative filtering recommendation algorithms. Proceedings of the 10th international conference on world wide web, pp 285–295Google Scholar
- 14.Gantner Z, Drumond L, Freudenthaler C, Rendle S, Schmidt-Thieme L (2010) Learning attribute-to-feature mappings for cold-start recommendations. Proceedings of the 10th international conference on data mining, pp 176–185Google Scholar
- 15.Balabanovic M, Shoham Y (1997) Fab: content-based collaborative filtering. Commun ACM 40(3):66–72CrossRefGoogle Scholar
- 16.Hotho A, Jaschke R, Schmitz C, Stumme G (2006) FolkRank:a ranking algorithm for folksonomies. LWA 1:111–114Google Scholar
- 17.Li FF, Xu GD, Cao LB (2014) Coupled item-based matrix factorization. Proceedings of the 15th international conference on web information systems engineering, pp 1–14Google Scholar
- 18.Breese JS, Heckerman D, Kadie C (1998) Empirical analysis of predictive algorithms for collaborative filtering. Proceedings of the 14th conference on uncertainty in artificial intelligence, pp 43–52Google Scholar
- 19.Cao L, Ou Y, Yu PS (2012) Coupled behavior analysis with applications. IEEE Trans Knowl Data Eng 24(8):1378–1392CrossRefGoogle Scholar
- 20.Wang J, De Vries AP, Reinders MJT (2006) Unifying user-based and item-based collaborative ltering approachesbysimilarityfusion. Proceedings of the 29th conference on research and development in information retrieval, pp 501–508Google Scholar
- 21.Lotfi H, Fallahnejad R (2010) Imprecise shannon’s entropy and multi attribute decision making. Entropy 12(1):53–62CrossRefGoogle Scholar
- 22.Nguyen JJ, Zhu M (2013) Content-boosted matrix factorization techniques for recommender systems. Stat Anal Data Mining 6(4):286–301CrossRefGoogle Scholar
- 23.Lee DD, Seung HS (2001) Algorithms for non-negative matrix factorization. Proceedings of the 14th conference on advances in neural information processing systems, pp 556–562Google Scholar