Skip to main content

Adaptive personalization using social networks


This research provides insights into the following questions regarding the effectiveness of mobile adaptive personalization systems: (1) to what extent can adaptive personalization produce a better service/product over time? (2) does adaptive personalization work better than self-customization? (3) does the use of the customer’s social network result in better personalization? To answer these questions, we develop and implement an adaptive personalization system for personalizing mobile news based on recording and analyzing customers’ behavior, plus information from their social network. The system learns from an individual’s reading history, automatically discovers new material as a result of shared interests in the user’s social network, and adapts the news feeds shown to the user. Field studies show that (1) repeatedly adapting to the customer’s observed behavior improves personalization performance; (2) personalizing automatically, using a personalization algorithm, results in better performance than allowing the customer to self-customize; and (3) using the customer’s social network for personalization results in further improvement. We conclude that mobile automated adaptive personalization systems that take advantage of social networks may be a promising approach to making personalization more effective.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5


  1. We also tested a Bayesian Logistic Regression model. That model, although seemingly more sophisticated than the Naïve Bayes approach, and a better performer in in-sample testing, actually performed substantially worse in out-of-sample tests, suggesting that the added complexity of the Bayesian Logistic Regression model resulted in over-fitting. Thus, we focus on the Naïve Bayes algorithm for the remainder of this paper.

  2. Although the use of these terms may seem non-standard to a marketing audience, we retain them to maintain consistency with the classification literature.

  3. A technical appendix describing the simulation is available from the authors.

  4. Obtained from


  • Adomavicius, G., & Tuzhilin, A. (2005). Personalization technologies: A process-oriented perspective. Communications of the ACM, 48(10), 83–90.

    Article  Google Scholar 

  • Ansari, A., Essegaier, S., & Kohli, R. (2000). Internet recommendation systems. Journal of Marketing Research, 37(3), 363–375.

    Article  Google Scholar 

  • Ansari, A., & Mela, C. F. (2003). E-customization. Journal of Marketing Research, 40(2), 131–145.

    Article  Google Scholar 

  • Ansari, A., Koenigsberg, O., & Stahl, F. (2011). Modeling multiple relationships in social networks. Journal of Marketing Research, 48(4), 713–728.

    Article  Google Scholar 

  • Atahan, P., & Sarkar, S. (2011). Accelerated learning of user profiles. Management Science, 57(2), 215–239.

    Article  Google Scholar 

  • Bechwati, N. N., & Xia, L. (2003). Do computers sweat? The impact of perceived effort of online decision aids on consumers’ satisfaction with the decision process. Journal of Consumer Psychology, 13(1&2), 139–148.

    Google Scholar 

  • Bell, D. R., & Song, S. Y. (2007). Neighborhood effects and trial on the Internet: Evidence from online grocery retailing. Quantitative Marketing and Economics, 5(4), 361–400.

    Article  Google Scholar 

  • Capgemini. (2008). Closed loop marketing: Unlocking the benefits of customer centricity. Accessed 18 Aug 2011

  • Caumont, A. (2013). 12 trends shaping digital news. Accessed 1 Nov 2013.

  • Cerquides, J. & Màntaras, R.L. (2003). The indifferent naïve bayes classifier. Proceedings of the 16th International FLAIRS Conference.

  • Chung, T. S., Rust, R. T., & Wedel, M. (2009). My mobile music: An adaptive personalization system for digital audio players. Marketing Science, 28(1), 52–68.

    Article  Google Scholar 

  • Cialdini, R. B. (2001). Influence: Science and practice (4th ed.). New York: Harper Collins.

    Google Scholar 

  • Cohen, W. W., & Singer, Y. (1999). Context sensitive learning methods for text categorization. ACM Transactions on Information Systems, 17(2), 141–173.

    Article  Google Scholar 

  • Cooke, A. D. J., Sujan, H., Sujan, M., & Weitz, B. A. (2002). Marketing the unfamiliar: The role of context and item-specific information in electronic agent recommendations. Journal of Marketing Research, 39(4), 488–497.

    Article  Google Scholar 

  • Das, A., Datar, M., Garg, A., & Rajaram, S. (2007). Google news personalization: Scalable online collaborative filtering. In Proceedings of the 16th International Conference on World Wide Web (pp. 271–280).

    Chapter  Google Scholar 

  • Edmonds, R., Guskin, E., Mitchell, A. & Jurkowitz,M. (2013). The state of the news media 2013. Accessed 1 Nov 2013.

  • Eyheramendy, S., Lewis, D. D., & Madigan, D. (2003). On the naïve bayes model for text categorization. Proceedings of the 9th International Conference on Artificial Intelligence and Statistics (AISTATS).

  • Fichman, R. G., & Cronin, M. J. (2003). Information-rich commerce at a crossroad: Business and technology adoption requirements. Communications of the ACM, 46(9), 96–102.

    Article  Google Scholar 

  • Franke, N., Keinz, P., & Steger, C. J. (2009). Testing the value of customization: When do customers really prefer products tailored to their preferences? Journal of Marketing, 73(5), 103–121.

    Article  Google Scholar 

  • Freeman, L. C. (1977). Set of measures of centrality based on betweenness. Sociometry, 40(1), 35–41.

    Article  Google Scholar 

  • Godes, D. (2011). Invited comment on ‘Opinion leadership and social contagion in new product diffusion’. Marketing Science, 30(2), 224–229.

    Article  Google Scholar 

  • Godes, D., & Mayzlin, D. (2009). Firm-created word-of-mouth communication: Evidence from a field study. Marketing Science, 28(4), 721–739.

    Article  Google Scholar 

  • Good, P. (2005). Introduction to statistics through resampling methods and R/S-PLUS. Hoboken: Wiley.

    Book  Google Scholar 

  • Google Play (2014). Accessed 19 Aug 2014

  • Hair, J. F., Black, B., Babin, B., & Anderson, R. (2005). Multivariate data analysis (6th ed.). Saddle River: Prentice Hall.

    Google Scholar 

  • Hand, D. J., & Yu, K. (2001). Idiot’s Bayes: Not so stupid after all? International Statistical Review, 69(3), 385–398.

    Google Scholar 

  • Hauser, J. R., Liberali, G., & Urban, G. L. (2014). Website morphing 2.0: Switching costs, partial exposure, random exit, and when to morph. Management Science, 60(6), 1594–1616.

    Article  Google Scholar 

  • Hauser, J. R., Urban, G. L., Liberali, G., & Bruan, M. (2009). Website morphing. Marketing Science, 28(2), 202–223.

    Article  Google Scholar 

  • Hartmann, W. R. (2010). Demand estimation with social interactions and the implications for targeted marketing. Marketing Science, 29(4), 585–601.

    Article  Google Scholar 

  • Hartmann, W., Manchanda, P., Nair, H., Bothner, M., Dodds, P., Godes, D., Hosanagar, K., & Tucker, C. (2008). Modeling social interactions: Identification, empirical methods and policy implications. Marketing Letters, 19(3), 287–304.

    Article  Google Scholar 

  • Häubl, G., & Trifts, V. (2000). Consumer decision making in online shopping environment: The effects of interactive decision aids. Marketing Science, 19(1), 4–21.

    Article  Google Scholar 

  • Herlocker, J.L., Konstan, J.A.& Riedl, J. (2000). Explaining collaborative filtering recommendations. Proceedings of the 2000 ACM conference on Computer supported cooperative work, 241–250.

  • Herlocker, J. L., Terveen, L. G., Konstan, J. A., & Riedl, J. T. (2004). Evaluating collaborative filtering recommender systems. ACM Transactions on information systems, 22, 5–53.

    Article  Google Scholar 

  • Hosanagar, K., Fleder, D., Lee, D., & Buja, A. (2013). Will the global village fracture into tribes? Recommender systems and their effects on consumer fragmentation. Management Science., 60(4), 805–823.

    Article  Google Scholar 

  • Iyengar, R., Van den Bulte, C., & Valente, T. W. (2011). Opinion leadership and social contagion in new product diffusion. Marketing Science, 30(2), 195–212.

    Article  Google Scholar 

  • Karr, D. (2014). The State of Mobile in the US. Accessed 24 Oct 2014.

  • Keath, J. (2011). Instagram becomes the largest mobile social network. Accessed 17 Dec 2011.

  • Khan, R., Lewis, M., & Singh, V. (2009). Dynamic customer management and the value of one-to-one marketing. Marketing Science, 28(6), 1063–1079.

    Article  Google Scholar 

  • Kirchoff, S. M. (2010). The U.S. Newspaper Industry in Transition. Congressional Research Service Report for Congress, 7–5700,

  • Levin, D., & Cross, R. (2004). The strength of weak ties you can trust: The mediating role of trust in effective knowledge transfer. Management Science, 40(11), 1477–1490.

    Article  Google Scholar 

  • Li, L., Wang, D. D., Zhu, S. Z., & Li, T. (2011). Personalized news recommendation: A review and an experimental investigation. Journal of Computer Science and Technology, 26(5), 754–766.

    Article  Google Scholar 

  • Liang, T. P., Yang, Y. F., Chen, D. N., & Ku, Y. C. (2008). A semantic-expansion approach to personalized knowledge recommendation. Decision Support Systems, 45(3), 401–412.

    Article  Google Scholar 

  • Liu, D. R., Tsai, P. Y., & Chiu, P. H. (2011). Personalized recommendation of popular blog articles for mobile applications. Information Sciences, 181(9), 1552–1572.

    Article  Google Scholar 

  • Liu, J. H., Dolan, P. & Pedersen, E. R. (2010). Personalized news recommendation based on click behavior. Accessed 13 Aug 2014.

  • Lunneborg, C. E. (2000). Data analysis by resampling: Concepts and applications. Belmont: Duxbury Press.

    Google Scholar 

  • Lyytinen, K., & Yoo, Y. J. (2002). Research commentary: The next wave of nomadic computing. Information Systems Research, 13(4), 377–388.

    Article  Google Scholar 

  • Lyytinen, K., Yoo, Y. J., Varshney, U., Ackerman, M., Davis, G., Avital, M., Robey, D., Sawyer, S., & Sorensen, C. (2004). Surfing the next wave: Design and implementation challenges of ubiquitous computing. Communications of the Association for Information Systems, 13(40), 697–716.

    Google Scholar 

  • McNee, S.M., Riedl, J., Konstan, J.A. (2006). Accurate is not always good: How accuracy metrics have hurt recommender systems. CHI’06 extended abstracts on Human factors in computing systems, 1097–1101.

  • Mcpherson, M., Smith-Lovin, L., & Cook, J. M. (2001). Birds of a feather: Homophily in social networks. Annual Review of Sociology, 27, 415–444.

    Article  Google Scholar 

  • Moon, S., Gary, J., & Russell, G. J. (2008). Predicting product purchase from inferred customer similarity: An autologistic model approach. Management Science, 54(1), 71–82.

    Article  Google Scholar 

  • Nair, H., Manchanda, P., & Bhatia, T. (2010). Asymmetric social interactions in physician prescription behavior: The role of opinion leaders. Journal of Marketing Research, 47(5), 883–895.

    Article  Google Scholar 

  • Narayan, V., Rao, V. R., & Saunders, C. (2011). How peer influence affects attribute preferences: A bayesian updating mechanism. Marketing Science, 30(2), 368–384.

    Article  Google Scholar 

  • Newspaper Association Of America. (2014). Business model evolving, circulation revenue rising. Accessed 6 Nov 2014

  • Picault, J., Ribière, M. (2008). Method of adapting a user profile including user preferences and communication device. European Patent EP08290033.

  • Rossi, P. E., Gilula, Z., & Allenby, G. M. (2001). Overcoming scale usage heterogeneity: A bayesian hierarchical approach. Journal of the American Statistical Association, 96(453), 20–31.

    Article  Google Scholar 

  • Rust, R. T., & Huang, M. H. (2014). The service revolution and the transformation of marketing science. Marketing Science, 33(2), 206–221.

    Article  Google Scholar 

  • Sebastiani, F. (2002). Machine learning in automated text categorization. ACM Computing Surveys, 34, 1–47.

    Article  Google Scholar 

  • Shapira, B., Shoval, P., Meyer, J., Tractinsky, N., & Mimran, D. (2009). ePaper: A personalized mobile newspaper. Journal of the American Society for Information Science and Technology, 60(11), 2333–2346.

    Article  Google Scholar 

  • Shugan, S. M. (1980). The cost of thinking. Journal of Consumer Research, 7(2), 99–111.

    Article  Google Scholar 

  • Smith, D., Menon, S., & Sivakumar, K. (2005). Online peer and editorial recommendations, trust, and choice in virtual markets. Journal of Interactive Marketing, 19(3), 15–37.

    Article  Google Scholar 

  • Subramanian, C. (2012). Gadget users hungrier for news, but media still lags in profit. Accessed 21 Mar 2012

  • TechCrunch (2014). Accessed 19 Aug 2014.

  • Thompson, D. V., Hamilton, R. W., & Rust, R. T. (2005). Feature fatigue: When product capabilities become too much of a good thing. Journal of Marketing Research, 42(4), 431–442.

    Article  Google Scholar 

  • Urban, G. L., Liberali, G., MacDonald, E., Bordley, R., & Hauser, J. R. (2014). Morphing banner advertising. Marketing Science, 33(1), 27–46.

    Article  Google Scholar 

  • Van Rijsbergen, C. J. (1979). Information Retrieval. London: Butterworth.

    Google Scholar 

  • Van Roy, B., & Yan, X. (2010). Manipulation robustness of collaborative filtering. Management Science, 56(11), 1911–1929.

    Article  Google Scholar 

  • Varki, S., & Rust, R. T. (1998). Technology and optimal segment size. Marketing Letters, 9(2), 147–167.

    Article  Google Scholar 

  • Wang, J., Aribarg, A., & Atchadé, Y. F. (2013). Modeling Choice Interdependence in a Social Network. Marketing Science, 32(6), 977–997.

    Article  Google Scholar 

  • Webb, G. I., Buughton, J. R., & Wang, Z. H. (2005). Not so naïve bayes: Aggregating one-dependence estimators. Machine Learning, 58(1), 5–24.

    Article  Google Scholar 

  • Yaniv, I. (2004). Receiving other peoples’ advice: Influence and Benefit. Organizational Behavior and Human Decision Processes, 93(1), 1–13.

    Article  Google Scholar 

  • Ying, Y. P., Feinberg, F., & Wedel, M. (2006). Leveraging missing ratings to improve online recommendation systems. Journal of Marketing Research, 43(3), 355–365.

    Article  Google Scholar 

  • Zhang, J., & Krishnamurthi, L. (2004). Customizing promotions in online stores. Marketing Science, 23(4), 561–578.

    Article  Google Scholar 

  • Zhang, J. J. (2010). The sound of silence: Observational learning in the U.S. kidney market. Marketing Science, 29(2), 315–335.

    Article  Google Scholar 

  • Zhao, Y., Yang, S., Narayan, V., & Zhao, Y. (2013). Modeling consumer learning from online product reviews. Marketing Science, 32(1), 153–169.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Roland T. Rust.



Appendix 1 - Procedure for filtering news feeds

To predict whether an article will be read or not, we develop a fully automated probabilistic modified Naïve Bayes approach to our Adaptive Personalization (AP) algorithm. As input, the approach uses keywords in the text of news articles that describe an individual’s interests in news. Assume, for one user (with user-specific subscript suppressed for convenience), that every article is tagged as r = 1 or r = 0 depending on whether it is read or not. Assume there are s 0 = 1, …, S 0 unread and s 1 = 1, …, S 1 read news articles, with a set of W unique words across all articles. At any point in time, the data for the user then consists of a S × W table, \( Y=\left(\begin{array}{c}\hfill {Y}^0\hfill \\ {}\hfill {Y}^1\hfill \end{array}\right) \); with \( {Y}^0={y}_{1:{S}_0,1:W}^0 \), \( {Y}^1={y}_{1:{S}_1,1:W}^1 \) , respectively, counts of unread and read words, respectively, where (1 : S 0, 1 : W) indexes a matrix. If we sum the table across s 0 = 1, …, S 0 and s 1 = 1, …, S 1 we have y 01 : W and y 11 : W , (W × 1) vectors of word counts. This comprises the training dataset. The algorithm assumes a mixture-Poisson distribution for the joint distribution of the counts of the W keywords, with parameters λ r,w describing the intensity of keyword w in class r, which is of proportion π r :

$$ P(Y)={\displaystyle \sum_{r=0,1}{\pi}_r{\displaystyle \prod_{w=1}^W\frac{\lambda_{r,w}^{y_w^r}{e}^{-{\lambda}_{r,w}}}{y_w^r!}}}. $$

A mixture-Poisson distribution model has a gamma distribution as the mixing distribution of the Poisson rate. Equation 1 is estimated on the word count data Y each time a new set of articles is to be downloaded. Let a batch of b = 1, ⋯, B new articles be characterized by word counts x 1 : B,1 : W + V . There are V new words in that batch that are not in the training set. These new words are irrelevant in classifying the new article (because there is no prior data on them), but are relevant in updating the estimates after this new batch of articles, with the V new words, becomes part of the training data. The estimates of the parameters λ r,w are updated before every new downloading cycle.

Keyword selection and updating

Given that we use a probabilistic classification, a coherent selection mechanism to obtain the W most discriminating keywords is based on the classification odds-ratio:

$$ {o}_s\left({y}_w\right)=\frac{P\left({y}_w|r=1\right)\left(1-P\left({y}_w|r=0\right)\right)}{P\left({y}_w|r=0\right)\left(1-P\left({y}_w|r=1\right)\right)}=\frac{\lambda_{1,w}^{y_w^1}{e}^{-{\lambda}_{1,w}}\left({y}_w^0!-{\lambda}_{0,w}^{y_w^0}{e}^{-{\lambda}_{0,w}}\right)}{\lambda_{0,w}^{y_w^0}{e}^{-{\lambda}_{0,w}}\left({y}_w^1!-{\lambda}_{1,w}^{y_w^1}{e}^{-{\lambda}_{1,w}}\right)}. $$

This classification odds-ratio is large for words that discriminate well between the read (r = 1) and unread (r = 0) articles. It is one of the best performing procedures for keyword selection, and may reduce the number of keywords by a factor of 100 without a loss in performance (Sebastiani 2002). In our algorithm, every time a new batch of articles is downloaded and reading behavior is observed, words in all available articles are sorted by the magnitude of their odds-ratios in Eq. 2, using the data up to that point. The sorted list is then truncated at a pre-specified number of key words, W. The cutoff is set to W = 55 words from the entire text, based on experimentation with synthetic data, which is detailed in Appendix 3.

At any time, a moving window of M batches of articles is included, and articles older than these M batches are removed. This helps to dynamically remove keywords that no longer sufficiently reflect a user’s interests, and thus adapts the classification to the changing interests of the user. We suggest setting M at a value that reasonably reflects (1) how frequent user reading interests change, (2) how frequent and how drastically news content changes; (3) how frequently the users engage with the system. For our application, we judge that M = 7, which results in about one week of articles if articles are downloaded in daily cycles by the user, weighs these three factors appropriately. M = 7 yields satisfactory performance in our example (see below), but we have not tested the impact and effectiveness for other values of M. Different applications may require different values of M. For example in the case where the user accesses the system rarely, we may need a bigger value of M , because it will take longer to evaluate the user’s preferences.

Keyword dependence

Maximum likelihood inference for the parameters in (1) leads to the closed form estimators: \( {\widehat{\lambda}}_{r,1:W}={y}_{1:W}^r/{S}_r \) and \( {\widehat{\pi}}_r={S}_0/\left({S}_0+{S}_1\right) \). However, zero values in the case where a keyword does not occur are common. An empirical Bayes procedure derived from an indifference Gamma prior with parameters c 1 and c 2 is used to address this problem. This prior reduces the classification error, especially when little data are available (Cerquides and Màntaras 2003; Eyheramendy et al. 2003). A limitation of (1) is that it is based on the assumption that the occurrence of words is conditionally independent given the classes r. This assumption can be alleviated by averaging across all classifiers that are obtained conditional on the presence of a certain set of keywords (Webb et al. 2005). The conditional rate of occurrence of the word w, \( {\lambda}_{1,w\Big|{z}_k} \), given that a certain keyword k is present, z k  = 1, or absent, z k  = 0, is estimated as:

$$ {\widehat{\lambda}}_{r,w|{z}_k}=\frac{c_1+{y}_{r,w|{z}_k}}{c_2+{S}_{r|{z}_k}+2}, $$

where for class r, \( {y}_{r,w\Big|{z}_k} \) is the number of times word w appears in articles that contain or do not contain keyword k, and \( {S}_{r\Big|{z}_k} \) is the number of news articles with or without keyword k.

The joint probability of reading an article in class r that contains the keyword, z k , P(r, z k ) is estimated as:

$$ \widehat{P}\left(r,{z}_k\right)=\frac{S_{r|{z}_k}+W+1}{S_0+{S}_1+2\times W+2}. $$

The probability of reading or not reading a new article b is then estimated as the average across equation 4 over all values z k  = 0, 1 for key-words k = 1,…,K:

$$ \widehat{P}\left(r|{x}_{b,1:W+V}\right)\propto {\displaystyle \sum_{k=1}^K{\displaystyle \sum_{z_k=0,1}\left(\widehat{P}\left(r,{z}_k\right){\displaystyle \prod_{w=1}^W{\widehat{\lambda}}_{1,w|{z}_k}^{x_{w,k}^r}{e}^{-{\widehat{\lambda}}_{1,w|{z}_k}}}\right)}}, $$

where the expression 5 is normalized by its sum over r = 0 and r = 1. Here, x r w,k is the frequency with which word w occurs jointly with keyword k in article of class r. Substituting equations 3 and 4 yields the expression, for a specific user, of the probability of reading the new news article given the keyword count-vector:

$$ \widehat{P}\left(r|{x}_{b,1:W+V}\right)\propto {\displaystyle \sum_{k=1}^K{\displaystyle \sum_{z_k=0,1}\left[\frac{S_{r|{z}_k}+W+1}{S_0+{S}_1+2\times W+2}{\displaystyle \prod_{w=1}^W\;{\left(\frac{y_{r,w|{z}_k}+{c}_1}{S_{r|{z}_k}+{c}_2+2}\right)}^{x_{w,k}^r} \exp\;\left(-\frac{y_{r,w|{z}_k}+{c}_1}{S_{r|{z}_k}+{c}_2+2}\right)}\right]}}. $$

A crisp classification of the new article is obtained as \( \underset{r}{ \arg \max}\widehat{P}\left(r\Big|{x}_{b,1:W+V}\right) \). This classification is the basis for the filtering of news articles in the adaptive personalization system.

Including the social network

As a key feature of our approach, we include the social network of a target user in the personalization. In the user’s news scroll, we automatically include articles read by the user’s friends that the user would otherwise not have been presented. Peer influences are stronger if an individual is less certain about his/her preferences (Narayan et al. 2011). We thus focus on articles for which the user’s preference is ambivalent, because when the uncertainty of liking or disliking these articles is the highest, the potential for peer influence the largest. We thus restrict the use of the social network to only the news articles towards which a user is predicted to be indifferent; articles that the user would be predicted to read are downloaded in the mobile device anyway. This circumvents the problem of downloading articles from the social network which respondents are likely to dislike.

Thus, any news article b for which a target user is predicted to be more or less indifferent is downloaded to that user’s mobile device if anyone in her social network has read it. In our model a user is predicted to be indifferent to an article when the odds of reading the news article,

$$ {o}_r\left({x}_{b,1:W+V}\right)=\frac{\widehat{P}\left(r=1|{x}_{b,1:W+V}\right)}{\widehat{P}\left(r=0|{x}_{b,1:W+V}\right)}, $$

is close to 1. That is: 1 − ε < o r (x b,1 : W + V ) > 1 + ε, for some small value ε which ensures that the algorithm is very selective. We should emphasize that when a user is predicted to be indifferent to an article and the odds of reading is close to 1, the individual in question may or may not in fact be indifferent to the article. When the odds of reading an article is close to one, this may reflect uncertainty of the system about the users preference, rather than the user truly being indifferent. The parameter ε can be interpreted as a “social influence parameter”, because its value controls how many of the news articles enter the users’ news scroll through the social network. To be selective, the default value of the social influence parameter is set to be a small value.

This feature is expected to improve personalization if users in the social network have a shared preference for the news articles. In addition, it offers the important advantage of preventing the algorithm from zooming in on a too narrow set of news articles early on, by introducing a certain level of ‘surprise’. That is, new articles are introduced in an informed manner based on the social network, which enables the algorithm to learn the user‘s preferences for new topics to which they would not otherwise be exposed. Without the user‘s ability to see the reading behavior of their peers, the influence of the social network in our system mostly operates through a mechanism of collaborative filtering. This helps in improving preference estimation, especially when the reading preferences of the peers and the target individual match well. Effects through homophily of peers in the network do occur, however, since if news articles are preferred by the peers this also increases the chance that the target user is exposed to the same article.

Appendix 2 – Randomization test

The randomization test (Lunneborg 2000) is used to test for the significance of differences in the performance of adaptive personalization models, because measures such as precision and F1, are nonlinear and have too complex a distributional form to permit a traditional test of significance. Randomization tests differ from a parametric test of significance in many respects, among them is the lack of assumption of normality or homoscedasticity. In addition, we do not calculate the test statistic for a randomization test to a tabled distribution (e.g. normal distribution) but instead we compare the results that we obtain when we repeatedly randomize the data across the comparison groups. An important assumption behind a randomization test is that the observations are exchangeable under the null hypothesis. Under that assumption, randomization tests are exact tests. The primary purpose is to get p-values, the probability of obtaining a more extreme result under the null hypothesis of no difference. Re-sampling is done without replacement. Here the purpose is to get p-values for differences in the estimated values of F1, and the other measures. between the different personalization models. Because respondents were randomly assigned to conditions, the assumption of exchangeability likely holds. Comparing the difference between bootstrapping and randomization test, the bootstrap-based is based on less strict assumptions, but it is not an exact test. It is primarily used to obtain confidence intervals, and is based on sampling with replacement Good (2005).

The null hypothesis is that two models are not different, and that therefore any prediction produced by one model could have just as likely come from the other model. In the randomization test, we shuffle the predictions between two models and see how likely this produces a difference in the performance measures that is at least as large as the difference observed from the data. To illustrate how the test works, take the example where model 1 has a better performance than model 2. Then, we would expect that the performance metric obtained by randomly shuffling the predictions of the two models would not to give a larger difference in performance. In other words, a mixture of the predictions from model 1 and 2 is unlikely to give better predictions than the predictions coming from model 1 alone. In using 10,000 randomizations, we reshuffle the predictions of model 1 and 2 10,000 times. The test produces the number of times that there is a larger difference in performance as compared to the observed data for models 1 and 2. The test results provide the proportion, or probability, that the randomized predictions are better than the observed predictions. This probability can be interpreted as a ‘p-value’ measuring the level of significance of differences between the models.

The steps involved in a randomization test are:

  1. 1.

    Select on a metric for comparison between the different personalization models (e.g. F1)

  2. 2.

    Calculate the metric based on the different models. We denote the personalization model with a better performance as system 1 and the other one as system 2.

  3. 3.

    Repeat the following 10,000 times (a value we chose for this study):

    1. a.

      Shuffle the data between the two models, that is shuffle the predictions for the different individual news articles between the models,

    2. b.

      Calculate the metric for the shuffled data,

    3. c.

      With the shuffled data, if the performance of model 2 is better than model 1 increase the value of a counter by 1,

    4. d.

      Continue step a to c 10,000 times.

  4. 4.

    Divide the value of the increment counter by 10,000 to get the proportion of times the performance of the model 2 is better than model 1 with the shuffled data.

    Reject or retain the null hypothesis on the basis of this probability.

Appendix 3–tests on simulated data

We conduct a simulation study to investigate the performance of the Naive Bayes algorithm for various amounts of data used as input. For this purpose, a total of 482 news articles which were downloaded from Channel News Asia Business RSS feeds between 1st February 2010 and 9th February 2010, are used for the simulation. These articles span a period of 7 days. The moving window from which articles are retained is set to M = 7, since most articles lose much of their news value after 7 days. The news categories of these articles are Asia Pacific (69 articles), Business (69), Entertainment (70), Local (70), Sports (70), Technology (66) and World (68). A discriminant analysis is used to see which the keywords helped to discriminate the news articles. The recommended ratio of observations to predictor variables is at least 5:1 (Hair et al. 2005), so that with 482 observations we select the 96 keywords with the highest frequency.

We use the counts of these keywords in an individual-level logistic regression for ten individuals, to simulate their reading preferences. The coefficients are drawn from a uniform distribution. The logistic regression produces an indicator variable that shows each of the 482 articles to be either ‘read’(0) or ‘skipped’ (1) for each individual, and this is matched against the prediction of our algorithm. The articles from the first day are used to initialize the procedure (i.e. to generate the keywords and the word tables for read and skipped articles). The keywords and word tables generated from the first day are used to personalize the articles for day two, and so on. A total of 412 articles is available. The social network is simulated by assuming that peers in the social network have identical reading preferences, up to a random error. Because we are interested in personalizing the articles read on an individual) basis, we simulated 10 users.

The classification performance of the approach depends critically on what and how much data is used as input. As there are no theoretical guidelines available for these choices, we test the performance of our approach using four alternative approaches:

  1. 1.

    The words in the headlines of the news articles,

  2. 2.

    The words in the first three paragraphs of the news articles,

  3. 3.

    The words in the entire contents of the news articles,

  4. 4.

    The entire contents of the articles and incorporating the social network.

Performance will depend on how many keywords are used. For each of the scenarios, we therefore compare the performance of the approach as a function of the maximum number of keywords used for classification, varying it from 25 to 55 in increments of 10. The maximum of 55 keywords is dictated by memory and processing limitations of the mobile devices. The values of the F1 measure for the resulting scenario’s are shown in Table 5.

The motivation for testing our algorithm using only the headlines as data comes from the observation that users only get to see the headlines before deciding if they want to read the news articles. Due to the small number of words in the headline, F1 performance does not improve by increasing the maximum number of keywords. In fact when from 25 to 55 keywords are used no articles are recommended. This results in a zero value for precision and recall which resulted in a zero value for F1. In addition, with no articles recommended (i.e. no positives), the measure of accuracy simplifies to a simple ratio of true negatives divided by total negatives. This together with the simulated users preferring 11% of the articles on average, resulted in an artificially high accuracy. Obviously, this performance is unacceptable and we conclude that only using the headlines does not yield effective personalization.

The motivation for testing the performance of the algorithm using only the keywords in the first three paragraphs of each news article, is that users have a tendency to browse through articles and classify a document as interesting after reading only the first few paragraphs, without scrolling. Using 25 keywords in this case yields F1 = .207. This is some improvement over using keywords from the headlines only. Yet, when the number of keywords is increased, performance decreases, ultimately to F1 = .019 for 55 keywords. Inspection of the data shows that this is due to the algorithm’s tendency to reduce the number of articles recommended as the number of keywords increases. For this reason, we conclude that using the first three paragraphs does not yield acceptable performance either.

When keywords from the entire contents of the news articles are used, the performance of the algorithm improves with the number of keywords. Using from 25 to 55 keywords F1 improves from .206 to .255. Inspection of the data shows that this is caused by the algorithm being good at screening out unwanted news articles. But the algorithm is less good at finding news items that are preferred by the user. Compared to the previous two scenarios, however, these results are the best. These simulations show that there is little opportunity to improve computational performance by reducing the data input: for satisfactory performance one needs the entire news article as data, using the maximum number of 55 keywords to classify the article.

Finally, when not only the entire content of the news articles, but also the behavior of friends in the social network of the target user is used, F1 improves monotonically from .317 (25 keywords) to .619 (55 keywords) (Table 5), which is substantially better than with the same number of keywords without the social network. Breaking down performance by recommendation cycle shows that the F1 value increases from .614 for the first cycle, to .662 to the sixth and last cycle, revealing that the approach adapts better to user interests over time. Based on these results we decide on using 55 keywords from the entire text, as well as the social network, in the algorithm.

Table 5 F1 Performance measures for the simulated data

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Chung, T.S., Wedel, M. & Rust, R.T. Adaptive personalization using social networks. J. of the Acad. Mark. Sci. 44, 66–87 (2016).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Personalization
  • Social networks
  • News
  • Bayes classifier
  • Recommendation systems
  • Mobile commerce
  • Smart phones
  • Service marketing