Abstract
Personalized recommender systems rely on each user’s personal usage data in the system, in order to assist in decision making. However, privacy policies protecting users’ rights prevent these highly personal data from being publicly available to a wider researcher audience. In this work, we propose a memory biased random walk model on a multilayer sequence network, as a generator of synthetic sequential data for recommender systems. We demonstrate the applicability of the generated synthetic data in training recommender system models in cases when privacy policies restrict clickstream publishing.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering 17(6), 734–749 (2005)
Rendle, S., Tso-Sutter, K., Huijsen, W., Freudenthaler, C., Gantner, Z., Wartena, C., Brussee, R., Wibbels, M.: Report on state of the art recommender algorithms (update). Technical report, MyMedia public deliverable D4.1.2 (2011)
Burke, R.: Hybrid recommender systems: Survey and experiments. User Modeling and User-Adapted Interaction 12(4), 331–370 (2002)
Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., Riedl, J.: Grouplens: an open architecture for collaborative filtering of netnews. In: Proceedings of the 1994 ACM Conference on Computer Supported Cooperative Work, CSCW 1994, pp. 175–186 (1994)
Narayanan, A., Shmatikov, V.: Robust de-anonymization of large sparse datasets. In: Proceedings of the 2008 IEEE Symposium on Security and Privacy, SP 2008, pp. 111–125 (2008)
Feller, W.: An introduction to probability theory and its applications, vol. 2. John Wiley & Sons (2008)
Kao, E.: An introduction to stochastic processes. Business Statistics Series. Duxbury Press (1997)
Bogers, T.: Movie recommendation using random walks over the contextual graph. In: Proceedings of the 2nd Intl. Workshop on Context-Aware Recommender Systems (2010)
Fouss, F., Faulkner, S., Kolp, M., Pirotte, A., Saerens, M.: Web recommendation system based on a markov-chain model. In: International Conference on Enterprise Information Systems, ICEIS 2005 (2005)
Gori, M., Pucci, A.: Research paper recommender systems: A random-walk based approach. In: Web Intelligence, pp. 778–781 (2006)
Antulov-Fantulin, N., Bošnjak, M., Žnidaršič, M., Grčar, M., Morzy, M., Šmuc, T.: ECML/PKDD 2011 Discovery Challenge overview. In: Proceedings of the ECML-PKDD 2011 Workshop on Discovery Challenge, pp. 7–20 (2011)
Dror, G., Koenigstein, N., Koren, Y., Weimer, M.: The Yahoo! music dataset and kdd-cup’11. In: Proceedings of KDD Cup 2011 (2011)
Zlatić, V., Gabrielli, A., Caldarelli, G.: Topologically biased random walk and community finding in networks. Phys. Rev. E 82, 066,109 (2010)
Newman, M.: Networks: An Introduction. Oxford University Press, Inc. (2010)
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval Cambridge University Press (2008)
Deshpande, M., Karypis, G.: Item-based top-n recommendation algorithms. ACM Transactions on Information Systems 22(1), 143–177 (2004)
Rendle, S., Freudenthaler, C., Gantner, Z., Schmidt-Thieme, L.: Bpr: Bayesian personalized ranking from implicit feedback. In: Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, UAI 2009, pp. 452–461 (2009)
Gantner, Z., Rendle, S., Freudenthaler, C., Schmidt-Thieme, L.: Mymedialite: A free recommender system library. In: Proceedings of the Fifth ACM Conference on Recommender Systems, pp. 305–308 (2011)
Mihelčić, M., Antulov-Fantulin, N., Bošnjak, M., Šmuc, T.: Extending rapidminer with recommender systems algorithms. In: Proceedings of the RapidMiner Community Meeting and Conference, pp. 63–75 (2012)
Bošnjak, M., Antulov-Fantulin, N., Šmuc, T., Gamberger, D.: Constructing recommender systems workflow templates in RapidMiner. In: Proc. of the 2nd RapidMiner Community Meeting and Conference, pp. 101–112 (2011)
Chen, B.C., Kifer, D., LeFevre, K., Machanavajjhala, A.: Privacy-preserving data publishing. Foundations and Trends in Databases 2(1-2), 1–167 (2009)
Fung, B.C., Wang, K., Fu, A.W.C., Yu, P.S.: Introduction to Privacy-Preserving Data Publishing: Concepts and Techniques, 1st edn. Chapman & Hall/CRC (2010)
Aggarwal, C.C., Yu, P.S. (eds.): Privacy-Preserving Data Mining. Models and Algorithms. Springer (2008)
Berendt, B.: More than modelling and hiding: towards a comprehensive view of web mining and privacy. Data Mining and Knowledge Discovery 24(3), 697–737 (2012)
Kenig, B., Tassa, T.: A practical approximation algorithm for optimal k-anonymity. Data Mining and Knowledge Discovery 25(1), 134–168 (2012)
Samarati, P.: Protecting respondents’ identities in microdata release. IEEE Transactions on Knowledge and Data Engineering 13(6), 1010–1027 (2001)
Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: L-diversity: Privacy beyond k-anonymity. ACM Transactions on Knowledge Discovery from Data 1(1) (2007)
Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006)
Wolf, P.P.D., Amsterdam, H.V., Design, C., Order, W.T.: An empirical evaluation of PRAM statistics. Netherlands Voorburg/Heerlen (2004)
Wolf, P.P.D., Gouweleeuw, J.M., Kooiman, P., Willenborg, L.: Reflections on PRAM. Statistical Data Protection, Luxembourg, pp. 337–349 (1999)
Aggarwal, C.C., Yu, P.S.: A framework for condensation-based anonymization of string data. Data Mining and Knowledge Discovery 16(3), 251–275 (2008)
Raghunathan, T., Reiter, J., Rubin, D.: Multiple imputation for statistical disclosure limitation. Journal of Official Statistics 19(1), 1–16 (2003)
Fienberg, S.: A radical proposal for the provision of micro-data samples and the preservation of confidentiality. Technical report, Department of Statistics, Carnegie-Mellon University (1994)
Dandekar, R.A., Cohen, M., Kirkendall, N.: Sensitive micro data protection using latin hypercube sampling technique. In: Domingo-Ferrer, J. (ed.) Inference Control in Statistical Databases. LNCS, vol. 2316, pp. 117–125. Springer, Heidelberg (2002)
Dandekar, R.A., Domingo-Ferrer, J., Sebé, F.: LHS-based hybrid microdata vs rank swapping and microaggregation for numeric microdata protection. In: Domingo-Ferrer, J. (ed.) Inference Control in Statistical Databases. LNCS, vol. 2316, pp. 153–162. Springer, Heidelberg (2002)
Reiter, J.: Inference for partially synthetic, public use microdata sets. Survey Methodology 29(2), 181–188 (2003)
Brookshear, J., Glenn, H.: Theory of Computation: Formal Languages, Automata, and Complexity. Benjamin/Cummings Publish Company, Redwood City (1989)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Antulov-Fantulin, N., Bošnjak, M., Zlatić, V., Grčar, M., Šmuc, T. (2014). Synthetic Sequence Generator for Recommender Systems – Memory Biased Random Walk on a Sequence Multilayer Network. In: Džeroski, S., Panov, P., Kocev, D., Todorovski, L. (eds) Discovery Science. DS 2014. Lecture Notes in Computer Science(), vol 8777. Springer, Cham. https://doi.org/10.1007/978-3-319-11812-3_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-11812-3_3
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11811-6
Online ISBN: 978-3-319-11812-3
eBook Packages: Computer ScienceComputer Science (R0)