Abstract
Data collected for providing recommendations can be partitioned among different parties. Offering distributed data-based predictions is popular due to mutual advantages. It is almost impossible to present trustworthy referrals with decent accuracy from split data only. Meaningful outcomes can be drawn from adequate data. Those companies with distributed data might want to collaborate to produce accurate and dependable recommendations to their customers. However, they hesitate to work together or refuse to collaborate because of privacy, financial concerns, and legal issues. If privacy-preserving measures are provided, such data holders might decide to collaborate for better predictions. In this study, we investigate how to provide predictions based on vertically distributed data (VDD) among multiple parties without deeply jeopardizing their confidentiality. Users are first grouped into various clusters off-line using self-organizing map clustering while protecting the online vendors’ privacy. With privacy concerns, recommendations are produced based on partitioned data using a nearest neighbour prediction algorithm. We analyse our privacy-preserving scheme in terms of confidentiality and supplementary costs. Our analysis shows that our method offers recommendations without greatly exposing data holders’ privacy and causes negligible superfluous costs because of privacy concerns. To evaluate the scheme in terms of accuracy, we perform real-data-based experiments. Our experiment results demonstrate that the scheme is still able to provide truthful predictions.
Similar content being viewed by others
References
Ahn HJ (2008). A new similarity measure for collaborative filtering to alleviate the new user cold-starting problem. Inform Sci 178 (1): 37–51.
Berkovsky S, Eytani Y, Kuflik T and Ricci F (2005). Privacy-enhanced collaborative filtering. In: Workshop on Privacy-Enhanced Personalization, at the International Conference on User Modeling, Edinburgh, UK, pp 75–83.
Bertino E, Khan LR, Sandhu R and Thuraisingham B (2006). Secure knowledge management: Confidentiality, trust, and privacy. IEEE T Syst Man Cy A 36 (3): 429–438.
Bhowmick SS, Gruenwald L, Iwaihara M and Chatvichienchai S (2006). PRIVATE-IYE: A framework for privacy preserving data integration. In: The 22nd International Conference on Data Engineering Workshops, Washington, DC, p. 91
Billsus D and Pazzani MJ (1998). Learning collaborative information filters. In: Proceedings of the Fifteenth International Conference on Machine Learning. Madison, WI, Morgan Kaufmann Publishers Inc: Los Altos, CA, pp 46–54.
Canny J (2002). Collaborative filtering with privacy. In: Proceedings of the 2002 IEEE Symposium on Security and Privacy. IEEE Computer Society, Oakland, CA, pp 45–57.
Clifton C, Kantarcioglu M, Doan A, Schadow G, Vaidya J, Elmagarmid A and Suciu D (2004). Privacy-preserving data integration and sharing. In: Proceedings of the 9th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery. Paris, France, pp 19–26.
Goldberg K, Roeder T, Gupta D and Perkins C (2001). Eigentaste: A constant time collaborative filtering algorithm. Inform Retrieval 4 (2): 133–151.
Grcar M (2004). User profiling: Collaborative filtering. In: 7th International Multiconference Information Society IS 2004, Slovenia, pp 75–78.
Haykin S (1999). Neural Networks: A Comprehensive Foundation, 2nd edn. Prentice Hall: Englewood Cliffs, NJ.
Herlocker JL, Konstan JA, Borchers A and Riedl J (1999). An algorithmic framework for performing collaborative filtering. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Berkeley, CA, pp 230–237.
Kaleli C and Polat H (2007a). Providing Naive Bayesian classifier-based private recommendations on partitioned data. In: Proceedings of the 11th European Conference on Principles and Practice of Knowledge Discovery in Databases. Warsaw, Poland, pp 515–522.
Kaleli C and Polat H (2007b). Providing private recommendations using Naïve Bayesian classifier. In: Wegrzyn-Wolska K and Szczepaniak P (eds). Advances in Intelligent Web Mastering, Vol. 43. Springer: Berlin/Heidelberg, pp 168–173.
Kaleli C and Polat H (2010). P2P collaborative filtering with privacy. Turk J Electr Eng Comput Sci 8 (1): 101–116.
Kantarcioglu M and Vaidya J (2003). Privacy-preserving naïve Bayes classifier for horizontally partitioned data. In: The IEEE ICDM Workshop on PPDM, Melbourne, FL, pp 3–9.
Kaya SV, Pedersen TB, Savas E and Saygin Y (2007). Efficient privacy preserving distributed clustering based on secret sharing. In: Proceedings of the 2007 International Conference on Emerging Technologies in Knowledge Discovery and Data Mining. Nanjing, China, pp 280–291.
Kim T-H, Ryu Y-S, Park S-I and Yang S-B (2002). An improved recommendation algorithm in collaborative filtering. In: Proceedings of the Third International Conference on E-Commerce and Web Technologies. Aix-en-Provence, France. Springer-Verlag, Berlin, Germany, pp 254–261.
Kohonen T (1995). Self-Organizing Map. Springer: Berlin, Heidelberg, New York.
Lathia N, Hailes S and Capra L (2007). Private distributed collaborative filtering using estimated concordance measures. In: Proceedings of the 2007 ACM Conference on Recommender Systems. Minneapolis, MN, pp 1–8.
Lin X, Clifton C and Zhu M (2005). Privacy-preserving clustering with distributed EM mixture modeling. Knowl Inform Syst 8 (1): 68–81.
Liu K, Kargupta H and Ryan J (2006). Random projection-based multiplicative data perturbation for privacy preserving distributed data mining. IEEE T Knowl Data Eng 18 (1): 92–106.
Mangiameli P, Chen SK and West D (1996). A comparison of SOM neural network and hierarchical clustering methods. Eur J Opl Res 93 (2): 402–417.
Meziane F and Kasiran MK (2008). Evaluating trust in electronic commerce: A study based on the information provided on merchants' websites. J Opl Res Soc 59: 464–472.
Mingoti SA and Lima JO (2006). Comparing SOM neural network with Fuzzy c-means, K-means and traditional hierarchical clustering algorithms. Eur J Opl Res 174 (3): 1742–1759.
OECD (1998). Privacy online—OECD guidance on policy and practice.
OECD (2000). Guidelines for consumer protection in the context of electronic commerce.
OECD (2005). Guidelines on the protection of privacy and transborder flows of personal data.
Oliveira S, Oliveira SRM and Zaïane OR (2004). Toward standardization in privacy-preserving data mining. In: 3rd Workshop on Data Mining Standards (DM-SSP 2004), in Conjunction with KDD 2004, Seattle, WA, pp 7–17.
Paillier P (1999). Public-key cryptosystems based on composite degree residuosity classes. In: Proceedings of the 17th International Conference on Theory and Application of Cryptographic Techniques. Prague, Czech Republic, pp 223–238.
Parameswaran R and Blough DM (2007). Privacy preserving collaborative filtering using data obfuscation. In: Proceedings of the 2007 IEEE International Conference on Granular Computing. IEEE Computer Society: Silicon Valley, CA, pp 380–386.
Polat H and Du W (2003). Privacy-preserving collaborative filtering using randomized perturbation techniques. In: Proceedings of the Third IEEE International Conference on Data Mining. Melbourne, FL, IEEE Computer Society, Washington, DC, pp 625–628.
Polat H and Du W (2005). Privacy-preserving collaborative filtering. Int J Electroni Comm 9 (4): 9–35.
Polat H and Du W (2008). Privacy-preserving top-N recommendation on distributed data. J Am Soc Inform Sci Technol 59 (7): 1093–1108.
Roh TH, Oh KJ and Han I (2003). The collaborative filtering recommendation based on SOM cluster-indexing CBR. Expert Syst Appl 25 (3): 413–423.
Sarwar B, Karypis G, Konstan J and Riedl J (2000). Application of dimensionality reduction in recommender system a case study. In: WebKDD-2000, Boston, USA.
Sarwar B, Karypis G, Konstan J and Reidl J (2001). Item-based collaborative filtering recommendation algorithms. In: Proceedings of the 10th international Conference on World Wide Web. Hong Kong, pp 285–295.
Vaidya J, Clifton C, Kantarcioglu M and Patterson AS (2008). Privacy-preserving decision trees over vertically partitioned data. ACM T Knowl Discov Data 2 (3): 1–27.
Xue G-R, Lin C, Yang Q, Xi W, Zeng H-J, Yu Y and Chen Z (2005). Scalable collaborative filtering using cluster-based smoothing. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Salvador, Brazil, pp 114–121.
Yakut I and Polat H (2007). Privacy-preserving eigentaste-based collaborative filtering. In: Proceedings of the Security 2nd International Conference on Advances in Information and Computer Security. Nara, Japan, pp 169–184.
Yakut I and Polat H (2010). Privacy-preserving SVD-based collaborative filtering on partitioned data. Int J Inform Technol Decis Making 9: 473–502.
Zhang S, Ford J and Makedon F (2006). A privacy-preserving collaborative filtering scheme with two-way communication. In: Proceedings of the 7th ACM Conference on Electronic Commerce. Ann Arbor, MI, pp 316–323.
Acknowledgements
This work is supported by Grant 108E221 from TUBITAK.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kaleli, C., Polat, H. SOM-based recommendations with privacy on multi-party vertically distributed data. J Oper Res Soc 63, 826–838 (2012). https://doi.org/10.1057/jors.2011.76
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1057/jors.2011.76