Frequent Itemset Mining with Differential Privacy Based on Transaction Truncation
Frequent itemset mining is the basis of discovering transaction relationships and providing information services such as recommendation. However, when transaction databases contain individual sensitive information, direct release of frequent itemsets and their supports might bring privacy risks to users. Differential privacy provides strict protection for users, it can distort the sensitive data when attackers get the sensitive data from statistical information. The transaction length is related to sensitivity for counting occurrences (SCO) in a transaction database, larger SCO will reduce the availability of frequent itemsets under ε-differential privacy. So it is necessary to truncate some long transactions in transaction databases. We propose the algorithm FI-DPTT, a quality function is designed to calculate the optimal transaction length in exponential mechanism (EM), it aims to minimize noisy supports. Experimental results show that the proposed algorithm improves the availability and privacy efficiently.
KeywordsFrequent itemset mining Differential privacy Exponential mechanism Quality function Laplace mechanism Transaction truncation
This work is funded by Chongqing Natural Science Foundation (cstc2014kjrc-qnrc40002), Scientific and Technological Research Program of Chongqing Municipal Education Commission (KJ1500431, KJ1400429).
- 3.Bhaskar, R., Laxman, S., Thakurta, A.: Discovering frequent patterns in sensitive data. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2010 DBLP, pp. 503–512 (2010)Google Scholar
- 4.Zeng, C., Naughton, J.F., Cai, J.Y.: On differentially private frequent itemset mining. VLDB J. 6(1), 25–36 (2012)Google Scholar
- 5.Zhang, X., Miao, W., Meng, X.: An accurate method for mining top-k frequent pattern under differential privacy. J. Comput. Res. Develop. 51(1), 104–114 (2014)Google Scholar
- 6.Bonomi, L., Xiong, L.: A two-phase algorithm for mining sequential patterns with differential privacy. In: ACM International Conference on Information & Knowledge Management, pp. 269–278. ACM (2013)Google Scholar
- 8.Mcsherry, F., Talwar, K.: Mechanism design via differential privacy. In: Foundations of Computer Science 2007, FOCS 2007, pp. 94–103. IEEE (2007)Google Scholar
- 9.Guoqing, L., Xiaojian, Z., Liping, D.: Frequent sequential pattern mining under differential privacy. J. Comput. Res. Develop. 52(12), 2789–2801 (2015)Google Scholar
- 10.Datasets. http://fimi.ua.ac.be/data/