Cluster Computing

, Volume 22, Supplement 1, pp 1415–1428 | Cite as

Lossless and robust privacy preservation of association rules in data sanitization

  • Geeta S. NavaleEmail author
  • Suresh N. Mali


Data sanitization is a novel research area that conceals the sensitive rules given by the experts present in the original database with the appropriate modifications and then emancipates the modified database so that unauthorized persons cannot discover the sensitive rules and so the confidentiality of data is conserved against data mining methods. This paper primarily focuses on building an effective sanitizing algorithm for hiding the sensitive rules given by the experts/users. In order to minimize the four sanitization research challenges such as hiding failure, information loss, false rule generation and modification degree, the proposed method uses Firefly optimization algorithm. The proposed sanitization method has been compared and examined with other existing sanitizing algorithms depicting considerable improvement in terms of four research challenges that in turn can secure the selected database.


Association rule mining Data sanitization Sensitive rules Privacy-preserving data mining Khatri Rao product 


  1. 1.
    Han, S., Ng, W.K.: Privacy-preserving genetic algorithms for rule discovery. In: Lecture Notes in Computer Science, pp. 407–417 (2007)Google Scholar
  2. 2.
    Kuo, R.J., Chao, C.M., Chiu, Y.T.: Application of particle swarm optimization to association rule mining. Appl. Soft Comput. 11(1), 326–336 (2011)CrossRefGoogle Scholar
  3. 3.
    Lindell, Y., Pinkas, B.: Privacy preserving data mining. In: The Annual International Cryptology Conference on Advances in Cryptology, pp. 36–54 (2000)Google Scholar
  4. 4.
    Lin, C.W., Hong, T.P., Chang, C.C., Wang, S.L.: A greedy-based approach for hiding sensitive itemsets by transaction insertion. J. Inf. Hiding Multimed. Signal Process. 4, 201–227 (2013)Google Scholar
  5. 5.
    Lin, C.W., Hong, T.P., Yang, K.T., Wang, S.L.: The GA-based algorithms for optimizing hiding sensitive itemsets through transaction deletion. Appl. Intell. 42(2), 210–230 (2015)CrossRefGoogle Scholar
  6. 6.
    Pears, R., Koh, Y.S.: Weighted association rule mining using particle swarm optimization. Lect. Notes Comput. Sci. 7104, 327–338 (2012)CrossRefGoogle Scholar
  7. 7.
    Sarath, K.N.V.D., Ravi, V.: Association rule mining using binary particle swarm optimization. Eng. Appl. Artif. Intell. 26, 1832–1840 (2013)CrossRefGoogle Scholar
  8. 8.
    Shen, M., Zhan, Z.H., Chen, W.N., Gong, Y.J., Zhang, J., Li, Y.: Bi-velocity discrete particle swarm optimization and its application to multi cast routing problem in communication networks. IEEE Trans. Ind. Electron. 61(12), 7141–7151 (2014)CrossRefGoogle Scholar
  9. 9.
    Evfimievski, A., Srikant, R., Agrawal, R., Gehrke, J.: Privacy preserving mining of association rules. In: ACM Sigkdd International Conference on Knowledge Discovery and Data Mining, pp. 217–228 (2002)Google Scholar
  10. 10.
    Islam, Z., Brankovic, L.: Privacy preserving data mining: a noise addition frame work using a novel clustering technique. Knowl. Based Syst. 24(8), 1214–1223 (2011)CrossRefGoogle Scholar
  11. 11.
    Lin, C.W., Zhang, B., Yang, K.T., Hong, T.P.: Efficiently hiding sensitive itemsets with transaction deletion based on genetic algorithms. Sci. World J. (2014). Google Scholar
  12. 12.
    Menhas, M.I., Fei, M., Wang, L., Fu, X.: A novel hybrid binary PSO algorithm. Lect. Notes Comput. Sci. 6728, 93–100 (2011)CrossRefGoogle Scholar
  13. 13.
    Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999)CrossRefGoogle Scholar
  14. 14.
    Mooney, C.H., Roddick, J.F.: Sequential pattern mining—approaches and algorithms. ACM Comput. Surv. 45(2), 1–39 (2013)CrossRefzbMATHGoogle Scholar
  15. 15.
    Oliveira, S.R.M., Zaane, O.R.: Privacy preserving frequent itemset mining. In: IEEE International Conference on Privacy, Security and Data Mining, pp. 43–54. (2002)Google Scholar
  16. 16.
    Pandya, B.K., Dixit, K., Singh, U.K., Bunkar, K.: Effectiveness of multiplicative data perturbation for privacy preserving data mining. Int. J. Adv. Res. Comput. Sci. 5(6), 112–115 (2014)Google Scholar
  17. 17.
    Verykios, V.S., Bertino, E., Fovino, I.N., Provenza, L.P., Saygin, Y., Theodoridis, Y.: State-of-the-art in privacy preserving data mining. ACM Sigmod Rec. 33, 50–57 (2004)CrossRefGoogle Scholar
  18. 18.
    Fouad, M.R., Elbassioni, K., Bertino, E.: A supermodularity-based differential privacy preserving algorithm for data anonymization. IEEE Trans. Knowl. Data Eng. 26(7), 1591–1601 (2014)CrossRefGoogle Scholar
  19. 19.
    Clifton, C., Kantarcioglu, M., Vaidya, J., Lin, X., Zhu, M.Y.: Tools for privacy preserving distributed data mining. ACM Sigkdd Explor. 4, 1–7 (2003)Google Scholar
  20. 20.
    Goldberg, D.E.: Genetic algorithms in search, optimization and machine learning. Addison-Wesley Longman Publishing Co.Inc, Boston (1989)zbMATHGoogle Scholar
  21. 21.
    Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl. Based Syst 10, 557–570 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Hajian, S., Domingo-Ferrer, J., Farrs, O.: Generalization-based privacy preservation and discrimination prevention in data publishing and mining. Data Min. Knowl. Discov. 28(5–6), 1158–1188 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Microsoft. Example Database Food mart of Microsoft Analysis Services
  24. 24.
    Agrawal, R., Srikant, R.: quest synthetic data generator. IBM Almaden Research Center (1994a)
  25. 25.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: The International Conference on Very Large DataBase, San Francisco pp. 487–499 (1994b)Google Scholar
  26. 26.
    Agrawal, R., Srikant R.: Mining sequential patterns. In: The International Conference on Data Engineering, pp. 3–14 (1995)Google Scholar
  27. 27.
    Dasseni, E., Verykios, V.S., Elmagarmid, A.K., Bertino, E.: Hiding association rules by using confidence and support. In: International Workshop on Information Hiding, pp. 369–383 (2001)Google Scholar
  28. 28.
    Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. Lect. Notes Comput. Sci. 3876, 265–284 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  29. 29.
    Chen, M.S., Han, J., Yu, P.S.: Data mining: an overview from a database perspective. IEEE Trans. Knowl. Data Eng. 8(6), 866–883 (1996)CrossRefGoogle Scholar
  30. 30.
    Quinlan, J.R.: C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc, San Francisco (1993)Google Scholar
  31. 31.
    Frequent Itemset Mining Dataset Repository. (2012)
  32. 32.
    Agrawal, R., Srikant, R.: Privacy-preserving data mining. ACM Sigmod Rec. 29, 439–450 (2000)CrossRefGoogle Scholar
  33. 33.
    Aggarwal, C.C., Pei, J., Zhang, B.: On privacy preservation against adversarial data mining. In: ACM Sigkdd International Conference on Knowledge Discovery and Data Mining, pp. 510–516 (2006)Google Scholar
  34. 34.
    Atallah, M., Bertino, E., Elmagarmid, A., Ibrahim, M., Verykios, V.: Disclosure limitation of sensitive rules. In: The Workshop on Knowledge and Data Engineering Exchange, pp.45–52 (1999)Google Scholar
  35. 35.
    Bonam, J., Reddy, A.R., Kalyani, G.: Privacy preserving in association rule mining by data distortion using PSO. In: ICT and Critical Infrastructure: Proceedings of the 48th Annual Convention of Computer Society of India—vol. II, pp. 551–558 (2014)Google Scholar
  36. 36.
    Zuo, X., Zhang, G., Tan, W.: Self-adaptive learning PSO-based deadline constrained task scheduling for hybrid IaaS cloud. IEEE Trans. Autom. Sci. Eng. 11(2), 564–573 (2014)CrossRefGoogle Scholar
  37. 37.
    Zhi, X.H., Xing, X.L., Qang, Q.X., Zhang, L.H.: A discrete PSO method for generalized TSP problem. In: IEEE International Conference on Machine Learning and Cybernetics, pp. 2378–2383 (2004)Google Scholar
  38. 38.
    Lin, J.C.W., Liu, Q., Fournier-Viger, P., Hong, T.P., Voznak, M., Zhan, J.: A sanitization approach for hiding sensitive itemsets based on particle swarm optimization. Eng. Appl. Artif. Intell. 53, 1–18 (2016)CrossRefGoogle Scholar
  39. 39.
    Bhatar, K.: investigating and modeling the effect of laser intensity and nonlinear regime of the fiber on the optical link. J. Optical Commun. 38, 341–353 (2016)Google Scholar
  40. 40.
    Kumar, B.S.S., Manjunath, A.S., Christopher, S.: Improved entropy encoding for high efficient video coding standard. Alex. Eng. J. (2016). Google Scholar
  41. 41.
    Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min. Knowl. Discov. 8(1), 53–87 (2004)MathSciNetCrossRefGoogle Scholar
  42. 42.
    Kennedy, J., Eberhart, R.: Particle swarm optimization. In: IEEE International Conference on Neural Networks, pp. 1942–1948 (1995)Google Scholar
  43. 43.
    Kennedy, J., Eberhart, R.: A discrete binary version of particle swarm algorithm. In: IEEE International Conference on Systems, Man, and Cybernetics, pp. 4104–4108 (1997)Google Scholar
  44. 44.
    Lin, C.W., Yang, L., Fournier-Viger, P., Wu, M.T., Hong, T.P., Wang, S.L.: A swarm-based approach to mine high-utility itemsets. Multi discip. Soc. Netw. Res. pp. 572–581 (2015b)Google Scholar
  45. 45.
    Wu, Y.H., Chiang, C.M., Chen, A.L.P.: Hiding sensitive association rules with limited side effects. IEEE Trans. Knowl. Data Eng. 19, 29–42 (2007)CrossRefGoogle Scholar
  46. 46.
    Zaki, M.: SPADE: an efficient algorithm for mining frequent sEquences. Mach. Learn. 42(1–2), 31–60 (2001)CrossRefzbMATHGoogle Scholar
  47. 47.
    Harik, G.R., Lobo, F.G., Goldberg, D.E.: The compact genetic algorithm. IEEE Trans. Evol. Comput. 3(4), 287–297 (1999)CrossRefGoogle Scholar
  48. 48.
    Holland, J.H.: adaptation in natural and artificial systems. MIT Press, Cambridge (1992)CrossRefGoogle Scholar
  49. 49.
    Hong, T.P., Wang, C.Y., Tao, Y.H.: A new incremental data mining algorithm using pre-large itemsets. Intell. Data Anal. 5, 111–129 (2001)Google Scholar
  50. 50.
    Hong, T.P., Lin, C.W., Yang, K.T., Wang, S.L.: Using TF-IDF to hide sensitive itemsets. Appl. Intell. 38(4), 502–510 (2012)CrossRefGoogle Scholar
  51. 51.
    Giannotti, F., Lakshmanan, L.V.S., Monreale, A., Pedreschi, D., Wang, H.: Privacy-preserving mining of association rules from outsourced transaction databases. IEEE Syst. J. 7(3), 385–395 (2013)CrossRefGoogle Scholar
  52. 52.
    Pathak, K., Chaudhari, N.S., Tiwari, A.: Privacy preserving association rule mining by introducing concept of impact factor. In: proceedings of IEEE conference Industrial Electronics and Applications (ICIEA), pp. 1458–1461 (2012)Google Scholar
  53. 53.
    Wang, H., Yi, C.: Privacy-Preservation Association Rules Mining Based on Fuzzy Correlation. In: proceedings of International Conference on Fuzzy Systems and Knowledge discovery, pp. 757–760 (2012)Google Scholar
  54. 54.
    Sahoo, J., Das, A.K., Goswami, A.: An efficient approach for mining association rules from high utility itemsets. Expert Syst. Appl. (2015). Google Scholar
  55. 55.
    Domadiya, N.H., Rao, U.P.: Hiding Sensitive Association Rules to Maintain Privacy and Data Quality in Database. (2013)Google Scholar
  56. 56.
    Modi, C.N., Rao, U.P., Patel, D.R.: Maintaining privacy and data quality in privacy preserving association rule mining. In: proceedings of International Conference on Computing, Communication and Networking Technologies, (2010)Google Scholar
  57. 57.
    Li, L., Lu, R., Choo, K.K.R., Datta, A., Shao, J.: Privacy-preserving outsourced association rule mining on vertically partitioned databases. IEEE Trans. Inf. Forensics Secur. 11(8), 1847–1861 (2016)CrossRefGoogle Scholar
  58. 58.
    Dehkordi, M.N., Badie, K., Zadeh, A.K.: A novel method for privacy preserving in association rule mining based on genetic algorithms. J. Softw. 4(6), 555–562 (2009)CrossRefGoogle Scholar
  59. 59.
    Rajalaxmi, R.R.: A novel sanitization approach for privacy preserving utility itemset mining. Comput. Inf. Sci. 1(2), 77 (2008)Google Scholar
  60. 60.
    Ravi, A.T., Chitra, S.: privacy preserving data mining using differential evolution—artificial bee colony algorithm. Int. J. Appl. Eng. Res. 9(23), 21575–21584 (2014)Google Scholar
  61. 61.
    Tomar, A., Dubey, A.K., Richhariya, V.: Novel Sensitive Information Preserving Mining (SIPM) algorithm for association rule mining in centralized database. In: International Conference on Emerging Trends in Networks and Computer Communications (ETNCC), Udaipur, pp. 392–397 (2011)Google Scholar
  62. 62.
    Tian, Y., Liu, D., Yuan, D., Wang, K.: A discrete PSO for two-stage assembly scheduling problem. Int. J. Adv. Manuf. Technol. 66(1–4), 481–499 (2013)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Smt. Kashibai Navale College of EngineeringPuneIndia
  2. 2.Savitribai Phule Pune UniversityPuneIndia
  3. 3.Sinhgad Institute of Technology and SciencePuneIndia

Personalised recommendations