Skip to main content
Log in

A new algorithm using integer programming relaxation for privacy-preserving in utility mining

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

High-utility itemset mining (HUIM) is an effective technique for discovering significant information in data. However, data containing sensitive and private information may cause privacy concerns. Therefore, privacy preserving utility mining (PPUM) has recently become a critical research area. PPUM is the process of transforming a quantitative transactional database into a sanitised one, thus ensuring that utility mining algorithms cannot discover sensitive information. The sanitisation process can have several side effects, including the loss of non-sensitive information and the introduction of redundant information. Additionally, the running times of heuristic algorithms for sanitising data are high. To minimise negative effects and lower the execution time of the hiding process, we propose the G-ILP algorithm with a GPU parallel programming method for preprocessing and a new efficient constraint satisfaction problem for hiding data. The experimental evaluations of G-ILP show the algorithm’s efficiency in terms of running time and its ability to minimise side effects in large datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Algorithm 2
Algorithm 3
Algorithm 4
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Yun U, Kim D (2017) Analysis of privacy preserving approaches in high utility pattern mining. In: Park, J.J.J.H., Pan, Y., Yi, G., Loia, V. (eds.) Advances in Computer Science and Ubiquitous Computing, Singapore, pp. 883–887. https://doi.org/10.1007/978-981-10-3023-9_137

  2. Yeh J-S, Hsu P-C (2010) HHUIF and MSICF: Novel algorithms for privacy preserving utility mining. Expert Syst Appl 37(7):4779–4786. https://doi.org/10.1016/j.eswa.2009.12.038

    Article  Google Scholar 

  3. Lin C-W, Hong T-P, Wong J-W, Lan G-C, Lin W-Y (2014) A GA-based approach to hide sensitive high utility itemsets. Sci World J 2014:2356–6140. https://doi.org/10.1155/2014/804629

    Article  Google Scholar 

  4. Lin JC-W, Hong T-P, Fournier-Viger P, Liu Q, Wong J-W, Zhan J (2017) Efficient hiding of confidential high-utility itemsets with minimal side effects. J Exp Theoretical Artif Intell 29(6):1225–1245. https://doi.org/10.1080/0952813X.2017.1328462

    Article  Google Scholar 

  5. Lin JC-W, Wu T-Y, Fournier-Viger P, Lin G, Zhan J, Voznak M (2016) Fast algorithms for hiding sensitive high-utility itemsets in privacy-preserving utility mining. Eng Appl Artif Intell 55:269–284. https://doi.org/10.1016/j.engappai.2016.07.003

    Article  Google Scholar 

  6. Li S, Mu N, Le J, Liao X (2019) A novel algorithm for privacy preserving utility mining based on integer linear programming. Eng Appl Artif Intell 81:300–312. https://doi.org/10.1016/j.engappai.2018.12.006

    Article  Google Scholar 

  7. Liu X, Chen G, Wen S, Song G (2020) An improved sanitization algorithm in privacy-preserving utility mining. Mathematical Problems in Engineering 2020:1–14. https://doi.org/10.1155/2020/7489045

    Article  Google Scholar 

  8. Liu X, Wen S, Zuo W (2020) Effective sanitization approaches to protect sensitive knowledge in high-utility itemset mining. Appl Intell 50(1):169–191. https://doi.org/10.1007/s10489-019-01524-2

    Article  Google Scholar 

  9. Jangra S, Toshniwal D (2022) Efficient algorithms for victim item selection in privacypreserving utility mining. Future Gener Comput Syst 128:219–234. https://doi.org/10.1016/j.future.2021.10.008

    Article  Google Scholar 

  10. Ashraf M, Rady S, Abdelkader T, Gharib TF (2023) Efficient privacy preserving algorithms for hiding sensitive high utility itemsets. Comput Sec 103360. https://doi.org/10.1016/j.cose.2023.103360

  11. Yao H, Hamilton HJ, Butz CJ (2004) A foundational approach to mining itemset utilities from databases. In: Proceedings of the 2004 SIAM International Conference on Data Mining, pp. 482–486. https://doi.org/10.1137/1.9781611972740.51

  12. Liu Y, Liao W-K, Choudhary A (2005) A two-phase algorithm for fast discovery of high utility itemsets. In: Ho TB, Cheung D, Liu H (eds.) Advances in Knowledge Discovery and Data Mining, Berlin, Heidelberg, pp. 689–695. https://doi.org/10.1007/1143091979

  13. Lin C-W, Hong T-P, Lu W-H (2011) An effective tree structure for mining high utility itemsets. Expert Syst Appl 38(6):7419–7424. https://doi.org/10.1016/j.eswa.2010.12.082

    Article  Google Scholar 

  14. Tseng VS, Wu C-W, Shie B-E, Yu PS (2010) Up-growth: An efficient algorithm for high utility itemset mining. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD-10, pp. 253–262. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/1835804.1835839

  15. Tseng VS, Shie B, Wu C, Yu PS (2013) Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans Knowl Data Eng 25(8):1772–1786. https://doi.org/10.1109/TKDE.2012.59

    Article  Google Scholar 

  16. Liu M, Qu J (2012) Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management. CIKM-12, pp. 55–64, New York, NY, USA. https://doi.org/10.1145/2396761.2396773

  17. Fournier-Viger P, Wu C-W, Zida S, Tseng VS (2014) FHM: Faster high-utility itemset mining using estimated utility co-occurrence pruning. In: Andreasen T, Christiansen H, Cubero J-C, Ra’s ZW (eds.) Foundations of Intelligent Systems, Cham, pp. 83–92. https://doi.org/10.1007/978-3-319-08326-19

  18. Krishnamoorthy S (2015) Pruning strategies for mining high utility itemsets. Expert Syst Appl 42(5):2371–2381. https://doi.org/10.1016/j.eswa.2014.11.001

    Article  Google Scholar 

  19. Zida S, Fournier Viger P, Lin C-W, Wu C-W, Tseng V (2016) EFIM: a fast and memory efficient algorithm for high-utility itemset mining. Knowl Inf Syst 51:595–625. https://doi.org/10.1007/s10115-016-0986-0

    Article  Google Scholar 

  20. Lin JC-W, Gan W, Fournier-Viger P, Hong T-P (2015) Mining high-utility itemsets with multiple minimum utility thresholds. In: Proceedings of the Eighth International C* Conference on Computer Science & Software Engineering. C3S2E-15, pp. 9–17, New York, NY, USA. https://doi.org/10.1145/2790798.2790807

  21. Gan W, Lin JC-W, Chao H-C, Fournier-Viger P, Wang X, Yu PS (2020) Utility-driven mining of trend information for intelligent system. ACM Trans Manag Inf Syst 11(3). https://doi.org/10.1145/3391251

  22. Vo B, Nguyen LTT, Nguyen TDD, Fournier-Viger P, Yun U (2020) A multi-core approach to efficiently mining high-utility itemsets in dynamic profit databases. IEEE Access 8:85890–85899. https://doi.org/10.1109/ACCESS.2020.2992729

    Article  Google Scholar 

  23. Yun U, Nam H, Kim J, Kim H, Baek Y, Lee J, Yoon E, Truong T, Vo B, Pedrycz W (2020) Efficient transaction deleting approach of pre-large based high utility pattern mining in dynamic databases. Futur Gener Comput Syst 103:58–78. https://doi.org/10.1016/j.future.2019.09.024

    Article  Google Scholar 

  24. Yun U, Kim J (2015) A fast perturbation algorithm using tree structure for privacy preserving utility mining. Expert Syst Appl 42(3):1149–1165. https://doi.org/10.1016/j.eswa.2014.08.037

    Article  Google Scholar 

  25. Nguyen D (2022) Le B (2022) A fast algorithm for privacy-preserving utility mining. J Inf Technol Commun 1:12–22. https://doi.org/10.32913/mic-ict-research.v2022.n1.1026

    Article  Google Scholar 

  26. Wu C, Fournier-Viger P, Gu J, Tseng VS (2015) Mining closed+ high utility itemsets without candidate generation. In: 2015 Conference on Technologies and Applications of Artificial Intelligence (TAAI), pp. 187–194. https://doi.org/10.1109/TAAI.2015.7407089

  27. Li X, Shih P-C, Overbey J, Seals C, Lim A (2016) Comparing programmer productivity in openacc and cuda: An empirical investigation. Intern J Comput Sci, Eng Appl (IJCSEA) 6(5):1–15

  28. Dong J, Han M (2007) BitTableFI: An efficient mining frequent itemsets algorithm. Knowl-Based Syst 20(4):329–335. https://doi.org/10.1016/j.knosys.2006.08.005

    Article  Google Scholar 

  29. Liu J, Wang K, Fung BC (2012) Direct discovery of high utility itemsets without candidate generation. In: 2012 IEEE 12th International Conference on Data Mining, pp. 984–989. https://doi.org/10.1109/ICDM.2012.20. IEEE

  30. Gurobi Optimization L (2020) Gurobi Optimizer Reference Manual. http://www.gurobi.com

Download references

Acknowledgements

This research is funded by University of Science, VNU-HCM under grant number CNTT 2021-05

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bac Le.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nguyen, D., Tran, MT. & Le, B. A new algorithm using integer programming relaxation for privacy-preserving in utility mining. Appl Intell 53, 25106–25118 (2023). https://doi.org/10.1007/s10489-023-04913-w

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-023-04913-w

Keywords

Navigation