Skip to main content
Log in

Secure itemset hiding in smart city sensor data

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Sensor data that is often collected in the Internet of Things (IoT) or any smart city environment should be protected against security and privacy concerns. Often, sensor data that is shared by devices in smart cities contains sensitive or private information that can often be shared over different networks and by different smart applications. In the last decade, the area of Privacy-Preserving Data Mining (PPDM) has received a lot of attention as the amount of data received and collected daily is huge. Unfortunately, PPDM mostly applies to binary data. To improve the usefulness of PPDM, we present a more usable version for smart cities called Privacy-Preserving Utility Mining (PPUM), in the form of a Maximal Sensitive Utility-Maximal Sensitive ConflIct (MSU-MSI) algorithm. MSU-MSI finds any conflicting items that may contain sensitive itemsets with high-utility and sanitizes them, stripping them of sensitive and private information while maintaining utility. Any transactions encountered that contain sensitive itemsets are first fed through sanitization processes. This is followed by calculating the total number of items that conflict, and then removing them so sanitization processes can operate more efficiently so as to not redo known sanitization processes. We conduct an in-depth experimental analysis, where our detailed methodology is compared directly with state-of-the-art frameworks such as MSU-MIU, MSU-MAU, HHUIF and MSCIF. Our proposed MSU-MSI shows a higher performance in missing cost, in particular when dealing with highly dense or highly sparse datasets. Moreover, our novel framework is shown to achieve an excellent performance with regards to similarity in database structure and database utility.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data Availability

Data available on request from the authors.

Code Availability

Code available on request from the authors.

References

  1. Agrawal, R., Srikant, R., et al.: Fast algorithms for mining association rules. In: The International Conference on Very Large Data Bases, vol. 1215, pp. 487–499 (1994)

  2. Alawneh, L., Shehab, M.A., Al-Ayyoub, M., Jararweh, Y., Al-Sharif, Z.A.: A scalable multiple pairwise protein sequence alignment acceleration using hybrid CPU-GPU approach. Clust. Comput. 23, 2677–2688 (2020)

    Article  Google Scholar 

  3. Amiri, A.: Dare to share: protecting sensitive knowledge with data sanitization. Decis. Support Syst. 43(1), 181–191 (2007)

    Article  Google Scholar 

  4. Bertino, E., Fovino, I.N., Provenza, L.P.: A framework for evaluating privacy preserving data mining algorithms. Data Min. Knowl. Disc. 11(2), 121–154 (2005)

    Article  MathSciNet  Google Scholar 

  5. Cai, H., Xu, B., Jiang, L., Vasilakos, A.V.: Iot-based big data storage systems in cloud computing: perspectives and challenges. IEEE Internet Things J. 4(1), 75–87 (2016)

    Article  Google Scholar 

  6. Chan, R., Yang, Q., Shen, Y.D.: Mining high utility itemsets. In: Third IEEE International Conference on Data Mining, pp. 19–26 (2003)

  7. Chen, M.S., Han, J., Yu, P.S.: Data mining: an overview from a database perspective. IEEE Trans. Knowl. Data Eng. 8(6), 866–883 (1996)

    Article  Google Scholar 

  8. Dasseni, E., Verykios, V.S., Elmagarmid, A.K., Bertino, E.: Hiding association rules by using confidence and support. In: International Workshop on Information Hiding, pp. 369–383 (2001)

  9. Duong, H., Truong, T., Vo, B.: An efficient method for mining frequent itemsets with double constraints. Eng. Appl. Artif. Intell. 27, 148–154 (2014)

    Article  Google Scholar 

  10. Fayyad, U., Piatetsky-Shapiro, G., Smyth, P.: From data mining to knowledge discovery in databases. AI Mag. 17(3), 37–37 (1996)

    Google Scholar 

  11. Gan, W., Lin, J.C.W., Fournier-Viger, P., Chao, H.C., Tseng, V.S., Yu, P.S.: A survey of utility-oriented pattern mining. IEEE Trans. Knowl. Data Eng. 33, 1306–1327 (2019)

    Article  Google Scholar 

  12. Giannotti, F., Lakshmanan, L.V., Monreale, A., Pedreschi, D., Wang, H.: Privacy-preserving mining of association rules from outsourced transaction databases. IEEE Syst. J. 7(3), 385–395 (2012)

    Article  Google Scholar 

  13. Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min. Knowl. Disc. 8(1), 53–87 (2004)

    Article  MathSciNet  Google Scholar 

  14. Hong, T.P., Lin, C.W., Yang, K.T., Wang, S.L.: Using TF-IDF to hide sensitive itemsets. Appl. Intell. 38(4), 502–510 (2013)

    Article  Google Scholar 

  15. Hong, T.P., Wang, C.Y., Tao, Y.H.: A new incremental data mining algorithm using pre-large itemsets. Intell. Data Anal. 5(2), 111–129 (2001)

    Article  Google Scholar 

  16. Jangra, S., Toshniwal, D.: Efficient algorithms for victim item selection in privacy-preserving utility mining. Futur. Gener. Comput. Syst. 128, 219–234 (2022)

    Article  Google Scholar 

  17. Krishnamoorthy, S.: A comparative study of top-k high utility itemset mining methods. In: High-Utility Pattern Mining, pp. 47–74 (2019)

  18. Li, S., Mu, N., Le, J., Liao, X.: A novel algorithm for privacy preserving utility mining based on integer linear programming. Eng. Appl. Artif. Intell. 81, 300–312 (2019)

    Article  Google Scholar 

  19. Li, X., Liu, S., Wu, F., Kumari, S., Rodrigues, J.J.P.C.: Privacy preserving data aggregation scheme for mobile edge computing assisted IoT applications. IEEE Internet Things J. 6(3), 4755–4763 (2019)

    Article  Google Scholar 

  20. Li, X.B., Sarkar, S.: A tree-based data perturbation approach for privacy-preserving data mining. IEEE Trans. Knowl. Data Eng. 18(9), 1278–1283 (2006)

    Article  Google Scholar 

  21. Li, Y.C., Yeh, J.S., Chang, C.C.: MICF: an effective sanitization algorithm for hiding sensitive patterns on data mining. Adv. Eng. Inform. 21(3), 269–280 (2007)

    Article  Google Scholar 

  22. Lin, C.W., Hong, T.P., Wong, J.W., Lan, G.C., Lin, W.Y.: A GA-based approach to hide sensitive high utility itemsets. Sci. World J. 2014 (2014)

  23. Lin, J.C.W., Fournier-Viger, P., Wu, L., Gan, W., Djenouri, Y., Zhang, J.: PPSF: An open-source privacy-preserving and security mining framework. In: IEEE International Conference on Data Mining Workshops, pp. 1459–1463 (2018)

  24. Lin, J.C.W., Liu, Q., Fournier-Viger, P., Hong, T.P., Voznak, M., Zhan, J.: A sanitization approach for hiding sensitive itemsets based on particle swarm optimization. Eng. Appl. Artif. Intell. 53, 1–18 (2016)

    Article  Google Scholar 

  25. Lin, J.C.W., Wu, T.Y., Fournier-Viger, P., Lin, G., Zhan, J., Voznak, M.: Fast algorithms for hiding sensitive high-utility itemsets in privacy-preserving utility mining. Eng. Appl. Artif. Intell. 55, 269–284 (2016)

    Article  Google Scholar 

  26. Lin, M.Y., Tu, T.F., Hsueh, S.C.: High utility pattern mining using the maximal itemset property and lexicographic tree structures. Inf. Sci. 215, 1–14 (2012)

    Article  Google Scholar 

  27. Lindell, Y., Pinkas, B.: Privacy preserving data mining. In: Annual International Cryptology Conference, pp. 36–54 (2000)

  28. Liu, M., Qu, J.: Mining high utility itemsets without candidate generation. In: ACM International Conference on Information and Knowledge Management, pp. 55–64 (2012)

  29. Liu, X., Chen, G., Wen, S., Song, G.: An improved sanitization algorithm in privacy-preserving utility mining. Math. Probl. Eng. 2020, 7489045 (2020)

    Google Scholar 

  30. Liu, X., Wen, S., Zuo, W.: Effective sanitization approaches to protect sensitive knowledge in high-utility itemset mining. Appl. Intell. 50, 169–191 (2020)

    Article  Google Scholar 

  31. Liu, Y., Liao, W.K., Choudhary, A.: A two-phase algorithm for fast discovery of high utility itemsets. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 689–695 (2005)

  32. Makani, S., Pittala, R., Alsayed, E., Aloqaily, M., Jararweh, Y.: A survey of blockchain applications in sustainable and smart cities. Clust. Comput. 25(6), 3915–3936 (2022)

    Article  Google Scholar 

  33. Marjani, M., Nasaruddin, F., Gani, A., Karim, A., Hashem, I.A.T., Siddiqa, A., Yaqoob, I.: Big IoT data analytics: Architecture, opportunities, and open research challenges. IEEE Access 5, 5247–5261 (2017)

    Article  Google Scholar 

  34. Sollins, K.R.: IoT big data security and privacy versus innovation. IEEE Internet Things J. 6(2), 1628–1635 (2019)

    Article  Google Scholar 

  35. Sreenivasulu, A.L., Reddy, C.P.: NLDA non-linear regression model for preserving data privacy in wireless sensor networks. Digit. Commun. Netw. 6(1), 101–107 (2020)

    Article  Google Scholar 

  36. Sun, X., Yu, P.S.: A border-based approach for hiding sensitive frequent itemsets. In: IEEE International Conference on Data Mining, p. 8 (2005)

  37. Tseng, L., Yao, X., Otoum, S., Aloqaily, M., Jararweh, Y.: Blockchain-based database in an IoT environment: challenges, opportunities, and analysis. Clust. Comput. 23, 2151–2165 (2020)

    Article  Google Scholar 

  38. Verykios, V.S., Elmagarmid, A.K., Bertino, E., Saygin, Y., Dasseni, E.: Association rule hiding. IEEE Trans. Knowl. Data Eng. 16(4), 434–447 (2004)

    Article  Google Scholar 

  39. Wu, D., Yang, B., Wang, R.: Scalable privacy-preserving big data aggregation mechanism. Digit. Commun. Netw. 2(3), 122–129 (2016)

    Article  Google Scholar 

  40. Wu, J.M.T., Zhan, J., Lin, J.C.W.: Ant colony system sanitization approach to hiding sensitive itemsets. IEEE Access 5, 10024–10039 (2017)

    Article  Google Scholar 

  41. Yao, H., Hamilton, H.J.: Mining itemset utilities from transaction databases. Data Knowl. Eng. 59(3), 603–626 (2006)

    Article  Google Scholar 

  42. Yao, H., Hamilton, H.J., Butz, C.J.: A foundational approach to mining itemset utilities from databases. In: SIAM International Conference on Data Mining, pp. 482–486 (2004)

  43. Yao, X., Farha, F., Li, R., Psychoula, I., Chen, L., Ning, H.: Security and privacy issues of physical objects in the IoT: challenges and opportunities. Digit. Commun. Netw. 7, 373–384 (2020)

    Article  Google Scholar 

  44. Yeh, J.S., Hsu, P.C.: HHUIF and MSICF: novel algorithms for privacy preserving utility mining. Expert Syst. Appl. 37(7), 4779–4786 (2010)

    Article  Google Scholar 

  45. Yun, U., Kim, J.: A fast perturbation algorithm using tree structure for privacy preserving utility mining. Expert Syst. Appl. 42(3), 1149–1165 (2015)

    Article  Google Scholar 

Download references

Acknowledgements

This work is partially supported by the Natural Sciences Research Council of Canada (NSERC) through their Discovery Grants program held by Dr. Gautam Srivastava (RGPIN-2020-05363)

Funding

This work is partially supported by the Natural Sciences Research Council of Canada (NSERC) through their Discovery Grants program held by Dr. Gautam Srivastava (RGPIN-2020-05363)

Author information

Authors and Affiliations

Authors

Contributions

GS: Conceptualization, Methodology, Software, Data curation, Validation, Investigation, Visualization, Writing - original draft. JC-WL: Supervision, Conceptualization, Methodology, Investigation, Writing - review & editing. GL: Methodology, Validation, Writing - review & editing.

Corresponding author

Correspondence to Gautam Srivastava.

Ethics declarations

Conflict of interest

The authors have no Conflicts of Interest to declare for this manuscript.

Ethical Approval

For this type of study formal consent was not required. This manuscript does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Srivastava, G., Lin, J.CW. & Lin, G. Secure itemset hiding in smart city sensor data. Cluster Comput 27, 1361–1374 (2024). https://doi.org/10.1007/s10586-023-04000-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-023-04000-2

Keywords

Navigation