Abstract
Deduplication is an important technology in the cloud storage service. For protecting user privacy, sensitive data usually have to be encrypted before outsourcing. This makes secure data deduplication a challenging task. Although convergent encryption is used to securely eliminate duplicate copies on the encrypted data, these secure deduplication techniques support only exact data deduplication. That is, there is no tolerance of differences in traditional deduplication schemes. This requirement is too strict for multimedia data including image. For images, typical modifications such as resizing and compression only change their binary presentation but maintain human visual perceptions, which should be eliminated as duplicate copies. Those perceptual similar images occupy a lot of storage space on the remote server and greatly affect the efficiency of deduplication system. In this paper, we first formalize and solve the problem of effective fuzzy image deduplication while maintaining user privacy. Our solution eliminates duplicated images based on the measurement of image similarity over encrypted data. The robustness evaluation is given and demonstrates that this fuzzy deduplication system is able to duplicate perceptual similar images, which optimizes the storage and bandwidth overhead greatly in cloud storage service.
Similar content being viewed by others
References
Bellare M, Keelveedhi S, Ristenpart T (2013) Message-locked encryption and secure deduplication. In: Proceedings of the 32nd international conference on the theory and applications of cryptographic techniques, pp 296–312
Chen M, Wang S, Tian L (2013) A high-precision duplicate image deduplication approach. J Comput 8(11):2768–2775
De Roover C, De Vleeschouwer C, Lefèbvre F, Macq B (2005) Robust image hashing based on radial variance of pixels. In: Proceedings of the 12th international conference on image processing, vol 3, pp 77–80
Douceur JR, Adya A, Bolosky WJ, Simon P, Theimer M (2002) Reclaiming space from duplicate files in a serverless distributed file system. In: Proceedings of the 22nd international conference on distributed computing systems, pp 617–624
Fatos X, Jianfeng W, Xiaofeng C, Liu JK, Li J, Krause P (2014) An efficient phr service system supporting fuzzy keyword search and fine-grained access control. Soft Comput 18(9):1795–1802
Harnik D, Pinkas B, Shulman-Peleg A (2010) Side channels in cloud services: deduplication in cloud storage. IEEE Secur Priv 8(6):40–47
Jin L, Xiaofeng C, Mingqiang L, Lee P, Lou W (2014a) Secure deduplication with efficient and reliable convergent key management. IEEE Trans Parallel Distrib Syst 25(6):1615–1625
Kaaniche N, Laurent M (2014) A secure client side deduplication scheme in cloud storage environments. In: Proceedings of the 6th international conference on new technologies, mobility and security, pp 1–7
Katiyar A, Weissman J (2011) Videdup: an application-aware framework for video de-duplication. In: Proceedings of the 3rd USENIX conference on Hot topics in storage and file systems, p 7
Kiani SL, Anjum A, Antonopoulos N, Knappmeyer M (2014) Context-aware service utilisation in the clouds and energy conservation. J Ambient Intell Humaniz Comput 5(1):111–131
Komaki D, Oku A, Arase Y, Hara T, Uemukai T, Hattori Gen, Nishio Shojiro (2011) Content comparison functions for mobile co-located collaborative web search. J Ambient Intell Humaniz Comput 2(3):239–248
Li J, Wang Q, Wang C, Cao N, Ren N, Lou W (2010) Fuzzy keyword search over encrypted data in cloud computing. In: Proceedings of the 29th international conference on computer communications, pp 441–445
Menezes AJ, Van Oorschot PC, Vanstone SA (2010) Handbook of applied cryptography. CRC Press, West Palm Beach
Meyer DT, Bolosky WJ (2012) A study of practical deduplication. ACM Trans Storage 7(4):14
Wang C, Zhang B, Ren K, Roveda JM (2013) Privacy-assured outsourcing of image reconstruction service in cloud. IEEE Trans Emerg Topics Comput 1(1):166–177
Xuan L, Guoji Z, Xiayan Z (2014b) Image encryption algorithm with compound chaotic maps. J Ambient Intell Humaniz Comput. doi:10.1007/s12652-013-0217-4
Yang B, Gu F, Niu X (2006) Block mean value based image perceptual hashing. In: Processings of the 2nd international conference on intelligent information hiding and multimedia signal processing, pp 167–172
You LL, Pollack KT, Long DD (2005) Deep store: an archival storage system architecture. In: Proceedings of the 21st international conference on data engineering, pp 804–815
Zauner C, Steinebach M, Hermann E (2011) Rihamark: perceptual image hash benchmarking. In: Proceedings of SPIE, The International Society for Optical Engineering. Society of Photo-Optical Instrumentation Engineers
Zhang L, Ma J (2011) Image annotation by incorporating word correlations into multi-class svm. Soft Comput 15(5):917–927
Zhang G, Liu Q (2011) A novel image encryption method based on total shuffling scheme. Optics Commun 284(12):2775–2780
Zhu B, Li K, Hugo Patterson R (2008) Avoiding the disk bottleneck in the data domain deduplication file system. In: Proceedings of the 6th USENIX conference on file and storage technologies, pp 269–282
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by V. Loia.
Rights and permissions
About this article
Cite this article
Li, X., Li, J. & Huang, F. A secure cloud storage system supporting privacy-preserving fuzzy deduplication. Soft Comput 20, 1437–1448 (2016). https://doi.org/10.1007/s00500-015-1596-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-015-1596-6