Skip to main content

Advertisement

Log in

A secure cloud storage system supporting privacy-preserving fuzzy deduplication

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Deduplication is an important technology in the cloud storage service. For protecting user privacy, sensitive data usually have to be encrypted before outsourcing. This makes secure data deduplication a challenging task. Although convergent encryption is used to securely eliminate duplicate copies on the encrypted data, these secure deduplication techniques support only exact data deduplication. That is, there is no tolerance of differences in traditional deduplication schemes. This requirement is too strict for multimedia data including image. For images, typical modifications such as resizing and compression only change their binary presentation but maintain human visual perceptions, which should be eliminated as duplicate copies. Those perceptual similar images occupy a lot of storage space on the remote server and greatly affect the efficiency of deduplication system. In this paper, we first formalize and solve the problem of effective fuzzy image deduplication while maintaining user privacy. Our solution eliminates duplicated images based on the measurement of image similarity over encrypted data. The robustness evaluation is given and demonstrates that this fuzzy deduplication system is able to duplicate perceptual similar images, which optimizes the storage and bandwidth overhead greatly in cloud storage service.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Bellare M, Keelveedhi S, Ristenpart T (2013) Message-locked encryption and secure deduplication. In: Proceedings of the 32nd international conference on the theory and applications of cryptographic techniques, pp 296–312

  • Chen M, Wang S, Tian L (2013) A high-precision duplicate image deduplication approach. J Comput 8(11):2768–2775

    Article  Google Scholar 

  • De Roover C, De Vleeschouwer C, Lefèbvre F, Macq B (2005) Robust image hashing based on radial variance of pixels. In: Proceedings of the 12th international conference on image processing, vol 3, pp 77–80

  • Douceur JR, Adya A, Bolosky WJ, Simon P, Theimer M (2002) Reclaiming space from duplicate files in a serverless distributed file system. In: Proceedings of the 22nd international conference on distributed computing systems, pp 617–624

  • Fatos X, Jianfeng W, Xiaofeng C, Liu JK, Li J, Krause P (2014) An efficient phr service system supporting fuzzy keyword search and fine-grained access control. Soft Comput 18(9):1795–1802

  • Harnik D, Pinkas B, Shulman-Peleg A (2010) Side channels in cloud services: deduplication in cloud storage. IEEE Secur Priv 8(6):40–47

    Article  Google Scholar 

  • Jin L, Xiaofeng C, Mingqiang L, Lee P, Lou W (2014a) Secure deduplication with efficient and reliable convergent key management. IEEE Trans Parallel Distrib Syst 25(6):1615–1625

  • Kaaniche N, Laurent M (2014) A secure client side deduplication scheme in cloud storage environments. In: Proceedings of the 6th international conference on new technologies, mobility and security, pp 1–7

  • Katiyar A, Weissman J (2011) Videdup: an application-aware framework for video de-duplication. In: Proceedings of the 3rd USENIX conference on Hot topics in storage and file systems, p 7

  • Kiani SL, Anjum A, Antonopoulos N, Knappmeyer M (2014) Context-aware service utilisation in the clouds and energy conservation. J Ambient Intell Humaniz Comput 5(1):111–131

  • Komaki D, Oku A, Arase Y, Hara T, Uemukai T, Hattori Gen, Nishio Shojiro (2011) Content comparison functions for mobile co-located collaborative web search. J Ambient Intell Humaniz Comput 2(3):239–248

    Article  Google Scholar 

  • Li J, Wang Q, Wang C, Cao N, Ren N, Lou W (2010) Fuzzy keyword search over encrypted data in cloud computing. In: Proceedings of the 29th international conference on computer communications, pp 441–445

  • Menezes AJ, Van Oorschot PC, Vanstone SA (2010) Handbook of applied cryptography. CRC Press, West Palm Beach

  • Meyer DT, Bolosky WJ (2012) A study of practical deduplication. ACM Trans Storage 7(4):14

  • Wang C, Zhang B, Ren K, Roveda JM (2013) Privacy-assured outsourcing of image reconstruction service in cloud. IEEE Trans Emerg Topics Comput 1(1):166–177

  • Xuan L, Guoji Z, Xiayan Z (2014b) Image encryption algorithm with compound chaotic maps. J Ambient Intell Humaniz Comput. doi:10.1007/s12652-013-0217-4

  • Yang B, Gu F, Niu X (2006) Block mean value based image perceptual hashing. In: Processings of the 2nd international conference on intelligent information hiding and multimedia signal processing, pp 167–172

  • You LL, Pollack KT, Long DD (2005) Deep store: an archival storage system architecture. In: Proceedings of the 21st international conference on data engineering, pp 804–815

  • Zauner C, Steinebach M, Hermann E (2011) Rihamark: perceptual image hash benchmarking. In: Proceedings of SPIE, The International Society for Optical Engineering. Society of Photo-Optical Instrumentation Engineers

  • Zhang L, Ma J (2011) Image annotation by incorporating word correlations into multi-class svm. Soft Comput 15(5):917–927

  • Zhang G, Liu Q (2011) A novel image encryption method based on total shuffling scheme. Optics Commun 284(12):2775–2780

  • Zhu B, Li K, Hugo Patterson R (2008) Avoiding the disk bottleneck in the data domain deduplication file system. In: Proceedings of the 6th USENIX conference on file and storage technologies, pp 269–282

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xuan Li.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, X., Li, J. & Huang, F. A secure cloud storage system supporting privacy-preserving fuzzy deduplication. Soft Comput 20, 1437–1448 (2016). https://doi.org/10.1007/s00500-015-1596-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-015-1596-6

Keywords

Navigation