Skip to main content
Log in

A secure framework for managing data in cloud storage using rapid asymmetric maximum based dynamic size chunking and fuzzy logic for deduplication

  • Original Paper
  • Published:
Wireless Networks Aims and scope Submit manuscript

Abstract

Cloud storage is the ideal solution for outsourcing big data since the cloud can store a large amount of data. Cloud storage, on the other hand, raises additional problems about data duplication, fine-grained access control, and privacy, all of these factors are crucial for cloud large data storage. Data duplication approaches based on encrypted data schemes now available do not provide for fine-grained access control. This paper proposes a secure framework for managing data using rapid asymmetric maximum based dynamic size chunking and fuzzy logic for deduplication. Chunking, fingerprinting, hashing, and writing are the four main process of the proposed method. Initially, chunking is done to split the files into chunks. Rapid Asymmetric Maximum (RAM) based Dynamic Size Chunking (DSC) is used in the proposed method. These chunked files are then fingerprinted using hashing process for ensuring data authentication. Then B-tree indexing approach is used in the proposed method in order to keep the fingerprinted in an organized state. General Type2-Fuzzy logic is using Ant Lion Optimization (ALO) is used for detecting duplicate files in the documents. In the cloud storage platform, only non-duplicate documents are safely kept. The Triple Data Encryption Standard is used to do a security study before outsourcing non-duplicate data to a third-party cloud server. The total computation time of the proposed technique is 0.4 s in the inline phase and 0.04 s in the offline phase, and the deduplication ratio is 95% in the inline phase and 90% in the offline phase. This proposed deduplication approach requires less storage, which reduces memory use and processing time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability statement

If all data, models, and code generated or used during the study appear in the submitted article and no data needs to be specifically requested.

Code availability

No code is available for this manuscript.

References

  1. Yuan, H., Chen, X., Wang, J., Yuan, J., Yan, H., & Susilo, W. (2020). Blockchain-based public auditing and secure deduplication with fair arbitration. Information Sciences, 541, 409–425.

    Article  MathSciNet  Google Scholar 

  2. Wang, L., Wang, B., Song, W., & Zhang, Z. (2019). A key-sharing based secure deduplication scheme in cloud storage. Information Sciences, 504, 48–60.

    Article  MathSciNet  Google Scholar 

  3. Periasamy, J. K., & Latha, B. (2020) An enhanced secure content deduplication identification and prevention (ESCDIP) algorithm in cloud environment. Neural Computing and Applications, 1–10.

  4. Pooranian, Z., Shojafar, M., Garg, S., Taheri, R., & Tafazolli, R. (2020). LEVER: Secure deduplicated cloud storage with encrypted two-party interactions in cyber-physical systems. IEEE Transactions on Industrial Informatics, 17(8), 5759–5768.

    Article  Google Scholar 

  5. Widodo, R. N., Lim, H., & Atiquzzaman, M. (2017). A new content-defined chunking algorithm for data deduplication in cloud storage. Future Generation Computer Systems, 71, 145–156.

    Article  Google Scholar 

  6. Ali, G., Ahmad, M. I., & Rafi, A. 2020, January. Secure block-level data deduplication approach for cloud data centers. In 2020 3rd International Conference on Computing, Mathematics and Engineering Technologies (iCoMET) (pp. 1–6). IEEE.

  7. Karthick, S. (2017). Semi supervised hierarchy forest clustering and KNN based metric learning technique for machine learning system. Journal of Advanced Research in Dynamical and Control Systems, 9, 2679–2690.

    Google Scholar 

  8. Rashmi, R. P., Gandhi, Y., Sarmalkar, V., Pund, P., & Khetani, V. (2020, October). RDPC: Secure Cloud Storage with Deduplication Technique. In 2020 Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud)(I-SMAC) (pp. 1280–1283). IEEE.

  9. Yuan, H., Chen, X., Wang, J., Yuan, J., Yan, H. and Susilo, W. (2020) Blockchain-based Public Auditing and Secure Deduplication with Fair Arbitration. Information Sciences,.

  10. Li, J., Chen, X., Li, M., Li, J., Lee, P. P., & Lou, W. (2013). Secure deduplication with efficient and reliable convergent key management. IEEE transactions on parallel and distributed systems, 25(6), 1615–1625.

    Article  Google Scholar 

  11. Yang, C., Zhang, M., Jiang, Q., Zhang, J., Li, D., Ma, J., & Ren, J. (2017). Zero knowledge based client side deduplication for encrypted files of secure cloud storage in smart cities. Pervasive and Mobile Computing, 41, 243–258.

    Article  Google Scholar 

  12. Jayapandian, N. and Rahman, A.M.J.M. (2018) Secure deduplication for cloud storage using interactive message-locked encryption with convergent encryption, to reduce storage space. Brazilian Archives of Biology and Technology, 61.

  13. Li, J., Chen, X., Xhafa, F., & Barolli, L. (2015). Secure deduplication storage systems supporting keyword search. Journal of Computer and System Sciences, 81(8), 1532–1541.

    Article  MathSciNet  Google Scholar 

  14. Sun, S., Yao, W., & Li, X. (2019). SORD: A new strategy of online replica deduplication in Cloud-P2P. Cluster Computing, 22(1), 1–23.

    Article  Google Scholar 

  15. Li, J., & Hou, M. (2018). Improving data availability for deduplication in cloud storage. International Journal of Grid and High Performance Computing (IJGHPC), 10(2), 70–89.

    Article  Google Scholar 

  16. Rao, K.P.R., Reddy V.K. and Yakoob, S.K. (2018). Dynamic Secure Deduplication in Cloud Using Genetic Programming. In Data Engineering and Intelligent Computing (pp. 493–502). Springer.

  17. Zhang, Y., Xu, C., Li, H., Yang, K., Zhou, J., & Lin, X. (2018). Healthdep: An efficient and secure deduplication scheme for cloud-assisted health systems. IEEE Transactions on Industrial Informatics, 14(9), 4101–4112.

    Article  Google Scholar 

  18. Wu, S., Li, K. C., Mao, B., & Liao, M. (2017). DAC: Improving storage availability with deduplication-assisted cloud-of-clouds. Future Generation Computer Systems, 74, 190–198.

    Article  Google Scholar 

  19. Saeed, A. S. M., & George, L. E. (2021). Fingerprint-based data deduplication using a mathematical bounded linear hash function. Symmetry, 13(11), 1978.

    Article  Google Scholar 

  20. Carvajal, O., Melin, P., Miramontes, I., & Prado-Arechiga, G. (2021). Optimal design of a general type-2 fuzzy classifier for the pulse level and its hardware implementation. Engineering Applications of Artificial Intelligence, 97, 04069.

    Article  Google Scholar 

  21. Rajkumar, K., & Dhanakoti, V. (2022). Fuzzy-Dedup: A secure deduplication model using cosine based Fuzzy interference system in cloud application. Journal of Intelligent & Fuzzy Systems, (Preprint), 1–14.

  22. Kambo, H., & Sinha, B. (2017, May). Secure data deduplication mechanism based on Rabin CDC and MD5 in cloud computing environment. In 2017 2nd IEEE international conference on recent trends in electronics, information & communication technology (RTEICT) (pp. 400–404). IEEE.

  23. Li, Y., Hu, L., Xia, K., & Luo, J. (2019). Fast distributed video deduplication via locality-sensitive hashing with similarity ranking. EURASIP Journal on Image and Video Processing, 2019, 1–11.

    Article  Google Scholar 

Download references

Funding

There is no funding provided to prepare the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to K. Rajkumar.

Ethics declarations

Conflict of interest

The process of writing and the content of the article does not give grounds for raising the issue of a conflict of interest.

Ethical approval

this article does not contain any studies with human participants or animals performed by any of the authors.

Informal consent

Informed consent was obtained from all individual participants included in the study.

Consent to participate

I have read and I understand the provided information.

Consent to publish

This article does not contain any Image or video to get permission.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rajkumar, K., Hariharan, U., Dhanakoti, V. et al. A secure framework for managing data in cloud storage using rapid asymmetric maximum based dynamic size chunking and fuzzy logic for deduplication. Wireless Netw 30, 321–334 (2024). https://doi.org/10.1007/s11276-023-03448-9

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11276-023-03448-9

Keywords

Navigation