Skip to main content
Log in

Semantic and secure search over encrypted outsourcing cloud based on BERT

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

Searchable encryption provides an effective way for data security and privacy in cloud storage. Users can retrieve encrypted data in the cloud under the premise of protecting their own data security and privacy. However, most of the current content-based retrieval schemes do not contain enough semantic information of the article and cannot fully reflect the semantic information of the text. In this paper, we propose two secure and semantic retrieval schemes based on BERT (bidirectional encoder representations from transformers) named SSRB-1, SSRB-2. By training the documents with BERT, the keyword vector is generated to contain more semantic information of the documents, which improves the accuracy of retrieval and makes the retrieval result more consistent with the user’s intention. Finally, through testing on real data sets, it is shown that both of our solutions are feasible and effective.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Wang C, Yuan X, Cui Y, Ren K. Toward secure outsourced middlebox services: practices, challenges, and beyond. IEEE Network, 2017, 32(1): 166–171

    Article  Google Scholar 

  2. Song D X, Wagner D, Perrig A. Practical techniques for searches on encrypted data. In: Proceeding of 2000 IEEE Symposium on Security and Privacy. 2000, 44–55

  3. Swaminathan A, Mao Y, Su G M, Gou H, Varna, A L, He S, Wu M, Oard D W. Confidentiality-preserving rank-ordered search. In: Proceedings of the 2007 ACM workshop on Storage Security and Survivability. 2007, 7–12

  4. Wang C, Cao N, Li J, Ren K, Lou W. Secure ranked keyword search over encrypted cloud data. In: Proceedings of the 30th IEEE international conference on distributed computing systems. 2010, 253–262

  5. Li J, Chen X, Xhafa F, Barolli L. Secure deduplication storage systems supporting keyword search. Journal of Computer and System Sciences, 2015, 81(8): 1532–1541

    Article  MathSciNet  MATH  Google Scholar 

  6. Li R, Xu Z, Kang W, Yow K C, Xu C Z. Efficient multi-keyword ranked query over encrypted data in cloud computing. Future Generation Computer Systems, 2014, 30: 179–190

    Article  Google Scholar 

  7. Zhang M, Chen Y, Huang J. SE-PPFM: A Searchable encryption scheme supporting privacy-preserving fuzzy multikeyword in Cloud Systems. IEEE Systems Journal, 2020, 15(2): 2980–2988

    Article  Google Scholar 

  8. Hu S, Wang Q, Wang J, Qin Z, Ren K. Securing SIFT: privacy-preserving outsourcing computation of feature extractions over encrypted image data. IEEE Transactions on Image Processing, 2016, 25(7): 3411–3425

    Article  MathSciNet  MATH  Google Scholar 

  9. Yuan X, Wang X, Wang C, Squicciarini A C, Ren K. Towards privacy-preserving and practical image-centric social discovery. IEEE Transactions on Dependable and Secure Computing, 2016, 15(5): 868–882

    Article  Google Scholar 

  10. Kamara S, Papamanthou C. Parallel and dynamic searchable symmetric encryption. In: Proceeding of International Conference on Financial Cryptography and Data Security. 2013, 258–274

  11. Miers I, Mohassel P. IO-DSSE: scaling dynamic searchable encryption to millions of indexes by improving locality. In: Proceeding of NDSS. 2017

  12. Liu Y, Peng H, Wang J. Verifiable diversity ranking search over encrypted outsourced data. Computers, Materials & Continua, 2018, 55(1): 37–57

    Article  Google Scholar 

  13. Duan H, Wang C, Yuan X, Zhou Y, Wang Q, Ren K. LightBox: full-stack protected stateful middlebox at lightning speed. In: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. 2019, 2351–2367

  14. Fu Z, Sun X, Liu Q, Zhou L, Shu J G. Achieving efficient cloud search services: multi-keyword ranked search over encrypted cloud data supporting parallel computing. IEICE Transactions on Communication, 2015, 98(1): 190–200

    Article  Google Scholar 

  15. Wang B, Yu S, Lou W, Hou Y T. Privacy-preserving multi-keyword fuzzy search over encrypted data in the cloud. In: Proceedings of IEEE INFOCOM 2014-IEEE Conference on Computer Communications. 2014, 2112–2120

  16. Goh E J. Secure indexes. IACR Cryptol. ePrint Arch., 2003, 2003: 216

    Google Scholar 

  17. Curtmola R, Garay J, Kamara S, Ostrovsky R. Searchable symmetric encryption: improved definitions and efficient constructions. In: Proceeding of the 13th ACM Conference on Computer and Communications Security. 2006, 79–88

  18. Cao N, Wang C, Li M, Ren K, Lou W. Privacy-preserving multi-keyword ranked search over encrypted cloud data. IEEE Transactions on Parallel and Distributed Systems, 2013, 25(1): 222–233

    Article  Google Scholar 

  19. Li J, Wang Q, Wang C, Cao N, Ren K, Lou W. Fuzzy keyword search over encrypted data in cloud computing. In: Proceedings of the 29th Conference on Information Communications. 2010, 441–445

  20. Fu Z, Shu J, Sun X, Linge N. Smart cloud search services: verifiable keyword-based semantic search over encrypted cloud data. IEEE Transactions on Consumer Electronics, 2014, 60(4): 762–770

    Article  Google Scholar 

  21. Wang Q, He M, Du M, Chow S S M, Lai R W F, Zou Q. Searchable encryption over feature-rich data. IEEE Transactions on Dependable and Secure Computing, 2018, 15(3): 496–510

    Article  Google Scholar 

  22. Xu L, Yuan X, Wang C, Wang Q, Xu C. Hardening database padding for searchable encryption. In: Proceedings of IEEE INFOCOM 2019-IEEE Conference on Computer Communications. 2019, 2503–2511

  23. Fu Z, Huang F, Ren K, Weng J, Wang C. Privacy-preserving smart semantic search based on conceptual graphs over encrypted outsourced data. IEEE Transactions on Information Forensics and Security, 2017, 12(8): 1874–1884

    Article  Google Scholar 

  24. Liu Y, Fu Z. Secure search service based on word2vec in the public cloud. International Journal of Computational Science and Engineering, 2019, 18(3): 305–313

    Article  Google Scholar 

  25. Zerr S, Demidova E, Olmedilla D, Nejdl W, Winslett, M, Mitra S. Zerber: r-confidential indexing for distributed documents. In: Proceedings of the 11th International Conference on Extending Database Technology: Advances in Database Technology. 2008, 287–298

  26. Devlin J, Chang M W, Lee K, Toutanova K. Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019, 4171–4186

  27. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser L, Polosukhin I. Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 5998–6008

  28. Wong W K, Cheung D W, Kao B, Mamoulis N. Secure kNN computation on encrypted databases. In: Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data. 2009, 139–152

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant Nos. U1836110 and U1836208); by the Jiangsu Basic Research Programs-Natural Science Foundation under grant numbers BK20200039.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhangjie Fu.

Additional information

Zhangjie Fu received his PhD in computer science from the College of Computer, Hunan University, China in 2012. He is currently a Professor at School of Computer and Software, Nanjing University of Information Science and Technology, China. He was a visiting scholar of Computer Science and Engineering at State University of New York at Buffalo from March, 2015 to March, 2016, USA. His research interests include Cloud and Outsourcing Security, Digital Forensics, Network and Information Security. His research has been supported by NSFC, PAPD, and GYHY. Zhangjie is a member of IEEE and a member of ACM.

Yan Wang received his BE in Internet of Things Engineering from Nanjing University of Information Science and Technology, China in 2018. He is currently pursuing his MS in computer science and technology at the Department of Computer and Software, Nanjing University of Information Science and Technology, China. His research interests include cloud security and information security.

Xingming Sun received the BS degree in mathematics from Hunan Normal University, China in 1984, the MS degree in computing science from the Dalian University of Science and Technology, China in 1988, and the PhD degree in computer science from Fudan University, China in 2001. He is currently a Professor with the School of Computer and Software, Nanjing University of Information Science and Technology, China. His research interests include network and information security, digital watermarking, cloud computing security, and wireless network security.

Xiaosong Zhang received the MS and PhD degrees in computer science from the University of Electronic Science and Technology of China (UESTC), China in 1999 and 2011, respectively. He is currently a Professor with the School of Computer Science and Engineering, University of Electronic Science and Technology of China, China. His research interests are blockchain, big data security, AI security, etc.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fu, Z., Wang, Y., Sun, X. et al. Semantic and secure search over encrypted outsourcing cloud based on BERT. Front. Comput. Sci. 16, 162802 (2022). https://doi.org/10.1007/s11704-021-0277-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11704-021-0277-0

Keywords

Navigation