Skip to main content
Log in

A novel semantic-aware search scheme based on BCI-tree index over encrypted cloud data

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Most of the traditional privacy-preserving search schemes in cloud adopt TF-IDF model which is on the basis of keyword frequency statistics. The embedding semantic association between keywords and documents are not considered. To solve this problem, we propose a novel semantic-aware search scheme based on BCI-tree index over encrypted cloud data. The LDA model is adopted to generate vectors for documents and queried keywords and the vectors contain topic-based semantic information. The homomorphic encryption on vectors is used to perform privacy-preserving semantic relevance score computation between queried keywords and documents. To achieve efficient search processing, a novel binary clustering tree-based index (BCI-tree index) is designed, which is constructed following the divisive hierarchical clustering algorithm. By using the BCI-tree index, a depth-first recursive search algorithm is proposed. In addition, a threshold presetting-based optimization is applied to further accelerate the search speed. The experimental results show that the proposed scheme performs good in the semantic precision of search results and the search time cost.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data Availability

Details of the datasets have been described in Section 8.

References

  1. Ballard, L., Kamara, S., Monrose, F.: Achieving efficient conjunctive keyword searches over encrypted data. In: Proceedings of 2005 International Conference on Information and Communications Security. pp. 414–426. Springer (2005)

  2. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3(1), 993–1022 (2003)

    MATH  Google Scholar 

  3. Boneh, D., Waters, B.: Conjunctive, subset, and range queries on encrypted data. In: Theory of Cryptography Conference. pp. 535–554. Springer (2007)

  4. Cao, N., Wang, C., Li, M., Ren, K., Lou, W.: Privacy-preserving multi-keyword ranked search over encrypted cloud data. In: Proceeding of the 30th IEEE International Conference on Computer Communications. pp. 829–837. IEEE (2011)

  5. Cao, N., Wang, C., Li, M., Ren, K., Lou, W.: Privacy-preserving multi-keyword ranked search over encrypted cloud data. IEEE Trans. Parallel Distrib. Sys. 25(1), 222–233 (2014)

    Article  Google Scholar 

  6. Cao, Q., Li, Y., Wu, Z., Miao, Y., Liu, J.: Privacy-preserving conjunctive keyword search on encrypted data with enhanced fine-grained access control. World Wide Web 23, 959–989 (2020)

    Article  Google Scholar 

  7. Chang, Y.C., Mitzenmacher, M.: Privacy preserving keyword searches on remote encrypted data. In: Proceedings of 2005 International Conference on Applied Cryptography and Network Security.pp. 442–455. Springer (2005)

  8. Chen, C., Zhu, X., Shen, P., Hu, J., Guo, S., Tari, Z., Zomaya, A.Y.: An efficient privacy-preserving ranked keyword search method. IEEE Trans. Parallel Distrib. Syst. 27(4), 951–963 (2016)

    Article  Google Scholar 

  9. Curtmola, R., Garay, J., Kamara, S., Ostrovsky, R.: Searchable symmetric encryption: improved definitions and efficient constructions. J. Comput. Secur. 19(5), 895–934 (2011)

    Article  Google Scholar 

  10. Dai, H., Dai, X., Yi, X., Yang, G., Huang, H.: Semantic-aware multi-keyword ranked search scheme over encrypted cloud data. J. Netw. Comput. Appl. 147, 102442 (2019)

    Article  Google Scholar 

  11. Dai, H., Yang, M., Yang, G., Xiang, Y., Hu, Z., Wang, H.: A keyword-grouping inverted index based multi-keyword ranked search scheme over encrypted cloud data. IEEE Trans. Sustain. Comput. 7(3), 561–578 (2022)

    Article  Google Scholar 

  12. Dai, X., Dai, H., Rong, C., Yang, G., Xiao, F.: Enhanced semantic-aware multi-keyword ranked search scheme over encrypted cloud data. IEEE Trans. Cloud Comput., pp. 1–16 (2020)

  13. Delfs, H., Knebl, H.: Introduction to cryptography principles and applications, 3rd edn. Springer-Verlag, Berlin (2007)

    Book  MATH  Google Scholar 

  14. Fu, Z., Huang, F., Ren, K., Weng, J., Wang, C.: Privacy-preserving smart semantic search based on conceptual graphs over encrypted outsourced data. IEEE Trans. Inf. Forensics Sec. 12(8), 1874–1884 (2017)

    Article  Google Scholar 

  15. Fu, Z., Sun, X., Ji, S., Xie, G.: Towards efficient content-aware search over encrypted outsourced data in cloud. In: Proceedings of the 35th Annual IEEE International Conference on Computer Communications. pp. 1–9. IEEE (2016)

  16. Fu, Z., Sun, X., Linge, N., Zhou, L.: Achieving effective cloud search services: multi-keyword ranked search over encrypted cloud data supporting synonym query. IEEE Trans. Cons. Electr. 60(1), 164–172 (2014)

    Article  Google Scholar 

  17. Fu, Z., Xia, L., Sun, X., Liu, A.X., Xie, G.: Semantic-aware searching over encrypted data for cloud computing. IEEE Trans. Inf. Forensics Sec. 13(9), 2359–2371 (2018)

    Article  Google Scholar 

  18. Gabryel, M., Damaševičius, R., Przybyszewski, K.: Application of the bag-of-words algorithm in classification the quality of sales leads. In: Proceedings of 2018 International Conference on Artificial Intelligence and Soft Computing. pp. 615–622. Springer (2018)

  19. Hozhabr, M., Asghari, P., Javadi, H.: Dynamic secure multi-keyword ranked search over encrypted cloud data. J. Inf. Sec. Appl. 61(1), 1–12 (2021)

    Google Scholar 

  20. Hua, J., Liu, Y., Chen, H., Tian, X., Jin, C.: An enhanced wildcard-based fuzzy searching scheme in encrypted databases. World Wide Web 23, 2185–2214 (2020)

    Article  Google Scholar 

  21. Ibrahim, A., Jin, H., Yassin, A.A., Zou, D.: Secure rank-ordered search of multi-keyword trapdoor over encrypted cloud data. In: Proceedings of 2012 IEEE Asia-Pacific Services Computing Conference. pp. 263–270. IEEE (2012)

  22. Kiayias, A., Oksuz, O., Russell, A., Tang, Q., Wang, B.: Efficient encrypted keyword search for multi-user data sharing. In: Proceedings of European Symposium on Research in Computer Security. pp. 173–195. Springer (2016)

  23. Lang, K.: Newsweeder: Learning to filter netnews. In: Proceedings of the 12th International Conference on Machine Learning. pp. 331–339. Elsevier (1995)

  24. Li, J., Wang, Q., Wang, C., Cao, N., Ren, K., Lou, W.: Fuzzy keyword search over encrypted data in cloud computing. In: Proceedings of the 29th IEEE International Conference on Computer Communications. pp. 1–5. IEEE (2010)

  25. Liang, Y., Li, Y., Zhang, K., Ma, L.: DMSE: dynamic multi-keyword search encryption based on inverted index. J. Syst. Architect. 119, 1–10 (2021)

    Article  Google Scholar 

  26. Liu, C., Zhu, L., Chen, J.: Efficient searchable symmetric encryption for storing multiple source dynamic social data on cloud. J. Netw. Comput. Appl. 86, 3–14 (2017)

    Article  Google Scholar 

  27. Liu, Q., Peng, Y., Wu, J., Wang, T., Wang, G.: Secure multi-keyword fuzzy searches with enhanced service quality in cloud computing. IEEE Trans. Network and Service Management 18(2), 2046–2062 (2020)

    Article  Google Scholar 

  28. Orencik, C., Kantarcioglu, M., Savas, E.: A practical and secure multi-keyword search method over encrypted cloud data. In: Proceedings of IEEE 16th International Conference on Cloud Computing. pp. 390–397. IEEE (2013)

  29. Poon, H.T., Miri, A.: An efficient conjunctive keyword and phase search scheme for encrypted cloud storage systems. In: Proceedings of 2015 IEEE 8th International Conference on Cloud Computing. pp. 508–515. IEEE (2015)

  30. Roux, M.: A comparative study of divisive and agglomerative hierarchical clustering algorithms. J. Class. 35(2), 345–366 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  31. Scheuermann, P., Ouksel, M.: Multidimensional b-trees for associative searching in database systems. Inf. Syst. 7(2), 123–137 (1982)

    Article  MATH  Google Scholar 

  32. Song, D.X., Wagner, D., Perrig, A.: Practical techniques for searches on encrypted data. In: Proceeding of 2000 IEEE Symposium on Security and Privacy. pp. 44–55. IEEE (2000)

  33. Sun, W., Liu, X., Lou, W., Hou, Y.T., Li, H.: Catch you if you lie to me: efficient verifiable conjunctive keyword search over large dynamic encrypted cloud data. In: Proceedings of the 34th IEEE International Conference on Computer Communications. pp. 2110–2118. IEEE (2015)

  34. Sun, W., Wang, B., Cao, N., Li, M., Lou, W., Hou, Y.T., Li, H.: Verifiable privacy-preserving multi-keyword text search in the cloud supporting similarity-based ranking. IEEE Trans. Parallel Distrib. Syst. 11(25), 3025–3035 (2014)

    Article  Google Scholar 

  35. Swaminathan, A., Mao, Y., Su, G.M., Gou, H., Varna, A.L., He, S., Wu, M., Oard, D.W.: Confidentiality-preserving rank-ordered search. In: Proceedings of the 2007 ACM workshop on Storage security and survivability. pp. 7–12. ACM (2007)

  36. Tseng, C.Y., Lu, C., Chou, C.F.: Efficient privacy-preserving multi-keyword ranked search utilizing document replication and partition. In: Proceedings of the 12th Annual IEEE Consumer Communications and Networking Conference (CCNC). pp. 671–676. IEEE (2015)

  37. Wang, B., Yu, S., Lou, W., Hou, Y.T.: Privacy-preserving multi-keyword fuzzy search over encrypted data in the cloud. In: Proceedings of the 33th IEEE International Conference on Computer Communications. pp. 2112–2120. IEEE (2014)

  38. Wang, C., Cao, N., Ren, K., Lou, W.: Enabling secure and efficient ranked keyword search over outsourced cloud data. IEEE Trans. Parallel Distrib. Syst. 23(8), 1467–1479 (2011)

    Article  Google Scholar 

  39. Wang, C., Ren, K., Yu, S., Urs, K.M.R.: Achieving usable and privacy-assured similarity search over outsourced cloud data. In: Proceedings of the 31th IEEE International Conference on Computer Communications. pp. 451–459. IEEE (2012)

  40. Wang, P., Ravishankar, C.V.: On masking topical intent in keyword search. In: Proceedings of 2014 IEEE 30th International Conference on Data Engineering. pp. 256–267. IEEE (2014)

  41. Xia, Z., Wang, X., Sun, X., Wang, Q.: A secure and dynamic multi-keyword ranked search scheme over encrypted cloud data. IEEE Trans. Parallel Distrib. Syst. 27(2), 340–352 (2016)

    Article  Google Scholar 

  42. Xia, Z., Zhu, Y., Sun, X., Chen, L.: Secure semantic expansion based search over encrypted cloud data supporting similarity ranking. J. Cloud Comput. 3(1), 1–11 (2014)

    Article  Google Scholar 

  43. Zhu, X., Dai, H., Yi, X., Yang, G., Li, X.: Muse: an efficient and accurate verifiable privacy-preserving multikeyword text search over encrypted cloud data. Secur. Commun. Netw. 2017, 1–17 (2017)

    Article  Google Scholar 

  44. Zerr, S., Olmedilla, D., Nejdl, W., Siberski, W.: Zerber+r: Top-k retrieval from a confidential index. In: Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology. pp. 439–449 ACM (2009)

  45. Zhang, B., Zhang, F.: An efficient public key encryption with conjunctive-subset keywords search. J. Netw. Comput. Appl. 34(1), 262–267 (2011)

    Article  Google Scholar 

  46. Zhang, W., Xiao, S., Lin, Y., Zhou, T., Zhou, S.: Secure ranked multi-keyword search for multiple data owners in cloud computing. In: 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks. pp. 276–286. IEEE (2014)

  47. Zhou, Q., Dai, H., Shen, W., Liu, Y., Yang, G.: Evss: An efficient verifiable search scheme over encrypted cloud data. World Wide Web pp. 1–21 (2022)

  48. Zhu, X., Liu, Q., Wang, G.: A novel verifiable and dynamic fuzzy keyword search scheme over encrypted data in cloud computing. In: Proceedings of 2016 IEEE Trustcom/BigDataSE/ISPA. pp. 845–851. IEEE (2016)

  49. Zhou Q., Dai H., Hu Z., Liu Y., Yang G.: SAPMS: A Semantic-aware Privacy-preserving Multi-keyword Search Scheme in Cloud. In: Proceedings of the 6th APWeb-WAIM International Joint Conference on Web and Big Data. pp. 251–263. LNCS (2022)

Download references

Acknowledgements

Not applicable.

Funding

This work was supported by the National Natural Science Foundation of China under the grant Nos.61902199, 61872197,and 61972209; and the Jiangsu Province Postgraduate Scientific Research Innovation Program under the grand No. KYCX22_0984.

Author information

Authors and Affiliations

Authors

Contributions

This work thanks the following authors for their contributions: Qian Zhou and Hua Dai contributed to the conception of the study; Qian Zhou and Yuanlong Liu performed the experiment; Qian Zhou and Zheng Hu contributed significantly to security analysis and manuscript preparation; Qian Zhou and Hua Dai performed the data analyses and wrote the manuscript; Geng Yang and Xun Yi helped perform the analysis with constructive discussions.

Corresponding author

Correspondence to Qian Zhou.

Ethics declarations

Ethical Approval and Consent to participate

Our manuscripts were not submitted to multiple journals for simultaneous consideration and original. All authors agree with the content of the article.

Human and Animal Ethics

Not applicable.

Consent for publication

Our manuscript is approved by all authors for publication.

Competing interests

No conflict of interest exits in the submission of this manuscript, and manuscript is approved by all authors for publication.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, Q., Dai, H., Liu, Y. et al. A novel semantic-aware search scheme based on BCI-tree index over encrypted cloud data. World Wide Web 26, 3055–3079 (2023). https://doi.org/10.1007/s11280-023-01176-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-023-01176-w

Keywords

Navigation