Skip to main content
Log in

An efficient distributed search solution for federated cloud

  • Published:
Distributed and Parallel Databases Aims and scope Submit manuscript

Abstract

Federated cloud enables data sharing across different organizations. This paper proposes a distributed search index (DS-index) solution to provide distributed search capability in federated cloud, with the target to support multi-attribute range queries with secondary indexes. DS-index deploys a three-layer architecture with a multi-attribute index overlay, a tree-based P2P network layer, and a federated cloud layer. We present a distributed search algorithm with the three-layer architecture. In order to facilitate distributed search, we propose a dynamic mapping algorithm for DS-index to map between different layers. In addition, a Markov Chain-based cost model is defined to facilitate node selection and mapping. With the dynamic mapping algorithm and cost model, the proposed DS-index solution is more cost-effective than the traditional P2P networks as it can reduce the network cost in terms of index maintenance cost and node selection cost. The experiments demonstrate that, our DS-index solution along with its cost model can save the computation resource and reduce network bandwidth consumption by around 30% comparing to the one without cost model. It can also reduce the number of node splits/merges by around 20% comparing to the state-of-the-art solution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Liu, F., Tong, J., Mao, J., Bohn, R., Messina, J., Badger, L., Leaf, D.: NIST cloud computing reference architecture. http://www.nist.gov/customcf/get_pdf.cfm?pub_id=909505

  2. Lea, R., Blackstock, M.: City hub: a cloud-based iot platform for smart cities. In: 2014 IEEE 6th International Conference on Cloud Computing Technology and Science (CloudCom), pp. 799–804. IEEE (2014)

  3. Xu, Q., Aung, K.M.M., Zhu, Y., Yong, K.L.: Building a large-scale object-based active storage platform for data analytics in the internet of things. J. Supercomput. 72, 2796–2814 (2016)

    Article  Google Scholar 

  4. Talia, Domenico: Clouds for scalable big data analytics. Computer 46(5), 98–101 (2013)

    Article  Google Scholar 

  5. Khan, Z., Anjum, A., Soomro, K., Tahir, M.A.: Towards cloud based big data analytics for smart future cities. J. Cloud Comput. 4(1), 2 (2015)

    Article  Google Scholar 

  6. Rochwerger, B., et al.: Reservoir—when one cloud is not enough. Computer 44(3), 44–51 (2011)

    Article  Google Scholar 

  7. Celesti, A., et al.: Three-phase cross-cloud federation model: the cloud SSO authentication. In: Proceedings of The 2nd IEEE International Conference on Advances in Future Internet, July (2010)

  8. Manno, G., Smari, W.W., Spalazzi, L., Taccari, G.: A semantic-based federated cloud system for emergency response. Concurr. Comput. Pract. Exp. 27(13), 3316–3344 (2015)

    Article  Google Scholar 

  9. Ghemawat, S., Gobioff, H., Leung, S.-T.: The Google file system. In: Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles (SOSP 2003), pp. 29–43 (2003)

  10. Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The Hadoop distributed file system. In: The Proceedings of IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST 2010), pp. 1–10 (2010)

  11. Amazon S3: http://aws.amazon.com/s3/

  12. Weil, S.A., Brandt, S.A., Miller, E.L., Long, D.D.E., Maltzahn, C.: Ceph: a scalable, high-performance distributed file system. In: OSDI’06: Proceedings of the 7th Symposium on Operating Systems Design and Implementation, pp. 307–320

  13. Kubiatowicz, J., Bindel, D., Chen, Y., Czerwinski, S., Eaton, P., Geels, D., Gummadi, R., Rhea, S., Weatherspoon, H., Wells, C., Zhao, B.: Oceanstore: an architecture for global-scale persistent storage. SIGARCH, pp. 190–201 (2000)

  14. Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: a distributed storage system for structured data. OSDI (2006)

  15. DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: Amazon’s highly available key-value store. SIGOPS (2007)

  16. Voldemort: http://project-voldemort.com/

  17. Cassandra: http://incubator.apache.org/cassandra/

  18. Stoica, I., Morris, R., Karger, D., Kaashoek, M.F., Balakrishnan, H.: Chord: a scalable peer-to-peer lookup service for internet applications. In: SIGCOMM, pp. 149–160 (2001)

  19. Bharambe, A.R., Agrawal, M., Seshan, S.: Mercury: supporting scalable multi-attribute range queries. In: SIGCOMM (2004)

  20. Crainiceanu, A., Linga, P., Machanavajjhala, A., Gehrke, J., Shanmugasundaram, J.: P-ring: an efficient and robust p2p range index structure. In: SIGMOD (2007)

  21. Aberer, K., Cudr-Mauroux, P., Datta, A., Despotovic, Z., Hauswirth, M., Punceva, M., Schmidt, R.: P-grid: a self-organizing structured P2P system. In: SIGMOD Record (2003)

  22. Ratnasamy, S., Francis, P., Handley, M., Karp, R., Schenker, S.: A scalable content addressable network. In: Proceedings of the ACM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pp. 161–17 (2001)

  23. Jagadish, H.V., Ooi, B.C., Vu, Q.H.: BATON: a balanced tree structure for peer-to-peer networks. In: VLDB, pp. 661–672 (2005)

  24. Jagadish, H.V., Ooi, B.C., Tan, K.-L., Vu, Q.H., Zhang, R.: Speeding up search in peer-to-peer networks with a multi-way tree structure. In: SIGMOD, pp. 1–12 (2006)

  25. Dong, B., Byna, S., Wu, K.: The spatially clustered join on heterogeneous scientific data sets. In: Proceedings of the 2015 IEEE International Conference on Big Data (Big Data) (2015)

  26. Carofiglio, G., Gallo, M., Perino, D.: Modeling data transfer in content-centric networking. In: Proceedings of the 23rd International Teletraffic Congress (ITC) (2011)

  27. Ganesan, P., Yang, B., Molina, H.G.: One torus to rule them all: multidimensional queries in P2P systems. In: WebDB, pp. 19–24 (2004)

  28. Shu, Y., Ooi, B.C., Tan, K.-L., Zhou, A.: Supporting multi-dimensional range queries in peer-to-peer systems. In: IEEE International Conference on Peer-to-Peer Computing (2005)

  29. Vu, Q.H., Lupu, M., Ooi, B.C.: Peer-to-Peer Computing: Principles and Applications. Springer, New York (2009)

    Google Scholar 

  30. Wang, J., Wu, S., Gao, H., Li, J., Ooi, B.C.: Indexing multi-dimensional data in a cloud system. SIGMOD, pp. 591–602 (2010)

  31. Wu, S., Jiang, D., Ooi, B.C., Wu, K.-L.: Efficient B-tree based indexing for cloud data processing. In: VLDB, pp. 1207–1218 (2010)

  32. Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975)

    Article  MATH  Google Scholar 

  33. Robinson, J.T.: The K-D-B tree: a search structure for large multi-dimensional dynamic indexes. In: SIGMOD, pp. 10–18 (1981)

  34. Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: Proc. of ACM SIGMOD’84, pp. 47–57 (1984)

  35. Meyn, S., Sean, P., Tweedie, R.L.: Markov Chains and Stochastic Stability. Cambridge University Press, Cambridge (2009)

    Book  MATH  Google Scholar 

  36. Bayer, R.: The universal B-tree for multidimensional indexing: general concepts. In: Proceedings of the International Conference on Worldwide Computing and Its Applications (WWCA ’97), pp. 198–209 (1997)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yongqing Zhu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, Y., Xu, Q., Shi, H. et al. An efficient distributed search solution for federated cloud. Distrib Parallel Databases 35, 411–433 (2017). https://doi.org/10.1007/s10619-017-7201-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10619-017-7201-5

Keywords

Navigation