Skip to main content

Multi-dimensional Index over a Key-Value Store for Semi-structured Data

  • Conference paper
  • First Online:
Big Scientific Data Management (BigSDM 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11473))

Included in the following conference series:

  • 758 Accesses

Abstract

The informal data structures and trillions of data volume are the challenges for databases to store and retrieve semi-structured data. Most researchers deal with the issues through R-Tree, KD-tree and space curves, but these structures are not suitable for default and discrete values of semi-structured data, and even require sampling before storage. We present MD-Index, a scalable multi-dimensional indexing system that supports high-throughput and real-time range queries. MD-Index builds bitmap index of sliced data over a range partitioned Key-value store. The underlying Key-value store guarantees high throughput, large data storage, high availability and fault tolerance of the system, and bitmap provides multi-dimensional index of data. Meanwhile, MD-Index encodes the discrete values as the hash code of a slice, and stores the data and the bitmap of a slice in the same region (a storage unit of the range partitioned Key-value store) to utilize distributed computing and data locality. Our prototype of MD-Index is built on HBase, the standard Key-value database. Experimental results reveal that MD-Index is capable of storing and retrieving trillions of semi-structured data and achieving a throughput of two million records per second.

Supported by 2016YFB1000604.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Zhou, X., Zhang, X., Wang, Y., Li, R., Wang, S.: Efficient distributed multi-dimensional index for big data management. In: Wang, J., Xiong, H., Ishikawa, Y., Xu, J., Zhou, J. (eds.) WAIM 2013. LNCS, vol. 7923, pp. 130–141. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38562-9_14

    Chapter  Google Scholar 

  2. Nishimura, S., et al.: MD-HBase: a scalable multi-dimensional data infrastructure for location aware services. In: 2011 12th IEEE International Conference on Mobile Data Management (MDM), vol. 1. IEEE (2011)

    Google Scholar 

  3. Lawder, J.K., King, P.J.H.: Querying multi-dimensional data indexed using the Hilbert space-filling curve. ACM Sigmod Rec. 30(1), 19–24 (2001)

    Article  Google Scholar 

  4. Chan, C.-Y., Ioannidis, Y.E.: Bitmap index design and evaluation. ACM SIGMOD Rec. 27(2), 355–366 (1998)

    Article  Google Scholar 

  5. Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975)

    Article  MathSciNet  Google Scholar 

  6. Guttman, A.: R-trees: a dynamic index structure for spatial searching, vol. 14, no. 2. ACM (1984)

    Google Scholar 

  7. Jensen, C.S., Lin, D., Ooi, B.C.: Query and update efficient B+-tree based indexing of moving objects. In: Proceedings of the Thirtieth International Conference on Very Large Data Bases, vol. 30. VLDB Endowment (2004)

    Google Scholar 

  8. Apache HBase - Apache HBase™Home. base.apache.org/

  9. Apache Hadoop. hadoop.apache.org/

  10. Entity-Relationship Model Wikipedia, Wikimedia Foundation, 4 October 2018. en.wikipedia.org/wiki/Entity

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xin Gao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gao, X., Qi, Y., Hou, D. (2019). Multi-dimensional Index over a Key-Value Store for Semi-structured Data. In: Li, J., Meng, X., Zhang, Y., Cui, W., Du, Z. (eds) Big Scientific Data Management. BigSDM 2018. Lecture Notes in Computer Science(), vol 11473. Springer, Cham. https://doi.org/10.1007/978-3-030-28061-1_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-28061-1_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-28060-4

  • Online ISBN: 978-3-030-28061-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics