Skip to main content

Pipelined Query Processing Using Non-volatile Memory SSDs

  • 709 Accesses

Part of the Lecture Notes in Computer Science book series (LNISA,volume 12318)

Abstract

NVM Optane SSDs are faster than traditional flash-based SSDs and more economical than DRAM main memory, so we explore query processing with the inverted index on NVM aiming at reducing costs, but this leads to NVM-to-DRAM I/O which negatively affects the search engine’s responsiveness. To alleviate this problem, we propose a pipelining scheme to overlap CPU computation with NVM-to-DRAM I/O. We further propose some optimizations: variable coalesced block size, data prefetching, and block skipping.

The experiments on the Gov2 and ClueWeb document corpuses indicate a reduction in CPU waiting time caused by NVM-to-DRAM I/O by around 85% for Maxscore, Wand, and BlockMaxWand queries vs. not using pipelining, while maintaining comparable query throughput (loss within 6%) vs. an in-memory inverted index (DRAM-based scheme). For RankAnd queries, we occupy 3% of the inverted index in memory for caching to achieve similar query efficiency (within 6%) vs. the DRAM-based scheme.

Keywords

  • NVM Optane SSD
  • Query processing
  • Pipeline
  • Prefetch

This work is partially supported by National Science Foundation of China (61872201, 61702521, U1833114); Science and Technology Development Plan of Tianjin (17JCYBJC15300, 18ZXZNGX00140, 18ZXZNGX00200).

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-60290-1_35
  • Chapter length: 16 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   89.00
Price excludes VAT (USA)
  • ISBN: 978-3-030-60290-1
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   119.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.
Fig. 8.
Fig. 9.
Fig. 10.

References

  1. Anh, V.N., Moffat, A.: Index compression using 64-bit words. Softw. Pract. Exp. 40(2), 131–147 (2010)

    Google Scholar 

  2. Ao, N., et al.: Efficient parallel lists intersection and index compression algorithms using graphics processing units. PVLDB 4(8), 470–481 (2011)

    Google Scholar 

  3. Broder, A.Z., Carmel, D., Herscovici, M., Soffer, A., Zien, J.Y.: Efficient query evaluation using a two-level retrieval process. In: Proceedings of CIKM. pp. 426–434 (2003)

    Google Scholar 

  4. Büttcher, S., Clarke, C.L.A., Cormack, G.V.: Information Retrieval-Implementing and Evaluating Search Engines. MIT Press, Cambridge (2010)

    MATH  Google Scholar 

  5. Cambazoglu, B.B., Baeza-Yates, R.A.: Scalability challenges in web search engines. Synth. Lect. Inf. Concepts Retr. Serv. 7, 1–138 (2015)

    Google Scholar 

  6. Cambazoglu, B.B., Kayaaslan, E., Jonassen, S., Aykanat, C.: A term-based inverted index partitioning model for efficient distributed query processing. TWEB 7(3), 15:1–15:23 (2013)

    Google Scholar 

  7. Cutting, D.R., Pedersen, J.O.: Optimizations for dynamic inverted index maintenance. In: Proceedings of SIGIR, pp. 405–411 (1990)

    Google Scholar 

  8. Ding, S., Suel, T.: Faster top-k document retrieval using block-max indexes. In: Proceedings of SIGIR, pp. 993–1002 (2011)

    Google Scholar 

  9. Eisenman, A., et al.: Reducing DRAM footprint with NVM in Facebook. In: Proceedings of EuroSys, pp. 42:1–42:13 (2018)

    Google Scholar 

  10. Eisenman, A., et al.: Bandana: using non-volatile memory for storing deep learning models. CoRR arxiv:1811.05922 (2018)

  11. Lemire, D., Boytsov, L.: Decoding billions of integers per second through vectorization. Softw. Pract. Exp. 45(1), 1–29 (2015)

    CrossRef  Google Scholar 

  12. Liu, X., Peng, Z.: An efficient random access inverted index for information retrieval. In: Proceedings of WWW, pp. 1153–1154 (2010)

    Google Scholar 

  13. Moffat, A., Stuiver, L.: Binary interpolative coding for effective index compression. Inf. Retr. 3(1), 25–47 (2000)

    Google Scholar 

  14. Ottaviano, G., Tonellotto, N., Venturini, R.: Optimal space-time tradeoffs for inverted indexes. In: Proceedings of WSDM, pp. 47–56 (2015)

    Google Scholar 

  15. Risvik, K.M., Chilimbi, T.M., Tan, H., Kalyanaraman, K., Anderson, C.: Maguro, a system for indexing and searching over very large text collections. In: Proceedings of WSDM, pp. 727–736 (2013)

    Google Scholar 

  16. Robertson, S.E., Jones, K.S.: Relevance weighting of search terms. JASIS 27(3), 129–146 (1976)

    Google Scholar 

  17. Stepanov, A.A., Gangolli, A.R., Rose, D.E., Ernst, R.J., Oberoi, P.S.: SIMD-based decoding of posting lists. In: Proceedings of CIKM, pp. 317–326 (2011)

    Google Scholar 

  18. Turtle, H.R., Flood, J.: Query evaluation: strategies and optimizations. Inf. Process. Manag. 31(6), 831–850 (1995)

    Google Scholar 

  19. Wang, J., Lo, E., Yiu, M.L., Tong, J., Wang, G., Liu, X.: The impact of solid state drive on search engine cache management. In: Proceedings of SIGIR, pp. 693–702 (2013)

    Google Scholar 

  20. Xia, F., Jiang, D., Xiong, J., Sun, N.: HIKV: a hybrid index key-value store for DRAM-NVM memory systems. In: Proceedings of USENIX, pp. 349–362 (2017)

    Google Scholar 

  21. Xu, J., Swanson, S.: NOVA: a log-structured file system for hybrid volatile/non-volatile main memories. In: Proceedings of FAST, pp. 323–338 (2016)

    Google Scholar 

  22. Yan, H., Ding, S., Suel, T.: Inverted index compression and query processing with optimized document ordering. In: Proceedings of WWW, pp. 401–410 (2009)

    Google Scholar 

  23. Zhang, J., Long, X., Suel, T.: Performance of compressed inverted list caching in search engines. In: Proceedings of WWW, pp. 387–396 (2008)

    Google Scholar 

  24. Zhang, R., Sun, P., Tong, J., Stones, R.J., Wang, G., Liu, X.: Compact snippet caching for flash-based search engines. In: Proceedings of SIGIR, pp. 1015–1018 (2015)

    Google Scholar 

  25. Zukowski, M., Héman, S., Nes, N., Boncz, P.A.: Super-scalar RAM-CPU cache compression. In: Proceedings of ICDE (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Gang Wang or Xiaoguang Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Liu, X. et al. (2020). Pipelined Query Processing Using Non-volatile Memory SSDs. In: Wang, X., Zhang, R., Lee, YK., Sun, L., Moon, YS. (eds) Web and Big Data. APWeb-WAIM 2020. Lecture Notes in Computer Science(), vol 12318. Springer, Cham. https://doi.org/10.1007/978-3-030-60290-1_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-60290-1_35

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-60289-5

  • Online ISBN: 978-3-030-60290-1

  • eBook Packages: Computer ScienceComputer Science (R0)