Skip to main content
Log in

Indexing in flash storage devices: a survey on challenges, current approaches, and future trends

  • Special Issue Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

Indexes are special purpose data structures, designed to facilitate and speed up the access to the contents of a file. Indexing has been actively and extensively investigated in DBMSes equipped with hard disk drives (HDDs). In the recent years, solid-state drives (SSDs), based on NAND flash technology, started replacing magnetic disks due to their appealing characteristics: high throughput/low latency, shock resistance, absence of mechanical parts, low power consumption. However, treating SSDs as simply another category of block devices ignores their idiosyncrasies, like erase-before-write, wear-out and asymmetric read/write, and may lead to poor performance. These peculiarities of SSDs dictate the refactoring or even the reinvention of the indexing techniques that have been designed primarily for HDDs. In this work, we present a concise overview of the SSD technology and the challenges it poses. We broadly survey 62 flash-aware indexes for various data types, analyze the main techniques they employ, and comment on their main advantages and disadvantages, aiming to provide a systematic and valuable resource for researchers working on algorithm design and index development for SSDs. Additionally, we discuss future trends and new lines of research related to this field.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21

Similar content being viewed by others

Notes

  1. http://www.openssd.io.

  2. https://lucene.apache.org/.

  3. https://mariadb.org/.

References

  1. Aggarwal, A., Vitter, J., et al.: The input/output complexity of sorting and related problems. Commun. ACM 31(9), 1116–1127 (1988)

    MathSciNet  Google Scholar 

  2. Agrawal, D., Ganesan, D., Sitaraman, R., Diao, Y., Singh, S.: Lazy-adaptive tree: an optimized index structure for flash devices. Proc. VLDB Endow. 2(1), 361–372 (2009)

    Google Scholar 

  3. Agrawal, N., Prabhakaran, V., Wobber, T., Davis, J.D., Manasse, M.S., Panigrahy, R.: Design tradeoffs for SSD performance. In: Proceedings of the USENIX Annual Technical Conference (ATC), Boston, MA, pp. 57–70 (2008)

  4. Ajwani, D., Beckmann, A., Jacob, R., Meyer, U., Moruz, G.: On computational models for flash memory devices. In: Proceedings of the 8th International Symposium on Experimental Algorithms (SEA), Dortmund, Germany, pp. 16–27 (2009)

  5. Ajwani, D., Malinger, I., Meyer, U., Toledo, S.: Characterizing the performance of flash memory storage devices and its impact on algorithm design. In: Proceedings of the 7th International Workshop on Experimental Algorithms (WEA), Provincetown, MA, pp. 208–219 (2008)

  6. Andersson, A., Nilsson, S.: Efficient implementation of suffix trees. Softw. Pract. Exp. 25(2), 129–141 (1995)

    Google Scholar 

  7. Athanassoulis, M., Ailamaki, A.: BF-tree: approximate tree indexing. Proc. VLDB Endow. 7(14), 1881–1892 (2014)

    Google Scholar 

  8. Barbalace, A., Iliopoulos, A., Rauchfuss, H., Brasche, G.: It’s time to think about an operating system for near data processing architectures. In: Proceedings of the 16th Workshop on Hot Topics in Operating Systems (HotOS), Whistler, Canada, pp. 56–61 (2017)

  9. Bayer, R., McCreight, E.M.: Organization and maintenance of large ordered indexes. Acta Inform. 1(3), 173–189 (1972)

    MATH  Google Scholar 

  10. Beckmann, N., Kriegel, H.P., Schneider, R., Seeger, B.: The R*-tree: an efficient and robust access method for points and rectangles. In: Proceedings of the ACM International Conference on Management of Data (SIGMOD), Atlantic City, NJ, pp. 322–331 (1990)

  11. Bender, M.A., Farach-Colton, M., Johnson, R., Mauras, S., Mayer, T., Phillips, C.A., Xu, H.: Write-optimized skip lists. In: Proceedings of the 36th ACM Symposium on Principles of Database Systems (PODS), Chicago, IL, pp. 69–78 (2017)

  12. Bentley, J.L., Saxe, J.B.: Decomposable searching problems, I. Static-to-dynamic transformation. J. Algorithms 1(4), 301–358 (1980)

    MathSciNet  MATH  Google Scholar 

  13. Bityuckiy, A.B.: JFFS3 design issues. Technical report, Memory technology device (MTD) subsystem for Linux (2005)

  14. Bjørling, M., González, J., Bonnet, P.: LightNVM: the Linux Open-Channel SSD subsystem. In: Proceedings of the 15th USENIX Conference on File & Storage Technologies (FAST), Santa Clara, CA, pp. 359–374 (2017)

  15. Blelloch, G.E., Fineman, J.T., Gibbons, P.B., Gu, Y., Shun, J.: Efficient algorithms with asymmetric read and write costs. In: Proceedings of the 24th Annual European Symposium on Algorithms (ESA), Schloss Dagstuhl, Germany (2016)

  16. Bloom, B.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)

    MATH  Google Scholar 

  17. Bonnet, P.: What’s up with the storage hierarchy? In: Proceedings of the 8th Biennial Conference on Innovative Data Systems Research (CIDR), Chaminade, CA (2017)

  18. Bouganim, L., Jónsson, B., Bonnet, P.: uFLIP: understanding flash IO patterns (2009). arXiv preprint arXiv:0909.1780

  19. Byun, S., Huh, M., Hwang, H.: An index rewriting scheme using compression for flash memory database systems. J. Inf. Sci. 33(4), 398–415 (2007)

    Google Scholar 

  20. Cai, Y., Ghose, S., Haratsch, E.F., Luo, Y., Mutlu, O.: Error characterization, mitigation, and recovery in flash-memory-based solid-state drives. Proc. IEEE 105(9), 1666–1704 (2017)

    Google Scholar 

  21. Canim, M., Lang, C.A., Mihaila, G.A., Ross, K.A.: Buffered Bloom filters on solid state storage. In: Proceedings of the 1st International Workshop on Accelerating Data Management Systems Using Modern Processor & Storage Architectures (ADMS), Singapore, pp. 1–8 (2010)

  22. Cao, Z., Zhou, S., Li, K., Liu, Y.: Flashsearch: document searching in small mobile device. In: Proceedings of the International Seminar on Business & Information Management, Wuhan, China, pp. 79–82 (2008)

  23. Carniel, A.C., Ciferri, R.R., de Aguiar Ciferri, C.D.: The performance relation of spatial indexing on hard disk drives and solid state drives. In: Proceedings of the XVII Brazilian Symposium on Geoinformatics (GeoInfo), Campos do Jordão, SP, Brazil, pp. 263–274 (2016)

  24. Carniel, A.C., Ciferri, R.R., de Aguiar Ciferri, C.D.: Analyzing the performance of spatial indices on hard disk drives and flash-based solid state drives. J. Inf. Data Manag. 8(1), 34 (2017)

    Google Scholar 

  25. Carniel, A.C., Ciferri, R.R., de Aguiar Ciferri, C.D.: A generic and efficient framework for spatial indexing on flash-based solid state drives. In: Proceedings of the 21st European Conference on Advances in Databases & Information Systems (ADBIS), Nicosia, Cyprus, pp. 229–243 (2017)

  26. Carniel, A.C., Ciferri, R.R., Ciferri, C.D.: A generic and efficient framework for flash-aware spatial indexing. Inf. Syst. 82, 102–120 (2019)

    Google Scholar 

  27. Carniel, A.C., Roumelis, G., Ciferri, R.R., Vassilakopoulos, M., Corral, A., Cifferi, C.D.d.A.: An efficient flash-aware spatial index for points. In: Proceedings of the XIX Brazilian Symposium on Geoinformatics (GEOINFO), Campina Grande, Brazil, pp. 65–79 (2018)

  28. Chakrabarti, D.R., Boehm, H.J., Bhandari, K.: Atlas: leveraging locks for non-volatile memory consistency. ACM SIGPLAN Not. 49(10), 433–452 (2014)

    Google Scholar 

  29. Chazelle, B., Guibas, L.J.: Fractional cascading: a data structuring technique. Algorithmica 1(1–4), 133–162 (1986)

    MathSciNet  MATH  Google Scholar 

  30. Chen, F., Hou, B., Lee, R.: Internal parallelism of flash memory-based solid-state drives. ACM Trans. Storage 12(3), 13 (2016)

    Google Scholar 

  31. Chen, F., Koufaty, D.A., Zhang, X.: Understanding intrinsic characteristics and system implications of flash memory based solid state drives. In: Proceedings of the 11th International Joint Conference on Measurement & Modeling of Computer Systems (SIGMETRICS/Performance), Seattle, WA, pp. 181–192 (2009)

  32. Cho, S., Chang, S., Jo, I.: The solid-state drive technology, today and tomorrow. In: Proceedings of the 31st IEEE International Conference on Data Engineering (ICDE), Seoul, Korea, pp. 1520–1522 (2015)

  33. Cho, S., Park, C., Oh, H., Kim, S., Yi, Y., Ganger, G.R.: Active disk meets flash: a case for intelligent SSDs. In: Proceedings of the 27th ACM International Conference on Supercomputing (ICS), Eugene, OR, pp. 91–102 (2013)

  34. Choi, W.G., Shin, M., Lee, D., Park, H., Park, S.: Optimization of a multiversion index on SSDs to improve system performance. In: Proceedings of the IEEE International Conference on Systems, Man & Cybernetics (SMC), Budapest, Hungary, pp. 1620–1625 (2016)

  35. Chowdhury, N.M.M.K., Akbar, M.M., Kaykobad, M.: DiskTrie: an efficient data structure using flash memory for mobile devices. In: Proceedings of the 1st Workshop on Algorithms & Computation (WALCOM), Dhaka, Bangladesh, pp. 76–87 (2007)

  36. Comer, D.: The ubiquitous B-tree. ACM Comput. Surv. 11(2), 121–137 (1979)

    MathSciNet  MATH  Google Scholar 

  37. Cornwell, M.: Anatomy of a solid-state drive. Commun. ACM 55(12), 59–63 (2012)

    Google Scholar 

  38. Cui, K., Jin, P., Yue, L.: HashTree: a new hybrid index for flash disks. In: Proceedings of the 12th International Asia-Pacific Web Conference (APWeb), Busan, Korea, pp. 45–51 (2010)

  39. Dai, H., Neufeld, M., Han, R.: Elf: an efficient log-structured flash file system for micro sensor nodes. In: Proceedings of the 2nd International Conference on Embedded Networked Sensor Systems (SenSys), Baltimore, MD, pp. 176–187 (2004)

  40. Debnath, B., Sengupta, S., Li, J.: SkimpyStash: RAM space skimpy key-value store on flash-based storage. In: Proceedings of the ACM International Conference on Management of Data (SIGMOD), Athens, Greece, pp. 25–36 (2011)

  41. Debnath, B., Sengupta, S., Li, J., Lilja, D.J., Du, D.H.: BloomFlash: bloom filter on flash-based storage. In: Proceedings of the 31st International Conference on Distributed Computing Systems (ICDCS), Minneapolis, MN, pp. 635–644 (2011)

  42. Driscoll, J.R., Sarnak, N., Sleator, D.D., Tarjan, R.E.: Making data structures persistent. J. Comput. Syst. Sci. 38(1), 86–124 (1989)

    MathSciNet  MATH  Google Scholar 

  43. Engel, J., Mertens, R.: LogFS-finally a scalable flash file system. In: Proceedings of the 12th International Linux System Technology Conference, Hamburg, Germany (2005)

  44. Fagin, R., Nievergelt, J., Pippenger, N., Strong, H.R.: Extendible hashing: a fast access method for dynamic files. ACM Trans. Database Syst. 4(3), 315–344 (1979)

    Google Scholar 

  45. Fang, H.W., Yeh, M.Y., Suei, P.L., Kuo, T.W.: An adaptive endurance-aware B\(^+\)-tree for flash memory storage systems. IEEE Trans. Comput. 63(11), 2661–2673 (2014)

    MathSciNet  MATH  Google Scholar 

  46. Fevgas, A., Bozanis, P.: Grid-file: towards to a flash efficient multi-dimensional index. In: Proceedings of the 29th International Conference on Database & Expert Systems Applications (DEXA), Regensburg, Germany, vol. II, pp. 285–294 (2015)

  47. Fevgas, A., Bozanis, P.: LB-Grid: an SSD efficient grid file. Data Knowl. Eng. 121, 18–41 (2019)

    Google Scholar 

  48. Finkel, R.A., Bentley, J.L.: Quad trees a data structure for retrieval on composite keys. Acta Inform. 4(1), 1–9 (1974)

    MATH  Google Scholar 

  49. Gaede, V., Günther, O.: Multidimensional access methods. ACM Comput. Surv. 30(2), 170–231 (1998)

    Google Scholar 

  50. Gao, C., Shi, L., Ji, C., Di, Y., Wu, K., Xue, C.J., Sha, E.H.M.: Exploiting parallelism for access conflict minimization in flash-based solid state drives. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 37(1), 168–181 (2018)

    Google Scholar 

  51. Gong, X., Chen, S., Lin, M., Liu, H.: A write-optimized B-tree layer for NAND flash memory. In: Proceedings of the 7th International Conference on Wireless Communications, Networking & Mobile Computing (WiCOM), Wuhan, China, pp. 1–4 (2011)

  52. González, J., Bjørling, M.: Multi-tenant I/O isolation with open-channel SSDs. In: Proceedings of the 8th Annual Non-Volatile Memories Workshop (NVMW), San Diego, CA (2017)

  53. Graefe, G.: Write-optimized B-trees. In: Proceedings of the 30th International Conference on Very Large Data Bases (VLDB), Toronto, Canada, pp. 672–683 (2004)

    Google Scholar 

  54. Gu, B., Yoon, A.S., Bae, D.H., Jo, I., Lee, J., Yoon, J., Kang, J.U., Kwon, M., Yoon, C., Cho, S., et al.: Biscuit: a framework for near-data processing of big data workloads. In: Proceedings 43rd ACM/IEEE Annual International Symposium on Computer Architecture (ISCA), Seoul, Korea, pp. 153–165 (2016)

  55. Gu, Y.: Survey: computational models for asymmetric read and write costs. In: Proceedings of the IEEE International Parallel & Distributed Processing Symposium Workshops (IPDPSW), Vancouver, Canada, pp. 733–743 (2018)

  56. Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: Proceedings of the ACM International Conference on Management of Data (SIGMOD), Boston, MA, pp. 47–57 (1984)

  57. Haapasalo, T., Jaluta, I., Seeger, B., Sippu, S., Soisalon-Soininen, E.: Transactions on the multiversion B\(^+\)-tree. In: Proceedings of the 12th International Conference on Extending Database Technology (EDBT), Saint-Petersburg, Russia, pp. 1064–1075 (2009)

  58. Hady, F.T., Foong, A., Veal, B., Williams, D.: Platform storage performance with 3D XPoint technology. Proc. IEEE 105(9), 1822–1833 (2017)

    Google Scholar 

  59. Havasi, F.: An improved B\(^+\)-tree for flash file systems. In: Proceedings of the 37th International Conference on Current Trends in Theory & Practice of Computer Science (SOFSEM), Nový Smokovec, Slovakia, pp. 297–307 (2011)

  60. Hinrichs, K.: Implementation of the grid file: design concepts and experience. BIT Numer. Math. 25(4), 569–592 (1985)

    MathSciNet  MATH  Google Scholar 

  61. Ho, V.P., Park, D.J.: WPCB-tree: a novel flash-aware B-tree index using a write pattern converter. Symmetry 10(1), 18 (2018)

    MathSciNet  Google Scholar 

  62. Hu, Y., Jiang, H., Feng, D., Tian, L., Luo, H., Zhang, S.: Performance impact and interplay of SSD parallelism through advanced commands, allocation strategy and data granularity. In: Proceedings of the 25th International Conference on Supercomputing (ICS), Tucson, AZ, pp. 96–107 (2011)

  63. Jacob, R., Sitchinava, N.: Lower bounds in the asymmetric external memory model. In: Proceedings of the 29th ACM Symposium on Parallelism in Algorithms & Architectures (SPAA), Washington, DC, pp. 247–254 (2017)

  64. Jiang, Z., Wu, Y., Zhang, Y., Li, C., Xing, C.: AB-tree: a write-optimized adaptive index structure on solid state disk. In: Proceedings of the 11th Web Information System & Application Conference (WISA), Tianjin, China, pp. 188–193 (2014)

  65. Jin, P., Xie, X., Wang, N., Yue, L.: Optimizing R-tree for flash memory. Expert Syst. Appl. 42(10), 4676–4686 (2015)

    Google Scholar 

  66. Jin, P., Yang, C., Jensen, C.S., Yang, P., Yue, L.: Read/write-optimized tree indexing for solid-state drives. VLDB J. 25(5), 695–717 (2016)

    Google Scholar 

  67. Jin, P., Yang, C., Wang, X., Yue, L., Zhang, D.: SAL-hashing: a self-adaptive linear hashing index for SSDs. IEEE Trans. Knowl. Data Eng. (2018). https://doi.org/10.1109/TKDE.2018.2884714

  68. Jin, P., Yang, P., Yue, L.: Optimizing B+-tree for hybrid storage systems. Distrib. Parallel Databases 33(3), 449–475 (2015)

    Google Scholar 

  69. Jin, R.: B-tree index layer for multi-channel flash memory. In: Proceedings of the 4th International Conference on Mobile & Wireless Technology (ICMWT), Kuala Lumpur, Malaysia, pp. 197–202 (2017)

  70. Jin, R., Cho, H.J., Chung, T.S.: A group round robin based B-tree index storage scheme for flash memory devices. In: Proceedings of the 8th International Conference on Ubiquitous Information Management & Communication (ICUIMC), Siem Reap, Cambodia, p. 29 (2014)

  71. Jin, R., Cho, H.J., Chung, T.S.: LS-LRU: a lazy-split LRU buffer replacement policy for flash-based B\(^+\)-tree index. J. Inf. Sci. Eng. 31(3), 1113–1132 (2015)

    Google Scholar 

  72. Jin, R., Cho, H.J., Lee, S.W., Chung, T.S.: Lazy-split B\(^+\)-tree: a novel B\(^+\)-tree index scheme for flash-based database systems. Des. Autom. Embed. Syst. 17(1), 167–191 (2013)

    Google Scholar 

  73. Jin, R., Kwon, S.J., Chung, T.S.: FlashB-tree: a novel B-tree index scheme for solid state drives. In: Proceedings of the ACM Symposium on Research in Applied Computation (RACS), Miami, FL, pp. 50–55 (2011)

  74. Jo, I., Bae, D.H., Yoon, A.S., Kang, J.U., Cho, S., Lee, D.D., Jeong, J.: YourSQL: a high-performance database system leveraging in-storage computing. Proc. VLDB Endow. 9(12), 924–935 (2016)

    Google Scholar 

  75. Jørgensen, M.V., Rasmussen, R.B., Šaltenis, S., Schjønning, C.: FB-tree: a B\(^+\)-tree for flash-based SSDs. In: Proceedings of the 15th Symposium on International Database Engineering & Applications (IDEAS), Lisbon, Portugal, pp. 34–42 (2011)

  76. Jung, W., Roh, H., Shin, M., Park, S.: Inverted index maintenance strategy for flashSSDs: revitalization of in-place index update strategy. Inf. Syst. 49, 25–39 (2015)

    Google Scholar 

  77. Kang, D., Jung, D., Kang, J.U., Kim, J.S.: \(\mu *\)-tree: an ordered index structure for NAND flash memory with adaptive page layout scheme. IEEE Trans. Comput. 62(4), 784–797 (2007)

    MathSciNet  MATH  Google Scholar 

  78. Kang, J.U., Hyun, J., Maeng, H., Cho, S.: The multi-streamed solid-state drive. In: Proceedings of the 6th USENIX Workshop on Hot Topics in Storage & File Systems (HotStorage), Philadelphia, PA (2014)

  79. Kim, B., Lee, D.H.: LSB-tree: a log-structured B-tree index structure for NAND flash SSDs. Des. Autom. Embed. Syst. 19(1–2), 77–100 (2015)

    Google Scholar 

  80. Kim, B.K., Lee, S.W., Lee, D.H.: h-Hash: a hash index structure for flash-based solid state drives. J. Circuits Syst. Comput. 24(9), 1550128 (2015)

    Google Scholar 

  81. Kim, E.: SSD performance: a primer. Technical report, Solid State Storage Initiative (2013)

  82. Kim, H.J., Lee, Y.S., Kim, J.S.: NVMeDirect: a user-space I/O framework for application-specific optimization on NVMe SSDs. In: Proceedings of the 8th USENIX Workshop on Hot Topics in Storage & File Systems (HotStorage), Denver, CO (2016)

  83. Kim, S., Oh, H., Park, C., Cho, S., Lee, S.W., Moon, B.: In-storage processing of database scans and joins. Inf. Sci. 327, 183–200 (2016)

    Google Scholar 

  84. Koltsidas, I., Hsu, V.: IBM storage and NVM express revolution. Technical report, IBM (2017)

  85. Koltsidas, I., Pletka, R., Mueller, P., Weigold, T., Eleftheriou, E., Varsamou, M., Ntalla, A., Bougioukou, E., Palli, A., Antanokopoulos, T.: PSS: a prototype storage subsystem based on PCM. In: Proceedings of the 5th Annual Non-Volatile Memories Workshop (NVMW), San Diego, CA (2014)

  86. Kourtis, K., Ioannou, N., Koltsidas, I.: Reaping the performance of fast NVM storage with uDepot. In: Proceedings of the 17th USENIX Conference on File & Storage Technologies (FAST), Boston, MA, pp. 1–15 (2019)

  87. Kwon, S.J., Ranjitkar, A., Ko, Y.B., Chung, T.S.: FTL algorithms for NAND-type flash memories. Des. Autom. Embed. Syst. 15(3), 191–224 (2011)

    Google Scholar 

  88. Lee, H.S., Lee, D.H.: An efficient index buffer management scheme for implementing a B-tree on NAND flash memory. Data Knowl. Eng. 69(9), 901–916 (2010)

    Google Scholar 

  89. Lee, S.W., Moon, B.: Design of flash-based DBMS: an in-page logging approach. In: Proceedings of the ACM International Conference on Management of Data (SIGMOD), Beijing, China, pp. 55–66 (2007)

  90. Lee, Y.G., Jung, D., Kang, D., Kim, J.S.: \(\mu \)-tree: a memory-efficient flash translation layer supporting multiple mapping granularities. In: Proceedings of the 8th ACM International Conference on Embedded Software (EMSOFT), Atlanta, GA, pp. 21–30 (2008)

  91. Lee, Y.S., Quero, L.C., Lee, Y., Kim, J.S., Maeng, S.: Accelerating external sorting via on-the-fly data merge in active SSDs. In: Proceedings of the 6th USENIX Workshop on Hot Topics in Storage & File Systems (HotStorage), Philadelphia, PA (2014)

  92. Levandoski, J.J., Lomet, D.B., Sengupta, S.: The Bw-tree: a B-tree for new hardware platforms. In: Proceedings of the 29th IEEE International Conference on Data Engineering (ICDE), Washington, DC, pp. 302–313 (2013)

  93. Levandoski, J.J., Sengupta, S., Redmond, W.: The BW-tree: a latch-free B-tree for log-structured flash storage. IEEE Data Eng. Bull. 36(2), 56–62 (2013)

    Google Scholar 

  94. Li, G., Zhao, P., Yuan, L., Gao, S.: Efficient implementation of a multi-dimensional index structure over flash memory storage systems. J. Supercomput. 64(3), 1055–1074 (2013)

    Google Scholar 

  95. Li, H., Hao, M., Tong, M.H., Sundararaman, S., Bjørling, M., Gunawi, H.S.: The CASE of FEMU: cheap, accurate, scalable and extensible flash emulator. In: Proceedings of the 16th USENIX Conference on File & Storage Technologies (FAST), Oakland, CA, pp. 83–90 (2018)

  96. Li, R., Chen, X., Li, C., Gu, X., Wen, K.: Efficient online index maintenance for SSD-based information retrieval systems. In: Proceedings of the 14th IEEE International Conference on High Performance Computing & Communication (HPCC), Liverpool, UK, pp. 262–269 (2012)

  97. Li, X., Da, Z., Meng, X.: A new dynamic hash index for flash-based storage. In: Proceedings of the 9th International Conference on Web-Age Information Management (WAIM), Zhangjiajie, China, pp. 93–98 (2008)

  98. Li, Y., He, B., Luo, Q., Yi, K.: Tree indexing on flash disks. In: Proceedings of the 25th IEEE International Conference on Data Engineering (ICDE), Shanghai, China, pp. 1303–1306 (2009)

  99. Li, Y., He, B., Yang, R.J., Luo, Q., Yi, K.: Tree indexing on solid state drives. Proc. VLDB Endow. 3(1–2), 1195–1206 (2010)

    Google Scholar 

  100. Lin, S., Zeinalipour-Yazti, D., Kalogeraki, V., Gunopulos, D., Najjar, W.A.: Efficient indexing data structures for flash-based sensor devices. ACM Trans. Storage 2(4), 468–503 (2006)

    Google Scholar 

  101. Litwin, W.: Linear hashing: a new tool for file and table addressing. In: Proceedings of the 6th International Conference on Very Large Data Bases (VLDB), Montreal, Canada, pp. 212–223 (1980)

  102. Lu, G., Debnath, B., Du, D.H.: A forest-structured Bloom filter with flash memory. In: Proceedings of the 27th IEEE Symposium on Mass Storage Systems & Technologies (MSST), Denver, CO, pp. 1–6 (2011)

  103. Lv, Y., Li, J., Cui, B., Chen, X.: Log-compact R-tree: an efficient spatial index for SSD. In: Proceedings of the 16th International Conference on Database Systems for Advanced Applications (DASFAA), Hong Kong, China, vol. III, pp. 202–213 (2011)

    Google Scholar 

  104. Manolopoulos, Y., Nanopoulos, A., Papadopoulos, A.N., Theodoridis, Y.: R-Trees: Theory and Applications. Springer, Berlin (2010)

    MATH  Google Scholar 

  105. Mehlhorn, K., Näher, S.: Dynamic fractional cascading. Algorithmica 5(1–4), 215–241 (1990)

    MathSciNet  MATH  Google Scholar 

  106. Meza, J., Wu, Q., Kumar, S., Mutlu, O.: A large-scale study of flash memory failures in the field. In: Proceedings of the ACM International Conference on Measurement & Modeling of Computer Systems (SIGMETRICS), Portland, OR, pp. 177–190 (2015)

  107. Micheloni, R.: 3D Flash Memories. Springer, Berlin (2016)

    Google Scholar 

  108. Micheloni, R.: Solid-State-Drives Modeling. Springer, Berlin (2017)

    Google Scholar 

  109. Mittal, S., Vetter, J.S.: A survey of software techniques for using non-volatile memories for storage and main memory systems. IEEE Trans. Parallel Distrib. Syst. 27(5), 1537–1550 (2016)

    Google Scholar 

  110. Na, G.J., Lee, S.W., Moon, B.: Dynamic in-page logging for b+-tree index. IEEE Trans. Knowl. Data Eng. 24(7), 1231–1243 (2012)

    Google Scholar 

  111. Na, G.J., Moon, B., Lee, S.W.: IPLB\(^+\)-tree for flash memory database systems. J. Inf. Sci. Eng. 27(1), 111–127 (2011)

    Google Scholar 

  112. Nanavati, M., Schwarzkopf, M., Wires, J., Warfield, A.: Non-volatile storage: implications of the datacenter’s shifting center. ACM Queue 13(9), 20 (2015)

    Google Scholar 

  113. Narayanan, I., Wang, D., Jeon, M., Sharma, B., Caulfield, L., Sivasubramaniam, A., Cutler, B., Liu, J., Khessib, B., Vaid, K.: SSD failures in datacenters: What? when? and why? In: Proceedings of the 9th ACM International on Systems & Storage Conference (SYSTOR), Haifa, Israel (2016)

  114. Nath, S., Kansal, A.: FlashDB: dynamic self-tuning database for NAND flash. In: Proceedings of the 6th International Symposium on Information Processing in Sensor Networks (IPSN), Cambridge, MA, pp. 410–419 (2007)

  115. Nievergelt, J., Hinterberger, H., Sevcik, K.C.: The Grid file: an adaptable, symmetric multikey file structure. ACM Trans. Database Syst. 9(1), 38–71 (1984)

    Google Scholar 

  116. On, S.T., Hu, H., Li, Y., Xu, J.: Lazy-update B\(^+\)-tree for flash devices. In: Proceedings of the 10th International Conference on Mobile Data Management (MDM), Taipei, Taiwan, pp. 323–328 (2009)

  117. O’Neil, P., Cheng, E., Gawlick, D., O’Neil, E.: The log-structured merge-tree (LSM-tree). Acta Inform. 33(4), 351–385 (1996)

    MATH  Google Scholar 

  118. Park, C., Cheon, W., Kang, J., Roh, K., Cho, W., Kim, J.S.: A reconfigurable FTL architecture for NAND flash-based applications. ACM Trans. Embed. Comput. Syst. 7(4), 38 (2008)

    Google Scholar 

  119. Pawlik, M., Macyna, W.: Implementation of the aggregated R-tree over flash memory. In: Proceedings of the 17th International Conference on Database Systems for Advanced Applications (DASFAA), International Workshops: FlashDB, ITEMS, SNSM, SIM3, DQDI, Busan, Korea, pp. 65–72 (2012)

    Google Scholar 

  120. Pearce, R., Gokhale, M., Amato, N.M.: Multithreaded asynchronous graph traversal for in-memory and semi-external memory. In: Proceedings of the ACM/IEEE International Conference on High Performance Computing, Networking, Storage & Analysis (SC), New Orleans, LA, pp. 1–11 (2010)

  121. Pugh, W.: Skip lists: a probabilistic alternative to balanced trees. Commun. ACM 33(6), 668–677 (1990)

    Google Scholar 

  122. Robinson, J.T.: The KDB-tree: a search structure for large multidimensional dynamic indexes. In: Proceedings of the ACM International Conference on Management of Data (SIGMOD), Ann Arbor, MI, pp. 10–18 (1981)

  123. Roh, H., Kim, S., Lee, D., Park, S.: AS B-tree: a study of an efficient B\(^+\)-tree for SSDs. J. Inf. Sci. Eng. 30(1), 85–106 (2014)

    Google Scholar 

  124. Roh, H., Kim, W.C., Kim, S., Park, S.: A B-tree index extension to enhance response time and the life cycle of flash memory. Inf. Sci. 179(18), 3136–3161 (2009)

    MathSciNet  Google Scholar 

  125. Roh, H., Park, S., Kim, S., Shin, M., Lee, S.W.: B\(^+\)-tree index optimization by exploiting internal parallelism of flash-based solid state drives. Proc. VLDB Endow. 5(4), 286–297 (2011)

    Google Scholar 

  126. Roh, H., Park, S., Shin, M., Lee, S.W.: MPSearch: multi-path search for tree-based indexes to exploit internal parallelism of flash SSDs. IEEE Data Eng. Bull. 37(2), 3–11 (2014)

    Google Scholar 

  127. Ross, K.A.: Modeling the performance of algorithms on flash memory devices. In: Proceedings of the 4th International Workshop on Data management on New Hardware (DaMoN), Vancouver, Canada, pp. 11–16 (2008)

  128. Roumelis, G., Fevgas, A., Vassilakopoulos, M., Corral, A., Bozanis, P., Manolopoulos, Y.: Bulk-loading and bulk-insertion algorithms for xBR-trees in solid state drives. Computing (2019). https://doi.org/10.1007/s00607-019-00709-4

    Article  Google Scholar 

  129. Roumelis, G., Vassilakopoulos, M., Corral, A., Fevgas, A., Manolopoulos, Y.: Spatial batch-queries processing using xBR\(^+\)-trees in solid-state drives. In: Proceedings of the 8th International Conference on Model & Data Engineering (MEDI), Marrakesh, Morocco, pp. 301–317 (2018)

  130. Roumelis, G., Vassilakopoulos, M., Loukopoulos, T., Corral, A., Manolopoulos, Y.: The xBR\(^+\)-tree: an efficient access method for points. In: Proceedings of the 26th International Conference on Database & Expert Systems Applications (DEXA), Valencia, Spain, pp. 43–58 (2015)

  131. Sarwat, M., Mokbel, M.F., Zhou, X., Nath, S.: Fast: a generic framework for flash-aware spatial trees. In: Proceedings of the 12th International Symposium in Advances in Spatial & Temporal Databases (SSTD), Minneapolis, MN, pp. 149–167 (2011)

  132. Sarwat, M., Mokbel, M.F., Zhou, X., Nath, S.: Generic and efficient framework for search trees on flash memory storage systems. GeoInformatica 17(3), 417–448 (2013)

    Google Scholar 

  133. Schierl, A., Schellhorn, G., Haneberg, D., Reif, W.: Abstract specification of the UBIFS file system for flash memory. In: Proceedings of the 16th International Symposium on Formal Methods (FM), Eindhoven, the Netherlands, pp. 190–206 (2009)

  134. Schroeder, B., Lagisetty, R., Merchant, A.: Flash reliability in production: the expected and the unexpected. In: Proceedings of the 14th USENIX Conference on File & Storage Technologies (FAST), Santa Clara, CA, pp. 67–80 (2016)

  135. Shen, Z., Chen, F., Jia, Y., Shao, Z.: Didacache: an integration of device and application for flash-based key-value caching. ACM Trans. Storage 14(3), 26:1–26:32 (2018)

    Google Scholar 

  136. Son, Y., Kang, H., Han, H., Yeom, H.Y.: An empirical evaluation and analysis of the performance of nvm express solid state drive. Clust. Comput. 19(3), 1541–1553 (2016)

    Google Scholar 

  137. Tan, C.C., Sheng, B., Wang, H., Li, Q.: Microsearch: a search engine for embedded devices used in pervasive computing. ACM Trans. Embed. Comput. Syst. 9(4), 43 (2010)

    Google Scholar 

  138. Teng, D., Guo, L., Lee, R., Chen, F., Zhang, Y., Ma, S., Zhang, X.: A low-cost disk solution enabling lsm-tree to achieve high performance for mixed read/write workloads. ACM Trans. Storage 14(2), 15 (2018)

    Google Scholar 

  139. Thonangi, R., Babu, S., Yang, J.: A practical concurrent index for solid-state drives. In: Proceedings of the 21st ACM International Conference on Information & Knowledge Management (CIKM), Maui, HI, pp. 1332–1341 (2012)

  140. Thonangi, R., Yang, J.: On log-structured merge for solid-state drives. In: Proceedings of the 33rd IEEE International Conference on Data Engineering (ICDE), San Diego, CA, pp. 683–694 (2017)

  141. Viglas, S.D.: Adapting the B\(^+\)-tree for asymmetric I/O. In: Proceedings of the 16th East European Conference on Advances in Databases & Information Systems (ADBIS), Poznan, Poland, pp. 399–412 (2012)

  142. Wang, H., Feng, J.: FlashSkipList: indexing on flash devices. In: Proceedings of the ACM Turing 50th Celebration Conference (ACM TUR-C), Shanghai, China (2017)

  143. Wang, J., Park, D., Kee, Y.S., Papakonstantinou, Y., Swanson, S.: SSD in-storage computing for list intersection. In: Proceedings of the 12th International Workshop on Data Management on New Hardware (DaMoN). San Francisco, CA (2016)

  144. Wang, J., Park, D., Papakonstantinou, Y., Swanson, S.: SSD in-storage computing for search engines. IEEE Trans. Comput. (2016). https://doi.org/10.1109/TC.2016.2608818

  145. Wang, L., Wang, H.: A new self-adaptive extendible hash index for flash-based DBMS. In: Proceedings of the IEEE International Conference on Information & Automation (ICIA), Harbin, China, pp. 2519–2524 (2010)

  146. Wang, N., Jin, P., Wan, S., Zhang, Y., Yue, L.: OR-tree: an optimized spatial tree index for flash-memory storage systems. In: Proceedings of the 3rd International Conference in Data & Knowledge Engineering (ICDKE), Wuyishan, China, pp. 1–14 (2012)

  147. Wang, P., Sun, G., Jiang, S., Ouyang, J., Lin, S., Zhang, C., Cong, J.: An efficient design and implementation of LSM-tree based key-value store on open-channel SSD. In: Proceedings of the 9th Eurosys Conference, Amsterdam, The Netherlands (2014)

  148. Workgroup, N.E.: NVME overview (Online) http://nvmexpress.org/wp-content/uploads/NVMe_Overview.pdf. Accessed 29 Apr 2019

  149. Wu, C.H., Chang, L.P., Kuo, T.W.: An efficient B-tree layer for flash-memory storage systems. In: Revised Papers of the 9th International Conference on Real-Time & Embedded Computing Systems & Applications (RTCSA), Tainan, Taiwan, pp. 409–430 (2003)

  150. Wu, C.H., Chang, L.P., Kuo, T.W.: An efficient R-tree implementation over flash-memory storage systems. In: Proceedings of the 11th ACM International Symposium on Advances in Geographic Information Systems (GIS), New Orleans, LO, pp. 17–24 (2003)

  151. Wu, C.H., Kuo, T.W., Chang, L.P.: An efficient B-tree layer implementation for flash-memory storage systems. ACM Trans. Embed. Comput. Syst. 6(3), 19 (2007)

    Google Scholar 

  152. Xiang, X., Yue, L., Liu, Z., Wei, P.: A reliable B-tree implementation over flash memory. In: Proceedings of the 23rd ACM Symposium on Applied Computing (SAC), Fortaleza, Brazil, pp. 1487–1491 (2008)

  153. Xu, J., Kim, J., Memaripour, A., Swanson, S.: Finding and fixing performance pathologies in persistent memory software stacks. In: Proceedings of the 24th International Conference on Architectural Support for Programming Languages & Operating Systems (ASPLOS), Providence, RI, pp. 427–439 (2019)

  154. Xu, J., Swanson, S.: NOVA: a log-structured file system for hybrid volatile/non-volatile main memories. In: Proceedings of the 14th USENIX Conference on File & Storage Technologies (FAST), Santa Clara, CA, pp. 323–338 (2016)

  155. Xu, Q., Siyamwala, H., Ghosh, M., Suri, T., Awasthi, M., Guz, Z., Shayesteh, A., Balakrishnan, V.: Performance analysis of NVMe SSDs and their implication on real world databases. In: Proceedings of the 8th ACM International Systems & Storage Conference (SYSTOR), Haifa, Israel (2015)

  156. Yang, C., Jin, P., Yue, L., Zhang, D.: Self-adaptive Linear hashing for solid state drives. In: Proceedings of the 32nd IEEE International Conference on Data Engineering (ICDE), Helsinki, Finland, pp. 433–444 (2016)

  157. Yang, C.W., Lee, K.Y., Kim, M.H., Lee, Y.J.: An efficient dynamic hash index structure for NAND flash memory. IEICE Trans. Fundam. Electron. Commun. 92–A(7), 1716–1719 (2009)

    Google Scholar 

  158. Yang, J., Wei, Q., Chen, C., Wang, C., Yong, K.L., He, B.: NV-tree: reducing consistency cost for NVM-based single level systems. In: Proceedings of the 13th USENIX Conference on File & Storage Technologies (FAST), Santa Clara, CA, pp. 167–181 (2015)

  159. Yang, Z., Harris, J.R., Walker, B., Verkamp, D., Liu, C., Chang, C., Cao, G., Stern, J., Verma, V., Paul, L.E.: SPDK: a development kit to build high performance storage applications. In: Proceedings of the IEEE International Conference on Cloud Computing Technology & Science (CloudCom), Hong Kong, China, pp. 154–161 (2017)

  160. Yin, S., Pucheral, P.: PBFilter: a flash-based indexing scheme for embedded systems. Inf. Syst. 37(7), 634–653 (2012)

    Google Scholar 

  161. Zobel, J., Moffat, A.: Inverted files for text search engines. ACM Comput. Surv. 38(2), 6 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Panayiotis Bozanis.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fevgas, A., Akritidis, L., Bozanis, P. et al. Indexing in flash storage devices: a survey on challenges, current approaches, and future trends. The VLDB Journal 29, 273–311 (2020). https://doi.org/10.1007/s00778-019-00559-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-019-00559-8

Keywords

Navigation