Acta Informatica

, Volume 33, Issue 4, pp 351–385 | Cite as

The log-structured merge-tree (LSM-tree)

  • Patrick O’NeilEmail author
  • Edward Cheng
  • Dieter Gawlick
  • Elizabeth O’Neil


High-performance transaction system applications typically insert rows in a History table to provide an activity trace; at the same time the transaction system generates log records for purposes of system recovery. Both types of generated information can benefit from efficient indexing. An example in a well-known setting is the TPC-A benchmark application, modified to support efficient queries on the history for account activity for specific accounts. This requires an index by account-id on the fast-growing History table. Unfortunately, standard disk-based index structures such as the B-tree will effectively double the I/O cost of the transaction to maintain an index such as this in real time, increasing the total system cost up to fifty percent. Clearly a method for maintaining a real-time index at low cost is desirable. The log-structured mergetree (LSM-tree) is a disk-based data structure designed to provide low-cost indexing for a file experiencing a high rate of record inserts (and deletes) over an extended period. The LSM-tree uses an algorithm that defers and batches index changes, cascading the changes from a memory-based component through one or more disk components in an efficient manner reminiscent of merge sort. During this process all index values are continuously accessible to retrievals (aside from very short locking periods), either through the memory component or one of the disk components. The algorithm has greatly reduced disk arm movements compared to a traditional access methods such as B-trees, and will improve cost-performance in domains where disk arm costs for inserts with traditional access methods overwhelm storage media costs. The LSM-tree approach also generalizes to operations other than insert and delete. However, indexed finds requiring immediate response will lose I/O efficiency in some cases, so the LSM-tree is most useful in applications where index inserts are more common than finds that retrieve the entries. This seems to be a common property for history tables and log files, for example. The conclusions of Sect. 6 compare the hybrid use of memory and disk components in the LSM-tree access method with the commonly understood advantage of the hybrid method to buffer disk pages in memory.


Leaf Node Leaf Level Memory Buffer Access Rate Disk Component 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aho, A. V., Hopcroft, J. E., Ullman, J. D.: The design and analysis of computer algorithms. Reading, MA, Addison-WesleyGoogle Scholar
  2. 2.
    Anon et al.: A measure of transaction processing power. In: Stonebraker, M. (ed.) Readings in database systems, 2nd. edn., pp. 442–454. San Mateo, CA, Morgan Kaufmann, 1988Google Scholar
  3. 3.
    Bayer, R., Schkolnick, M.: Concurrency of operations on B-trees. In: Stonebraker, M. (ed.) Readings in database systems, pp. 129–139, San Mateo, CA, Morgan Kaufmann, 1988Google Scholar
  4. 4.
    Bernstein, P. A., Hadzilacos, V., Goodman, N.: Concurrency control and recovery in database systems. Reading, MA, Addison-Wesley 1987Google Scholar
  5. 5.
    Corner, D.: The ubiquitous B-tree. Comput. Surv. 11, 121–137 (1979)CrossRefGoogle Scholar
  6. 6.
    Copeland, G., Keller, T., Smith, M.: Database buffer and disk configuring and the battle of the bottlenecks. Proc. 4th International Workshop High Performance Transaction Systems, September 1991Google Scholar
  7. 7.
    Dadam, P., Lum, V., Praedel, U., Shlageter, G.: Selective deferred index maintenance & concurrency control in integrated information systems. Proc. 11th International VLDB Conference, pp. 142–150, August 1985Google Scholar
  8. 8.
    Daniels, D. S., Spector, A. Z., Thompson, D. S.: Distributed logging for transaction processing. ACM SIGMOD Transactions pp. 82–96, (1987)Google Scholar
  9. 9.
    Fagin, R., Nievergelt, J., Pippenger, N., Strong, H. R.: Extendible hashing — a fast access method for dynamic files. ACM Trans. Database Systems, 4 (N3) 315–344 (1979)CrossRefGoogle Scholar
  10. 10.
    Garcia-Molina, H., Salem, K.: Sagas. ACM SIGMOD Transactions, pp. 249–259 (1987)Google Scholar
  11. 11.
    Garcia-Molina, H., Gawlick, D., Klein, J., Kleissner, K., Salem, K.: Coordinating multitransactional activities. Princeton University Report, CS-TR-247-90, February 1990.Google Scholar
  12. 12.
    Garcia-Molina, H.: Modelling long-running activities as nested sagas. IEEE Data Engineering 14 (No 1) 14–18 (1991)Google Scholar
  13. 13.
    Gray, J., Putzolu, F.: The five minute rule for trading memory for disk accessess and the 10 Byte rule for trading memory for CPU time. Proc. 1987 ACM SIGMOD Conference, pp. 395–398Google Scholar
  14. 14.
    Gray, J., Reuter, A.: transaction processing, concepts and techniques. San Mateo, CA, Morgan Kaufmann 1992Google Scholar
  15. 15.
    Kolovson, C. P., Stonebraker, M.: Indexing techniques for historical databases. Proc. 1989 IEEE Data Engineering Conference, pp. 138–147Google Scholar
  16. 16.
    Lomet, D., Salzberg, B.: Access methods for multiversion data. Proc. 1989 ACM SIGMOD Conference, pp. 315–323Google Scholar
  17. 17.
    Lomet, D., Salzberg, B.: The performance of a multiversion access method. Proc. 1990 ACM SIGMOD Conference, pp. 353–363.Google Scholar
  18. 18.
    Lomet, D. B.: A simple bounded disorder file organization with good performance. ACM Trans. on Database Systems 13 (4) 525–551 (1988)zbMATHCrossRefGoogle Scholar
  19. 19.
    O’Neil, P. E.: The escrow transactional method. TODS, 11 (No. 4) 405–430 (1986)CrossRefGoogle Scholar
  20. 20.
    O’Neil, P., Cheng, E., Gawlick, D., O’Neil, E.: The log-structured merge-tree (LSM-tree). UMass/Boston Math & CS Dept Technical Report, 91–6, November, 1991Google Scholar
  21. 21.
    O’Neil, P. E.: The SB-tree: An index-sequential structure for high-performance sequential acess. Acta Inf. 29, 241–265 (1992)zbMATHCrossRefMathSciNetGoogle Scholar
  22. 22.
    O’Neil, P., Weikum, G.: A log-structured history data access method (LHAM). Presented at the Fifth International Workshop on High-Performance Transaction Systems, September 1993Google Scholar
  23. 23.
    Rosemblum, M., Ousterhout, J. K.: The design and implementation of a log structured file system. ACM Trans. Comp. Sys. 10 (No. 1) 26–52 (1992)CrossRefGoogle Scholar
  24. 24.
    Reuter, A.: Contracts: A means for controlling system activities beyond transactional boundaries. Proc. 3rd International Workshop on High Performance Transaction Systems, September 1989Google Scholar
  25. 25.
    Severance, D. G., Lohman, G. M.: Differential files: their application to the maintenance of large databases. ACM Trans. Database Systems 1 (No 3) 256–267 (1976)CrossRefGoogle Scholar
  26. 26.
    Transaction Processing Performance Council (TPC): TPC BENCHMARK A standard specification. The performance handbook: for database and transaction processing systems, 2nd edn. San Mateo, CA, Morgan Kauffman, 1993Google Scholar
  27. 27.
    Wächter, H.: Contracts: A means for improving reliability in distributed computing. IEEE Spring CompCon 91Google Scholar
  28. 28.
    Weikum, G.: Principles and realization strategies for multilevel transaction management. ACM Trans. Database Systems. 16 (No 1) 132–180 (1991)CrossRefGoogle Scholar
  29. 29.
    Wodnicki, J. M., Kurtz, S. C.: GPD performance evaluation lab database 2 Version 2 Utility analysis, IBM Document Number GG09-1031-0, September 28, 1989Google Scholar

Copyright information

© Springer-Verlag 1996

Authors and Affiliations

  • Patrick O’Neil
    • 1
    Email author
  • Edward Cheng
    • 2
  • Dieter Gawlick
    • 3
  • Elizabeth O’Neil
    • 1
  1. 1.Department of Mathematics and Computer ScienceUniversity of Massachusetts/BostonBostonUSA
  2. 2.Digital Equipment CorporationPalo AltoUSA
  3. 3.Oracle CorporationRedwood ShoresUSA

Personalised recommendations