Compound Treatment of Chained Declustered Replicas Using a Parallel Btree for High Scalability and Availability

  • Min Luo
  • Akitsugu Watanabe
  • Haruo Yokota
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6262)


Scalability and availability are key features of parallel database systems. To realize scalability, many dynamic load-balancing methods with data placement and parallel index structures on shared-nothing parallel infrastructure have been proposed. Data migration with range-partitioned placement using a parallel Btree is one solution. The combination of range partitioning and chained declustered replicas provides high availability while preserving scalability. However, independent treatment of the primary and backup data in each node results in long failover times. We propose a novel method for compound treatment of chained declustered replicas using a parallel Btree, called the Fat-Btree. In the proposed method, the single Fat-Btree provides access paths to both primary and backup data in all processor elements, which greatly reduces failover time. Moreover, it enables dynamic load balancing without physical data migration, and improves memory space utilization for processing the index. Experiments using PostgreSQL on a 160-node PC cluster demonstrate the effect.


Leaf Node Intermediate Node Data Migration Access Path Metadata Server 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Achyutuni, K.J., Omiecinski, E., Navathe, S.B.: Two techniques for on-line index modification in shared nothing parallel databases. In: Proceedings of the ACM SIGMOD Int’l. Conf. on Management of data, June 04-06, pp. 125–136 (1996)Google Scholar
  2. 2.
    Arnan, R., Bachemat, E., Lam, T.K., Michel, R.: Dynamic data reallocation in disk arrays. ACM Transactions on Storage (TOS) 3(1), 2-es (2007)Google Scholar
  3. 3.
    Dewitt, D., Gray, J.: Parallel database systems: the future of high performance database systems. Communications of the ACM 35(6), 85–98 (1992)CrossRefGoogle Scholar
  4. 4.
    Elnikety, S., Zwaenepoel, W., Pedone, F.: Database replication using generalized snapshot isolation. In: SRDS 2005, October 26-28, pp. 73–84 (2005)Google Scholar
  5. 5.
    Feelifl, H., Kitsuregawa, M., Ooi, B.C.: A fast convergence technique for online heat-balancing of btree indexed database over shared-nothing parallel systems. In: Ibrahim, M., Küng, J., Revell, N. (eds.) DEXA 2000. LNCS, vol. 1873, pp. 846–858. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  6. 6.
    Fekete, A.: Allocating isolation levels to transactions. In: Proceedings of 24th ACM SIG-(MOD/ACT/ART) symposium on principles of database systems (June 2005)Google Scholar
  7. 7.
    Haerder, T., Reuter, A.: Principles of transaction-oriented database recovery. ACM Computing Surveys (CSUR) 15(4), 287–317 (1983)CrossRefMathSciNetGoogle Scholar
  8. 8.
    Hsiao, H.-I., DeWitt, D.J.: A performance study of three high availability data replication strategies. Distributed & Parallel Databases 1(1), 53–80 (1993)CrossRefGoogle Scholar
  9. 9.
    Hsiao, H.-I., DeWitt, D.J.: Chained declustering: A new availability strategy for multiprocessor database machines. In: Proceedings of ICDE 1990, pp. 456–465 (1990)Google Scholar
  10. 10.
    Hvasshovd, S.-O.: Recovery in Parallel Database Systems, 2nd edn. Morgan Kaufmann Publishers, San Francisco (1999)MATHGoogle Scholar
  11. 11.
    Kemme, B., Alonso, G.: Don’t be lazy, be consistent: postgres-r, a new way to implement database replication. In: Proceedings of VLDB 2000, pp. 134–143 (September 2000)Google Scholar
  12. 12.
    Feelifl, H., Kitsuregawa, M.: RING: a strategy for minimizing the cost of online data placement reorganization for btree indexed database over shared-nothing machines. In: Proceedings of the 7th Int’l. Conf. on DASFAA 2001, pp. 190–199 (April 2001)Google Scholar
  13. 13.
    Lee, M.L.: Towards self-tuning data placement in parallel database. In: Proceedings of the ACM SIGMOD Int’l. Conf. on Management of data, pp. 225–236 (May 2000)Google Scholar
  14. 14.
    Miyazaki, J., Yokota, H.: Concurrency control and performance evaluation of parallel B-tree structures. IEICE Trans. INF. & SYST. E85-D(8), 1269–1283 (2002)Google Scholar
  15. 15.
    Ouyang, X., Yokota, H.: An efficient commit protocol exploiting primary–backup placement in a distributed storage system. In: Proceedings of the 12th Pacific Rim International Symposium on Dependable Computing, December 18-20, pp. 238–247 (2006)Google Scholar
  16. 16.
    Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity Search The Metric Space Approach. In: Parallel and Distributed Indexes, ch. 5, Springer, US (2006)Google Scholar
  17. 17.
    Seeger, B., Larson, P.: Multi-disk B-trees. In: Proceedings of the 1991 ACM SIGMOD international conference on Management of data, May 29-31, pp. 436–445 (1991)Google Scholar
  18. 18.
    Taniar, D., Rahayu, J.W.: Global parallel index for multi-processor DB systems. Information Sciences 165(1-2), 103–127 (2004)MATHCrossRefGoogle Scholar
  19. 19.
    Yoshihara, T., Kobayashi, D., Yokota, H.: Mark-opt: A concurrency control protocol for parallel B-tree structures to reduce the cost of SMOs. IEICE Transactions on information and systems 90, 1213–1224 (2007)Google Scholar
  20. 20.
    Renesse, R.V., Schneider, F.B.: Chain replication for supporting high throughput & availability. In: Proceedings of the 6th USENIX Symposium, OSDI 2004, p. 7 (2004)Google Scholar
  21. 21.
    Watanabe, A., Yokota, H.: A directory traverse cost based skew handling for parallel data access. Transactions of IEICE (D-I) J85-D-I, 877–886 (2002)Google Scholar
  22. 22.
    Watanabe, A., Yokota, H.: Adaptive overlapped declustering: a highly available data-placement method balancing access load and space utilization. In: Proceedings of the 21st Int’l. Conf. on Data Engineering, pp. 828–839 (2005)Google Scholar
  23. 23.
    Wu, S., Kemme, B.: Postgres-r(si): Combining replica control with concurrency control based on snapshot isolation. In: Proceedings of ICDE 2005, pp. 422–433 (2005)Google Scholar
  24. 24.
    Yokota, H., et al.: Fat-Btree: An update conscious parallel directory structure. In: Proceedings of the 15st Int’l. Conf. on Data Engineering, p. 448 (1999)Google Scholar
  25. 25.
    Yoshihara, T., Kobayashi, D., Yokota, H.: A concurrency control protocol for parallel b-tree structures without latch-coupling for explosively growing digital content. In: Proceedings of the 11th Int’l. Conf. on EDBT 2008, vol. 261, pp. 133–144 (2008)Google Scholar
  26. 26.
    Yu, H., Vahdat, A.: The costs and limits of availability for replicated services. In: Proceedings of the 18th ACM SOSP 2001, October 21-24 (2001)Google Scholar
  27. 27.
    Amazon Web Service LLC (2009),

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Min Luo
    • 1
  • Akitsugu Watanabe
    • 1
  • Haruo Yokota
    • 1
    • 2
  1. 1.Department of Computer ScienceTokyo Institute of TechnologyTokyoJapan
  2. 2.Global Scientific Information and Computing CenterTokyo Institute of TechnologyTokyoJapan

Personalised recommendations