Towards High Performance and High Availability Clusters of Archived Stream

  • Kai Du
  • Huaimin Wang
  • Shuqiang Yang
  • Bo Deng
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4505)

Abstract

Some burgeoning web applications, such as web search engines, need to track, store and analyze massive real-time users’ access logs with high availability of 24*7. The traditional high availability approaches towards general-purpose transaction applications are always not efficient enough to store these high-rate insertion-only archived streams. This paper presents an integrated approach to store these archived streams in a database cluster and recover it quickly. This approach is based on our simplified replication protocol and high performance data loading and query strategy. The experiments show that our approach can reach efficient data loading and query and get shorter recovery time than the traditional database cluster recovery methods.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ganarski, S., Naacke, H., Pacitti, E., Valduriez, P.: Parallel Processing with Autonomous Databases in a Cluster System. In: Meersman, R., Tari, Z., et al. (eds.) CoopIS 2002, DOA 2002, and ODBASE 2002. LNCS, vol. 2519, pp. 410–428. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  2. 2.
    Gray, J., Reuter, A.: Transaction Processing: Concepts and Techniques. Morgan Kaufmann, San Francisco (1992)Google Scholar
  3. 3.
    Google personalized search, http://www.google.com/psearch
  4. 4.
  5. 5.
    Gray, J., Helland, P., O’Neil, P., Shasha, D.: The Danger of Replication and a Solution. In: ACM SIGMOD (1996)Google Scholar
  6. 6.
    Wiesmann, M., Pedone, F., Schiper, A., Kemme, B., Alonso, G.: Transaction Replication Techniques: a Three Parameter Classification. In: SRDS (2000)Google Scholar
  7. 7.
    Mohan, C., Haderle, D., Lindsay, B., Pirahesh, H., Schwarz, P.: ARIES: a transaction recovery method supporting fine-ranularity locking and partial rollbacks using write-ahead logging. ACM TODS 17(1), 94–162 (1992)CrossRefGoogle Scholar
  8. 8.
    Cai, Y.D., Aydt, R., Brunner, R.J.: Optimized Data Loading for a Multi-Terabyte Sky Survey Repository. In: Super Computing (2005)Google Scholar
  9. 9.
    Liskov, B., Ghemawat, S., Gruber, R., Johnson, P., Shrira, L.: Replication in the harp file system. In: SOSP, pp. 226–238. ACM Press, New York (1991)Google Scholar
  10. 10.
  11. 11.
    Chandrasekaran, S., Franklin, M.: Remembrance of Streams Past:Overload-Sensitive Management of Archived Streams. In: VLDB (2004)Google Scholar
  12. 12.
  13. 13.
    Hvasshovd, S.-O., Torbjørnsen, Ø., Bratsberg, S.E., Holager, P.: The clustra telecom database: High availability, high throughput, and real-time response. In: VLDB (1995)Google Scholar
  14. 14.
    Lau, E., Madden, S.: An Integrated Approach to Recovery and High Availability in an Updatable, Distributed Data Warehouse. In: VLDB (2006)Google Scholar
  15. 15.
    Jiménez-Peris, R., Patino-Martinez, M., Alonso, G.: An algorithm for non-intrusive, parallel recovery of replicated data and its correctness. In: SRDS (2002)Google Scholar
  16. 16.
    Kemme, B.: Database Replication for Clusters of Workstations. PhD dissertation, Swiss Federal Institute of Technology, Zurich, Germany (2000)Google Scholar

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Kai Du
    • 1
  • Huaimin Wang
    • 1
  • Shuqiang Yang
    • 1
  • Bo Deng
    • 1
  1. 1.School of Computer Science, National University of Defense Technology, Changsha 410073China

Personalised recommendations