Software Fault-Tolerance with Off-the-Shelf SQL Servers

  • P. Popov
  • L. Strigini
  • A. Kostov
  • V. Mollov
  • D. Selensky
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2959)


With off-the-shelf software, software fault tolerance is almost the only means available for assuring better dependability than the off-the-shelf software offers, without the much higher costs of bespoke development or extra V&V. We report our experience with an experimental setup we have developed with off-the-shelf SQL database servers. First, we describe the use of a protective wrapper to mask the effects of a bug in one of the servers, without depending on an adequate fix from the vendors. We then discuss how to combine the diverse off-the-shelf servers into a diverse modular redundant configuration (N-version software or N-self-checking software). A wrapper guarantees the consistency between the diverse replicas of the database, serving multiple clients, by restricting the concurrency between the client transactions We thus show that diverse modular redundancy with protective wrapping is a viable way of achieving fault-tolerance with even complex off-the-shelf components, like database servers.


Client Application Snapshot Isolation Software Fault Tolerance Server Thread Read Transaction 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    ECUA, 3rd European COTS User Working Group (ECUA) Workshop, in Panel with Industrial Collaborators, Copenhagen, Denmark (2002)Google Scholar
  2. 2.
    Popov, P., Strigini, L., Romanovsky, A.: Diversity for off-the-Shelf Components. Presented at International Conference on Dependable Systems and Networks (DSN 2000) - Fast Abstracts supplement, New York, NY, USA (2000)Google Scholar
  3. 3.
    Gray, J.: FT101: Talk at UC Berkeley on Fault-Tolerance, pp. 62 slides (2000),
  4. 4.
    Popov, P., Strigini, L.: Diversity with Off-The-Shelf Components: A Study with SQL Database Servers. Presented at International Conference on Dependable Systems and Networks (DSN 2003) - Fast Abstracts supplement (2003)Google Scholar
  5. 5.
    Laprie, J.C., Arlat, J., Beounes, C., Kanoun, K.: Definition and Analysis of Hardwareand- Software Fault-Tolerant Architectures. IEEE Computer 23, 39–51 (1990)Google Scholar
  6. 6.
    Gray, J., Helland, P., Shasha, D., O’Neil, P.: The Dangers of Replication and a solution. Presented at ACM SIGMOD International Conference on Management of Data, Montreal, Canada (1996)Google Scholar
  7. 7.
    Weismann, M., Pedone, F., Schiper, A.: Database Replication Techniques: a Three Parameter Classification. Presented at 19th IEEE Symposium on Reliable Distributed Systems (SRDS 2000), Nurnberg, Germany (2000)Google Scholar
  8. 8.
    Popov, P., Strigini, L., Riddle, S., Romanovsky, A.: Protective Wrapping of OTS Components. Presented at 4th ICSE Workshop on Component-Based Software Engineering: Component Certification and System Prediction, Toronto (2001)Google Scholar
  9. 9.
    TPC, TPC-C, An On-Line Transaction Processing Benchmark, v. 5 (2002) Google Scholar
  10. 10.
    Microsoft, MS SQL 7.0, BUG #: 56013, FIX: Lock Conversion Processing Does Not Properly Wakeup Lock Waiter (2002),;EN-US;236955
  11. 11.
    Bernstain, A., Hadzilacos, V., Goodman, N.: Concurrency Control and Recovery Database Systems. Addison-Wesley, Reading (1987)Google Scholar
  12. 12.
    Vaysburd, A.: Faul Tolerance in Three-Tier Applications: Focusing on the Database Tier. Presented at 18th IEEE Symposium on Reliable Distributed Systems (SRDS 1999), Lausanne, Switzerland (1999)Google Scholar
  13. 13.
    Pedone, F., Frolund, S.: Pronto: A Fast Failover Protocol for Off-the-shelf Commercial Databases. Presented at 19th IEEE Symposium on Reliable Distributed Systems (SRDS 2000), Nurnberg, Germany (2000)Google Scholar
  14. 14.
    Pedone, F., Guerraoui, R., Schiper, A.: Transaction Reordering in Replicated Databases. Presented at 16th IEEE Symposium on Reliable Distributed Systems (SRDS 1997), Durham, NC (1997)Google Scholar
  15. 15.
    Schlichting, R.D., Schneider, F.B.: Fail-Stop Processors: An Approach to Designing Fault-Tolerant Computing Systems. ACM Transactions on Computing Systems 1, 222–238 (1983)CrossRefGoogle Scholar
  16. 16.
    Fekete, A., Liarokapis, D., O’Neil, E., O’Neil, P., Shasha, D.: Making Snapshots Isolation Serializable, p. 16 (2000)Google Scholar
  17. 17.
    Berenson, H., Bernstein, P., Gray, J., Melton, J., O’Neil, E., O’Neil, P.: A Critique of ANSI SQL Isolation Levels. Presented at SIGMOD Internationa Conference on Management of Data (1995)Google Scholar
  18. 18.
    Mont, M.C., Baldwin, A., Beres, Y., Harrison, K., Sadler, M., Shiu, S.: Towards Diversity of COTS Software Applications: Reducing Risks of Widespread Faults and Attacks. HP Laboratories, Bristol (2002)Google Scholar
  19. 19.
    Romanovsky, A.: Exception Handling in Component-Based System Development. Presented at COMPSAC 2001, Chicago, IL (2001)Google Scholar
  20. 20.
    Popov, P., Strigini, L., Riddle, S., Romanovsky, A.:On Systematic Design of Protectors for Employing OTS Items. Presented at 27th Euromicro Conference, Workshop on Component-Based Software Engineering, Warsaw, Poland (2001)Google Scholar
  21. 21.
    Reynolds, J., Just, J., Lawson, E., Clough, L., Maglich, R., Levitt, K.: The Design and Implementation of an Intrusion Tolerant System. Presented at International Conference on Dependable Systems and Networks (DSN 2002), Washington, D.C., USA (2002)Google Scholar
  22. 22.
    Valdes, A., Almgren, M., Cheung, S., Deswarte, Y., Dutertre, B., Levy, J., Saidi, H., Stavridou, V., Uribe, T.E.: An Adaptive Intrusion-Tolerant Server Architecture (1999)Google Scholar
  23. 23.
    Hiltunen, M.A., Schlichting, R.D., Ugarte, C.A., Wong, G.T.: Survivability through Customization and Adaptability: The Cactus Approach. Presented at DARPA Information Survivability Conference & Exposition (2000)Google Scholar
  24. 24.
    Wang, F., Gong, F., Sargor, C., Goseva-Popstojanova, K., Trivedi, K., Jou, F.: SITAR: A Scalable Intrusion-Tolerant Architecture for Distributed Services. Presented at IEEE Workshop on Information Assurance and Security, West Point, NY, U.S.A (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • P. Popov
    • 1
  • L. Strigini
    • 1
  • A. Kostov
    • 2
  • V. Mollov
    • 2
  • D. Selensky
    • 2
  1. 1.Centre for Software ReliabilityCity UniversityLondonUK
  2. 2.Department of ComputingTechnical UniversityPlovdivBulgaria

Personalised recommendations