Software Quality Journal

, Volume 21, Issue 2, pp 203–240 | Cite as

Optimizing decomposition of software architecture for local recovery

  • Hasan SözerEmail author
  • Bedir Tekinerdoğan
  • Mehmet Akşit


The increasing size and complexity of software systems has led to an amplified number of potential failures and as such makes it harder to ensure software reliability. Since it is usually hard to prevent all the failures, fault tolerance techniques have become more important. An essential element of fault tolerance is the recovery from failures. Local recovery is an effective approach whereby only the erroneous parts of the system are recovered while the other parts remain available. For achieving local recovery, the architecture needs to be decomposed into separate units that can be recovered in isolation. Usually, there are many different alternative ways to decompose the system into recoverable units. It appears that each of these decomposition alternatives performs differently with respect to availability and performance metrics. We propose a systematic approach dedicated to optimizing the decomposition of software architecture for local recovery. The approach provides systematic guidelines to depict the design space of the possible decomposition alternatives, to reduce the design space with respect to domain and stakeholder constraints and to balance the feasible alternatives with respect to availability and performance. The approach is supported by an integrated set of tools and illustrated for the open-source MPlayer software.


Software architecture design Fault tolerance Local recovery Availability Performance 



We acknowledge the feedback from the discussions with our TRADER project (TRADER, 2011) partners from NXP Research, NXP Semiconductors, TASS, Philips Consumer Electronics, Design Technology Institute, Embedded Systems Institute, IMEC, Leiden University and Delft University of Technology. We thank the anonymous reviewers for their feedback to improve this paper.


  1. Aleti, A., Björnander, S., Grunske, L., & Meedeniya, I. (2009). Archeopterix: An extendable tool for architecture optimization of aadl models. In Proceedings of the ICSE 2009 workshop on model-based methodologies for pervasive and embedded software (MOMPES), Vancouver, Canada, pp. 61–71.Google Scholar
  2. Alexander, C. (1964). Notes on the synthesis of form. Harvard Cambridge, MA: University Press.Google Scholar
  3. Anquetil, N., Fourrier, C., & Lethbridge, T. (1999). Experiments with clustering as a software remodularization method. In Proceedings of the 6th working conference on reverse engineering (WCRE), IEEE Computer Society, pp. 235–245.Google Scholar
  4. Athon, T., & Papalambros, P. (1996). A note on weighted criteria methods for compromise solutions in multi-objective optimization. Engineering Optimization, 27(2), 155–176.CrossRefGoogle Scholar
  5. Avizienis, A., Laprie, J. C., Randell, B., & Landwehr, C. (2004). Basic concepts and taxonomy of dependable and secure computing. IEEE Transactions on Dependable and Secure Computing, 1(1), 11–33.CrossRefGoogle Scholar
  6. Bachman, F., Bass, L., & Klein, M. (2003). Deriving architectural tactics: A step toward methodical architectural design. Tech. Rep. CMU/SEI-2003-TR-004, SEI, Pittsburgh, PA, USA.Google Scholar
  7. Boudali, H., Sozer, H., & Stoelinga, M. (2009). Architectural availability analysis of software decomposition for local recovery. In Proceedings of the third IEEE international conference on secure software integration and reliability improvement, Shanghai, China, pp. 14–22.Google Scholar
  8. Buschmann, F., Meunier, R., Rohnert, H., Sommerlad, P., & Stal, M. (1996). Pattern-oriented software architecture, a system of patterns. Wiley.Google Scholar
  9. Candea, G., Cutler, J., & Fox, A. (2004). Improving availability with recursive micro-reboots: A soft-state system case study. Performance Evaluation, 56(1-4), 213–248.CrossRefGoogle Scholar
  10. Candea, G., Kawamoto, S., Fujiki, Y., Friedman, G., & Fox, A. (2004b). Microreboot: A technique for cheap recovery. In Proceedings of the 6th symposium on operating systems design and implementation (OSDI), San Francisco, CA, USA, pp. 31–44.Google Scholar
  11. Clements, P., Bachmann, F., Bass, L., Garlan, D., Ivers, J., Little, R., Nord, R., & Stafford, J. (2002a). Documenting software architectures: Views and beyond. Boston, MA: Addison-Wesley.Google Scholar
  12. Clements, P., Kazman, R., & Klein, M. (2002b). Evaluating software architectures: Methods and case studies. Boston: Addison-Wesley.Google Scholar
  13. Patterson, D. et al. (2002). Recovery oriented computing (ROC): Motivation, definition, techniques, and case studies. Technical Report UCB/CSD-02-1175, University of California, Berkeley.Google Scholar
  14. Dashofy, E., van der Hoek, A., & Taylor, R. (2002). An infrastructure for the rapid development of XML-based architecture description languages. In Proceedings of the 22rd international conference on software engineering (ICSE), ACM, Orlando, FL, USA, pp. 266–276.Google Scholar
  15. Davey, J., & Burd, E. (2000). Evaluating the suitability of data clustering for software remodularization. In Proceedings of the 7th working conference on reverse engineering (WCRE). IEEE Computer Society, pp. 268–278.Google Scholar
  16. Dobrica, L., & Niemela, E. (2002). A survey on software architecture analysis methods. IEEE Transactions on Software Engineering, 28(7), 638–654.CrossRefGoogle Scholar
  17. Fenlason, J., & Stallman, R. (2000). GNU gprof: The GNU profiler. Free Software Foundation,
  18. Gokhale, S. (2007). Architecture-based software reliability analysis: Overview and limitations. IEEE Transactions on Dependable and Secure Computing, 4(1), 32–40.CrossRefGoogle Scholar
  19. Grassi, V., Mirandola, R., & Sabetta, A. (2005). An XML-based language to support performance and reliability modeling and analysis in software architectures. In R. Reussner, J. Mayer, J. Stafford, S. Overhage, S. Becker, & P. Schroeder (Eds.), QoSA/SOQUA, Springer, Lecture Notes in Computer Science, Vol. 3712, pp. 71–87.Google Scholar
  20. Grunske, L., Lindsay, P., Bondarev, E., Papadopoulos, Y., & Parker, D. (2007). An outline of an architecture-based method for optimizing dependability attributes of software-intensive systems. In R. de Lemos, C. Gacek, & A. B. Romanovsky (Eds.), Architecting dependable systems IV (pp. 188–209). Berlin: Springer.Google Scholar
  21. Harris, J., Hirst, J., & Mossinghoff, M. (2000). Combinatorics and graph theory. New York: Springer.Google Scholar
  22. Herder, J., Bos, H., Gras, B., Homburg, P., & Tanenbaum, A. (2007). Failure resilience for device drivers. In Proceedings of the 37th annual IEEE/IFIP international conference on dependable systems and networks (DSN). Edinburgh, UK, pp. 41–50.Google Scholar
  23. Heyliger, G. (1994). Coupling. In J. Marciniak (Ed.), Encyclopedia of software engineering (pp. 220–228). Wiley.Google Scholar
  24. Huang, Y., & Kintala, C. (1995). Software fault tolerance in the application layer. In M. R. Lyu (Ed.), Software fault tolerance, chapter 10 (pp. 231–248). New York: WileyGoogle Scholar
  25. Hunt, G., Aiken, M., Fhndrich, M., Hawblitzel, C., Hodson, O., Larus, J., Levi, S., Steensgaard, B., Tarditi, D., & Wobber, T. (2007). Sealing OS processes to improve dependability and safety. SIGOPS Operating Systems Review, 41(3), 341–354.CrossRefGoogle Scholar
  26. Jokiaho, T., Herrmann, F., Penkler, D., & Moser, L. (2003). The service availability forum application interface specification. RTC Magazine, 12(6), 52–58.Google Scholar
  27. Kang, K., Cohen, S., Hess, J., Novak, W., & Peterson, A. (1990). Feature-oriented domain analysis (FODA) feasibility study. Tech. Rep. CMU/SEI-90-TR-21, SEI.Google Scholar
  28. Laprie, J. C., Arlat, J., Beounes, C., & Kanoun, K. (1995). Architectural issues in software fault tolerance. In M. R. Lyu (Ed.), Software fault tolerance, chapter 3 (pp. 47–80). Cichester: Wiley.Google Scholar
  29. Lung, C. H., Xu, X., & Zaman, M. (2007). Software architecture decomposition using attributes. International Journal of Software Engineering and Knowledge Engineering, 17, 599–613.CrossRefGoogle Scholar
  30. Medvidovic, N., & Taylor, R. N. (2000). A classification and comparison framework for software architecture description languages. IEEE Transactions on Software Engineering, 26(1), 70–93.CrossRefGoogle Scholar
  31. Meedeniya, I., Buhnova, B., Aleti, A., & Grunske L. (2011). Reliability-driven deployment optimization for embedded systems. Journal of Systems and Software, 84(5), 835–846.CrossRefGoogle Scholar
  32. Mitchell, B. S., & Mancoridis, S. (2006). On the automatic modularization of software systems using the bunch tool. IEEE Transactions on Software Engineering, 32(3), 193–208.CrossRefGoogle Scholar
  33. MPlayer (2010). MPlayer official website. Accessed 31 Mar 2011.
  34. Necula, G., McPeak, S., Rahul, S., & Weimer, W. (2002). CIL: Intermediate language and tools for analysis and transformation of C programs. In Proceedings of the conference on compiler construction, pp. 213–228.Google Scholar
  35. Nethercote, N., & Seward, J. (2007). Valgrind: a framework for heavyweight dynamic binary instrumentation. SIGPLAN Notices, 42(6), 89–100.CrossRefGoogle Scholar
  36. Nguyen, G., Hluchý, L., Tran, V., & Kotocova, M. (2001). DDG task recovery for cluster computing. In Proceedings of the 4th international conference on parallel processing and applied mathematics, Springer, Naleczow, Poland, Lecture Notes in Computer Science, Vol. 2328, pp. 369–378.Google Scholar
  37. Object Management Group (2001) Fault tolerant CORBA. Tech. Rep. OMG Document formal/2001-09-29, Object Management Group.Google Scholar
  38. Pareto, V. (1896). Cours D’ economie politique. Lausanne, Switzerland: F. RougeGoogle Scholar
  39. Ross, S. (2007). Introduction to probability models. San Diego: Elsevier Inc.Google Scholar
  40. di Ruscio, D., Malavolta, I., Muccini, H., Pelliccione, P., & Pierantonio, A. (2010). Developing next generation ADLs through MDE techniques. In Proceedings of the 32nd international conference on software engineering (ICSE), Cape Town, South Africa, pp. 85–94.Google Scholar
  41. Ruskey, F. (1993). Simple combinatorial gray codes constructed by reversing sublists. In Proceedings of the 4th international symposium on algorithms and computation (ISAAC), Springer, Lecture Notes in Computer Science, Vol. 762, pp. 201–208.Google Scholar
  42. Ruskey, F. (2003). Combinatorial generation. University of Victoria, Victoria, BC, Canada, manuscript CSC-425/520Google Scholar
  43. Santos, G., Duarte, A., Rexachs, D., & Luque, E. (2008). Increasing the performability of computer clusters using RADIC II. In Proceedings of the third international conference on availability, reliability and security, IEEE Computer Society, pp. 653–658.Google Scholar
  44. Sozer, H., & Tekinerdogan, B. (2008). Introducing recovery style for modeling and analyzing system recovery. In Proceedings of the 7th working IEEE/IFIP conference on software architecture (WICSA). Vancouver, BC, Canada, pp. 167–176.Google Scholar
  45. Sozer, H., Tekinerdogan, B., & Aksit, M. (2009). FLORA: A framework for decomposing software architecture to introduce local recovery. Software: Practice and Experience, 39(10), 869–889.CrossRefGoogle Scholar
  46. Teitelbaum, T. (2000). Codesurfer. SIGSOFT Software Engineering Notes, 25(1), 99.CrossRefGoogle Scholar
  47. Tekinerdogan, B., Sozer, H., & Aksit, M. (2008). Software architecture reliability analysis using failure scenarios. Journal of Systems and Software, 81(4), 558–575.CrossRefGoogle Scholar
  48. TRADER (2011). Trader project, ESI. Accessed 31-March-2011.
  49. de Visser, I. (2008). Analyzing user perceived failure severity in consumer electronics products. PhD thesis, Technische Universiteit Eindhoven, Eindhoven, The Netherlands.Google Scholar
  50. White, J., Doughtery, B., Strowd, H., & Schmidt, D. (2009). Creating self-healing service compositions with feature models and microrebooting. International Journal of Business Process Integration and Management, 4, 35–46.CrossRefGoogle Scholar
  51. Wiggerts, T. (1997). Using clustering algorithms in legacy systems remodularization. In Proceedings of the 4th Working Conference on Reverse Engineering (WCRE), IEEE Computer Society, pp. 33–43.Google Scholar
  52. Yacoub, S., Cukic, B., & Ammar, H. (2004). A scenario-based reliability analysis approach for component-based software. IEEE Transactions on Reliability, 53(14), 465–480.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Hasan Sözer
    • 1
    Email author
  • Bedir Tekinerdoğan
    • 2
  • Mehmet Akşit
    • 3
  1. 1.Department of Computer ScienceÖzyeğin UniversityİstanbulTurkey
  2. 2.Department of Computer EngineeringBilkent UniversityAnkaraTurkey
  3. 3.Department of Computer ScienceUniversity of TwenteEnschedeThe Netherlands

Personalised recommendations