Abstract
Applications running in large-scale distributed systems face many challenges and difficulties. Constraints imposed to such systems need to be thoroughly checked in order to ensure a proper service delivery to the client. The current paper proposes a monitoring solution for large-scale distributed systems relying on abstract state machines. Data gathered from the monitoring components are used in calculating metrics and establishing a diagnosis for the system. Emphasis is put on failure detection and on ensuring non-functional requirements of the system such as fault-tolerance and resilience. The model introduced in this paper will be integrated in a cloud-enabled large-scale distributed system. The novelty of the solution consists of finding the best integration architecture for state-of-the-art algorithms and tools and refining them to an efficient version for large-scale distributed systems.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
A tuple (\(\mathrm {M_{k}}\), \(\mathrm {T_{k}}\), \(\mathrm {I_{k}}\)) refers to (\(\mathrm {Monitoring Component_{k}}\), \(\mathrm {Topology_{k}}\), \(\mathrm {Metrics Set_{k}}\)).
References
Parkhill, D.F.: The Challenge of the Computer Utility. Addison-Wesley Publishing Company, Reading (1966)
Nemes, S. T.: Adaptation Engine for Large-Scale Distributed Systems. In: Computer Aided Systems Theory - EUROCAST 2015, To appear. Springer, Las Palmas (2015)
Kutare, M., Eisenhauer, G., Wang, C., Schwan, K., Talwar, V., Wolf, M.: Monalytics: online monitoring and analytics for managing large scale data centers. In: Proceedings of the 7th International Conference on Autonomic Computing, pp. 141–150. ACM (2010)
Rak, M., Venticinque, S., Mahr, T., Echevarria, G., Esnal, G.: Cloud application monitoring: the mOSAIC approach. In: 2011 IEEE Third International Conference on Cloud Computing Technology and Science (CloudCom), pp. 758–763. IEEE (2011)
Palmieri, R., di Sanzo, P., Quaglia, F., Romano, P., Peluso, S., Didona, D.: Integrated monitoring of infrastructures and applications in cloud environments. In: Alexander, M., D’Ambra, P., Belloum, A., Bosilca, G., Cannataro, M., Danelutto, M., Di Martino, B., Gerndt, M., et al. (eds.) Euro-Par 2011, Part I. LNCS, vol. 7155, pp. 45–53. Springer, Heidelberg (2012)
Massie, M.L., Chun, B.N., Culler, D.E: The ganglia distributed monitoring system: design, parallel computing, implementation and experience (2003)
Börger, E., Stärk, R.F.: Abstract State Machines: A Method for High-Level System Design and Analysis. Springer, Heidelberg (2003)
Lynch, N.: Distributed Algorithms. Morgan Kaufmann Publishers Inc., San Francisco (1996)
Hamid, B., Mosbah, M.: A formal model for fault-tolerance in distributed systems. In: Winther, R., Gran, B.A., Dahll, G. (eds.) SAFECOMP 2005. LNCS, vol. 3688, pp. 108–121. Springer, Heidelberg (2005)
Driscoll, K., Hall, B., Sivencrona, H., Zumsteg, P.: Byzantine fault tolerance, from theory to reality. In: Anderson, S., Felici, M., Littlewood, B. (eds.) SAFECOMP 2003. LNCS, vol. 2788, pp. 235–248. Springer, Heidelberg (2003)
Stärk, R.F., Schmid, J., Börger, E.: Java and the Java Virtual Machine: Definition, Verification, Validation. Springer, Heidelberg (2001)
Blass, A., Gurevich, Y.: Abstract state machines capture parallel algorithms: correction and extension. ACM Trans. Comput. Logic 9(3), 19:1–19:32 (2008)
Glässer, U., Gu, Q.-P.: Formal description and analysis of a distributed location service for mobile ad hoc networks. In: Theoretical Computer Science (2005)
Rady, M., Lampesberger, H.: Monitoring of client-cloud interaction. In: Buchberger, B., Prinz, A., Schewe, K.D., Thalheim, B. (eds.) Correct Software in Web Applications and Web Services. Texts & Monographs in Symbolic Computation, pp. 177–228. Springer, Heidelberg (2014)
Bósa, K.: A formal model of a cloud service architecture in terms of ambient ASM. Technical report, Christian Doppler Laboratory for Client-Centric Cloud Computing (CDCC), Johannes Kepler University Linz, Hagenberg, Austria (2012)
Baader, F., Calvanese, D., McGuinness, D.L., Nardi, D., Patel-Schneider, P.F.: The Description Logic Handbook: Theory, Implementation, and Applications. Cambridge University Press, New York (2003)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Buga, A. (2015). A Scalable Monitoring Solution for Large-Scale Distributed Systems. In: Moreno-Díaz, R., Pichler, F., Quesada-Arencibia, A. (eds) Computer Aided Systems Theory – EUROCAST 2015. EUROCAST 2015. Lecture Notes in Computer Science(), vol 9520. Springer, Cham. https://doi.org/10.1007/978-3-319-27340-2_28
Download citation
DOI: https://doi.org/10.1007/978-3-319-27340-2_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27339-6
Online ISBN: 978-3-319-27340-2
eBook Packages: Computer ScienceComputer Science (R0)