RV 2017: Runtime Verification pp 277-293 | Cite as
Monitoring Partially Synchronous Distributed Systems Using SMT Solvers
Abstract
In this paper, we discuss the feasibility of monitoring partially synchronous distributed systems to detect latent bugs, i.e., errors caused by concurrency and race conditions among concurrent processes. We present a monitoring framework where we model both system constraints and latent bugs as Satisfiability Modulo Theories (SMT) formulas, and we detect the presence of latent bugs using an SMT solver. We demonstrate the feasibility of our framework using both synthetic applications where latent bugs occur at any time with random probability and an application involving exclusive access to a shared resource with a subtle timing bug. We illustrate how the time required for verification is affected by parameters such as communication frequency, latency, and clock skew. Our results show that our framework can be used for real-life applications, and because our framework uses SMT solvers, the range of appropriate applications will increase as these solvers become more efficient over time.
References
- 1.Basin, D., Bhatt, B.N., Traytel, D.: Almost event-rate independent monitoring of metric temporal logic. In: Legay, A., Margaria, T. (eds.) TACAS 2017. LNCS, vol. 10206, pp. 94–112. Springer, Heidelberg (2017). doi: 10.1007/978-3-662-54580-5_6 CrossRefGoogle Scholar
- 2.Bauer, A., Falcone, Y.: Decentralised LTL monitoring. Form. Methods Syst. Des. 48(1–2), 46–93 (2016)CrossRefMATHGoogle Scholar
- 3.Bronson, N., Amsden, Z., Cabrera, G., Chakka, P., Dimov, P., Ding, H., Ferris, J., Giardullo, A., Kulkarni, S., Li, H., Marchukov, M., Petrov, D., Puzar, L., Song, Y.J., Venkataramani, V.: Tao: Facebook’s distributed data store for the social graph. In: Presented as part of the 2013 USENIX Annual Technical Conference (USENIX ATC 13), San Jose, CA, pp. 49–60. USENIX (2013)Google Scholar
- 4.Charron-Bost, B.: Concerning the size of logical clocks in distributed systems. Inf. Process. Lett. 39(1), 11–16 (1991)MathSciNetCrossRefMATHGoogle Scholar
- 5.Chase, C.M., Garg, V.K.: Detection of global predicates: techniques and their limitations. Distrib. Comput. 11(4), 191–201 (1998)CrossRefGoogle Scholar
- 6.Chauhan, H., Garg, V.K., Natarajan, A., Mittal, N.: A distributed abstraction algorithm for online predicate detection. In: Proceedings of the 2013 IEEE 32nd International Symposium on Reliable Distributed Systems, SRDS 2013, pp. 101–110. IEEE Computer Society, Washington, DC (2013)Google Scholar
- 7.Corbett, J.C., Dean, J., Epstein, M., Fikes, A., Frost, C., Furman, J.J., Ghemawat, S., Gubarev, A., Heiser, C., Hochschild, P., Hsieh, W., Kanthak, S., Kogan, E., Li, H., Lloyd, A., Melnik, S., Mwaura, D., Nagle, D., Quinlan, S., Rao, R., Rolig, L., Saito, Y., Szymaniak, M., Taylor, C., Wang, R., Woodford, D.: Spanner: Google’s globally-distributed database. In: 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12), Hollywood, CA, pp. 261–264. USENIX Association (2012)Google Scholar
- 8.Cristian, F., Fetzer, C.: The timed asynchronous distributed system model. In: Digest of Papers: FTCS-28, The Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing, Munich, Germany, 23–25 June 1998, pp. 140–149 (1998)Google Scholar
- 9.de Moura, L., Bjørner, N.: Z3: an efficient SMT solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-78800-3_24 CrossRefGoogle Scholar
- 10.Demirbas, M., Kulkarni, S.: Beyond truetime: using augmentedtime for improving google spanner. In: LADIS 2013: 7th Workshop on Large-Scale Distributed Systems and Middleware (2013)Google Scholar
- 11.Falcone, Y., Cornebize, T., Fernandez, J.-C.: Efficient and generalized decentralized monitoring of regular languages. In: Ábrahám, E., Palamidessi, C. (eds.) FORTE 2014. LNCS, vol. 8461, pp. 66–83. Springer, Heidelberg (2014). doi: 10.1007/978-3-662-43613-4_5 CrossRefGoogle Scholar
- 12.Fidge, C.J.: Timestamps in message-passing systems that preserve the partial ordering. In: Proceedings of the 11th Australian Computer Science Conference, vol. 10(1), pp. 56–66 (1988)Google Scholar
- 13.Garg, V.K., Waldecker, B.: Detection of weak unstable predicates in distributed programs. IEEE Trans. Parallel Distrib. Syst. 5(3), 299–307 (1994)CrossRefGoogle Scholar
- 14.Kulkarni, S.S., Demirbas, M., Madappa, D., Avva, B., Leone, M.: Logical physical clocks. In: Aguilera, M.K., Querzoni, L., Shapiro, M. (eds.) OPODIS 2014. LNCS, vol. 8878, pp. 17–32. Springer, Cham (2014). doi: 10.1007/978-3-319-14472-6_2 Google Scholar
- 15.Marzullo, K., Neiger, G.: Detection of global state predicates. In: Toueg, S., Spirakis, P.G., Kirousis, L. (eds.) WDAG 1991. LNCS, vol. 579, pp. 254–272. Springer, Heidelberg (1992). doi: 10.1007/BFb0022452 CrossRefGoogle Scholar
- 16.Mattern, F.: Virtual time and global states of distributed systems. In: Parallel and Distributed Algorithms, pp. 215–226. North-Holland (1989)Google Scholar
- 17.Mills, D.L.: Internet time synchronization: the network time protocol. IEEE Trans. Commun. 39(10), 1482–1493 (1991)CrossRefGoogle Scholar
- 18.Mostafa, M., Bonakdarpour, B.: Decentralized runtime verification of LTL specifications in distributed systems. In: Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2015, pp. 494–503. IEEE Computer Society, Washington, DC (2015)Google Scholar
- 19.Schwarz, R., Mattern, F.: Detecting causal relationships in distributed computations: in search of the holy grail. Distrib. Comput. 7(3), 149–174 (1994)CrossRefMATHGoogle Scholar
- 20.Sen, K., Vardhan, A., Agha, G., Rosu, G.: Efficient decentralized monitoring of safety in distributed systems. In: Proceedings of the 26th International Conference on Software Engineering, ICSE 2004, pp. 418–427. IEEE Computer Society, Washington, DC (2004)Google Scholar
- 21.Stoller, S.D.: Detecting global predicates in distributed systems with clocks. Distrib. Comput. 13(2), 85–98 (2000)CrossRefGoogle Scholar
- 22.Valapil, V.T., Yingchareonthawornchai, S., Kulkarni, S., Torng, E., Demirbas, M.: Monitoring partially synchronous distributed systems using SMT solvers-technical report (2017). http://cse.msu.edu/~tekkenva/z3monitoringresults/TechnicalReport.pdf
- 23.Yingchareonthawornchai, S., Nguyen, D.N., Valapil, V.T., Kulkarni, S.S., Demirbas, M.: Precision, recall, and sensitivity of monitoring partially synchronous distributed systems. In: Falcone, Y., Sánchez, C. (eds.) RV 2016. LNCS, vol. 10012, pp. 420–435. Springer, Cham (2016). doi: 10.1007/978-3-319-46982-9_26. arXiv:1607.03369 CrossRefGoogle Scholar
- 24.Yingchareonthawornchai, S., Kulkarni, S.S., Demirbas, M.: Analysis of bounds on hybrid vector clocks. In: OPODIS 2015, Rennes, France, 14–17 December 2015, pp. 34:1–34:17 (2015)Google Scholar
- 25.Yingchareonthawornchai, S., Valapil, V.T., Kulkarni, S., Torng, E., Demirbas, M.: Efficient algorithms for predicate detection using hybrid logical clocks. In: Proceedings of the 18th International Conference on Distributed Computing and Networking, ICDCN 2017, pp. 10:1–10:10. ACM, New York (2017)Google Scholar
- 26.Zhu, W., Cao, J., Raynal, M.: Predicate detection in asynchronous distributed systems: a probabilistic approach. IEEE Trans. Comput. 65(1), 173–186 (2016)MathSciNetCrossRefMATHGoogle Scholar