Abstract
Today’s distributed systems need runtime error detection to catch errors arising from software bugs, hardware errors, or unexpected operating conditions. A prominent class of error detection techniques operates in a stateful manner, i.e., it keeps track of the state of the application being monitored and then matches state-based rules. Large-scale distributed applications generate a high volume of messages that can overwhelm the capacity of a stateful detection system. An existing approach to handle this is to randomly sample the messages and process a subset. However, this approach, leads to non-determinism with respect to the detection system’s view of what state the application is in. This in turn leads to degradation in the quality of detection. We present an intelligent sampling algorithm and a Hidden Markov Model (HMM)-based algorithm to select the messages that the detection system processes and determine the application states such that the non-determinism is minimized. We also present a mechanism for selectively triggering computationally intensive rules based on a light-weight mechanism to determine if the rule is likely to be flagged. We demonstrate the techniques in a detection system called Monitor applied to a J2EE multi-tier application. We empirically evaluate the performance of Monitor under different load conditions and error scenarios and compare it to a previous system called Pinpoint.
Keywords
Download to read the full chapter text
Chapter PDF
References
Kruegel, C., Valeur, F., Vigna, G., Kemmerer, R.: Stateful intrusion detection for high-speed network’s. In: IEEE Symp. on Security and Privacy (2002)
Jiang, W., Song, H., Dai, Y.: Real-time Intrusion Detection for High-speed Networks. Computers & Security 24(4), 287–294 (2005)
Krishnamurthy, B., Sen, S., Zhang, Y., Chen, Y.: Sketch-based change detection: Methods, evaluation, and applications. In: IMC 2003 (2003)
Lakhina, A., Crovella, M., Diot, C.: Mining Anomalies Using Traffic Feature Distributions. ACM SIGCOMM Comput. Commun. Rev. 35(4) (October 2005)
Mai, J., Chuah, C., Sridharan, A., Ye, T., Zang, H.: Is Sampled Data Sufficient for Anomaly Detection? In: IMC 2006 (2006)
Barham, P., Donnelly, A., Isaacs, R., Mortier, R.: Using Magpie for Request Extraction and Workload Modeling. In: USENIX OSDI (2004)
Chen, M.Y., Accardi, A., Kiciman, E., Lloyd, J., Patterson, D., Fox, A., Brewer, E.: Path-based failure and evolution management. In: USENIX NSDI (2004)
Aguilera, M.K., Mogul, J.C., Wiener, J.L., Reynolds, P., Muthitacharoen, A.: Performance debugging for distributed systems of black boxes. In: ACM SOSP (2003)
Reynolds, P., Wiener, J.L., Mogul, J.C., Aguilera, M.K., Vahdat, A.: WAP5: black-box performance debugging for wide-area systems. In: WWW 2006 (2006)
Khanna, G., Varadharajan, P., Bagchi, S.: Automated online monitoring of distributed applications through external monitors. IEEE Trans. on Dependable and Secure Computing 3(2), 115–129 (2006)
Khanna, G., Laguna, I., Arshad, F.A., Bagchi, S.: Stateful Detection in High Throughput Distributed Systems. In: SRDS 2007 (2007)
The Java EE 5 Tutorial (September 2007), http://java.sun.com/javaee/5/docs/tutorial/doc/
GlassFish: Open Source Application Server (2008), https://glassfish.dev.java.net/
Klein, D., Manning, C.D.: Parsing with treebank grammars. Assoc. for Computational Linguistics (2001)
Schuff, D.L., Pai, V.S.: Design Alternatives for a High-Performance Self-Securing Ethernet Network Interface. In: IPDPS 2007 (2007)
Kiciman, E., Fox, A.: Detecting application-level failures in component-based Internet services. IEEE Trans. Neural Networks 16(5), 1027–1041 (2005)
Apache Tomcat: An Open Source JSP and Servlet Container, http://tomcat.apache.org/
TPC-W Benchmark, http://www.tpc.org
Grottke, M., Li, L., Vaidyanathan, K., Trivedi, K.S.: Analysis of Software Aging in a Web Server. IEEE Trans. on Reliability 55(3), 411–420 (2006)
Brockwell, P.J., Davis, R.A.: Time Series: Theory and Methods, 2nd edn. (1998)
Williams, A.W., Pertet, S.M., Narasimhan, P.: Tiresias: Black-Box Failure Prediction in Distributed Systems. In: IPDPS (2007)
Laguna, I., Arshad, F.A., Grothe, D.M., Bagchi, S.: How To Keep Your Head Above Water While Detecting Errors. ECE Technical Reports, Purdue University, http://docs.lib.purdue.edu/ecetr/379
Wu, Y.S., Bagchi, S., Singh, N., Wita, R.: Spam Detection in Voice-Over-IP Calls through Semi-Supervised Clustering. In: IEEE/IFIP DSN 2009 (2009)
Rabiner, L.R.: A tutorial on Hidden Markov Models and selected applications in speech recognition. Proceedings of the IEEE 77(2) (February 1989)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 IFIP International Federation for Information Processing
About this paper
Cite this paper
Laguna, I., Arshad, F.A., Grothe, D.M., Bagchi, S. (2009). How to Keep Your Head above Water While Detecting Errors. In: Bacon, J.M., Cooper, B.F. (eds) Middleware 2009. Middleware 2009. Lecture Notes in Computer Science, vol 5896. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10445-9_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-10445-9_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10444-2
Online ISBN: 978-3-642-10445-9
eBook Packages: Computer ScienceComputer Science (R0)