Dynamic Combination of Multiple Host-Based Anomaly Detectors with Broader Detection Coverage and Fewer False Alerts
To achieve broader detection coverage with fewer false alarms, a POMDP-based anomaly detection model combining several sate-of-the-art host-based anomaly detectors is proposed in this paper. An optimal combinatorial manner is expected to be discovered through a policy-gradient reinforcement learning algorithm, based on the independent actions of those detectors, and the behavior of the proposed model can be adjusted through a global reward signal to adapt to various system situations. A primarily experiment with some comparative studies are carried out to validate its performance.
Unable to display preview. Download preview PDF.
- 1.Aberdeen, D.: A Survey of Approimate Methods for Solving Partially Observable Markov Decision Processes, National ICT Australia Report, AustraliaGoogle Scholar
- 2.Bartlett, P.L., Baxter, J.: Stochastic Optimization of Controlled Partially Observable Markov Decision Processes. In: Proceedings of the 39th IEEE Conference on Decision and Control (CDC 2000) (2000)Google Scholar
- 3.Forrest, S., Hofmeyr, S.A., Longstaff, T.A.: A sense of self for UNIX processes. In: Proceedings of 1996 IEEE Symposium on Security and Privacy, Los Alamitos, CA (1996)Google Scholar
- 5.Tao, N., Baxter, J., Weaver, L.: A Multi-Agent, Policy-Gradient approach to Network Routing. In: 18th International Conference on Machine Learning, ICML 2000 (2000)Google Scholar
- 6.Ye, N., Li, X., Chen, Q., Emran, S.M., Xu, M.: Probabilistic Techniques for Intrusion Detection Based on Computer Audit Data. In: IEEE Transaction on Systems, Man, and Cybernetics-Part A:Systems and Humans, vol. 31(4) (July 2001)Google Scholar