Abstract
We review three recently proposed scan statistic methods for multivariate pattern detection. Each method models the relationship between multiple observed and hidden variables using a Bayesian network structure, drawing inferences about the underlying pattern type and the affected subset of the data. We first discuss the multivariate Bayesian scan statistic (MBSS) proposed by Neill and Cooper (2008). MBSS is a stream-based event surveillance framework that detects and characterizes events given the aggregate counts for multiple data streams. Next, we describe the agent-based Bayesian scan statistic (ABSS) proposed by Jiang et al. (2008). ABSS performs event detection and characterization given individual-level data for each agent in a population. Finally, we review the anomalous group detection (AGD) method proposed by Das, Schneider, and Neill (2008). AGD is a general pattern detection approach which learns a Bayesian network structure from data and detects anomalous groups of records.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Bronstein, A., Das, J., Duro, M., Friedrich, R., Kleyner, G., Mueller, M., Singhal, S., and Cohen, I. (2001). Bayesian networks for detecting anomalies in internet-based services, In Intl. Symposium on Integrated Network Mgmt.
Clayton, D., and Kaldor, J. (1987). Empirical Bayes estimates of age-standardized relative risks for use in disease mapping, Biometrics, 43, 671–681.
Cooper, G. F., Dash, D. H., Levander, J. D., Wong, W.-K., Hogan, W. R., and Wagner, M. M. (2004). Bayesian biosurveillance of disease outbreaks, In Proc. Conference on Uncertainty in Artificial Intelligence.
Cooper, G. F., Dowling, J. N., Levander, J. D., and Sutovsky, P. (2007). A Bayesian algorithm for detecting CDC Category A outbreak diseases from emergency department chief complaints, Advances in Disease Surveillance, 2, 45.
Das, K., Schneider, J., and Neill, D. B. (2008). Detecting anomalous groups in categorical datasets, submitted for publication, Carnegie Mellon University, School of Computer Science.
Dong-Her, S., Hsiu-Sen, C., Chun-Yuan, C., and Lin, B. (2004). Internet security: malicious e-mails detection and protection, Industrial Mgmt. and Data Sys., 104, 613–623.
Duczmal, L., and Assuncao, R. (2004). A simulated annealing strategy for the detection of arbitrary shaped spatial clusters, Computational Statistics and Data Analysis, 45, 269–286.
Heckerman, D., Geiger, D., and Chickering, M. (1995). Learning Bayesian networks: the combination of knowledge and statistical data, Machine Learning, 20, 197–243.
Hjalmars, U., Kulldorff, M., Gustafsson, G., and Nagarwalla, N. (1996). Childhood leukemia in Sweden: using GIS and a spatial scan statistic for cluster detection, Statistics in Medicine, 15, 707–715.
Jiang, X., Neill, D. B., and Cooper, G. F. (2008). A Bayesian network model for spatial event surveillance, International Journal of Approximate Reasoning, in press.
Kleinman, K., Abrams, A., Kulldorff, M., and Platt, R. (2005). A model-adjusted space-time scan statistic with an application to syndromic surveillance, Epidemiology and Infection, 133(3), 409–419.
Kulldorff, M. (1997). A spatial scan statistic, Communications in Statistics: Theory and Methods, 26(6), 1481–1496.
Kulldorff, M. (2001). Prospective time-periodic geographical disease surveillance using a scan statistic, Journal of the Royal Statistical Society A, 164, 61–72.
Kulldorff, M., Athas, W., Feuer, E., Miller, B., and Key, C. (1998). Evaluating cluster alarms: a space-time scan statistic and cluster alarms in Los Alamos, American Journal of Public Health, 88, 1377–1380.
Kulldorff, M., Feuer, E. J., Miller, B. A., and Freedman, L. S. (1997). Breast cancer clusters in the northeast United States: a geographic analysis, American Journal of Epidemiology, 146(2), 161–170.
Kulldorff, M., Huang, L., Pickle, L., and Duczmal, L. (2006). An elliptic spatial scan statistic, Statistics in Medicine, 25, 3929–3943.
Kulldorff, M., Mostashari, F., Duczmal, L., Yih, W. K., Kleinman, K., and Platt, R. (2007). Multivariate scan statistics for disease surveillance, Statistics in Medicine, 26, 1824–1833.
Kulldorff, M., and Nagarwalla, N. (1995). Spatial disease clusters: detection and inference, Statistics in Medicine, 14, 799–810.
Mollié, A. (1999). Bayesian and empirical Bayes approaches to disease mapping, In Lawson, A. B., Biggeri, A., Böhning, D., Lesaffre, E., Viel, J.-F., and Bertollini, R., Disease Mapping and Risk Assessment for Public Health, Wiley, New York.
Moore, A., and Wong, W.-K. (2003). Optimal reinsertion: a new search operator for accelerated and more accurate Bayesian network structure learning, In Proceedings of the 20th Intl. Conf. on Machine Learning, 552–559.
Mostashari, F., Kulldorff, M., Hartman, J. J., Miller, J. R., and Kulasekera, V. (2003). Dead bird clustering: a potential early warning system for West Nile virus activity, Emerging Infectious Diseases, 9, 641–646.
Neill, D. B. (2006). Detection of spatial and spatio-temporal clusters, CMU- CS-06-142, Ph.D. thesis, Carnegie Mellon University, School of Computer Science.
Neill, D. B. (2007). Incorporating learning into disease surveillance systems, Advances in Disease Surveillance, 4, 107.
Neill, D. B., and Cooper, G. F. (2008). A multivariate Bayesian scan statistic for early event detection and characterization, Machine Learning, in press.
Neill, D. B., and Lingwall, J. (2007). A nonparametric scan statistic for multivariate disease surveillance, Advances in Disease Surveillance, 4, 106.
Neill, D. B., and Moore, A. W. (2004). Rapid detection of significant spatial clusters, In Proc. 10th ACM SIGKDD Conf. on Knowledge Discovery and Data Mining, 256–265.
Neill, D. B., Moore, A. W., and Cooper, G. F. (2006). A Bayesian spatial scan statistic, In Advances in Neural Information Processing Systems 18, 1003–1010.
Neill, D. B., Moore, A. W., and Cooper, G. F. (2007). A multivariate Bayesian scan statistic, Advances in Disease Surveillance, 2, 60.
Neill, D. B., Moore, A. W., and Sabhnani, M. R. (2005a). Detecting elongated disease clusters, Morbidity and Mortality Weekly Report, 54 (Supplement on Syndromic Surveillance), 197.
Neill, D. B., Moore, A. W., Sabhnani, M. R., and Daniel, K. (2005b). Detection of emerging space-time clusters, In Proc. 11th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining.
Neill, D. B., and Sabhnani, M. R. (2007). A robust expectation-based spatial scan statistic, Advances in Disease Surveillance, 2, 61.
Patil, G. P., and Taillie, C. (2004). Upper level set scan statistic for detecting arbitrarily shaped hotspots, Envir. Ecol. Stat., 11, 183–197.
Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann, San Mateo, CA.
Tango, T., and Takahashi, K. (2005). A flexibly shaped spatial scan statistic for detecting clusters, International Journal of Health Geographics, 4, 11.
Wong, W.-K., Moore, A. W., Cooper, G. F., and Wagner, M. M. (2003a). Bayesian network anomaly pattern detection for disease outbreaks, In Proc. 20th International Conference on Machine Learning.
Wong, W.-K., Moore, A. W., Cooper, G. F., and Wagner, M. M. (2003b). WSARE: What’s strange about recent events? Journal of Urban Health, 80(2 Suppl. 1), i66–i75.
Ye, N., and Xu, M. (2000). Probabilistic networks with undirected links for anomaly detection, In IEEE Systems, Man, and Cybernetics Information Assurance and Security Workshop, 175–179.
Acknowledgements
This work was partially supported by NSF grant IIS-0325581 and CDC grant 8 R01 HK000020 02. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of NSF or CDC.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Birkhäuser Boston, a part of Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Neill, D.B., Cooper, G.F., Das, K., Jiang, X., Schneider, J. (2009). Bayesian Network Scan Statistics for Multivariate Pattern Detection. In: Glaz, J., Pozdnyakov, V., Wallenstein, S. (eds) Scan Statistics. Statistics for Industry and Technology. Birkhäuser Boston. https://doi.org/10.1007/978-0-8176-4749-0_11
Download citation
DOI: https://doi.org/10.1007/978-0-8176-4749-0_11
Published:
Publisher Name: Birkhäuser Boston
Print ISBN: 978-0-8176-4748-3
Online ISBN: 978-0-8176-4749-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)