Skip to main content

Bayesian Network Scan Statistics for Multivariate Pattern Detection

  • Chapter
  • First Online:

Part of the book series: Statistics for Industry and Technology ((SIT))

Abstract

We review three recently proposed scan statistic methods for multivariate pattern detection. Each method models the relationship between multiple observed and hidden variables using a Bayesian network structure, drawing inferences about the underlying pattern type and the affected subset of the data. We first discuss the multivariate Bayesian scan statistic (MBSS) proposed by Neill and Cooper (2008). MBSS is a stream-based event surveillance framework that detects and characterizes events given the aggregate counts for multiple data streams. Next, we describe the agent-based Bayesian scan statistic (ABSS) proposed by Jiang et al. (2008). ABSS performs event detection and characterization given individual-level data for each agent in a population. Finally, we review the anomalous group detection (AGD) method proposed by Das, Schneider, and Neill (2008). AGD is a general pattern detection approach which learns a Bayesian network structure from data and detects anomalous groups of records.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bronstein, A., Das, J., Duro, M., Friedrich, R., Kleyner, G., Mueller, M., Singhal, S., and Cohen, I. (2001). Bayesian networks for detecting anomalies in internet-based services, In Intl. Symposium on Integrated Network Mgmt.

    Google Scholar 

  2. Clayton, D., and Kaldor, J. (1987). Empirical Bayes estimates of age-standardized relative risks for use in disease mapping, Biometrics, 43, 671–681.

    Article  Google Scholar 

  3. Cooper, G. F., Dash, D. H., Levander, J. D., Wong, W.-K., Hogan, W. R., and Wagner, M. M. (2004). Bayesian biosurveillance of disease outbreaks, In Proc. Conference on Uncertainty in Artificial Intelligence.

    Google Scholar 

  4. Cooper, G. F., Dowling, J. N., Levander, J. D., and Sutovsky, P. (2007). A Bayesian algorithm for detecting CDC Category A outbreak diseases from emergency department chief complaints, Advances in Disease Surveillance, 2, 45.

    Google Scholar 

  5. Das, K., Schneider, J., and Neill, D. B. (2008). Detecting anomalous groups in categorical datasets, submitted for publication, Carnegie Mellon University, School of Computer Science.

    Google Scholar 

  6. Dong-Her, S., Hsiu-Sen, C., Chun-Yuan, C., and Lin, B. (2004). Internet security: malicious e-mails detection and protection, Industrial Mgmt. and Data Sys., 104, 613–623.

    Article  Google Scholar 

  7. Duczmal, L., and Assuncao, R. (2004). A simulated annealing strategy for the detection of arbitrary shaped spatial clusters, Computational Statistics and Data Analysis, 45, 269–286.

    Article  MATH  MathSciNet  Google Scholar 

  8. Heckerman, D., Geiger, D., and Chickering, M. (1995). Learning Bayesian networks: the combination of knowledge and statistical data, Machine Learning, 20, 197–243.

    MATH  Google Scholar 

  9. Hjalmars, U., Kulldorff, M., Gustafsson, G., and Nagarwalla, N. (1996). Childhood leukemia in Sweden: using GIS and a spatial scan statistic for cluster detection, Statistics in Medicine, 15, 707–715.

    Article  Google Scholar 

  10. Jiang, X., Neill, D. B., and Cooper, G. F. (2008). A Bayesian network model for spatial event surveillance, International Journal of Approximate Reasoning, in press.

    Google Scholar 

  11. Kleinman, K., Abrams, A., Kulldorff, M., and Platt, R. (2005). A model-adjusted space-time scan statistic with an application to syndromic surveillance, Epidemiology and Infection, 133(3), 409–419.

    Article  Google Scholar 

  12. Kulldorff, M. (1997). A spatial scan statistic, Communications in Statistics: Theory and Methods, 26(6), 1481–1496.

    Article  MATH  MathSciNet  Google Scholar 

  13. Kulldorff, M. (2001). Prospective time-periodic geographical disease surveillance using a scan statistic, Journal of the Royal Statistical Society A, 164, 61–72.

    Article  MATH  MathSciNet  Google Scholar 

  14. Kulldorff, M., Athas, W., Feuer, E., Miller, B., and Key, C. (1998). Evaluating cluster alarms: a space-time scan statistic and cluster alarms in Los Alamos, American Journal of Public Health, 88, 1377–1380.

    Article  Google Scholar 

  15. Kulldorff, M., Feuer, E. J., Miller, B. A., and Freedman, L. S. (1997). Breast cancer clusters in the northeast United States: a geographic analysis, American Journal of Epidemiology, 146(2), 161–170.

    Google Scholar 

  16. Kulldorff, M., Huang, L., Pickle, L., and Duczmal, L. (2006). An elliptic spatial scan statistic, Statistics in Medicine, 25, 3929–3943.

    Article  MathSciNet  Google Scholar 

  17. Kulldorff, M., Mostashari, F., Duczmal, L., Yih, W. K., Kleinman, K., and Platt, R. (2007). Multivariate scan statistics for disease surveillance, Statistics in Medicine, 26, 1824–1833.

    Article  MathSciNet  Google Scholar 

  18. Kulldorff, M., and Nagarwalla, N. (1995). Spatial disease clusters: detection and inference, Statistics in Medicine, 14, 799–810.

    Article  Google Scholar 

  19. Mollié, A. (1999). Bayesian and empirical Bayes approaches to disease mapping, In Lawson, A. B., Biggeri, A., Böhning, D., Lesaffre, E., Viel, J.-F., and Bertollini, R., Disease Mapping and Risk Assessment for Public Health, Wiley, New York.

    Google Scholar 

  20. Moore, A., and Wong, W.-K. (2003). Optimal reinsertion: a new search operator for accelerated and more accurate Bayesian network structure learning, In Proceedings of the 20th Intl. Conf. on Machine Learning, 552–559.

    Google Scholar 

  21. Mostashari, F., Kulldorff, M., Hartman, J. J., Miller, J. R., and Kulasekera, V. (2003). Dead bird clustering: a potential early warning system for West Nile virus activity, Emerging Infectious Diseases, 9, 641–646.

    Google Scholar 

  22. Neill, D. B. (2006). Detection of spatial and spatio-temporal clusters, CMU- CS-06-142, Ph.D. thesis, Carnegie Mellon University, School of Computer Science.

    Google Scholar 

  23. Neill, D. B. (2007). Incorporating learning into disease surveillance systems, Advances in Disease Surveillance, 4, 107.

    Google Scholar 

  24. Neill, D. B., and Cooper, G. F. (2008). A multivariate Bayesian scan statistic for early event detection and characterization, Machine Learning, in press.

    Google Scholar 

  25. Neill, D. B., and Lingwall, J. (2007). A nonparametric scan statistic for multivariate disease surveillance, Advances in Disease Surveillance, 4, 106.

    Google Scholar 

  26. Neill, D. B., and Moore, A. W. (2004). Rapid detection of significant spatial clusters, In Proc. 10th ACM SIGKDD Conf. on Knowledge Discovery and Data Mining, 256–265.

    Google Scholar 

  27. Neill, D. B., Moore, A. W., and Cooper, G. F. (2006). A Bayesian spatial scan statistic, In Advances in Neural Information Processing Systems 18, 1003–1010.

    Google Scholar 

  28. Neill, D. B., Moore, A. W., and Cooper, G. F. (2007). A multivariate Bayesian scan statistic, Advances in Disease Surveillance, 2, 60.

    Google Scholar 

  29. Neill, D. B., Moore, A. W., and Sabhnani, M. R. (2005a). Detecting elongated disease clusters, Morbidity and Mortality Weekly Report, 54 (Supplement on Syndromic Surveillance), 197.

    Google Scholar 

  30. Neill, D. B., Moore, A. W., Sabhnani, M. R., and Daniel, K. (2005b). Detection of emerging space-time clusters, In Proc. 11th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining.

    Google Scholar 

  31. Neill, D. B., and Sabhnani, M. R. (2007). A robust expectation-based spatial scan statistic, Advances in Disease Surveillance, 2, 61.

    Google Scholar 

  32. Patil, G. P., and Taillie, C. (2004). Upper level set scan statistic for detecting arbitrarily shaped hotspots, Envir. Ecol. Stat., 11, 183–197.

    Article  MathSciNet  Google Scholar 

  33. Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann, San Mateo, CA.

    Google Scholar 

  34. Tango, T., and Takahashi, K. (2005). A flexibly shaped spatial scan statistic for detecting clusters, International Journal of Health Geographics, 4, 11.

    Article  Google Scholar 

  35. Wong, W.-K., Moore, A. W., Cooper, G. F., and Wagner, M. M. (2003a). Bayesian network anomaly pattern detection for disease outbreaks, In Proc. 20th International Conference on Machine Learning.

    Google Scholar 

  36. Wong, W.-K., Moore, A. W., Cooper, G. F., and Wagner, M. M. (2003b). WSARE: What’s strange about recent events? Journal of Urban Health, 80(2 Suppl. 1), i66–i75.

    Google Scholar 

  37. Ye, N., and Xu, M. (2000). Probabilistic networks with undirected links for anomaly detection, In IEEE Systems, Man, and Cybernetics Information Assurance and Security Workshop, 175–179.

    Google Scholar 

Download references

Acknowledgements

This work was partially supported by NSF grant IIS-0325581 and CDC grant 8 R01 HK000020 02. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of NSF or CDC.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Birkhäuser Boston, a part of Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Neill, D.B., Cooper, G.F., Das, K., Jiang, X., Schneider, J. (2009). Bayesian Network Scan Statistics for Multivariate Pattern Detection. In: Glaz, J., Pozdnyakov, V., Wallenstein, S. (eds) Scan Statistics. Statistics for Industry and Technology. Birkhäuser Boston. https://doi.org/10.1007/978-0-8176-4749-0_11

Download citation

Publish with us

Policies and ethics