Data Mining and Knowledge Discovery

, Volume 20, Issue 3, pp 328–360 | Cite as

A real-time temporal Bayesian architecture for event surveillance and its application to patient-specific multiple disease outbreak detection

  • Xia JiangEmail author
  • Gregory F. Cooper


Reliable and accurate detection of disease outbreaks remains an important research topic in disease outbreak surveillance. A temporal surveillance system bases its analysis on data not only from the most recent time period, but also on data from previous time periods. A non-temporal system only looks at data from the most recent time period. There are two difficulties with a non-temporal system when it is used to monitor real data which often contain noise. First, it is prone to produce false positive signals during non-outbreak time periods. Second, during an outbreak, it tends to release false negative signals early in the outbreak, which can adversely affect the decision making process of the user of the system. We conjecture that by converting a non-temporal system to a temporal one, we may attenuate these difficulties inherent in a non-temporal system. In this paper, we propose a Bayesian network architecture for a class of temporal event surveillance models called BayesNet-T. Using this Bayesian network architecture, we can convert certain non-temporal surveillance systems to temporal ones. We apply this architecture to a previously developed non-temporal multiple-disease outbreak detection system called PC and create a temporal system called PCT. PCT takes Emergency Department (ED) patient chief complaint data as its input. The PCT system was constructed using both data (non-outbreak diseases) and expert assessments (outbreak diseases). We compare PCT to PC using a real influenza outbreak. Furthermore, we compare PCT to both PC and the classic statistical methods CUSUM and EWMA using a total of 240 influenza and Cryptosporidium disease outbreaks created by injecting stochastically simulated outbreak cases into real ED admission data. Our results indicate that PCT has a smaller mean time to detection than PC at low false alarm rates, and that PCT is more stable than PC in that once an outbreak is detected, PCT is better at maintaining the detection signal on future days.


Temporal disease outbreak detection Bayesian network Patient-specific model Mining ED chief complaint data Uncertainty modeling Biosurveillance 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Baron MI et al (2002) Bayes and asymptotically pointwise stopping rules for the detection of influenza outbreaks. In: Gastonis C, Kass RE, Carriquiry A (eds) Case studies in Bayesian statistics. Springer–Verlag, New YorkGoogle Scholar
  2. Bos T, Fetherston TA (1992) Market model nonstationarity in the Korean stock market. In: Rhee SG, Chang RP (eds) Pacific-Basin capital markets research, 3rd edn. Elsevier, North-Holland, AmsterdamGoogle Scholar
  3. Box G, Jenkins G, Reinsel R (1994) Time series analysis: forecasting and control. Prentice Hall, Englewood CliffszbMATHGoogle Scholar
  4. Burges C (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Disc 2(2): 121–167CrossRefGoogle Scholar
  5. Cooper GF, Dash DH, Levander JD, Wong WK, Hogan WR, Wagner MM (2004) Bayesian biosurveillance of disease outbreaks. In: Proceedings of the 20th conference on uncertainty in artificial intelligence, AUAI Press, Arlington, Virginia, pp 94–103Google Scholar
  6. Cooper GF, Dowling JN, Lavender JD, Sutovsky P (2007) A Bayesian algorithm for detecting CDC category A outbreak diseases from emergency department chief complaints. Adv Disease Surveill 2: 45Google Scholar
  7. Fawcett T, Provost F (1999) Activity monitoring: noticing interesting changes in behavior. In: Proceedings of the fifth SIGKDD conference on knowledge discovery and data mining, ACM Press, San Diego, California, pp 53–62Google Scholar
  8. Hamilton J (1994) Time series analysis. Princeton University Press, PrincetonzbMATHGoogle Scholar
  9. Hogan W, Cooper GF, Wallstrom G, Wagner M (2007) The Bayesian aerosol release detector: an algorithm for detecting and characterizing outbreaks caused by an atmospheric release of bacillus anthracis. Stat Med 26(29): 5225–5252CrossRefMathSciNetGoogle Scholar
  10. Jiang X (2007) A Bayesian network for predicting an epicurve. Adv Disease Surveill 2: 15Google Scholar
  11. Jiang X (2008) A Bayesian network model for spatio-temporal event surveillance, Ph.D. Thesis, Department of Biomedical Informatics, University of PittsburghGoogle Scholar
  12. Jiang X, Cooper GF, Neill DB (2009) A Bayesian network model for spatial event surveillance. Int J Approx Reason. doi: 10.1016/j.ijar.2009.01.001
  13. Jiang X, Wallstrom GL (2006) A Bayesian network for outbreak detection and prediction. In: Proceedings of AAAI-06, Boston, Massachusetts, pp 1166–1160Google Scholar
  14. Kulldorff M (1997) A spatial scan statistic. Commun Stat Theory Methods 26(6): 1481–1496zbMATHCrossRefMathSciNetGoogle Scholar
  15. Kulldorff M (2004) Satscan v. 4.0: software for the spatial and space-time scan statistics, Technical Report, Information Management Services, Inc.Google Scholar
  16. Kulldorff M, Heffernan R, Hartman J, Assunco R, Mostashari F (2005) Space-time permutation scan statistic for disease outbreak detection. PLoS Med 2: 216–224CrossRefGoogle Scholar
  17. Kulldorff M, Mostashari F, Luiz D, Yih K, Kleinman K, Platt R (2007) Multivariate scan statistics for disease surveillance. Stat Med 26: 1824–1833CrossRefMathSciNetGoogle Scholar
  18. Montgomery DC (2001) Introduction to statistical quality control. Wiley, New YorkGoogle Scholar
  19. Moore A (2001a) A powerpoint tutorial on hidden Markov models, available at
  20. Moore A (2001b) A powerpoint tutorial on support vector machines, available at
  21. Moore A, Anderson B, Das K, Wong WK (2006) Combining multiple signals for biosurveillance. In: Wagner M (eds) Handbook of biosurveillance. Elsevier, New YorkGoogle Scholar
  22. Neill DB, Moore AW, Cooper GF (2005a) A Bayesian spatial scan statistic. Adv Neural Inform Process Syst (NIPS) 18: 1003–1010Google Scholar
  23. Neill DB, Moore AW, Sabnani M, Daniel K (2005b) Detection of emerging space-time clusters. In: Proceedings of 11th ACM SIGKDD international conference on knowledge discovery and mining, Chicago, Illinois, pp 218–227Google Scholar
  24. Rabiner LR (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 77(2): 257–286CrossRefGoogle Scholar
  25. Reis BR, Mandl KD (2003) Time series modeling for syndromic surveillance. BMC Med Inform Dec Making 3(2)Google Scholar
  26. Reis BY, Pagano M, Mandl KD (2003) Using temporal context to improve biosurveillance. PNAS 100(4): 1961–1965CrossRefGoogle Scholar
  27. Serfling RE (1963) Methods for current statistical analysis of pneumonia-influenza deaths. Public Health Rep 78(6): 494–506Google Scholar
  28. Shmueli G, Fienberg S (2006) Current and potential statistical methods for monitoring multiple data streams for biosurveillance. In: Wilson A, Wilson GD, Olwell D (eds) Statistical methods in counterterrorism. Springer, New YorkGoogle Scholar
  29. Stirling R, Aramini J, Ellis A, Gillien L, Meyers R, Flevry M, Werker D (2001) Waterborne cryptosporidiosis outbreak, North Battleford, Saskatchewan, spring 2001. Can Commun Disease Rep 27(22): 185–192Google Scholar
  30. Soneson C, Bock D (2003) A review and discussion of prospective statistical surveillance in public health. JR Stat Soc A 166(1): 5–21CrossRefGoogle Scholar
  31. Sun L, Shenoy P (2007) Using Bayesian networks for bankruptcy prediction: some methodological issues. Eur J Oper Res 180(2): 738–753zbMATHCrossRefGoogle Scholar
  32. Tsui FC, Wagner MM, Dato V, Chang HC (2001) Value of ICD-9-coded chief complaints for detection of epidemics. Symp J Am Med Inform Assoc 9: 4–47Google Scholar
  33. Wong WK, Moore A (2006) Classical time series methods for biosurveillance. In: Wagner M (eds) Handbook of biosurveillance. Elsevier, New YorkGoogle Scholar

Copyright information

© The Author(s) 2009

Authors and Affiliations

  1. 1.Department of Biomedical Informatics, School of MedicineUniversity of PittsburghPittsburghUSA

Personalised recommendations