Abstract
Most multimedia surveillance and monitoring systems nowadays utilize multiple types of sensors to detect events of interest as and when they occur in the environment. However, due to the asynchrony among and diversity of sensors, information assimilation – how to combine the information obtained from asynchronous and multifarious sources is an important and challenging research problem. In this paper, we propose a framework for information assimilation that addresses the issues – “when”, “what” and “how” to assimilate the information obtained from different media sources in order to detect events in multimedia surveillance systems. The proposed framework adopts a hierarchical probabilistic assimilation approach to detect atomic and compound events. To detect an event, our framework uses not only the media streams available at the current instant but it also utilizes their two important properties – first, accumulated past history of whether they have been providing concurring or contradictory evidences, and – second, the system designer’s confidence in them. The experimental results show the utility of the proposed framework.
Similar content being viewed by others
References
Atrey, P.K., Kankanhalli, M.S., Jain, R.: Timeline-based information assimilation in multimedia surveillance and monitoring systems. In: The ACM International Workshop on Video Surveillance and Sensor Networks. Singapore, pp. 103–112 (2005)
Atrey, P.K., Kankanhalli, M.S., Oommen, J.B.: Goal-oriented optimal subset selection of correlated multimedia streams. ACM Trans. Multimed. Comput. Commun. Appl. (2006) (in press)
Atrey, P.K., Maddage, N.C., Kankanhalli, M.S.: Audio based event detection for multimedia surveillance. In: IEEE International Conference on Acoustics, Speech, and Signal Processing. Toulouse, France, pp. V813–V816 (2006)
Benediktsson J.A., Kanellopoulos I. (1999) Classification of multisource and hyperspectral data based on decision fusion. IEEE Trans. GeoSci. Remote Sens. 37(3): 1367–1377
Bloch D.A., Kraemer H.C. (1989) 2 × 2 Kappa coefficients: Measures of agreement or association. J. Biom. 45(1): 269–287
Chair Z., Varshney P.R. (1986) Optimal data fusion in multiple sensor detection systems. IEEE Trans. Aerosp. Electron. Syst. 22, 98–101
Checka, N., Wilson, K.W., Siracusa, M.R., Darrell, T.: Multiple person and speaker activity tracking with a particle filter. In: International Conference on Acoustics Speech and Signal Processing
Chieu, H.L., Lee, Y.K.: Query based event extraction along a timeline. In: International ACM SIGIR Conference on Research and development in Information Retrieval. Sheffield, UK, pp. 425–432 (2004)
Genest C., Zidek J.V. (1986) Combining probability distributions: a critique and annotated bibliography. J. Stat. Sci. 1(1): 114–118
Hershey, J., Attias, H., Jojic, N., Krisjianson, T.: Audio visual graphical models for speech processing. In: IEEE International Conference on Speech, Acoustics, and Signal Processing. Montreal, Canada, pp. V649–V652 (2004)
Kam M., Zhu Q., Gray W.S. (1992) Optimal data fusion of correlated local decisions in multiple sensor detection systems. IEEE Trans. Aerosp. Electron. Syst. 28(3): 916–920
Lin L.I.-K. (1989) A concordance correlation coefficient to evaluate reproducibility. J. Biom. 45(1): 255–268
Maddage, N.C.: Content based music structure analysis. Ph.D. thesis, School of Computing, National University of Singapore (2006)
Nefian A.V., Liang L., Pi X., Liu X., Murphye K. (2002) Dynamic bayesian networks for audio-visual speech recognition. EURASIP J. Appl. Signal Process. 11, 1–15
Nock, H.J., Iyengar, G., Neti, C.: Assessing face and speech consistency for monologue detection in video. In: ACM International Conference on Multimedia (2002)
Rao B.S., Whyte H.D. (1993) A decentralized bayesian algorithm for identification of tracked objects. IEEE Trans. Syst. Man Cybernet. 23, 1683–1698
Siegel, M., Wu, H.: Confidence fusion. In: IEEE International Workshop on Robot Sensing, pp. 96–99 (1993)
Stauffer, C., Grimson, W.E.L.: Adaptive background mixture models for real-time tracking. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2. Ft. Collins, CO, USA, pp. 252–258 (1999)
Wu, Y., Chang, E.Y., Chang, K.C.-C., Smith, J.R.: Optimal multimodal fusion for multimedia data analysis. In: ACM International Conference on Multimedia. New York, USA, pp. 572–579 (2004)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Atrey, P.K., Kankanhalli, M.S. & Jain, R. Information assimilation framework for event detection in multimedia surveillance systems. Multimedia Systems 12, 239–253 (2006). https://doi.org/10.1007/s00530-006-0063-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-006-0063-8