Abstract
The latent semantic analysis (LSA) has been widely used in the fields of computer vision and pattern recognition. Most of the existing works based on LSA focus on behavior recognition and motion classification. In the applications of visual surveillance, accurate tracking of the moving people in surveillance scenes, is regarded as one of the preliminary requirement for other tasks such as object recognition or segmentation. However, accurate tracking is extremely hard under challenging surveillance scenes where similarity among multiple objects or occlusion among multiple objects occurs. Usual temporal Markov chain based tracking algorithms suffer from the ‘tracking error accumulation problem’. The accumulated errors can finally make the tracking to drift from the target. To handle the problem of tracking drift, some authors have proposed the idea of using detection along with tracking as an effective solution. However, many of the critical issues still remain unsettled in these detection based tracking algorithms. In this paper, we propose a novel moving people tracking with detection based on (probabilistic) LSA. By employing a novel ‘twin-pipeline’ training framework to find the latent semantic topics of ‘moving people’, the proposed detection can effectively detect the interest points on moving people in different indoor and outdoor environments with camera motion. Since the detected interest points on different body parts can be used to locate the position of moving people more accurately, by combining the detection with incremental subspace learning based tracking, the proposed algorithms resolves the problem of tracking drift during each target appearance update process. In addition, due to the time independent processing mechanism of detection, the proposed method is also able to handle the error accumulation problem. The detection can calibrate the tracking errors during updating of each state of the tracking algorithm. Extensive, experiments on various surveillance environments using different benchmark datasets have proved the accuracy and robustness of the proposed tracking algorithm. Further, the experimental comparison results clearly show that the proposed tracking algorithm outperforms the well known tracking algorithms such as ISL, AMS and WSL algorithms. Furthermore, the speed performance of the proposed method is also satisfactory for realistic surveillance applications.
Similar content being viewed by others
References
Andriluka M, Roth S, Schiele B (2008) People-tracking-by-detection and people-detection-by-tracking. In: IEEE conference on Computer Vision and Pattern Recognition, CVPR
Avidan S (2004) Support vector tracking. IEEE Trans Pattern Anal Mach Intell 26:1064–1072
Avidan S (2007) Ensemble tracking. IEEE Trans Pattern Anal Mach Intell 29:261–271
Benfold B, Reid I (2011) Stable multi-target tracking in real-time surveillance video. In: IEEE conference on Computer Vision and Pattern Recognition, CVPR
Black MJ, Jepson AD (1998) EigenTracking: robust matching and tracking of articulated objects using a view-based representation. Int J Comput Vis 26:63–84
Collins RT (2003) Mean-shift blob tracking through scale space. In: IEEE conference on Computer Vision and Pattern Recognition, CVPR
Comaniciu D, Ramesh V, Meer P (2003) Kernel-based object tracking. IEEE Trans Pattern Anal Mach Intell 25:564–577
David B, Ng M, Andrew Y, Jordan MI, Lafferty J (2006) Latent Dirichlet allocation. J Mach Learn Res 4–5:993–1022
Dollar P, Rabaud V, Cottrell G, Belongie S (2005) Behavior recognition via sparse spatio-temporal features. In: IEEE international workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, VS-PETS
Fabrice S, Bernard M, Benoit H (2004) Improved video content indexing by multiple latent semantic analysis. In: International Conference on Image and Video Retrieval, CIVR
Grabner H, Leistner C, Bischof H (2008) Semi-supervised on-line boosting for robust tracking. In: European Conference on Computer Vision, ECCV
Hofmann T (1999) Probabilistic latent semantic indexing. In: International ACM SIGIR conference on research and development in information retrieval, SIGIR
Hohl L, Souvannavong F, Merialdo B, Huet B (2004) Enhancing latent semantic analysis video object retrieval with structural information. In: International Conference on Image Processing, ICIP
Isard M, Blake A (1998) Condensation—conditional density propagation for visual tracking. Int J Comput Vis 29:5–28
Jepson AD, Fleet DJ, El-Maraghi TF (2003) Robust online appearance models for visual tracking. IEEE Trans Pattern Anal Mach Intell 25:1296–1311
Kalal Z, Matas J, Kikolajczyk K (2010) P-N learning: bootsrapping binary classifiers by structural contraints. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR
Kwon J, Lee KM (2010) Visual tracking decomposition. In: IEEE conference on Computer Vision and Pattern Recognition, CVPR
Kwon J, Lee LM, Park FC (2009) Visual tracking via geometric particle filtering on the affine group with optimal improtance functions. In: IEEE conference on Computer Vision and Pattern Recognition, CVPR
Laptev I (2005) On space-time interest points. Int J Comput Vis 64:107–123
Levy A, Lindenbaum M (2000) Sequential Karhunen-Loeve basis extraction and its application to images. IEEE Trans Image Process 9:1371–1374
Liu D, Chen T (2006) Semantic-shift for unsupervised object detection. In: IEEE conference on Computer Vision and Pattern Recognition workshop, CVPR workshop
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60:91–110
Mikolajczyk K, Schmid C (2004) Scale & affine invariant interest point detectors. Int J Comput Vis 60:63–86
Miller A, Shah M (2007) Foreground segmentation in surveillance scenes containing a door. In: IEEE International Conference on Multimedia and Expo, ICME
Niebles JC, Wang HC, Li FF (2008) Unsupervised learning of human action categories using spatial-temporal words. Int J Comput Vis 79:299–318
Rodriguez M, Laptev I, Sivic J, Audibert JY (2011) Density-aware person detection and tracking in crowds. In: IEEE conference on Computer Vision and Pattern Recognition, CVPR
Ross DA, Lim J, Lin RS, Yang MH (2008) Incremental learning for robust visual tracking. Int J Comput Vis 77:125–141
Schmid C, Mohr R, Bauckhage C (2000) Evaluation of interest point detectors. Int J Comput Vis 37:151–172
Scovanner P, Ali S, Shah M (2007) A 3-dimensional SIFT descriptor and its application to action recognition. In: ACM multimedia conference, ACM MM
Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local SVM approach. In: International Conference on Pattern Recognition, ICPR
Shu-Fai W, Cipolla R (2007) Extracting spatiotemporal interest points using global information. In: IEEE International Conference on Computer Vision, ICCV
Viola P, Jones MJ, Snow D (2005) Detecting pedestrians using patterns of motion and appearance. Int J Comput Vis 63:153–161
Yilmaz A, Javed O, Shah M (2006) Object tracking: a survey. ACM Comput Surv 38:13–58
Acknowledgements
The authors of this paper would like to thank the senior software engineers, Mr. Wenbo Hu and Mr. Linshu Bai, for their invaluable support in implementing the proposed algorithms by NVIDIA CUDA programming.
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Zhang, P., Zhang, Y., Thomas, T. et al. Moving people tracking with detection by latent semantic analysis for visual surveillance applications. Multimed Tools Appl 68, 991–1021 (2014). https://doi.org/10.1007/s11042-012-1110-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-012-1110-4