Advertisement

Entity Association Using Context for Wide Area Motion Imagery Target Tracking

  • Erik BlaschEmail author
  • Pengpeng Liang
  • Xinchu Shi
  • Peiyi Li
  • Haibin Ling
Chapter
Part of the Advances in Computer Vision and Pattern Recognition book series (ACVPR)

Abstract

Entity estimation includes tracking the numbers and types of targets in a scene and is challenging in large area surveillance due to high target density, severe similar target ambiguity, and a low sensor frame rate. Moving vehicle detection from wide area aerial surveillance can be aided by context information. In this paper, we utilize the maximum consistency context (MCC) as spatiotemporal information to estimate multiple targets, and temporal context (TC) to capture the road information. For a candidate association, the MCC is defined as the most consistent association in its neighborhood. Such a maximum selection chooses the reliable neighborhood context information while filtering out noisy and distracting data. In contrast with previous methods to exploit road information, TC does not need to get the location of the road first or use the geographical information systems’ (GIS) information. We first use background subtraction to generate the candidates and then build MCC/TC based on the candidates that have been classified as positive by histograms of oriented gradient (HOG) with multiple kernel learning (MKL). For each positive candidate, a region around the candidate is divided into several subregions based on the direction of the candidate, then each subregion is divided into 12 bins with a fixed length; and finally the TC, a histogram, is built according to the positions of the positive candidates in eight consecutive frames. In order to benefit from both the appearance and context information, we use MKL to combine MCC/TC and HOG. We demonstrate the usefulness of context modeling on multi-target tracking using three challenging wide area motion imagery (WAMI) sequences using the publicly available Columbus Large Image Format (CLIF) 2006 dataset. Both quantitative and qualitative results show clearly the effectiveness of using MCC and TC information, in comparison with algorithms that use no context information. Likewise, the experiments demonstrate that the proposed MCC/TC are useful to remove the false positives that are away from the road and the combination of TC and HOG with MKL outperforms the use of TC or HOG only.

Keywords

Wide area motion imagery Temporal context Spatial context Maximum consistency context Background subtraction Histogram of gradients Multiple kernel learning Target tracking Road abstraction 

Notes

Acknowledgments

This work is partly supported by the Air Force Office of Scientific Research (AFOSR) under the Dynamic Data Driven Application Systems program and the Air Force Research Lab. The work was supported in part by NSF Grants IIS-1218156 and IIS-1350521.

References

  1. 1.
    S. Pellegrini, A. Ess, K. Schindler, L. Van Gool, You’ll never walk alone: modeling social behavior for multi-target tracking, in IEEE International Conference on Computer Vision (2009) pp. 261–268Google Scholar
  2. 2.
    J. Xiao, H. Cheng, F. Han, H. Sawhney, Geo-spatial aerial video processing for scene understanding and object tracking, in IEEE Conference on Computer Vision and Pattern Recognition (2008), pp. 1–8Google Scholar
  3. 3.
    J. Xiao, H. Cheng, H.S. Sawhney, F. Han, Vehicle detection and tracking in wide field-of-view aerial video, in CVPR, 2010Google Scholar
  4. 4.
    E. Blasch, G. Seetharaman, K. Palaniappan, H. Ling, G. Chen, Wide-area motion imagery (WAMI) exploitation tools for enhanced situation awareness, in Proceedings of IEEE Applied Imagery Pattern Recognition (AIPR) Workshop: Computer Vision: Time for Change, 2012Google Scholar
  5. 5.
    J. Prokaj, X. Zhao, G.G. Medioni, Tracking many vehicles in wide area aerial surveillance, in CVPR Workshops (2012), pp. 37–43Google Scholar
  6. 6.
    J. Prokaj, G. Medioni, Using 3D scene structure to improve tracking, in IEEE Conference on Computer Vision and Pattern Recognition (2011), pp. 1337–1344Google Scholar
  7. 7.
    V. Reilly, H. Idrees, M. Shah, Detection and tracking of large number of targets in wide area surveillance, in European Conference on Computer Vision (2010), pp. 186–199Google Scholar
  8. 8.
    H. Ling, Y. Wu, E. Blasch, G. Chen, L. Bai, Evaluation of visual tracking in extremely low frame rate wide area motion imagery, in Proceedings of the International Conference on Information Fusion (FUSION), 2011Google Scholar
  9. 9.
    K. Palaniappan, F. Bunyak, P. Kumar, I. Ersoy, S. Jaeger, K. Ganguli, A. Haridas, J. Fraser, R.M. Rao, G. Seetharaman, Efficient feature extraction and likelihood fusion for vehicle tracking in low frame rate airborne video, in Proceedings of the International Conference on Information Fusion (FUSION), 2010Google Scholar
  10. 10.
    X. Shi, H. Ling, E. Blasch, W. Hu, Context-driven moving vehicle detection in wide area motion imagery, in International Conference on Pattern Recognition (ICPR), 2012Google Scholar
  11. 11.
    S. Ali, V. Reilly, M. Shah, Motion and appearance contexts for tracking and re-acquiring targets in aerial videos, in IEEE Conference on Computer Vision and Pattern Recognition (2007), pp. 1–6Google Scholar
  12. 12.
    V. Reilly, H. Idrees, M. Shah, Detection and tracking of large number of targets in wide area surveillance, in ECCV (3), 2010Google Scholar
  13. 13.
    J. Prokaj, M. Duchaineau, G. Medioni, Inferring tracklets for multi-object tracking, in Workshop of Aerial Video Processing Joint with IEEE CVPR, 2011Google Scholar
  14. 14.
    F. Bunyak, K. Palaniappan, S.K. Nath, G. Seetharaman, Flux tensor constrained geodesic active contours with sensor fusion for persistent object tracking. J. Multimedia 2(4), 20–33 (2007)CrossRefGoogle Scholar
  15. 15.
    G. Heitz, D. Koller, Learning spatial context: using stuff to find things, in ECCV (1), 2008Google Scholar
  16. 16.
    A. Jain, A. Gupta, L.S. Davis, Learning what and how of contextual models for scene labeling, in ECCV (4), 2010Google Scholar
  17. 17.
    S.K. Divvala, D. Hoiem, J. Hays, A.A. Efros, M. Hebert, An empirical study of context in object detection, in CVPR, 2009Google Scholar
  18. 18.
    H. Myeong, J.Y. Chang, K. M. Lee, Learning object relationships via graph-based context model, in CVPR, 2012Google Scholar
  19. 19.
    C. Galleguillos, B. McFee, S.J. Belongie, G.R.G. Lanckriet, Multi-class object localization by combining local contextual interactions, in CVPR, 2010Google Scholar
  20. 20.
    Z. Niu, G. Hua, X. Gao, Q. Tian, Context aware topic model for scene recognition, in CVPR, 2012Google Scholar
  21. 21.
    J. Porway, K. Wang, B. Yao, S.C. Zhu, A hierarchical and contextual model for aerial image understanding, in CVPR, 2008Google Scholar
  22. 22.
    A. Rabinovich, A. Vedaldi, C. Galleguillos, E. Wiewiora, S. Belongie, Objects in context, in ICCV, 2007Google Scholar
  23. 23.
    N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, in CVPR (1), 2005Google Scholar
  24. 24.
    A. Jain, S.V.N. Vishwanathan, M. Varma, Spg-gmkl: generalized multiple kernel learning with a million kernels, in KDD, 2012Google Scholar
  25. 25.
    M. Varma, B.R. Babu, More generality in efficient multiple kernel learning, in ICML, 2009Google Scholar
  26. 26.
    M. Varma, D. Ray, Learning the discriminative power invariance trade-off, in ICCV, 2007Google Scholar
  27. 27.
    M. G¨onen, E. Alpaydin, Multiple kernel learning algorithms. J. Mach. Learn. Res. 12, 2211–2268 (2011)Google Scholar
  28. 28.
    O. Mendoza-Schrock, J.A. Patrick, E. Blasch, Video image registration evaluation for a layered sensing environment, in Proceedings of IEEE National Aerospace Electronics Conference (NAECON), 2009Google Scholar
  29. 29.
    Y. Bar-Shalom, T. Fortmann, Tracking and Data Association (Academic Press, 1988)Google Scholar
  30. 30.
    D. Reid, An algorithm for tracking multiple targets. TAC 24(6), 843–854 (1979)Google Scholar
  31. 31.
    M. Yang, Y. Wu, G. Hua, Context-aware visual tracking. IEEE Trans. Pattern Anal. Mach. Intell. 31(7), 1195–1209 (2009)CrossRefGoogle Scholar
  32. 32.
    H. Grabner, J. Matas, L. Van Gool, P. Cattin, Tracking the invisible: learning where the object might be, in IEEE Conference on Computer Vision and Pattern Recognition (2010), pp. 1285–1292Google Scholar
  33. 33.
    T. Zhao, R. Nevatia, Car detection in low resolution aerial image, in ICCV, 2001Google Scholar
  34. 34.
    X. Shi, H. Ling, E. Blasch, W. Hu, Context-driven moving vehicle detection in wide area motion imagery, in International Conference on Pattern Recognition (ICPR), 2012Google Scholar
  35. 35.
    J. Xiao, H. Cheng, H.S. Sawhney, F. Han, Vehicle detection and tracking in wide field-of-view aerial video, in CVPR, 2010Google Scholar
  36. 36.
    P. Liang, G. Teodoro, H. Ling, E. Blasch, G. Chen, L. Bai, Multiple kernel learning for vehicle detection in wide area motion imagery, in International Conference on Information Fusion (FUSION), 2012Google Scholar
  37. 37.
    K. Palaniappan, F. Bunyak, P. Kumar, I. Ersoy, S. Jaeger, K. Ganguli, A. Haridas, J. Fraser, R.M. Rao, G.S. Seetharaman, Efficient feature extraction and likelihood fusion for vehicle tracking in low frame rate airborne video, in Proceedings of the International Conference on Information Fusion (FUSION), 2010Google Scholar
  38. 38.
    R. Pelapur, S. Candemir, F. Bunyak, M. Poostchi, G. Seetharaman, K. Palaniappan, Persistent target tracking using likelihood fusion in wide-area and full motion video sequences, in Proceedings of the International Conference on Information Fusion (FUSION), 2012Google Scholar
  39. 39.
    H. Ling, Y. Wu, E. Blasch, G. Chen, L. Bai, Evaluation of visual tracking in extremely low frame rate wide area motion imagery, in Proceedings of the International Conference on Information Fusion (FUSION), 2011Google Scholar
  40. 40.
    E. Blasch, G. Seetharaman, S. Suddarth, K. Palaniappan, G. Chen, H. Ling, A. Basharat, Summary of methods in wide-area motion imagery (WAMI), in Proceedings of SPIE, vol. 9089, 2014Google Scholar
  41. 41.
    X. Shi, P. Li, W. Hu, E. Blasch, H. Ling, Using maximum consistency context for multiple target association in wide area traffic scenes, in Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2013Google Scholar
  42. 42.
    W.-S. Zheng, S. Gong, T. Xiang, Quantifying and transferring contextual information in object detection. IEEE Trans. Pattern Anal. Mach. Intell. 34(4), 762–777 (2012)CrossRefGoogle Scholar
  43. 43.
    Y. Ding, J. Xiao, Contextual boost for pedestrian detection, in CVPR, 2012Google Scholar
  44. 44.
    Z. Song, Q. Chen, Z. Huang, Y. Hua, S. Yan, Contextualizing object detection and classification, in CVPR, 2011Google Scholar
  45. 45.
    P.F. Felzenszwalb, R.B. Girshick, D.A. McAllester, D. Ramanan, Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)CrossRefGoogle Scholar
  46. 46.
    M. Andriluka, S. Roth, B. Schiele, People-tracking-by-detection and people-detection-by-tracking, in IEEE Conference on Computer Vision and Pattern Recognition (2008), pp. 1–8Google Scholar
  47. 47.
    M.D. Breitenstein, F. Reichlin, B. Leibe, E. Koller-Meier, L. Van Gool, Online multiperson tracking-by-detection from a single, uncalibrated camera. IEEE Trans. Pattern Anal. Mach. Intell. 33(9), 1820–1833 (2011)CrossRefGoogle Scholar
  48. 48.
    H. Bay, T. Tuytelaars, L. Van Gool, Surf: speeded up robust features, in European Conference on Computer Vision (2006), pp. 404–417Google Scholar
  49. 49.
    N. Dalal, B. Triggs, Histograms of oriented gradients for human detection. IEEE Conf. on Comput. Vis. Pattern Recogn. 1, 886–893 (2005)Google Scholar
  50. 50.
    H.W. Kuhn, The hungarian method for the assignment problem. Naval Res. Logistics Quart. 2(1–2), 83–97 (1955)MathSciNetCrossRefzbMATHGoogle Scholar
  51. 51.
    M.J. Swain, D.H. Ballard, Color indexing. Int. J. Comput. Vis. 7(1), 11–32 (1991)CrossRefGoogle Scholar
  52. 52.
    C. Huang, B. Wu, R. Nevatia, Robust Object Tracking by Hierarchical Association of Detection Responses (2008), pp. 788–801Google Scholar
  53. 53.
    C.H. Kuo, C. Huang, R. Nevatia, Multi-target tracking by on-line learned discriminative appearance models, in IEEE Conference on Computer Vision and Pattern Recognition (2010), pp. 685–692Google Scholar
  54. 54.
    D.G. Lowe, Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)CrossRefGoogle Scholar
  55. 55.
    M. Muja, D.G. Lowe, Fast approximate nearest neighbors with automatic algorithm configuration, in International Conference on Computer Vision Theory and Application, 2009Google Scholar
  56. 56.
    M.A. Fischler, R.C. Bolles, Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)MathSciNetCrossRefGoogle Scholar
  57. 57.
    E.C. Cho, S.S. Iyengar, G. Seetharaman, R. Holyer, M. Lybanon, Velocity Vectors for Features of Sequential Oceanographic ImagesGoogle Scholar
  58. 58.
    AFRL: Columbus large image format (clif) 2006, https://www.sdms.afrl.af.mil/index.php?collection=clif2006
  59. 59.
    O. Mendoza-Schrock, J.A. Patrick, E. Blasch, Video image registration evaluation for a layered sensing environment, in Proceedings of IEEE National Aerospace Electronics Conference (NAECON), 2009Google Scholar
  60. 60.
    C.-C. Chang, C.-J. Lin, LIBSVM: a library for support vector machines, ACM Trans. Intell. Syst. Technol. 2, 27:1–27:27 (2011), software available at http://www.csie.ntu.edu.tw/cjlin/libsvm

Copyright information

© Springer International Publishing Switzerland (outside the USA) 2016

Authors and Affiliations

  • Erik Blasch
    • 1
    Email author
  • Pengpeng Liang
    • 2
  • Xinchu Shi
    • 2
  • Peiyi Li
    • 2
  • Haibin Ling
    • 2
  1. 1.Air Force Research LabRomeUSA
  2. 2.Department of Computer and Information SciencesTemple UniversityPhiladelphiaUSA

Personalised recommendations