Scalable Multi-camera Tracking in a Metropolis

  • Yogesh Raja
  • Shaogang Gong
Part of the Advances in Computer Vision and Pattern Recognition book series (ACVPR)


The majority of work in person re-identification is focused primarily on the matching process at an algorithmic level, from identifying reliable features to formulating effective classifiers and distance metrics in order to improve matching scores on established ‘closed-world’ benchmark datasets of limited scope and size. Very little work has explored the pragmatic and ultimately challenging question of how to engineer working systems that best leverage the strengths and tolerate the weaknesses of the current state of the art in re-identification techniques, and which are capable of scaling to ‘open-world’ operational requirements in a large urban environment. In this work, we present the design rationale, implementational considerations and quantitative evaluation of a retrospective forensic tool known as Multi-Camera Tracking (MCT). The MCT system was developed for re-identifying and back-tracking individuals within huge quantities of open-world CCTV video data sourced from a large distributed multi-camera network encompassing different public transport hubs in a metropolis. There are three key characteristics of MCT, associativity, capacity and accessibility, that underpin its scalability to spatially large, temporally diverse, highly crowded and topologically complex urban environments with transport links. We discuss a multitude of functional features that in combination address these characteristics. We consider computer vision techniques and machine learning algorithms, including relative feature ranking for inter-camera matching, global (crowd-level) and local (person-specific) space–time profiling, attribute re-ranking and machine-guided data mining using a ‘man-in-the-loop’ interactive paradigm. We also discuss implementational considerations designed to facilitate linear scalability to an aribitrary number of cameras by employing a distributed computing architecture. We conduct quantitative trials to illustrate the potential of the MCT system and its performance characteristics in coping with very large-scale open-world multi-camera data covering crowded transport hubs in a metropolis.


Camera View Correct Match Query Engine Candidate Match Search Iteration 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



We thank Lukasz Zalewski, Tao Xiang, Robert Koger, Tim Hospedales, Ryan Layne, Chen Change Loy and Richard Howarth of Vision Semantics and Queen Mary University of London who contributed to this work; Colin Lewis, Gari Owen and Andrew Powell of the UK MOD SA(SD) who made this work possible; Zsolt Husz, Antony Waldock, Edward Campbell and Paul Zanelli of BAE Systems who collaborated on this work; and Toby Nortcliffe of the UK Home Office CAST who assisted in setting up the trial environment and data capture.


  1. 1.
    Chang, C., Lin, C.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27:1–27:27 (2011)Google Scholar
  2. 2.
    Chapelle, O., Keerthi, S.: Efficient algorithms for ranking with SVMs. Inf. Retrieval 13(3):201–215 (2010)Google Scholar
  3. 3.
    Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)CrossRefGoogle Scholar
  4. 4.
    Gheissari, N., Sebastian, T., Hartley, R.: Person reidentification using spatiotemporal appearance. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1528–1535 (2006)Google Scholar
  5. 5.
    Hahnel, M., Klunder, D., Kraiss, K.F.: Color and texture features for person recognition. In: IEEE International Joint Conference on Neural Networks, vol. 1, pp. 647–652 (2004)Google Scholar
  6. 6.
    Joachims, T.: Optimizing search engines using clickthrough data. In: Knowledge Discovery and Data Mining, pp. 133–142 (2010)Google Scholar
  7. 7.
    Kuhn, H.: The hungarian method for the assignment problem. Naval Res. Logist. Quarterly 2, 83–97 (1955)CrossRefGoogle Scholar
  8. 8.
    Layne, R., Hospedales, T., Gong, S.: Person re-identification by attributes. In: British Machine Vision Conference, Guildford, UK (2012)Google Scholar
  9. 9.
    Layne, R., Hospedales, T., Gong, S.: Towards person identification and re-identification with attributes. In: European Conference on Computer Vision, First International Workshop on Re-Identification. Firenze, Italy (2012)Google Scholar
  10. 10.
    Loy, C.C., Xiang, T., Gong, S.: Multi-camera activity correlation analysis. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1988–1995 (2009)Google Scholar
  11. 11.
    Loy, C.C., Xiang, T., Gong, S.: Time-delayed correlation analysis for multi-camera activity understanding. Int. J. Comput. Vis. 90(1), 106–129 (2010)CrossRefGoogle Scholar
  12. 12.
    Madden, C., Cheng, E., Piccardi, M.: Tracking people across disjoint camera views by an illumination-tolerant appearance representation. Mach. Vis. Appl. 18(3), 233–247 (2007)CrossRefMATHGoogle Scholar
  13. 13.
    Munkres, J.: Algorithms for the assignment and transportation problems. J. Soc. Ind. Appl. Math. 5(1), 32–38 (1957)CrossRefMATHMathSciNetGoogle Scholar
  14. 14.
    Prosser, B., Gong, S., Xiang, T.: Multi-camera matching under illumination change over time. In: European Conference on Computer Vision, Workshop on Multi-camera and Multi-model Sensor Fusion (2008)Google Scholar
  15. 15.
    Prosser, B., Zheng, W., Gong, S., Xiang, T.: Person re-identification by support vector ranking. In: British Machine Vision Conference, Aberystwyth, UK (2010)Google Scholar
  16. 16.
    Raja, Y., Gong, S.: Scaling up multi-camera tracking for real-world deployment. In: Proceedings of the SPIE Conference on Optics and Photonics for Counterterrorism, Crime Fighting and Defence, Edinburgh, UK (2012)Google Scholar
  17. 17.
    Raja, Y., Gong, S., Xiang, T.: Multi-source data inference for object association. In: IMA Conference on Mathematics in Defence, Shrivenham, UK (2011)Google Scholar
  18. 18.
    Schmid, C.: Constructing models for content-based image retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 30–45 (2001)Google Scholar
  19. 19.
    UK Home Office: i-LIDS dataset: Multiple camera tracking scenario. (2010)
  20. 20.
    Wang, H., Suter, D., Schindler, K.: Effective appearance model and similarity measure for particle filtering and visual tracking. In: European Conference on Computer Vision, pp. 606–618, Graz, Austria (2006)Google Scholar
  21. 21.
    Zheng, W., Gong, S., Xiang, T.: Person re-identification by probabilistic relative distance comparison. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 649–656, Colorado Springs, USA (2011)Google Scholar
  22. 22.
    Zheng, W., Gong, S., Xiang, T.: Re-identification by relative distance comparison. IEEE Trans. Pattern Anal. Mach. Intell. 35(3), 653–668 (2013)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London 2014

Authors and Affiliations

  1. 1.Vision Semantics LtdLondonUK
  2. 2.Queen Mary University of LondonLondonUK

Personalised recommendations