Abstract
In this paper, we give the design and implementation of a system for person re-identification in a camera network, based on the appearance. This system seeks to construct an online database that contains the history of every person that enters the field of view of the cameras. This system is qualified to associate an identifier to each detected person, which keeps this identifier in the same camera and in other cameras even if he or she disappears and then appears again. Our system comprises a moving objects detection step that is implemented using the Mixture of Gaussians method and a proposed difference method, to improve the detection results. It also comprises a tracking step that is implemented using the sum of squared differences algorithm. The re-identification stage is realized using two steps: the intersection of tracking and detection for the temporal association, the histogram for comparison. The global system was tested on a real data set collected by three cameras. The experimental results show that our approach gives very satisfactory results.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
In recent years, video surveillance has grown more and more. This resulted in an increase of cameras installed in different places (private or public), making their exploitation and monitoring very difficult for human being. That is why much research has been done to create intelligent vision systems that can help the human being, in interpreting scenes and reacting with alarms in case of any anomaly. Currently there are several types of video surveillance systems (access control in sensitive locations, people recognition, control of traffic congestion, ...etc.).
In this paper we are interested in the problem of person re-identification in a camera network. Re-identification in computer vision systems aims to follow a person, associate an identifier to him, and store it in a database. If the person leaves the scene then reappears in the field of view of any camera, it will be assigned the same identifier. In a crowded and uncontrolled environment observed by cameras from unknown distances, person re-identification relying upon conventional biometrics, such as face recognition, is neither feasible nor reliable, due to insufficiently constrained conditions and insufficient image details for extracting robust biometrics [17]. Instead, visual features based on the appearance of people, determined by their clothing and objects carried or associated with them, can be exploited more reliably for re-identification.
The remainder of this paper is organized as follows: Sect. 2 presents some related works from the literature. Section 3 descibes in details each block of the proposed system. The experimental results and their discussion are presented in Sect. 4. Finally, some conclusions are drawn in Sect. 5.
2 Related Works
In the literature, the approaches of re-identification can be grouped in several classes, according to several criteria [12]:
-
1.
The number of images per person: This class comprises two families. The first family is the family of mono-sample methods, where the signature of a person is extracted from a single image as in [1, 3, 6, 15, 16, 24] . The second family is the family of multi-sample methods, where multiple images are used to calculate the signature of a person as in [4, 5, 7, 11, 14, 19, 21].
-
2.
The type of representation: The first family in this class is the family of global approaches, where the whole information in the image is exploited for calculating the person’s signature, as in [1, 2, 13]. The second family is that of local approaches, which represent the image by several feature vectors, each vector describes a region or a locally detected point, such as in [5, 8, 9].
-
3.
The existence of a set of images mapped a priori: This class includes supervised approaches like in [2, 3, 14] and unsupervised approaches as in [10, 23].
A very nice survey of people re-identification approaches is presented in [22]. They are therein grouped as a multidimensional taxonomy according to camera setting, sample set cardinality, signature, adoption of a body model, machine learning techniques and application scenario.
3 Description of the Proposed System
In this section, we describe the different stages of the proposed system, for person re-identification in non-overlapping camera network. These stages are: person detection, their localization and verification, their tracking and their re-identification. The overall flowchart of the proposed system is shown in Fig. 1.
3.1 Person Detection
This initial stage is accomplished by combining the Mixture of Gaussians (MoG) method [20] and the difference method. The MoG is one of the most used and successful methods in surveillance systems, because it is adaptive, and can handle multimodal backgrounds.
In the difference method, we first take the difference between two successive images in grayscale \( I_{g(t)} \) and \( I_{g(t-1)}\), as in Eq. (1), and then we compare the resulting difference image \( I_{diff} \) to a threshold to detect pixels in movement.
The hybrid of detections resulting from the MoG and difference methods is performed using the logical OR operation.
After detecting moving objects we fill the holes [18]. The holes of a binary image correspond to the set of its regional minima, which are not connected to the image border.
3.2 Person Localization
The localization of the detected person is done using the labeling technique. This technique consists in separating the areas in the mask obtained from the detection step. We associate with each area an integer value (label) by using an 8-connected neighborhood, then we calculate some proprieties for each area, e.g. x and y coordinates, height, width and sum of foreground pixels.
3.3 Verification
To eliminate false detections, we propose a verification phase. To be validated each detected person has to verify the following three conditions:
-
The ratio of width to height: this ratio has to lie between min and max thresholds.
-
The surface of the rectangle containing the person (surface = height \(\times \) width) has to lie between min and max thresholds. This is to eliminate very small and very big objects due to false detection.
-
The ratio of the sum of foreground pixels to the surface also has to be limited.
3.4 Person Tracking
The person tracking process is done by template matching using the Sum of Squared Differences Algorithm (SSD). In digital image processing, the SSD is a measure of the similarity between image blocks. It is calculated by taking the square of the difference between each pixel in the original block X (a portion from the current frame) and the corresponding pixel in the Y block being used for comparison (Model from previous detection).
These differences are summed to create a simple metric of block similarity as in Eq. (2), zero means that the two blocks are identical. We sweep all the positions in the frame, then the block with the smallest metric is the tracked block.
The SSD value for two blocks X and Y calculated by:
For a given Y model, the most similar block X is the one that minimizes the SSD.
3.5 Re-identification and Association
Following the stages of detection, localization, verification, and tracking, we have the stage of re-identification and online construction of database DB containing the history of each person that appeared in the view field of the cameras. Figure 2 presents a detailed flowchart of this stage.
This stage deals with the moving objects obtained from the detection and tracking stages, which are called ‘found person’.
First, we calculate the intersection between the found persons resulting from the detection and tracking, the intersection \( (A\cap B) \) of two rectangles A and B is the rectangle that contains all elements of A that also belong to B.
Then we test if the found persons resulted from detection only, tracking only or from both. If found person comes from intersection or tracking only, we update the database with the identifier of tracked person.
On the other hand, if that found person comes from detection only, then we calculate its histogram. An image histogram is a type of histogram that acts as a graphical representation of the tonal distribution in a digital image. It plots the number of pixels for each tonal value. The histogram of the found person is compared to the histograms of identified persons stored in the database. If there is a match, then we update the database by associating that person with the matched identifier, otherwise, we consider this person as a new one and assign to it a new identifier that is added to the database.
4 Experimental Results
In this section, we present the material and the database used, the experimental results, and their discussion.
4.1 System development environment
The material that was used for the development of our application is:
-
1.
A laptop with:
-
Processor: Intel core i7 4702MQ CPU @ 2.20 GHz 2.20 GHz.
-
RAM memory: 8.00 Go.
-
Operating system: Windows 8.1, 64-bits
-
Hard Drive: 1 TB.
-
-
2.
Digital video recorder DVR.
-
3.
Camera with characteristics:
-
1/3 Sony HR CCD
-
420 TV lines
-
0.2 Lux
-
Adjustable Focal between (3 mm and 8 mm).
-
To test our system we build our own database, composed of sequences of images recorded on the third floor of the Department of Electronics at USTO university. Three cameras, set to a height of (2.30 m) and with an angle of (−30), were used to take these images. Each sequence contains from one to three people who walk in the fields of view of the three cameras. The cameras were placed as shown in the layout presented in Fig. 3.
To fulfil the condition of a non-overlapping camera network, the database was realized so that a person lies in the field of view of only one camera, at a given instant. Figure 4 shows the fields of view of the three cameras.
4.2 Experimental Results, and Discussion
In this section, we will present and discuss the results of each step of the proposed system. The Mixture of Gaussian gives us raw results of detection from each camera, after having defined suitable settings according to some criteria, like: indoor or outdoor environment, people movement speed and lighting changes. Figure 5(b) presents an example of these results. To improve these raw results, we combine them with the results of the difference method (Fig. 5(c)), which allows for the detection of the edges of moving objects, then we proceed to a holes filling of the resulting image to obtain better results as illustrated in Fig. 5(d).
In Fig. 6, we present the localization and verification results. After the localization by the labeling technique, we apply the verification procedure to each person. In Fig. 6(a), only the persons that verify the validation conditions are kept (the person in green rectangle), the others in red are ignored. Figure 6(b) presents the detection results.
The tracking step is run in parallel with the detection step and it is realized by the SSD. To accelerate its execution we decided to apply it only to a limited region of interest, instead of searching in the whole frame. This region is determined by the coordinates of the model to track. The obtained results of tracking are satisfactory. Figure 7 shows the tracking results, Fig. 7(a) is a detection in frame 159 and Fig. 7(b) is its tracking in frame 222.
The re-identification stage is realized using two techniques, the intersection of detection and tracking for the temporal association, and the histogram for comparison. In Fig. 8, we present the multiplication of the detected person with its mask to extract the silhouette only. Then, we calculate histograms of Red, Green, Blue channels and grayscales as shown in Fig. 9. We use the histogram of the silhouette to avoid the effect of the background. These different histograms are used for comparison with the models stored in the database. If there is a match, we associate the matched identifier to the actual person, otherwise, we consider that the actual person is new and add it to the database with a new identifier.
A sample of the constructed database is presented in Fig. 10. This database contains the history of every person that enters the field of view of the cameras.
5 Conclusion
In this paper, we presented the conception and implementation of a system for person re-identification in a camera network, based on the appearance. This system aims to build an online database that contains the history of every person captured by the cameras.
This system is able to assign an identifier to each detected person, that it keeps everywhere in the fields of view of the cameras and even if he or she disappears and then appears again.
Our system implements an improved detection technique that combines the Mixture of Gaussians method and the difference method. The SSD algorithm with an acceleration strategy is used for the tracking step, whereas the re-identification stage is realized using two techniques: the intersection for temporal association and the histogram comparison.
The global system was tested on a real data set collected by three cameras. The experimental results show that our approach leads to very satisfactory results with an opportunity for improvement in the re-identification stage, by using a local histograms instead of using the global one. Also as a future work, we plan to evaluate our method quantitatively and compare it with other methods.
References
An, L., Kafai, M., Yang, S., Bhanu, B.: Reference-based person re-identification. In: 2013 10th IEEE International Conference on Advanced Video and Signal Based Surveillance, pp. 244–249. IEEE, August 2013
Bauml, M., Stiefelhagen, R.: Evaluation of local features for person re-identification in image sequences. In: 2011 8th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 291–296. IEEE August 2011
Cai, Y., Huang, K., Tan, T.: Human appearance matching across multiple non-overlapping cameras. In: 19th International Conference on Pattern Recognition, pp. 1–4. IEEE December 2008
Cheng, D.S., Cristani, M., Stoppa, M., Bazzani, L., Murino, V.: Custom pictorial structures for re-identification. In: Proceedings of the British Machine Vision Conference, pp. 68.1–68.11. British Machine Vision Association (2011)
de Oliveira, I.O., De Souza Pio, J.L.: People reidentification in a camera network. In: 2009 Eighth IEEE International Conference on Dependable, Autonomic and Secure Computing, pp. 461–466. IEEE, December 2009
Dikmen, M., Akbas, E., Huang, T.S., Ahuja, N.: Pedestrian recognition with a learned metric. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010. LNCS, vol. 6495, pp. 501–512. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19282-1_40
Farenzena, M., Bazzani, L., Perina, A., Murino, V., Cristani, M.: Person re-identification by symmetry-driven accumulation of local features. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2360–2367. IEEE, June 2010
Gheissari, N., Sebastian, T.B., Hartley, R.: Person reidentification using spatiotemporal appearance. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2 (CVPR 2006), vol. 2, pp. 1528–1535. IEEE (2006)
Hamdoun, O., Moutarde, F., Stanciulescu, B., Steux, B.: Person re-identification in multi-camera system by signature based on interest point descriptors collected on short video sequences. In: 2008 Second ACM/IEEE International Conference on Distributed Smart Cameras, pp. 1–6. IEEE, September 2008
Hirzer, M., Roth, P.M., Bischof, H.: Person re-identification by efficient impostor-based metric learning. In: 2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance, pp. 203–208. IEEE, September 2012
Huang, C.-H., Wu, Y.-T., Shih, M.-Y.: Unsupervised pedestrian re-identification for loitering detection. In: Wada, T., Huang, F., Lin, S. (eds.) PSIVT 2009. LNCS, vol. 5414, pp. 771–783. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-540-92957-4_67
Khedher, M.I.: Ré-identification de personnes à partir des séquences vidéo. Dissertations, Institut National des Télécommunications, Paris (2014)
Ijiri, Y., Lao, S.: Human re-identification through distance metric learning based on jensen-shannon kernel. In: VISAPP: - International Conference on Computer Vision Theory and Applications, pp. 603–612 (2012)
Jungling, K., Arens, M.: View-invariant person re-identification with an implicit shape model. In: 2011 8th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 197–202. IEEE, August 2011
Park, U., Jain, A.K., Kitahara, I., Kogure, K., Hagita, N.: ViSE: visual search engine using multiple networked cameras. In: 18th International Conference on Pattern Recognition (ICPR 2006), pp. 1204–1207. IEEE (2006)
Schwartz, W.R., Davis, L.S.: Learning discriminative appearance-based models using partial least squares. In: 2009 XXII Brazilian Symposium on Computer Graphics and Image Processing, pp. 322–329. IEEE, October 2009
Gong, S., Cristani, M., Loy, C.C., Hospedales, T.M.: The re-identification challenge. In: Gong, S., Cristani, M., Yan, S., Loy, C.C. (eds.) Person Re-Identification. ACVPR, pp. 1–20. Springer, London (2014). https://doi.org/10.1007/978-1-4471-6296-4_1
Soille, P.: Morphological Image Analysis. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-662-05088-0
Souded, M.: People Detection, Tracking and Re-identification Through a Video Camera Network. Ph.D. thesis, Signal and Image processing, Institut National de Recherche en Informatique et en Automatique, Universite de Nice - Sophia Antipolis Ecole, Nice (2013)
Stauffer, C., Grimson, W.E.L.: Adaptive background mixture models for real-time tracking. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 246–252. IEEE Computer Society (1999)
Cong, D.-N.T., Achard, C., Khoudour, L.: People re-identification by classification of silhouettes based on sparse representation. In: 2010 2nd International Conference on Image Processing Theory, Tools and Applications, pp. 60–65. IEEE, July 2010
Vezzani, R., Baltieri, D., Cucchiara, R.: People reidentification in surveillance and forensics: a Survey. ACM Comput. Surv. 46(2), 1–37 (2013)
Wang, T., Gong, S., Zhu, X., Wang, S.: Person re-identification by video ranking. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 688–703. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10593-2_45
Wang, X., Doretto, G., Sebastian, T., Rittscher, J., Tu, P.: Shape and appearance context modeling. In: 2007 IEEE 11th International Conference on Computer Vision, pp. 1–8. IEEE (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 IFIP International Federation for Information Processing
About this paper
Cite this paper
Baquhaizel, A.S., Kholkhal, S., Alshaqaqi, B., Keche, M. (2018). SSD and Histogram for Person Re-identification System. In: Amine, A., Mouhoub, M., Ait Mohamed, O., Djebbar, B. (eds) Computational Intelligence and Its Applications. CIIA 2018. IFIP Advances in Information and Communication Technology, vol 522. Springer, Cham. https://doi.org/10.1007/978-3-319-89743-1_50
Download citation
DOI: https://doi.org/10.1007/978-3-319-89743-1_50
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-89742-4
Online ISBN: 978-3-319-89743-1
eBook Packages: Computer ScienceComputer Science (R0)