A multitask joint framework for real-time person search

Li, Ye; Yin, Kangning; Liang, Jie; Tan, Zhuofu; Wang, Xinzhong; Yin, Guangqiang; Wang, Zhiguo

doi:10.1007/s00530-022-00982-y

A multitask joint framework for real-time person search

Regular Paper
Published: 17 August 2022

Volume 29, pages 211–222, (2023)
Cite this article

Multimedia Systems Aims and scope Submit manuscript

Ye Li^1,2,3^na1,
Kangning Yin²^na1,
Jie Liang²,
Zhuofu Tan²,
Xinzhong Wang¹,
Guangqiang Yin² &
…
Zhiguo Wang²

423 Accesses
4 Citations
Explore all metrics

Abstract

Person searches generally involve three important parts: person detection, feature extraction and identity comparison. However, a person search integrating detection, extraction and comparison has the two following drawbacks. First, the accuracy of detection will affect the accuracy of comparison. Second, it is difficult to achieve real-time results in real-world applications. To solve these problems, we propose a multitask joint framework for real-time person search (MJF) that optimizes person detection, feature extraction and identity comparison. For the person detection module, we propose the YOLOv5-GS model, which is trained with a person dataset. YOLOv5-GS combines the advantages of the Ghostnet and the squeeze-and-excitation block and improves the speed of person detection. For the feature extraction module, we design a model adaptation architecture, which can select different networks according to the number of people. It can balance the relationship between accuracy and speed. For identity comparison, we propose a 3D pooled table and a matching strategy to improve identification accuracy. On the condition of 1920 \(\times\) 1080-resolution video and a 200-ID table, the IR and the FPS achieved by our method reach 82.69% and 25.14, respectively. Therefore, the MJF can achieve real-time person search.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

Person Search by Multi-Scale Matching

Fine-Grained Person Re-identification

Article 08 January 2020

Improving Person Re-identification by Rich Feature Discovery with Self-guided Network

References

Xu, Y., Ma, B., Huang, R., Lin, L.: Person search in a scene by jointly modeling people commonness and person uniqueness. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 937–940 (2014)
Zhong, Z., Zheng, L., Cao, D., Li, S.: Re-ranking person re-identification with k-reciprocal encoding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1318–1327 (2017)
Sun, Y., Zheng, L., Deng, W., Wang, S.: Svdnet for pedestrian retrieval. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3800–3808 (2017)
He, T., Zhang, Z., Zhang, H., Zhang, Z., Xie, J., Li, M.: Bag of tricks for image classification with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 558–567 (2019)
Xiao, T., Li, S., Wang, B., Lin, L., Wang, X.: End-to-end deep learning for person search 2 (2). arXiv preprint arXiv:1604.01850
Munjal, B., Amin, S., Tombari, F., Galasso, F.: Query-guided end-to-end person search. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 811–820 (2019)
He, Z., Zhang, L.: End-to-end detection and re-identification integrated net for person search. In: Asian Conference on Computer Vision, pp. 349–364. Springer (2018)
Han, K., Wang, Y., Tian, Q., Guo, J., Xu, C., Xu, C.: Ghostnet: More features from cheap operations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1580–1589 (2020)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778 (2016)
Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7794–7803 (2018)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C-Y., Berg, AC.: Ssd: single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37. Springer (2016)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)
Farhadi, A., Redmon, J.: Yolov3: an incremental improvement. Computer Vision and Pattern Recognition, cite as
Bochkovskiy, A., Wang, C-Y., Liao, H-YM.: Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934
D-fes: Deep facial expression recognition system. In: Conference on Information and Communication Technology
Singh, H., Dhanak, N., Ansari, H., Kumar, K.: Hdml: habit detection with machine learning. In: International Conference on Computer and Communication Technology (2017)
Sharma, S., Kumar, K.: Asl-3dcnn: American sign language recognition technique using 3-d convolutional neural networks. Multimed. Tools Appl. 80(17), 26319–26331 (2021)
Article Google Scholar
A novel superpixel based color spatial feature for salient object detection. In: Conference on Information and Communication Technology
Ansari, H., Vijayvergia, A., Kumar, K.: Dcr-hmm: depression detection based on content rating using hidden markov model. In: In Proceedings of IEEE 2nd Conference on Information and Communication Technology, (CICT 2018) (2019)
Negi, A., Chauhan, P., Kumar, K., Rajput, RS.: Face mask detection classifier and model pruning with keras-surgeon. In: 2020 5th IEEE International Conference on Recent Advances and Innovations in Engineering (ICRAIE) (2020)
Vijayvergia, A., Kumar, K.: Selective shallow models strength integration for emotion detection using GloVe and LSTM. Multim. Tools Appl. 80, 28349–28363 (2021)
Article Google Scholar
Hu, J., Gao, X., Wu, H., Gao, S.: Detection of workers without the helments in videos based on yolo v3. In: 2019 12th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), pp. 1–4. IEEE (2019)
Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1492–1500 (2017)
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Zhang, H., Wu, C., Zhang, Z., Zhu, Y., Zhang, Z., Lin, H., Sun, Y., He, T., Mueller, J., Manmatha, R., et al.: Resnest: Split-attention networks. arXiv preprint arXiv:2004.08955
Li, W., Zhao, R., Xiao, T., Wang, X.: Deepreid: deep filter pairing neural network for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 152–159 (2014)
Varior, RR., Shuai, B., Lu, J., Xu, D., Wang, G.: A siamese long short-term memory architecture for human re-identification. In: European Conference on Computer Vision, pp. 135–153. Springer (2016)
Sharma, S., Kumar, P., Kumar, K.: Lexer: Lexicon based emotion analyzer
Kumar, S., Kumar, K.: Lsrc: Lexicon star rating system over cloud, pp. 1–6 (2018)
Kumar, S., Kumar, K.: Irsc: Integrated automated review mining system using virtual machines in cloud environment. In: 2018 Conference on Information and Communication Technology (CICT) (2018)
Su, C., Li, J., Zhang, S., Xing, J., Gao, W., Tian, Q.: Pose-driven deep convolutional model for person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3960–3969 (2017)
Zhao, H., Tian, M., Sun, S., Shao, J., Yan, J., Yi, S., Wang, X., Tang, X.: Spindle net: person re-identification with human body region guided feature decomposition and fusion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1077–1085 (2017)
Vijayvergia, A., Kumar, K.: Star: rating of reviews by exploiting variation in emotions using transfer learning framework. In: 2018 Conference on Information and Communication Technology (CICT) (2018)
Darbari, A., Kumar, K., Darbari, S., Patil, P.L.: Requirement of artificial intelligence technology awareness for thoracic surgeons. Cardiothorac. Surg. 29(1), 13 (2021)
Article Google Scholar
Kumar, K., Kurhekar, M.: Sentimentalizer: Docker container utility over cloud. In: International Conference on Advances in Pattern Recognition
Zhao, L., Li, X., Zhuang, Y., Wang, J.: Deeply-learned part-aligned representations for person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3219–3228 (2017)
Xiao, T., Li, S., Wang, B., Lin, L., Wang, X.: Joint detection and identification feature learning for person search. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3415–3424 (2017)
Li, Y., Yin, G., Liu, C., Yang, X., Wang, Z.: Triplet online instance matching loss for person re-identification. Neurocomputing 433, 10–18 (2021)
Article Google Scholar
Wang, C-Y., Mark Liao, H-Y., Wu, Y-H., Chen, P-Y., Hsieh, J-W., Yeh, I-H.: Cspnet: A new backbone that can enhance learning capability of cnn. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 390–391 (2020)
Lin, T-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, CL.: Microsoft coco: common objects in context. In: European Conference on Computer Vision, pp. 740–755 (2014)
Shao, S., Zhao, Z., Li, B., Xiao, T., Yu, G., Zhang, X., Sun, J.: Crowdhuman: a benchmark for detecting human in a crowd. arXiv preprint arXiv:1805.00123
Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., Tian, Q.: Scalable person re-identification: a benchmark. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1116–1124 (2015)
Gou, M., Karanam, S., Liu, W., Camps, O., Radke, R.J.: Dukemtmc4reid: A large-scale multi-camera person re-identification dataset. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1425–1434 (2017). https://doi.org/10.1109/CVPRW.2017.185
Bolle, RM., Connell, JH., Pankanti, S., Ratha, NK., Senior, AW.: The relation between the roc curve and the cmc. In: IEEE Workshop on Automatic Identification Advanced Technologies (2005)
Zheng, F., Deng, C., Sun, X., Jiang, X., Guo, X., Yu, Z., Huang, F., Ji, R.: Pyramidal person re-identification via multi-loss dynamic training. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8506–8514 (2019). https://doi.org/10.1109/CVPR.2019.00871
Zhang, X., Luo, H., Fan, X., Xiang, W., Sun, Y., Xiao, Q., Jiang, W., Zhang, C., Sun, J.: Alignedreid: Surpassing human-level performance in person re-identification. arXiv preprint arXiv:1711.08184
Martinel, N., Foresti, G.L., Micheloni, C.: Aggregating deep pyramidal representations for person re-identification. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1544–1554 (2019). https://doi.org/10.1109/CVPRW.2019.00196
Herzog, F., Ji, X., Teepe, T., Hörmann, S., Gilg, J., Rigoll, G.: Lightweight multi-branch network for person re-identification. arXiv preprint arXiv:2101.10774
Wang, H., Fan, Y., Wang, Z., Jiao, L., Schiele, B.: Parameter-free spatial attention network for person re-identification. arXiv preprint arXiv:1811.12150
Gao, S.H., Cheng, M.M., Zhao, K., Zhang, X.Y., Yang, M.H., Torr, P.: Res2net: a new multi-scale backbone architecture. IEEE Trans. Pattern Anal. Mach. Intell. 43(2), 652–662 (2021). https://doi.org/10.1109/TPAMI.2019.2938758
Article Google Scholar
Zhou, K., Yang, Y., Cavallaro, A., Xiang, T.: Omni-scale feature learning for person re-identification. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3701–3711 (2019). https://doi.org/10.1109/ICCV.2019.00380
Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. arXiv preprint arXiv:1611.05431
Zhang, H., Wu, C., Zhang, Z., Zhu, Y., Zhang, Z., Lin, H., Sun, Y., He, T., Muller, J., Manmatha, R., Li, M., Smola, A.: Resnest: Split-attention networks. arXiv preprint arXiv:2004.08955

Download references

Funding

This work was supported in part by the Natural Science Foundation of Xinjiang Uygur Autonomous Region (no. 2022D01B05).

Author information

Ye Li and Kangning Yin have contributed equally to this work.

Authors and Affiliations

Shenzhen Institute of Information Technology, Shenzhen, China
Ye Li & Xinzhong Wang
University of Electronic Science and Technology of China, Chengdu, China
Ye Li, Kangning Yin, Jie Liang, Zhuofu Tan, Guangqiang Yin & Zhiguo Wang
Kash Institute of Electronics and Information Industry, Kash, China
Ye Li

Authors

Ye Li
View author publications
You can also search for this author in PubMed Google Scholar
Kangning Yin
View author publications
You can also search for this author in PubMed Google Scholar
Jie Liang
View author publications
You can also search for this author in PubMed Google Scholar
Zhuofu Tan
View author publications
You can also search for this author in PubMed Google Scholar
Xinzhong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Guangqiang Yin
View author publications
You can also search for this author in PubMed Google Scholar
Zhiguo Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Xinzhong Wang, Guangqiang Yin or Zhiguo Wang.

Ethics declarations

Conflict of interest

The data that support the findings of this study are available online. These datasets were derived from the following public domain resources: [COCO, CrowdHuman, Market-1501, DukeMTMC].

Additional information

Communicated by C. Yan.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, Y., Yin, K., Liang, J. et al. A multitask joint framework for real-time person search. Multimedia Systems 29, 211–222 (2023). https://doi.org/10.1007/s00530-022-00982-y

Download citation

Received: 14 February 2022
Accepted: 15 July 2022
Published: 17 August 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s00530-022-00982-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A multitask joint framework for real-time person search

Abstract

Access this article

Similar content being viewed by others

Person Search by Multi-Scale Matching

Fine-Grained Person Re-identification

Improving Person Re-identification by Rich Feature Discovery with Self-guided Network

References

Funding

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A multitask joint framework for real-time person search

Abstract

Access this article

Similar content being viewed by others

Person Search by Multi-Scale Matching

Fine-Grained Person Re-identification

Improving Person Re-identification by Rich Feature Discovery with Self-guided Network

References

Funding

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation