Skip to main content

An Object Detection and Tracking Algorithm Combined with Semantic Information

  • Conference paper
  • First Online:
Mobile Multimedia Communications (MobiMedia 2021)

Abstract

This paper proposes a novel algorithm for object detection and tracking combined with attribute recognition that can be used in embedded systems. The algorithm, which is based on a single-shot multi-box detector (SSD) and kernelized correlation filter (KCF), can distinguish other objects similar to the tracking object, thereby solving the problem of confusion between similar objects. Different classification tasks in the multi-attribute recognition algorithm share the feature extraction module, which uses depthwise separable convolution and global pooling instead of standard convolution and fully connected (FC) layers, thereby improving the overall recognition accuracy and computational efficiency. Additionally, an attribute weight fine-tuning mechanism is added to improve the overall precision and ensure that different tasks are fully learned according to the degree of difficulty. Moreover, this algorithm reduces the size of the model without decreasing the accuracy, making it possible to be run on an embedded device. The results of experiments performed on OTB-100 demonstrate that a superior accuracy of 82.85% is achieved, and the precision and F1 indicator values reach 69.84% and 70.85%, respectively. The precision rate (PR) and success rate (SR) of the overall algorithm respectively reach 85.34% and 80.88%, which are higher than those achieved by the SSD algorithm. However, the size of the proposed attribute recognition algorithm is only 3.05 MB, which is about 1% of the size of other algorithms, indicating that the proposed algorithm not only improves the overall recognition accuracy, but also effectively reduces the model size.

This work is supported by the National Key Research and Development Program of China under Grant 2018AAA0102702; Natural Science Foundation of Heilongjiang Province (JJ2019LH2398); Fundamental Research Funds for the Central Universities (3072020CFT0801, 3072019CF0801 and 3072019CFM0802).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Wang, X., Xiao, T., Jiang, Y., Shao, S., Sun, J., Shen, C.: Repulsion loss: detecting pedestrians in a crowd. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2018)

    Google Scholar 

  2. Feichtenhofer, C., Pinz, A., Zisserman, A.: Detect to track and track to detect (2017)

    Google Scholar 

  3. Hoiem, D., Chodpathumwan, Y., Dai, Q.: Diagnosing error in object detectors. In: Proceedings of the 12th European conference on Computer Vision - Volume Part III (2012)

    Google Scholar 

  4. Mao, L., Yan, Y., Xue, J.H., Wang, H.: Deep multi-task multi-label CNN for effective facial attribute classification. In: Proceedings of International Conference on Computer Vision and Pattern Recognition (2020)

    Google Scholar 

  5. Sarafianos, N., Xu, X., Kakadiaris, I.A.: Deep imbalanced attribute classification using visual attention aggregation (2018)

    Google Scholar 

  6. Zhou, W., Gao, S., Zhang, L., Lou, X.: Histogram of oriented gradients feature extraction from raw Bayer pattern images. IEEE Trans. Circuits Syst. II: Express Briefs (99), 1 (2020)

    Google Scholar 

  7. Zhao, G., Wang, X., Cheng, Y.: Hyperspectral image classification based on local binary pattern and broad learning system. Int. J. Remote Sens. 41(24), 9393–9417 (2020)

    Article  Google Scholar 

  8. Wang, X., Zheng, S., Yang, R., Luo, B., Tang, J.: Pedestrian attribute recognition: A survey (2019)

    Google Scholar 

  9. Song, Z., Zhang, J.: Image registration approach with scale-invariant feature transform algorithm and tangent-crossing-point feature. J. Electron. Imaging 29(2), 1 (2020)

    Google Scholar 

  10. Yu, X., Wang, H.: Support vector machine classification model for color fastness to ironing of vat dyes. Textile Res. J. 004051752199236 (2021)

    Google Scholar 

  11. Zhu, J., Liao, S., Lei, Z., Li, S.Z.: Multi-label convolutional neural network based pedestrian attribute classification. Image Vis. Comput. 58, 224–229 (2017)

    Google Scholar 

  12. Abdulnabi, A.H., Wang, G., Lu, J., Jia, K.: Multi-task CNN model for attribute prediction. IEEE Tran. Multimedia (2015)

    Google Scholar 

  13. Guan, H., Cheng, B.: Taking full advantage of convolutional network for robust visual tracking. Multimedia Tools Appl. 78(8), 11011–11025 (2018)

    Google Scholar 

  14. Olmez, E., Akdogan, V., Korkmaz, M., Er, O.: Automatic segmentation of meniscus in multispectral MRI using regions with convolutional neural network (R-CNN). J. Digital Imaging 33(30) (2020)

    Google Scholar 

  15. Girshick, R.: Fast R-CNN. Computer Science (2015)

    Google Scholar 

  16. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)

    Article  Google Scholar 

  17. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: Unified, real-time object detection, You only look once (2015)

    Google Scholar 

  18. Chen, X., Yu, J., Wu, Z.: Temporally identity-aware SSD with attentional LSTM. IEEE Transactions on Cybernetics (2018)

    Google Scholar 

  19. Shan, Y.: ADAS and video surveillance analytics system using deep learning algorithms on FPGA. In: 2018 28th International Conference on Field Programmable Logic and Applications (FPL), pp. 465–466 (2018)

    Google Scholar 

  20. Wang, J., Li, A., Pang, Y.: Improved multi-domain convolutional neural networks method for vehicle tracking. Int. J. Artif. Intell. Tools 29(07n08), 2040022 (2020)

    Google Scholar 

  21. Lee, J., Kim, S., Ko, B.C.: Online multiple object tracking using rule distillated siamese random forest. IEEE Access 8 (2020)

    Google Scholar 

  22. He, W., Li, H., Liu, W., Li, C., Guo, B.: rstaple: a robust complementary learning method for real-time object tracking. Appl. Sci. 10(9), 3021 (2020)

    Article  Google Scholar 

  23. Ahmad, M., Ahmed, I., Khan, F.A., Qayum, F., Aljuaid, H.: Convolutional neural network based person tracking using overhead views. Int. J. Distribut. Sensor Netw. 16 (2020)

    Google Scholar 

  24. Gao, L., Li, Y., Ning, J.: Maximum margin object tracking with weighted circulant feature maps. IET Comput. Vis. 13(1), 71–78 (2019)

    Google Scholar 

  25. Chen, Y., Sheng, R.: Single-object tracking algorithm based on two-step spatiotemporal deep feature fusion in a complex surveillance scenario. Math. Probl. Eng. (2021)

    Google Scholar 

  26. Xiang, Z., Tao, T., Song, L., Dong, Z., Wang, H.: Object tracking algorithm for unmanned surface vehicle based on improved mean-shift method. Int. J. Adv. Rob. Syst. 17(3), 172988142092529 (2020)

    Article  Google Scholar 

  27. Tang, Y., Liu, Y., Huang, H., Liu, J., Xie, W.: A scale-adaptive particle filter tracking algorithm based on offline trained multi-domain deep network. IEEE Access (99), 1 (2020)

    Google Scholar 

  28. Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: Exploiting the circulant structure of tracking-by-detection with kernels (2012)

    Google Scholar 

  29. Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: High-speed tracking with kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 583–596 (2014)

    Google Scholar 

  30. Zhang, W., Zhang, Z., Zeadally, S., Chao, H.C., Leung, V.C.: A multiple-algorithm service model for energy-delay optimization in edge artificial intelligence. IEEE Trans. Ind. Inform. (99), 1 (2019)

    Google Scholar 

  31. Sun, T., et al.: Learning sparse sharing architectures for multiple tasks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 8936–8943 (2020)

    Google Scholar 

  32. Zamani, N.S.M., Zaki, W.M.D.W., Huddin, A.B., Hussain, A., Mutalib, H.A.: Automated pterygium detection using deep neural network. IEEE Access 8, 191659–191672 (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Changbo Hou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ji, Q., Liu, H., Hou, C., Zhang, Q., Mo, H. (2021). An Object Detection and Tracking Algorithm Combined with Semantic Information. In: Xiong, J., Wu, S., Peng, C., Tian, Y. (eds) Mobile Multimedia Communications. MobiMedia 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 394. Springer, Cham. https://doi.org/10.1007/978-3-030-89814-4_62

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-89814-4_62

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-89813-7

  • Online ISBN: 978-3-030-89814-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics