Abstract
This paper proposes a novel algorithm for object detection and tracking combined with attribute recognition that can be used in embedded systems. The algorithm, which is based on a single-shot multi-box detector (SSD) and kernelized correlation filter (KCF), can distinguish other objects similar to the tracking object, thereby solving the problem of confusion between similar objects. Different classification tasks in the multi-attribute recognition algorithm share the feature extraction module, which uses depthwise separable convolution and global pooling instead of standard convolution and fully connected (FC) layers, thereby improving the overall recognition accuracy and computational efficiency. Additionally, an attribute weight fine-tuning mechanism is added to improve the overall precision and ensure that different tasks are fully learned according to the degree of difficulty. Moreover, this algorithm reduces the size of the model without decreasing the accuracy, making it possible to be run on an embedded device. The results of experiments performed on OTB-100 demonstrate that a superior accuracy of 82.85% is achieved, and the precision and F1 indicator values reach 69.84% and 70.85%, respectively. The precision rate (PR) and success rate (SR) of the overall algorithm respectively reach 85.34% and 80.88%, which are higher than those achieved by the SSD algorithm. However, the size of the proposed attribute recognition algorithm is only 3.05 MB, which is about 1% of the size of other algorithms, indicating that the proposed algorithm not only improves the overall recognition accuracy, but also effectively reduces the model size.
This work is supported by the National Key Research and Development Program of China under Grant 2018AAA0102702; Natural Science Foundation of Heilongjiang Province (JJ2019LH2398); Fundamental Research Funds for the Central Universities (3072020CFT0801, 3072019CF0801 and 3072019CFM0802).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Wang, X., Xiao, T., Jiang, Y., Shao, S., Sun, J., Shen, C.: Repulsion loss: detecting pedestrians in a crowd. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Feichtenhofer, C., Pinz, A., Zisserman, A.: Detect to track and track to detect (2017)
Hoiem, D., Chodpathumwan, Y., Dai, Q.: Diagnosing error in object detectors. In: Proceedings of the 12th European conference on Computer Vision - Volume Part III (2012)
Mao, L., Yan, Y., Xue, J.H., Wang, H.: Deep multi-task multi-label CNN for effective facial attribute classification. In: Proceedings of International Conference on Computer Vision and Pattern Recognition (2020)
Sarafianos, N., Xu, X., Kakadiaris, I.A.: Deep imbalanced attribute classification using visual attention aggregation (2018)
Zhou, W., Gao, S., Zhang, L., Lou, X.: Histogram of oriented gradients feature extraction from raw Bayer pattern images. IEEE Trans. Circuits Syst. II: Express Briefs (99), 1 (2020)
Zhao, G., Wang, X., Cheng, Y.: Hyperspectral image classification based on local binary pattern and broad learning system. Int. J. Remote Sens. 41(24), 9393–9417 (2020)
Wang, X., Zheng, S., Yang, R., Luo, B., Tang, J.: Pedestrian attribute recognition: A survey (2019)
Song, Z., Zhang, J.: Image registration approach with scale-invariant feature transform algorithm and tangent-crossing-point feature. J. Electron. Imaging 29(2), 1 (2020)
Yu, X., Wang, H.: Support vector machine classification model for color fastness to ironing of vat dyes. Textile Res. J. 004051752199236 (2021)
Zhu, J., Liao, S., Lei, Z., Li, S.Z.: Multi-label convolutional neural network based pedestrian attribute classification. Image Vis. Comput. 58, 224–229 (2017)
Abdulnabi, A.H., Wang, G., Lu, J., Jia, K.: Multi-task CNN model for attribute prediction. IEEE Tran. Multimedia (2015)
Guan, H., Cheng, B.: Taking full advantage of convolutional network for robust visual tracking. Multimedia Tools Appl. 78(8), 11011–11025 (2018)
Olmez, E., Akdogan, V., Korkmaz, M., Er, O.: Automatic segmentation of meniscus in multispectral MRI using regions with convolutional neural network (R-CNN). J. Digital Imaging 33(30) (2020)
Girshick, R.: Fast R-CNN. Computer Science (2015)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: Unified, real-time object detection, You only look once (2015)
Chen, X., Yu, J., Wu, Z.: Temporally identity-aware SSD with attentional LSTM. IEEE Transactions on Cybernetics (2018)
Shan, Y.: ADAS and video surveillance analytics system using deep learning algorithms on FPGA. In: 2018 28th International Conference on Field Programmable Logic and Applications (FPL), pp. 465–466 (2018)
Wang, J., Li, A., Pang, Y.: Improved multi-domain convolutional neural networks method for vehicle tracking. Int. J. Artif. Intell. Tools 29(07n08), 2040022 (2020)
Lee, J., Kim, S., Ko, B.C.: Online multiple object tracking using rule distillated siamese random forest. IEEE Access 8 (2020)
He, W., Li, H., Liu, W., Li, C., Guo, B.: rstaple: a robust complementary learning method for real-time object tracking. Appl. Sci. 10(9), 3021 (2020)
Ahmad, M., Ahmed, I., Khan, F.A., Qayum, F., Aljuaid, H.: Convolutional neural network based person tracking using overhead views. Int. J. Distribut. Sensor Netw. 16 (2020)
Gao, L., Li, Y., Ning, J.: Maximum margin object tracking with weighted circulant feature maps. IET Comput. Vis. 13(1), 71–78 (2019)
Chen, Y., Sheng, R.: Single-object tracking algorithm based on two-step spatiotemporal deep feature fusion in a complex surveillance scenario. Math. Probl. Eng. (2021)
Xiang, Z., Tao, T., Song, L., Dong, Z., Wang, H.: Object tracking algorithm for unmanned surface vehicle based on improved mean-shift method. Int. J. Adv. Rob. Syst. 17(3), 172988142092529 (2020)
Tang, Y., Liu, Y., Huang, H., Liu, J., Xie, W.: A scale-adaptive particle filter tracking algorithm based on offline trained multi-domain deep network. IEEE Access (99), 1 (2020)
Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: Exploiting the circulant structure of tracking-by-detection with kernels (2012)
Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: High-speed tracking with kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 583–596 (2014)
Zhang, W., Zhang, Z., Zeadally, S., Chao, H.C., Leung, V.C.: A multiple-algorithm service model for energy-delay optimization in edge artificial intelligence. IEEE Trans. Ind. Inform. (99), 1 (2019)
Sun, T., et al.: Learning sparse sharing architectures for multiple tasks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 8936–8943 (2020)
Zamani, N.S.M., Zaki, W.M.D.W., Huddin, A.B., Hussain, A., Mutalib, H.A.: Automated pterygium detection using deep neural network. IEEE Access 8, 191659–191672 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Ji, Q., Liu, H., Hou, C., Zhang, Q., Mo, H. (2021). An Object Detection and Tracking Algorithm Combined with Semantic Information. In: Xiong, J., Wu, S., Peng, C., Tian, Y. (eds) Mobile Multimedia Communications. MobiMedia 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 394. Springer, Cham. https://doi.org/10.1007/978-3-030-89814-4_62
Download citation
DOI: https://doi.org/10.1007/978-3-030-89814-4_62
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-89813-7
Online ISBN: 978-3-030-89814-4
eBook Packages: Computer ScienceComputer Science (R0)