Visual Object Tracking Using Machine Learning

Odeh, Ammar; Keshta, Ismail; Al-Fayoumi, Mustafa

doi:10.1007/978-3-031-40398-9_4

Ammar Odeh¹⁰,
Ismail Keshta¹¹ &
Mustafa Al-Fayoumi¹⁰

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1809))

Included in the following conference series:

International Conference on Science, Engineering Management and Information Technology

143 Accesses

Abstract

Visual object tracking has become a very active research area in recent years. Each year, a growing number of tracking algorithms are proposed. Object detection and tracking is a critical and challenging task in many critical computer vision applications, including automated video surveillance, traffic monitoring, autonomous robot navigation, and intelligent environments. Object tracking is segmenting an object of interest and tracking its velocity, orientation, and occlusion in a video scene to extract useful information. Over the last two decades, several object tracking approaches have been developed to design a robust object tracker that covers all practical obstacles in real-world operations. This paper reviews recent trends and advances in tracking and assesses the reliability of various trackers based on feature extraction techniques. In video processing, visual tracking has a wide range of applications. When a target is identified in one video frame, it is frequently advantageous to track that object in subsequent frames. Every successful frame in which the target is tracked yields more information about the target’s identity and activity. Because tracking is more straightforward than detection, tracking algorithms can require fewer computational resources than object detectors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Xuan, S., et al.: Siamese networks with distractor-reduction method for long-term visual object tracking. Pattern Recogn. 112, 107698 (2021)
Article Google Scholar
Jiang, M., et al.: High speed long-term visual object tracking algorithm for real robot systems. Neurocomputing 434, 268–284 (2021)
Article Google Scholar
Mehmood, K., et al.: Context-aware and occlusion handling mechanism for online visual object tracking. Electronics 10, 43 (2021)
Article Google Scholar
Wang, Y., Ma, J.: Visual object tracking using surface fitting for scale and rotation estimation. KSII Trans. Internet Inf. Syst. 15 (2021)
Google Scholar
Wu, J., et al.: Towards accurate estimation for visual object tracking with multi-hierarchy feature aggregation. Neurocomputing 451, 252–264 (2021)
Article Google Scholar
Rinnert, P., Nieder, A.: Neural code of motor planning and execution during goal-directed movements in crows. J. Neurosci. 41, 4060–4072 (2021)
Article Google Scholar
Clarence, A., et al.: Unscripted retargeting: reach prediction for haptic retargeting in virtual reality. In: 2021 IEEE Virtual Reality and 3D User Interfaces (VR), pp. 150–159 (2021)
Google Scholar
Zhao, H., et al.: Deep mutual learning for visual object tracking. Pattern Recogn. 112, 107796 (2021)
Article Google Scholar
Guo, Q., et al.: Exploring the effects of blur and deblurring to visual object tracking. IEEE Trans. Image Process. 30, 1812–1824 (2021)
Article Google Scholar
Jia, S., et al.: IoU attack: towards temporally coherent black-box adversarial attack for visual object tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6709–6718 (2021)
Google Scholar
Zhu, J., Zhang, G., Zhou, S., Li, K.: Relation-aware Siamese region proposal network for visual object tracking. Multimedia Tools Appl. 80(10), 15469–15485 (2021). https://doi.org/10.1007/s11042-021-10574-z
Article Google Scholar
Chen, Y., Wang, J., Xia, R., Zhang, Q., Cao, Z., Yang, K.: The visual object tracking algorithm research based on adaptive combination kernel. J. Ambient. Intell. Humaniz. Comput. 10(12), 4855–4867 (2019). https://doi.org/10.1007/s12652-018-01171-4
Article Google Scholar
Lee, S.-H., et al.: Learning discriminative appearance models for online multi-object tracking with appearance discriminability measures. IEEE Access 6, 67316–67328 (2018)
Article Google Scholar
He, M., et al.: Fast online multi-pedestrian tracking via integrating motion model and deep appearance model. IEEE Access 7, 89475–89486 (2019)
Article Google Scholar
Franzoni, V., et al.: Emotional machines: the next revolution. In: Web Intelligence, pp. 1–7 (2019)
Google Scholar
Li, S., Yeung, D.-Y.: Visual object tracking for unmanned aerial vehicles: a benchmark and new motion models. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)
Google Scholar
Cakir, S., Cetin, A.E.: Visual object tracking using Fourier domain phase information. Signal Image Video Process. 1–8 (2021)
Google Scholar
Yuan, D., Zhang, X., Liu, J., Li, D.: A multiple feature fused model for visual object tracking via correlation filters. Multimedia Tools Appl. 78(19), 27271–27290 (2019). https://doi.org/10.1007/s11042-019-07828-2
Article Google Scholar
Chowdhury, P.R., et al.: Brain Inspired Object Recognition System. arXiv preprint arXiv:2105.07237 (2021)
Dawod, M., Hanna, S.: BIM-assisted object recognition for the on-site autonomous robotic assembly of discrete structures. Constr. Robot. 3(1–4), 69–81 (2019). https://doi.org/10.1007/s41693-019-00021-9
Article Google Scholar
Poza-Lujan, J.-L., et al.: Distributed architecture to integrate sensor information: object recognition for smart cities. Sensors 20, 112 (2020)
Article Google Scholar
Girish, S., et al.: The lottery ticket hypothesis for object recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 762–771 (2021)
Google Scholar
Fu, J., et al.: A multi-hypothesis approach to pose ambiguity in object-based SLAM. arXiv preprint arXiv:2108.01225 (2021)
Kutschbach, T., et al.: Sequential sensor fusion combining probability hypothesis density and kernelized correlation filters for multi-object tracking in video data. In: 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 1–5 (2017)
Google Scholar
Wang, Q., et al.: HypoML: visual analysis for hypothesis-based evaluation of machine learning models. IEEE Trans. Visual Comput. Graphics 27, 1417–1426 (2020)
Article Google Scholar
Long, L., et al.: Object-level representation learning for few-shot image classification. arXiv preprint arXiv:1805.10777 (2018)
Hubert, C.: More on the model: building on the ruins of representation. Archit. Des. 91, 14–21 (2021)
Google Scholar
Li, Z., et al.: Self-guided adaptation: progressive representation alignment for domain adaptive object detection. arXiv preprint arXiv:2003.08777 (2020)
Huang, J.: Auto-attentional mechanism in multi-domain convolutional neural networks for improving object tracking. Int. J. Intell. Comput. Cybern. (2021)
Google Scholar
Bekiroglu, Y., et al.: Learning tactile characterizations of object-and pose-specific grasps. In: 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 1554–1560 (2011)
Google Scholar
La Porta, F., et al.: Unified Balance Scale: an activity-based, bed to community, and aetiology-independent measure of balance calibrated with Rasch analysis. J. Rehabil. Med. 43, 435–444 (2011)
Article Google Scholar
Du, B., et al.: A discriminative manifold learning based dimension reduction method for hyperspectral classification. Int. J. Fuzzy Syst. 14, 272–277 (2012)
Google Scholar
Hsiao, E., Hebert, M.: Occlusion reasoning for object detection under arbitrary viewpoint. IEEE Trans. Pattern Anal. Mach. Intell. 36, 1803–1815 (2014)
Article Google Scholar
Bajcsy, R.: Three-dimensional object representation. In: Kittler, J., Fu, K.S., Pau, LF. (eds) Pattern Recognition Theory and Applications, pp. 283–295. Springer, New York (1982). https://doi.org/10.1007/978-94-009-7772-3_17
Moghaddam, B., Pentland, A.: Probabilistic visual learning for object representation. IEEE Trans. Pattern Anal. Mach. Intell. 19, 696–710 (1997)
Article Google Scholar
Laurentini, A.: The visual hull concept for Silhouette-based image understanding. IEEE Trans. Pattern Anal. Mach. Intell. 16, 150–162 (1994)
Article Google Scholar
Ashok, V., Ganapathy, D.: A geometrical method to classify face forms. J. Oral Biol. Craniofac. Res. 9, 232–235 (2019)
Article Google Scholar
Wagemans, J., et al.: Identification of everyday objects on the basis of Silhouette and outline versions. Perception 37, 207–244 (2008)
Article Google Scholar
Sapp, B., et al.: Cascaded models for articulated pose estimation. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) Computer Vision – ECCV 2010. LNCS, vol. 6312, pp. 406–420. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15552-9_30
Geldof, A.A.: Models for cancer skeletal metastasis: a reappraisal of Batson’s plexus. Anticancer Res. 17, 1535–1539 (1997)
Google Scholar
Jarrett, K., et al.: What is the best multi-stage architecture for object recognition? In: 2009 IEEE 12th International Conference on Computer Vision, pp. 2146–2153 (2009)
Google Scholar
Xu, G., Zhang, Z.: Epipolar Geometry in Stereo, Motion and Object Recognition: A Unified Approach, vol. 6. Springer, New York (2013). https://doi.org/10.1007/978-94-015-8668-9
Riesenhuber, M., Poggio, T.: Models of object recognition. Nat. Neurosci. 3, 1199–1204 (2000)
Article Google Scholar
Grasselli, G., et al.: Quantitative three-dimensional description of a rough surface and parameter evolution with shearing. Int. J. Rock Mech. Min. Sci. 39, 789–800 (2002)
Article Google Scholar
Dutton, Z., et al.: Attaining the quantum limit of superresolution in imaging an object’s length via predetection spatial-mode sorting. Phys. Rev. A 99, 033847 (2019)
Article Google Scholar
Betke, M., Makris, N.C.: Information-conserving object recognition. In: Sixth International Conference on Computer Vision (IEEE Cat. No. 98CH36271), pp. 145–152 (1998)
Google Scholar
Barrett, H.H., et al.: Objective assessment of image quality. II. Fisher information, Fourier crosstalk, and figures of merit for task performance. JOSA A 12, 834–852 (1995)
Article Google Scholar
Betke, M., Makris, N.C.: Recognition, resolution, and complexity of objects subject to affine transformations. Int. J. Comput. Vision 44, 5–40 (2001)
Article MATH Google Scholar
Tian, T., et al.: Cramer-Rao bounds of localization estimation for integrated radar and communication system. IEEE Access 8, 105852–105863 (2020)
Article Google Scholar
Zheng, Y., et al.: A new precision evaluation method for signals of opportunity based on Cramer-Rao lower bound in finite error. In: 2019 Chinese Control Conference (CCC), pp. 3934–3939 (2019)
Google Scholar
Lee, S., et al.: Estimation error bound of battery electrode parameters with limited data window. IEEE Trans. Industr. Inf. 16, 3376–3386 (2019)
Article Google Scholar
Li, X., Yang, F., Cheng, H., Liu, W., Shen, D.: Contour knowledge transfer for salient object detection. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11219, pp. 370–385. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01267-0_22
Chapter Google Scholar
Gong, X.-Y., et al.: An overview of contour detection approaches. Int. J. Autom. Comput. 15, 656–672 (2018)
Article Google Scholar
Philbrick, K.A., et al.: RIL-contour: a medical imaging dataset annotation tool for and with deep learning. J. Digit. Imaging 32, 571–581 (2019)
Article Google Scholar
Dai, Y., et al.: Trajectory tracking control for seafloor tracked vehicle by adaptive neural-fuzzy inference system algorithm. Int. J. Comput. Commun. Control 13, 465–476 (2018)
Article Google Scholar
Guan, W., et al.: Visible light dynamic positioning method using improved Camshift-Kalman algorithm. IEEE Photonics J. 11, 1–22 (2019)
Google Scholar
Hu, B., Niebur, E.: A recurrent neural model for proto-object based contour integration and figure-ground segregation. J. Comput. Neurosci. 43(3), 227–242 (2017). https://doi.org/10.1007/s10827-017-0659-3
Article MATH Google Scholar
Qin, J., et al.: An encrypted image retrieval method based on Harris corner optimization and LSH in cloud computing. IEEE Access 7, 24626–24633 (2019)
Article Google Scholar
Cheng, J., et al.: Hidden Markov model-based nonfragile state estimation of switched neural network with probabilistic quantized outputs. IEEE Trans. Cybern. 50, 1900–1909 (2019)
Article Google Scholar
Cai, D., et al.: A moving target detecting and tracking system based on DSP. In: 2017 International Conference on Optical Instruments and Technology: Optoelectronic Imaging/Spectroscopy and Signal Processing Technology, p. 106200Z (2018)
Google Scholar
Dimeas, F., Doulgeri, Z.: Progressive automation of periodic tasks on planar surfaces of unknown pose with hybrid force/position control. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5246–5252 (2020)
Google Scholar
Zong, B., et al.: Deep autoencoding Gaussian mixture model for unsupervised anomaly detection. In: International Conference on Learning Representations (2018)
Google Scholar
Lee, H., Kim, D.: Salient region-based online object tracking. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1170–1177 (2018)
Google Scholar
Yu, T.T., War, N.: Condensed object representation with corner HOG features for object classification in outdoor scenes. In: 2017 18th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD), pp. 77–82 (2017)
Google Scholar
Wang, X., et al.: Aerial infrared object tracking via an improved long-term correlation filter with optical flow estimation and SURF matching. Infrared Phys. Technol. 116, 103790 (2021)
Article Google Scholar
Sadegh, A.M., Worek, W.M.: Marks’ Standard Handbook for Mechanical Engineers: McGraw-Hill Education (2018)
Google Scholar
Chhabra, P., Garg, N.K., Kumar, M.: Content-based image retrieval system using ORB and SIFT features. Neural Comput. Appl. 32(7), 2725–2733 (2018). https://doi.org/10.1007/s00521-018-3677-9
Article Google Scholar
Amaya, M., et al.: Adaptive sequential Monte Carlo for posterior inference and model selection among complex geological priors. Geophys. J. Int. 226, 1220–1238 (2021)
Article Google Scholar
Bae, S.-H., Yoon, K.-J.: Confidence-based data association and discriminative deep appearance learning for robust online multi-object tracking. IEEE Trans. Pattern Anal. Mach. Intell. 40, 595–610 (2017)
Article Google Scholar
Mbakop, S., et al.: Inverse dynamics model-based shape control of soft continuum finger robot using parametric curve. IEEE Robot. Autom. Lett. 6, 8053–8060 (2021)
Article Google Scholar
Liu, F., et al.: Robust visual tracking revisited: From correlation filter to template matching. IEEE Trans. Image Process. 27, 2777–2790 (2018)
Article MathSciNet MATH Google Scholar
Kaskman, R., et al.: HomebrewedDB: RGB-D dataset for 6D pose estimation of 3D objects. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (2019)
Google Scholar
Xu, J., et al.: Bilevel distance metric learning for robust image recognition. Adv. Neural. Inf. Process. Syst. 31, 4198–4207 (2018)
Google Scholar
Cabeza de Vaca, I., et al.: Enhanced Monte Carlo methods for modeling proteins including computation of absolute free energies of binding. J. Chem. Theory Comput. 14, 3279–3288 (2018)
Article Google Scholar
Guler, Z., et al.: A new object tracking framework for interest point based feature extraction algorithms. Elektronika ir Elektrotechnika 26, 63–71 (2020)
Article Google Scholar
Pareek, A., et al.: A robust surf-based online human tracking algorithm using adaptive object model. In: Proceedings of International Conference on Artificial Intelligence and Applications, pp. 543–551 (2021)
Google Scholar
Rejeesh, M.: Interest point based face recognition using adaptive neuro fuzzy inference system. Multimedia Tools Appl. 78, 22691–22710 (2019)
Article Google Scholar
Kann, K., et al.: Fortification of neural morphological segmentation models for polysynthetic minimal-resource languages. arXiv preprint arXiv:1804.06024 (2018)
Noyel, G., et al.: Morphological segmentation of hyperspectral images. arXiv preprint arXiv:2010.00853 (2020)
Yang, X., et al.: A face detection method based on skin color model and improved AdaBoost algorithm. Traitement du Signal, vol. 37 (2020)
Google Scholar
Hameed, K., et al.: A sample weight and AdaBoost CNN-based coarse to fine classification of fruit and vegetables at a supermarket self-checkout. Appl. Sci. 10, 8667 (2020)
Article Google Scholar
Sun, Y., et al.: Active perception for foreground segmentation: an RGB-D data-based background modeling method. IEEE Trans. Autom. Sci. Eng. 16, 1596–1609 (2019)
Article Google Scholar
Voigtlaender, P., et al.: MOTS: multi-object tracking and segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7942–7951 (2019)
Google Scholar
Gunjal, P.R., et al.: Moving object tracking using Kalman filter. In: 2018 International Conference on Advances in Communication and Computing Technology (ICACCT), pp. 544–547 (2018)
Google Scholar

Download references

Acknowledgments

This research was supported by Princess Sumaya University for Technology (PSUT) and Researchers Supporting Program (TUMA-Project-2021-14), AlMaarefa University.

Author information

Authors and Affiliations

Department of Computer Science, King Hussein School of Computing Sciences, Princess Sumaya University for Technology Amman, Amman, Jordan
Ammar Odeh & Mustafa Al-Fayoumi
Department of Computer Science, College of Applied Sciences, AlMaarefa University Riyadh, Riyadh, Kingdom of Saudi Arabia
Ismail Keshta

Authors

Ammar Odeh
View author publications
You can also search for this author in PubMed Google Scholar
Ismail Keshta
View author publications
You can also search for this author in PubMed Google Scholar
Mustafa Al-Fayoumi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ammar Odeh .

Editor information

Editors and Affiliations

Kharazmi University, Tehran, Iran
Abolfazl Mirzazadeh
Ankara Yıldırım Beyazıt University, Ankara, Türkiye
Babek Erdebilli
Istinye University, Istanbul, Türkiye
Erfan Babaee Tirkolaee
Poznań University of Technology, Poznań, Poland
Gerhard-Wilhelm Weber
Indian Institute of Technology Delhi, New Delhi, India
Arpan Kumar Kar

Ethics declarations

Conflict of Interest

The authors declare no conflict of interest.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Odeh, A., Keshta, I., Al-Fayoumi, M. (2023). Visual Object Tracking Using Machine Learning. In: Mirzazadeh, A., Erdebilli, B., Babaee Tirkolaee, E., Weber, GW., Kar, A.K. (eds) Science, Engineering Management and Information Technology. SEMIT 2022. Communications in Computer and Information Science, vol 1809. Springer, Cham. https://doi.org/10.1007/978-3-031-40398-9_4

Download citation

DOI: https://doi.org/10.1007/978-3-031-40398-9_4
Published: 21 August 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-40397-2
Online ISBN: 978-3-031-40398-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics