Skip to main content
Log in

OBB detector: occluded object detection based on geometric modeling of video frames

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Object detection is an important research area in video surveillance systems, aimed at identifying and locating target objects within recorded scenes. Various object detectors fail when partial occlusion occurs in which only some features of the objects are visible due to overlapped bounding boxes. This situation can result in miscounting of the objects and misaligning the bounding boxes leading to localization loss. To address these problems, we have proposed a geometric-based axis-aligned bounding box method with occlusion prior conditions to estimate the location of overlapped bounding boxes with a single viewpoint. Firstly, the proposed method detects the closest points of the detected bounding boxes by extracting geometric features namely the width, height, and area of the detected objects. Secondly, occlusion prior condition is used to detect the partial occlusion and compute the overlapped area under different levels of occlusions such as (i) 20–40% and (ii) 40–70%. The performance of the proposed method has been tested on two benchmark datasets: Highway and PETS 2006, both containing outdoor video frames. The experimental results show that the proposed method can detect the objects under partial occlusion which are approximately 65% occluded with 92.7% precision in the Highway dataset and 85.1% precision in the PETS2006 dataset. Also, it has been observed that the bounding box localization loss of the proposed method has been improved by 1.76% in the Highway dataset and 2% in the PETS2006 dataset by generating the correct aligned bounding boxes on the detected objects, especially in the case of partial occlusion.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Algorithm 1
Fig. 6
Fig. 7
Algorithm 2
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

Data and materials availability

Data is easily available online. http://changedetection.net/.

References

  1. Zou, Z., Chen, K., Shi, Z., Guo, Y., Ye, J.: Object Detection in 20 Years: A Survey. In: Proc. IEEE, no. June, pp. 1–20, 2023, doi: https://doi.org/10.1109/jproc.2023.3238524.

  2. Amit, Y., Felzenszwalb, P., Girshick, R.: Object Detection. In: Computer Vision: A Reference Guide, Cham: Springer International Publishing, 2020, pp. 1–9. doi: https://doi.org/10.1007/978-3-030-03243-2_660-1

  3. Yuan, Y., Chu, J., Leng, L., Miao, J., Kim, B.-G.: A scale-adaptive object-tracking algorithm with occlusion detection. EURASIP J. Image Video Process. 2020(1), 7 (2020). https://doi.org/10.1186/s13640-020-0496-6

    Article  Google Scholar 

  4. Shi, P., Hou, B., Chen, J., Zu, Y.: An algorithm of occlusion detection for the surveillance camera. Sci. Program., (2021) https://doi.org/10.1155/2021/6698160

  5. Li, Y.Y., et al.: Occlusion. Comput. Vis. Image Underst. 17(3), 346–359 (2008). https://doi.org/10.1016/j.cviu.2007.09.014

    Article  Google Scholar 

  6. Gholamhosseinian, A., Seitz, J.: Vehicle classification in intelligent transport systems: an overview, methods and software perspective. IEEE Open J. Intell. Transp. Syst. 2, 173–194 (2021). https://doi.org/10.1109/OJITS.2021.3096756

    Article  Google Scholar 

  7. Xie, H., Zheng, W., Shin, H.: Occluded pedestrian detection techniques by deformable attention-guided network (Dagn). Appl. Sci. (2021). https://doi.org/10.3390/app11136025

    Article  Google Scholar 

  8. Fantacci, C., Vo, B.-N., Vo, B.-T., Battistelli, G., Chisci, L.: Robust fusion for multisensor multiobject tracking. IEEE Signal Process. Lett. 25(5), 640–644 (2018). https://doi.org/10.1109/LSP.2018.2811750

    Article  Google Scholar 

  9. Wang, C. Xinlong and Xiao, Tete and Jiang, Yuning and Shao, Shuai and Sun, Jian and Shen, Repulsion loss: Detecting pedestrians in a crowd.,” {Proceedings IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR)}, 2018

  10. Meshgi, S., Ishii, K.: The state-of-the-art in handling occlusions for visual object tracking. IEICE Trans. Inf. Syst. 98(7), 1260 (2015)

    Article  Google Scholar 

  11. C. Zhou and J. Yuan, “Bi-box Regression for Pedestrian Detection and Occlusion Estimation,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 11205 LNCS, pp. 138–154, 2018, doi: https://doi.org/10.1007/978-3-030-01246-5_9.

  12. E. Bochinski, V. Eiselein, and T. Sikora, “High-Speed tracking-by-detection without using image information,” in 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), 2017, pp. 1–6. doi: https://doi.org/10.1109/AVSS.2017.8078516.

  13. Li, Y., Tu, Y., Chen, X., Zhao, H., Zhou, G.: Distance-Aware Occlusion Detection with Focused Attention. IEEE Trans. Image Process. 31, 5661–5676 (2022). https://doi.org/10.1109/TIP.2022.3197984

    Article  Google Scholar 

  14. B. Pepikj, Bojan and Stark, Michael and Gehler, Peter and Schiele, “Occlusion Patterns for Object Class Detection,” Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2013.

  15. P. Cai et al., “Collision Detection Using Axis Aligned Bounding Boxes,” in Simulations, Serious Games and Their Applications, Y. Cai and S. L. Goei, Eds. Singapore: Springer Singapore, 2014, pp. 1–14. doi: https://doi.org/10.1007/978-981-4560-32-0_1.

  16. K. Saleh and Z. Vámossy, “BBBD: Bounding Box Based Detector for Occlusion Detection and Order Recovery,” pp. 78–84, 2022, doi: https://doi.org/10.5220/0011146600003209.

  17. X. Tian, H. Li, and H. Deng, “An improved object tracking algorithm based on adaptive weighted strategy and occlusion detection mechanism,” J. Algorithms Comput. Technol., vol. 15, 2021, doi: https://doi.org/10.1177/1748302620973536.

  18. H. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid, and S. Savarese, “Generalized intersection over union: A metric and a loss for bounding box regression,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2019-June, pp. 658–666, 2019, doi: https://doi.org/10.1109/CVPR.2019.00075.

  19. Moutakki, Z., Ouloul, I.M., Afdel, K., Amghar, A.: Real-time video surveillance system for traffic management with background subtraction using codebook model and occlusion handling. Transp. Telecommun. 18(4), 297–306 (2017). https://doi.org/10.1515/ttj-2017-0027

    Article  Google Scholar 

  20. Ferrari, V., Jurie, F., Schmid, C.: From Images to Shape Models for Object Detection. Int. J. Comput. Vis. 87(3), 284–303 (2010). https://doi.org/10.1007/s11263-009-0270-9

    Article  Google Scholar 

  21. Nevatia, R., Binford, T.O.: Description and recognition of curved objects. Artif. Intell. 8(1), 77–98 (1977). https://doi.org/10.1016/0004-3702(77)90006-6

    Article  Google Scholar 

  22. Mao, W., Zheng, J., Li, B.: “Patch-based object tracking using corner and color with partial occlusion handling”, in. IEEE International Conference on Progress in Informatics and Computing 2014, 269–274 (2014). https://doi.org/10.1109/PIC.2014.6972339

    Article  Google Scholar 

  23. Lowe, D.G.: Distinctive Image Features from Scale-Invariant Keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004). https://doi.org/10.1023/B:VISI.0000029664.99615.94

    Article  Google Scholar 

  24. Park, S., Lee, H., Yoo, J.-H., Kim, G., Kim, S.: Partially Occluded Facial Image Retrieval Based on a Similarity Measurement. Math. Probl. Eng. 2015, 217568 (2015). https://doi.org/10.1155/2015/217568

    Article  Google Scholar 

  25. Li, X., et al.: Multi-Task Structure-Aware Context Modeling for Robust Keypoint-Based Object Tracking. IEEE Trans. Pattern Anal. Mach. Intell. 41(4), 915–927 (2019). https://doi.org/10.1109/TPAMI.2018.2818132

    Article  Google Scholar 

  26. Hu, K., Huang, G., Yang, Y., Pun, C.M., Ling, W.K., Cheng, L.: Rapid facial expression recognition under part occlusion based on symmetric SURF and heterogeneous soft partition network. Multimed. Tools Appl. 79(41–42), 30861–30881 (2020). https://doi.org/10.1007/s11042-020-09566-2

    Article  Google Scholar 

  27. Weng, R., Lu, J., Tan, Y.-P.: Robust Point Set Matching for Partial Face Recognition. IEEE Trans. Image Process. 25(3), 1163–1176 (2016). https://doi.org/10.1109/TIP.2016.2515987

    Article  MathSciNet  Google Scholar 

  28. X. Wang, T. X. Han, and S. Yan, “An HOG-LBP human detector with partial occlusion handling,” in 2009 IEEE 12th International Conference on Computer Vision, 2009, pp. 32–39. doi: https://doi.org/10.1109/ICCV.2009.5459207.

  29. Mishra, P.K., Saroha, G.P.: Occlusion handling strategies for multiple moving object classification. Int. J. Comput. Vis. Robot. 10(2), 167–184 (2020). https://doi.org/10.1504/IJCVR.2020.105683

    Article  Google Scholar 

  30. C. C. Ghiasi, Golnaz and Fowlkes, “Occlusion Coherence: Localizing Occluded Faces with a Hierarchical Deformable Part Model,” Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2014.

  31. A. Y. Stewart, Russell and Andriluka, Mykhaylo and Ng, “End-To-End People Detection in Crowded Scenes,” Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016.

  32. Jia, Q., Chen, X., Wang, Y., Fan, X., Ling, H., Latecki, L.J.: A rotation robust shape transformer for cartoon character recognition. Vis. Comput. (2023). https://doi.org/10.1007/s00371-023-03123-2

    Article  Google Scholar 

  33. L. Qi, L. Jiang, S. Liu, X. Shen, and J. Jia, “Amodal instance segmentation with kins dataset,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2019-June, pp. 3009–3018, 2019, doi: https://doi.org/10.1109/CVPR.2019.00313.

  34. H. A. Bin Sulaiman, M. A. Othman, M. Z. A. A. Aziz, and A. Bade, “Implementation of axis-aligned bounding box for opengl 3D virtual environment,” ARPN J. Eng. Appl. Sci., vol. 10, no. 2, pp. 701–708, 2015.

  35. D. Bartz, Dirk and Klosowski, James T and Staneker, “Tighter bounding volumes for better occlusion culling performance,” 2005.

  36. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017). https://doi.org/10.1109/TPAMI.2016.2577031

    Article  Google Scholar 

  37. Heo, J., Wang, Y., Park, J.: Occlusion-aware spatial attention transformer for occluded object recognition. Pattern Recognit. Lett. 159, 70–76 (2022). https://doi.org/10.1016/j.patrec.2022.05.006

    Article  Google Scholar 

  38. T. Mostafa, S. J. Chowdhury, M. K. Rhaman, and M. G. R. Alam, “Occluded Object Detection for Autonomous Vehicles Employing YOLOv5, YOLOX and Faster R-CNN,” in 2022 IEEE 13th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), 2022, pp. 405–410. doi: https://doi.org/10.1109/IEMCON56893.2022.9946565

  39. Wang, M., Du, H., Mei, W., Wang, S., Yuan, D.: Material-aware Cross-channel Interaction Attention (MCIA) for occluded prohibited item detection. Vis. Comput. 39(7), 2865–2877 (2023). https://doi.org/10.1007/s00371-022-02498-y

    Article  Google Scholar 

  40. Agrawal, S., Natu, P.: ABGS Segmenter: pixel wise adaptive background subtraction and intensity ratio based shadow removal approach for moving object detection. J. Supercomput. (2022). https://doi.org/10.1007/s11227-022-04972-9

    Article  Google Scholar 

  41. B. Pepik, M. Stark, and P. Gehler, “Occlusion Patterns for Object Class Detection,” CVPR, pp. 3286–3293, 2013, doi: https://doi.org/10.1109/CVPR.2013.422.

  42. Zhou, C., Yuan, J.: Occlusion Pattern Discovery for Object Detection and Occlusion Reasoning. IEEE Trans. Circuits Syst. Video Technol. 30(7), 2067–2080 (2020). https://doi.org/10.1109/TCSVT.2019.2909982

    Article  Google Scholar 

  43. “Changedetection.net.” http://jacarini.dinf.usherbrooke.ca/dataset2012

  44. Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., Ren, D.: “Distance-IoU loss: Faster and better learning for bounding box regression”, AAAI 2020–34th AAAI Conf. Artif. Intell. 2, 12993–13000 (2020). https://doi.org/10.1609/aaai.v34i07.6999

    Article  Google Scholar 

Download references

Acknowledgements

Not applicable

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Contributions

Author 1: wrote the manuscript. Author 2: reviewed the manuscript.

Corresponding author

Correspondence to Supriya Agrawal.

Ethics declarations

Conflict interest

The authors declare that they have no competing interests.

Ethics approval

Not applicable.

Consent for publication

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

This section contains the input frame and ground truth frame of Highway and PETS 2006 datasets. The red color bounding box depicts an occluded area.

figure e

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Agrawal, S., Natu, P. OBB detector: occluded object detection based on geometric modeling of video frames. Vis Comput (2024). https://doi.org/10.1007/s00371-024-03374-7

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00371-024-03374-7

Keywords

Navigation