Skip to main content

UC-OWOD: Unknown-Classified Open World Object Detection

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Abstract

Open World Object Detection (OWOD) is a challenging computer vision problem that requires detecting unknown objects and gradually learning the identified unknown classes. However, it cannot distinguish unknown instances as multiple unknown classes. In this work, we propose a novel OWOD problem called Unknown-Classified Open World Object Detection (UC-OWOD). UC-OWOD aims to detect unknown instances and classify them into different unknown classes. Besides, we formulate the problem and devise a two-stage object detector to solve UC-OWOD. First, unknown label-aware proposal and unknown-discriminative classification head are used to detect known and unknown objects. Then, similarity-based unknown classification and unknown clustering refinement modules are constructed to distinguish multiple unknown classes. Moreover, two novel evaluation protocols are designed to evaluate unknown-class detection. Abundant experiments and visualizations prove the effectiveness of the proposed method. Code is available at https://github.com/JohnWuzh/UC-OWOD.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Ahmadinejad, N., Liu, L.: J-Score: a robust measure of clustering accuracy. arXiv preprint arXiv:2109.01306 (2021)

  2. Aljundi, R., Babiloni, F., Elhoseiny, M., Rohrbach, M., Tuytelaars, T.: Memory aware synapses: learning what (not) to forget. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 144–161. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_9

    Chapter  Google Scholar 

  3. Bansal, A., Sikka, K., Sharma, G., Chellappa, R., Divakaran, A.: Zero-shot object detection. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11205, pp. 397–414. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01246-5_24

    Chapter  Google Scholar 

  4. Basu, S., Banerjee, A., Mooney, R.J.: Active semi-supervision for pairwise constrained clustering. In: Proceedings of the 2004 SIAM International Conference on Data Mining, pp. 333–344 (2004)

    Google Scholar 

  5. Bendale, A., Boult, T.: Towards open world recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1893–1902 (2015)

    Google Scholar 

  6. Bendale, A., Boult, T.E.: Towards open set deep networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1563–1572 (2016)

    Google Scholar 

  7. Caesar, H., et al.: nuScenes: a multimodal dataset for autonomous driving. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11621–11631 (2020)

    Google Scholar 

  8. Cao, J., Cholakkal, H., Anwer, R.M., Khan, F.S., Pang, Y., Shao, L.: D2Det: towards high quality object detection and instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)

    Google Scholar 

  9. Castro, F.M., Marín-Jiménez, M.J., Guil, N., Schmid, C., Alahari, K.: End-to-end incremental learning. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11216, pp. 241–257. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01258-8_15

    Chapter  Google Scholar 

  10. Chen, X., Yu, J., Kong, S., Wu, Z., Wen, L.: Joint anchor-feature refinement for real-time accurate object detection in images and videos. IEEE Trans. Circ. Syst. Video Technol. 31(2), 594–607 (2020)

    Article  Google Scholar 

  11. Dhamija, A., Gunther, M., Ventura, J., Boult, T.: The overlooked elephant of object detection: open set. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 1021–1030 (2020)

    Google Scholar 

  12. Dinler, D., Tural, M.K.: A survey of constrained clustering. In: Celebi, M.E., Aydin, K. (eds.) Unsupervised Learning Algorithms, pp. 207–235. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-24211-8_9

    Chapter  Google Scholar 

  13. Dong, N., Zhang, Y., Ding, M., Lee, G.H.: Bridging non co-occurrence with unlabeled in-the-wild data for incremental object detection. In: Proceedings of Advances in Neural Information Processing Systems (NeurIPS), pp. 30492–30503 (2021)

    Google Scholar 

  14. Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The Pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. (IJCV) 88(2), 303–338 (2010). https://doi.org/10.1007/s11263-009-0275-4

    Article  Google Scholar 

  15. French, R.M.: Catastrophic forgetting in connectionist networks. Trends Cogn. Sci. 3(4), 128–135 (1999)

    Article  Google Scholar 

  16. Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the KITTI dataset. Int. J. Robot. Res. 32(11), 1231–1237 (2013)

    Article  Google Scholar 

  17. Girshick, R.B.: Fast R-CNN. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1440–1448 (2015)

    Google Scholar 

  18. Gupta, A., Narayan, S., Joseph, K., Khan, S., Khan, F.S., Shah, M.: OW-DETR: open-world detection transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9235–9244 (2022)

    Google Scholar 

  19. Hall, D., et al.: Probabilistic object detection: definition and evaluation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 1031–1040 (2020)

    Google Scholar 

  20. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2961–2969 (2017)

    Google Scholar 

  21. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)

    Google Scholar 

  22. Hsu, Y.C., Lv, Z., Kira, Z.: Learning to cluster in order to transfer across domains and tasks. In: Proceedings of International Conference on Learning Representations (ICLR) (2018)

    Google Scholar 

  23. Jain, L.P., Scheirer, W.J., Boult, T.E.: Multi-class open set recognition using probability of inclusion. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8691, pp. 393–409. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10578-9_26

    Chapter  Google Scholar 

  24. Joseph, K., Khan, S., Khan, F.S., Balasubramanian, V.N.: Towards open world object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5830–5840 (2021)

    Google Scholar 

  25. Joseph, K.J., Balasubramanian, V.N.: Meta-consolidation for continual learning. In: Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), pp. 14374–14386 (2020)

    Google Scholar 

  26. Kj, J., Rajasegaran, J., Khan, S., Khan, F.S., Balasubramanian, V.N.: Incremental object detection via meta-learning. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI), early access (2021). https://doi.org/10.1109/TPAMI.2021.3124133

  27. Kuhn, H.W.: The Hungarian method for the assignment problem. Nav. Res. Logist. Q. 2(1–2), 83–97 (1955)

    Article  MathSciNet  MATH  Google Scholar 

  28. Lenz, I., Lee, H., Saxena, A.: Deep learning for detecting robotic grasps. Int. J. Robot. Res. 34(4–5), 705–724 (2015)

    Article  Google Scholar 

  29. Li, Z., Hoiem, D.: Learning without forgetting. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 40(12), 2935–2947 (2018)

    Article  Google Scholar 

  30. Lin, T.E., Xu, H., Zhang, H.: Discovering new intents via constrained deep adaptive clustering with cluster refinement. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 8360–8367 (2020)

    Google Scholar 

  31. Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2980–2988 (2017)

    Google Scholar 

  32. Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  33. Liu, Z., Miao, Z., Zhan, X., Wang, J., Gong, B., Yu, S.X.: Large-scale long-tailed recognition in an open world. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2537–2546 (2019)

    Google Scholar 

  34. Lu, C., Krishna, R., Bernstein, M., Fei-Fei, L.: Visual relationship detection with language priors. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 852–869. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_51

    Chapter  Google Scholar 

  35. Lu, Y., Chen, X., Wu, Z., Yu, J.: Decoupled metric network for single-stage few-shot object detection. IEEE Trans. Cybern. early access, 1–12 (2022). https://doi.org/10.1109/TCYB.2022.3149825

  36. Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. (JMLR) 9(11), 2579–2605 (2008)

    MATH  Google Scholar 

  37. Mallya, A., Lazebnik, S.: PackNet: adding multiple tasks to a single network by iterative pruning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7765–7773 (2018)

    Google Scholar 

  38. McCloskey, M., Cohen, N.: Catastrophic interference in connectionist networks: the sequential learning problem. Psychol. Learn. Motiv. 24, 109–165 (1989)

    Article  Google Scholar 

  39. Miller, D., Dayoub, F., Milford, M., Sünderhauf, N.: Evaluating merging strategies for sampling-based uncertainty techniques in object detection. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp. 2348–2354 (2019)

    Google Scholar 

  40. Miller, D., Nicholson, L., Dayoub, F., Sünderhauf, N.: Dropout sampling for robust object detection in open-set conditions. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp. 3243–3249 (2018)

    Google Scholar 

  41. Perera, P., et al.: Generative-discriminative feature representations for open-set recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11814–11823 (2020)

    Google Scholar 

  42. Perez-Rua, J.M., Zhu, X., Hospedales, T.M., Xiang, T.: Incremental few-shot object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13846–13855 (2020)

    Google Scholar 

  43. Prabhu, A., Torr, P.H.S., Dokania, P.K.: GDumb: a simple approach that questions our progress in continual learning. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12347, pp. 524–540. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58536-5_31

    Chapter  Google Scholar 

  44. Rajasegaran, J., Khan, S., Hayat, M., Khan, F.S., Shah, M.: iTAML: an incremental task-agnostic meta-learning approach. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13588–13597 (2020)

    Google Scholar 

  45. Rebuffi, S.A., Kolesnikov, A., Sperl, G., Lampert, C.H.: iCaRL: incremental classifier and representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2001–2010 (2017)

    Google Scholar 

  46. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788 (2016)

    Google Scholar 

  47. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Proceedings of Advances in Neural Information Processing Systems (NeurIPS), pp. 91–99 (2015)

    Google Scholar 

  48. Rostami, M., Spinoulas, L., Hussein, M., Mathai, J., Abd-Almageed, W.: Detection and continual learning of novel face presentation attacks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 14851–14860 (2021)

    Google Scholar 

  49. Rusu, A.A., et al.: Progressive neural networks. arXiv preprint arXiv:1606.04671 (2016)

  50. Scheirer, W.J., Jain, L.P., Boult, T.E.: Probability models for open set recognition. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 36(11), 2317–2324 (2014)

    Article  Google Scholar 

  51. Scheirer, W.J., de Rezende Rocha, A., Sapkota, A., Boult, T.E.: Toward open set recognition. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 35(7), 1757–1772 (2013)

    Article  Google Scholar 

  52. Serra, J., Suris, D., Miron, M., Karatzoglou, A.: Overcoming catastrophic forgetting with hard attention to the task. In: Proceedings of International Conference on Machine Learning (ICML), pp. 4548–4557 (2018)

    Google Scholar 

  53. Sun, P., et al.: Scalability in perception for autonomous driving: Waymo open dataset. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2446–2454 (2020)

    Google Scholar 

  54. Wang, J., Wang, X., Shang-Guan, Y., Gupta, A.: Wanderlust: online continual object detection in the real world. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 10829–10838 (2021)

    Google Scholar 

  55. Wang, X., Huang, T.E., Gonzalez, J., Darrell, T., Yu, F.: Frustratingly simple few-shot object detection. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 9919–9928 (2020)

    Google Scholar 

  56. Xie, J., Girshick, R., Farhadi, A.: Unsupervised deep embedding for clustering analysis. In: Proceedings of International Conference on Machine Learning (ICML), pp. 478–487 (2016)

    Google Scholar 

  57. Xu, H., Liu, B., Shu, L., Yu, P.: Open-world learning and application to product classification. In: Proceedings of the World Wide Web Conference (WWW), pp. 3413–3419 (2019)

    Google Scholar 

  58. Yoshihashi, R., Shao, W., Kawakami, R., You, S., Iida, M., Naemura, T.: Classification-reconstruction learning for open-set recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4016–4025 (2019)

    Google Scholar 

  59. Yue, Z., Wang, T., Sun, Q., Hua, X.S., Zhang, H.: Counterfactual zero-shot and open-set visual recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15404–15414 (2021)

    Google Scholar 

  60. Zhao, X., Liu, X., Shen, Y., Ma, Y., Qiao, Y., Wang, D.: Revisiting open world object detection. arXiv preprint arXiv:2201.00471 (2022)

  61. Zhigang, C., Xuan, L., Fan, Y.: Constrained k-means with external information. In: Proceedings of 2013 8th International Conference on Computer Science & Education, pp. 490–493 (2013)

    Google Scholar 

  62. Zhu, X., Su, W., Lu, L., Li, B., Wang, X., Dai, J.: Deformable DETR: deformable transformers for end-to-end object detection. In: Proceedings of International Conference on Learning Representations (ICLR) (2021)

    Google Scholar 

Download references

Acknowledgements

This work was supported in part by the National Key Research and Development Program of China under Grant 2019YFB1310300 and in part by the National Natural Science Foundation of China under Grant 62022090.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhengxing Wu .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 4391 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wu, Z., Lu, Y., Chen, X., Wu, Z., Kang, L., Yu, J. (2022). UC-OWOD: Unknown-Classified Open World Object Detection. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13670. Springer, Cham. https://doi.org/10.1007/978-3-031-20080-9_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-20080-9_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-20079-3

  • Online ISBN: 978-3-031-20080-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics