CDTD: A Large-Scale Cross-Domain Benchmark for Instance-Level Image-to-Image Translation and Domain Adaptive Object Detection


Cross-domain visual problems, such as image-to-image translation and domain adaptive object detection, have attracted increasing attentions in the last few years, and also become new rising and challenging directions for the computer vision community. Recently, despite enormous efforts of the field in data collection, there are still few datasets covering the instance-level image-to-image translation and domain adaptive object detection tasks simultaneously. In this work, we introduce a large-scale cross-domain benchmark CDTD (contains 155,529 high-resolution natural images across four different modalities with object bounding box annotations. A summary of the entire dataset is provided in the following sections. Dataset is available at: for the new instance-level translation and object detection tasks. We provide comprehensive baseline results of the benchmark on both of these two tasks. Moreover, we proposed a novel instance-level image-to-image translation approach called INIT and a gradient detach method for the domain adaptive object detection to harvest and exert dataset’s function of the instance level annotations across different domains.

  1. 1.

    The abbreviation of A C ross-D omain Benchmark for T ranslation and D etection tasks.

  2. 2.

    For safety, we collect the rainy images after the rain, so this category looks more like overcast weather with wet road.

  3. 3.

  4. 4.


    MathSciNet  MATH  Google Scholar 

    Article  Google Scholar 

    Article  Google Scholar 

    Article  Google Scholar 

    Article  Google Scholar 

  • Cross-domain benchmark
  • Instance level image-to-image translation
  • Domain adaptive object detection