Skip to main content
Log in

International Journal of Computer Vision - Call for Papers: Special Issue on Visual Datasets

Guest Editors

  • Xin Zhao, University of Science and Technology Beijing, China
  • Liang Zheng, Australian National University, Australia
  • Qiang Qiu, Purdue University, USA
  • Yin Li, University of Wisconsin-Madison, USA
  • Limin Wang, Nanjing University, China
  • José Lezama, Google Research, USA
  • Qiuhong Ke, Monash University, Australia
  • Yongchan Kwon, Columbia University, USA
  • Ruoxi Jia, Virginia Tech, USA
  • Jungong Han, University of Sheffield, UK

Data is the fuel of computer vision, on which state-of-the-art systems are built. A robust computer vision system not only needs a strong model architecture and learning algorithms but also relies on a comprehensive large-scale training set. Despite the pivotal significance of datasets, existing research in computer vision is usually algorithm centric. That is, given fixed training and test data, it is the algorithms or models that are primarily considered for improving. As such, while significant progress has been made in understanding and improving algorithms, there is much less effort in the community made on dataset-level analysis. For example, comparing the number of algorithm-centric works in domain adaptation, the quantitative understanding of the domain gap is much more limited. As a result, there are currently few investigations into the representations of datasets, while in contrast an abundance of literature concerns ways to represent images or videos, essential elements in datasets.

Much benefit can be brought by research centered on datasets. For example, if we can quantify the distribution difference in a more principled way such as end-to-end training, we will have better ideas of how datasets differ from each other and thus be able to design better domain adaptation algorithms. If we can learn to predict the level of labeling noise of a training set, we will be better positioned to design specific noise-resistant learning schemes. Moreover, by quantifying the difficulty of test datasets, it is eventually possible to predict model performance under evolving environments so as to ensure safe applications of AI. This will be made possible through the launching of the label-free model evaluation challenge.

Topics of Interest

This special issue invites original research articles focusing on Visual Datasets. Appropriate submissions include, but are not limited to the following forms or a combination there of:

  • Properties and attributes of vision datasets: The first and foremost problem is the definition of dataset-level properties. While computer vision typically studies image-level properties, such as image category, human identities, object bounding boxes and region semantics, the dataset-level counterparts are not extensively studied. Examples of such dataset properties include but are not limited to the level of database noise, the extent to which a dataset looks realistic, dataset diversity, bias and fairness, its quality as a training set, and its difficulty level as a test or validation set. Analogous to image-level properties, dataset properties will require dedicated methods to evaluate and will induce the problem of dataset representation learning and end-to-end training.
  • Application of dataset-level analysis: We find numerous application opportunities for dataset-level analysis. For example, understanding the quality of datasets allows us to better design dataset composition schemes, thus obtaining higher accuracy for given models. This creates interesting directions for the computer vision community, especially considering that datasets for existing tasks are usually fixed, and that dynamic dataset composition would foster new opportunities. Moreover, mining and understanding the content and label bias of datasets will give us a clearer picture of the generalization ability of models trained on such datasets, and subsequently allow us to make corresponding improvements to the datasets. For example, having automated dataset-level metrics could be very beneficial for active learning.
  • Representations of and similarities between vision datasets: While image representations, being either hand-crafted or deeply learned, have been widely studied, those of datasets are much less investigated. The latter has been largely focused at first- and second- order statistics, and this is somehow analogous to the hand-crafted features in computer vision. In the context of computer vision, it would be very interesting to explore how to extract relevant image characteristics and aggregate them into global set representations. For example, when analyzing label noise levels, it would be beneficial to separate image foreground and backgrounds when computing dataset features. A more exciting topic is to perform (semi-)end-to-end learning, where the entire dataset or its selected parts are fed into neural networks, and thus task-oriented feature representations can be learned.
  • Improving vision dataset quality through generation and simulation: The community has seen interesting research using synthetic data generated by simulation engines or generative adversarial nets (GANs) or existing real data to compose new training sets. These methods give flexible and inexpensive solutions to various testing scenarios where training data are expensive to collect or corner cases happen.

In summary, the questions related to the proposed Special Issue include but are not limited to:

  • Can vision datasets be analyzed on a large scale?
  • How to holistically understand the visual semantics contained in a dataset?
  • How to define vision-related properties and problems on the dataset level?
  • How can we improve algorithm design by better understanding vision datasets?
  • Can we predict the performance of an existing model in a new dataset?
  • What are good dataset representations? Can they be hand-crafted, learned through neural nets or a combination of both?
  • How do we measure similarities between datasets and their bias and fairness?
  • Can we improve training data quality through data engineering or simulation?
  • How to efficiently create labelled datasets under new environments?
  • How to create realistic datasets that serve our real-world application purpose?
  • How can we alleviate the need for large-scale labelled datasets in deep learning?
  • How to best analyze model performance under various environments without requiring accessing the groundtruth labels?
  • How to evaluate diffusion models and large language models?

Important Dates

  • Paper submission deadline: 30 September 2024
  • First Review decision: 30 December 2024
  • Revision deadline: 15 March 2025
  • Final decision: 15 June 2025

Submission Guidelines

Please submit via IJCV Editorial Manager:

Choose SI: Visual Datasets from the Article Type dropdown.

  • Submitted papers should present original, unpublished work, relevant to one of the topics of the Special Issue.
  • All submitted papers will be evaluated on the basis of relevance, significance of contribution, technical quality, scholarship, and quality of presentation, by at least two independent reviewers. 
  • It is the policy of the journal that no submission, or substantially overlapping submission, be published or be under review at another journal or conference at any time during the review process. 
  • Manuscripts must conform to the author guidelines available on the IJCV website at:
  • Peer reviewing will follow the standard rigorous IJCV review process. Full length manuscripts are expected to follow IJCV guidelines outlined in the submission guidelines:

Author Resources

Authors are encouraged to submit high-quality, original work that has neither appeared in, nor is under consideration by other journals.  Springer provides a host of information about publishing in a Springer Journal on our Journal Author Resources page, including  FAQs,  Tutorials along with Help and Support

Other links include: