A Tool for Building Multi-purpose and Multi-pose Synthetic Data Sets

Ruiz, Marco; Fontinele, Jefferson; Perrone, Ricardo; Santos, Marcelo; Oliveira, Luciano

doi:10.1007/978-3-030-32040-9_41

Marco Ruiz⁵,
Jefferson Fontinele⁵,
Ricardo Perrone⁵,
Marcelo Santos⁵ &
…
Luciano Oliveira⁵

Part of the book series: Lecture Notes in Computational Vision and Biomechanics ((LNCVB,volume 34))

Included in the following conference series:

ECCOMAS Thematic Conference on Computational Vision and Medical Image Processing

877 Accesses
1 Citations

Abstract

Modern computer vision methods typically require expensive data acquisition and accurate manual labeling. In this work, we instead leverage the recent progress in computer graphics to propose a novel approach of designing and generating large scale multi-purpose image data sets from 3D object models directly, captured from multiple categorized camera viewpoints and controlled environmental conditions. The set of rendered images provide data for geometric computer vision problems such as depth estimation, camera pose estimation, 3D box estimation, 3D reconstruction, camera calibration, and also pixel-perfect ground truth for scene understanding problems, such as: semantic and instance segmentation, object detection, just to cite a few. In this paper, we also survey the most well-known synthetic data sets used in computer vision tasks, pointing out the relevance of rendering images for training deep neural networks. When compared to similar tools, our generator contains a wide set of features easy to extend, besides allowing for building sets of images in the MSCOCO format, so ready for deep learning works. To the best of our knowledge, the proposed tool is the first one to generate large-scale, multi-pose, synthetic data sets automatically, allowing for training and evaluation of supervised methods for all of the covered features.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Free open-source 3D software https://www.blender.org.

References

Atapour-Abarghouei, A., Breckon, T.P.: Real-time monocular depth estimation using synthetic data with domain adaptation via image style transfer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2800–2810 (2018)
Google Scholar
Butler, D.J., Wulff, J., Stanley, G.B., Black, M.J.: A naturalistic open source movie for optical flow evaluation. In: Fitzgibbon, A., et al. (eds.) European Conference on Computer Vision (ECCV), Part IV, LNCS 7577, pp. 611–625. Springer, Berlin (2012)
Google Scholar
Dosovitskiy, A., Ros, G., Codevilla, F., Lopez, A., Koltun, V.: CARLA: an open urban driving simulator. In: Proceedings of the 1st Annual Conference on Robot Learning, pp. 1–16 (2017)
Google Scholar
Dwibedi, D., Misra, I., Hebert, M.: Cut, paste and learn: Surprisingly easy synthesis for instance detection. In: The IEEE International Conference on Computer Vision (ICCV), pp. 1310–1319 (2017)
Google Scholar
Gaidon, A., Wang, Q., Cabon, Y., Vig, E.: Virtualworlds as proxy for multi-object tracking analysis. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4340–4349 (2016)
Google Scholar
Handa, A., Whelan, T., McDonald, J., Davison, A.: A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 1524–1531 (2014)
Google Scholar
Johnson-Roberson, M., Barto, C., Mehta, R., Sridhar, S.N., Rosaen, K., Vasudevan, R.: Driving in the matrix: can virtual worlds replace human-generated annotations for real world tasks? In: IEEE International Conference on Robotics and Automation (ICRA), pp. 746–753 (2017)
Google Scholar
Liebelt, J., Schmid, C., Schertler, K.: Viewpoint-independent object class detection using 3D feature maps. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008). https://doi.org/10.1109/CVPR.2008.4587614
Lin, T., Maire, M., Belongie, S.J., Bourdev, L.D., Girshick, R.B., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft COCO: common objects in context. CoRR abs/1405.0312, pp. 740–755 (2014)
Google Scholar
Matzen, K., Snavely, N.: NYC3DCars: a dataset of 3D vehicles in geographic context. In: International Conference on Computer Vision (ICCV), pp. 761–768 (2013)
Google Scholar
Mayer, N., Ilg, E., Hausser, P., Fischer, P., Cremers, D., Dosovitskiy, A., Brox, T.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4040–4048 (2016)
Google Scholar
Pepik, B., Stark, M., Gehler, P., Schiele, B.: Teaching 3D geometry to deformable part models. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3362–3369 (2012). https://doi.org/10.1109/CVPR.2012.6248075
Richter, S.R., Hayder, Z., Koltun, V.: Playing for benchmarks. In: International Conference on Computer Vision (ICCV), pp. 2232–2241 (2017)
Google Scholar
Ros, G., Sellart, L., Materzynska, J., Vazquez, D., Lopez, A.M.: The synthia dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3234–3243 (2016). https://doi.org/10.1109/CVPR.2016.352
Savva, M., Chang, A.X., Dosovitskiy, A., Funkhouser, T., Koltun, V.: MINOS: multimodal indoor simulator for navigation in complex environments. arXiv:1712.03931 (2017)
Shah, S., Dey, D., Lovett, C., Kapoor, A.: AirSim: high-fidelity visual and physical simulation for autonomous vehicles. In: Field and Service Robotics (2017)
Google Scholar
Song, S., Yu, F., Zeng, A., Chang, A.X., Savva, M., Funkhouser, T.: Semantic scene completion from a single depth image. IEEE Conference on Computer Vision and Pattern Recognition, pp. 190–198 (2017)
Google Scholar
Su, H., Qi, C.R., Li, Y., Guibas, L.J.: Render for CNN: viewpoint estimation in images using CNNs trained with rendered 3D model views. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 2686–2694 (2015). https://doi.org/10.1109/ICCV.2015.308
Sun, B., Peng, X., Saenko, K.: Generating large scale image datasets from 3D cad models. In: CVPR 2015 Workshop on the Future of Datasets in Vision (2015)
Google Scholar
Tremblay, J., To, T., Birchfield, S.: Falling things: a synthetic dataset for 3D object detection and pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 2038–2041 (2018)
Google Scholar
Varol, G., Romero, J., Martin, X., Mahmood, N., Black, M.J., Laptev, I., Schmid, C.: Learning from synthetic humans. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4627–4635 (2017)
Google Scholar
Xiang, Y., Kim, W., Chen, W., Ji, J., Choy, C., Su, H., Mottaghi, R., Guibas, L., Savarese, S.: ObjectNet3D: a large scale database for 3D object recognition. In: European Conference Computer Vision (ECCV), pp. 160–176 (2016)
Chapter Google Scholar
Xiang, Y., Mottaghi, R., Savarese, S.: Beyond pascal: a benchmark for 3D object detection in the wild. In: IEEE Winter Conference on Applications of Computer Vision, pp. 75–82 (2014). https://doi.org/10.1109/WACV.2014.6836101

Download references

Author information

Authors and Affiliations

Intelligent Vision Research Lab, Federal University of Bahia, Salvador, Brazil
Marco Ruiz, Jefferson Fontinele, Ricardo Perrone, Marcelo Santos & Luciano Oliveira

Authors

Marco Ruiz
View author publications
You can also search for this author in PubMed Google Scholar
Jefferson Fontinele
View author publications
You can also search for this author in PubMed Google Scholar
Ricardo Perrone
View author publications
You can also search for this author in PubMed Google Scholar
Marcelo Santos
View author publications
You can also search for this author in PubMed Google Scholar
Luciano Oliveira
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marco Ruiz .

Editor information

Editors and Affiliations

Faculdade de Engenharia, Universidade do Porto, Porto, Portugal
João Manuel R. S. Tavares
Faculdade de Engenharia, Universidade do Porto, Porto, Portugal
Renato Manuel Natal Jorge

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ruiz, M., Fontinele, J., Perrone, R., Santos, M., Oliveira, L. (2019). A Tool for Building Multi-purpose and Multi-pose Synthetic Data Sets. In: Tavares, J., Natal Jorge, R. (eds) VipIMAGE 2019. VipIMAGE 2019. Lecture Notes in Computational Vision and Biomechanics, vol 34. Springer, Cham. https://doi.org/10.1007/978-3-030-32040-9_41

Download citation

DOI: https://doi.org/10.1007/978-3-030-32040-9_41
Published: 28 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32039-3
Online ISBN: 978-3-030-32040-9
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)

Publish with us

Policies and ethics