Automatic “Ground Truth” Annotation and Industrial Workpiece Dataset Generation for Deep Learning

Liu, Fu-Qiang; Wang, Zong-Yi

doi:10.1007/s11633-020-1221-8

Automatic “Ground Truth” Annotation and Industrial Workpiece Dataset Generation for Deep Learning

Research Article
Published: 05 March 2020

Volume 17, pages 539–550, (2020)
Cite this article

International Journal of Automation and Computing Aims and scope Submit manuscript

242 Accesses
7 Citations
1 Altmetric
Explore all metrics

Abstract

In industry, it is becoming common to detect and recognize industrial workpieces using deep learning methods. In this field, the lack of datasets is a big problem, and collecting and annotating datasets in this field is very labor intensive. The researchers need to perform dataset annotation if a dataset is generated by themselves. It is also one of the restrictive factors that the current method based on deep learning cannot expand well. At present, there are very few workpiece datasets for industrial fields, and the existing datasets are generated from ideal workpiece computer aided design (CAD) models, for which few actual workpiece images were collected and utilized. We propose an automatic industrial workpiece dataset generation method and an automatic ground truth annotation method. Included in our methods are three algorithms that we proposed: a point cloud based spatial plane segmentation algorithm to segment the workpieces in the real scene and to obtain the annotation information of the workpieces in the images captured in the real scene; a random multiple workpiece generation algorithm to generate abundant composition datasets with random rotation workpiece angles and positions; and a tangent vector based contour tracking and completion algorithm to get improved contour images. With our procedures, annotation information can be obtained using the algorithms proposed in this paper. Upon completion of the annotation process, a json format file is generated. Faster R-CNN (Faster R-convolutional neural network), SSD (single shot multibox detector) and YOLO (you only look once: unified, real-time object detection) are trained using the datasets proposed in this paper. The experimental results show the effectiveness and integrity of this dataset generation and annotation method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Workpiece tracking based on improved SiamFC++ and virtual dataset

Article 11 October 2023

Automated assembly quality inspection by deep learning with 2D and 3D synthetic CAD data

Article Open access 18 April 2024

Manufacturing process classification based on heat kernel signature and convolutional neural networks

Article 30 August 2022

References

J. X. Xiao, K. A. Ehinger, J. Hays, A. Torralba, A. Oliva. Sun database: Exploring a large collection of scene categories. International Journal of Computer Vision, vol. 119, no. 1, pp. 3–22, 2016. DOI: https://doi.org/10.1007/s11263-014-0748-y.
Article MathSciNet Google Scholar
A. Torralba, R. Fergus, W. T. Freeman. 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 11, pp. 1958–1970, 2008. DOI: https://doi.org/10.1109/TPAMI.2008.128.
Article Google Scholar
M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, A. Zisserman. The pascal visual object classes (VOC) challenge. International Journal of Computer Vision, vol. 88, no. 2, pp. 303–338, 2010. DOI: https://doi.org/10.1007/s11263-009-0275-4.
Article Google Scholar
J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, L. Fei-Fei. ImageNet: A large-scale hierarchical image database. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Miami, USA, pp. 248–255, 2009. DOI: https://doi.org/10.1109/CVPR.2009.5206848.
Google Scholar
T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar, C. L. Zitnick. Microsoft COCO: Common objects in context. In Proceedings of the 13th European Conference on Computer Vision, Springer, Zurich, Switzerland, pp. 740–755, 2014. DOI: https://doi.org/10.1007/978-3-319-10602-148.
Google Scholar
B. L. Zhou, A. Lapedriza, A. Khosla, A. Oliva, A. Torralba. Places: A 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 6, pp. 1452–1464, 2018. DOI: https://doi.org/10.1109/tpami.2017.2723009.
Article Google Scholar
I. Krasin, T. Duerig, N. Alldrin, V. Ferrari, S. Abu-El-Haija, A. Kuznetsova, H. Rom, J. Uijlings, S. Popov, S. Kamali, M. Malloci, J. Pont-Tuset, A. Veit, S. Belongie, V. Gomes, A. Gupta, C. Sun, G. Chechik, D. Cai, Z. Feng, D. Narayanan, K. Murphy. OpenImages: A public dataset for large-scale multi-label and multi-class image classification, [Online], Available: https://storage.googleap s.com/openimages/web/index.html, October 6, 2019.
J. Tremblay, T. To, A. Molchanov, S. Tyree, J. Kautz, S. Birchfield. Synthetically trained neural networks for learning human-readable plans from real-world demonstrations. In Proceedings of IEEE International Conference on Robotics and Automation, IEEE, Brisbane, Australia, pp. 5659–5666, 2018. DOI: https://doi.org/10.1109/ICRA.2018.8460642.
Google Scholar
J. Tremblay, T. To, S. Birchfield. Falling things: A synthetic dataset for 3D object detection and pose estimation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, IEEE, Salt Lake City, USA, pp. 2119–21193, 2018. DOI: https://doi.org/10.1109/cvprw.2018.00275.
Google Scholar
B. Calli, A. Singh, A. Walsman, S. Srinivasa, P. Abbeel, A. M. Dollar. The YCB object and model set: Towards common benchmarks for manipulation research. In Proceedings of International Conference on Advanced Robotics, IEEE, Istanbul, Turkey, pp. 510–517, 2015. DOI: https://doi.org/10.1109/ICAR.2015.7251504.
Google Scholar
M. Arsenovic, S. Sladojevic, A. Anderla, D. Stefanovic, B. Lalic. Deep learning powered automated tool for generating image based datasets. In Proceedings of the 14th IEEE International Scientific Conference on Informatics, IEEE, Poprad, Slovakia, pp. 13–17, 2017. DOI: https://doi.org/10.1109/informatics.2017.8327214.
Google Scholar
J. Sun, P. Wang, Y. K. Luo, G. M. Hao, H. Qiao. Precision work-piece detection and measurement combining top-down and bottom-up saliency. International Journal of Automation and Computing, vol. 15, no. 4, pp. 417–430, 2018. DOI: https://doi.org/10.1007/s11633-018-1123-1.
Article Google Scholar
N. Poolsawad, L. Moore, C. Kambhampati, J. G. F. Cleland. Issues in the mining of heart failure datasets. International Journal of Automation and Computing, vol. 11, no. 2, pp. 162–179, 2014. DOI: https://doi.org/10.1007/s11633-014-0778-5.
Article Google Scholar
X. Y. Gong, H. Su, D. Xu, Z. T. Zhang, F. Shen, H. B. Yang. An overview of contour detection approaches. International Journal of Automation and Computing, vol. 15, no. 6, pp. 656–672, 2018. DOI: https://doi.org/10.1007/s11633-018-1117-z.
Article Google Scholar
A. Aldoma, T. Fäulhammer, M. Vincze. Automation of “ground truth” annotation for multi-view RGB-D object instance recognition datasets. In Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, IEEE, Chicago, USA, pp. 5016–5023, 2014. DOI: https://doi.org/10.1109/IROS.2014.6943275.
Google Scholar
K. Lai, L. F. Bo, X. F. Ren, D. Fox. A large-scale hierarchical multi-view RGB-D object dataset. In Proceedings of IEEE International Conference on Robotics and Automation, IEEE, Shanghai, China, pp. 1817–1824, 2011. DOI: https://doi.org/10.1109/icra.2011.5980382.
Google Scholar
M. Di Cicco, C. Potena, G. Grisetti, A. Pretto. Automatic model based dataset generation for fast and accurate crop and weeds detection. In Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, IEEE, Vancouver, Canada, pp. 5188–5195, 2017. DOI: https://doi.org/10.1109/IROS.2017.8206408.
Google Scholar
S. Greuter, J. Parker, N. Stewart, G. Leach. Real-time procedural generation of ‘pseudo infinite’ cities. In Proceedings of the 1st International Conference on Computer Graphics and Interactive Techniques in Australasia and South East Asia, ACM, Melbourne, Australia, pp. 87–94, 2003. DOI: https://doi.org/10.1145/604487.604490.
Google Scholar
R. Van Der Linden, R. Lopes, R. Bidarra. Procedural generation of dungeons. IEEE Transactions on Computational Intelligence and AI in Games, vol. 6, no. 1, pp. 78–89, 2013. DOI: https://doi.org/10.1109/tciaig.2013.2290371.
Article Google Scholar
S. R. Richter, V. Vineet, S. Roth, V. Koltun. Playing for data: Ground truth from computer games. In Proceedings of 14th European Conference on Computer Vision, Springer, Amsterdam, The Netherlands, pp. 102–118, 2016. DOI: https://doi.org/10.1007/978-3-319-46475-6_7.
Google Scholar
P. Marion, P. R. Florence, L. Manuelli, R. Tedrake. Label Fusion: A pipeline for generating ground truth labels for real RGBD data of cluttered scenes. In Proceedings of IEEE International Conference on Robotics and Automation, Brisbane, Australia, pp. 3235–3242, 2018. DOI: https://doi.org/10.1109/icra.2018.8460950.
T. Hodan, P. Haluza, S. Obdrzalek, J. Matas, M. Lourakis, X. Zabulis. T-LESS: An RGB-D dataset for 6D pose estimation of texture-less objects. In Proceedings of IEEE Winter Conference on Applications of Computer Vision, IEEE, Santa Rosa, USA, pp. 880–888, 2017. DOI: https://doi.org/10.1109/WACV.2017.103.
Google Scholar
H. Hattori, V. Naresh Boddeti, K. Kitani, T. Kanade. Learning scene-specific pedestrian detectors without real data. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Boston, USA, pp. 3819–3827, 2015. DOI: https://doi.org/10.1109/CVPR.2015.7299006.
Google Scholar
H. S. Koppula, A. Anand, T. Joachims, A. Saxena. Semantic labeling of 3D point clouds for indoor scenes. In Proceedings of the 24th International Conference on Neural Information Processing Systems, ACM, Red Hook, USA, pp. 244–252, 2011.
Google Scholar
J. Xie, M. Kiefel, M. T. Sun, A. Geiger. Semantic instance annotation of street scenes by 3D to 2D label transfer. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 3688–3697, 2016. DOI: https://doi.org/10.1109/CVPR.2016.401.
Google Scholar
B. Zoph, E. D. Cubuk, G. Ghiasi, T. Y. Lin, J. Shlens, Q. V. Le. Learning data augmentation strategies for object detection. ArXiv preprint ArXiv: 1906.11172, 2019.
A. Dutta, A. Zisserman. The VIA annotation software for images, audio and video. ArXiv preprint ArXiv: 1904. 10699, 2019.
L. Von Ahn, L. Dabbish. Labeling images with a computer game. In Proceedings of SIGCHI Conference on Human Factors in Computing Systems, ACM, New York, USA, pp. 319–326, 2004. DOI: https://doi.org/10.1145/985692.985733.
Google Scholar
C. H. Zhang, K. Loken, Z. Y. Chen, Z. Y. Xiao, G. Kunkel. Mask Editor: An image annotation tool for image segmentation tasks. ArXiv preprint ArXiv: 1809.06461v1, 2018.
B. C. Russell, A. Torralba, K. P. Murphy, W. T. Freeman. LabelMe: A database and web-based tool for image annotation. International Journal of Computer Vision, vol. 77, no. 1–3, pp. 157–173, 2008. DOI: https://doi.org/10.1007/s11263-007-0090-8.
Article Google Scholar
M. Johnson-Roberson, C. Barto, R. Mehta, S. N. Sridhar, K. Rosaen, R. Vasudevan. Driving in the matrix: Can virtual worlds replace human-generated annotations for real world tasks? In Proceedings of IEEE International Conference on Robotics and Automation, IEEE, Singapore, pp. 746–753, 2017. DOI: https://doi.org/10.1109/icra.2017.7989092.
Google Scholar
B. T. Phong. Illumination for computer generated pictures. Communications of the ACM, vol. 18, no. 6, pp. 311–317, 1975. DOI: https://doi.org/10.1145/360825.360839.
Article Google Scholar
S. Q. Ren, K. M. He, R. Girshick, J. Sun. Faster R-CNN: Towards real-time object detection with region proposal networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems, ACM, Cambridge, USA, pp. 91–99, 2015.
Google Scholar
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, A. C. Berg. Ssd: Single shot multibox detector. In Proceedings of the 14th European Conference on Computer Vision, Springer, Amsterdam, The Netherlands, pp. 21–37, 2016. DOI: https://doi.org/10.1007/978-3-319-46448-0_2.
Google Scholar
J. Redmon, S. Divvala, R. Girshick, A. Farhadi. You only look once: Unified, real-time object detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 779–788, 2016. DOI: https://doi.org/10.1109/CVPR.2016.91.
Google Scholar
F. Q. Liu, Z. Y. Wang. PolishNet-2d and PolishNet-3d: Deep learning—based workpiece recognition. IEEE Access, vol. 7, pp. 127042–127054, 1270. DOI: https://doi.org/10.1109/ACCESS.2019.2940411.
Article Google Scholar

Download references

Author information

Authors and Affiliations

College of Automation, Harbin Engineering University, No. 145 Nantong Road, Nangang District, Harbin, 150001, China
Fu-Qiang Liu & Zong-Yi Wang

Authors

Fu-Qiang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zong-Yi Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fu-Qiang Liu.

Additional information

Fu-Qiang Liu received the M. Sc. degree in computer technology from Harbin Engineering University, China in 2013. He is currently a Ph. D. degree candidate in control science and engineering at College of Automation, Harbin Engineering University, China.

His research interests include computer vision, deep learning, artificial intelligent, deep neural networks, simultaneous localization and mapping (SLAM), and robotics.

Zong-Yi Wang received the Ph.D. degree in control theory and control engineering from Harbin Engineering University, China in 2005. He is a professor at College of Automation, Harbin Engineering University, China. He won first prize of the Heilongjiang Provincial Scientific and Technological Progress Award in 2004.

His research interests include computer vision, robotics, welding and cutting intelligence.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, FQ., Wang, ZY. Automatic “Ground Truth” Annotation and Industrial Workpiece Dataset Generation for Deep Learning. Int. J. Autom. Comput. 17, 539–550 (2020). https://doi.org/10.1007/s11633-020-1221-8

Download citation

Received: 13 November 2019
Accepted: 24 December 2019
Published: 05 March 2020
Issue Date: August 2020
DOI: https://doi.org/10.1007/s11633-020-1221-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic “Ground Truth” Annotation and Industrial Workpiece Dataset Generation for Deep Learning

Abstract

Access this article

Similar content being viewed by others

Workpiece tracking based on improved SiamFC++ and virtual dataset

Automated assembly quality inspection by deep learning with 2D and 3D synthetic CAD data

Manufacturing process classification based on heat kernel signature and convolutional neural networks

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Automatic “Ground Truth” Annotation and Industrial Workpiece Dataset Generation for Deep Learning

Abstract

Access this article

Similar content being viewed by others

Workpiece tracking based on improved SiamFC++ and virtual dataset

Automated assembly quality inspection by deep learning with 2D and 3D synthetic CAD data

Manufacturing process classification based on heat kernel signature and convolutional neural networks

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation