Using Meta-Learning to Reduce the Effort of Training New Workpiece Geometries for Entanglement Detection in Bin-Picking Applications

Moosmann, Marius; Bleifuß, Julian; Rosport, Johannes; Spenrath, Felix; Kraus, Werner; Bormann, Richard; Huber, Marco F.

doi:10.1007/978-3-031-27933-1_14

Marius Moosmann⁶,
Julian Bleifuß⁶,
Johannes Rosport⁶,
Felix Spenrath⁶,
Werner Kraus⁶,
Richard Bormann⁶ &
…
Marco F. Huber⁷

Part of the book series: ARENA2036 ((ARENA2036))

Included in the following conference series:

Stuttgart Conference on Automotive Production

2891 Accesses

Abstract

In this paper, we introduce a scaling method for the training of neural networks used for entanglement detection in Bin-Picking. In the Bin-Picking process of complex-shaped and chaotically stored objects, entangled workpieces are a common source of problems. It has been shown that deep neural networks, which are trained using supervised learning, can be used to detect entangled workpieces. However, this strategy requires time-consuming data generation and an additional training process when adapting to previously unseen geometries. To solve this problem, we analyze and compare several Meta-Learning techniques like Reptile, MAML and TAMS for their feasibility as a scaling method for the entanglement detection. These methods search for a strongly generalized model for entanglement detection by learning from the training process of various workpieces with different geometries. Using this generalized model for entanglement detection as initialization helps to increase the learning success with only few training epochs and reduces the required amount of data and therefore the setup effort significantly.

You have full access to this open access chapter, Download conference paper PDF

Automatic “Ground Truth” Annotation and Industrial Workpiece Dataset Generation for Deep Learning

Article 05 March 2020

Deep representation learning and reinforcement learning for workpiece setup optimization in CNC milling

Article Open access 29 June 2023

Edge Deep Learning Towards the Metallurgical Industry: Improving the Hybrid Pelletized Sinter (HPS) Process

Keywords

1 Introduction

In the Bin-Picking process entangled workpieces are a common source of problems for incorrect handling. To increase the robustness of successful grasps in Bin-Picking, the application can be extended with an entanglement detection and furthermore with separation methods [1,2,3,4,5]. It has been shown, that such entanglement detection can be realized with the use of neural networks in a model-based approach [6]. However, when applied to new workpiece geometries, this supervised learning approach requires an expensive deal of effort.

The current state of the supervised learning approach of the entanglement detection [6] uses a deep convolutional neural network. The architecture is inspired by DenseNet [7] and is trained with grayscale depth maps of potentially entangled situations using the supervised learning approach.

The depth maps are generated in a simulation and later transferred to reality using Sim-to-Real-methods. To conduct the Sim-to-Real-Transfer we use CycleGAN as a domain adaptation method and several domain randomization parameters, for example Gaussian noise on the input images. As the approach is model-based, the simulation needs the geometric information of the workpiece. Each workpiece therefore requires its own specific entanglement detection data generation and training. In order to receive a high performance entanglement detector, the training of the neural network requires up to 20,000 depth maps as training inputs. The training process and the data generation amount to 46 h on a standard hardware. In summary, the current state holds potential to reduce the effort on adapting the entanglement detection to new workpieces.

Meta-Learning shows great success in accelerating the adaption of neural networks and creating strong classification models with only few data samples. For this reason, different Meta-Learning methods were investigated for their suitability to reduce the effort on training new workpiece geometries for the entanglement detection.

In summary, the main contributions of this paper are:

a base dataset for Meta-Learning based entanglement detection
the comparison of the Meta-Learning methods applied to the entanglement detection
the validation of the practical feasibility to reduce the effort on adapting the entanglement detection on new workpiece geometries with Meta-Learning

2 Meta-Learning

Meta-Learning enables machine learning models to use experience gained from related tasks [8]. It transfers previously learned knowledge of the training process and enables a neural network to perform this task faster and better. Meta-Learning is a learning process on two levels. The general procedure depends on the current Meta-Learning method and will be explained in the course of this work. Meta-Learning is used to realise powerful classification models with only a small amount of training data. With this procedure, several Meta-Learning methods achieve a high performance in few-shot image classification [9,10,11,12,13,14] or object detection [15,16,17].

As Meta-Learning grows in interest, a variety of Meta-Learning methods exist, which can be divided into gradient-based and metric-based algorithms, among others [18]. For the entanglement detection we perform experiments with the gradient-based algorithms Reptile [10] and the more complex MAML [9]. MAML achieves great success in generating a task-agnostic network which can adapt to new tasks in few gradient steps. Therefore MAML uses the second-order derivatives as meta-gradient. The Reptile algorithm simplifies the method and is able to successfully meta-learn with fewer classes that are sufficiently populated [19]. As metric-based algorithm we test TAMS [20], which is based on prototypical networks [11] and dedicated for medium-shot applications. Since the entanglement detection tends to follow the character of medium-shot classification with less available classes but sufficient shots, we chose Reptile and TAMS for the experiments in addition to MAML.

3 Meta-Learning Applied to the Entanglement Detection

This section presents a brief overview over the base dataset used for the Meta-Learning based entanglement detection. Furthermore, it introduces the different investigated Meta-Learning methods applied to the entanglement detection.

3.1 Base Dataset

Successful Meta-Learning requires a base dataset of source tasks closely related to the later target task. In this case, the target task is the classification between entangled and non-entangled workpieces of an unknown geometry. Therefore, the base dataset consists of 54 different workpieces with various geometries in total. Each workpiece provides the entanglement detections as classification task with 200 synthetic depth maps, half of them showing entangled workpieces.

To validate the Meta-Learning implementation, the Omniglot [21] dataset was used. This dataset consists of images from letters and is similar to the depth maps in the manner that both have only one channel. Even though the two image datasets differ in their structure, one could observe a benefit on Meta-Learning by pretraining the models with Omniglot. The Omniglot dataset was consulted to pretrain and to verify the functionality of the Meta-Learning method.

3.2 Implementation to the Entanglement Detection

The Meta-Learning based entanglement detection was applied as K-shot N-way classification, where K describes the quantity of the training data and N the number of classes distinguished.

The implementation of the Meta-Learning based entanglement detection is realised in such a way that for each meta-batch N/2 workpiece geometries are sampled randomly from the base dataset. Both gradient-based algorithms are based on a 5-shot 6-way classification due to experimental experience. Therefore one meta-batch represents the simultaneous entanglement detection of three different workpiece geometries. This scheme is sketched in Fig. 1. In case of TAMS a 5-shot 20-way classification is selected as best hyperparameter for the entanglement detection application. After each Meta-Learning epoch the K-shot N-way classification is repeated and evaluated with unknown geometries for testing the Meta-Learning model without updating it.

The Meta-Learning results in a strongly generalised network for the entanglement detection of multiple workpiece geometries. In order to finetune it to the unknown workpiece geometrie with transfer learning later, the classification layer of the generalised network is modified to a binary classifier.

3.3 Training of the Meta-Learning Methods

The training of the Meta-Learning applied to the entanglement detection is monitored using a subset of previously separated workpiece geometries from the base dataset. We use a split of 46 workpiece geometries for training and eight workpiece geometries for validating the adaptability. Figure 2 exemplarily shows the training plots from the Reptile and MAML Meta-Learning. In a training it happens that the validation accuracy is better than the training accuracy. We explain this behavior by the quality of the training and validation data. The training data was generated some time ago with an outdated physics simulation, while the validation data is from a revised version. In detail, data acquisition with a virtual depth image sensor and the physical interaction of the components in the bin have been improved through optimizations in simulation.

The Reptile Meta-Learning converges within 5,000 Meta-Learning epochs which takes about 4.5 h. The MAML Meta-Learning needs about 24 h for 8,000 epochs and then starts overfitting. This is due to the few source tasks in the base dataset. With a larger base dataset with hundreds of different workpiece geometries the performance of MAML is expected to improve. The TAMS-algorithm also suffers from the few classes and does not make any significant progress in the Meta-Learning. The structure of the base dataset with sufficient labeled data points per class but only few training tasks fits the Reptile algorithm best [19].

4 Results

4.1 Comparison of Applied Meta-Learning Methods

To compare the Meta-models generated by Reptile, MAML and TAMS, we use three new workpiece geometries not utilised in the Meta-Learning process yet. We also add a model with randomly initialized network parameters to the comparison which has to train the entanglement detection from scratch. We test the adaptation of the four models in dependency of the number of training data. While varying the amount of training data, the 2,500 depth maps for testing remain the same for each workpiece. We repeat each adaptation training eight times with different dropout-rates for regularisation and capture the best performance for each training data amount afterwards. Figure 3 shows the results for the three chosen workpiece geometries.

Comparing the workpiece geometries with each other, it is noticeable that the entanglement detection of the connecting rods is easier to learn than of the u-bolts, which results in higher test accuracies. The four models therefore do not differ much in the performance after the adaptation to the connecting rods. In case of the double hook and the u-bolt it can be seen that the Reptile model outperforms the other Meta-models and the random model in nearly every training data amount by far. The biggest benefit of the Meta-Learning can be observed in the adaptation to the double hooks with 2,500 depth maps.

4.2 Performance Validation of the Meta-trained Entanglement Detection on Unseen Workpiece Geometries

The model comparison leads to the choice of the Reptile algorithm as Meta-Learning to reduce the effort on training new workpiece geometries for the entanglement detection. This method is once more validated with the metal holder, shown in Fig. 1d, as workpiece which is interesting for industrial applications at a customer site. In doing so, the direct gain of Reptile as method to adapt to new workpieces with less effort, in later contexts abbreviated scaling method, is recorded.

Therefore the Reptile model and a model without prior knowledge from a Meta-Learning are compared in the adaptation with the same depth maps in a training with identical parameters. Figure 4 shows the progression of test accuracy and test loss during training with a training dataset consisting of 2,500 (green) data samples for the Reptile model and 2,500 (blue) and 5,000 (purple) data samples for the random model. The test dataset amounts to 5,000 equal depth maps.

One can observe that the Reptile model immediately starts adapting to the new workpiece and reaches a high classification performance in significantly less training epochs than the model without the Meta-Learning. If the training data is doubled to 5,000 instances, it is possible to reach a similar performance to the Meta-Learning, but with significantly larger number of epochs. In all cases, the loss of the adaptation training indicates the start of overfitting the model after it converged. The test-accuracy however remains stable and Reptile outperforms the current state of the entanglement detection by 100 epochs and 2,500 training samples.

5 Summary and Outlook

In this paper we introduced a scaling method to reduce the effort of training new workpiece geometries for entanglement detection in Bin-Picking. We compared three different Meta-Learning methods from the current state of the art in their usability for the entanglement detection. The chosen scaling method based on the Reptile algorithm helps reducing the amount of training epochs and therefore the training time by about 80 percent points and halves the amount of training data, which also halves the simulation time.

The scaling method makes the entanglement detection feasible for faster responses to new workpiece geometries in the Bin-Picking application. However, to achieve an entanglement detection with a higher performance than with Meta-Learning, more training data can be used to train a specific machine learning model. In conclusion, the Meta-Learning method helps reacting quickly on customer requests and a more accurate entanglement detection model can be updated later.

To further improve the scaling method in future work, it is of interest how the Meta-Learning becomes stronger with the growing base dataset through new workpiece geometries.

References

Moosmann, M., et al.: Using deep neural networks to separate entangled workpieces in random bin picking. In: SCAP Stuttgart Conference on Automotive Production (2020)
Google Scholar
Moosmann, M., et al.: Separating entangled workpieces in random bin picking using deep reinforcement learning. In: CIRP Conference on Manufacturing Systems (2021)
Google Scholar
Matsumura, R., Domae, Y., Wan, W., Harada, K.: Learning based robotic bin-picking for potentially tangled objects. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (2019)
Google Scholar
Leão, G., Costa, C.M., Sousa, A., Veiga, G.: Perception of entangled tubes for automated bin picking. In: Iberian Robotics Conference, pp. 619-631 (2019)
Google Scholar
Leão, G., Costa, C.M., Sousa, A., Veiga, G.: Detecting and solving tube entanglement in bin picking operations. In: Applied Sciences (2020)
Google Scholar
Moosmann, M., et al.: Increasing the robustness of random bin picking by avoiding grasps of entangled workpieces. In: CIRP Conference on Manufacturing Systems, Chicago (2020)
Google Scholar
Huang, G., Liu, Z., Van Der Maaten., L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700-4708 (2017)
Google Scholar
Hospedales, T., Antoniou, A., Micaelli, P., Storkey, A.: Meta-learning in neural networks: a survey. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (2021)
Google Scholar
Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: International Conference on Machine Learning, pp. 1126-1135. PMLR (2017)
Google Scholar
Nichol, A., Achiam, J., Schulman, J.: On first-order meta-learning algorithms. In: arXiv preprint arXiv:1803.02999 (2018)
Snell, J., Swersky, K., Zemel, R.: Prototypical networks for few-shot learning. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Raghu, A., Raghu, M., Bengio, S., Vinyals, O.: Rapid learning or feature reuse? Towards understanding the effectiveness of MAML. In: International Conference on Learning Representations (2020)
Google Scholar
Ravichandran, A., Bhotika, R., Soatto, S.: Few-shot learning with embedded class models and shot-free meta training. In: International Conference on Computer Vision (2019)
Google Scholar
Sun, Q., Liu, Y., Chua, T., Schiele, B.: Meta-transfer learning for few-shot learning. In: Conference on Computer Vision and Pattern Recognition (2019)
Google Scholar
Wu, X., Sahoo, D., Hoi, S.: Meta-RCNN: meta learning for few-shot object detection. In: ACM International Conference on Multimedia, pp. 1679-1687 (2020)
Google Scholar
Kang, B., Liu, Z., Wang, X., Yu, F., Feng, J., Darrell, T.: Few-shot object detection via feature reweighting. In: IEEE International Conference on Computer Vision, pp. 8420-8429 (2019)
Google Scholar
Wang, Y., Ramanan, D., Hebert, M.: Meta-learning to detect rare objects. In: IEEE International Conference on Computer Vision, pp. 9925-9934 (2019)
Google Scholar
Chen, W. Y., et al.: A closer look at few-shot classification. In: International Conference on Learning Representations (2019)
Google Scholar
Al-Shedivat, M., Li, L., Xing, E., Talwalkar, A.: On data efficiency of meta-learning. In: International Conference on Artificial Intelligence and Statistics, pp. 1369-1377. PMLR (2021)
Google Scholar
Jiang, X., Ding, L., Havaei, M., Jesson, A., Matwin, S.: Task adaptive metric space for medium-shot medical image classification. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11764, pp. 147–155. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32239-7_17
Chapter Google Scholar
Lake, B.M., Salakhutdinov, R., Tenenbaum, J.B.: Human-level concept learning through probabilistic program induction. Science 350(6266), 1332–1338 (2015)
Article MathSciNet MATH Google Scholar
Zhu, J.-Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: IEEE International Conference on Computer Vision ICCV (2017)
Google Scholar

Download references

Acknowledgement

This work was partially supported by the German Federal Ministry of Education and Research (Deep Picking - Grant No. 01IS20005C) and the Ministry of Economic Affairs of the state Baden-Württemberg (Center for Cognitive Robotics - Grant No. 017-180069 and Center for Cyber Cognitive Intelligence (CCI) - Grant No. 017-192996).

Author information

Authors and Affiliations

Department Robot and Assistive Systems, Fraunhofer Institute for Manufacturing Engineering and Automation IPA, Nobelstraße 12, 70569, Stuttgart, Germany
Marius Moosmann, Julian Bleifuß, Johannes Rosport, Felix Spenrath, Werner Kraus & Richard Bormann
Department Cyber Cognitive Intelligence (CCI), Fraunhofer IPA and Institute of Industrial Manufacturing and Management IFF, University of Stuttgart, Stuttgart, Germany
Marco F. Huber

Authors

Marius Moosmann
View author publications
You can also search for this author in PubMed Google Scholar
Julian Bleifuß
View author publications
You can also search for this author in PubMed Google Scholar
Johannes Rosport
View author publications
You can also search for this author in PubMed Google Scholar
Felix Spenrath
View author publications
You can also search for this author in PubMed Google Scholar
Werner Kraus
View author publications
You can also search for this author in PubMed Google Scholar
Richard Bormann
View author publications
You can also search for this author in PubMed Google Scholar
Marco F. Huber
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marius Moosmann .

Editor information

Editors and Affiliations

ARENA2036 e.V., Stuttgart, Germany
Niklas Kiefl
ARENA2036 e.V., Stuttgart, Germany
Frederik Wulle
ARENA2036 e.V., Stuttgart, Germany
Clemens Ackermann
ARENA2036 e.V., Stuttgart, Germany
Daniel Holder

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this paper

Cite this paper

Moosmann, M. et al. (2023). Using Meta-Learning to Reduce the Effort of Training New Workpiece Geometries for Entanglement Detection in Bin-Picking Applications. In: Kiefl, N., Wulle, F., Ackermann, C., Holder, D. (eds) Advances in Automotive Production Technology – Towards Software-Defined Manufacturing and Resilient Supply Chains. SCAP 2022. ARENA2036. Springer, Cham. https://doi.org/10.1007/978-3-031-27933-1_14

Download citation

DOI: https://doi.org/10.1007/978-3-031-27933-1_14
Published: 05 June 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-27932-4
Online ISBN: 978-3-031-27933-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Using Meta-Learning to Reduce the Effort of Training New Workpiece Geometries for Entanglement Detection in Bin-Picking Applications