Robot End-Effector Mounted Camera Pose Optimization in Object Detection-Based Tasks

Roveda, Loris; Maroni, Marco; Mazzuchelli, Lorenzo; Praolini, Loris; Shahid, Asad Ali; Bucca, Giuseppe; Piga, Dario

doi:10.1007/s10846-021-01558-0

Robot End-Effector Mounted Camera Pose Optimization in Object Detection-Based Tasks

Regular paper
Published: 30 December 2021

Volume 104, article number 16, (2022)
Cite this article

Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Loris Roveda ORCID: orcid.org/0000-0002-4427-536X¹,
Marco Maroni²,
Lorenzo Mazzuchelli²,
Loris Praolini²,
Asad Ali Shahid¹,
Giuseppe Bucca² &
…
Dario Piga¹

696 Accesses
7 Citations
Explore all metrics

Abstract

Robots equipped with the vision systems at the end-effector provide a powerful combination in industrial contexts, allowing to execute a wide range of manufacturing tasks, such as inspection applications. While many works are dedicated to machine vision algorithms, the optimization of the vision system pose is not properly addressed. Optimizing the sensor pose, in fact, can increase the object detection performance, avoiding occlusions and collisions in the real working scene. Therefore, the development of an approach capable of optimizing the pose of a vision system is the main objective of this paper. A complete pipeline for such optimization is proposed, composed of the following main components: working scene reconstruction, robot-environment collisions modeling, object detection, sensor pose optimization (exploiting Bayesian Optimization, a state of the art methodology), and collision-free robot motion planning. To validate the proposed approach, experimental tests have been executed considering two object detection-based tasks. A Franka EMIKA Panda robot equipped with an Intel^© RealSense D400 at its end-effector has been employed as a robotic platform. Achieved results show the high-fidelity reconstruction of the real working environment for an offline optimization (i.e., performed simulations), as well as the capabilities of the employed Bayesian Optimization-based approach to define the sensor pose. The proposed optimization methodology has been compared with the grid point approach, showing an improved performance for camera pose optimization purposes. An additional experiment has been performed in order to show the possibility to exploit a digital twin (if available) of the working scene instead of the environment reconstruction (to reduce the computational resources and to avoid measurements noise in the 3D reconstruction). Obtained results show the feasibility of the proposed pipeline employing such a digital twin.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Smart Check 3D: An Industrial Inspection System Combining 3D Vision with Automatic Planning of Inspection Viewpoints

Automated, Depth Sensor Based Object Detection and Path Planning for Robot-Aided 3D Scanning

A Constraint Based Motion Optimization System for Quality Inspection Process Improvement

Data Availability Statement

Open source code is available at https://github.com/LorisR/BO_best_view, providing all the developed components explained in the paper.

References

Lasi, H., Fettke, P., Kemper, H.-G., Feld, T., Hoffmann, M.: Industry 4.0. Business & Information Systems Engineering 6(4), 239–242 (2014)
Article Google Scholar
Roveda, L., Magni, M., Cantoni, M., Piga, D., Bucca, G.: Human-robot collaboration in sensorless assembly task learning enhanced by uncertainties adaptation via bayesian optimization. Robot. Auton. Syst., pp 103711 (2020)
Roveda, L., Maskani, J., Franceschi, P., Abdi, A., Braghin, F., Tosatti, L.M., Pedrocchi, N.: Model-based reinforcement learning variable impedance control for human-robot collaboration. J. Intell. Robot. Syst., pp 1–17 (2020b)
Pérez, L., Rodríguez, Í., Rodríguez, N., Usamentiaga, R., García, D.F.: Robot guidance using machine vision techniques in industrial environments: A comparative review. Sensors 16(3), 335 (2016)
Article Google Scholar
Vozel, K.: The details of vision guided robotics. Quality, pp 38–40 (2020)
Shamsfakhr, F., Bigham, B.S.: Gsr: geometrical scan registration algorithm for robust and fast robot pose estimation. Assembly Automation (2020)
Nerakae, P., Uangpairoj, P., Chamniprasart, K.: Using machine vision for flexible automatic assembly system. Procedia Computer Science 96, 428–435 (2016)
Article Google Scholar
Roveda, L., Castaman, N., Ghidoni, S., Franceschi, P., Boscolo, N., Pagello, E., Pedrocchi, N.: Human-robot cooperative interaction control for the installation of heavy and bulky components. In: 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 339–344. IEEE (2018)
Balatti, P., Kanoulas, D., Tsagarakis, N., Ajoudani, A.: A method for autonomous robotic manipulation through exploratory interactions with uncertain environments. Autonomous Robots 44(8), 1395–1410 (2020)
Article Google Scholar
Zhihong, C., Hebin, Z., Yanbo, W., Binyan, L., Yu, L.: A vision-based robotic grasping system using deep learning for garbage sorting. In: 2017 36th Chinese Control Conference (CCC), pp. 11223–11226. IEEE (2017)
Frank, D., Chhor, J., Schmitt, R.: Stereo-vision for autonomous industrial inspection robots. In: 2017 IEEE International Conference on Robotics and Biomimetics (ROBIO), pp. 2555–2561. IEEE (2017)
Militaru, C., Mezei, A.-D., Tamas, L.: Object handling in cluttered indoor environment with a mobile manipulator. In: 2016 IEEE International Conference on Automation, Quality and Testing, Robotics (AQTR), pp. 1–6. IEEE, (2016)
Kragic, D.: Free space of rigid objects: Caging, path non-existence, and narrow passage detection. In: Algorithmic Foundations of Robotics XIII: Proceedings of the 13th Workshop on the Algorithmic Foundations of Robotics, vol. 14, p. 19. Springer Nature (2020)
Nair, D., Pakdaman, A., Plöger, P.G.: Performance evaluation of low-cost machine vision cameras for image-based grasp verification. arXiv:2003.10167 (2020)
Cheng, S., Leng, Z., Cubuk, E.D., Zoph, B., Bai, C., Ngiam, J., Song, Y., Caine, B., Vasudevan, V., Li, C., et al.: Improving 3d object detection through progressive population based augmentation. In: European Conference on Computer Vision, pp. 279–294. Springer (2020)
Pi, Y., Nath, N.D., Behzadan, A.H.: Convolutional neural networks for object detection in aerial imagery for disaster response and recovery. Advanced Engineering Informatics 43, 101009 (2020)
Article Google Scholar
Arnold, E., Al-Jarrah, O.Y., Dianati, M., Fallah, S., Oxtoby, D., Mouzakitis, A.: A survey on 3d object detection methods for autonomous driving applications. IEEE Transactions on Intelligent Transportation Systems 20(10), 3782–3795 (2019)
Article Google Scholar
Zou, Z., Shi, Z., Guo, Y., Ye, J. Object detection in 20 years: A survey. arXiv:1905.05055 (2019)
Du, G., Wang, K., Lian, S.: Vision-based robotic grasping from object localization, pose estimation, grasp detection to motion planning: A review. arXiv:1905.06658 (2019)
Chen, J., Zhang, L., Liu, Y., Xu, C.: Survey on 6d pose estimation of rigid object. In: 2020 39th Chinese Control Conference (CCC), pp. 7440–7445. IEEE (2020)
Roveda, L., Ghidoni, S., Cotecchia, S., Pagello, E., Pedrocchi, N.: Eureca h2020 cleansky 2: a multi-robot framework to enhance the fourth industrial revolution in the aerospace industry. In: Robotics and Automation (ICRA), 2017 IEEE Int Conf on, Workshop on Industry of the Future: Collaborative, Connected, Cognitive. Novel approaches stemming from Factory of the Future and Industry 4.0 initiatives (2017)
Vicentini, F., Pedrocchi, N., Beschi, M., Giussani, M., Iannacci, N., Magnoni, P., Pellegrinelli, S., Roveda, L., Villagrossi, E., Askarpour, M., et al.: Piros: Cooperative, safe and reconfigurable robotic companion for cnc pallets load/unload stations. In: Bringing Innovative Robotic Technologies from Research Labs to Industrial End-users, pp. 57–96. Springer (2020)
Ercan, A.O., Yang, D.B., El Gamal, A., Guibas, L.J.: Optimal placement and selection of camera network nodes for target localization. In: International Conference on Distributed Computing in Sensor Systems, pp. 389–404. Springer (2006)
Olague, G., Mohr, R.: Optimal camera placement for accurate reconstruction. Pattern Recognition 35(4), 927–944 (2002)
Article Google Scholar
Chen, S.Y., Li, Y.F.: Automatic sensor placement for model-based robot vision. IEEE Transactions on Systems Man and Cybernetics Part B (Cybernetics) 34(1), 393–408 (2004)
Article Google Scholar
Dunn, E., Olague, G.: Pareto optimal camera placement for automated visual inspection. In: 2005 IEEE/RSJ International Conference on Intelligent Robots and Systemss, pp. 3821–3826. IEEE (2005)
McGreavy, C., Kunze, L., Hawes, N.: Next best view planning for object recognition in mobile robotics. CEUR Workshop Proceedings (2017)
Iversen, T.M., Kraft, D.: Optimizing sensor placement: A mixture model framework using stable poses and sparsely precomputed pose uncertainty predictions. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 6652–6659. IEEE (2018)
Mosbach, D., Gospodnetić, P., Rauhut, M., Hamann, B., Hagen, H.: Feature-driven viewpoint placement for model-based surface inspection. Machine Vision and Applications 32(1), 1–21 (2020)
Google Scholar
Ajoudani, A., Zanchettin, A.M., Ivaldi, S., Albu-Schäffer, A., Kosuge, K., Khatib, O.: Progress and prospects of the human-robot collaboration. Autonomous Robots 42(5), 957–975 (2018)
Article Google Scholar
Pelikan, M., Goldberg, D.E., Cantú-Paz, E., et al.: Boa: The bayesian optimization algorithm. In: Proceedings of the Genetic and Evolutionary Computation Conference GECCO-99, vol. 1, pp. 525–532. Citeseer (1999)
Brochu, E., Cora, V.M., De Freitas, N.: A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arXiv:1012.2599 (2010)
Letham, B., Karrer, B., Ottoni, G., Bakshy, E., et al.: Constrained bayesian optimization with noisy experiments. Bayesian Analysis 14(2), 495–519 (2019)
Article MathSciNet Google Scholar
Schleicher, T., Bullinger, A.C.: Assistive robots in highly flexible automotive manufacturing processes. In: Congress of the International Ergonomics Association, pp. 203–215. Springer (2018)
Ciszak, O.: Industry 4.0–industrial robots. In: MMS 2018: 3rd EAI International Conference on Management of Manufacturing Systems, pp. 52. European Alliance for Innovation (2018)
Cully, A., Clune, J., Tarapore, D., Mouret, J.-B.: Robots that can adapt like animals. Nature 521(7553), 503 (2015)
Article Google Scholar
Drieß, D., Englert, P., Toussaint, M.: Constrained bayesian optimization of combined interaction force/task space controllers for manipulations. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 902–907. IEEE (2017)
Yuan, K., Chatzinikolaidis, I., Li, Z.: Bayesian optimization for whole-body control of high degrees of freedom robots through reduction of dimensionality. IEEE Robot. Autom. Lett. (2019)
Rozo, L.: Interactive trajectory adaptation through force-guided bayesian optimization. arXiv:1908.07263 (2019)
Roveda, L., Forgione, M., Piga, D.: Robot control parameters auto-tuning in trajectory tracking applications. Control Engineering Practice 101, 104488 (2020)
Article Google Scholar
Roveda, L., Castaman, N., Franceschi, P., Ghidoni, S., Pedrocchi, N.: A control framework definition to overcome position/interaction dynamics uncertainties in force-controlled tasks. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 6819–6825. IEEE (2020d)
Hornung, A., Wurm, K.M., Bennewitz, M., Stachniss, C., Burgard, W.: Octomap: An efficient probabilistic 3d mapping framework based on octrees. Autonomous Robots 34(3), 189–206 (2013)
Article Google Scholar
Hodan, T., Michel, F., Brachmann, E., Kehl, W., GlentBuch, A., Kraft, D., Drost, B., Vidal, J., Ihrke, S., Zabulis, X., et al.: Bop: Benchmark for 6d object pose estimation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 19–34 (2018)
https://github.com/IFL-CAMP/easy_handeye. Last visit in January 2021
Wang, J., Olson, E.: Apriltag 2: Efficient and robust fiducial detection. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4193–4198. IEEE (2016)
Drost, B., Ulrich, M., Navab, N., Ilic, S.: Model globally, match locally: Efficient and robust 3d object recognition. (2010)
Hinterstoisser, S., Lepetit, V., Rajkumar, N., Konolige, K.: Going further with point pair features. (2016)
Maroni, M., Praolini, L.: Best view methodology enhanced by bayesian optimization for robotic motion planning in quality inspection tasks. Master’s thesis, Politecnico di Milano (2020)
Mazzuchelli, L.: Robotized quality inspection approach enhanced by bayesian optimization through point cloud based sensors. Master’s thesis, Politecnico di Milano (2020)
Chitta, S., Sucan, I., Cousins, S.: Moveit![ros topics]. IEEE Robotics & Automation Magazine 19(1), 18–19 (2012)
Article Google Scholar
Cully, A., Chatzilygeroudis, K., Allocati, F., Mouret, J.-B.: Limbo: A fast and flexible library for bayesian optimization. arXiv:1611.07343 (2016)
Singh, A., Sha, J., Narayan, K.S., Achim, T., Abbeel, P.: Bigbird: A large-scale 3d database of object instances. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 509–516. IEEE (2014)
Xiang, Y., Kim, W., Chen, W., Ji, J., Choy, C., Su, H., Mottaghi, R., Guibas, L., Savarese, S.: Objectnet3d: A large scale database for 3d object recognition. In: European Conference on Computer Vision, pp. 160–176. Springer (2016)
Barbu, A., Mayo, D., Alverio, J., Luo, W., Wang, C., Gutfreund, D., Tenenbaum, J., Katz, B.: Objectnet: A large-scale bias-controlled dataset for pushing the limits of object recognition models. In: Advances in Neural Information Processing Systems, pp. 9453–9463 (2019)
Su, H., Qi, C.R.., Li, Y., Guibas, L.J.: Render for cnn: Viewpoint estimation in images using cnns trained with rendered 3d model views. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2686–2694 (2015)
Peng, X., Sun, B., Ali, K., Saenko, K.: Learning deep object detectors from 3d models. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1278–1286 (2015)
Yair Movshovitz-Attias, Takeo Kanade, and Yaser Sheikh. How useful is photo-realistic rendering for visual learning? In: European Conference on Computer Vision, pp. 202–217. Springer (2016)
Mitash, C., Bekris, K.E., Boularias, A.: A self-supervised learning system for object detection using physics simulation and multi-view pose estimation. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 545–551. IEEE (2017)
Tushar, J., Sardana, H.K., et al.: Mechanical cad parts recognition for industrial automation. In: Smart Computing and Informatics, pp. 341–349. Springer (2018)
Ben Abdallah, H., Jovančević, I., Orteu, J.-J., Brèthes, L.: Automatic inspection of aeronautical mechanical assemblies by matching the 3d cad model and real 2d images. Journal of Imaging 5(10), 81 (2019)
Article Google Scholar
Song, K.-T., Wu, C.-H., Jiang, S.-Y.: Cad-based pose estimation design for random bin picking using a rgb-d camera. Journal of Intelligent & Robotic Systems 87(3–4), 455–470 (2017)
Article Google Scholar
Murphy, K., Torralba, A., Eaton, D., Freeman, W.: Object detection and localization using local and global features. In: Toward Category-Level Object Recognition, pp. 382–400. Springer (2006)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)
Article Google Scholar
Czajewski, W., Kołomyjec, K.: 3d object detection and recognition for robotic grasping based on rgb-d images and global features. Foundations of Computing and Decision Sciences 42(3), 219–237 (2017)
Article Google Scholar
Sukanya, C.M., Gokul, R., Paul, V.: A survey on object recognition methods. International Journal of Science, Engineering and Computer Technology 6(1), 48 (2016)
Google Scholar
Castellani, U., Cristani, M., Fantoni, S., Murino, V.: Sparse points matching by combining 3d mesh saliency with statistical descriptors. In: Computer Graphics Forum, vol. 27, pp. 643–652. Wiley Online Library (2008)
Digne, J., Cohen-Steiner, D., Alliez, P., De Goes, F., Desbrun, M.: Feature-preserving surface reconstruction and simplification from defect-laden point sets. Journal of Mathematical Imaging and Vision 48(2), 369–382 (2014)
Article MathSciNet Google Scholar
Zhang, J., Marszałek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: A comprehensive study. International Journal of Computer Vision 73(2), 213–238 (2007)
Article Google Scholar
Guo, Y., Bennamoun, M., Sohel, F., Lu, M., Wan, J.: 3d object recognition in cluttered scenes with local surface features: a survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 36(11), 2270–2287 (2014)
Article Google Scholar
do Monte Lima, J.P.S., Teichrieb, V.: An efficient global point cloud descriptor for object recognition and pose estimation. In: 2016 29th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), pp. 56–63. IEEE (2016)
Alhamzi, K., Elmogy, K., Barakat, S.: 3d object recognition based on local and global features using point cloud library. International Journal of Advancements in Computing Technology 7(3), 43 (2015)
Google Scholar
Rusu, R.B., Cousins, S.: 3d is here: Point cloud library (pcl). In: 2011 IEEE International Conference on Robotics and Automation, pp. 1–4. IEEE (2011)
Hinterstoisser, S., Holzer, S., Cagniart, C., Ilic, S., Konolige, K., Navab, N., Lepetit, V.: Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes. In: 2011 international conference on Computer Vision, pp. 858–865. IEEE (2011)
Hodaň, T., Zabulis, X., Lourakis, M., Obdržálek, Š., Matas, J.: Detection and fine 3d pose estimation of texture-less objects in rgb-d images. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4421–4428. IEEE (2015)
Kotsiantis, S.B., Zaharakis, I., Pintelas, P.: Supervised machine learning: A review of classification techniques. Emerging Artificial Intelligence Applications in Computer Engineering 160(1), 3–24 (2007)
Google Scholar
Wang, C., Martín-Martín, R., Xu, D., Lv, J., Lu, C., Fei-Fei, L., Savarese, S., Zhu, Y.: 6-pack: Category-level 6d pose tracker with anchor-based keypoints. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 10059–10066. IEEE (2020)
Tjaden, H., Schwanecke, U., Schömer, E., Cremers, D.: A region-based gauss-newton approach to real-time monocular multiple object tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence 41(8), 1797–1812 (2018)
Article Google Scholar
Song, C., Song, J., Huang, Q.: Hybridpose: 6d object pose estimation under hybrid representations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 431–440 (2020)
Kehl, W., Milletari, F., Tombari, F., Ilic, S., Navab, N.: Deep learning of local rgb-d patches for 3d object detection and 6d pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision – ECCV 2016, p. 205–220. Springer International Publishing, Cham. ISBN 978-3-319-46487-9 (2016)

Download references

Funding

The work has been developed within the project ASSASSINN, funded from H2020 CleanSky 2 under grant agreement n. 886977.

Author information

Authors and Affiliations

Istituto Dalle Molle di studi sull’Intelligenza Artificiale (IDSIA), Scuola Universitaria Professionale della Svizzera Italiana (SUPSI), Università della Svizzera Italiana (USI), 6928, Manno, Switzerland
Loris Roveda, Asad Ali Shahid & Dario Piga
Department of Mechanical Engineering, Politecnico di Milano, 23900, Lecco, Italy
Marco Maroni, Lorenzo Mazzuchelli, Loris Praolini & Giuseppe Bucca

Authors

Loris Roveda
View author publications
You can also search for this author in PubMed Google Scholar
Marco Maroni
View author publications
You can also search for this author in PubMed Google Scholar
Lorenzo Mazzuchelli
View author publications
You can also search for this author in PubMed Google Scholar
Loris Praolini
View author publications
You can also search for this author in PubMed Google Scholar
Asad Ali Shahid
View author publications
You can also search for this author in PubMed Google Scholar
Giuseppe Bucca
View author publications
You can also search for this author in PubMed Google Scholar
Dario Piga
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Methodology: L. Roveda, M. Maroni, L. Mazzuchelli, L. Praolini; software implementation: M. Maroni, L. Mazzuchelli, L. Praolini, L. Roveda; experimental tests: M. Maroni, L. Mazzuchelli, L. Praolini, L. Roveda; work supervision: G. Bucca, D. Piga; funds acquisition: L. Roveda; paper editing: L. Roveda, A. A. Shahid, M. Maroni, L. Mazzuchelli, L. Praolini.

Corresponding author

Correspondence to Loris Roveda.

Ethics declarations

Ethical approval

Not applicable.

Consent to participate

Not applicable.

Consent to publish

Authors consent to publish the here presented work.

Competing interests

Not applicable.

Appendix A: Object Detection State of the Art

Many object detection approaches are making use of specific datasets (created either by human annotation or incrementally placing one object in the scene and using foreground masking) for the detection of parts in operating environments [52,53,54]. Other approaches, instead, are making use of the CAD files of the target parts to be detected. Some of these works use synthetic datasets generated by rendering 3D CAD models of the target objects with different viewpoints, avoiding manual labeling [55]. In this case, an offline training (i.e., in simulation environments) for detection and pose estimation purposes is performed. However, many issues are still present in the proposed methods, making it difficult to transfer the trained behavior from simulation to the real task: modeling differences between the virtual training environment and the real testing scenario (i.e., the training can be not suitable for the target application), the generation of training objects’ poses that are not necessarily physically realistic (i.e., increasing the processing time without providing useful information to the algorithm), the presence of occlusions that are usually treated in a simplified manner (i.e., unrealistic scenes resulting in possible failures when moving to the real task) [56, 57]. The possibility to (partially) overcome such issues by exploiting an autonomous process for training a Convolutional Neural Network for object detection and pose estimation has been proposed in [58]. By employing a physics engine to generate synthetic but physically realistic images, the proposed approach makes use of multiple views to perform the object detection and its pose estimation. Other CAD-based approaches make use of the real data acquired from the operating scene to perform the object detection and its pose estimation [59], using both the 2D images [60] or the RGB-D data [61]. In such a context, feature-based methods [62], in which the object detection is based on the use of 3D data, are some of the most popular solutions adopted in many robotic applications [41], that can be divided in two main groups: local feature-based [63] and global feature-based [64] methods. Local features-based approaches are based on matching the descriptors of local surface characteristics, including three main stages [65]: 3D keypoints detection, local surface feature description, and surface matching. The first phase is the most important one, in which a set of points are labelled as keypoints as a function of the exploited detection method (e.g., surface sparse sampling, mesh decimation, fixed-scale, adaptive-scale, etc. [66, 67]). These points will be the ones on which object detection will be based. Once a keypoint has been detected, geometric information of the local surface around the keypoint can be extracted and encoded into a feature descriptor. According to the approaches employed to construct the features descriptors, it is possible to classify the existing methods into three broad categories [68]: signature based, histogram based, and transform based methods. Finally, the surface matching step establishes a set of feature correspondences between the operating scene and the target model, by matching the scene features against the model features. A comprehensive survey of these existing methods is proposed in [69]. Global feature-based methods, instead, follow a different pipeline for which the whole object surface is described by a single or small set of descriptors. Global point cloud descriptor is described extensively in [70]. Local features-based techniques are more robust considering cluttered environments and partial occlusions, that are frequently present in the real-world applications. Global features-based methods are instead more suitable for model retrieval and 3D shape classification, especially considering the weak geometric structures. Alhamzi et al. [71] describes an approach based on the exploitation of both local features and global features techniques, based on the PCL library [72]. Other approaches have been developed for CAD-based object detection and pose identification purposes. Template matching techniques have been proposed exploiting RGB-D data as in [73], in which a method based on quantized surface normal as depth cue is proposed. In a similar way, recently, [74] applied the concept of multimodal matching of [73] on an efficient cascade-style evaluation strategy. Even techniques based on supervised machine learning have been used for object detection and pose estimation exploiting RGB-D data. In [75] a review of classification techniques used in supervised machine learning is described, explaining how the goal of supervised learning is to build a concise model of the distribution of class labels in terms of predictor features and the different classification techniques. One of the methods to perform the pose estimation is represented by a deep learning approach to category-level 3D object pose tracking on RGB-D data with the use of key points [76]. This algorithm tracks novel object instances of known object categories such as bowls, laptops, and mugs in real time. It learns to compactly represent an object by a handful of 3D key points, based on which the inter frame motion of an object instance can be estimated through key point matching. Another algorithm for real-time 6 DOF pose estimation and tracking of rigid 3D objects uses a monocular RGB camera [77]. The key idea is to derive a region-based cost function using temporally consistent local color histograms. While such region-based cost functions are commonly optimized using first-order gradient descent techniques, in this paper a Gauss-Newton optimization scheme is proposed, which gives rise to drastically faster convergence and highly accurate and robust tracking performance. In numerous preliminary experiments performed by the authors [48, 49], it has been demonstrated that the proposed Gauss-Newton approach outperforms existing approaches in the presence of cluttered backgrounds, heterogeneous objects and partial occlusions. HybridPose [78], instead, leverages on multiple intermediate representations to express the geometric information in the input image for pose estimation. In addition to key points, this type of algorithm integrates a prediction network that outputs edge vectors between adjacent key points. As most objects possess a partial reflection symmetry, HybridPose also utilizes predicted dense pixel-wise correspondences that reflect the underlying symmetric relations between pixels. Another work demonstrated that Neural Networks coupled with a local voting-based approach can be used to perform reliable 3D object detection and pose estimation in a cluttered environment showing occlusions [79].

Rights and permissions

Reprints and permissions

About this article

Cite this article

Roveda, L., Maroni, M., Mazzuchelli, L. et al. Robot End-Effector Mounted Camera Pose Optimization in Object Detection-Based Tasks. J Intell Robot Syst 104, 16 (2022). https://doi.org/10.1007/s10846-021-01558-0

Download citation

Received: 19 January 2021
Accepted: 10 December 2021
Published: 30 December 2021
DOI: https://doi.org/10.1007/s10846-021-01558-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robot End-Effector Mounted Camera Pose Optimization in Object Detection-Based Tasks

Abstract

Access this article

Similar content being viewed by others

Smart Check 3D: An Industrial Inspection System Combining 3D Vision with Automatic Planning of Inspection Viewpoints

Automated, Depth Sensor Based Object Detection and Path Planning for Robot-Aided 3D Scanning

A Constraint Based Motion Optimization System for Quality Inspection Process Improvement

Data Availability Statement

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethical approval

Consent to participate

Consent to publish

Competing interests

Appendix A: Object Detection State of the Art

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Robot End-Effector Mounted Camera Pose Optimization in Object Detection-Based Tasks

Abstract

Access this article

Similar content being viewed by others

Smart Check 3D: An Industrial Inspection System Combining 3D Vision with Automatic Planning of Inspection Viewpoints

Automated, Depth Sensor Based Object Detection and Path Planning for Robot-Aided 3D Scanning

A Constraint Based Motion Optimization System for Quality Inspection Process Improvement

Data Availability Statement

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethical approval

Consent to participate

Consent to publish

Competing interests

Appendix A: Object Detection State of the Art

Appendix A: Object Detection State of the Art

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation