Skip to main content
Log in

Robot End-Effector Mounted Camera Pose Optimization in Object Detection-Based Tasks

  • Regular paper
  • Published:
Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Abstract

Robots equipped with the vision systems at the end-effector provide a powerful combination in industrial contexts, allowing to execute a wide range of manufacturing tasks, such as inspection applications. While many works are dedicated to machine vision algorithms, the optimization of the vision system pose is not properly addressed. Optimizing the sensor pose, in fact, can increase the object detection performance, avoiding occlusions and collisions in the real working scene. Therefore, the development of an approach capable of optimizing the pose of a vision system is the main objective of this paper. A complete pipeline for such optimization is proposed, composed of the following main components: working scene reconstruction, robot-environment collisions modeling, object detection, sensor pose optimization (exploiting Bayesian Optimization, a state of the art methodology), and collision-free robot motion planning. To validate the proposed approach, experimental tests have been executed considering two object detection-based tasks. A Franka EMIKA Panda robot equipped with an Intel© RealSense D400 at its end-effector has been employed as a robotic platform. Achieved results show the high-fidelity reconstruction of the real working environment for an offline optimization (i.e., performed simulations), as well as the capabilities of the employed Bayesian Optimization-based approach to define the sensor pose. The proposed optimization methodology has been compared with the grid point approach, showing an improved performance for camera pose optimization purposes. An additional experiment has been performed in order to show the possibility to exploit a digital twin (if available) of the working scene instead of the environment reconstruction (to reduce the computational resources and to avoid measurements noise in the 3D reconstruction). Obtained results show the feasibility of the proposed pipeline employing such a digital twin.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Data Availability Statement

Open source code is available at https://github.com/LorisR/BO_best_view, providing all the developed components explained in the paper.

References

  1. Lasi, H., Fettke, P., Kemper, H.-G., Feld, T., Hoffmann, M.: Industry 4.0. Business & Information Systems Engineering 6(4), 239–242 (2014)

    Article  Google Scholar 

  2. Roveda, L., Magni, M., Cantoni, M., Piga, D., Bucca, G.: Human-robot collaboration in sensorless assembly task learning enhanced by uncertainties adaptation via bayesian optimization. Robot. Auton. Syst., pp 103711 (2020)

  3. Roveda, L., Maskani, J., Franceschi, P., Abdi, A., Braghin, F., Tosatti, L.M., Pedrocchi, N.: Model-based reinforcement learning variable impedance control for human-robot collaboration. J. Intell. Robot. Syst., pp 1–17 (2020b)

  4. Pérez, L., Rodríguez, Í., Rodríguez, N., Usamentiaga, R., García, D.F.: Robot guidance using machine vision techniques in industrial environments: A comparative review. Sensors 16(3), 335 (2016)

    Article  Google Scholar 

  5. Vozel, K.: The details of vision guided robotics. Quality, pp 38–40 (2020)

  6. Shamsfakhr, F., Bigham, B.S.: Gsr: geometrical scan registration algorithm for robust and fast robot pose estimation. Assembly Automation (2020)

  7. Nerakae, P., Uangpairoj, P., Chamniprasart, K.: Using machine vision for flexible automatic assembly system. Procedia Computer Science 96, 428–435 (2016)

    Article  Google Scholar 

  8. Roveda, L., Castaman, N., Ghidoni, S., Franceschi, P., Boscolo, N., Pagello, E., Pedrocchi, N.: Human-robot cooperative interaction control for the installation of heavy and bulky components. In: 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 339–344. IEEE (2018)

  9. Balatti, P., Kanoulas, D., Tsagarakis, N., Ajoudani, A.: A method for autonomous robotic manipulation through exploratory interactions with uncertain environments. Autonomous Robots 44(8), 1395–1410 (2020)

    Article  Google Scholar 

  10. Zhihong, C., Hebin, Z., Yanbo, W., Binyan, L., Yu, L.: A vision-based robotic grasping system using deep learning for garbage sorting. In: 2017 36th Chinese Control Conference (CCC), pp. 11223–11226. IEEE (2017)

  11. Frank, D., Chhor, J., Schmitt, R.: Stereo-vision for autonomous industrial inspection robots. In: 2017 IEEE International Conference on Robotics and Biomimetics (ROBIO), pp. 2555–2561. IEEE (2017)

  12. Militaru, C., Mezei, A.-D., Tamas, L.: Object handling in cluttered indoor environment with a mobile manipulator. In: 2016 IEEE International Conference on Automation, Quality and Testing, Robotics (AQTR), pp. 1–6. IEEE, (2016)

  13. Kragic, D.: Free space of rigid objects: Caging, path non-existence, and narrow passage detection. In: Algorithmic Foundations of Robotics XIII: Proceedings of the 13th Workshop on the Algorithmic Foundations of Robotics, vol. 14, p. 19. Springer Nature (2020)

  14. Nair, D., Pakdaman, A., Plöger, P.G.: Performance evaluation of low-cost machine vision cameras for image-based grasp verification. arXiv:2003.10167 (2020)

  15. Cheng, S., Leng, Z., Cubuk, E.D., Zoph, B., Bai, C., Ngiam, J., Song, Y., Caine, B., Vasudevan, V., Li, C., et al.: Improving 3d object detection through progressive population based augmentation. In: European Conference on Computer Vision, pp. 279–294. Springer (2020)

  16. Pi, Y., Nath, N.D., Behzadan, A.H.: Convolutional neural networks for object detection in aerial imagery for disaster response and recovery. Advanced Engineering Informatics 43, 101009 (2020)

    Article  Google Scholar 

  17. Arnold, E., Al-Jarrah, O.Y., Dianati, M., Fallah, S., Oxtoby, D., Mouzakitis, A.: A survey on 3d object detection methods for autonomous driving applications. IEEE Transactions on Intelligent Transportation Systems 20(10), 3782–3795 (2019)

    Article  Google Scholar 

  18. Zou, Z., Shi, Z., Guo, Y., Ye, J. Object detection in 20 years: A survey. arXiv:1905.05055 (2019)

  19. Du, G., Wang, K., Lian, S.: Vision-based robotic grasping from object localization, pose estimation, grasp detection to motion planning: A review. arXiv:1905.06658 (2019)

  20. Chen, J., Zhang, L., Liu, Y., Xu, C.: Survey on 6d pose estimation of rigid object. In: 2020 39th Chinese Control Conference (CCC), pp. 7440–7445. IEEE (2020)

  21. Roveda, L., Ghidoni, S., Cotecchia, S., Pagello, E., Pedrocchi, N.: Eureca h2020 cleansky 2: a multi-robot framework to enhance the fourth industrial revolution in the aerospace industry. In: Robotics and Automation (ICRA), 2017 IEEE Int Conf on, Workshop on Industry of the Future: Collaborative, Connected, Cognitive. Novel approaches stemming from Factory of the Future and Industry 4.0 initiatives (2017)

  22. Vicentini, F., Pedrocchi, N., Beschi, M., Giussani, M., Iannacci, N., Magnoni, P., Pellegrinelli, S., Roveda, L., Villagrossi, E., Askarpour, M., et al.: Piros: Cooperative, safe and reconfigurable robotic companion for cnc pallets load/unload stations. In: Bringing Innovative Robotic Technologies from Research Labs to Industrial End-users, pp. 57–96. Springer (2020)

  23. Ercan, A.O., Yang, D.B., El Gamal, A., Guibas, L.J.: Optimal placement and selection of camera network nodes for target localization. In: International Conference on Distributed Computing in Sensor Systems, pp. 389–404. Springer (2006)

  24. Olague, G., Mohr, R.: Optimal camera placement for accurate reconstruction. Pattern Recognition 35(4), 927–944 (2002)

    Article  Google Scholar 

  25. Chen, S.Y., Li, Y.F.: Automatic sensor placement for model-based robot vision. IEEE Transactions on Systems Man and Cybernetics Part B (Cybernetics) 34(1), 393–408 (2004)

    Article  Google Scholar 

  26. Dunn, E., Olague, G.: Pareto optimal camera placement for automated visual inspection. In: 2005 IEEE/RSJ International Conference on Intelligent Robots and Systemss, pp. 3821–3826. IEEE (2005)

  27. McGreavy, C., Kunze, L., Hawes, N.: Next best view planning for object recognition in mobile robotics. CEUR Workshop Proceedings (2017)

  28. Iversen, T.M., Kraft, D.: Optimizing sensor placement: A mixture model framework using stable poses and sparsely precomputed pose uncertainty predictions. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 6652–6659. IEEE (2018)

  29. Mosbach, D., Gospodnetić, P., Rauhut, M., Hamann, B., Hagen, H.: Feature-driven viewpoint placement for model-based surface inspection. Machine Vision and Applications 32(1), 1–21 (2020)

    Google Scholar 

  30. Ajoudani, A., Zanchettin, A.M., Ivaldi, S., Albu-Schäffer, A., Kosuge, K., Khatib, O.: Progress and prospects of the human-robot collaboration. Autonomous Robots 42(5), 957–975 (2018)

    Article  Google Scholar 

  31. Pelikan, M., Goldberg, D.E., Cantú-Paz, E., et al.: Boa: The bayesian optimization algorithm. In: Proceedings of the Genetic and Evolutionary Computation Conference GECCO-99, vol. 1, pp. 525–532. Citeseer (1999)

  32. Brochu, E., Cora, V.M., De Freitas, N.: A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arXiv:1012.2599 (2010)

  33. Letham, B., Karrer, B., Ottoni, G., Bakshy, E., et al.: Constrained bayesian optimization with noisy experiments. Bayesian Analysis 14(2), 495–519 (2019)

    Article  MathSciNet  Google Scholar 

  34. Schleicher, T., Bullinger, A.C.: Assistive robots in highly flexible automotive manufacturing processes. In: Congress of the International Ergonomics Association, pp. 203–215. Springer (2018)

  35. Ciszak, O.: Industry 4.0–industrial robots. In: MMS 2018: 3rd EAI International Conference on Management of Manufacturing Systems, pp. 52. European Alliance for Innovation (2018)

  36. Cully, A., Clune, J., Tarapore, D., Mouret, J.-B.: Robots that can adapt like animals. Nature 521(7553), 503 (2015)

    Article  Google Scholar 

  37. Drieß, D., Englert, P., Toussaint, M.: Constrained bayesian optimization of combined interaction force/task space controllers for manipulations. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 902–907. IEEE (2017)

  38. Yuan, K., Chatzinikolaidis, I., Li, Z.: Bayesian optimization for whole-body control of high degrees of freedom robots through reduction of dimensionality. IEEE Robot. Autom. Lett. (2019)

  39. Rozo, L.: Interactive trajectory adaptation through force-guided bayesian optimization. arXiv:1908.07263 (2019)

  40. Roveda, L., Forgione, M., Piga, D.: Robot control parameters auto-tuning in trajectory tracking applications. Control Engineering Practice 101, 104488 (2020)

    Article  Google Scholar 

  41. Roveda, L., Castaman, N., Franceschi, P., Ghidoni, S., Pedrocchi, N.: A control framework definition to overcome position/interaction dynamics uncertainties in force-controlled tasks. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 6819–6825. IEEE (2020d)

  42. Hornung, A., Wurm, K.M., Bennewitz, M., Stachniss, C., Burgard, W.: Octomap: An efficient probabilistic 3d mapping framework based on octrees. Autonomous Robots 34(3), 189–206 (2013)

    Article  Google Scholar 

  43. Hodan, T., Michel, F., Brachmann, E., Kehl, W., GlentBuch, A., Kraft, D., Drost, B., Vidal, J., Ihrke, S., Zabulis, X., et al.: Bop: Benchmark for 6d object pose estimation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 19–34 (2018)

  44. https://github.com/IFL-CAMP/easy_handeye. Last visit in January 2021

  45. Wang, J., Olson, E.: Apriltag 2: Efficient and robust fiducial detection. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4193–4198. IEEE (2016)

  46. Drost, B., Ulrich, M., Navab, N., Ilic, S.: Model globally, match locally: Efficient and robust 3d object recognition. (2010)

  47. Hinterstoisser, S., Lepetit, V., Rajkumar, N., Konolige, K.: Going further with point pair features. (2016)

  48. Maroni, M., Praolini, L.: Best view methodology enhanced by bayesian optimization for robotic motion planning in quality inspection tasks. Master’s thesis, Politecnico di Milano (2020)

  49. Mazzuchelli, L.: Robotized quality inspection approach enhanced by bayesian optimization through point cloud based sensors. Master’s thesis, Politecnico di Milano (2020)

  50. Chitta, S., Sucan, I., Cousins, S.: Moveit![ros topics]. IEEE Robotics & Automation Magazine 19(1), 18–19 (2012)

    Article  Google Scholar 

  51. Cully, A., Chatzilygeroudis, K., Allocati, F., Mouret, J.-B.: Limbo: A fast and flexible library for bayesian optimization. arXiv:1611.07343 (2016)

  52. Singh, A., Sha, J., Narayan, K.S., Achim, T., Abbeel, P.: Bigbird: A large-scale 3d database of object instances. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 509–516. IEEE (2014)

  53. Xiang, Y., Kim, W., Chen, W., Ji, J., Choy, C., Su, H., Mottaghi, R., Guibas, L., Savarese, S.: Objectnet3d: A large scale database for 3d object recognition. In: European Conference on Computer Vision, pp. 160–176. Springer (2016)

  54. Barbu, A., Mayo, D., Alverio, J., Luo, W., Wang, C., Gutfreund, D., Tenenbaum, J., Katz, B.: Objectnet: A large-scale bias-controlled dataset for pushing the limits of object recognition models. In: Advances in Neural Information Processing Systems, pp. 9453–9463 (2019)

  55. Su, H., Qi, C.R.., Li, Y., Guibas, L.J.: Render for cnn: Viewpoint estimation in images using cnns trained with rendered 3d model views. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2686–2694 (2015)

  56. Peng, X., Sun, B., Ali, K., Saenko, K.: Learning deep object detectors from 3d models. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1278–1286 (2015)

  57. Yair Movshovitz-Attias, Takeo Kanade, and Yaser Sheikh. How useful is photo-realistic rendering for visual learning? In: European Conference on Computer Vision, pp. 202–217. Springer (2016)

  58. Mitash, C., Bekris, K.E., Boularias, A.: A self-supervised learning system for object detection using physics simulation and multi-view pose estimation. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 545–551. IEEE (2017)

  59. Tushar, J., Sardana, H.K., et al.: Mechanical cad parts recognition for industrial automation. In: Smart Computing and Informatics, pp. 341–349. Springer (2018)

  60. Ben Abdallah, H., Jovančević, I., Orteu, J.-J., Brèthes, L.: Automatic inspection of aeronautical mechanical assemblies by matching the 3d cad model and real 2d images. Journal of Imaging 5(10), 81 (2019)

    Article  Google Scholar 

  61. Song, K.-T., Wu, C.-H., Jiang, S.-Y.: Cad-based pose estimation design for random bin picking using a rgb-d camera. Journal of Intelligent & Robotic Systems 87(3–4), 455–470 (2017)

    Article  Google Scholar 

  62. Murphy, K., Torralba, A., Eaton, D., Freeman, W.: Object detection and localization using local and global features. In: Toward Category-Level Object Recognition, pp. 382–400. Springer (2006)

  63. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)

    Article  Google Scholar 

  64. Czajewski, W., Kołomyjec, K.: 3d object detection and recognition for robotic grasping based on rgb-d images and global features. Foundations of Computing and Decision Sciences 42(3), 219–237 (2017)

    Article  Google Scholar 

  65. Sukanya, C.M., Gokul, R., Paul, V.: A survey on object recognition methods. International Journal of Science, Engineering and Computer Technology 6(1), 48 (2016)

    Google Scholar 

  66. Castellani, U., Cristani, M., Fantoni, S., Murino, V.: Sparse points matching by combining 3d mesh saliency with statistical descriptors. In: Computer Graphics Forum, vol. 27, pp. 643–652. Wiley Online Library (2008)

  67. Digne, J., Cohen-Steiner, D., Alliez, P., De Goes, F., Desbrun, M.: Feature-preserving surface reconstruction and simplification from defect-laden point sets. Journal of Mathematical Imaging and Vision 48(2), 369–382 (2014)

    Article  MathSciNet  Google Scholar 

  68. Zhang, J., Marszałek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: A comprehensive study. International Journal of Computer Vision 73(2), 213–238 (2007)

    Article  Google Scholar 

  69. Guo, Y., Bennamoun, M., Sohel, F., Lu, M., Wan, J.: 3d object recognition in cluttered scenes with local surface features: a survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 36(11), 2270–2287 (2014)

    Article  Google Scholar 

  70. do Monte Lima, J.P.S., Teichrieb, V.: An efficient global point cloud descriptor for object recognition and pose estimation. In: 2016 29th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), pp. 56–63. IEEE (2016)

  71. Alhamzi, K., Elmogy, K., Barakat, S.: 3d object recognition based on local and global features using point cloud library. International Journal of Advancements in Computing Technology 7(3), 43 (2015)

    Google Scholar 

  72. Rusu, R.B., Cousins, S.: 3d is here: Point cloud library (pcl). In: 2011 IEEE International Conference on Robotics and Automation, pp. 1–4. IEEE (2011)

  73. Hinterstoisser, S., Holzer, S., Cagniart, C., Ilic, S., Konolige, K., Navab, N., Lepetit, V.: Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes. In: 2011 international conference on Computer Vision, pp. 858–865. IEEE (2011)

  74. Hodaň, T., Zabulis, X., Lourakis, M., Obdržálek, Š., Matas, J.: Detection and fine 3d pose estimation of texture-less objects in rgb-d images. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4421–4428. IEEE (2015)

  75. Kotsiantis, S.B., Zaharakis, I., Pintelas, P.: Supervised machine learning: A review of classification techniques. Emerging Artificial Intelligence Applications in Computer Engineering 160(1), 3–24 (2007)

    Google Scholar 

  76. Wang, C., Martín-Martín, R., Xu, D., Lv, J., Lu, C., Fei-Fei, L., Savarese, S., Zhu, Y.: 6-pack: Category-level 6d pose tracker with anchor-based keypoints. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 10059–10066. IEEE (2020)

  77. Tjaden, H., Schwanecke, U., Schömer, E., Cremers, D.: A region-based gauss-newton approach to real-time monocular multiple object tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence 41(8), 1797–1812 (2018)

    Article  Google Scholar 

  78. Song, C., Song, J., Huang, Q.: Hybridpose: 6d object pose estimation under hybrid representations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 431–440 (2020)

  79. Kehl, W., Milletari, F., Tombari, F., Ilic, S., Navab, N.: Deep learning of local rgb-d patches for 3d object detection and 6d pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision – ECCV 2016, p. 205–220. Springer International Publishing, Cham. ISBN 978-3-319-46487-9 (2016)

Download references

Funding

The work has been developed within the project ASSASSINN, funded from H2020 CleanSky 2 under grant agreement n. 886977.

Author information

Authors and Affiliations

Authors

Contributions

Methodology: L. Roveda, M. Maroni, L. Mazzuchelli, L. Praolini; software implementation: M. Maroni, L. Mazzuchelli, L. Praolini, L. Roveda; experimental tests: M. Maroni, L. Mazzuchelli, L. Praolini, L. Roveda; work supervision: G. Bucca, D. Piga; funds acquisition: L. Roveda; paper editing: L. Roveda, A. A. Shahid, M. Maroni, L. Mazzuchelli, L. Praolini.

Corresponding author

Correspondence to Loris Roveda.

Ethics declarations

Ethical approval

Not applicable.

Consent to participate

Not applicable.

Consent to publish

Authors consent to publish the here presented work.

Competing interests

Not applicable.

Appendix A: Object Detection State of the Art

Appendix A: Object Detection State of the Art

Many object detection approaches are making use of specific datasets (created either by human annotation or incrementally placing one object in the scene and using foreground masking) for the detection of parts in operating environments [52,53,54]. Other approaches, instead, are making use of the CAD files of the target parts to be detected. Some of these works use synthetic datasets generated by rendering 3D CAD models of the target objects with different viewpoints, avoiding manual labeling [55]. In this case, an offline training (i.e., in simulation environments) for detection and pose estimation purposes is performed. However, many issues are still present in the proposed methods, making it difficult to transfer the trained behavior from simulation to the real task: modeling differences between the virtual training environment and the real testing scenario (i.e., the training can be not suitable for the target application), the generation of training objects’ poses that are not necessarily physically realistic (i.e., increasing the processing time without providing useful information to the algorithm), the presence of occlusions that are usually treated in a simplified manner (i.e., unrealistic scenes resulting in possible failures when moving to the real task) [56, 57]. The possibility to (partially) overcome such issues by exploiting an autonomous process for training a Convolutional Neural Network for object detection and pose estimation has been proposed in [58]. By employing a physics engine to generate synthetic but physically realistic images, the proposed approach makes use of multiple views to perform the object detection and its pose estimation. Other CAD-based approaches make use of the real data acquired from the operating scene to perform the object detection and its pose estimation [59], using both the 2D images [60] or the RGB-D data [61]. In such a context, feature-based methods [62], in which the object detection is based on the use of 3D data, are some of the most popular solutions adopted in many robotic applications [41], that can be divided in two main groups: local feature-based [63] and global feature-based [64] methods. Local features-based approaches are based on matching the descriptors of local surface characteristics, including three main stages [65]: 3D keypoints detection, local surface feature description, and surface matching. The first phase is the most important one, in which a set of points are labelled as keypoints as a function of the exploited detection method (e.g., surface sparse sampling, mesh decimation, fixed-scale, adaptive-scale, etc. [66, 67]). These points will be the ones on which object detection will be based. Once a keypoint has been detected, geometric information of the local surface around the keypoint can be extracted and encoded into a feature descriptor. According to the approaches employed to construct the features descriptors, it is possible to classify the existing methods into three broad categories [68]: signature based, histogram based, and transform based methods. Finally, the surface matching step establishes a set of feature correspondences between the operating scene and the target model, by matching the scene features against the model features. A comprehensive survey of these existing methods is proposed in [69]. Global feature-based methods, instead, follow a different pipeline for which the whole object surface is described by a single or small set of descriptors. Global point cloud descriptor is described extensively in [70]. Local features-based techniques are more robust considering cluttered environments and partial occlusions, that are frequently present in the real-world applications. Global features-based methods are instead more suitable for model retrieval and 3D shape classification, especially considering the weak geometric structures. Alhamzi et al. [71] describes an approach based on the exploitation of both local features and global features techniques, based on the PCL library [72]. Other approaches have been developed for CAD-based object detection and pose identification purposes. Template matching techniques have been proposed exploiting RGB-D data as in [73], in which a method based on quantized surface normal as depth cue is proposed. In a similar way, recently, [74] applied the concept of multimodal matching of [73] on an efficient cascade-style evaluation strategy. Even techniques based on supervised machine learning have been used for object detection and pose estimation exploiting RGB-D data. In [75] a review of classification techniques used in supervised machine learning is described, explaining how the goal of supervised learning is to build a concise model of the distribution of class labels in terms of predictor features and the different classification techniques. One of the methods to perform the pose estimation is represented by a deep learning approach to category-level 3D object pose tracking on RGB-D data with the use of key points [76]. This algorithm tracks novel object instances of known object categories such as bowls, laptops, and mugs in real time. It learns to compactly represent an object by a handful of 3D key points, based on which the inter frame motion of an object instance can be estimated through key point matching. Another algorithm for real-time 6 DOF pose estimation and tracking of rigid 3D objects uses a monocular RGB camera [77]. The key idea is to derive a region-based cost function using temporally consistent local color histograms. While such region-based cost functions are commonly optimized using first-order gradient descent techniques, in this paper a Gauss-Newton optimization scheme is proposed, which gives rise to drastically faster convergence and highly accurate and robust tracking performance. In numerous preliminary experiments performed by the authors [48, 49], it has been demonstrated that the proposed Gauss-Newton approach outperforms existing approaches in the presence of cluttered backgrounds, heterogeneous objects and partial occlusions. HybridPose [78], instead, leverages on multiple intermediate representations to express the geometric information in the input image for pose estimation. In addition to key points, this type of algorithm integrates a prediction network that outputs edge vectors between adjacent key points. As most objects possess a partial reflection symmetry, HybridPose also utilizes predicted dense pixel-wise correspondences that reflect the underlying symmetric relations between pixels. Another work demonstrated that Neural Networks coupled with a local voting-based approach can be used to perform reliable 3D object detection and pose estimation in a cluttered environment showing occlusions [79].

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Roveda, L., Maroni, M., Mazzuchelli, L. et al. Robot End-Effector Mounted Camera Pose Optimization in Object Detection-Based Tasks. J Intell Robot Syst 104, 16 (2022). https://doi.org/10.1007/s10846-021-01558-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10846-021-01558-0

Keywords

Navigation