Abstract
We are using a template descriptor on the image in order to try and find the object. However, we have a sparse 3D point clouds of the world that is not used at all when looking for the object in the images. Considering there are many false alarms during the detection, we are interested in exploring how to combine the detections on the image with the 3D point clouds in order to reject some detection outliers. In this experiment we use semi-direct-monocular visual odometry (SVO) to provide 3D points coordinates and camera poses to project 3D points to 2D image coordinates. By un-projecting points in the tracking on the selection tree (TST) detection box back to 3D space, we can use 3D Gaussian ellipsoid fitting to determine object scales. By ruling out different scales of detected objects, we can reject most of the detection outliers of the object.
Similar content being viewed by others
References
LEE T, SOATTO S. Learning and matching multiscale template descriptors for real-time detection, localization and tracking [C]//In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Colorado, USA: Institute of Electrical and Electronic Engineers, 2011: 1457–1464.
BABENKO B, YANG M H, BELONGIE S. Visual tracking with online multiple instance learning [C]//In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). San Fransico, USA: Institute of Electrical and Electronic Engineers, 2009: 983–990.
HINTERSTOISSER S, LEPETIE V, ILIC S, et al. Dominant orientation templates for real-time detection of texture-less objects [C]//In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Miami, USA: Institute of Electrical and Electronic Engineers, 2010: 2057–2069.
LEE T, SOATTO S. TST/BTD: An end-to-end visual recognition system [R]. Los Angeles: UCLA Technical Report, 2010.
BLöSCH M, WEISS S, SCARAMUZZA D, et al. Vision based MAV navigation in unknown and unstructured environments [C]//Proceeding IEEE International Conference on Robotics and Automation. Alaska, USA: [s.n.], 2010: 21–28.
WEISEE S, ACHTELIK M W, LYNEN S, et al. Monocular vision for long-term micro aerial vehicle state estimation: A Compendium [J]. Journal of Field Robotics, 2013, 30(5): 803–831.
SCARAMYZZA D, ACHTELIK M, DOITSIDIS L, et al. Vision-controlled micro flying robots: From system design to autonomous navigation and mapping in GPS-denied environments [J]. IEEE Robotics and Automation Magazine, 2014, 21(3): 26–40.
FORSTER C, LYNEN S, KNEIP L, et al. Collaborative monocular SLAM with multiple micro aerial vehicles [J]. Proceeding IEEE/RSJ International Conference on Intelligent Robots and Systems, 2013, 8215(2): 3962–3970.
FORSTER C, PIZZOLI M, SCARAMUZZA D. Air-ground localization and map augmentation using monocular dense reconstruction [C]//Proceeding IEEE/RSJ International Conference on Intelligent Robots and Systems. Tokyo, Japan: IEEE, 2013: 592–625.
FORSTER C, PIZZOLI M, SCARAMUZZA D. SVO: fast semi-direct monocular visual odometry [C]//IEEE International Conference on Robotics and Automation (ICRA). Hong Kong, China: IEEE, 2014: 624–675.
DEVERNAY F, FAUGERAS O. Straight lines have to be straight: Automatic calibration and removal of distortion from scenes of structured environments [J]. Machine Vision and Applications, 2001, 13(1): 14–24.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Guo, L. A comprehensive method to reject detection outliers by combining template descriptor with sparse 3D point clouds. J. Shanghai Jiaotong Univ. (Sci.) 22, 188–192 (2017). https://doi.org/10.1007/s12204-017-1820-x
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12204-017-1820-x
Key words
- semi-direct-monocular visual odometry (SVO)
- tracking on the selection tree (TST)-recognizer
- 3D point-clouds
- Gaussian ellipsoid fitting