Abstract
This paper describes a new method for the moving object detection using the IMU sensor and instance image segmentation. In the proposed method, the feature points are extracted by the detector, and the initial fundamental matrix is calculated from the IMU data. Next, the epipolar line is used to classify the extracted feature points. From the background feature point matching, fundamental matrix is calculated iteratively to minimize the error of classification. After the feature point classification, image segmentation is used to enhance the quality of the classification result. The proposed method is implemented and tested with real-world driving videos, and compared with the previous works.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12541-021-00527-9/MediaObjects/12541_2021_527_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12541-021-00527-9/MediaObjects/12541_2021_527_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12541-021-00527-9/MediaObjects/12541_2021_527_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12541-021-00527-9/MediaObjects/12541_2021_527_Fig4_HTML.jpg)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12541-021-00527-9/MediaObjects/12541_2021_527_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12541-021-00527-9/MediaObjects/12541_2021_527_Fig6_HTML.jpg)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12541-021-00527-9/MediaObjects/12541_2021_527_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12541-021-00527-9/MediaObjects/12541_2021_527_Fig8_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12541-021-00527-9/MediaObjects/12541_2021_527_Fig9_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12541-021-00527-9/MediaObjects/12541_2021_527_Fig10_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12541-021-00527-9/MediaObjects/12541_2021_527_Fig11_HTML.jpg)
Similar content being viewed by others
References
Baek, S., Kim, H., & Boo, K. (2014). Robust estimation of vehicle recognition on curved roads using a rear-side view vision system. International Journal of Precision Engineering and Manufacturing, 15(4), 753–760
Bay, H., Tuytelaars, T., & Van Gool, L. (2008). SURF: Speeded up robust features. Computer Vision and Image Understanding, 110(3), 346–359
Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In IEEE computer society conference on computer vision and pattern recognition (pp. 886–893).
DeTone, D., Malisiewicz, T., & Rabinovich, A. (2018). SuperPoint: self-supervised interest point detection and description. In IEEE conference on computer vision and pattern recognition (pp. 224–236).
Ha, S. W., & Moon, Y. H. (2011). Multiple object tracking using SIFT features and location matching. International Journal of Smart Home, 5(4), 17–26
He, K., Gkioxari, G., Dollar, P., & Girshick, R. (2017). Mask R-CNN. In IEEE conference on computer vision and pattern recognition (pp. 2961–2969).
Hu, W. C., Chen, C. H., Chen, T. Y., Huang, D. Y., & Wu, Z. C. (2015). Moving object detection and tracking from video captured by moving camera. Journal of Visual Communication and Image Representation, 30, 164–180
Jung, S., Song, S., Chang, M., & Park, S. (2018). Range image registration based on 2D synthetic images. Computer-Aided Design, 94, 16–27
Jung, S., Cho, Y., & Chang, M. (2020). Moving object detection from moving camera image sequences using an Inertial Measurement Unit sensor. Applied Sciences, 10(1), 268
Kim, C., Li, F., Ciptadi, A., & Regh, J. M. (2015) Multiple hypothesis tracking revisited. In Proceddings of the IEEE international conference on computer vision (pp. 4696–4704).
Kuen, J., Lim, K. M., & Lee, C. P. (2015). Self-taught learning of a deep invariant representation for visual tracking via temporal slowness principle. Pattern Recognition, 48(10), 2964–2982
Leal-Taixe, L., Canton-Ferrer, C., & Schindler, K. (2016). Learning by tracking: Siamese CNN for robust target association. In Proceedings of the IEEE conference on computer vision and pattern recognition workshop (pp. 33–40).
Li, P., Wang, D., Wang, L., & Lu, H. (2018). Deep visual tracking: Review and experimental comparison. Pattern Recognition, 76, 323–338
Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollar, P., & Zitnick, C. L. (2014). Microsoft COCO: Common objects in context. In European conference on computer vision (pp. 740–755).
Liu, X., Lin, L., Yan, S., Jin, H., & Jiang, W. (2011). Adaptive object tracking by learning hybrid template online. IEEE Transactions on Circuits and Systems for Video Technology, 21(11), 1588–1599
Lowe, D. G. (1999). Object recognition from local scale-invariant features. Computer Vision, 2, 1150–1157
Ma, C., Huang, J. B., Yang, X., & Yang, M. H. (2015). Hierarchical convolutional features for visual tracking. In Proceedings of the IEEE international conference on computer vision (pp. 3074–3082).
Ning, J., Zhang, L., Zhang, D., & Wu, C. (2009). Robust object tracking using joint color-texture histogram. International Journal of Pattern Recognition and Artificial Intelligence, 23(7), 1245–1263
Pan, J., Hu, B., & Zhang, J. Q. (2008). Robust and accurate object tracking under various types of occlusions. IEEE Transactions on Circuits and Systems for Video Technology, 18(2), 223–236
Roshanbin, N., & Miller, J. (2017). A comparative study of the performance of local feature-based pattern recognition algorithms. Pattern Analysis and Applications, 20(4), 1145–1156
Rublee, E., Rabaud, V., Konolige, K., & Bradski, G. (2011). ORB: An efficient alternative to SIFT or SURF. In International conference on computer vision (pp. 2564–2571).
Wang, N., & Yeung, D. Y. (2013). Learning a deep compact image representation for visual tracking. In Adv. neural inf. process. syst. (pp. 809–817).
Wang, L., Quyang, W., Wang, X., & Lu, H. (2015) Visual tracking with fully convolutional networks. In Proceedings of the IEEE international conference on computer vision (pp. 3119–3127).
Zhao, Q., Yang, Z., & Tao, H. (2010). Differential earth mover’s distance with its applications to visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(2), 274–287
Zhong, Y., Jain, A. K., & Dubuisson-Jolly, M. P. (2000). Object tracking using deformable templates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(5), 544–549
Acknowledgements
This work was supported by the IT R&D program of MSIT/IITP. [R2020040040, Development of 5G-based 3D spatial scanning device technology for virtual space composition.]
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Jung, S., Cho, Y., Lee, K. et al. Moving Object Detection with Single Moving Camera and IMU Sensor using Mask R-CNN Instance Image Segmentation. Int. J. Precis. Eng. Manuf. 22, 1049–1059 (2021). https://doi.org/10.1007/s12541-021-00527-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12541-021-00527-9