Abstract
We present a new approach for determining 3D motion of a moving rigid object relative to a single camera in image sequences. To estimate motion parameters as characterized by 3D rotation and 3D translation, non-linear least square equations have been formulated. Corresponding features on an object were observed from images at different times. Good initial values of these non-linear equations were provided from a para-perspective projection model to overcome ill-conditioned convergence problem of the equations.
Similar content being viewed by others
References
Lepetit, V., & Fua, P. (2005). Monocular model-based 3D tracking of rigid objects: A survey. Computer Graphics and Vision, 1, 1–89.
Doug Bowman, A. (2014). 3D user interfaces. In M. Soegaard & R.F. Dam (Eds.) The encyclopedia of human–computer interaction (2nd ed.). Aarhus: The Interaction Design Foundation. https://www.interaction-design.org/encyclopedia/3d_user_interfaces.html.
Marchand, E., Uchiyama, H., & Spindler, F. (2016). Pose estimation for augmented reality: A hands-on survey. IEEE Transactions on Visualization and Computer Graphics, 22, 2633–2651.
Qiang, W., Weiwei, Z., Xiaoou, T., & Heung-Yeung, S. (2006). Real-time Bayesian 3-D pose tracking. IEEE Transactions on Circuits and Systems for Video Technology, 16, 1533–1541. doi:10.1109/TCSVT.2006.885727.
Han, Y. (2005). Geometric algorithms for least squares estimation of 3D information from monocular image. IEEE Transactions on Circuits and Systems for Video Technology, 15, 269–282. doi:10.1109/TCSVT.2004.841541.
Yoshio, I., Yasushi, Y., & Masahiko, Y. (1995). A system for 3D motion and position estimation of hand from monocular image sequence. Advances in Human Factors/Ergonomics, 20, 809–814. doi:10.1016/S0921-2647(06)80314-X.
Jigang, L., Dongquan, L., Justin, D., & HockSoon, S. (2015). 3D Human motion tracking by exemplar-based conditional particle filter. Signal Processing, 110, 164–177. doi:10.1016/j.sigpro.2014.08.028.
Migniot, C., & Ababsa, F. (2014). Hybrid 3D–2D human tracking in a topview. Journal of Real-Time Image Processing. doi:10.1007/s11554-014-0429-7.
Morward, T., Prankl, J., Zillich, M., & Vincze, M. (2015). Advances in real-time object tracking. Journal of Real-Time Image Processing, 10, 683–697. doi:10.1007/s11554-013-0388-4.
Haralick, R. M., Joo, H., Lee, C., Zhuang, X., Vaidya, Y. G., & Kim, M. B. (1989). Pose estimation from corresponding point data. IEEE Transactions on Systems, Man, and Cybernetics, 19, 1426–1446. doi:10.1109/21.44063.
Lu, C. P., Hager, G. D., & Mjolsness, E. (2000). Fast and globally convergent pose estimation from video images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 610–622. doi:10.1109/34.862199.
Huang, T. S., & Tsai, R. Y. (1981). Image Sequence analysis: Motion estimation. In T. S. Huang (Ed.), Image Sequence analysis. New York: Springer.
Huang, T. S. (1986). Determining three dimensional motion and structure from two perspective views. In T. Y. Young & K. S. Fu (Eds.), Handbook of pattern recognition and image processing. New York: Academic Press.
Tsai, R. Y., & Huang, T. S. (1984). Uniqueness and estimation of three-dimensional motion parameters of rigid objects with curved surfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 13–27. doi:10.1109/TPAMI.1984.4767471.
Weng, J., Huang, T. S., & Ahuja, N. (1989). Motion and structure from two perspective views: Algorithms, error analysis, and error estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11, 451–476. doi:10.1109/34.24779.
Tsai, R. Y., & Huang, T. S. (1981). Estimating three-dimensional motion parameters of a rigid planar patch. IEEE Transactions on Acoustics, Speech, and Signal Processing, 29, 1147–1152. doi:10.1109/TASSP.1981.1163710.
Tsai, R. Y., & Huang, T. S. (1982). Estimating three-dimensional motion parameters of a rigid planar patch, II: Singular value decomposition. IEEE Transactions on Acoustics, Speech, and Signal Processing, 30, 525–534. doi:10.1109/TASSP.1982.1163931.
Huang, T. S., & Netravali, A. N. (1994). Motion and structure from feature correspondences: A review. Proceedings of the IEEE, 82, 252–268.
Ohta, Y., Maenobu, K., & Sakai, T. (1981). Obtaining surface orientation from texels under perspective projection. In Proceedings of 7th international joint conference on artificial intelligence (pp. 746–751).
Sugimoto, A. (1996). Object recognition by combining paraperspective images. International Journal of Computer Vision, 19, 181–201. doi:10.1007/BF00055804.
Aloimonos, J. Y. (1990). Perspective approximations. Image and Vision Computing, 8, 179–192. doi:10.1016/0262-8856(90)90064-C.
Poelman, C. J., & Kanade, T. (1997). A paraperspective factorization method for shape and motion recovery. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19, 206–218. doi:10.1109/34.584098.
Zhang, Z. (2000). A flexible new technique for camera calibration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 1330–1334. doi:10.1109/34.888718.
Hesch, J. A., & Roumeliotis, S. I. (2011) A direct least-squares (DLS) method for PnP. IEEE ICCV.
Acknowledgements
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP) (No. NRF-2016R1A2B4013017).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Tumurbaatar, T., Kim, T. Development of real-time object motion estimation from single camera. Spat. Inf. Res. 25, 647–656 (2017). https://doi.org/10.1007/s41324-017-0130-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41324-017-0130-6