A deep learning based fusion of RGB camera information and magnetic localization information for endoscopic capsule robots
 2.2k Downloads
 3 Citations
Abstract
A reliable, real time localization functionality is crutial for actively controlled capsule endoscopy robots, which are an emerging, minimally invasive diagnostic and therapeutic technology for the gastrointestinal (GI) tract. In this study, we extend the success of deep learning approaches from various research fields to the problem of sensor fusion for endoscopic capsule robots. We propose a multisensor fusion based localization approach which combines endoscopic camera information and magnetic sensor based localization information. The results performed on real pig stomach dataset show that our method achieves submillimeter precision for both translational and rotational movements.
Keywords
Deep Learning based Sensor Fusion Endoscopic Capsule Robots RNNCNN (RNN:Recurrent Neural Network, CNN: Convolutional Neural Network)1 Introduction
Robot localization denotes the robot’s ability to establish its position and orientation within the frame of reference. Different sensors used in medical milliscale robot localization have their own particular strengths and weaknesses, which makes sensor data fusion an attractive solution. Monocular visualmagnetic odometry approaches, for example, have received considerable attention in mobile robotic sensor fusion literature. In general, localization techniques for endoscopic capsule robots can be categorized into three main groups: electromagnetic wavebased techniques; magnetic field strengthbased techniques and hybrid techniquesUmay et al. (2017).
In recent years, numerous electromagnetic wavebased approaches like time of flight and difference of arrival (ToF and TDoA), received signal strength (RSS), RF identification (RFID) and angle of arrival (AoA) based methods have been proposed Wang et al. (2011); Fischer et al. (2004); Wang et al. (2009); Ye (2013); Hou et al. (2009).
In magnetic localization systems, the magnetic source and magnetic sensor system are the essential components. The magnetic source can be designed in different ways: a permanent magnet, an embedded secondary coil, or a triaxial magnetoresistive sensor. Magnetic sensors located outside the human body detect the magnetic flux density in order to estimate the location of the capsule (e.g., Popek et al. (2013), Natali et al. (2016), Yim and Sitti (2013)). One of the major advantages of utilizing magnetic field strengthbased localization techniques is their successful coupling with magnetic locomotion systems. This could be achieved using magnetic steering, magnetic levitation, and remote magnetic manipulation. Other advantages include their robustness against attenuation by the human body. However, the disadvantage is that they experience interference from the environment. This could be handled by implementing additional hardware for handling the localization problem.
Another group of endoscopic capsule robot localization techniques is the hybrid techniques. These implement an integration of different sources at once such as RF sensors, magnetic sensors, and RGB sensors. The core idea is to integrate data from different sources which strengthen each other and can produce more accurate localization data. As a common approach Kalman filter and its derivatives are proposed to fuse RF electromagnetic signal data, magnetic sensor data, and video data. The first group of hybrid methods fuses RF and video signal Geng and Pahlavan (2016); Bao et al. (2015), whereas the second group focus on fusion of magnetic and RF signal data Umay and Fidan (2016); Geng and Pahlavan (2016); Umay and Fidan (2017) and the last group on fusion of magnetic and video data Gumprecht et al. (2013).
2 System architecture details
 1.
Optical Flow estimation.
 2.
CNN based feature vector extraction.
 3.
LSTMs based sensor fusion.
2.1 Preprocessing
Even though the beauty of deep learning is told to be its success and easiness to process raw input data without inquiring any preand post processing, we do preprocessing since it increases the accuracy of our method upon our observations we made during evaluations. This section explains the preprocessing operations we applied on the raw RGB image data before passing it into the deep neural network. The operations include vessel detection, enhancement and keyframe selection.
2.1.1 Multiscale vessel enhancement
2.1.2 Keyframe selection
 1.
Choose a candidate keyframe and extract Farneback optical flow between this and the reference keyframe.
 2.
Compute the magnitude of the extracted optical flow vector for each pixel.
 3.
Calculate the cumulative value by summing up all the magnitude values.
 4.
Normalize the cumulative value by the total number of pixels.
 5.
If the normalized cumulative value is less than \(\tau \) then go to the next frame. Otherwise, identify the candidate key frame as a key frame and repeat the process.
2.2 Optical flow extraction
2.3 Magnetic localization system
2.4 Deep CNNRNN architecture for sensor fusion

learning rate: 0.001

momentum1: 0.9

momentum2: 0.999

epsilon: \(10^{8}\)

solver type: Adam

batch size: 64

GPU: NVIDIA K80
3 Dataset
4 Evaluation
We evaluate the performance of our system both quantitatively and qualitatively in terms of trajectory estimation. We also report the computational time requirements of the method.
4.1 Trajectory estimation
The absolute trajectory (ATE) rootmeansquare error metric (RMSE) is used for quantitative comparisons, which measures the rootmeansquare of Euclidean distances between all estimated endoscopic capsule robot poses and the ground truth poses. We created six different trajectories with various complexity levels. Overfitting, which would make the resulting pose estimator inapplicable in other scenarios, was prevented using dropout and early stopping techniques. The dropout regularization technique, which samples a part of the whole network and updates its parameters based on the input data, is an extremely effective and simple method to avoid overfitting.
5 Conclusion
In this study, we presented, to the best of our knowledge, the first sensor fusion method based on deep learning for endoscopic capsule robots. The proposed CNNRNN architecture based fusion approach is able to achieve simultaneous learning and sequential modeling of motion dynamics across frames and magnetic data streams. Since it is trained in an endtoend manner, there is no need to carefully handtune the parameters of the system except the hyperparameters. In the future, we will incorporate controlled actuation into the scenario to investigate a more complete system, and additionally we will seek ways to make the system more robust against representational singularities in the rotation data.
Notes
Acknowledgements
Open access funding provided by Max Planck Society.
References
 Arshak, K., Adepoju, F.: Capsule tracking in the gi tract: a novel microcontroller based solution. In: Sensors Applications Symposium, 2006. Proceedings of the 2006 IEEE, pp. 186–191. IEEE (2006)Google Scholar
 Bao, G., Pahlavan, K., Mi, L.: Hybrid localization of microrobotic endoscopic capsule inside small intestine by data fusion of vision and RF sensors. IEEE Sens. J. 15(5), 2669–2678 (2015)CrossRefGoogle Scholar
 Beauchemin, S.S., Barron J.L.: The computation of optical flow. ACM Comput Surv 27(3), 433–466 (1995)Google Scholar
 Cornelius, N., Kanade, T.: Adapting opticalflow to measure object motion in reflectance and Xray image sequences. In: Proceedings of ACM SIGGRAPH Computer Graphics, vol. 18, pp. 24–25. ACM, New york, NY (1984). https://doi.org/10.1145/988525.988537
 Clark, R., Wang, S., Wen, H., Markham, A., Trigoni, N.: Vinet: visualinertial odometry as a sequencetosequence learning problem. In: AAAI, pp. 3995–4001 (2017)Google Scholar
 Di Natali, C., Beccani, M., Simaan, N., Valdastri, P.: Jacobianbased iterative method for magnetic localization in robotic capsule endoscopy. IEEE Trans. Robot. 32(2), 327–338 (2016)CrossRefGoogle Scholar
 Fischer, D., Schreiber, R., Levi, D., Eliakim, R.: Capsule endoscopy: the localization system. Gastroint. Endosc. Clin. 14(1), 25–31 (2004)CrossRefGoogle Scholar
 Farnebäck, G.: Twoframe motion estimation based on polynomial expansion. In: Bigun, J., Gustavsson, T. (eds.) Image Analysis. Lecture Notes in Computer Science, vol. 2749, pp. 363–370. Springer, Heidelberg (2003)Google Scholar
 Frangi, A.F., Niessen, W.J., Vincken, K.L., Viergever, M.A.: Multiscale vessel enhancement filtering. In: Wells W.M., Colchester A., Delp S. (eds.) Medical Image Computing and ComputerAssisted Intervention—MICCAI’98. Lecture Notes in Computer Science, vol. 1496, pp. 130–137. Springer, Heidelberg (1998)Google Scholar
 Geng, Y., Pahlavan, K.: Design, implementation, and fundamental limits of image and RF based wireless capsule endoscopy hybrid localization. IEEE Trans. Mob. Comput. 15(8), 1951–1964 (2016)CrossRefGoogle Scholar
 Gumprecht, J.D., Lueth, T.C., Khamesee, M.B.: Navigation of a robotic capsule endoscope with a novel ultrasound tracking system. Microsyst. Technol. 19(9–10), 1415–1423 (2013)CrossRefGoogle Scholar
 Hou, J., Zhu, Y., Zhang, L., Fu, Y., Zhao, F., Yang, L., Rong, G.: Design and implementation of a high resolution localization system for invivo capsule endoscopy. In: Dependable, Autonomic and Secure Computing, 2009. DASC’09. Eighth IEEE International Conference on, pp. 209–214. IEEE (2009)Google Scholar
 Popek, K.M., Mahoney, A.W., Abbott, J.J.: Localization method for a magnetic capsule endoscope propelled by a rotating magnetic dipole field. In: 2013 IEEE International Conference on Robotics and Automation (ICRA), pp. 5348–5353. IEEE (2013)Google Scholar
 Son, D., Dogan, M.D., Sitti, M.: Magnetically actuated soft capsule endoscope for fineneedle aspiration biopsy. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 1132–1139. IEEE (2017)Google Scholar
 Son, D., Yim, S., Sitti, M.: A 5d localization method for a magnetically manipulated untethered robot using a 2d array of halleffect sensors. IEEE/ASME Trans. Mechatron. 21(2), 708–716 (2016)CrossRefGoogle Scholar
 Than, T.D., Alici, G., Harvey, S., OKeefe, G., Zhou, H., Li, W., Cook, T., AlamFotias, S.: An effective localization method for robotic endoscopic capsules using multiple positron emission markers. IEEE Trans. Robot. 30(5), 1174–1186 (2014)CrossRefGoogle Scholar
 Turan, M., Almalioglu, Y., Araujo, H., Konukoglu, E., Sitti, M.: A nonrigid map fusionbased rgbdepth slam method for endoscopic capsule robots. arXiv preprint arXiv:1705.05444 (2017)
 Umay, I., Fidan, B.: Adaptive magnetic sensing based wireless capsule localization. In: 2016 10th International Symposium on Medical Information and Communication Technology (ISMICT), pp. 1–5. IEEE (2016)Google Scholar
 Umay, I., Fidan, B.: Adaptive wireless biomedical capsule tracking based on magnetic sensing. Int. J. Wirel. Inf. Netw. 24(2), 189–199 (2017)CrossRefGoogle Scholar
 Umay, I., Fidan, B., Barshan, B.: Localization and tracking of implantable biomedical sensors. Sensors 17(3), 583 (2017)CrossRefGoogle Scholar
 Wang, L., Hu, C., Tian, L., Li, M., Meng, M.Q.H.: A novel radio propagation radiation model for location of the capsule in gi tract. In: 2009 IEEE International Conference on Robotics and Biomimetics (ROBIO), pp. 2332–2337. IEEE (2009)Google Scholar
 Wang, Y., Fu, R., Ye, Y., Khan, U., Pahlavan, K.: Performance bounds for RF positioning of endoscopy camera capsules. In: 2011 IEEE Topical Conference on Biomedical Wireless Technologies, Networks, and Sensing Systems (BioWireleSS), pp. 71–74. IEEE (2011)Google Scholar
 Ye, Y.: Bounds on RF cooperative localization for video capsule endoscopy. Ph.D. thesis, Worcester Polytechnic Institute (2013)Google Scholar
 Yim, S., Sitti, M.: 3d localization method for a magnetically actuated soft capsule endoscope and its applications. IEEE Trans. Robot. 29(5), 1139–1151 (2013)CrossRefGoogle Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.