MarLe: Markerless estimation of head pose for navigated transcranial magnetic stimulation

Matsuda, Renan H.; Souza, Victor H.; Kirsten, Petrus N.; Ilmoniemi, Risto J.; Baffa, Oswaldo

doi:10.1007/s13246-023-01263-2

MarLe: Markerless estimation of head pose for navigated transcranial magnetic stimulation

Scientific Paper
Open access
Published: 11 May 2023

Volume 46, pages 887–896, (2023)
Cite this article

Download PDF

You have full access to this open access article

Physical and Engineering Sciences in Medicine Aims and scope Submit manuscript

MarLe: Markerless estimation of head pose for navigated transcranial magnetic stimulation

Download PDF

Renan H. Matsuda ORCID: orcid.org/0000-0002-1927-4824^1,2,
Victor H. Souza^1,2,3,
Petrus N. Kirsten¹,
Risto J. Ilmoniemi² &
…
Oswaldo Baffa¹

2032 Accesses
2 Citations
4 Altmetric
Explore all metrics

Abstract

Navigated transcranial magnetic stimulation (nTMS) is a valuable tool for non-invasive brain stimulation. Currently, nTMS requires fixing of markers on the patient’s head. Head marker displacements lead to changes in coil placement and brain stimulation inaccuracy. A markerless neuronavigation method is needed to increase the reliability of nTMS and simplify the nTMS protocol. In this study, we introduce and release MarLe, a Python markerless head tracker neuronavigation software for TMS. This novel software uses computer-vision techniques combined with low-cost cameras to estimate the head pose for neuronavigation. A coregistration algorithm, based on a closed-form solution, was designed to track the patient’s head and the TMS coil referenced to the individual’s brain image. We show that MarLe can estimate head pose based on real-time video processing. An intuitive pipeline was developed to connect the MarLe and nTMS neuronavigation software. MarLe achieved acceptable accuracy and stability in a mockup nTMS experiment. MarLe allows real-time tracking of the patient’s head without any markers. The combination of face detection and a coregistration algorithm can overcome nTMS head marker displacement concerns. MarLe can improve reliability, simplify, and reduce the protocol time of brain intervention techniques such as nTMS.

Non-orthogonal one-step calibration method for robotized transcranial magnetic stimulation

Article Open access 01 October 2018

Automatic Neurocranial Landmarks Detection from Visible Facial Landmarks Leveraging 3D Head Priors

Optical tracking with two markers for robust prospective motion correction for brain imaging

Article 30 June 2015

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Neuronavigation is crucial for transcranial magnetic stimulation (TMS) applications to increase the stimulation accuracy, reproducibility, and to reduce response variability [1,2,3]. Neuronavigation uses a tracking device to monitor the position and orientation of the TMS coil relative to the patient’s head. Tracking devices utilize markers or sensors fixed on the patient’s head and on the TMS coil for real-time position tracking [4]. However, marker fixation requires caution, as it must remain static during the entire treatment or experimental protocol. The accuracy of neuronavigation is compromised if the tracking markers move with respect to the brain. Small head-marker displacements can be left unnoticed by the operator, leading to critical inaccuracies in monitoring the TMS coil position relative to the brain [5].

In TMS, a strong current pulse applied into a coil placed on the scalp induces an electric field in the cortical tissue that can depolarize neurons [6]. Navigated TMS has been used in clinical applications for the treatment of neurological disorders [7, 8], in basic neuroscience research [9,10,11], and for mapping cortical functions prior neurosurgery [12], such as motor[13] and speech areas [14, 15]. However, slight changes in the coil placement may cause considerable unintended changes in the physiological responses [3, 5, 16, 17]. In conventional navigated TMS, head markers are susceptible to displacements during the coil positioning and can be challenging when combined with electroencephalography caps [18, 19]. Markerless tracking of the head position would increase the reliability of navigation and make the experimental procedure simpler and more accurate.

Face detection and recognition are well-established computer vision techniques. Existing algorithms can identify accurately non-face and face images and can discriminate different faces [20,21,22]. These techniques have been used for various applications such as surveillance systems and neuroscience studies based on facial expression and behavior recognition [23]. The combination of face detection and recognition enables real-time head pose estimation based on video recordings by facial landmark detection [24]. Automated head pose estimation open a possibility for the development of markerless neuronavigation. Studies have shown that markerless neuronavigation can aid in radiotherapy [25, 26], positron emission tomography [27] and neurosurgery [28]. However, the ensemble of markerless head pose estimation and TMS is a novel methodology.

To overcome the limitations imposed by physical head tracker displacements, we developed a markerless head tracker neuronavigation method for TMS, which we call MarLe. The main contributions of this study are: (1) a new method based on real-time head pose estimation that can be used with low-cost cameras and multiple tracking devices for nTMS. (2) We implemented and distributed our MarLe algorithm in our open-source neuronavigation system InVesalius [3]. (4) A full characterization of the accuracy and reliability of TMS navigation with markerless head pose estimation. (5) A streamlined process employing MarLe, which has the potential to enhance the ease of TMS targeting and reduce the duration of TMS procedures.

Methods

The MarLe was implemented as a Python library distributed via cross-platform binary wheel files. Apart from conventional Python libraries, such as Numpy, the main dependencies of MarLe are:

1.
OpenCV library [29] for processing, calibration, and camera communication
2.
Dlib library [30] for the face detection and head-pose estimation

MarLe combines head-pose estimation from live video streaming with tool tracking. The live video can be provided by sufficiently high-definition video cameras while the tool tracking is optimally performed by dedicated spatial tracking devices. In this study, we validated and characterized the MarLe algorithm with three different video cameras, a built-in live stream video camera (resolution of 2048 × 1536 pixels; 20 frames per second (FPS)) in the video camera unit (VCU) Polaris Vega VT (Northern Digital Inc., Canada), and two low-cost webcams c270 (Logitech, Switzerland) (resolution of 1280 ⋅ 720 pixels; 30 FPS) and c920 (Logitech, Switzerland) (1920 ⋅ 1080 pixels; 30 FPS). Tracking of the TMS coil and the fiducials collection probe were performed with the Polaris Vega VT infrared camera and markers.

Camera calibration

The camera calibration converts real-world three-dimensional (3D) position measurements to the camera’s coordinate system. Overall, the calibration estimates the intrinsic and extrinsic parameters of the camera. The intrinsic parameters are the camera geometry and lens distortion. The extrinsic parameters are the camera rotation and translation matrix relative to the tracked object. The intrinsic and extrinsic parameters are input arguments for MarLe. The camera calibration algorithm followed OpenCV checkerboard pattern calibration [31]. First, each camera captured 1000 checkerboard image samples from different viewpoints. Next, the camera parameters are computed based on Zhang’s [32] closed-form solution. Given the camera parameters, we found the coordinates of all checkerboard samples and we transformed them back to the two-dimensional (2D) camera coordinate system. Finally, we verified the camera calibration accuracy by computing the re-projection error. The re-projection error was calculated as the absolute norm between the transformed and the found checkerboard’s locations. The camera calibration algorithm was developed independently from MarLe.

MarLe workflow

The workflow of MarLe has five steps:

1.
Establish communication between the camera and the neuronavigation system;
2.
Define the transformation matrix between the TMS coil tracking device and the camera;
3.
Detect the face;
4.
Transform head-pose estimation to the tracking device coordinate system;
5.
Apply filtering to reduce jittering and measurement errors.

The OpenCV was used to establish the communication between the camera device and MarLe software. The user can set which camera will be used if multiple cameras are connected to the computer. The GStreamer backend, along OpenCV, was used to communicate with the VCU Vega VT. Once the camera communication is established, MarLe determines the position and orientation of the head. The MarLe algorithm employs the Dlib library along with a pre-trained network of facial landmarks to estimate the real-time 2D location of 68 facial landmarks [33]. The pre-trained network was generated based on the iBUG 300-W facial dataset [34]. Then, we use OpenCV, together with the camera calibration parameters, to find the projection of the 2D facial landmark to 3D. The projection was performed using a 3D anthropometric model proposed by Martins and Batista [24]. We selected 14 facial structures for the 3D fitting: inner and outer corners of both eyebrows, inner and outer corners of both eyes, right and left bottom corners of the nostrils, both labial commissure and the middle lower lip of the mouth, and finally the chin as shown in Fig. 1a.

We should note that MarLe provides only the markerless tracking of the head position, requiring a second device to track the TMS coil position. MarLe can be operated with any tracking device supported by the neuronavigation system. The MarLe and the secondary tracking device have different coordinates systems. Therefore, coregistration is required to operate both trackers in the same coordinate system. The coregistration consists in acquiring simultaneously the head pose in both coordinate systems. The MarLe collects the head pose based on the face detection and the secondary tracker collects the pose of a marker attached to the subject’s head, following the conventional neuronavigation workflow. We developed a graphical user interface to perform the coregistration between the two camera systems. The coregistration requires at least 500 head positions from each camera. The head pose estimation is transformed to the secondary tracking device coordinate system using a closed-form solution through the equation:

$${\mathbf{T}}_{camera}^{tracker} {\mathbf{T}}_{tracker}^{marker}= {\mathbf{T}}_{camera}^{head pose}{\mathbf{T}}_{head pose}^{marker}$$

where ${\mathbf{T}}_{camera}^{tracker}$ and ${\mathbf{T}}_{head pose}^{marker}$ are homogeneous transformation matrices from the camera to the tracking device and from the head pose to the head tracker, respectively. The ${\mathbf{T}}_{tracker}^{marker}$ and ${\mathbf{T}}_{camera}^{head pose}$ matrices are head-pose-paired measurements from both MarLe and the secondary tracking device. Once the coordinate system of the head-pose estimation and secondary tracking device are fixed, we compute the transformation matrices between the head-pose-paired measurements (${\mathbf{T}}_{tracker}^{marker}$ and ${\mathbf{T}}_{camera}^{head pose}$) using the least squares method. Figure 1b illustrates the transformation matrices. Once the coregistration is done, the head marker is removed from the subject’s head.

High-frequency noise oscillations, i.e., spatial jittering, in the head-pose measurements decrease the head-tracking accuracy [35]. The final step is the real-time smoothing filter to minimize the jittering of head-pose estimation. We implemented a Savitzky–Golay filter [36] based on the Python library SciPy [37]. This filter uses a convolution process to fit successive data sub-sets with a low-degree polynomial. The data subsets are given by fixed window size. A first-order polynomial filter was used with a window size of 5 frames. We also evaluated the response of Kalman and Grubbs filters, as described in the Supplementary Material. After comparing their performances, we chose the Savitzky–Golay filter due to its superior performance for attenuating jittering.

Characterization

We characterized the stability of MarLe for different distances between the camera and the subject’s head, the jittering, and the accuracy in repositioning the TMS coil. To measure the stability, we used a frontal human face photo printed on an A4 office paper; the printed photo has a neutral facial expression and was printed with the original head size. We collect the head pose in three distances between the camera and the face picture: 100, 125, and 150 cm. Each acquisition was recorded for three minutes every two seconds for each condition. To evaluate the filter effect on the accuracy and stability, we recorded the head pose before and after applying the Savitzky–Golay filter. These measurements were repeated three times for each of the three cameras, Logitech c270, Logitech c920, and VCU from Polaris Vega VT. The jittering was estimated as the 95% interval (1.96 times the standard deviation) of each acquired coordinate. We used the same acquisition of the stability evaluation with 100-cm distance between camera and face, with filtering, and for the three cameras.

For validating the markerless navigation, we implemented MarLe on the neuronavigation software platform InVesalius [38]. InVesalius is an open-source and free software for navigated TMS. InVesalius supports multiple tracking devices. We characterized and demonstrated the MarLe with the Polaris Vega VT as the secondary tracking device. The accuracy of MarLe in repositioning the TMS coil was evaluated in a mockup experiment that followed a conventional TMS procedure [2] except that no TMS pulses were delivered to the subject. The study was approved by the local ethics committee of the University of São Paulo (CAAE: 54674416.9.0000.5407) in accordance with the Declaration of Helsinki.

Figure 2 depicts the experimental setup. The participant (the first author: 29-year-old man with no known neurological disorders or visible facial deformations) was instructed to sit in a chair and to stay relaxed and with a neutral facial expression. For neuronavigation, we used a T1-weighted magnetic resonance imaging acquired with a volumetric gradient echo sequence (voxel size 1 × 1 × 1 mm³; 240 × 240 × 240 acquisition matrix) in an Achieva 3 T scanner (Philips Healthcare, Netherlands). A figure-of-eight TMS coil (Neurosoft, Russia) was placed over the scalp directly above the left hand knob of the primary motor cortex and the scalp coordinate set as the target in InVesalius. The coil was oriented approximately perpendicular to the central sulcus. The revisiting experimental procedure included three steps: (1) coregistration between MarLe and the secondary tracker device, (2) neuronavigation coregistration, and (3) repositioning of the TMS coil. For the coregistration of the trackers, we collected 500 paired poses. The neuronavigation coregistration was performed using three fiducial landmarks: nasion, right and left ear tragus [3]. The TMS coil was initially placed on a side table, which was defined as the home position. The coil was repositioned 10 times, alternating between the home and the scalp target. The InVesalius Guiding interface was used to place the coil at the target and the coil coordinates were saved when the user reached the target within a 3-mm and 3° range. The experimental procedure was repeated three times for each of the three cameras. The experimental procedures followed our previous characterization protocol employed for the InVesalius Navigator [3].

Statistical analysis

The stability was evaluated as the standard deviation (SD) of the difference of the translation (Euclidean distance) and three orientation vectors from the average pose coordinates during the acquisition. The effect of camera model, the camera–face distance, and the presence of filter were assessed with a two-way analysis of variance (ANOVA). The jittering of translation axis and rotation angles were evaluated with one-way ANOVA. The accuracy for revisiting a TMS coil repositioning was estimated as the average Euclidian distance and angles difference between the acquired coil pose and the predefined coil target. We used two-way ANOVA to evaluate if the camera model and the coordinates axes (translation, yaw, pitch, and roll) affects the revisiting target accuracy. Tukey HSD post-hoc multiple comparisons were performed for stability, jittering, and repeatability of coil positioning; The threshold for statistical significance was set at p < 0.05. The software R 4.2 (R Core Team, Austria) was used for all statistical analysis.

Results

MarLe stability and jittering

The measured stability for each distance, presence of filter, and for all tested cameras are depicted in Table 1. We observed that the stability, with filtering, varied across the camera model (${F}_{\text{2,16191}}=7.88$; p < 0.001). The c270’s average standard deviation was 0.02 mm (p < 0.001) smaller than the VCU Vega VT and 0.01 mm (p = 0.02) higher than the camera c920. The average standard deviation for the non-filtered coordinates was 68 ± 5% higher than the filtered coordinates for the camera C270, 70 ± 2% higher for the camera C920 and 37 ± 9% higher for the built-in camera Vega VT. The filtered coordinates did not reveal relevant differences between the camera–face distances (${F}_{\text{2,16191}}=0.94$; p = 0.39). The unfiltered coordinates did not reveal relevant differences between the cameras (${F}_{\text{2,16191}}=2.22$; p = 0.11) or for the camera–face distances (${F}_{\text{2,16191}}=0.10$; p = 0.90). No difference was found between the three trials (p = 0.99).

Table 1 The stability results for the distance between the camera and face, for the camera model, with and without filtering for each translation and rotation axis

Full size table

MarLe jittering was estimated with a camera–head distance of 100 cm; the results are illustrated in Fig. 3. The charts points (black circles) are the distances between the measured head pose to the average head pose coordinate, recorded along 180 s for each translation axis (x, y, z) and rotation angle (yaw, pitch, roll). The red dashed lines represent the jittering, i.e., the 95% intervals (1.96 times the standard deviation) of the static head pose estimation. The camera c270 has a significant difference between the translation axis (Fig. 3a; p < 0.001; ${F}_{\text{2,897}}=21.30$). The jittering for the x axis is 0.11 mm smaller than for the y axis (Fig. 3a; p < 0.001) and 0.14 mm smaller than the z axis (Fig. 3a; p < 0.001). No difference was found for the rotation angles (Fig. 3a; p = 0.71; ${F}_{\text{2,897}}=0.34$). We did not find any difference for the translation axis and rotation angles for the cameras c920 (Fig. 3b; translation: p = 0.84; ${F}_{\text{2,897}}=0.18$; rotation: p = 0.39; ${F}_{\text{2,897}}=0.94$) and VCU Vega VT (Fig. 3c; translation: p = 0.19; ${F}_{\text{2,897}}=1.67$; rotation: p = 0.52; ${F}_{\text{2,897}}=0.61$). The smallest deviation was found for c920, ± 0.12 mm for the y axis and ± 0.06° for roll rotation, Fig. 3b. The largest deviation was ± 1.78 mm obtained for the z axis and ± 0.62° obtained for yaw rotation, both for the Vega VT camera, Fig. 3c. The 95% intervals were within the range of ± 2 mm and ± 1° for all cameras.

Accuracy of revisiting a TMS target

We evaluated the accuracy of MarLe for revisiting a TMS coil placement in the mockup navigated TMS experiment. The coregistration error between the camera and the TMS coil tracker was 1.56 ± 0.96 mm for the built-in Vega VT, the low-cost camera c270 returned an error of 1.48 ± 1.10 mm and 1.38 ± 0.84 for the camera c920. The neuronavigation fiducial registration error was lower than 3 mm for all cameras and trials. The comparison of the translation vector and the rotation angles between the three cameras is ilustrated in Fig. 4. The difference between the collected coordinates to the target varied depending on the rotation angle (${F}_{\text{2,288}}=12.32$; p < 0.001) and the camera model (${F}_{\text{2,288}}=4.63$; p = 0.01). However, the camera model had no significant effect on the translation vector (Fig. 4a; ${F}_{\text{2,96}}=1.79$; p = 0.17). MarLe has a significant deviation of 0.43° between pitch and roll (Fig. 4c and d; p < 0.001), 0.55° between yaw and roll (Fig. 4b and d; p < 0.001). The cameras c270 and c920 have a significant deviation of 0.86° in pitch (Fig. 4c; p = 0.002).

Discussion

During conventional navigated TMS, the markers or sensors fixed on the patient’s head are susceptible to displacements. Uncontrollable factors, such as patients sweating or oily skin can cause the markers to move from the initial fixed position or the skin may move due to head muscle activation, leading to neuronavigation inaccuracies. Neuronavigation systems cannot distinguish head marker displacements; the mismatch detection depends on the user’s expertise [39].

To overcome these limitations, we developed a markerless head tracking algorithm, named MarLe, for navigated TMS that can estimate the head pose based on real-time video processing. We combined, in a single device, the optimal accuracy of infrared marker tracking for the TMS coil with the MarLe based on the built-in video camera. Nevertheless, we can use standalone low-cost cameras to perform the head-pose estimation. In a simulated TMS experiment, neuronavigation with MarLe achieves acceptable accuracy and stability with the built-in Vega VT camera and with two low-cost webcams, comparable to dedicated tracking devices supported by InVesalius [3].

The camera calibration errors were below 1 mm for all cameras. Camera calibration is a critical component in performing accurate tracking [31]. Our MarLe algorithm estimates the head pose by relating the camera units (pixels) to the physical world (meters) based on the camera calibration parameters. Therefore, the camera calibration error affects the navigation accuracy. The neuronavigation error is mainly due to the coregistration variability, tracking device inaccuracy, and distortions in the anatomical images [40]. The typically recommended error limits for neuronavigation systems are 3 mm and 3° for positional and orientation accuracy, respectively, i.e., the inherent error of a neuronavigation system to reach a target [4]. Some studies define the neuronavigation error based on the 95th percentile distribution of Euclidean distances between the target and the collected coordinate. In this approach, the accepted neuronavigation error is up to 3–4 mm [3, 41]. The stability and jittering experiments enabled the assessment of the tracking device accuracy while revisiting a TMS coil placements assessed the overall neuronavigation error with MarLe.

Our measurements revealed that the distance between camera and face, a distance ranging from 100 to 150 cm, did not affect the MarLe stability to estimate head pose. One likely explanation is that we are operating in the camera’s optimal measurement volume. The MarLe operational distance range provides setup flexibility for the users, reducing the time spent with tracker arrangement for navigated TMS. The stability MarLe, for filtered coordinates, depends on the resolution and acquisition frame of the camera. VCU Vega VT has the highest resolution (2048 × 1536 pixels), followed by c920 (1920 × 1080), with c270 having the lowest resolution (1280 × 720); the resolution affects the accuracy of head-pose determination [31]. Higher resolution means smaller pixel size, providing a better capability to detect small head movements. MarLe head-pose estimation stability seems to be the same for cameras with resolutions of 2048 × 1536 pixels and 1920 × 1080 pixels, but a lower resolution may affect the MarLe stability.

The stability for the non-filtered coordinates is not affected by the camera device or the camera–face distance. Then, filtering can increase stability, but this depends on frame rate. Namely, the Savitzky–Golay filter is affected by the amount of input data over time [42]. A high acquisition rate results in a stronger filter effect, increasing the smoothing and stability of the head–pose estimation. However, the resolution seems to have no effect on the filtering. For example, the built-in camera Vega VT has the lowest frame rate (20 Hz), and it has the smallest increase in stability (37 ± 9%) after filtering. The cameras c270 and c920 have a 30-Hz acquisition rate; both have similar stability increases after filtering, 68 ± 5% and 70 ± 2%, respectively. The Savitzky–Golay filter has an important contribution to increasing the stability and consequently the accuracy of MarLe. It should be noted that the Savitzky–Golay filter parameters were optimized to have the lowest delay in terms of smoothness. The first-order polynomial is used due to the good response, in terms of smoothness, for low frequencies (less than 100 Hz) [43]. The window size affects the time response; a big window size includes delays to filtered signals. We found five frames to be optimal: there was no noticeable visual delay, and a good filter response was obtained.

The MarLe jittering for all cameras was less than 2 mm and 1° for translation and rotation, i.e., lower than the acceptance limit of 3 mm or 3°. Interestingly, fluctuations when using the MarLe algorithm are clearly smaller than those obtained with other head-pose estimation algorithms published up to now [24, 44]. Martins and Batista[24] found a jitter of 1 cm and 2°. This discrepancy may be caused by the face-detection algorithm. Martins and Batista[24] used a statistical matching method (active appearance mode) to track facial characteristics. We are using a pre-trained network based on the 300-W face dataset to perform facial landmark detection [34]. This dataset has 300 indoor and 300 outdoor images including a broad range of age, sex, facial expressions, face sizes, face shapes, skin tones, illumination conditions, and occlusions, which may provide an improved performance of markerless navigation on varying room conditions and distinct facial characteristics [34].

The z translation axis has the higher deviation, illustrated in Fig. 3. The z direction defines the distance between the source, i.e., the camera, to the tracked object, i.e., the face. The head-pose estimation algorithm is based on a face model; once the algorithm detects a face, we use a closed-form solution to find the required scaling to fit the face model to the detected face. The translation z is based on the scaling factor, and it is more susceptible to fluctuations than the x and y axes [31]. However, no significant difference was found between the translation axes (x, y, z) and between the rotation angles (yaw, pitch, roll) for the c920 and built-in Vega VT. The camera c270 has a deviation only between the translation axes. This might be caused by the low camera resolution [31].

Finally, the MarLe accuracy for revisiting a coil placement was in a range of 3–4 mm and 3–4° for the 95th percentile, corroborating previous findings [3, 41]. As was expected, MarLe connected to c270 showed the highest deviations for revisiting a TMS target. Again, this association might point toward low camera resolution [31].

Limitations

It is important to highlight that the TMS coil tracker and the camera must be stationary over the TMS session. The transformation between the camera to the coil tracker is based on the physical location of the devices. Displacements will nullify the transformation matrix and the coregistration must be redone. However, we could overcome this limitation with the Polaris Vega VT.

We should note that MarLe characterization and the simulated TMS experiment were performed with neutral facial expressions in a well-illuminated room, which is the most common condition on a navigated TMS experiment with healthy participants. The facial landmark detection method implemented in MarLe has excellent performance for a variety of facial characteristics [34]. However, in this study, the specific effect of varying conditions, such as skin tone, sex, and age, were not characterized in terms of TMS targeting accuracy and stability, which are planned to be explored in a future study. Furthermore, outdoor scenes with poor illumination and the facial expressions of, e.g., surprise and scream, can negatively impact the performance of data-driven face detection methods [34]. For MarLe, we visually inspected that the facial detection stability decreases when the participant smiles or talks. One possible solution is to combine facial expression recognition with face detection to warn the user when a non-neutral expression is identified, avoiding unstable pose estimations. One might also be able to teach the algorithm to use facial landmarks that do not move when facial expressions change, improving stability and accuracy in dynamic conditions. Lastly, we did not test our method on participants with facial deformities. The heterogeneity of facial deformities may impose difficulties in obtaining accurate and generalizable landmark detection, which may provide an important topic to be investigated [45].

Conclusion

We developed a markerless head-pose estimation algorithm for navigated TMS that utilizes low-cost cameras. We successfully demonstrated the registration of the neuronavigation system, tracking device, and camera used for head-pose estimation through transformation matrixes. MarLe has a potential to improve the reliability and ease of TMS targeting, simplifying, and reducing the time to perform the determination of head pose, as there is no need to be concerned about the displacement of head-tracker markers.

Abbreviations

ANOVA:: Analysis of variance
FPS:: Frames per second
nTMS:: Navigated transcranial magnetic stimulation
SD:: Standard deviation
3D:: Three-dimensional
TMS:: Transcranial magnetic stimulation
2D:: Two-dimensional
VCU:: Video camera unit

References

Lefaucheur J-P (2010) Why image-guided navigation becomes essential in the practice of transcranial magnetic stimulation. Neurophysiol Clin 40:1–5. https://doi.org/10.1016/j.neucli.2009.10.004
Article PubMed Google Scholar
Julkunen P (2014) Methods for estimating cortical motor representation size and location in navigated transcranial magnetic stimulation. J Neurosci Methods 232:125–133. https://doi.org/10.1016/j.jneumeth.2014.05.020
Article PubMed Google Scholar
Souza VH, Matsuda RH, Peres ASC et al (2018) Development and characterization of the InVesalius Navigator software for navigated transcranial magnetic stimulation. J Neurosci Methods 309:109–120. https://doi.org/10.1016/j.jneumeth.2018.08.023
Article PubMed Google Scholar
Ruohonen J, Karhu J (2010) Navigated transcranial magnetic stimulation. Neurophysiol Clin 40:7–17. https://doi.org/10.1016/j.neucli.2010.01.006
Article CAS PubMed Google Scholar
Nieminen AE, Nieminen JO, Stenroos M et al (2022) Accuracy and precision of navigated transcranial magnetic stimulation. J Neural Eng 19. https://doi.org/10.1088/1741-2552/aca71a
Barker AT, Jalinous R, Freeston IL (1985) Non-invasive magnetic stimulation of human motor cortex. The Lancet 325:1106–1107. https://doi.org/10.1016/S0140-6736(85)92413-4
Article Google Scholar
Rossini PM, Burke D, Chen R et al (2015) Non-invasive electrical and magnetic stimulation of the brain, spinal cord, roots and peripheral nerves: basic principles and procedures for routine clinical and research application: an updated report from an I.F.C.N. Committee. Clin Neurophysiol 126:1071–1107. https://doi.org/10.1016/j.clinph.2015.02.001
Article CAS PubMed PubMed Central Google Scholar
Somaa FA, de Graaf TA, Sack AT (2022) Transcranial magnetic stimulation in the treatment of neurological Diseases. Front Neurol 13. https://doi.org/10.3389/fneur.2022.793253
Nazarova M, Novikov P, Ivanina E et al (2021) Mapping of multiple muscles with transcranial magnetic stimulation: absolute and relative test–retest reliability. Hum Brain Mapp 42:2508–2528. https://doi.org/10.1002/hbm.25383
Article PubMed PubMed Central Google Scholar
Tardelli GP, Souza VH, Matsuda RH et al (2022) Forearm and hand muscles exhibit high coactivation and overlapping of cortical motor representations. Brain Topogr 35:322–336. https://doi.org/10.1007/s10548-022-00893-1
Article PubMed PubMed Central Google Scholar
Souza VH, Nieminen JO, Tugin S et al (2022) TMS with fast and accurate electronic control: measuring the orientation sensitivity of corticomotor pathways. Brain Stimul 15:306–315. https://doi.org/10.1016/j.brs.2022.01.009
Article PubMed Google Scholar
Haddad AF, Young JS, Berger MS, Tarapore PE (2021) Preoperative applications of navigated Transcranial magnetic stimulation. https://doi.org/10.3389/fneur.2020.628903. Front Neurol 11:
Umana GE, Scalia G, Graziano F et al (2021) Navigated transcranial magnetic stimulation motor mapping usefulness in the surgical management of patients affected by brain tumors in eloquent areas: a systematic review and meta-analysis. Front Neurol 12. https://doi.org/10.3389/fneur.2021.644198
Natalizi F, Piras F, Vecchio D et al (2022) Preoperative navigated transcranial magnetic stimulation: New Insight for Brain Tumor-Related Language Mapping. J Pers Med 12:1589. https://doi.org/10.3390/jpm12101589
Article PubMed PubMed Central Google Scholar
Lioumis P, Zhdanov A, Mäkelä N et al (2012) A novel approach for documenting naming errors induced by navigated transcranial magnetic stimulation. J Neurosci Methods 204:349–354. https://doi.org/10.1016/j.jneumeth.2011.11.003
Article PubMed Google Scholar
Souza VH, Vieira TM, Peres ASC et al (2018) Effect of TMS coil orientation on the spatial distribution of motor evoked potentials in an intrinsic hand muscle. Biomedical Eng / Biomedizinische Technik 63:635–645. https://doi.org/10.1515/bmt-2016-0240
Article CAS Google Scholar
Nieminen JO, Sinisalo H, Souza VH et al (2022) Multi-locus transcranial magnetic stimulation system for electronically targeted brain stimulation. Brain Stimul 15:116–124. https://doi.org/10.1016/j.brs.2021.11.014
Article PubMed PubMed Central Google Scholar
Lioumis P, Rosanova M (2022) The role of neuronavigation in TMS–EEG studies: current applications and future perspectives. J Neurosci Methods 380:109677. https://doi.org/10.1016/j.jneumeth.2022.109677
Article PubMed Google Scholar
Nobakhsh B, Shalbaf A, Rostami R et al (2022) An effective brain connectivity technique to predict repetitive transcranial magnetic stimulation outcome for major depressive disorder patients using EEG signals. Phys Eng Sci Med. https://doi.org/10.1007/s13246-022-01198-0
Article PubMed Google Scholar
Rowley HA, Baluja S, Kanade T (1998) Neural network-based face detection. IEEE Trans Pattern Anal Mach Intell 20:23–38. https://doi.org/10.1109/34.655647
Article Google Scholar
Hangaragi S, Singh T, N N (2023) Face detection and Recognition using Face Mesh and deep neural network. Procedia Comput Sci 218:741–749. https://doi.org/10.1016/j.procs.2023.01.054
Article Google Scholar
Zhang H, Chi L (2020) End-to-end spatial transform face detection and recognition. Virtual Real Intell Hardw 2:119–131. https://doi.org/10.1016/j.vrih.2020.04.002
Article Google Scholar
Lisetti CL, Schiano DJ (2000) Automatic facial expression interpretation. Pragmat Cogn 8:185–235. https://doi.org/10.1075/pc.8.1.09lis
Article Google Scholar
Martins P, Batista J (2008) Single view head pose estimation. In: 2008 15th IEEE International Conference on Image Processing. IEEE, pp 1652–1655
Miura H, Ozawa S, Matsuura T et al (2017) Proposed patient motion monitoring system using feature point tracking with a web camera. Australas Phys Eng Sci Med 40:939–942. https://doi.org/10.1007/s13246-017-0589-4
Article PubMed Google Scholar
Ohashi A, Nishio T, Saito A et al (2022) Baseline drift vector of multiple points on body surface using a near-infrared camera. Phys Eng Sci Med 45:143–155. https://doi.org/10.1007/s13246-021-01097-w
Article PubMed Google Scholar
Goddard J, Mandelkern M (2019) Non-invasive PET head-motion correction via optical 3d pose tracking. In: 2019 IEEE Nuclear Science Symposium and Medical Imaging Conference (NSS/MIC). IEEE, pp 1–4
Jiang L, Zhang S, Yang J et al (2015) A robust automated markerless registration framework for neurosurgery navigation. Int J Med Rob Comput Assist Surg 11:436–447. https://doi.org/10.1002/rcs.1626
Article Google Scholar
Bradski G (2000) The OpenCV Library. Dr Dobb’s Journal of Software Tools
King DE (2009) Dlib-ml: a machine learning toolkit. J Mach Learn Res 10:1755–1758
Google Scholar
Bradski G, Kaehler A (2008) Learning OpenCV: computer vision with the OpenCV library. O’Reilly Media
Zhang Z (2000) A flexible new technique for camera calibration. IEEE Trans Pattern Anal Mach Intell 22:1330–1334. https://doi.org/10.1109/34.888718
Article Google Scholar
Kazemi V, Sullivan J (2014) One millisecond face alignment with an ensemble of regression trees. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 1867–1874
Sagonas C, Tzimiropoulos G, Zafeiriou S, Pantic M (2013) 300 faces in-the-wild challenge: the first facial landmark localization challenge. Proc IEEE Int Conf Comput Vis 397–403. https://doi.org/10.1109/ICCVW.2013.59
Zeng A, Yang L, Ju X et al (2022) SmoothNet: a plug-and-play network for refining human poses in videos. Eur Conf Comput Vis. https://doi.org/10.48550/arXiv.2112.13715
Article Google Scholar
Vivó-Truyols G, Schoenmakers PJ (2006) Automatic selection of optimal Savitzky-Golay smoothing. Anal Chem 78:4598–4608. https://doi.org/10.1021/ac0600196
Article CAS PubMed Google Scholar
Virtanen P, Gommers R, Oliphant TE et al (2020) SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat Methods 17:261–272. https://doi.org/10.1038/s41592-019-0686-2
Article CAS PubMed PubMed Central Google Scholar
Amorim P, Moraes T, Silva J, Pedrini H (2015) InVesalius: an interactive rendering Framework for Health Care support. Springer International Publishing, pp 45–54
Schönfeldt-Lecuona C, Thielscher A, Freudenmann RW et al (2005) Accuracy of stereotaxic positioning of transcranial magnetic stimulation. Brain Topogr 17:253–259. https://doi.org/10.1007/s10548-005-6033-1
Article PubMed Google Scholar
Steinmeier R, Rachinger J, Kaus M et al (2000) Factors influencing the application accuracy of neuronavigation systems. Stereotact Funct Neurosurg 75:188–202. https://doi.org/10.1159/000048404
Article CAS PubMed Google Scholar
Mascott CR (2006) In vivo accuracy of image guidance performed using optical tracking and optimized registration. J Neurosurg 105:561–567. https://doi.org/10.3171/jns.2006.105.4.561
Article PubMed Google Scholar
Savitzky A, Golay MJE (1964) Smoothing and differentiation of data by simplified least Squares Procedures. Anal Chem 36:1627–1639. https://doi.org/10.1021/ac60214a047
Article CAS Google Scholar
Seo J, Ma H, Saha TK (2018) On savitzky-golay filtering for online condition monitoring of transformer on-load tap changer. IEEE Trans Power Delivery 33:1689–1698. https://doi.org/10.1109/TPWRD.2017.2749374
Article Google Scholar
Darby J, Sánchez MB, Butler PB, Loram ID (2016) An evaluation of 3D head pose estimation using the Microsoft Kinect v2. Gait Posture 48:83–88. https://doi.org/10.1016/j.gaitpost.2016.04.030
Article PubMed Google Scholar
Workman CI, Chatterjee A (2021) The Face Image Meta-Database (fIMDb) & ChatLab Facial Anomaly Database (CFAD): tools for research on face perception and social stigma. Methods in Psychology 5. https://doi.org/10.1016/j.metip.2021.100063

Download references

Acknowledgements

We are thankful to Lourenço Rocha and Carlos Renato Silva for technical support and to Prof. Dr. Jarbas Caiado de Castro Neto, IFSC–USP for the insightful discussions at the beginning of this research.

Funding

This work has received funding from the Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) (grant No. 141056/2018-5, No. 304107/2019-0, and No. 118882/2019-8) and from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 810377). This article was produced as part of the activities of the FAPESP Research, Innovation and Dissemination Center for Neuromathematics (grant No. 2013/07699-0). VHS has received funding from the Academy of Finland (decision No. 349985). RM is post-doctoral fellow of FAPESP (Grant # 2022/14526-3).

Open Access funding provided by Aalto University.

Author information

Authors and Affiliations

Department of Physics, Faculty of Philosophy Sciences and Letters of Ribeirão Preto, University of São Paulo, Av. Bandeirantes, Ribeirão Preto, 3900, 14040-901, SP, Brazil
Renan H. Matsuda, Victor H. Souza, Petrus N. Kirsten & Oswaldo Baffa
Department of Neuroscience and Biomedical Engineering, Aalto University School of Science, Rakentajanaukio 2, Espoo, 02150, Finland
Renan H. Matsuda, Victor H. Souza & Risto J. Ilmoniemi
School of Physiotherapy, Federal University of Juiz de Fora, Juiz de Fora – MG, Cascatinha, Brazil
Victor H. Souza

Authors

Renan H. Matsuda
View author publications
You can also search for this author in PubMed Google Scholar
Victor H. Souza
View author publications
You can also search for this author in PubMed Google Scholar
Petrus N. Kirsten
View author publications
You can also search for this author in PubMed Google Scholar
Risto J. Ilmoniemi
View author publications
You can also search for this author in PubMed Google Scholar
Oswaldo Baffa
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Renan H. Matsuda: Conceptualization, Methodology, Software, Formal Analysis, Investigation, Data Curation, Writing – Original draft, Writing – Review & Editing, Visualization; Victor Hugo Souza: Conceptualization, Methodology, Formal Analysis, Writing – Original draft, Writing – Review & Editing; Petrus N. Kirsten: Methodology, Software, Investigation, Writing – Review and Editing; Risto J. Ilmoniemi: Conceptualization, Supervision, Funding acquisition, Writing – Review and Editing; Oswaldo Baffa: Conceptualization, Resources, Supervision, Funding acquisition, Writing – Review and Editing.

Corresponding author

Correspondence to Renan H. Matsuda.

Ethics declarations

Conflict of interest

V.H.S. and O.B. are inventors in a patent application on neuronavigation technology related to the methodology utilized in this work. The other authors declare no conflicts of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Supplementary Material

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Matsuda, R.H., Souza, V.H., Kirsten, P.N. et al. MarLe: Markerless estimation of head pose for navigated transcranial magnetic stimulation. Phys Eng Sci Med 46, 887–896 (2023). https://doi.org/10.1007/s13246-023-01263-2

Download citation

Received: 30 January 2023
Accepted: 16 April 2023
Published: 11 May 2023
Issue Date: June 2023
DOI: https://doi.org/10.1007/s13246-023-01263-2

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

MarLe: Markerless estimation of head pose for navigated transcranial magnetic stimulation

Abstract

Similar content being viewed by others

Non-orthogonal one-step calibration method for robotized transcranial magnetic stimulation

Automatic Neurocranial Landmarks Detection from Visible Facial Landmarks Leveraging 3D Head Priors

Optical tracking with two markers for robust prospective motion correction for brain imaging

Introduction