Needle placement is a challenging problem for a variety of medical interventions, including brachytherapy or biopsy [12]. The force acting on the needle tip allows for inference about the currently penetrated tissue. This information can be used to navigate the needle and to prevent injuries of delicate structures [9]. In order to distinguish tissue based on tip forces, it may be required to measure those with an accuracy of approximately \(0.01 \, \hbox {N}\) [8]. Tip forces cannot be measured with external sensors due to friction forces at the needle shaft [5]. Therefore, small-scale fiber-optical force estimation methods have been directly integrated into the needle tip. Several sensor concepts are based on Fabry–Pérot interferometry [1] and fiber Bragg gratings [6]. Here, we consider a setting where optical coherence tomography is available, e.g., to study tissue deformation [10] or to realize elastography [7]. While OCT has been proposed for tip forces estimation before [2, 3], these approaches rely on the reconstructed gray value data. However, using the reconstructed data has two limitations. First, the signal processing is based on a number of assumptions which may cause some loss of signal information. Second, it does not incorporate the phase part of the complex OCT signal that is particularly sensitive to small axial shifts. Therefore, we explore whether the tip force estimation accuracy can be improved by directly using the raw spectral OCT data. Thus, we perform a calibration between the optical signal and forces applied to the needle tip with convolutional neural networks. We validate our approach with three different needles using a new, improved needle design.

Fig. 1
figure 1

The design of the needles used in this work. In the upper image, one of our needles is depicted below, in comparison with a standard G18 biopsy needle above. In the schema, the brass tip with a piston and the brass sleeve are depicted in yellow. The ferrule guiding the fiber is depicted in gray. The protection tube is depicted in orange

Fig. 2
figure 2

Example OCT image data over a period of time. Left, the raw OCT signal is shown. Right, the reconstructed A-Scans (depth profiles) are shown. One individual scan is depicted in one column. The scans were collected as part of a time series. Subsequent scans are adjacently depicted (M-Scan). The reconstructed scans allow human interpretation to some extent, e.g., the piston surface is visible. Patterns in the raw OCT data are much more subtle and not interpretable by the human eye


Needle design

We used an improved needle design for force estimation at the needle tip. A scheme and an image of the needle are shown in Fig. 1. A brass tip with a piston is put on a brass sleeve such that it is able to perform a sliding motion inside of it. An epoxy layer between tip and sleeve acts as a spring. An optical fiber is embedded into a ceramic ferrule. The ferrule is positioned relative to the tip piston such that the light beam travels a distance of approximately \(1 \, \hbox {mm}\) through air until hitting the surface of the piston. For protection, the ferrule is embedded into a polymer tube which is glued to the brass sleeve. When forces act on the needle tip, the epoxy layer is compressed and the piston moves closer to the exit point of the laser beam which can be detected in the OCT signal. The diameter of the needle is \(2 \, \hbox {mm}\).

In [2], the needle tip was constructed as a cone and attached to the needle shaft with a deformable epoxy layer. Thus, radial forces on the needle tip could easily tilt it. The improved piston construction has the advantage that it guides the tip in axial needle direction and prevents tilt. This contributes to a more reproducible signal, an important aspect in the calibration of the needle, and to the overall durability.

Table 1 Mean absolute error results in mN and inference times in ms
Fig. 3
figure 3

Bar plot showing the relative difference of the mean absolute error when learning from raw data

Calibration data

We acquire calibration data for three custom build needles identical in construction. The data acquisition is performed similar to [3] where a needle is driven against a flat surface with a stepper motor. The force in the axial direction of the needle is measured with a force sensor and recorded together with the associated raw OCT data. Approximately 180 000 OCT-force pairs are collected for each needle for forces between \(0 \, \hbox {N}\) and \(1 \, \hbox {N}\). We perform regular OCT data reconstruction, which includes the following steps:

  1. 1.

    Dechirping the data by resampling and interpolating to new sampling points, based on manufacturer specifications

  2. 2.

    Estimation of the DC spectrum using an exponential moving average with a damping coefficient of \(d = 0.05\)

  3. 3.

    Subtraction of the DC spectrum

  4. 4.

    Apodization by using a Hann window for filtering the spectral data

  5. 5.

    Fourier transform for mapping frequency values to spatial values

  6. 6.

    Selection of the absolute signal value as the final reconstructed intensity image

Throughout the reconstruction process, information can be lost due to the DC spectrum estimation strategy (2), the apodization (4) which eliminates high-frequency signal parts and the selection of the absolute signal value (6) as the final image. Figure 2 shows an excerpt of the collected data, both in raw (left) and reconstructed form (right).

The raw OCT signal has a size of \(1024 \times N_t\) where \(N_t\) is the number of scans acquired over time. Reconstruction up to step (5) results in a complex signal which is also of size \(1024 \times N_t\). Finally, by taking the absolute signal value, the A-Scan image sequence of size \(512 \times N_t\) is obtained. This can be interpreted as a sequence of 1D depth images over time.

Deep learning architectures

We compare prediction performance for needle tip forces from both raw and reconstructed OCT data with different convolutional neural network (CNN) architectures. The considered architectures are variants of the ResNet [4], which is an extension of CNNs that enables better training through improved gradient flow. The ResNet models were originally built for 2D images with 2D convolutions. Here, the A-Scan data represent 1D images. Therefore, we adapt the architectures by replacing 2D convolutions by 1D convolutions. Furthermore, we replace the network’s original output layer for multi-class classification with a fully connected layer with one output for needle force regression. We consider several ResNet variants, resembling different network sizes, the regular ResNet34 and ResNet18 architecture as well as a smaller architecture with 2 residual blocks and 6 convolutional layers, which we name ResNet6. All deep learning models are implemented in PyTorch [11]. Learning is performed over 150 epochs with a batch size of \(N_B = 128\) and a learning rate of 0.005 using the Adam optimizer. As a loss function, we use the mean squared error which is defined as

$$\begin{aligned} MSE = \frac{1}{N_B}\sum _{j=1}^{N_B}(y^{j}-{\hat{y}}^{j})^2 \end{aligned}$$

where y is the ground-truth force and \({\hat{y}}\) is the predicted force value. We use \(20\%\) of the data as a hold-out validation set. We performed five training runs with different random seeds and averaged the individual results.


We report the mean absolute error (MAE) in mN between force predictions and force targets on the validation set. Additionally, we report inference times for the examined neural network architectures. All results are shown in Table 1. Figure 3 shows the relative differences in errors graphically. For most combinations of needle and model architecture, the error for learning on raw OCT data is lower compared with learning from reconstructed OCT scans. Particularly for the ResNet34 architecture, the calibration performance improves for all needles.

Discussion and conclusion

In this paper, we address the calibration problem of OCT-based needle tip force estimation using new and improved sensor concept with a piston and a guiding sleeve. In contrast to a previous OCT-based concept [2], the needle is not as sensitive to lateral forces by design while achieving similar calibration results with intensity data.

Furthermore, we study an approach for improving deep learning-based calibration performance even further. During OCT acquisition, a spectral signal is obtained which is typically reconstructed to an intensity depth image. We illustrate that learning is possible in an end-to-end fashion, i.e., the process of image reconstruction can be avoided. Moreover, the results improve in six out of nine setups, indicating that there may be information lost in the original processing that can be used when training the network on the raw signal. The proposed approach allows for precise estimation of forces at the needle tip, which is particularly interesting for force-based robotic needle placement. With typical robot control cycles of \(1 \, \hbox {ms}\), a trade-off between accuracy and inference time must be balanced.

We find that learning forces from raw OCT data instead of reconstructed images work well, in particular, for larger deep learning models. For future work, our approach could be studied in more detail with additional deep learning methods and in different applications scenarios. Also, our approach could be extended to other applications where an imaging modality is used as a sensor signal, for example, in the context of motion tracking with OCT or ultrasound.