We used two different public datasets to validate our segmentation-based registration method. Most of our experiments are conducted on the RESECT dataset , including clinical cases of low-grade gliomas (Grade II) acquired on adult patients between 2011 and 2016 at St. Olavs University Hospital, Norway. There is no selection bias, and the dataset includes tumors at various locations within the brain. For 17 patients, B-mode US-reconstructed volumes with good coverage of the resection site have been acquired. No blood clotting agent, which causes well-known artefacts, is used. US acquisitions are performed at three different phases of the procedure (before resection, during and after resection), and different US probes have been utilized. This dataset is designed to test intra-modality registration of US volumes and two sets of landmarks are provided: one to validate the registration of volumes acquired before, during and after resection, and another set that increases the number of landmarks between volumes obtained before and during resection. Regarding both sets, the reference landmarks are taken in the volumes acquired before resection and then are utilized as references to select the corresponding landmarks in US volumes acquired during and after tumor removal. In the RESECT dataset, landmarks have been taken in the proximity of deep grooves and corners of sulci, convex points of gyri and vanishing points of sulci. The number of landmarks of the first and second sets can be, respectively, found in the second column of Tables 4 and 5.
In addition to RESECT volumes, BITE dataset is also utilized to test our registration framework . It contains 14 US-reconstructed volumes of 14 different patients with an average age of 52 years old. The study includes four low-grade and ten high-grade gliomas, all supratentorial, with the majority in the frontal lobe (9/14). For 13 cases, acquisitions are obtained before and after tumor resection. Ten homologous landmarks are obtained per volume, and initial mTRE are provided. The quality of BITE acquisitions is lower with respect to RESECT dataset, mainly because blood clotting agent is used, creating large artefacts .
We used MeVisLabFootnote 2 for implementing (a) an annotation tool for medical images, (b) a 3D segmentation method based on a CNN and (c) registration framework for three-dimensional data.
Manual segmentation of anatomical structures
The first step of our method consists of the 3D segmentation of anatomical structures in different stages of US acquisitions. Both RESECT and BITE datasets are used to test registration algorithms and no ground truth is provided for validating segmentation methods. Therefore, we decided to conduct a manual annotation of the structures of interest in the US volumes acquired before resection of RESECT dataset. Pathological tissue was excluded from the manual annotation since it is progressively removed during resection and correspondences could not be found in volumes acquired at subsequent stages. On the contrary, we focused on other hyperechogenic (with an increased response—echo—during ultrasound examination) elements such as the sulci and falx cerebri. We consider these elements valid correspondences because the majority of them has a high chance to remain visible in different stages of the procedure.
The manual segmentations were performed on a web-based annotation tool. As shown in Fig. 1, each RESECT volume can be simultaneously visualized on three different projections planes (axial, sagittal and coronal). The segmentation task is accomplished by contouring each structure (yellow contour in the first frame of Fig. 1) of interest on the axial view. The drawn contours are then projected onto the other two views (blue overlay in the second frames of Fig. 1) so that a better understanding of the segmentation process is possible by observing the structures in different projections. The annotation process can be accomplished very easily and smoothly, and 3D interpolated volumes can be then obtained by rasterizing the drawn contours. As shown in Fig. 1, the contours are well defined in the axial view but several elements are not correctly included if considering the other two views. This is a common issue that we found in our annotation, which would require much time and effort to be corrected. However, we decided to have a maximum annotation time of 2 h per volume. The obtained masks correctly include the major structures of interest, but some elements such as minor sulci are missing. Despite the sparseness of our dataset, we expect our training set to be good enough to train our model to segment more refined structures of interest [29, 30].
The manual annotation was performed by the main author of this work (L.C.), who has two years of experience in medical imaging and almost one year in US imaging for neurosurgery. Then, a neurosurgeon with many years of experience in the use of US modality for tumor resection reviewed and rated the manual annotations, by taking into account the sparseness of the dataset. According to the defined criteria, each volume could be rated with a point between 4 and 1. More precisely, a point equal to 1 means that the main structures (falx cerebri and major sulci) are correctly segmented, and only minor changes should be done to exclude parts of no interest (i.e., slightly over-segmented elements). A point equal to 2 indicates that the main structures are correctly segmented, but major corrections should be done to exclude structures of no interest. A point equal to 3 indicates that main structures were missed in the manual annotations, which, however, are still acceptable. A score of 4 means that a lot of major structures are missing; therefore, that annotation for the volume of interest cannot be accepted. The neurosurgeon evaluated the annotations by looking at the projected structures on the sagittal and coronal views of the drawn contours. Table 1 shows the results of the rating process for the volumes of interest.
A convolutional neural network aimed for a volumetric segmentation is trained on the manual annotations. We utilized the original 3D U-net  architecture, in which few modifications were made with respect to the original implementation: (a) The analysis and synthesis paths have two resolution steps and (b) before each convolution layer of the upscaling path a dropout with a value of 0.4 is used in order to prevent the network from overfitting. The training is conducted with a patch size of (30,30,30), padding of (8,8,8) and a batch size of 15 samples. The learning rate was set to 0.001, and the best model saved according to best Jaccard index computed on 75 samples every 100 iterations. The architecture modifications, as well as the training parameters, were chosen by conducting several experiments and selecting those providing the best results. As training, validation and test sets, we split the seventeen volumes acquired before resection, which we annotated in the manual annotation. The split has been done as follows: The training set includes the volumes from 1 to 15, the validation one the volumes from 16 to 21 and the test one the volumes 24, 25, 27.
After having found the best model to segment anatomical structures in pre-resection US volumes, we applied it to segment ultrasound volumes acquired at different surgical phases.
The masks automatically segmented by our trained model are used to register US volumes. The proposed method is a variational image registration approach based on : The registration process can be seen as an iterative optimization algorithm where the search of the correct registration between two images corresponds to an optimization process aimed at finding a global minimum of an objective function. The minimization of the objective function is performed according to the discretize-then-optimize paradigm : The discretization of the various parameters is followed by their optimization. The objective function to be minimized is composed of a distance measure, which quantifies the similarity between the deformed template image and the reference one, and a regularizer, which penalizes undesired transformations. In our approach, the binary 3D masks generated by the previous step are used as input for the registration task, which can be seen as mono-modality intensity-based problem. Therefore, we chose the sum of squared differences (SSD) as a similarity measure, which is usually suggested to register images with similar intensity values. Moreover, to limit the possible transformations in the deformable step, we utilized the elastic regularizer, which is one of the most commonly used . In our method, the choice of the optimal transformation parameters has been conducted by using the quasi-Newton l-BGFS , due to its speed and memory efficiency. The stopping criteria for the optimization process were empirically defined: the minimal progress, the minimal gradient and the relative one, the minimum step length were set equal to 0.001, and the maximum number of iterations equal to 100.
Our registration method aims to provide a deformable solution to compensate for anatomical changes happening during tumor resection. As commonly suggested for methods involving non-rigid registration tasks , the proposed solution includes an initial parametric registration used then to initialize the nonparametric one. First of all, the parametric approach utilizes the information provided by the optical tracking systems as an initial guess. Based on this pre-registration, a two-step approach is conducted, including a translation followed then by a rigid transformation. In this stage, to speed the optimization process, the images are registered at a resolution one-level coarser compared to the original one. Then, the information computed during the parametric registration is utilized as the initial condition for the nonparametric step. In this stage, to reduce the chance to reach a local minimum, a multilevel technique is introduced: the images are registered at three different scales, from a third-level to one-level coarser. As output of the registration step, the deformed template image is provided.