1 Introduction

White matter (WM) brain fiber parcellation, also named bundling, or segmentation –especially when providing a voxel-based output–, or “virtual dissection” when being done semi-automatically or with some manual intervention, encompasses methods that aim to classify and group together fiber entities, i.e. streamlines. Bundling is an essential processing step in tractography pipelines allowing to identify the tracks of interest across different brain regions. The large number of streamlines contained in an average tractogram calls for automated procedures. Streamline classification for bundling purposes is most commonly performed using either of two criteria [6]: (i) the streamline similarity (defined according to some distance measure); and/or (ii) the regions of interest (ROI) streamlines traverse or which (gray matter) brain regions their endpoints connect. Despite being a seemingly simple geometrical entity, adequately characterizing streamlines is still a challenge. Although several distance measures (such as the closest point distance, the Hausdorff distance, the Mahalanobis distance, or the Minimum average Direct and Flip distance (MDF), among others) have been proposed in literature [6, 17], streamline-space point-wise distance computation and full pair-wise comparisons are computationally expensive, and might not capture other relevant features. Clustering can be performed in the streamline native space, or some other representation space (e.g. [17, 23, 27]), and some methods provide a volumetric result of streamline groups (bundles) (e.g. [13, 14, 22]).

We propose to extend the autoencoder-based latent space nearest neighbor tractography framework proposed in [12] to cluster streamlines into bundles. We show that the proposed autoencoder-based method is successful at bundling streamlines on synthetic and clinical-style realistic phantom and in vivo human brain data. The method (i) does not require to be trained on labelled data, (ii) uses a single model, trained only once, to classify streamlines, and (iii) does not require any distance thresholding parameter to generate the clusters.

1.1 Related Work

Automatic bundle identification of deep white matter pathways has been performed using a variety of methods: (i) anatomical filtering; (ii) clustering; (iii) atlas-based; (iv) graph-based; (v) dictionary learning;(vi) segmentation-based; and, more recently, (vii) deep learning-based methods [23]. Automatic anatomical filtering methods (e.g. [26]), including query languages [21], often offer limited quality results due to the variability of the streamline locations across subjects, and are highly sensitive to the streamlines’ waypoints (e.g. streamlines that are a few voxels short of reaching the gray matter, or apart from each other at a few locations might be discarded or classified into different groups).

Clustering methods [2, 9, 15, 17, 19] use a given streamline similarity distance definition. These approaches may include some form of hierarchical approach to progressively improve the results (e.g. [9]). Several methods have used unsupervised machine learning strategies, such as Expectation-Maximization (EM) [15] or k-means [9]. Similarly, the use of streamline feature descriptors that aim to capture and summarize the relevant information for the classification, along with the use of some form of embedding space where the clustering takes place, have also been proposed [2, 17, 19]. Some of these methods (e.g. [17]) require computing pair-wise streamline distances, which has a complexity of \(\mathcal {O}(N^{2})\).

Atlas-based methods such as the ones proposed in [6, 25] rely on the anatomical priors provided by the atlas to assign streamlines to a given bundle. They use bundle or cluster “models” to recognize streamlines in the target tractogram according to a given threshold with respect to the streamline- or feature-space centroids. Some of these methods, such as [6], might yield a variable number of clusters across subjects, or differing results depending on the initial sorting of the streamlines in the tractogram.

Graph-based strategies [18, 20] consider the clustering task as a graph partitioning problem that seeks to cluster the nodes based on a similarity measure. Dictionary learning methods [23], in turn, generally assume that a dictionary that contains a representative signature for each bundle can be computed (or learned), and posit the task of finding the class a streamline belongs to as an optimization problem that seeks to find the coefficients that fit a given bundle representation for each streamline.

Lately, deep learning-based methods have also been applied to the bundling task, and have compared favorably over the mentioned conventional methods within the studied contexts. Several authors [11, 27] have used recurrent neural networks (RNNs) to solve the clustering problem as a classification problem. Similarly, regular classification convolutional neural networks (CNNs) have been employed [10, 19, 24] to predict the streamline bundle labels. In [3], authors proposed a Deep Embedded Clustering-based (DEC) framework to provide the cluster assignments. Finally, a number of deep learning-based methods have cast the problem into a segmentation task, yielding bundle-wise voxel masks [13, 14, 22].

Classification neural networks are trained to reliably provide a prediction on a fixed-length probability vector, and hence do not allow to change the number of target labels (i.e. bundles) without retraining. Tractography segmentation methods, in turn, are inherently binary classification methods: given that the same voxel cannot be assigned to multiple labels (even if multiple streamlines belonging to different bundles may traverse the same voxel), such methods require a separate model to be trained for each bundle.

2 Material and Methods

The same deep autoencoder architecture presented in [12] is used in this work. The chosen autoencoder is a regular convolutional deep neural network, trained to minimize the mean squared-error loss between the input streamlines and their reconstructions at the output of the autoencoder.

We propose to cluster streamlines using a k-NN approach in the latent space learned by autoencoding streamlines. It is essentially assumed that similar data points (streamlines in our case) will be concentrated to neighboring regions in the Euclidean sense in the latent space [1, 8]. Thus, given (i) an autoencoder; (ii) a set of streamlines to train the autoencoder; (iii) the anatomical bundle classes of a subset of the preceding streamlines; and (iv) a new tractogram that needs to be split into the same set of available bundles, the proposed method proceeds as follows:

  1. 1.

    Train an autoencoder using raw, unlabelled streamlines, generated by a predetermined tractography algorithm.

  2. 2.

    Select a subset of streamlines whose bundle class is known so that they can be used as the reference set to bundle new streamlines. Project such streamlines to the latent space.

  3. 3.

    Project to the latent space the streamlines in a new, to-be-bundled tractogram.

  4. 4.

    Apply a k-NN method using the readily available labelled (reference) streamlines to determine the bundle class of the new streamlines.

We have dubbed the above method CINTA, Clustering in Tractography using Autoencoders. The method requires all streamline data to dwell in a common or standard reference space (such as the MNI space).

Fig. 1.
figure 1

Conceptual illustration of CINTA (Clustering in Tractography using Autoencoders). The streamlines that belong to the same bundle are naturally clustered together in the latent space of a trained autoencoder. A k-NN method is applied to assign the bundle label to such streamlines.

3 Experiments

CINTA’s performance is quantitatively measured on the (i) “Fiber Cup” synthetic tractography dataset [4, 5], and the (ii) clinical-style realistic ISMRM 2015 Tractography Challenge dataset [16]. A subject from the Human Connectome Project (HCP) dataset [7] was used to qualitatively demonstrate CINTA’s bundling ability on in vivo human brain tractography data. Local probabilistic (“Fiber Cup”; ISMRM 2015 Tractography Challenge) and global tracking (HCP) were employed to reconstruct streamlines. The ground truth WM parcellations were obtained according to the data preparation procedure described in [12]. Streamlines had their head-to-tail orientations flipped according to a reference, and were resampled to 256 points prior to training the autoencoder. The k parameter for the k-NN clustering method was chosen experimentally from the set 3,5: it was fixed to a value of 5 as it provided a better F1-score on the ISMRM 2015 Tractography Challenge dataset (an identical performance was registered for both values on the “Fiber Cup” dataset). RecoBundles [6] was used as the baseline method (using the synthetic bundle models available in each dataset).

The following results are reported:

  • Accuracy: proportion of correct predictions (true positives and true negatives) over the total number of streamlines.

  • Sensitivity (recall): proportion of relevant instances that are predicted as positives (true positives) among all positive streamlines in the data.

  • Precision: proportion of relevant instances that are predicted as true positives among all retrieved (predicted) positive streamlines.

  • F1-score: harmonic mean of precision and sensitivity.

For each bundle, the positive instances are those corresponding to the streamlines that are labelled with the given bundle class as determined by the underlying scoring method, the negatives being any other streamline in the whole tractogram.

4 Results

Table 1 shows CINTA’s performance for the “Fiber Cup” and ISMRM 2015 Tractography Challenge datasets averaged over all bundles. As the reported measures reveal, the proposed autoencoder-based tractography bundling procedure achieves perfect and close to perfect scores on the respective datasets, and outperforms the RecoBundles baseline consistently. Additionally, as it can be seen in figure 2, the classification performance is highly consistent across bundles on both datasets.

Table 1. Bundling classification scores. Mean and standard deviation values across bundles.
Fig. 2.
figure 2

CINTA’s classification performance bundle-wise breakup: (a) “Fiber Cup” dataset; and (b) ISMRM 2015 Tractography Challenge dataset.

Fig. 3.
figure 3

Autoencoder-based bundling on the “Fiber Cup” dataset: (a) all bundles; (b) bundle 5; (c) bundle 6; and (d) bundle 7 (following the numbering in [4]).

Fig. 4.
figure 4

Autoencoder-based bundling on the ISMRM 2015 Tractography Challenge dataset: (a, b, c) all bundles (axial superior, coronal anterior, sagittal left views, respectively); (d) left SLF (axial superior view); (e) left CST (coronal anterior view); and (f) Fornix (sagittal left view) (see [16] for the bundle acronyms and names).

Figures 3 and 4 show the bundles as classified with the proposed method. As expected from the scores in table 1, the latent space-based bundling predictions closely follow the anatomically coherent streamline-space bundle partitions. Furthermore, following from the reconstruction difficulty analysis on the ISMRM 2015 Tractography Challenge dataset [16], which revealed 18 hard or very hard bundles, results indicate that CINTA reliably identifies hard-to-track bundles in the data (e.g. left CST and fornix; see (e) and (f) subplots in figure 4).

Figure 5 shows the bundling results on the HCP data subject. As it detaches from the figure, CINTA successfully clusters streamlines into the corresponding anatomically meaningful bundles.

Fig. 5.
figure 5

Autoencoder-based bundling on the HCP dataset: (a, b, c) all bundles (axial superior, coronal anterior, sagittal left views, respectively); (d) right ILF (sagittal right view); (e) left OR (axial superior view); and (f) CC (sagittal left view).

5 Discussion

The results in section 4 show that the latent space learned by the proposed autoencoder provides a low-dimensional representational space where similar streamlines are clustered close to each other. Thus, streamlines can be appropriately classified into anatomically coherent bundles in such a space.

Our clustering approach only requires a single parameter to be fixed (the neighborhood value k), and it is experimentally verified that its value does not influence significantly the results. Its worst case computational time performance is linear (\(\mathcal {O}(Nd) \approx \mathcal {O}(N)\), where N is the number of data points and d the number of features, assuming \(N \gg d\)) (see section A.2 for an experimental demonstration). The complexity is thus dominated by the number of samples. Our clustering framework uses a single model to classify all streamlines at once. Additionally, CINTA can accommodate a variable number of bundles: the autoencoder does not need to be retrained if the number of bundles to be identified changes.

The proposed procedure does not incur notable misclassification errors: it is verified that when a streamline is assigned to the wrong bundle, such streamlines are anatomically close to the wrong class (e.g. left CST streamlines being classified as left FPT streamlines; see section A.1 for an example). This constitutes an indirect evidence of the fact that the latent space of our autoencoder appropriately encodes the necessary anatomical information about the input streamlines.

CINTA requires a subset of the training streamlines to be appropriately labelled so that streamlines in any new tractogram can be classified according to their nearest neighbors in such set. Such a set of labelled streamlines needs to be built only once (for a target bundle mapping). Investigating the classification performance dependency on the number of available labelled streamlines, or whether and how such a value may be variable across bundles or target bundle mappings, is left for future work. Similarly, a multi-subject dataset comparative analysis of CINTA is left for a separate piece of work.

6 Conclusion

We present an extension to an autoencoder-based framework to cluster tractography streamlines into anatomically consistent bundles. We demonstrate that the autoencoder-based tractography latent space offers a versatile representational space to classify streamlines in a straightforward fashion. CINTA (Clustering in Tractography using Autoencoders), obtains excellent scores in synthetic and clinical-style realistic phantom data, and outperforms the RecoBundles baseline method. It also obtains anatomically consistent results on in vivo human brain data. The method (i) does not require to be trained on labelled data, (ii) uses a single model, trained only once, to classify streamlines, and (iii) does not require any distance thresholding parameter to generate the clusters.

Conflict of Interest. Pierre-Marc Jodoin and Maxime Descoteaux report a relationship with Imeka Solutions inc. that includes board membership and employment. Jon Haitz Legarreta, Pierre-Marc Jodoin and Maxime Descoteaux have patent #17/337,413 pending to Imeka Solutions inc.