Analyzing Thalamocortical Tract-Tracing Experiments in a Common Reference Space

Current mesoscale connectivity atlases provide limited information about the organization of thalamocortical projections in the mouse brain. Labeling the projections of spatially restricted neuron populations in thalamus can provide a functionally relevant level of connectomic analysis, but these need to be integrated within the same common reference space. Here, we present a pipeline for the segmentation, registration, integration and analysis of multiple tract-tracing experiments. The key difference with other workflows is that the data is transformed to fit the reference template. As a test-case, we investigated the axonal projections and intranuclear arrangement of seven neuronal populations of the ventral posteromedial nucleus of the thalamus (VPM), which we labeled with an anterograde tracer. Their soma positions corresponded, from dorsal to ventral, to cortical representations of the whiskers, nose and mouth. They strongly targeted layer 4, with the majority exclusively targeting one cortical area and the ones in ventrolateral VPM branching to multiple somatosensory areas. We found that our experiments were more topographically precise than similar experiments from the Allen Institute and projections to the primary somatosensory area were in agreement with single-neuron morphological reconstructions from publicly available databases. This pilot study sets the basis for a shared virtual connectivity atlas that could be enriched with additional data for studying the topographical organization of different thalamic nuclei. The pipeline is accessible with only minimal programming skills via a Jupyter Notebook, and offers multiple visualization tools such as cortical flatmaps, subcortical plots and 3D renderings and can be used with custom anatomical delineations. Supplementary Information The online version contains supplementary material available at 10.1007/s12021-023-09644-4.

1 Supplementary Materials

AMBCA
The Allen Mouse Brain Connectivity Atlas (AMBCA) repository offers the largest collection of anterograde tract-tracing experiments spanning the entire mouse brain to date (see 1).It is an ensemble of ∼ 1600 experiments performed on 56-days-old (P56) wild-type (C57BL/6J) and transgenic cre-line male mice, with 16 different driver lines in total (Oh et al., 2014;Harris et al., 2019).They used a recombinant adenoassociated virus (rAAV) expressing enhanced green fluorescent protein (EGFP) to label the wild-type mice (Oh et al., 2014).To label the transgenic cre-line mice, they used a cre-recombinase-dependent rAAV virus expressing either a EGFP expressed in the neuronal cytoplasm or a fusion of synaptophysin-EGFP that labels the presynaptic terminals (sypEGFP) (Harris et al., 2018).The main difference is that in the wild-type experiments, most neurons at the injection location were labeled, whereas in the cre-line experiments, only neurons belonging to specific cell-classes were labeled.They visualized the labeled axonal projections using two-photon microscopy and produced brain slice images at 1 µm resolution, which have been registered to CCF v3.0 (Wang et al., 2020), thus producing a whole-brain volumetric dataset.
A number of cre-line experiments targeted the projections of multiple thalamic nuclei which were thus classified into different projection classes based on the cortical hierarchy of their projections as feedforward or feedback (Harris et al., 2019), In particular, nuclei were classified as core (predominantly targeting cortical areas in layer 4), intralaminar (predominantly targeting the striatum), matrix focal (predominantly targeting layer 1 of a single area) and matrix multi-areal (predominantly targeting layer 1 of multiple areas).
That said, the coarseness of the delineation of mouse brain areas based on their projection profile is due to the technical constraints of the experiments performed for AMBCA.A major limitation is the dense labeling induced by the selected viral tracers, which resulted in an inability to further subdivide brain areas and nuclei by topographically organized neuronal sub-populations with different projection patterns.Likewise, it is not possible to distinguish the precise termination patterns of different classes of cells labeled by the cre-line tracers.While this issue has been largely mitigated by the emergence of large-scale datasets of axonal morphological reconstructions of single-neurons (Winnubst et al., 2019;Peng et al., 2021), thousands of similar reconstructions will be required to reach a coverage of thalamic nuclei similar to (Harris et al., 2019).

QUINT workflow
The QUINT workflow is comprised of three well-established neuroinformatics tools, with each tool corresponding to a step of the workflow.First, the QuickNII software (Puchades et al., 2019) is used to register experimental section images to the CCF template, followed by generating images of the corresponding template section.Second, the ilastik software (Berg et al., 2019) is used to segment useful objects out of the experimental section, such as cell bodies or neurites.Third, the Nutil software (Groeneboom et al., 2020) is used to quantify the counts of the segmented objects from step 2 across the delineated brain regions that characterize the template section from step 1.In the following paragraphs, we will provide a short description of the QuickNII and ilastik tools.We will also describe two additional tools related to QUINT that were incorporated in our pipeline, namely VisuAlign (see 1) and Deep-Slice (Carey et al., 2022), which corroborated the registration procedure.Lastly, we did not use the Nutil software but utilized the Python-based libraries NumPy and Matplotlib to compute counts of segmented pixels directly within the resulting 3D volume (see 1) that was generated by the finalized registration of a complete experimental dataset.

QuickNII
QuickNII is a semi-automated open-source software that assigns a spatial location to serial section images.In particular, it performs affine spatial registration of sectional image data to a 3D reference atlas coordinate framework.It enables the manual selection of the best fitting atlas slice for a given experimental slice through the visually maximal overlap of both slice images.
QuickNII produces customised atlas maps adapted to match the cutting plane and proportions of the sections.The location is defined by superimposing the reference atlas onto the section images (anchoring).The cutting angle of the reference atlas is adjusted to match the section plane identified using an affine transformation, which is followed by a manual adaptation of the atlas section to match the experimental section.
Then, the transformation of pixel coordinates to the corresponding voxel coordinates in the atlas, generated by QuickNII, is given by: where x v ,y v ,z v are 3D voxel coordinates in atlas space, x p ,y p are 2D pixel coordinates of the experimental slice, u,v,o correspond to the top right, bottom left and top left corners of the image in the atlas space, respectively, and w, h correspond to the width and height of the registered image.
In the case of non-linear refinements applied on top of the linear registration, the experimental slice pixel coordinates are non-linearly transformed to match the pixel coordinates of the corresponding atlas slice ( xp , ŷp ) (see section S1.2.3) and the above equation is adapted to work with the transformed coordinates instead.

ilastik
Ilastik is a user-friendly tool that performs image analysis using machine learning methods without requiring methodological expertise from the user.It solves multiple image analysis problems with the use of supervised classification algorithms.Classification focuses on training models to learn how to classify data into two or multiple categories, by providing them exemplar input data that have been a priori associated with certain categories, followed by testing the performance of the model on previously unseen examples (Tan et al., 2005;Bishop, 2006) One of the major workflows of ilastik is the pixel classification workflow, which aims to produce a characterization of pixels in an input image that is of significance to the user.This workflow segments images in a semantic fashion: it associates a label to each image pixel, which is defined by the user using a graphical user interface.Given semantic classes to which the labels refer, the user has to provide examples within the image that correspond to the different classes, by painting each example with a graphical brush using a colour that is associated with the class.Ilastik then estimates the probability that each pixel belong to a class.In terms of technical implementation, the input of this workflow is the output of a number of filters applied to the image, which have also been selected by the user.The method used to train the classifier is typically the Random Forest Classifier comprised of 100 decision trees (Breiman, 2001), motivated by the low number of required parameters and its high performance in image classification tasks (Tan et al., 2005).

VisuAlign and reverse registration
VisuAlign is a sibling tool to QuickNii that utilizes a user interface for refining an already existing 2D affine registration in a non-linear fashion.It has been specifically developed for curating the spatial variability resulting from a linear registration of an experimental section image to an atlas-derived one.VisuAlign thus employs usergenerated landmarks in the user interface to perform a non-linear transformation or "warping" of the atlas slice to the experimental one.
To understand how the reverse non-linear registration of an experimental section to an atlas section was achieved, it is first important to clarify the implementation of the forward registration.For that purpose, let us label the pixel coordinates of the experimental section as the "to" coordinates, and the corresponding atlas section ones as the "from" coordinates.The forward registration has been selected as the default option of registration in Visualign's user interface, because it eliminates the issue of folding: a single point in the "from" can transform into multiple "to".
VisuAlign utilizes the Delaunay triangulation technique, which is implemented by the incremental Bowyer-Watson algorithm (Bowyer, 1981;Watson, 1981).First, a rectangle is used to represent the original pixel coordinate space, on which the experimental and atlas sections are overlaid.The rectangle is defined by the width and height of the experimental section: where w and h correspond to the width and height of the pixel coordinate space.The reason why the rectangle occupies a 10% larger space that than the original pixel space is to allow for a higher range of overlay between the two sections.For the triangulation to take place, the rectangle has to be partitioned into two triangles that are formed by [P0, P1, P2] and [P1, P2, P3], respectively.The formation of the two triangles now allows each pixel inside the image to be represented via barycentric coordinates (λ 1 , λ 2 , λ 3 ), which can return the original pixel coordinates if the dot product is taken with respect to the triangle vertices.This barycentric coordinate representation results in a computationally efficient mapping between the "to" and "from" coordinate systems.
The forward transformation is thus achieved by estimating for each pixel the respective "from" coordinate, followed by transforming the "from" coordinates to the "to" coordinates in their barycentric form.As soon as the forward mapping has been completed, the reverse transformation from a "to" to a "from" point can be achieved in the reverse fashion.

DeepSlice
DeepSlice is a deep neural network that has been trained on thousands of histological images to automatically register coronal section images to the Allen Mouse Brain CCF v3.0 (Carey et al., 2022).It uses the Xception architecture front-end (Chollet, 2017), which performed well on classifying images utilizing a deep architecture of convolutional layers, combined with a new back end that predicts where the corners of slice are placed within the Allen Mouse Brain Atlas (origin, left edge, bottom edge).This model was trained using a variety of brain section images from the Allen CCF v3.0 with their position as determined by the Allen image registration pipeline (see 1), as well as synthetic slices generated by Nissl and serial two-photon image volumes from the Allen Institute.The performance is improved when the slice index (from experiment) and cutting angle is integrated across the slices.

Flatmap creation
To provide a 2D visualization of neuroanatomical information within the spatial context of the cortex, in the form of a flatmap, the following two-step procedure was followed (Harris et al., 2019).First, a curved cortical coordinate system was defined, followed by a smooth mapping of the cortical surface to a flatmap.For the first step, the orthogonal path between the pia surface and the white matter was computed with the use of the Laplace equation (Griffiths, 2013).This resulted in the creation of streamlines as the third axis in the cortical surface, which match columnar cortical structures and provide information about cortical depth.The streamlines could thus be averaged over the desired range of cortical depth which then leads to a 2D embedding of the data.For the second step of the flatmap creation, the geodesic distance was computed from each point of the surface to pairs of anchor points that formed the two respective axes of the 2D embedding into a flatmap.The position of each cortical source point into the flatmap was inferred by finding the flatmap point at which the radial distance from the anchor points in 2D equals the geodesic distance between the source point and the anchors in 3D.

EUAL parcellation
The Enhanced and Unified Anatomical labeling (EUAL) parcellation is a unified anatomical labeling system which integrates the labeling of the Franklin-Paxinos (FP) atlas (Paxinos and Franklin, 2019) into the Allen CCF v3.0 (Chon et al., 2019).The aim of this system is to minimize discrepancies in the anatomical delineations of the mouse brain between the two predominant atlases, namely FP and ARA.The integration was achieved by first identifying and aligning corresponding points between the FP and ARA labels.They then used spatial variations in volumetric cell density data, obtained from transgenic cre-line mice (Kim et al., 2017), as a proxy for further subdividing ARA labels whose corresponding FP labels have a higher degree of subdivision.In addition, they validated the parcellation by performing a topographical analysis of connectivity patterns obtained from tract-tracing data (Oh et al., 2014;Hintiryan et al., 2016;Harris et al., 2019).

SBA Composer
The Scalable Brain Atlas (SBA) Composer is a web-based application for displaying brain imaging, volumetric and 3d-object-based data embedded within a number of available brain atlases (Bakker et al., 2015).It can be used directly by users or by third party websites as a visualization front-end (see Main table 1 for more details).It uses the anatomical parcellation of the Allen Brain Atlas, the areas of which can be rendered alongside the imported data with parameters for color and transparency that can be selected by the user.Moreover, the user can use a computer keyboard and mouse to click and navigate by rotating, scaling and translating the rendered brain objects along the generated scene.Each registered population is visualized in SBA as a cloud of cubes that represents the 3D coordinates of 10 µm voxels containing labeled axons in RAS space (Left-Right, Posterior-Anterior, Inferior-Superior, with origin at the anterior commissure).To deal with the sparsity of labeled axons along the anterior-posterior axis due to the inter-spacing of coronal slices, a random shift is introduced along this axis to make the cubes look more spread out across the template.A pitfall of that approach is that the 10 µm resolution results in the point-cloud being converted to small 3D cubes which are hard to distinguish.To increase cube visibility, its linear size is multiplied by a factor of 20.For storing the data, the Extensible 3D (x3d) file format is used (see 1).Under the x3d format, the data are then uploaded to SBA via an Application Programming Interface.

Morphology selection
The MouseLight project/repository/database comprises of ∼ 1500 complete morphological reconstructions of single-neurons (Economo et al., 2016).These neurons are primarily located in the thalamus, motor cortex, subiculum and hypothalamus and their morphologies were reconstructed with a high-throughput procedure involving viral projection labeling, high resolution imaging and anatomical segmentation.They first labeled single-neurons by injecting 56-days-old (P56) male wild-type mice with two Adeno-associated viruses (AAV): low-titer AAV expressing cre-recombinase (AAV Syn-iCre) and high-titer AAV coding for a green fluorescent protein (AAV CAG-Flex eGFP/tdTomato).To produce high-resolution images containing the labeled neurites, they used serial two-photon tomography with an integrated vibratome.For tracing and reconstructing the neurons, they implemented a semi-automated algorithm for the segmentation and reconstruction of the soma, axons and dendrites that required manual intervention for distinguishing branches at intersection points (Winnubst et al., 2019).
The Braintell project/repository/database comprises of ∼ 1700 fully reconstructed single-neuron morphologies from cortex, claustrum, thalamus, and striatum (Peng et al., 2021).This repository is a product of a pipeline designed for labeling, imaging, reconstructing, registering and analysing single-neurons from these areas.To label single-cells, they used a combination of two mouse transgenic lines, namely GFP-expressing Ai139 or Ai140 TIGRE2.0 reporter and the TIGRE-MORF reporter (Madisen et al., 2015;Daigle et al., 2018;Veldman et al., 2020), which lead to highly sparse labeling of the complete axonal and dendritic arborization of a cell.They thus performed anterograde tracing on ∼ 140 P56 male and female transgenic mice.They imaged whole-brains for each experiment using the fluorescence micro-optical sectioning tomography (fMOST) imaging platform (Li et al., 2010).fMOST is a platform combining epifluorescence microscope with a mechanical sectioning system.For each 3D imaged brain, they automatically reconstructed the underlying neuronal morphologies using the Vaa3D open source software (Peng et al., 2014), which were then registered to CCF with the mBrainAligner tool (Peng et al., 2011).They then used the above mentioned pipeline to characterize the morphological diversity of the aforementioned areas in terms of projection pattern-specific cell-types.These projection-specific types were found to be correlated with their transcriptomic composition, divergence/convergence of their projections, layer-specificity in the case of cortical neurons and somatotopic organization.
For integrating single-neuron morphological reconstructions in our analysis, we performed the following data retrieval and pre-processing steps.We first downloaded all VPM neurons from their respective repositories in the form of swc files (see 1 for the respective hyperlinks).The swc file format is the standardized format for representing morphological reconstructions (Stockley et al., 1993).We only selected morphologies whose soma position was located in the VPM.When selecting neurons from the Braintell database, we had to use a strict selection criterion based on the quality of the registration procedure.For each Braintell neuron, manual corrections were made to obtain its correct soma position to mitigate errors caused by their fully automated reconstruction pipeline.However, we only used neurons whose soma position was within the boundaries of VPM, and confirmed as VPM neurons by the authors in their supplementary tables after the manual corrections.Following this selection criterion, we thus retrieved 257 Mouselight and Braintell morphologies in total for analysis.
We loaded each morphology as a list of two arrays in which the first corresponded to the anatomical coordinate of each point in CCF and the second contained additional information about each point, such as its identity (soma, dendrite, axon), its index in the first list and the index of its parent point in the list, where a parent point corresponded to its direct ancestor in the morphological tree that could be either an axonal branch, dendritic branch or the soma.We first had to ensure by means of an affine transformation that all neurons were represented in the same orientation system as CCF, which is PIR (Anterior-Posterior, Superior-Inferior, Left-Right with origin at the Anterior-Superior-Left corner) at a 10 µm resolution, which is the highest available resolution for the ARA template.Braintell neurons are oriented at PIR at 1 µm resolution, with only a small fraction having 25 µm resolution.We thus re-scaled the anatomical coordinates of the two Braintell groups by a factor of 10 and 0.4, respectively.Mouselight neurons are oriented at LIP (Right-Left, Superior-Inferior, Anterior-Posterior, corner origin) at 1 µm resolution.For the anatomical coordinates of each neuron, we switched the first and the third coordinate, mirrored the third coordinate and re-scaled all coordinates by a factor of 10.
Since we were exclusively interested in the axonal terminal branches, we first identified the terminals by finding axonal points that had no children.We then extracted the branch of a given terminal by backtracking from the terminal point to its ancestor points until a branching point was reached.We thereby obtained a list of the axonal terminal branches of each neuron across all somatosensory areas.

Fig. 1 :
Fig. 1: Anatomical integration of registered population experiments with morphological reconstructions of long-range projecting single-neurons (LRPN) in the CCF suggests proximity between the two data modalities at the level of their source location inside VPM.(A-E) Maximum projection plots of VPM across the coronal plane (10 µm) that illustrate the proximity between the injection volume location of the populations and the soma location of LRPN neurons.The black contour lines delineate the anatomical borders of VPM according to the ARA parcellation (see section 2.6).The letters A to G are used to label the populations that correspond to the experiment ids 1-7.The red color represents the source location of a population and orange represents the soma location of proximal LRPN neurons.