Main

The brain functions as a network of chemical and electrical activity, so identifying how neurons connect across brain regions is central to understanding how the brain works, and how to treat brain diseases. Modern neuroscience techniques can image single neuron morphology at scale (Economo et al., 2016), and subsequent neuron tracing can help discover new morphological subtypes (Winnubst et al., 2019). Due to anatomical variation, and deformations that may have occurred during tissue preparation, neuron traces need to be mapped between coordinate spaces to compare morphologies from different brain samples. Brain registration software often includes neuron mapping implementations, but these implementations have not been thoroughly characterized from a numerical analysis perspective.

This question is relevant to the ongoing work of the international neuroscience community, including the Brain Initiative Cell Census Network (BICCN), to establish comprehensive neuronal atlases of the mammalian brain (BRAIN Initiative Cell Census Network, 2021). This effort has produced many images of stained or fluorescently labeled brains, which are being used to identify whole-brain connectivity patterns. Sometimes this data is analyzed with density-based methods to measure connectivity between brain regions (Watakabe et al., 2023; Athey et al., 2023). In other cases, when individual neurons can be resolved, it is possible to generate digital neuron traces for morphological analysis (Skibbe et al., 2018; Athey et al., 2022). This paper focuses on neuron traces, which are commonly stored as a set of connected 3D coordinates, or knots, such as in the SWC format (Stockley et al., 1993; Cannon et al., 1998). The connections between the knots are classically represented as cylinders (Cannon et al., 1998), or conical frustums (O’Halloran, 2020), but here we ignore radius information, since it is not generated by all neuron tracing methods. Consequently, the whole neuron trace is considered to be a tree of piecewise linear curves.

In order to assemble these traces into a complete picture of the various neuron morphologies in the brain, scientists need a way to map neuron traces into common coordinate systems. Several popular software applications exist for this task and are used to assemble atlases of neuron morphology. For example, Peng et al. (2021) used mBrainAligner (Qu et al., 2022), Gao et al. (2022) used the Computational Morphometry Toolkit, and the MouseLight project (Winnubst et al., 2019) used displacement fields from Fedorov et al. (2012). Existing methods use what we call zeroth order curve mapping in that they only map the positions of the knots (also known as trace points). However, depending on the nonlinearity of the mapping, and the continuous representation of the neuron trace, zeroth order mapping is sensitive to different samplings of the original neuronal curve (Fig. 1a,b). In other words, sampling the same curve different ways while tracing in the original image may lead to different mapped morphologies. It is critical that neuron mapping methods preserve the geometry of digital neuron traces in order to build reliable atlases of neuron morphology, and to accurately identify deviations in diseased brains.

In this work, we introduce a method to preserve derivative information when mapping neuronal curves, and investigate the conditions under which this technique is advantageous to existing methods (Figure 1). We applied our method to both simulated data and real neuron traces from a whole mouse brain image, and the code used developed in this work is freely available in our Python package brainlit.

Fig. 1
figure 1

Neglecting the action of a nonlinear mapping on a curve’s derivatives can introduce errors. a-b Different samplings of a curve can lead to different results under nonlinear deformations, such as only sampling the endpoints (a) versus sampling several times along the curve (b). c-d Large distances between trace points can contribute to mapping inaccuracies. The green line segment following cortical layers 2/3 in a synthetic mouse brain image (c) is defined only by its endpoints. Transforming only the positions of the endpoints (zeroth order mapping, d), is less accurate than incorporating the action on the derivatives as well (first order mapping, d). e-f Quantitative descriptions of the mapping from target to atlas via the logarithm of the Jacobian determinant, which quantifies expansion and compression (e), and the spectral norm of the displacement field, which plays a role in an error bound of zeroth order mapping (f)

Results

Action of Diffeomorphisms on Discrete Samplings

In the following sections, we use \(C^k\) to represent the space of continuous functions with k continuous derivatives, where the domain and range can be inferred by the context. We model a neuronal branch (dendrite or axon) as a regular 3D curve \(c:[0,L] \rightarrow \mathbb {R}^3\), \(c \in C^k\), and \(\vert \dot{c} \vert > 0\), where L, without loss of generality, is the arc length of the curve (Younes, 2010). When a neuronal curve is traced, it is typically stored as a sequence of points \(\{x_i=c(t_i): t_i < t_{i+1}\}_{i=1}^n\), where the independent variables \(t_i\) can be taken to be the indices of the points. When there is a diffeomorphism between coordinate systems \(\phi : \mathbb {R}^3\rightarrow \mathbb {R}^3\), these traces are mapped via the group action:

$$\begin{aligned} \phi \cdot \{x_i\}_{i=1}^n&= \{\phi (x_i)\}_{i=1}^n \end{aligned}$$

We want to extend the space of traces, and the associated action, to include derivatives of the underlying curve denoted \(\partial _t c\). This can be done using the jet space \(J^k\). In our setting, \(J^k=[0,L] \times X^{(k)}\), where an element of \(X^{(k)}\) is a \(k+1\)-tuple \((x^0,x^1,...,x^k) \in (\mathbb R^3)^{k+1}\) representing a position and first k derivatives of a curve in \(\mathbb R^3\). A \(C^k\) curve \(c: [0,L] \rightarrow \mathbb R^3\) can be extended to a curve \(\hat{c}: [0,L] \rightarrow X^{(k)}\) simply by adding derivatives, with \(\hat{c}(t) = (c(t), \partial _t c(t), \ldots , \partial _t^k c(t)) \in X^{(k)}\) (Olver, 1995).

The \(C^k\) diffeomorphisms have a natural group action on the jet space \(J^k\), ensuring the commutation between the standard action of diffeomorphisms on curves, \((\phi , c) \mapsto \phi \circ c\) and their extensions, such that the identity \(\phi \cdot \hat{c}(t) = \widehat{\phi \circ c}(t)\) holds for all curves c and times t, defining the left-hand side. For example, for \(k=2\), this provides

$$\phi \cdot (t, x^{0}, x^{1}, x^{2}) = (t, \phi (x^{0}), D\phi (x^{0}) x^{1}, D\phi (x^{0}) x^{2} + D^2\phi (x^{0}) (x^{1}, x^{1}))$$

Neuron traces, as mentioned before, involve a sequence of samples with time-stamps \(\{(t_i,x^{(k)}_i)\}_{i=1}^n\), identified as elements of \((J^k)^n\), the n-fold Cartesian product of \(J^k\). Our diffeomorphisms will act on such a sequence as follows:

Statement 1

For a sequence of time-stamped elements on the jet space, \(T = \{(t_i, x_i^{(k)})\}_{i=1}^n\) in \((J^k)^n\), we define the action of diffeomorphisms

$$\begin{aligned} \phi \cdot T = \{(t_i, \phi \cdot x_i^{(k)})\}_{i=1}^n \end{aligned}$$
(1)

The fact that this operation provides an action is an established result (Olver, 1995), and the proof is provided in the Supplement. We will define k’th order discrete mapping to be the action in Eq. 1 of a diffeomorphism on a curve sampling that includes k derivatives. The axioms that define group actions are important to verify because they ensure that applying the identity transformation does not change the object, and that applying a composition of transformations is equivalent to applying the individual transformations successively. Further, group actions can exchange mathematical structure between the acting group and the set being acted upon, and they are at the core of several important theorems (Suksumran, 2016).

The k’th order discrete mapping method allows us to compute the first k derivatives of the transformed curve. We will interpolate the transformed curve using splines of order \(2k+1\) that satisfy the derivative values. For example, zeroth order mapping will produce a first order spline and first order mapping will produce a cubic Hermite spline (Spitzbart, 1960).

Error Analysis of Zeroth and First Order Mapping

Now we will examine the error introduced by zeroth order mapping, which is used by existing neuron mapping methods. First, note that under affine transformations, zeroth order mapping of piecewise linear curves introduce no error, so these results are only useful under non-affine transformations. The following results require that the curve c be parameterized by arc length. However, we note that all continuously differentiable regular curves can be reparameterized by arc length (Smale, 1958). We use \(\vert \cdot \vert\) to denote the Euclidean norm for elements of \(\mathbb {R}^d\), and the spectral norm for matrices.

Proposition 1 [Zeroth Order Mapping Error Bound]

Say \(\phi : \mathbb {R}^3 \rightarrow \mathbb {R}^3\) is a \(C^1\) diffeomorphism and \(c: [0,L] \rightarrow \mathbb {R}^3\) is a continuous, piecewise linear curve parameterized by arc length with knots \(\{t_i: t_1=0, t_n=L, t_{i-1} < t_i\}_{i=1}^n\). For the transformed curve \(f=\phi \, \circ \, c\), the zeroth order mapping defines a first order spline g which satisfies:

$$\begin{aligned} \max _{t \in [0,L]} \vert f(t)-g(t)\vert&\le \max _{i \in \{0,...,n\}, t \in [t_{i-1}, t_i]} \frac{1}{2} \left( \vert D\phi \, \circ \, c(t) - I \vert \vert t_i-t_{i-1} \vert + \vert \epsilon _i - \epsilon _{i-1}\vert \right) \end{aligned}$$
(2)

where \(\epsilon _i\triangleq c(t_i)-\phi (c(t_i))\) and \(D\phi \, \circ \, c(t)\) is the Jacobian of \(\phi\) evaluated at c(t).

This shows how the error introduced by the state of the art mapping method is related to the displacement magnitude, \(\epsilon\), and the extent to which the Jacobian of the transformation, \(D\phi\), differs from the identity matrix. Note that the bound in Eq. 2 goes to zero as \(\phi\) approaches the identity map (in which case zeroth order mapping has zero error for piecewise linear curves). It depends on the arc lengths of the original curve segments and the spectral norm of \(D\phi\), which is related to the finite time Lyapunov exponent (\(\log \vert D\phi \vert\)), a well-known quantity in field dynamics which characterizes the amount of stretching in a differentiable flow. Also, the bound applies to \(\max _{t \in [0,L]} \vert f(t)-g(t)\vert\), which is not parameterization invariant, and therefore not a strictly geometric quantity. However we note that this quantity is an upper bound of the Frechet distance, which is parameterization invariant.

In this paper we demonstrate first order mapping in an effort to mitigate this mapping error. Such a method has the advantage of having superior error convergence at the knots as a consequence of Taylor’s theorem. Further, we present a set of error bounds that helps clarify the advantage of first order mapping.

Proposition 2 [Comparable Bounds for Zeroth and First Order Mapping]

Say \(\phi : \mathbb {R}^3 \rightarrow \mathbb {R}^3\) is a \(C^4\) diffeomorphism and \(c: [a,b] \rightarrow \mathbb {R}^3\) is a continuous, piecewise \(C^4\) curve parameterized with knots \(\{t_i: t_1=a, t_n=b, t_{i-1} < t_i\}_{i=1}^n\). For the transformed curve \(f=\phi \circ c\) defined by coordinate functions \(f=(f^0,f^1,f^2)^T\), the zeroth order mapping defines a first order spline \(g_0\) which satisfies:

$$\begin{aligned} \begin{aligned} \max _{t \in [a,b]} \vert f(t)-g_0(t)\vert&\le \frac{\sqrt{3}}{4} \max _{t \in [a,b], j \in \{0,1,2\}} \vert \partial ^{(4)}_t f^j(t)\vert \left( \frac{\delta }{2}\right) ^4 + \\&\frac{\sqrt{3}}{2} \left( \frac{\delta }{2}\right) ^2 \max _{i \in \{1...n\}, j \in \{0,1,2\}} \vert \partial ^{(3)}_t f^j(t_i)\vert \left( \frac{\delta }{2}\right) + \\&\frac{\sqrt{3}}{2} \left( \frac{\delta }{2}\right) ^2 \max _{i \in \{1...n\}, j \in \{0,1,2\}} \vert \partial ^{(2)}_t f^j(t_i)\vert \end{aligned} \end{aligned}$$
(3)

where \(\delta \triangleq \max _{2 \le i\le n} \vert t_i-t_{i-1}\vert\) and \(\partial ^{(k)}_t f^j(t)\) is the k’th derivative of \(f^j\) evaluated at t. Also, the first order mapping defines a third order spline \(g_1\), which satisfies

$$\begin{aligned} \max _{t \in [a,b]} \vert f(t)-g_1(t)\vert&\le \frac{\sqrt{3}}{4!} \max _{t \in [a,b], j\in \{0,1,2\}} \vert \partial ^{(4)}_t f^j(t)\vert \left( \frac{\delta }{2}\right) ^4 \end{aligned}$$
(4)

and we note that the bound in (4) is tighter than the bound in (3). Further, there exists a transformed curve f and a set of knots \(\{t_i\}_{i=1}^n\) that achieves both bounds exactly.

Thus, we have made a connection between the state of the art (zeroth order mapping) and a higher order method (first order mapping) via worst-case bounds on mapping error. The error bound for first order mapping is smaller than that for zeroth order mapping, though for any given curve, either method may produce smaller error than the other. Proofs for the propositions are in the supplement.

Software Implementation

We implemented both zeroth and first order discrete mapping in our our open-source Python package brainlit. For first order mapping, we compute one-sided derivatives at the knots of the curve from first order splines in accordance with original SWC formulation (Stockley et al., 1993; Cannon et al., 1998). Then, once the knot positions and derivatives are transformed, we generate a continuous curve in the new space using Hermite interpolation. Further details of our implementation can be found in the Methods.

First order mapping involves more computations than zeroth order mapping. First, if the Jacobian of the transformation, \(D\phi\), is not immediately available, it needs to be approximated. Our software uses a finite difference method which, to approximate \(D\phi (x)\), involves calling \(\phi\) four times, three vector addition operations, and three vector scaling operations. Next, the derivatives at the knots need to be approximated and transformed by \(D\phi\). For each line segment, our method computes one vector addition and one vector scaling to approximate one-sided derivatives, and computes two matrix-vector products to transform the derivatives (Algorithm 3). Lastly, generating and evaluating cubic splines involves more computations than first-order splines (Kincaid & Cheney, 2002). Despite these differences, first order mapping still scales linearly with the number of trace nodes, and the number of computations differs from zeroth order mapping by a constant factor of computations. In practice, both zeroth order and first order mapping take on the order of seconds for traces with thousands of nodes, which should not be a bottleneck in neuron morphology studies.

Figure 2 shows examples of our method on simulated data, compared to the zeroth order method, and the “ground truth” where we map a dense sampling of points along the first order spline of the original curve.

Fig. 2
figure 2

Preserving derivative information can mitigate errors when transforming discretized curves. a-b Applying a nonlinear deformation field to a single line segment (a) using zeroth and first order mapping (b). c-d Applying a nonlinear deformation field to a piecewise linear curve (c) using zeroth and first order mapping (d). Zeroth and first order discrete mapping methods are shown relative to ground truth considered to be the application of the vector field to a dense sampling of the original curves

Application to Real Neurons

We applied our method to 20 reconstructed neurons in SWC format from a whole mouse brain image from the Janelia MouseLight project (Winnubst et al., 2019). We selected the first 20 SWC files that successfully downloaded from MouseLight’s NeuronBrowser repository and did not have repeat trace nodes. Neurons have a tree-like structure, and we split them into non-branching curves in order to apply our mapping methods. We follow a method introduced previously (Athey et al., 2021) where the root to leaf path with the longest arc length is recursively removed until the tree is reduced to non-bifurcating “branches”. The transformed branches are then reconnected to maintain the topology of the original trace, and therefore our methods generalize naturally from 3D curves, to the branching structure of neuron traces.

We generate random transformations using the Large Deformation Diffeomorphic Metric Mapping (LDDMM) framework described in Miller et al. and applied in Tward and Miller (Miller et al., 2006; Tward & Miller, 2017). We generate an initial momentum field by sampling Gaussian noise with zero mean and varying standard deviation, \(\sigma\). The momentum is smoothed to construct a velocity field, and integrated in time according to the conservation laws established in Miller et al. to generate a diffeomorphic transformation (Miller et al., 2006). We generated four diffeomorphisms with \(\sigma\) levels of \(80\), \(160\), \(320\) and \(640\) \(\mu m/\text {time}\). The position and tangent displacement profiles of these four diffeomorphisms are shown in Figure 3a. We centered the neuron traces at the origin then applied the random diffeomorphisms to compare zeroth and first order mapping to ground truth (Fig. 3b-g). Ground truth was generated by upsampling the original traces to a maximum node spacing of \(2 \mu m\) followed by zeroth order mapping.

Fig. 3
figure 3

Application of zeroth and first order mapping of neuron traces under diffeomorphisms derived from random Gaussian initial momenta. a Different values of \(\sigma\) produced diffeomorphisms with different position and tangent vector displacement profiles. The positions and tangent vectors sampled in the histogram were distributed as a uniform grid with a spacing of \(500 \mu m\). b-g Two examples of the diffeomorphism with \(\sigma =640\) applied to neuron traces to produce zeroth and first order mappings, along with ground truth. Both examples show the original trace and the transformation (b, e), the results of the different transformation methods (c, f), and a zoomed in view of the region outlined by the dotted line to show discrepancies between the methods (d, g). Plot axes are in units of microns

For each neuron trace, we computed the discrete Frechet error from ground truth (Fig. 4a). We also wanted to measure which mapping method better matched the ground truth with respect to a neuron’s distribution of common morphometric quantities, such as path angle, branch angle, tortuosity, and segment length. We used the Kolmogorov-Smirnov test statistic to measure how much the distribution of these quantities differed from ground truth (Fig. 4b). We performed two-sided Wilcoxon signed-rank tests for each comparison and used a Bonferroni correction across the different \(\sigma\) values (Fig. 4b). Lastly, we compared the discrete Frechet errors to the average sampling period of the trace i.e. the average distance between trace nodes (Fig. 4c).

Fig. 4
figure 4

Comparison of zeroth and first order mapping of neuron traces under random diffeomorphisms. a Discrete Frechet error was computed between the different order mappings, and ground truth. b Distributions of common morphometric quantities were compared to that of ground truth using the Kolmogorov-Smirnov test statistic. Differences between zeroth and first order methods were tested using Wilcoxon signed-rank test with Bonferroni correction across different values of \(\sigma\) (\(*: p \le 0.05\), \(**: p \le 0.01\), \(***: p \le 0.001\), \(****\, p\le 0.0001\)). Box plots show median, upper and lower quartiles and whiskers have a maximum length of 1.5x the interquartile range with other outlier data marked with points. c Relationship between discrete Frechet error and average sampling period (distance between trace points) under the random diffeomorphisms

To explore the effect of downsampling neuron traces on mapped morphologies, we identified non-branching nodes in straight portions of the trace, and measured the impact of removing those nodes from the trace. Specifically, we performed both zeroth and first order mapping on the segment with the node removed, and compared it to the ground truth mapping of the original segment. We determined which fraction of nodes maintained a discrete Frechet error less than one micron, serving as an estimate for the fraction of nodes which are not necessary to maintain the mapped morphology (Fig. 5c). We performed two-sided Wilcoxon signed-rank tests to compare zeroth and first order methods and used a Bonferroni correction across the different values of \(\sigma\).

Fig. 5
figure 5

Counting how many nodes in MouseLight neuron traces can be removed without affecting the mapped morphology. a For each non-branching node with path angle above 170 degrees, we generated a line segment with that node removed. b We performed first order mapping on the downsampled line segment and compared the result with the ground truth mapping of the original curve. c For each mapped neuron trace, we determined the fraction of nodes where the discrete Frechet error is less than or equal to one micron under the four random diffeomorphisms. Box plots show median, upper and lower quartiles and whiskers have a maximum length of 1.5x the interquartile range with other outlier data marked with points. We performed Wilcoxon signed-rank test between the paired neuron traces at \(\alpha =0.05\) with Bonferroni correction across different values of \(\sigma\) (\(*: p \le 0.05\), \(***: p \le 10^{-3}\))

Discussion

In this paper we examine the “naive” approach to mapping discretely sampled one-dimensional structures by simply transforming the positions of the knots, i.e. mapping line segments to line segments. We show that this method can be inaccurate when the Jacobian of the transformation is non-constant. We describe how to preserve derivative information which will lead to more accurate mappings in neighborhoods of the knots. We offer an implementation of a first-order mapping technique which, empirically, is more accurate on discretely sampled differentiable curves. Our mathematical framework is described in terms of 3D curves, but we also show how our method can handle branching. We apply our method to real neuron reconstructions and show that it more accurately matches ground truth in both frechet error, and a variety of morphometric quantities.

In our experiment with real neuron reconstructions, it is important to note what we are considering ground truth. Since the original reconstructions are in SWC format, only the knot positions are known, and the neurons are typically represented as piecewise linear structures. Real neuron morphologies are not piecewise linear, and instead are continuously curving as they pass through dense brain tissue. Nonetheless, because we have no further information about the neuron trajectories, we consider the original reconstructions to be piecewise linear, and generate the ground truth mappings by transforming the straight lines between the knots.

The transformations in our experiments were generated by “shooting” a random initial momenta field (Miller et al., 2006). In neuromorphology studies, transformations are typically generated via image registration to an atlas for which several approaches exist (Toga & Thompson, 2001; Chandrashekhar et al., 2021). This work is only relevant to non-affine registration techniques since affine transformations preserve straight lines. The utility of higher order mapping depends on the extent to which the brain sample is deformed nonlinearly. In practice, investigators can look at the profiles of position and tangent vector displacements to identify which regime (\(\sigma\) level) is most similar to their transformation (Fig. 3a). At low values of \(\sigma\), Frechet error of both zeroth and first order methods are in the range of \(1-10\) microns (Fig. 3c), which is likely negligble for mesoscale neuromorphology. However, under more extreme transformations, the first order mapping offers a more significant improvement in both Frechet error and distributions of morphometric quantities (Fig. 3c, d).

As mentioned previously, existing mapping methods use zeroth order mapping. Investigators can use the error bound in Eq. 2 to determine whether zeroth order mapping is adequate. If Jacobian and displacement values of the transformation at hand are not easily accessible, our empirical results can offer guidance. For example, we found that under less extreme transformations (\(\sigma =80,160\)), the frechet errors remained below ten microns for both zeroth and first order methods. However, as transformations got more extreme, it became more important to either keep the sampling period small, or to use first order mapping. Specifically, if the sampling period was less than ten microns, then both zeroth and first order mapping had low error. For higher sampling periods, first order mapping offered more significant improvements.

Conversely our results can be used to make manual tracing more efficient. If the registration transformation, \(\phi\), is know a priori, and there are stretches where a neuronal branch is straight, then it is possible to compute the minimum sampling rate while still controlling the amount of error introduced during mapping to atlas coordinates. The neuron trace files examined here are at most a couple megabytes, so this approach is not likely produce significant data storage gains. However, it could allow manual tracers to sample more sparsely along straight stretches of axons, possibly leading to faster reconstruction. As a preliminary experiment, we computed the fraction of nodes which could be removed, while maintaining a submicron error after first order mapping (Fig. 5). On average, around 5% of nodes achieved submicron error for both zeroth and first order mapping. This fraction decreased with larger sigma, indicating the importance of a higher sampling rate under more extreme transformations. First order mapping led to a statistically significant increase in the fraction of nodes with submicron error for \(\sigma =160,320,640\), though this increase was small. It is important to note that since each node was examined individually, it is not necessarily the case that removing all the nodes together would maintain submicron error. In the worst case, if all the nodes were located consecutively along the trace, only every other node could be removed to maintain submicron error. Further, it is unknown whether skipping the nodes identified in our experiment would have saved time in the MouseLight tracing protocol. A proper experiment to test this hypothesis would involve both registration and neuron reconstruction in real whole-brain images and thus is reserved as a potential avenue of future study. However, given that manual tracing remains a bottleneck and requires several person-hours per neuron (Winnubst et al., 2019), making tracing process just a couple percentage points faster would tangibly accelerate neuromorphological experiments.

It may be tempting to use our “ground-truth” mapping method, i.e. upsampling a linear interpolation then performing zeroth order mapping, as a neuron mapping method. While this may be appropriate in some settings, this approach has two primary disadvantages. First, as stated before, neurons are not piecewise linear structures so, while the knot positions can be generally regarded as lying on the neuron, the linear interpolation cannot. Therefore, it would be necessary to keep track of which knots are from the original trace, and which knots are from the upsampling in order to preserve the original trace information. This would require existing file formats to expand their metadata conventions. Secondly, for large traces, the upsampled data could become computationally cumbersome to store.

We want to highlight work in the adjacent field of neuron reconstruction where algorithms such as Li et al. (2020) can convert reconstruction knots into dense image segmentations which capture neuron trajectories at finer resolutions. Algorithms to automatically trace images of single neurons have been under development for decades (Peng et al., 2015; Athey et al., 2022). They could be adapted to generate both denser neuron samplings, and more accurate derivative estimates at the sampled points. These methods could improve both zeroth and first order mapping methods, so weighing these effects alongside the accuracy required for the given scientific goal would help determine which mapping method is appropriate.

Numerical error in mapping 3D curves, the subject of this paper, is only one source of error in neuron reconstruction studies. Error can also be introduced during the neuron tracing process. However, tracing errors generally come in the form of missing branches, or fusing two unconnected branches, rather than incorrect placement of trace points (Winnubst et al., 2019). Indeed, many tracing workflows involve semi-automated tools that “snap” the trace points to the fluorescent signal, and therefore tracing error of point placement can be assumed to be roughly equal to the resolution of the image. In the case of the MouseLight project, this was \(0.3 \times 0.3 \times 1 \mu m^3\), less than the mapping errors in our experiments. A more likely location of trace error is between trace points, if the neuron does not follow a straight path between trace paths. However, in well-designed experiments, neuron tracers are trained to place enough trace points to closely follow the neuron’s trajectory. If a neuron’s morphology includes sharp turns which deviate several microns from the trace, then tracing error may dominate mapping error.

Neuron traces are not the only 3D curves being mapped in neuroscience. Tractographic data in diffusion MRI involves similar data structures and mapping algorithms, so we believe the relevance of our work extends beyond neuron tracing studies in mice (Tournier et al., 2019). We believe this work could also be extended to other geometric objects other than branching curves, such as closed networks of curves, or surfaces.

As brain mapping efforts expand to include other species, such as non-human primates, we believe it will remain important to design algorithms that mitigate numerical error when handling digital neuron traces. Non-human primate brains are orders of magnitudes larger than those in mice, and also have shown more inter-individual variability, presenting new challenges for brain mapping (Ose et al., 2022).

Methods

Software Implementation

Our mapping framework was described in terms of non-branching curves in 3D, but can be naturally extended to neuron traces that have tree-like topology, i.e. have branching, but no loops. We decompose a branching trace into “branches” by recursively removing the root to leaf path with the longest arc length (Athey et al., 2021). The trace points where branching occurs are copied into each of their associated branches e.g. the trace point of at a bifurcation will be part of two neuron branches. Thus, the tree-like graph of trace points is decomposed into non-branching subgraphs.

Each branch is considered to be a first order spline, i.e. the curve is piecewise linear. In this representation, mapping a neuron branch can be decomposed into mapping a collection of line segments that connect consecutive trace points, \(x_i=c(t_i),x_{i+1}=c(t_{i+1})\). Algorithm 1 illustrates our process for generating ground truth, where we linearly interpolate the line segment every 2 microns, then apply \(\phi\) to each coordinate. Algorithm 2 illustrates the zeroth order mapping method, where we apply \(\phi\) to the endpoints of the line segment, then perform linear interpolation. Algorithm 3 illustrates our first order mapping method, where \(\frac{x_{i+1}-x_i}{\vert x_{i+1}-x_i\vert }\) is the (one-sided) derivative at each endpoint, then we use \(\phi\) to map the endpoint positions, and \(D\phi\) to map the endpoint derivatives. The mapped positions and derivatives are used to define a cubic Hermite spline (Kincaid & Cheney, 2002). Specifically, we use the SciPy implementation of cubic Hermite splines (Virtanen et al., 2020).

After mapping is applied to each branch, the branches are re-connected by matching and “fusing” the trace points that were copied into multiple branches. This way, the topology of the original branching trace is preserved. Figure 3b-g depicts two examples of branching neuron traces which were transformed with both zeroth and first order mapping. The mapped morphologies are written to SWC format.

Algorithm 1
figure a

Ground Truth Mapping

Algorithm 2
figure b

Zeroth Order Mapping

Algorithm 3
figure c

First Order Mapping

Quantitatively Comparing Curves

We define the Frechet distance between neuron traces to be the maximal Frechet distance between matched branches. Since we are comparing different mappings of the same neuron, matching branches to each other is trivial. We used the package from Jekel et al. to compute discrete frechet distance (Jekel et al., 2019). Discrete Frechet distance is an approximation of, and upper bound to Frechet distance (Eiter & Mannila, 1994). We used nGauge to load the transformed SWC files and compute morphometric quantities. We used SciPy to perform Kolmogorov-Smirnov statistics (Walker et al., 2022; Virtanen et al., 2020).

Further details about our implementation can be found in our open-source Python package brainlit: http://brainlit.neurodata.io/.

Information Sharing Statement

The neuron traces used in this work were from the MouseLight project’s NeuronBrowser website: https://ml-neuronbrowser.janelia.org/. Specifically, we used traces: AA-1087, 1089-1093, 1149, 1192, 1196, 1223, 1413, 1417, 1477, 1493, 1537-1540, 1543 and 1548. The code used in this work is available in our open-source Python package brainlit: http://brainlit.neurodata.io/.