# Automated Tracing of Neurites from Light Microscopy Stacks of Images

## Authors

- First Online:

DOI: 10.1007/s12021-011-9121-2

- Cite this article as:
- Chothani, P., Mehta, V. & Stepanyants, A. Neuroinform (2011) 9: 263. doi:10.1007/s12021-011-9121-2

- 33 Citations
- 421 Views

## Abstract

Automating the process of neural circuit reconstruction on a large-scale is one of the foremost challenges in the field of neuroscience. In this study we examine the methodology for circuit reconstruction from three-dimensional light microscopy (LM) stacks of images. We show how the minimal error-rate of an ideal reconstruction procedure depends on the density of labeled neurites, giving rise to the fundamental limitation of an LM based approach for neural circuit research. Circuit reconstruction procedures typically involve steps related to neuron labeling and imaging, and subsequent image pre-processing and tracing of neurites. In this study, we focus on the last step—detection of traces of neurites from already pre-processed stacks of images. Our automated tracing algorithm, implemented as part of the Neural Circuit Tracer software package, consists of the following main steps. First, image stack is filtered to enhance labeled neurites. Second, centerline of the neurites is detected and optimized. Finally, individual branches of the optimal trace are merged into trees based on a cost minimization approach. The cost function accounts for branch orientations, distances between their end-points, curvature of the merged structure, and its intensity. The algorithm is capable of connecting branches which appear broken due to imperfect labeling and can resolve situations where branches appear to be fused due the limited resolution of light microscopy. The Neural Circuit Tracer software is designed to automatically incorporate ImageJ plug-ins and functions written in MatLab and provides roughly a 10-fold increases in speed in comparison to manual tracing.

### Keywords

TracingSegmentationAxonDendriteConfocalStack## Introduction

It is evident, that the complete understanding of the brain function can only be derived from a substantially detailed account of synaptic connectivity in the underlying neural circuit. Can such an account be given in the form of a complete connectome of the circuit (Lichtman and Sanes 2008; Sporns et al. 2005)? While the connectomes of small invertebrate circuits can be fully determined electron-microscopically (EM) (Chen et al. 2006; White et al. 1986), this technique, in spite of a number of impressive developments in recent years [see e.g. (Briggman and Denk 2006; Denk and Horstmann 2004; Hayworth et al. 2006; Mishchenko et al. 2010)], still lacks the capacity to reconstruct connectivity on a larger scale. Consider a hypothetical serial-section EM reconstruction of a 1 mm^{3} brain tissue. Volume of this size roughly corresponds to a blowfly brain or a mammalian neocortical column. At 5 nm × 5 nm × 50 nm spatial resolution, the amount of data required to describe the volume completely is on the order of 800 TB, which is too large to be easily processed by modern day computers. More importantly, even at very low probabilities of errors in following small axon profiles from one serial section to the next, say 0.01%, cumulative probability of error in connectivity over a distance of about 1 mm (or 20,000 serial sections) is virtually 100%, making the resulting connectivity diagram unusable. This fundamental limitation of the technique makes the complete connectome description of large circuits unrealistic, at least for the time being.

Other issues further compound the problem (Stepanyants and Chklovskii 2005). First, synaptic connectivity in large circuits can change over time due to the formation and elimination of synapses [see e.g. (Grutzendler et al. 2002; Trachtenberg et al. 2002; Yuste and Bonhoeffer 2001)]. Second, synaptic connectivity in large circuits is bound to be variable from one brain to another (Bohland et al. 2009). Hence, the connectome of a large neural circuit is not likely to be time and brain invariant. To understand the extent of the variability it would be necessary to compare connectomes of different brains. However, this is not a trivial task since neurons in large brain circuits may not be identifiable from one animal to another. In the end, one will be forced to abandon the idea of the complete connectome description, and instead resort to a partial or statistical/probabilistic account of connectivity.

As an alternative to a statistical connectome description one can base the account of connectivity on features of the circuit that are more stable over time and may be more invariant among different brains. Since a synaptic contact requires physical proximity between axonal and dendritic branches of pre- and postsynaptic neurons, such an account can be made in terms of the relative layout of branches, or potential synapses (Stepanyants and Chklovskii 2005; Stepanyants et al. 2002; Stepanyants et al. 2008). Description of connectivity in terms of potential synapses, though far from being complete, is particularly useful for characterizing large circuits. This is because while actual synaptic connectivity may change over time, potential connectivity is typically much more stable (Holtmaat et al. 2005; Trachtenberg et al. 2002). What is more, potential connectivity in different animals (same species, age, brain area, neuron classes, etc.) appears to be less variable than actual connectivity (Jefferis et al. 2007; Stepanyants et al. 2008). This is because while the number of potential synapses between neurons depends mainly on their class and positions in the brain, the number of actual synapses, in addition, depends on neurons’ functional properties (e.g. orientation preference in primary visual cortex). Finally, because potential connectivity is defined by the appositions between axonal and dendritic branches on a micrometer scale, reconstructions of potential connectivity can be done with light confocal or two-photon microscopy (LM), taking advantage of the multitude of imaging and cell labeling techniques (Wilt et al. 2009).

With this study, we are aiming to develop a tool which will automate the process of large-scale 3D reconstruction of neurites. A typical circuit reconstruction procedure contains several general steps (Meijering 2010; Russ 2007): neuron labeling and imaging, image pre-processing, tracing of the wires, and post-processing. The pre-processing stage may include alignment of individual images within the stack, deconvolution of the stack, and the application of various noise reduction and feature enhancement filters. The post-processing stage could contain detection of features such as spines, boutons and synapses, 3D rendering of neuronal branches, and tiling of the reconstructions of individual stacks. In this study, we will mainly focus on the wire tracing part, as it is arguably the main obstacle on the way to automating the circuit reconstruction process. As neurites of many cell types can span the entire brain of an animal (e.g. cortical pyramidal cell axons) or the entire animal itself (e.g. *C. elegans*), our ultimate goal is to be able to perform reconstructions on a large-scale, and to recover the traces of axonal and dendritic arbors of sparsely labeled populations of neurons in their entirety. But, how sparse should the labeling be?

## Analysis

### How Does the Minimal LM Reconstruction Error-Rate Depend on the Sparseness of Labeled Neurites?

*ρ*, is high. Length density is defined as the combined length of all processes per unit volume (Escobar et al. 2008). At high

*ρ*neurites may appear fused in 3D (Fig. 1a) and it is often difficult to resolve their individual branches or the branching pattern. Assuming that branches are distributed isotropically and uniformly in the stack of volume

*V*, and that their typical diameter in the image is

*d*(apparent, not real diameter), we can estimate the expected number of places in the stack where two branches appear to be in contact. This number is

*πdρ*

^{2}

*V*/2 (Escobar et al. 2008). In these locations one would have to rely on additional characteristics of labeled neurites in order to tell them apart. Such characteristics may include relative branch orientations, branch thicknesses, types (e.g. axons or dendrites), brightness of the label, and color. Below we examine the possibility of discriminating fused branches based on their relative orientations.

This expression shows how the average deviation of neurite’s orientation from a straight path, *δθ*, depends on the path length along the neurite, *l*. Parameter *l*_{p} is referred to as the effective persistence length. Over this length branch orientation changes on average by 1 radian (or 57°), assuming that there is no branching along the way.

Figure 1c shows the dependence of *δθ* on *l* for mouse axons of neocortical layer 6 neurons (sample reconstructions are shown in Fig. 8). In calculating *δθ* we used the reconstructed traces of neurites and uniformly sampled from them neurite segments which do not traverse through branch points. Deviation in orientation was calculated for every such segment (see inset in Fig. 1c) and the results were combined within groups of similar length segments (black error-bars in Fig. 1c). The effective persistence length, *l*_{p}, and the scaling exponent, *γ*, were determined from the nonlinear list squares regression of this data with Eq. 1: *l*_{p} = 2050 ± 190 μm (mean ± S.D.) and *γ* = 0.18 ± 0.01. The adjusted *R*^{2} = 0.985 illustrates the goodness of the fit (red line).

Two branches which appear to be in contact cannot be resolved if the deviations in their individual orientations, *δθ*, over the length of the contact, *l*, are comparable to the angle of branch incidence, *θ* ≈ 2 *d/l* (Fig. 1b). As a result, contacting branches, incident at *θ < θ*_{c} = (2 *d/l*_{p})^{γ/(1+γ)} are unlikely be resolved. Assuming isotropy, the fraction of such branch orientations in 3D is equal to (1-cos(*θ*_{c}))/2. As *θ*_{c} < 1, this expression is well approximated with the leading term of its expansion over *θ*_{c}, resulting in the fraction of not resolvable relative branch orientations of (2 *d/l*_{p})^{2γ/(1+γ)}/4.

*ρ*:

This calculation makes it possible to estimate the lower bound of the error-rate for different levels of label sparseness. As the combined length of an axonal arbor, *L*_{a}, is much larger than that of a dendritic arbor for many central neuron classes, in labeling neurites for circuit level analyses it is essential to ensure that the minimal error-rate does not exceed about one error per axon, or *e*_{L}*L*_{a} < 1. Otherwise, results of automated, or even manual reconstruction procedures, may be unreliable.

Consider an attempt to reconstruct potential connectivity on the scale of the entire mouse cerebral cortex. The total length density of axons in the mouse neocortex is estimated at about 4 μm^{-2} (Stepanyants et al. 2009). Assuming that the apparent axon diameter *d* ≈ 1 μm, the typical values of the effective persistence length of cortical axons (most of which are the axons of pyramidal neurons) and the scaling exponents, *γ*, are similar to those obtained from Fig. 1c, we estimate that reconstruction of 0.01% of excitatory neurons uniformly labeled throughout the cortex (only 1 out of 10,000 neurons labeled, *ρ* = 0.0001× 4 μm^{-2}) can be done reliably. This reconstruction would result in about 8 errors per mm^{3} of cortical gray matter according to Eq. 2, or 1 error per 50 mm of axon length according to Eq. 3. As the average axon length per neuron in mouse neocortex is about 40 mm (Braitenberg and Schüz 1998), the above error-rate is sufficiently low, making reconstructions suitable for circuit level analyses of connectivity. It is worth noting that by using multiple color labels (Lichtman et al. 2008; Livet et al. 2007) one can increase the fraction of labeled neurons many-fold without affecting the error-rate significantly.

## Methods

### Outline of the Automated Tracing Algorithm

There are numerous approaches to automated tracing of linear structures of neurites from 3D LM images [see (Meijering 2010) for review]. In general, these approaches fall into two main categories. The first category of algorithms is based on local neurite tracking (or tracing) (Al-Kofahi et al. 2002; Bas and Erdogmus 2010a, b; Can et al. 1999; Srinivasan et al. 2007; Wang et al. 2007). Tracking is typically initiated from a seed point provided by the user and the algorithm steps along the structures of neurites by analyzing intensity distribution in a small neighborhood of the current location. Tracking based algorithms are computationally inexpensive and can successfully interpolate through small intensity gaps in the image. On the downside, tracking based algorithms usually run into problems at branch points or branch crossover regions. This is because, at a given step of a tracking algorithm, information about neurite structures contained in the image is limited only to the structures previously visited by the algorithm.

The second group of algorithms is based on image segmentation. Here, some measure of tubularity is calculated for every voxel in the image in order to delineate voxels belonging to the neurites from the ones belonging to the background. Next, centerline of the segmented regions is determined by means of skeletonization (or thinning) (Lee et al. 1994; Palagyi and Kuba 1998; Weaver et al. 2004), voxel coding (Vasilkoski and Stepanyants 2009; Zhou and Toga 1999; Zhou et al. 1998), or by optimally connecting a set of maximum tubularity seed points (Deschamps and Cohen 2001; Dijkstra 1959; González et al. 2008; 2010; Sethian 1999; Wink et al. 2002; Xie et al. 2010). Segmentation based algorithms are computationally expensive because they operate on the entire image. The resulting centerline is often broken into multiple pieces, which need to be interconnected, and contains numerous spurious branches, which must be pruned. Additional algorithms are invoked to accomplish these tasks.

Numerical values of parameters and processing times of automated tracing procedures leading to the results shown in Figs. 7, 8 and 9. The average processing time was measured on Intel Core 2 Duo, 2.20 GHz, 4 GB, 64-bit computer. Image stacks, containing neuromuscular projection fibers, were reduced 3-fold with a median filter along *x*, *y*, and *z* dimensions before tracing. Subscripts of parameters *α*, *β*, *γ* correspond to 2, 3, and 4 branch end-points mergers. NA for 4 branch end-points mergers indicates that this merger type was suppressed

Dataset | Stack size (voxels) | Average processing time (seconds) | CSF size (voxels) | Intensity threshold (% of max) | Region size threshold (voxels) | Branch length threshold (voxels) | Trace optimization | Branch merger | ||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

α | α | α | ε | |||||||||

a | b | β | β | β | ||||||||

a | b | γ | γ | γ | ||||||||

δ | δ | δ | ||||||||||

0.1 | 0.1 | 0.1 | 1.0 | |||||||||

Olfactory axons (Fig. 7) | 512 | 190 | [1,1.5,2] | 7 | 200 | 6 | 0.5 | 0.05 | 2.0 | 0.5 | 0 | |

512 | 0.2 | 0.05 | 1.0 | 1.0 | 3.0 | |||||||

60 | 10 | 0 | 0 | |||||||||

| ||||||||||||

0.1 | 0.2 | NA | 1.5 | |||||||||

Neocortical axons (Fig. 8) | 512 | 125 | 1 | 10 | 200 | 10 | 0.5 | 0.05 | 1.0 | 1.0 | NA | |

512 | 0.2 | 0.05 | 1.0 | 1.0 | NA | |||||||

46 | 1.5 | 1.0 | NA | |||||||||

| ||||||||||||

0.1 | 0.2 | NA | 2.0 | |||||||||

Neuromuscular axons (Fig. 9) | 341 | 195 | 2.5 | 5 | 200 | 15 | 0.5 | 0.05 | 2.0 | 2.0 | NA | |

341 | 0.2 | 0.05 | 1.0 | 1.0 | NA | |||||||

54 | 10 | 10 | NA |

- 1.
Image stack (see Fig. 2a) is pre-processed if necessary. This step may include alignment of images within the stack, stack deconvolution, conversion of color formats to gray scale, elimination of cell bodies and other non-neurite structures. Pre-processing steps strongly depend on the details of neuron labeling and imaging and are not addressed in this study.

- 2.
The pre-processed image is filtered to enhance the tracing process and binarized (thresholded). Separate regions containing small numbers of voxels are eliminated. The outcome of this step is illustrated in Fig. 2b.

- 3.
Centerline of the remaining structures in the binary image is detected. This centerline is represented with a graph consisting of short connected straight line segments. Due to noise and imperfect labeling centerline usually contains short erroneous terminal branches and small nested loops. Erroneous terminal branches are eliminated by setting a branch length threshold. Similarly, every small nested loop cluster is replaced with a single vertex located at the cluster’s center of intensity. The result is referred to as the initial trace. The outcome of this step is illustrated in Fig. 2c.

- 4.
The initial trace is optimized with two sequential modified active contour methods (Kass et al. 1988; Vasilkoski and Stepanyants 2009), Fig. 2d. End-points of the trace remain fixed during the first optimization, but the branch- and intermediate-points are allowed to move to their optimal positions. The second optimization algorithm improves the placement of the branch- and end-points of the trace.

- 5.
Individual branches contained in the optimal trace are merged into branching tree structures. Branch here is defined as a neurite connecting the root or a branch-point to a successive branch- or an end-point. This step is based on a novel cost function optimization method. Special care is taken to prevent the formation of loops at every step of the branch merging process. The result is illustrated in Fig. 2e.

In the following subsections we provide more details for the steps 2, 3, and 4 of the algorithm.

### Filtering Tubular Structures of Neurites

Multi-scale filters can be used successfully to enhance tubular structures. Some such filters are based on the analyses of eigenvalues of the Hessian matrix (Frangi et al. 1998; Lorenz et al. 1997; Sato et al. 1998; Streekstra and van Pelt 2002). These eigenvalues are calculated at different spatial scales for every voxel in the image, and the filter is constructed in such a way that voxels with high intensity variations along one direction relative to the orthogonal directions produce high responses. Filter responses at different spatial scales are combined by taking the maximum over the range of scales. An alternative approach is to use a bank of steerable filters designed to enhance different orientations and take the maximum over the range of orientations (Freeman et al. 1991; Gonzalez et al. 2009; Jacob and Unser 2004). This approach is computationally very expensive as it is usually necessary to use hundreds of filters to resolve orientations of neurites in 3D.

*σ*, is matched with the caliber of neurites, CSF can smooth out the intensity within the boundaries of the neurites, sharpen the boundaries, and reduce the background noise. A popular choice of a CSF used in detecting linear structures is the Laplacian of Gaussian (LoG) (Marr and Hildreth 1980). In 3D LoG has the form:

Numerical coefficients in this equation are chosen in such a way that both positive and negative components of the filter are normalized to unity and the filter as a whole is normalized to zero.

*σ*= 3 voxel sizes, is shown in Fig. 3a. If applied to a cylindrical neurite of a uniform intensity (intensity of 1 in Fig. 3a), the LoG filter produces the response shown in Fig. 3b. As it becomes evident from examining the figure, a problem will arises if the image stack contains neurites of different calibers. LoG will reduce the intensity inside thick branches, effectively curving out their interiors (e.g.

*R*= 8 voxel sizes), as well as reduce the intensity of very thin branches (e.g.

*R*= 1 voxel size), which may lead to branch breaking. To circumvent these problems one may combine the outputs from different size filters applied to an image,

*I*, by taking the maximum output at every voxel:

Asterisk in this expression denotes convolution. The response of such multi-scale CSF, Fig. 3c (*σ* = 2, 3, and 4 voxel sizes), encompasses a larger range of neurite calibers, *R*, which is advantageous. For real neurites, Fig. 3d, comparison of the results of single- and multi-scale LoG filtering is shown in Fig. 3e and f.

### Extracting Initial Traces of Neurites

There are multiple methods that can be used for detecting the centerline of neurites. Some methods are based on skeletonization or thinning of binary images [see e.g. (Lee et al. 1994; Palagyi and Kuba 1998; Weaver et al. 2004)]. The basic idea behind these methods is an iterative removal of voxels from the surface of the segmented image in a way that preserves the topology of the contained structure and does not erode terminal branches. This can be accomplished by fixing the tips of the terminal branches and monitoring the number of disconnected structures at every step of the algorithm. The algorithm proceeds until no more voxels can be removed from the image without increasing the number of regions. Alternatively, the centerline of neurites can be obtained by using the Minimum Cost Path methods [see e.g. (Deschamps and Cohen 2001; Dijkstra 1959; González et al. 2008; González et al. 2010; Sethian 1999; Wink et al. 2002; Xie et al. 2010)].

The centers of intensity of consecutive wave fronts (graph nodes) are connected with straight line segments (graph edges) to form a graph. The branch-points of the graph correspond to the centers of intensity of the dividing wave fronts, and end-points are the centers of intensity of the terminal wave fronts (see Fig. 4a). Due to noise and imperfect labeling, voxel coding algorithm may result in a graph containing short erroneous terminal branches and small nested loops. Erroneous terminal branches are eliminated by setting a branch length threshold. Similarly, every small nested loop cluster is replaced with a single vertex located at the cluster’s center of intensity. This reduced graph is referred to as the initial trace of neurites. It usually consists of multiple unconnected sub-graphs which, in turn, may contain multiple branches and branch segments (see Fig. 4b).

### Trace Optimization

The initial trace, obtained with the voxel coding algorithm, usually represents the structure of neurites only approximately (Fig. 4b). A smother and more accurate representation can be achieved with fitness function optimization algorithms. Because the initial trace typically lies sufficiently close to the optimal solution, such optimization can be performed with a fast iterative gradient ascent procedure (Boyd and Vandenberghe 2004).

*F*

_{1}, was constructed from intensity integrated along the trace (average intensity along the trace times the trace length) and the elastic energy of the trace:

Vectors \( {\vec{r}_k} \) in this expression specify the positions of vertices of the trace and \( {\vec{R}_i} \) denote the positions of voxel centers in the image stack. Parameter *s* denotes voxel size which, for simplicity, is assumed to be the same in all 3 dimensions. Index *k*_{l} in the second term enumerates all the vertices of the trace that are connected to the vertex *k*. For example, if vertex *k* is an intermediate node, *k*_{l} enumerates the 2 of its neighboring vertices, and if vertex *k* is a bifurcating node, *k*_{l} describes 3 vertices. Parameter *λ* denotes the average density of segments along the trace (number of segments per unit length of the trace), and *a*_{1} > 0 controls the tension in the trace. Parameter *σ* should be of the order of the typical radius of neurites contained in the image, and, due to the fast decay of the Gaussian pre-factor, summation over *i* in Eq. 6 can be restricted to a small number of voxels in the vicinity of the trace.

*F*

_{1}can be achieved with the following gradient ascent procedure:

Here, superscript *n* enumerates the steps of the algorithm and parameter *b*_{1} > 0 controls the step size. Terminal vertices of the trace must remain fixed during this procedure to prevent the sub-trees within the trace from collapsing to single points. Details related to the stability of the algorithm and the appropriate range of values of parameters *a*_{1} and *b*_{1} are addressed in (Vasilkoski and Stepanyants 2009).

*F*

_{2}, reflecting the intensity and straightness of the trace is maximized:

*k*in this expression runs over the intermediate vertices only, and dot in the second term denotes the scalar product. At every intermediate vertex,

*k*, the notion of straightness is captured by the cosine of the angle between segments connecting

*k*with its two first order neighbors (denoted with

*k*

_{−1}and

*k*

_{+1}in Eq. 8). The gradient ascent procedure for this functional is:

Positions of all vertices of the trace, including branch- and end-points, are updated during this procedure. For brevity, Eq. 9 is written in a way where it is only applicable to vertices, *k*, with strictly two first and second order neighbors, i.e. *k*_{−2}, *k*_{−1} and *k*_{+1}, *k*_{+2}. To describe the steps of other types of vertices, Eq. 9 must be modified in the following way: (i) if some of the neighboring vertices of *k* do not exist (e.g. if *k* is an end-point vertex), or (ii) if the number of first or second order neighbors of *k* is greater than two due to branching (e.g. if *k* is a bifurcation point it has 3 first order vertices) then the terms corresponding to such neighboring vertices must be dropped from Eq. 9.

At every iteration step of the gradient ascent algorithms of Eqs. 7 and 9 positions of all vertices are synchronously updated. Long segments are subdivided and short segments are combined to ensure stability of the trace and adequate representation of intensity in the image (Vasilkoski and Stepanyants 2009). The sequential optimization scheme was designed to improve trace precision while maintaining stability. The initial trace is stretched in the course of the first optimization, straightening the zigzags resulting from voxel coding (Fig. 4c). This step is essential, as the second optimization, which improves the placement of branch- and end-points (Fig. 4d), may lead to instability if applied directly to the initial (wave) trace. Numerical values of parameters used in both optimization procedures are shown in Table 1. For the 3 datasets examined in this study, both optimizations converged to their optimal solutions (as judged by monitoring the fitness functions) within 50 iteration steps. Optimization improves the placement of branch- and end-points by about 2–3 μm over the initial trace (Vasilkoski and Stepanyants 2009) (compare Fig. 4b and d). What is however more important, optimization smoothes out zigzags present in the initial trace. This is essential for the next step of the algorithm, during which individual branches are merged into tree structures based on their orientations, intensities, and curvatures. Calculation of these quantities relies on smoothness of the trace.

### Automated Branch Merger

Branch merging is the most important stage of the described automated tracing procedure. At this stage we are invariably risking to create erroneously connected neurites. Such false positive mergers are difficult to find and to correct. Hence, the best reconstruction strategy in our opinion is to merge branches only when the confidence level is relatively high. Regions with complex merger patterns are better handled by trained users.

In merging branches of the optimal trace we first identify the end-points of all the branches and group them into spatially segregated clusters. For this, we consider a graph in which the nodes represent the branch end-points and the edges are placed only between spatially neighboring nodes (distance less than a threshold set by the user). Branch end-point clusters are sorted based on the number of contained end-points and the algorithm proceeds from merging 2 end-point clusters (in the order of increasing cost, see Eq. 10), to 3 end-point clusters, to the higher order clusters.

Index *i* in this expressions enumerates separate branch end-point mergers within the considered merger scenario. *D*_{i} denotes the total distance between all merging branch end-point pairs within the *i*-th merger. *χ*_{ij} represents the angle made by the branch pair *j* within the *i*-th merger. It is calculated as the angle between tangents to the two branches in the vicinity of the considered end-points. By default, *χ*_{i0} equals 180° for 2 branch end-points mergers, 120° for bifurcations, and 90° for trifurcations. *I*_{i} is the average intensity along the optimally connecting traces of the *i*-th merger, and *I*_{0} is the average intensity along the optimal trace of the entire image. These quantities are calculated by dividing the first terms in Eqs. 6 or 7 with the trace length. *K*_{i} is the average curvature of the optimally connecting traces of merger *i*. As before, the curvature is calculated only for the intermediate vertices. *N* represents the number of remaining (not merged) branch end-points. Figure 5 shows individual components of the cost for 3 out of 15 possible merger scenarios involving 4 branch end-points.

The cost function in Eq. 10 depends on 13 non-negative parameters (Table 1): three sets of parameters *α*, *β*, *γ*, *δ—*one set for each allowed merger type (2, 3, and 4 branch end-point mergers), as well as the parameter *ε*. These parameters are determined with the perceptron learning algorithm (Engel and Broeck 2001) during user assisted branch merging procedure. Here, for every cluster of branch end-points in the training stack of images, the user is sequentially presented with different merger scenarios and has to identify the correct one. Once the correct merger is identified, numerical values of *D*, *χ*, *I*, *K*, and *N* for this merger are associated with 1 (or correct merger) and those for all other merger scenarios within the cluster are associated with −1 (or incorrect merger). The algorithm proceeds to other clusters of end-points until training is complete. Perceptron is trained to identify the correct merger within each cluster by solving the system of inequalities which ensures that costs of the correct mergers are the lowest within each group.

Numerical values of the cost function parameters *α*, *β*, *γ*, and *δ* are obtained from the perceptron learning, during which more than 95% of mergers are typically recognized correctly. Reconstructions are not very sensitive to the exact values of *α*, *β*, *γ*, and *δ*, and the parameter sets generalize well on test images obtained under similar experimental conditions. For example, in reconstructing datasets of drosophila olfactory neuron axons (Brown et al. 2011) used in the DIADEM challenge (Brown et al. 2011), we trained the algorithm on 3 training image stacks and tested its performance on 3 qualifier image stacks by using the DIADEM metric. The metric scores were not statistically different between the training (0.89, 0.89, and 0.67) and test groups (0.66, 0.79, and 0.90).

## Implementation

### Parameters and Data Formats

Our algorithm depends on a number of parameters, some of which are set by the user while others are learned from the training data and are generalized on images of the same type (see Table 1). Parameters set by the user include: the multi-scale CSF sizes, thresholds for intensity, region size, and branch length, as well as trace optimization parameters. There are no strict guidelines on how to choose the values of these parameters, and the best choice of parameters is dependent on the image type.

In general, the multi-scale CSF sizes must span the range of neurites’ radii. Because voxel coding algorithm works on binary images, intensity thresholding is a necessary step. The threshold must be high enough to eliminate the background and separate neurites as much as possible without breaking them into many small regions. Region size threshold is used to eliminate small noise remaining in the image after filtering and thresholding. Typically, few hundred voxels are sufficient to remove such noise without affecting regions of neurites. Branch length threshold is used to eliminate short terminal branches. This is usually necessary for simplifying the topology of branch merging patterns. It is recommended to set this threshold to at least few voxel sizes in order to reduce artifacts of the voxel coding algorithm. For neurites containing short terminal branches (2–5 μm), spines, or filopodia, branch length threshold should be high enough to eliminate the traces of such structures. These structures must be detected with separate algorithms, after the main trace of neurites is determined. The choice of optimization parameters was previously described (Vasilkoski and Stepanyants 2009). Best values of parameters *a*_{1,2}, which control the stiffness and straightness of the trace, depend on the class of neurites. For example, in the neocortex axons of many inhibitory neuron classes are much more tortuous than axons of excitatory neurons (Stepanyants et al. 2004), and therefore they must be reconstructed by using lower values of *a*_{1,2}. More studies are needed to determine the best values of the trace optimization parameters for different neuron classes.

Traces of neurites in this study are built out of short connected straight line segments. Their topological structure is conveniently represented with an adjacency matrix. This format makes it possible to describe structures containing loops, which are often present at the initial stages of the algorithm. This data format also allows for fast traversing through the trace and identifying its branch- and end-points. The final trace of all the neurites contained in the image stack can be converted from the adjacency matrix format to a more traditional SWC format of neuron morphology (Cannon et al. 1998).

### The Neural Circuit Tracer Software

The Neural Circuit Tracer is open source software built using Java (Sun Microsystems) and Matlab (MathWorks, Inc., Natick MA). It is based on the core of ImageJ (http://rsbweb.nih.gov/ij) and the graphic user interface has been developed by using Java Swings. The software combines a number of functionalities of ImageJ with several newly developed functions for automated and manual tracing of neurites. The Neural Circuit Tracer is designed in a way that will allow the users to add any plug-ins developed for ImageJ. More importantly, functions written in MatLab and converted into Java with Matlab JA toolbox can also be added to the Neural Circuit Tracer.

Loading a stack of images with the option to reduce the stack in

*xy*or*xyz*dimensions. The reduction can be accompanied with 3D median, mean, minimum, maximum, or standard deviation filtering of the stack. Image stack can be loaded in Tagged Image File (.tif) and MatLab (.mat) formats.Multi-scale CSF and the region size filters (described above).

Automated tracing module. This function is based on the algorithm described above.

- Manual editing module. The trace is conveniently displayed in 2D on top of the ImageJ stack view and in 3D inside the ImageJ 3D viewer plug-in (see Fig. 6). Trace can be zoomed in/out and rotated in 3D. Individual tree segments can be added and deleted. Tracing can be continued from any vertex of the trace. Trees can be connected, disconnected and deleted.
Two trace optimization functions (described above).

Saving and loading of the stack and the trace in MatLab (.mat) format.

Exporting the trace in SWC format (Cannon et al. 1998).

Adding Java and MatLab based plug-ins.

Building, saving, and running macros. The user can save a group of functions as macros and run them consecutively.

## Results and Discussion

Performance of the automated tracing algorithm presented in this study was evaluated on 3 DIADEM challenge datasets (Brown et al. 2011). Neurites in the selected datasets were fluorescently labeled and imaged with confocal or two-photon microscopy. Other DIADEM datasets, obtained with brightfield microscopy, were not examined due to the beaded appearance of neurites. Voxel coding algorithm cannot be used for extracting the initial traces of neurites from such images and must be replaced with the Minimum Cost Path or tracking based methods. After the initial trace is detected, optimization and branch merging steps of the algorithm can be applied without changes.

Automated and manual (gold standard) traces can be compared quantitatively by using the DIADEM metric (Gillette et al. 2011) to evaluate the performance of the algorithm. However, this comparison is hindered by several considerations. First, the DIADEM metric relies heavily on the topologies of the reconstructions, yet it only compares connected tree structures. Broken trees score lower because only the tree sections connected to the roots contribute to the score. In general, topological errors which occur closer to the root of the reconstruction receive larger penalties. This seems reasonable if the root is selected at or close to the neuron’s cell body, but is somewhat arbitrary otherwise (see e.g. Fig. 8). Second, gold standard reconstructions obtained manually are usually not perfect. For example, comparisons of reconstructions performed manually by different users do not result in perfect scores (Gillette et al. 2011). Hence, automated reconstructions, even if more accurate than the gold standard, will not receive perfect scores. Finally, scoring automated reconstructions with the DIADEM metric becomes meaningless in situations where the numbers of theoretically expected ambiguities or errors are high.

## Conclusions

In this study we describe a methodology for automated tracing of neurites from 3D LM stacks of images. Our algorithm and software can be used for tracing neurites labeled with fluorescent and non-fluorescent markers and imaged with confocal, two-photon, or brightfield microscopy. We estimate how the minimal error-rate of an LM based reconstruction approach depends on the sparseness of labeled neurites. Reconstructions can only be useful for circuit level analyses if the labeling is sufficiently sparse. If the threshold of label sparseness is exceeded, neurites cannot be reconstructed with confidence, even manually by trained operators.

Automated reconstruction tools, including the one described here can substantially aid more time consuming manual reconstruction techniques, but are unlikely to replace them completely, at least in the near future. Major improvements in the reconstruction methodology may be easier done at the experimental rather than computational end of the problem. Because there are no automated tools that perform well on diverse experimental datasets, the optimal reconstruction strategy may be to extract the initial traces of neurites, but merge branches automatically into tree like structures only when the confidence for such mergers is relatively high. The remaining mergers can be done by the user without significant time investment. It is usually much faster and easier to connect broken branches with editing software than to find and repair erroneous connections.

## Information and Sharing Agreement

More information on the Neural Circuit Tracer software can be found at http://www.neurogeometry.net.

## Acknowledgements

We would like to acknowledge Dr. Zlatko Vasilkoski’s contribution to the implementation of the voxel coding algorithm and discussions related to the subject of this study. This work was supported by the NIH grant NS063494.