1 Introduction

Pioneered by Korbinian Brodmannn in 1909 [3], the classical approach to mapping distinct brain regions is based on visually recognizing the cellular textures (cytoarchitecture) from images of sections of a brain. Several paper atlases have been created in this way for the brains of different species [10].

The primary methods for expert annotation of brain regions have changed little since then. It still is a labor intensive process performed only by the most experienced neuroanatomists. In this paper we propose a machine learning approach for atlas construction that uses automated texture recognition to immitate human pattern recognition in the annotation task.

There exist several section-based digital atlases that were constructed using automated registration algorithms. The best known is the Allen Reference Atlas for mouse [1, 4, 6], which is based on downsampled images of 50 \(\upmu \mathrm{m}\) per pixel. At this resolution, registration can be performed by maximizing intensity similarity using metrics such as correlation and mutual information.

Fig. 1.
figure 1

A demonstration of the limitation of reduced resolution brain images. The “Original” image was taken at 0.5 \(\upmu \mathrm{m}\)/pixel. “Detector 1, 2, 3” represent the detection of three brain structures based on texture by the trained classifiers. The “Downsampled” image lacks the high-resolution details needed to distinguish the structure. (Best viewed in color)

The problem is that at this resolution the information on cellular texture is discarded, which results in poor localization in regions that lack high contrast boundaries (see Fig. 1). In this work we focus on the mouse brainstem, a part that has numerous cytoarchitecturally identifiable nuclei but is relatively homogeneous at low resolution. To overcome this limitation we have developed the active atlas, a texture-based atlas that operates on the full-resolution images and uses texture classifiers to differentiate structures not identifiable at low resolution. This distinguishes our approach from both the Allen atlas and those based on MRI or optical volumes [8, 9, 11, 13].

The contributions of this work are:

\(\bullet \) Detection of cytoarchitectural textures visible only at high resolution.

\(\bullet \) Identification of discrete structures in addition to overall registration.

\(\bullet \) Characterization of the positional variability of brain structures.

\(\bullet \) Use of iterative refinement to reduce human annotation effort.

The paper is organized as follows. Section 2 describes the procedure for building an active atlas. Section 3 presents evaluation results that demonstrate the confidence of registration and accuracy of texture detection.

2 The Active Atlas

The active atlas has two components:

  1. 1.

    Anatomical model: stores for each of 28 structures in the brainstem, the position statistics and probabilistic shape.

  2. 2.

    Texture classifiers: predict the probability that a given image patch corresponds to a particular structure.

The construction of the atlas is iterative, starting with an initialization step that required significant human labor, followed by refinement steps which require little or no expert labor (see Fig. 2). In our case, the initial step was to annotate three stacks of images, which required 30 h of work of an experienced neuroanatomist. From these annotated stacks, an initial anatomical model and a set of texture classifiers were constructed. The refinement then uses nine additional stacks that were not annotated. These stacks were aligned to the initial atlas and the information from this alignment was used to refine the atlas and to estimate the variability from brain to brain. Below we provide more details on each step.

Fig. 2.
figure 2

Incremental atlas building workflow

2.1 Preprocessing

Our dataset consist of sagittal brain sections from twelve mice of identical strain and age. The 20 \(\upmu \mathrm{m}\) sections are mounted with a tape-transfer system [12] to ensure minimal distortion. Each specimen gives roughly 400 sections, stained with Nissl and scanned at 0.5 \(\upmu \mathrm{m}\) resolution, demonstrating clear cytoarchitectonic features. The sections of each brain are registered via in-plane correlation-maximizing rigid transforms and stacked to reconstruct a 3D volume. The sufficiency of rigid transforms is proved by the smooth structure boundaries on virtual coronal sections of reconstructed volumes.

2.2 Estimation of Anatomical Model

Model estimation takes as input a current model (initially null) and a set of manually or automatically annotated brains. From each annotated brain one can collect an aligned contour set for each structure (Fig. 3b), which can be converted into a 3D mesh or volume (Fig. 3c). Based on them we derive the average centroid position and the average shape of each structure, which constitute a refined model.

2.2.1 Position Estimation

All brains are co-registered using the method described in Sect. 2.4. Centroid positions of the same structures in the common space are averaged over all brains. Those of paired structures are further adjusted to ensure symmetry of left and right hemispheres. The covariance matrices of centroid positions are also computed. They quantify brain variability and are used as structure-specific constrains for aligning future data.

2.2.2 Shape Estimation

All meshes of the same structure are aligned using Iterative Closest Point algorithm [2] (Fig. 3d) and converted to aligned volumes. The average shape as a probabilistic volume is then computed by voxel-voting (Fig. 3e).

Combining average shapes with average centroid positions, we obtain a probabilistic atlas volume \(\mathbf {A}\) where \(\mathbf {A}(\mathbf {p})\) denotes the 28-dimensional probability vector at location \(\mathbf {p}\).

Fig. 3.
figure 3

(a) Structure boundaries drawn by an expert (b) Aligned contour series in 3D (c) Facial motor nucleus from both hemispheres of different brains (d) Meshes aligned (e) Probabilistic average shape (f) Anatomical model with 28 structures

2.3 Learning Texture Classifiers

We train texture classifiers to differentiate a structure from its immediate surrounding region. We found that this gives better results than training against the entire background. The probable reason is that the anatomical model eliminates most of the uncertainty in gross positions, allowing the texture classifiers to focus on correcting small-scale error.

Image patches roughly 100 \(\upmu \mathrm{m}\) by 100 \(\upmu \mathrm{m}\) are used as units for classification. For each structure, a binary logistic regression classifier is trained using a positive patch set extracted from the interior of structure boundaries and a negative set extracted from the surrounding region within 50 \(\upmu \mathrm{m}\) from the boundaries. The feature vectors encoding the patches are the 1024-dimensional output of a pre-trained deep convolutional neural network (Inception-BN [7]). Although the network was originally trained for classifying natural images, it proves effective also for classifying histology textures.

For an unannotated image, these classifiers are applied to patches with 25 \(\upmu \mathrm{m}\) spacing, resulting in score maps for different structures. All score maps of a same structure in one stack undergo the previously computed intra-stack alignment to form a set of 3D score volumes. Each volume represents a probabilistic estimate of a particular structure’s position in the reconstructed specimen (Fig. 4). Denote by \(\mathbf {S}(\mathbf {p})\) the vector consisting of the scores for different structures at location \(\mathbf {p}\).

Fig. 4.
figure 4

(a) An example score map for facial motor nucleus (b) Stacking 2D score maps forms a 3D score volume. (c,d) Score volumes for other structures.

2.4 Registering Atlas to Specimen

Registration is driven by maximizing the correlation at all voxels between the score vectors of the specimen volume and the probability vectors of the atlas volume. A global 3D transform first aligns the atlas roughly with the whole specimen. Affine transform is used to account for non-vertical cutting angle and scale change due to dehydration. Separate 3D translations are then applied to each structure so independent variations can be captured.

Let \(\varOmega \) be the domain of the atlas. For global transform, the objective to maximize is simply \(F^{global}(\mathbf {L}, \mathbf {b}) = \sum _{\mathbf {p}\in \varOmega } \mathbf {A}(\mathbf {p}) \cdot \mathbf {S}(\mathbf {L}\mathbf {p}\,+\,\mathbf {b})\), where \(\mathbf {L}\in \mathbb {R}^{3\times 3}\) and \(\mathbf {b}\in \mathbb {R}^3\) are respectively the linear and translation parts of the affine transform.

For the local transform of structure k, only the voxels inside the structure and those in a surrounding region within a 50 \(\upmu \mathrm{m}\) radius are concerned. Denote the two sets by \(\varOmega _k^+\) and \(\varOmega _k^-\) respectively. The objective is

$$\begin{aligned} F^{local}(\mathbf {t}) = \sum _{\mathbf {p}\in \varOmega _k^+} \mathbf {A}(\mathbf {p}) \cdot \mathbf {S}'(\mathbf {p}+\mathbf {t}) - \sum _{\mathbf {p}\in \varOmega _k^-} \mathbf {A}(\mathbf {p}) \cdot \mathbf {S}'(\mathbf {p}+\mathbf {t})\; - \; \eta \mathbf {t}^\mathsf {T}C_k\mathbf {t}\;, \end{aligned}$$
(1)

where \(\mathbf {t}\in \mathbb {R}^3\) is the local translation and \(\mathbf {S}'\) is the globally transformed score volume. The regularization term penalizes deviation from the mean position defined in the atlas model, where \(C_k\) is the inverse of the position covariance matrix (see Sect. 2.2.1).

Optimization for both cases starts with grid search, followed by gradient descent where the learning rate is determined using Adagrad [5]. From Fig. 5 one can visually verify the accuracy of registration. This registration effectively annotates new stacks for the 28 structures.

Fig. 5.
figure 5

(a) Reference model globally registered to the specimen. (b) Global registration. Showing the structure contours on a section. Structures are roughly aligned. (c) Local registration. Structures are aligned perfectly.

2.5 Evaluating Registration Confidence

The registration algorithm seeks a local maxima of the objective functions. We quantify the confidence of the registration by considering the height and the width of the converged local maxima. The height of the peak is normalized by considering a z-test relative to the variance within a sphere around the peak. The width can be computed for any direction, based on the Hessian of the z-score around the peak, as the distance away from peak that the z-score drops to 0. Figure 6 shows examples where different directions have different localization confidence.

Fig. 6.
figure 6

(a, b) Confident structures (c) Two unconfident structures (d) Uncertainty ellipsoids. The elongated structure VLL (yellow) is uncertain only in its axial direction, while Sp5I (red) is uncertain in rostral-caudal direction because its rostral and caudal boundaries are ambiguous. (Best viewed in color)

2.6 Updating Atlas

After new brains are co-registered with the atlas, average positions and shapes for all structures are re-estimated. Additional training patches can also be collected from the automatic annotations to improve the classifiers.

3 Results

3.1 Confidence of Registrations

The global registrations across all specimens have an average peak z-score of 2.06. The average peak radius is 98 \(\upmu \mathrm{m}\) in the steepest direction and 123 \(\upmu \mathrm{m}\) in the flattest direction. This suggests that the derived reference model captures the common anatomy of this population and matches all specimens with little space for adjustment. Figures 7 and 8 show these for the per-structure registrations. The average z-score is 1.79 and the width is between 90 \(\upmu \mathrm{m}\) and 250 \(\upmu \mathrm{m}\) for most structures. Generally, small structures tend to be registered more confidently than large ones. This aligns well with intuition if one considers how position shifts affect the overlap between the structure and the texture map. For a small structure, a small translation might completely eliminate any overlap, while a large structure is less sensitive.

Fig. 7.
figure 7

Average z-scores of the local registrations of different structures.

Fig. 8.
figure 8

Average peak width of per-structure registrations of different structures.

3.2 Variability of Structure Position

Variability is captured by the amount of per-structure translation. Figure 9 shows these for different structures across all specimens. Most structures vary within 100 um of the mean position defined in atlas. Some structures are particularly variable, which are also the ones whose boundaries are difficult to define. The same structure in left and right hemispheres generally have similar variability.

Fig. 9.
figure 9

Variability of centroid positions for different structures. Same color indicates the same structure in left (L) and right (R) hemispheres.

3.3 Accuracy of Texture Classifiers

Figure 10 shows the test accuracy for the classification of different structures. They range from 0.7 to 0.9 with a mean of 0.79. Larger structures tend to be harder to classify possibly due to their texture being more inhomogeneus.

Fig. 10.
figure 10

Accuracy of texture classifiers. (Structures sorted by increasing size)

4 Conclusion

The results demonstrate a form of co-training between the anatomical model and the texture classifiers. On the one hand, registrations perform well despite the classifiers for some structures are suboptimal, due to the strong constraint by the anatomical model. On the other hand, confident detection of the characteristic textures of many structures allows specimen-specific deviations from the current anatomical model to be discovered, contributing to more accurate variability. The synergy between the anatomical information and texural information is the key feature of the proposed active atlas.