Abstract
The brain is an intricate system that controls a variety of functions. It consists of a vast number of cells that exhibit diverse characteristics. To understand brain function in health and disease, it is crucial to classify neurons accurately. Recent advancements in machine learning have provided a way to classify neurons based on their electrophysiological activity. This paper presents a deep-learning framework that classifies neurons solely on this basis. The framework uses data from the Allen Cell Types database, which contains a survey of biological features derived from single-cell recordings from mice and humans. The shared information from both sources is used to classify neurons into their broad types with the help of a joint model. An accurate domain-adaptive model, integrating electrophysiological data from both mice and humans, is implemented. Furthermore, data from mouse neurons, which also includes labels of transgenic mouse lines, is further classified into subtypes using an interpretable neural network model. The framework provides state-of-the-art results in terms of accuracy and precision while also providing explanations for the predictions.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
Introduction
The classification of neurons, which are the basic units of the nervous system, has been a persistent challenge in neuroscience since the publication of Ramon y Cajal’s ’Histology of the Nervous System of Man and Vertebrates’ (Cajal, 1995). Classification of neurons is crucial for understanding brain function in both healthy and diseased states. It allows for consistent identification of neurons across different laboratories and experimental conditions (Zeng & Sanes, 2017). At the most fundamental level, cells can be categorized into non-neuronal cells and neurons. While neurons share the fundamental cellular structures of other cells, they are specialized in their ability to communicate through electrochemical signaling (Bean, 2007).
Neurons are cells that can produce and transmit electrical signals known as Action Potentials (AP). The electrical potential of neurons can be measured using micro-electrodes placed across their membrane. Normally, the electrical potential of neurons is negative, ranging from -40 to -90 millivolts. This potential difference is created by the concentration of ions inside and outside the cell. When a stimulus is applied to the cell, the membrane potential increases. If the stimulus is strong enough, it can cause the membrane potential to reach a threshold, which triggers the neuron to transmit an AP (Carter & Bean, 2009; Baranauskas, 2007).
Neurons can be divided into two categories: excitatory and inhibitory neurons (Melzer & Monyer, 2020). The main difference between the two types is that excitatory neurons release neurotransmitters, typically glutamic acid, that stimulate the postsynaptic neuron to produce an action potential. Conversely, inhibitory neurons release neurotransmitters, such as gamma-aminobutyric acid (GABA), which prevent the firing of an action potential. Although inhibitory interneurons make up only 10-20% of the total neural population in the cortex, they play a crucial role in sensation, movement, and cognition (Swanson & Maffei, 2019).
Physiologically, neurons consist of dendrites that serve as the input, the soma (cell body) that contains the nucleus, and the axon that serves as the output. Excitatory neurons are generally spiny in structure, with a long apical dendrite. They exhibit less variability in their electrophysiological characteristics, which makes it challenging to differentiate between subtypes of excitatory cell types using electrophysiological features alone. Inhibitory neurons, on the other hand, are typically either aspiny or sparsely spiny, with a more compact dendritic structure. They have a more significant variance in electrophysiological properties and tend to spike faster (Kawaguchi, 1997; Strübing et al., 1995). Neurons can also be classified based on their neurotransmitter, GABAergic neurons which are inhibitory cells, and Glutamatergic neurons, which are habitually excitatory and brain-area specific. Notably, it is more accessible to measure electrophysiological than morphological or genetic features (Zeng & Sanes, 2017); hence, we focus on neuronal classification exclusively relying on electrophysiology.
Many neuroscientists also consider GABAergic neurons as belonging to one of the four subclasses based on the expression of specific principal gene markers (Tremblay et al., 2016); these include Pvalb (parvalbumin) positive, Vip (vasoactive intestinal peptide) positive, Sst (somatostatin) positive, Htr3a (5-hydroxytryptamine receptor) positive but Vip negative. These subclasses of GABAergic interneurons account for most neurons in specific brain regions, as demonstrated in Fig. 1. The classes are expressed in a non-overlapping manner, meaning that each neuron belongs to one class in a monovalent fashion, with distinct cell types accompanied by different physiological properties.
In 2019, the Allen Cell Types Database (ACTD) became publicly available (For Brain Science AI, 2015; Gouwens et al., 2019). Thanks to recent advancements in computing capabilities and the rapid development of machine and deep learning methods, the field of neuronal cell-type classification has made significant progress. The ACTD identified 17 electrophysiological neuronal types, four of which were classified as excitatory subtypes, and 13 as inhibitory subtypes. The 13 inhibitory subtypes were further categorized into the four inhibitory interneuron types based on genetic tags: Vip, Ndnf, Sst, and Pvalb. Additionally, the researchers identified 38 morphological and 46 morpho-electric neuronal types, all categorized using current clamp electrophysiological recordings. These groups underwent a dimensionality reduction process using algorithms such as principal component analysis (Abdi & Williams, 2010) and t-distributed stochastic neighbor embedding (Van der Maaten, 2008), and were then clustered using a Gaussian mixture model (Reynolds, 2009).
In their study, Ghaderi et al. (2018) developed a semi-supervised method using the Automated Clustering Tool for Deconvolution (ACTD) to classify neurons into three types: excitatory pyramidal cells (Pyr), Pvalb interneurons, and Sst interneurons from layer 2/3 of the mouse primary visual cortex. The authors achieved accuracies of 91.59 ± 1.69, 97.47 ± 0.67, and 89.06 ± 1.99 for Pvalb, Pyr, and Sst neurons, respectively. This resulted in an overall accuracy of 92.67 ± 0.54% for the classification of the three neuron types.
Seo and Lee (2019) used machine learning to predict transgenic markers of neurons using electrophysiological recordings. The work evaluated three different methods, namely: Random Forest (RF) (Ho, 1995), Least Absolute Shrinkage and Selection Operator (Tibshirani, 1996), and Fully Connected Neural Network (FCNN) (Rosenblatt, 1958). The three models performed similarly in predicting transgenic markers for excitatory neurons (Ctgf, Cux2 &Slc17, Nr5a1 &Scnn1a, Ntsr1, Rbp4, and Rorb) with an accuracy range of 28.57-46.93%. They also predicted transgenic markers for inhibitory neurons (Chrna2, Gad2, Htr3a\(+|\)Vip−, Ndnf, Nkx2, Pvalb, Sst, Vip &Chat) with an accuracy range of 59.03-73.49%.
More recently, Rodríguez-Collado and Rueda (2021) revealed a circular ordered taxonomy by transforming the first two principal components of the electrophysiological features PCA. They validated the proposed taxonomy with several machine learning models: linear discriminant analysis (Balakrishnama & Ganapathiraju, 1998), RF, gradient boosted decision tree (Chen & Guestrin, 2016; Natekin & Knoll, 2013), support vector machine (Cortes & Vapnik, 1995), and FCNN ensemble (Zhou & Chen, 2002). These models were able to discriminate the different neuron types (4 types of inhibitory neurons - Pvalb, Htr3a\(+|\)Vip−, Sst, Vip, as well as Glutamatergic excitatory cells) using electrophysiological features with accuracy ranging between 66.1-75.2% for the raw data, and 72.0-80.3% accuracy for a subset of the data that has been cleaned using anomaly detectors.
The studies mentioned so far have only relied on neuronal data collected from mice samples, as access to human neuronal data is limited. Even though mouse and human neuronal cell types and functionality are similar (Wong et al., 2023), electrophysiological differences between neurons from the two different species still exist, such that human pyramidal neurons have a higher action potential threshold voltage, a lower input resistance, and larger dendritic arbors (Mihaljević et al., 2021).
One can overcome the limitation of scarce neuronal data availability using a machine-learning model that is trained on data from both humans and mice. However, this approach may result in overfitting to the most abundant domain, leading to performance gaps on data from other domains (Novak et al., 2018; Farahani et al., 2021). This problem is known as the domain shift problem, and it can be addressed by using tools from the domain adaptation field (Zhou et al., 2022). Neural networks are known as "black boxes" due to their complexity, which makes it challenging to understand their predictions (Gunning et al., 2019). In the field of biomedicine, it is essential to interpret the model’s predictions as practitioners need to validate and trust them. Therefore, model interpretability is crucial. Additionally, due to their complexity, neural networks are often referred to as "black boxes", making it difficult to interpret their predictions Gunning et al. (2019). In the field of biomedicine, model interpretability is crucial as practitioners need to trust and validate the predictions made by the model.
This paper presents a machine-learning framework that addresses the challenge of predicting neuronal cell types in two defined tasks. In the first task, the framework suggests enriching the scarce human neuronal data using data from mice. The same neural network is used to classify broad neuronal types (excitatory vs. inhibitory) in both human and mouse samples. The mouse source data (which is easier to obtain) is used to learn a distribution over the human target data by embedding mutual information of the two domain distributions. This improves the model’s results of the target data, i.e., human neuronal data. The second task involves classifying neurons into five neuronal subtypes, including Excitatory Glutamatergic cells and the different subclasses of inhibitory GABAergic cells (Pvalb, Htr3a\(+|\)Vip−, Sst, and Vip) in an explainable way. An inherently interpretable model is used to predict the five neuronal subclasses in a clear and understandable manner. Our algorithms achieve state-of-the-art evaluation metrics in classifying mouse neuronal types and add the ability to classify human neuronal types in a joint classification approach.
The code and data used for this research is publicly available atFootnote 1 andFootnote 2 respectively.
Methods
Data
The Mouse Neuronal Data contains recordings of whole-cell current clamp from identified fluorescent Cre-positive neurons or nearby Cre-negative neurons in acute brain slices derived from adult mice. To identify cells from mice, transgenic mouse lines harboring fluorescent reporters are used, along with drivers that allow for enrichment of cell classes based on marker genes. On the other hand, Human Neuronal Data is obtained from donated ex vivo brain tissues that are analyzed from neurosurgical and postmortem sources. This data is less abundant than data from mice and is challenging to obtain. It is available thanks to the generosity of tissue donors. The ACTD (Allen Cell Types Database) has electrophysiological recordings from 1920 mice and 413 human cells. Each whole-cell current clamp recording responds to a stimulation recorded at 200 KHz (before 2016) or 50 KHz (after 2016).
The dataset contains four stimulating conditions that evoke AP responses from the neurons. The first type, noise stimulation, comprises noise pulses of square current injections. The second type, ramp stimulation, involves gradually increasing the intensity of square current injections at a rate slower than the neuron’s time constant. The third stimulation type, long square stimulation, employs square pulses of extended duration to induce a response from the neuron. Finally, the fourth stimulation type, short square stimulation, delivers brief square pulses designed to elicit a single AP from the neuron. This focused stimulation allows a simple protocol to produce an AP.
The classification of neurons into their broad types in humans and mice, and the classification of mouse cells into their specific neuronal subclasses, relies on analyzing 41 electrophysiological tabular features, though neuronal morphology-related features are attached to some neurons in the dataset, morphological features were not used for classification. All 41 tabular features are solely based on electrophysiological behavior, this is done so that the method presented can be used in clinical real-time applications, where morphological features are lacking. Nevertheless, the morphology of the neuron is still expressed via its electrophysiological features, since the morphology affects the propagation of the electrophysiological signal in the neuronal branching trees (Ofer et al., 2017). These features are extracted from APs captured within various stimulations, such as AP width and height, and AP threshold, along with features related to AP trains, such as firing speed. Each whole-cell current clamp recording is solely based on membrane potential measurements and responds to a stimulation recorded at 200 KHz (before 2016) or 50 KHz (after 2016). Recordings are performed using a range of stimulus protocols, including short pulses (3 ms current injections used to find the action potential threshold within 10 pA), long steps (1s current injections from -110pA to rheobase +160pA, in 20pA increments), slow ramps (ramp of 25pA per 1 second, terminated after a series of action potentials are acquired), and naturalistic noise (pink noise scaled to three amplitudes, 0.75, 1, and 1.5 times rheobase.) to characterize the intrinsic properties of the neurons. For more information on the electrophysiological description of all features or stimulations used, please refer to the Allen Cell Electrophysiology Overview documentationFootnote 3.
The Allen Institute has identified the dendritic morphology of each neuron by categorizing it as either aspiny, sparsely spiny or spiny. This was done by observing the slides of the neuron’s dendrites under a microscope at 20X or 63X magnification. These different dendritic types can be roughly classified as interneurons (aspiny and sparsely spiny) and pyramidal or spiny stellate neurons (spiny).
Aspiny dendrites are characterized by the absence of spiny protrusions, lack of a pronounced apical dendrite and/or axon that emerged from the soma or dendrite at odd angles, and had extensive local branching. Sparsely spiny dendrites are defined by the presence of infrequent to moderately frequent spiny protrusions (approximately one spine per 10 microns), lack of a pronounced apical dendrite and/or an axon that emerged from the soma or dendrite at odd angles, and had extensive local branching, and/or projected up to layer 1.
Spiny dendrites are defined by the presence of frequent spiny protrusions (approximately one spine per 1-2 microns), an axon that descended perpendicularly down to the white matter with sparse, proximal branching occurring at right angles to the primary axonal branch and/or a pronounced primary, apical dendrite (For Brain Science AI, 2015).
The dataset contains four types of stimulations: (1) noise stimulations that involve square current injections with noise pulses, (2) ramp stimulations that are square current injections with increasing intensity at a rate slower than the neuron’s time constant, (3) long square stimulations that are square pulses of duration allowing the neuron to reach a steady state, and (4) short square stimulations that are brief enough to elicit a single action potential. Below is an example of an electrophysiological response of a neuron to a noise-type stimulation in Fig. 2.
Neurons can be classified into two categories: GABAergic and Glutamatergic. GABAergic neurons are further divided into four subclasses based on their expressed Cre lines, which are Pvalb (Parvalbumin) positive, Vip (Vasoactive intestinal peptide) positive, Sst (Somatostin) positive, and 5-hydroxytryptamine receptor 3A (Htr3a) positive Vip negative. On the other hand, Glutamatergic neurons are classified based on their laminar locations and the location to which they project their axons, as highlighted in Tremblay et al. (2016).
Using the ACTD, researchers have defined five transcriptomic-electrophysiological subclasses, including four major GABAergic subclasses and one Glutamatergic subclass. The subclasses are specified in Fig. 3, as per Rodríguez-Collado and Rueda (2021); Tremblay et al. (2016).
After the preprocessing stage, which is visualized in Fig. 4, we are left with 1424 mouse samples and 299 human samples for further analysis. These samples will be used to train, validate, and test our classification models. Among the mouse samples, 700 samples are classified as Glutamatergic and correspond to spiny neurons, while the other 724 samples are classified as GABAergic and pertain to aspiny neurons. In the case of human samples, we have 231 spiny neurons and 68 aspiny neurons available for analysis. The distribution of inhibitory and excitatory cells in humans and mice is illustrated in Fig. 5.
When we examine the Cre-line subclasses present in the GABAergic mouse samples, we see that there are four types. 231 neurons belong to the Pvalb subclass, 199 neurons exhibit Htr3a positivity with Vip negativity, 173 neurons fall into the Sst subclass, and 121 neurons are identified as Vip positive. You can see the graphical representation of this data in Fig. 6.
The extracted tabular AP features can be found in the supplementary material.
Classification Models
Artificial neural networks (ANN) are a type of machine learning models inspired by biological neural networks. ANNs rely on matrix multiplications followed by nonlinear activation functions to learn complex relations between input and output. ANNs are comprised of artificial neurons that are connected through edges. These edges typically have a weight value that can adjust the strength of the signal at that connection, and the weights are ’learned’ through an optimizer such as Stochastic Gradient Descent (SGD) (Ruder, 2016).
Over the last decade, numerous neural network architectures have been developed for diverse applications (Liu et al., 2017). In this paper, we focus on fully connected neural networks, also referred to as multi-layer perceptron (MLP), or just a ’neural network’ (NN) (Krogh, 2008). We also use a new type of NN designed for tabular data, namely the Locally SParse Interpretable Network - LSPIN (Yang et al., 2022).
We focus on two classification tasks that rely solely on electrophysiological features. In the first task, we use a joint model to classify neurons from humans and mice according to their dendrite type (spiny vs. aspiny). In the second task, we classify neurons from mouse samples into their respective cell classes based on marker genes (multi-label classification). These include Pvalb, Sst, Vip, Htr3a\(+|\)Vip−, and Glutamatergic neurons. For the first task, we introduce a domain adaptation component to handle measurements from humans and mice using a joint model based on mutual information from the two domains. For the second task, we use a NN with a sample-specific feature selection mechanism, namely LSPIN, to reduce model overfitting in low-sample-size data and obtain an interpretable model.
Domain-adaptive Classification
In this section, we aim to design a model that can classify neurons from human samples to neuronal types, yet this is difficult due to the shortage of data samples for human neurons. Human neuronal samples are scarce and difficult to obtain. To overcome this issue, we use both mouse and human samples to establish a shared distribution of similar characteristics from both domains. The underlying assumption is rooted in the biological similarity between mouse and human neurons, both originating from mammalian brain tissues. We effectively classify neuronal samples by utilizing common information within samples from both species. However, conventional ANNs may exhibit sub-optimal performance and underperform in such a scenario due to the domain shift arising from the distinct distributions of mouse and human neuronal data. We overcome this limitation by introducing an adversarial domain adaptation scheme, namely DANN Ganin et al. (2016), designed to mitigate the influence of this domain shift. This scheme aligns the distributions of mouse and human neuronal data, enhancing the model’s ability to classify human neuronal samples accurately.
We consider \(X \in \mathbb {R}^{D}\) the input space, and \(Y \in \{0, 1\}\) the output space, where 0 is an excitatory cell, and 1 is an inhibitory cell. We define S to be the source distribution over \(X \times Y\), and \(D_S\) to be the marginal distribution such that \(S = \{(\varvec{x}_i, y_i)\}_{i=1}^n \sim D_S\). We define T to be the target distribution over \(X \times Y\), and \(D_T\) to be the marginal distribution such that \(T = \{(\varvec{x}_i, y_i)\}_{i={n+1}}^{N} \sim D_T\). Where n is the number of source samples, and N is the total number of samples. We aim to define a classifier \(\eta : X \rightarrow {} Y\) to which the target risk function in Eq. (1) is low while maintaining a low source risk:
Since there may be a shift between \(D_S\) and \(D_T\), training a naive model based on Eq. (1) can be biased towards the more abundant domain \(D_S\). To alleviate such bias, Ganin et al. (2016) introduced a technique called DANN that combines representation learning (i.e., deep feature learning) and unsupervised domain adaptation in an end-to-end training process. DANN jointly optimizes two adversarial losses, minimizing the loss of a label classifier and maximizing the loss of a domain classifier. Training both losses can be considered a form of adversarial neural network regularization. On the one hand, the network needs to classify the data into the correct labels. On the other hand, the predictions made by the network must be based on features that cannot discriminate between the source domain and the target domain. In our setting, mouse cells are considered the source distribution and are more abundant, and the human cells serve as the target distribution.
The prediction loss and domain loss are respectively defined as:
where \(\theta _f, \theta _y, \theta _d\) are the parameters of the feature extractor, label classifier, and domain classifier, respectively, \(G_f, G_y, G_d\) are the function outputs of the feature extractor, label classifier and domain classifier, respectively, and \(d_i\) is the domain label of sample i as illustrated in Fig. 7.
Overall, training the model consists of optimizing Eq. (2):
by finding the saddle point \(\hat{\theta }_f, \hat{\theta }_y, \hat{\theta }_d\) such that:
To optimize over Eq. (3), we can use gradient descent, which relies on the following update rules:
Where \(\mu\) is the learning rate.
Using the aforementioned NN architecture, domain adaptation is achieved by forcing the prediction based on features that cannot discriminate between mouse and human samples. Final classification decisions are made using discriminative features that are domain (organism) invariant. We assume that a good representation for cross-domain transfer is one for which an algorithm cannot identify between the two domains (Farahani et al., 2021; Rozner et al., 2023).
Multi-label Classification Using Locally Sparse Networks
Collecting whole-cell current clamp recordings is labor intensive; for instance, the ACTD contains only 1920 mice and 413 human cells. The low number of samples makes it challenging to train an over-parametrized NN while avoiding overfitting. To overcome this limitation, we adopt a recently proposed method for fitting ANN models to low sample size data to address this obstacle. Specifically, the method is designed to deal with the problem of low sample size data for tabular heterogeneous biological data such as whole-cell current clamp recordings of neurons in various brain areas in mice. In this section, we use Locally SParse Interpretable Network - LSPIN (Yang et al., 2022), an intrinsically interpretable model with the benefit of showing the features it relied on during inference. We use LSPIN to predict five distinct neuronal types, four subclasses from GABAergic neurons, and one class of Glutamatergic neurons for a total of five subclasses. We achieve state-of-the-art results using the proposed method, surpassing other machine learning models and works in this field.
LSPIN is a locally sparse neural network in which the local sparsity is learned to identify the subset of the most relevant features for each sample. LSPIN includes two neural networks trained in tandem. The first is the gating network that predicts the sample-specific sparsity patterns, and the second is the prediction network that classifies the neuron type. By forcing the model to select a subset of the most informative features for each sample, we can reduce overfitting in low sample size data. Another benefit of this model is that by predicting the most informative features locally, we obtain an interpretation of the predictions.
Given labeled observations \(\{\varvec{x}^{(i)}, y^{(i)}\}_{i=1}^N\), where \(\varvec{x}^{(i)} \in \mathbb {R}^D\), and \(x_d^{(i)}\) represents the dth feature of the ith sample, and \(y^{(i)}\) is the label of the ith sample. We want to learn a global prediction function \(\varvec{f_\theta }\), and a set of parameters \(\{\mu _d^{(i)}\}_{d=1, i=1}^{D, n}\) such that \(\mu _d^{(i)}\) depict the behavior of the local stochastic gates \(z_d^{(i)} \in [0, 1]\) that sparsify (for each instance i) the set of features that propagate into in the prediction model \(\varvec{f_\theta }\). Stochastic gates (Yamada et al., 2020) are continuously relaxed Bernoulli variables highly effective for the sparsification of ANNs. They were previously used for several applications, including feature selection (Shaham et al., 2022; Jana et al., 2021) and sparse Canonical Correlation Analysis (Lindenbaum et al., 2021).
Each stochastic gate (for feature d and sample i) is defined based on the threshold function in Eq. (4):
where \(\epsilon_d^{(i)}\sim\mathcal N(0,\sigma^2)\) and \(\sigma\) is fixed to a constant during training, and equals 0 during inference. The sample-specific parameters \(\varvec{\mu }^{(i)} \in \mathbb {R}^D, i = 1,...,N\) are predicted based on the gating network \(\varvec{\psi }\) such that \(\varvec{\mu }^{(i)} = \varvec{\psi }(\varvec{x}^{(i)}|\varvec{\Omega })\), where \(\varvec{\Omega }\) are the weights of the gating network. These weights are learned simultaneously with the weights of the prediction network by minimizing the loss in Eq. (5):
where \(\mathcal {L}\) is a desired loss (e.g. cross-entropy). \(\odot\) represents the Hadamard product (element-wise multiplication), and \(\mathcal {R}(\varvec{z}^{(i)})\) is a regularizer term defined in Eq. (6):
where \(K_{i, j} \ge 0\) is a user-defined kernel (e.g., radial basis function). The architecture of the LSPIN model is illustrated in Fig. 8.
Results
Classification of Excitatory Vs. Inhibitory Neurons in Humans and Mice
Using the DANN model, we establish that whole-cell current clamp recordings of neurons in mouse brains are similar to those of human brains. Moreover, we show that implementing a joint ANN model to classify cells from human and mouse samples using mutual information from the two domains is possible.
Domain adaptation in low sample size settings is crucial since it enriches data in size and variability. In our example, human neuronal data is more challenging to acquire than data from mice. The significance of our results is that we prove common information is stored in the neuronal data from humans and mice. Since mouse neuronal data is more abundant than neuronal data from human samples, it can be used to strengthen classification models of human neuronal cell types.
We optimized the DANN model on 1378 training samples of cells from both humans and mice. The data was split into 1171 training samples and 207 validation samples, 60 human and 285 mouse cells for testing the model. We used the hyperparameter optimization library, ’Optuna’ Akiba et al. (2019) which enables automated and systematic exploration of the hyperparameter space, to identify the most effective configuration for the model. We used ’Optuna’ over a grid of pre-defined hyperparameters such as the model’s architecture, regularization techniques, learning rates, and optimizers. Each hyperparameter is given a set of different values and ’optuna’ chooses the values which obtain the top metrics based on the validation set. The values that are optimized in this manner are weight decay (chosen between 0.0001 to 1), network architecture (chosen between 6 different architectures including 3 to 4 hidden layers), activation function in each layer (chosen between ’relu’, ’selu’, and ’swish’), learning rate (chosen between 0.001 to 0.2), dropout rate (chosen between 0 to 0.5), batch size (chosen to either be 32 or 64), number of epochs (either 500, 1000, 1500, and 2000), and the optimizer used (chosen to be either ’SGD’ ’RMSProp’ or ’Adam’) The performance of the DANN model is shown in Figs 9, 10 and Table 1. Using the method, we show that the model generalizes to human and mouse domains while providing excellent classification results in accuracy, F1 score, precision, and recall.
The performance of the DANN model was evaluated using the accuracy, F1 score, precision, and recall evaluation metrics. The evaluation shows that the model successfully predicts cell types from humans and mice. Additionally, we compare the DANN model to a FCNN model that was trained and tested on the same dataset, not using an adversarial approach. The FCNN model training process followed the same constraints as the DANN model. The network was optimized using the ’Optuna’ library, following the same constraints as the DANN model. The training process of the DANN model is publicly available atFootnote 4 as well as the training process of the FCNN model atFootnote 5. We demonstrate that optimizing the model using the DANN scheme benefits the classification results in both domains; see Table 1. We demonstrate that the method classifies neurons to their broad types with 95.0% accuracy in human samples and with 97.4% accuracy in mouse samples using the same model weights, delivering a model that generalizes to both the human and mouse domains and classifies samples from both domains simultaneously.
Ablation: In our evaluation of the model, we discovered that a Fully Connected NN model, trained without using a domain adversarial scheme, and optimized using mouse neuronal data, yet tested on human neuronal data, achieved an accuracy score of 0.646, an F1 score of 0.715, a precision of 0.973, and a recall of 0.565. In contrast, a Fully Connected NN model trained on human neuronal data and tested on mouse neuronal data resulted in an accuracy score of 0.596, F1 score of 0.712, precision of 0.554, and recall of 0.997. This ablation study considers a neural network model with the same architecture of the domain classifier and feature extractor, but with the domain classifier and adversarial scheme removed. The model is then trained on a single source of data, yet tested on the other source of data (trained on human data and tested on mouse data and vice versa). We show that without using an adversarial domain adaptation scheme, results on the domain that the model was not trained on, are lacking. These results demonstrate the domain shift between the two distributions, which the DANN model confounds. This further validates the notion that the DANN model overcomes the domain shift between the two data sources by finding a shared embedding of the two data sources. When classifying samples of neurons from humans and mice, it is important to address the domain shift. Treating the data as a collection of distinct neurons across varied samples leads to lower evaluation metrics compared to using a DANN model, as shown in Table 1. Therefore, it is not feasible to forego domain differentiation and approach the data as a collection varied samples, since the samples are too heterogeneous to be treated as those from a common source.
Classification of Neurons to Marker Gene Classes in Mice
Our research proposes the use of the LSPIN model to classify neurons into their five defined subclasses with exceptional accuracy. We achieved an accuracy score of 0.916, an F1 score of 0.877, a precision score of 0.886, and a recall score of 0.873, which surpasses other machine learning models such as RF (Random Forest), SVC (C-Support-Vector-Classifier), and XGBoost (eXtreme Gradient Boosting) in all of the aforementioned metrics. Our results also exceed those of other related studies, as detailed in the introduction. The classification results are presented in Fig. 11 and Table 2.
During the training process, LSPIN was trained on 1105 samples from mice and tested on 277 (42 samples with NaN values from the ACTD were excluded). The prediction network included two hidden dense layers of size 40 and 20 (with an input layer of size 41 and an output layer of size 5). The gating network was assembled from 3 layers, each containing 50 neurons. Tanh was used as the activation function for both the gating and prediction networks. \(\lambda _1 = 0.01047, \lambda _2 = 0, \sigma = 0.5\), the network was trained for 1000 epochs with a learning rate of 0.0599 and SGD optimizer.
We can interpret the decisions produced by the prediction network via the gating network and stochastic gates outputs. The model is forced to select a subset of the most informative features identified for each sample; by that, the feature space is diminished, and model overfitting is reduced in low-sample-size data. The selection of relevant features produces an interpretable classification verdict in which relevant and non-relevant features are identified for each neuronal subclass, less essential features are attenuated, while important features are kept. See Fig. 12.
Discussion
The paper introduces two classification frameworks for identifying neuronal types based on electrophysiological recordings. The first method enriches mouse data by considering shared information from mouse and human neuronal samples. To overcome the domain shift between the mouse and human distributions, we use DANN. The joint model accurately classifies cells to their broad type with 95.0% accuracy for human cells and 97.4% accuracy for mouse cells.
The second method uses LSPIN, a locally sparse ANN, to identify five neuronal subtypes. The method overcomes low sample-size data by automatically subsetting the feature space and excluding non-essential features. Additionally, it is inherently interpretable, revealing the more or less significant features for the classification. In our case, it provides biological insights into the defining features of each neuronal class.
It is also possible to combine the two methods by modifying the DANN architecture to have the LSPIN’s gating network as the DANN’s feature extractor, so that the DANN’s feature extractor provides local sparsification of features.
There are several avenues for future research in the field of neuronal classification. Firstly, expanding the range of species included in the classification process could provide valuable insights into the evolutionary conservation of different types of neurons and their functional properties. The DANN method is capable of handling domain adaptation from multiple sources, making it suitable for this type of research. Secondly, utilizing the DANN method in an unsupervised scheme could improve the accuracy of interneuron subcategory predictions for human neuronal samples. Thirdly, the generalization ability of the models should be investigated to determine their applicability in real-world scenarios. This can be achieved by evaluating the framework’s performance on independent datasets from different laboratories and experimental conditions. Finally, exploring the biological insights of each neuronal type would enhance the interpretability of the model’s predictions. This would provide a better understanding of the specific features and regions of the electrophysiological signals that contribute most significantly to the classification and characterization of different neuronal types.
Information Sharing Statement
Data sharing is not applicable to this article as no new data were created in this study. Data used in the preparation of this article are publicly available at https://celltypes.brain-map.org/data (ACTD). The AllenSDK software library is open sourced and available at https://allensdk.readthedocs.io/en/latest/, the source code of the research is also publicly available at https://github.com/ofek181/Neuronal-Cell-Type-Classification.
Data Availability
All data used in the research is from the Allen Cell Types Database, found at: https://celltypes.brain-map.org/data.
Notes
References
Abdi, H., & Williams, L. J. (2010). Principal component analysis. Wiley interdisciplinary reviews: computational statistics, 2(4), 433–459.
Akiba, T., Sano, S., Yanase, T. et al. (2019). Optuna: A next-generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp 2623–2631.
Balakrishnama, S., & Ganapathiraju, A. (1998). Linear discriminant analysis-a brief tutorial. Institute for Signal and information Processing, 18(1998), 1–8.
Baranauskas, G. (2007). Ionic channel function in action potential generation: current perspective. Molecular neurobiology, 35, 129–150.
Bean, B. P. (2007). The action potential in mammalian central neurons. Nature Reviews Neuroscience, 8(6), 451–465.
Cajal, S. R. et al. (1995). Histology of the nervous system of man and vertebrates. History of Neuroscience (Oxford Univ Press, New York) 6.
Carter, B. C., & Bean, B. P. (2009). Sodium entry during action potentials of mammalian neurons: incomplete inactivation and reduced metabolic efficiency in fast-spiking neurons. Neuron, 64(6), 898–909.
Chen, T., & Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pp 785–794.
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine learning, 20, 273–297.
Farahani, A., Voghoei, S., Rasheed, K. et al. (2021). A brief review of domain adaptation. Advances in Data Science and Information Engineering, pp 877–894.
For Brain Science AI. (2015). Allen cell types database.
Ganin, Y., Ustinova, E., Ajakan, H., et al. (2016). Domain-adversarial training of neural networks. The journal of machine learning research, 17(1), 2096–2030.
Ghaderi, P., Marateb, H. R., & Safari, M. S. (2018). Electrophysiological profiling of neocortical neural subtypes: a semi-supervised method applied to in vivo whole-cell patch-clamp data. Frontiers in neuroscience, 12, 823.
Gouwens, N. W., Sorensen, S. A., Berg, J., et al. (2019). Classification of electrophysiological and morphological neuron types in the mouse visual cortex. Nature neuroscience, 22(7), 1182–1195.
Gunning, D., Stefik, M., Choi, J. et al. (2019). Xai explainable artificial intelligence. Science Robotics, 4(37):eaay7120.
Ho, T. K. (1995). Random decision forests. In: Proceedings of 3rd international conference on document analysis and recognition, IEEE, pp 278–282.
Jana, S., Li, H., Yamada, Y. et al. (2021). Support recovery with stochastic gates: Theory and application for linear models. arXiv preprint arXiv:2110.15960.
Kawaguchi, Y. (1997). Neostriatal cell subtypes and their functional roles. Neuroscience research, 27(1), 1–8.
Krogh, A. (2008). What are artificial neural networks? Nature biotechnology, 26(2), 195–197.
Lindenbaum, O., Salhov, M., Averbuch, A. et al. (2021). L0-sparse canonical correlation analysis. In: International Conference on Learning Representations.
Liu, W., Wang, Z., Liu, X., et al. (2017). A survey of deep neural network architectures and their applications. Neurocomputing, 234, 11–26.
Melzer, S., & Monyer, H. (2020). Diversity and function of corticopetal and corticofugal gabaergic projection neurons. Nature Reviews Neuroscience, 21(9), 499–515.
Mihaljević, B., Larrañaga, P., & Bielza, C. (2021). Comparing the electrophysiology and morphology of human and mouse layer 2/3 pyramidal neurons with bayesian networks. Frontiers in neuroinformatics, 15(580), 873.
Natekin, A., & Knoll, A. (2013). Gradient boosting machines, a tutorial. Frontiers in neurorobotics, 7, 21.
Novak, R., Bahri, Y., Abolafia, D. A. et al. (2018). Sensitivity and generalization in neural networks: an empirical study. arXiv preprint arXiv:1802.08760.
Ofer, N., Shefi, O., & Yaari, G. (2017). Branching morphology determines signal propagation dynamics in neurons. Scientific reports, 7(1), 8877.
Reynolds, D. A., et al. (2009). Gaussian mixture models. Encyclopedia of biometrics, 741, 659–663.
Rodríguez-Collado, A., Rueda, C. (2021). Electrophysiological and transcriptomic features reveal a circular taxonomy of cortical neurons. Frontiers in Human Neuroscience, p 410.
Rosenblatt, F. (1958). The perceptron: a probabilistic model for information storage and organization in the brain. Psychological review, 65(6), 386.
Rozner, A., Battash, B., Wolf, L. et al. (2023). Domain-generalizable multiple-domain clustering. arXiv preprint arXiv:2301.13530.
Ruder, S. (2016). An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747.
Seo, I., & Lee, H. (2019). Predicting transgenic markers of a neuron by electrophysiological properties using machine learning. Brain Research Bulletin, 150, 102–110.
Shaham, U., Lindenbaum, O., Svirsky, J., et al. (2022). Deep unsupervised feature selection by discarding nuisance and correlated features. Neural Networks, 152, 34–43.
Swanson, O. K., & Maffei, A. (2019). From hiring to firing: activation of inhibitory neurons and their recruitment in behavior. Frontiers in molecular neuroscience, 12, 168.
Strübing, C., Ahnert-Hilger, G., Shan, J., et al. (1995). Differentiation of pluripotent embryonic stem cells into the neuronal lineage in vitro gives rise to mature inhibitory and excitatory neurons. Mechanisms of development, 53(2), 275–287.
Tremblay, R., Lee, S., & Rudy, B. (2016). Gabaergic interneurons in the neocortex: from cellular properties to circuits. Neuron, 91(2), 260–292.
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267–288.
Van der Maaten, L., Hinton, G. (2008). Visualizing data using t-sne. Journal of machine learning research, 9, (11).
Wong, H. H. W., Chou, C. Y. C., Watt, A. J., et al. (2023). Comparing mouse and human brains. Elife, 12(e90), 017.
Yang, J., Lindenbaum, O., & Kluger, Y. (2022). Locally sparse neural networks for tabular biomedical data. In: International Conference on Machine Learning, PMLR, pp 25,123–25,153.
Yamada, Y., Lindenbaum, O., Negahban, S et al. (2020). Feature selection using stochastic gates. In: International Conference on Machine Learning, PMLR, pp 10,648–10,659.
Zeng, H., & Sanes, J. R. (2017). Neuronal cell-type classification: challenges, opportunities and the path forward. Nature Reviews Neuroscience, 18(9), 530–546.
Zhou, K., Liu, Z., Qiao ,Y. et al. (2022). Domain generalization: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence.
Zhou, Z. H., & Chen, S. (2002). Neural network ensemble. CHINESE JOURNAL OF COMPUTERS-CHINESE EDITION-, 25(1), 1–8.
Funding
Open access funding provided by Bar-Ilan University. This work was supported by Israeli Science Foundation (#1066/20).
Author information
Authors and Affiliations
Contributions
Ofek Ophir wrote the main manuscript and Orit Shefi as well as Ofir Lindenbaum supervised and mentored the research. All authors reviewed the manuscript.
Corresponding authors
Ethics declarations
Competing Interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ophir, O., Shefi, O. & Lindenbaum, O. Classifying Neuronal Cell Types Based on Shared Electrophysiological Information from Humans and Mice. Neuroinform (2024). https://doi.org/10.1007/s12021-024-09675-5
Accepted:
Published:
DOI: https://doi.org/10.1007/s12021-024-09675-5