Abstract
Spike sorting is the process of retrieving the spike times of individual neurons that are present in an extracellular neural recording. Over the last decades, many spike sorting algorithms have been published. In an effort to guide a user towards a specific spike sorting algorithm, given a specific recording setting (i.e., brain region and recording device), we provide an open-source graphical tool for the generation of hybrid ground-truth data in Python. Hybrid ground-truth data is a data-driven modelling paradigm in which spikes from a single unit are moved to a different location on the recording probe, thereby generating a virtual unit of which the spike times are known. The tool enables a user to efficiently generate hybrid ground-truth datasets and make informed decisions between spike sorting algorithms, fine-tune the algorithm parameters towards the used recording setting, or get a deeper understanding of those algorithms.
This is a preview of subscription content, access via your institution.








Notes
The tool is available on https://github.com/jwouters91/shybrid.
Please consult the https://phy.readthedocs.io/en/latest/ for more information about the template-gui format.
References
Allen, B.D., Moore-Kochlacs, C., Bernstein, J.G., Kinney, J., Scholvin, J., Seoane, L., Chronopoulos, C., Lamantia, C., Kodandaramaiah, S.B., Tegmark, M., & et al. (2018). Automated in vivo patch clamp evaluation of extracellular multielectrode array spike recording capability. Journal of neurophysiology.
Aydın, Ċ., Couto, J., Giugliano, M., Farrow, K., & Bonin, V. (2018). Locomotion modulates specific functional cell types in the mouse visual thalamus. Nature Communications, 9(1), 1–12.
Blatt, M., Wiseman, S., & Domany, E. (1996). Superparamagnetic clustering of data. Physical Review Letters, 76(18), 3251.
Buccino, A.P., & Einevoll, G.T. (2019). Mearec: a fast and customizable testbench simulator for ground-truth extracellular spiking activity, bioRxiv (pp. 691642).
Buccino, A.P., Hurwitz, C.L., Magland, J., Garcia, S., Siegle, J.H., Hurwitz, R., & Hennig, M.H. (2019). Spikeinterface, a unified framework for spike sorting, BioRxiv (pp. 796599).
Camunas-Mesa, L.A., & Quiroga, R.Q. (2013). A detailed and fast model of extracellular recordings. Neural Computation, 25(5), 1191–1212.
Carlson, D., & Carin, L. (2019). Continuing progress of spike sorting in the era of big data. Current Opinion in Neurobiology, 55, 90– 96.
Chung, J.E., Magland, J.F., Barnett, A.H., Tolosa, V.M., Tooker, A.C., Lee, K.Y., Shah, K.G., Felix, S.H., Frank, L.M., & Greengard, L.F. (2017). A fully automated approach to spike sorting. Neuron, 95(6), 1381–1394.
Einevoll, G.T., Franke, F., Hagen, E., Pouzat, C., & Harris, K. D. (2012). Towards reliable spike-train recordings from thousands of neurons with multielectrodes. Current Opinion in Neurobiology, 22(1), 11–17.
Franke, F., Quiroga, R.Q., Hierlemann, A., & Obermayer, K. (2015). Bayes optimal template matching for spike sorting–combining fisher discriminant analysis with optimal filtering. Journal of Computational Neuroscience, 38(3), 439–459.
Gibson, S., Judy, J.W., & Marković, D. (2012). Spike sorting: The first step in decoding the brain. IEEE Signal Processing Magazine, 29(1), 124–143.
Gligorijević, I., van Dijk, J.P., Mijović, B., Van Huffel, S., Blok, J.H., & De Vos, M. (2013). A new and fast approach towards semg decomposition. Medical & Biological Engineering & Computing, 51 (5), 593–605.
Gouwens, N.W., Berg, J., Feng, D., Sorensen, S.A., Zeng, H., Hawrylycz, M.J., Koch, C., & Arkhipov, A. (2018). Systematic generation of biophysically detailed models for diverse cortical neuron types. Nature Communications, 9(1), 1–13.
Grün, S., & Rotter, S. (2010). Analysis of parallel spike trains Vol. 7. Berlin: Springer.
Hagen, E., Ness, T.V., Khosrowshahi, A., Sørensen, C., Fyhn, M., Hafting, T., Franke, F., & Einevoll, G.T. (2015). Visapy: a python tool for biophysics-based generation of virtual spiking activity for evaluation of spike-sorting algorithms. Journal of Neuroscience Methods, 245, 182–204.
Hines, M.L., & Carnevale, N.T. (1997). The neuron simulation environment. Neural Computation, 9(6), 1179–1209.
Holobar, A., & Zazula, D. (2007). Multichannel blind source separation using convolution kernel compensation. IEEE Transactions on Signal Processing, 55(9), 4487–4496.
Hunt, D.L., Lai, C., Smith, R.D., Lee, A.K., Harris, T.D., & Barbic, M. (2019). Multimodal in vivo brain electrophysiology with integrated glass microelectrodes. Nature Biomedical Engineering, 1.
Hutchison, W., Allan, R., Opitz, H., Levy, R., Dostrovsky, J., Lang, A., & Lozano, A. (1998). Neurophysiological identification of the subthalamic nucleus in surgery for parkinson’s disease. Annals of Neurology: Official Journal of the American Neurological Association and the Child Neurology Society, 44(4), 622–628.
Jun, J.J., Mitelut, C., Lai, C., Gratiy, S., Anastassiou, C., & Harris, T.D. (2017a). Real-time spike sorting platform for high-density extracellular probes with ground-truth validation and drift correction, bioRxiv (pp. 101030).
Jun, J.J., Steinmetz, N.A., Siegle, J.H., Denman, D.J., Bauza, M., Barbarits, B., Lee, A.K., Anastassiou, C.A., Andrei, A., AydıN, Ċ., & et al. (2017b). Fully integrated silicon probes for high-density recording of neural activity. Nature, 551(7679), 232.
Khatoun, A., Asamoah, B., & Mc Laughlin, M. (2017). Simultaneously excitatory and inhibitory effects of transcranial alternating current stimulation revealed using selective pulse-train stimulation in the rat motor cortex. Journal of Neuroscience, 37(39), 9389–9402.
Lewicki, M.S. (1998). A review of methods for spike sorting: the detection and classification of neural action potentials. Network: Computation in Neural Systems, 9(4), R53–R78.
Lindén, H., Hagen, E., Leski, S., Norheim, E.S., Pettersen, K. H., & Einevoll, G.T. (2014). Lfpy: a tool for biophysical simulation of extracellular potentials generated by detailed model neurons. Frontiers in Neuroinformatics, 7, 41.
Lopez, C.M., Putzeys, J., Raducanu, B.C., Ballini, M., Wang, S., Andrei, A., Rochus, V., Vandebriel, R., Severi, S., Van Hoof, C., & et al. (2017). A neural probe with up to 966 electrodes and up to 384 configurable channels in 0.13μ m soi cmos. IEEE Transactions on Biomedical Circuits and Systems, 11(3), 510–522.
Markram, H., Muller, E., Ramaswamy, S., Reimann, M.W., Abdellah, M., Sanchez, C.A., Ailamaki, A., Alonso-Nanclares, L., Antille, N., Arsever, S., & et al. (2015). Reconstruction and simulation of neocortical microcircuitry. Cell, 163(2), 456–492.
Marre, O., Amodei, D., Deshmukh, N., Sadeghi, K., Soo, F., Holy, T.E., & Berry, M.J. (2012). Mapping a complete neural population in the retina. Journal of Neuroscience, 32(43), 14859–14873.
Maynard, E.M., Nordhausen, C.T., & Normann, R.A. (1997). The utah intracortical electrode array: a recording structure for potential brain-computer interfaces. Electroencephalography and Clinical Neurophysiology, 102(3), 228–239.
Merletti, R., & Farina, D. (2016). Surface electromyography: physiology, engineering and applications. New York: Wiley.
Moser, E.I., Kropff, E., & Moser, M.-B. (2008). Place cells, grid cells, and the brain’s spatial representation system. Annu. Reviews in the Neurosciences, 31, 69–89.
Neto, J.P., Lopes, G., Frazao, J., Nogueira, J., Lacerda, P., Baiao, P., Aarts, A., Andrei, A., Musa, S., Fortunato, E., & et al. (2016). Validating silicon polytrodes with paired juxtacellular recordings: method and dataset. Journal of Neurophysiology, 116(2), 892–903.
Pachitariu, M., Steinmetz, N.A., Kadir, S.N., Carandini, M., & Harris, K.D. (2016). Fast and accurate spike sorting of high-channel count probes with kilosort. Advances in Neural Information Processing Systems, 4448–4456.
Quiroga, R.Q., Nadasdy, Z., & Ben-Shaul, Y. (2004). Unsupervised spike detection and sorting with wavelets and superparamagnetic clustering. Neural Computation, 16(8), 1661–1687.
Ramaswamy, S., Courcol, J.-D., Abdellah, M., Adaszewski, S.R., Antille, N., Arsever, S., Atenekeng, G., Bilgili, A., Brukau, Y., Chalimourda, A., & et al. (2015). The neocortical microcircuit collaboration portal: a resource for rat somatosensory cortex. Frontiers in Neural Circuits, 9, 44.
Rodriguez, A., & Laio, A. (2014). Clustering by fast search and find of density peaks. Science, 344(6191), 1492–1496.
Rossant, C. (2020). cortex-lab/phy. [Online]. Available: https://github.com/cortex-lab/phy.
Rossant, C., Kadir, S.N., Goodman, D.F., Schulman, J., Hunter, M.L., Saleem, A.B., Grosmark, A., Belluscio, M., Denfield, G.H., Ecker, A.S., & et al. (2016). Spike sorting for large, dense electrode arrays. Nature Neuroscience, 19(4), 634.
Rutishauser, U., Schuman, E.M., & Mamelak, A.N. (2006). Online detection and sorting of extracellularly recorded action potentials in human medial temporal lobe recordings, in vivo. Journal of Neuroscience Methods, 154(1-2), 204–224.
Schwartz, A.B. (2004). Cortical neural prosthetics. Annu. Reviews in the Neurosciences, 27, 487–507.
Sukiban, J., Voges, N., Dembek, T.A., Pauli, R., Visser-Vandewalle, V., Denker, M., Weber, I., Timmermann, L., & Grün, S. (2019). Evaluation of spike sorting algorithms: Application to human subthalamic nucleus recordings and simulations. Neuroscience, 414, 168–185.
Trautmann, E.M., Stavisky, S.D., Lahiri, S., Ames, K.C., Kaufman, M.T., O’Shea, D.J., Vyas, S., Sun, X., Ryu, S.I., Ganguli, S., & et al. (2019). Accurate estimation of neural population dynamics without spike sorting. Neuron, 103(2), 292–308.
Wouters, J., Kloosterman, F., & Bertrand, A. (2018). Towards online spike sorting for high-density neural probes using discriminative template matching with suppression of interfering spikes. Journal of Neural Engineering, 15(5), 056005.
Yger, P., Spampinato, G.L., Esposito, E., Lefebvre, B., Deny, S., Gardella, C., Stimberg, M., Jetter, F., Zeck, G., Picaud, S., & et al. (2018). A spike sorting toolbox for up to thousands of electrodes validated with ground truth recordings in vitro and in vivo. Elife, 7, e34518.
Acknowledgment
The authors would like to thank Jonathan Dan and Jonathan Moeyersons for their time spent on thoroughly testing the software and for their valuable feedback.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was carried out at the ESAT Laboratory of KU Leuven, in the frame of KU Leuven Special Research Fund projects C14/16/057, and the Research Foundation Flanders (FWO) project FWO G0D7516N. This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 802895). This research received funding from the Flemish Government under the ”Onderzoeksprogramma Artificië le Intelligentie (AI) Vlaanderen” programme. The scientific responsibility is assumed by its authors.
Appendix
Appendix
A Auto hybridization fitting factor bounds
The calculation of the fitting factor bounds during the automatic hybridization is based on robust statistics, which are commonly used for the detection and removal of outliers. The automatic bounds selection is rather conservative, i.e., it is likely that quite a few good spikes are excluded from the hybridization when using the automated approach.
Consider \({\mathscr{B}}^{\left (n\right )} = \left \{\log _{10} \ {\upbeta }_{s}^{\left (n\right )} \ \vert \ s \in \mathcal {S}^{\left (n\right )} \right \}\) which is the set of the logarithm of the fitting factors (see “Hybrid Ground-Truth Model”) for a certain neuron n. The logarithm is used to be able to also remove close to zero fitting factors based on simple statistics. Given \({\mathscr{B}}^{\left (n\right )}\), the first and third quartile are calculated, denoted by Q1 and Q3 respectively. From those quartile values the interquartile range (IQR) is calculated as IQR = Q3 −Q1. From those statistics the bounds are calculated:
and
where the IQR scaling factor (i.e. \(\frac {3}{4}\)) was determined experimentally.
B Auto hybridization random unit relocation
During the automatic hybridization, a random unit relocation is calculated for every neuron. For this relocation, only a shift in the y-direction is considered. The random shift is determined by drawing a y-position on the probe grid model (see “Hybrid Ground-Truth Model”) from a discrete uniform distribution. This random y-position is the y-position to which the channel with the maximal deflection in the spike template is shifted to. In this way we avoid that the complete template is shifted off the probe. The actual shift can then be calculated as the random y-position minus the y-position of the channel with maximal deflection in the original template. A minimum shift of two channels is enforced, to make sure that the re-inserted unit is sufficiently separable from the original unit.
C External template import
When an external template is imported, there are no spike times available, neither is the scaling known. The spike occurrences are modeled as a poisson point process. The inter-spike interval ΔISI is then modelled by drawing from an exponential distribution:
where λ represents the desired spike rate. Every inter-spike interval sample \(\hat {\Delta }_{\text {ISI}}\) is enforced to last at minimum the user-defined refractory period \({\Delta }_{\min \limits }\):
The actual simulated discrete spike times ksim are obtained by calculating the cumulative sum over the inter-spike interval samples. Those spike times are then discretized by multiplying them with the recording sampling frequency and rounding each product to its nearest integer. This gives rise to a set of discrete spike times \(\mathcal {S}^{\text {ext}} = \left \{ k_{\text {sim}} \right \}\).
The template scaling is derived from the user-defined desired peak-signal-to-noise ratio (PSNR \(= 10\log _{10}\frac {P_{\text {peak}}}{P_{\text {noise}}}\)). The scaling factor is calculated as follows:
with Ppeak equal to the square of the peak absolute value over all channels of the external template and Pnoise equal to a robust estimate (based on the median absolute deviation) of the noise variance of the channel on which the template reaches its peak absolute value.
The hybrid data generated from an external template can then be described as follows:
where \(\mathbf {t}_{c,\left (x,y\right )}^{\text {ext}}\) denotes the imported external template at channel c. Note that the template temporal window is derived from the external template directly. The external template is assumed to match the sampling frequency of the recording data that is being hybridized.
D Automatic merging
The merging framework for a specific ground-truth spike train consists of the following steps:
-
1)
Compute the correspondence between the ground-truth spike train and all automatically recovered spike clusters in terms of precision and recall. More information on those performance metrics can be found in “Performance metrics calculation”.
-
2)
Sort all clusters on descending precision, such that the cluster with the highest fraction of true spike times is on top of the list.
-
3)
Merge the ordered clusters together in a top-down fashion, i.e. starting from the cluster with the highest precision, as long as the merge operation increases the F1-score of the new cluster that contains all previously merged clusters.
Initially, the merging of clusters with a high precision will increase the sensitivity, at only a very small drop in precision. Such a merging will likely lead to an increase in F1-score. At a certain point, clusters will start containing significant amounts of false positives that will notably decrease the precision of the merged cluster. This decrease will then result in a decreasing F1-score. The proposed approach tries to find the combination of clusters with maximal F1-score, without explicitly having to consider all possible combinations, preventing a combinatorial explosion from happening.
Rights and permissions
About this article
Cite this article
Wouters, J., Kloosterman, F. & Bertrand, A. SHYBRID: A Graphical Tool for Generating Hybrid Ground-Truth Spiking Data for Evaluating Spike Sorting Performance. Neuroinform 19, 141–158 (2021). https://doi.org/10.1007/s12021-020-09474-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12021-020-09474-8
Keywords
- Spike sorting
- Validation
- Hybrid ground truth
- GUI