Most analysis methods for neural spike data require knowledge about the exact spike times of individual neurons. To obtain this information from extracellular recordings, spike sorting is necessary to exclude measurement artifacts and to separate spikes from different neurons. The quality of spike sorting can have significant effects on the results of spike train analyses [1]. However, manual spike sorting leads to extremely variable results due to different sorting strategies of individual persons [2]. Therefore, reproducible results require automated spike-sorting methods, even though they also lead to sorting errors.

The goal of this study was to evaluate the performance of two automated spike sorting methods, valley seeking and the T-distributed EM algorithm, provided by the Plexon offline spike sorter [3]. Since the "true" origin of each spike is not known for experimental data, we used artificial data to compare the performances of the spike sorting algorithms. The artificial data sets were based on the statistical properties of spike and noise waveforms obtained from a large set of extracellular recordings from the turtle retina. To avoid a biased evaluation of the spike sorting algorithms, we used three separate artificial data sets, based on the same neuronal data, which were sorted 1) manually, 2) with valley seeking and 3) with the T-distributed EM algorithm before calculating the statistical properties. Each artificial data set consisted of 300 traces, containing between one and six classes of waveforms and representing spikes of up to three neurons and noise.

Both algorithms were applied to the first three principal components of the artificial waveforms for automatic spike sorting of all three artificial data sets with a range of different sorting parameter combinations. To quantify the spike sorting performance on a scale between 0 (chance) and 1 (identity), the Rand-index [4] was used to calculate the similarity of the correct classification and the classification obtained by a spike-sorting algorithm.

We found that the T-distributed EM algorithm was clearly superior to valley seeking for all artificial data sets independent of the sorting method on which the statistical properties of the waveforms were based. Using optimal sorting parameters, the EM algorithm led to an average Rand-index of 0.95 (std 0.13), whereas valley seeking only yielded a maximal average Rand-index of 0.57 (std 0.44). Compared to this marked difference between both algorithms, the choice of the sorting parameters only has minor effects on the classification performance.