# NeuroXidence: reliable and efficient analysis of an excess or deficiency of joint-spike events

- First Online:

- Received:
- Revised:
- Accepted:

- 29 Citations
- 744 Downloads

## Abstract

We present a non-parametric and computationally efficient method named NeuroXidence that detects coordinated firing of two or more neurons and tests whether the observed level of coordinated firing is significantly different from that expected by chance. The method considers the full auto-structure of the data, including the changes in the rate responses and the history dependencies in the spiking activity. Also, the method accounts for trial-by-trial variability in the dataset, such as the variability of the rate responses and their latencies. NeuroXidence can be applied to short data windows lasting only tens of milliseconds, which enables the tracking of transient neuronal states correlated to information processing. We demonstrate, on both simulated data and single-unit activity recorded in cat visual cortex, that NeuroXidence discriminates reliably between significant and spurious events that occur by chance.

### Keywords

Coordinated firing Temporal code Rate code Assembly hypothesis Synchrony Detection of joint-spike events Cat visual cortex Area 17 Moving grating Non-parametric and parametric significance estimation## 1 Introduction

It is commonly held that neurons encode information by modulations of their discharge rate. A complementary hypothesis is that information is also encoded in the precise relation between the discharges of spatially distributed neurons (Hebb 1949; von der Malsburg 1981; Fetz 1997; Singer 1999). These complementary views are addressed in the literature as the rate coding and the temporal coding hypotheses. Multiple methods have been developed to detect temporal relations between spiking events and to investigate whether these relations that are forming a spike pattern are correlated with stimulus configurations, behavior, or particular states of neuronal systems (Gerstein and Perkel 1969, 1972; Abeles and Gerstein 1988; Gray et al. 1989; Brown et al. 1998, 2004; Grün et al. 1999, 2002a, b; Martignon et al. 2000; Tetko and Villa 2001; Nakahara and Amari 2002; Pipa and Grün 2003; Barbieri et al. 2004, 2005; Sharpee et al. 2004; Samonds and Bonds 2004; Ikegaya et al. 2004; Czanner et al. 2005; Okatan et al. 2005; Kass et al. 2005; Pipa et al. 2006; Oweiss et al. 2007; Schneider et al. 2006). The methods differ in the definitions of the spike patterns, the techniques to detect these patterns, and the approaches to analyze the resulting data (descriptive, statistical hypothesis testing, maximum likelihood, and Bayesian approaches). Even though the temporal coding hypothesis formulates precisely what constitutes a spike pattern, it turns out to be a non-trivial problem to design a method that detects the existence of such patterns and to investigate their information content, without being confounded by other properties of the data (Oram et al. 1999; Brody 1999; Roy et al. 2000; Baker and Gerstein 2001; Grün et al. 2003).

In this article we present the novel method NeuroXidence^{1} (Pipa 2003; Pipa et al. 2006) for analyzing precise and consistent temporal relations between discharges of simultaneously recorded neurons that are referred to as coordinated firing events. NeuroXidence (see www.NeuroXidence.com) has been designed to deal with four main properties of neuronal spike trains that make the analysis of coordinated firing events difficult: variability, short timescales, history dependencies and auto-structures of the spiking activities, as well as the rareness of coordinated firing events. First, neuronal recordings show a high degree of variability, which is partially caused by noise, changes in the properties of neuronal responses, and by the dynamics of the information processing. Any analysis of coordinated firing events has to consider all three kinds of variability. Second, evidence indicates that neuronal states linked to the processing of information can last for only a short period of time (e.g., a few tens of milliseconds; Oram and Perrett 1992; Thorpe et al. 1996). This implies that analysis methods need to operate on short time series containing a very limited number of samples. Third, neuronal activity might have strong history dependencies, where the likelihood that a spike will occur at a certain point in time depends on the times at which previous spikes have occurred. Therefore, the auto-structures of the spike trains need to be considered in the analysis of potentially meaningful coordinated firing events. Fourth, past investigations of coordinated firing events indicated that these events are rare (Abeles and Gerstein 1988; Ikegaya et al. 2004). Thus, analysis of coordinated firing events might be difficult, since even highly parallel recordings from up to a hundred simultaneous recording sites might still grossly under-sample the neurons that participate in the actual temporal code.

In the present article we first describe the statistical hypothesis test and the detection algorithm used by NeuroXidence. Next, we utilize simulated data to illustrate the statistical properties, performance, and applicability of NeuroXidence based on calibration studies for false positives and for test-power. To this end we use simulated data with and without spike-rate modulations as well as epochs in which spike-rate modulations coincide with epochs of coordinated firing. We next compare the performance and concept of NeuroXidence to related methods such as the unitary events analysis (Grün et al. 2002a, b), the conventional cross-correlation analysis (König 1994), and maximum entropy approaches (Schneidman et al. 2006). Finally, we apply NeuroXidence to an example dataset consisting of 42 single units recorded simultaneously from visual area 17 of an anesthetized cat.

## 2 Methods

### 2.1 Coordinated firing events

Coordinated spiking has been defined in at least two different ways. The first definition is based on the synchronous spiking of at least two neurons with millisecond precision. These coordinated events have been referred to as synchronous firings (Gray et al. 1989), temporal-patterns (Vaadia and Abeles 1987), or joint-spike events (Grün et al. 2002a, b). The second type of coordinated firing is defined as two or more neurons spiking in sequence over a finite period of time. Such a sequence has been referred to as a spatiotemporal-pattern (Abeles and Gerstein 1988) or, more simply, a pattern (Ikegaya et al. 2004). NeuroXidence is designed only for the analysis of coordinated spiking of the first type.

*order 2*and (c),

*order 3*)), the order is not necessarily the same as the complexity, which is the total number of neurons participating in the resulting spike pattern.

Methods that were developed to identify the complete or partial structure of neuronal coupling (Martignon et al. 2000; Nakahara and Amari 2002; Okatan et al. 2005; Schneidman et al. 2006) needed large datasets, in which at least the structure of neuronal coupling had to be assumed to be fixed. To the best of our knowledge, currently available methods fail to extract the full coupling structures from massive parallel recordings (activity of up to hundreds of neurons recorded simultaneously) if the activity involves fast changes in the properties of the responses and a high degree of trail-by-trial variability or the amount of data is very limited. NeuroXidence addresses the problems of under-sampling and variability and does not try to derive a model of the coupling structure, but rather it tests whether coordinated firing is random or likely to be induced by, as well as correlated to, information processing or neuronal states.

### 2.2 Detection of joint-spike events

NeuroXidence analyzes coordinated neuronal firing utilizing a novel and computationally efficient algorithm for identifying and counting JSEs that are based on joint-spike activity that is not precisely synchronous. The central feature of this method is the supposition of a timescale *τ*_{c}, which defines the required temporal precision of fine temporal cross-structure that is typically assumed to be in the range between 1–10 ms (Markram et al. 1997; Bi and Poo 1998; Grün et al. 1999; Hopfield and Brody 2000; Aertsen et al. 2001; Hopfield and Brody 2001; Sjostrom et al. 2001). NeuroXidence utilizes *τ*_{c} as the maximum temporal difference between the first and the last spike of a JSE (Fig. 1(d)).

*τ*

_{c}in this way, NeuroXidence avoids the problems of previous approaches that utilized

*τ*

_{c}for exclusive binning (Grün et al. 2002a; Pipa and Grün 2003). Exclusive binning allows for detecting JSEs that involve random or systematic deviations from precise synchronization, also referred to as jitter. However, exclusive binning does not consider the exact difference between the spike times of the neurons, rather it reduces the precision of the sampling to

*τ*

_{c}(Fig. 2(g)). This poses the problem that spikes, which are less than

*τ*

_{c}apart and should be considered as a JSE, can be detected as non-coincident if they happen to fall across separate bins (Grün et al. 2002a). For that reason, exclusive binning likely does not detect all joint-spike events. This problem is further aggravated by increases in either the complexity of the JSE and the bin size

*τ*

_{c}(Grün et al. 2002a). One can overcome the problems of exclusive binning with the multiple-shift method (Grün et al. 1999), that shifts entire spike trains in relation to one another, for all existing combinations of shifts. However, this method can be used only on datasets with a small number of neurons because the number of combinations of shifts and the resulting computational complexity increase exponentially with the number of neurons.

NeuroXidence is more efficient because it takes advantage of the sparseness of spiking activity to reduce the computational complexity of detecting JSEs. The algorithm also returns an exact solution and allows for variations in the timing of each individual spike that are less than the defined amount of jitter, *τ*_{c}. This is accomplished by the use of binning to represent the spiking data as a time-discrete process, where the bin size, *b*, is considerably shorter than *τ*_{c} (e.g., *b* <= 1 ms and *τ*_{c} = 5 ms). Thus, this method preserves the precision of the spike times in the original recording when detecting JSEs (Fig. 1(d)).

There are three key steps in the detection of JSEs: the pre-processing of spike trains (see Appendix 1: Pre-processing and Fig. 12, Appendix 1), the detection of jittered JSEs, and the identification of JS-patterns and their complexity. The pre-processing step ensures that the minimal inter-spike interval of any unit is larger than *τ*_{c}, corresponding to the kernel G^{+/−} defined in units of the bin size, *b*. This step prevents temporally overlapping JSEs, while preserving each individual JSE and its exact frequency of occurrence. In the second step, NeuroXidence detects jittered JSEs by extending the temporal extent of each individual spike to the number of bins defined by G^{+/−}. For that purpose, each spike and its neighborhood are replaced by a kernel, which contains G^{+/−} bins of ones (Fig. 2(a–c)). The third step in JSE detection consists of identifying JS-patterns and determining their complexity. This is accomplished by evaluating the local complexity (i.e. the number of synchronous spikes) per time bin (Fig. 2(d)). Each local maximum of the local complexity corresponds to one instance of joint-spiking (Fig. 2(e)). Thus, zones of maximal local complexity are used to derive the corresponding JSE within an identified zone of joint-spiking (Fig. 2(f)).

The determination of the total frequency of occurrences of one particular JS-pattern in the dataset involves the sum of two pattern frequencies. One is the frequency of JSEs that are identical to the JS-pattern of interest. The other is the frequency of all supra-patterns in which the JS-pattern is included as a sub-pattern.

### 2.3 Definition of the hypothesis test

To test whether JSEs of one type of JS-pattern occurs more or less often than expected by chance, a statistical hypothesis test is used. The null hypothesis (H0) states that JS-patterns occur by chance. Therefore, H0 assumes that the precise timing of neuronal firing, on a timescale that is defined by the precision of JS-pattern *τ*_{c}, is mutually independent across neurons (Fig. 1(a)). The alternative hypothesis (H1) assumes that there are more (or less) coordinated firing events than expected by chance and thus, that the spiking events are dependent. Consequently, the observed frequency of JSEs has to be compared to the expected frequency of JSEs in the case where the neurons are assumed to be independent. Therefore, the core of the statistical hypothesis test is the estimate of the expected frequency of JSEs.

This estimation can be based on either a parametric or a non-parametric approach. Parametric approaches hypothesize an explicit model of the data, while non-parametric methods do not pose such assumptions. Since, violations of the assumptions of the model often lead to false rejection of H0, a non-parametric method might provide considerable advantages in detecting JSEs. An example of a parametric method that has been discussed as yielding increased false positive levels is the unitary events analysis (Grün et al. 2002a, b), which hypothesizes that spike trains are stationary Bernoulli processes. Such a process is binary and discrete, with constant probability of spike generation across time as well as with no history dependency. Therefore, several factors might induce false positives: (1) low firing rates (Roy et al. 2000), (2) fast rate modulations (Grün et al. 2002b), (3) auto-structure in the spike trains (Baker and Lemon 2000; Pipa et al., submitted), and (4) trial-by-trail variability in firing rates and in latencies of responses (Brody 1999; Baker and Gerstein 2001; Grün et al. 2003).

### 2.4 Generation of surrogate data

The non-parametric approach of NeuroXidence estimates the chance frequency of JSEs (H0) based on a surrogate dataset, a technique that has been widely used (Schreiber and Schmitz 2000; Baker and Gerstein 2001). The art of developing a non-parametric method is to produce surrogate data that differs from the original dataset in only one property. This property defines the alternative hypothesis, H1. Hence, the ideal surrogate dataset is derived from the original dataset, such that it maintains all of the original properties of the data but does not contain any coordinated firing. The surrogate dataset should preserve the auto-structure of each individual spike train, trial-by-trial variability, rate modulations, and all history dependencies.

*τ*

_{c}, has been already described and corresponds to fine temporal cross-structures of interest for JSEs. This range is typically between 1–10 ms (Markram et al. 1997; Bi and Poo 1998; Grün et al. 1999; Hopfield and Brody 2000; Aertsen et al. 2001; Hopfield and Brody 2001; Sjostrom et al. 2001). The second timescale,

*τ*

_{r}, corresponds to the scale of the rate coding and is defined to be distinguishably slower than

*τ*

_{c}; therefore, a factor,

*η*, is introduced, which is the ratio between the two timescales:

*η*is chosen to be in the range of 2–5, meaning that

*τ*

_{r}is two to five times slower than

*τ*

_{c}. As we will show, this difference in timescales is sufficient to test reliably for the presence of fine temporal cross-structure defined by JSEs, even in data in which the auto-structure of the spike trains changes on the timescale of

*τ*

_{c}.

The surrogate dataset is derived from the original spike trains by jittering spikes by an amount given by the slow timescale, *τ*_{r} (Fig. 1(e and f)). In contrast to random jittering, also referred to as ‘dithering,’ of each individual spike (Fig. 1(f)), which destroys the fine temporal auto-structure, NeuroXidence jitters each spike of the same spike train by the same amount (Fig. 1(e)). Thus, the jittering is equivalent to random shifts of an entire spike train, which destroys fine temporal cross-structure on timescales faster than *τ*_{r}, but preserves auto-structures on all timescales. Hence, random shifting of entire spike trains is an optimal strategy to produce surrogate data, and it minimizes the chances of generating inappropriate surrogates.

### 2.5 Implementation of the statistical test

*k,*and trial,

*t*, we compute the difference, Δ

*f*

_{t}

^{k}, between the two frequencies:

*f*

_{t}

^{k}(org) ∈

**N**_{0}, for trial

*t*= 1...

*T*and JS-pattern

*k*= 1...

*K*. Thus, the variability of Δ

*f*

_{t}

^{k}is assessed on a trial-by-trial basis.

*S*). Each surrogate sample is based on random and independent shifts to derive the average chance frequency of JSEs per trial. Thus, for

*S*> 1, the difference Δ

*f*

_{t}

^{k}becomes the average difference between the original data and the

*S*surrogates \( \overline{\Delta } f^{k}_{t} \):

*k*, in the original dataset. Similarly, a negative value of \( \overline{\Delta } f^{k}_{t} \) indicates a deficiency of JSEs in the original dataset. The differences \( \overline{\Delta } f^{k}_{t} \) for one particular JS-pattern

*k*computed for all available trials form the set \( \overline{\Delta } F^{k} \):

### 2.6 Rare events versus significant events

The estimation of the statistical significance utilized by NeuroXidence is derived from the trial-based evaluation of \( \overline{\Delta } f^{k}_{t} \ne 0 \). Therefore, NeuroXidence requires that a significant increase or decrease of the frequency of JSEs has to be statistically consistent across trials. This dissociates NeuroXidence from other methods that consider only the total number of JSEs across trials and do not take trial-by-trial variability into consideration (Abeles and Gerstein 1988; Grün et al. 1999, 2002a, b; Ikegaya et al. 2004). To test if \( \ifmmode\expandafter\bar\else\expandafter\=\fi{\overline\Delta }f^{k}_{t} \) is a significant increase or decrease of the frequency of JSEs, NeuroXidence allows testing whether either the mean (*t* test) or median (Wilcoxon rank test) of \( \ifmmode\expandafter\bar\else\expandafter\=\fi{\overline\Delta }f^{k}_{t} \) is different from zero. The Wilcoxon test is, in many cases, a much better choice, especially when spike rates are low and complexities of JSEs are greater than two (see Appendix 2).

### 2.7 Parameters of the NeuroXidence significance estimation

We next investigate the properties of the parameters *η* and *S* utilized by NeuroXidence to estimate the statistical significance of JSEs. To this end, we discuss how both affect the false-positive rate and test-power of the method.

### The scaling factor *η*

As mentioned, NeuroXidence differentiates between two timescales at which neural processes operate. The first, *τ*_{c}, defines the timescale of JSEs. The second timescale, *τ*_{r}, is slower and is relevant for the other processes considered by the significance evaluation. The ratio between the two timescales is given by *η* = *τ*_{r}/*τ*_{c} (Eq. (1)). The selection of *η* is based on a trade-off between the information maintained in the surrogate data and the information destroyed by jittering the spike trains. On the one hand, *τ*_{r} should be as fast as possible to account for rapid changes in the cross-structure, such as co-variations in firing rates (Brody 1999; Baker and Gerstein 2001). On the other hand, a large value of *τ*_{r} (i.e. large *η*) ensures that jittering destroys most of the coordinated firing in the surrogate data, which is necessary for a high test-power (probability to reject H0 if H1 is true). The probability that jittering destroys JSEs increases exponentially with the complexity of JSEs, since the number of existing, different jitter configurations is exponentially growing with the complexity of the JS-pattern (see Electronic Supplementary Fig. 2A). Therefore, large *η* improves mainly the test-power for JS-patterns, whose complexities do not exceed two or three. In conclusion, *η* should be chosen to be in the range of 2–5, which provides a good balance between the test-power and the dynamics of the cross-structure.

### Number of surrogates *S*

*S*> 1 random surrogates per trial improves the estimation of the frequency,

*f*

_{t}

^{k}(sur), of JSEs by using the average frequency, \( \overline{f } ^{\,k}_{t} {\left( {sur} \right)} \), for each particular JS-pattern,

*k*. This improves the estimate of the expected number of JSEs under H0 and increases the reliability of the estimated

*p*value. To understand how multiple surrogates influence the significance estimation in detail, we next derive the impact of

*S*> 1 on the probability distribution of the average difference, \( \overline{\Delta } f = f_{{\operatorname{org} }} - \overline{f} _{{\operatorname{sur} }} \), between the frequency of JSEs in the original dataset,

*f*

_{org}, and the average frequency in the surrogate dataset, \( \overline{f} _{{\operatorname{sur} }} \). The distribution of the average difference, \( \overline{\Delta } f \), is given by a convolution of the distributions \( p_{\,{\overline{f} _{{sur}} }} \) and \( p_{{f_{{org}} }} \):

Since \( p_{{f_{{\operatorname{org} }} }} \) is not affected by different numbers of surrogates, we focus solely on changes of \( p_{\,{\overline{f} _{{sur}} }} \) and assume that events are rare such that the distribution of \( p_{\,{\overline{f} _{{sur}} }} \) is expected to be skewed, with a median smaller than the mean. However, with increasing numbers of surrogates, the distribution of the average frequency, \( \overline{f} _{{\operatorname{sur} }} \), becomes more normally distributed than \( f_{{\operatorname{sur} }} \) because of the central limit theorem. Therefore, the median of the distribution \( p_{\,{\overline{f} _{{sur}} }} \) becomes larger with increasing *S* and approaches the value of the mean.

To illustrate the impact of this mapping on the significance estimation, we assume the null hypothesis, H0, and define that both \( \overline{f} _{{\operatorname{sur} }} \) and *f*_{org} are sampled from the same distribution \( {\left( {{\text{H}}0:p_{{f_{{sur}} }} \equiv p_{{f_{{org}} }} } \right)} \). In this case, the median and the mean of \( p_{{f_{{sur}} }} \) and \( p_{{f_{{org}} }} \) are, by definition, identical. However, as described above, multiple surrogates, first, make the distribution \(p\,_{{\overline{f} _{{sur}} }} \) more normal than \( p_{{f_{{org}} }} \) and, second, make the median of \( p\,_{{\overline{f} _{{sur}} }} \) larger than the median of \( p_{{f_{{org}} }} \). Hence, it follows that in the case of H0, multiple surrogates, i.e., *S* > 1, bias the median of \( \overline{\Delta } f \) (see Eq. (5)) towards negative values (see Electronic Supplementary Fig. 1A). This makes the reliance of NeuroXidence on a median test more conservative when testing for an excess of JSEs and more liberal when testing for a deficiency of JSEs (i.e. increased numbers of false-positive events). Therefore, it is recommended that one use larger values of *S* only for testing for an excess of JSEs, while one needs to be cautious when testing for a deficiency of JSEs. In the latter case, *S* = 1 is a more proper choice to minimize the risk of type-1 errors.

### 2.8 False-positive rates

False-positive events are defined as false rejections of the null hypothesis (H0). If a statistical hypothesis test is applied to a recording of large sets of neurons, one has to differentiate between two different scenarios leading to different definitions of false error rates.

### Scenario one

The test is applied to only one JS-pattern of interest. In this case, the rate of false positives is defined by the percentage of *p* values lower than the test-level (e.g., *p* < 0.05). This percentage of false positives is referred to as the *individual-false-positive-rate*, since it is defined for a single JS-pattern. The inclusion of more than one JS-pattern requires a multiple-comparison correction (e.g. Bonferroni).

### Scenario two

The test is applied only to those JS-patterns that have occurred in the dataset at least once. In this case, the grand-average false error rate is referred to as the *actual-false-positive-rate*. A test that ensures a conservative actual-false-positive-rate has to ensure that the average false-positive-rates of all occurring JS-patterns are below the test-level. This implies that the statistical significance must be independent of the occurrence of the JS-patterns, which is achieved by evaluating the significance on the basis of multiple trials. The consideration of the trial-by-trial variability of the frequencies of occurrences prevents JS-patterns, which are by chance occurring in a few or even only one trial, from being considered to be significant. Thus, significant JS-patterns are required to occur reliably across trials.

In the context of simultaneous recordings of large sets of neurons, only scenario two is of practical usage, since, for scenario one, a multiple-comparison correction has to be performed (e.g. Bonferroni). The latter corrects the test-level for the number of JS-patterns that were tested (*K*). Since the number of JS-patterns can be very large, the Bonferroni correction can lead to a corrected test-level that is extremely small (e.g. for 20 neurons *K* > 10^{6}, and a 1% level leads to a corrected level of <10^{−6}), reducing thus considerably the statistical power.

## 3 Results

In the following sections we report the statistical and computational properties of NeuroXidence. To this end we utilize simulated data as well as an example dataset consisting of 42 single units simultaneously recorded from cat visual area 17 of an anaesthetized cat.

### 3.1 False positives for stationary processes

*T*), a mean spike rate of 15 ap/s (

*r*), 20 surrogates samples (

*S*), and

*η*equal to 3. From this standard set, 15 different combinations of parameters were derived by varying the number of trials (

*T*= 20, 50, 100, 200), the number of surrogates (

*S*= 1, 50, 250), the mean spiking rate (

*r*= 7, 10, 30, 60, 90 ap/s), and the scaling factor (

*η*= 2, 5, 7). NeuroXidence was applied to each simulated dataset using a sliding window with durations of

*l*= 200, 400, and 800 ms. In total, 48 different simulated-data models were used to derive the individual-false-positive-rate for 4 JS-patterns of complexities 2–5. None of the individual-false-positive-rates for any of the parameter sets was above chance-level, for either test-level of 5 or 1% (Fig. 3), demonstrating that NeuroXidence is a conservative statistical hypothesis test. Furthermore, NeuroXidence becomes more conservative with increasing complexity.

### 3.2 Test-power for stationary processes

*η*was 3. Each of the remaining 15 sets was derived from the first set by variations of either the sliding window length (

*l*= 200, 400, 800 ms, Fig. 4(a, rows

*1*–

*4*)), the number of trials (

*T*= 20, 50, 100, 200, Fig. 4(b, rows

*1*–

*4*)), the background spike rate (

*r*= 10, 15, 30, 60, 90 ap/s, Fig. 4(c, rows

*1*–

*4*)), or the number of surrogate datasets (

*S*= 1, 20, 50, 250, Fig. 4(d, rows

*1*–

*4*)).

Based on these 16 simulated datasets, the test-power was derived for four JS-patterns of complexity 2–5. The test-level used by NeuroXidence was 1%. Increasing the length of the analysis window, as well as increasing the number of trials, led to increased test-power. Given the same frequency of excess JSEs, the test-power tended to increase along with JSE complexities. This was expected since the probability of a chance occurrence of a JSE exponen-tially decreased with its complexity. The number of surrogates used for the estimation of the average frequency \( \overline{\Delta } f^{k}_{t} \) changed the test-power only very subtly in compari-son to the changes observed when varying the window length, the number of trials, and the spike rate. In particular, the differences in test-power for *S* > 20 were negligible, indicating that going beyond 20 surrogates does not substantially improve the reliability of the estimates (Fig. 4(d, rows *1*–*4*)).

### 3.3 Sub-patterns and supra-patterns of induced JS-patterns

Any JS-pattern of complexity higher than 2 includes sub-patterns that are expected to be detected at least as frequently as the mother-pattern. On the other hand, the test-power of each of these sub-patterns is expected to be lower than the test-power of the mother-pattern, since the lower complexity leads to a higher chance frequency under H0. Thus, when sub-patterns are only induced by a single mother-process and not by additional correlations with orders equal to or smaller than the complexity of the sub-patterns, the test-power is expected to decrease with the decreasing complexity of the sub-pattern.

Supra-patterns are composed of the mother-pattern itself and spikes from additional neurons, which by chance coincide with the mother-pattern. Therefore, supra-patterns are of higher complexity than the induced mother-pattern. The maximal frequency of a supra-pattern is bound by the maximal frequency of any sub-pattern, including the induced mother-pattern. Furthermore, since the additional spikes are coinciding by chance, the frequency of the supra-pattern occurring is expected to be smaller than that for the mother-pattern. Thus, as long the excess frequency of the mother-pattern is not so high that the supra-pattern still occurs reliably across trials, the test-power of the supra-pattern is expected to be substantially reduced in relation to the mother-pattern.

### 3.4 False positives and test-power for a non-stationary process

Spike trains during periods 1–13 were independent, which implied that H0 should not be rejected more often than the test-level (5%), if the actual-false-positive-rate were conservative. The percentage of false rejections is clearly below 5% for all complexities and throughout all sliding windows during periods 1–13. Therefore, NeuroXidence is a hypothesis test with a conservative actual-false-positive-rate that is effected neither by low rates, rate modulation, latency variability, and cross-trial rate changes nor by the different model processes (γ-process, Poisson) used to generate the spike train. High test-power during periods 14 and 15, which were characterized by increased frequencies of JSEs based on multiple-interaction process correlations with *ζ* = 0.12 (period 14) and *ζ* = 0.3 (period 15), demonstrates the applicability of NeuroXidence for the detection of excesses of JSEs.

Unlike NeuroXidence, the shuffle-corrected cross-correlogram (bin width = 20 ms, data segments = 800 ms) indicated the existence of cross-structure due to its modulation in periods 10 and 12 (Fig. 7(a)). The reasons for these modulations are the shuffle-correction and latency co-variations across neurons that characterize periods 10 and 12 (Brody 1999; Baker and Gerstein 2001). This indicates that modulation in the shuffle-corrected cross-correlogram does not necessarily indicate an excess of fine temporal cross-structure.

We also applied the UE-method (see Appendix 3) to the same dataset (Fig. 7(b–d)). Since the method assumes that data in every sliding window can be described by a stationary Bernoulli processes, the sliding window was chosen to be rather short (100 ms) compared to the 800 ms-long NeuroXidence window. This window length of 100 ms justifies the assumption of a stationary Bernoulli process for a slower modulation of the rate profiles. Therefore, 100 ms is a good trade-off between the danger of false-positive events induced by non-stationarity and the reduced test-power due to the small amount of data contained in each window.

For the UE-method, we derived the actual-false-positive-rate per JS-pattern complexity, which ranged from 2–6 during periods 1–13. As with NeuroXidence, the actual-false-positive-rate was defined as the rate of false rejections of H0, normalized by the number of occurred JS-patterns. As expected, fast changing rates were nearly fully compensated by the short sliding window (100 ms) for complexity 2 JSEs (Fig. 7(b)), while high-complexity JS-patterns led to an increased actual-false-positive-rate during periods with fast changing rates (periods 7–12; Fig. 7(d)). The second source for increased actual-false-positive-rates was low spike rates. During periods 5 and 6 and all parts of other periods in which the rate was low (5 ap/s), the actual-false-positive-rate increased dramatically for complexity 2 JS-patterns and up to 100% for complexity 3 patterns (Fig. 7(c)). In contrast, the actual-false-positive-rate for complexity 6 JSEs was close to zero because their existence was not detected due to the usage of exclusive binning. The reason for the dramatic increase in the actual-false-positive-rate is that the UE-method does not consider any trial-based variation in the frequency of occurrences. Instead, it tests if the total number of a particular JSE across all trials is significantly different from the expected total number, which is based on the assumption that the expected number is Poisson distributed. Since the significance estimation of the UE-method is only based on the variability of the expected total number and not on the variability of frequency of occurrences across trials, a few or even only one JSE might be evaluated as a significant excess if the expected total number is low enough. The latter is the case if spike rates are low or the complexity is large.

### 3.5 Test-power and false-positives rate changes co-occurring with JS activity

*σ*

_{t}= 200 ms, peak rate 70 ap/s), while period 4 contained a rate modulation with a steeper rising than falling slope (composed with two Gaussian waveforms, rising

*σ*

_{t}= 100 ms and falling

*σ*

_{t}= 300 ms, peak rate 70 ap/s). In addition, periods 2, 3, and 4 contained short epochs (Fig. 8(a)) with coordinated firing modeled by an MIP process (Kuhn et al. 2003). NeuroXidence was applied to the data with a sliding-window length of 100 ms. We evaluated the statistical significance for each occurring JS-pattern with complexities 2–6, derived the frequency of significant JS-patterns (Fig. 8(c)), and computed the percentage of significant JS-patterns per complexity (Fig. 8(d)). During periods with rate modulation, but lacking fine temporal structure, the percentage of false rejections of H0 was below the test level of 5% for all complexities. In contrast, periods that contained coordinated firing lead to an increased frequency of significant JS-patterns (Fig. 8(d)). However, in the case of an MIP process, since the rate of JSEs increases for an increasing spike rate, the test power itself was also modulated. Hence, the precise relation of the test-power and spike rate depends on the chosen model of coordinated firing. Irrespective of this precise relation, the simulated data demonstrates that NeuroXidence is capable of detecting periods of coordinated firing even in the presence of rate modulation of the same neurons that are firing synchronously.

### 3.6 Test-power of NeuroXidence for oscillatory processes

*r*

_{i}(

*t*), that is coherently-modulated across neurons by a sinusoidal oscillation with frequency

*v*:

*A*= 10 ap/s) to model spike trains by inhomogeneous Poisson processes and estimated numerically the test-power of complexity 2 JS-patterns. NeuroXidence, based on 50 trials of 800 ms-long analysis windows, was applied to 100 independent realizations of the data. The oscillation frequency ranged between 1 and 80 Hz, while the jitter-kernel width was kept fixed [

*τ*

_{r}= 21 ms in Fig. 9(a) and

*τ*

_{r}= 11 ms in Fig. 9(b)]. For oscillation frequencies below 6 Hz in the case of

*τ*

_{r}= 21 ms and 9 Hz for

*τ*

_{r}= 11 ms, the percentage of significant events stayed below the test-level of 5%. With increasing oscillation frequencies, the test-power increased and reached up to 100% at 50 Hz for

*τ*

_{r}= 21 ms and at 80 Hz for

*τ*

_{r}= 11 ms.

In addition to this numerical estimation of the NeuroXidence test-power, we used an analytical description of the expected difference, Δ*f*, between the frequency of JSEs in the original, *f*(org), and the jittered, *f*(sur), datasets to approximate the theoretically predicted test-power *A*′(*v*)^{c} (see Appendix 4). The agreement between the theoretical prediction of the test-power, in percent given by 100 *A*′(*v*)^{c}, and the numerical estimation based on simulated data, both for *c* = 2, indicates that the modulation of the test-power is well described by the analytical solution (Fig. 9). Therefore, *A*′(*v*)^{c} can be used to discuss the impact of coherent oscillations on the probability of rejecting H0.

### 3.7 Application to real data

We applied NeuroXidence to neuronal activity of 48 single units recorded in area 17 of an anesthetized cat in response to a drifting sinusoidal grating. Gratings were presented for 3.5 s and 20 trials with movement in directions of 30° and 120°. The direction of 30° was the optimal direction for eliciting strong responses in most of the cells, while 120° was chosen to be perpendicular (see Appendix 5 for details on experimental methods).

*τ*

_{c}= 5 ms,

*τ*

_{r}= 20 ms,

*S*= 50, bin size = 1 ms, Wilcoxon test). Up to 14,769 different and individual JS-patterns per sliding window (length 200 ms, window shift 100 ms) were detected (Fig. 11(a,

*2*and b,

*2*)). JS-patterns occurred significantly more often than expected by chance only during the period of stimulation, and their complexities ranged between 2 and 8. No significant patterns were found of complexity 9 or higher (Fig. 11(a,

*3*and

*b*, 3)).

Interestingly, the time-course of the significant JS-patterns partially followed the temporal rhythm of the stimulation. The number of patterns recurred with a period of 820 ms, which corresponded with the timing of the stripes of the grating crossing the receptive fields (Fig. 11(a, *2* and *3*, b, *2* and *3*), compare to PSTHs in Electronic Supplementary Fig. 3). Also remarkable is the steep increase and decrease at the beginning and end of the period of stimulation. Notably, both the number of JS-patterns as well as the number of significant JS-patterns indicate stimulus specificity of joint-spike activity. This confirms previous findings restricted to pairwise analysis (Samonds et al. 2003). To express this in numbers, we averaged the number of patterns per complexity between 0.1 and 3.5 s after stimulus onset for each condition. Then, we computed the difference, for each individual complexity, between the mean across times for both movement directions and normalized the difference to the average for movement at 30°. The mean number of JS-patterns for 30° movement exceeds that for 120° by an average of 4.16 and by a maximum of 5.43, while the corresponding numbers for significant JS-patterns are 1.71 and 3.03. Compared to the modulation of the rate response for the same period, which amounts to 0.18, the average modulation of 1.73 of the frequency of significant JS-patterns indicates that joint-spike activity has higher discriminative power than rate.

To validate the results given by the number of JS-patterns that exhibited a significant increase of JSEs, we also applied NeuroXidence to a dataset derived by jittering the original dataset. Thus, we destroyed any coordinated-firing by jittering spikes with ±10 ms (Fig. 11(a, *4* and b, *4*)). This jittering of the data is the same as that used by NeuroXidence to derive the difference between the real data and H0 (*j* = ±10 ms, *η* = 4). Hence, the application of NeuroXidence on the jittered dataset allows the derivation of the chance frequency of significant JS-patterns (H0 against H0). The number of significant JS-patterns that occurred in the jittered dataset amounted to 0.2% of the number observed in the original dataset. Thus, the error rate was much smaller than the nominal alpha value of 0.01, indicating that the test is conservative. The positive side is that 99.8% of the significant JS-patterns could not be explained by chance (Fig. 11(a, *3* and b, *3*)).

## 4 Discussion

The distinction between rate codes and temporal codes in the cerebral cortex has been the origin of heated discussions over recent years (von der Malsburg 1981; Aertsen et al. 1989; Gerstein et al. 1989; Shadlen and Newsome 1994; Riehle et al. 1997; Fetz 1997; Brody 1999; Baker and Lemon 2000; Aertsen et al. 2001; Grün et al. 2003; Ikegaya et al. 2004). The rate hypothesis assumes that information is encoded in the discharge frequency of individual cells, while the temporal code assumes that information is encoded in the precise timing relations between individual discharges of distributed neurons. The latter can be tested by a statistical hypothesis test that checks whether fine temporal cross-structure represented by joint-spike events (JSE) exists and is beyond chance level. Therefore, the statistical hypothesis test is crucially dependent on the statistical formulation of the chance level of JSEs under the null hypothesis (H0: rate-hypothesis). To distinguish the temporal from the rate code hypothesis, H0 has to consider all features of the data that might be causally related to the rate hypothesis. Only then is a rejection of H0 interpretable in favor of the temporal coding hypothesis and provides evidence for cooperative coding based on joint-spike activity.

We next outline the properties of the NeuroXidence method in detail, starting with statistical testing and the basic idea of timescale separation for NeuroXidence. Then we describe the distinction between rare, spurious, and reliably reoccurring events, the choice of appropriate window lengths, the impact of oscillatory and coherent rate modulations, and the differences between NeuroXidence and other methods that are used for detecting coordinated firing. Last, we discuss results obtained form real data recorded from area 17 in an anesthetized cat.

### 4.1 Statistical testing and timescale separation of NeuroXidence

NeuroXidence implements a statistical hypothesis test for the analysis of spiking activity utilizing a large number of neurons, which allows for a dissociation of the temporal-code and the rate-code hypotheses. In contrast to earlier methods (Abeles and Gerstein 1988; Grün et al. 1999, 2002a), NeuroXidence is a non-parametric statistical hypothesis test and does not assume any model about the spike-train generating process. The estimation of the statistical significance of JS-patterns is based on a comparison of the frequency of JSEs between the recorded dataset and one or more surrogate datasets, which are derived from the recorded dataset but lack any coordinated-firing. Random shifting of spike trains was utilized to destroy fine temporal cross-structure while keeping the complete auto-structure in the surrogate dataset intact. Therefore, NeuroXidence estimates the chance frequency of JSEs under H0 by allowing for the consideration of rate modulations or for history dependencies, which might exist in each individual spike train.

To this end, NeuroXidence defines two timescales, *τ*_{r} and *τ*_{c}. The fast timescale, *τ*_{c}, corresponds to fine temporal joint-activity across neurons and is believed to be in the millisecond range. The second timescale, *τ*_{r}, is a lower bound of rate modulations that are coherent across neurons and that should be considered as rate, not as fine temporal joint-activity. The scale between *τ*_{r} and *τ*_{c} is given by *η*. As we have shown, small amounts of shifting with *η* = 3 lead to the considerable test-power of the NeuroXidence method. Thus, even in cases of very small differences between the timescales *τ*_{r} and *τ*_{c} (e.g. *τ*_{c} = 15 ms for *τ*_{r} = 5 ms and *η* = 3), NeuroXidence can detect a significant excess or deficiency of JSEs and, therefore, allows for a clear distinction between the rate- and the temporal-code hypotheses.

In conclusion, the concept of timescale separation, the reliability and sensitivity of NeuroXidence, and the high temporal precision for detecting periods of coordinated firing in short sliding windows all make NeuroXidence ideally suited for the analysis of neuronal activity that might exhibit both rate changes and coordinated firing.

### 4.2 Synchronous events versus spatiotemporal patterns

NeuroXidence is optimized to detect synchronous firings. However, it can also be applied to the analysis of spatiotemporal patterns (Abeles and Gerstein 1988), each consisting of one spike per individual neuron. This requires mapping each spatiotemporal pattern onto a corresponding synchronous event by shifting each spike train by its corresponding temporal delay. Since this mapping is required for each individual spatiotemporal pattern, the computational complexity increases linearly with the number of spatiotemporal patterns of interest.

### 4.3 Complexity of pattern versus order of coupling

NeuroXidence cannot investigate the nature or the cause of observed JS-patterns. Hence, higher-complexity JS-patterns detected by NeuroXidence can be either due to pairwise coupling or to higher-order interactions of more than pairs. To infer to what degree pairwise interactions are sufficient, Schneidman et al. (2006) introduced an analysis based on the maximum entropy concept and applied it to data recorded from the retina. They concluded that most of the observed patterns are explained by pairwise couplings between the neurons (Schneidman et al. 2006). However, this method assumes for the independence assumption that the recorded spike trains can be described by stationary Bernoulli processes. Both this assumption and the assumption of the independence of firing of a given neuron over time, are likely to be false (Brody 1998; Baker and Gerstein 2001).

The importance of the auto-structure on the joint probability of sets of neurons firing coincidently is highlighted by data presented in this paper. We have shown that the number of individual patterns existing in sets of spike trains can be different by a factor of 10, given the same rate but different regularities of the spiking process (Fig. 6, periods 4 and 5). Therefore, any method based on the assumption of independent Bernoulli processes, such as the maximum entropy method used in Schneidan et al. (2006), is likely to fail since the estimated probability of a JS-pattern does not account for auto-structure such as oscillations, regularity, or burstiness of the individual spiking activity. In addition, the maximum entropy approach requires much more data than NeuroXidence. The study reported in (Schneidman et al. 2006) was based ∼1 h of recording to estimate interactions among a set of ten neurons. Given the amount of data that can be recorded for behavioral experiments as well as stability and stationarity issues, it appears to be difficult to obtain sufficient amount of data to allow reliable application of these methods in behavioral paradigms. In contrast, as demonstrated by the analysis of real data in this paper, 20 trials, each with 500 ms sliding windows (in total 10 s), are sufficient to estimate the statistical significance of JS-patterns involving 9 of the 42 recorded cells.

### 4.4 Induced versus evoked JS-patterns

One might want to distinguish two kinds of JS-patterns: evoked and induced. The first kind consists of JS-patterns that are evoked by a stimulus and, therefore, are tightly time-locked to it with a certain delay. JS-patterns of the second kind, reflect the internal dynamics of the system and thus, are not time-locked to the stimulus. NeuroXidence cannot distinguish between evoked and induced JS-patterns as long as their temporal precision is of the same order. However, since the required precision of JS-patterns, *τ*_{c}, is usually on the order of a few milliseconds, this distinction is of a rather theoretical nature. Evoked JS-patterns would require, first, that changes of the stimulus are on the same order as or faster than *τ*_{c}, and, second, that the temporal precision is maintained over the whole period from stimulus onset until the occurrence of JS-patterns. Hence, the distinction is of practical use only for a very few stimuli such as flickered or flashed visual stimuli and then only shortly after stimulus onset. In those few cases, trial shuffling, rather than the timescale separation, can be used as an alternative for generating surrogate data to distinguish between induced and evoked JS-patterns. However, since trial shuffling destroys the correspondence of simultaneously recorded trials, it also destroys rate co-variations in the range of the temporal precision of stimulus locking. As discussed by Brody and others (Brody 1999; Grün et al. 2003; Pipa and Grün 2003), this might lead to false positives. Therefore, trial shuffling should be used as an alternative for the concept of timescale separation only if the changes of the stimulus are of the same order as *τ*_{c}. If this is not the case, it can be inferred that JS-patterns detected by NeuroXidence are induced rather than evoked.

### 4.5 Rare and spurious events versus reliably reoccurring events

NeuroXidence differentiates between rare and spurious events and reoccuring events. Only if events are reliably reoccurring across trials might they be classified as JS-patterns with a significant excess or diminution of JSEs. This allows NeuroXidence to estimate the statistical significance reliably even for JS-patterns of complexities higher than 2, which are expected to be rare. In contrast, other methods, which are based only on the total frequency across trials or the frequency in a single trial, might reject H0 if the expected frequency is so low that a few occurrences or even a single JSE is sufficient to be rated as a significant event. This is only likely to be the case for higher complexity JSEs, for which the expectancy is exponentially decreasing. Therefore, methods based on single trials or on pooled data from all trials are likely to fail to differentiate between rare and spurious events, on one side, and events that are reliable and consistent, on the other side.

### 4.6 Analysis window length

NeuroXidence can be applied to short time windows to track transient neuronal states. In contrast to other methods, the NeuroXidence window length is not restricted by any assumption on the stationarity of data, since the full auto-structure including any rate modulation is considered by the statistical hypothesis test. Thus, the length of the NeuroXidence analysis window can be freely chosen by the operator to match the assumed duration of neuronal states of interest.

### 4.7 Oscillatory processes

Since oscillatory processes are frequently observed in neuronal data, we studied the impact of oscillatory and, across neurons, coherent rate modulation by using simulated data based on inhomogeneous Poisson processes, whose rates were modulated sinusoidally. Based on this simulated data, we estimated test-power numerically and also derived an approximate analytical solution of the test-power (see Appendix 4). Both the analytical approximation and the numerical estimation were in agreement and demonstrate that the test-power increases with increasing frequency of the coherent rate modulations. This behavior is expected since NeuroXidence discriminates between fast processes that are defined on a timescale *τ*_{c}, corresponding to the precision of JSEs, and processes that are slower than a timescale *τ*_{r}. Hence, if coherent rate modulations occur on a timescale that is comparable with *τ*_{c}, NeuroXidence is expected to reject H0 and to support H1. This illustrates that the definition of coordinated firing and fine temporal cross-structure includes explicitly fast and, across neurons, coherent changes of the instantaneous spike rate, which occur on a timescale comparable to *τ*_{c}.

### 4.8 NeuroXidence compared to other methods

In contrast to other methods like the cross-correlation analysis (Gerstein and Perkel 1969; König 1994) and the joint peri-stimulus time histogram (JPSTH; Aertsen et al. 1989), NeuroXidence allows for the analysis of interactions between larger sets of neurons than just pairs. To compare the statistical properties of NeuroXidence with the UE-method and the shuffle-corrected cross-correlogram, we applied all three methods to the same simulated dataset based on 18 simultaneous Poisson or Gamma processes. The simulated data contained epochs with constant rates between 5 and 40 ap/s, rate modulations between 5 and 45 ap/s, latency variability, and trial-by-trial variability of rate-levels. Only NeuroXidence did not exhibit an increased false-positives rate for any of the epochs of the simulated data, indicating that a consideration of the auto-structure, as well as a trial-based evaluation, is crucial for dissociating the rate and the temporal code hypothesis.

However, if NeuroXidence and the UE-method use analysis windows of equal length, the test-power of the UE-method is higher. This has two reasons. First, the UE-method is a parametric approach and assumes a model, which improves the test-power if the model is correct. The disadvantage is that a model-based approach is only valid as long as the model is describing approximately well the real observed data. Second, NeuroXidence incorporates the trial-by-trial variability of the frequency of occurrence of JSEs. Therefore, it requires that the frequency of JSEs across all trials is reliable and, for most of the trials, is higher in the real than in the surrogate dataset. In contrast, the UE-method considers only the total frequency of JSEs across trials. If the expected frequency is very low, a few occurrences or even one single JSE across all trials might lead to a rejection of H0. This increases the sensitivity, as reflected by the higher test-power, but on the other hand, this also increases the probability of false rejections in the case of rare events.

### 4.9 Significant joint-spike activity in real data

We applied NeuroXidence to real data recorded in visual area 17 in an anesthetized cat. Our results demonstrate that JSEs, with a precision of 5 ms, occurred significantly more often than expected by chance for complexity 2–8 JS-patterns. We investigated the strength of neuronal synchrony across two stimulation conditions and compared the measures obtained by NeuroXidence (Fig. 11) to the measures obtained by a classical pairwise method, which employs the computation of cross-correlation histograms (Fig. 10; König 1994; Samonds et al. 2003). For pairwise interactions (complexity 2), both measures produce similar results. Neuronal synchrony is strongest in stimulation conditions in which the orientation of the sinusoidal grating matches the preferred orientation of most of the multi-units included in the analysis (condition 30°; orientation preferences are shown in Electronic Supplemental Fig. 6), while synchrony is weakest under the condition of 120° when the neurons are stimulated least optimally. Results obtained with NeuroXidence indicate that interactions of higher complexities show even stronger task- and stimulus-specific changes in the strength of neuronal synchrony than pairwise interactions (Fig. 11). Thus, the comparison of the relative modulation of rate responses between conditions 30° and 120°, reveals that the JS-activity has the higher discriminative power.

We validated our result by analyzing a jittered version of the original data. This validation was equivalent to a comparison of H0 against H0 and served as the estimation of the chance frequency of JS-patterns with excess JSEs. Since only 0.2% of JS-patterns with excess JSEs remained significant after jittering, we conclude that the increase of the frequency of JS-patterns with an excess of JSEs is highly significant for complexities ranging between 2–8. This is, to our knowledge, the first demonstration of JS-activity and its stimulus specificity that considers the complete auto-structure and trial-by-trial variability of spike trains when calculating JSE frequencies in real data.

These results illustrate that NeuroXidence allows for an evaluation of JS-patterns independent of rate modulations, auto-structure, and trial-by-trial variability. This highlights the strength of NeuroXidence for investigating the complex interactions amoung neurons in terms of fine temporal structure that might coexist with rate modulations of individual neurons.

## 5 Conclusion

In conclusion, we have presented the new analysis tool NeuroXidence (www.NeuroXidence.com) for detecting temporally coordinated firing events in spike trains. We have demonstrated its performance, reliability, and applicability, and compared it to currently used methods. The results obtained in the analysis of real data provide strong evidence that precise temporal relations between the discharges of distributed neurons contain information about the stimulus configurations and, therefore, support the temporal coding hypothesis.

## Acknowledgements

The authors wish to thank to Sonja Grün and Emery Brown for the fruitful discussions on this project. Also, Gordon Pipa would like to thank to his wife Gabriela Pipa and his family for the great support. This study was partially supported by Hertie foundation.