Mapping input noise to escape noise in integrate-and-fire neurons: a level-crossing approach

Noise in spiking neurons is commonly modeled by a noisy input current or by generating output spikes stochastically with a voltage-dependent hazard rate (“escape noise”). While input noise lends itself to modeling biophysical noise processes, the phenomenological escape noise is mathematically more tractable. Using the level-crossing theory for differentiable Gaussian processes, we derive an approximate mapping between colored input noise and escape noise in leaky integrate-and-fire neurons. This mapping requires the first-passage-time (FPT) density of an overdamped Brownian particle driven by colored noise with respect to an arbitrarily moving boundary. Starting from the Wiener–Rice series for the FPT density, we apply the second-order decoupling approximation of Stratonovich to the case of moving boundaries and derive a simplified hazard-rate representation that is local in time and numerically efficient. This simplification requires the calculation of the non-stationary auto-correlation function of the level-crossing process: For exponentially correlated input noise (Ornstein–Uhlenbeck process), we obtain an exact formula for the zero-lag auto-correlation as a function of noise parameters, mean membrane potential and its speed, as well as an exponential approximation of the full auto-correlation function. The theory well predicts the FPT and interspike interval densities as well as the population activities obtained from simulations with colored input noise and time-dependent stimulus or boundary. The agreement with simulations is strongly enhanced across the sub- and suprathreshold firing regime compared to a first-order decoupling approximation that neglects correlations between level crossings. The second-order approximation also improves upon a previously proposed theory in the subthreshold regime. Depending on a simplicity-accuracy trade-off, all considered approximations represent useful mappings from colored input noise to escape noise, enabling progress in the theory of neuronal population dynamics.


Introduction
Neurons in the brain must operate under highly non-stationary conditions.In fact, most behaviorally relevant sensory stimuli as well as internal signals are rarely constant in time but may change rapidly.In the presence of noise, such dynamic stimuli can be reliably encoded in the time-dependent population activity of a large population of spiking neurons [1].The time-dependent population activity also provides a concise, coarse-grained description of the collective dynamics of interacting spiking neurons.Therefore, theories that predict the population activity in response to a time-dependent signal have been of fundamental interest in theoretical neuroscience [2][3][4][5].
The population activity of noisy spiking neurons can be mathematically described by population density equations [6,7].The form of the population density equation depends on the noise model.Two popular ways to model neuronal noise consist of modeling noise either in the input or in the output of the neuron [1].In the first model class (input noise), noise enters the dynamical equations of the membrane potential, currents or conductances leading to stochastic differential equations.If the noise is Gaussian white noise, the subthreshold dynamics becomes a diffusion process and the input noise is also called diffusive noise [3].The corresponding population density equation is a Fokker-Planck equation and the population activity can be obtained as the probability flux across the threshold [8-10, 6, 11, 4].Models based on diffusive noise naturally appear as the result of modeling biophysical processes such as synaptic shot-noise or ion channel noise.In particular, a frequently considered source of noise is background synaptic input modeled as external Poisson processes [12,13].The fluctuating part of this external shot noise leads, via a diffusion approximation [1], to Gaussian white noise driving the synaptic input current or conductance.Besides its biophysical interpretability, input noise has the advantage that it permits modeling both temporal [10,14] and spatial [15] correlations of synaptic inputs and enables mean-field theories for recurrent networks of sparsely-connected integrate-and-fire neurons [9,12].
In the second model class (called output noise or escape noise [3]), the dynamical equations for the state variables are deterministic while spikes ("output") are generated stochastically through a hazard rate or conditional intensity [3,[16][17][18][19][20][21][22][23][24].This hazard rate depends on the state variables via a link function.For example, it may be given as λ (t) = Ψ (u(t), t(t)), where u(t) is the membrane potential and t(t) is the last spike time of the neuron at time t.If the neuron model is a non-homogeneous renewal or quasirenewal [20] process, the corresponding population density equation is a renewal integral equation or, equivalently, a refractory density equation [1,3,20,[25][26][27][28].Although output noise is of phenomenological nature without a quantitative link to biophysical mechanisms, it has several advantages [28] owing to its simpler mathematical tractability: First, the refractory density or integral equation admits an extension to finite numbers of neurons [5,[28][29][30].This extension allows to account for finite-size fluctuations of the population activity at the mesoscopic scale.Second, models with output noise provide analytical expressions for the likelihood function, and thus model parameters can be efficiently fitted to experimental data of single neuron recordings [16-18, 23,31-33].And third, the state space for models with output noise remains approximately one-dimensional even for multi-dimensional conductance-based neuron models [25].The one-dimensional description permits highly efficient numerical solutions, in contrast to Fokker-Planck equations [34], which become intractable and computationally inefficient for several state variables.
In view of the wide use of biologically interpretable input noise and the mathematical advantages of output noise, an intriguing question is whether input noise can be approximately mapped to output noise, so as to take full advantage of both noise models.Mathematically, such a map requires the specification of the hazard rate λ (t) in terms of a link function Ψ , which depends on some dynamical variables and defines the escape-noise model.Unfortunately, a standard method to derive such a link function does not exist.To see this, let us consider the example of nonhomogeneous renewal processes as a popular class of neuron models.In these models, the probability density P(t|t) to fire the next spike at time t given a spike at time t, t < t, does not depend on the state of the model before time t, i.e. the memory of renewal neurons only reaches back to its last spike.An important example of nonhomogeneous renewal models in neuroscience are one-dimensional integrate-andfire neurons driven by white input noise [1].For this model class one can formally construct the hazard rate via the formula λ (t|t) = P(t|t)/ 1 − t t P(s|t) ds [1].However, there are two obstacles: first, in order to apply this formula, the "interspike interval (ISI) density" P(t|t) would be needed in analytical form for arbitrary, time-dependent input currents {I(t )} t ∈(t,t) that occurred since the last spike.However, the calculation of the ISI density for time-dependent inputs is equivalent to a first-passage-time (FPT) problem with timedependent parameters or boundary.The solution of this FPT problem requires the solution of the Fokker-Planck equation with moving absorbing boundary, which is known to be a hard theoretical problem [35][36][37].Second, even if one succeeds to derive an approximate formula for the hazard rate λ (t|t), it is still challenging to represent the hazard rate in the form of a link function Ψ u(t), {z(t)}, t that depends on some voltage-like variable u(t), the last spike time t and possibly further dynamical variables {z(t)} locally in time (as opposed to a "non-local" functional of {u(t ), z(t )} t ∈(t,t) ).
Several theoretical studies have suggested approximate local hazard rates for leaky integrate-and fire (LIF) models driven by white [38,39,25] or exponentially-correlated [26] Gaussian noise, or quasi-static (frozen) noise [40].In this paper, we explore an alternative approach to the hazard rate and the first-passage-time density based on the theory of level crossings [41].In Sec. 2, we introduce the LIF model with time-dependent driving and constant threshold and map this process an equivalent model with constant input and moving barrier.In Sec. 3, we consider the level crossing statistics with respect to this moving barrier and use the Wiener-Rice series and approximations thereof to provide formal expressions for the FPT density.These expressions form the starting point for deriving approximate hazard rates that are local in time.This derivation reveals some unexpected results concerning the correlations of level-crossings of Gaussian processes at small time lags (Sec.3.4).Then, we turn to the LIF model and the problem of mapping input noise to escape noise (Secs.4) and apply this map to predict the time-dependent population activity of LIF neurons with colored input noise (Sec.5).Each of the sections 3, 4 and 5 closes with a comparison of the level-crossing theory with simulations and a previous theory by Chizhov and Graham [26].Detailed derivations are provided in the Appendix.

Leaky integrate-and-fire models and the associated first-passage-time problem
As a spiking neuron model with input noise, we consider a leaky integrate-and-fire model driven by synaptically filtered ("colored") noise [42,1,43].In this model, spikes are emitted whenever the membrane potential V (t) reaches a threshold V T .The subthreshold dynamics for V < V T can be written as where τ m is the membrane time constant and µ(t) = V rest + RI(t) is the mean neuronal drive consisting of a constant resting potential V rest and a time-dependent input current I(t) (R denotes the membrane resistance).Furthermore, η(t) is a colored noise modeled as a one-dimensional Ornstein-Uhlenbeck process with correlation time τ s and variance σ 2 η , and ξ (t) is a zero-mean Gaussian white noise with autocorrelation function ξ (t)ξ (t ) = δ (t −t ).The colored noise captures the effect of various intrinsic and extrinsic noise sources, such as fluctuations of synaptic background activity in vivo (shot noise due to random spike arrival from background neurons).After threshold crossing and spike emission, V (t) is reset to a reset potential V R , V R < V T , and the subthreshold dynamics Eq. ( 1) resumes after an absolute refractory period of length t ref following the reset.
We are seeking a corresponding spiking neuron model with escape-noise [3] given by a hazard rate (conditional intensity) of the form Ψ u(t), u(t), {z i (t)},t − t .Here, u(t) is a membrane-potential variable that obeys the noiseless membrane dynamics of the LIF model between spikes: Furthermore, we allow an explicit dependence on the speed of the membrane potential u(t) (in accordance with previous studies [38,39,25,40]), the time since the last spike t − t, and possibly further auxiliary variables {z i } whose dynamics between spikes is given by ordinary differential equations.Given these variables at time t, a spike is fired independently in the next time step with probability where dt is a small step size.This probabilistic firing rule is the counterpart of the firing rule with a hard threshold in the LIF model with input noise.After a spike, u(t) is reset to V R and the auxiliary variables {z i } are also reset to some suitable fixed reset value.During an absolute refractory period of length t ref , the variables are clamped to their reset values and the hazard rate is set to zero.Because all memory is erased upon resetting, the escape-noise model is a non-homogeneous renewal process .The main goal is to map the model with colored input noise, Eq. ( 1) to the model with escape noise, Eq. ( 2).Strictly speaking, mapping the two models is an ill-posed problem because the model with input noise is a non-renewal process, whereas the escape-noise model is a (non-homogeneous) renewal process.In fact, the temporal correlations of the colored noise in Eq. ( 1) introduces memory that is not erased upon spiking.This memory leads to correlations between interspike intervals (ISIs) [44,42,14].However, if the correlation time τ s of the colored noise is much smaller than the mean interspike interval, these correlations will be small and the model with input noise can be regarded as approximately renewal.In this case, it is sufficient to match the ISI densities of the two models in order to obtain an approximate mapping.Therefore, our goal of mapping the two models can be phrased more modestly as follows: Can we find a link function Ψ of the escape-noise model such that for an arbitrary given stimulus µ(t) the time-dependent ISI densities P(t|t) of the two models approximately match for all t and t < t?We emphasize that this definition of the mapping rests on the assumption of sufficiently small correlation times of the colored input noise.Biologically, this assumption seems to be reasonable given that typical time scales of excitatory and inhibitory postsynaptic currents are often only on the order of a few milliseconds [1].
To derive the link function Ψ that maps input to output noise, one needs to solve a first-passage-time (FPT) problem: As mentioned in the introduction, the hazard rate can be obtained from the ISI density of the model with input noise, Eq. ( 1).In this model, the interspike interval is determined by the "first-passage time" that is needed for the membrane potential to travel from the reset potential to the threshold.Thus, the ISI density P(t|t) is equivalent to the FPT density (apart from a time shift due to the deterministic absolute refractory period).To compute the FPT density, one needs to choose suitable initial conditions for the colored noise η(t).The ISI starting at the last spike time t is composed of the initial absolute refractory period of length t ref and the stochastic FPT t * .We thus need the initial value η(t + t ref ) of the noise at the starting time t + t ref of the stochastic motion.At the firing time t, the distribution of the noise p fire (η, t) is biased towards positive values of η [44,42,45,14], in contrast to the stationary distribution p st (η) of the Ornstein-Uhlenbeck noise, which has zero mean.During the absolute refractory period, the noise distribution relaxes towards the stationary distribution.Even though the noise at time t + t ref may not be fully stationary yet, it is reasonable to assume stationary initial conditions, where η(t + t ref ) ∼ N (0, σ 2 η ) is drawn from a normal distribution with variance σ 2 η .This initial condition is justified because the noise correlation time τ s has been assumed to be much smaller than the mean ISI; hence, we do not expect that the precise shape of the initial noise distribution has a significant effect on the FPT density.
Because in the following we focus on the FPT starting at t +t ref , we will conveniently choose the time origin such that t + t ref = 0. Furthermore, since we are only interested in the first threshold crossing after time t = 0, we can omit the voltage resetting for t > 0 without changing the FPT statistics.The resulting non-resetting process V (t) is the freely evolving solution of Eq. ( 1) without reset and with initial conditions V (0) = V R , η(0) ∼ N (0, σ 2 η ) (Fig. 1a).This nonresetting process will be useful for the level-crossing approach below.
For mathematical convenience, we will now reformulate the FPT problem in terms of a time-homogeneous process x(t) and a moving boundary b(t), so as to eliminate the time-dependent parameter µ(t) in Eq. (1) (Fig. 1b).This is achieved by subtracting the mean non-resetting membrane potential V (t) = u(t): where u(t) is given by Eq. (2a) with initial condition u(0) = V R .Furthermore, setting y = η/τ m , γ = 1/τ m , τ y = τ s and D = τ s σ 2 η /τ 2 m , we find the Langevin equation ẋ = −γx + y (5a) with initial conditions The dynamics of x(t) can be interpreted as an overdamped motion of Brownian particle in a parabolic potential subject to a colored noise y(t) (Ornstein-Uhlenbeck process).Here, D and τ y are intensity and the correlation time of the noise, respectively, and γ is the friction coefficient.As before, ξ (t) is a zero-mean Gaussian white noise.At time t = 0, the random initial condition for the colored noise y corresponds to a stationary Gaussian distribution with mean zero and variance σ 2 y = D/τ y .By construction, the domain of the particle is bounded from above by the time-dependent boundary b(t), where b(0) > 0 and b(t) is a differentiable function of time.The FPT t * is defined as the time when x(t) exits the domain, i.e. when it reaches the boundary, for the first time.The FPT density will be denoted by P(t), i.e.P(t)dt = Prob(t * ∈ [t,t + dt)) for an infinitesimal time interval of length dt.We emphasize again that the FPT density of the Brownian particle x(t) with moving boundary b(t) is the same as the FPT density of the membrane potential V (t) with respect to the constant threshold V T .
Beyond neuroscience, the escape of the doubly low-pass filtered process, Eq. ( 5), from a domain with moving boundary b(t) may serve as a simple archetypal model for nonstationary FPT problems.One prominent example are reaction times of bimolecular chemical reactions [46].If x(t) is interpreted as a reaction coordinate and the domain x < b(t) corresponds to the reactant state, the boundary b(t) can be interpreted as a time-dependent energy barrier that needs to be surpassed to reach the product state.Accordingly, the first-passage time can be interpreted as the reaction time.
3 Level-crossing theory for a moving barrier

Hazard-rate representation of first-passage-time density
To find approximations to the FPT density from approximate hazard rates, we use concepts from renewal theory, especially the notion of hazard rate and survival probability [47].Because the process Eq. ( 5) starts at time 0, the hazard rate λ (t) is defined here as the conditional probability per small time interval dt to find a boundary crossing in the interval (t,t + dt) given the absence of crossings in the interval (0,t).On the other hand, the survival probability S(t) is defined as the probability of an absence of crossings in (0,t).The two definitions imply that S(t +dt) = S(t)(1 − λ (t)dt), hence dS(t)/dt = −λ (t)S(t).Because the survival probability is unity at time t = 0, we thus obtain S(t) = exp − t 0 λ (s) ds for t > 0. The probability to find the first crossing after time 0 in the interval (t,t +dt) is equal to the probability to find a crossing in (t,t + dt) and to have no crossings in (0,t).Hence, the FPT density is given by the product P(t) = λ (t)S(t), or Given the hazard rate λ (t) for t > 0, Eq. ( 7) provides a simple formula for the FPT density.An advantage of this representation is that the exponential factor can be turned into a first-order differential equation, Thus, if the hazard rate λ (t) can be efficiently computed for t > 0, this representation permits an efficient numerical integration of the first-passage-time density forward in time.Therefore, the main strategy in this paper is to derive computationally efficient approximations for the hazard rate.
In general, the calculation of the hazard rate is as difficult as the calculation of the FPT density itself.However, finding approximations for λ (t) has several advantages over direct approximations of P(t).Firstly, as a probability density, P(t) must satisfy the normalization to unity.Thus, the (a) (b) Fig. 1 First-passage time of an integrate-and-fire neuron model and an equivalent model with moving boundary.(a) At time t = 0, different realizations of the non-resetting membrane potential V (t) (colored thin lines) are released from the reset potential V R .The non-resetting membrane potential follows a Gaussian process with time-dependent mean V (t) (gray thick line).Shown are three realizations (green, red, blue lines) that have an identical threshold crossing at time t = t * (blue circle), which is not necessarily the first crossing (indicated by an arrow).(b) Transformation to an equivalent time-homogeneous process x(t) with moving boundary b(t), in which the positions of threshold crossings are preserved.Parameters: value of the FPT density at different times cannot be calculated independently.In particular, the value of P(t) strongly depends on the values for t ∈ (0,t).By contrast, λ (t) is not a probability density and can thus, in principle, be arbitrary as long as it is non-negative and S(t) = exp − t 0 λ (s) ds converges to zero as t → ∞.Thus, if we are able to find any approximation for λ (t), the normalization of P(t) is guaranteed by Eq. (7).
Secondly, the character of the hazard rate is more local in time than the FPT density, and thus, we expect more efficient approximations for the hazard rate.The non-local character of P(t) has been already mentioned above.Moreover, the non-locality becomes particularly evident by the integral in Eq. (7), which accumulates the history of hazard rates.The exponential factor S(t) shaped by this integral thus contributes a trivial history-dependence of the FPT density P(t), which is present already for time-homogeneous processes.By contrast, this trivial history-dependence is divided out in the hazard rate λ (t) = P(t)/S(t).The remaining time-dependence of the hazard rate singles out effects of non-stationarity and explicit time-dependence of the system, which can be captured by local variables.Thirdly, because of the locality in time, time-dependent rates are interesting in its own right as they are often the natural choice to model escape processes in terms of a Markovian dynamics and master equations.
From the above considerations it becomes clear that the hazard rate representation, Eq. (7), is only useful if we succeed to derive approximations for λ (t) that are local in time.This means that we are seeking an approximation of the hazard rate in the form which may depend on time explicitly and through a few variables such as the value and its derivative of the timedependent boundary, b(t) and ḃ(t), respectively, and possibly through a few auxiliary variables z i (t) that obey simple ordinary differential equations.Note that we use the notations Φ for the boundary-dependent hazard rate of the model Eq. ( 5) and Ψ for the voltage-dependent hazard rate of the model Eq.(2b).The two functions are related in a simple way, see Sec. 4.1.

Wiener-rice series
Our approach to tackle the time-dependent FPT problem is to employ the level-crossing statistics of a Gaussian process [48,49,41,50,51].To this end, let us consider the sub-set of all realizations of x(t) that cross the barrier b(t) from below in the time interval (t * ,t * + ∆t), a so-called "up-crossing" (Fig. 1b).The up-crossing at time t * is not necessarily the first one but could be the second, third (and so on) up-crossing (e.g.green and red lines in Fig. 1b).To compute the density of the first up-crossing, one can make use of the statistics of repeated up-crossing events.These events form a point process in the time interval [0,t * ] where N(t * ) denotes the (random) number of up-crossings in that interval, {t i } i=1,...,N(t * ) are the up-crossing times and δ (•) is the Dirac δ -function.The statistics of the point process can be fully described by the set of moment functions . .and non-coinciding time arguments t i [52,53].The moment functions can be interpreted such that for a small time step ∆t the quantity f k (t 1 , . . .,t k )∆t k yields the probability to find an up-crossing events in each of the non-overlapping intervals (t 1 ,t 1 + ∆t), ..., (t k ,t k + ∆t).For instance, f 1 (t) yields the rate of upcrossings at time t, and is the conditional rate of an upcrossing at time t 2 given an up-crossing at time t 1 .
For level-crossings of Gaussian processes, the distribution functions f k can be calculated explicitly, both for stationary and non-stationary processes (see appendix, Sec.A.3).The distribution functions f k allow for an exact series expression of the FPT density, sometimes called Wiener-Rice series [41,50]: A detailed explanation of this formula is given in reference [41].In brief, it counts -for a large ensemble of trajectories -the number of those trajectories that have a crossing in [t,t + dt) but no crossing in (0,t).Starting with the fraction f 1 (t)dt of all trajectories that cross the boundary at time t (k = 0 term), the fraction with no previous crossing can be computed by subtracting those trajectories that crossed the boundary before time t.The second term t 0 f 2 (t 1 ,t)dt 1 in Eq. ( 11) accounts for these trajectories but overestimates their number because some trajectories are counted multiply.This corresponds to trajectories that cross the boundary more than once before time t (e.g.red line in Fig. 1).To correct for the excessive subtraction, one needs to add the fraction of trajectories with two or more crossings before t.This is taken into account by the third term 1 2 t 0 t 0 f 3 (t 1 ,t 2 ,t)dt 1 dt 2 which computes the mean number of crossing pairs {t 1 , t2 } per trajectory (e.g. in Fig. 1, the blue and green curve contributes zero and the red curve contributes one such pair; the factor 1  2 accounts for permutations of t1 and t2 ).Again, this term overestimates the fraction of trajectories with double crossing events because trajectories with more than two crossings are multiply counted (e.g. a trajectory with three crossings gives rise to three pairs {t 1 , t2 }, {t 1 , t3 }, {t 2 , t3 }).Continuing this correction procedure for trajectories with arbitrary number of crossings leads to the infinite series expression Eq. ( 11).
An alternative statistical description of the point process s(t) is given by the k-th order cumulant functions g k (t 1 , . . .,t k ) (see [52,53] and Sec.A.1), which remove the dependence on lower-order moment functions: for instance, g . The probability to find no event in the interval (0,t) (i.e. the survival probability) is related to the cumulant functions by [52,53] From this expression, the Wiener-Rice series for the FPT density, Eq. ( 11) is recovered by P(t) = −dS(t)/dt.Similarly, the hazard rate can be obtained by λ (t) = −d(ln S)/dt.As infinite series expressions, Eq. ( 11) and Eq. ( 12) are of no practical use for direct computations of the FPT density.However, these formal expressions are used as a starting point for further approximations.

Decoupling approximations
The series expression for the survival probability, Eq. ( 12), simplifies considerably if higher-order cumulant functions g k are approximated in terms of lower-order cumulant functions, thereby neglecting higher-order dependencies between up-crossings.In this section, we review two approximations based on such a decoupling of (temporal) interactions between events [52]: a first-order decoupling approximation, where all up-crossing events are assumed to be independent, and a second-order decoupling approximation, in which higherorder interactions are modeled in terms of pairwise interactions.While the first-order approximation readily results in local hazard rates, the more accurate pairwise interaction approximation is highly non-local and therefore not useful for practical calculations.However, as we shall show in Sec.3.5, the pairwise interaction model can be used as a starting point for deriving an efficient local approximation of the hazard rate (second-order decoupling approximation) that accounts for higher-order interactions between up-crossings.

Independent upcrossings
If the correlation time of the process x(t) is much smaller than the (typical) intervals between upcrossings, up-crossing events can be regarded as independent, i.e. the series of upcrossing events is an inhomogeneous Poisson process with rate f 1 (t).Mathematically, this corresponds to neglecting higher-order cumulants except for the first one: g 1 (t) = f 1 (t) and g k ≈ 0 for all k ≥ 2 [52].In this case, Eq. ( 12) reduces to S(t) = exp − t 0 f 1 (τ) dτ , and hence the FPT density reads From this expression, we see that the hazard rate is simply given by the upcrossing rate of the freely evolving process x(t): λ (t) ≈ f 1 (t).The upcrossing rate f 1 (t) can be calculated analytically in terms of the current value of the boundary b(t) and its derivative ḃ(t) (see Appendix A.3 and A.4).
Similar expressions for the level-crossing density in the time-inhomogeneous case have been derived in previous studies [49,54].

Upcrossings correlated in pairs
If the average time between upcrossings 1/ f 1 (t) is on the order of or smaller than the correlation time of x(t) given by τ cor = γ −1 + τ y , upcrossing events cannot be regarded as being independent anymore.To account for correlations between upcrossings, we follow a decoupling approximation (DA) of higher-order correlation functions g k (t 1 , . . .,t k ), k ≥ 3, proposed by Stratonovich [52,55].This approximation assumes that higher-order correlations are governed by the same time scales as pair-wise correlations and can therefore be expressed in terms of the first two correlation functions f 1 (t) and g 2 (t 1 ,t 2 ).Specifically, correlation functions with k ≥ 2 are approximated by the ansatz [52,55] (17) Here, the function R(t,t ) describes the pairwise interactions between events at time t and t , and {• • • } sym denotes the operation of symmetrization (i.e. the arithmetic mean of all permutations of the time arguments).As suggested in [52,55], we choose R(t,t ) as the normalized auto-correlation function which makes the ansatz Eq. ( 17) exact for k = 2.Note that compared to [52,55], we use an opposite sign in the definition of R for mathematical convenience.The auto-correlation function R(t,t ) can be interpreted as the conditional probability density of an event at time t given an event at time t normalized by the unconditional probability density f 1 (t ) and shifted by the mean such that R(t,t ) = 0 if events at time t and t are independent.For stationary point processes, R(t,t ) = R(|t − t |) only depends on the time difference.In analogy to the common use for spatial point processes, R(t −t ) will be called pair correlation function in this case.We expect the following behavior of the auto-correlation function: firstly, if events are far apart, |t − t | τ cor , they occur independently, hence f 2 (t,t ) ≈ f 1 (t) f 1 (t ).This implies a vanishing auto-correlation function R(t,t ) ≈ 0. Secondly, the behavior when t and t are close depends on the correlations between events: if close events occur independently as in the case of an inhomogeneous Poisson process, R(t,t ) vanishes.In contrast, a positive pair correlation function R(t,t ) > 0 at small time lag indicates that events are attractive and tend to cluster.Conversely, for a negative pair correlation function R(t,t ) < 0 at small time lag, events are repulsive, i.e. the occurrence of close events is less frequent than expected for a Poisson process.In particular, if a point process exhibits a refractory period after each event ("hardcore interaction"), we find that f 2 (t,t ) = 0 and hence R(t,t ) = −1 if t and t fall within a refractory period.Similarly, non-approaching random points [52] are characterized by R(t,t) = −1 in the limit of vanishing time lag.Interestingly, it has been assumed by some authors that level crossings of differentiable processes are non-approaching events with R(t,t) = −1 [41,56].In Sec.3.4 we shall investigate this assumption in more detail.
While the decoupling approximation (DA), Eq. ( 17), is exact for k = 2 by construction, it must be considered as a physically motivated, heuristic ansatz for k ≥ 3, which in general is not expected to be exact.Nevertheless, the ansatz and the above-described behavior of R(t,t ) ensure some important properties of the higher-order correlation functions g k : first, the DA is exact for an inhomogeneous Poisson process because in this case R(t,t ) ≡ 0 and thus Eq. ( 17) recovers the expected result g k ≡ 0 for all k ≥ 2. Second, g k does not depend on the order of the time arguments because of the symmetrization operation in Eq. (17).Third, g k (t 1 , . . .,t k ) ≈ 0 if the time difference of two arguments is much larger than τ cor because their pair correlation vanishes.And forth, it is known that for a system of non-approaching random points g k (t, . . .,t) = (−1) k (k − 1)! f k 1 (t) [55], which is consistent with Eq. ( 17) and R(t,t) = −1.
Substituting the DA, Eq. ( 17), into the general expression for the survival probability, Eq. ( 12), yields [52,55,41,57] where is a measure of upcrossing correlations on the time scale t.The formula Eq. ( 19) has been termed Stratonovich approximation [41].Comparing the Stratonovich approximation with the first-order decoupling approximation, Eq. ( 13), we observe that the upcrossing rate f 1 (τ) is multiplied by a correction factor ln(1 + q)/q.However, this correction factor depends explicitly on time t, which precludes a direct interpretation of the integrand in Eq. ( 19) as the hazard rate (but see [57] for a hazard rate approximation of the integrand in the time-homogeneous case).For the Stratonovich approximation to be applicable, one has to require that q(t, τ) > −1 (21) for all t and τ so as to keep the argument of the logarithm positive [41].
In practice, Eq. ( 19) is not useful as a computational tool.A numerical evaluation is highly inefficient because Eq. ( 19) contains nested integrals on three levels: for each τ of the outer integral, the integral q(t, τ) needs to be evaluated independently for each time t.Furthermore, the numerical integration of q(t, τ) is itself computationally complex because R(τ, τ ) involves a further integration (taking already into account that one of the two integrals in the definition of f 2 , Eq. (81), Sec.A.5, can be evaluated analytically [49,41]; we note that f 2 can also be expressed in terms of Owen's T function [57]).Therefore, we will further simplify Eq. ( 19) by deriving a local approximation of the hazard rate.

The auto-correlation function of level crossings for small time lags
We now proceed with calculating the auto-correlation function R(t,t + τ) in the limit of small time lags τ.Based on the zero-lag limit we then propose a rough estimation of the temporal correlation structure for τ > 0, which will be required for the simplification of the Stratonovich approximation in the next section.While the rate of level-crossings has been studied extensively (e.g.[48,55,41,58]), the calculation of second-order statistics such as the auto-correlation function has not received much attention.To the best of our knowledge, closed-form analytical formulas for the autocorrelation function of non-stationary level crossings have not been published previously.In the Appendix Sec.A.5.2, we also provide formulas for the auto-correlation function of general Gaussian level-crossing processes in the stationary state (see also [59] for special cases and [60] for the related but distinct result for the stationary auto-correlation function of the two-state process triggered by level crossings).
According to Eq. ( 18), the auto-correlation function at zero time lag is given by where f 2 (t,t) ≡ lim τ→0 f 2 (t,t + τ) is defined through the limiting procedure τ → 0. This corresponds to the continuous part of the auto-correlation function, i.e. f 2 (t,t) excludes the singular self-correlation of points given by f 1 (t)δ (τ).
The correlations between upcrossing in the limit of vanishing lag can be calculated within a saddle-point approximation (see Appendix, Sec.A.5).The result is It is instructive to discuss the stationary case, ḃ = 0 and t → ∞, in which the pair correlation function R(τ) for vanishing time lag τ obtains the simple form with the numerical constant β = (3 √ 3 − π)/9 ≈ 0.228284.For any fixed value of γτ y this expression becomes minimal at b = 0 (Fig. 2c, blue dashed line).From this we infer that R 0 is always positive if γτ y < 0.0583757 ("white noise regime") or γτ y > 17.1304 (strong friction or large noise correlation time).In this case, upcrossings tend to cluster.In the wide intermediate range 0.0583757 < γτ y < 17.1304, the sign of R 0 depends on the ratio |b|/σ x of barrier height to standard deviation of x(t).For vanishing or low barrier height such that |b|/σ x is below the critical value the pair correlation function will be negative at small time lags, i.e. upcrossings tend to repel each other.Closer inspection of Eq. (25) shows that for any barrier height b, R 0 becomes minimal (i.e.most negative) if γτ y = 1.The absolute achievable minimum is found as R 0 = −0.543431.Therefore, the value R 0 = −1 expected for non-approaching points is never realized for level crossings of a doubly lowpass-filtered white noise such as Eq. ( 1) and Eq. ( 5) for the membrane potential and overdamped Brownian particle driven by a one-dimensional Ornstein-Uhlenbeck noise, respectively.This result is in marked contrast to the assumption of non-approaching level crossings made in previous studies [41,56].
On the other hand, for high barriers such that |b| > b crit , the pair correlation function is positive at small time lags, implying that upcrossing events tend to cluster.Intuitively, upcrossings are mediated by large fluctuations of x(t) in order to reach the high barrier.Once the barrier is reached, x(t) persists at high values for some period because values of x(t) are positively correlated at short time lags.During this period the probability to cross the barrier for a second time is strongly increased.That is, upcrossings tend to cluster in periods on the order of the correlation time of x(t).This clustering corresponds to a positive pair correlation R 0 3.5 Local hazard function.
From the Stratonovich approximation, Eq. ( 19), we obtain the corresponding hazard rate by differentiating − ln S(t) with respect to t.Using Eq. ( 20), the result can be written as where F(q) = ln(1+q)/q.Because of the integral in Eq. ( 27), the hazard rate is still non-local in time.In order to obtain a local approximation, we make two ad hoc approximations.First, Eq. ( 27) can be considerably simplified if F (q(t, τ)) only weakly depends on τ such that we can pull this function out of the integral.Under this assumption and using again Eq. ( 20), the hazard rate reduces to the particularly simple form where we used the short-hand notation The above ad-hoc approximation seems plausible because the pair-correlation function R(t, τ) is different from zero only in a region of width |τ − t| ∼ τ corr around its peak at the integration boundary τ = t, where τ corr is the correlation time defined in Eq. ( 31) below (Fig. 2a,b).On this time scale, q(t, τ) represents indeed a slowly varying function of τ since it results from an integration over R (cf.Eq. ( 20)).
Note that an alternative approximation has been suggested in [57], which neglects the second term in Eq. ( 27).The formula Eq. ( 28) reveals a simple relation between the upcrossing rate and the hazard rate, which is the relevant quantity for the FPT: In the absence of correlations between upcrossings, q = 0, the two rates are equal, while negative correlations (repulsion of up-crossings) increase the hazard rate and positive correlations (attraction or clustering of upcrossings) decreases the hazard rate compared to the upcrossing rate f 1 .
Second, to find a local estimation of q(t) we need to turn the integral in Eq. ( 29) into a differential equation for q.A simple way to achieve this is to use an exponential approximation for the pair correlation function where R 0 (t) = f 2 (t,t)/ f 2 1 (t) − 1 is the limit of vanishing time lag τ → 0. Accordingly, the function f 2 (t,t) has to be understood as the limit lim τ→0 f 2 (t,t + τ), which has been calculated analytically in the previous section.Furthermore, τ corr is the typical correlation time with which correlations between upcrossings decay as function of their temporal distance.As a rough approximation, this correlation time is given by the correlation time of the stationary process x(t) itself: Here, C xx (τ) is the auto-correlation function of x(t) in the stationary state.In fact, comparison of the exponential approximation with numerical evaluation of the exact quadrature formula of the correlation function confirms our choice of τ corr and also shows that that the exponential ansatz is reasonable as long as R 0 is significantly different from zero (Fig. 2 a,b, left and right panels).In the crossover region from negative to positive R 0 when the barrier height b is increased, the auto-correlation function has both positive and negative phases that are not captured by an exponential function (Fig. 2 a,b, middle panels).However, these deviations are less significant because absolute correlations are small in this case.
Inserting the exponential ansatz Eq. ( 30) into Eq.( 29), we can pull R 0 (t) in front of the integral and obtain: where defines a new auxiliary variable that satisfies the differential equation with z(0) = 0. We note that the slightly different ansatz R(t,t ) ≈ R 0 (t ) exp − t−t τ corr yields slightly different equations with similarly good results.In Sec.5.2, we will thus only show the results for the above ansatz, Eq. (30).We note that in the limit of vanishing correlations between upcrossings, R 0 (t) ≡ 0, the first-order DA λ (t) ≈ f 1 (t) is recovered from Eq. (28).Thus, the first-order approximation, Eq. ( 14), is expected to be valid if for all t > 0. In summary, the local hazard rate in the second-order DA is given by .
(35) Here, Φ 1 is given by Eq. ( 14) and R0 b, ḃ,t = f2 b, ḃ,t is the zero-lag correlation between up-crossings, Eq. ( 22), where f2 is given by Eq. (23).In contrast to the first-order approximation Φ 1 , the hazard rate Φ 2 depends on the additional local variable z that obeys Together with Eq. ( 8), this ordinary differential equation provides an update rule for the numerical evaluation of the FPT density P(t) forward in time.

First-passage-time densities
Being equipped with local approximations of the hazard rate, the FPT density P(t) can be easily obtained from Eq. ( 8).To test the performance of our theory, we compare the first-and second-order decoupling approximations (DA) with simulations and an alternative hazard-rate theory proposed by Chizhov and Graham [26].An extended variant of the Chizhov-Graham (C&G) theory is presented in Appendix B, Eq. ( 101).
For concreteness, we consider a periodically moving boundary: (Fig. 3, top panels).The case, where the amplitude of the oscillating boundary is smaller than unity, α < 1, corresponds to the subthreshold firing regime of LIF neurons.In this case, both the first-and second-order DA (Eq.( 8) with λ (t) given by Eq. ( 14) and ( 35), respectively) yield excellent agreements with simulations (Fig. 3a).In contrast, the C&G theory (Eq.( 8) with λ (t) given by Eq. ( 101)), shows clear deviations from simulations at the peaks of the FPT density and during the time spans when the boundary is increasing ( ḃ > 0), i.e. when the boundary moves away from zero.In these regions, the drift component, Eq. ( 96), of the C&G hazard rate is set to zero, leaving only diffusion as a source of threshold crossings.The rectification of the drift component also leads to a characteristic kink at the local extrema of the boundary ( ḃ(t) = 0).The case of large amplitude oscillations of the boundary (α > 1) is equivalent to a LIF model that is periodically driven into the supra-threshold regime.In this case, the first-order DA performs significantly worse than the secondorder approximation and the C&G theory, which both agree well with simulation results (Fig. 3b).In particular, the first peak in the FPT density (green dotted line in Fig. 3b) is underestimated if correlations between upcrossings are neglected.The underestimation is caused by a reduced hazard rate, which can be understood from the simple formula Eq. ( 28): in the first order approximation, the hazard rate is given by the level-crossing rate λ (t) ≈ f 1 (t), while in the second-order approximation λ (t) ≈ f 1 (t)/[1 +q(t)] with q(t) = R 0 (t)z(t).The factor 1/(1 + q) accounts for the correlations between upcrossings.At the peak, the boundary b(t) is close to zero.In this case, the zero lag pair correlation R 0 is negative representing the reduced probability of nearby crossings ("repulsion", Fig. 2, left panels).Since z is positive, we have −1 < q < 0 and thus the factor 1/[1 + q] is larger than unity (note that q > −1 by the assumption Eq. ( 21)).Therefore, correlations between upcrossings lead to an increased hazard rate and thus a stronger first peak of the FPT density.
4 Mapping colored input noise to escape noise in the leaky integrate-and-fire model

Link function
We now come back to our initial motivation to map colored noise in the input to escape noise in the output of a LIF neuron.Having derived the hazard rate Φ for the FPT with moving boundary b(t), it is easy to formulate the link function Ψ in Eq. ( 2) that provides the escape-noise model corresponding to the LIF model with input noise Eq. (1).To this end, we only need to shift time such that the FPT starts at time t +t ref instead of t = 0, enforce a zero hazard rate during the absolute refractory period, and express the moving threshold b(t) in terms of the mean membrane potential u(t) for t > t + t ref using Eq. ( 4).Accordingly, we also replace the temporal derivative of the moving boundary by for t > t + t ref .
The last expression shows that, instead of the two functions u(t) and u(t), one can also use the two functions u(t) and µ(t) if the stimulus µ(t) is known.With these changes, we obtain the link function in the first-order DA as Here, θ (t) = 1 t≥0 is the Heaviside step function and Φ 1 is given by Eq. ( 14).Note that in the first-order DA, the link function Ψ (u, u, z, τ) = Ψ 1 (u, u, τ) does not depend on an auxiliary variable z.In contrast, the 2nd-order DA exhibits an additional auxiliary variable z.Taking the last spike time and the absolute refractory period into account, its dynamics reads ż = − z with initial condition z(t) = 0. We can now write the link function Ψ in the second-order DA as where Φ 2 is given by Eq. ( 35).

Comparison with simulation and C&G theory
To judge the performance of the level-crossing theory given by the link functions Ψ 1 and Ψ 2 , we compared ISI densities, survival functions and hazard rates with Monte-Carlo simulations of the LIF model with colored input noise, Eq. ( 1), and the C&G theory.These functions are obtained from the link functions as where for the first-order decoupling approximation(DA) and for the second-order DA, with Ψ 1 and Ψ 2 given by Eq. ( 40) and Eq. ( 42), respectively.In Eq. ( 45) and ( 46), we have introduced the membrane potential and the auxiliary variable as deterministic functions of t and t.For t > t +t ref , these functions obey the first-order dynamics with initial conditions u(t The time-dependent stimulus µ(t), shown in Fig. 4 (top panels), was obtained as µ(t) = µ 0 + µ 1 (t), where µ 1 (t) is a fixed realization of a band-limited white-noise process with a cut-off frequency of 100 Hz.Without loss of generality, we also choose the last spike time as the time origin, t = 0.The membrane potential u(t|0) is shown in Fig. 4 (second panel from top).Note that in simulations and figures, we measured voltages in units of V T − V R and chose the arbitrary reference potential such that V R = 0, and hence V T = 1.For subthreshold stimuli (Fig. 4A), u(t|0) < V T , both the first-and second-order decoupling approximations agree well with the interval distribution obtained from simulations of the model with colored input noise.As in the case of periodic subthreshold driving (Fig. 3a), the C&G theory exhibits again clear deviations at the peaks of the ISI density and in periods where the slope of the mean membrane potential is negative, u(t|0) < 0, (Fig. 4A, middle panel).The overall performance is better visible in the survival function (Fig. 4A, second panel from bottom), which is related to the cumulative ISI distribution via S(t|t) = 1 − t t P(s|t) ds.It confirms the excellent performance of both decoupling approximations in the subthreshold regime.For completeness, we also compared the hazard rates (Fig. 4A, bottom panel).Note that the initial transient of u(t|0) from reset to resting potential µ 0 realizes a relative refractory period, where the the probability to fire is low.
For suprathreshold stimuli, where the mean membrane potential exceeds the threshold, the first-order DA deviates significantly from simulation results (Fig. 4B).This is because level crossings occur more frequently when u is close to the threshold and thus exhibit stronger (negative) correlations.In this case, the assumption of independent upcrossing is no longer valid.Again, the underestimation of the first peak in the ISI density and the hazard rate (dotted lines in Fig. 4B, middle and bottom panel) if correlations are neglected can be understood from the simple formula Eq. ( 28): under the assumption of independent upcrossings, the hazard rate is given by the level-crossing rate λ (t|0) ≈ f 1 (t), while correlations between upcrossings are accounted for in the second-order approximation as λ (t|0) ≈ f 1 (t)/[1 + R 0 (t)z(t|0)].We have seen that if u is close to the threshold (corresponding to b = 0), the zero lag pair correlation R 0  4 First-passage-time density, survivor function and hazard rate under non-stationary driving of a neuron that fired its last spike at time t = 0. (A) Weak subthreshold stimulus µ(t) (top panel) leads to a mean membrane potential response u(t|0) below threshold at V T = 1 (second panel).The first-passage-time density P(t|0) for the first threshold crossing of V (t) is shown in the third panel (gray circles: MC simulations of 10 6 trials; red solid line: Chizhov-Graham theory, Eq. ( 7), (101); blue dashed line: first-order decoupling approximation (independent upcrossings), Eq. ( 45), (43); blue solid line: second-order decoupling approximation (correlated upcrossings), Eq. ( 46), (43).The survival probability S(t|0) = −dP(t|0)/dt and the corresponding hazard rate λ (t|0) are shown in the two bottom panels, respectively.For MC simulations, the hazard rate is computed from the ratio λ (t|0) = P(t|0)/S(t|0).(B) The same for a suprathreshold stimulus, for which the mean membrane potential u(t|0) reaches the threshold.In both figures, τ s = 4 ms, τ m = 10 ms and σ η is such that the standard deviation of V is σ V = 0.25.
is negative representing the reduced probability of nearby crossings ("repulsion", Fig. 2, left panels).Since z is positive, the factor 1/[1 + R 0 z] is larger than unity (Note that q ≡ R 0 z > −1 by assumption Eq. ( 21) for the applicability of the Stratonovich approximation).Therefore, correlations between upcrossings lead to an increased hazard rate (2ndorder DA) as compared to the theory with independent upcrossings (1st-order DA) (blue solid vs. blue dotted line in Fig. 4B, bottom).
To characterize the error of the theoretical approximations more systematically, we compare theory and simulations as a function of the stimulus properties (Fig. 5).To this end, we model µ(t) as a complex stimulus sampled from a stationary Ornstein-Uhlenbeck process with correlation time τ µ , mean μ and variance (1 + τ m /τ µ ) σ 2 .This parametrization has been chosen such that the non-resetting membrane potential V has mean μ and standard deviation σ in the stationary state.For a given realization µ(t), we quantify the deviation of the theoretical ISI distribution P µ (t|0) from the simulated one Pµ (t|0) using the Kolmogorov-Smirnov (KS) statistics [61].This statistics is then averaged over the stimulus ensemble (the subscript µ indicates the dependence on a given realization µ(t)).Explicitly, the mean KS statistics is defined as where • µ denotes the ensemble average over realizations µ(t).Thus, the KS statistics can also be interpreted as the largest absolute difference between the survival function S µ (t|0) and the simulated survival function Ŝµ (t) (see Fig. 4, second panels from bottom).The analysis confirms our previous observations that the decoupling approximations perform best in the subthreshold regime ( μ < 1) at small stimulus variations σ (Fig. 5); they both become worse in the tonically-firing regime ( μ > 1).Although the qualitative dependence on the stimulus parameters is similar between the 1st-and 2nd-order DA, the er- ror is considerably smaller for the 2nd-order DA throughout stimulus parameters.On the other hand, the Chizhov-Graham (C&G) theory has an opposite dependence, it generally performs well in the tonically-firing regime ( μ > 1) and shows small weaknesses in the subthreshold regime (Fig. 5b, μ < 1), but it exhibits a good overall performance.For all three approximations, the error is larger for a rapidly changing stimulus (Fig. 5a).Interestingly, in the strongly meandriven regime ( μ > 1), a constant or weakly-fluctuating stimulus ( σ 1) turns out to more difficult for the 2nd-order DA than a more strongly fluctuating stimulus (Fig. 5b,c).
5 Population activity of LIF neurons (time-dependent firing rate)

Integral equation
As an application of the noise mapping, we consider the dynamics of the time-dependent firing rate, or equivalently the population activity of LIF neurons with colored input noise.Being in possession of an approximate hazard rate, it is straightforward to use the renewal integral equation [3,1] (or equivalently, the refractory density equation [25][26][27][28]62]) to compute the population activity forward in time.To this end, let us consider a population of N uncoupled LIF neurons with colored input noise, Eq. ( 1).The spike train X i (t) of a given neuron i, i = 1, . . ., N is defined as ), where {t i,k } k∈Z are the spike times of that neu-ron.The population activity is defined as the total number of spikes in a small time bin (t,t + ∆t) divided by the number of neurons N and the time step ∆t.In the limit of infinitely many neurons and infinitesimally small time steps, we obtain the deterministic population activity Note that this expression can also be interpreted as an ensemble or trial average of a single neuron spike train, i.e.A(t) is equivalent to the time-dependent firing rate of a single neuron measured over many trials or realizations of a statistical ensemble.The evolution of the population activity is given by the renewal equation [47, 1] where P(t|t) is given by Eq. ( 43) and t + 0 denotes the rightsided limit.In Eq. ( 51), we assumed that the population is initialized with a spike of each neuron at time t 0 ("synchronized initial condition").The first term P(t|t 0 ) represents the contribution from neurons that fire at time t for the first time after the initial spike at t 0 .The integral equation ( 51) can be efficiently solved numerically [63].In particular, for numerical solutions, it is useful to turn the exponential factor into a differential equation as in Eq. ( 8): for all t < t.

Comparison with simulations and C&G theory
As an example, we studied the response of the population activity to the complex stimulus µ(t) shown in Fig. 6Ai and Bi.In the subthreshold regime, where the membrane potential remains below threshold (Fig. 6A), the level-crossing theory well predicts the population activity obtained from simulations, while the C&G prediction exhibits small deviations as expected from the deviations of the ISI density in the subthreshold regime discussed above (Fig. 3 and Fig. 4).The agreement is good for both strong and weak noise.For suprathreshold stimuli, where the membrane potential exceeds the threshold, the first-order decoupling approximation shows clear deviations (Fig. 6B).However, accounting for correlations between level-crossings in the secondorder approximation recovers the population activity of simulated neurons for both strong and weak noise.Similarly, the C&G theory shows an excellent agreement with simulations.

Discussion
We developed a level-crossing theory for the hazard rate of a leaky integrate-and-fire neuron driven by colored input noise.To this end, we generalized the Stratonovich approximation for the first-passage-time (FPT) density [55,41,57] to the time-inhomogeneous case, where the stimulus or boundary is time-dependent, and derived a simplification that is local in time.Because higher-order correlations between upcrossings are approximated through their pair-wise correlations, we referred to this theory as the second-order decoupling approximation (DA).Besides the mean membrane potential u(t), the simplified hazard rate depends on the speed u and one additional variable z(t), which accounts for correlations between level crossings.Therefore, the escapenoise model defined by this hazard rate consists of only one extra first-order differential equation, Eq. ( 41), besides the dynamics of u, Eq. (2a).Our simulation results for the timedependent interspike-interval (ISI) density and population activity show that the mapped LIF model with escape-noise well matches the LIF model with colored input noise.Thus, the hazard rate in the 2nd-order DA (link function Eq. ( 42) and dynamics of z, Eq. ( 41)) provides a novel map from input noise to escape noise.We note that the dependence on the speed u is important and qualitatively differs from commonly used escape-noise models, where the link function only depends on the mean membrane potential u.Given the extensive theoretical literature on population models with simple link functions Ψ (u) [3,64,30], it will be an interesting question for further studies how the mean-field dynamics is influenced by an additional dependence on the membrane potential speed u.
The map based on the 2nd-order DA should be compared to the 1st-order DA, which neglects any correlations between upcrossings and represents a time-dependent generalization of the Hertz approximation [41], and the previously proposed map by Chizhov and Graham (C&G) [26].The generalized Hertz approximation (1st-order DA) involves less ad-hoc approximations compared to the 2nd-order DA (cf.Eqs. ( 28) and ( 30)), and performs well in the fluctuationdriven (subthreshold) firing regime at low firing rates.On the other hand, its region of validity, Eq. ( 34), is rather limited, especially transiently large firing rates and mean-driven (suprathreshold) firing are not well described by the firstorder approximation.Furthermore, the gain in numerical efficiency compared to the 2nd-order DA is minor: e.g., simulating the firing rate trajectory of 200ms in Fig. 6B (middle) took 134ms for the 1st-order DA versus 165ms for the 2ndorder DA (Julia code run on an Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz).
On the contrary, the C&G map exhibits some weaknesses in the fluctuation-driven regime, while it has an excellent performance for short, mean-driven firing-rate transients.This   4A leads to a mean membrane potential response u(t|t 0 ) below threshold at V T = 1 (ii).The resulting population activity A(t) is shown in (iii) and (iv) for strong (σ V = 0.25) and weak (σ V = 0.1) background noise, respectively.Gray circles: MC simulations of 10 6 trials; red solid line: Chizhov-Graham theory, Eq. ( 7), (101); blue dashed line: level-crossing theory with independent upcrossings (1st-order decoupling approximation), Eq. ( 45), (43); blue solid line: level-crossing theory with correlated upcrossings (2nd-order decoupling approximation), Eq. ( 46), (43).(B) The same for a suprathreshold stimulus as in Fig. 4B, for which the mean membrane potential u(t) reaches the threshold.In both panels, τ s = 4 ms, τ m = 10 ms, t ref = 4 ms and the population was initialized at time t 0 = −25 ms (initial transient not shown).
behavior is expected because the theory represents an interpolation between two limit cases, where the theory is expected to work well: strong positive drift towards the threshold without diffusion (cf.also [40]) and pure diffusion without drift.During short mean-driven periods the drift-induced firing dominates and diffusion effects can be safely neglected.An advantage of the C&G hazard rate, Eq. ( 103), is its simpler mathematical form and thus easier numerical implementation than the hazard rates based on the level-crossing theory (1st-and 2nd-order DA).Furthermore, the C&G theory permits to take the white-noise limit, τ s → 0, whereas the level-crossing theory is not well defined in this limit: for τ s → 0, the upcrossing rate f 1 diverges [48,55] (cf.Eq. ( 87)).Despite the divergence in the white-noise limit, we found in simulations that the 2nd-order DA performs well in the physiologically relevant range of synaptic time scales including synaptic time constants as small as τ s = 1 ms (relative to τ m = 10 ms, data not shown).On the other hand, the numerical efficiency of the C&G and the 2nd-order DA are comparable (e.g.175ms and 165ms run time, respectively, for the 200ms firing rate trajectory in Fig. 6B, middle).Over-all, the C&G theory represents a good compromise between simplicity and accuracy.
Apart from the mapping of input noise to escape noise, the analysis performed in this paper also provided some analytical insights into the Stratonovich approximation.First, the ansatz of Stratonovich, Eq. ( 17), has been originally proposed for a system of "non-approaching" random points (level crossings) [52,55].In our terminology, this means that the pair correlation function at zero time lag is R(t,t) = −1.Put differently, the conditional rate ν cond (t, τ) = f 2 (t,t +τ)/ f 1 (t) of an upcrossing to occur a time lag τ after a crossing at time t vanishes for τ → 0 if upcrossings are non-approaching.However, we found that in our case of the membrane potential driven by an exponentially-correlated Gaussian noise, i.e. a doubly low-pass-filtered white noise (cf.Eq. ( 1) or ( 5)), the upcrossings do not form a system of non-approaching points.The conditional rate ν cond at zero time lag has a nonvanishing minimum (corresponding to a reduced probability of close upcrossings, ν cond < f 1 ) and can even exceed the stationary upcrossing rate, ν cond > f 1 , (the probability of an upcrossing is increased by an immediately preceding upcrossing, as already noted by [59] for stationary levelcrossings).Given the excellent agreement of the 2nd-order DA with simulations, the ansatz Eq. ( 17) seems to be more general and not limited to systems of non-approaching random points.
Based on the assumption of non-approaching level crossings, the threshold-crossing process has been frequently used as an analytically tractable model of neural spike generation.
Examples include the calculation of information rates [65], pairwise correlations and synchronization of neurons due to shared inputs [58,66,59] and stochastic resonance [67].The intuition behind this assumption is that level crossings exhibit refractoriness [56] or a silence period [58] because it takes some time for a trajectory to re-cross the threshold from below.While this intuition is true for sufficiently smooth Gaussian processes [58,66] (auto-correlation function must be at least four times differentiable at 0), it fails if the velocity of the process is not differentiable (third derivative of auto-correlation at 0 does not exist), as in the present study and in [41,68,54,56].Because neurons exhibit some degree of refractoriness, the Gaussian processes of thresholdcrossing neurons should be sufficiently smooth to be useful as a spiking neuron model.By mapping input noise to escape noise we could apply the renewal integral equation to predict the time-dependent population activity of infinitely many LIF neurons with colored input noise.This detour via an approximate escapenoise model allowed us to circumvent the direct numerical solution of the two-dimensional Fokker-Planck equation associated with the LIF model Eq. ( 1).An intriguing question is whether the same indirect approach could be used to solve the important problem of finitely many neurons with input noise.Neural circuits in the brain are often modeled by networks of integrate-and-fire neurons driven by Poissonian input noise (e.g.[69,13,70]).In these network models, the number of neurons per cell type range from about hundred to a few thousand cells, consistent with experimental estimations [71].On this mesoscopic scale, finite-size fluctuations of the population activity cannot be neglected.It is, however, unknown how to generalize the Fokker-Planck equation to a stochastic population equation in the case of finite neuron numbers, so as to account for finite-size fluctuations.On the other hand, the problem of finite-size neural population equations has been recently solved for LIF neurons with escape noise in the form of a stochastic integral equation [5,30].In the original paper [5], we have applied the stochastic integral equation to the cortical microcircuit model of [13] by roughly fitting an escape-noise model with the simple link function Ψ (u) = ce β u to match mean population activities of simulation data.However, with the map derived in this paper, where Ψ depends on u and u, it should be possible to directly use the stochastic integral equation as a new mesoscopic population model for finite-size populations of LIF neurons driven by colored input noise.
A FPT density from level-crossing statistics

A.1 General expression for survivor function
The sequence of upward crossings of the freely evolving, non-resetting membrane potential across the threshold, or shortly the set of "upcrossings", forms a point process {t 1 ,t 2 , . . .} in time with t i > 0. Thus, the upcrossing times are defined by V (t i ) = V T and V (t i ) > 0. As any point process, the upcrossing times for t > 0 can be fully characterized by the joint distribution functions f 1 (t 1 ), f 2 (t 1 ,t 2 ), f 3 (t 1 ,t 2 ,t 3 ), ... (see, e.g.[52,53]).These functions are defined such that is the probability to find an upcrossing in each of the non-overlapping intervals [t 1 ,t 1 + dt 1 ), ..., [t k ,t k + dt k ), with sufficiently small intervals dt 1 , . . ., dt k < dt and non-coinciding arguments t i = t j for all i = j.In the case of coinciding arguments t i = t j for some i = j, the function f k is understood to be its limit value for t i → t j .
For our purpose, it will be more convenient to use the correlation functions g 1 (t 1 ), g 2 (t 1 ,t 2 ), g 3 (t 1 ,t 2 ,t 3 ), ... (see, e.g.[52,53]).Similar to the joint distribution functions { f k }, the system of correlation functions {g k } completely characterizes the statistics of the upcrossing times.To define the correlation functions, we first introduce the generating functional for the f k given by where v(t) is a test function [52,53].It can be shown that expanding the generating functional in powers of v(t) yields i.e. the functions f k are the expansion coefficients of the generating functional.Therefore, the joint distribution functions f k can be uniquely generated by functional differentiation of L [v].In analogy to the cumulants of a random variable that are generated from the logarithm of the moment generating function, the correlation functions g k can be obtained from ln L as follows: In particular, the first two correlation functions read By means of the correlation functions, the survivor function S(t), i.e. the probability for having no upcrossing in the interval [0,t), can be expressed as Eq.(12).

A.2 Moments and correlation functions of the Gaussian process
In contrast to the vanishing first moments x = y = 0 and the stationary variance σ 2 y = y 2 (t) , the second moments σ 2 x (t) = x 2 (t) and σ xy (t) = x(t)y(t) are time-dependent.They obey the differential equation [72] d with σ 2 y = D/τ y , τ−1 = γ + τ −1 y and σ 2 x (0) = σ xy (0) = 0.The explicit solution is For large t, the process [x(t), y(t)] becomes stationary with the following constant moments

A.3 Joint distribution functions for upcrossings
Let us denote the point process of the upcrossings by {t i } i=1,2,... .The corresponding spike train can be written as Note that this equation can be seen as an extension of the Kac-Rice formula [51] to moving boundaries.The joint distribution function is defined as (for t i = t j for i, j = 1, . . ., k, i = j).Substituting Eq. ( 63) into Eq.( 64) and taking the average yields (65) where b i and ḃi is short-hand for b(t i ) and ḃ(t i ), respectively.Furthermore, p (x, ẋ) 2k (x 1 , . . ., x k , ẋ1 , . . ., ẋk ) is the joint probability density for the variables x i = x(t i ) and ẋi = ẋ(t i ).In our case of the two-dimensional Ornstein-Uhlenbeck process, Eq. ( 5), p (x, ẋ) 2k can be simply expressed by the joint probability density p 2k (x 1 , . . ., x k , y 1 , . . ., y k ) of the variables x i = x(t i ) and y i = y(t i ): (66) Inserting this expression into Eq.( 65) yields (67) where we made the substitution ẋi = ḃi + w i with new integration variables w i .We note, however, that for higher-dimensional models, it is generally more convenient to directly compute the density p (x, ẋ) 2k and use Eq.(65).For example, for a (n + 1)-dimensional Gaussian process x(t) = [x(t), y 1 (t), . . ., y n (t)] T , this density is determined by the time-dependent correlation functions x(t)x(t + τ) , x(t) ẋ(t + τ) , ẋ(t)x(t + τ) and ẋ(t) ẋ(t + τ) , which can be obtained from the timedependent covariance matrix of x(t) in a straightforward manner.

A.4 Uprossing rate f 1 (t)
Using the moments σ 2 x (t), σ xy (t) and σ 2 y derived in Sec.A.2, the joint probability density of x and y is given by the bivariate Gaussian distribution with xy .This allows us to calculate the upcrossing rate f 1 (t) from Eq. ( 67).The integration can be performed analytically resulting in the formula Eq. ( 14).

A.5 Correlations between upcrossings for small time lag
Here, we are interested in the probability that two upcrossings occur very close to each other.More precisely, we want to calculate the probability density f 2 (t,t + τ) in the limit when the distance τ between upcrossings goes to zero.

A.5.1 Time-dependent boundary
To this end, we need the probability density of the four-dimensional vector z = [x(t), x(t + τ), y(t), y(t + τ)] T , which is given by the multivariate Gaussian distribution This distribution is determined by the correlation matrix C 4 with elements (C 4 ) i j = z i z j : where we used the notations C xx (t, τ) ≡ x(t)x(t + τ) , C xy (t, τ) ≡ x(t)y(t + τ) and C yx (t, τ) ≡ y(t)x(t + τ) .Furthermore, note that σ 2 y = y 2 (t) and C yy (τ) ≡ y(t)y(t + τ) do not depend on time because of the stationarity of y(t).The correlation functions for τ > 0 can be computed from the regression theorem [72]: C xy (t, τ) = G yx (τ)σ 2 x (t) + G yy (τ)σ xy (t), C yy (τ) = G yy (τ)σ 2 y (t), where we used the elements of the Green's function of the Ornstein-Uhlenbeck process Eq. ( 5).Using the negative drift matrix of the Ornstein-Uhlenbeck process A = In Eq. ( 70) we also need the time-shifted moments σ 2 x (t + τ) and σ xy (t +τ).These can be obtained from σ 2 x (t) and σ xy (t) by propagating Eq. (59) Because we are interested in the limit τ → 0, we can expand the moving threshold at time t to linear order such that b(t + τ) = b(t) + ḃ(t)τ + O(τ 2 ). (80) The two-point joint density follows from Eq. ( 67) and ( 69): Combining all factors yields the two-point upcrossing density in the limit of zero lag given by Eq. ( 23).
In the stationary case, ḃ = 0 and t → ∞, the formula for f 2 (t,t) reduces to  with the stationary upcrossing rate This expression results in the pair correlation function Eq. (25).

A.5.2 Auto-correlation function of up-crossings for stationary, differentiable Gaussian processes
In the stationary case, the rate of upcrossings f 1 is constant and the second order distribution function as well as the auto-correlation function of x only depend on the time difference, hence f 2 (t,t + τ) = f 2 (τ) and C xx (t,t + τ) = C xx (τ).A classical result for the upcrossing rate is [48] Here, we derive the asymptotic behavior of f 2 (τ) for small time lag τ.
To this end, we expand C xx (τ) where xx (0) denotes the k-th right-sided derivative of the correlation function at zero time lag.Here, we have taken into account that the auto-correlation function is an even function.Furthermore, we have not included the first-order term c 1 |τ| because the derivative C xx (0) = C x ẋ(0) must be zero for differentiable processes x(t), i.e. for velocities ẋ with finite variance.For example, the one-dimensional Ornstein-Uhlenbeck process is excluded because it exhibits a kink in its auto-correlation function C xx (τ) ∼ e −|τ|/τcor at zero lag (i.e.c 1 < 0) implying an infinite variance of the velocity, σ 2 ẋ = −c 2 = ∞, and hence a diverging up-crossing rate, Eq. ( 88).This divergence arises for any one-dimensional Langevin dynamics, for which the velocity ẋ exhibits a white noise component, and reflects the fractal nature of Markovian diffusion processes [60].In the following, we distinguish three cases, all of which have occurred in previous studies: (i) c 3 = 0 corresponding to a kink in the velocity correlation function C ẋ ẋ(τ ) = C xx (τ).This case is considered in the present work as well as in previous models [41,68,54,56].(ii) c 3 = 0 and c 5 = 0 corresponding to a kink in the correlation function of the acceleration ẍ(t), as in [41].And (iii), c 3 = 0 and c 5 = 0 which occurs, e.g., for smooth correlation functions as used in [58,66].
In the first case, c 3 = 0, i.e. when C xx (τ) has a kink at zero lag and thus the acceleration ẍ has infinite variance as in our model Eq.(1), we find in lowest-order in τ This expression recovers a previous result obtained in [59].Furthermore, the case c 3 = 0, c 5 = 0, yields the following lowest-order behavior To the best of our knowledge, this expression is a novel result.Finally, the third case c 3 = 0 and c 5 = 0, yields in lowest-order which has been reported before [59].In the derivation of Eqs.(90)-(92), we have used the Gaussian integral

B Chizhov-Graham theory
An elegant approximation for the FPT problem has been put forward by Chizhov and Graham [25,26], which we will state here without proof.

Fig. 2
Fig. 2 Correlations of level crossings of a stationary process x(t).(a) Normalized auto-correlation function R(τ) ≡ R(t,t + τ) as a function on the time lag τ (in units of τ 1 def = τ m = γ −1 , τ = 0) for constant barriers b (as indicated on top) and small time constant τ y = 0.4τ m .The solid magenta lines show the exact semi-analytical result obtained from numerical integration of Eq. (81) and the blue dashed lines shows the exponential approximation, Eq. (30), respectively.(b) Same as (a) but with τ y = 2.5τ m .(c) Correlations in the limit of vanishing time lag, R(0) = lim τ→0 R(τ), as a function of the time scale ratio τ 2 /τ 1 = τ y /τ m for three different constant ( ḃ = 0) threshold levels b (as indicated).(d) Correlations for vanishing time lag as a function of the instantaneous threshold level b(t) for three different slopes ḃ(t) (at τ y = 0.4τ m ): decreasing thresholds lower probability of observing two infinitesimally close level crossings (blue dashed line), whereas increasing threshold increase this probability (finely dashed red line) compared to constant thresholds (solid green line).In all panels, black dotted lines indicate the zero baseline corresponding to a Poisson statistics.
Fig.4First-passage-time density, survivor function and hazard rate under non-stationary driving of a neuron that fired its last spike at time t = 0. (A) Weak subthreshold stimulus µ(t) (top panel) leads to a mean membrane potential response u(t|0) below threshold at V T = 1 (second panel).The first-passage-time density P(t|0) for the first threshold crossing of V (t) is shown in the third panel (gray circles: MC simulations of 10 6 trials; red solid line: Chizhov-Graham theory, Eq. (7), (101); blue dashed line: first-order decoupling approximation (independent upcrossings), Eq. (45),(43); blue solid line: second-order decoupling approximation (correlated upcrossings), Eq. (46),(43).The survival probability S(t|0) = −dP(t|0)/dt and the corresponding hazard rate λ (t|0) are shown in the two bottom panels, respectively.For MC simulations, the hazard rate is computed from the ratio λ (t|0) = P(t|0)/S(t|0).(B) The same for a suprathreshold stimulus, for which the mean membrane potential u(t|0) reaches the threshold.In both figures, τ s = 4 ms, τ m = 10 ms and σ η is such that the standard deviation of V is σ V = 0.25.

Fig. 5
Fig. 5 Error of the theoretical approximations for different stimulus properties.The error is measured as the Kolmogorov-Smirnov distance D between the theoretical and simulated ISI distribution.The stimulus µ(t) driving the LIF model is sampled from an Ornstein-Uhlenbeck process with mean μ, standard deviation 1 + τ m /τ µ σ and correlation time τ µ .(a) Color-coded value of D as a function of μ and σ for a rapidly varying stimulus, τ µ = 1 ms (left: 1st-order DA , middle: 2nd-order DA, right: Chizhov-Graham theory).(b) Same as (a) but for a moderately fast stimulus, τ µ = 10 ms.(c) Same as (a) but for a slow stimulus, τ µ = 100 ms.Other parameters as in Fig. 4. indep.upcross.correl.upcross.