Emergent dynamics in a model of visual cortex

This paper proposes that the network dynamics of the mammalian visual cortex are highly structured and strongly shaped by temporally localized barrages of excitatory and inhibitory firing we call ‘multiple-firing events’ (MFEs). Our proposal is based on careful study of a network of spiking neurons built to reflect the coarse physiology of a small patch of layer 2/3 of V1. When appropriately benchmarked this network is capable of reproducing the qualitative features of a range of phenomena observed in the real visual cortex, including spontaneous background patterns, orientation-specific responses, surround suppression and gamma-band oscillations. Detailed investigation into the relevant regimes reveals causal relationships among dynamical events driven by a strong competition between the excitatory and inhibitory populations. It suggests that along with firing rates, MFE characteristics can be a powerful signature of a regime. Testable predictions based on model observations and dynamical analysis are proposed. Electronic supplementary material The online version of this article (doi:10.1007/s10827-013-0445-9) contains supplementary material, which is available to authorized users.

connected to orientation domains of similar orientation-preference within other hypercolumns. We will refer to connections between cluster C jk and clusters C j k (i.e., clusters within the same hypercolumn) as local connections, and connections between C jk and C jk , k = k, as long-range connections. See Fig 1 in the main text for an illustration of our network architecture.
Each cluster has N E excitatory and N I inhibitory neurons. In order to focus clearly on causal neuronneuron interactions we model each neuron using single-compartment leaky integrate-and-fire equations in which the state-variables are membrane-potential V and a slowly decaying excitatory synaptic conductance g syn,E slow (intended to represent NMDA-type excitation [61][62][63][64][65][66][67][68][69][70]). We also include typical fast synaptic currents associated with AMPA-type excitation and GABA-A type inhibition, but these currents are modeled with instantaneous rise and infinitely fast decay, thus obviating the need for additional state-variables to track these quantities. We have made this simplification within our model in order to clearly distinguish the fastsynaptic dynamics from the slower dynamics associated with leakage currents and slow-synaptic currents, thus accentuating the causal relationship between a cell and its post-synaptic targets. Nevertheless, our major results remain valid even when the fast-synaptic time-scales are not infinitely short (see, e.g., section 2.1).
The state-variable V of a typical neuron n evolves according to a set of ODEs which can be described as follows: In these equations V L = 0, V E = 14/3, V I = −2/3 (expressed in non-dimensional units) are reversal potentials associated with the leakage-, excitatory-and inhibitory-currents, and the leakage-timescale τ V = 20ms [71]. The voltage V of neuron n evolves continuously until V (t) crosses a membrane-potential threshold V T = 1, at which point this neuron fires, and its voltage V (t) is reset to V L = 0, and held there for a time τ ref .
The term g input fast models the excitatory conductance associated with feedforward input to neuron n (see Eq 5 below). The term g syn,E slow models the slow excitatory synaptic conductance, and the term g syn,Q fast models the fast Q-type synaptic conductance, Q ∈ {E, I}.
The slow-excitatory conductance g syn,E slow for the neuron n is given by where W nn slow describes the coupling strength between neuron n and neuron n, m n is a sum of delta-functions representing the firing activity of neuron n , and α E slow (t) is an alpha-function with instantaneous rise-time and a decay-time of τ E slow ∼ 50 − 150ms. In a similar fashion, the fast Q-type conductance g syn,Q fast , Q ∈ {E, I}, is given by The alpha-function α Q fast (t) has instantaneous rise-time and an infinitely fast decay-time of τ Q fast = 0. Thus, each time a neuron fires, the fast synaptic-conductances create an instantaneous jump in the voltage of the postsynaptic neuron and multiple neurons can be driven to fire at any instantaneous point in time.
Now we must emphasize that, due to the singular nature of Eq 4 (as well as Eq 5 below), the Eq 1 is not actually a well-posed ordinary differential equation. We model the instantaneous currents g syn,Q fast by starting with fast time-scales τ Q fast which are not zero, and then taking the formal limit as these time-scales, as well as τ ref , tend to 0 + . The order in which we take these limits affects the dynamics. We assume that τ ref τ Q fast , so each neuron can only fire once at each instant in time, and τ I fast ∼ τ E fast as τ E fast , τ I fast → 0 + . This assumption allows for a biologically realistic competition between excitatory and inhibitory populations.
The coupling strengths W nn fast and W nn slow encode the connectivity of the network, n being the transmitting and n the receiving neuron. We assume that the neurons are sparsely connected, so many of these weights will be 0. Since our model incorporates a slow excitatory synaptic current but not a slow inhibitory synaptic current, W nn slow = 0 when n is inhibitory. Also, since only excitatory cells broadcast long-range connections, W nn fast = W nn slow = 0 when n is inhibitory and in a different hypercolumn than n. If two neurons are connected, then we assume that their connection strengths depend only on (i) their types, (ii) whether or not they share the same orientation preference, and (iii) whether or not they lie in the same hypercolumn. That is to say, given two connected neurons n and n of types Q and Q in clusters C jk and C j k respectively, W nn fast and W nn slow depend only on Q and Q , δ jj and δ kk . Thus for pairs of neurons with δ kk = 1, there are 6 local-coupling parameters, which we denote by S Q,Q fast and S Q,E slow . If δ jj = 1, there are 4 long-range-coupling parameters, which we denote by L Q,E fast and L Q,E slow . For ease of notation we will later use S QQ to refer to S Q,Q fast , and L Q to refer to L Q,E slow . We will also refer to g syn,E slow as g slow and τ E slow as τ slow .

Two types of input
We will consider two general classes of input g input fast for this system: (i) an input which models the unstimulated or 'background' state of the cortex, and (ii) an input which models a 'stimulus' intended to represent a drifting grating of some diameter. Both the background drive and the stimulus to neuron-n are modeled by an input of the form where the spike-times T drive h are chosen from a Poisson process with rate η Q,j,k,l (t) that depends on Q, j and k, as well as on the neuron's index l. In background the rate of this Poisson input is chosen to be constant across all orientations j and hypercolumns k, as well as all indices l. The stimulus is modeled by increasing this Poisson input drive to selected clusters. For example, in order to model a drifting grating stimulus of orientation θ j with a small spatial diameter, we increase the drive to the j th cluster in a single hypercolumn 1 . If, on the other hand, we wish to model a drifting grating stimulus of orientation θ j with a very large spatial diameter, we would increase the drive to the j th cluster of every hypercolumn. Thus, in general, the drive, i.e., the rate of the Poisson input, to each cluster is given by a cluster-independent background drive plus a cluster-dependent stimulus-specific drive: We simulate the type of input a 'simple'-like cell would receive under a drifting-grating stimulus by setting η stim Q,j,k,l (t) = C Q,j,k η bkg Q (1 + sin (ω (t − φ l ))) .
In this expression C Q,j,k represents the effective contrast of the stimulus as perceived by the neurons in question, and is high only for those clusters j, k which would be influenced given the orientation and size of the drifting-grating. We choose C E,k,j = 3C I,k,j to simulate the excess of orientation-tuned feedforward input to the excitatory population which may arise from feedback from other layers [33]. The frequency ω (which we take to be 4Hz) is the temporal-frequency at which the grating passes over the cell's receptive field center, and is related to the spatial-frequency of the grating, as well as the grating's drift speed. The phase φ l is related to the spatial-phase preference of the cell, and is randomly distributed within each cluster [8,69,73,74]. We will not consider the distinction between simple and complex cells, except to note that our main conclusions do not change when we take η stim (t) = C [75].

Constraints from physiology
The following are parameters in our model: (a) network size (J, K, N Q ), (b) sparsity of connections, (c) synaptic coupling strengths (S QQ fast , S QQ slow , L Q,E fast , L Q,E slow ), (d) time-scale of slow-excitation τ slow , and the ratio τ I fast /τ E fast , (e) input (S Q drive , η bkg Q , η stim Q,j,k ). Many of these parameters are constrained in one way or another by the physiology and architecture of the real V1 as we now discuss.
(a) and (b): With regard to network size and connectivity, a typical small patch of layer 2/3 of V1 with a radius comparable to an ocular dominance column comprises ∼ 5 − 10 hypercolumns, each comprising a continuum of orientation domains [7,22,76,77]. To grossly capture the facts that (i) each orientation domain is influenced locally by domains of differing orientation preference, and (ii) each orientation domain is influenced laterally by multiple nearby orientation domains of similar orientation preference, we choose K = 8 and J = 3. Our results do not change qualitatively if we include more hypercolumns or resolve orientationspace more finely 2 . One feature not incorporated is the spatial structure of the long-range connections across scales of several mm -in our model long-range connections between pairs of hypercolumns are statistically identical. A typical orientation-domain within layer 2/3 of cat V1 has ∼ 700 excitatory cells and ∼ 200 inhibitory cells [5,[78][79][80][81] 3 and many types of inhibitory neurons are known to arborize densely [82][83][84][85]. We use N E = N I = 128 for the simulations and parameter-sweeps presented in this section; too small an N Q (e.g. N Q = 8) will not produce a dynamic regime with statistics that are physiologically acceptable. We take N I comparable to N E to reflect the fact that inhibitory neurons arborize more densely The regimes (and underlying mechanisms) discussed here persist for larger values of N Q . For an example, see section 4.3.
Experiments also indicate that excitatory neurons are not connected together in an all-to-all fashion [32]. We use a network connectivity of about 25%.
(c) and (d): The synaptic strengths are constrained by several electrophysiological experiments within V1, which indicate that typical EPSPs and IPSPs are not much larger than 1.5mV [4,9,86,87]. Thus a typical cell near reset will require ∼ 10 − 20 synaptic excitatory 'kicks' (in quick succession) to be driven over threshold. This constraint places an upper bound on the local and long-range coupling strengths S QQ fast and L Q,E fast . The time-integrated slow-excitatory EPSPs observed in the cortex are not much larger than the time-integrated fast EPSPs [64,68,70,88]; that too will impose constraints in our parameter choices. The time-scale of slow-excitation in V1 varies from ∼ 40 − 120ms [61,63,65]. In our model we use τ E slow = 128ms for most of our simulations, but this choice is not critical to our results. With regard to the fast time scales, the ratio τ I f ast /τ E f ast , which determines the order in which events are resolved, is taken to be 1; our results are not especially sensitive to this ratio provided it is not too large or too small.
(e): The input to our system is similarly constrained by experiments. Within the real V1 the EPSPs due to input are rarely large, and consist of many micro-EPSPs as well as many feedback and feedforward connections from other layers [89]. Moreover, it is observed that when V1 is driven by stimulus, a large portion of the feedforward input is orientation-tuned and may also preferentially target the excitatory population [33,90]. Thus, we constrain our system by requiring that (i) the magnitude of S Q drive is small (say, < S Q,E fast ), and (ii) the additional input associated with an oriented visual stimulus should affect the excitatory population more than the inhibitory population (i.e., η stim E,j,k > η stim I,j,k via C E,j,k > C I,j,k when cluster j, k is stimulated). In V1 it is observed that typical dynamic regimes in background produce excitatory firing-rates of a few Hz, with inhibitory firing-rates that are not much higher [27,[91][92][93]. This places a lower bound on the input 2 While our discretization into J = 3 orientation domains per hypercolumn is somewhat of an idealization, the local interaction between these clusters qualitatively captures the continuum of orientation-domains in V1, and we are not sure if a choice of J > 3 would be substantially more realistic. Specifically, while the division of V1 into orientation domains is well documented, the local architecture associated with these domains is not as well documented. The 'messiness' in the local architecture seems far greater than that assumed in a linearly parametrized ring model [32]. Moreover, the orientation tuning seen in many cells in V1 is quite broad [96,134], with a width at half-height comparable to pi/3. Thus, there are probably no more than 4 − 7 clearly distinguishable orientation domains in the real V1. Our choice J = 3 seems like it can coarsely capture the 'activated', and 'inactivated' sections of any given hypercolumn under strong grating stimulus, while allowing us to coarsely canvass the emergence of background patterns. Increasing J in our model would require the introduction of extra parameters, which we did not feel could be appropriately constrained. 3 There are roughly 40, 000 neurons per mm 3 in the cat V1, and layer 2/3 is about 0.38mm thick. Thus, there are around 14, 000 cells per mm 2 in layer 2/3. There are between 4-7 orientation pinwheels per mm 2 in the cat, so there are around 2800 neurons per orientation pinwheel. With each pinhweel divided into 3 orientation domains, there would be roughly 700 excitatory neurons and 200 inhibitory neurons per orientation domain. current to our system. If the combination S E drive · η bkg E is too low, then our system will not fire sufficiently frequently in background. Also, even though our model is structured specifically to allow many neurons to fire concurrently, we do not consider parameter regimes which admit synchronous firing or firing events that involve too a large fraction of the population, as that is unbiological. See Fig 1. Last but not least, many different experiments have demonstrated that V1 typically operates in a regime of high-gain [44][45][46][47][48][49][50][51][52][53][54][55][56][57][58][59][60]. We build that into our model by requiring that the background input to the system places the network in a state that can be easily excited by stimulus. Note that, unlike the restriction associated with background firing-rate (which constrains S E drive ·η bkg E ), the restriction associated with a high-gain regime places an upper bound on S E drive . If S E drive is too large, then the system will respond linearly as η E,j,k increases [94]. Within a high gain regime S E drive is small, η E,j,k is relatively large, and each neuron receives many relatively weak input EPSPs. This type of input ensures that the membrane-potential of each neuron is often close to V T , and that a small additional input can drive many neurons to fire.
How good is the model described above? One way to find out is to try to replicate some of the empirical observations of V1 that are not direct consequences of the modeling, i.e., they are not directly implied by our model architecture or by the rules of dynamics. The following sections discuss several such empirical observations. In each case, we will demonstrate (via simulations) that the phenomenon occurs for suitable choices of parameters, give a sense of how they constrain the network, and discuss briefly the main features of the dynamical regimes. Our aim is to produce, at the end of section 5, regions in parameter space each point of which corresponds to a model which exhibits all of the phenomena considered.
Recall that S QQ fast is abbreviated as S QQ , and L QE slow as L Q .

Fast time scales and MFEs
In the main text of this paper we often refer to MFEs (multiple firing events), which are idealizations of a concept that can only be defined precisely in the limit τ E → 0, for in this limit, no other forces can intervene between the start and completion of a synaptic event. Taking τ E to be a small but strictly positive number will introduce a small amount of fuzziness, in that it may not always be possible to determine whether a spiking is 'directly caused' by another spike. As long as τ E and τ I are significantly smaller than the other timescales in the system, multiple firing events should remain clearly identifiable, if not precisely defined. In Fig 2 we show that MFEs can manifest within more realistic clusters of conductance-based LIF neurons for which the fast synaptic conductances have timescales τ E , τ I > 0.

Conduction velocity and synaptic delays
The horizontal conduction velocity in mammalian visual cortex has been measured to lie between 0.1 and 0.4 mm/ms, with a mean of roughly 0.3 mm/ms. In other words, the delay due to horizontal transmission is roughly 3 − 10 ms/mm [4,18,97,98].
For simplicity, the model we discuss in the main text does not include any transmission delay between hypercolumns, nevertheless our main results hold even if we were to include such a delay. Specifically, the dynamic regime of our model continues to generate plenty of MFEs even after we incorporate transmission delays of 5 − 10 ms between hypercolumns (note that with the relatively small spatial extent of our system different hypercolumns are ∼ 1 − 2mm apart). The reason why our results are robust with regards to transmission delay is that, in the regime we discuss in the main text, the majority of the MFEs result from fast synaptic interactions within individual orientation domains. The long-range horizontal connections are important for setting the mean level of input to the clusters, but play a subsidiary role in the actual nucleation and execution of most MFEs. Shown in Fig 3 is a dynamic regime associated with our model when we incorporate long-range transmission delays of 9ms between hypercolumns. The coupling strengths of the system have been adjusted by ±10% so that the regime still retains the same dynamical properties as discussed in the main text. The individual orientation domains still generate MFEs, and the MFEs are still responsible for the large-scale dynamic phenomena (e.g., spontaneous background patterns and surround suppression).

Orientation selectivity
A large fraction of the excitatory and inhibitory cells in real V1 are orientation-selective, and V1 itself is organized into iso-orientation domains [10,96,[99][100][101]. Each iso-orientation domain is ∼ 0.25mm in diameter and is modeled in our network via a single cluster C jk . Two commonly observed features of orientation selectivity are that (i) excitatory cells tend to be sharply tuned for orientation, and (ii) inhibitory cells tend to be weakly to moderately tuned for orientation. We will try to capture the gross phenomenon of orientation selectivity in our model in the following way: we assume that a parameter regime exhibits physiologically reasonable orientation selectivity if (i) a stimulation of a subset of clusters corresponding to a given orientation (say, θ 1 ) does not increase the excitatory firing-rates within the other orientations (i.e., θ 2 , . . . , θ J ) above their unstimulated background level, and (ii) when a particular orientation is stimulated, the inhibitory neurons associated with that orientation fire more than the inhibitory neurons associated with other unstimulated orientations.

Localized receptive fields
In real V1, it is a fact that presentation of a grating at a cell's preferred orientation, but away from its receptive field center, will not produce additional spiking activity; there may be associated subthreshold activity but it does not give rise to firing-events [33,104,105]. A commonly held assumption which can rationalize this phenomenon is that the connectivity within V1 is dominated by local circuitry, implying that activation of the lateral/long-range circuitry alone will not be sufficient to cause firing events (although long-range excitation can give rise to a measurable increase in the average subthreshold voltage of areas far from the stimulated site -see [23,33,106]). For our model, this translates into the following: we assume that a parameter regime exhibits localized receptive fields in a physiologically reasonable manner if, when a particular cluster is stimulated, excitatory neurons in other clusters corresponding to the same orientation are not driven to fire appreciably more than they would in background.

Sample results and constraints on network parameters
To illustrate the ideas above, we first present a sample of simulation results showing the excitatory and inhibitory firing rates in a cluster we call C jk , i.e., a cluster with orientation j and located in hypercolumn k. Caption for Table 1. Here we present steady-state excitatory and inhibitory firing-rates for a typical cluster C jk in our model, recorded under a variety of stimulus paradigms. 'Unstim' means no cluster is stimulated. 'Stim C mn ' means C mn is the only cluster stimulated, and in the data shown, j = j, k = k. In each case involving stimulation the contrast (or stimulus magnitude) is relatively low, and the responses of our system have not saturated.
The data shown illustrate the following: Both excitatory and inhibitory firing in C jk are elevated significantly when C jk is directly stimulated (from 2.7 to 31 for E, from 6.6 to 36 for I). Excitatory firing rate in C jk when C j k , j = j, is stimulated is significantly lower than when no cluster is stimulated (1.0 vs 2.7), illustrating the fact that excitatory neurons are sharply tuned for orientation. Inhibitory firing rate in C jk when C j k is stimulated is lower than when C j,k itself is stimulated (5.4 vs 36), illustrating the fact that inhibitory neurons are weakly tuned. Finally, excitatory and inhibitory firing rates in C jk in background are not significantly lower than when C jk , a cluster of the same orientation but in a different hypercolumn, is stimulated (3,7 vs 2,12), illustrating the idea of localized receptive fields [107].
We discuss next how the properties discussed above restrict network parameters: With respect to connectivity, they tell us that, within a hypercolumn, there must be connections across orientation domains; at the same time, these connections should be no denser than within a cluster. This is the case in our model.
With respect to constraints on synaptic couplings, the properties discussed in this section do not impose strong constraints on our system beyond those already imposed at the end of Section 1, and they are mild compared to those to be imposed in the next two sections. Mainly, the orientation selectivity properties above suggest that local inhibition extends to other clusters within the same hypercolumn, whereas local excitation alone cannot cause other clusters within the same hypercolumn to fire. This suggests that S EE should not be large relative to S EI and S IE . The receptive field property suggests that (i) direct stimulation affects E-neurons more than it does E-neurons in other hypercolumns, i.e. L E cannot be too large relative to L I , and (ii) the net effect of long-range excitation should be suppressive, i.e. L I > L E . These restrictions are not severe; they make the viable parameter corridor in panel

Sharp tuning
Finally, we comment on an an additional phenomenon regarding orientation tuning -sometimes referred to as 'sharpening'. The orientation tuning observed in the real V1 is rather sharp and various theoretical investigations have pointed out that this sharp tuning can manifest even when the feedforward inputs into V1 are not sharply tuned. Moreover, modeling work has indicated that the recurrent connectivity within V1 itself may be sufficient to give rise to such a phenomenon (see, e.g., [138]). Our model, as it currently stands, only coarsely resolves orientation space and is not quite detailed enough to clearly investigate this sharpening; we cannot easily disambiguate sharply tuned inputs/responses from broadly tuned inputs/responses.
Here we show that, by increasing the number of orientations in our model, we can indeed qualitatively capture sharpening. We present an example in which we increase J from 3 to 4, and retune the local coupling strengths to maintain the same type of regime discussed in the main text. This J = 4 model has fewer inhibitory cells per cluster (64 instead of 128), but otherwise has the same architecture as the J = 3 model we discuss in the rest of the paper. In terms of the dynamical regime, we retune the coupling strengths to ensure that (i) MFEs still manifest, and (ii) these MFEs still give rise to spontaneous background patterns, surround suppression, the appropriately structured LFP, etc. As shown in Fig 4A, the MFE distribution for this model has a long tail -MFEs of small to moderate size are typical.
With this new J = 4 model we can present weakly tuned input, and measure how sharply tuned the output firing rates are. More precisely, we choose one orientation, say, θ 3 , and stimulate 3 clusters of this orientation (i.e., we choose η stim Q,3,k,l as described in section 1.2 with strength C E,3,k = 0.75 and C I,3,k = 0.25, for k = 1, 2, 3, and with strength 0 for k > 3). We then stimulate the first 3 clusters of the 'adjacent' orientations θ 2 and θ 4 half as strongly (i.e., with C E,2,k = C E,4,k = 0.375 and C I,2,k = C I,4,k = 0.125 for k = 1, 2, 3). We do not stimulate the final 'orthogonal' orientation θ 1 at all -the only stimulus received by θ 1 is the background η bkg Q . This input is broadly tuned, as can be seen in the Fig 4B. We then measure the time-averaged output firing-rates across the first three clusters of each orientation. This data is also shown in Fig 4B, and is much more sharply tuned than the input. This behavior is typical for this model, and persists when we change the number of clusters stimulated or the overall strength of the stimulus.

Spontaneous correlated background activity
It has been observed in several mammals (e.g., ferret, cat, and monkey) that, when unstimulated, i.e., in background, the activity in the visual cortex is not homogeneous, but has spatial and temporal structure. The spatial correlations are on the order of 1-2mm and the temporal correlations are on the order of 50-500ms [35-37, 108, 109]. It has also been found that, much of the time, the distinct cortical regions which exhibit correlated behavior are orientation domains of similar orientation preference separated by ∼ 0.5 − 2mm [23,35].

Model regimes exhibiting realistic V1 background activity
By interpreting a collection of experiments [33,[35][36][37][38][39][40]108], we propose that the following dynamic picture of our network correctly represents the phenomenon observed: When unstimulated, our network should admit epochs of sustained activity across a collection of orientation domains with equivalent orientation, i.e., a collection of clusters {C jk , j = θ} for some θ. When a pattern is activated, the sustained activity corresponds to increases in both mean firing-rate and mean subthreshold voltage, and clusters with a different θ are usually not concurrently activated during this period (though some concurrent patterns are permitted from time to time). Any given sustained activity pattern can persist for ∼ 50 − 500ms before either decaying to a homogeneous state or evolving into another sustained pattern associated with a different θ. Fig 2 in the main text illustrates a typical background dynamic regime for our model. Here we also replot the psd of the LFP for this background regime on a log-log scale (see Fig 5).
Notice the following characteristics of the dynamical regimes we propose as candidates for exhibiting realistic background activity: First, the durations of the patterns are quite random and sometimes ill-defined, but they have a characteristic persistent time τ persist bkgrnd which can be considerably longer than τ slow ; this is consistent with several experimental observations [35,108,109]. In the regime shown, τ persist bkgrnd ∼ 300 ms, compared with τ slow = 128 ms. It is not hard to make τ persist bkgrnd last as long as 1-2 s by increasing L E and decreasing the local couplings One may be concerned that τ persist bkgrnd may increase with system size, and therefore τ persist bkgrnd may become unphysically large if the cluster size in our model were increased to match that of the real V1. This is a legitimate concern, as such increases in switching times are predicted in a number of theoretical studies [110,112,113]. In our model, however, there are several parameters that can be tuned to control the values of τ persist bkgrnd . We demonstrate below (section 4.3) that by tuning these parameters, switching times can be made longer or shorter as desired for clusters 4 times the size of those used in Fig 2 of the main text. Also notice that, following the onset of each pattern, activity among E-cells in the orientation domain in question continues well past the point when their inhibitory counterparts have been fully activated. This characteristic, which is very much in evidence in panel [A] and also reflected in the shapes of the crosscovariance curves in panel [B], is consistent with experimental measurements [39,40].
One reason we draw attention to the two characteristics in the last paragraph is that they are at odds with [111], which previously proposed a different network regime also exhibiting correlated background activity. Unlike ours, their regime is highly inhibition-driven. The model network described in Section 1 is, in fact, flexible enough that it can also support the scenario described in [111] -for a different set of parameters than the ones proposed here, of course. We believe the scenario presented in this subsection fits better with experimental data.

Constraints on network parameters
The phenomenon associated with the background regimes discussed above places quite a strong constraint on the parameters of our model. The emergence of patterns hinges on the ability of the E-population to nucleate MFEs (which however must not involve too large a fraction of the cluster since such synchronous firing is not typically observed in functional regimes of real V1). That in turn depends on a number of factors, notably the strengths of the background drives S Q drive and η bkg Q (see Section 1.2), and the local couplings S QQ fast . Other things being equal, the closer to threshold a system operates, the more readily it generates MFEs. We have chosen the background drive in such a way the system is operating quite close to threshold. Longrange fast excitation L E fast must not be too small as it is responsible for recruiting clusters in multiple hypercolumns. Once recruitment takes place, τ bkgrnd persist depends strongly on the ratio of L I slow to L E slow (S E slow and S I slow are fixed for definiteness). If L I slow is too small and L E slow is too large, then an orientation which has recruited will dominate nearly indefinitely (i.e., τ bkgrnd persist → ∞). In general, for the regimes we have investigated, a larger ratio of L I /L E leads to a more rapid interference with the E-dominated MFEs, hence a shorter τ bkgrnd persist . Finally, S QQ fast and the local connectivity across clusters have the greatest influence on non-iso-oriented exclusion, which also affects τ bkgrnd persist and the patterned activity. It is obviously impossible to balance all of the relationships in the last paragraph 'by hand'. To locate the relevant parameter regions, we performed many parameter sweeps, using the results of simulations to help identify parameters with desired properties. See Fig 6, which tracks the same two panels shown in Fig 1.

Background patterns in larger systems
The switching time τ bkgrnd persist of the regime exhibiting background patterns shown in section 4 does not necessarily increase with the system size. If the sizes N E , N I of the clusters in our model are increased, then the network parameters can be retuned slightly to maintain qualitatively similar background fluctuations with a comparable switching timescale τ bkgrnd persist . In Fig 10A we show background patterns obtained from a system very similar to the one shown in Fig 2 of the main text, except that N E and N I are increased to 512 and the coupling parameters have been retuned slightly. As can be seen in Fig 10A, τ bkgrnd persist for this regime is ∼ 300ms. By taking this regime and modifying the local coupling strengths S IE + S EI and S EE by ±50%, as done in Fig 6, we can obtain qualitatively different regimes for which τ bkgrnd persist can be larger or smaller than 300ms. Examples of these other regimes are shown in Fig 10B,C.

Surround suppression of inhibitory and excitatory firing-rates
It has been observed that the firing rate of many excitatory cortical cells is lower when stimulated with an optimally-oriented full-field grating than with an optimally-oriented small grating subtending only the neuron's classical receptive field [31,42,43,117,118]. Typical theories regarding this type of effect invoke long-range or lateral connectivity (or feedback), as well as a strong inhibitory agent which is triggered by the activation of the surrounding hypercolumns, and which serves to suppress excitation within the central/measured hypercolumn [34,119]. As there is no strong evidence for long-range inhibition within the cortex, the typical assumption is that the inhibitory mechanism leading to surround suppression must involve long-range excitation of the local inhibitory circuit. As such, many theories regarding surround suppression predict that, as the surround is stimulated, local inhibitory firing rates in the center would increase, thus suppressing excitatory firing-rates in the center.
Recent results of Ozeki et al. [42,120] challenge the view expressed at the end of the last paragraph: their intracellular measurements infer that, when the surround and center are both stimulated with an optimallyoriented grating, both inhibitory and excitatory firing rates are lower than the corresponding rates when only the center is stimulated.
We mention also another set of results, by Haider et al. [43], which finds, using images that are rich in spatio-temporal structure, an increase in the firing-rate of fast-spiking interneurons when one stimulates the surround in addition to the center.
In this section we will focus primarily on iso-oriented surround suppression a la [42]. A few related results including contrast dependence and other kinds of stimuli are discussed in section 5.2.

Network regimes exhibiting surround-suppression a la [42]
We fix in our network an orientation j and a hypercolumn k. It will be assumed throughout that all stimulations take the form of drifting gratings with orientation j. We view the kth hypercolumn as the 'center', and any subset of the other hypercolumns as the 'surround'. (Recall from Section 1 that connectivity between all pairs of hypercolumns are statistically identical in our model.) We will compare the firing rates of excitatory and inhibitory neurons in the center, i.e. in the cluster C jk , in the following two scenarios: In Scenario 1, we stimulate C jk alone; in Scenario 2, we stimulate some subcollection of {C jk , ∀ k } including C jk ; for simplicity, we denote the latter by ∪ k C jk .
Following Ozeki et al. [42], we say a regime exhibits surround-suppression if, under strong stimulation, there is decreased firing for both excitatory and inhibitory neurons in C jk as one goes from Scenario 1 to Scenario 2. All stimuli considered in section 5.1 are assumed to have high contrast.
We will demonstrate that there are parameter regions for which the system exhibits surround-suppression in the sense above. A sample of results is shown in Table 2, and a more detailed illustration of this phenomenon is presented in Fig 5 in  Caption for Table 2. Here we present steady-state excitatory and inhibitory firing-rates for a typical cluster C jk in our model, recorded under a variety of stimulus paradigms. 'Unstim' means no cluster is stimulated. 'Stim C jk ' means C jk is the only cluster stimulated, and 'Stim ∪ k C jk ' means that several clusters of the same orientation (including C jk ) are stimulated. Fig 7 shows two panels analogous to those in Fig 6, illustrating the parameter constraints placed on the network with regard to surround-suppression a la [42]. One sees from Fig 7 that, in each of the panels, the intersection of the grey region with the domain bounded by the solid black curve is nonempty. This is the set of parameters which simultaneously satisfy all requirements in the studies reported in the main text of this paper. To get a rough idea of the size of this viable parameter region, see Fig 7, in which the highlighted square region shows a two-fold variation in parameters.

Contrast dependence
We report here on some very preliminary results regarding the response to stimuli of low-versus highconstrast, for optimally oriented drifting gratings of increasing diameters. We think of the constant C E,k,j in our external input (see Sect. 1.2) as varying from 0 to 1; with 0.125 corresponding to low and 0.5 to high contrast; these numbers are chosen to elicit roughly desirable firing rates as seen in real V1. Since there is no notion of geometric distance among hypercolumns in our model, we represent the 'diameter' of a stimulus by the total number of clusters stimulated. That is to say, we fix an orientation j, think of C j1 as the 'center' or cluster 1, and study the firing rates in cluster 1 when clusters 1 through n, i.e., C j1 , . . . , C jn are stimulated. Plots of E-firing rates as functions of n for a few different contrasts are shown in Fig. 3A in the main text. These results show some resemblance to the experimental data in [41].
The effects of a high contrast stimulation are discussed in previous sections. For very low contrast, one may rationalize the small increase in firing rate in cluster 1 as n goes from 1 to 2 as follows: For a system that operates very close to threshold but in a low firing regime (such as in background or when very weakly stimulated), there is a tendency to nucleate MFEs when any amount of additional excitation is received, and MFEs lead to the formation and reinforcement of a theta-pattern, which leads to increased firing rates. On the other hand, long-range effects are, on average, suppressive. It appears that the tendency to form MFEs outweighs this suppressive effect for a system in a low enough firing regime, but the scale is tipped (quickly) as one raises its level of excitation. That would explain the initial increase in firing rate, as well as the decrease for n > 2 for the regime shown in Fig 3A in the main text. This thinking is corroborated by simulations.

Regimes in which I-firing increases with grating diameter
For nearly all reasonable parameters (not necessarily close to the patch considered in this paper) that we have investigated, excitatory firing rates at the center drop very definitively as we stimulate the surround in addition to the center. This, however, is not true for inhibitory firing rates, which can go in either direction. That is to say, there exist sizable regions in parameters away from the patch we consider on which our model exhibits behavior opposite to that described in Sect. 5.1 with respect to inhibitory firing. One way to get into such a regime is to start from our reference parameters, and alter the coupling strengths S EI + S EI as well as the input η bkg I in order to decrease the relative fraction of I-firing that is triggered by MFEs. The fact that our model supports such a range of behaviors is testimony to its versatility; it also points to the importance of constraining parameters by experimental data and the need for further work.

Stimulus type influences surround-suppression
Our model is not capable of fully capturing the results of Haider et al. in part because their work studied fastspiking interneurons, which are not well modeled by the integrate-and-fire neurons we use. More importantly, the stimulus used in Haider et al. is in the form of a 'natural movie', with broadband spatio-temporal structures, and not a drifting grating with a fixed orientation (which is the only type of stimulus used in all of our results so far). However, for the same set of parameters used to generate the inhibitory surroundsuppression shown in Fig 5 of the main text, we have seen that a full-field stimulus that affects multiple orientations produces significantly lower firing rates in the center compared to stimuli of comparable strengths directed at a single orientation. This has motivated us to crudely approximate the 'natural' stimuli of the kind used by Haider by a drive of the form: η stim Q,j,k,l (t) = C Q,j,k η bkg Q (1 + sin (θ j + ω (t − φ l ))) , where j varies across all angles, and k varies across the hypercolumns directly affected by the stimulus. Using this 'natural' stimulus, we have found that our reference parameter-set (shown in Fig 5 of the main text) produces a rather different response; the inhibitory cells show a slight surround facilitation, whereas the excitatory cells remain suppressed. These observations are consistent with, respectively, the fast-spiking interneurons and regular-spiking pyramidal cells observed by [43]. An example of this is shown in Fig 8. 6 Comparison with related works A variety of theories have been proposed to explain background phenomena and surround suppression in V1. We compare and contrast our findings with a few previous results.

Rate models
We mention first some results involving firing-rate models which can explain, to varying degrees, spontaneous background activity within V1. Work by [110] applies linear stability analysis to a nonlinear firing-rate model to reveal marginally stable states. By suitably balancing a noisy background input, their system can exhibit a background regime which fluctuates between different marginal modes with biologically realistic timescales. Further work by [110,112,121,122] has investigated linear firing-rate models which are designed to amplify weak inputs to produce transient impulse responses. Using background drives with the appropriate spatio-temporal correlations these models can also exhibit background-fluctuations with biologically realistic spatio-temporal scales.
An important difference between our results and [112] is that our regime does not rely on correlations in the drive to produce patterned activity. While background drive is likely correlated, there is evidence that these correlations may not be directly responsible for the patterns observed in V1 (see [108]). We also differ from [110] in that, though not the most common scenario, it is not rare for multiple orientations to be represented within the same sustained pattern. This phenomenon, present in our regime but excluded in [110], is consistent with [38]. In addition to these differences, the regimes produced by our model exhibit MFEs (see main text). This self-organized collaborative activity among neurons (i.e., the interplay of MFEs and the characteristic times between them) are not readily captured by current firing-rate or populationdynamics models. Because our network does not coarse-grain over individual firing-events, our model can capture the structure of correlated firing-events, both in terms of magnitude and distribution [39,40]. These correlated firing-events may be important for coding [123][124][125][126][127][128][129][130][131] and they cannot be readily captured by firing-rate models.

Network Models
There are network-based models of V1 that are closer in spirit to our approach than the firing-rate models discussed above. One such network model is included in Murphy and Miller [112]; this model exhibits dynamics which are well captured by the firing-rate model discussed in the same paper. Another network model of V1 is Cai et al. [111]. This model, which was used earlier to investigate spontaneous background activity, has influenced some of the thinking in the present paper. The regimes that we regard as correctly reflecting experimental observations, however, differ substantially from those proposed in [111], and we would like to discuss that in some detail.
The most significant difference between the regime proposed in [111] and ours is that the former is largely inhibition-dominated, while the regime proposed here is balanced in a way that permits serious sustained competition between its excitatory and inhibitory populations; in that sense, our regime is poised in a position of much higher gain. These differences manifest themselves clearly in the firing patterns in background: In [111], the initiation of a pattern, called a 'recruitment event', takes the form of a single burst of excitatory firing. Following a recruitment, the regime invokes strong slow-excitation to sustain θ-specific inhibitory firing. Further excitatory firing beyond that one initial burst is prohibited in these post-recruitment periods of θ-specific activity, which often conclude with the initiation of a new recruitment-event within a different θ. See Fig 9. As a result of this dynamic mechanism, the persistence timescale for background patterns in [111] is directly controlled by τ slow , and can never be substantially longer than τ slow . Another byproduct of its mostly-inhibitory firing in background is that this regime has unrealistically low firing rates both in background and under drive.
Our model is flexible enough that it has the capability to produce, in different parameter regions, both the regimes we propose and the scenario in [111]; for the latter, take L I f ast or L I slow >> L E slow . We have a number of reasons, however, for believing that the regimes we propose in this paper more accurately reflect real V1 dynamics: (1) The parameter regions associated with the inhibitory-dominated regime similar to [111] do not exhibit biologically-realistic firing-rates, whereas our regimes do, as ensured by the benchmarking process in section 1.3; see Fig 1. (2) Sustained excitatory firing in background is supported by direct experimental data from [39]. (3) We know of no evidence to suggest that the time scale τ bkgrnd persist should be directly tied to τ slow . On the contrary, spontaneous activity in eye-closed ferret V1 can last many seconds, which is longer than the persistence timescales of many (but not all) types of NMDA receptors [108], and certain types of spontaneous cortical activity persist with similar timescales even under NMDA blockade [115]. (4) Our fourth reason has to do with another V1 phenomenon, namely surround suppression, which we discussed in section 5. Due to their extreme inhibitory suppression, regimes we have found that exhibit background patterns similar to those in [111] exhibit classical surround-suppression of the excitatory population but not surround-suppression of the excitatory and inhibitory population as is observed in recent experiments [42].
For these reasons, we believe that the regimes presented in the main text of this paper are more realistic than the regimes presented in [111], and that the mechanisms discussed in this paper and in [95] are relevant to the functioning of V1.

Appendix
All of the parameters in our model are identified in Sects. 1.2 and 1.3. Many sweeps were conducted to locate regions in this rather high dimensional parameter space. Our search was semi-systematic: We began with some initial educated guesses. At the same time, we attempted to identify 'opposing forces' in the system that need to be balanced. Suppose X and Y form such a pair (examples of which are given below). We pitched them against one another in simulations of regimes corresponding to 2D panels in which both X and Y are varied and other parameters are guessed. Computation results were used to identify pairs (X, Y ) that manifest the desired phenomena. Information on this 2D study (especially when it failed to produce reasonable parameters) gave feedback to our initial guesses, which were successively improved.
Via these parameter sweeps, we located an open region in parameter space with all of the desired properties. This region is reasonably sized: in most directions, one can vary the parameter in question by 40% and remain in the viable set. A specific set of parameters in the interior of this region is fixed and is used in many of the simulations shown. We will refer to this set of parameters as our 'reference set'; its precise values are given below. Our search is by no means exhaustive, and we do not claim that parameters in this region are the only viable ones. Among the choices that we made, the most important are parameters related to background drive: we set them to put our system in a 'high-gain' mode, so it responds sensitively to changes in input (see Sect. 1.3). Two sets of 'opposing forces' we have found to require the most delicate balancing are, not surprisingly, (i) local fast excitation vs inhibition, i.e. the coupling strengths S QQ , and (ii) long-range slow excitation to E vs I-neurons, i.e. the coupling strengths L E vs L I .
Other parameters are relatively stable, which is not to say that they can be arbitrarily fixed (e.g. too small a L EE fast will hamper the generation of MFEs discussed in the main text). They are relatively stable in the sense that once appropriately chosen, varying them by reasonable amounts will not lead to dramatic qualitative changes in the dynamics, though precise values of other parameters have to be adjusted.
With regard to (i), we set S II = 0, our rationale being that a positive S II amounts to weaker inhibition. To further simplify the situation, we chose to vary S EI and S IE together, viewing the combined value of these two parameters as a measure of the effectiveness of the inhibitory population. Thus in (i), we pitch S EE , representing the effectiveness of the local fast excitation, against S EI and S IE , which are varied simultaneously. Among all of the 2D panels we have studied, this is quite possibly the one that placed the most serious constraints on the model. The panel L E vs L I also revealed important constraints, especially with regard to producing data that fit with experimental observations in unstimulated background activity (see Section 4). Constraints on parameters in these two 2D-slices are tracked in successive steps in Figs 1, 6 and 7.

Reference parameter set
The network size is given by:  . To the right of the black curve the system exhibits frequent strong synchronous firing-events, which we regard as unbiological. Finally, the two vertical green lines within panel [A] indicate the boundaries of the region where the fast local EPSPs lie within a biologically plausible range (i.e, 0.5 → 1.5mV). Thus, the parameters in the grey region are the only ones that satisfy all of the constraints above, and we will henceforth consider only parameters in this region. Each panel shows typical trajectories for one of a family of networks. Each network in this family is a single cluster of N E = 1024 excitatory neurons and N I = 1024 inhibitory neurons driven by a uniform Poisson background input. Each neuron is modeled using the conductance-based integrate-and-fire equations with a nonzero τ E , τ I . The ratio of τ I /τ E is fixed to be 2 in accordance with physiologically observed timescales for GABA-type inhibition and AMPA-type excitation (the precise ratio here is not critical to our results). The different networks are identical except for the value of τ I , which ranges from 1ms (in panel-A) to 4ms (in panel-C). In the top two strips of each panel we show the voltage-distributions for the neurons in the network. Each vertical strip represents the voltage histogram for the EX or IN neurons in the network at a particular time within a 128ms-long trajectory. The bottom two strips in each panel show the raster plots for the EX and IN neurons in the network. Note that the presence of MFEs is very apparent, even when τ I is large. Moreover, the width of the MFEs is ≤ 10ms, and the larger MFEs are clearly distinguishable. Other than the addition of these delays, the network architecture is identical to the model discussed in the main text. The coupling strengths have been altered by ±10% to preserve the dynamical regime. On the right we plot the MFE distribution for this regime. Here we construct a model which resolves orientation space more finely (i.e., with J = 4). We retune this model so that the dynamic regime is qualitatively similar to the J = 3 model discussed in the main text. [A]. MFE distribution. [B]. Illustration of orientation tuning for this model. On the left we present the broadly tuned feedforward input we use to stimulate the model (rates shown for excitatory neurons). We measure this strength in terms of the timeaveraged Poisson input rate associated with a typical neuron of a given orientation within the stimulated hypercolumns. The background rate is indicated with a dashed line. On the right we plot the time-averaged output excitatory firing-rates for the stimulated hypercolumns. The background firing-rate is indicated with a dashed line. Note that the output firing-rates are much more sharply tuned than the input firing-rates.  In each of these panels, the grey region shown is a (large) subset of the grey region in Fig 1, with some additional parameters excluded due to inadequate performance with respect to orientation selectivity and/or receptive fields (see Sect. 3). Parameters in the region bounded by the black curves are admissible for purposes of background activity. In panel [A], parameters near 'a' are excluded due to too much synchronous firing (S EE too large relative to S IE , S EI ). Parameters near 'b' are excluded because heightened activity in one orientation domain tends to spill over into other orientation domains (local couplings too strong). Parameters near 'c' are excluded because these regimes show no discernible patterns in background (strong inhibition prevents successful recruitment and dominance by any one θ). Parameters near 'd' are excluded because the patterns switch too slowly (insufficient inhibition to suppress E-activity once it is aroused). In panel [B], patterns switch too slowly for parameters next to 'e', too fast for parameters near 'f'. In the region above 'g' (i.e. larger L I still), the regime becomes increasingly inhibition driven, showing characteristics of the regimes proposed in [111]. Parameter constraints imposed to satisfy surround suppression. The 2D-slices of parameter space shown here are identical to those in Figs 1 and 6, with identical labeling of axes. In each panel, the grey region is obtained by intersecting the corresponding grey region in Fig 6 with the region bounded by the black curves there, that is to say, parameters within the grey regions shown here meet all previous conditions. Fig 7 pertains to the effects of strong stimulation. Requiring that excitatory firing rate is lowered in the center as one stimulates the surround imposes no further constraint on parameters. In each of the panels here, the region bounded by the solid black curve consist of parameters with the property that when strongly stimulated, inhibitory firing is lower by ≥ 45% in Scenario 2 compared to Scenario 1, and the dashed black curve bounds the region where inhibitory firing in Scenario 2 is lower than or equal to that in Scenario 1. Note that the parameter region which satisfies all of our constraints is not small -the highlighted square region indicates a two-fold variation in parameters. Caption for Fig 8: In panels A and B we show the firing-events produced by our reference parameter set under the 'natural' drive described in section 5.2.3. Panel-A illustrates the sequence of E-and I-population spike counts (collected over 1ms bins) produced by one of the clusters C jk in our network when hypercolumn k alone is stimulated. Panel-B illustrates these spike-counts when 2 other hypercolumns are also stimulated.
In Panel-C we show time-averaged firing-rates for the E-(red) and I-(blue) populations as a function of the number of clusters stimulated. We did not divide the I-populations into MFE-spikes and nonMFE-spikes (as done in Fig 5 of the main text), since the firing-rates were quite low and the '95%' rule was not a good qualitative assessment of the degree of collaborative activity in this regime. In Panel-D we show the histograms of MFE magnitudes for E-(top) and I-cells (bottom) for the regimes shown in panels A (black) and B (colored) respectively.  [111]. The format follows that of panel [A] in Fig 2 of the main text, with two scale bars for the excitatory and inhibitory populations. Note that the dynamics in this regime are strongly stereotyped: sharp excitatory MFEs (called recruitments in [111]) initiate inhibitory-dominated epochs of firing which preclude further excitatory firing -in contrast to the sustained competition between E and I in our model following the initiation of each pattern. Also, as the inhibitory epochs in [111] are driven by g slow , the inhibitory firing-rate during these epochs follows a predictable decay. Panel [B]: (i) cross-covariances, as defined in the methods section of the main text, for Q = E (red) and Q = I (blue) over 1800ms (left) and ∼800ms (right); note the strong time-asymmetry at time 0. Finally, note the sharp contrast between the cross-covariances exhibited by this inhibitory dominated regime and the cross-covariances exhibited by the regime presented in  . In panel-B we show background activity for a system equivalent to that shown in panel-A, except that the local coupling strengths S IE + S EI have been decreased by ∼ 20% and S EE has been increased by ∼ 20%. For this regime the typical MFEs are slightly larger than in panel-A (as can be seen by inspecting the tail end of the MFE distribution) and the switching time is longer. In panel-C we show background activity for another system equivalent to that shown in panel-A, except that the local coupling strengths have been modified in the opposite direction -S EE has been decreased and S IE + S EI have been increased. For this regime the typical MFEs are slightly smaller than in panel-A, and the switching time is shorter. In panel-D we reproduce part of the dynamic trace shown in Fig 2 of the main text, which corresponds to a system with cluster size N E = N I = 128.