Joint Statistics of Strongly Correlated Neurons via Dimensional Reduction

The relative timing of action potentials in neurons recorded from local cortical networks often shows a non-trivial dependence, which is then quantified by cross-correlation functions. Theoretical models emphasize that such spike train correlations are an inevitable consequence of two neurons being part of the same network and sharing some synaptic input. For non-linear neuron models, however, explicit correlation functions are difficult to compute analytically, and perturbative methods work only for weak shared input. In order to treat strong correlations, we suggest here an alternative non-perturbative method. Specifically, we study the case of two leaky integrate-and-fire neurons with strong shared input. Correlation functions derived from simulated spike trains fit our theoretical predictions very accurately. Using our method, we computed the non-linear correlation transfer as well as correlation functions that are asymmetric due to inhomogeneous intrinsic parameters or unequal input.


I. INTRODUCTION
Electric activity generated by different neurons in the brain is often strongly correlated [1][2][3][4]. These correlations arise as a result of shared input, or input components that are themselves correlated. Correlated activity can be a consequence of shared background fluctuations [5], but strong correlations might also indicate synchronous action potentials at the input indicating temporal coding. However, a clear-cut dichotomy between decorrelated and synchronized dynamics does not exist [6] [7,8]. Rather, one should consider these two extremes as two faces of the same coin. Recent high-precision measurements reported very low average correlations suggesting a mechanism of active decorrelation in cortical networks [9][10][11][12]. At the same time it was observed by intracellular measurements that nearby neurons receive very similar input [2][3][4].
Several studies of pair correlations in neural networks relate structure and dynamics assuming a fluctuating dynamics about a fixed point that is characterized by asynchronous (A) population activity and irregular (I) spike trains [11,[13][14][15]. They employ essentially linear perturbation theory [16,17] to compute correlation functions. Nevertheless, some of these works push the limits of existing methods. First of all, a qualitatively different AI state was observed in simulations of spiking neural networks with stronger couplings [18]. Secondly, a partial extension of the theory to the strongly correlated regime was based on numerically determined spike response functions [14]. Thirdly, pair correlation studies were generalized to higher-order correlations in recurrent networks by accounting for certain network connectivity motifs [19? ]. These studies exploit and extend existing methods, but they also demonstrate the need for a new approach.
The main assumption of perturbation theory is that the common drive of the two neurons is weak. Yet, this criterion depends on the background state and strength of interactions in a given network. We showed previously that low background rates, for example, may lead to a breakdown of perturbation theory even for low correlation coefficients c [20]. This makes non-perturbative methods indispensable for modeling and analysis of correlations as low spike rates are typical in experiments [21]. All in all, a proper treatment of strong correlations must take the non-linear correlation transfer into account, which appears to play an important role in sensory processing [22]. However, a unified and transparent framework to calculate correlations of all strengths for neuron models of the integrate-and-fire type does not exist.
The pitfall of previously suggested non-perturbative methods are their immense computational costs due to the high dimensionality of the discretized problem [23]. This makes computations practically impossible for a large range of parameters. For instance, the numerical effort of computing the pair correlation problem scales as N 4 , where N is the number of grid points used to approximate single neuron membrane potentials. Although, limiting the grid size is possible [24], a too coarse voltage grid fails to properly reflect the statistics of leaky-integrate-and-fire neurons with Poisson input. The precision issue gets even more severe for correlations of higher order, which are needed to parametrize the joint statistics of multiple neurons. With our methods, in contrast, we observed that joint membrane potential distributions of even strongly correlated neurons can be reduced to a small set of principal vectors via singular value decomposition (SVD). This suggests that strong correlations can be computed with high precision resorting to subspaces of relatively low dimension. In this work, specifically, we devise a SVD based method that allows to compute spike correlation functions of two leaky integrate-and-fire neurons with high accuracy. Similar problems were studied analytically for arbitrary input correlations of the stochastic dynamics of neural oscillators [25], and for level-crossings of correlated Gaussian processes [26]. Related numerical work considered strong input correlations for integrate-and-fire neurons receiving white noise input [27] or shot noise input with nontrivial temporal correlations [28,29]. The problem of analytically calculating the stationary distributions conditional on a spike from the exit current at the threshold is also discussed in the case of colored noise [28]. A method to deal with very strong input correlations (c ≈ 1) in a specific input model (correlated Poisson processes) was suggested by [30]. Our study further suggests a novel technique, extending [24], to solve 2D jump equations for leaky integrate-and-fire neurons by mapping it to a Markov chain. This method provides an accurate estimate of the steady state joint distribution of membrane potentials.
We test our method for large input correlations c and demonstrate its power for different types of correlation asymmetries by comparing our semi-analytical approach to correlations extracted from simulated neuronal spike trains. We look at the full range of input correlations and provide an example of a non-linear correlation transfer function. Our method can be extended to more general integrate-and-fire models, higher-order input correlations and third order output correlations. However, we have to defer a detailed analysis of such cases to future work. mechanism that is part of the membrane potential dynamics of LIF neurons.

A. Two neurons with shared Poisson input
The leaky inteagrate-and-fire neuron model with postsynaptic potentials of finite amplitudes was studied previously in [24]. The stochastic equation that describes the membrane potential dynamics of one particular neuron is given as where h ex and h in represent the amplitudes of individual EPSPs and IPSPs, respectively, and S ex i (t) and S in j (t) are the spike trains of excitatory and inhibitory presynaptic neurons, with each of their spikes represented by a Dirac delta function An analogous definition holds for inhibitory neurons. In both cases, if the membrane potential reaches the firing threshold, V th , a spike is elicited and the voltage is reset to its resting value at 0.
In order to study correlations between the spike trains of two neurons we look at two coupled stochastic equations, describing the membrane potentials of two neurons that share a certain fraction of their excitatory and inhibitory input spikeṡ where shared excitatory and inhibitory input spike trains are denoted as m S ex s,m (t) and n S in s,n (t), respectively. Comparing the parameters of the jump model to shared white noise input we implicitly specified the firing rates of 6 independent Poisson processes ( Fig. 1) For notational convenience, shared input rates r ex,c and r inh,c are computed from a shared Wiener process ξ c with zero mean. Setting h ex = h and h in = gh, rates for each independent excitatory and inhibitory process are given as Here we note that some combinations of (µ, σ) do not correspond to any combination of positive Poisson rates, as they must satisfy guarantee r in ≥ 0 for the inhibitory rate.

B. Discretized Markov operators for the LIF model
In this section we summarize the discrete approximation to the dynamics of a LIF neuron, as developed in [24]. The coarse graining of the membrane potential is given by the map  The probability density function P (v) of the membrane potential then becomes a vector p = (p i ) satisfying with v i = i ∆V . As we impose a cut-off lower boundary V − of the voltage scale, the dimension of the discrete state space is given as N = V th −V − ∆V . The temporal evolution of the membrane potential distribution is now described in terms of a Markov process. The Markov propagator can be expressed as a juxtaposition of three operators: a decay operator D describing leaky integration, a jump operator J accounting for the action of synaptic inputs, and a threshold-and-reset operator T that implements spike generation upon threshold crossing. The discrete decay operator D is derived from its continuous counterpart D as follows This definition leads to the following decay matrix The jump distribution, which underlies the jump operator J , can be derived from the count distribution of excitatory and inhibitory synaptic events in a small time interval ∆t.
Assuming that they arrive with Poisson statistics (maximum entropy) at a rate of r ex and r in , respectively, we obtain where a = r ex ∆t and b = r in ∆t are the corresponding expected event counts. Dummy indices m and n correspond to excitatory and inhibitory counts. The jump distribution of the membrane potential γ = (m − gn)h is given by The jump operator J is then derived as with a jump matrix given by The threshold-and-reset operator T takes all the states above threshold and inserts them at the reset potential. This is simply given as Finally, the time evolution matrix of the corresponding Markov chain is given as a product of the three operators described above The discrete stationary distribution is and the corresponding stationary rate is given as

C. Spike train correlations
The cross-covariance function of two stationary spike trains S a (t) = l δ(t−t a l ) (a = 1, 2) is defined as where S a (t) = r a , with . indicating the ensemble average. For the model we studied here stationary rates were computed using Eq. 21. The cross-covariance function can be expressed in terms of the conditional firing rate r 1|2 (τ ), two stationary rates r 1 and r 2 , and the amplitude r 0 of a δ-function Given the spiking neuron model considered here, Eq. 21 can be derived from the stationary joint membrane potential distribution P 0 (V 1 , V 2 ). Conditional on a spike at time t 0 in the first neuron, one has to find the instantaneous distribution of the membrane potential of the second neuron The probability of observing a consecutive spike is given by the flux P flux (V 1 , θ 2 ) with the normalization in Eq. 23. The conditional flux is computed for the Markov approximation as in Eq. 47. In general we simply compute the exit rate at the threshold distributed over V 1 by solving the following initial value problem where M 1 is discrete time evolution matrix of the neuron model given by Eq. 17. M 1 leads to a (forward) time evolution in steps of ∆t. The instantaneous conditional rate r 1|2 (t) in Eq. 21 is computed with a formula similar to Eq. 19 Finally, the covariance function C 21 (τ ) in Eq. 21 is derived by using r 1|2 (t) and two stationary rates r 1 and r 2 and r 0 . Note that the method described here can be generalized to higher order correlations as well.

D. Correlated jump distribution in 2D
We now use a 2-dimensional state space describing the joint membrane potential evolution of two neurons. Correlated and uncorrelated Poisson jumps push the 2D membrane potential vector (V 1 , V 2 ) into three independent directions, (1, 0), (0, 1) and (1, 1), allowing jumps in the positive (excitatory) and negative (inhibitory) direction, respectively. Hence, the jump distribution from an initial position (U 1 , U 2 ) to a new position (V 1 , V 2 ) in state space driven by 6 independent Poisson processes is obtained from a 2D convolution Inserting the expressions for the jump distributions in each direction, collecting all terms and using Eq. 13, we obtain This expression is also valid for more general input statistic models that rely on a decomposition into statistically independent parts [30,31]. Here we consider the shared Poisson input model as described by Eq. 4. Integrating the expression with respect to Z and inserting the mean event counts a i = r ex,i ∆t and b i = r in,i ∆t, the resulting expression is given in a compact form We can simplify this expression by choosing a regular grid according to h e = n∆V and h i = m∆V , for integers m and n. In practice, however, the resulting sum will be truncated for a given tolerance. We will use the matrix form of the discretized operators in the following subsection, which is equivalent to the above expression.

E. Linear operators for correlated dynamics in 2D
Here we discuss the action of operators on state vectors of the discretized (V 1 , V 2 ) space, assuming N bins in each dimension. We write the stationary voltage distribution in the where X and Y are two suitable orthogonal bases of R N . We will define a specific choice for X and Y later in Section II F. As there is no crosstalk between the two neurons except by shared input leading to a correlated jump distribution, threshold and decay operators (tensors) are separable Separability would also apply to the jump distribution for an input correlation coefficient of c = 0, corresponding to two neurons without shared input However, in the case of non-zero correlation, this relation holds only for a single path among the many connecting two points in state space . Therefore, every path must be taken into account by considering the contribution of each operator to a movement in the oblique (1, 1) direction.
Once we have computed the correct Markov matrix for 2D via one can also find the stationary joint membrane potential distribution as the Perron- Regarding the correlated jump distribution there are two ways of constructing J operators.
One possibility is described in Eq. 26. Alternatively, we construct linear Markov jump operators in 2D exploiting the commutativity of infinitesimal operators Here I is the identity matrix. Using the properties of the operator product ⊗ and commutativity of the individual factors, we can simplify this expression In order to expand the third term we define U and D operators as up and down transition matrices, where U corresponds to a 1-step up transition, and D corresponds to a 1-step down transition. This leads to Hence, discrete approximations to infinitesimal generators of private components are given where matrix powers k and l are defined on h e = k∆V and h i = l∆V for simplicity.(This restrictive assumption can be generalized easily by computing the continuous jump distribution in Eq. 14 and then discretizing it, which leads to the same result.) On the other hand, we need to be careful with correlated spikes, which trigger jumps into the oblique direction for m excitatory and n inhibitory jumps. Expanding yields As we noted before, the above construction is quite general and can be applied easily for general amplitude distributions. We only need to consider a discrete amplitude distribution c mn in a given time bin of size ∆t, as described above, see also [31,32].
A final remark on the method described in this section concerns the commutativity of matrices. This property leads to a numerically more economic expression where the set of integers J is defined as list of all jump numbers j = mk − nl. The coefficient is the probability of j jumps. The jump generator O is then defined in terms of matrix

F. Operator decomposition and SVD basis reduction
The expansion method described above is straightforward, but rather cumbersome to implement. We will now introduce a basis for the expansion of correlated jump operators suitable to reduce the cost of the computations involved, and helpful to increase the accuracy of a truncation. With our method, as compared to others, we have to solve linear equations of lower dimensionality in order to get a better approximation for the correlation function.
Some further arguments for selecting Singular Value Decomposition are discussed in the results section. SVD of a matrix is given as project J 1 (J 2 )and J 2 (T 2 ) to this subspace, we also extend the orthogonal basis X and Y to supra-threshold transitions where X is an M × K and Y is an N × L operator, respectively. I m and I n are identity matrices, where m and n are the maximal supra-threshold jump numbers induced by both private and shared inputs. The combined action of J and T for c = 0 is then expressed as reduced operators Below we use the same dimensional reduction for correlated operators. In Eq. 37 the correlated jump operators are expressed as A dimensional reduction is then achieved by using Eq. 42 in In order to find P 0 in Eq. 27 we need to solve Eq. 32 , which reads the left hand side of this equation Eq. 32, we obtain where Q is a tensor defined as The dimensionally reduced problem in Eq. 43 can then be solved in practice by reindexing tensor indices (i, j, k, l) → (I, K) .

G. Conditional flux distribution
Using the decomposed 2D stationary distribution obtained by reduction, one can compute the flux distribution with the help of matrix operators. We compute the flux distribution to converge as its maximum value is S 80,80 , because of numerical instabilities.
using the 2D decay and jump operators with thresholding imposed only at one of the boundaries. A scheme illustrating the situation is shown in Fig. 1c. Here we explain how to obtain the flux distribution conditional on a spike in one neuron. In the general case of correlated neurons, the action of the full operators J 2D and D 2D is given as whereP J is a (M + m) × (N + n) matrix. The implicitly summed subspace indices, P 0,kl = m,n X T k ,m Ω mn Y n,l , are not shown and, A j , B j and c(j) are defined in Eq. 37. This equation can be written in a concise form with implicit indices as Again, for practical reasons, computations were reduced by projecting J 2D D 2D to extended subspacesX andỸ similar to Eq. 42 In order to compute the probability of jumps we need to sum the probabilities for a jump over the threshold V 1 > θ 1 or V 2 > θ 2 . The flux distribution is obtained as defined for k < M and l < N . These expressions are normalized such that k P flux,k = 1.
The amplitude of the delta singularity at the reset potential is obtained as where r ex,c is the rate of excitatory shared spike trains. Once we have found the conditional flux distribution, we can solve the initial value problem defined in Eq. 23 in order to determine the conditional rates r 1|2 (t) or r 2|1 (t).

H. Diffusion approximation vs. finite PSPs
We compare the exit rate of the stochastic system with post-synaptic potentials of finite amplitude with the analytic result obtained for the diffusion approximation [? ]. For small enough PSPs the difference in rates of the two models is small We use the absolute difference between the two rates to account for the accuracy of a specific space-time grid (∆V, ∆t).

I. Correlation coefficient and comparison to diffusion approximation
The correlation coefficient in 10a is computed with the formula We used 49 for the stationary rate r, r 0 is the amplitude of the δ-function in Eq. 48, and the coefficient of variation,

ISI
, is computed with the equation as given in [13]. Thereby erf(x) is the error function [33]. We computed correlation coefficients of the diffusion approximation of the finite PSP system (meaning a 2D Fokker-Planck equation with the same c and σs) in [20].

J. Direct numerical simulations and data smoothing
We used the neural simulation tool NEST [34] to perform numerical simulations of input and output spike trains in the scenario described above. All analyses were based on discretized voltage data obtained during simulations of 1 000 s duration using a time resolution of ∆t = 10 −4 s.
Empirical voltage distributions were obtained by normalizing histograms appropriately.
Further smoothing using a simple moving average was performed before comparing these distributions to the analytically obtained stationary distribution. We also performed the comparison using cumulative distributions, as the implicit integration very efficiently reduces the noise in the data. Two 2D distributions are compared via visual inspection of contour lines. We also directly compare spike train cross-correlation functions to assess efficiency and accuracy of the method.

K. Numerical evaluation of cross-correlation functions
We compute cross-correlation functions of spike trains from conditional PSTHs. One can express this as an integral over two variables τ = t 1 − t 2 and s = t 1 + t 2 with bin size ∆ where we set with observation window T .

L. Convergence and error bounds
The direct singular value decomposition of a 2D membrane potential distribution shows that there are only few singular values that deviate significantly from 0 (Fig. 4). This behavior does not depend strongly on the chosen discretization, but it does depend on the input correlation coefficient c. Although singular vectors are not probability distributions in their own respect, the singular vectors X i derived from the neuronal dynamics matrix (except the first few vectors) have the property provided the discretization is fine enough. This behavior is demonstrated in Fig. 6. The contribution of each sum to the overall normalization is progressively small, where Σ 1k and Σ 2l are sums of k-th and l-th singular vectors of the first and the second matrix for k ≥ m and l ≥ n, respectively This shows that the sum converges rather quickly. This error measure is related to projections of 1D discrete stationary distribution P 0 (satisfying M 1 P 0 = P 0 ) to SVDs. All other eigenvectors of a Markovian matrix ( M 1 P 0 = λP 0 for |λ| < 1) satisfy k (P i ) k = 0. We want to avoid underestimating the total probability mass as a result of the truncated sum in Eq. 27. Hence, above we justify that the remainder of P 0 projections after truncation can be omitted up to a certain precision. On the other hand, in order to describe cumulative contribution of singular vectors we look at the L 1 distance of the omitted remainder (i.e.
which describes how well the method converges self-consistently. Here we didn't normalize this equation for every term we added. Which means we just rely on fast convergence of P 0 projections measured by Eq. 58, so first few error terms can be misleading.

III. RESULTS
In order to treat strong correlations we devised a robust numerical method to study the joint statistics of membrane potentials and spike trains of integrate-and-fire model neurons.
The case study reported here covers the leaky integrate-and-fire (LIF) model with Poisson input spike trains. However, our method can be easily generalized to non-linear leak functions [35], conductance based synaptic inputs [36] and more complex input correlation models [30,31],. although we have to leave the details of such generalizations open. In this section we will explain how to select a 'good basis for expansion', and we will give numerical examples that demonstrate the power of the method.

A. SVD of joint probability distributions and choice of expansion basis
We started from a simple observation: The stationary joint membrane potential distribution for two neurons with independent input is given as where P 1 (V 1 ) and P 2 (V 2 ) are the stationary membrane potential distributions of two independent neurons, as described in Eq. 4. A similar relation for a discretized voltage grid can be written as For the case of shared input this simple relation is not valid any more. On the other hand, we observed that a value of the parameter c close to 0 will practically recruit only a small number of additional principal components for any given precision, cf. Fig. 4. Here we perform a singular value decomposition of the full solution of Eq. 32 given in terms of the generalizing the case of independent neurons to also reconstruct the joint membrane potential distribution for neurons with shared input. As demonstrated in Fig. 4, convergence is rather quick, even for moderate values of c.
Another aim of our study was to gain some understanding about the influence of the space-time grid. We observed that left and right singular vectors are of the form where X c and Y c reflect the quasi-continuous part of basis vectors and Ω is the coupling matrix as defined in Eq. 27. Here we need to make sure that the emerging singularity at the reset bin is not causing any numerical problems. One needs to first consider a small time step ∆t and adapt the stepping in space ∆x accordingly. A more thorough discussion of a suitable coarse graining strategy, however, is postponed to a later section of this paper.
Here we suggest to use SVD as a method to achieve a dimensional reduction of the full system. As it is a numerical method, its convergence and efficiency needs to be addressed.
Generally, there are several different options to select a basis. Specifically, we use the right singular vectors of single-neuron Markov matrices. As demonstrated in Fig. 4, right singular vectors lead to an expansion that converges faster for coarse grids (e.g. V = 0.5 mV).
Although for finer grids (e.g. V = 0.05 mV) the difference is less prominent (Figs. 5a and b), right singular vectors still converge slightly faster than left singular vectors (Fig. 5d). Right As reported previously [20], we may also use a direct analytical approach using the basis of the single-neuron Fokker-Planck operator, and its adjoint basis where X and Y are the left eigenvectors of the single neuron operators. However, the issue is that the discrete adjoint basis blows up at the lower boundary. The effect of this on our approximation is demonstrated in Fig. 3. In general, SVD eliminates a kernel of singular matrices.
In our treatment of the 2D Fokker-Planck equation, which is the infinitesimal limit of the theory considered above, we used the basis and adjoint basis to project linear operators to a subspace. This has certain advantages as it satisfies constraints for marginal distributions and preserves the Markov property to some extent. Positivity of the solution in the subspace is not guaranteed, but time evolution is probability preserving ( i P i (t) = 1).
First, SVD is computationally convenient, because it leads to using a real orthogonal basis which resembles the eigenbasis of M. Second, numerical instabilities due to an ill- Two neurons that are driven by correlated input current will exhibit correlated output spike trains. This transfer of correlation reflects an important property of neuronal dynamics, which is of particular relevance for understanding the contribution of neurons to network dynamics. Recently, we were able to demonstrate, by exact analytical treatment, that the correlation transfer for leaky integrate-and-fire neurons is strongly non-linear [20]. Only for weak input correlation it can be described by perturbative methods, and deviations from linear response theory depend on the background firing rate. In the present work we demonstrate the same non-linear correlation transfer, cf. Fig. 10. There we also demonstrate how the parameters of the spatial and temporal coarse graining affects the precision of the Markov approximation. Our main conclusion is that dimensional reduction via SVD subspace projections makes it possible to achieve a superior precision with small bin sizes. Fine enough grids, however, could not be dealt with on typical computers without using the reduction suggested here.

D. Application 2: Asymmetric cross-covariance functions
Neurons in biological networks have widely distributed parameters, and this heterogeneity may also influence information processing [38][39][40]. Moreover, robust asymmetries in spike correlations could lead to asymmetric synaptic efficacies, if they are subject to spike timing dependent plasticity [41,42]. Our approach reveals a generic temporal asymmetry in cross- asymmetry, σ 1 = σ 2 , (c) τ m asymmetry, τ 1 = τ 2 , which leads to µ 1 = µ 2 and σ 1 = σ 2 as private spike train input rates are the same, (d) V th asymmetry, V th,1 = V th,2 . For specific parameter values, see Table II. covariance functions, related to the heterogeneity of intrinsic neuron parameters and input variables. Such temporal asymmetry is more pronounced for larger values of c, especially in the non-perturbative regime that we address in this work.
We document here an application of our method to four types of asymmetries [20,40]: µ asymmetry, σ asymmetry, τ m asymmetry and V th asymmetry. We quantified the asymmetry by a = χ 1 /χ 2 (specific values given in Table II), where χ is replaced by the respective parameter. Asymmetric correlations have been described previously, and they were by numerical simulations and experimentally studied by [38,40,43,44].
*** Results are presented here for different asymmetric parameters: (a) µ asymmetry, (b) σ asymmetry, (c) τ m asymmetry, which implies an asymmetry in µ and σ as well, as private spike train input rates are the same, (d) V th asymmetry. For specific parameters, see Table II. (a) Correlation transfer function as a function of input correlation. We compare here analytical results (solid curves) described in a recent paper [20] with numerical results (orange symbols) obtained with the methods described in this paper. Non-perturbative correlation transfer functions C out (C in ) in [20] for symmetric parameters and for high and low firing rates, respectively (blue: r b = 15.2 Hz, CV 2 = 0.5; green: r g = 1.13 Hz, CV 2 = 0.98). Slopes of light blue and light green lines (corresponding to dCout dC in at C in = 0) are computed using perturbation theory [37]. Note that we added the obvious points C out (0) = 0 and C out (1) = 1 to the plot by hand.   Table II   . FIG. 11. Here we demonstrate that fixing the space bin (here ∆V = 0.02 mV), the choice of the time bin affects the firing rate estimation [24]. The deviation of correlation coefficients for small rates in Fig. 10a (orange triangles vs. dark green curve) is a result of a poor estimation of conditional rates. The variance of the input (σ 2 ) is crucial in determining the appropriate temporal bin size, while the mean input (µ) is less effective. With increasing variance one observes an increasing firing rate error. From these plots, we conclude that for a space bin ∆V = 0.02 mV, a time bin between ∆t = 0.2 ms and ∆t = 0.1 ms defines a range of good choices. All other parameters are as specified in Table I. tion of input and output correlations) are difficult to derive. Previous analytical approaches have employed perturbation theory [16,17] to study pairwise correlations under the assumption of weak input correlation [37,45]. As a consequence of this technical convenience, we still lack a systematic approach that allows us to deal with a broad range of correlations, and to gain an understanding of their implications for network dynamics.
B. Extensions of the theory All computations described in our paper can be applied to more general integrate-and-fire models with anon-linear leak function Ψ(V ) We only need to rewrite the decay matrix as a discrete approximation of the differential . Other scenarios of interest are reflected by an altered amplitude distribution of the inputs.
This is a natural consequence if individual synapses have different PSP amplitudes. It could also arise, however, if the population of input neurons has a non-trivial correlation structure.
In particular, higher-order correlations have been treated in terms of specific amplitude distributions [31,46]. The method described in the present paper can be adapted to such scenarios by simply using a modified definition of the jump matrix [30].
Higher-order statistics on the output side is also compatible with our method, describing the joint response behavior of three or more neurons that are driven by shared input. Thirdorder correlations can be computed in practice, because the projections work in the same now with a 3D jump operator given in the generic form This operator is again transformed with a basis derived from a SVD as This procedure is computationally more demanding as we need to consider additional paths, although the scaling is not exponential. Under assumption of homogeneous shared input (same jump amplitudes driven by shared input in all directions) leads to an expression similar to Eq. 37, Assuming joint stationarity of all three spike trains S 1 (t), S 2 (t) and S 3 (t), we need to find the joint third moment of the spike train statistics As shown above, second moments can be computed with our method (Fig. 9). In order to obtain the covariance function from the stationary 3D flux, the time evolution of the 2D conditional flux at times (τ 1 , τ 2 ) is needed. This is given as which is computationally demanding as the numerical effort scales as O(N 6 ). However, this can be projected to the subspace with time dependent coupling matrix Ω as ∆ t Ω(t) = Q 12 Ω(t).
This form has advantages over finite difference methods as e.g. suggested in [23]. The computation of the third order moment defined in Eq. 70 requires a solution of Eq. 72 at τ 2 to find the second conditional distribution P 1|2|3 (τ 2 ). Then we need to find the 1D conditional distribution (e.g. for neuron 1) at t = τ 1 by solving similar to Eq. 23. This provides us with conditional rate r 1|2|3 (τ 1 , τ 2 ) and then the third moment is given as Details of this computation have to be deferred to future work, though.

C. Boundary conditions and singularity
The joint membrane potential distribution has a singularity at the origin (V 1 , V 2 ) = (0, 0) due to a coordinated reset caused by some shared input spikes. There is also a line discontinuity at V 1 = 0 and V 2 = 0, again due to the reset boundary condition. These singularities are reflected in the right singular vectors X and Y . This is the exact reason why we selected them as a basis to expand operators and joint distributions.
We observed that a singularity (a δ-function) emerges when ∆V is small and ∆t is large, in relative terms. This is an issue even for the 1D discrete problem, and it is even more severe for 2D problems as the amplitude of the singularity scales quadratically with ∆V .
This phenomenon occurs only if PSPs have a finite amplitude. As the PSP gets larger relative to ∆V , reset currents remain finite even in continuous time [24]. As a consequence, the limit to continuous variables must be taken with care, in particular for c > 0.
The δ-singularity does not exist for the diffusion approximation [20]. However, the definition of the current at the origin again fails as the derivative is discontinuous in both V 1 and V 2 directions. The infinitesimal limit of the jump equation must be taken with care.
There is no doubt that the jump equation is well-defined as the flux at the boundary is not local. However, the infinitesimal limit is problematic for correlated neurons (c > 0) as the flux is not defined at the boundary of the 2D domain.

D. Precision, computational efficiency and grid selection
The selection of an appropriate grid in space and time is crucial for correlation computations. The small residual offset between direct simulation and our new semi-analytical computation (cf. Fig. 8 and 9), for example, can be considered as a discretization artifact. Although this issue would deserve a more systematic treatment, we report here some observations that can guide grid selection: (i) For discrete solutions of the heat equation based on central difference scheme, convergence of 1D time evolution requires ∆t (∆V ) 2 < τ σ 2 [47]. A similar rule also applies in the 2D case considered here. In general, explicit discretization schemes of second order differential operators arising in the study of diffusion, require positivity and stability conditions in the order of ∆t = O( τ (∆V 1 +∆V 2 ) 2 σ 2 ) [23]. In this work, we followed a discretization scheme that approximately conserves probability, a Markovian approximation [24]. However, we note once more that some grids may lead to violation of the Markov property for too large ∆t, as a result of boundary effects. This may create issues when the largest eigenvalue exceeds 1.
(ii) To reflect small expected bin counts (especially for c ≈ 1) adequately, one needs a larger ∆t and a smaller ∆V . This is in conflict with rule (i). Besides, we observe that smaller ∆V for a fixed ∆t actually leads to better firing rate approximation up to some point (Fig. 11).
(iii) A finer grid requires more computational efforts to achieve a smooth correlation function. The SVD reduction does not alter this behavior. Other dependencies and limiting factors are indicated in Fig. 10. There are two constraining factors which are determined by the selected precision of the approximation. One is the extent of the jump distribution, which affects the number of terms to be accumulated (size of the set J in Eq. 37). For a fixed grid and a selected precision, this number increases with σ c . The second constraint is the size of the SVD subspace. We know that as c gets closer to 1 and C(τ ) gets steeper we need to include more singular components.
In Fig. 10 we illustrate how coarse graining affects the shape of the cross-covariance function C(τ ). Although the precision of the approximation is limited by the subspace projections implied by SVD, the grid parameters ∆t and ∆V are the most important factors to get the shape of the function right. However, for a fixed dimension of the SVD subspace even the finest grid would not be able to capture the singularity at zero time lag (τ = 0).
The grid effectively limits the precision of the approximation due to the reduced number of degrees of freedom.

V. CONCLUSION
We developed a novel numerical method to compute the joint statistics and correlation functions for two LIF model neurons driven by shared input. Our approach can deal with the full range of input correlations c, and the expansion converges fast. Also, our method is widely generalizable and can deal with other scenarios that are biologically relevant.
We observed in previous work [20] that low output firing rates generally require a nonperturbative treatment. If output rates are high, in contrast, and for high input firing rates with small PSPs (diffusion regime), the approximation derived from linear response theory [37] is reasonably precise. We conclude that it is possible to compute correlation functions (in contrast to deriving them from simulations) for a wide range of models with finite PSP amplitudes, and also for a wide range of parallel spike train input models. Although there is currently no conclusive theory for the selection of an appropriate spatio-temporal grid, we were able to come up with some heuristics. The precision of even the first moment (firing rate) depends on the grid.
Specifically if c is close to 1, the temporal component of the correlation function resembles a delta function. In order to capture this phenomenon the grid must be fine enough.
The innovation in our work is not only the formulation of correlation functions based on a Markov chain approximation, but also a dimensional reduction. This helps us compute joint membrane potential distributions. We showed that the number of components obtained by SVD needed to represent single neuron dynamical evolution matrices is small. This also means that computations can presumably be generalized to higher-order correlations with only moderately increased computational effort.