Multivariate autoregressive model
Granger causality was defined for two channels, however, Granger in his later study [17] pointed out that the causality principle holds only, if there are no other channels influencing the process. To account for the whole multivariate structure of a process of k channels the multichannel autoregressive model (MVAR) has to be considered.
For a multivariate k-channel process X(t):
$$ {\bf X}(t) = (X_{1} (t), \, X_{2} (t), \ldots , \, X_{k} (t)). $$
(5)
The model takes the form
$$ {\bf X}(t) = \sum\limits_{j = 1}^{p} {\bf A} (j)X(t - j) + {\bf E}(t), $$
(6)
where E(t) are vectors of size k and the coefficients A are k × k-sized matrices.
Equation 4 can be easily transformed to describe relations in the frequency domain. After changing the sign of A and application of Z transform we get:
$$ \begin{aligned} {\bf E}(f) &= {\bf A}(f){\bf X}(f) \hfill \\ {\bf X}(f) &= {\bf A}^{ - 1} (f) {\bf E}(f) = {\bf H}(f){\bf E}(f) \hfill \\ {\bf H}(f) &= \left( {\sum\limits_{m = 0}^{p} {{\bf A}(m)\exp ( - 2\pi imf\Updelta t)} } \right)^{ - 1} . \hfill \\ \end{aligned} $$
(7)
From the form of the above equations we see that the model can be considered as a linear filter with white noises E(f) on its input and the signals X(f) on its output. The matrix of filter coefficients H(f) is called the transfer matrix of the system. It contains information about all relations between data channels in the given set including the phase relations between signals. From transfer matrix cross-spectra and partial coherences can be found. Partial coherence is given by the formula:
$$ C_{ij} (f) = {\frac{{{\bf M}_{ij} (f)}}{{\sqrt {{\bf M}_{ii} (f){\bf M}_{jj} (f)} }}}, $$
(8)
where M
ij
is a minor of spectral matrix (matrix of spectra and cross-spectra) with the i-th row and j-th column removed. Partial coherence is non-zero only when the given relation between channels is direct. If a signal in a given channel can be explained by a linear combination of some other signals of the set, the partial coherence between them will be low.
The estimation of the coefficients of MVAR is based on the calculation of covariance matrix, therefore additional correlations between channels should not be introduced. The signals should be referenced in respect to the channel which is not involved in the model estimation (e.g., “linked ears”). Common average, bipolar derivation or Hjorth transform must not be used, since they disturb the correlation structure between the signals. The mean value of each signal should be subtracted and it is recommended to divide the signal by the square root of its variance.
The MVAR model is a sort of a filter which separates noise from the signal. This property follows directly from Eq. 7. Therefore, MVAR is especially suitable for analysis of noisy data. For the excessively smoothed time series, where the random component is suppressed, the difficulties in fitting the model may occur. The AR spectral estimates have better statistical properties than FFT estimates, which is easy to see comparing smooth AR power spectral estimates to the fluctuating estimates obtained by means of Fourier transform. The measures of connectivity derived from the MVAR in virtue of model properties are also very robust in respect to noise. It was reported in [20] that for 3-channel model the propagations were correctly estimated by means of DTF when the amplitude of noise was 3 times as big as a signal itself. For biomedical time series where the contribution of noise is quite high the estimates of connectivity based on MVAR are recommended.
Directed transfer function
Based on the properties of the transfer function of MVAR, DTF was introduced [20] in the form:
$$ {\text{DTF}}_{j \to i}^{2} (f) = {\frac{{\left| {H_{ij} (f)} \right|^{2} }}{{\sum\limits_{m = 1}^{k} {\left| {H_{im} (f)} \right|^{2} } }}}. $$
(9)
The DTF describes causal influence of channel j on channel i at frequency f. The above equation defines a normalized version of DTF, which takes values from 0 to 1 producing a ratio between the inflow from channel j to channel i to all the inflows to channel i.
The non-normalized DTF which is directly related to the coupling strength [22] is defined as:
$$ \theta_{ij}^{2} (f) = \left| {H_{ij} (f)} \right|^{2} . $$
(10)
The DTF found many applications e.g.,: for localization of epileptic foci [14], for estimation of EEG propagation in different sleep stages and wakefulness [21], for determination of transmission between brain structures of an animal during a behavioral test [24], for estimation of cortical connectivity [2], [3] and many others.
The DTF shows not only direct, but also cascade flows, namely in case of propagation 1 → 2 → 3 it shows also propagation 1 → 3. In order to distinguish direct from indirect flows direct Directed Transfer Function (dDTF) was introduced [25].
The dDTF is defined as a multiplication of a modified DTF by partial coherence. The modification of DTF concerned normalization of the function in such a way as to make the denominator independent of frequency. The dDTF (χ
ij
(f)) showing direct propagation from channel j to i is defined as:
$$ \begin{aligned} \chi_{ij}^{2} (f) &= F_{ij}^{2} (f)C_{ij}^{2} (f) \hfill \\ F_{ij}^{2} (f) &= {\frac{{\left| {H_{ij} (f)} \right|^{2} }}{{\sum\limits_{f} {\sum\limits_{m = 1}^{k} {\left| {H_{im} (f)} \right|^{2} } } }}}, \hfill \\ \end{aligned} $$
(11)
where C
ij
(f) is partial coherence. χ
ij
(f) has a nonzero value when both functions F
2
ij
(f) and C
2
ij
(f) are non-zero, in that case there exists a direct causal relation between channels j → i.
Distinguishing direct from indirect transmission is essential in case of signals from implanted electrodes, for EEG signals recorded by scalp electrodes it is not really important [27].
The DTF and dDTF show propagation when there is a phase difference between signals, they have non-zero value only when there is a phase difference between signals from different derivations. Volume conduction is a zero phase propagation, therefore no phase difference between channels is generated, so in theory volume conduction should not have any influence on DTF results. In practice it has some minor influence e.g., increasing the noise level, however, this influence is not critical, it is much less important than in case of other methods.
In [2] functional connectivity was evaluated by application of DTF to the cortical signals estimated by means of the linear inverse procedure [18]. The procedure returned the amplitude values of EEG on the cortex, however, the phases of the signals were changed by the inverse procedure which influenced the results. They show the causality dependencies between the cortical signals, not exactly the direction of the propagating EEG activity.
PDC
The PDC was defined by Baccala and Sameshima in 2001 [4] in the following form:
$$ P_{ij} (f) = {\frac{{A_{ij} (f)}}{{\sqrt {{\bf a}_{j}^{*} (f){\bf a}_{j} (f)} }}}. $$
(12)
In the above equation A
ij
(f) is an element of A(f)—a Fourier transform of MVAR model coefficients A(t), where a
j
(f) is j-th column of A(f) and the asterisk denotes the transpose and complex conjugate operation. Although it is a function operating in the frequency domain, the dependence of A(f) on the frequency has not a direct correspondence to the power spectrum. From normalization condition it follows that PDC takes values from the interval [0,1]. PDC shows only direct flows between channels. Unlike DTF, PDC is normalized to show a ratio between the outflow from channel j to channel i to all the outflows from the source channel j, so it emphasizes rather the sinks, not the sources.
In neurophysiological applications rather sources, not the sinks are of primary interest, therefore later on the estimator called Generalized Partial Directed Coherence (GPDC) was proposed [5], where normalization factor in the denominator similar to the one applied in DTF was introduced. GPDC is given by the formula:
$$ {\text{GPDC}}_{j \to i} (f) = {\frac{{A_{ij} (f)}}{{\sum\limits_{i = 1}^{k} {\left| {A_{ij} (f)} \right|}^{2} }}}. $$
(13)
Independently, the normalization similar to the one given by formula (13) was proposed in [38]. It has been pointed out in [38] that not renormalized PDC has several drawbacks, namely: (i) PDC is decreased when multiple signals are emitted from a given source, (ii) PDC is not scale-invariant, since it depends on the units of measurement of the source and target processes, and (iii) PDC does not allow conclusions on the absolute strength of the coupling. These disadvantages are alleviated in case of GPDC. PDC and GPDC, similarly to DTF are insensitive to the volume conduction.
Bivariate versus multivariate estimators of connectivity
The differences between multivariate and bivariate estimators of connectivity may be illustrated by a simple example. Let us consider the common situation when a signal is measured at different distances from the source. Such a case corresponds to the simulation scheme shown at the top of Fig. 1. The EEG signal from the channel 1 is propagating with different delays D to channels 2, 3, 4, and 5. At each step random white noise is added. For bivariate measure of connectivity (DTF) based on 2 channel AR model the flows of activity are observed in each case when there is a delay between two channels. In case of DTF estimated from MVAR model encompassing all channels only the propagation from channel 1 is observed, in agreement with the simulation scheme. This observation is true for any bivariate measure, no matter how it is calculated [7], [27]. No wonder that for bivariate measures very dense patterns of propagation are found e.g., [11] and it is practically impossible to find the sources of propagation. In contrast, the DTF results usually show a few significant sites from which activity is propagating e.g., [8], [9], [14], [21], [28], [29]. This point is also illustrated in Fig. 2. More examples of comparing different methods of directionality estimation may be found in [27] including application to real experimental data. Namely transmission patterns of EEG are estimated for an awake state, eyes closed. It is known that in this state the activity is propagating from the posterior structures of the brain, with some weaker sources in front. This kind of pattern was found by means of DTF calculated from MVAR. For bivariate measure the pattern is disorganized, even the reversal of propagation was observed [27].
The DTF and PDC first found the applications in analysis of EEG or ECoG. More recently methods based on MVAR start to be applied also to the fMRI signals [36] and in multimodal integration of EEG and fMRI [3]. However, in case fMRI signals, because of poor time resolution, only main connections acting during a whole task may be identified. The dynamical fast transmissions involved in the information processing may be identified by means of time-varying estimators based on MVAR.