Effective Connectivity for Investigating the Neurophysiological Basis of Cognitive Processes

(1)

where t refers to time and N is the number of electrodes or cortical areas considered.

Supposing that the following MVAR process is an adequate description of the dataset Y:

${\displaystyle \sum_{k=0}^p\Lambda (k) Y\left( t- k\right) = E(t)}\kern1em \mathrm{ with}\ \Lambda (0)= I,$

(2)

where Y(t) is the data vector in time, E(t) = [e1(t), …, en(t)]^Tis a vector of multivariate zero-mean uncorrelated white noise processes, Λ(1), Λ(2), …, Λ(p) are the N × N matrices of model coefficients and p is the model order, usually chosen by means of the Akaike Information Criteria (AIC) for MVAR processes [23].

Once an MVAR model is adequately estimated, it becomes the basis for subsequent spectral analysis. To investigate the spectral properties of the examined process, Eq. 2 is transformed to the frequency domain:

$\Lambda (f) Y(f)= E(f)$

(3)

where:

$\Lambda (f)={\displaystyle \sum_{k=0}^p\Lambda (k){\mathrm{ e}}^{- j2\pi f\Delta tk}}$

(4)

and Δt is the temporal interval between two samples.

2.2 Partial Directed Coherence

The PDC [10] is a full multivariate spectral measure, used to determine the directed influences between pairs of signals in a multivariate data set. PDC is a frequency domain representation of the existing multivariate relationships between simultaneously analyzed time series that allows the inference of functional relationships between them. This estimator was demonstrated to be a frequency version of the concept of Granger causality [1], according to which a time series x[n] can be said to have an influence on another time series y[n] if the knowledge of past samples of x significantly reduces the prediction error for the present sample of y.

It is possible to define PDC as

${\pi}_{ij}(f)=\frac{\Lambda_{ij}(f)}{\sqrt{{\displaystyle \sum_{k=1}^N{\Lambda}_{k j}(f){\Lambda}_{k j}^{*}(f)}}}$

(5)

which bears its name by its relation to the well-known concept of partial coherence [24].

The PDC from node j to node i, π _ij( f), describes the directional flow of information from the signal y _j[n] to y _i[n], whereupon common effects produced by other signals y _k[n] on the latter are subtracted leaving only a description that is exclusive from y _j[n] to y _i[n]. PDC squared values are in the interval [0:1] and the normalization condition

${\displaystyle \sum_{n=1}^N{\left|{\pi}_{n j}(f)\right|}^2}=1$

(6)

is verified. According to this normalization, π _ij( f) represents the fraction of the information flow of node j directed to node i, as compared to all the j’s interactions with other nodes.

Even if this formulation derived directly from information theory, the original definition was modified in order to give a better physiological interpretation to the estimate results achieved on electrophysiological data. In particular, a new type of normalization, already used for another connectivity estimator such as DTF [8] was introduced. Such normalization consisted in dividing each estimated value of PDC by the root squared sum of all the elements of the relative row, obtaining the following definition:

${\pi}_{ij}^{\mathrm{ row}}(f)=\frac{\Lambda_{ij}(f)}{\sqrt{{\displaystyle \sum_{k=1}^N{\Lambda}_{ik}(f){\Lambda}_{ik}^{*}(f)}}}$

(7)

Even in this formulation PDC squared values are in the range [0:1], but the normalization condition is as follows:

${\displaystyle \sum_{n=1}^N{\left|{\pi}_{\mathrm{ in}}^{\mathrm{ row}}(f)\right|}^2}=1$

(8)

Moreover, a squared formulation of PDC has been introduced and can be defined as follows for the two types of normalization:

${\mathrm{ sPDC}}_{ij}^{\mathrm{ col}}(f)=\frac{{\left|{\Lambda}_{ij}(f)\right|}^2}{{\displaystyle \sum_{k=1}^N{\left|{\Lambda}_{k j}(f)\right|}^2}}$

(9)

${\mathrm{ sPDC}}_{ij}^{\mathrm{ row}}(f)=\frac{{\left|{\Lambda}_{ij}(f)\right|}^2}{{\displaystyle \sum_{k=1}^N{\left|{\Lambda}_{ik}(f)\right|}^2}}$

(10)

The main difference with respect to the original formulation is in the interpretation of these estimators. Squared PDC can be put in relationship with the power density of the investigated signals and can be interpreted as the fraction of ith signal power density due to the jth measure. The higher performance of squared methods in respect to simple PDC has been demonstrated in a simulation study [25]. This study revealed higher accuracy for the methods based on squared formulation of PDC (i) in the estimation estimation of connectivity patterns on data characterized by different lengths and signal-to-noise ratios (SNRs) and (ii) in distinction between direct and indirect pathways.

2.3 Adaptive Partial Directed Coherence

The original formulation of PDC is based on the hypothesis of stationarity of the signals included in the estimation process. Such hypothesis leads to a complete loss of the information about the temporal evolution of estimated information flows.

For overcoming this limitation, a time-varying adaptation of squared PDC was introduced [25]. The adaptation consisted in modifying the original formulation of PDC by including time dependence in the MVAR coefficients. Thus, the adaptive squared PDC estimator can be defined as follows:

${\mathrm{ sPDC}}_{ij}^{\mathrm{ row}}\left( f, t\right)=\frac{{\left|{\Lambda}_{ij}\left( f, t\right)\right|}^2}{{\displaystyle \sum_{k=1}^N{\left|{\Lambda}_{ik}\left( f, t\right)\right|}^2}}$

(11)

${\mathrm{ sPDC}}_{ij}^{\mathrm{ col}}\left( f, t\right)=\frac{{\left|{\Lambda}_{ij}\left( f, t\right)\right|}^2}{{\displaystyle \sum_{k=1}^N{\left|{\Lambda}_{k j}\left( f, t\right)\right|}^2}},$

(12)

where t refers to a time dependence of the MVAR coefficients and Λ_ij(f,t) represents the ij entry of the matrix of model coefficients Λ at frequency f and time t.

The estimation of time-varying MVAR parameters can be performed by means of two different approaches available at the moment, the RLS and the GLKF. The GLKF, whose higher accuracy in following temporal dynamics of investigated connectivity patterns, also in presence of a high number of sources, has been already demonstrated [22], will be described below.

2.4 The General Linear Kalman Filter

In the GLKF an adaptation of the Kalman filter to the case of multi-trial time series is provided. In particular:

$\eqalign{{Q}_t ={G}_{t-1}{Q}_{t-1}+{V}_t\\{}{O}_t ={H}_t{Q}_t+{W}_t}$

(13)

where O _trepresents the observation, Q _tis the state process, H _tand G _tare the transition matrices, and V _tand W _tare the additive noises. To obtain the connection with the time-varying MVAR, it is necessary to make the following associations:

${Q}_t=\left[\begin{array}{c}\hfill {\Lambda}_1{(t)}^N\hfill \\ {}\hfill \vdots \hfill \\ {}\hfill {\Lambda}_p{(t)}^N\hfill \end{array}\right],\\ {O}_t=\left(\begin{array}{ccc}\hfill {y}_1^{(1)}(t)\hfill & \hfill \cdots \hfill & \hfill {y}_M^{(1)}(t)\hfill \\ {}\hfill \vdots \hfill & \hfill \ddots \hfill & \hfill \vdots \hfill \\ {}\hfill {y}_1^{(N)}(t)\hfill & \hfill \cdots \hfill & \hfill {y}_M^{(N)}(t)\hfill \end{array}\right)={Y}_t$

(14)

${G}_{t-1}={I}_{dp}, {H}_t=\left(\begin{array}{ccc} {O}_{t-1},\hfill & \hfill \cdots, \hfill & \hfill {O}_{t- p\hfill} \end{array}\right),$

(15)

where N denotes the number of trials, whereas M is the dimension of the measured process [19]. In particular, for t = [p + 1:T] the following steps are repeated:

${Q}_{t-1}^{+}={Q}_{t-1}+{K}_t\left({O}_t-{H}_t{Q}_{t-1}\right)$

(16)

${Q}_t={G}_{t-1}{Q}_{t-1}^{+}$

(17)

${K}_t={P}_{t-1}{H}_t^T/{S}_t$

(18)

${S}_t={H}_t{P}_{t-1}{H}_t^T+ tr\left({\overline{W}}_t\right){I}_k\kern1.5em where\kern0.5em {\overline{W}}_t= E\left[{W}_t{W}_t^T\right]$

(19)

${P}_{t-1}^{+}=\left({I}_{dp}-{K}_t{H}_t\right){P}_{t-1}$

(20)

${P}_t={G}_{t-1}{P}_{t-1}^{+}{G}_{t-1}^T+{\overline{V}}_t\kern1.5em where\kern0.5em {\overline{V}}_t= E\left[{V}_t{V}_t^T\right]$

(21)

The rationale behind this set of equations is the demand for a linear and recursive estimator for the state Q _t(Q _{(t − 1)},O _t) as a function of the previous state Q _{(t − 1)} and of the actual observation matrix O _t. In particular starting from the coefficient matrix at time t−1, its estimation is updated to Q _{(t − 1)} ⁺ by means of Eq. 16 and used for estimating the values corresponding to the following time sample Q _tby means of Eq. 17. K _tis called the Kalman gain matrix and it weights the prediction error. Its formulation is reported in Eq. 18 where P _{(t − 1)} is the covariance matrix whose evolution is estimated by means of Eqs. 20 and 21.

The quality of estimation is related to the definition of two parameters, c1 and c2, regulating the compromise between the estimation accuracy and the speed of adaptation to transitions. A description of the application range of c1 and c2 and a suggestion about their typical values for EEG signals under different conditions of SNR and amount of trials is available in [22].

2.5 Statistical Validation of Connectivity Patterns

Random correlations between signals induced by environmental noise or by chance can lead to the presence of spurious links in the connectivity estimation process. In order to assess the significance of estimated patterns, the value of functional connectivity for a given pair of signals and for each frequency sample, obtained by computing PDC, has to be statistically compared with a threshold level which is related to the lack of transmission between considered signals. Such threshold could be inferred by means of two different approaches. The first approach, mainly used in stationary applications, extracts the statistical threshold from an empirical distribution representing the null-case for PDC estimator. Such threshold represents thus the inferior limit above which the estimated link is not due to chance. On the contrary, the second approach, mainly used in time-varying applications, infers the statistical threshold directly from the baseline condition, a period in which the subject is exposed, without performing the task, to all the external stimuli due to environmental noise and to the paradigm administration. In this way, all PDC values estimated during baseline period represent the reference distribution to be compared with the values inferred during task condition. In particular, once achieved the reference distribution, the threshold is computed by applying on it a percentile with a significance level of 5 %. A threshold value is thus achieved for each pair, direction and frequency band. A schematic representation of the process is reported in Fig. 1.

Fig. 1

Schematic representation of validation process for time-varying connectivity

2.6 Reducing the Occurrence of Type I Errors in Assessment of Connectivity Patterns

The statistical validation process has to be applied on each couple of signals for each direction and for each frequency sample. This necessity leads to the execution of a high number of simultaneous univariate statistical tests with evident consequences in the occurrence of type I errors. The statistical theory provides several techniques that could be usefully applied in the context of the assessment of connectivity patterns in order to avoid the occurrence of false positives.

The family-wise error rate represents the probability of observing one or more false positives after carrying out simultaneous univariate tests. Supposing to have m null hypotheses H ₁, H ₂, …, H _m. Each hypothesis could be declared significant or not-significant by means of a statistical test. Table 1 summarizes the situation after multiple significance tests are simultaneously applied.

Table 1

Table explaining the concept of family-wise error rate

	Null hypothesis is true	Alternative hypothesis is true	Total
Declared significant	V	S	R
Declared non-significant	U	T	m − R
Total	m ₀	m − m ₀	m

where:

m ₀ is the number of true null hypotheses

m − m ₀ is the number of true alternative hypotheses

V is the number of false positives (type I error)

S is the number of true positives

T is the number of false negatives (type II error)

U is the number of true negatives

R is the number of rejected null hypotheses

The FWER is the probability of making even one type I error in the family:

$\mathrm{ FWER}= \Pr \left( V\ge 1\right)$

(22)

Many methodologies are available for preventing type I errors [26], but in the following sections we limit the discussion to False Discovery Rate (FDR) and Bonferroni adjustments, which are the most used methodologies in neuroscience field.

Bonferroni Adjustment

The Bonferroni adjustment [27] starts from the consideration that if we perform N univariate tests, each one of them with an unknown significant probability α, the probability p that at least one of the test is significant is given by [28]:

$p < N\alpha$

(23)

In other words this means that if N = 20 tests are performed with the usual probability α = 0.05, at least one of them will become significant statistically by chance alone. However, the Bonferroni adjustment requires that the probability p for which this event could occur (i.e., one result will be statistically significant by chance alone) could be equal to α. By using the Eq. 23, the single test will be performed at a probability.

${\beta}^{*}=\alpha / N$

(24)

This β ^* is the actual probability at which the statistical tests are performed in order to conclude that all of the tests are performed at α level of statistical significance, Bonferroni adjusted for multiple comparisons. The Bonferroni adjustment is quite flexible since it does not require the hypothesis of independence of the data to be applied but it’s really conservative. In fact, it allows to highly reduce the number of false positives but at the same time, it introduces a lot of false negatives. For this reason, in 1995 a new approach, called the False Discovery Rate and less conservative of Bonferroni method, was introduced. Its capability to prevent both type I and type II errors was demonstrated in [29].

False Discovery Rate

The false discovery rate (FDR), suggested by Benjamini and Hochberg is the expected proportion of erroneous rejections among all rejections [30]. Considering V as the number of false positives and S as the number of true positives, the FDR is given by:

$\mathrm{ FDR}= E\left[\frac{V}{V+ S}\right]$

(25)

where E[] is the symbol for expected value.

In the following we report the False Discovery Rate controlling procedure described by Benjamini and Hochberg in 1995. Let H ₁, H ₂, …, H _mbe the null hypotheses, with m as the number of univariate tests to be performed, and p ₁, p ₂, …, p _mtheir corresponding p-values. Let’s order in ascending order the p-values as p(1) ≤ p(2) ≤ … ≤ p(m) and then select the largest i for which the condition

${P}_{(i)}\le \frac{i}{m}\alpha$

(26)

is verified. Only the first i null hypotheses (H ₁, …, H _i), corresponding to the first order i p-value will be rejected.

In the case of independent tests, an approximation for evaluating the corrected significance level has been introduced [30, 31]:

${\beta}^{*}=\frac{\left( m+1\right)}{2 m}\alpha$

(27)

In this case the new level of significance is β*. Such value guarantees that each test is performed with the imposed significance α.

2.7 Graph Theory Approach

The methodological advancement in the functional connectivity field has been leading to the description of neurological mechanisms at the basis of complex cerebral processes involving a high number of sources. Once the connectivity pattern, that was achieved for the investigated condition, was qualitatively described, a quantitative characterization of its main properties is necessary in order to synthetize the huge amount of information derived from the application of such advanced methodologies. The extraction of indexes describing global and local properties of the investigated networks could open the way to several different applications in neuroscience field.

A graph is a mathematical object consisting of a set of vertices (or nodes) and edges (or connections) indicating the presence of some sort of interaction between the vertices. The adjacency matrix A contains the information about the connectivity structure of the graph and can be derived directly from the investigated network. When a directed edge exists from the node j to the node i, the corresponding entry of the adjacency matrix is A _ij = 1 in binary graphs or A _ij = v (where v is the value achieved by the estimator) in weighted graphs otherwise A _ij = 0. The adjacency matrix can be used for the extraction of salient information about the characteristic of the investigated network by defining several indexes based on its elements.

In relation to the estimator used for building the network, the associated graph could be:

undirected → if the estimator is able to extract only the value of the information flows and not its direction. In this case the adjacency matrix is symmetric (A _ij = A _ji).
directed → if the estimator allows to reconstruct not only the magnitude but also the direction of the connection. In this case the adjacency matrix is asymmetric (A _ij ≠ A _ji).

If the estimator used for the analysis is based on multivariate approach, as in the case of PDC [10], the corresponding graph is binary/weighted and directed.

Adjacency Matrix Extraction

Once the functional connectivity pattern is estimated, it is necessary to define an associated adjacency matrix for each network, on which salient indexes able to characterize the network properties can be extracted. The generic ijth entry of a directed binary adjacency matrix is equal to 1 if there is a functional link directed from the jth to the ith signal and to 0 if no link exists. The construction of an adjacency matrix can be performed by comparing each estimated connectivity value to its correspondent threshold value. In particular:

${G}_{ij}=\left\{\begin{array}{c} 1\to {A}_{ij}\ge {\tau}_{ij} \\0\to {A}_{ij}<{\tau}_{ij\hfill} \end{array}\right.$

(28)

where G _ijand A _ijrepresent the entry (i, j) of the adjacency matrix G and the connectivity matrix A, respectively, and τ _ijis the corresponding threshold.

Different approaches have been developed for evaluating the threshold values, most of them based on qualitative assumptions aiming at fixing the number of edges or the degree of some nodes or at maximizing some properties of the investigated networks. The selection of the threshold to be used for the extraction of adjacency matrices is a crucial step of the graph theory approach; in fact the type of threshold used in the process might affect the structure and the topological properties of the investigated networks. Recently, it was demonstrated that the use of statistical thresholds computed on null-case distribution or on baseline condition allows to prevent erroneous description of network main properties [32].

Graph Theory Indexes

Different indexes can be defined on the basis of the adjacency matrix extracted from a given connectivity pattern. The most commonly used, will be described in the following:

Global Efficiency. The global efficiency is the average of the inverse of the geodesic length and represents the efficiency of the communication between all the nodes in the network [33]. It can be defined as follows

${E}_{\mathrm{ g}}=\frac{1}{N\left( N-1\right)}{\displaystyle \sum_{i\ne j}\frac{1}{d_{i j}}}$

(29)

where N represents the number of nodes in the graph and d _ijthe geodesic distance between i and j (defined as the length of the shortest path between i and j).

Local Efficiency. The local efficiency is the average of the global efficiencies computed on each sub-graph G _ibelonging to the network and represents the efficiency of the communication between all the nodes around the node i in the network [33]. It can be defined as follows

${E}_{\mathrm{ l}}=\frac{1}{N}{\displaystyle \sum_{i=1}^N{E}_{\mathrm{ g}}\left({G}_i\right)},$

(30)

where N represents the number of nodes in the graph and G _ithe sub-graph achieved deleting the ith row and the ith column from the original graph.

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Tags: Modern Electroencephalographic Assessment Techniques

Jun 25, 2017 | Posted by admin in PATHOLOGY & LABORATORY MEDICINE | Comments Off

Basicmedical Key

Fastest Basicmedical Insight Engine