where a, b ∈ R, a ≠ 0 are the scale and translation parameters, respectively. As a increases the wavelet becomes more narrow and by varying b, the mother wavelet is displaced in time. Thus, the wavelet family gives a unique pattern and its replicas at different scales and with variable localization in time.
Therefore, the CWT of a signal (t) at time b and scale a is defined as:
Figure 1 illustrates an example of a commonly used mother wavelet, the Morlet wavelet on the time-frequency plane, and the effect of the different combinations of time shifts and scales of the wavelets on its shape.
Fig. 1
Variations of time and frequency resolution of the Morlet wavelet at the locations and on the time-frequency plane. The horizontal axis τ represents the time shifts of the mother wavelet, while the axis S represents the different scales of the mother wavelet
2.2 Morlet Wavelet
The Morlet wavelet [51] is a complex wavelet, comprising real and imaginary sinusoidal oscillations, that is convolved with a Gaussian envelope so that the wavelet magnitude is largest at its center and tapered toward its edges (Fig. 2). The wavelet’s Gaussian distribution around its center time point has a SD of σ t . The wavelet also has a Gaussian shaped spectral bandwidth around its center frequency, f 0, that has a SD of σ f . It is mathematically formed as follows:
Fig. 2
Morlet wavelet in time and frequency domain
Wavelets are normalized so that their total energy is 1, the normalization factor A being equal to . The temporal SD, σ t , is inversely proportional to σ f (the exact relationship between them is defined by σ f = 1/2πσ t ), consistent with the Heisenberg uncertainty principle that as temporal precision increases (i.e., shorter σ t ) frequency precision decreases (i.e., larger σ f ). Furthermore, a wavelet is defined by a ratio of the center frequency, f 0, to σ f (i.e., f 0/σ f = c), such that σ f and σ t vary with the center frequency, f 0. This ratio is usually constant [52], ensuring a constant number of cycles in the wavelet for all the frequencies used in the analysis. However, the open source Matlab-based software EEGLAB [53] offers the possibility of using different number of cycles throughout the range of frequencies, beginning with a small number of cycles for the low frequencies and gradually reaching a larger number of cycles for the highest frequencies. The advantage of this technique is the higher resolution for both low and high frequencies.
2.3 Event-Related Power Measures
Power reflects the magnitude of the neuroelectric oscillations at specific frequencies. When EEG oscillations are assumed to be stable (stationary) over time, the traditional FFT is often used to spectrally decompose this time-invariant EEG. Nevertheless, when EEG activity cannot be assumed to be stationary over the time period of interest, as in the case of ERPs, a time-frequency decomposition is necessary. When the magnitude values are squared for each time-frequency data point and then averaged over trials the result is a 2-dimensional matrix containing total power of the EEG at each frequency and time point. Total power captures the magnitude of the oscillations irrespective of their phase angles and it comprises two sources of event-related oscillatory power, evoked and induced power.
Evoked power expresses changes in EEG power that are phase-locked with respect to the stimuli onset across trials [44]. It is isolated by averaging the event-locked EEG epochs in time domain prior to decomposing the signal in both time and frequency spaces. The averaging process does not affect frequencies that are phase-locked with respect to event onset across repeated trials; therefore they are still present in the average ERP. This is not the case for oscillations that are out of phase with respect to the stimulus onset across trials, which cancel out toward zero after the time-domain averaging. Induced power reflects these changes in EEG power that are time-locked, but not phase-locked, with respect to the stimuli onset.
In order to calculate induced power, evoked power needs to be removed from the total power estimate. The authors in [54] propose a time-frequency decomposition applied to each single trial, followed by an averaging of the powers across trials, in order to identify non-phase-locked activity; however this technique yields the total oscillatory power of the EEG, without discarding the evoked activity. On the contrary, the authors in [55] claim to calculate the induced power by subtracting the estimated evoked responses from the corresponding single trial data, without providing any further information about how this subtraction is implemented. Zervakis et al. [56] introduced two novel measures, the Phase Intertrial Coherence (PIC) and the Phase-shift Intertrial coherence (PsIC), in order to evaluate the phase-locked and non-phase-locked oscillatory activity, respectively. Nevertheless, the authors point out that these measures are useful in order to qualitatively evaluate the phase or non-phase-locked nature of oscillations, via the time-frequency maps of the measures, since the PsIC reflects mixed phase and non-phase-locked intertribal coherence. The focus of their study was not on the changes in energy with respect to a baseline or pre-stimulus period; therefore a quantitative evaluation of the extracted induced activity cannot be carried out. Moreover, in [57] it is pointed out that removing the mean ERP (i.e., the evoked power) from each epoch before time-frequency analysis would involve an implicit assumption that event-related brain dynamics can be modeled as a sum of a stable ERP and a reliable pattern of EEG amplitude modulation. Such a model would not take into account EEG phase-dependent interactions, whereas the actual effect of subtracting the evoked oscillatory activity from each epoch prior to the wavelet transform would be relatively small, particularly at frequencies above 10 Hz, where auditory ERP amplitudes are 15 dB or more below mean EEG amplitudes. Apparently, there is great controversy in the field on whether or how this subtraction should be performed; therefore we chose not to include the induced oscillatory activity in our analysis. Figure 3 illustrates the properties of evoked and induced powers, where averaging clearly preserves the phase-locked evoked oscillations while it eliminates the non-phase-locked induced activity.
Fig. 3
Example of evoked (phase-locked) versus induced (non-phase-locked) EEG oscillations
Apart from the aforementioned power measures, event-related phase consistency across trials can be calculated, defined as the partial or exact synchronization of activity at a particular latency and frequency to a set of experimental events to which EEG data trials are time locked [53]. This measure was first introduced by Tallon-Baudry et al. in [58] as “Phase-locking factor”; however we use the term “Inter-Trial Phase Coherence” (ITPC) throughout this paper, in accordance with the definition in [53]. Typically, for n trials, if F k (f, t) is the spectral estimate of trial k, at frequency f and time t, then the ITPC is defined as:
where ∣⋅∣ represents the complex norm.
The ITPC measure takes values between 0, which represents absence of synchronization between EEG data and time-locking events, and 1, which represents perfect synchronization. The time-frequency analysis yields complex vectors in the 2-D phase space, each of which is represented by the magnitude and phase of the spectral estimate. In order to calculate the ITPC, the lengths of each one of the trial activity vectors are normalized to 1, and then their complex average is computed. Thus, only the information about the phase of the spectral estimate of each trial is taken into account.
3 Application of Wavelet Analysis on Auditory PSP EEG Data
3.1 Participants and Experimental Setup
The participants in this study included 12 healthy controls (mean age = 42.17, SD = 7.16) and 12 schizophrenia patients (mean age = 42.17, SD = 10.50). Schizophrenia subjects were recruited from the outpatient clinics at Wayne State University. They were medicated and clinically stable without change in medications for at least 1 month. An agreement among chart and a Structured Clinical Interview for DSM-IV diagnosis was necessary for inclusion in the study. An initial telephone screening was carried out in order to establish a basic qualification of the participants.
Healthy subjects had no history of psychiatric, drug or alcohol abuse, or neurological disorders. They had no first-degree relatives with any Axis-I psychiatric diagnoses including drug use or dependence and were not on any CNS active medications. Exclusion criteria for both groups included: (a) history of a seizure disorder or any other neurological problems including head injury leading to loss of consciousness of any length of time; (b) currently meeting DSM IV criteria for drug or alcohol dependence; (c) positive urine test for any drugs of abuse; and (d) pregnancy in women. Written informed consents were obtained from all the subjects. This study was carried out following guidelines for proper human research conduct in accordance with the Declaration of Helsinki. The protocol and study procedures were approved by the Institutional Review Board at Wayne State University.
In the current study the PSP (“sensory gating” protocol) was used, which consists of the presentation of two consecutive stimuli, S1 and S2. The stimuli were identical clicks of duration of 4 ms each, intensity of 85 dB and frequency of 1,000 Hz, with time of rise/fall equal to 1 ms. The interstimulus and interpair intervals were set to vary randomly between 500 ± 25 ms and 8 ± 1 s, respectively. Subjects were instructed to relax, stay awake, and try to avoid too much eye movement during the recording session. For each subject 120 trials were recorded.
3.2 EEG Data Acquisition and Preprocessing
EEG data were recorded from 19 sites (Fp1, Fp2, Fz, F3, F4, F7, F8, Cz, C3, C4, T7, T8, Pz, P3, P4, P7, P8, O1, O2), according to the international 10–20 electrode placement system, and left and right mastoids. Vertical and horizontal electrooculograms were also recorded. A linked ears reference was chosen for acquisition of the data with an analog band-pass filter of 0.3–100 Hz and analog-to-digital sampling rate of 1,000 Hz. An electrode attached to the forehead was used as the ground electrode. Electrode impedances were kept below 5 kΩ.
No additional digital filtering was applied to the data offline. Eye movement and blink artifacts were removed using Independent Component Analysis [59] and trials containing artifacts exceeding ±75 μV were rejected. After trial rejection the groups did not differ significantly on the number of final trials included: all subjects had at least 75 % of trials accepted (i.e., 90 out of 120 trials). The 200 ms before S1 stimulus were used for baseline correction over the remaining epoch. All preprocessing steps were performed using the open source Matlab software toolbox Fieldtrip [60].
3.3 Time-Frequency Analysis and Classification
For each subject and for all 19 channels that were recorded, all single trials were convolved with a complex Morlet wavelet. The decomposition was performed for a frequency range 8–60 Hz, using 4-Hz steps, and the wavelet length was increased linearly from 1 cycle at 8 Hz to 4 cycles at 60 Hz, yielding a time range from −130 ms (pre-stimulus) to 901 ms (post-stimulus), divided to 150 time points. The time-frequency decompositions were performed using the open source Matlab software EEGLAB [53]. Figures 4 and 5 illustrate two examples of time-frequency representations of relative-to-baseline changes of total power, ITPC, and evoked power, averaged over all sensors, for a representative healthy subject and a schizophrenia patient, respectively.
Fig. 4
Time-frequency representation of total power, ITPC, and evoked power of a healthy subject, averaged over all sensors. Color bars indicate relative-to-baseline changes in power
Fig. 5
Time-frequency representation of total power, ITPC, and evoked power of a subject suffering from schizophrenia, averaged over all sensors. Color bars indicate relative-to-baseline changes in power
For each one of the computed time-frequency representations (total power, evoked power, ITPC) the time-frequency bins were averaged over four frequency bands (alpha: 8–13 Hz, beta: 13–30 Hz, low gamma: 30–45 Hz, medium gamma: 45–60 Hz) and over 52 time epochs of 20 ms length (with the exception of the first one that lasted from −130 to −119 ms). For each new time-frequency point of interest and for each channel, the two groups (schizophrenia patients and healthy controls) were compared using a nonparametric permutation test, based on t-statistic with Monte Carlo randomization. The nonparametric statistical test is performed in the following way:
(a)
Collect the trials of the two groups of subjects in a single set.
(b)
Randomly draw as many trials from this combined dataset as there were trials in group 1 and place those trials into subset 1. Place the remaining trials in subset 2. The result of this procedure is called random partition.
(c)
Calculate the test statistic (here, unpaired t-test) on this random partition.
(d)
Repeat steps (b) and (c) a large number of times and construct a histogram of the test statistics.
(e)
From the test statistic that was actually observed and the constructed histogram, calculate the proportion of random partitions that resulted in a larger test statistic than the observed one. This proportion is the Monte Carlo significance probability, which is also called the permutation p-value.
(f)
If the p-value is smaller than a critical alpha value, then conclude that the data in the two groups are significantly different.
The accuracy of the Monte Carlo p-value increases with the number of draws from the permutation distribution. In the current study, the number of random partitions was set to 1,000 and the significance level to 5 %. To control for increased type I error rate due to multiple comparisons, Bonferroni correction was performed.
Furthermore, in order to compare groups in terms of sensory gating, the averaged time-frequency bins from 501 to 901 ms post-stimulus (S2 response) were subtracted from the corresponding time-frequency bins from 1 to 401 ms (S1 response), for each one of the measures and the channels, yielding time-frequency gating representations of 4 frequency bands and 20 time epochs (Fig. 6). We chose S1–S2 amplitude difference as an expression of the gating measure because some subject’s responses to S1 or S2 stimuli in time-frequency state-space were negative, making S2/S1 ratio difficult to interpret and impractical in statistical analysis.
Fig. 6
Time-frequency representation of S1 response, S2 response, and sensory gating (S1–S2) after averaging over specific frequency bands and time ranges
The same statistical analysis as described above was performed for the gating representations. The features that were found to be statistically significant (p < 0.05) for certain channels and time-frequency bins, as a result of the aforementioned statistical analysis, were used as features in the classification schemes between schizophrenia patients and healthy controls.
The features were first normalized to the range [0,1] in order to avoid the ones in greater numeric ranges dominating those in smaller numeric ranges. Then, the normalized features were introduced into a wrapper for feature subset selection [61], in order to find an optimal feature subset in terms of classification results. An SVM classifier with a RBF kernel, a fivefold cross-validation scheme, and a best-first search engine were used for the estimation of the accuracy of the feature subsets.
The resulting optimal subset was used as a feature set in three different classifiers:
A linear discriminant analysis (LDA) classifier.
A support vector machine (SVM) classifier with a RBF kernel.
A normalized Gaussian RBF network, which uses the k-means clustering algorithm to provide the basis functions and learns a linear regression on top of that.
For each one of the classifiers we used two different testing modes:
A tenfold cross-validation
Training on the 66 % of the data and testing on the remainder (percentage-split mode)
The wrapper for the feature subset selection, and the SVM and RBF network classifiers were applied using the open source data mining software Weka 3.6.7 [62]. For the LDA classifier the corresponding function of the Statistics Toolbox of Matlab R2010a was utilized.
4 Results
4.1 Statistical Analysis: Schizophrenia Patients–Healthy Subjects
The time bins that yielded significant differences (p < 0.05) between schizophrenia patients (SP) and healthy subjects (HS) in the statistical analysis are given below. In order to facilitate the comparison between S1 and S2 responses the time bins between 501 and 901 ms after S1 are converted to 1–501 ms after S2.
−19 to 1 ms: SP showed significantly higher S1 total response than HS in low gamma band at F8 channel (p = 0.044).
1–21 ms: SP showed significantly lower S1 evoked response than HS in beta band at the F3 channel (p = 0.041), and lower S1 ITPC response in alpha band at the F7 channel (p = 0.038), as well as lower gating of ITPC response in beta band, at the channels Fz and F3 (p = 0.042, p = 0.037, respectively).
61–81 ms: SP showed significantly lower S1 evoked and ITPC responses (p = 0.032, p = 0.027, respectively) in medium gamma band at the C3 channel. Additionally, SP showed significantly higher S1 total response in medium gamma band, at T7 channel (p = 0.041).
81–101 ms: SP showed significantly lower S2 evoked and ITPC responses (p = 0.037, p = 0.033, respectively) in medium gamma band at Fz.
101–121 ms: SP showed significantly lower S1 evoked response in medium gamma band at Fp1 (p < 0.02), lower gating of evoked response in beta band at Fz (p < 0.023), and lower gating of ITPC response in beta band at T8 (p = 0.03) and in low gamma band, at P3 (p < 0.029).
141–161 ms: SP showed significantly lower S1 evoked response in low gamma band at T8 (p = 0.031) and lower S2 ITPC response in medium gamma band at Fp2 (p = 0.03).
161–181 ms: SP showed significantly higher S1 total response in low gamma band at C3 and Pz (p = 0.025, p = 0.028, respectively) and gating of total response in low gamma band at C3, Pz, and P3 (p = 0.014, p = 0.017, p = 0.018, respectively).
181–201 ms: SP showed significantly higher S1 total response in beta band at P4 (p = 0.026) and gating of total response in beta at F4, C3, and P4 (p = 0.013, p = 0.018, p = 0.02, respectively) and in low gamma band at Cz, Pz, C3, C4, and P3 (p = 0.014, p = 0.014, p = 0.02, p = 0.021, p = 0.018, respectively). Additionally, SP showed significantly higher gating of ITPC response in medium gamma band at T8 (p = 0.031).Stay updated, free articles. Join our Telegram channel
Full access? Get Clinical Tree