for Seizure Detection and Prediction: An Overview



Fig. 1
Schematic representation of seizure detection system



Most studies present their solution to the problem of seizure detection in the context of a decision support system for the neurologist expert. As there are many types of seizures, this is sometimes a difficult task, taking into account the nature, temporal length, and singularities of each seizure type. When a patient experiences seizures of different types one needs to categorize EEG ictal periods into a specific type, although some epileptic syndromes are difficult to be characterized as being of specific category. A more demanding task, which is still considered an open scientific question, is the prediction of a seizure [5], which profoundly will improve the quality of life of people suffering from severe seizures. Besides, the understanding of underlying mechanisms leading in seizures and the origin of a seizure is in each case are still under question.

Towards this direction, many EEG analysis algorithms have been proposed. Linear analysis has been widely used based mainly on synchronization features as a primer and straightforward approach. Although these methods can reveal in some cases the existence of epileptic seizures, they have their limits if someone takes the nature of real human EEG data into account [6, 7]. Under this prism, EEG signals can be interpreted as the result of a system containing highly nonlinear elements. The study of nonlinear EEG dynamics can reveal hidden information and provide a more complete picture about underlying brain processes [8, 9]. Nonlinear analysis has been used with increased accuracy over the last decade in the area of seizure detection and prediction.



2 Methods


In this section, linear methods (highlighting time–frequency methods), nonlinear methods (highlighting measures derived from information theory), methods based on signal’s morphological characteristics, and vision-based methods are presented.


2.1 Linear Methods


Linear methods have been widely used in the area of epilepsy detection mainly due to their simplicity and versatility. One of the simplest linear statistic metrics is the variance of the signal. It offers an insight into dynamics underlying the EEG and is usually calculated in consecutive windows. A further linear method is based on the autocorrelation function, exploiting the periodic nature of seizures. Liu et al. [10], using Scored Autocorrelation Moment (SAM) analysis, distinguished EEG epochs containing seizures with an accuracy of 91.4 % although signals did not present differences in their spectral properties.

Furthermore, seizure onset and offset determination may be succeeded using linear prediction filters (LPF) [11]. An LPF estimates the spectral characteristics of a signal, with its accuracy depending on the stationarity of the latter. When there are spikes, sharp waves or rapid changes, the filter’s prediction error increases, leading to an identification of a possible seizure.

Discrete wavelet transform (DWT), which is a transformation extracting scale-frequency components from data (each component with resolution matched to its scale) may also be applied in seizure detection. In [12], normal and seizure signals were classified with an accuracy of 99.5 % using DWT and a linear classifier. Another linear feature, the relative fluctuation index [13], can measure the intensity of the fluctuation of EEG signals, which is defined as:



$$ {F}_i={\displaystyle \sum_{j=1}^{M-1}\bigg|{a}_i\left( j+1\right)-{a}_i(j)\bigg|}, $$
where a i denotes the amplitude of the filtered EEG signal at the i th band with length M. During a seizure, there is a larger fluctuation in the EEG signals than during an ictal-free period. Therefore, values of fluctuation index during a seizure are usually larger than during rest EEG. Using this index, along with other features, a study by Yuan et al. [13] achieved 94.90 % mean accuracy in a segment-based aspect and 93.85 % mean accuracy in an event-based aspect.


Time–Frequency Methods


Various studies that employ time–frequency features have also been used in the area of seizure detection [14]. Hassanpour et al. [15] used time–frequency patterns as signatures in order to detect seizures. Tzallas et al. [1618] performed an extensive investigation of well-known time–frequency distributions, extracting features from the Power Spectral Density (PSD) time–frequency grid followed by artificial neural network (ANN) classification. With this methodology, an accuracy between 89 and 100 % for three different datasets was achieved using reduced interference time–frequency distribution and ANNs. A time–frequency matched filter was introduced in [19, 20] in order to reveal seizure patterns. Rankine et al. [21] proposed a related methodology analyzing changes in preictal, ictal, and postictal states. Moreover, an improved time–frequency dictionary in terms of reconstruction accuracy and discrimination between seizure and non-seizure states is presented in [22].


2.2 Nonlinear Methods


Epileptic seizures can be seen as manifestations of intermittent spatiotemporal transitions of the human brain from chaos to order [23]. Nonlinear analysis of EEG has attracted increasing interest by many research groups mainly because it incorporates the non-stationary nature of a signal. It perceives brain mechanisms as part of a macroscopic system in a way to understand its spatiotemporal dynamic properties. The revealed underlying information of ongoing EEG leads to promising results not only in the detection but also in the prediction of upcoming seizures [24].


Fractal Dimension


Fractal dimension is a nonlinear time domain measure characterizing the complexity of a time series. The degree of complexity increases if the fractal dimension increases. Various algorithms have been developed [25] in order to calculate the fractal dimension such as box counting [26], Katz’s algorithm [27], Petrosian’s algorithm [28], and Higuchi’s algorithm [29]. According to the last, the time series 
$$ {\hfill x(i),\hfill i=1,2,\ldots, N\hfill } $$
formulates the vector



$$ {X}_m^k=\left\{ x(m), x\left( m+ k\right),\ldots, x\left( m+\left\lfloor \frac{N- m}{k}\right\rfloor \cdot k\right)\right\}, $$

where k is the time lag, m = 1,2, …, k and ⌊y⌋ the round down integer of argument y. For each X m k , the average is formed:



$$ {L}_m(k)=\frac{\displaystyle \sum_{i=1}^{\left\lfloor \frac{\left( N- m\right)}{k}\right\rfloor}\left| x\left( m+ ik\right)- x\left( {m+\left( i-1\right) k} \right) \right| \left( n-1 \right)}{\left\lfloor \frac{N- m}{k}\right\rfloor \cdot k}$$

Finally, the sum of averages is calculated as



$$ L(k)={\displaystyle \sum_{m=1}^k{L}_m(k)} $$

The linear estimation of the slope of the curve ln(L(k)) versus ln(1/k) is an estimate of the fractal dimension.


Lyapunov Exponent


Lyapunov exponent (λ) is a nonlinear metric measuring the exponential divergence of two time series trajectories in phase space. Considering the m-dimensional time vector of a time series X = {x(t), x(t + 1), …, x(t + m − 1)} and two neighboring points 
$$ {X}_{t_0} $$
and X t in phase space at time t 0 and t respectively, the distances of the points in the ith direction are 
$$ {\left. d{X}_i\right|}_{t_0} $$
and dX i | t respectively. Given the following equation,



$$ {\left. d{X}_i\right|}_t\approx {\mathrm{ e}}^{\lambda_i t}{\left. d{X}_i\right|}_{t_0} $$
the Lyapunov exponents are λ i .

Finally, the maximal Lyapunov exponent can be defined as



$$ {\lambda}_{\max }={\lim_ {t \to \infty }}\ \mathop{ \lim\limits_{ d{X}_i | _{t_0}\to 0}}\ { \lim}\frac{1}{t} \ln \frac{{\left. d{X}_i\right|}_t}{{\left. d{X}_i\right|}_{t_0}} $$
and measures the biggest increase rate of the error in the initial conditions. Lyapunov exponents characterize the chaotic nature of a time series, i.e., a slight shift in initial conditions can lead to a non-deterministic difference in the phase space trajectory. Using Lyapunov exponents and recurrent neural networks, Guler et al. [30] achieved 96.79 % accuracy rate in the detection of epileptic seizures.

On the other hand, Kannathal et al. [31] tested nonlinear measures including the correlation dimension (CD), maximal Lyapunov exponent, Hurst exponent (H), and Kolmogorov–Sinai entropy (K–S entropy) in order to distinguish epileptic from normal EEG activity. All measures showed high discriminating ability, with slightly better results being reported for the CD (p-value: 0.0001) and K–S entropy (p-value: 0.0001).


2.3 Information Theory Based Analysis and Entropy


Entropy is a physical measure derived from thermodynamics, describing the order or disorder of a physical system. High entropy values equal to high levels of disorder of a system, whereas low values describe a more ordered system, capable of producing more work. Signal processing and analysis research disciplines borrowed entropy from information theory in order to address and describe the irregularity, complexity, or unpredictability characteristics of a signal. Given these properties, entropy has been widely used towards automatic seizure detection [3, 32, 33].


Shannon Entropy


After some initial approaches by H. Nyquist and R. Hartley, research leaders at Bell Labs, Shannon established in 1948 quantitatively the foundations of information theory [34]. According to these, a signal is divided into J non-overlapping value bins and the ratios of samples falling into j th bin to the total samples Ν are calculated



$$ H=-{\displaystyle \sum_{j=1}^J{p}_j\ { \log}_2\left({p}_j\right)}\kern1em \mathrm{ where}\kern0.5em {p}_j=\frac{N\left({x}_j\right)}{N}, $$
where N(x j ) is the amount of samples that fall into bin j of total J bins to the total samples N. EEG Shannon entropy has been correlated with desflurane effect compartment concentrations [35]. It has also been used in order analyze long term EEG coming from patients with frontal lobe epilepsy [36].


Spectral Entropy


Spectral Entropy was introduced by Inouye [37, 38] measuring the proportional contribution of each spectral component to the total spectral distribution [39].



$$ H=-{\displaystyle \sum_{j=1}^J{p}_j\ { \log}_2\left({p}_j\right)}\kern1em \mathrm{ where}\kern0.5em {p}_j=\frac{S_j}{S}, $$
where S is the total spectrum and S j is the spectrum at frequency bin j of total J bins.

A traditional approach to estimate spectral entropy is through Fourier power spectrum [40, 41], which is applicable mainly where a signal’s stationarity conditions are satisfied, e.g., the resting EEG. However, many clinical applications are highly non-stationary with transient and rapid changes in their spectra distributions. In addition to that, a time-varying entropy index is necessary in some cases [42]. This can be partially dealt with the short time Fourier transform (STFT) revealing spectral distributions over successive windows [40]. However, this approach faces the intrinsic problem of window size selection that arises from the Heisenberg Fourier Uncertainty Principle [43].



$$ \Delta t\Delta f\ge \frac{1}{4\cdot \pi} $$

Due to this, a small window size increases temporal resolution but makes spectral resolution poor whereas a wide window size achieves the opposite effect. It is considered that the optimal distribution is a Gaussian that minimizes the product of time–frequency variances [44].

To overcome these limitations, Quiroga et al. [45] introduced wavelet entropy (WE) which is based on multi-resolution decomposition by means of the wavelet transform (WT). This technique has already been applied in EEG/ERP signal analysis [4648]. The problem with this approach is that the results are strongly dependent on the selection of the mother wavelet function. WE was efficiently applied in order to discriminate between EEG signals of controls and epileptic patients [4951]. Rosso et al. [52] compares the Gabor transform and the wavelet transform claiming the superiority of the second because a variable window is used for the analysis. Subsequently, the time evolution of wavelet entropy and relative wavelet entropy was investigated, showing significant decrease during ictal periods. However, different wavelet basis functions can produce different results, making their interpretation sometimes ambiguous.

In order to yield an optimal time–frequency distribution and subsequently time–frequency spectral entropy, adaptive algorithms are used. Adaptive Optimal Kernel (AOK) time–frequency representation [53] is an effective method in representing signals in the time frequency plane. The main advantage of having an adaptive signal-dependent method is that in each case the kernel is selected according to how well it is suited to signal’s characteristics. The method is adjusted by the choice of the kernel which involves a compromise between cross term reduction, loss of time–frequency resolution, and maintenance of certain properties of distribution [44, 54]. Spectral entropy by using this method was presented in [55].


Approximate Entropy


Approximate entropy (ApEn) was introduced by Pincus [56] to quantify the regularity and predictability of a time series data of physiological signals. Being a modification of the Kolmogorov–Sinai entropy [57], it was especially developed for determination of the regularity of biologic signals in the presence of white noise [58].

Given a time series X(n) = {x(n)} = {x(1), x(2), …, x(N)} of N samples, the ApEn value is calculated through the following steps:

1.

The vector sequences X m (i) = {x(i), x(i + 1), …, x(i + m − 1)} which represent m consecutive values commencing with the i th point are formed.

 

2.

The distance between x m (i) and x m (j) is calculated, defined by



$$ d\left[{X}_m(i),{X}_m(j)\right]=\mathop{\max}\limits_{1\le k\le m}\left\{\left| x\left( i+ k-1\right)- x\left( j+ k-1\right)\right|\right\} $$

 

3.

For each x m (i) the number N i m (r) of vectors is calculated



$$ d\left[{X}_m(i),{X}_m(j)\right]\le r $$

with r representing the noise filter level.

Then, the parameters C i m (r) are estimated as,



$$ {C}_i^m(r)=\frac{N_i^m(r)}{N- m+1} $$

 

4.

ϕ m (r) is defined as the mean value of the parameters C i m :



$$ {\phi}^m(r)=\frac{{\displaystyle \sum_{i=1}^{N= m+1} \ln {C}_i^m(r)}}{N- m+1} $$

 

5.

ApEn(m, r, N) is calculated using ϕ m (r) and ϕ m + 1(r) as



$$ \mathrm{ ApEn}\left( m, r, N\right)={\phi}^m(r)-{\phi}^{m+1}(r) $$

 

ApEn has already been used in many applications such as analysis of heart rate variability [5962], analysis of the endocrine hormone release pulsatility [63], and detection of epilepsy [64, 65]. The majority of studies indicate that during ictal periods ApEn presents a significant decrease in comparison with EEG during normal periods [3, 65, 66]. ApEn was also used in order to classify EEG signals among five different states (including an ictal state) with an increased accuracy [67].

The calculation of ApEn depends on the parameters embedding dimension (m), noise filter level (r), and data length ( N). Besides, it is arguable whether the standard deviation used at the noise filter level would be calculated from the original data series or from the individual selected EEG segments. However, there is no specific guideline for their optimal determination even though most research studies use the parameters described in [56, 61] as a rule of thumb. But ApEn statistics do not present relative consistency [61] leading to problems in hypothesis formulation and testing. As signals of different source and pathologies can have quite different properties, these parameters should be determined, based on the specific use. The need for a consistent determination of parameters was studied in a recent work [68] where a preliminary analysis of these parameters was established.


Sample Entropy


Sample entropy (SampEn), which is presented in [61], also estimates complexity in time series providing an unbiased measure regarding the length of time series.



$$ H= \ln \left(\frac{A^m(r)}{B^m(r)}\right) $$

The calculation of sample entropy starts with the steps 1 and 2 already described for the ApEn calculation. The following steps are given below:

3.

For each X m (i) the number N i m (r) of vectors is calculated



$$ d\left[{X}_m(i),{X}_m(j)\right]\le r $$

with r representing the noise filter level.

Then, the parameters B i m (r) and B m (r) are defined as,



$$ {B}_i^m(r)=\frac{N_i^m(r)}{N- m-1} $$




$$ {B}^m(r)=\frac{1}{N- m}{\displaystyle \sum_{i=1}^{N- m}{B}_i^m(r)} $$

 

4.

The dimension is incremented to m = m + 1 and the number N i m + 1(r) is calculated so that



$$ d\left[{X}_{m+1}(i),{X}_{m+1}(j)\right]\le r $$

Then, the parameters A i m (r) and A m (r) are defined as



$$ {A}_i^m(r)=\frac{N_i^{m+1}(r)}{N- m-1} $$




$$ {A}^m(r)=\frac{1}{N- m}{\displaystyle \sum_{i=1}^{N- m}{A}_i^m(r)} $$

 

5.

Finally, sample entropy is defined as



$$ \mathrm{ SampEn}\left( m, r\right)= \ln \left[\frac{B^m(r)}{A^m(r)}\right] $$

 

The advantage of SampEn is that its calculation is independent of time series size as it restricts self-matches and uses a simpler calculation algorithm, reducing execution time [61]. However, despite its advantages, SampEn is not widely used [69]. Sample entropy was used as feature for automatic seizure detection in [70]. It was also applied in [71] combined with Lempel–Ziv as indicators to discriminate focal myoclonic events and localize myoclonic focus.


Kullback–Leibler Entropy


Kullback–Leibler entropy (K–L entropy) measures the degree of similarity between two probability distributions and can be interpreted as a method quantifying differences in information content [72]. K–L entropy was applied to intracranial multichannel EEG recordings and indicates its ability to detect seizure onset based on spectral distribution properties [73].


Lempel–Ziv Complexity


The Lempel–Ziv measure estimates the rate of recurrence of patterns along a time series, reflecting a signal’s complexity. Lempel–Ziv has been applied to epileptic EEG signal showing increased values during ictal periods [74]. Another work has applied LZ complexity on E-ICA and ST-ICA transformed signals in an attempt to isolate seizure activity [75]. Both of these studies have been applied to limited datasets, pointing out the need of a more thorough evaluation.


Permutation Entropy


Permutation entropy is a measure of complexity introduced by Bandt [76]. Its application to absence epilepsy on rats indicated superiority on prediction of epileptic seizures and identification of preictal periods (54 % detection rate) comparing with sample entropy [77]. The same study achieved 98.6 % correct identification of interictal periods.


Order Index


Order index is another nonlinear feature that was proposed by Ouyang [78] measuring the irregularity of non-stationary time series. In a recent work [79], a comparative analysis of order index along with other linear and nonlinear features was performed.


2.4 Morphological Analysis


Most algorithms in the literature select features based on amplitude, spectral properties, and synchronization of ongoing EEG in order to identify a seizure. However, little progress has been made in order to incorporate a neurologist’s experience in analyzing a waveform morphology and shape for making a decision on optimal epilepsy treatment. Some studies working towards shape analysis of epileptic seizures give quite promising results not only by their techniques themselves but also by the prospect of integrating the present and future perception of neurologists’ expertise. In this way, Deburchgraeve et al. [80] extracts segments that morphologically resemble seizures by a combined nonlinear energy operator and a spikiness index. Then a detector is applied exploiting the repetitive nature of a seizure is applied. The spike and wave complexes of epileptic syndromes can also be extracted by a two-stage algorithm, the first enhancing the existence of spikes and the second applying a patient-specific template matching [81]. Interictal spikes have also been detected using Walsh transform in addition to the fulfillment of clinical criteria establishing a simulated epileptic spike [82, 83].


2.5 Vision-Based Analysis


In some cases, epilepsy monitoring is performed with synchronized video and EEG recordings. Epileptic syndromes are evaluated based not only on scalp recordings but also on human motion features extracted from video sequences. Analysis involves mainly detection of the myoclonic jerks, eye motion (eyeball doze, eyeball upwards roll, eyelid movements), head jerks and movements, facial expressions (mouth, lips malformations). However, it can be understood that a seizure specific organization and combination of motion features should be applied in order to provide better results [84]. This promising area of research helps neurologist experts have a more complete picture preventing them from false alarms and leading to decision support with increased accuracy. A thorough review can be found in [85]. Vision-based analysis in epilepsy can be divided into two categories, marker-based and marker-free techniques. Marker-based techniques track objects/markers placed in representative parts of the human body that convey information related to the epileptic manifestation. On the other hand, marker-free techniques use advanced image processing and computer vision tools to extract motion-related information directly from the image sequences in the video. Both techniques return time-varying signals, which form the basis for further feature extraction in the time- and frequency domain. The extracted features finally feed a classifier such as an ANN or a decision tree with the aim to detect epileptic seizures.


3 Seizure Prediction


Nowadays, seizure detection is considered practically an issue that has been solved with satisfactory accuracy. On the other hand, seizure prediction remains an open scientific problem in a way that there is no consistent approach for predicting a seizure accurately within a significant amount of time before it occurs. However, many algorithms have been tested in their ability to forecast seizures.


3.1 Early Approaches


The notion of seizure prediction was firstly mentioned in 1975 [86] based on spectral analysis of EEG data collected from two electrodes. In 1981, Rogowski et al. [87] investigated preictal periods using pole trajectories of an autoregressive model. Gotman et al. [88] investigated rates of interictal spiking as indicators of upcoming seizures.


3.2 Linear Methods



Statistical Measures


Among other measures Mormann et al. [89] investigated the statistical moment of the EEG amplitudes in order to detect the preictal state. Other linear measures like power have been used in [90] and signal variance has been used in [91] to predict seizure onset.


Hjorth Parameters


Hjorth parameters, namely, activity, mobility, and complexity, are time domain parameters useful for the quantitative evaluation of EEG [92]. The parameter of activity represents the variance of signal’s amplitude, the mobility represents the square root of the ratio between the variances of the first derivative and the amplitude, and the complexity is derived as the ratio between the mobility of the first derivative of the EEG and the mobility of the EEG itself. Mormann et al. used Hjorth parameters among others as features for seizure prediction [89]. Mobility has been also used followed by SVM classification achieving better false positive rates (fpr) in comparison with plain spectral analysis [93].


Accumulated Energy


The accumulated energy is computed from the average energy across all values of the signal x of a window k



$$ {E}_k=\frac{1}{N_{\mathrm{ w}}}{\displaystyle \sum_{i=1}^{N_{\mathrm{ w}}}}{x_k}^2(i), $$
where N w is the window length.

Then, the average of ten values of average energies are added to the running accumulated energy.



$$ {\mathrm{ AE}}_m=\frac{1}{10}{\displaystyle \sum_{k=10 m-9}^{10 m}}{E}_k+{\mathrm{ AE}}_{m-1}, $$
where m = 1,2,…,N/N w and AE0 = 0.

This measure can be considered as the running average of average window energies. Accumulated energy has been used in various studies of seizure prediction [9496].


AR Modelling


In [97], a feature extraction and classification system was proposed based on Autoregressive Models, SVM and Kalman Filtering. Its performance regarding false positives rates per hour is quite promising with a mean prediction time ranging from 5 to 92 min.


3.3 Nonlinear Methods


Most of the nonlinear methods exploit the reconstruction of a time series x(i), i = 1, 2, …, N in phase space domain forming the m ‐ dimensional time delayed vectors



$$ {X}_m(i)=\left\{ x(i), x\left( i+1\cdot \tau \right),\ldots, x\left( i+\left( m-1\right)\cdot \tau \right)\right\}, $$
where m is the embedding dimension and τ is the time delay.

This reconstruction conveys important information about the nonlinear dynamics of a signal and it is used to many methods some of them described below.


Lyapunov Exponent


The calculation method of Lyapunov exponents was analyzed in the previous section of this chapter. Iasemidis et al. [98100] applied for the first time nonlinear analysis to seizure prediction. The idea behind this approach is that the transition from normal to epileptic EEG is reflected by a transition from chaotic to a more ordered state, and therefore, the spatiotemporal dynamical properties of the epileptic brain are different for different clinical states.


Dynamical Similarity Index


Dynamical similarity index is a method introduced in [101] which calculates brain state dynamics through phase state reconstruction and compares a running window state against a reference window with the use of the cross correlation integral. Various studies using this index have shown promising results even in the detection of preictal states of temporal lobe epilepsy [102, 103] and neocortical partial epilepsy [104].


Correlation Dimension


In order to calculate the correlation dimension, the correlation integral defined by Grassberger and Procaccia [105] is needed



$$\begin{array}{lll} {C}_m\left(\varepsilon, {N}_m\right) =\frac{2}{N_m\left({N}_m-1\right)}{\displaystyle \sum_{i=1}^{N_m}{\displaystyle \sum_{j= i+1}^{N_m}\Theta}}\left(\varepsilon -\left\Vert {x}_i-{x}_j\right\Vert \right),\cr{N}_m = N-\left( m-1\right)\cdot \tau,\end{array} $$
where Θ is the Heaviside function. This integral counts the pairs (x i , x j ) whose distance is smaller than ε. Then, the correlation dimension D is defined by



$$ D={\mathop{\lim}\limits_{\varepsilon \to 0\atop N\to \mathit{\infty}}}\frac{\partial \ln {C}_m\left(\varepsilon, {N}_m\right)}{\partial \ln \varepsilon} $$

In preictal states, drops in correlation dimension were observed making this measure able to identify states preceding seizures [95, 106, 107].


Entropies


Zandi et al. [108] used entropic measure of positive zero-crossing intervals achieving 0.28 false positive rate and average prediction time 25 min.

Jun 25, 2017 | Posted by in PATHOLOGY & LABORATORY MEDICINE | Comments Off on for Seizure Detection and Prediction: An Overview

Full access? Get Clinical Tree

Get Clinical Tree app for offline access