Thursday, February 19, 2009

Support Vector Machine (SVM) as Post Classifier for Epilepsy Risk Level Classifications from Fuzzy based EEG Signal Parameters

Benefits of using Virtual Instrumentation
Virtual instrumentation allows us to use mainstream computer technologies combining it
with innovative software and flexible hardware. This makes it possible for us to develop
computer based instrumentation solutions. the concept of virtual instrumentation enables
students, engineers and scientists to build powerful applications for increasing productivity and
performance by Reduces programming complexity. Features such as virtual instrument
simulation makes it user friendly in a way to identified and rectify errors.
Products Used
Lab view 8.2
DAQ: NI 6259
[Bus P,Pc ,Express; Analog Input:32, Analog Output:4; Sollution:16 bit; Maxupdate Rate:2.8
Ms/s; Digital Input:48]
Problem to be solved
Epilepsy is caused due to Cumulative firing of neurons in brain. Once a diagnosis of
epilepsy is established, it is important to begin treatment right away. The longer treatment is
delayed, the more difficult the epilepsy is to treat. In this paper, the software and hardware
details of a prototype PC based monitoring unit for diagnosis of risk level of epilepsy allowing
the physician to monitor the patient’s epilepsy risk level to decide appropriate therapeutic
measures
Solution to the problem
Support Vector Machine (SVM) is used for pattern classification and non linear
regression like multilayer perceptrons and Radial Basis Function networks. SVM is now
regarded as important example of ‘Kernel Methods’. The main idea of SVM is to construct a
hyper plane as the decision surface in such a way that the margin of separation between positive
and negative examples is minimized. The SVM is an approximate implementation of method of
structural minimization. In SVM we investigate the optimization of fuzzy outputs in the
classification of Epilepsy Risk Levels from EEG (Electroencephalogram) signals. The fuzzy
techniques are applied as a first level classifier to classify the risk levels of epilepsy based on
extracted parameters like energy, variance, peaks, sharp and spike waves, duration, events and
covariance from the EEG signals of the patient.
Introduction
Support Vector Machine (SVM) is used as a post classifier to obtain the
optimized risk level characteristics of a epileptic patient. People attacked by epilepsy are
unnoticed and this leads to other events such as a stroke, which also causes falls or migraines. In
India number of persons suffering from epilepsy is increasing per year. The complexity involved
in the diagnosis and therapy is to be cost effective in nature. Airports, amusement parks, and
shopping malls are just a few of the places where computers are used to diagnosis a person’s
Epilepsy risk levels if a life threatening condition occurs. In some situation there is not always a
trained doctor’s and neuro scientists on hand. This project work is intended to synthesis a cost
effective SVM mechanism to classify the epilepsy risk level of the patients and to mimic a
doctor’s and neuro scientist’s diagnosis.
The EEG (Electroencephalogram) signals of 20 patients are collected from Sri
Ramakrishna Hospitals at Coimbatore and their risk level of epilepsy is identified after
converting the EEG signals to code patterns by fuzzy systems. This type of classification helped
doctor’s and neuro surgeons in giving appropriate therapeutic measures to the patients. This
project helps to save a patient’s life when a life threatening condition occurs. This scientific
project is carried in order to save a patient’s life and also to create public awareness among
people about the risk ness of epilepsy.
The project can be further improved by collecting EEG signals of another 10 patients and
diagnosis can be made in classifying the patient’s risk level and to design a cost effective therapy
device for the doctors. Since it is done in off line methodology further improvement is needed to
diagnose in on line. In on line method minute to minute diagnosis can be obtained for higher risk
level epilepsy patients and mass screening of epilepsy disease is possible.
The block diagram of epilepsy classifier is shown in figure1. This is accomplished as:
1. Fuzzy classification for epilepsy risk level at each channel from EEG signals and its
parameters.
2. Each channel results are optimized, since they are at different risk levels.
3. Performance of fuzzy classification before and after the SVM optimization methods is
analyzed.
Figure 1 SVM- Fuzzy Classification System
EEG Signal
Parameter
Fuzzy
System
Code
Patterns
SVM
Risk level
output
Hardware description
The Electroencephalogram signals from epileptic patients are to be collected from
hospitals. Then the EEG signals are then converted to code patterns by fuzzy systems. The
figure2 below shows how EEG signals are converted for processing;
Figure 2 EEG Signal Conversion
The output of a fuzzy system represents a wide space of risk levels. This is due to sixteen
different channels of input to the system in three epochs. This yields a total of forty-eight input
output pairs. Since we deal with known cases of epileptic patients, it is indispensable to find the
exact level of risk the patient. SVM optimization will also aid in the development of automated
systems that can precisely classify the risk level of the epileptic patient under observation. Hence
an optimization of the outputs of the fuzzy system is initiated. This will improvise the
classification of the patient’s state and can provide the EEGer with a clear picture.
Application description
The output of a fuzzy system represents a wide space of risk levels. This is due to sixteen
different channels of input to the system in three epochs. This yields a total of forty-eight input
output pairs. Since we deal with known cases of epileptic patients, it is indispensable to find the
exact level of risk the patient. Due to the low value of performance index (40%), quality value
(6.25) it is necessary to optimize the output of the fuzzy systems. Hence we are moving to SVM
classification which gives a performance index of 98% and a quality value of 22.94.
The following tasks are carried out to classify the risk levels by SVM which are,
1. First a simplest case is analyzed with hyper plane as decision function with the known linear
data.
2. A non linear classification is done for the codes obtained from a particular patient by using
quadratic discrimination.
3. Then the k-means clustering is performed for large data with different sets of clusters with
centroid for each.
4. The centroid obtained is mapped by the kernel function for obtaining a proper shape.
5. A linear separation is obtained by using SVM with kernel and k-means clustering
The parameters derived from the EEG signal are stored as data sets. Then the fuzzy
technique is used to obtain the risk level at every EEG channel. The objective was to classify
perfect risk levels with high rate of classification. Though it is impossible to obtain a perfect
performance in all these conditions, some compromises have been made. The classification rate
of epilepsy risk level of above 98% is possible in our method. The number of cases from the
present twenty patients has to be increased for better testing of the system. From this method we
can infer the occurrence of High-risk level frequency and the possible medication to the patients.
Also optimizing each region’s data separately can solve the focal epilepsy problem. The future
research is in the direction of a comparison between heuristic optimization models with SVM.
In fuzzy techniques more suboptimal solutions are arrived. These solutions are to be
optimized to arrive a better solution for identifying patient’s epilepsy risk level. For optimization
of fuzzy outputs the Support Vector Machine (SVM) method is identified.
The following solution constrains steps are followed:
Step 1: The linearization and convergence is done using Quadratic Optimization. The primal
minimization problem is transformed into its dual optimization problem of maximizing the dual
lagrangian LD with respect to :
Max LD =
(1)
Subject to
(2)
(3)
Step 2: The optimal separating hyper plane is constructed by solving the quadratic programming
problem defined by (1)-(3). In this solution, those points have non-zero Lagrangian multipliers (
) are termed support vectors.
Step 3: Support vectors lie closest to the decision boundary. Consequently, the optimal hyper
plane is only determined by the support vectors in the training data.
Step 4: The k-means clustering is done for the given set of data. The k-means function will form
a group of clusters according to the condition given in step2 and step3. Suppose for a group of 3
clusters, k-means function will randomly choose 3 centre points from the given set. Each centre
point will acquire the values that are present around them.
Step 5: Now there will be six centre points three from each epochs and then the SVM training
process is done by the Kernel methods. Thus, only the kernel function is used in the training
algorithm, and one does not need to know the explicit form of . Some of the commonly used
kernel functions are:
Polynomial function:
Radial Basis Function:
Sigmoid function:
The hyper plane and support vectors are used to separate linearly separable and non-linearly
separable data. The figure3, 4below shows the VI simulation of our project,
Figure 3 VI Simulation
Figure 4 VI Simulation
Kernel Functions
One of the major tricks of SVM learning is the use of kernel functions to extend the class
of decision functions to the non-linear case. This is done by applying the data from the input
space into a high dimensional feature space by a function and solving the linear
learning problem in . The actual function does not need to be known, it suffices to have a
kernel function which calculates the inner product in the feature space.
It was noticed by Sch¨olkopf in that the kernel function defines a distance measure d on the input
space by
(4)
(5)
This shows the kernel function can be interpreted as a measure of similarity between the
examples of and .
Linear kernel
The linear kernel is the most simple kernel function. The decision function
takes the form . When one uses the linear kernel to predict time series, i. e.
, this means the resulting model is an statistical
autoregressive model of the order k (AR[k]). With this kernel, time series are taken to be similar,
if they are generated by the same AR-model.
RBF kernels
Radial basis kernels take the form clearly, the similarity of two
examples is simply judged by their Euclidian distance. In terms of time series, this has a parallel
in the so called phase space representation. Assume the time series is generated by a function
such that . If one takes the time series and plots the (k+1)-
dimensional vectors, the resulting plot is a part of the graph of g, so the function g can be
estimated from the time series. Especially, assuming that the function is linear and the data is
generated by
where is a Gaussian noise (i. e. the time series model is AR[1]), it can
be shown that most of the data lies in an ellipsoid defined by the mean of the time series and the
variance of . In this is used in the phase space procedure for finding outliers in the time series.
This shows that information about a window of a time series can be gotten from other windows
of the time series that are similar in means of the euclidian distance, which makes the RBF
kernel promising for time series analysis.
Fourier Kernel
A common transformation for the analysis of time series data is to use the Fourier
transform (see Figure4). This representation is useful if the information of the time series does
not lie in the individual values at each time point but in the frequency of some events. It was
noted by Vapnik that the inner product of the Fourier expansion of two time series can be
directly calculated by the regularized kernel function
METHODOLOGY
The hyper plane and support vectors are used to separate linearly separable and nonlinearly
separable data. In this project we used, Radial Basis Kernel function (RBF) [4] for this
non-linear classification. RBF is a curve fitting approximation in higher dimensional space.
According to this learning it is equivalent to finding a surface in multi dimensional space that
provides a best fit by utilizing the training data and generalization is equivalent to use of this
multidimensional surface to interpolate the test data. It draws up on a traditional strict
interpolation in multidimensional space. Thus RBF provides a set of the testing data which acts
as a “basis” for input patterns when expanded into hidden space. From the set of RBF testing
values the Mean Square Error (MSE) and Average MSE is performed and error values are
calculated. The tool used in this study is mat lab v7.2 and Lab View 8.2.
An important factor for the choice of a classification method for a given problem is the
available a-priori knowledge. During the last few years support vector machines (SVM) have
shown to be widely applicable and successful particular in cases where a-priori knowledge
consists of labeled learning data. If more knowledge is available, it is reasonable to incorporate
and model this knowledge within the classification results or to require less training data.
Therefore, much active research is dealing with adapting the general SVM methodology to cases
where additional a-priori knowledge is available. We have focused on the common case where
variability of data can be modeled by transformations which leave the class membership
unchanged. If these transformations can be modeled by mathematical groups of transformations
one can incorporate this knowledge independently of the classifier during the feature extraction
stage by group integration, normalization etc. this leads to variant features, on which any
classification algorithm can be applied.
It is noted that one of main assumptions of SVM is that all samples in the training set are
independent and identically distributed (i.i.d), however, in many practical engineering
applications, the obtained training data is often contaminated by noise. Further, some samples in
the training data set are misplaced on the wrong side by accident. These known as outliers. In
this case, the standard SVM training algorithm will make decision boundary deviate severely
from the optimal hyper plane, such that, the SVM is very sensitive to noise, and especially those
outliers that are close to decision boundary. This makes the standard SVM no longer sparse, that
is, the number of support vectors increases significantly due to outliers. In this project, we
present a general method that follows the main idea of SVM using adaptive margin for each data
point to formulate the minimization problem, which uses the RBF kernel trick. It is noted that the
classification functions obtained by minimizing MSE are not sensitive to outliers in the training
set. The reason that classical MSE is immune to outliers is that it is an average algorithm. A
particular sample in the training set only contributes little to the final result. The effect of outliers
can be eliminated by taking average on samples. That is why the average technique is a simple
yet effective tool to tackle outliers.
In order to avoid outliers we utilized the RBF kernel functions and also decision
functions for determining the margin of each classes. Since we are analyzing twenty epilepsy
patients through leave one out methods and ten fold cross validation. Based on the MSE value
and Average MSE values of SVM models the classifications of epilepsy risk levels are validated.
The following fig 5 depicts the training and testing MSE of SVM models. The outliers problem
is solved through Average MSE method which is shown in figure 6.
MSE of Training and testing SVM Models
0
0.001
0.002
0.003
0.004
0.005
0.006
1
3
5
7
9
11
13
15
17
19
Patients
MSE of SVM Models
Series1
testing
Figure.5 MSE of Training and Testing of SVM Models
Average MSE under Testing
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
1
3
5
7
9
11
13
15
17
19
Patients
Average MSE
Average MSE under
Testing
Figure .6 Average MSE under Testing of SVM Models
Figure 7 shows the details of training data with Perfect Classification (PC) from which up
to 20% of training data set the perfect classification of 100% is obtained. When the training done
by the outliers the PC of epilepsy risk level is slipped to 95% level and finally all the sets of data
are trained the PC is settled at 98% only.
Training of Data with Perfect Classification
92
93
94
95
96
97
98
99
100
101
10 20 30 40 50 60 70 80 90 100
Percentage of Training Data
Perfect classification
Figure .7 Average MSE under Testing of SVM Models
Test Results
In SVM the performance classification is about 97.39% which is very high when
compared with Fuzzy logic which is 50% only. The sensitivity and selectivity of SVM is also
more when compared to the latter. The missed classification of SVM is 1.458% but it is about
20% in Fuzzy Network and the value of PI in SVM is 97.07 and 40 in Fuzzy. Table I indicates
the result details of Fuzzy and SVM methods.
TABLE I. PERFORMANCE INDEX
The PI calculated for the aforesaid classification methods using (8) for SVM optimization
is 97.07 which are higher than Fuzzy technique. It is evident that the optimizations give a
better performance than the Fuzzy techniques due to its lower false alarms and missed
classifications. This optimization model is evaluated in terms of its receiver operating
characteristics (ROC) curve for test data sets. This enables the user to evaluate a model in
terms of the trade-off between sensitivity and specificity. ROC matrices are used to show
how changing detection threshold affects detection versus false alarms. If the threshold is set
too high then the system will miss too much detection. Conversely, if the threshold is very
Methods
Perfect
Classification
Missed
Classification
False
Alarm
Performance
Index
Fuzzy logic 50 20 10 40
SVM Optimization 97.39 1.458 1.385 97.07
low then there will be heavy false alarms. The percentage of detections classified correctly is
plotted against the percentage of non -detections in correctly classified as detections (i.e.
false alarms) as a function of the detection threshold. ROC is the best way to evaluate a
detector.
The performance of classification for test data set is assessed by calculating the area
under the ROC curve of AZ . It is noticed that the values of AZ from range of 0.5 to 1 for a
perfect classifier. A good trade-off is observed between detections and false alarms. ROC
curve for the Fuzzy classifier with SVM optimization are shown in figure 8.
ROC of SVM Post Classifier
86
88
90
92
94
96
98
100
102
100
100
100
100
100
100
100
93.75
100
93.75
Specificty
Sensitivity
Figure 8. ROC of Fuzzy and SVM Classifiers
In Order to compare different classifier we need a measure that reflects the overall
quality of the classifier. Their quality is determined by three factors. Classification rate,
Classification delay and False Alarm rate. The quality value QV is defined as
( fa ) ( dly dct msd )
V R T P P
C
Q
+ 0.2 * * + 6*
= (6)
Where, C is the scaling constant
Rfa is the number of false alarm per set
Tdly is the average delay of the on set classification in seconds
Pdct is the percentage of perfect classification and
Pmsd is the percentage of perfect risk level missed
A constant C is empirically set to 10 because this scale is the value of QV to an easy
reading range. The higher value of QV, the better the classifier among the different classifier,
the classifier with the highest QV should be the best. Figure 9 depicts the details of quality
values for each patient. Table V shows the Comparison of the fuzzy and SVM optimization
techniques. It is observed from Table II, that SVM method is performing well with the
highest performance index and quality values.
Quality Value
0
5
10
15
20
25
30
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Patient
Quality Value
Figure 9: Quality value for Data set
TABLE II: COMPARISON RESULTS OF CLASSIFIERS TAKEN AS AVERAGE OF ALL TEN
PATIENTS
Parameters Fuzzy
Techniques
Without
Optimization
Optimization
With SVM
Technique
Perfect Classification (%) 50 97.39
Missed Classification (%) 20 1.458
False Alarm 10 1.389
Weighted Delay in secs 4 2.031
Performance Index (%) 40 97.07
Sensitivity 83.33 98.59
Specificity 71.42 98.52
Quality Value 6.25 22.94
CONCLUSION
This Project investigates the performance of SVM in optimizing the epilepsy risk level of
epileptic patients from EEG signals. The parameters derived from the EEG signal are stored as
data sets. Then the fuzzy technique is used to obtain the risk level from each epoch at every EEG
channel. The objective was to classify perfect risk levels with high rate of classification, a short
delay from onset, and a low false alarm rate. Though it is impossible to obtain a perfect
performance in all these conditions, some compromises have been made. As a high false alarm
rate ruins the effectiveness of the system, a low false-alarm rate is most important. SVM
optimization techniques are used to optimize the risk level by incorporating the above goals. The
classification rate of epilepsy risk level of above 98% is possible in our method. The missed
classification is almost 1.458 for a short delay of 2.031 seconds. The number of cases from the
present twenty patients has to be increased for better testing of the system. From this method we
can infer the occurrence of High-risk level frequency and the possible medication to the patients.
Also optimizing each region’s data separately can solve the focal epilepsy problem.
Sources:
1. Leon D.Iasemidis etal., Adaptive Epileptic SeizurePrediction System, IEEE Transactions on
Biomedical Engineering, May 2003,50(5): 616-627.
2. K P Adlassnig, Fuzzy Set Theory in Medical diagnosis, IEEE Transactions on Systems Man
Cybernetics, March 1986,16: 260-265.
3. Alison A Dingle et al, A Multistage system to Detect epileptic form activity in the EEG,IEEE
Transactions on Biomedical Engineering,1993, 40(12):1260-1268.
4. Haoqu and Jean Gotman, A patient specific algorithm for detection onset in long-term EEG
monitoring possible use as warning device, IEEE Transactions on Biomedical Engineering,
February 1997,44(2): 115-122.
5. Arthur C Gayton, Text Book of Medical Physiology, Prism Books Pvt. Ltd., Bangalore, 9th
Edition, 1996.
6. J.Seunghan Park et al, TDAT Domain Analysis Tool for EEG Analysis, IEEE Transactions on
Biomedical Engineering, August 1990,37(8): 803-811.
7. Donna L Hudson, Fuzzy logic in Medical Expert Systems, IEEE EMB Magazine,
November/December 1994,13(6): 693-698.
8. R.Harikumar and B.Sabarish Narayanan, Fuzzy Techniques for Classification of Epilepsy risk
level from EEG Signals, Proceedings of IEEE Tencon – 2003, 14-17 October 2003,Bangalore,
India, 209-213.
9. Mark van Gils, Signal processing in prolonged EEG recordings during intensive care, IEEE
EMB Magazine November/December 1997,16(6): 56-63.
10. Celement.C etal, A Comparison of Algorithms for Detection of Spikes in the
Electroencephalogram,IEEE Transaction on Bio Medical Engineering, April 2003, 50 (4):
521-26.
11. Pamela McCauley-Bell and Adedeji B.Badiru, Fuzzy Modeling and Analytic Hierarchy
Processing to Quantify Risk levels Associated with Occupational Injuries- Part I: The
Development of Fuzzy- Linguistic Risk Levels, IEEE Transaction on Fuzzy Systems, 1996,4
( 2): 124-31.
12. Joel.J etal, Detection of seizure precursors from depth EEG using a sign periodogram
transform, IEEE Transactions on Bio Medical Engineering, April 2004,51 (4):449-458.
13. S.Haykin, Neural networks a Comprehensive Foundation, Prentice- Hall Inc. 2nd Ed. 1999.
14. Mu-chun Su, Chien –Hsing Chou, A modified version of the k-means clustering algorithm with
a distance based on cluster symmetry, IEEE Transactions on Pattern Analysis and Machine
Intelligence June 2001, 23 (6): 674-680.
15. Rangaraj M. Rangayyan, Bio- Medical Signal Analysis A Case Study Approach, IEEE Press-
John Wiley &sons Inc New York 2002.
16. Sathish Kumar-Neural Networks, A Classroom Approach, McGraw-Hill New York, 2004.
17. Richard O. Duda, David G. Stroke, Peter E. Hart-Pattern Classification, second edition, A
Wiley-Interscience Publication, John Wiley and Sons, Inc, 2003.
18. Jehan Zeb Shah, Naomie bt Salim- Neural Networks and Support Vector Machines Based Bio-
Activity Classification, Proceedings of the 1st Conference on Natural Resources Engineering &
Technology 2006, 24-25th July 2006: Putra Jaya, Malaysia, 484-491.
19. Qing song, Wenjie Hu, and Wenfang Xie, Robust Support Vector Machine With Bullet Hole
Image Classification, IEEE Transaction on SMC Part C, 2002,32 ( 4):440-448.
20.V.Vapnik, Statistical Learning Theory, Wiely Chichester, GB,19

No comments: