8.3.3 SVM algorithm for the analysis of mental functions

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

\begin{array}{l} H_{11} = y_{1} y_{2} {(1 + X_{1}^{T} X_{1})}^{2} = (1) (1) {(1 + [1 - 1] [\begin{array}{l} 1 \\ - 1 \end{array}])}^{2} = 9 \\ H_{12} = y_{1} y_{2} {(1 + X_{1}^{T} X_{2})}^{2} = (1) (1) {(1 + [1 - 1] [\begin{array}{l} - 1 \\ 1 \end{array}])}^{2} = 1 \end{array}

$\begin{array}{l} H_{11} = y_{1} y_{2} {(1 + X_{1}^{T} X_{1})}^{2} = (1) (1) {(1 + [1 - 1] [\begin{array}{l} 1 \\ - 1 \end{array}])}^{2} = 9 \\ H_{12} = y_{1} y_{2} {(1 + X_{1}^{T} X_{2})}^{2} = (1) (1) {(1 + [1 - 1] [\begin{array}{l} - 1 \\ 1 \end{array}])}^{2} = 1 \end{array}$

If all the Hij values are calculated in this way, the matrix provided below will be obtained:

H = [\begin{matrix} 9 & 1 & - 1 & - 1 \\ 1 & 9 & - 1 & - 1 \\ - 1 & - 1 & 9 & 1 \\ - 1 & - 1 & 1 & 9 \end{matrix}]

$H = [\begin{matrix} 9 & 1 & - 1 & - 1 \\ 1 & 9 & - 1 & - 1 \\ - 1 & - 1 & 9 & 1 \\ - 1 & - 1 & 1 & 9 \end{matrix}]$

Step (8–15) In order to get the a value at this stage, it is essential to solve these systems: 1 −Ha =0 and $[\begin{array}{l} 1 \\ 1 \\ 1 \\ 1 \end{array}] - [\begin{matrix} 9 & 1 & - 1 & - 1 \\ 1 & 9 & - 1 & - 1 \\ - 1 & - 1 & 9 & 1 \\ - 1 & - 1 & 1 & 9 \end{matrix}]$ $[\begin{array}{l} 1 \\ 1 \\ 1 \\ 1 \end{array}] - [\begin{matrix} 9 & 1 & - 1 & - 1 \\ 1 & 9 & - 1 & - 1 \\ - 1 & - 1 & 9 & 1 \\ - 1 & - 1 & 1 & 9 \end{matrix}]$ a = 0. As a result of the solution, we obtain a1 = a2 = a3 = a4 = 0.125. Such being the case, all the samples are accepted as the support vectors. These results obtained fulfill the condition presented in eq. (8.8), which is the condition of $\sum_{i = 1}^{4} a_{i} y_{i} = a_{1} + a_{2} - a_{3} - a_{4} = 0$ $\sum_{i = 1}^{4} a_{i} y_{i} = a_{1} + a_{2} - a_{3} - a_{4} = 0$

In order to find the weight vector w, $w = \sum_{i = 1}^{4} a_{i} y_{i} (ϕ (x_{i}))$ $w = \sum_{i = 1}^{4} a_{i} y_{i} (ϕ (x_{i}))$ is to be calculated in this case.

\begin{array}{l} w = (0.125) {(1) [1 \sqrt{2} - \sqrt{2} - \sqrt{2} 1 1] + (1) [1 - \sqrt{2} \sqrt{2} - \sqrt{2} 1 1] \\ + (- 1) [1 \sqrt{2} \sqrt{2} \sqrt{2} 1 1] + ((- 1) [1 - \sqrt{2} - \sqrt{2} \sqrt{2} 1 1]} \\ = (0.125) [\begin{matrix} 0 & 0 & 0 & - 4 \sqrt{2} & 0 & 0 \end{matrix}] \\ = [\begin{matrix} 0 & 0 & 0 & \frac{- \sqrt{2}}{2} & 0 & 0 \end{matrix}] \end{array}

$\begin{array}{l} w = (0.125) {(1) [1 \sqrt{2} - \sqrt{2} - \sqrt{2} 1 1] + (1) [1 - \sqrt{2} \sqrt{2} - \sqrt{2} 1 1] \\ + (- 1) [1 \sqrt{2} \sqrt{2} \sqrt{2} 1 1] + ((- 1) [1 - \sqrt{2} - \sqrt{2} \sqrt{2} 1 1]} \\ = (0.125) [\begin{matrix} 0 & 0 & 0 & - 4 \sqrt{2} & 0 & 0 \end{matrix}] \\ = [\begin{matrix} 0 & 0 & 0 & \frac{- \sqrt{2}}{2} & 0 & 0 \end{matrix}] \end{array}$

In this case, the classifier is obtained as follows:

= [\begin{array}{l} 0 \\ 0 \\ 0 \\ \frac{- \sqrt{2}}{2} \\ 0 \\ 0 \end{array}] [\begin{matrix} 1 & \sqrt{2} x_{1} & \sqrt{2} x_{2} & \sqrt{2} x_{1} x_{2} & x_{1}^{2} & x_{2}^{2} \end{matrix}] = - x_{1} x_{2}

$= [\begin{array}{l} 0 \\ 0 \\ 0 \\ \frac{- \sqrt{2}}{2} \\ 0 \\ 0 \end{array}] [\begin{matrix} 1 & \sqrt{2} x_{1} & \sqrt{2} x_{2} & \sqrt{2} x_{1} x_{2} & x_{1}^{2} & x_{2}^{2} \end{matrix}] = - x_{1} x_{2}$

The test accuracy rate has been obtained as 88.9% in the classification procedure of the MS dataset with four classes through SVM polynomial kernel algorithm.

8.3.3SVM algorithm for the analysis of mental functions

As presented in Table 2.19, the WAIS-R dataset has data 200 samples belonging to patient and 200 samples to healthy control group. The attributes of the control group are data regarding school education, gender and D.M (see Chapter 2, Table 2.18). Data are made up of a total of 21 attributes. It is known that using these attributes of 400 individuals, the data if they belong to patient or healthy group are known. How can we make the classification as to which individual belongs to which patient or healthy individuals and those diagnosed with WAIS-R test(based on the school education, gender and D.M)?D matrix has a dimension of 400 × 21. This means D matrix includes the WAIS-R dataset of 400 individuals along with their 21 attributes (see Table 2.19) for the WAIS-R dataset. For the classification of D matrix through SVM the first-step training procedure is to be employed. For the training procedure, 66.66% of the D matrix can be split for the training dataset (267 × 21), and 33.33% as the test dataset (133 × 21).

Following the classification of the training dataset being trained with SVM algorithm, we can do the classification of the test dataset (Figure 8.10)

Figure 8.10: Binary (linear) support vector machine algorithm for the analysis of WAIS-R.

In order to do classification with linear SVM algorithm, the test data is randomly chosen as 33.3% from the WAIS-R dataset.

WAIS-R dataset has two classes. Let us analyze the example given in Example 8.4 in order to understand how the classifiers are found for WAIS-R dataset being trained by linear (binary) SVM algorithm.

Example 8.4 For the WAIS-R dataset that has patient and healthy classes, for patient class with (0,0)(0,1) values and healthy class with (1,1) values, let us have a function that can separate these two classes from each other in a linear fashion:

The vectors relevant to this can be defined in the following way:

x_{1} = [\begin{array}{l} 0 \\ 0 \end{array}], x_{2} = [\begin{array}{l} 0 \\ 1 \end{array}], x_{3} = [\begin{array}{l} 1 \\ 1 \end{array}], y = [\begin{array}{l} 1 \\ 1 \\ - 1 \end{array}]

$x_{1} = [\begin{array}{l} 0 \\ 0 \end{array}], x_{2} = [\begin{array}{l} 0 \\ 1 \end{array}], x_{3} = [\begin{array}{l} 1 \\ 1 \end{array}], y = [\begin{array}{l} 1 \\ 1 \\ - 1 \end{array}]$

Lagrange function was in the pattern as presented in eq. (8.8). If the given values are written in their place, the calculation of Lagrange function can be done as follows:

Step (1–5)

\begin{array}{l} d (X^{T}) = a_{1} + a_{2} + a_{3} - \frac{1}{2} (a_{1} a_{1} y_{1} y_{1} x_{1}^{T} x_{1} + a_{1} a_{2} y_{1} y_{2} x_{1}^{T} x_{2} + a_{1} a_{3} y_{1} y_{3} x_{1}^{T} x_{3}) \\ + a_{2} a_{1} y_{2} y_{1} x_{2}^{T} x_{1} + a_{2} a_{2} y_{2} y_{2} x_{2}^{T} x_{2} + a_{1} a_{3} y_{2} y_{3} x_{2}^{T} x_{3} \\ + a_{3} a_{1} y_{3} y_{1} x_{3}^{T} x_{1} + a_{3} a_{2} y_{3} y_{2} x_{3}^{T} x_{2} + a_{3} a_{3} y_{3} y_{3} x_{3}^{T} x_{3}) \end{array}

$\begin{array}{l} d (X^{T}) = a_{1} + a_{2} + a_{3} - \frac{1}{2} (a_{1} a_{1} y_{1} y_{1} x_{1}^{T} x_{1} + a_{1} a_{2} y_{1} y_{2} x_{1}^{T} x_{2} + a_{1} a_{3} y_{1} y_{3} x_{1}^{T} x_{3}) \\ + a_{2} a_{1} y_{2} y_{1} x_{2}^{T} x_{1} + a_{2} a_{2} y_{2} y_{2} x_{2}^{T} x_{2} + a_{1} a_{3} y_{2} y_{3} x_{2}^{T} x_{3} \\ + a_{3} a_{1} y_{3} y_{1} x_{3}^{T} x_{1} + a_{3} a_{2} y_{3} y_{2} x_{3}^{T} x_{2} + a_{3} a_{3} y_{3} y_{3} x_{3}^{T} x_{3}) \end{array}$

Since $x_{1} = [\begin{array}{l} 0 \\ 0 \end{array}]$ $x_{1} = [\begin{array}{l} 0 \\ 0 \end{array}]$ we can write the denotation mentioned above in the way presented as follows:

2322

d (X^{T}) = a_{1} + a_{2} + a_{3} - \frac{1}{2} (a_{2}^{2} y_{2}^{2} x_{2}^{T} x_{2} + a_{2} a_{3} y_{2} y_{3} x_{2}^{T} x_{3} + a_{3} a_{2} y_{3} y_{2} x_{3}^{T} x_{2} + a_{3}^{2} y_{3}^{2} x_{3}^{T} x_{3})

$d (X^{T}) = a_{1} + a_{2} + a_{3} - \frac{1}{2} (a_{2}^{2} y_{2}^{2} x_{2}^{T} x_{2} + a_{2} a_{3} y_{2} y_{3} x_{2}^{T} x_{3} + a_{3} a_{2} y_{3} y_{2} x_{3}^{T} x_{2} + a_{3}^{2} y_{3}^{2} x_{3}^{T} x_{3})$

\begin{array}{l} d (X^{T}) = a_{1} + a_{2} + a_{3} - \frac{1}{2} (a_{2}^{2} {(1)}^{2} [\begin{matrix} 0 & 1 \end{matrix}] [\begin{array}{l} 0 \\ 1 \end{array}] + a_{2} a_{3} (1) (- 1) [\begin{matrix} 0 & 1 \end{matrix}] [\begin{array}{l} 0 \\ 1 \end{array}] \\ + a_{3} a_{2} (- 1) (1) [\begin{matrix} 1 & 1 \end{matrix}] [\begin{array}{l} 0 \\ 1 \end{array}] + a_{3}^{2} {(- 1)}^{2} [\begin{matrix} 1 & 1 \end{matrix}] [\begin{array}{l} 1 \\ 1 \end{array}]) \\ d (X^{t}) = a_{1} + a_{2} + a_{3} - \frac{1}{2} {a_{2}^{2} - 2 a_{2} a_{3} + 2 a_{3}^{2}} \end{array}

$\begin{array}{l} d (X^{T}) = a_{1} + a_{2} + a_{3} - \frac{1}{2} (a_{2}^{2} {(1)}^{2} [\begin{matrix} 0 & 1 \end{matrix}] [\begin{array}{l} 0 \\ 1 \end{array}] + a_{2} a_{3} (1) (- 1) [\begin{matrix} 0 & 1 \end{matrix}] [\begin{array}{l} 0 \\ 1 \end{array}] \\ + a_{3} a_{2} (- 1) (1) [\begin{matrix} 1 & 1 \end{matrix}] [\begin{array}{l} 0 \\ 1 \end{array}] + a_{3}^{2} {(- 1)}^{2} [\begin{matrix} 1 & 1 \end{matrix}] [\begin{array}{l} 1 \\ 1 \end{array}]) \\ d (X^{t}) = a_{1} + a_{2} + a_{3} - \frac{1}{2} {a_{2}^{2} - 2 a_{2} a_{3} + 2 a_{3}^{2}} \end{array}$

Step (6–11) However, since $k \sum_{i = 1}^{3} y_{i} a_{i} = 0$ $k \sum_{i = 1}^{3} y_{i} a_{i} = 0$ from a1 + a2 − a3 = 0 relation, a1 + a2 = a3 is obtained. If this value is written in its place in d(XT), in order to find the values of $d (X^{T}) = 2 a_{3} - \frac{1}{2} {a_{2}^{2} - 2 a_{2} a_{3} + 2 a_{3}^{2}}$ $d (X^{T}) = 2 a_{3} - \frac{1}{2} {a_{2}^{2} - 2 a_{2} a_{3} + 2 a_{3}^{2}}$ a2 and a3 the derivatives of the functions are taken and equaled to zero.

\begin{array}{l} \frac{\partial d (X^{T})}{\partial a_{2}} = - 2 a_{2} + 2 a_{3} = 0 \\ \frac{\partial d (X^{T})}{\partial a_{3}} = 2 + a_{2} - 2 a_{3} = 0 \end{array}

$\begin{array}{l} \frac{\partial d (X^{T})}{\partial a_{2}} = - 2 a_{2} + 2 a_{3} = 0 \\ \frac{\partial d (X^{T})}{\partial a_{3}} = 2 + a_{2} - 2 a_{3} = 0 \end{array}$

If these two last equations are solved, a2 = 2, a3 = 2 is obtained. a1 = 0 is obtained as

well. This means, it is possible to do the denotation as $a = [\begin{array}{l} 0 \\ 2 \\ 2 \end{array}] .$ $a = [\begin{array}{l} 0 \\ 2 \\ 2 \end{array}] .$ Now, we can find the w and b values.

\begin{array}{l} w = a_{2} y_{2} x_{2} + a_{3} y_{3} x_{3} \\ = 2 (1) [\begin{array}{l} 0 \\ 1 \end{array}] + 2 (- 1) [\begin{array}{l} 1 \\ 1 \end{array}] = [\begin{array}{l} 0 \\ 2 \end{array}] - [\begin{array}{l} 2 \\ 2 \end{array}] = [\begin{array}{l} - 2 \\ 0 \end{array}] \\ b = \frac{1}{2} (\frac{1}{y_{2}} x_{2}^{T} [\begin{array}{l} - 2 \\ 0 \end{array}] + \frac{1}{y_{3}} x_{3}^{T} [\begin{array}{l} - 2 \\ 0 \end{array}]) \\ = \frac{1}{2} (\frac{1}{1} - [\begin{matrix} 0 & 1 \end{matrix}] [\begin{array}{l} - 2 \\ 0 \end{array}] + \frac{1}{(- 1)} - [\begin{matrix} 1 & 1 \end{matrix}] [\begin{array}{l} - 2 \\ 0 \end{array}]) = 1 \end{array}

$\begin{array}{l} w = a_{2} y_{2} x_{2} + a_{3} y_{3} x_{3} \\ = 2 (1) [\begin{array}{l} 0 \\ 1 \end{array}] + 2 (- 1) [\begin{array}{l} 1 \\ 1 \end{array}] = [\begin{array}{l} 0 \\ 2 \end{array}] - [\begin{array}{l} 2 \\ 2 \end{array}] = [\begin{array}{l} - 2 \\ 0 \end{array}] \\ b = \frac{1}{2} (\frac{1}{y_{2}} x_{2}^{T} [\begin{array}{l} - 2 \\ 0 \end{array}] + \frac{1}{y_{3}} x_{3}^{T} [\begin{array}{l} - 2 \\ 0 \end{array}]) \\ = \frac{1}{2} (\frac{1}{1} - [\begin{matrix} 0 & 1 \end{matrix}] [\begin{array}{l} - 2 \\ 0 \end{array}] + \frac{1}{(- 1)} - [\begin{matrix} 1 & 1 \end{matrix}] [\begin{array}{l} - 2 \\ 0 \end{array}]) = 1 \end{array}$

Step (12–15) As a result, {xi} classification for the observation values to be given as new will be as follows:

\begin{array}{l} f (x) = sgn (w x + b) \\ = sgn ([\begin{array}{l} - 2 \\ 0 \end{array}] [\begin{matrix} x_{1} & x_{2} \end{matrix}] + 1) \\ = sgn (- 2 x_{1} + 1) \end{array}

$\begin{array}{l} f (x) = sgn (w x + b) \\ = sgn ([\begin{array}{l} - 2 \\ 0 \end{array}] [\begin{matrix} x_{1} & x_{2} \end{matrix}] + 1) \\ = sgn (- 2 x_{1} + 1) \end{array}$

In this case, if we would want to classify the new $x_{4} = [\begin{array}{l} 0 \\ 4 \end{array}]$ $x_{4} = [\begin{array}{l} 0 \\ 4 \end{array}]$ observation value, we see that n( − 2x1 + 1) > 0. Therefore, it is understood that the observation in question is in the positive zone (patient). For xi > 0, all the observations will be in the negative zone, namely in healthy area, which is seen clearly.

In this way, the WAIS-R datasets with two classes (samples) are linearly separable. In the classification procedure of WAIS-R dataset through polynomial SVM algorithm yielded a test accuracy rate of 98.8%.

We can use SVM kernel functions in our datasets (MS Dataset, Economy (U.N.I.S.) Dataset and WAIS-R Dataset) and do the application accordingly. The accuracy rates for the classification obtained can be seen in Table 8.1.

Table 8.1: The classification accuracy rates of SVM kernel functions.

As presented in Table 8.1, the classification of datasets such as the WAIS-R data that can be separable linearly through SVM kernel functions can be said to be more accurate compared to the classification of multi-class datasets such as those of MS Dataset and Economy Dataset.

References

[1]Larose DT. Discovering knowledge in data: An introduction to data mining, USA: John Wiley & Sons, Inc., 90–106, 2005.

[2]Schölkopf B, Smola AJ. Learning with kernels: Support vector machines, regularization, optimization, and beyond. USA: MIT press, 2001.

[3]Ma Y, Guo G. Support vector machines applications. New York: Springer, 2014.

[4]Han J, Kamber M, Pei J, Data mining Concepts and Techniques. USA: The Morgan Kaufmann Series in Data Management Systems, Elsevier, 2012.

[5]Stoean, C, Stoean, R. Support vector machines and evolutionary algorithms for classification. Single or Together. Switzerland: Springer International Publishing, 2014.

[6]Suykens JAK, Signoretto M, Argyriou A, Regularization, optimization, kernels, and support vector machines. USA: CRC Press, 2014.

[7]Wang SH, Zhang YD, Dong Z, Phillips P. Pathological Brain Detection. Singapore: Springer, 2018.

[8]Kung SY. Kernel methods and machine learning. United Kingdom: Cambridge University Press, 2014.

[9]Kubat M. An Introduction to Machine Learning. Switzerland: Springer International Publishing, 2015.

[10]Olson DL, Delen D. Advanced data mining techniques. Berlin Heidelberg: Springer Science & Business Media, 2008.

[11]Han J, Kamber M, Pei J, Data mining Concepts and Techniques. USA: The Morgan Kaufmann Series in Data Management Systems, Elsevier, 2012.

[12]Witten IH, Frank E, Hall MA, Pal CJ. Data Mining: Practical machine learning tools and techniques. USA: Morgan Kaufmann Series in Data Management Systems, Elsevier, 2016

[13]Karaca Y, Zhang YD, Cattani C, Ayan U. The differential diagnosis of multiple sclerosis using convex combination of infinite Kernels. CNS & Neurological Disorders-Drug Targets (Formerly Current Drug Targets-CNS & Neurological Disorders), 2017, 16(1), 36–43.

[14]Tang J, Tian Y. A multi-kernel framework with nonparallel support vector machine. Neurocomputing, Elsevier, 2017, 266, 226–238.

[15]Abe S. Support vector machines for pattern classification, London: Springer, 2010.

[16]Burges CJ. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery. 1998, 2 (2), 121–167.

[17]Lipo Wang. Support Vector Machines: Theory and Applications. Berlin Heidelberg: Springer Science & Business Media, 2005.

[18]Smola AJ, Schölkopf B. A tutorial on support vector regression. Statistics and Computing, Kluwer Academic Publishers Hingham, USA, 2004, 14 (3), 199–222.

[19]Hsu CW, Lin CJ. A comparison of methods for multiclass support vector machines. IEEE Transactions on Neural Networks, 2002, 13 (2), 415–425.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 8.3.3 SVM algorithm for the analysis of mental functions

Create new playlist

Sign In

Sign Up

8.3.3SVM algorithm for the analysis of mental functions

References

Table of Contents for
8.3.3 SVM algorithm for the analysis of mental functions