Support vector classifiers – SVC

Support vector machines can be applied to classification, anomalies detection, and regression problems. Let's first dive into the support vector classifiers.

The binary SVC

The first classifier to be evaluated is the binary (2-class) support vector classifier. The implementation uses the LIBSVM library created by Chih-Chung Chang and Chih-Jen Lin from the National Taiwan University [8:9].

LIBSVM

The library was originally written in C before being ported to Java. It can be downloaded from http://www.csie.ntu.edu.tw/~cjlin/libsvm as a .zip or tar.gzip file. The library includes the following classifier modes:

  • Support vector classifiers (C-SVC, υ-SVC, and one-class SVC)
  • Support vector regression (υ-SVR and ε-SVR)
  • RBF, linear, sigmoid, polynomial, and precomputed kernels

LIBSVM has the distinct advantage of using Sequential Minimal Optimization (SMO), which reduces the time complexity of a training of n observations to O(n 2). The LIBSVM documentation covers both the theory and implementation of hard and soft margins and is available at http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf.

Note

Why LIBSVM?

There are alternatives to the LIBSVM library for learning and experimenting with SVM. David Soergel from the University of Berkeley refactored and optimized the Java version [8:10]. Thorsten Joachims' SVMLight [8:11] Spark/MLlib 1.0 includes two Scala implementations of the SVM using resilient distributed datasets (refer to the Apache Spark section in Chapter 12, Scalable Frameworks). However, LIBSVM is the most commonly used SVM library.

The implementation of the different support vector classifiers and the support vector regression in LIBSVM is broken down into the following five Java classes:

  • svm_model: This defines the parameters of the model created during training
  • svm_node: This models the element of the sparse matrix Q, which is used in the maximization of the margins
  • svm_parameters: This contains the different models for support vector classifiers and regressions, the five kernels supported in LIBSVM with their parameters, and the weights vectors used in cross-validation
  • svm_problem: This configures the input to any of the SVM algorithm (the number of observations, input vector data x as a matrix, and the vector of labels y)
  • svm: This implements algorithms used in training, classification, and regression

The library also includes template programs for training, prediction, and normalization of datasets.

Note

The LIBSVM Java code

The Java version of LIBSVM is a direct port of the original C code. It does not support generic types and is not easily configurable (the code uses switch statements instead of polymorphism). For all its limitations, LIBSVM is a fairly well-tested and robust Java library for SVMs.

Let's create a Scala wrapper to the LIBSVM library to improve its flexibility and ease of use.

Design

The implementation of the support vector machine algorithm uses the design template for classifiers (refer to the Design template for classifier section in the Appendix A, Basic Concepts).

The key components of the implementation of a SVM are as follows:

  • A model, SVMModel, of the Model type is initialized through training during the instantiation of the classifier. The model class is an adapter to the svm_model structure defined in LIBSVM.
  • An SVMAdapter object interfaces with the internal LIBSVM data structures and methods.
  • The SVM support vector machine class is implemented as an implicit data transformation of the ITransform type. It has three parameters: the configuration wrapper of the SVMConfig type, the features/time series of the XVSeries type, and the target or labeled values, DblVector.
  • The configuration (the SVMConfig type) consists of three distinct elements: SVMExecution that defines the execution parameters such as the maximum number of iterations or convergence criteria, SVMKernel that specifies the kernel function used during training, and SVMFormulation that defines the formula (C, epsilon, or nu) used to compute a nonseparable case for the support vector classifier and regression.

The key software components of the support vector machine are described in the following UML class diagram:

Design

The UML class diagram for the support vector machine

The UML diagram omits the helper traits and classes such as Monitor or the Apache Commons Math components.

Configuration parameters

LIBSVM exposes a large number of parameters for the configuration and execution of any of the SVM algorithms. Any SVM algorithm is configured with three categories of parameters, which are as follows:

  • Formulation (or type) of the SVM algorithms (the multiclass classifier, one-class classifier, regression, and so on) using the SVMFormulation class
  • The kernel function used in the algorithm (the RBF kernel, Sigmoid kernel, and so on) using the SVMKernel class
  • Training and executing parameters (the convergence criteria, number of folds for cross-validation, and so on) using the SVMExecution class

The SVM formulation

The instantiation of the configuration consists of initializing the param LIBSVM parameter by the SVM type, kernel, and the execution context selected by the user.

Each of the SVM parameters' case class extends the generic SVMConfigItem trait:

trait SVMConfigItem { def update(param: svm_parameter): Unit }

The classes inherited from SVMConfigItem are responsible for updating the list of the SVM parameters, svm_parameter, defined in LIBSVM. The update method encapsulates the configuration of LIBSVM.

The formulation of the SVM algorithm by a class hierarchy with SVMFormulation as the base trait is as follows:

sealed trait SVMFormulation extends SVMConfigItem {   
  def update(param: svm_parameter): Unit 
}

The list of the formulation for the SVM (C, nu, and eps for regression) is completely defined and known. Therefore, the hierarchy should not be altered and the SVMFormulation trait has to be declared sealed. Here is an example of the SVM CSVCFormulation formulation class, which defines the C-SVM model:

class CSVCFormulation (c: Double) extends SVMFormulation {   
   override def update(param: svm_parameter): Unit = {      
     param.svm_type = svm_parameter.C_SVC
     param.C = c
  }
}

The other SVM NuSVCFormulation, OneSVCFormulation, and SVRFormulation formulation classes implement the υ-SVM, 1-SVM, and ε-SVM, respectively for regression models.

The SVM kernel function

Next, you need to specify the kernel functions by defining and implementing the SVMKernel trait:

sealed trait SVMKernel extends SVMConfigItem {
  override def update(param: svm_parameter): Unit 
}

Once again, there are a limited number of kernel functions supported in LIBSVM. Therefore, the hierarchy of kernel functions is sealed. The following code snippet configures the radius basis function kernel, RbfKernel, as an example of the definition of the kernel definition class:

class RbfKernel(gamma: Double) extends SVMKernel {
  override def update(param: svm_parameter): Unit = {
    param.kernel_type = svm_parameter.RBF
    param.gamma = gamma
}

The fact that the LIBSVM Java byte code library is not very extensible does not prevent you from defining a new kernel function in the LIBSVM source code. For example, the Laplacian kernel can be added by performing the following steps:

  1. Create a new kernel type in svm_parameter, such as svm_parameter. LAPLACE = 5.
  2. Add the kernel function name to kernel_type_table in the svm class.
  3. Add kernel_type != svm_parameter.LAPLACE to the svm_check_ parameter method.
  4. Add the implementation of the kernel function to two values in svm: kernel_function (Java code):
    case svm_parameter.LAPLACE:
       double sum = 0.0;
       for(int k = 0; k < x[i].length; k++) { 
         final double diff = x[i][k].value - x[j][k].value; 
         sum += diff*diff;
        }    
        return Math.exp(-gamma*Math.sqrt(sum));
  5. Add the implementation of the Laplace kernel function in the svm.k_function method by modifying the existing implementation of RBF (distanceSqr).
  6. Rebuild the libsvm.jar file

The SVM execution

The SVMExecution class defines the configuration parameters for the execution of the training of the model, namely the eps convergence factor for the optimizer (line 2), the size of the cache, cacheSize (line 1), and the number of folds, nFolds, used during cross-validation:

class SVMExecution(cacheSize: Int, eps: Double, nFolds: Int) 
     extends SVMConfigItem {
  override def update(param: svm_parameter): Unit = { 
    param.cache_size = cacheSize //1
    param.eps = eps //2
  }
}

The cross-validation is performed only if the nFolds value is greater than 1.

We are finally ready to create the SVMConfig configuration class, which hides and manages all of the different configuration parameters:

class SVMConfig(formula: SVMFormulation, kernel: SVMKernel,
     exec: SVMExecution) {
  val param = new svm_parameter
  formula.update(param) //3
  kernel.update(param)  //4
  exec.update(param)  //5
}

The SVMConfig class delegates the selection of the formula to the SVMFormulation class (line 3), selection of the kernel function to the SVMKernel class (line 4), and the execution of parameters to the SVMExecution class (line 5). The sequence of update calls initializes the LIBSVM list of configuration parameters.

Interface to LIBSVM

We need to create an adapter object to encapsulate the invocation to LIBSVM. The SVMAdapter object hides the LIBSVM internal data structures: svm_model and svm_node:

object SVMAdapter {
  type SVMNodes = Array[Array[svm_node]]
  class SVMProblem(numObs: Int, expected: DblArray) //6
   
  def createSVMNode(dim: Int, x: DblArray): Array[svm_node] //7
  def predictSVM(model: SVMModel, x: DblArray): Double //8
  def crossValidateSVM(problem: SVMProblem, //9
     param: svm_parameter, nFolds: Int, expected: DblArray) 
  def trainSVM(problem: SVMProblem,  //10
     param: svm_parameter): svm_model 
}

The SVMAdapter object is a single entry point to LIBSVM for training, validating a SVM model, and executing predictions:

  • SVMProblem wraps the definition of the training objective or problem in LIBSVM, using the labels or expected values (line 6)
  • createSVMNode creates a new computation node for each observation x (line 7)
  • predictSVM predicts the outcome of a new observation x given a model, svm_model, generated through training (line 8)
  • crossValidateSVM validates the model, svm_model, with the nFold training—validation sets (line 9)
  • trainSVM executes the problem training configuration (line 10)

Note

svm_node

The LIBSVM svm_node Java class is defined as a pair of indices of the feature in the observation array and its value:

public class svm_node implements java.io.Serializable {   
  public int index;    
  public double value;
}

The SVMAdapter methods are described in the next section.

Training

The model for the SVM is defined by the following two components:

  • svm_model: This is the SVM model parameters defined in LIBSVM
  • accuracy: This is the accuracy of the model computed during cross-validation

The code will be as follows:

case class SVMModel(val svmmodel: svm_model, 
     val accuracy: Double) extends Model {
  lazy val residuals: DblArray = svmmodel.sv_coef(0)
}

The residuals, that is, r = y – f(x) are computed in the LIBSVM library.

Note

Accuracy in the SVM model

You may wonder why the value of the accuracy is a component of the model. The accuracy component of the model provides the client code with a quality metric associated with the model. Integrating the accuracy into the model, allows the user to make informed decisions in accepting or rejecting the model. The accuracy is stored in the model file for subsequent analysis.

Next, let's create the first support vector classifier for the two-class problems. The SVM class implements the ITransform monadic data transformation that implicitly generates a model from a training set, as described in the Monadic data transformation section in Chapter 2, Hello World! (line 11).

The constructor for the SVM follows the template described in the Design template for immutable classifiers section in the Appendix A, Basic Concepts:

class SVM[T <% Double](config: SVMConfig, xt: XVSeries[T], 
   expected: DblVector) extends ITransform[Array[T]](xt) {//11

  type V = Double   //12
  val normEPS = config.eps*1e-7  //13
  val model: Option[SVMModel] = train  //14

  def accuracy: Option[Double] = model.map( _.accuracy) //15
  def mse: Option[Double]  //16
  def margin: Option[Double]  //17
}

The implementation of the ITransform abstract class requires the definition of the output value of the predictor as a Double (line 12). The normEPS is used for rounding errors in the computation of the margin (line 13). The model of the SVMModel type is generated through training by the SVM constructor (line 14). The last four methods are used to compute the parameters of the accuracy model (line 15), the mean square of errors, mse, (line 16), and the margin (line 17).

Let's take a look at the training method, train:

def train: Option[SVMModel] = Try {
  val problem = new SVMProblem(xt.size, expected.toArray) //18
  val dim = dimension(xt)

  xt.zipWithIndex.foreach{ case (_x, n) =>  //19
      problem.update(n, createSVMNode(dim, _x))
  }
  new SVMModel(trainSVM(problem, config.param), accuracy(problem))   //20
}._toOption("SVM training failed", logger)

The train method creates SVMProblem that provides LIBSVM with the training components (line 18). The purpose of the SVMProblem class is to manage the definition of training parameters implemented in LIBSVM, as follows:

class SVMProblem(numObs: Int, expected: DblArray) {
  val problem = new svm_problem  //21
  problem.l = numObs
  problem.y = expected 
  problem.x = new SVMNodes(numObs)

  def update(n: Int, node: Array[svm_node]): Unit = 
    problem.x(n) = node  //22
}

The arguments of the SVMProblem constructor, the number of observations, and the labels or expected values are used to initialize the corresponding svm_problem data structure in LIBSVM (line 21). The update method maps each observation, which is defined as an array of svm_node to the problem (line 22).

The createSVMNode method creates an array of svm_node from an observation. A svm_node in LIBSVM is the pair of the j index of a feature in an observation (line 23) and its value, y (line 24):

def createSVMNode(dim: Int, x: DblArray): Array[svm_node] = {
   val newNode = new Array[svm_node](dim)
   x.zipWithIndex.foreach{ case (y, j) =>  {
      val node = new svm_node
      node.index= j  //23
      node.value = y  //24
      newNode(j) = node 
   }}
   newNode

The mapping between an observation and a LIBSVM node is illustrated in the following diagram:

Training

Indexing of observations using LIBSVM

The trainSVM method pushes the training request with a well-defined problem and configuration parameters to LIBSVM by invoking the svm_train method (line 26):

def trainSVM(problem: SVMProblem, 
     param: svm_parameter): svm_model =
   svm.svm_train(problem.problem, param) //26

The accuracy is the ratio of the true positive plus the true negative over the size of the test sample (refer to the Key quality metrics section in Chapter 2, Hello World!). It is computed through cross-validation only if the number of folds initialized in the SVMExecution configuration class is greater than 1. Practically, the accuracy is computed by invoking the cross-validation method, svm_cross_validation, in the LIBSVM package, and then computing the ratio of the number of predicted values that match the labels over the total number of observations:

def accuracy(problem: SVMProblem): Double = { 
  if( config.isCrossValidation ) {
    val target = new Array[Double](expected.size)
    crossValidateSVM(problem, config.param,  //27
        config.nFolds, target)

    target.zip(expected)
       .filter{case(x, y) =>Math.abs(x- y) < config.eps}  //28
       .size.toDouble/expected.size
  }
  else 0.0
}

The call to the crossValidateSVM method of SVMAdapter forwards the configuration and execution of the cross validation with config.nFolds (line 27):

def crossValidateSVM(problem: SVMProblem, param: svm_parameter, 
    nFolds: Int, expected: DblArray) {
  svm.svm_cross_validation(problem.problem, param, 
    nFolds, expected)
}

The Scala filter weeds out the observations that were poorly predicted (line 28). This minimalist implementation is good enough to start exploring the support vector classifier.

Classification

The implementation of the |> classification method for the SVM class follows the same pattern as the other classifiers. It invokes the predictSVM method in SVMAdapter that forwards the request to LIBSVM (line 29):

override def |> : PartialFunction[Array[T], Try[V]] =  {
   case x: Array[T] if(x.size == dimension(xt) && isModel) =>
      Try( predictSVM(model.get.svmmodel, x) )  //29
}

C-penalty and margin

The first evaluation consists of understanding the impact of the penalty factor C on the margin in the generation of the classes. Let's implement the computation of the margin. The margin is defined as 2/|w| and implemented as a method of the SVM class, as follows:-

def margin: Option[Double] = 
  if(isModel) {
    val wNorm = model.get.residuals./:(0.0)((s,r) => s + r*r)
    if(wNorm < normEPS) None else Some(2.0/Math.sqrt(wNorm))
  }
  else None

The first instruction computes the sum of the squares, wNorm, of the residuals r = y – f(x|w). The margin is ultimately computed if the sum of squares is significant enough to avoid rounding errors.

The margin is evaluated using an artificially generated time series and labeled data. First, we define the method to evaluate the margin for a specific value of the penalty (inversed regularization coefficient) factor C:

val GAMMA = 0.8
val CACHE_SIZE = 1<<8
val NFOLDS = 1
val EPS = 1e-5

def evalMargin(features: Vector[DblArray], 
    expected: DblVector, c: Double): Int = {
  val execEnv = SVMExecution(CACHE_SIZE, EPS, NFOLDS)
  val config = SVMConfig(new CSVCFormulation(c), 
     new RbfKernel(GAMMA), execEnv)
  val svc = SVM[Double](config, features, expected)
  svc.margin.map(_.toString)     //30
}

The evalMargin method uses the CACHE_SIZE, EPS, and NFOLDS execution parameters. The execution displays the value of the margin for different values of C (line 30). The method is invoked iteratively to evaluate the impact of the penalty factor on the margin extracted from the training of the model. The test uses a synthetic time series to highlight the relation between C and the margin. The synthetic time series created by the generate method consists of two training sets of an equal size, N:

  • Data points generated as y = x(1 + r/5) for the label 1, r being a randomly generated number over the range [0,1] (line 31)
  • A randomly generated data point y = r for the label -1 (line 32)

Consider the following code:

def generate: (Vector[DblArray], DblArray) = {
  val z  = Vector.tabulate(N)(i => {
    val ri = i*(1.0 + 0.2*Random.nextDouble)
    Array[Double](i, ri)  //31
  }) ++
  Vector.tabulate(N)(i =>Array[Double](i,i*Random.nextDouble))
  (z, Array.fill(N)(1) ++ Array.fill(N)(-1))  //32
}

The evalMargin method is executed for different values of C ranging from 0 to 5:

generate.map(y => 
  (0.1 until 5.0 by 0.1)
    .flatMap(evalMargin(y._1, y._2, _)).mkString("
") 
)

Note

val versus final val

There is a difference between a val and a final val. A nonfinal value can be overridden in a subclass. Overriding a final value produces a compiler error, as follows:

class A { val x = 5;  final val y = 8 } 
class B extends A { 
  override val x = 9 // OK    
  override val y = 10 // Error 
}

The following chart illustrates the relation between the penalty or cost factor C and the margin:

C-penalty and margin

The margin value versus the C-penalty factor for a support vector classifier

As expected, the value of the margin decreases as the penalty term C increases. The C penalty factor is related to the L 2 regularization factor λ as C ~ 1/λ. A model with a large value of C has a high variance and a low bias, while a small value of C will produce lower variance and a higher bias.

Note

Optimizing C penalty

The optimal value for C is usually evaluated through cross-validation, by varying C in incremental powers of 2: 2n, 2n+1, … [8:12].

Kernel evaluation

The next test consists of comparing the impact of the kernel function on the accuracy of the prediction. Once again, a synthetic time series is generated to highlight the contribution of each kernel. The test code uses the runtime prediction or classification method, |>, to evaluate the different kernel functions. Let's create a method to evaluate and compare these kernel functions. All we need is the following (line 33):

  • An xt training set of the Vector[DblArray] type
  • A test set, test, of the Vector[DblArray] type
  • A set of labels for the training set that takes the value 0 or 1
  • A kF kernel function

Consider the following code:

val C = 1.0
def evalKernel(xt: Vector[DblArray],  test: Vector[DblArray], 
     labels: DblVector, kF: SVMKernel): Double = { //33
  
  val config = SVMConfig(new CSVCFormulation(C), kF) //34
  val svc = SVM[Double](config, xt, labels)
  val pfnSvc = svc |>  //35
  test.zip(labels).count{case(x, y) =>pfnSvc(x).get == y}
    .toDouble/test.size  //36
}

The config configuration of the SVM uses the C penalty factor 1, the C-formulation, and the default execution environment (line 34). The predictive pfnSvc partial function (line 35) is used to compute the predictive values for the test set. Finally, the evalKernel method counts the number of successes for which the predictive values match the labeled or expected values. The accuracy is computed as the ratio of the successful prediction over the size of the test sample (line 36).

In order to compare the different kernels, let's generate three datasets of the size 2N for a binomial classification using the pseudo-random genData data generation method:

def genData(variance: Double, mean: Double): Vector[DblArray] = {
  val rGen = new Random(System.currentTimeMillis)
  Vector.tabulate(N)( _ => { 
    rGen.setSeed(rGen.nextLong)
    Array[Double](rGen.nextDouble, rGen.nextDouble)
      .map(variance*_ - mean)  //37
  })
}

The random value is computed through a transformation f(x) = variance*x = mean (line 37). The training and test sets consist of the aggregate of two classes of data points:

  • Random data points with the variance a and mean b associated with the label 0.0
  • Random data points with the variance a and mean 1-b associated with the label 1.0

Consider the following code for the training set:

val trainSet = genData(a, b) ++ genData(a, 1-b)
val testSet = genData(a, b) ++ genData(a, 1-b)

The a and b parameters are selected from two groups of training data points with various degrees of separation to illustrate the separating hyperplane.

The following chart describes the high margin; the first training set generated with the parameters a = 0.6 and b = 0.3 illustrates the highly separable classes with a clean and distinct hyperplane:

Kernel evaluation

The scatter plot for training and testing sets with a = 0.6 and b = 0.3

The following chart describes the medium margin; the parameters a = 0.8 and b = 0.3 generate two groups of observations with some overlap:

Kernel evaluation

The scatter plot for training and testing sets with a = 0.8 and b = 0.3

The following chart describes the low margin; the two groups of observations in this last training set are generated with a = 1.4 and b = 0.3 and show a significant overlap:

Kernel evaluation

The scatter plot for training and testing sets with a = 1.4 and b = 0.3

The test set is generated in a similar fashion as the training set, as they are extracted from the same data source:

val GAMMA = 0.8; val COEF0 = 0.5; val DEGREE = 2 //38
val N = 100

def compareKernel(a: Double, b: Double) {
  val labels = Vector.fill(N)(0.0) ++ Vector.fill(N)(1.0)
  evalKernel(trainSet, testSet,labels,new RbfKernel(GAMMA)) 
  evalKernel(trainSet, testSet, labels, 
      new SigmoidKernel(GAMMA)) 
  evalKernel(trainSet, testSet, labels, LinearKernel) 
  evalKernel(trainSet, testSet, labels, 
      new PolynomialKernel(GAMMA, COEF0, DEGREE))
}

The parameters for each of the four kernel functions are arbitrary selected from textbooks (line 38). The evalKernel method defined earlier is applied to the three training sets: the high margin (a = 1.4), medium margin (a = 0.8), and low margin (a = 0.6) with each of the four kernels (RBF, sigmoid, linear, and polynomial). The accuracy is assessed by counting the number of observations correctly classified for all of the classes for each invocation of the predictor, |>:

Kernel evaluation

A comparative chart of kernel functions using synthetic data

Although the different kernel functions do not differ in terms of the impact on the accuracy of the classifier, you can observe that the RBF and polynomial kernels produce results that are slightly more accurate. As expected, the accuracy decreases as the margin decreases. A decreasing margin indicates that the cases are not easily separable, affecting the accuracy of the classifier:

Kernel evaluation

The impact of the margin value on the accuracy of RBF and Sigmoid kernel functions

Note

A test case design

The test to compare the different kernel methods is highly dependent on the distribution or mixture of data in the training and test sets. The synthetic generation of data in this test case is used for illustrating the margin between classes of observations. Real-world datasets may produce different results.

In summary, there are four steps required to create a SVC-based model:

  1. Select a features set.
  2. Select the C-penalty (inverse regularization).
  3. Select the kernel function.
  4. Tune the kernel parameters.

As mentioned earlier, this test case relies on synthetic data to illustrate the concept of the margin and compare kernel methods. Let's use the support vector classifier for a real-world financial application.

Applications in risk analysis

The purpose of the test case is to evaluate the risk for a company to curtail or eliminate its quarterly or yearly dividend. The features selected are financial metrics relevant to a company's ability to generate a cash flow and pay out its dividends over the long term.

We need to select any subset of the following financial technical analysis metrics (refer to Appendix A, Basic Concepts):

  • Relative change in stock prices over the last 12 months
  • Long-term debt-equity ratio
  • Dividend coverage ratio
  • Annual dividend yield
  • Operating profit margin
  • Short interest (ratio of shares shorted over the float)
  • Cash per share-share price ratio
  • Earnings per share trend

The earnings trend has the following values:

  • -2 if earnings per share decline by more than 15 percent over the last 12 months.
  • -1 if earnings per share decline between 5 percent and 15 percent.
  • 0 if earnings per share is maintained within 5 percent.
  • +1 if earnings per share increase between 5 percent and 15 percent.
  • +2 if earnings per share increase by more than 15 percent. The values are normalized with values 0 and 1.

The labels or expected output (dividend changes) is categorized as follows:

  • -1 if the dividend is cut by more than 5 percent
  • 0 if the dividend is maintained within 5 percent
  • +1 if the dividend is increased by more than 5 percent

Let's combine two of these three labels {-1, 0, 1} to generate two classes for the binary SVC:

  • Class C1 = stable or decreasing dividends and class C2 = increasing dividends—training set A
  • Class C1 = decreasing dividends and class C2 = stable or increasing dividends—training set B

The different tests are performed with a fixed set of C and GAMMA configuration parameters and a 2-fold validation configuration:

val path = "resources/data/chap8/dividends2.csv"
val C = 1.0
val GAMMA = 0.5
val EPS = 1e-2
val NFOLDS = 2

val extractor = relPriceChange :: debtToEquity :: 
    dividendCoverage :: cashPerShareToPrice :: epsTrend :: 
    shortInterest :: dividendTrend :: 
    List[Array[String] =>Double]()  //39

val pfnSrc = DataSource(path, true, false,1) |> //40
val config = SVMConfig(new CSVCFormulation(C), 
     new RbfKernel(GAMMA), SVMExecution(EPS, NFOLDS))

for {
  input <- pfnSrc(extractor) //41
  obs <- getObservations(input)  //42
  svc <- SVM[Double](config, obs, input.last.toVector)
} yield {
  show(s"${svc.toString}
accuracy ${svc.accuracy.get}")
}

The first step is to define the extractor (which is the list of fields to be retrieved from the dividends2.csv file) (line 39). The pfnSrc partial function generated by the DataSource transformation class (line 40) converts the input file into a set of typed fields (line 41). An observation is an array of fields. The obs sequence of observations is generated from the input fields by transposing the matrix observations x features (line 42):

def getObservations(input: Vector[DblArray]):
     Try[Vector[DblArray]] = Try {
  transpose( input.dropRight(1).map(_.toArray) ).toVector
}

The test computes the model parameters and the accuracy from the cross-validation during the instantiation of the SVM.

Note

LIBSVM scaling

LIBSVM supports feature normalization known as scaling, prior to training. The main advantage of scaling is to avoid attributes in greater numeric ranges, dominating those in smaller numeric ranges. Another advantage is to avoid numerical difficulties during the calculation. In our examples, we use the normalization method of the normalize time series. Therefore, the scaling flag in LIBSVM is disabled.

The test is repeated with a different set of features and consists of comparing the accuracy of the support vector classifier for different features sets. The features sets are selected from the content of the .csv file by assembling the extractor with different configurations, as follows:

val extractor =  … :: dividendTrend :: …

Let's take a look at the following graph:

Applications in risk analysis

A comparative study of trading strategies using the binary SVC

The test demonstrates that the selection of the proper features set is the most critical step in applying the support vector machine, and any other model for that matter, to classification problems. In this particular case, the accuracy is also affected by the small size of the training set. The increase in the number of features also reduces the contribution of each specific feature to the loss function.

Note

The N-fold cross-validation

The cross-validation in this test example uses only two folds because the number of observations is small, and you want to make sure that any class contains at least a few observations.

The same process is repeated for the test B whose purpose is to classify companies with decreasing dividends and companies with stable or increasing dividends, as shown in the following graph:

Applications in risk analysis

A comparative study of trading strategies using the binary SVC

The difference in terms of accuracy of prediction between the first three features set and the last two features set in the preceding graph is more pronounced in test A than test B. In both the tests, the eps feature (earning per share) trend improves the accuracy of the classification. It is a particularly good predictor for companies with increasing dividends.

The problem of predicting the distribution (or not) dividends can be restated as evaluating the risk of a company to dramatically reduce its dividends.

What is the risk if a company eliminates its dividend altogether? Such a scenario is rare, and these cases are actually outliers. A one-class support vector classifier can be used to detect outliers or anomalies [8:13].

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset