Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Feature selection

Feature selection is one of the toughest parts of financial model building. Feature selection can be done statistically or by having domain knowledge. Here we are going to discuss only a few of the statistical feature selection methods in the financial space.

Removing irrelevant features

Data may contain highly correlated features and the model does better if we do not have highly correlated features in the model. The Caret R package gives the method for finding a correlation matrix between the features, which is shown by the following example.

A few lines of data used for correlation analysis and multiple regression analysis are displayed here by executing the following code:

>DataMR = read.csv("C:/Users/prashant.vats/Desktop/Projects/BOOK R/DataForMultipleRegression.csv") 
>head(DataMR)

	`StockYPrice`	`StockX1Price`	`StockX2Price`	`StockX3Price`	`StockX4Price`
1	80.13	72.86	93.1	63.7	83.1
2	79.57	72.88	90.2	63.5	82
3	79.93	71.72	99	64.5	82.8
4	81.69	71.54	90.9	66.7	86.5
5	80.82	71	90.7	60.7	80.8
6	81.07	71.78	93.1	62.9	84.2

The preceding output shows five variables in DataMR named StockYPrice, StockX1Price, StockX2Price, StockX3Price, and StockX4Price. Here StockYPrice is dependent and all the other four variables are independent variables. Dependence structure is very important to study for going deep into the analysis.

The following command calculates the correlation matrix between the first four columns, which are StockYPrice, StockX1Price, StockX2Price¸ and StockX3Price:

 > correlationMatrix<- cor(DataMR[,1:4])

Figure 3.11: Correlation matrix table

The preceding correlation matrix shows which variables are highly correlated and, accordingly, the feature will be selected in such a way that highly correlated features are not in the model.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Feature selection

Create new playlist

Sign In

Sign Up

Feature selection

Removing irrelevant features

Table of Contents for
Feature selection