Improving boosting

Due to the out-of-sample performance and the frequency at which boosting is bought and sold, we can assume it is overfitting the training data. Therefore, we'll will try to regularize its learning. The first step is to limit the maximum depth of individual trees. We start by imposing an upper limit of 2, using max_depth=2. This slightly improves our model, yielding an MSE of 19.14 and a Sharpe value of 0.17. Further limiting the overfitting capabilities of the model by using only 10 base learners (n_estimators=10), the model achieves additional improvement.

The MSE of the model is reduced to 16.39 and the Sharpe value is increased to 0.21. Adding an L1 regularization term of 0.5 (reg_alpha=0.5) only reduces the MSE to 16.37. We have come to a point where further fine-tuning will not contribute much performance to our model. At this point, our regressor looks like this:

lr = XGBRegressor(max_depth=2, n_estimators=10, reg_alpha=0.5)

Given the capabilities of XGBoost, we will try to increase the amount of information available to the model. We will increase the available feature lags to 30 and add a rolling mean of the previous 15 lags to the features. To do this, we modify the feature creation section of the code as follows:

def create_x_data(lags=1):
 diff_data = np.zeros((diff_len, lags))
 ma_data = np.zeros((diff_len, lags))

 diff_ma = (data.Close.diff()/data.Close).rolling(15).mean().fillna(0).values[1:]
 for lag in range(1, lags+1):
 this_data = diffs[:-lag]
 diff_data[lag:, lag-1] = this_data

 this_data = diff_ma[:-lag]
 ma_data[lag:, lag-1] = this_data
 return np.concatenate((diff_data, ma_data), axis=1)

x_data = create_x_data(lags=30)*100
y_data = diffs*100

This increases the trading performance of our model, achieving a Sharpe value of 0.32—the highest of all of the models, while it also increases its MSE to 16.78. The trades generated by this model are depicted in figure and in the table that follows. It is interesting to note that the number of buys has greatly reduced, a behavior that bagging also exhibited when we managed to improve its performance as an investment strategy:

Final boosting model performance

Metric	md=2/ne=10/reg=0.5+data	md=2/ne=10/reg=0.5	md=2/ne=10	md=2	xgb
MSE	16.78	16.37	16.39	19.14	19.20
Sharpe	0.32	0.21	0.21	0.17	0.13

Metrics for all boosting models

Table of Contents for Improving boosting

Create new playlist

Sign In

Sign Up

Table of Contents for
Improving boosting