Using XGBoost for classification

For classification purposes, the corresponding class is implemented in XGBClassifier. The constructor's parameters are the same as the regression implementation. For our example, we use the hand-written digit classification problem. We set the n_estimators parameter to 100 and n_jobs to 4. The rest of the code follows the usual template:

# --- SECTION 1 ---
# Libraries and data loading
from sklearn.datasets import load_digits
from xgboost import XGBClassifier
from sklearn import metrics
import numpy as np
digits = load_digits()
train_size = 1500
train_x, train_y = digits.data[:train_size], digits.target[:train_size]
test_x, test_y = digits.data[train_size:], digits.target[train_size:]
np.random.seed(123456)

# --- SECTION 2 ---
# Create the ensemble
ensemble_size = 100
ensemble = XGBClassifier(n_estimators=ensemble_size, n_jobs=4)

# --- SECTION 3 ---
# Train the ensemble
ensemble.fit(train_x, train_y)

# --- SECTION 4 ---
# Evaluate the ensemble
ensemble_predictions = ensemble.predict(test_x)
ensemble_acc = metrics.accuracy_score(test_y, ensemble_predictions)

# --- SECTION 5 ---
# Print the accuracy
print('Boosting: %.2f' % ensemble_acc)

The ensemble correctly classifies the test set with 89% accuracy, also the highest achieved for any boosting algorithm.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset