For classification purposes, the corresponding class is implemented in XGBClassifier. The constructor's parameters are the same as the regression implementation. For our example, we use the hand-written digit classification problem. We set the n_estimators parameter to 100 and n_jobs to 4. The rest of the code follows the usual template:
# --- SECTION 1 ---
# Libraries and data loading
from sklearn.datasets import load_digits
from xgboost import XGBClassifier
from sklearn import metrics
import numpy as np
digits = load_digits()
train_size = 1500
train_x, train_y = digits.data[:train_size], digits.target[:train_size]
test_x, test_y = digits.data[train_size:], digits.target[train_size:]
np.random.seed(123456)
# --- SECTION 2 ---
# Create the ensemble
ensemble_size = 100
ensemble = XGBClassifier(n_estimators=ensemble_size, n_jobs=4)
# --- SECTION 3 ---
# Train the ensemble
ensemble.fit(train_x, train_y)
# --- SECTION 4 ---
# Evaluate the ensemble
ensemble_predictions = ensemble.predict(test_x)
ensemble_acc = metrics.accuracy_score(test_y, ensemble_predictions)
# --- SECTION 5 ---
# Print the accuracy
print('Boosting: %.2f' % ensemble_acc)
The ensemble correctly classifies the test set with 89% accuracy, also the highest achieved for any boosting algorithm.