Testing the base learners

To test the base learners, we will benchmark the base learners by themselves, which will help us gauge how well they perform on their own. In order to do so, first, we load the libraries and dataset and then split the data with 70% in the train set and 30% in the test set. We use pandas in order to easily import the CSV. Our goal is to train and evaluate each individual base learner before we train and evaluate the ensemble as a whole:

# --- SECTION 1 ---
# Libraries and data loading
import numpy as np
import pandas as pd

from sklearn.tree import DecisionTreeClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.naive_bayes import GaussianNB
from sklearn.model_selection import train_test_split
from sklearn import metrics

np.random.seed(123456)
data = pd.read_csv('creditcard.csv')
data.Time = (data.Time-data.Time.min())/data.Time.std()
data.Amount = (data.Amount-data.Amount.mean())/data.Amount.std()

# Train-Test slpit of 70%-30%
x_train, x_test, y_train, y_test = train_test_split(
data.drop('Class', axis=1).values, data.Class.values, test_size=0.3)

After loading the libraries and data, we train each classifier and print the required metrics from the sklearn.metrics package. F1 score is implemented by the f1_score function and recall is implemented by the recall_score function. The decision tree is restricted to a maximum depth of three (max_depth=3), in order to avoid overfitting:

# --- SECTION 2 ---
# Base learners evaluation
base_classifiers = [('DT', DecisionTreeClassifier(max_depth=3)),
                    ('NB', GaussianNB()),
                    ('LR', LogisticRegression())]

for bc in base_classifiers:
 lr = bc[1]
 lr.fit(x_train, y_train)

 predictions = lr.predict(x_test)
 print(bc[0]+' f1', metrics.f1_score(y_test, predictions))
 print(bc[0]+' recall', metrics.recall_score(y_test, predictions))

The results are depicted in the following table. As is evident, the decision tree outperforms the other three learners. Naive Bayes has a higher recall score, but its F1 score is considerably worse, compared to the decision tree:

Learner	Metric	Value
Decision Tree	F1	0.770
	Recall	0.713
Naive Bayes	F1	0.107
	Recall	0.824
Logistic Regression	F1	0.751
	Recall	0.632

We can also experiment with the number of features present in the dataset. By plotting their correlation to the target, we can filter out features that present low correlation to the target. This table depicts each feature's correlation to the target:

Correlation between each variable and the target

By filtering any feature with a lower absolute value than 0.1, we hope that the base learners will be able to better detect the fraudulent transactions, as the dataset's noise will be reduced.

In order to test our theory, we repeat the experiment, but remove any columns from the DataFrame where the absolute correlation is lower than 0.1, as indicated by fs = list(correlations[(abs(correlations)>threshold)].index.values).

Here, fs holds all column names with a correlation greater than the indicated threshold:

# --- SECTION 3 ---
# Filter features according to their correlation to the target
np.random.seed(123456)
threshold = 0.1

correlations = data.corr()['Class'].drop('Class')
fs = list(correlations[(abs(correlations)>threshold)].index.values)
fs.append('Class')
data = data[fs]

x_train, x_test, y_train, y_test = train_test_split(data.drop('Class', axis=1).values, data.Class.values, test_size=0.3)

for bc in base_classifiers:
 lr = bc[1]
 lr.fit(x_train, y_train)

 predictions = lr.predict(x_test)
 print(bc[0]+' f1', metrics.f1_score(y_test, predictions))
 print(bc[0]+' recall', metrics.recall_score(y_test, predictions))

Again, we present the results in the following table. As we can see, the decision tree has increased its F1 score, while reducing its recall. Naive Bayes has improved on both metrics, while the logistic regression model has become considerably worse:

Learner	Metric	Value
Decision Tree	F1	0.785
	Recall	0.699
Naive Bayes	F1	0.208
	Recall	0.846
Logistic Regression	F1	0.735
	Recall	0.610

Performance metrics for the three base learners for the filtered dataset

Table of Contents for Testing the base learners

Create new playlist

Sign In

Sign Up

Table of Contents for
Testing the base learners