Evaluating model performance

Finally, after building the model, it is important to evaluate its performance against the ground truth, so that we can adjust it if needed, compare different models, and report the results of our model to others. Methods for evaluating model performance depend on the structure of the target variable being predicted.

Often, the first step in evaluating a model is making a 2 x 2 contingency table, an example of which is shown as follows (Preventive Medicine, 2016). In a 2 x 2 contingency table, all of the observations are split into four categories, which are further discussed in the following chart:

For binary-valued target variables (for example, classification problems), there will be four types of observations:

Those that had a positive outcome for which we predicted a positive outcome
Those that had a positive outcome for which we predicted a negative outcome
Those that had a negative outcome for which we predicted a positive outcome
Those that had a negative outcome for which we predicted a negative outcome

These four classes of observations are referred to respectively as:

True positives (TP)
False negatives (FN)
False positives (FP)
True negatives (TN)

Various performance measures can then be calculated from these four quantities. We will cover the popular ones in the following sections.

Table of Contents for Evaluating model performance

Create new playlist

Sign In

Sign Up

Table of Contents for
Evaluating model performance