D.4. Cross-fit training

Another approach to the train/test split question is cross-validation or k-fold cross-validation (see figure D.3). The concept behind cross validation is very similar to the rough splits we just covered, but it allows you a path to use the entire labeled set as training. The process involves dividing your training set into k equal sets, or folds. You then train your model with k-1 of the folds as a training set and validate it against the k-th fold. You then restart the training afresh with one of the k-1 sets used in training on the first attempt as your held-out validation set. The remaining k-1 folds become your new training set.

Figure D.3. K-fold cross-validation

This technique is valuable for analyzing the structure of the model and finding hyperparameters that perform well against a variety of validation data. Once your hyperparameters are chosen, you still have to select the trained model that performed the best and as such is susceptible to the bias expressed in the previous section, so holding a test set out from this process is still advisable.

This approach also gives you some new information about the reliability of your model. You can compute a P-value for the likelihood that the relationship discovered by your model, between the input features and the output predictions, is statistically significant and not the result of random chance. This is a significantly new piece of information if your training dataset is truly a representative sample of the real world.

The cost of this extra confidence in your model is that it takes K times as long to train, for K-fold cross-validation. So if you want to get the 90% answer to your problem, you can often simply do 1-fold cross validation. This 1-fold is exactly equivalent to our training set and validation set split that we did earlier. You won’t have 100% confidence in the reliability of your model as a description of the real world dynamics, but if it works well on your test set you can be very confident that it is a useful model for predicting your target variable. So this is the practical approach that makes sense for most business applications of machine learning models.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset