Reviewing methods to prevent overfitting in CNNs

Overfitting occurs when the model fits too well to the training set but is not able to generalize to unseen cases. For example, a CNN model recognizes specific traffic sign images in the training set instead of general patterns. It can be very dangerous if a self-driving car is not able to recognize sign images in ever-changing conditions, such as different weather, lighting, and angles different from what are presented in the training set. To recap, here's what we can do to reduce overfitting:

  • Collecting more training data (if possible and feasible) in order to account for various input data.
  • Using data augmentation, wherein we invent data in a smart way if time or cost does not allow us to collect more data.
  • Employing dropout, which diminishes complex co-adaptations among neighboring neurons.
  • Adding Lasso (L1) or/and Ridge (L2) penalty, which prevents model coefficients from fitting so perfectly that overfitting arises.
  • Reducing the complexity of network architecture. Recall that in the last chapter, we mentioned that adding hidden layers will not help boost the model performance but increase chances of overfitting.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset