The modularity tradeoff

This chapter has shown that it is possible, and often useful, to aid a machine learning model with some rule-based system. You might also have noticed that the images in the dataset were all cropped to show only one plant.

While we could have built a model to locate and classify the plants for us, in addition to classifying it, we could have also built a system that would output the treatment a plant should directly receive. This begs the question of how modular we should make our systems.

End-to-end deep learning was all the rage for several years. If given a huge amount of data, a deep learning model can learn what would otherwise have taken a system with many components much longer to learn. However, end-to-end deep learning does have several drawbacks:

  • End-to-end deep learning needs huge amounts of data. Because models have so many parameters, a large amount of data is needed in order to avoid overfitting.
  • End-to-end deep learning is hard to debug. If you replace your entire system with one black box model, you have little hope of finding out why certain things happened.
  • Some things are hard to learn but easy to write down as a code, especially sanity-check rules.

Recently, researchers have begun to make their models more modular. A great example is Ha and Schmidthuber's World Models, which can be read here: https://worldmodels.github.io/. In this, they've encoded visual information, made predictions about the future, and chosen actions with three different models.

On the practical side, we can take a look at Airbnb, who combine structural modeling with machine learning for their pricing engine. You can read more about it here: https://medium.com/airbnb-engineering/learning-market-dynamics-for-optimal-pricing-97cffbcc53e3. Modelers knew that bookings roughly follow a Poisson Distribution and that there are also seasonal effects. So, Airbnb built a model to predict the parameters of the distribution and seasonality directly, rather than letting the model predict bookings directly.

If you have a small amount of data, then your algorithm's performance needs to come from human insight. If some subtasks can be easily expressed in code, then it's usually better to express them in code. If you need explainability and want to see why certain choices were made, a modular setup with clearly interpretable intermediate outputs is a good choice. However, if a task is hard and you don't know exactly what subtasks it entails, and you have lots of data, then it's often better to use an end-to-end approach.

It's very rare to use a pure end-to-end approach. Images, for example, are always preprocessed from the camera chip, you never really work with raw data.

Being smart about dividing a task can boost performance and reduce risk.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset