Goal

The overall goal is to create a machine learning application that will be able to train models respecting given investment strategy and deploy these models as a callable service, processing incoming loan applications. The service will be able to decide about a given loan application and compute an interest rate. We can define our intentions with a top-down approach starting from business requirements. Remember, a good data scientist has a firm understanding of the question(s) being asked, which is dependent on understanding the business requirement(s), which are as follows:

We need to define what the investment strategy means and how it optimizes/influences our machine learning model creation and evaluation. Then, we will take the model's findings and apply them to our portfolio of loans to best optimize our profits based on specified investment strategy.
We need to define a computation of expected return based on the investment strategy, and the application should provide the expected return of a lender. This is an important loan attribute for investors since it directly connects the loan application, investment strategy (that is, risk), and possible profit. We should keep this fact in mind, since in real life, the modeling pipelines are used by users who are not experts in data science or statistics and who are more interested in more high-level interpretation of modeling outputs.
Furthermore, we need means to design and realize a loan prediction pipeline, which consists of the following:
- A model that is based on loan application data and investment strategy decides about the loan status-if the loan should be accepted or rejected.
  - The model needs to be robust enough to reject all bad loans (that is, loans that would lead to an investment loss), but on the other hand, do not miss any good loans (that is, do not miss any investment opportunity).
  - The model should be interpretable-it should provide an explanation as to why a loan was rejected. Interestingly, there is a lot of research regarding this subject; the interpretability of models with key stakeholders who want something more tangible than just the model said so.

For those interested in further reading regarding model interpretability, Zachary Lipton (UCSD) has an outstanding paper titled The Mythos of Model Interpretability, https://arxiv.org/abs/1606.03490 which directly addresses this topic. This is an especially useful paper for those data scientists who are constantly in the hot seat of explaining all their magic!

- There is another model that recommends the interest rate for accepted loans. Based on the specified loan application, the model should decide the best interest rate-not too high lose a borrower, but not too low to miss a profit.
- Finally, we need to decide how to deploy this complex, multi-faceted machine learning pipeline. Much like our previous chapter, which combines multiple models in a single pipeline, we will take all the inputs we have in our dataset-which we will see are very different types-and perform processing, feature extraction, model prediction, and recommendations based on our investment strategy: a tall order but one that we will accomplish in this chapter!

Table of Contents for Goal

Create new playlist

Sign In

Sign Up

Table of Contents for
Goal