Training the model parameters

Once we have our final data frame, we can think of the machine learning problem as minimizing an error function. All we are trying to do is make the best predictions on unseen patients/encounters; we are trying to minimize the difference between the predicted value and the observed value. For example, if we are trying to predict cancer onset, we want the predicted likelihood of cancer occurrence to be high in patients that developed cancer and low in patients that have not developed cancer. In machine learning, the difference between the predicted values and the observed values is known as an error function or cost function. Cost functions can take various forms, and machine learning practitioners often tinker with them while performing modeling. When minimizing the cost function, we need to know what weights we assign to certain features. In most cases, features that are more highly correlated to the outcome variable should be given more mathematical importance than features that are less highly correlated to the outcome variable. In a simplistic sense, we can refer to these "importance variables" as weights, or parameters. One of the major goals of supervised machine learning is all about finding that unique set of parameters or weights that minimizes our cost function. Almost every machine learning algorithm has its own way of assigning weights to different features. We will study this part of the pipeline in greater detail for the logistic regression, random forest, and neural network algorithms in Chapter 7, Making Predictive Models in Healthcare.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset