Summary

In this chapter, we examined the fitting of several regression models, including transforming input variables to the correct scale and accounting for categorical features correctly. In interpreting the coefficients of these models, we examined both cases where the classical assumptions of linear regression are fulfilled and broken. In the latter cases, we examined generalized linear models, GEE, mixed effects models, and time series models as alternative choices for our analyses. In the process of trying to improve the accuracy of our regression model, we fit both simple and regularized linear models. We also examined the use of tree-based regression models and how to optimize parameter choices in fitting them. Finally, we examined an example of using random forest in PySpark, which can be applied to larger datasets.

In the next chapter, we will examine data that has a discrete categorical outcome, instead of a continuous response. In the process, we will examine in more detail how the likelihood functions of different models are optimized, as well as various algorithms for classification problems.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset