Statistical-based feature selection

Statistics provides us with relatively quick and easy methods of interpreting both quantitative and qualitative data. We have used some statistical measures in previous chapters to obtain new knowledge and perspective around our data, specifically in that we recognized mean and standard deviation as metrics that enabled us to calculate z-scores and scale our data. In this chapter, we will rely on two new concepts to help us with our feature selection:

  • Pearson correlations
  • hypothesis testing

Both of these methods are known as univariate methods of feature selection, meaning that they are quick and handy when the problem is to select out single features at a time in order to create a better dataset for our machine learning pipeline.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset