Introduction

Most datasets contain features (such as attributes or variables) that are highly redundant. In order to remove irrelevant and redundant data to reduce the computational cost and avoid overfitting, you can reduce the features into a smaller subsets without a significant loss of information. The mathematical procedure of reducing features is known as dimension reduction. Dimension reduction is the process of converting data with large dimensions in to data will fewer relevant dimensions.

Nowadays, there has been an explosion of dataset sizes due to the continuous collection of data by sensors, cameras, GPS sensors, setup boxes, phones, and so on. With more data with many dimensions, processing has become difficult and some dimensions may not be relevant for all case studies.

Dimension reduction gives the following benefits:

The reduction of features can increase the efficiency of data processing, as fast computation will be possible
Data will be compressed, which reduces the storage requirements
It improves the model performance by removing the redundant data
It plays an important role in noise removal, improving the performance of the model

Dimension reduction is, therefore, widely used in the fields of pattern recognition, text retrieval, and machine learning. Dimension reduction can be divided into two parts: feature extraction and feature selection. Feature extraction is a technique that uses a lower dimension space to represent data in a higher dimension space. Feature selection is used to find a subset of the original variables.

The objective of feature selection is to select a set of relevant features to construct the model. The techniques for feature selection can be categorized into feature ranking and feature selection. Feature ranking ranks features with certain criteria and then selects features that are above a defined threshold. On the other hand, feature selection searches the optimal subset from a space of feature subsets.

In feature extraction, the problem can be categorized as linear or nonlinear. The linear method searches an affine space that best explains the variation of data distribution. In contrast, the nonlinear method is a better option for data that is distributed on a highly nonlinear curved surface. Here, we list some common linear and nonlinear methods.

Here are some common linear methods:

PCA: Principal component analysis maps data to a lower dimension, so that the variance of the data in a low dimension representation is maximized.
MDS: Multidimensional scaling is a method that allows you to visualize how near (pattern proximities) objects are to each other and can produce a representation of your data with lower dimension space. PCA can be regarded as the simplest form of MDS if the distance measurement used in MDS equals the covariance of the data.
SVD: Singular value decomposition removes redundant features that are linear correlated from the perspective of linear algebra. PCA can also be regarded as a specific case of SVD.

Here are some common nonlinear methods:

ISOMAP: ISOMAP can be viewed as an extension of MDS, which uses the distance metric of geodesic distances. In this method, geodesic distance is computed by graphing the shortest path distances.
LLE: Locally linear embedding performs local PCA and global eigen-decomposition. LLE is a local approach, which involves selecting features for each category of the class feature. In contrast, ISOMAP is a global approach, which involves selecting features for all features.

In this chapter, we will first discuss how to perform feature ranking and selection. Next, we will focus on the topic of feature extraction and cover recipes in performing dimension reduction with both linear and nonlinear methods. For linear methods, we will look at how to perform PCA, determine the number of principal components, and discuss its visualization. We then move on to MDS and SVD. Furthermore, we will discuss the application of SVD to compress images. For nonlinear methods, we will look at how to perform dimension reduction with ISOMAP and LLE.

Table of Contents for Introduction

Create new playlist

Sign In

Sign Up

Table of Contents for
Introduction