With well-prepared data, it is safe to move on with the training sets generation stage. Typical tasks in this stage can be summarized into two major categories: data preprocessing and feature engineering.
To begin, data preprocessing usually involves categorical feature encoding, feature scaling, feature selection, and dimensionality reduction.