With well-prepared data, it is safe to move on with the training sets generation stage. Typical tasks in this stage can be summarized into two major categories, data preprocessing and feature engineering.
Data preprocessing usually involves categorical feature encoding, feature scaling, feature selection, and dimensionality reduction.