The final step is taking the NumPy array of the pandas DataFrame that will be passed directly into the machine learning algorithm. First, we save the final column names, which will assist us when we assess variable importance later on:
X_train_cols = X_train.columns
X_test_cols = X_test.columns
Now, we use the values attribute of the pandas DataFrames to access the underlying NumPy array for each DataFrame:
X_train = X_train.values
X_test = X_test.values
Now, we are ready for to build the models.