To execute this recipe, you need to have a working Spark environment. Also, we will be working off of the no_outliers DataFrame we created in the Handling outliers recipe so we assume you have followed the steps to handle duplicates, missing observations, and outliers.
No other prerequisites are required.