Now, let's train the model using the training portion of the dataset:
- Let's start by importing the linear regression package:
from sklearn.linear_model import LinearRegression
- Then, let's instantiate the linear regression model and train it using the training dataset:
regressor = LinearRegression()
regressor.fit(X_train, y_train)
- Now, let's predict the results using the test portion of the dataset:
y_pred = regressor.predict(X_test)
from sklearn.metrics import mean_squared_error
from math import sqrt
sqrt(mean_squared_error(y_test, y_pred))
- The output generated by running the preceding code will generate the following:
As discussed in the preceding section, RMSE is the standard deviation of the error. It indicates that 68.2% of predictions will fall within 4.36 of the value of the target variable.