To measure the performance of a regression, the distance between the predicted outputs and the actual outputs, is a good model performance measure.
Rattle offers us a good way to see predicted values versus the actual value—the Predicted versus Observed plot. To test this plot, you need to create a regression model. You can download a sample dataset from the UCI Machine Learning Repository (http://archive.ics.uci.edu/ml; Irvine, CA: University of California, School of Information and Computer Science), or from Kaggle (http://www.kaggle.com/). On some websites, such as the UCI Machine Learning Repository, the datasets are classified by the task you want to perform with the dataset.
Imagine we have to create a model to predict the price of a house. Click on the Evaluate tab:
Rattle's Evaluate tab offers us two good options for a regression model as shown in the preceding screenshot:
After creating the model, go to the Evaluate tab, select your Model, the Validation dataset, the Pr v Ob option, press Execute, and Rattle will build a Predicted vs. Observed plot for you, as shown here:
This plot shows a set of points; each point is an observation in the y axes, where we can see the predicted value, and in the x axes, we can see the actual value. We can also see a dotted line; this line represents a perfect prediction, when predicted values are the same as the actual values. The last line is a linear fit to points.
Finally, Pseudo R-square is an approach to R-square. This measures the variance explained by the model. R-square is a number from 0 to 1; an R-square close to 1 means that the model has strong predictive power. When the model doesn't provide a good prediction, R-square is close to 0. In the same way, a Pseudo R-square close to 1 is good; a measure close to 0 means low performance.