If you don't have many variables in your dataset, it is a good idea to view all the possible scatterplots for your data. You can do this with one function call from either seaborn or pandas. These functions display a matrix of plots with kernel density estimation plots or histograms on the diagonal.
import pandas as pd from dautil import data from dautil import ts import matplotlib.pyplot as plt import seaborn as sns import matplotlib as mpl
df = data.Weather.load() df = ts.groupby_yday(df).mean() df.columns = [data.Weather.get_header(c) for c in df.columns]
pairplot()
function, which plots histograms on the diagonal by default:%matplotlib inline # Seaborn plotting, issues due to NaNs sns.pairplot(df.fillna(0))
The following plots are the result:
scatter_matrix()
function and request kernel density estimation plots on the diagonal:sns.set({'figure.figsize': '16, 12'})
mpl.rcParams['axes.linewidth'] = 9
mpl.rcParams['lines.linewidth'] = 2
plots = pd.scatter_matrix(df, marker='o', diagonal='kde')
plt.show()
Refer to the following plots for the end result:
The complete code is available in the scatter_matrix.ipynb
file in this book's code bundle.