Rebinning data

Often, the data we have is not structured the way we want to use it. A structuring technique we can use is called (statistical) data binning or bucketing. This strategy replaces values within an interval (a bin) with one representative value. In the process, we may lose information; however, we gain better control over the data and efficiency.

In the weather dataset, we have wind direction in degrees and wind speed in m/s, which can be represented in a different way. In this recipe, I chose to present wind direction with cardinal directions (north, south, and so on). For the wind speed, I used the Beaufort scale (visit https://en.wikipedia.org/wiki/Beaufort_scale).

How to do it...

Follow these instructions to rebin the data:

  1. The imports are as follows:
    import dautil as dl
    import seaborn as sns
    import matplotlib.pyplot as plt
    import pandas as pd
    import numpy as np
    from IPython.display import HTML
  2. Load and rebin the data as follows (wind direction is in degree 0-360; we rebin to cardinal directions such as north, southwest, and so on):
    df = dl.data.Weather.load()[['WIND_SPEED', 'WIND_DIR']].dropna()
    categorized = df.copy()
    categorized['WIND_DIR'] = dl.data.Weather.categorize_wind_dir(df)
    categorized['WIND_SPEED'] = dl.data.Weather.beaufort_scale(df)
  3. Show distributions and countplots with the following code:
    sp = dl.plotting.Subplotter(2, 2, context)
    sns.distplot(df['WIND_SPEED'], ax=sp.ax)
    sp.label(xlabel_params=dl.data.Weather.get_header('WIND_SPEED'))
    
    sns.distplot(df['WIND_DIR'], ax=sp.next_ax())
    sp.label(xlabel_params=dl.data.Weather.get_header('WIND_DIR'))
    
    sns.countplot(x='WIND_SPEED', data=categorized, ax=sp.next_ax())
    sp.label()
    
    sns.countplot(x='WIND_DIR', data=categorized, ax=sp.next_ax())
    sp.label()
    plt.tight_layout()
    HTML(dl.report.HTMLBuilder().watermark())

Refer to the following screenshot for the end result (refer to the rebinning_data.ipynb file in this book's code bundle):

How to do it...
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset