Transforming data with logarithms

When data varies by orders of magnitude, transforming the data with logarithms is an obvious strategy. In my experience, it is less common to do the opposite transformation using an exponential function. Usually when exploring, we visualize a log-log or semi-log scatter plot of paired variables.

To demonstrate this transformation, we will use the Worldbank data for infant mortality rate per 1000 livebirths and Gross Domestic Product (GDP) per capita for the available countries. If we apply the logarithm of base 10 to both variables, the slope of the line we get by fitting the data has a useful property. A one percent increase in one variable corresponds to a percentage change given by the slope of the other variable.

How to do it...

Transform the data using logarithms with the following procedure:

  1. The imports are as follows:
    import dautil as dl
    import matplotlib.pyplot as plt
    import numpy as np
    from IPython.display import HTML
  2. Download the data for 2010 with the following code:
    wb = dl.data.Worldbank()
    countries = wb.get_countries()[['name', 'iso2c']]
    inf_mort = wb.get_name('inf_mort')
    gdp_pcap = wb.get_name('gdp_pcap')
    df = wb.download(country=countries['iso2c'],
                     indicator=[inf_mort, gdp_pcap],
                     start=2010, end=2010).dropna()
  3. Apply the log transform with the following snippet:
    loglog = df.applymap(np.log10)
    x = loglog[gdp_pcap]
    y = loglog[inf_mort]
  4. Plot the data before and after the transformation:
    sp = dl.plotting.Subplotter(2, 1, context)
    xvar = 'GDP per capita'
    sp.label(xlabel_params=xvar)
    sp.ax.set_ylim([0, 200])
    sp.ax.scatter(df[gdp_pcap], df[inf_mort])
    
    sp.next_ax()
    sp.ax.scatter(x, y, label='Transformed')
    dl.plotting.plot_polyfit(sp.ax, x, y)
    sp.label(xlabel_params=xvar)
    plt.tight_layout()
    HTML(dl.report.HTMLBuilder().watermark())

Refer to the following screenshot for the end result (refer to the transforming_down.ipynb file in this book's code bundle):

How to do it...
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset