Resampling

A separate form of aggregation is time-based resampling. You can think of this practice as grouping by time periodexcept that statistics will be filled for missed time periods, too.

For example, let's count casualties for each month of the war, assuming the end of each battle as a time point. For that, we'll have to set DateTime as an index, first. For the sake of simplicity, let's create a copy of the dataframe to perform on:

ts = data[['axis killed', 'allies killed', 'end']].copy()
ts = ts.set_index('end').sort_index()

Now, all we need to do is define the frequency and aggregation method, and we're good to go:

>>> timeline = ts.resample('1Y').agg('sum')
>>> timeline
axis killed allies killed
end
1939-12-31 23727.0 166092.0
1940-12-31 36682.0 2741.0
1941-12-31 226230.0 1644334.0
1942-12-31 346949.0 2300836.0
1943-12-31 1110704.0 1456498.0
1944-12-31 640690.0 770208.0
1945-12-31 684689.0 622996.0

Moreover, for all of the dataframes and series with the DateTime index, pandas will plot the line charts automatically, one per column. Running timeline.plot() will get us to the following diagram. Here, we can estimate the number of casualties on both sides every year:

Starting with version 0.25.0, pandas offers a unified specification for visualizations. This means that any visualization library that supports certain methods can be registered and used instead of matplotlib. This is a brand-new feature and, as far as we know, there are no alternatives to matplotlib just yet. That being said, later in this chapter, we'll work with altair, another library for visualization. It won't surprise me if it will soon be possible to register altair as a renderer and get altair charts instead of matplotlib, preserving the same interface for the preceding clients we used!

The resulting time series is remarkable! It also reminds us that, aside from time, we have location coordinates for most of the battles. Can we use those to create maps? You bet!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset