Visualization

While we have all the stats for two types of islands, it is hard to make sense of the numbers. To help the case, let's visualize our data as a set of timelines. To do so, we'll use arguably the most popular (and one of the oldest) libraries for data visualization in Python, Matplotlib. We'll use Matplotlib extensively in the second part of this book, using more elegant interfaces, but, for now, let's keep it easy.

Here are the steps we need to take:

  1. First, we'll import the library and prepare the Notebook to show plots within the Notebook itself, rather than in a separate window or as the library calls it, inline.
  2. Next, we set up the 538 style for the visualization. This step is optional and the pick is arbitrary. Here is what it will look like in code:
# 1. sets jupyter to show plots within the notebook
%matplotlib inline

# 2. import matplotlib's most popular interface
from matplotlib import pylab as plt

# 3. (optional) style plots using "538" website style
plt.style.use('fivethirtyeight')
In the preceding code, we set the plotting style to 538. This step is completely optional—it merely changes the visual style (background and shapes colors) from the default style. There are plenty of built-in styles (https://matplotlib.org/gallery/style_sheets/style_sheets_reference.html) and we can always define our own one.

Now, Matplotlib is ready to visualize the data. 

For the next step, we need to create the chart objects to draw on. We have four metrics to show—population, average age, average skill, and percentage of animals with a good skillset. We also have two sources—two kinds of islands. The way Matplotlib works, we'll need to iterate over the stats for each island and plot a line (remember, we're drawing timelines?) by passing a pair of arrays—one for years (x axis) and a given metric (y axis). To make code cleaner, let's prepare the data as a dictionary of two lists and make another one for the corresponding colors. This is how to do that:

datas = {"Heaven Islands":stats, 
'Harsh Islands':h_stats}

colors = {
'Heaven Islands': 'blue',
'Harsh Islands': 'red'
}

Finally, we can plot the visualization itself. The following needs to be in the same cell; we'll split the code in order to explain it, then show it as a whole, one more time:

  1. On the first line, we are creating one chart with eight subplots—four rows (one per metric), and two columns (one per island type). The size of the chart is defined by the figsize argument. The sharex parameter sets the x axis to be shared across charts, thus, the axis will be shown only once. The axes variable is now a collection of four lists, representing rows, with two axes (subplot) objects in each. Having them, we can start adding marks and properties:
fig, axes = plt.subplots(4, 2, figsize=(10,10), sharex=True)
  1. Next, we'll set a y-axis title for each chart, and specify their x-axis limits to 15 years:
for i, title in enumerate(('Population', 'Average age', 'Average Survival Skill', '% with SSK > 75')):

axes[i].set_ylabel(title)
axes[i].set_xlim(0, 15)
  1. Now, we will loop over two types of islands, and every string of statistics. For each, we will pull a pair of arrays, and send them to plot as a polyline. We'll also add titles to the plots in the first row. Using the colors dictionary, we'll pass a corresponding color to each line:
for i, (k, v) in enumerate(datas.items()):
axes[0][i].set_title(k, fontsize=14)

for s in v: # for each island
years = list(s.keys())

axes[0][i].plot(years, [v['pop'] for v in s.values()],
c=colors[k], alpha=.007)
axes[1][i].plot(years, [v.get('mean_age', None)
for v in s.values()], c=colors[k], alpha=.007)
axes[2][i].plot(years, [v.get('mean_skill', None)
for v in s.values()], c=colors[k], alpha=.007)
axes[3][i].plot(years, [v.get('75_skill', None)
for v in s.values()], c=colors[k], alpha=.007)
  1. The following is the same code, pulled together:
fig, axes = plt.subplots(4, 2, figsize=(10,10), sharex=True)

for i, title in enumerate(('Population', 'Average age', 'Average Survival Skill', '% with SSK > 75')):
axes[i][0].set_ylabel(title)

axes[i][0].set_xlim(0, 15)
axes[i][1].set_xlim(0, 15)

for i, (k, v) in enumerate(datas.items()):
axes[0][i].set_title(k, fontsize=14)

for s in v: # for each island
years = list(s.keys())

axes[0][i].plot(years, [v['pop'] for v in s.values()],
c=colors[k], label=k, alpha=.007)
axes[1][i].plot(years, [v.get('mean_age', None)
for v in s.values()], c=colors[k], label=k, alpha=.007)
axes[2][i].plot(years, [v.get('mean_skill', None)
for v in s.values()], c=colors[k], label=k, alpha=.007)
axes[3][i].plot(years, [v.get('75_skill', None)
for v in s.values()], c=colors[k], label=k, alpha=.007)
  1. Now, run the cell. If everything is fine, it will take a few seconds to execute. After that, you'll get your visualization. Here is what it looks like for us:

What can we get from this chart? Let's discuss every metric. 

Please use the graphic bundle list of the book for viewing all color images in the book.

As you can see on the first chart, Heaven Islands have no constraints but the maximum population gap—so the animal population quickly grows to the maximum. For the Harsh Islands population, this is not the case. In fact, many islands have no animals at all (see the thick red line at the zero, starting on year 4).

Next, the average age seems to behave similarly—in both scenarios, it seems to bump into the maximum age and then stay at half of the maximum age, which makes sense. Perhaps the trend would be different if the initial skill window was more narrow.

Both charts for average skill and percentage of animals with skill above a certain threshold tell us the same story. On the Heaven Islands there is no trend, but deviation starts to accumulate—on some islands, it even falls below 20 percent, as it has no impact on anything. For Harsh Islands, the picture is drastically different: it seems that in most cases, unskilled animals were killed in the first year (this can be confirmed by the decline in the population for the first two years). All those that survived started to breed—so the skill skyrocketed from the get-go. For most of the Harsh Islands, 100% of the population got survival skills beyond 75 after 1-2 years. In other words, our natural selection did indeed work.

Of course, the model we created is very simplistic. It is also driven by a number of arbitrary values and decisions—fertility rate, maximum age, harsh weather conditions, maximum population, mutation_drift level, initial conditions, as well as uniform integer distribution of weather conditions and order of computations on each stage. It is no surprise that the system worked as we expected. There is, however, room for deeper research and experimentation. For example, we could discuss the pace of improvement, or the probability of island extinction with different initial values and assumptions. Alternatively, we could create another type of animal (carnivores), whose survival depends on killing herbivores (say, by having a larger survival skill), and research the dynamics of the two species. We encourage the reader to play with this code, adding custom rules and characteristics.

In the meantime, let's proceed to the next chapter, where we'll talk about some non-Python tools that are essential for a productive workflow.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset