Histograms, Herstograms, Yourstograms, and Mystograms

Another simple layout is the histogram, which simplifies creating bar charts when they're in a continuous series. We can also use it for binning a series of values so that we don't have to do as many gymnastics with Array.prototype.reduce and Array.prototype.map.

In this instance, we will create an ordinal scale of episodes and seasons and use that to create a histogram. In doing so, we're going to use a new dataset, which I've included in the data/ directory, GoT-deaths-by-season.json. This includes all the deaths in the show in the following format:

    { 
"name": "Will",
"role": "Ranger of the Night's Watch",
"death": {
"season": 1,
"episode": 1
},
"execution": "Beheaded for desertion by Ned Stark",
"likelihoodOfReturn": "0%"
},

The only data we're really concerned with here is the death object, which we'll use to create an ordinal scale.

Start by resetting main.js by commenting out all the westerosChart lines, then add the following:

westerosChart.init('histogram', 'data/GoT-deaths-by-season.json');

Go back to chapter7/index.js and add the following:

westerosChart.histogram = function histogram(_data) { 
const data = _data.data.map(d =>
Object.assign(d, { death: (d.death.season * 100) + d.death.episode }))
.sort((a, b) => a.death - b.death);
const episodesPerSeason = 10;
const totalSeasons = 6;
const allEpisodes = d3.range(1, totalSeasons + 1).reduce((episodes, s) =>
episodes.concat(d3.range(1, episodesPerSeason + 1).map(e => (s * 100) + e)), []);

This replaces the death object with a string where the season is multiplied by 100 and added to the episode number, then sorts all the data first by season, then by episode. It also creates an array of 60 elements of the format detailed above.

This last step is optional, but we want to show all the episodes (even if they don't have any deaths in them), so we populate the x-scale as per the following. This is a bit different to how histograms are generally used, which is with a continuous series. Alas, we don't really have any of those in this dataset, so this will have to suffice.

Next, we instantiate our x-scale:

  const x = d3.scaleBand() 
.range([0, this.innerWidth])
.domain(allEpisodes)
.paddingOuter(0)
.paddingInner(0.25);

We use an ordinal band scale here and set the domain to the allEpisodes array we have just created.

Next, we create our histogram layout generator, and supply it with the following data:

  const histogram = d3.histogram() 
.value(d => d.death)
.thresholds(x.domain());
const bins = histogram(data);

We set the value accessor to return our death value and use the histogram.thresholds() method to set the extents of each bin. The histogram.thresholds() method expects an array containing a series of values defining the edges of the bin--the first bin is situated between the first and second array elements, second bin between the second and third elements, and so on.

This returns us an array of bins. Each bin is an array with all the corresponding data assigned to it, with a length property corresponding to the number of array elements, and x0 and x1 properties, corresponding to the edges of the bin as explained above. The lower bound (x0) is inclusive, whereas the upper bound (x1) is exclusive (except for the last bin).

Next, we create a y-scale:

  const y = d3.scaleLinear() 
.domain([0, d3.max(bins, d => d.length)])
.range([this.innerHeight - 10, 0]);

This is pretty straightforward; we get the maximum length of an item in our bins by running d3.max() on them, and set the range to 10 pixels less than the innerHeight (our x axis will be double-decked to accommodate the number of labels, and each line is 10 pixels high).

Time to add the bars!:

  const bar = this.container.selectAll('.bar') 
.data(bins)
.enter()
.append('rect')
.attr('x', d => x(d.x0))
.attr('y', d => y(d.length))
.attr('fill', tomato')
.attr('width', () => x.bandwidth())
.attr('height', d => (this.innerHeight - 10) - y(d.length));

We make everything 10 pixels shorter to accommodate the legend; in every other respect, it's like any bar chart. We use bandwidth() to set the width--if we weren't using an ordinal bandScale here, we could subtract x0 from x1 and pass that to the x-scale to get the width of each bar.

Almost done! We're going to add the x axis next:

  const xAxis = this.container.append('g') 
.attr('class', 'axis x')
.attr('transform', `translate(0, ${this.innerHeight - 10})`)
.call(d3.axisBottom(x).tickFormat(
d => `S${(d - (d % 100)) / 100}E${d % 100}`));
xAxis.selectAll('text')
.each(function (d, i) {
const yVal = d3.select(this).attr('y');
d3.select(this).attr('y', i % 2 ? yVal : (yVal * 2) + 2)
});
xAxis.selectAll('line')
.each(function (d, i) {
const y2 = d3.select(this).attr('y2');
d3.select(this).attr('y2', i % 2 ? y2 : y2 * 2)
});

This is like any other x axis, except we're going a bit crazy here and making it double-decked by moving every other tick label down slightly more than twice its y-value (note that it's inside a group, so this value is relative to that) and extending its tick line accordingly. We use d3.each to run this operation on each element in the selection, which sets the this context to the current element; because of this, we use regular functions instead of fat arrow, which preserves the context provided by d3.each.

Lastly, we set up our trusty tooltip generator:

bar.call(tooltip((d) =>
`${d.x0}: ${d.length} deaths`, this.container));

Click on save, and you should have a nifty bar chart like this!:

Wow, season six goes in pretty hard, doesn't it?

Wait a minute, was this a proper use of a histogram?
Well, no, not really; a histogram is normally used to take a bunch of samples and break them into discrete chunks across a specified output range. It could be argued this would have made more sense, as a simple bar chart with an ordinal scale for the x-axis. However, in at least this particular case, using a histogram is perhaps a bit easier, as it does all the totaling for us, even if we did have to do a bit of contrived data wrangling to get the x-axis to make any sense. As a rule of thumb, if you're not entirely sure what your x-axis ticks should be, and you have a lot of data you want to display in a bar-chart-like manner, you should consider a histogram. If your data is already classified such that you could get away with using an ordinal scale for your x-axis, do that instead.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset