Static dashboards

Despite the name, static dashboards are not static per sethey are not just still images. Here, static refers to the fact that the dashboard is served as a flat HTML file; all of the interaction happens in the client's browser. As a result, the dashboard can be uploaded anywhere on the web (say, an S3 bucket or similar service) and stay there almost for free, with little maintenance required. It is also easy to update the dashboard or data, with essentially no downtime. And of course, this approach means you won't need to think about the scalability and performance of the dashboard.

Obviously, that approach has its downsides, as well. First of all, it is limited to a specific amount of data it can use, and the dataset will be basically available for everyone, directly. If your dashboard requires complex queries and real-time aggregation, this approach will not work. It would be hard to create authentication or to customize the dashboard for a specific user. In a nutshell, this type of dashboard is perfect for the following:

  • Serving a wide audience.
  • It uses a relatively small dataset that is fine to share with everyone.
  • This data is updated occasionallydefinitely not in real-time (the computation may take a lot of resources).

One obstacle in going down that path for many backend developers and data scientists is JavaScript itself. This is virtually the only option with which to write interactive web applications. At the same time, most data scientists and Python developers don't know JavaScript well enough to use it in production, and, often, don't even want to write JavaScript. There are a few ways to dodge that, for example, compiling your code to WebAssembly (which browsers can also run), but that, at least for now, is a hard task in its own right and is a huge overkill.

Another, arguably better, alternative is to use one of the existing Python tools and packages that will generate both HTML and JavaScript code for us. Earlier, we mentioned the difference between visualizing in the notebook and on the dashboard, but this kind of tool can generate charts for both cases.

In the previous chapter, we built a pipeline that collects data on 311 calls every day and then generates a report. Now, let's built a static dashboard of this data, using the altair library we used to plot interactive visualizations in the notebook. We will start in the same way: in the Notebook then store it as HTML; finally, we will redirect the visualization to use an external datasetthe one we're scheduled to update.

Let's start preparing our notebook and loading the dataset:

import pandas as pd
import altair as alt

alt.data_transformers.disable_max_rows()

data = pd.read_csv('./data/top5.csv', parse_dates=['date',]).fillna(0)

Now, what would we want to have on a dashboard? Usually, a primary goal is to highlight any temporal abnormalitiessay, a day that was skipped in data collection or whether the number of complaints deviated significantly. One way to do that is to show a line chart of the total number of complaintssay, split by boroughs:

timeline = alt.Chart(data, width=800).mark_line().encode(
x='date',
y='value',
color='boro'
).transform_filter(
(alt.datum.metric == 'complaints')
)

The code results in this diagram:

And already, we see some interesting stuff: missing values for June 7 and some peaks in January and February. This is a good example of the type of insight quick graphical overviews can give. We can also see different levels of complaints for different boroughsBrooklyn has, for some reason, more than the others.

Now, it would be great to see what this is all aboutwhich types of complaint are the most popular within a given interval of time. Let's first build a bar chart of the top five complaint types for the entire period:

barchart = alt.Chart(data, width=800).mark_bar().encode(
x='svalue:Q',
y=alt.Y(
'metric:N',
sort=alt.EncodingSortField(
field="svalue", # The field to use for the sort
order="descending" # The order to sort in
)
),
color=alt.value('purple'),
tooltip=['metric', 'svalue:Q']).transform_filter(
"datum.metric != 'complaints'").transform_filter(
"datum.boro == 'NYC'").transform_aggregate(
svalue='sum(value)',
groupby=["metric" ]).transform_window(
rank='rank(svalue)',
sort=[alt.SortField('svalue', order='descending')])
.transform_filter('datum.rank <= 10')

Here, we have to filter for NYC (to avoid counting metrics twice) and for the complaints metric, for the same reason. As we want to drop the long tail, we have to generate a rank for each row and then filter by its value. The following is the result:

Finally, we want to combine the two: selecting the time period and seeing a distribution of complaint types for that period. It is just a combination of the two, with a brush element added:

brush = alt.selection_interval(encodings=['x'], empty='all')

T = timeline.add_selection(brush)

B = barchart.transform_filter(brush)

dash = alt.vconcat(A, B, data=data)

Here, the dash variable represents a combined chart that knows how to filter bars based on the interval on the timeline. Feel free to play around and see how top complaints change over time! Of course, there are plenty of features to add (for example, see different complaint types for a particular time of the day), but those features and transformations will quickly grow too exponentially complex for rapid design—that's the downside of using a Vega stack and computing everything in the browser, in general.

On the following diagram, you can see a screenshot of the resulting dashboard:

The gray area on the timeline represents the selected range. The bar chart then shows the overall number of complaints for the top 10 complaint types within the period. This interactivity allows us to dive deeper into the data, exploring more nuanced trends of a particular time period.

We could imagine linking our dashboard to an API as an alternative to serving flat files. This way, the dashboard will show the data upto the current moment; it is also possible to connect Altair/Vega to a data stream, so that the dashboard will be updating in real time.

Working with Altair is great, as it is easy to create a beautiful visualization with advanced interactionexcept for the cases when it won't work. In the next section, let's talk about the ways we can debug your plots and understand what is going wrong.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset