Most business requirements can be fulfilled with the traditional charts that we reviewed in the previous chapters. Nevertheless, certain scenarios demand more complex visualizations for a complete understanding of the story hidden behind the data. In the next pages, we will embark on a journey that will take us to a land beyond "the classics" in the pursuit of insightful and stunning dashboards by discussing:
One of the main disadvantages of using aggregators such as avg()
is that they can hide interesting behaviors in the data. Picture the following scenario: you have access to the scores of five thousand students regarding four subjects—Science, Mathematics, English, and Literature. Each subject is graded with the help of two exams. In order to condense all these records and show only high-level figures, you decide to calculate an average. However, after analyzing the results, you start wondering: are there some extremely good students raising the global average, or are there alarmingly bad ones bringing it down? Are they all consistent? Can we separate them into groups (good, normal, or bad)? How did most of them fare?
One of the best ways to deal with this kind of questions is to create a histogram, a visualization focused on distributions instead of magnitudes. In this chart, the x-axis represents the exam grade, while the y-axis counts the number of students that scored it. As you can see, the data adopts a shape that gives us a better perspective of the situation:
Far to the right, we can find the top students (there aren't a lot of them, but their scores are pretty high). At the other end, we find those who might need a little help (grades below 400 points), and in the middle lies the majority of pupils.
We can also appreciate that the curve is skewed to the right as most of the students have scored between 425 and 585 points. Remember, a higher bar represents more occurrences. For example, the orange bar (the highest of all the histogram) represents the 296 boys that got grades between 500 and 505 points.
Now, let's take a look at how to create this chart in QlikView:
Science
. Create a new bar chart. However, this time, we don't have an existing dimension to rely on. Instead, we need to create dynamic clusters that group the students depending on their scores; so, click on the Add Calculated Dimension… button:=class( aggr(avg(Science), Student) , 5)
The final grade for each student is calculated by averaging the scores of two exams (midterm and final exams). Therefore, it is necessary to create a virtual table that uses Student
as the dimension:
aggr(XXX, Student)
This also must calculate the average of both the tests:
aggr(avg(Science), Student)
Based on these scores, we must create a set of clusters in order to group the students. In this example, we used the class()
function with a bin size of 5
. Therefore, if a student scored an average of 456 points in both exams, she would be located in the 455 <= x < 460 cluster.
Score
.Students
based on this formula:count(DISTINCT Student)
455 <= x < 460
, we will only display the bin's lower limit—455
. In order to do this, we need to use only the first three characters of the string. Therefore, edit the Calculated Dimension formula by adding a left()
function to it:=left(class( aggr(avg(Science), Student) , 5), 3)
5
) to a variable that the users can edit. Create a new variable in the script called BinSize
:LET BinSize = 5;
=left(class( aggr(avg(Science), Student) , BinSize), 3)
BinSize
variable.LET HidePrefix = '_'; Menu: LOAD * INLINE [ _Menu Science Math English Literature ];
HidePrefix
will be treated as System Fields. Therefore, they will not appear in the current selections box (a desirable behavior for a navigation field like this one)._Menu
field._Menu
field:=left(class( aggr(avg($(=_Menu)), Student) , BinSize), 3)
=_Menu
270
degrees so that the text appears vertically.