Appendix A. Introduction to statistics and analytic concepts 233
Individual data points are grouped together into ranges in order to visualize how
frequently data in each range occurs within the data set. High bars indicate more
data in a given range, and low bars indicate less data. In the histogram shown in
Figure A-3, the peak is in the 20-39 range, where there are five points.
Figure A-3 Histogram
The popularity of a histogram comes from its intuitive easy-to-read picture of the
location and variation in a data set. There are, however, two weaknesses of
histograms that you should bear in mind:
? Histograms can be manipulated to show different pictures. If too few or too
many bars are used, the histogram can be misleading. This is an area which
requires some judgment, and perhaps some experimentation, based on the
analyst's experience.
? Histograms can also obscure the time differences among data sets. For
example, if we looked at data for #births/day in the United States in 1996, you
would miss any seasonal variations, e.g. peaks around the times of full moon.
Likewise, in quality control, a histogram of a process run tells only one part of
a long story. There is a need to keep reviewing the histograms and control
charts for consecutive process runs over an extended time to gain useful
knowledge about a process.
A.1.15.1 Equi-width histograms
An equi-width histogram is a special case of the above histogram, with the
requirement that the x-axis ranges are equally distributed, while the y-axis
represents the frequency distribution within each of those x-axis ranges.
A.1.15.2 Equi-height histograms
An equi-height histogram is similar to the typical histogram discussed previously
in that it graphically represents the distribution of data, but it uses a slightly
different scale. In an equi-height histogram the height of each bar is equal and
X-axis range varies.