How to do it...

Note that the import statement is omitted in the following screenshots:

  1. The mean(data) function returns the normal average of a sequence or iterator:

    • The line 3 is the mean of integers.
    • The line 4 is the mean of floats.
    • The lines 6 and line 8 show that fractions can be averaged, as well as decimals.
  1. The harmonic_mean(data) function returns the harmonic average of a sequence or iterator. The harmonic mean is the reciprocal of the arithmetic mean of the reciprocals of the argument and is typically used when the average of rates or rations is needed.

For example, if a car traveled for a given distance at 60 mph, then the same distance back at 50 mph, its average speed would be the harmonic mean of 60 and 50, that is, 2/(1/60 + 1/50) = 54.5 mph:

This is very close to the regular mean of 55 mph, so let's look at a larger difference, say 20 mph and 80 mph:

The reason the harmonic mean is more appropriate in this example is because the normal, arithmetic mean doesn't account for the time required to complete the same distance, that is, it takes four times longer to travel a given distance at 20 mph compared to 80 mph

If the distance was 120 miles, then it would take six hours to travel at 20 mph but only one and a half hours at 80 mph. The total distance traveled would be 240 miles and the total time would be 7.5 hours. 240 miles/7.5 hours = 32 miles per hour.

  1. The median(data) function returns the middle value of a sequence or iterator:
    • The line 19 demonstrates that the average of the two middle values is returned when the number of data points is even.
    • When the number of data points is odd (line 20), then the middle value is returned.
  1. The median_low(data) function returns the low median of a sequence or iterator. It is used when the dataset contains discrete values and it is desired to have the returned value be part of the dataset:
    • If the dataset is an odd count (line 21), the middle value is returned, just like a normal median.
    • If the dataset is an even count (line 22), then the smaller of two middle values is returned.
  1. The median_high(data) function returns the high median of a sequence or iterator. It is used when the dataset contains discrete values and it is desired to have the returned value be part of the dataset:
    • The line 23 shows the larger of two middle values is returned if the dataset is an even number.
    • The line 24 shows the normal median is returned when there is an odd number of values in the data.
  1. The median_grouped(data, interval=1) function returns the median of a group of continuous data, using interpolation and calculated at the 50th percentile:

In this screenshot, the groups are 5–15, 1525, 2535, and 3545, with the values shown being in the middle of those groups. The middle value is in the 1525 group so it must be interpolated. By adjusting the interval, which adjusts the class interval, the interpolated result changes.

  1. The mode(data) function returns the most common value from data, and assumes data is discrete. It can be used for numeric or non-numeric data:
    • The line 30 shows that if there isn't a single value with the largest count, an error will be generated.
  1. The pstdev(data, mu=None) function returns the population standard deviation. If mu is not provided, the mean of the dataset will be automatically calculated:
    • The line 1 is a basic standard deviation. However, the mean of a dataset can be passed into the method so a separate calculation isn't required (lines 32-34).
  1. The pvariance(data, mu=None) function returns the variance of a population dataset. The same conditions for arguments as in pstdev applies. Decimals and fractions are supported:

While mu should be the calculated average for the dataset, passing in incorrect values may change the results (this also applies to pstdev).

  1. The stdev(data, xbar=None) function is the same functionality as pstdev but is designed for use with population samples, rather than entire populations.
  2. The variance(data, xbar=None) function provides the same functionality as pvariance but should only be used with samples rather than populations.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset