Statistical operators

A wide range of statistical operations, such as computing mean, median, variance, and standard deviation, can be calculated for NumPy arrays using the available statistical operators. The aggregates, such as mean, median, variance, and standard deviation, for an entire array can be calculated as shown in the following code:

In [16]: array_x = np.array([[0, 1, 2], [3, 4, 5]])
In [17]: np.mean(array_x)
Out[17]: 2.5
In [18]: np.median(array_x)
Out[18]: 2.5
In [19]: np.var(array_x)
Out[19]: 2.9166666666666665
In [20]: np.std(array_x)
Out[20]: 1.707825127659933

By default, these statistical parameters are computed by flattening out the array. To compute the statistical parameters along any of the axes, the axis argument can be defined when calling these functions. Let's look at this behavior with the mean function as an example:

In [27]: np.mean(array_x, axis = 0)
Out[27]: array([1.5, 2.5, 3.5])
In [28]: np.mean(array_x, axis = 1)
Out[28]: array([1., 4.])

There are special implementations of these functions to handle arrays with missing values or NAs. These functions are nanmean, nanmedian, nanstd, nanvar:

In [30]: nan_array = np.array([[5, 6, np.nan], [19, 3, 2]])

# The regular function returns only nan with a warning
In [31]: np.median(nan_array)
C:Users Anaconda3libsite-packages umpylibfunction_base.py:3250: RuntimeWarning: Invalid value encountered in median
r = func(a, **kwargs)
Out[31]: nan
In [32]: np.nanmedian(nan_array)
Out[32]: 5.0

The corrcoeff and cov functions help compute the Pearson's correlation coefficients and the covariance matrix for a given array or two given arrays:

In [35]: array_corr = np.random.randn(3,4)
In [36]: array_corr
Out[36]:
array([[-2.36657958, -0.43193796, 0.4761051 , -0.11778897],
[ 0.52101041, 1.11562216, 0.61953044, 0.07586606],
[-0.17068701, -0.84382552, 0.86449631, 0.77080463]])
In [37]: np.corrcoef(array_corr)
Out[37]:
array([[ 1. , -0.00394547, 0.48887013],
[-0.00394547, 1. , -0.76641267],
[ 0.48887013, -0.76641267, 1. ]])
In [38]: np.cov(array_corr)
Out[38]:
array([[ 1.51305796, -0.00207053, 0.48931189],
[-0.00207053, 0.18201613, -0.26606154],
[ 0.48931189, -0.26606154, 0.66210821]])
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset