With the NumPy package, we can easily solve many kinds of data processing tasks without writing complex loops. It is very helpful for us to control our code as well as the performance of the program. In this part, we want to introduce some mathematical and statistical functions.
See the following table for a listing of mathematical and statistical functions:
Function |
Description |
Example |
---|---|---|
|
Calculate the sum of all the elements in an array or along the axis |
>>> a = np.array([[2,4], [3,5]]) >>> np.sum(a, axis=0) array([5, 9])
|
|
Compute the product of array elements over the given axis |
>>> np.prod(a, axis=1) array([8, 15])
|
|
Calculate the discrete difference along the given axis |
>>> np.diff(a, axis=0) array([[1,1]])
|
|
>>> np.gradient(a) [array([[1., 1.], [1., 1.]]), array([[2., 2.], [2., 2.]])]
| |
|
Return the cross product of two arrays |
>>> b = np.array([[1,2], [3,4]]) >>> np.cross(a,b) array([0, -3])
|
|
Return standard deviation and variance of arrays |
>>> np.std(a) 1.1180339 >>> np.var(a) 1.25
|
|
Calculate arithmetic mean of an array |
>>> np.mean(a) 3.5
|
|
Return elements, either from x or y, that satisfy a condition |
>>> np.where([[True, True], [False, True]], [[1,2],[3,4]], [[5,6],[7,8]]) array([[1,2], [7, 4]])
|
|
Return the sorted unique values in an array |
>>> id = np.array(['a', 'b', 'c', 'c', 'd']) >>> np.unique(id) array(['a', 'b', 'c', 'd'], dtype='|S1')
|
|
Compute the sorted and common elements in two arrays |
>>> a = np.array(['a', 'b', 'a', 'c', 'd', 'c']) >>> b = np.array(['a', 'xyz', 'klm', 'd']) >>> np.intersect1d(a,b) array(['a', 'd'], dtype='|S3')
|
We can also save and load data to and from a disk, either in text or binary format, by using different supported functions in NumPy package.
Arrays are saved by default in an uncompressed raw binary format, with the file extension .npy
by the np.save
function:
>>> a = np.array([[0, 1, 2], [3, 4, 5]]) >>> np.save('test1.npy', a)
If we want to store several arrays into a single file in an uncompressed .npz
format, we can use the np.savez
function, as shown in the following example:
>>> a = np.arange(4) >>> b = np.arange(7) >>> np.savez('test2.npz', arr0=a, arr1=b)
The .npz
file is a zipped archive of files named after the variables they contain. When we load an .npz
file, we get back a dictionary-like object that can be queried for its lists of arrays:
>>> dic = np.load('test2.npz') >>> dic['arr0'] array([0, 1, 2, 3])
Another way to save array data into a file is using the np.savetxt
function that allows us to set format properties in the output file:
>>> x = np.arange(4) >>> # e.g., set comma as separator between elements >>> np.savetxt('test3.out', x, delimiter=',')
We have two common functions such as np.load
and np.loadtxt
, which correspond to the saving functions, for loading an array:
>>> np.load('test1.npy') array([[0, 1, 2], [3, 4, 5]]) >>> np.loadtxt('test3.out', delimiter=',') array([0., 1., 2., 3.])
Similar to the np.savetxt
function, the np.loadtxt
function also has a lot of options for loading an array from a text file.