There's more...

Writing an algorithm for a reduction operation using PyCUDA can be quite complex. For this purpose, Numba provides the @reduce decorator for converting simple binary operations into reduction kernels.

Reduction operations reduce a set of values to a single value. A typical example of a reduction operation is to calculate the sum of all the elements of an array. As an example, consider the following array of elements: 1, 2, 3, 4, 5, 6, 7, 8.

The sequential algorithm operates in the way shown in the diagram, that is, adding the elements of the array one after the other:

Sequential sum

A parallel algorithm operates according to the following schema:

Parallel sum

It is clear that the latter has the advantage of shorter execution time.

By using Numba and the @reduce decorator, we can write an algorithm, in a few lines of code, for the parallel sum on an array of integers ranging from 1 to 10,000:

import numpy 
from numba import cuda 
 
@cuda.reduce 
def sum_reduce(a, b): 
    return a + b 
 
A = (numpy.arange(10000, dtype=numpy.int64)) + 1
print(A) got = sum_reduce(A)
print(got)

The previous example can be performed by typing the following command:

(base) C:>python reduceNumba.py

The following result is provided:

vector to reduce = [ 1 2 3 ... 9998 9999 10000]
result = 50005000
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset