How to do it...

In the following example, we show the basic steps to build an application with PyOpenCL: the task to be performed is the sum of two vectors. In order to have a readable output, we'll consider two vectors that each have 100 elements: each i-th element of the resulting vector will be equal to the sum of the i-th element of vector_a, plus the i-th element of vector_b:

Let's start by importing all the necessary libraries:

import numpy as np 
import pyopencl as cl 
import numpy.linalg as la

We define the size of the vectors to be added, as follows:

vector_dimension = 100

Here, the input vectors, vector_a and vector_b, are defined:

vector_a = np.random.randint(vector_dimension,size=vector_dimension) 
vector_b = np.random.randint(vector_dimension,size=vector_dimension)

In sequence, we define platform, device, context, and queue:

platform = cl.get_platforms()[1] 
device = platform.get_devices()[0] 
context = cl.Context([device]) 
queue = cl.CommandQueue(context)

Now, it's time to organize the memory areas that will contain the input vectors:

mf = cl.mem_flags 
a_g = cl.Buffer(context, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=vector_a) 
b_g = cl.Buffer(context, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=vector_b)

Finally, we build the application kernel by using the Program method:

program = cl.Program(context, """ 
__kernel void vectorSum(__global const int *a_g, __global const int *b_g, __global int *res_g) { 
  int gid = get_global_id(0); 
  res_g[gid] = a_g[gid] + b_g[gid]; 
} 
""").build()

Then, we allocate the memory of the resulting matrix:

res_g = cl.Buffer(context, mf.WRITE_ONLY, vector_a.nbytes)

Then, we call the kernel function:

program.vectorSum(queue, vector_a.shape, None, a_g, b_g, res_g)

The memory space used to store the result is allocated in the host memory area (res_np):

res_np = np.empty_like(vector_a)

Copy the result of the computation into the memory area created:

cl._enqueue_copy(queue, res_np, res_g)

Finally, we print the results:

print ("PyOPENCL SUM OF TWO VECTORS") 
print ("Platform Selected = %s" %platform.name ) 
print ("Device Selected = %s" %device.name) 
print ("VECTOR LENGTH = %s" %vector_dimension) 
print ("INPUT VECTOR A") 
print (vector_a) 
print ("INPUT VECTOR B") 
print (vector_b) 
print ("OUTPUT VECTOR RESULT A + B ") 
print (res_np)

Then, we perform a simple check in order to verify that the sum operation is correct:

assert(la.norm(res_np - (vector_a + vector_b))) < 1e-5

Table of Contents for How to do it...

Create new playlist

Sign In

Sign Up

Table of Contents for
How to do it...