As explained previously, the test consists of executing the calculation task, both on the CPU via the test_cpu_vector_sum function, and then on the GPU via the test_gpu_vector_sum function.
Both functions report the execution time.
Regarding the testing function on the CPU, test_cpu_vector_sum, it consists of a double calculation loop on 10000 vector elements:
cpu_start_time = time() for i in range(10000): for j in range(10000): c_cpu[i] = a[i] + b[i] cpu_end_time = time()
The total CPU time is the difference between the following:
CPU Time = cpu_end_time - cpu_start_time
As for the test_gpu_vector_sum function, you can see the following by looking at the execution kernel:
__kernel void sum(__global const float *a, __global const float *b, __global float *c){ int i=get_global_id(0); int j; for(j=0;j< 10000;j++){ c[i]=a[i]+b[i];}
The sum of the two vectors is performed through a single calculation loop.
The result, as can be imagined, is a substantial reduction in the execution time for the test_gpu_vector_sum function:
(base) C:>python testApplicationPyopencl.py ============================================================
OpenCL Platforms and Devices
============================================================
Platform - Name: NVIDIA CUDA
Platform - Vendor: NVIDIA Corporation
Platform - Version: OpenCL 1.2 CUDA 10.1.152
Platform - Profile: FULL_PROFILE
--------------------------------------------------------
Device - Name: GeForce 840M
Device - Type: GPU
Device - Max Clock Speed: 1124 Mhz
Device - Compute Units: 3
Device - Local Memory: 48 KB
Device - Constant Memory: 64 KB
Device - Global Memory: 2 GB
Device - Max Buffer/Image Size: 512 MB
Device - Max Work Group Size: 1024
============================================================
Platform - Name: Intel(R) OpenCL
Platform - Vendor: Intel(R) Corporation
Platform - Version: OpenCL 2.0
Platform - Profile: FULL_PROFILE
--------------------------------------------------------
Device - Name: Intel(R) HD Graphics 5500
Device - Type: GPU
Device - Max Clock Speed: 950 Mhz
Device - Compute Units: 24
Device - Local Memory: 64 KB
Device - Constant Memory: 64 KB
Device - Global Memory: 3 GB
Device - Max Buffer/Image Size: 808 MB
Device - Max Work Group Size: 256
--------------------------------------------------------
Device - Name: Intel(R) Core(TM) i7-5500U CPU @ 2.40GHz
Device - Type: CPU
Device - Max Clock Speed: 2400 Mhz
Device - Compute Units: 4
Device - Local Memory: 32 KB
Device - Constant Memory: 128 KB
Device - Global Memory: 8 GB
Device - Max Buffer/Image Size: 2026 MB
Device - Max Work Group Size: 8192
CPU Time: 39.505873918533325 s
GPU Kernel evaluation Time: 0.013606592 s
GPU Time: 0.019981861114501953 s
Even if the test is not computationally expansive, it provides useful indications of the potential of a GPU card.