To use a GPU in your TensorFlow program, just type the following:
with tf.device("/gpu:0"):
Followed by the setup operations. This line of code will create a new context manager, telling TensorFlow to perform those actions on the GPU.
Let's consider the following example, in which we want to execute the following sum of two matrices, An + Bn.
Define the basic imports:
import numpy as np
import tensorflow as tf
import datetime
We can configure a program to find out which devices your operations and tensors are assigned. To realize this, we'll create a session with the following log_device_placement parameter set to True:
log_device_placement = True
Then we fix the n parameter, that is, the number of multiplication to perform:
n=10
Then we build two random large matrices. We use the NumPy rand function to perform this operation:
A = np.random.rand(10000, 10000).astype('float32')
B = np.random.rand(10000, 10000).astype('float32')
A and B will be respectively, of size 10000x10000.
The following array will be used to store results:
c1 = []
c2 = []
Here, we define the kernel matrix multiplication function that will be performed by the GPU:
def matpow(M, n):
if n < 1:
return M
else:
return tf.matmul(M, matpow(M, n-1))
As previously explained, we must configure the GPU and the GPU with the operations to perform.
The GPU will compute the An and Bn operations and store results in c1:
with tf.device('/gpu:0'):
a = tf.placeholder(tf.float32, [10000, 10000])
b = tf.placeholder(tf.float32, [10000, 10000])
c1.append(matpow(a, n))
c1.append(matpow(b, n))
In case if the above code does not work use /job:localhost/replica:0/task:0/cpu:0 as the GPU device (that is, will be executed using CPU).
The addition of all elements in c1, that is, An + Bn, is performed by the CPU, so we define it as follows:
with tf.device('/cpu:0'):
sum = tf.add_n(c1)
The datetime class permits to evaluate the computational time:
t1_1 = datetime.datetime.now()
with tf.Session(config=tf.ConfigProto
(log_device_placement=log_device_placement)) as sess:
sess.run(sum, {a:A, b:B})
t2_1 = datetime.datetime.now()
Computational time is then displayed using:
print("GPU computation time: " + str(t2_1-t1_1))
I am using a GeForce 840M graphic card, the results are as follows:
GPU computation time: 0:00:13.816644