We proceed with the recipe as follows:
- Start by importing a few modules
import sys
import numpy as np
import tensorflow as tf
from datetime import datetime
- Get from command line the type of processing unit that you desire to use (either "gpu" or "cpu")
device_name = sys.argv[1] # Choose device from cmd line. Options: gpu or cpu
shape = (int(sys.argv[2]), int(sys.argv[2]))
if device_name == "gpu":
device_name = "/gpu:0"
else:
device_name = "/cpu:0"
- Execute the matrix multiplication either on GPU or on CPU. The key instruction is with tf.device(device_name). It creates a new context manager, telling TensorFlow to perform those actions on either the GPU or the CPU
with tf.device(device_name):
random_matrix = tf.random_uniform(shape=shape, minval=0, maxval=1)
dot_operation = tf.matmul(random_matrix, tf.transpose(random_matrix))
sum_operation = tf.reduce_sum(dot_operation)
startTime = datetime.now()
with tf.Session(config=tf.ConfigProto(log_device_placement=True)) as session:
result = session.run(sum_operation)
print(result)
4. Print some debug timing just to verify what is the difference between CPU and GPU
print("Shape:", shape, "Device:", device_name)
print("Time taken:", datetime.now() - startTime)