Let's allocate the training computation cluster to train our model. The following code is pretty standard and has been listed here for convenience. The basic idea is to allocate a CPU-based, two-node cluster:
# compute target
from azureml.core.compute import ComputeTarget, BatchAiCompute
from azureml.core.compute_target import ComputeTargetException
# choose a name for your cluster
batchai_cluster_name = "traincluster-w3"
try:
# look for the existing cluster by name
compute_target = ComputeTarget(workspace=ws,
name=batchai_cluster_name)
if type(compute_target) is BatchAiCompute:
print('found compute target {}, just use
it.'.format(batchai_cluster_name))
else:
print('{} exists but it is not a Batch AI cluster. Please choose a different name.'.format(batchai_cluster_name))
except ComputeTargetException:
print('creating a new compute target...')
compute_config =
BatchAiCompute.provisioning_configuration(vm_size="STANDARD_D2_V2", #
small CPU-based VM
#vm_priority='lowpriority', # optional
autoscale_enabled=True,
cluster_min_nodes=0,
cluster_max_nodes=2)
# create the cluster
compute_target = ComputeTarget.create(ws, batchai_cluster_name,
compute_config)
# can poll for a minimum number of nodes and for a specific
timeout.
# if no min node count is provided it uses the scale settings for
the cluster
compute_target.wait_for_completion(show_output=True,
min_node_count=None, timeout_in_minutes=20)
# Use the 'status' property to get a detailed status for the
current cluster
print(compute_target.status.serialize())
Azure ML has recently introduced the ability to work with an FPGA accelerator. For more information, see the link at https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-deploy-fpga-web-service.