The field of machine learning is an ever-changing and advancing landscape. For new and experienced machine learning (ML) practitioners alike, there is always more to learn. This learning and exploring can be done in Azure, marrying new advancements with a sophisticated platform and toolset. Machine learning is not new to Azure, however. Azure Machine Learning Studio has been prevalent since 2015, providing a platform for the development and deployment of ML models from within a drag-and-drop UI. Today, the offerings have advanced and become even more integrated across services in Azure (VMs, AKS, Databricks, Storage, etc.). They provide more ways for developers and data scientists to develop, deploy, and monitor custom ML models with SDKs or by using an integrated drag-and-drop interface that needs very little coding.
The first part of this chapter is a gentle introduction to ML and deep learning with examples in image analysis and alongside working with and thinking about data in the context of ML. Following this, the Data Science Virtual Machine, Jupyter notebooks, Azure ML, and Databricks are explored for data science tasks. Many code samples and walkthroughs are provided.
Introduction to Machine Learning and Deep Learning
At its core, ML is the process of mapping input to output without hardcoded rules. In other words, rather than using rules like “if, else, case, switch,” ML uses systems that have seen and analyzed past data (or even streaming data) to make more “educated” guesses or predictions of future events.
The production of an ML model is through a process called training. Usually included in training a model is an evaluation of that model on the part of the original dataset that had not been used in training—sometimes called the held-out or test dataset . After the production of a satisfactory trained model (according to an evaluation), it is used to gain insights on new data in a process called inference or scoring .
In supervised learning, labels are provided at the beginning of training a model. Figure 16-1 depicts supervised learning, where data and the labels are provided to produce a model for tasks like classification or regression (see Figure 16-2). For instance, determining whether a patient has retinopathy, a dangerous eye condition, based on images of retinas from healthy patients and patients with this condition, is an example of a classification task. The labels for this task are binary: retinopathy or no retinopathy. Time-series analysis—where future events, like the demand for a product in a grocery store—is an example of a regression task. The demand for a product could be quantified as the number of items sold in a day. The future demand could then be predicted based on the demand from past days, or historical data. Semi-supervised learning is where a portion of the training data has labels, and the rest do not.
Unsupervised learning aims to understand the inner structure or organization in the input data and, thus, the input data does not need to be labeled. One common task is called clustering , where datapoints are formed into clusters that could answer questions such as how to group people who attend movies with input features such as age, gender, and income. Here, no label is trying to be predicted; rather, the goal is to find patterns and structure in the data. Based on how the data clusters, new characteristics may be inferred based on the age and income ranges, as well as, gender characteristics of the clusters.
Dimensionality reduction is a method to take several features and combine them in such a way that the number of features (dimensions) becomes reduced. An example of dimensionality reduction is taking a dataset such as pixel intensities from commonly sized images of the digits 0 to 9 (each pixel location is a feature or dimension, and each data point is an image). The fact that the image is a 3 or a 7 does not matter in dimensionality reduction; only the pixel intensities at each pixel location for each image. The feature count, also called feature space, is equal to the number of total pixels of the images in this case (e.g., for a set of images with a typical size of 28×28 pixels, they have 784 features each, which equates to 784 dimensions!). Dimensionality reduction reduces the number of dimensions to a smaller number. It is commonly used in conjunction with other types of ML, which, for one, often can help speed up training.
In reinforcement learning , a player (called an agent) learns how to optimally operate in an environment based on rewards and penalties assigned to it after taking actions in an iterative manner. For instance, imagine there is a maze that an agent is trying to get through to gain a big reward at the end, maybe a pile of gold or a large sum of points, but along the way, there are traps that take away points. The environment is the configuration or instance of this particular maze.
We begin with the agent at the entry-point to the maze—the initial state or location of the agent. Each turn by the agent through the maze counts as an action. At each turn, there is either a reward (award 1 point due to no traps or dead-ends encountered) or penalty (a trap or dead-end has been encountered—1 point removed). The summation of the points is the reward function in this case that the agent is trying to maximize. If the agent, through taking incremental actions (each turn or step), reaches the pile of gold, the agent has exited the maze successfully, and a new structure or set of conditions is created (new environment). The agent then begins again with some knowledge of how mazes work. After a certain number of rounds through different mazes, the agent has now learned strategies to make it through a maze to get the most points (it has maximized the reward function).
The examples in the next few sections focus on image analysis, as it is a good way to understand ML and important to many current scenarios. Text/natural-language, time-series, audio, speech, and other types of data are left for your exploration by using the resources provided on the book website at https://harris-soh-copeland-puca.github.io/.
Data Discussion
ML algorithms that operate on other types of data such as text and speech that also need data to be represented by numerical values as arrays of numbers.
A preprocessing technique called image augmentation adds variation to the dataset by morphing a subset of the images (flipping, blurring, applying geometric transformations, lighting differences, etc.) to create a more generalized model that is used in diverse situations and usually with higher confidence. Data augmentation, more broadly, can increase the size of and add variation to a dataset. It is a very useful technique in the context of classical ML and deep learning for tasks of all kinds (image classification, sentiment analysis on text, acoustic event detection, etc.). Other types of preprocessing include extrapolation, tagging, aggregation, imputation, and probability techniques.
In supervised ML, it is extremely important to have a balanced dataset, that is, equal representation from all classes. This prevents class bias in the training process. Along those lines, the data should match the domain to get the question answered by creating the ML model. For example, to create a text-based system to answer questions from customers regarding how to file and monitor insurance claims for a particular firm, not all insurance-related documents across the Internet should be used to train the model, rather, only insurance claim questions and answers from that particular firm should be used. Also, one should always ensure to have the correct permissions to use the dataset to answer data science questions. As a final note, it is common to encounter biased datasets—ones that have inherent, built-in human prejudices. A dataset should be examined carefully, labels or feature characteristics should be checked (e.g., Do number ranges make sense? Are all classes represented equally and fairly?), and exploratory data analysis techniques should be used before using it in ML experiments.
Traditional ML
Traditional ML, also called classical ML, generally consists of simpler approaches than deep learning to solving similar data science problems (deep learning is addressed in a subsequent section). Traditional ML models often take less time to train, and the algorithms are generally easier to understand, thus, explain to others, however, at the cost of usually having to do much more manual preprocessing to address certain limitations in the algorithms.
Take, for instance, using classical ML to create an object detector. In this example, a window slides across an image incrementally, and at each increment, a classifier determines the types of objects in it. Object detection consists of two parts: localization (find the object; i.e., slide the window) and classification (name the object; i.e., run that portion of the image through a classification model).
SVCs or SVMs are known to be good at dealing with larger feature spaces such as images with a height and width of 64 pixels (equating to 4096 features). SVM is one of the dominant classification algorithms falling under supervised learning techniques.
For a Python Jupyter notebook code sample of Histogram of Oriented Gradients with an SVM for object detection, go to https://github.com/harris-soh-copeland-puca/SampleCode/blob/master/Ch16/Ch16.Extra_Simple_Object_Detection_HOG.ipynb. See documentation for more details on the HOG algorithm at https://scikit-image.org/docs/dev/auto_examples/features_detection/plot_hog.html and SVMs at https://scikit-learn.org/stable/modules/svm.html.
In the supervised learning case, as shown here, the ML model is trained by checking how far from the ground-truth (real) value the output is and comparing that to the model output (the prediction). This serves as feedback (through using a mathematical equation called a loss function) to the model that it is either getting better or worse at the task on which it is being trained. After some time, an ML model should perform better at the task as it converges on small loss values, an indication that the guesses are improving over iterations (called epochs) in the training process. Other types of ML models are trained similarly (iteratively and with feedback) such that a function gets minimized or maximized, which helps lead the training process to create a good model.
Neural Networks
There are three standard classes of neural network architectures , which are briefly touched upon here: feedforward, convolutional, and recurrent neural networks. Just as with traditional ML, a model is trained by looking at the output (e.g., a class label or bounding box) at each step of the training process and comparing it to a ground-truth value.
Feedforward neural networks have a simple structure compared to other types of neural networks. They are often used for more manageable tasks like the classification of large amounts of tabular data. Feedforward networks take input, process it through a single layer to multiple layers of nodes, which can get deep, formed into layers that interconnect like a multilayered cake. Each layer can have as many nodes as the ML practitioner wishes to add, but if too many nodes are added, there is a chance that the network overfits or becomes less widely applicable and understands only what it has been trained on. They can be fully connected or dense, in which all nodes are connected from one layer to the next. A typical example of a feedforward network is the multilayer perceptron.
Convolutional neural networks (CNNs) are made up of layers, like the feedforward neural networks, but they are a bit more complicated. These neural networks apply filters to images that are akin to scanning the image incrementally with a flashlight, but instead of an ordinary flashlight, however, this flashlight gets smarter as it scans an image and remembers parts of the image it has seen before, storing that information into a matrix of numbers. That matrix, in turn, is scanned again for more information in the next step or layer of the network. Often, different types of layers—pooling layers, for instance—are applied to help the CNN generalize better and speed up calculations. The first layers of CNNs are generally capable of picking up more coarse patterns; for images, this could equate to edges or circles. In subsequent layers, the network may learn more complex patterns such as ears or eyes and eventually in the final convolutional layers, entire faces. Usually, a CNN ends with a fully connected layer or two, which helps decide what class should be assigned to the input data. A typical example of a CNN is a ResNet-50 network, which is usually used for image classification and has 50 layers.
Recurrent neural networks (RNNs) are commonly trained for tasks around text analysis (sentiment analysis, named entity recognition, summarization, question, and answering, etc.). The structure is very different from a CNN. In RNNs, there is the concept of the cell. Inside this cell are different types of neural network layers. The cell takes inputs, the text data in numerical form (e.g., as a word embedding) and a hidden state (some take more information, but this is the simplest case). More complex RNNs can have more input and output types, but all-in-all the cell passes information from one instance of itself to another instance, which is what makes it recurrent. This process continues until the architecture is out of input data. RNNs have many variations, such as many to one, which is where an entire piece of text gives just one score, as is the case with sentiment analysis. Another form of an RNN is many-to-many, for which one example is machine translation, where one language is translated to another. A typical example of an RNN is the long-short term memory algorithm or LSTM.
There are other types of neural networks, such as deep reinforcement learning networks (trains models that can play the complex game of Go, for instance) and generative adversarial networks (trains models that can perform style transfer, for instance), which are left to you to explore further (resources are on the book’s website).
To learn more about the differences between ML and deep learning, go to https://docs.microsoft.com/en-us/azure/machine-learning/concept-deep-learning-vs-machine-learning. Other helpful and in-depth resources on deep learning are on our website at https://harris-soh-copeland-puca.github.io.
Transfer Learning
Transfer learning is the process of taking a trained model (pre-trained) and fine-tuning it on new data with a different set, a subset, or entirely new class labels. The pre-trained or base model is trained on a high-quality, large dataset, such as ImageNet (1000 image classes) or the English Wikipedia corpus (over 3.6 billion words). In the fine-tuning phase of learning, less input data is needed as the model has general knowledge from the original, large dataset, upon which is was previously trained.
Transfer learning is commonly used across all types of ML tasks, from vision to language and speech. Additionally, transfer learning is used when there isn’t enough data to train a network from scratch, or there’s a pre-trained network that already exists for a similar task that has been trained on a very large dataset (see https://builtin.com/data-science/transfer-learning for more on this topic).
The Data Science Process
Just like creating and improving upon an application through cycles of development and productionizing, data science is an iterative process, as shown in Figure 16-8. The iterations of data science are a little different, however, from application development in the types of steps and resources needed. In fact, application development can very well include a data science process. This process is depicted in Figure 16-8.
The most important part of data science is asking the right question. The question usually comes from a business need or scientific study. A flawed question would be, how much money will I make next month? A better question is, by how many points will my Microsoft and Apple stocks increase over the next 28 days, and what will the mean predicted gain be at that time? Almost equally as important is to have relevant data to the question at hand.
After data acquisition, a good data science experiment begins with an exploratory data analysis step, which includes visualizing data such as the distribution of values or checking for outliers that could indicate mistakes in labels (which can happen on occasion) or calculations.
Usually, data is preprocessed in a certain way for the ML model to be trained, such as removing NaNs (for “not a number”), one type of data cleaning step. Data transformations are important, such as shaping an image from a 2D array to a long 1D array or making an RGB image grayscale. Another transformation type in the context of a time series experiment is log transforming the signal. It is said that data science is 80%–90% data processing and exploring, so becoming comfortable with data processing languages (e.g., SQL, Python, R) and ETL tools is very beneficial. Once the data is clean and transformed in a way that the algorithm can take as quality input, a model can be trained.
During the training process, different models may be tried, different algorithm parameters may be iterated over (hyperparameter tuning), and different ways of slicing and shuffling the dataset may be explored. Sometimes there is a need to go back to the cleaning and transformation data steps.
In summary, the data science process allows exploration and analysis of data and machine learning approaches in an iterative manner with improvements and better understanding along the way.
Prerequisites for Becoming a Successful Data Scientist
Git and GitHub for version control
Python and/or R proficiency
Jupyter notebook and a script editor (e.g., VS Code) experience
Knowledge of how to preprocess data (image, text, time series, etc.)
An understanding of classical ML and neural networks (completed!)
Overview of the Data Science Virtual Machine
Data science tooling and frameworks can take time, days to even weeks to set up on a machine. Azure offers a virtual machine set of SKUs that have most of the popular tools that a data scientist needs to train, test, and deploy ML applications. This includes all the common classical and deep learning frameworks like LightGBM and PyTorch, the popular tools to write Python and R code like Jupyter notebooks and RStudio, as well as, appropriate hardware and libraries for GPU acceleration of training and inference code. The Azure Machine Learning Python SDK also comes installed.
There are three different operating systems supported by the Data Science Virtual Machine (DSVM): Windows Server (2016 and 2019), Ubuntu (16.04 LTS and 18.04 LTS), and CentOS (7.4) (see https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/overview documentation for more details). In general, data science tools are more commonly used in conjunction with Unix-based systems, so here, the Ubuntu 16.04 version of the DSVM is shown in more detail.
Deep learning frameworks, such as PyTorch, CNTK, and TensorFlow
The NVIDIA driver for GPU acceleration (when choosing N-series VM)
TensorFlow Serving, MXNet Model Server, and TensorRT for test inferencing
CRAN R
Anaconda Python
JupyterHub with sample notebooks
Spark local with PySpark and SparkR Jupyter kernels
Azure command-line interface
Visual Studio Code, IntelliJ IDEA, PyCharm and Atom
H2O, Deep Water, and Sparkling Water
Julia
Vowpal Wabbit for online learning
XGBoost for gradient boosting
SQL Server
Intel Math Kernel Library
A Jupyter Notebook Overview
Quick and iterative prototyping
Easy annotation that is presentation quality
Shareable code and instructions for collaboration
Notes and code combined in one place
The ability to run anywhere the data science language (e.g., Python) and a browser are installed
Integration into tools such as Visual Studio Code
Jupyter notebooks offer many different kernels for different languages and are predominantly used with data science languages like Python, R, F#, and Scala (a kernel is what interfaces with the programming language). Simply put, they provide a method of both annotating one’s work (in the Markdown language or plain text) as well as writing code in blocks or small sections. The code blocks are run one at a time, which allows quick and easy prototyping and experimentation. Once a user is satisfied with the code, it may be exported to various formats, one of which is a script file (e.g., the Python .py file). The script file may then be run as one entire program at that point if so desired.
Figure 16-9 depicts a Jupyter notebook with free-form text annotation at the beginning (white background) followed by a Python code block (gray background). The example comes from a public source created by the Azure ML team.
Notebooks are used by the Azure ML team to demonstrate how to use its functionality with annotated examples. The Azure ML team maintains an official collection of Jupyter notebooks on GitHub at https://github.com/Azure/MachineLearningNotebooks.
Jupyter usage is explained in the next section, as well as how to write a simple Python program.
Hands-on with the Data Science Virtual Machine
- 1.Go to the Azure portal and search for Data Science Virtual Machine. Select the Ubuntu version.
- 2.When provisioning, ensure you select Password as the Authentication type in the Administrator account section on the Basics tab (the first page after clicking Create). Also, it is very important that the Username for the account is all lower-case due to a limitiation with JupyterHub.
- 3.
Navigate to the public IP address listed on the Overview pane of the DSVM resource in the portal once provisioning has completed. Note down this IP address. Firefox and Chrome browsers are best for use with JupyterHub. Enter the following URL and direct a browser window to it: https://THE_PUBLIC_IP_ADDRESS:8000. IMPORTANT: A certificate error may pop up; if so, simply continue to proceed to the site (this is a known issue). Next, the login screen shown in Figure 16-12 should appear. Provide your administrator username and password to proceed to the Jupyter notebooks. JupyterHub is a multitenant system that can support multiple users, but by default, it is set up for only one user. More may be added later with Unix commands.
- 4.Open a new Python 3 Azure ML notebook by clicking the drop-down menu on the upper right and selecting Python 3.6 – Azure ML or the latest Python version with Azure ML.
- 5.Git clone the https://github.com/harris-soh-copeland-puca/SampleCode repo by type the following on the command line.git clone https://github.com/harris-soh-copeland-puca/SampleCode.git
- 6.Upload the Ch16/Ch16.01_Image_Manipulation.ipynb file to the DSVM through JupyterHub, confirm the upload, and then click the notebook file to open it.
- 7.
Follow along with the notebook and read through the comments to understand how an image is read in Python. To run a code-cell, use the Run button at the top or use the keyboard shortcut Shift+Enter when the cursor is in a cell. Go through each code cell of the notebook and try to understand what is going on. In addition, the notebook contains code cells on how to separate the color channels (RGB) of the image and plot them.
The Azure DSVM is not your only option, however, for working in the data science domain. For those getting started, the following article describes other options for setup: https://rheartpython.github.io/navigating-ml/setup/.
Overview of Azure Machine Learning
When training a model, think of Azure ML as your experimentation and deployment platform with integration into common frameworks like Scikit-Learn for classical ML or PyTorch and TensorFlow for deep learning. The ML practitioner has the choice to do their experimentation locally, on cloud VMs (like the DSVM) or Spark clusters like Databricks.
When the practitioner is satisfied with the model as measured by an appropriate metric (or “ruler” like accuracy), they may validate it on new, unseen data (part of our “data science process”). At this point, if the results are satisfactory, the practitioner or DevOps professional may deploy the model.
In Azure ML, there are a few different deployment paradigms. One is a cloud service and another an IoT edge device (e.g., an ARM32 IoT device like the Vision AI DevKit https://azure.github.io/Vision-AI-DevKit-Pages/). The cloud service can further be consumed by a Power BI dashboard.
Finally, a model may be monitored using Azure ML integrated with Azure Monitor for things like the health of the service. If it is an IoT Edge deployment, the device may be monitored with Azure IoT Hub from which messages and logs may then be analyzed.
User roles
Compute targets
Pipelines
Datasets
Registered models
Deployment endpoints
The underlying Azure resources include an Azure Storage account, Azure Container Registry, Azure Key Vault, and Azure Application Insights. These resources are provisioned at the same time an Azure ML workspace is created.
Azure ML currently has SDKs in the R and Python programming languages. It can also be utilized with an interface called Designer, for a more of a drag-and-drop, no-code experience (however, in Designer, you can also write custom code as modules). In addition to the SDKs and Designer, Azure ML has a CLI extension to the main Azure CLI that integrates directly with the Azure ML workspace from the command line and a REST API.
Almost any Python or R training code may be converted to an Azure ML-friendly code, which allows the use of the Azure ML cloud compute to train models and services like Azure Kubernetes Service for deploying models. Not only does Azure ML have linked compute for training/deploying, but it also has mechanisms for taking a snapshot and versioning the code to train the model. Also, versioning the model itself is done by a process called registering a model . This is part of model management. Once registered, a model is associated with the Azure ML workspace for later retrieval simply by using the SDK.
Additionally, Azure ML provides automated ML (Auto ML) capabilities. Three types of tasks are automated: preprocessing (e.g., featurizing), choosing algorithms that are appropriate to use, and hyper-parameter tuning. Currently, regression, classification, and time-series forecasting are supported in Azure ML’s Auto ML capabilities.
Hands-on with Azure Machine Learning: Training a Model
- 1.
Provision an Azure ML workspace in the Azure portal, with the Azure ML SDK or CLI.
In the portal, search for machine learning, select Machine Learning, and click Create when prompted. Fill out the information for provisioning, selecting Basic as the workspace edition (it can be changed later). Note, the Basic edition does not support Auto ML and a few other features. Create this resource.
- 2.Start a Jupyter notebook system. Choices include
- a.
JupyterHub on the DSVM
- b.
Use Azure Notebooks (best if attached to a DSVM) at https://notebooks.azure.com
- c.
Locally, if Python 3 and Jupyter are installed. Start the Jupyter notebook locally with the following command on the command line.
- a.
The kernel may need to be set to the correct Python version and (on the DSVM, use Python 3.6 – Azure ML or the latest Python version with Azure ML).
- 3.In JupyterHub, open a terminal by selecting New ➤ Terminal.
- 4.Change directory into the notebooks folder and git clone the following repo, if not done already, by typing the following into the terminal window.cd notebooksgit clone https://github.com/harris-soh-copeland-puca/SampleCode.git
There should now be a Ch16 folder available.
- 5.Back in the main window of Jupyter (to get there, click the Jupyter logo in the upper-left corner), navigate to the Ch16 folder, and open Ch16.02a_Train_Digits_Classifer_Sklearn_AzureML.ipynb. The Python 3.6 – Azure ML kernel, if on the DSVM, should be chosen if prompted. If a switch of kernels is needed, the Kernel option in the menu is the correct place to go, as shown in Figure 16-22.
- 6.
From the Ch16 folder, upload the config.json configuration file downloaded earlier so that the notebook has access.
- 7.
Read through the introduction and then run the Import packages cell (a shortcut to run the code in a cell is to press Shift+Enter).
The notebook trains a classical ML model (logistic regression) on the MNIST handwritten digits dataset that is now a part of Azure Open Datasets and accessible through the Python SDK. The original dataset is also accessible at http://yann.lecun.com/exdb/mnist/.
- 8.
Run the Connect to workspace cell and log in to Azure through by following the interactive authentication instructions.
- 9.
Run the Create experiment cell to create an Experiment for all the training runs for the scenario.
- 10.Run the Create or Attach existing compute resources cell to create a managed Azure ML Compute service for training ML models in this Experiment or others. This may take some time because it is provisioning a cluster of Linux VMs on Azure with the specification shown in Figure 16-23.
- 11.Run Display some sample images to view a subset of the MNIST handwritten digits dataset. Here are ten labeled sample images from this dataset.
- 12.Run the Create a directory and Create a training script. Note, that the Jupyter cell magic %%writefile $script_folder/train.py appears at the beginning of the cell. This tells Jupyter to write a physical file to the path specified. Look in the folder with the Jupyter notebook to ensure the train.py was written to the sklearn-mnist subfolder. Notice the code that takes care of connecting the training process to Azure ML (Run context) as well as the code that trains a logistic regression classifier within this cell.
- 13.
Run all the cells in the Create an estimator section to define the Python environment and create the Estimator, a wrapper object that understands the compute, training script, script folder, parameters for training script and Python environment.
- 14.
Run the Submit the job to the cluster cell to begin training the model on the MNIST digits dataset. This creates a docker image, scales the cluster, runs the training job, and saves to an output folder with any created assets and sends this to the workspace upon completion.
- 15.
Run the Jupyter widget cell to see the progress in detail of the job. The first time the job is submitted, it takes up to 10 minutes as it needs to create the docker image and scale the cluster to the right number of nodes. After running this cell, a link to the Azure portal is provided. Click this link to see how it shows in the Azure portal for monitoring. If not using a Jupyter notebook, but rather scripts, it is helpful to know about this view.
- 16.
Run Get log results upon completion to view the logs and prevent running other code until the run is complete.
- 17.
Run Display run results to see the accuracy of the model on the test dataset.
- 18.
The model is now available in the output folder as a Python pickle file. It is associated with the run in that workspace. Run the Register model cells to register the model and make it accessible programmatically (useful in future scripts that need to access/use this model). Note that the model now has a name and a version number.
Hands-on with Azure Machine Learning: Deploying a Model
- 1.
In the main window of Jupyter (to get there, click on the Jupyter logo in the upper left corner), navigate to the Ch16 folder, again, and open Ch16.02b_Deploy_Digits_Classifer_Sklearn_AzureML.ipynb. The Python 3.6 – Azure ML kernel, if on the DSVM, should be chosen if prompted.
- 2.
Run the Set up the environment cells to import the packages and retrieve the registered model (physically downloads it with the SDK) that was trained in part 1.
- 3.Run the cells in the Test model locally section to download, load, and predict the test data locally before the service is created to ensure the code works as expected. If there are errors, try restarting the kernel and rerunning all cells up to this point. (Restart a kernel by going in the menu to Kernel➤ Restart & Clear Output.) After running, the Example the confusion matrix section, the plot shown in Figure 16-26 should be visible.
- 4.
Run the cells in the Deploy as a web service section to create the scoring or prediction script (physical file) and the environment for scoring (a physical file). Create the deployment configuration for ACI, and deploy as a service in the cloud (which may take a few minutes as it’s creating a docker image with the necessary files and packages, then pulling it down to the ACI). Take note of the scoring web service’s HTTP endpoint for REST calls. At this point, any language that can make REST calls may be used with proper input data format and header.
- 5.
Run the cells in the Test deployed service section to call the web service with the SDK and with a raw HTTP request.
- 6.
Only the service (not workspace) can be deleted, as shown in the last code cell, to conserve Azure resources if it is no longer needed.
Use Case: Image Classification with a Deep Neural Network and Azure Machine Learning
In this section, image classification with a neural network is discussed. In the sample code, the PyTorch deep learning framework is used (https://pytorch.org).
Hands-on with Azure Machine Learning and PyTorch
- 1.Start a Jupyter notebook system. Some of the choices include
JupyterHub on the DSVM
Use Azure Notebooks (https://notebooks.azure.com; best if attached to a DSVM)
Locally, if Python 3 and Jupyter are installed. Start the Jupyter notebook locally with the following command on the command line.
- 2.
In JupyterHub, open a terminal by selecting New ➤ Terminal.
- 3.Change directory into the notebooks folder and clone, if not already done, the SampleCode repository by typing the following into the terminal window.cd notebooksgit clone https://github.com/harris-soh-copeland-puca/SampleCode.git
There should now be a Ch16 folder available.
- 4.
Navigate to the Ch16 folder and open the Ch16.03_Train_Behavior_Classifier_PyTorch_AzureML.ipynb notebook. This notebook is based on a tutorial from the PyTorch public documentation for image classification. It uses transfer learning, which means we can use smaller datasets to train the model (https://pytorch.org/tutorials/intermediate/quantized_transfer_learning_tutorial.html). Ensure the config.json for the workspace is in the Ch16 folder, as well.
- 5.
Run each cell in the notebook as was done in the “Hands-on with Azure Machine Learning: Training a Model” section.
- 6.
In the Evaluation section, the accuracy is reported. Note, that even if it’s low, the more data that is added to this training experiment, the better the accuracy. Also, changing the training parameters can help greatly in a process called hyperparameter tuning.
- 7.
Check the workspace in the Azure portal for the Experiment, Run, and registered Model. A “Completed” status for the Run indicates success.
The next step is to create a scoring script and deploy it. An example is at https://github.com/harris-soh-copeland-puca/Azure-AI-Camp/blob/master/day2/1.1.ImageClassificationAmlCompute/Deploy-PyTorch-AzureML-Compute.ipynb.
IoT Devices and the Intelligent Edge
The Internet of Things, or IoT, refers to an Internet interconnected ecosystem of computing devices that are embedded into everyday objects, sending, and receiving different types of data continuously to and from other devices or the cloud. Examples of IoT devices are smartwatches, smart thermostats, network/Internet-connected cameras to monitor front doors, and temperature sensors that send data to alerting systems through the network. Devices may be controlled or monitored on cell phone, for instance, or an onsite computer system. They may be Internet-connected and monitored through a system like a Power BI dashboard in Azure. The device might not be connected to the Internet regularly, and some never connect to the Internet. Azure still helps IoT developers build entirely disconnected intelligent devices systems.
IoT Edge modules are docker containers with custom code, optionally supporting both Azure ML-built AI models or custom ML models trained elsewhere. They don’t necessarily have to have any ML, but it is becoming a more popular option. They are capable of running on laptops and servers, as well as constrained Windows and Linux devices, like cell phones and ARM-based computing systems.
The intelligent Edge allows the creation and deployment of modules for AI such as object detection and optical character recognition (e.g., license plate recognition), audio analysis (e.g., a dog barking), speech to text and translation (e.g., helping medical professionals with charting), the Azure IoT Edge runtime, and tight integration with the Azure cloud through IoT Hub. More information and tutorials are on the IoT Edge documentation pages at https://docs.microsoft.com/en-us/azure/iot-edge/.
As a continuation of the projects in this chapter, deploy an Azure IoT module to an Azure Edge VM (VM in Azure specifically preinstalled with Azure IoT Edge runtime) by running this tutorial: https://github.com/harris-soh-copeland-puca/Azure-AI-Camp/tree/master/day2/2.IoTEdgeModule.
Overview of Spark and Databricks
When discussing Spark, it’s helpful to know the difference between scale-up and scale-out. Scale-up and scale-out are ways to meet the increasing demand for computing resources from increasingly large workloads.
In its basic form, scale-up refers to adding more resources to a single machine or set of independent machines such as increasing memory or adding a more powerful processor.
With scale-out, there is a set of machines that share a common workload. Scale-out, in its basic form, is adding more machines to the pool that work on a single workload in unison. Parts of a workload and partitions of data are separated onto different worker nodes, which each perform calculations and, when finished with a task, return an answer to a cluster manager. This organization is shown in Figure 16-30.
Spark is part of the wider Apache ecosystem or zoo, an open source collection of tools and libraries for scaling-out with distributed computing. An older approach to scale-out methodology was Hadoop Map Reduce. Spark is a newer take on this, and in some cases, hundreds of times faster than Map Reduce. Unlike Map Reduce, Spark aims to keep as much data in memory as possible to avoid I/O trips to and from the data source such as a Hadoop Distributed File System or SQL database.
An easy-to-use UI and portal accessible through the Azure portal
Notebooks as the developer tool (very similar to Jupyter notebooks)
Easy Blob Storage integration
Simple cluster creation and management
Integrated tightly with Azure and other Azure resources
Ability to schedule jobs (daily, weekly, etc.)
Secure with role-based access backed by Azure Active Directory
Multiple programming language support
Easy install and manage libraries on to the entire cluster
Version control on Notebooks
In the next two sections, a quick walkthrough on how to set up a Databricks workspace and a tutorial using Databricks notebooks for image featurization and classification are shown.
Auto ML with Azure Databricks and Azure Machine Learning
Auto ML, or automated machine learning, is a feature of Azure ML that aids in choosing the right ML model for the task (classification, regression, time series, etc.) by testing out many different algorithms and hyperparameters automatically within one run and measuring against a metric of choice. Models get ranked according to the metric, and the best one wins (see more at https://docs.microsoft.com/en-us/azure/machine-learning/concept-automated-ml).
MLFlow comes by default with Databricks to keep a data science project organized and standardized—from building models to deployment within Databricks. It is also compatible with Azure ML.
Hands-on with Azure Databricks and Auto ML
- 1.
Navigate to the Azure portal, click Create a new resource, and search for databricks, selecting Azure Databricks.
- 2.Click Create and fill out the template.
- a.
Use a new resource group because many resources are created when provisioning Azure Databricks. Having them all in one clean resource group allows easier management.
- b.
In the Pricing Tier, select Premium so that role-based access and security are enabled.
- a.
- 3.To get the code, it is easiest to download the Databricks notebooks in GitHub as a zip file (in fact, it contains the entire repository). Navigate to https://github.com/harris-soh-copeland-puca/Azure-AI-Camp and download as a zip file and unzip on the local computer.
- 4.
In the Azure portal, navigate to the Azure Databricks service, and while in the Overview, click Launch Workspace to sign in to the Azure Databricks workspace using AAD credentials of the current Azure user.
- 5.In the Databricks menu on the left side, click Workspace and use the drop-down arrow to select Import and browse to the files under Azure-AI-Camp-master/day1/2.AutoMLDatabricks/ and select Model Training Classification III.dbc to import. This is a Databricks compressed file that can contain multiple files and folders. These may also be created from a Databricks workspace and exported. Note, many different types of files may be imported into Databricks, including plain Jupyter notebooks. Once imported, a single notebook should be visible under the Workspace menu item.
- 6.A cluster is needed to run the notebook. Navigate to Clusters in the menu, and click Create Cluster.
- a.
Give the cluster a descriptive name as there might be other users in the workspace who use the cluster (concurrently or later).
- b.
Keep the cluster mode on Standard, unless there are multiple users of this cluster concurrently and, in that case, it is a good idea to select High Concurrency.
- c.
The pool can remain as None (this is for higher availability of nodes for the cluster).
- d.
In Databricks Runtime Version, select Runtime: 6.4 (Scala 2.11, Spark 2.4.5) (non-GPU and non-ML version) or the default.
- e.
In Autopilot Options, make sure that Enable autoscaling and Terminate after are selected. It is recommended to set Terminate after to 30 or 60 minutes to prevent getting charged for any unnecessary compute resource time.
- f.
Ensure the Worker Type is set to default (of this writing, that is Standard_DS3_v2) with 2 Min Workers and 4 Max Workers. Note, a static fixed number of workers may also be specified.
- g.The Driver Type can remain at Same as worker.
- a.
- h.
Click Create Cluster to begin the provisioning process. This process could take several minutes, depending upon the size of the cluster specified. If the cluster stops, navigate back to the Clusters in the menu, pick the cluster, and click Start when ready to use.
- 7.Azure ML Python SDK needs to be installed on to the cluster. Navigate to Workspace in the menu and use the drop-down menu to select Create and then Library. For a Python package, select PyPI and type azureml-sdk[automl] (a version may be pinned with the use of == such as azureml-sdk[automl]==1.2.0 to ensure compatibility). Check the GitHub repo for the latest. Click Create to install the library to a cluster.
- 8.
The next step is to ensure that the library is installed on the cluster. There should now be a Status on running clusters pane. Check the small checkbox next to the running cluster of choice (if it is not running, start the cluster). After checking the box, click Install.
- 9.
Navigate back to the Model Training Classification III under Workspace and open it.
- 10.Fill in the cell that has placeholders for Azure subscription ID, resource group, Azure ML workspace name, and region of the Azure ML service started in an earlier section.
Interactive login to Azure is one of the next steps using a device login code.
This notebook should be very familiar if you worked through the previous Jupyter notebook hands-on sections because the controls are very similar, as are the process of using Azure ML to train and deploy. Use the instructions in the notebook as a guide to train and deploy the model to ACI for testing.
Use Case: Azure Databricks for Data Scientists
The surveillance video scenario for suspicious behavior may also be explored in Databricks. Databricks has a subset of the CAVIAR database preinstalled for tutorials and testing. Two notebooks are available as a Databricks compressed file in the Azure-AI-Camp/day1/3.AzureMLSparkMLDatabricks/ folder from the repository used in the previous section on Auto ML.
A second cluster with the ML Databricks features (MLlib) needs to be created alongside the one created in the previous section, which did not have Spark ML libraries. (Azure ML is not compatible with the ML Spark cluster runtimes; however, some features from this runtime are needed in the first sample notebook of this section). MLlib is a Spark ML library and has built-in algorithms and featurizers. The notebooks for this section take you through (1) data preprocessing (using MLlib to featurize an image dataset, created from videos) using the ML Databricks runtime and (2) training and using a logistic regression classifier to predict normal and suspicious behavior with Azure ML on a non-ML Databricks runtime.
Summary
This chapter laid a foundation of knowledge in machine learning, with examples in image data processing and analysis. The Data Science Virtual Machine, Jupyter notebooks, Azure ML, and Databricks for data science tasks were discussed. Images are not the only things that can be analyzed with machine learning, so you are encouraged to continue to learn and explore ways to use ML on Azure—whether on a local computer with the Azure ML SDKs, a DSVM or in an Azure Databricks workspace. Plentiful references and resources are on this book’s website at https://harris-soh-copeland-puca.github.io. Machine learning can truly be a fascinating and lifelong study.