© Julian Soh, Marshall Copeland, Anthony Puca, and Micheleen Harris 2020
J. Soh et al.Microsoft Azurehttps://doi.org/10.1007/978-1-4842-5958-0_14

14. Azure Storage

Julian Soh1 , Marshall Copeland2, Anthony Puca3 and Micheleen Harris1
(1)
Washington, WA, USA
(2)
Texas, TX, USA
(3)
Colorado, CO, USA
 

Over the years, the two foundational components that define Azure as a hyperscale cloud continue to be unmentioned pillars: compute and storage. Compute and storage are the foundational pillars for every private and public cloud. Infrastructure as a service is pretty much about compute and storage. We focused on compute in Chapter 2, and in this chapter, we dedicate time to cloud-based storage.

Cloud-based storage is not as ground-shaking and exciting as emerging technologies like artificial intelligence and machine learning, or the Internet of Things (IoT), or even Big Data, but these technologies all rely on storage to exist. In fact, they rely on a lot of storage.

The Difference Between Azure Storage and Azure Databases

Azure Storage, sometimes referred to as Azure datastores, is primarily designed for semi-structured or unstructured storage of content. It is separate from compute. That is what makes Azure storage inexpensive compared to Azure databases.

Azure databases are generally used for more structured data, and they are more compute dependent.

Cloud Storage and Storage Accounts

When we think of storage, the traditional hard drives, solid-state drives, thumb drives, and memory cards come to mind.

For enterprises, we think of storage area networks (SANs) and network-attached storage (NAS).

Although cloud storage is physically backed by similar technologies, they may be offered as cloud services that are more in line with today’s needs.

Storage in Azure is organized by objects called storage account s. A storage account is a logical container for the different types of cloud storage.

As seen in Figure 14-1, the types of storage that can share a storage account are
  • Azure Blob storage

  • Azure Files

  • Azure Tables

  • Azure Queues

Other types of storage may also be organized by storage accounts but would not be able to share the same storage account. For example, Azure Data Lake Store is another storage offering in Azure that is also housed in storage accounts but cannot share the same storage account with the other types of storage.
../images/336094_2_En_14_Chapter/336094_2_En_14_Fig1_HTML.jpg
Figure 14-1

Azure storage account Overview pane

Azure Blob Storage

Azure Blob storage is the most economical and abundant storage that is available today. It derives its name of the term BLOB, which stands for Binary Large Objects, although it is usually represented in all lower case.

The definition of a blob is a large file, typically an image or any form of unstructured data. We talk about the different data types in Chapter 17, but for now, think of blobs as large files that are not suited to reside in databases. The hard drive on your personal computer can be considered blob storage because it houses different types and sizes of files that are organized in folders.

The following are the big differences between the storage on your personal computer and blob storage in Azure.
  • Azure Blob storage is infinitely scalable and virtually limitless.

  • Azure Blob storage is physically backed by at least three sets of infrastructure—from drives to power supplies. This is the default deployment and is known as locally redundant storage (LRS). LRS is the minimal deployment model (see Figure 14-1).

  • Azure Blob storage can also be replicated to a remotely paired datacenter in Azure, which IS another three sets of infrastructure in that remote location. This is known as geo-redundant storage (GRS). Therefore, in a GRS configuration, data is being replicated across six sets of hardware spanning two geo-locations more than 500 miles apart,

  • Azure Blob storage is the least expensive type of storage in Azure. To provide some context, at the time of writing, Azure Blob storage prices are $0.03/GB/month1 for the hot tier and as little as $0.01/GB/month for the archive tier. This pricing is competitive among cloud providers, and Microsoft has demonstrated the willingness to continue making the price of storage competitive. Azure Blob storage should be considered enterprise-level storage so, it is economical to provide cloud storage in Azure than maintain on-premises hardware.

There are two types of Azure Blob storage.
  • Block Blob storage is ideal for files up to 200GB in size. Block Blob storage is normally used for unstructured data of varying sizes, such as videos, photos, and other binary files.

  • Page Blob storage is optimized to hold files that are used for random read and write operations . Therefore, they are often used to store the virtual hard disk (VHD) images of virtual machines in Azure.

To read, write, download, and upload files to Azure Blob storage, HTTPS PUT and GET methods are employed. To facilitate that, Azure provides public URL endpoints to access Azure Blob storage. However, Microsoft announced the Private Link service, which allows a public endpoint In Azure to be exposed with a privately owned IP address. Azure Private Link is discussed elsewhere in this book.

The URL to access blob storage in Azure is usually a .blob.core.windows.net suffix. Thus, accessing a blob stored in a container of a storage account looks something like this:

https://<StorageAccountName>.blob.core.windows.net/<ContainerName>/<BlobName>

For more information about Azure Blob storage, we have forked Microsoft’s extensive and evergreen documentation for Azure Blob storage to our GitHub repo at https://github.com/harris-soh-copeland-puca/azure-docs/blob/master/articles/storage/blobs/storage-blobs-overview.md.

Hands-on: Deploying Azure Blob Storage

In this exercise, you deploy an Azure storage account to deploy Azure Blob storage. You explore how to transfer files to and from Azure Blob storage and how to secure it.

As you start the deployment process, remember the relationship of the storage account to the blob containers and blobs, as seen in Figure 14-2.
../images/336094_2_En_14_Chapter/336094_2_En_14_Fig2_HTML.jpg
Figure 14-2

Structure and relationship of the storage account, blob containers, and blobs

As with all the other hands-on exercises, we assume that you have an Azure subscription, or know to sign up for a free trial.
  1. 1.

    Go to your Azure portal at https://portal.azure.com and sign in.

     
  2. 2.

    Click Create a resource. Type storage account in the search box. Select Storage account – blob, file, table, queue when it appears in the search results.

     
  3. 3.

    Click Create.

     
  4. 4.

    Select the subscription to put the storage account in or create a new subscription for it.

     
  5. 5.

    Provide a name for the storage account. The name is used as part of the URL endpoint, so it must be globally unique. It must also be in lowercase and no special characters.

     
  6. 6.

    Select a location closest to you.

     
  7. 7.

    Select a performance level. Standard performance is backed by high-performance enterprise-level hard disk drives. Premium performance is backed by enterprise-level solid-state drives.

     
  8. 8.

    Leave the Account kind at StorageV2. StorageV2 accounts are recommended for almost all storage scenarios and incorporate all the functionality of StorageV1 and Blob storage. Both these storage types are still mainly provided for backward compatibility, for example, if there is a need to access the storage account using the classic model rather than the Azure Resource Manager (ARM) method . See https://docs.microsoft.com/en-us/azure/storage/common/storage-account-overview#recommendations for more details.

     
  9. 9.

    For this exercise, leave the access tier (default) as Hot unless you know for sure that this storage account is generally used for files that can tolerate some latency like near-line access. Blobs that are uploaded to this storage account are assigned this tier by default, and you can move the blob to the hot or archive tier after the upload.

     
  10. 10.

    Click Next : Networking.

     
  11. 11.

    Select Public endpoint (all networks), and then click Next : Advanced.

     
  12. 12.

    Always keep the Secure transfer required option set to Enabled to enforce HTTPS communication.

     
  13. 13.

    Large file shares are disabled because you did not pick Premium as the storage type in step 7. Large file shares, if enabled, allow us to create file shares that are up to 100 TiB in size.

     
  14. 14.

    Leave Blob soft delete disabled. You can enable soft delete for additional protection against accidental deletes. The duration that deleted blobs are preserved if soft delete is enabled is based on the retention period, which you specify if you enabled this feature.

     
  15. 15.

    Leave the Data Lake Storage Gen2 Hierarchical namespace disabled because you are not deploying Azure Data Lake Services with this storage account. If you enabled Hierarchical namespace, it also means that you are specifying this storage account as an Azure Data Lake Store account, so you will not be able to deploy other storage types to this storage account.

     
  16. 16.

    Click Review + create.

     
  17. 17.

    Click Create.

     
  18. 18.

    After the storage account is deployed, click Go to resource.

     

You have just deployed an Azure storage account, but you have not deployed Azure Blob storage or any storage type.

In the following steps, you deploy Azure Blob storage by specifying the first container.
  1. 1.

    Go to the Overview pane of the Azure storage account created in the previous exercise if you are not already there.

     
  2. 2.

    Select Containers in the Overview pane, as seen in Figure 14-3.

     
../images/336094_2_En_14_Chapter/336094_2_En_14_Fig3_HTML.jpg
Figure 14-3

Containers (Blob) option in Azure storage account

  1. 3.

    Click + Container, which is located at the top of the pane.

     
  2. 4.

    Type raw for the container name. This is because you are using this Azure Blob storage and container for exercises in later chapters (e.g., Chapter 19).

     
  3. 5.

    Leave the public access level as Private (no anonymous access).

     
  4. 6.

    Click OK.

     
  5. 7.

    Click + Container again and create another container named stage.

     
  6. 8.

    Your Azure storage account with Azure Blob storage should now look similar to what is shown in Figure 14-4.

     
../images/336094_2_En_14_Chapter/336094_2_En_14_Fig4_HTML.jpg
Figure 14-4

Azure Blob storage with two containers deployed

You have now deployed Azure Blob storage; next, you need to store and retrieve files.

Hands-on: Using Azure Blob Storage

As part of this exercise, you need to download some content and using Azure Storage Explorer to upload the content to Azure Blob storage. You then confirm that the content is properly uploaded.
  1. 1.

    Download two sample files—customer.csv and users.csv—to your computer. These files are located on our GitHub repo at https://github.com/harris-soh-copeland-puca/SampleFiles/blob/master/customer.csv and https://github.com/harris-soh-copeland-puca/SampleFiles/blob/master/users.csv respectively. You will reuse these files for other exercises.

     
  2. 2.

    Download a picture file, seattle.jpg, from https://github.com/harris-soh-copeland-puca/SampleFiles/blob/master/seattle.jpg. You will use this file for exercises in later chapters.

     
  3. 3.

    Download Azure Storage Explorer from https://azure.microsoft.com/is-is/features/storage-explorer/ and install it on your computer. This is the application that you will use to upload, download, delete, and browse containers and blobs in Azure Blob storage.

     
  4. 4.

    Go to the Azure portal and open the Azure storage account created in the previous exercise.

     
  5. 5.

    Under Settings, click Access keys.

     
  6. 6.

    Note the storage account name, and then copy the connection string associated with key1, as seen in Figure 14-5.

     
../images/336094_2_En_14_Chapter/336094_2_En_14_Fig5_HTML.jpg
Figure 14-5

Storage account connection string

  1. 7.

    Launch Azure Storage Explorer.

     
  2. 8.

    Click the Connect icon and select Use a connection string, as seen in Figure 14-6. Then click Next.

     
../images/336094_2_En_14_Chapter/336094_2_En_14_Fig6_HTML.jpg
Figure 14-6

Use a connection string to connect to Azure Blob storage.

  1. 9.

    Paste the connection string that you copied in step 6 into the Connection string field. Notice that the display name is automatically populated with the Azure storage account name, as noted from step 6. Leave it like this, but you can change it if desired. This is the display name in Azure Storage Explorer. Click Next.

     
  2. 10.

    Click Connect.

     
  3. 11.

    Upon successful connection, you should see the storage account listed under Local & Attached ➤ Storage Accounts, as seen in Figure 14-7.

     
../images/336094_2_En_14_Chapter/336094_2_En_14_Fig7_HTML.jpg
Figure 14-7

Storage account connected via Azure Storage Explorer

Self-Guided Exercises

The Azure Storage Explorer user interface is very intuitive. If you have ever used an SFTP or FTP client, this is no different. Try these exercises on your own. The step-by-step guide is posted on our GitHub repo at https://github.com/harris-soh-copeland-puca.
  1. 1.

    Expand the storage account, and then expand Blob Containers. You should see the two containers created in the previous exercise.

     
  2. 2.

    Upload users.csv and customer.csv to the raw container. Upload seattle.jpg to the stage container.

     
  3. 3.

    Right-click customer.csv and change it to the archive tier.

     
  4. 4.

    Look for folder statistics and the activity history under Activities.

     
  5. 5.

    Try to access seattle.jpg from a browser. Change the container’s public access level if you receive an error. Change it back after you are done experimenting. (Do not refresh the page. Reload it in a separate browser).

     
  6. 6.

    Try connecting to the storage account using a storage account name and a key instead of a connection string.

     
  7. 7.

    Try connecting to the storage account by adding an Azure subscription and not using a key.

     
  8. 8.

    Check out Storage Explorer from the Azure portal (in preview at the time of writing).

     
  9. 9.

    Create an empty folder.

     
Note

Blobs in Azure Blob storage are stored in a flat system. The names of “folders” that blobs are stored in are part of the blob’s file name. Therefore, if there are no files in a folder, the folder cannot exist. That is why when you try to create a folder, Azure Storage Explorer reminds you that folders are virtual in Azure Blob storage. The next section discusses hierarchical namespace (HNS) support in Azure Data Lake Store.

Next Steps: Azure Blob Storage

Microsoft’s documentation on Azure Blob storage is on GitHub at https://github.com/harris-soh-copeland-puca/azure-docs/tree/master/articles/storage/blobs.

Azure Data Lake Store (ADLS)

Closely related to Azure Blob storage is the Azure Data Lake Store. Azure Data Lake Store is the only service that cannot share an Azure storage account with the other Azure storage options discussed in this chapter.

The Azure Data Lake Store covered in this book is the second generation of the service, often called Azure Data Lake Store gen2 or ADLS Gen2.

Azure Data Lake Store Gen 2 is built on Azure Blob with a few differentiating features, as follows.
  • ADLS Gen 2 is better suited for certain scenarios involving analytics because it works better with text files than Azure Blob (applies to analytics involving text files and not video, of course).

  • ADLS Gen 2 supports hierarchical namespace support, which is the ability to have a folder structure that is independent of the content. That means you can now have empty folders!

  • ADLS Gen 2 costs more than Azure Blob storage and does not have an archive tier.

Provisioning an Azure Data Lake Store follows the same steps as deploying an Azure storage account with the exception that in the Advanced tab of the provisioning process, enable the hierarchical namespace option, as seen in Figure 14-8.
../images/336094_2_En_14_Chapter/336094_2_En_14_Fig8_HTML.jpg
Figure 14-8

Provisioning ADLS Gen2 by specifying HNS

For more information regarding Azure Data Lake Store, please see the Microsoft documentation on this topic at our GitHub repo at https://github.com/harris-soh-copeland-puca/azure-docs/tree/master/articles/data-lake-store.

Azure Tables

Azure Tables is a nonrelational, key/value pair storage system . It does not require a schema and is a form of structured NoSQL datastore. Azure Tables are designed to be lightweight and optimized for simple and fast inserts and reads. Use case scenarios for Azure Tables include storing flexible datasets like user data for web applications, storage of lookup data or metadata, and so forth.

Note

Azure Tables are not like the tables in a relational database. Therefore, you cannot do unions or joins between tables. The tables in Azure Tables are stand-alone.

What does schema-less mean? For example, if you have an online training website catered to members looking for a coach, the member table in a NoSQL datastore vs. the member table in a relational database would be different, as seen in Figure 14-9.
../images/336094_2_En_14_Chapter/336094_2_En_14_Fig9_HTML.jpg
Figure 14-9

Difference between a schema-less table vs. a schema-enforced table

Anatomy of Azure Tables

The architecture of Azure Tables is similar to Azure Blob because they both reside in an Azure storage account. Using the example of our online training website for members and coaches, this is graphically represented in Figure 14-10.
../images/336094_2_En_14_Chapter/336094_2_En_14_Fig10_HTML.jpg
Figure 14-10

Structure of Azure Tables in Azure storage accounts

Entities in Azure Tables can have any number of properties. A property of an entity in Azure Tables is akin to a field. Every Azure Table entity has three mandatory system properties.
  • PartitionKey

  • RowKey

  • Timestamp

Using our online training website example, the member Azure Table has the entities shown in Figure 14-11.
../images/336094_2_En_14_Chapter/336094_2_En_14_Fig11_HTML.jpg
Figure 14-11

Azure Table entities with system properties and user-defined properties

But wait! Why did we not create a property for sport in the member table? To answer this question, you need to understand the mandatory system properties and how they are used.

PartitionKeys (PK) are optional and do not need to be unique.

RowKeys (RK) need to be unique. Therefore, only the first RowKey of all partitions can be blank.

Timestamps are automatically added to an entity upon its creation.

The primary key in an Azure Table is what uniquely identifies a row, and that primary key is a combination of the rows’ PartitionKey and RowKey.
  • Azure Table primary key = PK + RK

All entities in an Azure Table are sorted by PartitionKey, followed by RowKey. Therefore, for efficient Azure Table operations, you should select a PartitionKey that best organizes the data. So, going back to our member table, the best PartitionKey would be Sport. RowKeys are then uniquely assigned to each member in the same PartitionKey. Figure 14-12 depicts the member table when taking the PartitionKey and RowKey into consideration.
../images/336094_2_En_14_Chapter/336094_2_En_14_Fig12_HTML.jpg
Figure 14-12

Member table with Sport as the PartitionKey and unique RowKeys within the same partition

Like Azure Blob storage, Azure Tables are accessed via an API endpoint using REST HTTPS; however, unlike Azure Blob storage, there is no option to allow anonymous access to Azure Tables.

Hands-on: Using Azure Tables

This exercise continues to use Azure Storage Explorer to manage tables and entities to visualize Azure Table operations.
  1. 1.

    Launch Azure Storage Explorer and navigate to the Azure storage account created in the previous exercise.

     
  2. 2.

    Expand Tables. You see several system tables that already exist.

     
  3. 3.

    Right-click Tables and select Create table, and then type member for the name of the table.

     
  4. 4.

    After you have created the table, select it, and click + Add, as seen in Figure 14-13.

     
../images/336094_2_En_14_Chapter/336094_2_En_14_Fig13_HTML.jpg
Figure 14-13

Adding an entity to an Azure Table

  1. 5.

    Add the first entity, as seen in Figure 14-11. Use the value of Sport for each member’s PartitionKey and a unique RowKey. Click Add Property to add the first name, last name, and birthdate properties. You should only have to do this with the first entity. Subsequent entities remember those properties. If you delete a property for a new entity, it shows up as null.

     

Self-Guided Exercises

Try these exercises on your own. The step-by-step guide is posted on our GitHub repo at https://github.com/harris-soh-copeland-puca.
  1. 1.

    Explore how PartitionKeys and RowKeys are used by using Query in Azure Storage Explorer.

     
  2. 2.

    Try to create an entity with a non-unique RowKey in a partition.

     
  3. 3.

    Edit an existing entity.

     
  4. 4.

    Sort by columns in Azure Storage Explorer.

     
  5. 5.

    Export and import a table.

     

What format does it take?

Next Steps: Azure Tables

While you can use Azure Storage Explorer as a UI for Azure Tables, it is generally used as a rapid write and retrieve datastore. Data used by websites is a very common use case scenario. A Microsoft tutorial on Azure Tables using the .NET SDK is a good resource (see https://docs.microsoft.com/en-us/azure/cosmos-db/tutorial-develop-table-dotnet). When you first look at this tutorial, you might wonder why it refers to Cosmos DB instead of Azure Tables. The reason is that Cosmos DB is a multimodel database that uses the same Azure Table API. You do not need to follow the portion of the tutorial that tells you to deploy an Azure Cosmos DB. Start using the Azure Tables that you created in this chapter with the .NET SDK.

Note

In Chapter 17, we explore Azure Cosmos DB, a schema-less datastore with Table APIs and similar characteristics. The main difference between Azure Cosmos DB and Azure Tables is that the latter is a subset of Cosmos DB. Azure Cosmos DB is a multimodel database (one of the APIs for Cosmos DB is the Azure Table API) with the option to replicate globally for performance reasons. In contrast, Azure Tables can only be geo-redundant to another region more than 500 miles away primarily for disaster recovery and business continuity. A Microsoft Azure Tips and Tricks article has a good summary of the differences (see https://microsoft.github.io/AzureTipsAndTricks/blog/tip166.html).

You can find Microsoft’s documentation on Azure Tables at https://github.com/harris-soh-copeland-puca/azure-docs/tree/master/articles/storage/tables.

Azure Files

Azure Files offer shared storage for applications using the SMB 3.0 protocol. Azure Files only support SMB 3.0 because it is an Internet-secure protocol.

Traditionally, file shares are drives attached to servers and shared on the network, except in the case of Azure Files, there are no servers involved. As such, Azure Files is one of the simplest forms of serverless service to understand.

The easiest way to describe Azure Files is that it is a shared location that you can map a network drive letter to, and then use that drive letter in file explorer to access files.

The easiest use case for Azure Files is to connect them to virtual machines in Azure.

Hands-on: Using Azure Files

  1. 1.

    In Azure Storage Explorer, go to the Azure storage account created earlier in this chapter.

     
  2. 2.

    Right-click File Shares and select Create File Share.

     
  3. 3.

    Give the File Share a name. The File Share name is part of a URL, so it must be a valid DNS name. If the name is not acceptable, you see a red exclamation mark next to it.

     
  4. 4.

    After the File Share is created, select it, and upload a file.

     
  5. 5.

    Click the Connect VM option from the top menu in Azure Storage Explorer and copy the net use command provided in the popup prompt. Notice that the key is embedded in the command.

     
  6. 6.

    Remote to a VM in Azure and open a command prompt. Paste the command copied from step 5, and replace the drive letter in the command with the drive letter that you wish to use.

     

Next Steps: Azure Files

Azure Files offers an easy way to create File Shares that VMs in Azure can use. However, if you want to connect to File Shares via a drive letter from our on-premises computers, special network and security considerations need to take place, including opening port 445 in the firewall. Most ISPs block this port for security reasons because although SMB 3.0 is considered an Internet-secure protocol, older versions of SMB are not, but they use the same port 445.

For more information about connecting to Azure Files from on-premises networks and computers, see https://docs.microsoft.com/en-us/azure/storage/files/storage-files-networking-overview.

As with all the other topics, Microsoft’s extensive documentation on Azure Files is at https://github.com/harris-soh-copeland-puca/azure-docs/tree/master/articles/storage/files.

Azure Queues

Azure Queues is another easy-to-understand serverless storage service that supports messaging. Message queues support asynchronous application-to-application communication, so like Azure Tables, Azure Queues are generally used by applications.

To facilitate a common communication protocol between applications and services, authenticated HTTPS is the access method to Azure Queues. A message added to Azure Queues is stored for a certain amount of time or until it has been processed by the receiving application and deleted.

The retention period of a message in a queue is specified by the application sending the message, and it can be in days, hours, minutes, or seconds.

Hands-on: Using Azure Queues

  1. 1.

    In Azure Storage Explorer, go to the Azure storage account created earlier in this chapter.

     
  2. 2.

    Right-click Queues and select Create Queue.

     
  3. 3.

    Provide a name for the queue and hit Enter.

     
  4. 4.

    After the queue has been created, click Add Message, as seen in Figure 14-14. Type Message should stay for seven seconds in the Message text and set the message option to 7 seconds. Then click OK.

     
../images/336094_2_En_14_Chapter/336094_2_En_14_Fig14_HTML.jpg
Figure 14-14

Creating and setting message options in Azure Queues

Next Steps: Azure Queues

Getting into the development aspect using Azure Queues is beyond the scope of this chapter, but queues have served as a messaging method for a long time, so they are easily understood.

Microsoft’s documentation on Azure Queues is at https://github.com/harris-soh-copeland-puca/azure-docs/tree/master/articles/storage/queues.

To quickly start using Azure Queues, read the documentation at https://docs.microsoft.com/en-us/azure/storage/queues/storage-dotnet-how-to-use-queues.

Summary

This chapter was written to provide you with a good primer on the different Azure storage options. As a reminder, the comprehensive documentation for all Microsoft Azure services is on our GitHub repo at https://github.com/harris-soh-copeland-puca/azure-docs.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset