Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 5: Choosing the Right Database and Storage

Now that you've chosen your compute option, it's time to move on to the next decision: choosing the right database and storage for your cloud application.

Google Cloud's world of managed services is huge, and their library of database and storage options is very diverse as well. To put this in numbers, the Google Cloud ecosystem has well over 100 services and tools for nearly all aspects of cloud software development, and more than a dozen of these are dedicated solely to storage and databases. The goal behind such a large range of options is to give developers more customizability and agility. More importantly, developers are not required to choose a single service and stick with it—rather, they can choose multiple storage and database options together to meet their application's requirements; in fact, this is common practice.

To new developers who are diving straight into cloud computing, data storage might seem a secondary concern and far less important than, say, compute. However, data storage is an integral part of cloud computing and plays a big role in costs, agility, and security. In this chapter, we'll explore the different database and storage options available in Google Cloud and the ways in which you can use them. By the end of this chapter, we will have covered the following topics:

Storage and database options on Google Cloud: the big three
Additional storage and database options
Security and flexibility

Storage and database options on Google Cloud – the big three

As we said, there are a lot of options available for database and storage. While we will talk about all of these services, this chapter will focus more on a few key services because as with compute options, Google Cloud offers fully managed platforms for storage that help new developers achieve more while doing less. Additionally, it's unlikely that most new cloud developers would need to use all of the services provided in Google Cloud. Most developers only need a few (or in some cases, only one) storage and database services to get things off the ground. These services, which we'll focus on first, are Google Cloud Storage (GCS), Google Cloud SQL, Google Firestore, Google Spanner, and Google Bigtable. Additionally, we'll also talk about Cloud FileStore, BigQuery, Memorystore, and Persistent Disk.

GCS – basics

Before we dive deeper, let's clear up some of the basics including concepts and terms that we'll use throughout the chapter, such as structured and unstructured data. Structured data comes in different formats and types, but the basic idea is that structured data can be organized into rows and columns. All of the major Google Cloud storage options (Cloud SQL, Cloud Datastore, Cloud Bigtable) support structured data except one: GCS. GCS stores a different form of data known as unstructured data, as it cannot structure your data into rows and tables. It will only store a sequence of bytes, exactly the way you store it. However, this isn't as big of a limitation as you might think it is.

In the real world, this means that you can still store data that has an internal structure with tables and rows such as a ZIP, Joint Photographers Experts Group (JPEG), or even a comma-separated values (CSV) file, the only difference being that GCS won't know of any internal structure and it will retrieve the exact same sequence of bytes that you uploaded—every time, reliably.

Another term you'll see in GCS is object. Because GCS stores data as unstructured—that is, without any information about their internal structure and only as a sequence of bytes—it calls them objects. In other words, objects are very much like files but because GCS does not have the traditional naming conventions and hierarchical structure most modern file formats have, it calls them objects.

These objects are stored in what is called a bucket. Think of a bucket as a container, but for your files and not your microservices. Developers create multiple buckets for different projects where they can then upload files (that will be stored as objects) and download them as well.

Here is a diagram that outlines some of the GCS options available on Google Cloud:

Figure 5.1 – GCS classes

Finally, Google Cloud also has storage classes that determine various characteristics of your buckets, including the cost. Storage classes are based on availability and access frequency (that is, how many times you'll access the stored objects). Additionally, storage classes can also be regional or multi-regional. The former is a good option if you're only serving a small region such as the US West Coast and will offer the best latency possible in that region. If your app's audience is more spread out, choose multi-regional buckets; these can cover numerous regions (and entire continents) and optimize objects and buckets depending on where it is accessed from to minimize latency.

GCS

GCS is an object storage platform and the default object storage option on Google Cloud. It does almost everything that most developers need from an object storage option and is a particularly good option for a new cloud developer, as it is a fully managed platform with some really handy features.

The highlights of GCS include unlimited storage: perhaps not a strict requirement for all of the readers of this book, but if you're creating enterprise-level applications or applications that are expected to scale really fast, on-premises disk storage just won't cut it. Additionally, unlimited storage comes with no minimum object size.

It also offers a wide range of object storage classes with different use cases and priorities. Storage classes also have different pricing so that you can configure your storage to be the most cost-effective. Additionally, GCS has a feature called Object Lifecycle Management (OLM) that allows you to define conditions for your data that, when met, will automatically move your data to a lower storage class to help reduce monthly bills.

Getting started with GCS

Developers can get started with GCS and begin storing their data in a few ways including using Cloud Console as well as the gsutil command-line tool (part of the Google Cloud software development kit (SDK)).

To start off, you first need to create a bucket, and one of the easiest ways to do that is to use Cloud Console and configure it. You can reach the configuration screen by clicking on Storage from the menu on the left and then clicking on Create Bucket.

Once on the configuration screen, you'll be asked to name your bucket and choose from one of the four available storage classes, outlined as follows:

Standard: This is the storage class developers would need if their data is accessed frequently. It's a high-performance and high-frequency storage class (with 99.999999999% annual availability), and if your option is accessed more than a few times a month (or millions of times a week), this is the only option.
Nearline: Nearline (and all of the remaining classes) are long-term storage options for data that is accessed less than a month. Using Nearline reduces the storage costs.
Coldline: If your data will be accessed even less frequently, Coldline is great and will significantly reduce storage costs.
Archive: The Archive storage class is ideal for cold data storage and disaster recovery (DR). It has a minimum storage duration of 365 days and also the lowest storage costs.

After you've chosen your storage class, you'll also need to choose the location type. You can either choose a single region and give users from that region the best performance or you can choose a multi-regional location that is a cluster of individual regions and provide optimal latency to a geographically spread-out user base.

Furthermore, you can customize other parameters such as access control. By default, all data on GCS is encrypted and is only accessible to users who have access to your project. GCS offers highly granular access control using access control lists (ACLs) that enable you to grant roles to different users, with each role having different permissions. Developers can grant the following roles:

Readers
Writers
Owners

You can also make some Quality of Life (QOL) changes to your GCS by turning on features such as Object Versioning. Apart from creating multiple versions of an object, Object Versioning also means you do not run the risk of overwriting objects and losing older versions. Instead, with Object Versioning, your previous version simply gets archived (although this will consume additional storage). If you make a lot of changes to objects and don't want to lose any data, your monthly bills will certainly rise, and there is a way to counteract that as well: with OLM.

OLM is a feature that enables developers to define parameters for their data, and upon fulfillment, your data will automatically be migrated to a lower storage class, reducing your storage costs.

There is another way to keep storage costs under control and at the same time get rid of unnecessary files and keep your storage easily manageable: object lifecycles. As your application continues to grow and change, you'll continue collecting data, some of which may be rendered useless. At this point, you can use object lifecycles to define conditions to automatically delete certain files. There are a number of conditions you can set, including per-object age, fixed data cut-off, version history, and latest version.

Finally, this brings us to the question: what should I use GCS for? Well, quite frankly, GCS can be used to store any unstructured data, which for most users will be everything, as long as they don't need their storage platform to identify and leverage an internal structure, in which case we have a handful of other options now. However, if you don't need to make use of an internal structure to store data, GCS offers extremely reliable storage at very affordable costs and extreme scalability.

GCS is an incredibly diverse platform that will take care of most of your general storage needs. However, if you're looking for more specific capabilities, the following storage options should provide that. We'll start with Cloud SQL.

Cloud SQL

Cloud SQL is a fully managed relational database service that's hosted on Google Compute Engine (GCE). Using Cloud SQL, you can create virtual machines (VMs) (Cloud SQL instances) running a version of MySQL (or PostgreSQL) of your choice while carrying over all of the important capabilities of Structured Query Language (SQL), including the following:

Rich query language
Primary and secondary indexes
Atomicity, Consistency, Isolation, Durability (ACID) transactions
Relational integrity
Stored procedures

If you already have experience with MySQL and are looking for a relational database, Cloud SQL is the perfect option because it pretty much is MySQL underneath. The only difference is that Google hosts MySQL on its servers and takes care of updates, operating systems (OSes), configurations, backups, and other administrative tasks. If you already have a MySQL database and are looking to migrate to Google Cloud fairly simple and straightforward options to migrate your existing database to Cloud SQL exist as well.

Additionally, Cloud SQL also doubles as a fully managed platform for managing your PostgreSQL relational database. Similar to MySQL, most of the features of PostgreSQL have made their way into Cloud SQL, with a few exceptions. For instance, Cloud SQL currently does not support certain features that require SUPERUSER privileges, custom background workers, and a few other parameters. Otherwise, a PostgreSQL instance on CloudSQL is very similar to a locally hosted PostgreSQL instance. CloudSQL supports a wide range of PostgreSQL extensions, procedural languages, and powerful custom machine types (up to 624 gigabytes (GB) of random-access memory (RAM), 96 central processing units (CPUs), and 30 terabytes (TB) of storage).

Cloud SQL is a relational database, meaning it can store highly structured data with a complete schema, and if these two things match your requirements, you can shortlist Cloud SQL to be your database. Relational databases are also generally very good at asking complex questions, and Cloud SQL in particular has great query capabilities. In fact, it is one of the most complex (and capable) options if you're going to be using a lot of queries. On the other hand, there are other options available if your requirements are not as complex.

Furthermore, if you used Google App Engine (GAE) in Chapter 4, Choosing the Right Compute Option, as your compute option, you can use Cloud SQL with GAE with support for a wide range of programming languages, such as the following:

Java
Python
PHP: Hypertext Preprocessor (PHP)
Node.js
Go
Ruby

Now that you have a good understanding of what Cloud SQL is capable of, let's see it in action.

Getting started with Cloud SQL

Getting started with Cloud SQL is really simple and straightforward, and as with GCS can be done either through Cloud Console or through the Cloud SDK. In order to create your first Cloud SQL instance, simply go to Cloud Console and select SQL from the left-hand menu. You can either choose a first-generation SQL instance or a second-generation one. We recommend choosing a second-generation instance as this provides better performance and tools such as automatic storage increase. After that, use a complete basic configuration including name, password, region, and zone (you can leave it at the default setting Any and Google will find the zone with the least latency).

There are additional configuration options that you can access by clicking on Show Configuration Options to change things such as MySQL version, instance size, high availability, and to toggle features such as auto-update. Note that Cloud SQL can spin up instances supporting three versions of MySQL, as follows:

By default, your Cloud SQL instances will have version 5.7 and multiple options for machine types (up to 416 GB of RAM and 30 TB of data storage). Additionally, don't worry about provisioning resources and selecting the machine type at the start as you can always scale up and down by changing the computing power (going from a single core to a four-core machine type, for instance) and disk performance (changing the memory capacity so you can do more or fewer tasks at once).

Click Create, and Google will spin up your new SQL instance (this may take up to 90 seconds).

With the instance ready, you can now click on Detail and further configure it, including connecting it with the compute option you chose in Chapter 4, Choosing the Right Compute Option. You can also connect your SQL instance to other services by authorizing Internet Protocol (IP) addresses. If you didn't customize your instance and change the settings, you can change them now (or anytime). You can also do numerous other things to make your SQL instance exactly the way you need it to be.

For instance, you can change access control by specifying which IP addresses have access to your instance or connecting over Secure Sockets Layer (SSL), which adds an additional layer of security to communication on an unsecured connection. You can also set up maintenance schedules and windows by specifying a day and time at which Google can do maintenance each week.

To reiterate, Cloud SQL is a great option if your application uses highly structured data and complex queries—in fact, it's one of the best options if this is your biggest priority. That said, it's still based on GCE, which means while it does offer a lot of control over VMs, it doesn't scale very fast, nor is it the most scalable option available out there—it's somewhere in the middle. The pricing structure is similar to GCE too, and the bulk of your monthly bills would go toward running those VMs.

Cloud Firestore (previously Datastore)

If you're aware of SQL, you'll know that it has always suffered from one drawback—it is not very extensible with predefined schemas, and the fast-paced nature of today's market, along with the constant possibility of viral marketing, means that extensibility may not be something you want to ignore.

As a result, an alternative to SQL databases was developed—NoSQL databases, and Google Cloud has one too, called Google Cloud Datastore. However, a second generation of Datastore is available now, called Firestore. Firestore is relatively new (although it was launched in beta in early 2018). Firestore is a document storage solution, which is a non-relational form of a database. One of the key differences between a relational database such as Cloud SQL and a non-relational database such as Firestore is that the latter, while much more scalable, has far weaker query capabilities. This is mostly due to the fact that document storage does not use rows and columns; instead, it uses documents with properties. It can still do simple scans and lookups by a single key and it can do these very fast, on a large scale, and consistently.

Firestore is fully managed and serverless, which is part of the reason why it's so scalable and flexible. Being serverless also means you get to avoid any maintenance scheduling or downtime, which you might not be able to do in Cloud SQL. Furthermore, it still carries some of the capabilities of SQL, such as ACID transactions and relational integrity.

Getting started with Cloud Firestore

To get started with Firestore, there are two basic terms you need to be aware of. The first is entities. An entity is where you actually store data and is Firestore's equivalent of a document. When you start with Cloud Firestore, the first thing you'll be asked to create is an entity that will become your collection of unique identifiers called keys (we'll talk about keys in a second). An entity also supports a wide range of data types (primitives), including the following:

Booleans
Strings
Integers
Floating-point numbers
Dates or times
Binary data

Keys are comparable to unique IDs that we find in tables; they both represent a unique identifier. However, unlike table IDs, in order to create a key, developers don't need to create a table and then a row, as the key takes on both roles. In real-world usage, this means that you can simply start storing things without creating tables and add additional columns without an ALTER TABLE statement.

Open up GCS and follow these steps to create our first entity:

To start, select Firestore from the left-hand menu.
At the Start screen, you'll be prompted to create your first entity.
Then, select a location (once selected, this cannot be changed for an entity).

Once that's done, you can begin storing data by creating the first entity of its kind. You can then add a property and click Create to create your first entity and begin storing data (leave the indexed box checked to allow you to see this data in searches). You'll note that you didn't have to create a schema to start storing data, and that's because Firestore does not require you to define a schema in advance, but you still want to think about what you'll store, how you'll retrieve it, and the overall data model.

Another thing you might have noticed is that you didn't have to do any operational tasks such as creating VMs. That's because Firestore is a fully managed, serverless database that does most operations for you, which means you don't have to specify instance and cluster sizes or wait for an instance to spin up.

There are two aspects to Cloud Firestore: storage and operations. We've already looked at storage, so let's talk about operations. Operations are what you can do with the stored data, and as we said in the beginning, Firestore isn't the most complex database available. Therefore, your options are limited to basic operations such as the following:

get: Retrieve an entity by its key.
put: Save/update an entity.
delete: Delete an admin operation.
describe: Retrieve information about admin operations.
list: List pending admin operations along with their status.
cancel: Cancel a running admin operation.

Finally, let's talk about pricing. Since there are no VMs in Firestore, Google determines the monthly bill on the basis of storage and operations. Storage costs are calculated on the basis of GB/dollar. Operation costs are determined by the number of operations each month. Operations are of three types, as outlined next, and each one has a different cost:

Read
Write
Delete

Write operations cost the most, while delete operations are significantly cheaper.

Wrapping up, is Cloud Firestore for you? Well, if you're looking for an incredibly scalable database without the requirement of complex queries, Cloud Firestore is a great option. Even without the need for high scalability, Cloud Firestore is a decent option due to its reliability (zero maintenance) but won't be as cost-effective as some other options.

However, if neither of the options we've discussed so far is completely suitable for your project and you're looking for a middle ground, Cloud Spanner is worth checking out.

Cloud Spanner

We've just discussed two very different storage options, each having its drawbacks. SQL databases such as Cloud SQL have richer queries, while NoSQL databases such as Firestore have raw extensibility. There is a middle ground, in that both of these platforms offer a few MySQL capabilities such as full ACID semantics (Cloud SQL, of course, offers more capabilities), but the trade-offs have always been there. To get around these trade-offs and enjoy the benefits of both rich queries and impressive scalability, a third type of database was created: a NewSQL database, and on Google Cloud, it is known as Cloud Spanner.

Cloud Spanner was created due to Google's growing storage and relational semantics requirements, which ruled out both MySQL and Firestore (back then known as Megastore). The result was Cloud Spanner, a NewSQL database that has most of the SQL features of a relational database and unlimited scale for all intents and purposes (for the average business anyway). So, what is Cloud Spanner? Cloud Spanner is a fully managed relational database (NewSQL) with unlimited scalability and excellent reliability. Here are some of the highlights of Cloud Spanner:

It offers the best of Firestore (scalability) and Cloud SQL (relational semantics).
Like Firestore, it has very low downtime and maintenance periods, with 99.999% availability.
Because Cloud Spanner uses instances, it can easily move data across different VMs by resizing data chunks in case of a failure (very rare, even on extremely high loads). Your application always stays online and remains up to date.

With the fundamental concepts clear, let's take a look at how to get started with Cloud Spanner.

Getting started with Cloud Spanner

When getting started with Cloud Spanner, one of the most important concepts you need to know is that of an instance. An instance can be thought of as a container that stores your data as well as the compute required for operations. The closest equivalent would be the VMs we used in Cloud SQL, although entities from Firestore are similar too (they just don't have compute capabilities). However, unlike VMs in Cloud SQL, Cloud Spanner instances can replicate automatically within the region you configure (instead of zones). This gives Cloud Spanner the flexibility to quickly move data across different instances in different zones to avoid any downtime in the case of a failure.

The next important concept to understand is that of a node. Nodes are responsible for replicating data across zones, as well as carrying out operations such as retrieving data. Multiple individual nodes make up an instance, and each node is replicated for zones that are in your region.

Finally, as with other relational databases, Cloud Spanner has databases and tables. To store your data on Cloud Spanner, you will first have to create an instance and a database. Let's take a look at how to do this, as follows:

Again, let's use Google Cloud Console to start Cloud Spanner.
When you begin, you'll be prompted to create an instance and will find a similar configuration screen to when we created VMs in Cloud SQL. You'll need to name your instance, give it a unique identifier, and choose between Regional and Multi-regional (this is the same as what we learned in the GCS – basics section).
Choose the number of nodes. If you have a node and one region, it gets replicated thrice (one in each zone). Remember—you'll be billed for each node (a total of three here), so keep your nodes at 1, especially if you're just testing out (and turn off the instance when you're done).

With your instance created, you'll now have the option to create a database. Click on the Create database button to proceed, and enter the details of your database by choosing a name and filling in a schema. You can leave the schema empty and proceed to create a table by entering a name and clicking on Create.

Tables in Cloud Spanner are similar to tables in other relational databases for the most part, and you can use a text editor as well as a schema-building tool to get started. However, differences do arise when you proceed further. For instance, Cloud Spanner doesn't have an INSERT SQL query unlike some other databases, but rather a separate application programming interface (API) to insert data using keys. We will not go into more detail about adding, querying data, altering schemas, and other advanced topics just yet. The goal of this section is to help you understand whether Cloud Spanner is the database for you. So, is it?

Cloud Spanner solves two major problems in two different databases, but it does so by using a lot of resources. As a result, it offers a highly structured relational database, full SQL capabilities, reliability (guarantee on uptime), as well as raw scalability. However, it all comes at a cost, and a relatively high one at that. Cloud Spanner does almost everything, but it is not a cost-effective option. But the truth is, the majority of storage options are enterprise-focused, and thus achieving cost-efficiency isn't their top priority—performance is. And on top of this performance hierarchy of storage options sits Cloud Bigtable.

Cloud Bigtable

So far, we've talked about a few options, each offering something unique but having some trade-offs. But what if the scale isn't something you wanted to trade for and none of the platforms we've discussed so far seemed as though it offered enough scale? What if you wanted truly massive scale? Well, there is an option for that as well, and it's called Cloud Bigtable.

Bigtable is a fully managed, highly scalable NoSQL database that's specifically architected for extremely large workloads (over a TB of structured data). It was first announced in a paper published by Google in 2006 that resulted in the open source community building many similar systems, including HBase by Hadoop. Due to the similarities, Cloud Bigtable can be integrated into the Hadoop ecosystem through the HBase API, with a surprising amount of functionality.

In a nutshell, Cloud Bigtable is the database you want to choose if you need to store over 1 TB of data, require consistent read and write latency within 10 milliseconds (ms), or if you're already using HBase and want to migrate to Google Cloud in a simple and straightforward manner.

Getting started with Cloud Bigtable

As with the other options we've discussed so far, Cloud Bigtable also has some core concepts that are crucial to understanding how this platform works and making the most out of it. The first of these concepts is row keys. Since Cloud Bigtable is a non-relational NoSQL database, it uses keys to find data, similar to Firestore. However, in this case, these are called row keys. Row keys are extremely important because they are the only unique identifiers we have in Bigtable. Unlike SQL databases, there are no secondary indexes such as additional columns that you can use during lookups, so you must rely on row keys. Therefore, it's a good idea to carefully think about how you're going to structure your row keys and organize your data model to avoid overwriting data and inefficient queries.

The second major concept in Bigtable is the infrastructure hierarchy. At the highest level, we have an instance—inside an instance we can have multiple clusters, and inside each cluster there can be multiple nodes. An instance can be thought of as the main storage unit. However, because Bigtable is a fully managed platform, you do not need to spend any time provisioning resources and creating VMs yourself; you can spin up new instances with a few clicks.

Inside each instance are clusters. There can be up to four clusters in an instance, and inside the clusters, we have nodes. If you remember from the previous chapter, nodes in Spanner would replicate across different zones; however, in Bigtable, nodes replicate in the same zone to keep the latency at a minimum. You can configure how many nodes each cluster has (up to 30) to control the number of operations, throughput, and cost. Here's a statistic that puts the level of scale Bigtable has into perspective: A 30-node cluster is capable of performing 300,000 operations per second with a throughput of 300 megabytes (MB) per second. To put that into perspective, 300 MB per second is about 25 TB per day, which would add up to about 9 petabytes (PB) in a year.

With these concepts out of the way, we can create our first Bigtable instance. Follow the next steps to get started with Bigtable:

We'll once again use Cloud Console to start up Bigtable.
Once there, click on Create instance to configure your instance and cluster.
As before, enter the name, instance, and cluster ID, and then select a zone. The zone should ideally be close to any VMs that will access your database.

You'll also be asked to specify the number of nodes you want. Again, keep this at a minimum (which is 3 in this case) if you're doing a test, and don't forget to turn the instance off after you're done because, as with Cloud Spanner, Bigtable can get quite expensive—but is it worth it?

Well, it depends. Bigtable wasn't built for small applications. Even at its lowest configuration, it still has three nodes and is capable of 30,000 operations per second, which is more than what's needed for almost every small business out there. More importantly, unlike some of the other Google Cloud services, Bigtable pricing isn't based on usage. Even if you leave the three nodes unused, you'll still be billed the same amount because each node is reserved for sole use by your application.

So, what are the main takeaways? Well, Bigtable is fairly unstructured compared to databases such as Cloud SQL and isn't capable of richer queries (it only has row keys). But it provides extremely low latency, some of the highest throughput out of all databases, and it is extremely scalable. On the downside, it costs a lot and isn't really ideal for small-scale usage.

Wrapping up the big five

With the information presented so far, you should have a good idea of the capabilities of the five database/storage options that we've discussed so far, as well as the ideal use cases for them. That said, to make things a little bit easier, here is a decision tree that you can use to put all five of these options in perspective:

Figure 5.2 – Decision tree

These were the major cloud storage options that Google Cloud provides, and hopefully you found the one that fits your needs the best. However, if you didn't, don't worry—we've not yet exhausted our options. The Google Cloud ecosystem has numerous other cloud storage and database options that might be a better fit for you. Let's take a look.

Additional storage and database options

The Google Cloud ecosystem is huge, and it's not possible to cover every service in a single chapter. The five services that we've already covered make the biggest part of Google Cloud's storage and database services and are, in many ways, the pillars. However, there are some additional services that you'll find useful as your journey in cloud application development continues. Here are four such services.

BigQuery

Cloud BigQuery is a SQL relational database that is also used as a big data analytics platform. It's under the Big Data section of Cloud Console instead of the Storage section because its main purpose is to query extremely large datasets extremely quickly—terabytes of data within minutes.

The reason we didn't include BigQuery in our main list of database options was that most small businesses and developers who are just starting out won't be working with datasets that are measured in TB and PB. However, you might find BigQuery to be an extremely useful tool when you need to analyze large datasets such as public records and archives.

Filestore

Filestore is the opposite of a storage solution such as GCS that stores unstructured data with no internal hierarchy to manage it. Filestore is a fully managed file storage solution that works great for applications that need a filesystem interface. Instead of storing a sequence of bytes, Filestore is a great option for tasks such as media rendering, analytics, and shared content.

Persistent disks/local solid-state drive (SSD) (block storage)

Block storage can be thought of like physical hard drives that we use in our regular computers. They can be removed from a computer and plugged into another, and they would still have all of the data. Google offers a similar high-performance and high-availability service called Persistent Disk. Persistent disks can be added to your VM instances running on GCE or GKE. There are two types of persistent disks available, outlined as follows:

Standard persistent disk: This is the low-cost version that uses hard disk drives (HDDs).
Local SSD: This is the high-performance option, with better speed and lower latency.

Persistent Disks is a great cost-efficient cloud storage option that gives developers incredible flexibility and control over their data.

MemoryStore

MemoryStore is a database option that uses memory instead of traditional HDDs and SSDs for storage. Since memory is considerably faster than any SSD available, MemoryStore is able to significantly reduce latency, which can be crucial in applications that need to show users real-time updates by the ms (competitive e-sports, bidding, stream analytics, and so on). MemoryStore can be integrated with Redis and Memcached for automating monitoring, updates, and failover protocols. As with other Google Cloud storage options, MemoryStore is scalable, with support for up to 5 TB clusters.

Security and flexibility

There is no definite pattern that you need to follow when choosing your storage and database options. Different database and storage options can be used together—and often are—to complete tasks that have different priorities; the cloud gives you that flexibility.

That said, some services are more flexible than others. For instance, GCS—despite supporting only unstructured data—can be used for a large number of tasks due to its low cost, high scalability, low latency, and reliability. The only task you shouldn't use GCS for is querying your data. Instead, you can use GCS with a relational database that is capable of rich queries, such as Cloud SQL or Cloud Spanner.

Finally, there is data security. While data storage in the cloud is still safer than most on-premises solutions, there are still steps you should take to ensure the safety of your data. For starters, all of the data stored in Google Cloud is encrypted by default. That said, developers can further improve security by bringing their own encryption keys using the in-built Cloud Key Management service that Google Cloud provides or an on-premises system. Furthermore, almost all the storage options we've discussed offer some form of failover protection to ensure your data isn't wiped out in the case of a failure.

Summary

Google Cloud has an expansive list of cloud and database solutions—we've covered most of them, and you should've been able to find the right service for you. However, remember that one of the benefits of using Google Cloud services is that most of them can be used in tandem with each other, which opens up numerous other possibilities.

We also briefly touched on cloud security, but we'll take a much closer look at this. In the final chapter of Part 2 (Chapter 7, Implementing Cloud Native Security), we'll dive a lot deeper into the world of cloud security.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 5: Choosing the Right Database and Storage

Create new playlist

Sign In

Sign Up