Chapter 10. Programming for the NoSQL database service: DynamoDB

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 10. Programming for the NoSQL database service: DynamoDB

This chapter covers

The DynamoDB NoSQL database service
Creating tables and secondary indexes
Integrating DynamoDB into your service stack
Designing a key-value optimized data model
Tuning performance

Scaling a traditional, relational database is difficult because transactional guarantees (atomicity, consistency, isolation, and durability, also known as ACID) require communication among all nodes of the database. The more nodes you add, the slower your database becomes, because more nodes must coordinate transactions between each other. The way to tackle this has been to use databases that don’t adhere to these guarantees. They’re called NoSQL databases.

There are four types of NoSQL databases—document, graph, columnar, and key-value store—each with its own uses and applications. Amazon provides a NoSQL database service called DynamoDB. Unlike RDS, which effectively provides several common RDBMS engines like MySQL, Oracle Database, Microsoft SQL Server, and PostgreSQL, DynamoDB is a fully managed, proprietary, closed source key-value store. If you want to use a different type of NoSQL database—a document database like MongoDB, for example—you’ll need to spin up an EC2 instance and install MongoDB directly on that. Use the instructions in chapters 3 and 4 to do so. DynamoDB is highly available and highly durable. You can scale from one item to billions and from one request per second to tens of thousands of requests per second.

This chapter looks in detail at how to use DynamoDB: both how to administer it like any other service and how to program your applications to use it. Administering DynamoDB is simple. You can create tables and secondary indexes, and there’s only one option to tweak: its read and write capacity, which directly affects its cost and performance.

We’ll look at the basics of DynamoDB and demonstrate them by walking through a simple to-do application called nodetodo, the Hello World of modern applications. Figure 10.1 shows the to-do application nodetodo in action.

Figure 10.1. You can manage your tasks with the command-line to-do application nodetodo.

Examples are 100% covered by the Free Tier

The examples in this chapter are totally covered by the Free Tier. As long as you don’t run the examples longer than a few days, you won’t pay anything for it. Keep in mind that this applies only if you created a fresh AWS account for this book and there are no other things going on in your AWS account. Try to complete the chapter within a few days, because you’ll clean up your account at the end of the chapter.

Before you get started with nodetodo, you need to know about DynamoDB 101.

10.1. Operating DynamoDB

DynamoDB doesn’t require administration like a traditional relational database; instead, you have other tasks to take care of. Pricing depends mostly on your storage usage and performance requirements. This section also compares DynamoDB to RDS.

10.1.1. Administration

With DynamoDB, you don’t need to worry about installation, updates, servers, storage, or backups:

DynamoDB isn’t software you can download. Instead, it’s a NoSQL database as a service. Therefore, you really can’t install DynamoDB like you install MySQL or MongoDB. This also means you don’t have to update your database; the software is maintained by AWS.
DynamoDB runs on a fleet of servers operated by AWS. They take care of the OS and all security-related questions. From a security perspective, it’s your job to grant the right permissions in IAM to the users your of DynamoDB tables.
DynamoDB replicates your data among multiple servers and across multiple data centers. There’s no need for a backup from a durability point of view—the backup is already in the database.

Now you know some administrative tasks that are no longer necessary if you use DynamoDB. But you still have things to consider when using DynamoDB in production: creating tables (see section 10.4), creating secondary indexes (section 10.6), monitoring capacity usage, and provisioning read and write capacity (section 10.9).

10.1.2. Pricing

If you use DynamoDB, you pay the following monthly:

$ 0.25 per used GB of storage (secondary indexes consume storage as well)
$ 0.47 per provisioned write-capacity unit of throughput (throughput is explained in section 10.9)
$ 0.09 per provisioned read-capacity unit of throughput

These prices are valid for the North Virginia (us-east-1) region. No additional traffic charges apply if you use AWS resources like EC2 servers to access DynamoDB in the same region

10.1.3. RDS comparison

Table 10.1 compares DynamoDB and RDS. Keep in mind that this is like comparing apples and oranges; the only thing DynamoDB and RDS have in common is that both are called databases.

Table 10.1. Differences between DynamoDB and RDS

Task	DynamoDB	RDS
Creating a table	Management Console, SDK, or CLI aws dynamodb create-table	SQL CREATE TABLE statement
Inserting, updating, or deleting data	SDK	SQL INSERT, UPDATE, or DELETE statement, respectively
Querying data	If you query the primary key: SDK. Querying non-key attributes isn’t possible, but you can add a secondary index or scan the entire table.	SQL SELECT statement
Increasing storage	No action needed: DynamoDB grows with your items.	Provision more storage.
Increasing performance	Horizontal, by increasing capacity. DynamoDB will add more servers under the hood.	Vertical, by increasing instance size; or horizontal, by adding read replicas. There is an upper limit.
Installing the database on your machine	DynamoDB isn’t available for download. You can only use it as a service.	Download MySQL, Oracle Database, Microsoft SQL Server, or PostgreSQL, and install it on your machine.
Hiring an expert	Search for special DynamoDB skills.	Search for general SQL skills or special skills, depending on the database engine.

10.2. DynamoDB for developers

DynamoDB is a key-value store that organizes your data in tables. Each table contains items (values) that are identified by keys. A table can also maintain secondary indexes for data look-up in addition to the primary key. In this section, you’ll look at these basic building blocks of DynamoDB, ending with a brief comparison of NoSQL databases.

10.2.1. Tables, items, and attributes

A DynamoDB table has a name and organizes a collection of items. An item is a collection of attributes. An attribute is a name-value pair. The attribute value can be scalar (number, string, binary, boolean), multivalued (number set, string set, binary set), or a JSON document (object, array). Items in a table aren’t required to have the same attributes; there is no enforced schema.

You can create a table with the Management Console, CloudFormation, SDKs, or the CLI. The following example shows how you create a table with the CLI (don’t try to run this command now—you’ll create a table later in the chapter):

If you plan to run multiple applications that use DynamoDB, it’s good practice to prefix your tables with the name of your application. You can also add tables via the Management Console. Keep in mind that you can’t change the name of a table and the key schema. But you can add attribute definitions and change the provisioned throughput.

10.2.2. Primary keys

A primary key is unique within a table and identifies an item. You need the primary key to look up an item. The primary key is either a hash or a hash and a range. Hash keys

A hash key uses a single attribute of an item to create a hash index. If you want to look up an item based on its hash key, you need to know the exact hash key. A user table could use the user’s email as a hash primary key. A user then can be retrieved if you know the hash key (email, in this case).

Hash and range keys

A hash and range key uses two attributes of an item to create a more powerful index. The first attribute is the hash part of the key, and the second part is the range. To look up an item, you need to know the exact hash part of the key, but you don’t need to know the range part. The range part is sorted within the hash. This allows you to query the range part of the key from a certain starting point. A message table can use a hash and range as its primary key; the hash is the email of the user, and the range is a timestamp. You can now look up all messages of a user that are newer than a specific timestamp.

10.2.3. NoSQL comparison

Table 10.2 compares DynamoDB to several NoSQL databases. Keep in mind that all of these databases have pros and cons, and the table shows only a high-level comparison of how they can be used on top of AWS.

Table 10.2. Differences between DynamoDB and some NoSQL databases

Task	DynamoDB Key-value store	MongoDB Document store	Neo4j Graph store	Cassandra Columnar store	Riak KV Key-value store
Run the database on AWS in production.	One click: it’s a managed service.	Cluster of EC2 instances, self-maintained.	Cluster of EC2 instances, self-maintained.	Cluster of EC2 instances, self-maintained.	Cluster of EC2 instances, self-maintained.
Increase available storage while running.	Not necessary. The database grows automatically.	Add more EC2 instances (replica set).	Not possible (the increasing size of EBS volumes requires downtime).	Add more EC2 instances.	Add more EC2 instances.

10.2.4. DynamoDB Local

Imagine a team of developers working on a new app using DynamoDB. During development, each developer needs an isolated database so as not to corrupt the other team members’ data. They also want to write unit tests to make sure their app is working. You could create a unique set of DynamoDB tables with a CloudFormation stack per developer to separate them, or you could use a local DynamoDB. AWS provides a Java mockup of DynamoDB, which is available for download at http://mng.bz/27h5. Don’t run it in production! It’s only made for development purposes and provides the same functionality as DynamoDB, but it uses a different implementation: only the API is the same.

10.3. Programming a to-do application

To minimize the overhead of a programming language, you’ll use Node.js/JavaScript to create a small to-do application that can be used via the terminal on your local machine. Let’s call the application nodetodo. nodetodo will use DynamoDB as a database. With nodetodo, you can do the following:

Create and delete users
Create and delete tasks
Mark tasks as done
Get a list of all tasks with various filters

nodetodo supports multiple users and can track tasks with or without a due date. To help users deal with many tasks, a task can be assigned to a category. nodetodo is accessed via the terminal. Here’s how you would use nodetodo via the terminal to add a user (don’t try to run this command now—it’s not yet implemented):

To add a new task, you would do the following (don’t try to run this command now—it’s not yet implemented):

You would mark a task as finished as follows (don’t try to run this command now—it’s not yet implemented):

# node index.js task-done <uid> <tid>
$ node index.js task-done michael 1432187491647
=> task completed with tid 1432187491647

You should also be able to list tasks. Here’s how you would use nodetodo to do that (don’t try to run this command now—it’s not yet implemented):

# node index.js task-ls <uid> [<category>] [--overdue|--due|...]
$ node index.js task-ls michael
=> tasks [...]

To implement an intuitive CLI, nodetodo uses docopt, a command-line interface description language, to describe the CLI interface. The supported commands are as follows:

user-add —Adds a new user to nodetodo
user-rm —Removes a user
user-ls —Lists users
user —Shows the details of a single user
task-add —Adds a new task to nodetodo
task-rm —Removes a task
task-ls —Lists user tasks with various filters
task-la —Lists tasks by category with various filters
task-done —Marks a task as finished

In the rest of the chapter, you’ll implement those commands. The following listing shows the full CLI description of all the commands, including parameters.

Listing 10.1. CLI description language docopt: using nodetodo (cli.txt)

DynamoDB isn’t comparable to a traditional relational database in which you create, read, update, or delete data with SQL. You’ll access DynamoDB with an SDK to call the HTTP REST API. You must integrate DynamoDB into your application; you can’t take an existing application that uses a SQL database and run it on DynamoDB. To use DynamoDB, you need to write code!

10.4. Creating tables

A table in DynamoDB organizes your data. You aren’t required to define all the attributes that table items will have. DynamoDB doesn’t need a static schema like a relational database, but you must define the attributes that are used as the primary key in your table. In other words, you must define the table’s primary key. To do so, you’ll use the AWS CLI. The aws dynamodb create-table command has four mandatory options:

table-name —Name of the table (can’t be changed).
attribute-definitions —Name and type of attributes used as the primary key. A definition AttributeName=attr1,AttributeType=S can be repeated multiple times, separated by a space character. Valid types are S (String), N (Number), and B (Binary).
key-schema —Name of attributes that are part of the primary key (can’t be changed). Contains a single AttributeName=attr1,KeyType=HASH entry or two separated by spaces for the hash and range key. Valid types are HASH and RANGE.
provisioned-throughput —Performance settings for this table defined as ReadCapacityUnits=5,WriteCapacityUnits=5 (you’ll learn about this in section 10.9).

You’ll now create a table for the users of the nodetodo application and a table that will contain all the tasks.

10.4.1. Users with hash keys

Before you create a table for nodetodo users, you must think carefully about the table’s name and primary key. We suggest that you prefix all your tables with the name of your application. In this case, the table name is todo-user. To choose a primary key, you have to think about the queries you’ll make in the future and whether there is something unique about your data items. Users will have a unique ID, called uid, so it makes sense to choose the uid attribute as the primary key. You must also be able to look up users based on the uid to implement the user command. If you want a single attribute to be your primary key, you can always create a hash index: an unordered index based on the hash key. The following example shows a user table where uid is used as the primary hash key:

Because users will only be looked up based on the known uid, it’s fine to use a hash key. Next you’ll create the user table, structured like the previous example, with the help of the AWS CLI:

Creating a table takes some time. Wait until the status changes to ACTIVE. You can check the status of a table as follows:

10.4.2. Tasks with hash and range keys

Tasks always belong to a user, and all commands that are related to tasks include the user’s ID. To implement the task-ls command, you need a way to query the tasks based on the user’s ID. In addition to the hash key, you can use a hash and range key. Because all interactions with tasks require the user’s ID, you can choose uid as the hash part and a task ID (tid), the timestamp of creation, as the range part of the key. Now you can make queries that include the user’s ID and, if needed, the task’s ID.

Note

This solution has one limitation: users can add only one task per timestamp. Our timestamp comes with millisecond resolution, so it should be fine. But you should take care to prevent strange things from happening when the user should be able to add two tasks at the same time.

A hash and range key uses two of your table attributes. For the hash part of the key, an unordered hash index is maintained; the range part is kept in a sorted range index. The combination of the hash and the range uniquely identifies the item. The following data set shows the combination of unsorted hash parts and sorted range parts:

nodetodo offers the ability to get all tasks for a user. If the tasks have only a primary hash key, this will be difficult, because you need to know the key to extract them from DynamoDB. Luckily, the hash and range key makes things easier, because you only need to know the hash portion of the key to extract the items. For the tasks, you’ll use uid as the known hash portion. The range part is tid. The task ID is defined as the timestamp of task creation. You’ll now create the task table, using two attributes to create a hash and range index:

Wait until the table status changes to ACTIVE when you run aws dynamodb describe-table --table-name todo-task. When both tables are ready, you’ll add some data.

10.5. Adding data

You have two tables up and running. To use them, you need to add some data. You’ll access DynamoDB via the Node.js SDK, so it’s time to set up the SDK and some boilerplate code before you implement adding users and tasks.

Installing and getting started with Node.js

Node.js is a platform to execute JavaScript in an event-driven environment so you can easily build network applications. To install Node.js, visit https://nodejs.org and download the package that fits your OS.

After Node.js is installed, you can verify if everything works by typing node--version into your terminal. Your terminal should respond with something similar to v0.12.*. Now you’re ready to run JavaScript examples like nodetodo for AWS.

To get started with Node.js and docopt, you need some magic lines to load all the dependencies and do some configuration work. Listing 10.2 shows how this can be done.

Where is the code located?

As usual, you’ll find the code in the book’s code repository on GitHub: https://github.com/AWSinAction/code. nodetodo is located in /chapter10/.

Docopt is responsible for reading all the arguments passed to the process. It returns a JavaScript object, where the arguments are mapped to the described parameters in the CLI description.

Listing 10.2. nodetodo: using docopt in Node.js (index.js)

Next you’ll implement the features of nodetodo. You can use the putItem SDK operation to add data to DynamoDB like this:

The first step is to add data to nodetodo.

10.5.1. Adding a user

You can add a user to nodetodo by calling nodetodo user-add <uid> <email> <phone>. In Node.js, you do this using the code in the following listing.

Listing 10.3. nodetodo: adding a user (index.js)

When you make a call to the AWS API, you always do the following:

1. Create a JavaScript object (map) filled with the needed parameters (the params variable).

2. Invoke the function on the AWS SDK.

3. Check whether the response contains an error, or process the returned data.

Therefore you only need to change the content of params if you want to add a task instead of a user.

10.5.2. Adding a task

You can add a task to nodetodo by calling nodetodo task-add <uid> <description> [<category>] [--dueat=<yyyymmdd>]. In Node.js, you do this with the code shown in the following listing.

Listing 10.4. nodetodo: adding a task (index.js)

Now you can add users and tasks to nodetodo. Wouldn’t it be nice if you could retrieve all this data?

10.6. Retrieving data

DynamoDB is a key-value store. The key is usually the only way to retrieve data from such a store. When designing a data model for DynamoDB, you must be aware of that limitation when you create tables (you did so in section 10.4). If you can use only one key to look up data, you’ll soon or later experience difficulties. Luckily, DynamoDB provides two other ways to look up items: a secondary index key lookup and the scan operation. You’ll start by retrieving data with its primary key and continue with more sophisticated methods of data retrieval.

DynamoDB Streams

DynamoDB lets you retrieve changes to a table as soon as they’re made. A stream provides all write (create, update, delete) operations to your table items. The order is consistent within a hash key:

If your application polls the database for changes, DynamoDB Streams solves the problem in a more elegant way.
If you want to populate a cache with the changes made to a table, DynamoDB Streams can help.
If you want to replicate a DynamoDB table to another region, DynamoBD Streams can do it.

10.6.1. Getting by key

The simplest form of data retrieval is looking up a single item by its primary key. The getItem SDK operation to get a single item from DynamoDB can be used like this:

The command nodetodo user <uid> must retrieve a user by the user’s ID (uid). Translated to the Node.js AWS SDK, this looks like the following listing.

Listing 10.5. nodetodo: retrieving a user (index.js)

You can also use the getItem operation to retrieve data by primary hash and range key. The only change is that that Key has two entries instead of one. getItem returns one item or no items; if you want to get multiple items, you need to query DynamoDB.

10.6.2. Querying by key and filter

If you want to retrieve not a single item but a collection of items, you must query DynamoDB. Retrieving multiple items by primary key only works if your table has a hash and range key. Otherwise, the hash will only identify a single item. The query SDK operation to get a collection of items from DynamoDB can be used like this:

The query operations also lets you specify an optional FilterExpression. The syntax of FilterExpression works like KeyConditionExpression, but no index is used for filters. Filters are applied to all matches that KeyConditionExpression returns.

To list all tasks for a certain user, you must query DynamoDB. The primary key of a task is the combination of the uid hash part and the tid range part. To get all tasks for a user, KeyConditionExpression only requires the equality of the hash part of the primary key. The implementation of nodetodo task-ls <uid> [<category>] [--overdue |--due|--withoutdue|--futuredue] is shown next.

Listing 10.6. nodetodo: retrieving tasks (index.js)

Two problems arise with the query approach:

Depending on the result size from the primary key query, filtering may be slow. Filters work without an index: every item must be inspected. Imagine you have stock prices in DynamoDB, with a primary hash and range key: the hash is AAPL, and the range is a timestamp. You can make a query to retrieve all stock prices of Apple (AAPL) between two timestamps (20100101 and 20150101). But if you only want to return prices on Mondays, you need to filter over all prices to return only 20% of them. That’s wasting a lot of resources!
You can only query the primary key. Returning a list of all tasks that belong to a certain category for all users isn’t possible, because you can’t query the category attribute.

You can solve those problems with secondary indexes. Let’s look at how they work.

10.6.3. Using secondary indexes for more flexible queries

A secondary index is a projection of your original table that’s automatically maintained by DynamoDB. You can query a secondary index like you query the index containing all the primary keys of a table. You can imagine a global secondary index as a read-only DynamoDB table that’s automatically updated by DynamoDB: whenever you change the parent table, all indexes are asynchronously (eventually consistent!) updated as well. Figure 10.2 shows how a secondary index works.

Figure 10.2. A secondary index contains a copy (projection) of your table’s data to provide fast lookup on another key.

A secondary index comes at a price: the index requires storage (the same cost as for the original table). You must provision additional write-capacity units for the index as well because a write to your table will cause a write to the secondary index.

A huge benefit of DynamoDB is that you can provision capacity based on your workload. If one of your table indexes gets tons of read traffic, you can increase the read capacity of that index. You can fine-tune your database performance by provisioning sufficient capacity for your tables and indexes. You’ll learn more about that in section 10.9.

Back to nodetodo. To implement the retrieval of tasks by category, you’ll add a secondary index to the todo-task table. This will allow you to make queries by category. A hash and range key is used: the hash is the category attribute, and the range is the tid attribute. The index also needs a name: category-index. You can find the following CLI command in the README.md file in nodetodo’s code folder:

A global secondary index takes some time to be created. You can use the CLI to find out if the index is active:

$ aws dynamodb describe-table --table-name=todo-task 
--query "Table.GlobalSecondaryIndexes"

The following listing shows how the implementation of nodetodo task-la <category> [--overdue|...] uses the query operation.

Listing 10.7. nodetodo: retrieving tasks from a category index (index.js)

But there are still situations where a query doesn’t work: you can’t retrieve all users. Let’s look at what a table scan can do for you.

10.6.4. Scanning and filtering all of your table’s data

Sometime you can’t work with keys; instead, you need to go through all the items in the table. That’s not efficient, but in some situations, it’s okay. DynamoDB provides the scan operation to scan all items in a table:

The next listing shows the implementation of nodetodo user-ls [--limit=<limit>] [--next=<id>]. A paging mechanism is used to prevent too many items from being returned.

Listing 10.8. nodetodo: retrieving all users with paging (index.js)

The scan operation reads all items in the table. This example didn’t filter any data, but you can use FilterExpression as well. Note that you shouldn’t use the scan operation too often—it’s flexible but not efficient.

10.6.5. Eventually consistent data retrieval

DynamoDB doesn’t support transactions the same way a traditional database does. You can’t modify (create, update, delete) multiple documents in a single transaction—the atomic unit in DynamoDB is a single item.

In addition, DynamoDB is eventually consistent. That means it’s possible that if you create an item (version 1), update that item to version 2, and then get that item, you may see the old version 1; if you wait and get the item again, you’ll see version 2. Figure 10.3 shows this process. The reason for this behavior is that the item is persisted on multiple servers in the background. Depending on which server answers your request, the server may not have the latest version of the item.

Figure 10.3. Eventually consistent reads can return old values after a write operation until the change is propagated to all servers.

You can prevent eventually consistent reads by adding "ConsistentRead": true to the DynamoDB request to get strongly consistent reads. Strongly consistent reads are supported by getItem, query, and scan operation. But a strongly consistent read takes longer and consumes more read capacity than an eventually consistent read. Reads from a global secondary index are always eventually consistent because the index itself is eventually consistent.

10.7. Removing data

Like the getItem operation, the deleteItem operation requires that you specify the primary key you want to delete. Depending on whether your table uses a hash or a hash and range key, you must specify one or two attributes.

You can remove a user with nodetodo by calling nodetodo user-rm <uid>. In Node.js, this is as shown in the following listing.

Listing 10.9. nodetodo: removing a user (index.js)

Removing a task is similar: nodetodo task-rm <uid> <tid>. The only change is that the item is identified by a hash and range key and the table name, as shown in the next listing.

Listing 10.10. nodetodo: removing a task (index.js)

You’re now able to create, read, and delete items in DynamoDB. The only operation missing is updating.

10.8. Modifying data

You can update an item with the updateItem operation. You must identify the item you want to update by its key; you can also provide an UpdateExpression to specify the updates you want to perform. You can use one or a combination of the following update actions:

Use SET to override or create a new attribute. Examples: SET attr1 = :attr1val, SET attr1 = attr2 + :attr2val, SET attr1 = :attr1val, attr2 = :attr2val.
Use REMOVE to remove an attribute. Examples: REMOVE attr1, REMOVE attr1, attr2.

In nodetodo, you can mark a task as done by calling nodetodo task-done <uid> <tid>. To implement this feature, you need to update the task item, as shown in Node.js in the following listing.

Listing 10.11. nodetodo: updating a task as done (index.js)

That’s it! You’ve implemented all of nodetodo’s features.

10.9. Scaling capacity

When you create a DynamoDB table or a global secondary index, you must provision throughput. Throughput is divided into read and write capacity. DynamoDB uses ReadCapacityUnits and WriteCapacityUnits to specify the throughput of a table or global secondary index. But how is a capacity unit defined? Let’s start by doing some experimentation with the command-line interface:

More abstract rules for throughput consumption are as follows:

An eventually consistent read takes half the capacity compared to a strongly consistent read.
A strongly consistent getItem requires one read capacity unit if the item isn’t larger than 4 KB. If the item is larger than 4 KB, you need additional read capacity units. You can calculate the required read capacity units using roundUP(itemSize / 4).
A strongly consistent query requires one read capacity unit per 4 KB of item size. This means if your query returns 10 items, and each item is 2 KB, the item size is 20 KB and you need 5 read units. This is in contrast to 10 getItem operations, for which you would need 10 read capacity units.
A write operation needs one write capacity unit per 1 KB of item size. If your item is larger than 1 KB, you can calculate the required write capacity units using roundUP(itemSize).

If capacity units aren’t your favorite unit, you can use the AWS Simple Monthly Calculator at http://aws.amazon.com/calculator to calculate your capacity needs by providing details of your read and write workload.

The provision throughput of a table or a global secondary index is defined in seconds. If you provision five read capacity units per second with ReadCapacityUnits=5, you can make five strongly consistent getItem requests for that table if the item size isn’t larger than 4 KB per second. If you make more requests than are provisioned, DynamoDB will first throttle your request. If you make many more requests than are provisioned, DynamoDB will reject your requests.

It’s important to monitor how many read and write capacity units you require. Fortunately, DynamoDB sends some useful metrics to CloudWatch every minute. To see the metrics, open the AWS Management Console, navigate to the DynamoDB service, and select one of the tables. Figure 10.4 shows the CloudFormation metrics for the todo-user table.

Figure 10.4. Monitoring provisioned and consumed capacity units of the DynamoDB table

You can modify the provisioned throughput whenever you like, but you can only decrease the throughput capacity of a single table four times a day.

Cleaning up

Don’t forget to delete your DynamoDB tables after you finish this section. Use the Management Console to do so.

10.10. Summary

DynamoDB is a NoSQL database service that removes all the operational burdens from you, scales well, and can be used in many ways as the storage back end of your applications.
Looking up data in DynamoDB is based on keys. A hash key can only be looked up if you know the key. But DynamoDB also supports hash and range keys, which combine the power of a hash key with another key that is sorted.
You can retrieve a single item by its key with the getItem operation.
Strongly consistent reads (getItem, query, and scan) can be enforced if needed. Reads from a global secondary index are always eventually consistent.
DynamoDB doesn’t support SQL. Instead, you must use the SDK to communicate with DynamoDB from your application. This also implies that you can’t use an existing application to run with DynamoDB without touching the code.
DynamoDB uses expressions to make more complex interactions with the database possible, such as when you update an item.
Monitoring consumed read and write capacity is important if you want to provision enough capacity for your tables and indices.
DynamoDB is charged for per gigabyte of storage and per provisioned read or write capacity.
You can use the query operation to query primary keys or secondary indexes.
The scan operation is flexible but not efficient and shouldn’t be used too often.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 10. Programming for the NoSQL database service: DynamoDB

Create new playlist

Sign In

Sign Up

Chapter 10. Programming for the NoSQL database service: DynamoDB

Figure 10.1. You can manage your tasks with the command-line to-do application nodetodo.

10.1. Operating DynamoDB

10.1.1. Administration

10.1.2. Pricing

10.1.3. RDS comparison

Table 10.1. Differences between DynamoDB and RDS

10.2. DynamoDB for developers

10.2.1. Tables, items, and attributes

10.2.2. Primary keys

Hash and range keys

10.2.3. NoSQL comparison

Table 10.2. Differences between DynamoDB and some NoSQL databases

10.2.4. DynamoDB Local

10.3. Programming a to-do application

Listing 10.1. CLI description language docopt: using nodetodo (cli.txt)

10.4. Creating tables

10.4.1. Users with hash keys

10.4.2. Tasks with hash and range keys

Note

10.5. Adding data

Listing 10.2. nodetodo: using docopt in Node.js (index.js)

10.5.1. Adding a user

Listing 10.3. nodetodo: adding a user (index.js)

10.5.2. Adding a task

Listing 10.4. nodetodo: adding a task (index.js)

10.6. Retrieving data

10.6.1. Getting by key

Listing 10.5. nodetodo: retrieving a user (index.js)

10.6.2. Querying by key and filter

Listing 10.6. nodetodo: retrieving tasks (index.js)

10.6.3. Using secondary indexes for more flexible queries

Figure 10.2. A secondary index contains a copy (projection) of your table’s data to provide fast lookup on another key.

Listing 10.7. nodetodo: retrieving tasks from a category index (index.js)

10.6.4. Scanning and filtering all of your table’s data

Listing 10.8. nodetodo: retrieving all users with paging (index.js)

10.6.5. Eventually consistent data retrieval

Figure 10.3. Eventually consistent reads can return old values after a write operation until the change is propagated to all servers.

10.7. Removing data

Listing 10.9. nodetodo: removing a user (index.js)

Listing 10.10. nodetodo: removing a task (index.js)

10.8. Modifying data

Listing 10.11. nodetodo: updating a task as done (index.js)

10.9. Scaling capacity

Figure 10.4. Monitoring provisioned and consumed capacity units of the DynamoDB table

10.10. Summary

Table of Contents for
Chapter 10. Programming for the NoSQL database service: DynamoDB