E.1. Steps to create your AWS GPU instance

  1. Go to http://aws.amazon.com to sign up for an account or sign into an existing account. Once you are logged into your account, go to the AWS Management Console (http://console.aws.amazon.com) shown in figure E.1.
    Figure E.1. AWS Management Console

  2. Select EC2 under All Services; you can also find the EC2 service in the Services menu at the top of the page. The EC2 Dashboard provides summary information about existing EC2 instances (see figure E.2)
    Figure E.2. Creating a new AWS instance

  3. In the EC2 Dashboard, click the blue Launch Instance button to start the instance setup wizard, a sequence of screens where you can configure the virtual machine you want to launch.
  4. This screen (figure E.3) shows the server hard drive images or ISOs you can install on your virtual machine. These are called Amazon Machine Images (AMIs) on Amazon.[1] Some AMIs come with deep learning frameworks already installed. That way, you don’t need to install and configure the CUDA and BLAS libraries or Python packages such as TensorFlow, numpy, and Keras. To find a free preconfigured deep learning AMI, click the Amazon Marketplace or Community AMIs tab on the left side and search for “deep learning.”[2] You must still configure the hardware that makes use of all the software features that a particular AMI provides.

    1

    ISO is short for ISO-9660, an International Standards Organization open standard for writing disk images in a way that they can be transported and installed elsewhere, not only on one proprietary cloud service, such as AWS.

    2

    At the time of this writing, one such image under the Amazon Marketplace had an AMI ID of ami-f1d51489.

    Figure E.3. Selecting an AWS Machine Image

  5. Some of the neural network code in this book was tested on the Deep Learning AMI (Ubuntu), which is designed to take advantage of any GPU hardware present on your virtual machine. Click the blue Select button next to the AMI you want to use. If you’ve selected an Amazon Marketplace image, you’ll be presented with an estimate of the prices for running the AMI on various EC2 instance types that have a GPU (see figure E.4).
    Figure E.4. Cost overview for the machine image and the available instance types in your AWS region

  6. Many open source AMIs, like the Deep Learning Ubuntu AMI, are free, so the Software cost column on the More Info page for Amazon Marketplace shows $0. Other AMIs under the AWS Marketplace tab, such as the RocketML AMI, may have software costs associated with them. Regardless of the software cost, you’ll need to pay for server instance power-on time if it exceeds your “free tier” allowance. A GPU instance isn’t covered under the free tier. So make sure your pipeline has been fully tested on a low-cost CPU machine before running your pipeline on a more-expensive instance. Click the blue Continue button if you’re viewing this price list (see figure E.4). If you’ve returned to the AMI lists on Amazon Marketplace, you can click the blue Select button next to the AMI you would like to install on your EC2 instance, which will take you to “Step 2: Choose an Instance Type” (see figure E.5).
  7. In this step, you select the server type for your virtual machine (see figure E.5). The smallest GPU instance—g2.2xlarge—is a good value. Amazon’s dark pattern UI will preselect a much more expensive type, so you’ll have to manually select the g2.2xlarge instance if that’s the one you want. Also, you’ll find that virtual machines are much cheaper if you’ve selected US West 2 (Oregon) as your region rather than other US regions. You can find this selection in the menu at the upper-right corner of the page near your account name.
    Figure E.5. Choosing your instance type

  8. Once you’ve selected the instance type you’d like to use, you can launch your machine by clicking the blue Review and Launch button. But for your first instance, you should work your way through all the setup wizard steps so you can see what your options are, even if you decide to accept the defaults on each of these screens. To proceed to the next step, click the gray Next: Configure Instance Details button.
  9. Here you can configure the instance details (see figure E.6). If you are already using AWS machines on an existing virtual private cloud (VPC), you can assign your GPU machine to your existing VPC. Machines on the same VPC can use the same gateway or bastion servers on that VPC to access your machine. But if this is your first EC2 instance or you don’t have a “bastion server,”[3] you don’t need to worry about this.

    3

    Amazon has a tutorial on the best practices for a Bastion host (https://docs.aws.amazon.com/quickstart/-latest/linux-bastion/architecture.html).

    Figure E.6. Adding storage to your instance

  10. Selecting “Protect against accidental termination” makes it harder for you to accidentally terminate your machine. On Amazon Web Services, “terminate” means to power off a machine and wipe its storage. “Stop” means to power down or suspend the machine while retaining any training checkpoints you may have saved to persistent storage on that machine.
  11. To continue, click the Next: Add Storage button.
  12. In this step (figure E.7), you can add storage if you plan to work with large corpora. But you may be better off proceeding with a minimal amount of “local” storage on your EC2 instance and waiting to mount an Amazon “S3 Bucket” or other cloud storage service after your EC2 instance is up and running. This will allow you to share large datasets across multiple servers or training runs (between instance terminations). Amazon Web Service will charge you for any “local” EC2 storage above the 30 GB free tier allowance. The AWS UX has a lot of dark patterns that make it hard to avoid racking up charges.
    Figure E.7. Adding persistent storage to your instance

  13. Click the Next buttons to proceed through the next steps and review the default tags and security groups assigned to your EC2 instance. The final Next button sends you to the review step (see figure E.8).
    Figure E.8. Reviewing your instance setup before launching

  14. On the review screen (see figure E.8), Amazon Web Services shows you the details of your instance in one overview.
  15. Confirm that the instance details—particularly the type (RAM and CPU), the AMI image (Deep Learning Ubuntu), and storage (enough GB for your data)—are what you want before clicking the Launch button. At that point, AWS will power up your virtual machine and start loading your software image onto it.
  16. If you haven’t previously created an instance with AWS, it’ll ask you to create a new key pair (see figure E.9). The key pair allows you to ssh into the machine without a password. By default, EC2 instances don’t allow password login, so you’ll need to save the .pem file in your $HOME/.ssh/ folder and keep a copy of it in a safe place (such as your password manager) or you won’t be able to access your running server and will have to start over.
    Figure E.9. Creating a new instance key (or downloading an existing one)

  17. After saving your key pair (if you created a new key pair), AWS confirms that the instance is launched. On rare occasions, the Amazon data center may not have the resources you requested and you’ll receive an error, requiring you to start over.
  18. Click the instance hash that starts with i-... (see figure E.10). The link sends you to the overview of all your EC2 instances, where you’ll see your instance with its state indicated as “running” or “initializing.”
    Figure E.10. AWS launch confirmation

  19. You’ll want to record the public IP address for your instance (see figure E.11) alongside the .pem file for the key pair you generated earlier. A good place to store this is in your password manager with the .pem file. You’ll also want to put it within your $HOME/.ssh/config file, so you can give your instance a host name so you don’t have to find the IP address in the future.
    Figure E.11. EC2 Dashboard showing the newly created instance

    A typical config file will look something like what is shown in the following listing. You’ll want to change the HostName value to the public IP address (from the EC2 Dashboard) or fully qualified domain name (from your “Route 53” Dashboard on AWS) for your EC2 instance that you just launched.
    Listing E.1. $HOME/.ssh/config
    Host totalgood
        User ubuntu
        HostName INSTANCE_PUBLIC_IP                                     1
        Port 22
        IdentityFile ~/.ssh/nlp-in-action.pem                           2
        # ssh -i ~/.ssh/nlp-in-action.pem ubuntu@INSTANCE_PUBLIC_IP     3

    • 1 Replace INSTANCE_PUBLIC_IP with your public IP address.
    • 2 The path to the .pem file you downloaded goes here.
    • 3 You can leave notes as comments in your config file.
  20. Before logging into the AWS instance, ssh requires that the private key file (.pem file in your $HOME/.ssh directory) can be read only by you and the root superuser on your system. You can set the appropriate permissions by executing the following bash commands:[4]

    4

    A bash shell, like cygwin or git-bash, must be installed for bash ssh commands to work on a Windows system.

    $ chown -R $USER:users $HOME/.ssh
    $ chmod 700 $HOME/.ssh                       1
    $ chmod 600 $HOME/.ssh/nlp-in-action.pem     2
    $ chmod -R 600 $HOME/.ssh/*                  3

    • 1 This ensures that only you can delete, write, read, and execute the $HOME/.ssh directory.
    • 2 This ensures that only you can write and read the .pem file you downloaded.
    • 3 This ensures that you can read and write any of the key files in your $HOME/.ssh directory, like the default id_rsa and id_rsa.pub files that may have been generated when your account was created.
  21. After you’ve set the appropriate file permissions and set up your config file, execute the following bash command to attempt to log into your EC2 instance:
    $ ssh -i ~/.ssh/nlp-in-action.pem ubuntu@INSTANCE_PUBLIC_IP
  22. If the Amazon Machine Image is Ubuntu-based, the user name is usually ubuntu. But each AMI will have documentation on the user name and ssh port number required to log into it.
  23. If you log in for the very first time, you’re warned that the fingerprint of the machine is unknown (see figure E.12). Confirm with yes to go ahead with the login process.[5]

    5

    If you see this warning in the future, when you haven’t changed its IP address, then you may have someone attempting to spoof the IP address or domain name of your machine and hack into your instance with a man-in-the-middle attack. This is extremely rare.

    Figure E.12. Confirmation request to exchange ssh credentials

  24. After a successful login, you see a welcome screen (see figure E.13).
    Figure E.13. Welcome screen after a successful login

  25. As the final step, you need to activate your preferred development environment. The machine image provides various environments, including PyTorch, TensorFlow, and CNTK. Because we use TensorFlow and Keras in this book, you should activate the tensorflow_p36 environment. This loads a virtual environment with Python 3.6, Keras, and TensorFlow installed (see figure E.14):
    $ source activate tensorflow_p36
    Figure E.14. Activating your pre-installed Keras environment

    Now that you’ve activated your TensorFlow environment, you are ready to train your deep learning NLP models. Head over to an iPython shell with
    $ ipython
    Now you’re ready to train your models. Have fun!

E.1.1. Cost control

Running a GPU instance on a cloud service like AWS can quickly get expensive. The smallest GPU instance in the US-West 2 region costs $0.65 per hour at the time of this writing. Training a simple sequence-to-sequence model can take a few hours, and then you might want to iterate on your model parameters. All iterations can quickly add up to a decent monthly bill. You can minimize surprises with a few precautions (see figures E.15 and E.16):

  • Turn off idle GPU machines. When you stop (not terminate) your machine, the last state of the storage (except your /tmp folder) will be preserved and you can return to it. In-memory data will be lost, so make sure to save all your model checkpoints before stopping the machine.
    Figure E.15. AWS Billing Dashboard

    Figure E.16. AWS Budget Console

  • Check your EC2 instance summary page for running instances.
  • Check your AWS bill summary regularly to check for running instances.
  • Create an AWS Budget with spending alarms. Once you’ve configured a budget, AWS will alert you when you are exceeding it.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset