Chapter 2
IN THIS CHAPTER
Obtaining and using Python
Downloading and installing the datasets and example code
Running an application
Writing Python code
As mentioned in Chapter 1, Python is a flexible language that supports multiple coding styles, including an implementation of the functional programming paradigm. However, Python’s implementation is impure because it does support the other coding styles. Consequently, you choose between flexibility and the features that functional programming can provide when you choose Python. Many developers choose flexibility (and therefore Python), but there is no right or wrong choice — just the choice that works best for you. This chapter helps you set up, configure, and become familiar with Python so that you can use it in the book chapters that follow.
You could download and install Python 3.6.4 to work with the examples in this book. Doing so would still allow you to gain an understanding of how functional programming works in the Python environment. However, using the pure Python installation will also increase the amount of work you must perform to have a good coding experience and even potentially reduce the amount you learn because your focus will be on making the environment work, rather than seeing how Python implements the functional programming paradigm. Consequently, this book relies on the Jupyter Notebook Integrated Development Environment (IDE) (or user interface or editor, as you might prefer) of the Anaconda tool collection to perform tasks for the reasons described in the following sections.
A good IDE contains a certain amount of intelligence. For example, the IDE can suggest alternatives when you type the incorrect keyword, or it can tell you that a certain line of code simply won’t work as written. The more intelligence that an IDE contains, the less hard you have to work to write better code. Writing better code is essential because no one wants to spend hours looking for errors, called bugs.
Finding bugs (errors) in your code involves a process called debugging. Even the most expert developer in the world spends time debugging. Writing perfect code on the first pass is nearly impossible. When you do, it’s cause for celebration because it won’t happen often. Consequently, the debugging capabilities of your IDE are critical. Unfortunately, the debugging capabilities of the native Python tools are almost nonexistent. If you spend any time at all debugging, you quickly find the native tools annoying because of what they don’t tell you about your code.
Most IDEs look like fancy text editors, and that’s precisely what they are. Yes, you get all sorts of intelligent features, hints, tips, code coloring, and so on, but at the end of the day, they’re all text editors. Nothing is wrong with text editors, and this chapter isn’t telling you anything of the sort. However, given that Python developers often focus on scientific applications that require something better than pure text presentation, using notebooks instead can be helpful.
This book uses the Anaconda tool collection because it provides you with a great Python coding experience, but also because it helps you discover the enormous potential of literate programming techniques. If you spend a lot of time performing scientific tasks, Anaconda and products like it are essential. In addition, Anaconda is free, so you get the benefits of the literate programming style without the cost of other packages.
As mentioned in the previous section, Anaconda doesn’t come with your Python installation. With this in mind, the following sections help you obtain and install Anaconda on the three major platforms supported by this book.
The basic Anaconda package comes as a free download that you obtain at https://www.anaconda.com/download/
. Simply click the symbol for your operating system, such as the window icon for Windows, and then click Download in the platform’s section of the page to obtain access to the free product. (Depending on the Anaconda server load, the download can require a while to complete, so you may want to get a cup of coffee while waiting.) Anaconda supports the following platforms:
The free product is all you need for this book. However, when you look on the site, you see that many other add-on products are available. These products can help you create robust applications. For example, when you add Accelerate to the mix, you obtain the capability to perform multicore and GPU-enabled operations. The use of these add-on products is outside the scope of this book, but the Anaconda site gives you details on using them.
You have to use the command line to install Anaconda on Linux; you’re given no graphical installation option. Before you can perform the installation, you must download a copy of the Linux software from the Continuum Analytics site. You can find the required download information in the “Obtaining Analytics Anaconda” section, earlier in this chapter. The following procedure should work fine on any Linux system, whether you use the 32-bit or 64-bit version of Anaconda:
Open a copy of Terminal.
The Terminal window appears.
Change directories to the downloaded copy of Anaconda on your system.
The name of this file varies, but normally it appears as Anaconda3-5.1.0-Linux-x86.sh
for 32-bit systems and Anaconda3-5.1.0-Linux-x86_64.sh
for 64-bit systems. The version number is embedded as part of the filename. In this case, the filename refers to version 5.1.0, which is the version used for this book. If you use some other version, you may experience problems with the source code and need to make adjustments when working with it.
Type bash Anaconda3-5.1.0-Linux-x86.sh (for the 32-bit version) or bash Anaconda3-5.1.0-Linux-x86_64.sh (for the 64-bit version) and press Enter.
An installation wizard starts that asks you to accept the licensing terms for using Anaconda.
Read the licensing agreement and accept the terms using the method required for your version of Linux.
The wizard asks you to provide an installation location for Anaconda. The book assumes that you use the default location of ~/anaconda. If you choose some other location, you may have to modify some procedures later in the book to work with your setup.
Provide an installation location (if necessary) and press Enter (or click Next).
The application extraction process begins. After the extraction is complete, you see a completion message.
Add the installation path to your PATH
statement using the method required for your version of Linux.
You're ready to begin using Anaconda.
The Mac OS X installation comes in only one form: 64-bit. Before you can perform the install, you must download a copy of the Mac software from the Continuum Analytics site. You can find the required download information in the “Obtaining Analytics Anaconda” section, earlier in this chapter.
The installation files come in two forms. The first depends on a graphical installer; the second relies on the command line. The command-line version works much like the Linux version described in the preceding section of this chapter, “Installing Anaconda on Linux.”. The following steps help you install Anaconda 64-bit on a Mac system using the graphical installer:
Locate the downloaded copy of Anaconda on your system.
The name of this file varies, but normally it appears as Anaconda3-5.1.0-MacOSX-x86_64.pkg
. The version number is embedded as part of the filename. In this case, the filename refers to version 5.1.0, which is the version used for this book. If you use some other version, you may experience problems with the source code and need to make adjustments when working with it.
Double-click the installation file.
An introduction dialog box appears.
Click Continue.
The wizard asks whether you want to review the Read Me materials. You can read these materials later. For now, you can safely skip the information.
Click Continue.
The wizard displays a licensing agreement. Be sure to read through the licensing agreement so that you know the terms of usage.
Click I Agree if you agree to the licensing agreement.
You see a Standard Install dialog box where you can choose to perform a standard installation, change the installation location, or customize your setup. The standard installation is the one you should use for this book. Making changes could cause some steps within the book to fail unless you know how to modify the instructions to suit your setup.
Click Install.
The installation begins. A progress bar tells you how the installation process is progressing. When the installation is complete, you see a completion dialog box.
Click Continue.
You’re ready to begin using Anaconda.
Anaconda comes with a graphical installation application for Windows, so getting a good installation means using a wizard, as you would for any other installation. Of course, you need a copy of the installation file before you begin, and you can find the required download information in the “Obtaining Analytics Anaconda” section, earlier in this chapter. The following procedure (which can require a while to complete) should work fine on any Windows system, whether you use the 32-bit or 64-bit version of Anaconda:
Locate the downloaded copy of Anaconda on your system.
The name of this file varies, but normally it appears as Anaconda3-5.1.0-Windows-x86.exe
for 32-bit systems and Anaconda3-5.1.0-Windows-x86_64.exe
for 64-bit systems. The version number is embedded as part of the filename. In this case, the filename refers to version 5.1.0, which is the version used for this book. If you use some other version, you may experience problems with the source code and need to make adjustments when working with it.
Double-click the installation file.
(You may see an Open File – Security Warning dialog box that asks whether you want to run this file. Click Run if you see this dialog box pop up.) You see an Anaconda3 5.1.0 Setup dialog box.
Click Next.
The wizard displays a licensing agreement. Be sure to read through the licensing agreement so that you know the terms of usage.
Click I Agree if you agree to the licensing agreement.
You're asked what sort of installation type to perform (personal or for everyone). In most cases, you want to install the product just for yourself. The exception is if you have multiple people using your system and they all need access to Anaconda.
Choose one of the installation types and then click Next.
The wizard asks where to install Anaconda on disk, as shown in Figure 2-1. The book assumes that you use the default location. If you choose some other location, you may have to modify some procedures later in the book to work with your setup.
Choose an installation location (if necessary) and then click Next.
You see the Advanced Installation Options, shown in Figure 2-2. These options are selected by default, and no good reason exists to change them in most cases. You might need to change them if Anaconda won’t provide your default Python 3.6.4 setup. However, the book assumes that you’ve set up Anaconda using the default options.
Change the advanced installation options (if necessary) and then click Install.
You see an Installing dialog box with a progress bar. The installation process can take a few minutes, so get yourself a cup of coffee and read the comics for a while. When the installation process is over, you see a Next button enabled.
Click Next.
The wizard presents you with an option to install Microsoft VSCode. Installing this feature can cause problems with the book examples, so the best idea is not to install it. The book doesn’t make use of this feature.
Click Skip.
The wizard tells you that the installation is complete. You see options for learning more about Anaconda Cloud and getting started with Anaconda.
Choose the desired learning options and then click Finish.
You’re ready to begin using Anaconda.
The Anaconda package contains a number of applications, only one of which you use with this book. Here is a quick rundown on the tools you receive:
pip
, and doing other command line-related tasks.This book is about using Python to perform functional programming tasks. Of course, you can spend all your time creating the example code from scratch, debugging it, and only then discovering how it relates to learning about the wonders of Python, or you can take the easy way and download the prewritten code from the Dummies site as described in the book’s Introduction so that you can get right to work.
The following sections show how to work with Jupyter Notebook, one of the tools found in the Anaconda package. These sections emphasize the capability to manage application code, including importing the downloadable source and exporting your amazing applications to show friends.
To make working with the code in this book easier, you use Jupyter Notebook. This IDE lets you easily create Python notebook files that can contain any number of examples, each of which can run individually. The program runs in your browser, so which platform you use for development doesn’t matter; as long as it has a browser, you should be okay.
Most platforms provide an icon to access Jupyter Notebook. Just click this icon to access Jupyter Notebook. For example, on a Windows system, you choose Start ⇒ All Programs ⇒ Anaconda 3 ⇒ Jupyter Notebook. Figure 2-3 shows how the interface looks when viewed in a Firefox browser. The precise appearance on your system depends on the browser you use and the kind of platform you have installed.
No matter how you start Jupyter Notebook (or just Notebook, as it appears in the remainder of the book), the system generally opens a command prompt or terminal window to host Jupyter Notebook. This window contains a server that makes the application work. After you close the browser window when a session is complete, select the server window and press Ctrl+C or Ctrl+Break to stop the server.
The code you create and use in this book will reside in a repository on your hard drive. Think of a repository as a kind of filing cabinet where you put your code. Notebook opens a drawer, takes out the folder, and shows the code to you. You can modify it, run individual examples within the folder, add new examples, and simply interact with your code in a natural manner. The following sections get you started with Notebook so that you can see how this whole repository concept works.
It pays to organize your files so that you can access them more easily later. This book keeps its files in the FPD (Functional Programming For Dummies) folder. Use these steps within Notebook to create a new folder:
Choose New ⇒ Folder.
Notebook creates a new folder named Untitled Folder. The file appears in alphanumeric order, so you may not initially see it. You must scroll down to the correct location.
Click Rename at the top of the page.
You see a Rename Directory dialog box like the one shown in Figure 2-4.
Type FPD and click Rename.
Notebook changes the name of the folder for you.
Click the new FPD entry in the list.
Notebook changes the location to the FPD folder in which you perform tasks related to the exercises in this book.
Every new notebook is like a file folder. You can place individual examples within the file folder, just as you would sheets of paper into a physical file folder. Each example appears in a cell. You can put other sorts of things in the file folder, too, but you see how these things work as the book progresses. Use these steps to create a new notebook:
Click New ⇒ Python 3.
A new tab opens in the browser with the new notebook, as shown in Figure 2-5. Notice that the notebook contains a cell and that Notebook has highlighted the cell so that you can begin typing code in it. The title of the notebook is Untitled right now. That’s not a particularly helpful title, so you need to change it.
Click Untitled on the page.
Notebook asks what you want to use as a new name, as shown in Figure 2-6.
Type FPD_02_Sample and press Enter.
The new name tells you that this is a file for Functional Programming For Dummies, Chapter 2, Sample.ipynb
. Using this naming convention lets you easily differentiate these files from other files in your repository.
Of course, the Sample notebook doesn't contain anything just yet. Place the cursor in the cell, type print(‘Python is really cool!'), and then click the Run button. You see the output shown in Figure 2-7. The output is part of the same cell as the code (the code resides in a square box and the output resides outside that square box, but both are within the cell). However, Notebook visually separates the output from the code so that you can tell them apart. Notebook creates a new cell for you.
When you finish working with a notebook, shutting it down is important. To close a notebook, choose File ⇒ Close and Halt. You return to Notebook’s Home page, where you can see that the notebook you just created is added to the list.
Creating notebooks and keeping them all to yourself isn’t much fun. At some point, you want to share them with other people. To perform this task, you must export your notebook from the repository to a file. You can then send the file to someone else, who will import it into a different repository.
The previous section shows how to create a notebook named FPD_02_Sample.ipynb
in Notebook. You can open this notebook by clicking its entry in the repository list. The file reopens so that you can see your code again. To export this code, choose File ⇒ Download As ⇒ Notebook (.ipynb). What you see next depends on your browser, but you generally see some sort of dialog box for saving the notebook as a file. Use the same method for saving the Notebook file as you use for any other file you save by using your browser. Remember to choose File ⇒ Close and Halt when you finish so that the application shuts down.
Sometimes notebooks get outdated or you simply don’t need to work with them any longer. Rather than allow your repository to get clogged with files that you don’t need, you can remove these unwanted notebooks from the list. Use these steps to remove the file:
FPD_02_Sample.ipynb
entry.Click the trash can icon (Delete) at the top of the page.
You see a Delete notebook warning message like the one shown in Figure 2-8.
Click Delete.
The file gets removed from the list.
To use the source code from this book, you must import the downloaded files into your repository. The source code comes in an archive file that you extract to a location on your hard drive. The archive contains a list of .ipynb
(IPython Notebook) files containing the source code for this book (see the Introduction for details on downloading the source code). The following steps tell how to import these files into your repository:
Click Upload at the top of the page.
What you see depends on your browser. In most cases, you see some type of File Upload dialog box that provides access to the files on your hard drive.
Highlight one or more files to import and click the Open (or other, similar) button to begin the upload process.
You see the file added to an upload list, as shown in Figure 2-9. The file isn't part of the repository yet — you’ve simply selected it for upload.
Click Upload.
Notebook places the file in the repository so that you can begin using it.
This book uses a number of datasets, all of which appear in the Scikit-learn library. These datasets demonstrate various ways in which you can interact with data, and you use them in the examples to perform a variety of tasks. The following list provides a quick overview of the function used to import each of the datasets into your Python code:
load_boston()
: Regression analysis with the Boston house-prices datasetfetch_olivetti_faces
()
: Olivetti faces dataset from AT&Tmake_blobs()
: Generates isotropic Gaussian blobs used for clusteringThe technique for loading each of these datasets is the same across examples. The following example shows how to load the Boston house-prices dataset. You can find the code in the FPD_02_Dataset_Load.ipynb notebook.
from sklearn.datasets import load_boston
Boston = load_boston()
print(Boston.data.shape)
To see how the code works, click Run. The output from the print
call is (506, 13)
. You can see the output shown in Figure 2-10.
Actually, you've already created your first Anaconda application by using the steps in the “Creating a new notebook” section, earlier in this chapter. The print()
method may not seem like much, but you use it quite often. However, the literate programming approach provided by Anaconda requires a little more knowledge than you currently have. The following sections don’t tell you everything about this approach, but they do help you gain an understanding of what literate programming can provide in the way of functionality. However, before you begin, make sure you have the FPD_02_Sample.ipynb
file open for use because you need it to explore Notebook.
If Notebook were a standard IDE, you wouldn't have cells. What you’d have is a document containing a single, contiguous series of statements. To separate various coding elements, you need separate files. Cells are different because each cell is separate. Yes, the results of things you do in previous cells matter, but if a cell is meant to work alone, you can simply go to that cell and run it. To see how this works for yourself, type the following code into the next cell of the FPD_02_Sample
file:
myVar = 3 + 4
print(myVar)
Now click Run (the right-pointing arrow). The code executes, and you see the output, as shown in Figure 2-11. The output is 7
, as expected. However, notice the In [1]: entry. This entry tells you that this is the first cell executed during this session. If you want to start a new session (and therefore restart the numbers at 1), you choose Kernel ⇒ Restart (or one of the other restart options).
Note that the first cell also has an In [1]: entry. This entry is still from the previous session. Place your cursor in that cell and click Run. Now the cell contains In [2]:, as shown in Figure 2-12. However, note that the next cell hasn't been selected and still contains the In [1]: entry.
Now place the cursor in the third cell — the one that is currently blank — and type print("This is myVar: ", myVar)
. Click Run. The output in Figure 2-13 shows that the cells have executed in anything but a rigid order, but that myVar
is global to the notebook. What you do in other cells with data affects every other cell, no matter in what order the execution takes place.
Cells come in a number of different forms. This book doesn't use them all. However, knowing how to use the documentation cells can come in handy. Select the first cell (the one currently marked with a 2). Choose Insert ⇒ Insert Cell Above. You see a new cell added to the notebook. Note the drop-down list that currently shows the word Code. This list allows you to choose the kind of cell to create. Select Markdown from the list and type # This is a level 1 heading. Click Run (which may seem like an extremely odd thing to do, but give it a try). You see the text change into a heading, as shown in Figure 2-14. However, notice also that the cell lacks the In [x]
entry beside it, as the code cells have.
About now, you might be thinking that these special cells act just like HTML pages, and you’d be right. Choose Insert ⇒ Insert Cell Below, select Markdown in the drop-down list, and then type ## This is a level 2 heading. Click Run. As you can see, the number of hashes (#) you add to the text affects the heading level, but the hashes don’t show up in the actual heading.
This chapter (and book) doesn’t demonstrate all the kinds of cell content that you can see by using Notebook. However, you can add things like graphics to your notebooks, too. When the time comes, you can output (print) your notebook as a report and use it in presentations of all sorts. The literate programming technique is different from what you may have used in the past, but it has definite advantages, as you see in upcoming chapters.
The code you create using Notebook is still code and not some mystical unique file that only Notebook can understand. When working with any file, such as the FPD_02_Sample
, you can choose File ⇒ Download As ⇒ Python (.py) to output the Notebook as a Python file. Try it and you end up with FPD_02_Sample.py
.
To see the code run as it would using Python directly, open an Anaconda Prompt, which, on a Windows machine, you do by choosing Start ⇒ All Programs ⇒ Anaconda3 ⇒ Anaconda Prompt. The Anaconda Prompt has special features that make accessing the Python interpreter easy. Use the Change Directory (CD) command for your system to change directories to the one that holds the source code file. Type Python FPD_02_Sample.py and press Enter. Your code will execute as shown in Figure 2-15.
This book doesn't spend much time using this approach because, as you can see, it’s harder to use and understand than working with Notebook. However, it’s still a perfectly acceptable way to execute your own code.
As you work through the examples in this book, you see that certain lines are indented. In fact, the examples also provide a fair amount of white space (such as extra lines between lines of code). Python ignores extra lines for the most part, but relies on indentation to show certain coding elements (so the use of indentation is essential). For example, the code associated with a function is indented under that function so that you can easily see where the function begins and ends. The main reason to add extra lines is to provide visual cues about your code, such as the end of a function or the beginning of a new coding element.
The various uses of indentation will become more familiar as go through the examples in the book. However, you should know at the outset why indentation is used and how it gets put in place. To that end, it’s time for another example. The following steps help you create a new example that uses indentation to make the relationship between application elements a lot more apparent and easier to figure out later:
Choose New ⇒ Python3.
Jupyter Notebook creates a new notebook for you. The downloadable source uses the filename FPD_02_Indentation.ipynb
, but you can use any name you want.
Type print(“This is a really long line of text that will ” +.
You see the text displayed normally onscreen, just as you expect. The plus sign (+) tells Python that there is additional text to display. Adding text from multiple lines together into a single long piece of text is called concatenation. You learn more about using this feature later in the book, so you don’t need to worry about it now.
Press Enter.
The insertion point doesn’t go back to the beginning of the line, as you might expect. Instead, it ends up directly under the first double quote, as shown in Figure 2-16. This feature is called automatic indention and is one of the features that differentiates a regular text editor from one designed to write code.
Type “appear on multiple lines in the source code file.”) and press Enter.
Notice that the insertion point goes back to the beginning of the line. When Notebook senses that you have reached the end of the code, it automatically outdents the text to its original position.
Click Run.
You see the output shown in Figure 2-17. Even though the text appears on multiple lines in the source code file, it appears on just one line in the output. The line does break because of the size of the window, but it’s actually just one line.
People create notes for themselves all the time. When you need to buy groceries, you look through your cabinets, determine what you need, and write it down on a list. When you get to the store, you review your list to remember what you need. Using notes comes in handy for all sorts of needs, such as tracking the course of a conversation between business partners or remembering the essential points of a lecture. Humans need notes to jog their memories. Comments in source code are just another form of note. You add comments to the code so that you can remember what task the code performs later. The following sections describe comments in more detail. You can find these examples in the FPD_02_Comments.ipynb
file in the downloadable source.
Computers need some special way to determine that the text you're writing is a comment, not code to execute. Python provides two methods of defining text as a comment and not as code. The first method is the single-line comment. It uses the hash, also called the number sign (#), like this:
# This is a comment.
print("Hello from Python!") #This is also a comment.
Python doesn’t actually support a multiline comment directly, but you can create one using a triple-quoted string. A multiline comment both starts and ends with three double quotes ("""
) or three single quotes (''') like this:
"""
Application: Comments.py
Written by: John
Purpose: Shows how to use comments.
"""
You typically use multiline comments for longer explanations of who created an application, why it was created, and what tasks it performs. Of course, no hard rules exist for precisely how to use comments. The main goal is to tell the computer precisely what is and isn’t a comment so that it doesn’t become confused.
A lot of people don’t really understand comments—they don’t quite know what to do with notes in code. Keep in mind that you might write a piece of code today and then not look at it for years. You need notes to jog your memory so that you remember what task the code performs and why you wrote it. Here are some common reasons to use comments in your code:
You can use comments in a lot of other ways, too, but these are the most common ways. Look at how comments are used in the examples in the book, especially as you get to later chapters where the code becomes more complex. As your code becomes more complex, you need to add more comments and make the comments pertinent to what you need to remember about it.
Developers also sometimes use the commenting feature to keep lines of code from executing (referred to as commenting out). You might need to do this to determine whether a line of code is causing your application to fail. As with any other comment, you can use either single-line commenting or multiline commenting. However, when using multiline commenting, you do see the code that isn’t executing as part of the output (and it can actually be helpful to see where the code affects the output). Here is an example of both forms of commenting out:
# print("This print statement won't print")
"""
print("This print statement appears as output")
"""
After you have used the File ⇒ Close and Halt command to close each of the notebooks you have open (the individual browser windows), you can simply close the browser window showing the Notebook Home page to end your session. However, the Notebook server (a separate part of Notebook) continues to run in the background. Normally, a Jupyter Notebook window opens when you start Notebook, like the one shown in Figure 2-19. This window remains open until you stop the server. Simply press Ctrl+C to end the server session, and the window will close.
You have access to a wealth of Python resources online, and many of them appear in this book in the various chapters. However, the one resource you need to know about immediately is Anaconda Navigator. You start this application by choosing the Anaconda Navigator entry in the Anaconda3 folder. The application requires a few moments to start, so be patient.
The Community tab, shown in Figure 2-21, provides access to events, forums, and social entities. Some of this content changes over time, especially the events. To get a quick overview of an entry, hover the mouse over it. Reading an overview is especially helpful when deciding whether you want to learn more about events. Forums differ from social media by the level of formality and the mode of access. For example, the Stack Overflow allows you to ask Python-related questions, and Twitter allows you to rave about your latest programming feat.