Nearly essential tools of the trade

This section is about the tools used in the preparation of this book. They aren't essential to Haskell or data analysis, but they deserve a mention.

Version control software – Git

If you have ever been in a situation where you needed to update an old file while keeping that old file, you may have been tempted to name the files MyFileVersion1 and MyFileVersion2. In this instance, you used manual version control. Instead, you should use version control software.

Git is a distributed version control software that allows teams of programmers to work on a single project, track their changes, branch a project, merge project branches, and roll back mistakes if necessary. Git will scale from a team of 1 to hundreds of members.

If you already have a favorite software package for version control, we encourage you to use it while working through the examples in this book. If not, we will quickly demonstrate how to use Git.

First, you need to install Git by using the following code:

$ sudo apt-get install git

Git requires you to set up a repository in your working directory. Navigate to your folder for your Haskell project and create a repository:

$ git init

Once your repository is created, add the files that you've created in this chapter to the repository. Create a file called LearningDataAnalysis01.hs. At this point, the file should be blank. Let's add the blank file to our repository:

$ git add LearningDataAnalysis01.hs

Now, we'll commit the change:

$ git commit -m 'Add chapter 1 file'

Take a moment to revisit the LearningDataAnalysis01.hs file and make a change to damage the file. We can do this via the following command line:

$ echo "It was a mistake to add this line." >> LearningDataAnalysis01.hs

An addition to this line represents work that you contributed to a file but later realized was a mistake. This program will no longer compile with these changes. You may wish that you could remember the contents of the original file. You are in luck. Everything that you have committed to the version control is stored in the repository. Rename your damaged file to LearningDataAnalysis01Damaged.hs. We will fix our file back to the last commit:

$ git checkout -- LearningDataAnalysis01.hs

The LearningDataAnalysis01.hs blank file will be added back to your folder. When you inspect your file, you will see that the changes are gone and the file is restored. Hurray!

If you have a project consisting of at least one file, you should use version control. Here is the general workflow for branchless version control:

  1. Think.
  2. Write some code.
  3. Test that code.
  4. Commit that code.
  5. Go to step 1.

It doesn't take long to see the benefits of version control. Mistakes happen and version control is there to save you. This version control workflow will be sufficient for small projects. Though we will not remind you that you should use version control, you should make a practice of committing your code after each chapter (which is done probably more frequently than this).

Tmux

Tmux is an application that is used to run multiple terminals within a single terminal. A collection of terminals can be detached and reattached to other terminal connections, programs can be kept running in the background to monitor the progress, and the user can be allowed to jump back and forth between terminals, for example, while writing this book, we typically kept tmux running with the following terminals:

  • A terminal for the interactive Haskell command line
  • A terminal running our favorite text editor while working on the code for a chapter
  • A terminal running a text editor with mental notes to ourselves and snippets of code
  • A terminal running a text editor containing the text of the chapter we were currently writing
  • A terminal running the terminal web browser elinks in order to read the Haskell documentation

The prized feature (in our opinion) of tmux is its ability to detach from a terminal (even the one that has lost connection) and reattach itself to the currently connected terminal. Our work environment is a remote virtual private server running Debian Linux. With tmux, we can log in to our server from any computer with an Internet connection and an ssh client, reattach the current tmux session, and return to the testing and writing of the code.

We will begin by installing tmux:

$ sudo apt-get install tmux

Now, let's start tmux:

$ tmux

You will see the screen refresh with a new terminal. You are now inside a pseudoterminal. While in this terminal, start the interactive Haskell compiler (ghci). At the prompt, perform a calculation. Let's add 2 and 2 by using the prefix manner rather than the typical infix manner (all operators in Haskell are functions that allow for infix evaluation. Here, we call addition as a function):

$ ghci
GHCi, version 7.4.1: http://www.haskell.org/ghc/ :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
> (+) 2 2
4

The interactive Haskell compiler runs continuously. On your keyboard, type Ctrl + B, followed by C (for create). This command creates a new terminal. You can cycle forward through the chain of open terminals by using the Ctrl + B command, followed by N (for next). You now have two terminals running on the same connection.

Imagine that this is being viewed on a remove server. On your keyboard, type Ctrl + B followed by D. The screen will return just prior to you calling tmux. The [detached] word will now be seen on the screen. You no longer will be able to see the interactive Haskell compiler, but it will still run in the background of your computer. You can reattach the session to this terminal window by using the following command:

$ tmux attach -d

Your windows will be restored with all of your applications running and the content on the screen the same as it was when you left it. Cycle through the terminals until you find the Haskell interactive command line (Ctrl + B followed by P, cycles to the previous terminal). The application never stopped running. Once you are finished with your multiplexed session, close the command line in the manner that you normally would (either by using Ctrl + D, or by typing exit). Every terminal that is closed will return you to another open terminal. The tmux service will stop once the last terminal opened within the tmux command is closed.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset