Chapter 4
Configuration and Setup

When starting to use Git, it's important to configure it so that it works properly in your particular environment. You'll also want to be able to manage your content and your interactions with Git in a way that you prefer. In this chapter, you will learn how to configure your Git environment, and explore the different considerations that come into play. You'll look at some of the key required items such as line endings, as well as some of the more significant optional settings. You'll also learn how to define settings within the different scopes that Git allows.

In the “Advanced Topics” section, I'll describe how the init command works, offer more detail about what's actually in the underlying repository, and show you how to create aliases that take parameters that can run small programs.

EXECUTING COMMANDS IN GIT

As I previously mentioned, this book focuses on the Git command line to provide the most universally applicable way to use the tool. The general form of commands is as follows:

git <git-options>  <command>  <command-options>  <operands>

Table 4.1 describes the different parts of this form.

Table 4.1 Components of a Git Command Line Invocation

Element Description EXAMPLE(S) Notes
git Command to run Git git
<git-options> Global options for Git itself. These options may also specify a function to execute. git --work-tree=<path>
git --version
Some of these options may be intended for standalone operation (for example, --version), while others modify values used for other commands (for example, --work-tree).
<command> Git command to execute git push
<command-options> Options to the specified command git commit -m “comment” May have default options if none are specified. Options may also have values that can be selected to further qualify the option.
<operands> Items for the command to operate on git add *.c Particular to the command being executed. Examples include files in the working directory, branches or SHA1s in a repository, or a particular setting or value.

Operand Types

As referenced in Table 4.1, Git can take different kinds of operands, which are specifications of objects to operate on. The two most common operands are the SHA1 value of a commit (or a named branch or tag that refers to such a commit) and a path specification to a file or directory on the disk. For many commands, either or both of these value types may be specified—or neither. When neither operand is specified, the command will operate against all eligible items that it finds in the scope of the repository, staging area, or working directory tree.

The primary reason to specify both commit references and paths would be to select certain paths that are part of, or in the scope of, the snapshot associated with the commit. Because Git operates at the granularity of a snapshot (tree), you may not always want to do the operation against all items in the snapshot. However, that's what would happen if you just specified the commit | tag | branch. To indicate that the operation should only be done against certain files or paths in the scope of the snapshot, you need to add specific filenames or paths.

When both types are specified, if there is a possibility of Git not being able to tell the difference between a commit | branch | tag and one or more of the filenames or paths, then you can separate the two types using the special separation symbol “--”. Normally, this won't be needed if a commit is expressed as a SHA1 value, but it may be needed if branch or tag names could be mistaken as names for files or paths.

As an example, the command git <command> a1b2c3d4 file1.txt might be clear enough, but git <command> my-tag-name -- my-file-name could be ambiguous enough when parsed to require the “--” separator symbol.

Porcelain versus Plumbing Commands

In this section, command represents any of the commands available in Git, such as the ones I talked about for moving content between the levels of Git in Chapter 3 (add, commit, push, and so on). In Git, there are two categories for the types of commands: porcelain and plumbing. Those names may sound strange, but essentially, the porcelain commands are intended to be user-facing, more commonly used, and more convenient. They also typically provide a higher level of functionality. The commands that I previously mentioned in conjunction with the Git promotion model are examples of porcelain commands.

The plumbing commands function at a lower level and are not expected to be used by the average user. These commands are typically targeted at extracting or modifying content and information more directly from the repository. An example would be the git cat-file or git ls-files commands that provide a way to look at the contents of a file or directory within the repository if you know how to reference those elements.

Certain functionality in Git can be accomplished using either porcelain commands or plumbing commands. However, it would usually take several very specific plumbing commands to accomplish what one porcelain command can do. The porcelain commands are based on the plumbing commands. They aggregate the functionality of plumbing commands and certain options and sequences in order to make things simpler for the typical Git user.

Table 4.2 shows a categorization of the porcelain (user-friendly) commands that are available in Git.

Table 4.2 Porcelain Commands in Git

Command Purpose
add
bisect
branch
checkout
cherry
cherry-pick
clone
commit
config
diff
fetch
grep
help
log
merge
mv
pull
push
rebase
rerere
reset
revert
rm
show
status
submodule
subtree
tag
worktree
Add file contents to the index.
Find by binary search the change that introduced a bug.
List, create, or delete branches.
Switch branches or restore working tree files.
Find commits yet to be applied to upstream (branch on the remote).
Apply the changes introduced by some existing commits.
Clone a repository into a new directory.
Record changes to the repository.
Get and set repository or global options.
Show changes between commits, commits and working tree, and so on.
Download objects and refs from another repository.
Print lines matching a pattern.
Display help information.
Show commit logs.
Join two or more development histories together.
Move or rename a file, directory, or symlink.
Fetch from, or integrate with, another repository or a local branch.
Update remote refs along with associated objects.
Forward-port local commits to the updated upstream head.
Reuse recorded resolution for merged conflicts.
Reset current HEAD to the specified state.
Revert some existing commits.
Remove files from the working tree and from the index.
Show various types of objects.
Show the working tree status.
Initialize, update, or inspect submodules.
Merge subtrees and split repositories into subtrees.
Create, list, delete, or verify a tagged object.
Manage multiple working trees.

Table 4.3 shows the same categorization for the plumbing commands. These commands have names that indicate an action and an object to operate against as opposed to the simpler naming of the porcelain commands.

Table 4.3 Plumbing Commands in Git

Command Purpose
cat-file
commit-tree
count-objects
diff-index
for-each-ref
hash-object
ls-files
merge-base
read-tree
rev-list
rev-parse
show-ref
symbolic-ref
update-index
update-ref
verify-pack
write-tree
Provide content or type and size information for repository objects.
Create a new commit object.
Count an unpacked number of objects and their disk consumption.
Compare a tree to the working tree or index.
Output information on each ref.
Compute object ID and optionally create a blob from a file.
Show information about files in the index and the working tree.
Find as good common ancestors as possible for a merge.
Read tree information into the index.
List commit objects in reverse chronological order.
Pick out and massage parameters.
List references in a local repository.
Read, modify, and delete symbolic refs.
Register file contents in the working tree to the index.
Update the object name stored in a ref safely.
Validate packed Git archive files.
Create a tree object from the current index.

The descriptions for the commands in these tables are taken directly from the Git help. Some of the terms are more Git-specific at this point. However, as I use commands through the remainder of this book, I'll simplify their definitions and the terminology so it all makes sense.

The point of this section is that unless you have a specific need to deep-dive into the repository, you can simply use the porcelain commands and accomplish what you need to in Git.

Specifying Arguments

Arguments supplied either to Git or to Git commands can be abbreviated as a single letter or spelled out as words. One important note here is that if the argument is spelled out, you must precede it with two hyphens, as in --global. If the argument is abbreviated, only one hyphen is required, as in -a. Abbreviated arguments may be passed together, as in -am instead of -a -m. When arguments are combined in this way, the ordering is important. If the first argument requires a value, then the second argument may be taken as the required value instead of an additional argument.

Auto-complete

When you start typing a command or an argument to a command, Git has a helpful auto-completion feature (if enabled) that can do two things:

  • Provide valid values for the commands or arguments that could complete the text you're typing—if there is more than one valid option.
  • Automatically complete the command or argument that you're typing—if there is only one valid option.

Following are a couple of examples. The first one is for a command. If you type git c and then press the Tab key, nothing happens because there's more than one command that starts with c.

If you press the Tab key a second time (before typing anything else in between), Git helpfully displays all of the commands that start with c. In this case, it also scrolls that list up and leaves you at a prompt where you can continue typing the chosen command.

$ git c
checkout      citool        commit
cherry        clean         config
cherry-pick   clone

$ git c

Here's another example, where you narrow the available commands with more letters.

$ git co <TAB><TAB>
commit   config

$ git c

If you type enough letters to uniquely identify only one possible choice, then pressing the Tab key auto-completes the command for you because there's only one option. For example, git con <TAB> yields git config.

This also works for arguments to commands. Typing git config --l <TAB> <TAB> gives the suggestions: --list --local. Typing either git config --l or git config --li <TAB> yields git config --list.

Enabling Auto-complete If You Don't Have It

As noted earlier, auto-complete is already enabled in Git for Windows and some other distributions. For other versions (Linux, OS X) where it is not enabled, you can download scripts that implement this feature for different shells from https://github.com/git/git/tree/master/contrib/completion.

Once you understand tools like git pull, you can use them to retrieve these scripts via Git. Until then, or as an alternate approach, a simple way is just to click the desired script and then find the button labeled Raw on that page. Click that button to go to a web page with just the contents of that file. Then, you can download that script to your local system (through the browser) and add it into the appropriate init file in your home directory or into the appropriate directory for auto-completion for all users if your shell supports that.

Let's work through a quick example of how to install this feature for a bash environment.

Here's the direct link for the raw version: https://raw.githubusercontent.com/git/git/master/contrib/completion/git-completion.bash

After getting the raw version of the file, you can download that page as the file git-completion.bash to your local system. Once the script is downloaded, you add a line like the following into your .bashrc file (create the file if needed):

$ source ~/git-completion.bash

To extend this functionality for all users, you'll need to find out where your particular OS stores and expects to find auto-completion scripts and put the downloaded file there. For most bash systems, there is a /etc/bash_completion.d directory where scripts like this can be stored to be loaded. If you're not sure where the location is, try searching for completion on your file system, or consult Google.

Auto-completion and the Windows Command Prompt

In the Windows command prompt, auto-complete functionality is not built in, and the method in the previous section doesn't work because it is based on a Linux script. However, there is a utility called clink that you can search for, download, and install on Windows that will provide command auto-completion for Git (as well as other functionality). The use is the same—suggestions or completion via the tab key.

Note, however, that this does not provide suggestions or auto-completion for arguments to the commands.

Now that you understand how to invoke Git commands and pass arguments, let's see how you can use this feature to accomplish one of the most basic and essential parts of using Git: configuration.

CONFIGURING GIT

To set configuration values in Git, you use the config command. Here's the syntax:

git config [<file-option>] [type] [--show-origin] [-z|--null] name [value [value_regex]]
git config [<file-option>] [type] --add name value
git config [<file-option>] [type] --replace-all name value [value_regex]
git config [<file-option>] [type] [--show-origin] [-z|--null] --get name [value_regex]
git config [<file-option>] [type] [--show-origin] [-z|--null] --get-all name [value_regex]
git config [<file-option>] [type] [--show-origin] [-z|--null] [--name-only] --get-regexp name_regex [value_regex]
git config [<file-option>] [type] [-z|--null] --get-urlmatch name URL
git config [<file-option>] --unset name [value_regex]
git config [<file-option>] --unset-all name [value_regex]
git config [<file-option>] --rename-section old_name new_name
git config [<file-option>] --remove-section name
git config [<file-option>] [--show-origin] [-z|--null] [--name-only] -l | --list
git config [<file-option>] --get-color name [default]
git config [<file-option>] --get-colorbool name [stdout-is-tty]
git config [<file-option>] -e | --edit

Now here's an example of the most common syntax:

$ git config --global user.name "Joe Gituser"

Let's dissect the various parts of this command. The first two pieces are simply issuing the config command from git. After that is an option, global, (preceded by two hyphens because you are spelling it out). I'll be talking in more detail about this option shortly. Next comes the configuration setting that you're updating: user.name. Git uses a “.” notation to separate out the two pieces of a configuration setting—in this case, user and name. Think of this as setting the name value of the user section in the configuration. And finally, you have the actual value that you're setting this configuration setting to. Notice that because you have spaces in the value, you need to enclose the entire string in quotes.

Here's another example:

$ git config --global user.email [email protected]

One additional note: Git configuration settings are stored in text files. It is possible to change these settings by editing the associated text files, but this is highly discouraged because it's easy to make a mistake and also to accidentally modify other settings.

Telling Git Who You Are

Referring to the two earlier examples, one of the first things that you need to configure in Git is who you are, in terms of the username and e-mail address. Git expects you to set these two values, regardless of what interface or version of Git you use. This is because Git is a source management system. Because its purpose is to track changes by users over time, it wants to know who is making those changes so that it can record them.

If you don't specify these values, then Git will interpolate them from the signed-on userid and machine name (user@system). Chances are this is not what you want to have the system ultimately use. If you forget to set these values initially on a new system, and commits are recorded with the interpolated values, there is a way to go back and correct this information, using the commit command with the --amend and --reset-author options.

The values can be set via the same commands as shown in the previous section: git config --global user.name <name> and git config --global user.email <email address>.

Configuration Scope

In the previous examples, I used the --global option as part of the configuration step. The global option is a way of telling Git how broadly this configuration setting should be used—which repositories it should apply to.

Recall that the Git model is designed for many, smaller repositories instead of fewer, monolithic ones. Because users may normally be working with multiple repositories, it would be inconvenient and subject to error to have to configure the same settings in each repository. As a result, Git provides options to simplify choosing the scope for configuration values. There are three levels available for configuration: system, global, and local.

System

Configuration at the system level means that a configuration value applies to all repositories on a given system (machine) unless it's overridden at a lower level. These settings apply regardless of the particular user.

To ensure that a configuration value applies at the system level, you specify the --system option for the config command, as in git config --system core.autocrlf true.

These settings are usually stored in a gitconfig file in either /usr/etc or /usr/local/etc. On a Windows system, if you're using Git for Windows, the system file is in C:ProgramDataGitconfig. In other systems, look in the directory where Git was installed.

Global

Configuration at the global level implies that a configuration value applies to all of the repositories for a particular user, unless overridden at the local level. Unless you need repository-specific settings, this is the most common level for users to work with because it saves the effort of having to set values for each repository. An example of setting values at the global level would be the configuration I did earlier for user.name and user.email where the --global option was incorporated. These settings are stored in a file named .gitconfig in each user's home directory.

Local

Setting a configuration value at the local level means that the setting only applies in the context of that one repository. This can be useful in cases where you need to specify unique settings that are particular to one repository. It can also be useful if you need to temporarily override a higher-level setting.

An example could be overriding the global end of line settings because content in a repository is targeted for a different platform. To update settings at this level, you can specify the --local option or just omit any of the local, global, or system options for the configuration.

As an example of this last point, the following two commands are equivalent: git config --local core.autocrlf true and git config core.autocrlf true.

The local repository's configuration is stored within the local Git repository, in .git/config (or in config under wherever your Git directory is configured to be.)

These scope options (--local, --global, and --system) can be applied to other options and forms of the git config command to indicate the scope to be referenced for that command.

Settings Hierarchy

When determining what configuration setting to use, Git uses a particular search order to find these settings. First, it looks for a setting in the local repository configuration, then in the global configuration, and finally in the system configuration. If a specific value is found in that search order, then that value is used. Beyond that, the union of all of the levels (unique local + unique global + unique system) forms the superset of configuration values used when working with a repository.

Figure 4.1 summarizes the different configuration scopes in Git and how to work with them.

A schematic diagram of the different configuration scopes in Git and how to work with them.

Figure 4.1 Understanding the scopes of Git configuration files

Seeing Configuration Values

To see what value a particular configuration setting has, you can use git config <setting> as in git config user.name.

Git then prints the value associated with that setting. Because I didn't specify one of the scope options (--system, --global, --local), Git first checks to see if there is a local setting, and if so, it displays that value. If there is no explicit local setting, then it looks for a global setting, and, if one is found, displays the global value. If there is no global setting specified, Git looks for a system setting and displays that value. This is an example of the search order that I outlined earlier.

You can also use the scope options to specifically direct the config command to a particular level, as I did when setting configuration values earlier.

To better understand how this works at a practical level, consider the following sequence:

$ git config --global user.name "Global user"
$ git config user.name 

This returns the value Global user because there was no local value defined; Git looked for a global setting and found this one.

On the other hand, say you were to use this sequence:

$ git config user.name "Local user"
$ git config user.name 

This returns the value Local User because the local option was implied in setting the value and thus it finds a local value defined.

Undoing a Configuration Setting

Occasionally, you may need to remove a user setting at a particular level. Git provides the unset option for this, and it's pretty straightforward:

$ git config --unset <other options> <value to remove>

Other options here would generally refer to one of the scope options. Continuing the earlier example,

$ git config --unset --global user.name
$ git config --global user.name

In this case, nothing is returned because I just removed this value.

Listing Configuration Settings

Another option related to viewing configuration values is --list. Supplying the list option to git config results in a list of all configuration settings being dumped. By default, this list includes local, global, and system settings without qualification. So, if you have both a local and global value for the same setting, you will see both.

$ git config --list
…
user.name = global user
…
user.name = local user

If the settings have the same values, this can be confusing (and potentially misleading) if you're not aware of the reasons behind it. To work around seeing these multiple values, you can refine the list by specifying one of the scope options.

$ git config --local --list
…
user.name = local user

One-Shot Configuration

There is one additional way to set a configuration value: as a one-shot, one-time configuration for the current operation. This is done through one of the global options that can be passed to Git directly: -c.

The format for this is git -c <configuration setting>=<value> <rest of command line>.

Notice that this format requires the “=” sign between the setting and the value. Using this option effectively creates an override for the duration of the current operation.

Now that you understand how configuration settings are specified and managed in Git, let's look at configuration for some of the most common settings and behaviors that users deal with.

Default Editor

The default editor is primarily used when you need to type in a message while making a commit into the repository. If you don't supply the message in the command line when you do the commit, Git will bring up the default editor on your system to allow you to type one in.

If you would rather use a different editor, you can use the following config command to specify which one to use: git config --global core.editor <editor name or path + name> <optional options for the editor>.

The --global option is not required, but most users want to use the same editor for all of their repositories. Here again, you can break down core.editor as the editor value in the core section of the configuration.

If the editor is already in the path that Git knows about, then the path isn't required. Here are some examples of configuring editors:

$ git config core.editor vim  (Linux)

$ git config --global core.editor "nano"  (OS X)

c:> git config core.editor "'C:Program Fileswindows ntaccessorieswordpad.exe'"  (Windows)

$ git config --global core.editor "'C:/Program Files
(x86)/Notepad++/notepad++.exe' -multiInst -noSession
-notabbar"  (Bash shell on Git for Windows)

Note the different uses for single quotes and double quotes in the respective examples. Also, in the last example, -multInst, -noSession, and -notabbar are all options to Notepad++ to make it simpler to use. (multInst tells Notepad++ to allow multiple instances to run; noSession tells it not to remember the session state—that is, not to load the last file you were working on; and notabbar just avoids displaying the tabbed selection bar at the top.)

End of Line Settings

Now, let's look at one of the key settings users need to manage with Git: handling end of line (EOL) values. Git manages the two types of line endings: carriage returns/line feeds (CRLF) for Windows and line feeds (LF) for OS X/Linux.

In the context of Git, there are two options that are controlled by the EOL setting:

  • How line endings are stored in content when it is committed into the repository
  • How line endings are updated (or not) when content is checked out of the repository onto a local disk

The first item refers to whether or not Git normalizes line endings in the repository. Normalizing refers to stripping out CRs and only storing files with LFs.

For the second item, when content is checked out of Git, Git can update line endings in text files. This option allows you to specify whether or not Git updates line endings in files after checkout, and, if it does, which type it sets them to.

At a user or repository level, how Git handles these options is controlled by a configuration setting named core.autocrlf. As before, the “.” is a separator, and you can think of the first part as the section of the configuration, and the second part as the specific value being set in that section. The crlf part here obviously stands for carriage return, line-feed—meaning the common EOL sequence for files on a Windows environment. The auto part refers to automatically inserting CRLF sequences in files when they are checked out.

There are three possible values for the core.autocrlf setting:

  1. core.autocrlf=true. This value tells Git to normalize line endings to just LFs when storing files in the repository and to automatically insert CRLFs when files are checked out. If users are working on a Windows environment, this is the recommended value. It allows them to get CRLFs in files when checked out from Git, but doesn't store the CRs in the repository.
  2. core.autocrlf=input. This value tells Git to normalize line endings to just LFs when storing files in the repository but not to change anything when files are checked out. If users are working in a Unix environment, this is the recommended value because Unix expects just LFs.
  3. core.autocrlf=false. This default value tells Git not to change anything when files are being checked in or checked out. This is the primary value for the setting that can get users into trouble. Suppose you have two users working on code for the same repository, one in a Windows environment and one in a Unix environment. If both users have specified the core.autocrlf=false value in their configurations, then when they commit changes, the files from Windows will have CRLFs and those from Unix will have just LFs. If the respective users later each check out the other's files, then the files will have the wrong line endings for their system. For this reason, this value should not be used when mixed environments are being used in a project.

In general, it's a best practice to set the core.autocrlf setting to one of the values other than false, depending on which environment you're working in.

It should also be noted that there are other configuration settings that can contribute to how line endings are handled. However, these settings are more obscure and broader in terms of what they affect. Also, their default values generally work well for what most users need to do.

Aliases

Configuration in Git also supports the concept of configuring aliases for command strings. The format for defining an alias is git config <scope option> alias.<name> <command string>.

In this context:

  • <scope option> can be one of --system, --global, or --local. (Or it can be omitted, to default to local.)
  • <name> is the name you want to use for the alias. Once set, this can be used just like any other Git command.
  • <command string> is the string of a command and any arguments that the alias will substitute for.

There are two main reasons that aliases are convenient to create and use:

  • To save typing frequently used strings of commands and arguments
  • To create a more familiar command for a Git command

As an example of the first case, the git log command displays history in Git and has many options. Here's an example log command:

$ git log --pretty=format:"%h %ad | %s%d [%an]" --graph --date=short

Because this is a long command string, it can be difficult to type each time you want to use it. So, you can create an alias instead.

$ git config --global alias.hist git log --pretty=format:"%h %ad | %s%d [%an]" --graph --date=short

With this alias in place, you can now just type git hist instead of the longer, complicated command.

As an example of the second use case, suppose a user is more accustomed to typing checkin from using other SCMs instead of commit. If they want, they can create an alias by using git config --global alias.checkin commit.

After this, the user can type git checkin instead of git commit. (Note that, while this sort of alias can be created, it is not recommended because it is not universal and obscures Git's native commands.)

The format of the command to create an alias is consistent with the other config syntax. When you see alias.<name>, you can think of this as creating a value in the alias section of the configuration file. The alias information is stored in the configuration file for the specified scope.

Windows Filesystem Cache

The underlying filesystem layer on Windows is fundamentally different from the filesystem layer on Linux. Git's filesystem access is optimized for the Linux filesystem, and in the past, some operations on Git for Windows were noticeably slower. To compensate, Git has added a filesystem cache that does bulk reads of file system data and stores it in memory. This speeds up many operations, once the cache is initially populated. In recent versions of the install for Git for Windows, this option is turned on by default. To set it manually, you change the core.fscache value to true via git config --global core.fscache true.

INITIALIZING A REPOSITORY

Now that you understand how to configure the Git environment, I'll move on to setting up a local environment. Recall that a local environment consists of the three levels I discussed in the previous chapter: working directory, staging area, and local repository.

There are two ways to get a local environment for use with Git:

  • Creating a new environment from a set of existing files in a local directory, via the git init command
  • Seeding a local environment by copying an existing remote repository, via the git clone command

I'll discuss each of these methods in turn.

Git Init

The git init command is used for creating a new, empty Git repository in the local directory. The syntax for the command is shown below.

git init [-q | --quiet] [--bare] [--template=<template_directory>]
    [--separate-git-dir <git dir>]
    [--shared[=<permissions>]] [directory]

When this command is run, a new subdirectory named .git is created in the directory where the command was run, and populated with a skeleton repository. (Like many open-source applications, Git stores metadata in a subdirectory named for the tool and preceded by a dot.) This local environment is now ready for tracking and storing new content. Note that this command can be run at any time in a directory that does not already have a Git environment associated with it to create one, no matter how many or what types of files are already in the directory.

The basic syntax for invoking init is git init. Before running git init, you should be at the top level of the tree you want to put under Git control. You also want to make sure this is done at an appropriate level of granularity. Recall that Git is intended to work with multiple, smaller repositories, not very large ones. So, running git init at your home directory level, for example, is not usually a good idea because this sets Git up to try and act on all files and subdirectories under your home directory for future operations—which is probably beyond the scope you intended.

Git Clone

Whereas the init command is used when you want to create a new, empty repository and begin adding content, the clone command is used to populate a local repository from an existing remote repository. The syntax for the command is shown below.

git clone [--template=<template_directory>]
       [-l] [-s] [--no-hardlinks] [-q] [-n] [--bare] [--mirror]
       [-o <name>] [-b <name>] [-u <upload-pack>] [--reference <repository>]
       [--dissociate] [--separate-git-dir <git dir>]
       [--depth <depth>] [--[no-]single-branch]
       [--recursive | --recurse-submodules] [--] <repository>
       [<directory>]

To use the clone command, you specify a remote repository location to clone from and Git does the following:

  • Creates a local directory with the same name as the last component of the remote repository's path
  • Within that directory, creates a .git subdirectory and copies the appropriate parts of the remote repository down to that .git directory
  • Checks out the most recent version of a branch (usually the default master branch) into the local directory. This checked-out version with the flat files is what the user usually sees and works with immediately after the clone.

The basic syntax for cloning a repository is git clone <url> where <url> is a path to a remote repository. Here's an example:

$ git clone ssh://[email protected]:path-to-repo.git 

I will discuss this command more in Chapter 12.

What's in the Repository

Whether the local environment is created by a git init or git clone command, the structure within the .git subdirectory is the same. Essentially, a Git repository is a content-addressable data store. That is, you put something into the repository, Git calculates a hash value (SHA1) for the content, and you can then use that value later to get something out. Figure 4.2 shows an outline of a .git repository on disk.

A screenshot of tree listing of a .git directory (local repository).

Figure 4.2 Tree listing of a .git directory (local repository)

The HEAD file keeps track of which branch the user is currently working with. The description file is used by GitWeb, a browser application to view Git repositories. The config file is the local repository's configuration file. The object and pack directories are the areas where content is actually stored. You can find more information about the files and content stored in the local repository in the optional steps of Connected Lab 2.

ADVANCED TOPICS

In this section, I'll look at several topics. The first is a quick note about how the init command works. Second is a further explanation about what's in a Git repository. The third is how Git config statements map to the text of the configuration files. Finally, I'll look at a way to create even more useful aliases that can have arguments passed to them and do multiple steps.

While this information is not necessary for using Git, sometimes it's helpful to understand how Git works behind the scenes. The first two sections apply this approach to a couple of areas.

Git Init Demystified

If you're wondering how git init gets the initial content for the skeleton repository, the answer is that there's a template area containing the repository skeleton. This is installed when you install Git. If you're interested in looking at it, you can search for git-core on your filesystem in the area where you installed Git. On Windows, this is usually in a location such as C:Program FilesGitmingw64sharegit-core emplates (if you installed the Git for Windows package). On a Linux system, it may be in a location such as /usr/share/git-core/templates.

On some installations, you may also see a contrib folder in the same area with items such as hooks that users have contributed over time that are now included as optional pieces that can be put in place as desired. I'll talk more about setting up hooks in Chapter 15.

Running Git Init Twice on the Same Repository

Running init twice may seem counterintuitive, but there are actually cases where it provides value. The good news is that it does not delete or modify any content that you have added or committed into the repository or your local configuration. It does update any changes to the subset of the templated files discussed previously.

So what would be a use case to deliberately run init twice? Suppose you have multiple Git repositories on your system and you want to update a hook in all of them to provide some functionality, such as sending e-mails after a commit. You could update the hook in the templates area discussed earlier, and then do a git init on each of the repositories to get the updated hook put in place in each repository.

Looking Further into a Git Repository

As I've previously mentioned, a local Git repository is housed in the .git directory of the working directory. It is essentially a content-addressable database, meaning you supply a value (typically a SHA1) and you get content back out.

Figure 4.3 shows the relationship and transformation of content from the working directory into the Git repository.

A schematic diagram of the relationship and transformation of content from the working directory into the Git repository with screenshots for Working directory, Git snapshot, and Local repository file layout.

Figure 4.3 Mapping files and directories to Git repositories

Starting at the left side of the figure, files and directories first exist as normal OS items on the disk in the working directory. Git does not know anything about them. It does not track them until the user adds them to the staging area. Once they are tracked by Git, a new snapshot is created with metadata in the form of a commit record. Once committed, the pieces are stored in their respective areas in the underlying repository.

As shown in the middle section of the figure, the pieces that Git stores are defined as one of three types: blob, tree, or commit. Blobs are essentially anonymous containers for files—anonymous in the sense that they don't contain filenames. Trees can be thought of as containers for directories that point to blobs for files and contain the filenames. Commits can be thought of as the header records with meta-information that Git uses for tracking.

Internally, Git computes SHA1 checksums for each of these pieces and stores them referenced by those checksums. The checksums can be seen in the parts of the middle section and then in the tree view of the actual repository directory on the right side of the figure.

As shown in that view, Git stores these internal objects in directories that start with the first two characters of the checksum. The filename is made up of the remaining characters. The files may be changed over time when certain events trigger Git to do further compression and rearrange content to efficiently store very similar versions.

The only checksum those commands are concerned with (and the only one you need to be concerned with) is the checksum that is specifically associated with the commit record, not the ones for trees or blobs.

By referencing that one checksum for the commit record, Git pulls in the underlying tree and blob content.

Once you actually have a repository with content stored in it, you can change into the repository directories to see the stored objects or use this shortcut (on Linux systems): find .git/objects -type f.

From there, you can use the cat-file plumbing command to examine objects. As an example git cat-file -p <sha1 or branch name or reference> tells Git to figure out the type of object and neatly display its contents. A similar command, git cat-file -t <sha1 or branch name or reference> returns the type of an object: commit, tree, or blob.

Connected Lab 2 contains several optional steps that you can work through to understand what's happening in the underlying repository during an init, add, and commit sequence. It also further explains the files that are in the underlying repository tree at various points.

Mapping Config Commands to Configuration Files

In this chapter, I described the various configuration files that Git uses, as well as how to set values via the git config command. If you see something in a config file that you want to emulate or use, it can be helpful to understand how the config commands map to the file structure. This section will explain that.

Suppose you configure a two-part value such as user.name in your local configuration with a command like git conig --local user.name "Git User".

This translates into setting the name value of the user section, written into the .git/config file as follows:

 [user]
         name = Git User

If you need to configure a given value for a named section, you can use a three-part value such as the following:

$ git config --local remote.myremote.url http://github.com/brentlaster/calc2

$ cat .git/config
…
[remote "myremote"]
             url = http://github.com/brentlaster/calc2

Anything beyond three parts is still treated as three parts, with the extra pieces at the front just made part of the named section.

$ git config --local remote.myremote.new.url http://github.com/brentlaster/calc2

$ cat .git/config

…
[remote "myremote.new"]
               url = http://github.com/brentlaster/calc2

Note that the git config operation also takes a --file option instead of --local, --global, or --system. This allows for writing configuration options to a file in a different location, such as for test purposes.

$ git config --file test.config remote.myremote.test
        http://github.com/brentlaster/calc2git
	
$ cat test.config
[remote "myremote"]
                  test = http://github.com/brentlaster/calc2git

As one last tip, git config includes a --get-regexp option to find configuration values matching a specified pattern. I'll use this option in the next section so you can see how it works.

Creating Parameterized Aliases

Earlier in this chapter, I showed how to create simple aliases for specific Git command lines, such as git config alias.ci commit. It is certainly userful to alias fixed command strings, but only to the extent that arguments and options included in the alias never change.

What if you want to create an alias that takes a parameter that is not normally part of a command? Or that may change over time? Or that may perform extra steps or processing—especially with system commands?

As it turns out, on Linux systems you can do this with Git fairly easily. You just need to have your alias string take this form:

"! f() { do some processing }; f"

The ! at the beginning tells Git you are going to the shell. The “f() {}; f” allows you to define a function as part of the alias and then run that function when the alias is invoked.

Values that you pass in as arguments are treated as positional parameters (for example, $1, $2, and so on). When including these parameters in the alias definition, a backslash needs to precede the $, as in “$”. This is to ensure the parameter is included as part of the definition and not interpreted when you are defining the alias.

Let's work through a couple of examples. First, I'll create a simple alias that takes an argument and lists out any matching global and local settings prefixed by an appropriate header for each section. The config command is used in this example. What I am doing in this command line is defining a local alias named scopelist, which does the following:

  1. Echoes out a global settings header
  2. Uses git config's --get-regexp option with a global qualifier to search for the value that is passed in
  3. Echoes out a local settings header
  4. Uses git config's --get-regexp option with a local qualifier to search for the values that are passed in

Here's the command. (Pay attention to the quotes, semicolons, double hyphens, and backslashes.)

$ git config --local alias.scopelist "! f() { echo 'global settings'; git config --global --get-regexp $1; echo 'local settings'; git config --get-regexp $1; }; f"

Here is an example of running the alias:

$ git scopelist name
global settings
user.name Git User (global)
local settings
user.name Git User (local)

The following example will show you a simple way to dump out the contents of a particular scope into a file. This illustrates having two positional parameters. In this case, the alias will do the following:

  1. Echo out a header.
  2. Issue a git config command at the appropriate scope.
  3. Dump the values from step 2 into a separate file.

Here's the command to define this alias. (Again, pay attention to the punctuation characters that are used.)

$ git config --local alias.dumpvalues "! f() { echo 'copying config' $1; git config --list --$1 > $2; }; f"

Here is an example of running this alias and looking at the results:

         $ git dumpvalues global global_values.out
         copying config global
	 
         $ cat global_values.out
         alias.hist=log --pretty=format:"%h %ad | %s%d [%an]" --
graph --date=short
         push.default=simple
         core.autocrlf=false
         core.editor='C:/Program Files
(x86)/Notepad++/notepad++.exe' -multiInst -noSession -notabbar
         gitreview.remote=origin
         user.name=Git User (global)
         [email protected]

Obviously, these examples don't cover all the possibilities of bad or missing input. However, they'll give you an idea of how to use this functionality if you ever need it.

SUMMARY

In this chapter, I discussed the form and structure of Git commands and related topics such as auto-completion. I introduced basic configuration for Git and described how to create local environments. I covered the different scope of configuration settings you can use and how to specify values for each scope. I also covered how to create aliases to simplify interacting with Git. I then described the two different ways to create local environments with Git—initializing a new environment from existing files or cloning down an existing repository. Finally, I offered a brief description of what's inside a .git repository.

In the section on advanced topics, you took a closer look at how the init command works, the contents of a Git repository, and how to look at individual objects. Then you learned how configuration commands map to the actual configuration text files. Finally, you saw how to create advanced aliases that can run operating system commands and allow you to work with positional parameters.

In the next chapter, you'll start putting content into Git and go over the commands to start promoting it up through the levels.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset