Chapter 6. Rollbacks, Reverts, Resets, and Rebasing

This is otherwise known as the “Rrrrgh!” chapter. Bad things happen to good people. Fortunately, Git can help you undo some of those past mistakes by traveling back in time. There are several commands in Git that vary in their degree of severity—making minor adjustments of a commit message all the way through to obliterating history. Mistakes are typically committed and removed from a personal repository, but the way you deal with them can impact how others interact with the code base. Ensuring you are always dealing with problems in the most polite way possible will help your team work more efficiently.

By the end of this chapter, you will be able to:

  • Amend a commit to add new work

  • Restore a file to a previous state

  • Restore your working directory to a previously committed state

  • Revert previously made changes

  • Reshape your commit history using rebase

  • Remove a file from your repository

  • Remove commits added to a branch from an incorrect merge

Throughout the chapter you will be learning techniques that feel invisible, but have huge implications. Take the time to slow down, and draw a diagram of how you want things to appear after you have run the sequence of commands. This will help you to select the right subcommand and the right parameters. It will also help you to recall information the next time you need to perform the same task again.

Those who learn best by following along with video tutorials will benefit from Collaborating with Git (O’Reilly), the companion video series for this book.

Best Practices

In this chapter you are going to be learning to manipulate the history of your repository. While the exercises in this book are easy to follow, there will come a time when you are a little under pressure and a little unpracticed and you will panic and think you’ve lost your work. Take a deep breath. It will be okay. If you’ve committed something into the repository, it will (almost) always be there if you are willing to do some digging. It’s very difficult to completely remove work from a repository in Git; it is, however, relatively easy to lose work and not be able to find it again. So before you learn how to muck about with history, let’s make sure you’ve got some good recovery tools to help you MacGyver your way out of difficult situations.

Describing Your Problem

There are a lot of ways to undo work in Git, and each method is exactly right some of the time. In order to choose the correct method, you need to know exactly what you want to change—and how it should be different after you are finished. When I was first learning version control, I would often draw a quick sketch of what I was trying to accomplish to ensure I was using the right command for the job. Figure 6-1 shows the three concepts you need to be aware of: the working directory (the files currently visible on your filesystem); the staging area (the index of changes that will be written to the repository after the next commit); and the repository (which stores files and records the changes made to the files over time).

Three circles form a Venn diagram. The repository and the working directory are on the outside. The staging area is the overlap between the two. Repository = what is stored. Working directory = what can be seen. Staging area = the difference between what is stored and what can be seen.
Figure 6-1. The working directory, staging area, and repository each contain different information about your files

The Staging Area is Not Automatically Updated

Figure 6-1 is a bit of a lie, as you need to explicitly place things into the staging area using the command add, but it’s a decent working model to start from.

Whenever you can separate your problem into the discrete places where Git is storing its information, you have a better chance of choosing the correct command sequence to return your work to the state you want it to be in. Table 6-1 contains a series of scenarios might encounter while working with Git.

Table 6-1. Choosing the correct undo method
You want to… Notes Solution

Discard changes you’ve made to a file in your working directory

Changed file is not staged, or committed

checkout -- filename

Discard all unsaved changes in the working directory

File is staged, but not committed

reset --hard

Combine several commits up to, but not including, a specific commit

reset commit

Remove all unsaved changes, including untracked files

Changed files are not committed

clean -fd

Remove all staged changes and previously committed work up to a specific commit, but do not remove new files from the working directory

reset --hard commit

Remove previous work, but keep the commit history intact (“roll forward”)

Branch has been published; working directory is clean

revert commit

Remove a single commit from a branch’s history

Changed files are committed; working directory is clean; branch has not been published

rebase --interactive commit

Keep previous work, but combine it with another commit

Select the squash option

rebase --interactive commit

Figure 6-2 shows one diagram for the first scenario. Additional answers are available on the Git for Teams website.

Venn diagram: repository (older, correct version of the file) + working directory (bad edits you want to remove) overlap to form the staging area (unused). Arrow points from the repository to the working directory with the label: copy correct version of a file from the repository to the working directory: checkout -- filename.
Figure 6-2. You want to discard changes you’ve made to a file in your working directory; the incorrect copy of the file is not staged or committed

As you can see in the examples outlined in Table 6-1, some commands have two different outcomes depending on the parameters used. Figure 6-3 contains a flowchart of the scenarios you may find yourself in. Redraw this chart digitally, or on paper. The act of re-creating the chart will reinforce the options you will be forced to deal with in Git, and it will give you a personal reference point, which is often easier to remember than a page in a book.

You may have your own types of changes you need to recover from as well. Create a list of all the problem scenarios you may want to recover from. The better you are able to describe what’s wrong, the more likely you are to find the correct solution. As you work through this chapter, you may choose to expand on the flowchart in Figure 6-3 or create your own diagrams. Please share your work on Twitter by using #gitforteams. I’d love to see what you come up with!

Using Branches for Experimental Work

On a tree, a branch is independent from its sibling branches. Although they may have a common ancestor, you can (typically) saw a branch off a tree without impacting the other branches. In Git, the commits you add to your repository are connected to one or more branches. If you check out a different branch and manipulate the commit objects in that new branch, they are assigned a new identifier, leaving the original commit objects tied to the original branch unchanged. This means it is always safer to do your work in a new private branch, and when you are happy with the results, merge your branch back into the main branch (Figure 6-4).

A decision flow chart outlining the circumstances for undoing history in Git. Decision points include: Is the change committed? Is the change staged? Are there changes you want to preserve in the working directory? Do you want to keep a reference to your change? Is the change in a shared branch? Is the change in the most recent commit(s)?
Figure 6-3. Create a flowchart to help you select the appropriate command
Two pane illustration: First pane: A tree diagram which shows diverging branches. Second pane: commits are copied from the first branch to the second.
Figure 6-4. Working in a branch protects you from unintended changes; merge your work back into the main branch only when it is correct and complete

Previously we’ve created and deleted branches using the ticket as a starting point. But what if you were working on a ticket, and you weren’t sure which of two approaches you should take? In this case you could create a branch off of your ticket branch, make your experimental changes (Example 6-1), and then merge your experimental branch into your ticket branch (Example 6-2) if you want to save the changes.

Example 6-1. Use an experimental branch to test changes
$ git checkout -b experimental_idea
  (do work)
$ git add --all
$ git commit

You may have one or more commits in your experimental branch. When you merge the two branches, you can optionally combine all of those commits into a single one at the time of the merge with the parameter --squash. If you use this parameter, you will still need to run the command commit separately to save the changes from the other branch. By merging the branch in this way, you will be unable to unmerge the branch later. As such, it’s appropriate to use --squash only when merging branches you wish had never been separate to begin with.

Example 6-2. Merge your experimental branch back into the main branch
$ git checkout master
$ git merge experimental_idea --squash

Squash commit -- not updating HEAD
Automatic merge went well; stopped before committing as requested

$ git commit

After merging your experimental branch, you can delete it (Example 6-3).

Example 6-3. Delete your experimental branch
$ git branch --delete experimental_idea

If you want to discard your experimental ideas, complete the preceding steps, but omit the step where you merge your work into your main ticket branch. To delete an unmerged branch, you will need to use the parameter -D instead of --delete.

Subsequent sections of this chapter cover removing commits you made in a branch before you realized they were just experiments.

Rebasing Step by Step

Out of the three commands rebase, reset, and revert, rebase is the only command which is not exclusively focused on undoing work. Generally when we talk about about rebasing, we are referring to the process of bringing a branch up to date with commits that have been made on its parent branch. This is typically a very straightforward process: from the branch you want to update, you run the command rebase along with the name of the parent branch. Git removes your commits from the child branch you have been working on, adds the new commits that were made on the parent branch to the tip of your branch, and then adds an updated copy of your commits to your branch. This makes it seem as though your commits were added after the new changes from the parent branch. It’s the Git equivalent to whistling innocently and pretending nothing happened when actually it has snuck a vase with flowers onto the table while you weren’t looking.

Although we often talk about rebasing as “replaying your history,” rebasing is perhaps more correctly defined as traveling back in time and then attempting to re-enact history. If you have seen Back to the Future (or a modern time travel equivalent) you know that history is never quite the same the second time around. This is the case with rebase as well. Although it appears as though the commits are simply dropped back onto a new branch tip, they are actually completely new commits with their own reference ID. As these new commits are applied to the time line, problems can arise if the new history conflicts with the work you are trying to apply. This will result in errors about being in a detached HEAD state. Mind blown? Here is another way to think of it: Git allows us to retell history, inserting new facts as it pleases us. It does not, however, actually allow us to change anything that has happened in the past. What’s done is done all we can do is change the stories we tell about it.

Most of the time, when bringing a branch up-to-date with command rebase, it is virtually instant and happens automatically. If, however, during the rebasing process there are conflicting changes in the work you have done and the work that you are trying to sneak onto the parent branch, the process will stop and Git will ask you to resolve the conflicts by hand before it proceeds. This can be in-file changes, and deleted files (where one deletes a file that the other has edited). Git is, after all, just a simple content tracker. A mediated conflict resolution by you, the expert, always results in a better end product. Even if you would rather that Git just figured it out, it is good that it stops and asks for help. Think of it as a valuable life lesson: asking for help is okay.

The second cause of frustration is when rebase is used to force updates into a public branch. In this case a timeline will end up with the same code represented by two (or more) commit objects with distinct IDs. To help you choose whether you should be rebasing, or merging, please use the rebase or merge decision tree.

The remainder of this section describes the process of dealing with mid-rebase conflicts when bringing a branch up-to-date. In our example, the parent (or source) branch is named master and the branch we are attempting to bring up-to-date (the child branch) is named feature.

Begin Rebasing

Ensure your local copy of the parent branch is up to date with the most recent commits available from the main project repository:

$ git checkout master
$ git pull --rebase=preserve remote_nickname master

If It Helps, Be Explicit

When updating a local copy of a branch with the command pull, the parameters for the name of the remote, and name of the remote branch are typically optional. Occasionally, if I have more than one remote for a given repository, Git sometimes seems to miss if there are updates available. Adding the two additional parameters seems to help.

Change into the branch that is currently out of date from the main project, but which contains new work that hasn’t been introduced yet:

$ git checkout feature

Begin the rebasing process:

$ git rebase master

If there are no conflicts, Git will skip merrily through the process and spit you out the other end with no additional action required from you. See? Rebasing is easy! You should try it! However, sometimes there are conflicts…

Mid-Rebase Conflict from a Deleted File

A conflict in the rebasing process occurs when the changes you have made occur on the same line as the changes which are stored in one of the new commits on the parent branch. As a simple content tracker, Git doesn’t feel qualified to know whether our changes should be kept, or theirs. Instead of making guesses, Git stops and asks for your help. I think that’s actually quite considerate that Git perceives me to be more of an expert on the content than it is! Unfortunately the process isn’t called “asking you, the expert, for help”; it’s called “resolving conflict while in a detached HEAD state.” This is very scary language for process that is actually quite respectful.

To resolve a conflict you will need to put on your content expert hat, and help Git make some decisions about what to do next.

This section covers an example of a mid-rebase conflict. The file ch10.asciidoc has been deleted in the source branch, master, but I’ve been making updates to it in feature. This is a problem Git doesn’t know how to resolve. Do I want to keep the file? Should it be deleted? Git has put me into a detached HEAD state so that I can explain to Git how I want to proceed:

First, rewinding head to replay your work on top of it...
Applying: CH10: Stub file added with notes copied from video recording lessons.
Using index info to reconstruct a base tree...
A	ch10.asciidoc
Falling back to patching base and 3-way merge...
CONFLICT (modify/delete): ch10.asciidoc deleted in HEAD and modified in CH10:
Stub file added with notes copied from video recording lessons.. Version CH10:
Stub file added with notes copied from video recording lessons. of ch10.asciidoc
left in tree.
Failed to merge in the changes.
Patch failed at 0001 CH10: Stub file added with notes copied from video
recording lessons.
The copy of the patch that failed is found in:
   /Users/emmajane/Git/1234000002182/.git/rebase-apply/patch

When you have resolved this problem, run "git rebase --continue".
If you prefer to skip this patch, run "git rebase --skip" instead.
To check out the original branch and stop rebasing, run "git rebase --abort".

The relevant piece of information from this output is:

When you have resolved this problem, run "git rebase --continue".

This tells me that I need to:

  1. Resolve the merge conflict.

  2. Once I think the merge conflict is resolved, run the command:

    git rebase --continue

    I accomplish step 1 by opening the file in question in my designated file comparison tool:

$ git mergetool ch10.asciidoc

There are no merge conflict markers displayed in the file, so I quit the merge tool and proceed to the next step Git had identified:

$ git rebase --continue

The following message is returned from Git:

ch10.asciidoc: needs merge
You must edit all merge conflicts and then
mark them as resolved using git add

That’s not very helpful! I just looked at that file and there were no merge conflicts. I’ll ask Git what the problem is using the command status:

$ git status

The output from Git is as follows:

rebase in progress; onto 6ef4edb
You are currently rebasing branch 'ch10' on '6ef4edb'.
  (fix conflicts and then run "git rebase --continue")
  (use "git rebase --skip" to skip this patch)
  (use "git rebase --abort" to check out the original branch)

Unmerged paths:
  (use "git reset HEAD <file>..." to unstage)
  (use "git add/rm <file>..." as appropriate to mark resolution)

	deleted by us:   ch10.asciidoc

no changes added to commit (use "git add" and/or "git commit -a")

Aha! There are two clues here for me. The text: Unmerged paths and then a little later on the text: deleted by us: ch10.asciidoc. Well, I don’t want the file to be deleted. This is useful because Git has told me deleted by us and I know I don’t want to delete the file; therefore I need to unstage Git’s change. Unstaging a change is effectively saying to Git, “That thing you were planning to do? Don’t do it. In fact, forget you were even thinking about doing anything with that file. Reset your HEAD, Git.”

Git tells me how to prevent this change from happening with the following text:

(use "git reset HEAD <file>..." to unstage)

Using this message as a guide, I run the following command:

$ git reset HEAD ch10.asciidoc

Now, what this command is actually doing is clearing out the staging area, and moving the pointer back to the most recent known commit. Because I am knee-deep in a rebase, and in a detached HEAD state as opposed to in a branch, reset clears away the staging area and puts me in the most recent state from the rebasing process. In my case, this leaves me with the older version of the file, which is fine. As I proceed through the rebase, I will replace the contents of the file with the latest version from the branch feature. If I wanted to preserve their deletion of the file, I would skip this step and proceed with the instructions, adding the file to the staging area as described later.

With my chapter file replaced, let’s see what clues Git is giving me on how I should proceed:

$ git status

The output from Git is as follows:

rebase in progress; onto 6ef4edb
You are currently rebasing branch 'ch10' on '6ef4edb'.
  (all conflicts fixed: run "git rebase --continue")

Untracked files:
  (use "git add <file>..." to include in what will be committed)

	ch10.asciidoc

nothing added to commit but untracked files present (use "git add" to track)

So I’ve still got the file (great!), but Git is still confused about what to do, because as far as it’s concerned, that file should have been deleted. I need to explicitly add the file back into the repository, which Git tells me to do by giving me the message:

Untracked files: (use "git add <file>..." to include in what will be
committed) ch10.asciidoc

The formatting is awkward if there is only one affected file but in the case of a longer list of files, the formatting is lovely.

Per Git’s request, I will now add the file ch10.asciidoc to the staging area:

$ git add ch10.asciidoc

Now at this point, I know that the command add is just the beginning of a process, and that I’m going to need to commit the file as well, but this is rebasing and the rules are different. I’m going to ask Git what to do next by checking the output of the command status again:

$ git status

The output from Git is as follows:

rebase in progress; onto 6ef4edb
You are currently rebasing branch 'ch10' on '6ef4edb'.
  (all conflicts fixed: run "git rebase --continue")

Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	new file:   ch10.asciidoc

Okay, it’s saying there are changes to be committed (yup, already knew that), but it doesn’t tell me to commit them! Instead it tells me to continue with the rebasing with the message:

all conflicts fixed: run "git rebase --continue"

I proceed with this command even though add is normally paired with commit to save changes:

$ git rebase --continue

Mid-Rebase Conflict from a Single File Merge Conflict

After restarting the rebasing process, Git has run into another conflict as it replays the commits. The output is as follows:

Applying: CH10: Stub file added with notes copied from video recording lessons.
Applying: TOC: Adding Chapter 10 to the book build.
Using index info to reconstruct a base tree...
M	book.asciidoc
Falling back to patching base and 3-way merge...
Auto-merging book.asciidoc
CONFLICT (content): Merge conflict in book.asciidoc
Recorded preimage for 'book.asciidoc'
Failed to merge in the changes.
Patch failed at 0002 TOC: Adding Chapter 10 to the book build.
The copy of the patch that failed is found in:
   /Users/emmajane/Git/1234000002182/.git/rebase-apply/patch

When you have resolved this problem, run "git rebase --continue".
If you prefer to skip this patch, run "git rebase --skip" instead.
To check out the original branch and stop rebasing, run "git rebase --abort".

Another conflict. You’re being high maintenance, Git! No wonder people complain about rebasing! Okay, okay, at least it’s a different file this time (CONFLICT (content): Merge conflict in book.asciidoc). I take a closer look at the output of the command status again to see if Git gives me additional clues:

$ git status

The output from Git is as follows:

rebase in progress; onto 6ef4edb
You are currently rebasing branch 'ch10' on '6ef4edb'.
  (fix conflicts and then run "git rebase --continue")
  (use "git rebase --skip" to skip this patch)
  (use "git rebase --abort" to check out the original branch)

Unmerged paths:
  (use "git reset HEAD <file>..." to unstage)
  (use "git add <file>..." to mark resolution)

	both modified:   book.asciidoc

no changes added to commit (use "git add" and/or "git commit -a")

Long sigh. Alright, Git. Let’s see what the conflict is in this file:

$ git mergetool book.asciidoc

Opening up the file in my favorite merge tool, I see there is indeed a merge conflict in this file. The merge conflict markers are displayed as three columns. One column for each of the two branches being merged, and one column displaying how the merge conflict should be resolved. I choose the hunk of text I want to keep, which resolves the conflict. I save the file, close the merge tool, and ask Git if it’s happy by using the command status, again:

$ git status

The output from Git is as follows:

rebase in progress; onto 6ef4edb
You are currently rebasing branch 'ch10' on '6ef4edb'.
  (fix conflicts and then run "git rebase --continue")
  (use "git rebase --skip" to skip this patch)
  (use "git rebase --abort" to check out the original branch)

Unmerged paths:
  (use "git reset HEAD <file>..." to unstage)
  (use "git add <file>..." to mark resolution)

	both modified:   book.asciidoc

no changes added to commit (use "git add" and/or "git commit -a")

The message is a little misleading because I have fixed the conflicts. At this point, I open the file to double check. Nope, no conflicts there. So now I move on to the next group of instructions: unmerged paths: use "git add <file> …" to mark resolution and then both modified: book.asciidoc:

$ git add book.asciidoc

And check the status again:

$ git status

The output from Git is as follows:

rebase in progress; onto 6ef4edb
You are currently rebasing branch 'ch10' on '6ef4edb'.
  (all conflicts fixed: run "git rebase --continue")

Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	modified:   book.asciidoc

As before, I don’t pair the command add with the command commit. Instead, Git instructs me as follows: all conflicts fixed: run "git rebase --continue", so I proceed with the rebasing process:

$ git rebase --continue

The output from Git is as follows:

Applying: TOC: Adding Chapter 10 to the book build.
Recorded resolution for 'book.asciidoc'.
Applying: CH10: Outline of GitHub topics

The rebasing procedure has been completed. My copy of the branch feature is now up to date with all changes that had been previously committed to the branch master.

There are a few different ways that rebasing can kick up a conflict. Take your time, read the instructions carefully, and if you aren’t getting useful information, try using the command status to see if there’s something more helpful that Git can offer. If you are really in a panic about what’s happening, you can always abort the process with the command git rebase --abort. This will return you the state your branch was in right before you started the rebase.

An Overview of Locating Lost Work

It is very difficult to completely remove committed work in Git. It is, however, pretty easy to misplace your work with the same frequency that I misplace my keys, my glasses, my wallet, and my family’s patience. If you think you have lost some work, the first thing you will need to do is locate the commit where the work was stored. The command log displays commits that have been made to a particular branch; the command reflog lists a history of everything that has happened in your local copy of the repository. This means that if you are working with a repository you cloned from a remote server, the reflog history will begin at the point where you cloned the repository to your local environment—whereas the log history will display all of the commit messages since the command init was used to create the repository.

If you haven’t already, get a copy of the project repository for this book, and compare the output of the two commands reflog and log (Example 6-4).

Example 6-4. Compare the output of log and reflog
$ git clone https://gitlab.com/gitforteams/gitforteams.git

Cloning into 'gitforteams'...
remote: Counting objects: 1084, done.
remote: Total 1084 (delta 0), reused 0 (delta 0)
Receiving objects: 100% (1084/1084), 12.07 MiB | 813.00 KiB/s, done.
Resolving deltas: 100% (628/628), done.
Checking connectivity... done.

$ git log --oneline

e8d6aff Updating diagram: Adding commit ID reference to rebase.
ae56a1f Adding workflow diagram for: reset, revert, rebase, checkout.
2480520 Merge pull request #5 from xrmxrm/1-markdown_fixes
ee46470 Fix some markdown Issue #1

$ git reflog

2f17715 HEAD@{1}: clone: from https://gitlab.com/gitforteams/gitforteams.git

If the only thing you have done is clone the repository, you will only see one line of history in the reflog. As you do more things, the reflog will start to grow. Following is a sample of the output from this book’s repository:

fdd19dc HEAD@{157}: merge drafts: Fast-forward
af9e2c8 HEAD@{158}: checkout: moving from drafts to master
fdd19dc HEAD@{159}: merge ch04: Merge made by the 'recursive' strategy.
af9e2c8 HEAD@{160}: checkout: moving from ch04 to drafts
e296faa HEAD@{161}: commit (amend): CH04: first draft complete
dd87941 HEAD@{162}: commit: CH04: first draft complete

This is a private history. Only you can see it, thank goodness! It will contain everything that you have done including things that have no impact on the code, such as checking out a branch.

Both of the commands log and reflog show you the commit ID for a particular state that is stored in the repository. So long as you can find this commit ID, you can check it out (Example 6-5), temporarily restoring the state of the code base at that point in time.

Example 6-5. Check out a specific commit in your repository
$ git checkout commit
Checking out files: 100% (2979/2979), done.
Note: checking out 'a94b4c4'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b new_branch_name

HEAD is now at a94b4c4... Fixing broken URL to the slides from the main README file.
Was missing the end round bracket.

When you check out a commit, you will be detaching from the connected history for a particular branch. It’s not really as scary as it sounds, though. Normally when we work in Git we are working in a linear representation of history. When we check out a single commit, we are working in a suspended state (Figure 6-5).

A single commit 'leaf' is magnified away from the tree. The in-diagram caption reads 'Examining a commit in isolation from its branch'.
Figure 6-5. In a detached HEAD state, you are temporarily disconnected from the linear history of a branch

This is typically where people start to freak out a bit—understandably—your HEAD is DETACHED! Following the instructions Git provides will set you right. If you want to save the state you are in, check out a new branch and your state will be recorded in that new branch:

$ git checkout -b restoring_old_commit

At this point you can continue to add a few fix-ups in the new branch if there’s anything missing you want to add (or old work that is no longer relevant and that you want to remove). Once you are finished, you will need to decide how you want to incorporate the new branch back into your working branch. You could choose to merge the new branch into an existing branch, or just cherry-pick a few commit(s) that you want to keep. Let’s start with a merge, because this is something you should already be familiar with from Chapter 5:

$ git checkout working_branch
$ git merge restoring_old_commit

With the merge complete, you should now tidy up your local repository by deleting the temporary branch:

$ git branch --delete restoring_old_commit

If you have published the temporary branch and wish to delete it from the remote repository, you will need to do that explicitly:

$ git push --delete restoring_old_commit

This method has the potential to make an absolute mess of things if the temporary branch contains a lot of unrelated work. In this case, it may be more appropriate to use the command cherry-pick (Example 6-6). It can be used in a number of different ways—check the documentation for this command with git help cherry-pick. I tend to use the commit ID that I want to copy into my current branch. The optional parameter -x appends a line to the commit message letting you know this commit was cherry-picked from somewhere else, as opposed to having been originally created on this branch at this point in history. This addition makes it easier to identify the commit later.

Example 6-6. Copying commits onto a new branch with cherry-pick
$ git cherry-pick -x commit

Assuming the commit was cleanly applied to your current branch, you will see a message such as the following:

[master 6b60f9c] Adding office hours reminder.
 Date: Tue Jul 22 08:36:54 2014 -0700
 1 file changed, 2 insertions(+)

If things don’t go well, you may need to resolve a merge conflict. The output for that would be as follows:

error: could not apply 9d7fbf3... Lesson 9: Removing lesson stubs from
subsequent lessons.
hint: after resolving the conflicts, mark the corrected paths
hint: with 'git add <paths>' or 'git rm <paths>'
hint: and commit the result with 'git commit'

Merge conflicts are covered in more detail in Chapter 7. Skip ahead to that chapter if you encounter a conflict while cherry-picking a commit.

Another output you may encounter is when the commit you want to incorporate is actually a merge commit. You will need to select the parent branch in this case. You can recognize this case by the following output from Git when you attempt to cherry-pick a commit:

error: Commit 0075f7eda6 is a merge but no -m option was given.
fatal: cherry-pick failed

Confirm the parent branch you want to keep is the first branch lanes on the graphed output of your log (counting from left to right):

$ git log --oneline --graph

Then, run the command cherry-pick again, this time identifying the parent branch to keep with the parameter --mainline:

$ git cherry-pick -x commit --mainline 1

Finally, if you decide you don’t want to keep the recovered work, you can obliterate the changes:

$ git reset --merge ORIG_HEAD

Published History Should Not Be Altered

The command reset should not be used on a shared branch to remove commits that have already been published. Undoing changes on shared branches is covered later in this chapter.

If you have worked on each of the examples in this section, you should now be able to check out a single commit, create a new branch to recover from a detached HEAD state, merge changes from one branch into another, cherry-pick commits into a branch, and delete local branches.

Restoring Files

You are working along and you just deleted the wrong file. You actually wanted to keep the file. Or perhaps you edited a file that shouldn’t have been edited. Before the changes are locked into place (or committed), you can check out the files. This will restore the contents of the file to what was stored in the last known commit for the branch you are on:

$ rm README.md
$ git status

On branch master
Changes not staged for commit:
  (use "git add/rm <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	deleted:    README.md

no changes added to commit (use "git add" and/or "git commit -a")

The status message explains how to reverse the changes and recover your deleted file:

$ git checkout -- README.md

If you have already staged the file, you will need to unstage it before you can restore the file by using the command reset. To try this, you will need to first delete a file, then use the command add to add the changes to the staging area, and finally use the command status to verify your next action:

$ rm README.md
$ git add README.md
$ git status

On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	deleted:    README.md

At this point, the command you used previously, checkout, will not work. Instead, follow the instructions Git provides to unstage the file you want to restore. Instead of selecting a specific commit, use the Git short form HEAD, which refers to the most recent commit on the current branch:

$ git reset HEAD README.md

Once the file is unstaged, you can use the command checkout as you did previously to restore the deleted file:

$ git checkout -- README.md

If you prefer, you can combine these two commands into one:

$ git reset --hard HEAD -- README.md

If you want to undo all of the changes in your working directory, restoring the version of the files that was saved in the previous commit, you don’t need to make the changes one at a time. You can do it in bulk:

$ git reset --hard HEAD

You should now be able to restore a deleted file in the working directory.

Working with Commits

A commit is a snapshot within your repository that contains the state of all of the files at that point in time. Each of these commits can be manipulated within your history. You can remove the commit entirely with the command reset, you can reverse the effects of a commit (but maintain it in your history) with the command revert, and you can change the order of the commits with the command rebase. Altering the history of your repository is a big no-no if you’ve already published the commits. This is because even the slightest change will result in a new commit SHA being stored in the repository—even if the code itself is exactly the same at the tip of the branch. This is because Git assumes that all new commit IDs contain new information that must be incorporated, regardless of the contents of the files stored in those commits.

In this section, it is assumed you are working with commits that have not been shared with others yet (i.e., you haven’t pushed your branch). Tips for working on changing history for shared branches are covered separately.

Amending Commits

If you realize a commit you’ve just made is just missing one little fix, you can amend the commit to include additional files, or update the message that was used for the commit. I use this command frequently to convert my terse one-line commit messages into well-formed summaries of the work I’ve completed.

Do Not Change Shared History

If you have already pushed the work, it is considered bad form to go back and “fix” shared history.

If you have made any changes to the files in your working directory, you will need to add the files to the staging area before amending your commit (Example 6-7). If you are just updating the commit message, and there are no new, or modified files, you can omit the command add, and jump straight to the command commit.

Example 6-7. Updating the previous commit with --amend
$ git add --all
$ git commit --amend

Your new changes will be added to the previous commit, and a new ID will be assigned to the revised commit object.

Even More Commit Options Are Available

There are even more ways to construct your commit object. I’ve outlined the options I use most frequently. You may find additional gems by reading the relevant manual page for commit. This information is accessible by running the command: git help commit.

If you want to amend more than just the previous commit, though, you will need to use either reset or rebase.

Combining Commits with Reset

The command reset appears in many different forms during the undo process. In this example, we will use it to mimic the effects of squash in rebasing. The most basic explanation of what reset does is essentially a pointing game. Wherever you point your finger is what Git is going to treat as the current HEAD (or tip) of your branch.

Reset Alters Your Logged History

This is going to alter history because it removes references to commits. If someone were to merge their old copy of the branch, they would reintroduce the commits you had tried to remove. As a result, it’s best to only use reset to alter the history of branches that are not shared with others (this means you created the branch locally, and you haven’t pushed it to the server yet).

Previously you used the command reset to unstage work before making a commit. This time you are using reset to remove commit objects from your branch’s history. Think of a string of beads. Let’s say the string is 20 beads long. Holding the fourth bead, allow the first three beads to slide off the string. You now have a shorter string of beads as well as three loose beads. The parameters you use when issuing the reset command are part of what determines the fate of those beads.

If you want to discard the content contained in the commit objects you removed, you need to use reset with the mode hard. This mode is enabled by using the parameter --hard. When you use the mode hard, the commit objects will be removed, and the working directory will be updated so that all content stored in those commit objects are also removed. If you do not use --hard when you reset your work, Git keeps the content of the working directory the same, but throws away the commit objects back to the reference point. It will be as if you typed all of the changes from the previous commits into one giant piece of work. It’s now waiting to be added and committed.

Reset Reestablishes the Tip of a Branch

Somewhere along the way, I got it stuck in my head that reset ought to reverse the action applied in a given commit. This definition is correct for the command revert, but not reset. The command reset resets the tip of the branch to the specified commit. Perhaps if it were named “restore” or “promote” or even just “set” my brain would have made a better separation between the two commands. Remember: the target for reset is on what’s being kept, and the target of revert is what is lost.

Using our previous bead example, let’s say you wanted to reset your string of beads so that the most recent three beads were replaced by a single big bead. You would use the command reset to point the new end for your string to the fourth bead from the end. You would then slide the three beads off the end of the string. (If you used the parameter --hard, these beads would be discarded.) Instead, we’re going to remold these beads, and put them back on the string as a new commit.

Commits Must Be Consecutive, and End with the Most Recent Commit

For this operation to work, you need to be compressing consecutive commits leading up to your most recent commit. What we are doing is essentially a stepping stone to interactive rebasing. With this use of reset you will be limited to the most recent commits. With rebasing, you will be able to select any range of commits.

Using the command log, identify the most recent commit that you want to keep. This will become the new tip for your branch:

$ git log --oneline

699d8e0 More editing second file
eabb4cc Editing the second file
d955e17 Adding second file
eppb98c Editing the first file
ee3e63c Adding first file

Sticking with the three-bead analogy, the bead that I want to have as the new tip of my necklace is eppb98c. (This is the fourth bead from the end—not entirely intuitive if you are completely focused on removing three beads.) We’re going to put our finger on the bead we want to keep, and slide the rest off of the string:

$ git reset eppb98c

The are now three loose beads rattling around. These beads will appear as untracked changes in our repository. The content of the files will not have changed.

You can view what will be in your new commit by using the command diff:

$ git diff

To combine all of the edits that were made in those three commits into a single commit, use the command add to capture the changes in the staging area:

$ git add --all

Ensure the files are now staged and ready to be saved:

$ git status

Now that the files have been staged, the command diff will no longer show you what you are about to commit to your repository. Instead, you will need to examine the staged version of the changes:

$ git diff --staged

Staging Is Also Caching

The parameter --staged is an alias of --cached. I choose to use the aliased version because it more closely matches the terms I use when talking about staging changes. If you are searching for more documentation about this parameter, be sure to also look for the original parameter name.

Once you are satisfied with the contents of your new commit, you can go ahead and complete the commit process:

$ git commit -m "Replacing three small beads with this single, giant bead."

The three commits will now be combined into one single commit.

If you are having a hard time with the word reset and having to go one past the commit you are looking for, I encourage you to use relative history instead of commit IDs. For example, if you wanted to compress three commits from your branch into one, you would use the following command:

$ git reset HEAD~3

This version of the command puts your repository into the same state as the previous example, but it’s as if the pointer was using another language. Either approach is fine. Use whichever one makes more sense to you. I personally find if there are more than a handful of commits that I want to reset, using the commit ID is a lot easier than counting backward.

If you’ve been following along with the examples in this section, you should now be able to restore a file that was deleted, and combine several smaller commits into one.

Altering Commits with Interactive Rebasing

Rebasing is one of those topics that has gained a strong positive following—and strong opponents. While I have no technical problems using the command, I openly admit that I don’t like what it does. Rebasing is primarily used to change the way history is recorded, often without changing the content of the files in your working directory. Used incorrectly, this can cause chaos on shared branches as new commit objects with different IDs are used to store work identical work. But my complaints are more to do with the idea that it’s okay to rewrite history to suit your fancy. In the nonsoftware world historical revisionism is wrong.

Complaints aside, rebasing is simply the model Git has decided on and so it fits quite well into many workflows. (I use it when it is appropriate to do so—even for my solo projects where its use is not being enforced by an outside team.) One of the times it is appropriate to use rebasing is when bringing a branch up-to-date (as was discussed in “Rebasing Step by Step” and in Chapter 3); the second is before publishing your work—interactive rebasing allows you to curate the commits into an easier-to-read history. In this section you will learn about the latter of these two methods.

Interactive rebasing can be especially useful if you’ve been committing micro thoughts—leaving you with commits in your history that only capture partial ideas. Interactive rebasing is also useful if you have a number of commits that, due to a peer review or sober second thought, you’ve decided were not the correct approach. Cleaning up your history so there are only good, intentional commits will make it easier to use the command bisect in Chapter 9. To help explain the concept, I created a simple animation showing the basic principles of squashing several small commits into one whole idea.

The first thing you need to do is select a commit in your history that you want to have as your starting point (I often choose one commit older than what I think I’ll need—just in case). Let’s say your branch’s history has the following commits:

d1dc647 Revert "Adding office hours reminder."
50605a1 Correcting joke about horses and baths.
eed5023 Joke: What goes 'ha ha bonk'?
77c00e2 Adding an Easter egg of bad jokes.
0f187d8 Added information about additional people to be thanked.
c546720 Adding office hours reminder.
3184b5d Switching back to BADCamp version of the deck.
bd5c178 Added feedback request; formatting updates to pro-con lists
876e951 Removing feedback request; added Twitter handle.

You have decided that the three commits about jokes should be collapsed into a single commit. Looking to the commit previous to this, you select 0f187d8 as your starting point. You are now ready to begin the rebasing process:

$ git rebase --interactive 0f187d8
pick 77c00e2 Adding an Easter egg of bad jokes.
pick eed5023 Joke: What goes 'ha ha bonk'?
pick 50605a1 Correcting joke about horses and baths.
pick d1dc647 Revert "Adding office hours reminder."

# Rebase 0f187d8..d1dc647 onto 0f187d8
#
# Commands:
#  p, pick = use commit
#  r, reword = use commit, but edit the commit message
#  e, edit = use commit, but stop for amending
#  s, squash = use commit, but meld into previous commit
#  f, fixup = like "squash", but discard this commit's log message
#  x, exec = run command (the rest of the line) using shell
#
# These lines can be re-ordered; they are executed from top to bottom.
#
# If you remove a line here THAT COMMIT WILL BE LOST.
#
# However, if you remove everything, the rebase will be aborted.
#
# Note that empty commits are commented out

The list of commits has been reversed and the oldest commit is now at the top of the list. Edit the list and replace the second and third use of the word squash to pick. In my case, the edited list would appear as follows:

pick 77c00e2 Adding an Easter egg of bad jokes.
squash eed5023 Joke: What goes 'ha ha bonk'?
squash 50605a1 Correcting joke about horses and baths.
pick d1dc647 Revert "Adding office hours reminder."

Save and quit your editor to proceed. A new window commit message editor will open. You will now need to craft a new commit message that represents all of the commits you are combining. The current messages are provided as a starting point:

# This is a combination of 3 commits.
# The first commit's message is:
Adding an Easter egg of bad jokes.

You should add your bad jokes too.

# This is the 2nd commit message:

Joke: What goes 'ha ha bonk'?

# This is the 3rd commit message:

Correcting joke about horses and baths.

# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
#
# Date:      Wed Sep 10 06:12:01 2014 -0400
#
# rebase in progress; onto 0f187d8
# You are currently editing a commit while rebasing branch 'practice_rebasing'
on '0f187d8'.
#
# Changes to be committed:
#       new file:   badjokes.md
#

In this case, it is appropriate to update the commit message as follows:

Adding an Easter egg of bad jokes.

- New Joke: What goes 'ha ha bonk'?

You don’t need to remove lines starting with #. I have done this to make it a little easier to read.

When you are happy with the new commit message, save and quit the editor to proceed:

[detached HEAD 1c10178] Adding an Easter egg of bad jokes.
 Date: Wed Sep 10 06:12:01 2014 -0400
 1 file changed, 7 insertions(+)
 create mode 100644 badjokes.md
Successfully rebased and updated refs/heads/practice_rebasing.

The rebasing procedure is now complete. Your revised log will appear as follows:

$ git log --oneline

ef4409f Revert "Adding office hours reminder."
1c10178 Adding an Easter egg of bad jokes.
0f187d8 Added information about additional people to be thanked.
c546720 Adding office hours reminder.
3184b5d Switching back to BADCamp version of the deck.

In the second example, we are going to separate changes that were made in a single commit so they are available as two commits instead. This would be useful if you added made several changes to a single file and commited all of those changes as a single commit but they should have have actually been saved as two separate commits.

To separate a commit into several, begin the same way as you did before. This time when presented with the list of options, change pick to edit for one of the commits. When you save and close the editor this time, you will be presented with the option to amend your commit (you know how to do this! yay!), and then proceed with the rebase process:

Stopped at 0f187d831260b8e93d37bad11be1f41aaeca835e... Added information
about additional people to be thanked.
You can amend the commit now, with

	git commit --amend

Once you are satisfied with your changes, run

	git rebase --continue

At this point you are in a detached HEAD state (you’ve been here before! it’s okay!), but the files are all committed. You need to reset the working directory so that it has uncommitted files that you can work with. Do you remember the command we used previously to accomplish this? It’s reset! Instead of selecting a specific commit, it’s okay to use the shorthand for “one commit ago,” which is HEAD~1:

$ git reset HEAD~1

Unstaged changes after reset:
M	README.md

Now you have an uncommitted file in your working directory that needs to be added before you can continue the rebasing.

At this point, you can stage your files interactively by adding the parameter --patch when you add your files. This allows you to separate changes saved into one file into two (or more) commits. You do this by adding one hunk of the change to the staging area, committing the change, and then adding a new hunk to the staging area:

$ git add --patch README.md

You will be asked if you want to stage each of the hunks in the file:

diff --git a/README.md b/README.md
index 291915b..2eceb48 100644
--- a/README.md
+++ b/README.md
@@ -49,3 +49,5 @@ Emma is grateful for the support she received while employed at
 Drupalize.Me (Lullabot) for the development of this material.
 The first version of the reveal.js slides for this work were posted at
 [workflow-git-workshop](https://github.com/DrupalizeMe/workflow-git-workshop).
+
+Emma is also grateful to you for watching her git tutorials!
Stage this hunk [y,n,q,a,d,/,e,?]?

If you want to include the hunk, choose y; otherwise, choose n. If it’s a big hunk and you want to only include some of it, choose s (this option isn’t available if the hunk is too small). Proceed through each of the changes in the file and select the appropriate option. When you get to the end of the list of changes, you will be returned to the prompt. Use the command git status, and assuming there was more than one hunk to change, you will see your file is ready to be committed and not staged for commit:

$ git status

rebase in progress; onto bd5c178
You are currently splitting a commit while rebasing branch 'practice_rebasing'
on 'bd5c178'.
  (Once your working directory is clean, run "git rebase --continue")

Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	modified:   README.md

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	modified:   README.md

Proceed by committing your staged changes:

$ git commit

If the remainder of the changes can all be included in the same commit, you can omit the parameter --patch and add and commit the file to the repository:

$ git add README.md
$ git commit

With all of your changes committed, you are ready to proceed with the rebase. It seems like there aren’t any hints, but if you check the status, Git will remind you you are not done yet:

$ git status

rebase in progress; onto bd5c178
You are currently editing a commit while rebasing branch 'practice_rebasing'
on 'bd5c178'.
  (use "git commit --amend" to amend the current commit)
  (use "git rebase --continue" once you are satisfied with your changes)

nothing to commit, working directory clean

To complete the rebase, follow the command as Git has described in the status message:

$ git rebase --continue

Successfully rebased and updated refs/heads/practice_rebasing.

Phew! You did it! That was a lot of steps, but they were all concepts you have previously tried; this time they were chained together. Well done, you.

If you have followed each of the examples in this section, you should now be able to amend commits, and alter the history of a branch using interactive rebasing.

Unmerging a Branch

Mistakes can happen when you are merging branches. Maybe you had the wrong branch checked out when you performed the merge; or maybe you were supposed to use the --no-ff parameter when merging, but you forgot. So long as you haven’t published the branch, it can be quite easy to “unmerge” your branches.

There Is No Such Thing as an Unmerge

“Inconceivable!” he cried. “I do not think that word means what you think it means,” the other replied. With apologies to The Princess Bride, it’s true; there’s no six-fingered man in Git, and there’s not really a way to “unmerge” something. You can, however, reverse the effects of a merge by resetting the tip of your branch to the point immediately before you used the command merge.. Hopefully this doesn’t happen to you often, because it’s possible it will take years off your life just like The Machine does to our hero, Westley.

Ideally, you will notice you have incorrectly merged a branch immediately after doing it. This is the easiest scenario to reverse. Git knows some of its commands are more dangerous than others, so it stores a pointer to the most recent commit right before it performs the dangerous maneuver. Git considers a merge to be dangerous, and so you can easily undo a merge right after it occurs by running reset, and pointing the tip of your branch back to the commit right before the merge took place:

$ git reset --merge ORIG_HEAD

If you did not notice your mistake right away, you will need to ask yourself a few more questions before proceeding. Figure 6-6 summarizes the considerations you will need to make in order to select the correct commands to unmerge your work.

You will need to think carefully about what work you may want to retain, and what work can be thrown out, before proceeding. If you have deleted the branch you are removing, you may wish to create a backup copy of the commits in a separate branch. This will save you from having to dig through the reflog to find the lost commits.

Let’s say the branch you are working on is named master, and you want to create a backup branch named preservation_branch:

$ git checkout master
$ git checkout -b preservation_branch
Flow chart diagram. All commands identified in the diagram are described in text form.
Figure 6-6. Before unmerging your branch, consider what may happen to the lost commits

You now have a branch with the good commits and the bad commits, and you can proceed with removing the bad commits. This assumes there are no additional commits you want to save on the branch that needs cleaning:

$ git checkout master
$ git reset --merge ORIG_HEAD

If you do want to save some of the commits, you can now cherry-pick them back from the backup branch you created.

$ git cherry-pick commit_to_restore

The method of using ORIG_HEAD as a reference point will only work if you notice right away that you need to unmerge the bad branch. If you have been working on other things, it’s possible that Git will have already established a new ORIG_HEAD. In this case, you will need to select the specific commit ID you want to return to:

$ git reset last_correct_commit

As Figure 6-6 shows, there are a few different scenarios for unmerging branches. Take your time and remember, the reflog keeps track of everything, so if something disappears, you can always go back and check out a specific commit to center yourself and figure out what to do without losing any of your work.

Undoing Shared History

This chapter has been focused on altering the unpublished history of your repository. As soon as you start publishing your work you will eventually publish something that needs to be fixed up. There are lots of reasons why this can happen—new requirements from a client; you notice a bug; someone else notices a bug. There is nothing to be ashamed of if you need to make a change and share it with others, and you almost certainly don’t need to hide your learning! Sometimes, however, it’s appropriate to clean up a commit history that has already been shared. For example, lots of minor fixes can make debugging tools, such as bisect, less efficient; and a clean commit history is easier to read. The most polite way to modify shared history is to not modify it at all. Instead of a “roll back” to recover a past working state, think of your actions as “rolling forward” to a future working state. You can do this by adding new commits, or by using the command revert. In this section you will learn how to fix up a shared history without frustrating your teammates.

Reverting a Previous Commit

If there was a commit in the past that was incorrect, it is possible to apply a new commit that is the exact opposite of what you had previously using the command revert. If you are into physics, revert is kind of like noise-canceling headphones. The command applies the exact opposite sound as the background noise, and the net effect to your ears is a silent nothingness.

When you use the command revert, you will notice that your history is not altered. Commits are not removed; rather, a new commit is applied to the tip of your branch. For example, if the commit you are reverting applied three new lines, and removed one line, the revert will remove the three new lines and add back the deleted line.

For example, you have the following history for your branch:

50605a1 Correcting joke about horses and baths.
eed5023 Joke: What goes 'ha ha bonk'?
77c00e2 Adding an Easter egg of bad jokes.
0f187d8 Added information about additional people to be thanked.
c546720 Adding office hours reminder.
3184b5d Switching back to BADCamp version of the deck.
bd5c178 Added feedback request; formatting updates to pro-con lists

You decide that you want to remove the commit made about the reminder for the office hours, because that message was only relevant for that particular point in time. This message was added at c546720:

$ git revert c546720

The commit message editor will open. A default message is provided, so you can save and quit to proceed:

[master d1dc647] Revert "Adding office hours reminder."
 1 file changed, 2 deletions(-)

Your logged history now includes a new commit to undo the changes that were added in c546720:

d1dc647 Revert "Adding office hours reminder."
50605a1 Correcting joke about horses and baths.
eed5023 Joke: What goes 'ha ha bonk'?
77c00e2 Adding an Easter egg of bad jokes.
0f187d8 Added information about additional people to be thanked.
c546720 Adding office hours reminder.
3184b5d Switching back to BADCamp version of the deck.

Repeat for each commit that you want to revert.

If you have followed along with each of the examples in this section, you should now be able to reverse the changes that were implemented in a previous commit.

Unmerging a Shared Branch

Previously in this chapter you learned how unmerge two branches using the command reset. This command deletes commits from a branch’s history. As a result, Git will treat them as new commits if it encounters them again. This happens if people merge their (now out of date) branch into the main repository.

To know which commands to use, you will first need to determine what kind of merge it is. Figure 6-7 compares a fast-forward merge and a true merge. A fast-forward merge is aligned with the commits from the branch it was merged into; a true merge, however, is displayed as a hump on the graph and includes a commit where the merge was performed.

Using the command log, look for the point where the incorrect branch was merged in (Example 6-8). If there is a merge commit, you’re in luck! If there is no merge commit, you are going to have to do a lot more work to get the branch unmerged.

Fast forward merge contains a single string of commit objects. The true merge contains two strings which are joined with a commit.
Figure 6-7. When graphed, a fast-forward merge loses the visual of a branch; a true merge maintains it.
Example 6-8. The graphed log of your commit history will show you if it’s a true merge
$ git log --oneline --graph

*   4f2eaa4 Merge branch 'ch07' into drafts
|
| * c10fbdd CH07: snapshot after editing draft in LibreOffice
| * 9716e7b CH07: snapshot before LibreOffice editing
| * 8373ad7 App01: moving version check to the appendix from CH07
| * d602e51 CH7: Stub file added with notes copied from video recording lessons.
* | 1ae7de0 CH08: Incorrect heading formatting was creating new chapter
* | 7907650 CH08: Draft chapter. Based on ALA article.
* | ad6c422 CH8: Stub file added with notes copied from video recording lessons.

You may also want to look at a single commit to confirm if it is a true merge using the command show. This will list SHA1 for the branches that were merged:

$ git show 90249389
commit 902493896b794d7bc6b19a1130240302efb1757f
Merge: 54a4fdf c077a62
Author: Joe Shindelar <[email protected]>
Date:   Mon Jan 26 18:30:55 2015 -0700

    Merge branch 'dev' into qa

Thanks, Joe, for this tip!

Being Consistent Makes It Easier to Search Successfully

The default commit message for a merge commit is “Merge branch incoming into current,” which makes it easier to spot when reading through the output from the log command. Your team might choose to use a different commit message template; however, you can add the optional parameters --merges and --no-merges to further filter the logged history.

Once you know if there is a merge commit present, you can choose the appropriate set of commands. Figure 6-8 summarizes these options as a flowchart.

If the branch was merged using a true merge, and not a fast-forward merge, the undo process is as follows: use the command revert to reverse the effects of the merge commit (Example 6-9). This command takes one additional parameter, --mainline. This parameter tells Git which of the branches it should keep while undoing the merge. Take a look at your graphed log and count the lanes from left to right. The first lane is 1. You almost always want to keep the leftmost lane, and so the number to use is almost always 1.

Example 6-9. Reversing a merge commit
$ git checkout branch_to_clean_up
$ git log --graph --oneline
$ git revert --mainline 1 4f2eaa4

The commit message editor will open. A default commit message is provided indicating a revert is being performed, and including the commit message from the commit it is reversing (Example 6-10). I generally leave this message in place due to sheer laziness; however, the upside is that it is quite easy to search through my recorded history and find any commits where I’ve reverted a merge.

Flow chart diagram. If the merge was performed with a fast-forward strategy, there is no merge commit and you will need to deal with the commits separately. If there was a merge commit, you can revert this commit to un-merge the branch.
Figure 6-8. Depending on how your branch was merged, you will use different commands to unmerge the shared branch
Example 6-10. Sample commit message for a revert of a merge commit
Revert "Merge branch 'video-lessons' into integration_test"

This reverts commit 0075f7eda67326f174623eca9ec09fd54d7f4b74, reversing
changes made to 0f187d831260b8e93d37bad11be1f41aaeca835e.

# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
# On branch master
# Your branch and 'origin/master' have diverged,
# and have 23 and 2 different commits each, respectively.
#   (use "git pull" to merge the remote branch into yours)
#
# Changes to be committed:
#       deleted:    lessons/01-intro/README.md
#       deleted:    lessons/02-getting-started/README.md
#       deleted:    lessons/03-clone-remote/README.md
#       deleted:    lessons/04-config/README.md
(etc)
#

Occasionally you will run into conflicts when running a revert. No reason to panic. Simply treat it as any other merge conflict and follow Git’s on-screen instructions:

$ git revert --mainline 1 a1173fd
error: could not revert a1173fd... Merge branch 'unmerging'
hint: after resolving the conflicts, mark the corrected paths
hint: with 'git add <paths>' or 'git rm <paths>'
hint: and commit the result with 'git commit'
Resolved 'README.md' using previous resolution.

Something went wrong—check the status message to see which files need reviewing:

$ git status

On branch master
Your branch and 'origin/master' have diverged,
and have 20 and 2 different commits each, respectively.
  (use "git pull" to merge the remote branch into yours)
You are currently reverting commit a1173fd.
  (fix conflicts and run "git revert --continue")
  (use "git revert --abort" to cancel the revert operation)

Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	deleted:    badjokes.md
	modified:   slides/slides/session-oscon.html

Unmerged paths:
  (use "git reset HEAD <file>..." to unstage)
  (use "git add <file>..." to mark resolution)

	both modified:   README.md

The messages about the repository being out of sync with origin is unrelated to this issue. Skip that, and keep reading. The first useful bit of information starts at: You are currently reverting. You are given the options on how to proceed, and on how to abort the process. Don’t give up! Keep reading. The next bit looks like a regular ol’ dirty working directory with some files that are staged, and some that aren’t. If you were just making edits to your files, you would know how to deal with this. First you add your changes to the staging area, and then you commit them:

$ git add README.md
$ git commit -m "Reversing the merge commit a1173fd."
[master 291dabe] Reversing the merge commit a1173fd.
 2 files changed, 2 insertions(+), 7 deletions(-)
 delete mode 100644 badjokes.md

If there is no merge commit, you will need to deal with each of the commits you want to undo individually. This is going to be especially frustrating because a fast-forward merge does not have any visual clues in the graphed log about which commits were in the offending branch. (After the first time unpicking an incorrect merge, you’ll begin to see the logic in using a --no-ff strategy when merging branches.)

Consider Your Options by Talking to Your Team

Before unpicking the commits one at a time, you may want to check if there is anyone on the team with an unpublished, unsullied version of the branch they can share. Sometimes it is easier to break history with a well-placed push --force.

The first thing you need to do is get a sense of where the bad commits are. If you are not entirely sure how things went wrong, you can get a list of all the branches a commit is contained within by using the command branch with the parameter --contains:

$ git branch --contains commit

Assuming the merged-in branch hasn’t been deleted, you should be able to use the information to figure out which branch you are trying to unmerge, and what commits were applied to that branch that you might want to remove. Remember, though, the commits are going to be in both branches, so you won’t be able to run a comparison to find which commits are different. This step isn’t necessary if you already know which commits you are targeting.

If the commits you need to revert are sequential, you’re in luck! The command revert can accept a single commit, or a series of commits. Remember, though, that a revert is going to make a new commit for each commit it is reversing. This could get very noisy in your commit history, so instead of reversing each commit individually, you can group them into a single reversal by opting to save your commit message to the very end:

$ git revert --no-commit last_commit_to_keep..newest_commit_to_reject

After running this command you will end up with a dirty working directory with all of the files reverted back. Review the changes. Then, complete the revert process:

$ git revert --continue

Review the commit message and make any necessary updates to improve the clarity of the message. By default the message will be “Revert” followed by the quoted text of what was in the newest of the commits you are reversing. Often this will be sufficient, but you may want to be more descriptive if the original message was subpar.

If the commits are not sequential, you will need to revert the offending commits one at a time. Send me a tweet at @emmajanehw and I will commiserate and cheerlead.

$ git revert commit

Unmerging a merged branch is not something Git is designed to do unless a very specific workflow has been followed. Your team may never need to unmerge a branch. I have definitely had the occasional bad merge on a personal project where I was a solo developer and opted to swear a bit, and then shrug and move on. Sometimes history doesn’t really matter all that much; sometimes it does. With experience and hindsight, you know for sure which commands you should have been using.

Really Removing History

In this chapter, you’ve learned about updating the history of your repository, and especially retrieving information you thought was lost. There may be times when you actually do want to lose part of your history—for example, if you accidentally commit a very large data file or a configuration file that contains a password. Hopefully you never need to use this section, but just in case your “friend” ever needs help, I’ve included the instructions. You know, just in case.

Published History Is Public History

If you have published content to a publicly available remote repository, you should make the assumption that someone out there cloned a copy of your repository and has access to the secrets you did not mean to publish. Update any passwords and API keys that were published in the repository immediately.

If you need to do your cleanup on a published branch, you should notify your team members as soon as you realize you need to clean the repository. You should let them know you are going to be doing the cleanup, and will be “force pushing” a new history into the repository. Developers will need to evaluate their local repository and decide which state it is in. Have each of the developers search for the offending file to see if their repository is tainted:

  • If the file you are trying to remove is not in their local repository, they will not be affected by your cleanup.

  • If their repository does have the file, in any of their local branches, it is tainted. However, if they have not done any of their own work since the file was introduced, they will not be affected by your cleanup. This may be true for QA managers who are not also local developers. In this case, have them remove their local copy of the repository and re-clone the repository once the cleanup is done.

  • If their repository is tainted, and they do have local work that was built from a branch that includes the tainted history, they will need to bring these branch(es) up-to-date through rebasing. If they use merge to bring their branches up-to-date, they will reintroduce the problem files back into the repository and your work will have been for naught. This can be a little scary for people if they are not familiar with rebasing, so you may want to suggest that they push any branches that have work they need to keep so that you can clean it up for them. (Have them clone a new repository once the cleanup is done.)

While you are working on the cleanup, your coworkers could have a sword fight or something.

With everyone on the team notified, and with a plan of what will happen before, during, and after the cleanup on everyone else’s repositories, you are ready to proceed.

For this procedure, you will use the command filter-branch. This command allows you to rewrite branch histories and tags. The examples provided in the Git documentation are interesting, and worth reading. You can, for example, use this command to permanently remove any code submitted by a specific author. I cannot think of an instance when I would choose to remove everything from someone without reviewing the implications, but it’s interesting that the command can be used in this way. (Perhaps you know exactly how it would be useful, though?)

Assuming the file you want to remove is named SECRET.md, the command would be as follows (this is a single command, but it’s long; the allows you to wrap onto two lines):

$ git filter-branch --index-filter 
  'git rm --cached --ignore-unmatch SECRET.md' HEAD

With the file completely removed from the repository, add it to your .gitignore file so that it doesn’t accidentally sneak in again. Instructions on working with .gitignore are available in Appendix C.

Unlike the other methods in this chapter, we are aiming to permanently remove the offending content from your repository. For a brief period of time the commits will still be available by using the command reflog. When you are sure you do not need the commits anymore, you can obliterate them from your system by cleaning out the local history as well and doing a little garbage collection (gc):

$ git reflog expire --expire=now --all
$ git gc --prune=now

Your repository is now cleaned, and you are ready to push the new version to your remote repositories:

$ git push origin --force --all --tags

Once the new version of history is available from the shared repository, you can tell your coworkers to update their work. Depending on the conversation you’ve had previously, they will incorporate your sanitized changes into their work by one of the following methods:

  • Cloning the repository again from scratch. This method is better for teams that are not currently using rebasing and are intimidated by it.

  • Updating their branches with rebase. This method is better for teams that are already comfortable with rebasing because it is faster than starting a new clone, and allows them to keep any work they have locally:

$ git pull --rebase=preserve

Both GitHub and Bitbucket offer articles on how to do this cleanup for repositories stored on their sites. Both are worth reading because they cover slightly different scenarios.

Now that you know Git’s built-in way of sanitizing a repository, check out this stand-alone package, BFG Repo Cleaner. It delivers the same outcome as filter-branch, but it is much faster to use, and once it is installed, it’s much easier, too. If you are dismayed by the amount of time a cleanup is taking with filter-branch, you should definitely try using BFG.

Command Reference

Table 6-2 lists the commands covered in this chapter.

Table 6-2. Git commands for undoing work
Command Use

git checkout -b branch

Create a new branch with the name branch

git add filename(s)

Stage files in preparation for committing them to the repository

git commit

Save the staged changes to the repository

git checkout branch

Switch to the specified branch

git merge branch

Incorporate the commits from the branch branch into the current branch

git branch --delete

Remove a local branch

git branch -D

Remove a local branch whose commits are not incorporated elsewhere

git clone URL

Create a local copy of a remote repository

git log

Read the commit history for this branch

git reflog

Read the extended history for this branch

git checkout commit

Check out a specific commit; puts you into a detached HEAD state

git cherry-pick commit

Copy a commit from one branch to another

git reset --merge ORIG_HEAD

Remove from the current branch all commits applied during a recent merge

git checkout -- filename

Restore a file that was changed, but has not yet been committed

git reset HEAD filename

Unstage a file that is currently staged so that its changes are not saved during the next commit

git reset --hard HEAD

Restore all changed files to the previously stored state

git reset commit

Unstage all of the changes that were previously committed up to the commit right before this point

git rebase --interactive commit

Edit, or squash commits since commit

git rebase --continue

After resolving a merge conflict, continue with the rebasing process

git revert commit

Unapply changes stored in the identified commit; this creates a sharing-friendly reversal of history

git log --oneline --graph

Display the graphed history for this branch

git revert --mainline 1 commit

Reverse a merge commit

git branch --contains commit

List all branches that contain a specific commit object

git revert --no-commit last_commit_to_keep..newest_commit_to_reject

Reverse a group of commits in a single commit, instead of creating an object for every commit that is being undone

git filter-branch

Remove files from your repository permanently

git reflog expire

Forget about extended history, and use only the stored commit messages

git gc --prune=now

Run the garbage collector and ensure all noncommitted changes are removed from local memory

Summary

Throughout this chapter you learned how to work with the history of your Git repository. We covered common scenarios for some of the commands in Git which are often considered “advanced” by new Git users. By drawing diagrams summarizing the state of your repository, and the changes you wanted to make, you were able to efficiently choose the correct Git command to run for each of the scenarios outlined. You learned how to use the three “R"s of Git:

Reset

Moves the tip of your branch to a previous commit. This command does not require a commit message and it may return a dirty working directory if the parameter --hard is not used.

Rebase

Allows you to alter the way the commits are stored in the history of a branch. Commonly used to squash multiple commits into a single commit to clean up a branch; and to bring a branch up-to-date with another.

Revert

Reverses the changes made in a particular commit on a branch that has been shared with others. This command is paired with a commit and it returns a clean working directory.

In the next chapter, you will take the lessons you’ve been working on in your local repository and start integrating them with the rest of the team’s work.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset