Chapter 7. Teams of More than One

The first few times you work with others on a project will shape how you approach version control. If your collaborators are patient and empathetic, you are more likely to use version control with confidence. Empathetic teammates will document the procedure they want you to use, and support you with questions (updating the documentation as necessary). If you are responsible for starting a project, think of that scene when Jerry Maguire says to his star player, “help me help you.” As a project lead, this should be your mantra. Find the sticking points and remove them. Where you want consistency, provide detailed instructions, templates, and automated scripts. When something comes in that is not up to your standard, consider it a process problem that is yours to solve.

In this chapter, we have the culmination of everything covered in this book so far. In Part I, you learned about the different considerations for setting up a project. Now you will learn how to implement those decisions. In Chapters 5 and 6, you learned how to run the commands you’ll use on a daily basis as a developer. In this chapter, you will learn how to set up a connection to a remote project, and share your work with others.

By the end of this chapter, you will be able to:

  • Set up a new project on a code hosting system

  • Download a remote repository with clone

  • Upload your changes to a project with push

  • Refresh the list of branches available from the remote repository with fetch

  • Incorporate changes from the remote repository with pull

  • Explain the implications of updating your branches with pull, rebase, and merge

Where possible, this chapter includes templates you can use to help onboard new developers. The easier it is for people to contribute usable work, the more likely they are to enjoy working on your project. Even if it’s just a job, there’s no reason we shouldn’t all have a little more delight in our lives.

Those who learn best by following along with video tutorials will benefit from Collaborating with Git (O’Reilly), the companion video series for this book.

Setting Up the Project

The context for your project will dictate a lot of how the repository will be set up. A super-secret internal-only covert code base will be set up so as to ensure privacy; a free and open source code library will be set up for transparency and probably participation. Once the project is established, the commands the developers use daily will likely be quite similar.

This section covers the basic process for creating a new project on a code hosting system. The specifics for GitHub, Bitbucket, and GitLab are covered in Part III (Chapters 10, 11, and 12, respectively).

Creating a New Project

In order to share your work with your team, you will need to establish a new project in your code hosting system of choice. These days most code hosting systems offer more than a place to dump a shared repository. They also include ticketing systems, basic workflow enhancements, project documentation repositories, and more! In the communities and teams I participate in, one of the following three services are generally used: GitHub (typically used by open source projects), Bitbucket (typically used by internal teams and small teams who need free hosting for private projects), and GitLab (typically used by medium-sized companies that need to host their code in house for security reasons).

No matter which system you choose, the basics of setting up a project are going to be the same. The first question you’ll need to ask yourself is: which account should you use to create the repository? The standard format for project URLs on a web-based system is as follows: https://<hosting-url.com>/<project-owner’s-name>/<project-name>. If the project is really and truly yours—for example, the repository for your personal blog—it’s appropriate for the URL to include your username. If, however, the project belongs to an agency of developers, it would be more appropriate for the project owner’s name to be the name of the agency. And finally, if the project belongs to a number of agencies, such as an open source software project, the most appropriate project owner name would be the name of the software project.

The decisions you choose here may also affect who is able to write directly to the project, and may be dependent on the code hosting system you’re using. For example, if you choose to start the project under your personal name, you might not want to allow “just anyone” to write to the project without a review from you—especially so for public projects where others could be evaluating the body of work under the assumption it was yours.

What’s in a Name?

The support repository for this book has existed in a number of different places over the years, including my personal account, a team account, and three different code hosting systems (for a total of six different repositories that need to be maintained). Although the work has been developed by me, it becomes a question of branding on which URL I want to distribute. If I want others to think of the repository as theirs (such as in a set of abstract learning materials where people don’t have direct access to me), I might use the project URL; but when I want people to think of me as the author because it’s also a promotional piece, I might give people my personal URL. It’s quite possible I overthink this, but you should give the naming of things at least a little consideration.

You are probably reading this book as a member of a team (even if it’s a very tiny team of one!), and so you’ll want to select the name of your company, agency, or team as the project owner, or the name of the project if you are working on an open source project. Fortunately, you can move the code base to a new name or even a new code hosting platform very easily, so it’s not absolutely critical to get it right from the beginning. It is, however, more difficult to transfer any of the metadata for your project from one account to another. Metadata could include the history of tickets for your project, and any documentation stored outside of the repository.

With the project owner selected, go ahead and create a new empty project under this account. Don’t worry about uploading files just yet.

Establishing Permissions

There are two types of permissions you will need to set for your project: who can see the project (“read”); and who can commit to the project (“write”)—this was discussed in greater detail in Chapter 2. If you are an ultra-transparent team, the project should be visible to the world. Otherwise, create a private project.

The Cost of a Free Service

Some code hosting services will charge a small fee for private repositories, and some provide this service for free. If your code and its history are important, consider paying for hosting. You might choose to pay with your time and self-host the code internally, or you may choose to pay a small monthly fee to a third-party service. The advantage of paying is that the hosting company is more likely to be accountable to you as a customer, and you are more likely to keep them in business by helping to pay their expenses. Of course, if you can’t afford to pay the fee, there are plenty of free options available—and there’s no sense feeling guilty if a company has chosen to offer a free service. Do what you can.

Additionally, some hosting systems will allow you to set per-branch restrictions. At this time Bitbucket and GitLab offer this functionality. Configuration options are described in Chapters 11 and 12, respectively.

As a distributed version control system, Git is inherently good at dealing with incoming requests for changes to a repository. Generally, team projects will have a single repository that is considered The Project, and many spin-off projects that contain the work of the individual developers for the project. If your project is internal, you may choose to have everyone working directly in The Project repository; but if you prefer to maintain a cleaner central repository, you may choose to have each of your developers work in a fork of The Project.

The Project

Throughout this chapter, you will see reference made to “The Project.” I use this shorthand to refer to the canonical, or official, repository for a software project. This is the repository that the community has agreed to use for official releases of the software. Git itself has no internal hierarchy that forces one repository to be more important than another—only the declaration by the community makes a repository the official one.

Based on the decisions you made about your team structure in Chapter 2, assign the appropriate permissions for any contributors who should be allowed write access to The Project—additional contributions can be accepted from non-authorized developers via pull requests (these are also referred to as merge requests by some services).

Uploading the Project Repository

As a distributed version control system, Git is a bit of a social butterfly. It loves to connect with all kinds of repositories. It loves sharing stories, and making new friends along the way. Git maintains its connections with its faraway friends through a stored connection referred to as a remote. A local repository may have zero, one, or many remote connections. It is typical for Git repositories to have only one remote connection—the origin. You’ve probably seen this term used before. It’s the nickname assigned to the remote repository from which you downloaded, or cloned, your local copy. It’s just a nickname, though. You can use whatever names you like for your remote connections.

When you first start a new project, you may have no code written, or some code written. (Seems obvious, right?) If you have no code written, you may choose to start your project by following the instructions from your code hosting system and cloning the empty project to your local development environment. If, however, you already have some code locally, you will want to upload what you’ve already got. To do this, you will need to make a new connection from your local repository to the project hosting service.

From your local copy of the project repository, take a look to see if you already have a remote connection set up:

$ git remote --verbose

If you started locally, you won’t see any remotes listed, so it’s okay if nothing shows up at this point. If you do a have a remote set up for this repository, you will see something like the following:

origin  https://github.com:emmajane/gitforteams.git (fetch)
origin  https://github.com:emmajane/gitforteams.git (push)

Each line begins with the nickname for the remote connection (origin), as well as the source for the remote repository. These lines will always appear in pairs: the first line of the pair indicates where you will retrieve new work from (fetch), and the second indicates where you will upload new work to (push).

Project owners will need to have a connection to the official copy of a project; they may also have a connection to a fork of a project if they require themselves to go through a peer review process before incorporating their own work (peer reviews are covered in Chapter 8). As soon as you start adding multiple remote repositories for a project, the default nickname (origin) can get a bit confusing. As a result, I tend to name my remotes according to their purpose; for example, official and personal, which have meaning to me. When I upload work, I then decide between these two options. The standard Git terms for my nicknames are upstream and origin, although origin is assigned to the source of a cloned repository by default, regardless of whether or not you can write to it.

Name It to Claim It

I’ve been working with Git a very long time, and I still screw up the command git remote show origin on an embarrassingly regular basis. Four words. It shouldn’t be that hard for me to remember the order, right? I can never seem to get the order of show and origin right. By assigning my own names to the remote repositories, I am more likely to make more sense of the command, and thus get the order right. git remote show official just seems to make better sense to my brain. You may never have this problem, but if you struggle to remember this command, you might want to personalize your remote names and change the name origin to something that resonates.

To add a new remote connection, you will first need to know the URL for the project. The structure is generally https://<hosting-url.com>/<project-owner’s-name>/<project-name>.git. In newer versions of Git, the protocol https will be available to you, but in older documentation the first block may be replaced with something like [email protected]. Once you know the URL for the remote repository, you can make a connection to it (Example 7-1).

Example 7-1. Add a connection to a remote repository
$ git remote add nickname project-url

After a connection is made to a remote, you should see two new lines when you list your remote connections. If you want to use Git’s terminology, you would use the nickname upstream for the official project repository; if you are using my naming convention, you would use official. This name will never be published, and there are no Git police so you can use whatever you want and no one will ever know. (You could even call it cookies or coffee if that made you happy. It really doesn’t matter.)

For example, if I was a participant in a project named Mounties, and it was run by the agency Oh, Canada, I might have a series of remotes as follows:

$ git remote --verbose

official https://github.com:ohcanada/mounties.git (fetch)
official https://github.com:ohcanada/mounties.git (push)
personal https://github.com:emmajane/mounties.git (fetch)
personal https://github.com:emmajane/mounties.git (push)

You can easily hook up as many new remote connections as you like. For example, you might have remote connections for devserver, staging, and production; or you may log directly in to those machines and pull code from The Project repository, instead of pushing code directly to those locations.

If you already have a remote connection set up in your local repository that you no longer need, you can easily delete it (Example 7-2).

Example 7-2. Remove a remote connection
$ git remote remove nickname
Tip

You can easily rename remotes, and even set up default remotes for each of the branches in your local repository. Git’s built-in documentation for this command is easy to understand. You should read through the documentation if you want to personalize your list of remotes even further.

With the remote connection established for your project, you can now upload your local copy of the repository to the remote server:

$ git push nickname branch_name

If you want to share all local branches with others, you can update this command as follows:

$ git push --all nickname

Once you have uploaded your work, navigate to the project page to ensure the repository was uploaded as expected. By default, most code hosting systems will display the branch master if there is more than one branch present in the repository. If your local repository uses nonstandard branch names, check to see if your code hosting system allows you to assign the default branch for the repository. This branch is typically the most stable version of the project, with experimental work existing in other branches. Every project is a little different, though. Your project may use the master branch as the fire hose of new work and it might not be the most stable version of your software. Be explicit in your documentation.

To upload a local name under a new name on the remote server, use the following syntax:

$ git push nickname branch_local:branch_remote

For example, if you wanted to upload your branch main to the remote repository official and rename it to master in the remote repository, you would use the following command:

$ git push official main:master

Your local repository should now be uploaded to the remote project repository and with the desired branch names.

Document the Project in a README

When you navigate to your project page, you will notice most code hosting systems will display the contents of the file README if one is present in your project. This file should be used to give people an overview of the project. If it is a development project with dependencies, those should be listed here. If there are installation instructions, those should be listed here as well (or a link should be provided to a more complete installation guide). If you would like people to contribute to the project, or report bugs to the project, those instructions should be listed here, too.

The following projects have excellent README files that clearly explain what the repository is about, how you can use the code within it, and how you can contribute to it:

Apply a License to Your Project

There is no single international copyright law. As a result, any project that does not include an explicit license is assumed to be fully copyrighted, and not intended for reuse. I openly admit that a number of my projects do not include licenses. This is usually because I simply haven’t made the decision of how I want others to use my work. (I’m typically producing training materials in environments where copyright ownership is more restricted than in code communities where open licensing is more prevalent.) The license for a given repository is typically located in the file LICENSE or LICENSE.txt file.

If your local repository didn’t already have a README file, now would be a good time to add one! Today, new projects tend to use Markdown format for the README file, and therefore rename the file to README.md to ensure the file is correctly formatted.

With the project uploaded and the instructions established, it is now time to start on-boarding contributors to your project. The process you use in the remainder of this chapter should be added to your project repository as documentation. This will allow developers to have a copy locally, and will allow them easier access to the information instead of having to refer to an external wiki page.

Now that your project is in place, it’s time to flip the tables and look at things from a contributor’s perspective.

Setting Up the Developers

When you think about projects from a developer’s perspective, it’s not always entirely clear what the participation level is going to be. When it comes to publicly available projects, a developer might engage in three levels of participation:

  • Download a zipped package of the project, never to return to the project page again. This might be seen in true forks of a project where the downstream developers have no intention of checking back to see how the code has progressed. It might also be used for projects that are designed to be a starting point—where the intention is to hack up the code and modify the source significantly.

  • Clone the project repository with the intention of keeping the code up to date locally, but without the intention of making modifications. This could be true of any developer who is incorporating an open source library into his or her project. The developers might extend the library, and perhaps make little changes to the cloned library, but for the most part they are using the project code as is, relying on upstream developers for enhancements and security updates.

  • Clone the project repository with the intention of contributing work back. This will be true for open source project volunteers and staff, in-house developers on a software project, as well as staff at an agency who are contributing to a build for a particular project.

The main distinction between the latter two options is that a noncontributor will typically clone The Project directly, whereas a contributor will likely have a personal remote repository in addition to the project repository. The rationale for these choices was described in greater detail in Chapter 2.

Consumers Versus Contributors

Forward-thinking (intermediate to advanced) developers will always assume they are going to contribute back to a project at some point and create their own intermediate remote repository. Most novice developers, however, will aim to streamline their workflow where possible and omit the intermediate step of creating their own remote repository. This also means they are perceiving of themselves only as a consumer, rather than a potential contributor, to your project.

Once developers identify themselves as consumers or contributors (including primary maintainers), they will be ready to choose a method to download your project repository.

Consumers

Consumers have no intention to contribute back to a project. They don’t expect to have write access to the code base, and they can’t imagine a possible future where they would want to upload their changes to a project. This type of developer might download your repository in one of two ways:

  • As a zipped package.

  • As a clone of the repository directly from The Project page.

A zipped package has no connection back to The Project, and contains no history of the changes that have happened over time. A clone, on the other hand, maintains a connection to the project, and can be updated to the latest version by running a few Git commands. The structure to clone a remote repository is as follows:

$ git clone https://<hosting-url.com>/<project-owner's-name>/<project-name>.git

For example, if you wanted to download a copy of the project repository for the Git for Teams workshop, you would issue the following command:

$ git clone https://github.com/gitforteams/gitforteams.git

To update your local copy of the repository, first you would need to fetch the latest changes to The Project (for now, we’ll assume you have only one remote connection):

$ git fetch --all

Once you’ve fetched the changes, you can compare what’s changed in the latest version to what you have locally before choosing to update your local copy.

First, get a list of all branches in your repository:

$ git branch --all

You will see two groups of branches: your local branches and the remote tracking branches. The currently checked-out branch will be marked with *. My personal copy of the project repository cloned previously is as follows:

  gh-pages
* master
  video-lessons
  remotes/personal/gh-pages
  remotes/personal/master
  remotes/personal/video-lessons

This list shows three local branches as well as three branches connected to a remote that has been nicknamed personal.

For even more detail for each branch, use the parameter --verbose:

$ git branch --all --verbose

The output includes the commit message as well as the status for each branch compared to its remote repository:

  gh-pages                             629b54f Resolving merge conflict;  ...
* master                               2db982d Changes to "Undo" graphic: ...
  video-lessons                        7798eb1 [ahead 11] Lesson 00:      ...
  remotes/personal/gh-pages            629b54f Resolving merge conflict;  ...
  remotes/personal/master              2db982d Changes to "Undo" graphic  ...
  remotes/personal/video-lessons       653f875 Lesson 7: Added intro on   ...

To see a history of the changes that have been added to the repository on the branch master, you can use the command log:

$ git log personal/master

To compare your local copy of a branch to what was just downloaded, you can add the parameter --patch to see the per-commit changes, or use the command diff to see a summary of all changes:

$ git log --patch personal/master
$ git diff master personal/master

This will show you all of the changes in patch format. Look for lines that have been added (marked with +), or deleted (marked with -). If you prefer to check out the code base as a whole, you can check out the branch tip:

$ git checkout personal/master

This will put you into a detached HEAD state. To return to the local copy of the master branch, check it out:

$ git checkout master

Once you’ve reviewed the changes, you can update your local copy of the master branch by rebasing to add the new changes:

$ git rebase personal/master

Using the command rebase provides a cleaner graphed history; however, if your team has opted to use merging, you can use the command merge to bring your local branch up to date:

$ git merge personal/master

If you have multiple local branches that you want to update, you will need to check out each one individually and then use this same procedure to incorporate the changes. This needs to be done one branch at a time because if there are conflicts between the two copies of the branch, Git needs to give you a working directory to resolve the conflicts.

These few commands are the only ones that a consumer of a project will need to use. If, however, the developer makes a little change to her copy of the repository locally, and wants to contribute that change back to the project, she will be limited to submitting a patch, or requesting access as a developer (which is probably not appropriate to grant for one-off contributors). Although it is possible to submit patches, it is not preferred. (Yes, there are some projects that still use patches, including Git itself!) Instead, many projects have come to prefer pull requests. Originally used by GitHub, this term has become popular on other systems as well. A pull request is a meta feature—it is not something built into Git itself, but rather it is a feature of software that sits alongside Git. It provides a visual prompt for a project maintainer to incorporate a branch of work from a remote repository. The connection between the two repositories exists only for that one particular request; it is not a persistent connection like a developer would set from his or her local workstation to a remote repository.

Contributors

So you think you’re interested in contributing to a software project. Cool! (This is where, as the author of this book, I let out a huge sigh of relief. If you’ve made it this far into the book and weren’t interested in working on a software project, I’d feel really bad.) As a distributed version control system, Git is focused on what you can do locally. The built-in tools for direct collaboration on shared repositories are extremely coarse—either you have full write access to a project, or you have none. There are no per-branch permissions, and indeed, without the support of SSH, there’s no authentication system at all in Git. Git relies on wrapper software to provide the access control.

In order for wrapper software to make the connection between two repositories, it needs them to both be accessible from the same place. The easiest way to design for this is to have developers upload their changes to the same system that hosts The Project repository. GitHub, as well as every other web-based system, does this by having you create a clone, or a fork, of The Project, and upload your changes to the copied repository. Then, you use the wrapper software to request that your changes be pulled into The Project repository.

Using GitHub terms:

  1. An aspiring contributing developer (The Developer) forks The Project repository.

  2. The Developer then makes her proposed changes in her copy of The Project.

  3. When finished, The Developer initiates a pull request from a branch in her copy of the project to a branch in The Project repository.

  4. Using comments in GitHub’s web interface, a conversation will take place between The Developer and The Maintainer. Sometimes additional updates will be required by The Developer before The Maintainer is ready to accept the proposed changes into The Project.

  5. When the proposed changes are deemed worthy, The Maintainer will incorporate the pull request into The Project.

GitHub Does Not Require a Local Clone of The Project

GitHub now allows developers to make minor edits directly to files through a web interface; however, many developers will choose to clone their copy of The Project so they can work on it locally. Then, when they have completed their work, they will push their updates to their own copy of the project and initiate a pull request from their copy of the project to the main project repository.

The process for submitting a pull request will vary slightly depending on the wrapper software being used (e.g., GitHub, Bitbucket, GitLab, etc.); however, the basic process is covered in Part III.

Maintainers

A developer who has direct commit access to The Project repository is a special kind of developer, known as The Maintainer. Depending on how your team is structured, The Maintainers might be only those on the quality assurance team, or they may be handpicked developers from the community. For smaller internal projects, The Maintainers may be everyone who is working on the project.

In Chapter 2, you learned a little bit about project governance models. The way The Maintainer will interact with the project is a political, not technical, decision. Git doesn’t actually care how you structure your project, and so you will need to develop a system that works best for you. Defining the workflow for Consumers and Contributors is relatively easy because you aren’t really working with Git, but rather the workflow defined by the wrapper software (in the case of Consumers, they’re not even really working with Git at all).

If everyone on your team is a Maintainer (i.e., they are allowed to commit directly into the repository), it’s your choice as to whether you require developers to create a separate clone of the repository. The only limitation would be if your code hosting system does not have the capacity to accept incoming branches for merging from within a single repository. Check with your system of choice to see if it has a recommended workflow.

Generally I work with teams of fewer than 10 developers. Some of these teams I’ve worked with have opted for separate remote repositories for each developer, and some have allowed developers to commit their in-progress work directly to The Project repository. In the Drupal project, where there are thousands of developers, only a handful of people can commit into the main project repository; however, there are an additional 30,000 contributed modules, each with its own maintainers who have direct access to the project repository.

The Only Rules Are the Ones You Document

If there are no documented rules, your project will become anarchic so write down the exact steps you would like people to follow when contributing to the project.

Project maintainers will need to have at least a clone of The Project repository locally. If you were the developer who started the project, you already have a local clone of this repository. If you aren’t, you will need to clone the repository using the following:

$ git clone https://<hosting-url.com>/<project-owner's-name>/<project-name>.git 

You learned how to create a clone of a project repository as a team of one in Chapter 6 with the following command:

$ git clone https://gitlab.com/gitforteams/gitforteams.git

This will create a local copy of the repository, with the remote nickname origin.

If your project requires it, you may also need to create a clone of The Project on the code hosting system. This is covered in the previous section, or you may wish to follow the more detailed instructions available in Part III. Once you’ve created the remote clone, you can add this remote connection to your local repository. This will allow you to switch between the two from within the same directory. If you prefer, you can keep two local directories, but I personally enjoy the efficiency of not having to jump around as much. You are welcome to use your own naming conventions for the remotes. The syntax for adding a new remote is as follows:

$ git remote add nickname https://<hosting-url.com>/<your-name>/<project>.git

If I were to add my personal clone from GitLab, to follow the previous example, I would use the following command. Because this connection was being made to my personal copy of the repository, I would choose to use the nickname personal here:

$ git remote add personal https://gitlab.com/emmajane/gitforteams.git

To avoid confusion, I might also choose to rename the nickname for The Project remote from origin to official:

$ git remote rename origin official

These nicknames are completely arbitrary and are personal to your system. They will not be shared with others, so use whatever names make sense to you. Generally the convention is to use origin for the remote copy that most closely resembles your local work, and upstream for the copy of the repository that has the most new features being added by other developers that you might want to incorporate into your own work.

Once you’ve set up the remote connections to the project, and to your own personal copy of the repository, you should verify the names and URLs are what you are expecting:

$ git remote --verbose

In my case, the output is as follows:

official	[email protected]:gitforteams/gitforteams.git (fetch)
official	[email protected]:gitforteams/gitforteams.git (push)
personal	[email protected]:emmajane/gitforteams.git (fetch)
personal	[email protected]:emmajane/gitforteams.git (push)

You are now ready to work on your project as both a Contributor and a Maintainer.

Participating in Development

There are four main activities you will engage in when working with Git: working on new proposed changes, keeping your branches up to date, reviewing proposed changes, and publishing completed worked. Inevitably, you will also need to work on resolving conflicts when you update your branches, or when you attempt to incorporate proposed changes into The Project.

Constructing the Perfect Commit

There are two basic approaches to commits: demonstrate the thinking process and present the final solution. When I’m programming in a language I’m not very familiar with, I think in small increments focusing on little pieces of the system at a time. As I work, I commit snapshots of my work as I get to critical points. These snapshots act as lifelines, allowing me to track how I thought through a problem. If you were to read my commit messages when I code, you would be able to easily unpack my thinking. Commits might represent units of work in increments as small as 15–30 minutes of effort. The commit messages are unlikely to explain why I’ve done something. The initial commit might include a docblock of code comments which outline what I’m about to do, the next commit might have the scaffolding for what I was about to build, and it would proceed from there. The commit messages would add very little value above and beyond what is shown in the diff for each commit.

When I’m working in on a task I feel more confident about, I’m more likely to make radical changes to the working directory without those tiny lifeline commits. Then, when my work is finished, I’ll take a look at the overall changes, and shape smaller, relevant commits. This might be done by committing single changed files at a time, or perhaps I might make an even more granular commit using the --patch mode to add hunks of each file at a time to the staging area in preparation for a commit. These curated commits will be much more useful to me later if I need to dig through history using the command bisect. For example, in order to use a function, it must already be created somewhere, so I might choose to separate the creation, and use of a function into two separate commits even if I wrote them at the same time.

I hesitate to refer to these two approaches as novice and advanced, but that phrasing does ring true. Different source control management systems will have different ways of presenting commits in the history of your project. Git is very granular in how it shows you the commit history, and as a result, thinking in tiny commit increments gets messy and frustrating to work with. This is why we say that as you mature with Git, you will be more likely to adopt the second approach.

You don’t need to give up your tiny commits though. You can use rebase to combine many little unpublished commits into a history that is more like the second version. Work the way you want to work, then reshape history so that it stores information in a useful way.

Rewriting History

Yes, I hate with a screaming passion that Git allows you to rewrite history, and then tells you how dangerous it is. To me it feels too much like arrogant history revisionism. But that’s the model that Git uses. To work effectively with Git, I set aside my frustrations and adopt the techniques that the original software set out as best practices. I’m not afraid of rebasing; I just don’t like that it exists to begin with. I give you permission not to like it either; however, not liking what it represents isn’t a valid reason for not using it. It’s deeply ingrained in the philosophy of how Git stores metadata about code’s history. Have a cookie, it’ll be okay.

If you accidentally do too much work between commits, you don’t need to forgo a granular commit history. Previously you learned to add individual files to the staging area. You can get even more granular, assigning edits within a single file to multiple commits. To add a partial change within a file, instead of the whole file, use the command git add --patch filename. This command will walk through your file, line by line, and ask you if you would like to include each changed line in the commit you are building.

Rewriting History as It Happens

If you have a culture of showing work in progress on a centralized server, you will need to be careful in how you rebase your work. When a commit is rebased, the metadata for any commit object that is altered is assigned a new identifier. For example, if you are bringing a branch up to date, your local commits now have new parents and get a new ID. If you are trying to clean up the history of a branch, and you squash two commits, a new ID will be assigned to the resulting commit object even though the content is identical! This dual timeline can confuse Git and cause conflicts. To avoid these conflicts, limit your use of interactive rebasing to short-lived branches, such as ticket branches.

Excellent commit objects have the following characteristics:

  • Contains only related code. No scope creep, no “just fixing white space issues too.”

  • Conforms to coding standards for your project, including in-code documentation.

  • Are just the right size. Perhaps this is 100 lines of code. Or perhaps it’s a mega refactoring where a function name changed and 1,000 lines of code were affected.

  • Work is described in the best-ever commit message (see the next section).

The best rule of thumb I’ve heard for commit messages is “Whatever it takes to make future me not get pissed off at past me for being lazy.”

Your commit messages should include:

  • A terse description (fewer than 60 characters) in a standard format to make it easy to scan logs.

  • A longer explanation of why the current code is problematic, and the rationale for why the change is important.

  • A high-level description of how the change addresses the issue at hand.

  • An outline of the potential side effects the change may have.

  • A summary of the changes made, so that reading the diff of the code confirms the commit message, but reading the diff is not guesswork on what/why something has changed.

  • A ticket number, or other reference to sources where discussion about the proposed change can/has/will happen.

  • Who will be affected by the change (e.g., an optimization for developers; a speed improvement for users).

  • A list of places where the documentation will need to be updated.

A bad commit message would be as follows:

git commit -am "rewrote entire site in angular.js - it's faster now, I'm sure"

This commit is insufficient for the following reasons:

  • By using the -a parameter, all files will be committed as part of this commit en masse, and without consideration of whether or not they should be included.

  • By using the -m flag, the tendency will always be to write only a terse message that does not describe why the change is necessary, and how the change addresses this necessary change.

  • The commit message does not reference a ticket number, so it’s impossible to know which issue(s) are now resolved and can be closed in the ticket tracker.

To compare, a good commit message would be as follows:

$ git commit

[#321] Stop clipping trainer meta-data on video nodes at small screen size.

- Removes an unnecessary overflow: hidden that was causing some clipping.

Resolves #321

This is a good message for the following reasons:

  • It includes the ticket number, in square brackets, at the beginning of the terse commit message, making it easier to read the logs later.

  • The terse description (for the short log view) explains the symptom that was seen by site visitors.

  • A detailed explanation explains the technical implementation that was used to resolve the problem.

  • The final line of the commit message (Resolves #321) will be captured by the ticketing system and move the ticket from open to needs review.

When making a proposed change, you should keep the proposal small, and focused on solving a single problem. This will make it easier for The Maintainer of the project to review your submission, and accept your work. For example, if you are fixing a specific bug in one part of the code base, don’t also fix an extra line ending you found elsewhere in the code. While projects likely have naming conventions for their branches, if you are donating a drive-by fix that doesn’t already have an identified issue in The Project repository, name your branch using a terse description of the problem you are solving—perhaps, for example, css_button_padding or improved_test_coverage (Example 7-3).

Example 7-3. Make a change to the code base
$ git checkout -b terse_description
(edit files)

$ git add filename(s)
$ git commit

At this point, the commit message editor will open and you will need to provide the best commit message you’ve ever written.

With the proposed change in place, you can now publish it to your copy of the repository using the command push:

$ git push

Your personal branch has been uploaded, so it is now time to work with a team member to have your changes incorporated into the main branch for the project.

Keeping Branches Up to Date

Branches stored in Git can generally be thought of as one of two things: official project branches or short-lived suggestion branches. Shared project branches are used to integrate reviewed and approved code from multiple developers and contain the official history of a project’s code. Your local copy of these branches should always be up to date and should always be used as the base branch for your ticket branches. By convention, it is not appropriate to write new commits to the local copy of an official branch. Instead, you would create a new branch, complete your work, and then merge that branch back into the official branch. Several branching strategies are discussed in Chapter 3—you may want to go back and review that chapter if your team doesn’t already have a branching strategy. The second type of branch is essentially a developer’s sandbox. This is where you test out new ideas and get your code ready for review. These short-lived work branches must also be kept up to date, but they need a slightly different approach.

Rebase Versus Merge…Again

There are still no rebasing police who are going to show up at your team meetings. You’ll need to figure out, as a team, how you’re going to tackle bringing branches up to date. (I still think you need to do whatever is best for your team, but I’m going to show you the instructions for rebasing where it so that you can see it’s not significantly more difficult to use this method.) Regardless of what you choose, document your solution carefully, and support those who are new to Git to ensure they are able to perform the commands consistently. The easiest way I’ve found to ensure consistency this is to provide copy/paste-friendly documentation, and have people work at the command line. Additionally, flowcharts can be quite effective.

To reduce the number of conflicts you need to deal with when bringing short-lived branches together, you should keep your working branch up to date with the project branch you will eventually be merging into. How often is “regularly”? I recommend updating your branches at least as often as you drink coffee. If you don’t drink coffee, I would recommend you update your working branches at least daily using the commands in Example 7-4. Yes, this is going to seem tedious, but it can save you a lot of time in the long run to keep your work as up to date as possible.

Example 7-4. Update your local copy of this project’s branches
$ git checkout master
$ git pull --rebase=preserve

Git will update your local copy of the master repository to incorporate the changes from the upstream repository.

Once the project branches are up to date, you can now update your work branches. When you are bringing your work branches up to date, however, there will not be an upstream branch that you can pull your changes from like you used for the shared project branches. So how do you know if you should be merging or rebasing at this point? The rule of thumb is as follows: if you started your work right now would the change you’re about to incorporate into your work branch already be in place? If it’s a feature you wrote, it wouldn’t already be in the branch you’re bringing up to date and therefore you should merge the branch to incorporate the new work. If it’s a feature someone else wrote, you almost definitely want to rebase (if you are on Team Rebase). Another helpful tip is to match the names. If the changes you want to incorporate are coming from a branch with the same name, but on a different remote, you almost definitely want to rebase.

In Git, rebasing and fast-forward merges both result in a linear timeline, as they replay your commits onto the work that was done in a different branch. As each commit is replayed, there is the potential for a merge conflict, which needs to be resolved. As a result, developers who are less confident in their ability to deal with a merge conflict will opt to simplify the process, and use the merge command to bring their work up to date. Using merge does make your historical record more difficult to read; it is, however, also technically less complicated because it generally involves fewer merge conflicts.

If you are working with a complicated code base and it is important to be able to run debugging tools quickly, you should spend the time to get a clean history by using the command rebase to bring your work branches up to date. If, however, it is more important for contributions to be as easy as possible, you may want to allow your developers to use the merge command to bring their work up to date. (The Gittiest of Git readers just gritted their teeth while reading that last bit. But you know what? There are no Git police who will show up at your door if your team decides they just want things to be easier. Promise. Insert picture of a honey badger not caring here, and let’s move on.)

The first thing you need to do when bringing your work branches up to date is to ensure your project branches are up to date. Keeping a shared branch up to date is typically done with the command pull (which uses the optional parameter --rebase). To bring your personal work branch up to date, you will need to remember the source branch where you initially branched from and copy the changes made to this branch over to your work branch. If you are following the GitFlow model described in Chapter 3, this will likely be the branch dev or development.

For example, if your work branch was named 2378-add-test and your source branch was named development, the commands would be as follows:

$ git checkout development
$ git pull --rebase=preserve
$ git checkout 2378-add-test
$ git rebase development

Each of the commits you have made in your work branch will now be reapplied as if the new commits from the branch development had always been in place. These commits may apply cleanly, or you may need to deal with merge conflicts. Because rebasing is the preferred method in Git for keeping a branch up to date, I will passive-aggressively omit giving you the commands for how to merge a branch. I am hopeful you will forgive me.

In addition to keeping your branches up to date, you should also remember to update your personal repositories whenever your own work is incorporated into The Project because its main branch will now contain new commits. This will be helpful when you are responsible for reviewing someone else’s work and merging it into the master branch. The commands you run are exactly as they were described previously:

$ git checkout master
$ git pull --rebase=preserve

Regardless of how you choose to keep your branches up to date, I hope you’ll at least try to incorporate rebasing into your workflow. As frustrating as it can be, it will help you to have a cleaner history if you need to use the debugging techniques described in Chapter 9.

Reviewing Work

In order to review someone else’s work, you must first get a local copy of that work into your own repository. This might be work that has already been incorporated into the official project branches, or it might be a new feature, or a bug fix that a colleague has asked you to review and merge into the main project.

Peer reviewing new work is a multistep process and is covered in greater detail in Chapter 8. The basic process is as follows:

  1. Add a remote connection to the relevant repository.

  2. Fetch the available branches for that repository.

  3. Create a local copy of any branch you want to examine in depth.

  4. Incorporate any changes from the other branch that you would like to adopt into your own work.

  5. Push the revised branch back to the relevant remote repository.

The first thing you will need to do is find the repository that holds the work you want to incorporate. To list each of the remote repositories, use the remote subcommand show (Example 7-5). Just like listing branches, all available remotes will be listed as the output to the command. In Example 7-5, the two remotes I added in the previous section are displayed. This gives me a quick reminder of which repository I want to look at in more depth.

Example 7-5. A terse list of remote repositories
$ git remote show

official
personal

Once you have the name of the repository, you can get a full listing for the remote by adding the name of the nickname to the previous command (Example 7-6).

Example 7-6. Full details about the remote repository, personal
$ git remote show personal

* remote personal
 Fetch URL: [email protected]:emmajane/gitforteams.git
 Push  URL: [email protected]:emmajane/gitforteams.git
 HEAD branch: master
 Remote branches:
   2-bad_jokes   tracked
   master        tracked
   sandbox       tracked
   video-lessons tracked
 Local branch configured for 'git pull':
   master merges with remote master
 Local ref configured for 'git push':
   master pushes to master (up to date)

Here I can see there are four branches stored in the remote repository, all of which I have a copy of locally (this is indicated by the word tracked).

Update Your Local List of Branches

If you already have a connection to the remote repository, and you don’t see the branch your partner has asked you to review, ensure the list of remote branches is up to date by first running the command git fetch.

If you don’t want the extra overhead of getting all the information about the remote repository, you can choose to show only remote branches by using the command branch and adding the parameter --remotes (Example 7-7). This will allow you to locate the branch with the work you need to review. I like using this variation for branch instead of the --all parameter because it gives the actual name of the branch, instead of adding on the reference information of remotes.

Example 7-7. Listing remote branches
$ git branch --remotes

Branches Group Commits

A branch is a line of development that links individual commit objects. Different instances of a branch may have commits made by different developers, and therefore repositories are not identical until they are synced. It’s basically anarchy, but limited to each little repository. The conventions we establish as software teams are what bring order to the chaos and allow us to share our work in a sane manner. Remember the branching strategies we learned in Chapter 3? They’ll keep the work sorted into logical thought streams. Remember the permission strategies from Chapter 2? They’ll keep people locked into the right repository, unable to make changes without the community gatekeeper’s help.

If you add the parameter --verbose to branch, the one-line commit message for the tip of the branch will be included in the output. For example, I had several active work branches, an integration branch, and the official branch for the project (Example 7-8). Although I uploaded my commits occasionally to the remote server, mostly I just worked in the chapter branches, incorporating my work into the integration branch, drafts, and then the main branch, master.

Example 7-8. Selected output from git branch --verbose while working on this chapter
  ch02   7313755 CH02: Adding patching workflow diagram.
  ch04   69a3ded CH4: Stub file added with notes copied from Drupalize.Me.
* ch05   80b5200 [official/ch05: ahead 2] CH05: Fixing URL for image 05fig01.
  drafts 80b5200 CH05: Fixing URL for image 05fig01.
  master 319bb53 [official/master] Merge branch 'drafts'. Updates for CH05.

The first column contains the branch name, the second column contains the commit ID, and the third column contains the first line of the most recent commit message. If the branch is tracked remotely, the name of the remote branch is included in square brackets between the commit ID and the commit message.

Once you’ve located the remote branch that contains the work you want to review, you can either copy the branch into your local repository (Example 7-9), or examine the reference to it with the commands log and diff (Example 7-10).

Example 7-9. Copy a remote branch into your local repository
$ git checkout --tracking remote_nickname/branch
Example 7-10. Examine a remote branch without creating a working copy
$ git log --oneline remote_nickname/branch
$ git diff current_branch...remote_nickname/branch

Assuming the work passes review, it’s time to merge it into the main project branch.

Merging Completed Work

Before merging the new work into your project branch, you will need to first ensure all branches are up to date. This is necessary because Git won’t allow you to push your copy of a remote repository if the destination branch (on the remote) contains commits which are not in your local copy.

When uploading new work to a remote server, Git will only accept work as a fast forward merge. This means you don’t have to worry about having a merge conflict when you push your work. Because of this restriction, your local branch needs to contain all of the remote commits before you can push your branch. To update your work, you will need to use the command pull to retrieve the changes from the remote server and incorporate any new work into your local branches.

First, update your local copy of the destination branch (Example 7-11) by using the command pull with the parameter --rebase.

Example 7-11. Incorporate updates from a project branch
$ git checkout master
$ git pull --rebase=preserve

Once the public branch is up to date, you will need to bring the feature branch up to date as well (Example 7-12).

Example 7-12. Merge a completed ticket branch into a public project branch
$ git checkout 2378-add-test
$ git rebase master

Finally, you can merge the ticket branch into the main project branch (Example 7-13).

Example 7-13. Merge the completed ticket branch into the public project branch.
$ git checkout master
$ git merge --no-ff 2378-add-test

If the changes that were being introduced were unique from previous work that had been completed, the merge will now be completed; however, if there was overlapping work in the same area, Git will not know how to complete the merge and ask for your guidance. The language is a little scary as asking for help in Git terminology is better known as a merge conflict.

Resolving Merge and Rebase Conflicts

Conflict sounds hard and scary, but in Git, a merge conflict is actually a very small problem and you won’t need to spend a lot of money on a mediator or a therapist to resolve it. Any time a file is changed in exactly the same place, Git can be unsure of which version is the correct version, so it will ask you to make that decision. Git refers to this uncertainty as a conflict.

When you bring together two branches, there is always a chance that you will have changes in both our and their version of the code on the exact same lines within a file.

Git will add three lines into any file that has lines with conflicting information at exactly the same point:

<<<<<<<
=======
>>>>>>>

This represents the our code, and their code separated by a dividing row of =. To resolve a conflict you will need to edit the files, select the appropriate content to keep, and remove the markers. When you open the file to examine the conflict, look at the surrounding areas as well. Sometimes Git will have misjudged where to put the markers, so you shouldn’t just delete one whole section, or the other whole section. Read carefully, and you may find you need to take a little bit from each when you look at the surrounding code:

<<<<<<< HEAD
    $p++;
}
=======
}

>>>>>>> 2378-add-test

We don’t have enough information to resolve this merge conflict without understanding what the code update is trying to accomplish. Probably the end brace should be kept because it’s in both sides of the conflict, but what about the new line? And what about the increment of the variable? If you run into merge conflicts you are not sure how to resolve, you should talk to the author of the original code if you cannot figure it out just from reading the code itself. Misunderstanding the code and deleting too much (or too little) may end up unintentionally adding new bugs to the code if you resolve the conflict incorrectly.

Resolving Merges Step by Step from Very Divergent Branches

There is a complementary program, git-imerge, which works to merge the commits leading up to the tip of the two branches you are attempting to merge. Working with the incremental commits can make it easier to see how the conflict should be resolved because there is less to compare at each point. This is not part of Git core, and you will need to download and install the software separately. Check your favorite package manager if you want to reduce the install hassle. I installed my copy via OS X’s Brew.

When your edits are complete, you can remove the markers Git placed into the file and continue using the on-screen instructions which Git provides in its status message:

$ git status

If you were completing a merge, you will need to add the updated files and commit them to your repository:

$ git add filename(s)

By adding the files one at a time, you can use the status command as a TODO list of files with outstanding merge conflicts that need to be resolved:

$ git status

Once all the merge conflicts have been cleaned up in each of the files, you can commit your staged changes:

$ git commit

At this point, the default text editor for Git will open with additional information about the commit you are completing. When you have finished writing your message, save the changes and quit the editor to resume.

If you were attempting a rebase when the merge conflict occurred, you may be in the middle of a multistep process. In this case, you’ll need to proceed with the rebasing procedure:

$ git rebase --continue

If, before starting the merge, you know without a doubt that you will always want to use either the incoming work (theirs) or your own work (ours), you can preemptively instruct Git on how you want to address the proposed changes from the two branches. For example, if you wanted to merge in a branch that you knew contained fixes for the problem you were having, you could force Git to use the other branch when making its updates to your own branch:

$ git checkout branch_to_update
$ git merge --strategy-option=theirs incoming_branch

Publishing Work

The first time you upload your changes for a given branch, you will need to specify the remote repository that you want to use, as well as the branch name. The convention is to keep the branch names the same on the local and remote repositories. You will need to include the nickname for the remote repository. In Example 7-14, it is assumed the name of the remote is origin.

Example 7-14. Upload your branch with the proposed changes to your remote repository
$ git push --set-upstream origin branch

Once you’ve set up the branch for the remote repository, you can upload your work to the same remote again using the command push:

$ git push

If you have multiple remotes set for your repository, you will need to explicitly push to each of the remote repositories separately. By default, origin is used:

$ git push remote_nickname

The next part of the procedure will depend on the hosting system you’re using. Generally, though, you navigate to The Project page where you will locate a link for pull requests (the language may be slightly different on your system of choice). From this link you should be able to initiate a request to have your proposed updates included in the project. The system should already know which of your repositories was cloned from The Project, and it should include a list of all the branches you’ve worked on in your copy that might include proposed changes for The Project. You’ll select the branch you want to submit for inclusion and walk through any additional steps necessary. This process is covered in depth in Part III.

Once your pull request has been submitted, The Maintainer will review your proposed update. He may accept your work as is, or request changes and ask you to resubmit your work. If additional changes are needed, repeat the steps outlined in this section until the pull request is accepted.

To publish new work into a shared branch, the first thing you should do is check that the branch you are going to be merging into is up to date. This will ensure you can push your work after merging your changes. If the branch isn’t up to date, you will not be able to upload the revised copy of the shared branch until you have downloaded the new updates and incorporated them into the branch:

$ git checkout master
$ git pull --rebase=preserve

Once your local copy of the main project branch is up to date, you should ensure these changes are also copied into the feature branch you have been working on so that there is the smallest amount of difference between the two branches before the merge is performed:

$ git checkout 2378-add-test
$ git rebase master

Once the working branch is up to date, you are ready to merge in the reviewed and accepted changes:

$ git merge --no-ff 2378-add-test
$ git push

The work branch can now be deleted from your local repository and any remote repositories you have write access to:

$ git branch --delete 2378-add-test
$ git push remote_nickname --delete 2378-add-test

Your branches should now be up-to-date and ready for your teammates to download.

What happens next will vary greatly depending on the type of software you are building. Web developers who want to connect Git with a continuous integration build server may benefit from watching Lorna Mitchell’s videos Git Fundamentals for Web Developers (O’Reilly).

Sample Workflows

The remainder of this chapter serves as a template for working with teams. You should discuss with your team how they would like to work, and write down the commands each contributor and maintainer will need to use during the project.

Sprint-Based Workflow

This process is more or less what I’ve used for several teams working in a sprint-based release cycle. It is a variation on GitFlow and it works well for weekly website deployments. The schedule for the sprint follows a weekly routine (as opposed to the more “traditional” two-week sprint). This encourages granular tickets and helps the developers see their work in production as fast as possible. Some tickets will take several “sprints” to complete if they are larger in scope.

The repository is set up with five different types of branches: development, ticket, qa, master, and hotfix (Table 7-1). These branches are used either as single-issue development branches, or as integration branches.

Table 7-1. Branch types in a weekly deployment workflow
Branch name / convention Type of branch Description Branched from

dev

Integration

Used to collate peer reviewed code

ticket branches

ticket#-descriptive-name

Development

Used to complete work identified in tickets

dev

qa

Integration

Used for quality assurance testing at the end of each sprint; code that does not pass QA testing is removed from the branch

dev

master

Integration

Used to deploy fully tested code

qa

hotfix- ticket#-description

Development

Used to develop solutions for urgent problems identified on production

latest release tag on master

For the developers, every day is a development day. In addition, there are three days in the week when all team members rally toward the same goal.

The workflow is not overly complex (Example 7-15) for developers: all work begins on a fresh ticket branch from the parent branch dev. Once completed, the work in a ticket branch is pushed up to the shared project repository. Branches are kept up-to-date through rebasing, which allows for a cleaner branch history than merging.

Example 7-15. Git commands to work on tickets

In this example, substitute origin for the name of your remote, and 1234-new_ticket_branch for the name of your ticket branch:

$ git checkout dev
$ git pull --rebase=preserve origin dev
$ git checkout -b 1234-new_ticket_branch
// do work
$ git add --all
$ git commit

Before sharing the work, ensure the branch contains any new commits:

$ git checkout dev
$ git pull --rebase=preserve
$ git checkout 1234-new_ticket_branch
$ git rebase dev

Finally, share the new work with others:

$ git push origin 1234-new_ticket_branch

Once completed, a ticket branch is reviewed by another person on the team (Example 7-16). If the code passes review, the reviewer merges the ticket branch into the development branch and removes the ticket branch from the main repository. The review process is covered in depth in Chapter 8.

Example 7-16. Git commands to complete a peer review
$ git checkout dev
$ git pull --rebase=preserve
$ git checkout 1234-new_ticket_branch
// review process goes here
$ git merge --no-ff 1234-new_ticket_branch master
$ git branch --delete 1234-new_ticket_branch
$ git push --delete origin 1234-new_ticket_branch

Quality Assurance (Monday–Tuesday):

  • Automated test suite is run on dev to catch any regressions that may have snuck in while feature branches were being added up to this point.

  • All work in the branch dev is merged into the branch qa for testing (Example 7-17). Development work continues in the branch dev.

  • A sprint checklist is created in a shared document, such as Google Docs, by copying and pasting the user stories from the tickets that were merged into the qa branch. Typically, this is the first line of the ticket description—a convention that should be adopted to make the QA process faster.

  • All team members are responsible for running through the list of tickets to be tested in the shared document. In addition to the weekly tickets, there may be rolling tests that need to be completed by a person.

  • Anything that fails quality assurance has a new ticket created so that it can be fixed, or reverted, prior to release (Example 7-18).

Example 7-17. Commands to set up the qa branch
$ git checkout dev
$ git pull --rebase=preserve
$ git checkout qa
$ git merge --no-ff dev
$ git push
Example 7-18. Commands to remove tickets that have failed to pass QA in time for release
$ git log --oneline --grep ticket-number
(locate the commits that need to be reversed)

$ git revert commit

$ git revert --mainline 1 merge_commit
(ideally, however, you are merging work branches with --no-ff, which forces a commit ID
that can be easily undone)

Release Day (Wednesday):

  • The branch qa is merged into the branch master and tagged (Example 7-19).

  • From the live site, the repository is updated to use the tagged commit for release.

  • The work for the next week is prioritized with the development team.

Example 7-19. Commands to prepare for deployment
$ git checkout master
$ git merge qa
$ git tag
(locate the latest tag so that you can determine the next tag's number)

$ git tag --annotate -m tag_name
$ git push --tags

When the tag is added, it is signed with the --annotate parameter, and a message is added with the -m parameter. This ensures the tag will not be ignored.

Announcement Day (Thursday):

  • A public announcement is made to the community of users about the changes that were launched on the previous day. The extra day gives the team a chance to deal with any unexpected regressions, or bugs, when the code was moved to the production environment.

  • Development continues on the new list of priorities established on the previous day.

In the unlikely event that a serious bug or regression is introduced to the production environment, a hotfix is completed. Serious is, of course, a relative term. In this system, deployments are made weekly, so a hotfix, generally speaking, is an update that cannot wait a week to be deployed.

Each deployment is tagged as such, so the first step is to get a list of all tags and locate the current live version of the code base (Example 7-20). A new branch is created from this point, the updated code is applied, and then uploaded for review before deployment.

Example 7-20. Commands to create a hotfix branch
$ git checkout master
$ git tag
(review list of tags to determine the currently live tag)

$ git checkout -b hotfix-issue-description tag_name

The hotfix branch would then be worked on as if it were a regular development branch, undergoing a peer review and quality assurance test. When it passes testing, it would then be immediately incorporated back into the master branch and tagged for deployment (Example 7-21).

Example 7-21. Commands to prepare a hotfix for deployment
$ git checkout master
$ git merge --no-ff hotfix-issue-description
$ git tag --annotate -m new_tag_name
$ git push --tags

In this system, semantic versioning is not used. Instead, tag names are incremented using the format <launch_version>.<sprint_week>.<hotfix>. For example, 1.4.3 would be used to represent the third hotfix on the fourth week of development (in other words: a bad week for the team!).

Trusted Developers with No Peer Review

While writing this book, I worked with the O’Reilly automated build tool, Atlas. This system also has a web-based GUI that allows editors to work on book files directly. Saved files are immediately committed to the master branch. Due to the GUI, there is no peer review process because anyone on my team is able to make edits directly to a file. My preference, however, is to work locally, and not through a web GUI. I had been keeping the branch overhead low locally and had just been working in master as well. It only took me one local merge conflict to alter the way I was working locally.

When I wanted to update my work, I would use the command fetch to see if any changes had been made by my editors. With the fetch completed, I would compare my copy of the master branch with their copy of the master branch (origin/master). Assuming I agreed with all their edits, I would merge in their copy of the branch. If I disagreed, I would merge in their branch with the strategy ours, effectively throwing out their changes but letting Git think that the two branches were up to date:

$ git checkout master
$ git fetch origin
$ git diff origin/master

Depending on whether or not I wanted to keep the changes, I would merge the work in one of three ways: combine all work, overwrite their work with mine, or overwrite my work with theirs.

To combine all work (true merge):

$ git merge origin/master

To keep my own work:

$ git merge -X ours origin/master

To discard my own work in favor of the reviewer’s:

$ git merge -X theirs origin/master

This can be done on a per-commit basis, or if there is a merge conflict, it can be done on a very granular change-by-change basis with a merge tool. (It feels a bit passive-aggressive to be throwing stuff out, but really it’s just the limitation of a single branch system where you don’t have the ability to talk about the proposed changes in a separate branch.) Depending on the granularity of the commits, I might also choose to cherry-pick some commits to keep them, while discarding other commits. Cherry-picking commits was covered in Chapter 6.

Finally, I would upload the new version of the book to the repository, and update my local working branch drafts:

$ git push origin master
$ git checkout drafts
$ git rebase master

Then I started getting reviews as marked-up PDFs and realized, once again, I had another way that I wanted to separate work. I wanted to be able to write a chapter and keep those commits nice and tidy, but sometimes I was mid-chapter when an edit came in that I wanted to address immediately. Instead of intermingling these commits I set up the following structure for my branches: master, drafts, and one branch per chapter:

$ git checkout ch04
// write chapter
$ git add ch04.asciidoc
$ git commit
$ git checkout drafts
$ git merge ch04

The branch drafts gave me a place to integrate all of the work that I’d been doing. It was kept up to date by merging in chapters as they were completed, or rebasing the master branch if changes had been made by one of my editors. When I was first writing chapters on my own, without others contributing, multiple branches would have been a lot of overhead to maintain, but as more contributors started offering different kinds of contributions, more granularity in branches allowed me to pick and choose how I wanted the manuscript to progress.

Untrusted Developers with Independent Quality Assurance

If your team is mostly trusted developers, but you have a few contractors as well, you might want to have your contractors working in a fork of the repository, instead of giving them write access to the main project. For some types of software, this split might even be a requirement for your own staff. For example, if you were working on firmware for a medical device, you might have very strict government regulations you need to follow on who is allowed to check in work, and how that work must be reviewed before it is added to a repository.

This model is the same as what was described for Contributors (as opposed to Maintainers) earlier in this chapter.

A second example was given in the description of the forking strategy in Chapter 2. Here I included a description of how I offered a patch back to the reveal.js project. To do this, I made a fork of the project, and then cloned the project so that I could edit the files at my workstation. I then reversed the chaining to push my changes back to the original project through a push to upload my work, and then a pull request to submit my work for review.

Based on your reading to date, put together the commands that would be necessary for these workflows. Hint: there’s nothing here that you haven’t read about already in this chapter. Start by drawing yourself a diagram, then add arrows to show the progression of work through the process, and finally, add the Git commands for each of the arrows.

Summary

To work on a new project, you must first decide on the governance structure for the project. This will inform whether or not developers need to create a remote clone of the project, or just a local clone of the project. The way Consumers, Contributors, and Maintainers set up their access to the project may prevent them from doing some tasks; however, by adding remote repository connections, you can easily promote a Developer into a Maintainer.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset