Chapter 10. Builds Don’t Have To Be Slow and Unreliable

Jenn Strater

A while back, I was working at an early-stage start-up where the codebase and development team were growing every day. As we added more and more tests, the builds were taking longer and longer to run. At around the eight-minute mark I started to notice it, which is why I remember that specific number. From eight minutes, build times nearly doubled. At first, it was kinda nice. I would kick off a build, go grab a coffee, and chat with coworkers on other teams. But after a few months, it became irritating. I’d had enough coffee and I knew what everyone was working on, so I would check Twitter or help other developers on my team while waiting for my builds to finish. I would then have to context switch when I went back to my work.

The build was also unreliable. As is normal for any software project, we had a number of flaky tests. The first, albeit naive, solution was to turn off the tests (i.e., @Ignore) that were failing. Eventually, it got to the point where it was easier to push the changes and rely on the continuous integration (CI) server than to run the tests locally. The problem with this tactic was that it moved the problem further down the line. If a test failed at the CI step, it took much longer to debug. And if a flaky test passed initially and only showed up after merging, it blocked the entire team until we determined whether it was a legitimate issue.

Frustrated, I tried to fix some of the problematic tests. One test in particular stands out in my mind. It only appeared when the entire test suite ran, so each time I made a change, I had to wait 15-plus minutes for feedback. These incredibly long feedback cycles and a general lack of relevant data meant I wasted days tracking down this bug.

This isn’t just about one company, though. One of the advantages of being a job hopper is that I’ve seen the way many different teams work. I thought these issues were normal until I started at a company where we work on exactly these problems.

Teams that follow Developer Productivity Engineering, the practice and philosophy of improving developer experience through data, are able to improve their slow and unreliable builds. These teams are happier and have higher throughput, making the business happier too.

No matter what build tool they are using, the people responsible for developer productivity can effectively measure build performance and track outliers and regressions for both local and CI builds. They spend time analyzing the results and finding bottlenecks in the build process. When something does go wrong, they share the reports (e.g., Gradle build scans) with teammates and compare failing and passing builds to pinpoint the exact problem—even if they can’t reproduce the issues on their own machines.

With all this data, they can actually do something to optimize the process and reduce the frustration developers are facing. This work is never done, so they keep iterating to maintain developer productivity. It’s not an easy task, but the teams who work at it are able to prevent the problems I described from happening in the first place.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset