Chapter 10. Manage Usage of Third-Party Code

Oscar Wilde: “I wish I had said that.” Whistler: “You will, Oscar; you will.”

James McNeill Whistler

Best Practice:

  • Manage the usage of third-party code well.

  • Determine the specific advantages of using an external codebase, and keep third-party code up-to-date.

  • This improves the development process because using third-party code saves times and effort, and proper third-party code management makes the system’s own behavior more predictable.

Third-party code is code that you have not written yourself. It comes in several variants: sometimes in the form of a single library, sometimes in the form of a complete framework. The code can be open source and maintained by a (volunteer) community, it can be a paid product that derives from open source code, or it can be a paid product using only proprietary code.

There is a lot of third-party code and there are good reasons to use it. However, managing the use of third-party code requires a policy based on the right requirements for your team and/or organization.

Motivation

While using third-party code may feel like being out of control (you are not the writer or maintainer of the code), it comes with a lot of benefits. For one, it saves you from having to “reinvent the wheel,” and so saves you development effort. Usually, third-party code has also passed some form of quality control.

Using third-party code is, however, not trivial: external code can become outdated, cause security concerns, and thereby it may require additional maintenance. In complex cases the number of dependencies of your own codebase can explode. The decision to use third-party code is essentially a risk versus reward decision: the risk of security concerns or added maintenance versus the reward of reduced development time. So the usage of third-party code should be managed to ensure predictable behavior of that code.

Using Third-Party Code Saves Time and Effort

Using third-party code saves the development team time and effort in reinventing the wheel. It relieves them from writing custom code for which standard solutions are already present.

This is relevant as many applications share the same type of behavior. Consider basic needs for almost all systems: UI interaction, database access, manipulation of common data structures, administrating settings, or security measures. Using third-party libraries is especially helpful for such generic functionality, as it avoids unnecessary over-engineering. Widely known examples include web application frameworks such as Spring for Java, Django for Python, and AngularJS for JavaScript. For testing, there are the JUnit framework and its corresponding ports for other languages. For database communication there are also numerous frameworks available, such as NHibernate and the Entity Framework. So there is plenty of third-party code to choose from, saving you time and effort in the end.

Third-Party Code Has at Least Base-Level Quality

Third-party code is widely available as open source code next to commercial alternatives. All variations provide some form of quality control. Free open source solutions are maintained by the community: when that community is sufficiently large and active, they maintain and control the source code of the product. Although this is not a quality guarantee, the popularity of open source products suggests that many more benefit from its usage. When many users disagree with the quality or content of that product, typically this will lead to someone fixing the issues or someone making a development fork (a separately maintained version).

Typically, within paid subscriptions of open source derivatives, you have the possibility to request support in installing/using the products. In some cases you can communicate with the developers to report bugs or request improvements.

The commercial variations are hopefully quality controlled by the developers/organization itself. They clearly have the incentive to provide proper quality because they depend on satisfied clients and clients may expect proper quality. What is more, they constitute a professional organization that has some level of maturity, and so applies certain standards before a new version is released.

Managing Usage of Third-Party Code Makes a System’s Behavior More Predictable

Having control over external dependencies makes the behavior of those dependencies more predictable. By extension, this also applies to your system as a whole: to know the expected behavior of third-party code helps to predict the behavior of your own system. With a clear overview of external dependencies, developers do not need to make assumptions about which library versions are used where. This gives a fair indication of how third-party libraries and frameworks affect the behavior of your own system.

It is rather common that different versions of a library are used within a system. This introduces unnecessary maintenance complexity. There may be, for example, conflicting behavior of libraries that perform the same functions. Therefore, the starting point for managing third-party code is to have an overview of its usage. That is, gaining insight on the specific frameworks and libraries and the versions that are used.

How to Apply the Best Practice

Making decisions on how to use third-party code is essentially a matter of standardization. It includes decisions on a general approach (policy, for which you should list general requirements) and specific choices (e.g., listing advantages and disadvantages for a particular library). Requirements include an update policy and maintenance requirements. We elaborate on these points in this section.

Determine the Specific Maintainability Advantages of Using an External Codebase

Using third-party code for general functionality is a best practice. That is, insofar as it decreases the actual maintenance burden on your own source code. Using third-party code is a good choice when it offers much functionality that can be delegated away from the system while it requires reasonable effort to adapt your system to it. For utility functionalities this is generally straightforward and those are a great opportunity. One should also use libraries for functionality that is complex but widely available as a library. This is especially true for general security functionality (such as the algorithms for setting up secure connections between systems). They should never be written yourself, as it is hard to guarantee the security of that code.

In order to determine the quality of a library or framework, answer or give estimates for the following questions:

Replacing functionality

Is the functionality that you are trying to implement utility functionality that is widely used by other systems? Is that functionality complex and specialized, i.e., is coding it yourself error-sensitive?

  • Expectations: Is it likely that functionality in the library will be expanded soon such that you can use it also to replace other functionality that you are coding yourself now?

Maintenance

Does the specific third-party code have a reasonably active community with frequent updates? How widely is it used (e.g., number of downloads, number of forum topics, or mentions on popular developer forums)?

  • Experience/knowledge: Does the development team already have experience with the third-party code? Is knowledge readily available—is the code well-documented, either in the form of a book or online tutorials/forums?

Compatibility/reliability

Is the third-party code compatible with other technologies used in the system and the deployment environment?

  • Trustworthiness: Has the source code been audited with good results? This is especially relevant for security functionality.

  • Licensing: Is the source code licensed in a way that it is compatible with your form of licensing?

  • Intrusiveness: Can the third-party code be used in such a way that your own code is relatively loosely coupled to the framework you use? Will upgrading not break your own code because of coupling? Every “yes” to these questions implies an advantage and argument for using that particular library/framework.

Keep Third-Party Code Up-to-Date

Updates are needed to stay up to speed with bug fixes and improved functionality. However, without a policy this tends to lag behind. Checking and updating libraries costs time and the advantages may not seem evident. Therefore, before creating a policy the following must be known:

Effort

The amount of work required to perform updates each check/update cycle. By all means, automate as much as possible with tooling.

Priority

What has highest priority in the system’s development when it comes to updates? Typically, security has high priority and thereby policies tend to prescribe to always update libraries immediately when security flaws are patched. Consider that such a security emergency is especially hard to solve when libraries have been lagging behind major versions.

Then the policy should define how/when to check, and how/when to update. Note that the policy execution can be managed and automated with CI/dependency management tooling. (Examples of such tooling are Nexus, JFrog, Maven, and Gradle.)

How to check

Automatically scan recency of updates daily or manually check it (e.g., at the start of a release cycle).

When to update

Updating immediately when updates are available, or bundling all update work in a certain release cycle.

Update to what exactly

Updating to the newest versions, even if that is a beta version, or to the latest stable version.

  • Staying behind: You may choose to consistently wait a certain amount of time before updating, for example, to see whether the community experiences updating problems. You might therefore choose to always stay one version behind the latest stable release.

  • Not updating at all: You may choose to remain at the current version and not update, for example, when updates introduce a notable instability concern with respect to your own codebase.

Note that libraries can become unsupported because users have moved to an alternative. Unsupported libraries run risks for compatibility (interacting with other technologies that are updated in the meantime) and security (because new flaws are not being fixed).

Ensure Quick Response to Dependency Updates

Regardless of your update strategy, you should be able to detect and perform library updates quickly. To this end, the intrusiveness of a library or framework is very relevant. If your codebase is tightly coupled with library code, it becomes harder to perform updates because you have to fix a lot of your own code to make sure it works as it did before. This is yet another reason to take unit tests seriously: when you update a library and several unit tests fail, there is a good chance that the update caused it.

Important

If you decide that you want to keep up with certain versions of libraries, do the work as soon as you can. Postponing this increases the effort required to adjust your own code when the next update arrives.

Do Not Let Developers Change Library Source Code

Developers should be able to change and update libraries with minimal effort. Therefore, agree with developers that they do not make changes to the source code of a third-party library. If developers do that, the library code has become part of your own codebase and that defeats the purpose of third-party code. Updates of changed libraries are especially cumbersome and can easily lead to bugs. It requires developers to analyze exactly what has been changed in the library code and how that impacts the adjusted code. If a library does not perfectly fit the functionality you need (but it solves part of a difficult problem), it is easier to use it anyway and write custom code around it.

Important

For large or complex functionality, it is well worth it to consider adjusting functionality to fit it to third-party code, instead of building a custom solution (or adjusting third-party libraries).

Manage the Usage and Versions of Libraries and Frameworks Centrally

To keep libraries up-to-date, you need an overview of what versions are used. Dependency management tooling can help with this. To facilitate library updates, a best practice is to use a central repository with all used libraries. The process can be fully automated. In that case, when a developer creates a new build, a dependency management tool retrieves and imports the latest versions of the required libraries. You can also manually document the usage of types/versions of libraries and frameworks centrally, but in practice this is a huge maintenance issue. If the list cannot be fully relied upon, it loses its effectiveness completely.

Measuring Your Dependency Management

Suppose that you have a dependency management tool in place that automatically checks for and notifies you about new versions of libraries that the development team uses. You want to make sure that the team keeps updating their dependencies, but you also want to make sure that updates do not cost too much effort. Consider the following GQM model, to give you insight into the problems of the team and for checking whether your dependency management is done right:

  • Goal A: To understand how the team manages the use of third-party code by assessing the effort required to keep the system up-to-date with external dependencies.

    • Question 1: How much time does the team spend on updating external dependencies and fixing potential bugs introduced to the system?

      • Metric 1a: Number of bugs found after updating a library. This metric will not be useful directly, but will provide useful insight over time. This is because it tells you something about how the team uses specific libraries: some will introduce more bugs than others. Expect the trend line to stabilize per library. If some library updates require a lot of work, you can investigate whether the team is, for example, behind with updating. A cause could be that issues are accumulating because developers do not have the time for updates or because the library is altered very often. Or possibly, the library is not very suitable for the system, in which case you should consider switching to another library.

      • Metric 1b: Number of versions the team is behind per library. This number may indicate trouble with updating when the team is working with libraries that have already had two major updates, for example. It could be that the team postpones it because they think it is too much work or that they cannot allocate time. Or they may have run into technical issues trying to update. In any case, when the metric signals trouble, you should find out what the problem is.

If you use this model to assess your library usage, you may discover that you need to update your standards for using third-party code, or that you are better off by switching to other libraries.

A good way to gain insight into your usage of third-party code is to create a chart of your dependencies that shows how far behind you are with them. For instance, the chart in Figure 10-1 indicates the status of support for a list of libraries. For each library or framework the latest version is shown, together with the version that is currently in use. The colors show how far behind you are: blue means the framework is stable, while gray indicates that you should update to a newer version. Burgundy indicates an unsupported framework, meaning that you either should update as soon as possible, or consider switching to another framework.

bmso 1001
Figure 10-1. An example of library freshness

Suppose now that you need to select a new library for encryption functionality. This is a definite example of functionality you should never try to write yourself because of its complexity and impact of dysfunction. You may have found a few open source libraries, but want to select the right one. These considerations can be answered in a measurable manner:

  • Goal B: To select the most suitable library by measuring its community activity and trustworthiness.

    • Question 2: For which of these libraries are issues fixed the fastest?

      • Metric 2: Per library, the issue resolution time. Consider this metric as an indicator for the level of support you can expect from the maintainers. That is, when you report a bug or an issue, how fast the maintainers will respond and solve the bug or issue. Of course, a lower issue resolution time is usually better. A codebase that releases very often can probably also fix bugs very fast.

    • Question 3: Which of the libraries is most actively maintained?

      • Metric 3a: Number of contributions per week, per library (assuming that the libraries are on version control that is publicly accessible so that you can see the activity. Here, the number of contributions is a signal of maintenance activity).

      • Metric 3b: Number of contributors per library. A higher number of contributors indicates an active community. Having several contributors work on one library also makes it more likely that some form of quality control is used. Be cautious, however, of too many contributors as this could also signify a distorted library. Too many contributors may also lead to forking when the contributors disagree about the contents.

This GQM model can help you in deciding which third-party codebase is most suitable for you. Remember that finding the best library is a trade-off: you should neither choose a library when you do not have the capacity to keep up with its pace, nor a library that is stagnant, as this may leave bugs unfixed for a long time.

When you have decided to use an external component, you may want to track the time you save on custom implementation and the time you spend on adapting to a framework or library. In this way, you gain insight into the benefits of specific third-party code. So you could use the following GQM model for this:

  • Goal C: To determine the time gain of a specific framework by measuring the time spent on adapting to the framework versus the time saved on implementing custom functionality.

    • Question 4: How much time did we save by implementing functionality using the framework?

      • Metric 4: For each functionality that is using the framework, the estimated time it would take to implement it from scratch minus the time it took to implement using the framework. This metric gives a raw estimate of the time gain in using the framework. Therefore you should expect this value to at least be positive, otherwise it would mean that building it from scratch is faster!

    • Question 5: How much time is spent on learning about and staying up-to-date with the framework?

      • Metric 5: Time spent specifically on updates, or studying the framework. This time should be relatively high when you start using the framework (unless developers already know it, of course), and should gradually drop. The point is that it contributes negatively to the time you save by using the framework.

When you know how much time goes into using the framework itself and the time you save by delegating custom implementations, it is easy to predict your savings: just determine the break-even point and decide if it works for you. If you notice that it is very hard to save time on it, think about the other reasons for using the framework. If there are no convincing reasons, then decide whether to keep using it or to switch to another framework.

Common Objections to Third-Party Code Metrics

Controlling the usage of third-party code is important from a maintainability perspective. However, common objections are concerns of their trustworthiness, maintenance benefits, and inability to update.

Objection: Safety and Dependability of Third-Party Libraries

“We cannot know whether third-party libraries are dependable and safe. Should we test them?”

As libraries are imported pieces of code, you cannot unit test them directly the way you do with normal production code. Also, once you have determined that the library is dependable and adequate, you should avoid efforts to test it yourself and trust it is working. However, testing is possible in the following ways:

  • You could use the unit test code of the library itself to find bugs that may arise because your system has uncommon requirements or input. However, this mainly enables you to find bugs that the community should have found as well. Proper documentation of the library should clarify what it can and cannot do.

  • If you have further concerns on a library, you could test it while abstracting the library behind an interface. This does cost you extra work and may lead to unnecessary code, so you should only do this when you have particular concerns about the library. It does, however, give you the possibility of easily switching to another library.

Objection: We Cannot Update a Particular Library

“We cannot update a certain library—doing so leads to trouble/regression in another system.”

If the usage of a certain library leads to problems in another system, there is a deeper problem at work. The library may have changed significantly in a way that it is no longer compatible. In most cases, failing unit tests should signal what kind of functionality is in trouble. There could also be other reasons: young frameworks usually introduce a lot of breaking changes. If you do not have the capacity to keep up with such changes, this is all the more reason to switch libraries.

Objection: Does Third-Party Code Actually Lead to Maintenance Benefits?

“How can we determine whether using third-party code leads to benefits in maintenance?

As third-party code is not built by yourself, you do not know exactly how much effort was spent building it. But it is clear that for highly complex functionality it is much easier to import it than it is to code, test, and maintain it yourself. You do need to make sure that your system remains compatible with that library (which requires some testing and maintenance) and that effects of library changes are isolated in your code.

The important consideration is whether you are using your time to write code that makes your system unique and useful to your goals, or you are using that time to implement functionality that is already available in the form of a framework or library. It turns out in practice that a lot of functionality is very common among systems, so that most of the time there are already solutions readily available.

There are lots of ways in which you determine maintenance benefits: you can use stories or function points to determine how much work you save on implementing those while at the same time considering the investment you make to understand and use a third-party component. You can use the metrics provided in the previous section to gain insight into these times.

Metrics Overview

As a recap, Table 10-1 shows an overview of the metrics discussed in this chapter, with their corresponding goals.

Table 10-1. Summary of metrics and goals in this chapter
Metric # in text Metric description Corresponding goal

TPC 1a

Number of bugs found after updating a library

Third-party code effort

TPC 1b

Number of versions behind on a library

Third-party code effort

TPC 2

Issue resolution time per library-related issue

Library selection

TPC 3a

Number of contributions per week per library

Library selection

TPC 3b

Number of contributors per library

Library selection

TPC 4

Difference between time effort for implementing functionality
from scratch and using a framework

Framework effectiveness

TPC 5

Time invested in studying or updating a framework

Framework effectiveness

This is the penultimate chapter that has dealt with standardization issues. The following and last best practice chapter covers (code) documentation.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset