Chapter 13. Scaling Down with Custom Runtime Images

Now that you’ve seen all the tools and processes to work with modular applications, there’s one more exciting opportunity to explore. In “Linking Modules”, you got a taste of creating runtime images tailored to a specific application. Only the modules required to run the application become part of the image. A minimal runtime image can be automatically generated with jlink by using explicit dependency information available in modules.

Creating a custom runtime image is beneficial for several reasons:

Ease of use

jlink delivers a self-contained distribution of your application and the JVM, ready to be shipped.

Reduced footprint

Only the modules that your application uses are linked into the runtime image.

Performance

A custom runtime potentially runs faster by virtue of link-time optimizations that are otherwise too costly or impossible.

Security

With only the minimum required platform modules in a custom runtime image, the attack surface decreases.

Even though creating a custom runtime image is an optional step, having a smaller binary distribution that runs faster is a compelling motivation—especially when applications target resource-constrained devices, such as embedded systems; or, when they run in the cloud, where everything is metered. Putting a custom runtime image into a Docker container is a good way to create resource-efficient cloud deployments.

Tip

Other initiatives are improving Java’s support for containers. For example, OpenJDK 9 now also offers an Alpine Linux port. Running the JDK or a custom runtime image on top of this minimalistic Linux distribution is another way to reduce the footprint of the deployment.

Another advantage of distributing the Java runtime with your application: there’s no more mismatch between already installed Java versions and the version your application needs. Traditionally, either a Java Runtime Environment (JRE) or Java Development Kit (JDK) must be installed before running Java applications.

Note

The JRE is a subset of the JDK, designed to run Java applications rather than to develop Java applications. It has always been offered as a separate download geared toward end users.

A custom runtime image is fully self-contained. It bundles the application modules with the JVM and everything else it needs to execute your application. No other Java installation (JDK/JRE) is necessary. Distributing, for example, JavaFX-based desktop applications becomes easier this way. By creating a custom runtime image, you can offer a single download containing the application and the runtime environment. On the flip side, an image is not portable, because it targets a specific OS and architecture. In “Cross-Targeting Runtime Images”, we’ll discuss how to create images for different target systems.

It’s time to take a deep dive into what jlink is capable of. Before you look at the tool itself, we’ll first discuss linking and how it opens up new possibilities for Java applications.

Static Versus Dynamic Linking

Creating a custom runtime image can also be characterized as a form of static linking for modules. Linking is the process of bringing together compiled artifacts into an efficiently executable form. Traditionally, Java has always employed dynamic linking at the class level. Classes are lazily loaded at run-time, and wired together dynamically where necessary at any point in time. The just-in-time (JIT) compiler of the virtual machine is then responsible for compilation to native code at run-time. During this process, the JVM applies optimizations to the resulting ensemble of classes. Although this model allows for great flexibility, optimizations that are straightforward in more static situations are harder (or even impossible) to apply in this dynamic landscape.

Other languages make different trade-offs. Go, for example, prefers to statically link all code into a single binary. In C++, you can choose to link statically or dynamically. With the introduction of the module system and jlink, we now have that choice in Java as well. Classes are still dynamically loaded and linked in custom runtime images. However, the available modules from which classes can be loaded can be statically predetermined.

An advantage of static linking is that it affords optimizations across the whole application ahead of time. Effectively, this means optimizations can be applied across class and module boundaries, taking into account the whole application. This is possible only because through the module system we have up-front knowledge about what the whole application actually entails. All modules are known, from the root module (the entry point of an application), to the libraries, all the way through to the required platform modules. A resolved module graph represents the whole application.

Warning

Linking is not the same as ahead-of-time (AOT) compilation. A runtime image generated with jlink still consists of bytecode, not native code.

Examples of whole-program optimizations are dead-code elimination, constant folding, and inlining. It’s beyond the scope of this book to describe these optimizations. Fortunately, much literature is available.1

Dead Code by Oliver Widder link:http://geek-and-poke.com/geekandpoke/2010/8/5/simply-explained-dead-code.html, CC-BY link:https://creativecommons.org/licenses/by/3.0/deed.en_US

The applicability and effectiveness of many of these optimizations hinges on the assumption that all relevant code is available at the same time. The linking phase is that time, and jlink is the tool to do so.

Using jlink

Back in “Linking Modules”, we created a custom runtime image consisting of just a helloworld module and java.base. Toward the end of Chapter 3, we created a much more interesting application: EasyText, with its multiple analyses and CLI/GUI frontends. Just as a reminder, running the GUI frontend on top of a full JDK is achieved by setting up the right module path and starting the right module:

$ java -p mods -m easytext.gui

Assuming the mods directory contains the modular JARs for EasyText, this results in the situation depicted in Figure 13-1 at run-time.

Modules resolved at run-time when running `easytext.gui`. Greyed out modules are available in the JDK, but not resolved.
Figure 13-1. Modules resolved at run-time when running easytext.gui. Grayed-out modules are available in the JDK but not resolved.

At JVM startup, the module graph is constructed. Starting with the root module easytext.gui, all dependencies are recursively resolved. Both application modules and platform modules are part of the resolved module graph. However, as Figure 13-1 shows, many more platform modules are available in the JDK than strictly necessary for this application. The grayed-out modules are only the tip of the iceberg, because about 90 platform modules exist. Only nine of them are necessary to run easytext.gui with its JavaFX UI.

Let’s get rid of this dead weight by creating a custom runtime image for EasyText. The first choice we must make is, what exactly should be part of the image? Just the GUI, just the CLI, or both? You could create multiple images targeting distinct groups of users of your application. There’s no general right or wrong answer to this question. Linking is explicitly composing modules together into a coherent whole.

For now, let’s settle on creating a runtime image just for the GUI version of EasyText. We invoke jlink with easytext.gui as the root module:

$ jlink --module-path mods/:$JAVA_HOME/jmods                   1
        --add-modules easytext.gui                             2
        --add-modules easytext.algorithm.coleman              
        --add-modules easytext.algorithm.kincaid              
        --add-modules easytext.algorithm.naivesyllablecounter 
        --launcher easytext=easytext.gui                       3
        --output image 4
1

Set the module path where jlink can find modules, including platform modules in the JDK.

2

Add root modules to be included in the runtime image. Besides easytext.gui, service provider modules are added as root modules as well.

3

Define the name of a launcher script to be part of the runtime image, indicating the module it should run.

4

Set the output directory where the image is generated into.

Tip

The jlink tool lives in the bin directory of the JDK installation. It is not added to the system path by default, so in order to use it as shown in the preceding example, you must add it to the path first.

It’s possible to provide multiple root modules separated by commas as well, instead of the multiple --add-modules used here.

The specified module path explicitly includes the JDK directory containing platform modules ($JAVA_HOME/jmods). This is different from what you’ve seen when using java and javac: there the platform modules are assumed to come from the same JDK you run java or javac from. In “Cross-Targeting Runtime Images”, you’ll see why this is different for jlink.

As discussed in “Services and Linking”, service provider modules must be added as root modules as well. Resolving the module graph happens only through requires clauses; jlink does not follow uses and provides dependencies by default. In “Finding the Right Service Provider Modules”, we’ll show you how to find the right service provider modules to add.

Note

You can add the --bind-services flag to jlink. This instructs jlink to take into account uses/provides as well when resolving modules. However, this also binds all services between platform modules. Because java.base already uses a lot of (optional) services, this quickly leads to a larger set of resolved modules than strictly necessary.

Each of these root modules is resolved, and these root modules along with their recursively resolved dependencies become part of the generated image in ./image, as shown in Figure 13-2.

A custom runtime image contains only the modules necessary for the application.
Figure 13-2. A custom runtime image contains only the modules necessary for the application

The generated image has the following directory layout, which is similar to the JDK:

image
├── bin
├── conf
├── include
├── legal
└── lib

In the bin directory of the generated runtime image, you’ll find an easytext launcher script. It’s created because of the --launcher easytext=easytext.gui flag. The first argument is the name of the launcher script, and the second argument is the module it starts. This script is an executable convenience wrapper that starts the JVM directly with easytext.gui as the initial module to run. You can directly execute it from the command line by invoking imageineasytext. On other platforms, similar scripts are generated (see “Cross-Targeting Runtime Images” for how to target other platforms). Windows runtime images get batch files instead of shell scripts for Unix-like targets.

A launcher script can be created for modules containing a class with an entry point (static main method). That’s the case if we create the easytext.gui modular JAR as follows:

jar --create                                      
    --file mods/easytext.gui.jar                  
    --main-class=javamodularity.easytext.gui.Main 
    -C out/easytext.gui .

You can also build a runtime image from exploded modules. In that case, there’s no main class attribute from the modular JAR, so it must be explicitly added to the jlink invocation:

--launcher easytext=easytext.gui/javamodularity.easytext.gui.Main

This approach also works if there are multiple classes with main methods in a modular JAR. Regardless of whether specific launcher scripts are created, you can always start the application by using the image’s java command:

image/bin/java -m easytext.gui/javamodularity.easytext.gui.Main

There’s no need to set the module path when running the java command from the runtime image. All necessary modules are already in the image through the linking process.

We can show that the runtime image indeed contains the bare minimum of modules by running the following:

$ image/bin/java --list-modules
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
java.base@9
java.datatransfer@9
java.desktop@9
java.prefs@9
java.xml@9
javafx.base@9
javafx.controls@9
javafx.graphics@9
jdk.jsobject@9

This corresponds to the modules shown in Figure 13-2.

The bin directory can contain other executables besides the launcher scripts discussed so far. In the EasyText GUI image, the keytool and appletviewer binaries are also added. The former is always present, because it originates from java.base. The latter is part of the image because the included java.desktop module exposes applet functionality. Because other well-known JDK command-line tools such as jar, rmic, and javaws all depend on modules that are not in this runtime image, jlink is smart enough to omit them.

Module Resolution During Linking

Although the module path and module resolution in jlink appear to behave similarly to other tools you’ve seen so far, important differences exist. One exception you’ve already seen is that platform modules need to be added explicitly to the module path.

Another important exception involves automatic modules. When you put a nonmodular JAR on the module path with java or javac, it is treated as valid module for all intents and purposes (see “Automatic Modules”). However, jlink does not recognize nonmodular JARs on the module path as automatic modules. Only when your application is fully modularized, including all libraries, can you use jlink.

The reason is automatic modules can read from the classpath, bypassing the module system’s explicit dependencies. In a custom runtime image, there is no predefined classpath. Automatic modules in the resulting image could therefore cause run-time exceptions. When they rely on classes from the classpath that won’t be there, the application blows up at run-time. Allowing this situation voids the reliable configuration guarantees of the module system.

Of course, when you’re absolutely sure an automatic module is well behaved (i.e., it requires only other modules and nothing from the classpath), this situation is unfortunate. In that case, you can work around this limitation by turning the automatic module into an explicit module. You can use jdeps to generate the module descriptor (as discussed in “Creating a Module Descriptor”) and add it to the JAR. Now you have a module that can be put on jlink’s module path. This is not an ideal situation; patching other people’s code never is. Automatic modules are really a transitional feature to aid migration. When you run into this jlink limitation, it’s a good time to contact the maintainer of the library in question to urge them to modularize it.

Last, the same caveats with respect to modules using reflection apply during linking, as with running modules on a full JDK. If a module uses reflection and does not list those dependencies in the module descriptor, the resolver cannot take this into account. Consequently, the module containing the code that is reflected upon may not be included in the image. To prevent this, add the modules manually with --add-modules, so they end up in the runtime image anyway.

jlink for Classpath-Based Applications

It seems as though jlink is usable only with fully modularized applications. That’s true only in part. You can also use jlink to create custom images of the Java platform without involving any application modules. For example, you can run

jlink --module-path $JAVA_HOME/jmods --add-modules java.logging --output image

and get an image containing just the java.logging (and the mandatory java.base) module. That may seem only marginally useful, but this is where things get interesting. If you have an existing classpath-based application, there’s no way to use jlink directly on application code to create a custom runtime image for that application. There are no module descriptors for jlink to resolve any modules.

However, what you can do is use a tool such as jdeps to find the minimal set of platform modules necessary to run the application. If you then construct an image containing just these modules, you can later start your classpath-based application on this image without issues.

That probably sounds a bit abstract, so let’s look at a simple example. In “Using the Modular JDK Without Modules”, Example 2-5, we already had a simple NotInModule class that uses the java.logging platform module. We can inspect this class with jdeps to verify that is the only dependency:

$ jdeps out/NotInModule.class

NotInModule.class -> java.base
NotInModule.class -> java.logging
   <unnamed>                        -> java.lang                    java.base
   <unnamed>                        -> java.util.logging            java.logging

For larger applications, you would analyze the JARs of the application and its libraries in the same manner. Now you know which platform modules are necessary to run the application. For our example, that’s just java.logging. We already created the image containing only the java.logging module earlier in this section. By putting the NotInModule class on the classpath of the runtime image, we can run this classpath-based application on a stripped-down Java distribution:

image/bin/java -cp out NotInModule

The NotInModule class is on the classpath (assuming it’s in the out directory), so it is in the unnamed module. The unnamed module reads all other modules, which, in the case of this custom runtime image, is the tiny set of two modules: java.base and java.logging. Through these steps, you can create custom runtime images even for classpath-based applications. Had EasyText been a classpath-based application, the same steps would result in an image containing the nine platform modules shown in Figure 13-1.

There are some caveats. When your application tries to load classes from modules that are not in the runtime image, NoClassDefFoundError is raised at run-time. This is similar to the situation where your classpath lacks a necessary JAR. It is up to you to list the right set of modules for jlink to put into the image. At link-time, jlink doesn’t even see your application code, so it cannot help with module resolution as it can in the case of a fully modularized application. jdeps is helpful to get a first estimation of the required modules, but use of reflection in your application can, for example, not be detected by the static analysis of jlink. Testing your application on the resulting image is therefore important.

Furthermore, not all performance optimizations discussed in the subsequent sections are applicable to this classpath-based scenario. Many optimizations work because all code is available at link-time—which is not the case in the scenario we just discussed. The application code is never seen by the linker. In this scenario, jlink works only with the platform modules you add to the image explicitly, and their resolved platform module dependencies.

This use of jlink shows how the lines between the JDK and JRE have blurred in a modular world. Linking allows you to create a Java distribution with any desired set of platform modules. You’re not confined to the options offered by the platform vendors.

Reducing Size

Now that we have the basic use of jlink covered, it’s time to turn our attention to the optimizations we were promised. jlink uses a plug-in-based approach to support different optimizations. All flags introduced in this section and the next section are handled by jlink plug-ins. The aim is not to give an exhaustive overview of all plug-ins, but rather to highlight some plug-ins to illustrate the possibilities. The number of jlink plug-ins is expected to grow steadily over time, both by the JDK team and the community at large.

You can get an overview of all currently available plug-ins by running jlink --list-plugins. Some plug-ins are enabled by default; this is indicated in the output of this command. In this section, we’ll look at plug-ins that can reduce the disk size of runtime images. The next section covers run-time performance improvements.

Creating a custom runtime image as we did for EasyText already reduces the size on disk relative to a full JDK by leaving out unnecessary platform modules. Still, there’s more to be gained. You can use several flags to trim the image size even further.

The first one is --strip-debug. As the name implies, it removes native debug symbols and strips debug information from classes. For a production build, this is what you want. However, it is not enabled by default. Empirical evidence shows roughly a 10 percent reduction in image size with this flag enabled.

You can also compress the resulting image with the --compress=n flag. Currently, n can be 0, 1, or 2. It does two things on the highest setting. First, it creates a table of all string literals used in classes, sharing the representation across the whole application. Then, it applies a generic compression algorithm on the modules.

The next optimization affects the JVM that is put into the runtime image. With --vm=<vmtype>, different types of VMs can be selected. Valid options for vmtype are server, client, minimal, and all (which is the default). If a small footprint is your most important concern, choose the minimal option. It offers a VM with only a single garbage collector, one JIT compiler, and no serviceability or instrumentation support. At the moment, the minimal VM option is available only for Linux.

The last optimization concerns locales. Normally, the JDK ships with many locales to suit different date/time formats, currencies, and other locale-sensitive information. The default English locales are part of java.base, and so are always available. All other locales are part of the module jdk.localedata. Because this is a separate module, you can choose to add it to a runtime image or not. Services are used to expose the locale functionality from java.base and jdk.localedata. This means you must use --add-module jdk.localedata during linking if your application needs a non-English locale. Otherwise, locale-sensitive code simply falls back to the default English locale, because no other locales are available in the absence of jdk.localedata. Remember, service provider modules are not resolved automatically, unless --bind-services is used.

Tip

A similar situation arises when your application uses character sets that are not part of the default sets (i.e., US-ASCII, ISO-8859-1, UTF-8, UTF-16). Traditionally, these nondefault character sets were available through charsets.jar in the full JDK. With the modular JDK, add the jdk.charsets module to your image in order to use nondefault character sets.

However, jdk.locale is quite large. Adding this module easily increases the size on disk with about 15 megabytes. In many cases, you don’t need all locales in the image. If that’s the case, you can use the --include-locales flag of jlink. It takes a list of language tags as an argument (see the java.util.Locale JavaDoc for more information on valid language tags). The jlink plug-in then strips the jdk.locale module from all other locales. Only the resources for the specified locales will remain in the jdk.locale module that ends up in the image.

Improving Performance

The previous section presented a range of optimizations pertaining to the size of the image on disk. What’s even more interesting is the ability of jlink to optimize run-time performance. Again, this is achieved by plug-ins, some of which are enabled by default. Keep in mind that jlink and its plug-ins are still at an early stage and mostly aimed at improving startup time of applications. Many of the plug-ins discussed in this section are still experimental in JDK 9. Most of the performance optimizations available require in-depth knowledge of how the JDK works.

A lot of the experimental plug-ins generate code at link-time to improve startup performance. One optimization that is enabled by default is the pre-creation of a platform module descriptor cache. The idea is that when building the image, you know exactly which platform modules are part of the module graph. By creating a combined representation of their module descriptors at link-time, the original module descriptors don’t have to be parsed individually at run-time. This decreases the JVM startup time.

In addition, other plug-ins perform bytecode rewriting at link-time to achieve run-time performance gains. An example is the --class-for-name optimization. It rewrites instructions of the form Class.forName("pkg.SomeClass") to static references to that class, thus avoiding the overhead of reflectively searching for the class at run-time. Another example is a plug-in that pre-generates method handles invocation classes (extending java.lang.invoke.MethodHandle), which would otherwise be generated at run-time. While this may sound esoteric, the implementation of Java lambdas makes heavy use of the method handles machinery. By absorbing the cost of class generation at link-time, applications that use lambdas start faster. Unfortunately, using this plug-in currently requires highly specialized knowledge of the way method handles work.

As you can see, many plug-ins offer quite specialized performance-tuning scenarios. There are so many possible optimizations, a tool such as jlink is never done. Some optimizations are better suited to some applications than others. This is one of the prime reasons jlink features a plug-in-based architecture. You can even write your own jlink plug-ins, although the plug-in API itself is marked experimental in the Java 9 release.

Optimizations that have traditionally been too costly to perform just-in-time by the JVM are now viable at link-time. What’s remarkable is that jlink can optimize code coming from any module, whether it’s an application module, a library module you use, or a platform module. However, because jlink plug-ins allow arbitrary bytecode rewriting, their usefulness extends beyond performance enhancements. Many tools and frameworks currently use JVM agents to perform bytecode rewriting at run-time. Think of instrumentation agents or bytecode enhancing agents from Object-Relation Mappers such as OpenJPA. In some cases, these transformations can be applied up front at link-time. A jlink plug-in can be a good alternative (or complement) to some of these JVM agent implementations.

Tip

Keep in mind, this is the territory of advanced libraries and tools. You are unlikely to write your own jlink plug-in as part of a typical application development process, just as you are unlikely to write your own JVM agent.

Cross-Targeting Runtime Images

A custom runtime image created by jlink runs on only a specific OS and architecture. This is similar to how JDK or JRE distributions differ for Windows, Linux, macOS, and so forth. They contain platform-specific native binaries necessary to run the JVM. You download a Java runtime for your specific OS, and on top of that you run your portable Java applications.

In the examples so far, we’ve built images by using jlink. jlink is part of the JDK (which is platform-specific), but it can create images for different platforms. Doing so is straightforward. Instead of adding the platform modules of the JDK currently running jlink to the module path, you point at the platform modules of the OS and architecture you want to target. Obtaining these alternative platform modules is as easy as downloading the JDK for that platform and extracting it.

Let’s say we are running on macOS and want to create an image for Windows (32-bit). First, download the correct JDK for 32-bit Windows, and extract it into ~/jdk9-win-32. Then, use the following jlink invocation:

 $ jlink --module-path mods/:~/jdk9-win-32/jmods  ...

The resulting image contains your application modules from mods bundled with platform modules from the Windows 32-bit JDK. Additionally, the /bin directory of the image contains Windows batch files instead of macOS scripts. All that’s left to do now is distributing the image to the correct target!

Warning

Runtime images built with jlink don’t update automatically. You are responsible for building updated runtime images when a new Java version is released. Ensure that you distribute only runtime images created from an up-to-date Java version to prevent security issues.

You have seen how jlink can create lean runtime images. When your application is modularized, using jlink is straightforward. Linking is an optional step, which can make a lot of sense when targeting resource-constrained environments.

1 A good language-agnostic introduction can be found in “Whole-Program Optimization of Object-Oriented Languages” by Craig Chambers et al.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset