The Maven plugin generates a structure that fits our needs, but in some cases, we have to produce an output different from the common plugin structure; in such cases, the appropriate choice is Maven Assembly Plugin.
Our particular case of study has to face a deployment of batch application. In the specific batch application called BatchHandler, we have to perform read operations for data acquisition, produce output on files, and send files via FTP to be stored in other locations. In order to be deployed, our batch has to satisfy some specifications:
To accomplish this specification without making strange operations in order to create the package, we use maven-assembly-plugin
.
As described before, our batch application is a simple JAR with some dependency, so we manage to add the assembly plugin to our project's POM.
The structure within the ZIP file should be like the one shown in the following screenshot:
All the directories contain some specific files as described here:
bin
directory contains the script to launch the application in an environment-agnostic fashion (Windows, Linux, or Mac OSX)conf
directory contains all the configurations for correct application executionetc
directory contains all the configurations related to the environment variables and database connectionslib
directory contains all the libraries that are needed for batch executionlibRun
directory contains the batch application's JARlog
directory contains some configurations to logbackup
directory contains the log configuration backupIn order to generate the whole structure in a single step, we use a custom generation option for maven-assembly-plugin
using the assembly descriptor, as described previously.
The common meaning of assembly is to merge a group of files, directories, or dependencies into an archive format and distribute it to someone or into some environment.
The Assembly plugin can be used to aggregate a project with its dependencies, source code, documentation, and other files as configuration files into a single archive.
If we simply need to aggregate our project with the most common files present in a project structure, we can use a predefined model. When the predefined descriptor can't accomplish a specific user's target, a more powerful tool to manage the assembly architecture and files comes to the rescue: the descriptor file.
In all the examples, we used Version 2.4 of maven-assembly-plugin
. This version provides a single goal.
All other goals are deprecated and will be removed in the future versions of the plugin.
Goals such as assembly:assembly
, assembly:attached
, assembly:directory
, and assembly:directory-inline
are deprecated because they break normal build processes and promote nonstandard build practices.
Since the assembly:single-directory
goal is redundant, it has been deprecated in favor of the dir
goal. Moreover, the assembly:format
and assembly:unpack
goals have been deprecated in favor of a far more comprehensive Maven Dependency Plugin.
Maven Assembly Plugin's most important function is represented by descriptorRef
. This element represents the key for all packaging operations because it describes the structure to be created in the output.
As we said before, with descriptorRef
, you can specify a predefined descriptor because Maven provides a set of descriptors covering all the common usages:
jar-with-dependencies
descriptor, the plugin can generate an executable JAR such as Maven Shade Plugin.bin
descriptor allows the creation of a redistributable archive, starting from your project. Such archives can be in any of the three formats: ZIP
, tar.gz
, or tar.bz2
. The resulting project JAR is included, and it is possible to specify other files such as readme, license, or notice.src
descriptor produces an output packaging similar to the bin
descriptor output. This descriptor adds content from the src
project directory, which enables you to redistribute source code in conjunction with the executable. Output formats are the same as that of the bin
descriptor.project
descriptor consists of the sum of the bin
and src
descriptors. This descriptor creates an archive containing all the elements from our project structure. Only the target directory will be excluded from packaging. Also, in this case, the output formats are ZIP
, tar.gz
, and tar.bz2
.The predefined descriptors cover almost all the common needs; if someone needs specific behavior, the assembly plugin accepts a custom XML descriptor as input. This descriptor allows you to specify how to create the output archive and the contents to be included. This kind of operation can be accomplished using the descriptor file.
The descriptor file has different sections to describe various interactions with a project's files. We will describe the principal sections in order to understand the descriptor structure and functionality:
tar
, tar.gz
, or ZIP
) starting from the original project. It's possible to create a compressed file within the project's JAR artifact, a directory within dependencies, usually called lib
, and another directory called bin
within scripts in order to execute the application in a standalone mode.containerDescriptorHandler
is used to filter the files to aggregate them into the assembly archive. It's possible to aggregate different types of descriptor fragments, such as XML files for project configuration.moduleSet
, it is possible to include sources or binaries from different modules that are declared in a project's pom.xml
file.sources
element allows us to define configuration options to add a project's source code into our assembly file.fileSet
element allows us to include files or a group of files into the assembly.binaries
element is useful for including a project's module binary files in the resultant package.dependencySet
to manage project dependencies by the inclusion and exclusion of the output assembly package.unpackOptions
element provides us with the possibility to manage item extraction from the archive in order to filter, exclude, or include resources.file
element allows us to specify the inclusion of individual files. It also permits us to change the destination filename.groupVersionAlignement
element gives us the possibility to align a group of artifacts to a specific version, passed as a configuration parameter.repository
element is particularly useful whenever we need to deploy archives to internal repositories. It allows us to reorganize the project's dependencies into a small Maven repository and include it in the output archive.Thanks to the information provided within the descriptor file, we can create different structures for the plugin output.
In our project, the Assembly plugin configuration has the following form:
<plugin> <artifactId>maven-assembly-plugin</artifactId> <version>2.4</version> <configuration> <descriptorRefs> <descriptorRef> ${descriptorDir}/assembly-descriptor.xml </descriptorRef> </descriptorRefs> </configuration> <executions> <execution> <id>make-assembly</id> <phase>package</phase> <goals> <goal>single</goal> </goals> </execution> </executions> </plugin>
We bind execution to the package phase and pass a custom descriptor as descriptorRef
.
The ${descriptorDir}
variable stores the path to the descriptor file used by the assembly plugin.
In the preceding section, we described all the components of a descriptor file. Now, we will understand how it works. The following is the first element that we have:
<id>run</id>
It just sets the ID for assembly and represents a symbolic name for files from the project. The ID value will also be attached to generate a final filename. Another function of the id
element is to be an artifact's classifier at deploying time:
<includeBaseDirectory>false</includeBaseDirectory>
As the tag name suggests, this option tells the plugin whether to include the project's base directory in the output archive. In our example, we set it to false, since we don't want to include the base directory.
With the following tag, we specify to use the ZIP format as the output:
<formats> <format>zip</format> </formats>
As we said before, this option accepts many formats, such as ZIP
, tar
, tar.gz
, tar.bz2
, jar
, dir
, and war
. All formats but dir are well known: that is not a file format but a directive to create an exploded directory and not a compressed file or a Java archive.
In the fileSets
tag, we specify which file we want to put in the output directories. In the case of the bin
directory, all the source directory contents will be copied into the destination directory:
<fileSet> <directory>src/main/resources/scripts</directory> <outputDirectory>bin</outputDirectory> </fileSet>
The same operation was performed for the log
and etc
directories. In the conf
directory, we included a subset of the files present in the source directory. Through the include
directive, we configure this behavior:
<fileSet> <directory>src/main/resources</directory> <outputDirectory>conf</outputDirectory> <includes> <include>app.properties</include> <include>log4j.xml</include> <include>extract_ldap.param</include> <include>extract_ldap.param.sample.INT</include> <include>extract_ldap.param.sample.INT1</include> <include>extract_ldap.param.sample.PREPROD</include> <include>extract_ldap.param.sample.PROD</include> </includes> </fileSet>
All the JAR files related to the libraries and application batches are included using the dependencySets
tag. As in the previous section, we can use the include
and exclude
directives to manage the set of JAR files that we want to copy:
<dependencySets> <dependencySet> <outputDirectory>libRun</outputDirectory> <unpack>false</unpack> <scope>runtime</scope> <outputFileNameMapping> ${artifactId}.jar </outputFileNameMapping> <includes> <include>${artifact}</include> </includes> </dependencySet> <dependencySet> <outputDirectory>lib</outputDirectory> <unpack>false</unpack> <scope>runtime</scope> <excludes> <exclude>${artifact}</exclude> </excludes> </dependencySet> </dependencySets>
The includes
directive defines a set of files and directories to be included in the output archive. If no pattern is specified, all the files are included. Similarly, the excludes
tag represents a set of files to exclude. On the other hand, if the excludes
tags are empty, no files will be excluded.
Since the excludes
tag takes priority over the includes
tag, if we leave both the elements empty, all the files will be included in the output archive.
In the preceding snippet, we can see two different dependency sets. The first set specifies through the use of appropriate tags:
Since the include
statement specified the filename pattern described in outputFileNameMapping
, only the libraries fitting the pattern will be included inside the libRun
directory. In our examples, the pattern includes only the generated JAR file.
The second dependencySet
looks very similar to the first one. The main difference relies in the usage of the excludes
statement in order to exclude a set of dependencies. Using the ${artifact}
pattern, we exclude only the batch application JAR.
Put together all the configurations that we saw previously into the original descriptor file and complete it with its header results, as shown in the following code:
<assembly xmlns="http://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.0 http://maven.apache.org/xsd/assembly-1.1.0.xsd"> <id>run</id> <includeBaseDirectory>false</includeBaseDirectory> <formats> <format>zip</format> </formats> <dependencySets> <dependencySet> <outputDirectory>libRun</outputDirectory> <unpack>false</unpack> <scope>runtime</scope> <outputFileNameMapping> ${artifactId}.jar </outputFileNameMapping> <includes> <include>${artifact}</include> </includes> </dependencySet> <dependencySet> <outputDirectory>lib</outputDirectory> <unpack>false</unpack> <scope>runtime</scope> <excludes> <exclude>${artifact}</exclude> </excludes> </dependencySet> </dependencySets> <fileSets> <fileSet> <directory>src/main/resources/config</directory> <outputDirectory>etc</outputDirectory> </fileSet> <fileSet> <directory>src/main/resources</directory> <outputDirectory>conf</outputDirectory> <includes> <include>app.properties</include> <include>log4j.xml</include> <include>extract_ldap.param</include> <include>extract_ldap.param.sample.INT</include> <include>extract_ldap.param.sample.INT1</include> <include>extract_ldap.param.sample.PREPROD</include> <include>extract_ldap.param.sample.PROD</include> </includes> </fileSet> <fileSet> <directory>src/main/resources/scripts</directory> <outputDirectory>bin</outputDirectory> </fileSet> <fileSet> <directory>src/main/tmp</directory> <outputDirectory>tmp</outputDirectory> </fileSet> <fileSet> <directory>src/main/resources/log</directory> <outputDirectory>log/backup</outputDirectory> </fileSet> </fileSets> </assembly>
As a result, we manage to generate a single file named BatchHandler-run.zip
containing all the directories and libraries that we need to put our batch into action.