Chapter 8. Performance and Production

8.0. Introduction

You’ve spent all of this time developing your next big thing: what’s next but to ship it to the wild? Whether it is a product, internal service, or library, the last (and most important) step is delivering the fruits of your labor to your audience.

It’s easy for developers to forget that code-complete is only the beginning of an actual application’s life cycle. A successful project will spend much more time in production than it will in development, and stability and maintainability are premium features.

This is a chapter all about really finishing your work and building something that will run as painlessly as possible for years to come. Be the task performance, logging, release, or long-term maintenance, it’s all important in shipping something that is truly excellent. There is certainly a lot to worry about when your baby finally leaves home; we hope these recipes help you get it right.

8.1. AOT Compilation

Problem

You want to deliver your code as precompiled JVM bytecode in .class files, rather than as Clojure source code.

Solution

Use the :aot (ahead of time) compilation key in your project’s project.clj file to specify which namespaces should be compiled to .class files. The value of the :aot key is a vector of either symbols indicating specific namespaces to be compiled, or regular expression literals specifying that any namespace with a matching name should be compiled. Alternatively, instead of a vector, you can use the keyword :all as a value, which will AOT-compile every namespace in the project:

:aot [foo.bar foo.baz]

;; or...
:aot [#"foo.b.+"] ; Compile all namespaces starting with "foo.b"

;; or...
:aot :all

Note that if your project has specified a :main namespace, Leiningen will AOT-compile it by default, regardless of whether it is present in an :aot directive.

Once your project is configured for AOT compilation, you can compile it by invoking lein compile at the command line. All emitted classes will be placed in the target/classes directory, unless you’ve overridden the output directory with the :target-path or :compile-path options.

Discussion

It’s important to understand that AOT compilation does not change how the code actually runs. It’s no faster or different. All Clojure code is compiled to the same bytecode before execution; AOT compilation merely means that it happens at a singular, defined point in time instead of on demand as the program loads and runs.

However, although it isn’t any faster, it can be a great tool in the following situations:

  • You want to deliver the application binary, but you don’t want to include the original source code with it.
  • To marginally speed up an application’s start time (since the Clojure code won’t have to be compiled on the fly).
  • You need to generate classes loadable directly from Java for interop purposes.
  • For platforms (such as Android) that do not support custom class loaders for running new bytecode at runtime.

You may observe that there is more than one emitted class file for each AOT-compiled namespace. In fact, there will be separate Java classes for each function, the namespace itself, and any additional gen-class, deftype, or defrecord forms. This is actually not dissimilar from Java itself; it has always been the case that inner classes are compiled to separate class files, and Clojure functions are effectively anonymous inner classes from the JVM’s point of view.

See Also

8.2. Packaging a Project into a JAR File

Problem

You want to package a project into an executable JAR.

Solution

Use the Leiningen build tool to package your application as an uberjar, a JAR file that includes an application and all of its dependencies.

To follow along with this recipe, create a new Leiningen project:

$ lein new foo

Configure the project to be executable by adding :main and :aot parameters to the project’s project.clj file:

(defproject foo "0.1.0-SNAPSHOT"
  :description "FIXME: write description"
  :url "http://example.com/FIXME"
  :license {:name "Eclipse Public License"
            :url "http://www.eclipse.org/legal/epl-v10.html"}
  :dependencies [[org.clojure/clojure "1.5.1"]]
  :main foo.core
  :aot :all)

To finish making the project executable, add a -main function and :gen-class declaration to src/foo/core.clj. Remove the existing foo function:

(ns foo.core
  (:gen-class))

(defn -main  [& args]
  (->> args
       (interpose " ")
       (apply str)
       (println "Executed with the following args: ")))

Run the application using the lein run command to verify it is functioning correctly:

$ lein run 1 2 3

To package the application with all of its dependencies included, invoke lein uberjar:

$ lein uberjar
Created /tmp/foo/target/uberjar/foo-0.1.0-SNAPSHOT.jar
Created /tmp/foo/target/foo-0.1.0-SNAPSHOT-standalone.jar

Execute the generated target/foo-0.1.0-SNAPSHOT-standalone.jar file by passing it as the -jar option to the java executable:

$ java -jar target/foo-1.0.0-standalone.jar 1 2 3
Executed with the following args:  1 2 3

Discussion

Executable JAR files provide an excellent method to package a program so it can be provided to users, called by cron jobs, combined with other Unix tools, or used in any other scenario where command-line invocation is useful.

Under the hood, an executable JAR is like any other JAR file in that it contains a collection of program resources such as class files, Clojure source files, and classpath resources. Additionally, an executable JAR contains metadata indicating which class contains the main method as a Main-Class tag in its internal manifest file.

A Leiningen uberjar is a JAR file that contains not only your program, but all the dependencies bundled in as well. When Leiningen builds an uberjar, it can detect from the :main entry in project.clj that your program supplies a -main function and writes an appropriate manifest that will ensure that the emitted JAR file is executable.

The :gen-class in your namespace and the :aot Leiningen option are required to precompile your Clojure source file into a JVM class file, since the “Main-Class” manifest entry doesn’t know how to reference or compile Clojure source files.

Packaging JARs Without Their Dependencies

Not only does Leiningen make it possible to package a project with its dependencies, it also makes it possible to package it without its dependencies.

The jar command packages a project’s code without any of its upstream dependencies. Not even Clojure itself is included in the JAR file—you’ll need to BYOC.[25]

By invoking the command lein jar in the foo project, you’ll generate target/foo-0.1.0-SNAPSHOT.jar:

$ lein jar
Created /tmp/foo/target/jar/target/foo-0.1.0-SNAPSHOT.jar

Listing the contents of the JAR file using the unzip command,[26] you can see that very little is packaged—just a Maven .pom file, generated JVM class files, and the project’s miscellany:

$ unzip -l target/foo-0.1.0-SNAPSHOT.jar
Archive:  target/foo-0.1.0-SNAPSHOT.jar
  Length     Date   Time    Name
 --------    ----   ----    ----
      113  12-06-13 10:26   META-INF/MANIFEST.MF
     2595  12-06-13 10:26   META-INF/maven/foo/foo/pom.xml
       91  12-06-13 10:26   META-INF/maven/foo/foo/pom.properties
      292  12-06-13 10:26   META-INF/leiningen/foo/foo/project.clj
      292  12-06-13 10:26   project.clj
      229  12-06-13 10:26   META-INF/leiningen/foo/foo/README.md
    11220  12-06-13 10:26   META-INF/leiningen/foo/foo/LICENSE
        0  12-06-13 10:26   foo/
     1210  12-06-13 10:26   foo/core$_main.class
     1304  12-06-13 10:26   foo/core$fn__16.class
     1492  12-06-13 10:26   foo/core$loading__4910__auto__.class
     1755  12-06-13 10:26   foo/core.class
     2814  12-06-13 10:26   foo/core__init.class
      162  12-04-13 14:54   foo/core.clj
 --------                   -------
    23569                   14 files

The target/foo-0.1.0-SNAPSHOT-standalone.jar listing, on the other hand, includes over 3,000 files.[27]

Since the packaged pom.xml file includes a listing of the project’s dependencies, build tools like Leiningen or Maven can resolve these dependencies on their own. This allows for efficient packaging of libraries. Can you imagine if each and every Clojure library included the entirety of its dependencies? It would be a bandwidth nightmare.

Because of this property, lean JAR files such as this are what is deployed to remote repositories when you use the lein deploy command.[28]

Without its dependencies included—namely, Clojure—you’ll need to do a bit more work to run the foo application. First, download Clojure 1.5.1. Then invoke foo.core via the java command, including clojure-1.5.1.jar and foo-0.1.0-SNAPSHOT.jar on the classpath (via the -cp option):

# Download Clojure
$ wget 
  http://repo1.maven.org/maven2/org/clojure/clojure/1.5.1/clojure-1.5.1.zip
$ unzip clojure-1.5.1.zip

# Execute the application
$ java -cp target/foo-0.1.0-SNAPSHOT.jar:clojure-1.5.1/clojure-1.5.1.jar 
       foo.core 
       1 2 3
Executed with the following args:  1 2 3

See Also

8.3. Creating a WAR File

Problem

You want to deploy a Clojure web application built using Ring as a standard web archive (WAR) file in a commonly used Java EE container such as Tomcat, JBoss, or WebLogic.

Solution

Assuming you are using Ring or a framework based on Ring (such as Compojure), the easiest way to structure your project to build as a WAR file is to use the lein-ring plug-in for Leiningen. Say that your project has a Ring handler function defined in a namespace called warsample.core,[29] like so:

(ns warsample.core)

(defn handler [request]
  {:status 200
   :headers {"content-type" "text/html"}
   :body "<h1>Hello, world!</h1>"})

To configure the project with lein-ring, add the following key/value pairs to your Leiningen project.clj file:

:plugins [[lein-ring "0.8.8"]]
:ring {:handler warsample.core/handler}

You’ll also need to make sure that your application declares a dependency on the javax.servlet/servlet-api library. Most web app libraries do include a transitive dependency, which you can verify by running lein deps :tree. If no other library you’re using includes it, you can include it yourself by adding [javax.servlet/servlet-api "2.5"] to the :dependencies key in project.clj.

The :plugins key specifies that the project uses the lein-ring plug-in, and the map under the :ring key specifies configuration options specific to lein-ring. The only required option is :handler, which indicates the var name of the application’s primary Ring handler function.

lein-ring provides a handy way to run your application locally, for development and testing. At the command line, simply type:

$ lein ring server

An embedded Jetty server will be started, serving your Ring application (on port 3000 by default, though you can change this in the lein-ring options). It will also open your operating system’s default browser to that page.

Once you think the application is running correctly, you can build a WAR file using the lein ring war or lein ring uberwar commands. Both take the name of the WAR file to emit:

$ lein ring war warsample.war
$ lein ring uberwar warsample-with-deps.war

lein ring war builds a WAR file containing only your application code, not any transitive dependencies, whereas lein ring uberwar will build a WAR file containing bundled JAR files for every dependency as well.

Both these commands will generate all the necessary configuration and wiring (such as a WEB-INF directory and a web.xml file) before building the WAR. See the discussion section for some options you can pass to lein ring that will influence how these artifacts are generated.

After issuing the WAR build command, you will find the WAR file you created in your project’s target directory. This is a perfectly normal WAR file that you can deploy just as if it were a standard J2EE WAR. Every application server is different, so check the documentation for your preferred system to see how to deploy a WAR file. If you have an operations team responsible for production deployments, you will definitely want to check with them to make sure you adhere to their processes and best practices.

Discussion

It is crucial to understand the difference between a bare WAR file generated using lein ring war and an “uberwar” generated by lein ring uberwar, and when to use each.

A bare WAR file does not contain any of your project’s dependencies; it contains only the application code itself. This means that your program will not work unless you make sure that each and every JAR file your program depends on, including Clojure itself, is present on your web application’s shared library path. Exactly how to do this depends on the application server you’re using—you’ll have to refer to your system’s documentation to determine how to make them available.

An “uberwar,” on the other hand, includes all the JARs your program depends on in the WAR archive as a bundled library under the WEB-INF/lib subfolder. Compliant application servers are capable of running each application (each deployed WAR file) in its own class loader context and will make the bundled JARs available only to their applications.

Typically, an uberwar is a safer choice. It spares you from much of the effort of manually curating your libraries, and better reflects how your application’s classpath probably looked in development.

The cost of an uberwar, however, is that a single library may be loaded multiple times if it is bundled by multiple applications. If you are running 10 applications, all of which use (say) Compojure, the server will actually load the Compojure code into the JVM’s class space 10 times, once for each application. Some organizations running resource-constrained or high-performance deployments prefer to ensure that there is minimal redundancy in application dependencies. If this is the case, then you may have to fall back to using a non-uber WAR file and managing your dependencies in your application server’s shared library pool by hand.

Other lein-ring options

lein-ring provides some additional options you can set in the :ring configuration map in project.clj to fine-tune how WAR files are generated. For an exhaustive description, see the lein-ring project page.

A few of the more useful ones are shown in Table 8-1.

Table 8-1. lein-ring WAR options
KeyDescriptionDefault

:war-exclusions

A sequence of regexes of files to exclude from the target WAR

All hidden files

:servlet-class

The name of the generated Servlet class

:servlet-name

The name of the servlet in web.xml

The name of the handler function

:url-pattern

The URL of the servlet mapping in web.xml

/*

:web-xml

A specific web.xml file to use instead of the generated one

Building WAR files from scratch

If you aren’t using Ring, or if you have a good reason not to use the lein-ring plug-in, you can still create a WAR file, but the process is much more hands-on. Fortunately, a WAR file is essentially a JAR file with a different extension and some additional internal structure and configuration files, so you can use the standard lein jar tool to generate one—provided you add the following files at the appropriate locations in the archive.

You’ll also need to define some AOT classes implementing javax.servlet.Servlet yourself, and have these call into your Clojure application. Then you’ll need to wire them up to the application server using a deployment descriptor (web.xml).

The structure of a WAR file is:

<war root>
|-- <static resources>
|-- WEB-INF
    |-- web.xml
    |-- <app-server-specific deployment descriptors>
    |-- lib
    |   |-- <bundled JAR libraries>
    |-- classes
        |-- <AOT compiled .class files for servlets, etc.>
        |-- <.clj source files>

A full explanation of all of these elements is beyond the scope of this recipe. For more information, see Oracle’s J2EE tutorial on packaging web archives.

Other web server libraries (for example, Pedestal Server) that include tooling for Leiningen will also often have a utility for building WAR files—check the documentation of the library you’re using.

8.4. Running an Application as a Daemon

Problem

You want to run a Clojure application as a daemon (i.e., you want your application to run in the background) in another system process.

Solution

Use the Apache Commons Daemon library to write applications that can be executed in a background process. Daemon consists of two parts: the Daemon interface, which your application must implement, and a system application[30] that runs Daemon-compliant applications as daemons.

Begin by adding the Daemon dependency to your project’s project.clj file. If you don’t have an existing project, create a new one with the command lein new my-daemon. Since Daemon is a Java-based system, enable AOT compilation so that class files are generated:

(defproject my-daemon "0.1.0-SNAPSHOT"
  :description "FIXME: write description"
  :url "http://example.com/FIXME"
  :license {:name "Eclipse Public License"
            :url "http://www.eclipse.org/legal/epl-v10.html"}
  :dependencies [[org.clojure/clojure "1.5.1"]
                 [org.apache.commons/commons-daemon "1.0.9"]]
  :main my-daemon.core
  :aot :all)

To implement the org.apache.commons.daemon.Daemon interface, add the appropriate :gen-class declaration and interface functions to one of your project’s namespaces. For a minimally functional daemon, implement -init, -start, and -stop. For best results, provide a -main function to enable smoke testing of your application without touching the Daemon interface:

(ns my-daemon.core
  (:import [org.apache.commons.daemon Daemon DaemonContext])
  (:gen-class
    :implements [org.apache.commons.daemon.Daemon]))

;; A crude approximation of your application's state
(def state (atom {}))

(defn init [args]
  (swap! state assoc :running true))

(defn start []
  (while (:running @state)
    (println "tick")
    (Thread/sleep 2000)))

(defn stop []
  (swap! state assoc :running false))

;; Daemon implementation

(defn -init [this ^DaemonContext context]
  (init (.getArguments context)))

(defn -start [this]
  (future (start)))

(defn -stop [this]
  (stop))

(defn -destroy [this])

;; Enable command-line invocation
(defn -main [& args]
  (init args)
  (start))

Package all of the necessary dependencies and generated classes by invoking the Leiningen uberjar command:

$ lein uberjar
Compiling my-daemon.core
Created /tmp/my-daemon/target/my-daemon-0.1.0-SNAPSHOT.jar
Created /tmp/my-daemon/target/my-daemon-0.1.0-SNAPSHOT-standalone.jar

Before proceeding, test your application by running it with java:

$ java -jar target/my-daemon-0.1.0-SNAPSHOT-standalone.jar
tick
tick
tick
# ... Type Ctrl-C to stop the madness

Once you’ve verified your application works correctly, install jsvc.[31] Finally, the moment of truth. Run your application as a daemon by invoking jsvc with all of the requisite parameters—the absolute path of your Java home directory, the uberjar, the output log file, and the namespace where your Daemon implementation resides:[32]

$ sudo jsvc -java-home "$JAVA_HOME" 
            -cp "$(pwd)/target/my-daemon-0.1.0-SNAPSHOT-standalone.jar" 
            -outfile "$(pwd)/out.txt" 
            my_daemon.core
# Nothing!

$ sudo tail -f out.txt
tick
tick
tick
# ... Ctrl-C to exit

# Quit the daemonized process by adding the -stop flag
$ sudo jsvc -java-home "$JAVA_HOME" 
            -cp "$(pwd)/target/my-daemon-0.1.0-SNAPSHOT-standalone.jar" 
            -stop 
            my_daemon.core

If all is well, out.txt should now contain a couple of ticks. Congratulations! Daemon can be a little hard to get set up, but once you have it running, it works fantastically. If you encounter any problems launching a daemon using jsvc, use the -debug flag to output more detailed diagnostic information.

Note

You’ll find a full working copy of the my-daemon project at https://github.com/clojure-cookbook/my-daemon.

Discussion

Have no illusions, daemonizing Java-based services is hard; yet, for over 10 years, Java developers have been using Apache Commons Daemon to this end. Why reinvent the wheel with a separate Clojure tool? One of Clojure’s core strengths is its ability to breathe new life into old tunes, and Daemon is one such “old tune.”

Not all tunes are created equal, however. Where some Java libraries require a little Java interop, Daemon requires a lot. Daemonizing an application with Apache Commons Daemon requires getting two parts just right. The first part is creating a class that implements the Daemon interface and packaging it as a JAR file. The Daemon interface consists of four methods, called at different points in an daemonized application’s life cycle:

init(DaemonContext context)
Invoked as your application is initializing. This is where you should set up any initial state for your application.
start()
Invoked after init. This is where you should begin performing work. jsvc expects start() to complete quickly, so you should kick-off work in a future or Java Thread.
stop()
Invoked when a daemon has been instructed to stop. This is where you should halt whatever processing you began in start.
destroy()
Invoked after stop, but before the JVM process exits. In a traditional Java program, this is where you would free any resources you had acquired. You may be able to skip this method in Clojure applications if you’ve properly structured your application. It doesn’t hurt to include an empty function to prevent jsvc from complaining.

It’s easy enough to create a record (with defrecord) that implements the Daemon interface—but that isn’t enough. jsvc expects a Daemon-implementing class to exist on the classpath. To provide this, you must do two things: first, you need to enable ahead-of-time (AOT) compilation for your project—setting :aot :all in your project.clj will accomplish this. Second, you need to commandeer a namespace to produce a class via the :gen-class namespace directive. More specifically, you need to generate a class that implements the Daemon interface. This is accomplished easily enough using :gen-class in conjunction with the :implements directive:

(ns my-daemon.core
  ;; ...
  (:gen-class
    :implements [org.apache.commons.daemon.Daemon]))

Having set up my-daemon.core to generate a Daemon-implementing class upon compilation, the only thing left is to implement the methods themselves. Prefacing a function with a dash (e.g., -start) indicates to the Clojure compiler that a function is in fact a Java method. Further, since the Daemon methods are instance methods, each function includes one additional argument, the present Daemon instance. This argument is traditionally denoted with the name this.

In our simple my-daemon example, most of the method implementations are rather plain, taking no arguments other than this and delegating work to regular Clojure functions. -init deserves a bit more attention, though:

(defn -init [this ^DaemonContext context]
  (init (.getArguments context)))

The -init method takes an additional argument: a DaemonContext. This argument captures the command-line arguments the daemon was started with in its .getArguments property. As implemented, -init invokes the .getArguments method on context, passing its return value along to the regular Clojure function init.

On that topic, why delegate every Daemon implementation to a separate Clojure function? By separating participation in the Daemon interface from the inner workings of your application, you retain the ability to invoke it in other ways. With this separation of concerns, it becomes much easier to test your application, via either integration tests or direct invocation. The -main function utilizes these Clojure functions to allow you to verify that your application behaves correctly in isolation of daemonization.

With all of the groundwork for a Daemon-compliant application laid, the only remaining step is packaging the application. Leiningen’s uberjar command completes all of the necessary preparations for running your application as a daemon: compiling my-daemon.core to a class, gathering dependencies, and packaging them all into a standalone JAR file.

Last but not least, you need to run the darn thing. Since JVM processes don’t generally play nicely with low-level system calls, Daemon provides system applications, jsvc and procrun, that act as intermediaries between the JVM and your computer’s operating system. These applications, generally written in C, are capable of invoking the appropriate system calls to fork and execute your application in a background process. For simplicity, we’ll limit our discussion to the jsvc tool for the remainder of the recipe.

Both of these tools have a dizzying number of configuration options, but only a handful of them are actually necessary for getting the ball rolling. At a minimum, you must provide the location of your standalone JAR (-cp), your Java installation (-java-home), and the desired class to execute (the final argument). Other relevant options include -pidfile, -outfile, and -errfile; these specify where the process’s ID, STDOUT, and STDERR output will be written to, respectively. Any arguments following the name of the class to invoke will be passed into -init as a DaemonContext.

A more complete example: 

$ sudo jsvc -java-home "$JAVA_HOME" 
            -cp "$(pwd)/target/my-daemon-0.1.0-SNAPSHOT-standalone.jar" 
            -pidfile /var/run/my-daemon.pid 
            -outfile "/var/log/my-daemon.out" 
            -errfile "/var/log/my-daemon.err" 
            my_daemon.core 
            "arguments" "to" "my-daemon.core"

Note

Once you’ve started a daemon with jsvc, you can halt it by re-running jsvc with the -stop option included.

Since jsvc relaunches your application in a completely new process, it carries none of its original execution context. This means no environment variables, no current working directory, nothing; the process may not even be running as the same user. Because of this, it is extremely important to specify arguments to jsvc with their absolute paths and correct permissions in place.

For our sample, we’ve opted to use sudo to make this a less painful experience, but in production you should set up a separate user with more limited permissions. The running user should have write access to the .pid, .out, and .err files, and read access to Java and the classpath.

jsvc and its ilk can be fickle beasts—the slightest misconfiguration will cause your daemon to fail silently, without warning. We highly suggest using the -debug and -nodetach flags while developing and configuring your daemon until you’re sure things work correctly.

Once you’ve nailed an appropriate configuration, the final step is to automate the management of your daemon by writing a daemon script. A good daemon script captures configuration parameters, file paths, and common operations, exposing them in a clean, noise-free skin. Instead of the long jsvc commands you executed before, you would simply invoke my-daemon start or my-daemon stop. In fact, many Linux distributions use similar scripts to manage system daemons. To implement your own jsvc daemon script, we suggest reading Sheldon Neilson’s “Creating a Java Daemon (System Service) for Debian using Apache Commons Jsvc”.

See Also

8.5. Alleviating Performance Problems with Type Hinting

Problem

You have functions that get called very often, and you want to optimize performance for those methods.

Solution

One of the easiest ways to increase performance for a given function is to eliminate Java reflection. Enable warn-on-reflection to diagnose excessive reflection:

(defn column-idx
  "Return the index number of a column in a CSV header row"
  [header-cols col]
  (.indexOf (vec header-cols) col))

(def headers (clojure.string/split "A,B,C" #","))
(column-idx headers "B")
;; -> 1

(set! *warn-on-reflection* true)

(defn column-idx
  "Return the index number of a column in a CSV header row"
  [header-cols col]
  (.indexOf (vec header-cols) col))
;; Reflection warning, NO_SOURCE_PATH:1:1 - call to indexOf can't be resolved.

;; 100,000 non-hinted executions...
(time (dotimes [_ 100000] (column-idx headers "B")))
;; "Elapsed time: 329.258 msecs"

Once you’ve identified reflection, add type hints to your argument list preceding each argument in the form <^Type> <arg>:

(defn column-idx
  "Return the index number of a column in a CSV header row"
  [^java.util.List header-cols col]
  (.indexOf header-cols col))

;; 100,000 properly hinted executions
(time (dotimes [_ 100000] (column-idx headers "B")))
;; "Elapsed time: 27.779 msecs"

When you have groups of functions that interact together, you may see reflection warnings in spite of your properly hinted arguments. Add type hints to the argument list itself to hint the types of the functions’ return values:

;; As a simple example, imagine you want to compare the result
;; of two function calls

(defn some-calculation [x] 42)

(defn same-calc? [x y] (.equals (some-calculation x)
                                (some-calculation y)))
;; Reflection warning, NO_SOURCE_PATH:1:24 - call to equals can't be resolved.

;; Now type-hint the return value of some-calculation
(defn some-calculation ^Integer [x] 42)

(defn same-calc? [x y] (.equals (some-calculation x)
                                (some-calculation y)))

;; Look Ma, no reflection warnings!

Discussion

In highly performant code, it is often the case that you’ll choose to fall back to Java for increased performance. There is an impedance mismatch between Clojure and Java, however; Java is strongly typed, whereas Clojure is not. Because of this, (almost) every time you invoke a Java function in Clojure, it needs to reflect on the type of the provided arguments in order to select the appropriate Java method to invoke. For seldom-invoked methods, this isn’t too big of a deal, but for methods executed frequently, the cost of reflection can pile up quickly.

Type hinting short-circuits this reflection. If you’ve hinted all of the arguments to a Java function, the Clojure compiler will no longer perform reflection. Instead, the function application will directly invoke the appropriate Java function. Of course, if you’ve gotten your types wrong, your methods may not work properly; improperly hinted functions may throw a type-cast exception.

What about when you have a sequence of values, all of a uniform type? Clojure provides a number of special hints for these cases, namely ^ints, ^floats, ^longs, and ^doubles. Hinting these types will allow you to pass whole arrays as arguments to Java functions and not provoke reflection for sequences.

8.6. Fast Math with Primitive Java Arrays

Problem

You need to perform fast numerical operations over significant amounts of data.

Solution

Primitive Java arrays are the canonical way to store large collections of numbers compactly and do math over them quickly (often 100 times faster than Clojure sequences). The hiphip (array) library is a quick and easy way to manipulate primitive arrays of double, long, float, or int members.

Before starting, add [prismatic/hiphip "0.1.0"] to your project’s dependencies or start a REPL using lein-try:

$ lein try prismatic/hiphip

Use one of hiphip’s amap macros to perform fast math on typed arrays. amap uses a parallel binding syntax similar to doseq:

(require 'hiphip.double)

(defn map-sqrt [xs]
  (hiphip.double/amap [x xs] (Math/sqrt x)))

(seq (map-sqrt (double-array (range 1000))))
;; -> (2.0 3.0 4.0)

(defn pointwise-product
  "Produce a new double array with the product of corresponding elements of
  xs and ys"
  [xs ys]
  (hiphip.double/amap [x xs y ys] (* x y)))
(seq (pointwise-product (double-array [1.0 2.0 3.0])
                        (double-array [2.0 3.0 4.0])))
;; -> (2.0 6.0 12.0)

To modify an array in place, use one of hiphip’s afill! macros:

(defn add-in-place!
  "Modify xs, incrementing each element by the corresponding element of ys"
  [xs ys]
  (hiphip.double/afill! [x xs y ys] (+ x y)))

(let [xs (double-array [1.0 2.0 3.0])]
  (add-in-place! xs (double-array [0.0 1.0 2.0]))
  (seq xs))
;; -> (1.0 3.0 5.0)

For faster reduce-like operations, use one of hiphip’s areduce and asum macros:

(defn dot-product [ws xs]
  (hiphip.double/asum [x xs w ws] (* x w)))

(dot-product (double-array [1.0 2.0 3.0])
             (double-array [2.0 3.0 4.0]))
;; -> 20.0

Warning

We’d love to throw in a quick time benchmark to demonstrate the gains, but the JVM is a fickle beast when it comes to optimizations. We suggest using Criterium when benchmarking to avoid common pitfalls.

To see Criterium benchmarks of hiphip, see w01fe’s bench.clj gist.

Discussion

Most of the time, Clojure’s sequence abstraction is all you need to get the job done. The preceding dot-product can be written succinctly in ordinary Clojure, and this is generally what you should try first:

(defn dot-product [ws xs]
  (reduce + (map * ws xs))

Once you identify a bottleneck in your mathematical operations, however, primitive arrays may be the only way to go. The preceding dot-product implementation can be made more than 100 times faster by using asum, primarily because map produces sequences of boxed Java Double objects. In addition to the cost of constructing an intermediate sequence, all arithmetic operations on boxed numbers are significantly slower than on their primitive counterparts.

hiphip’s amap, afill!, reduce, and asum macros (among others) are available for int, long, float, and double types. If you wanted to use reduce over an array of floats, for example, you would use hiphip.float/reduce. These macros define the appropriate type hints and optimizations per type.

Clojure also comes with built-in functions for operating on arrays, although greater care must be taken to ensure maximal performance (via appropriate type hints and use of *unchecked-math*):

(set! *unchecked-math* true)
(defn map-inc [^doubles xs]
  (amap xs i ret (aset ret i (inc (aget xs i)))))

Working with primitive arrays in Clojure isn’t for the faint of heart: if you don’t get everything right, you can easily end up with code that’s both much uglier and no faster (or even slower) than the straightforward sequence version. The biggest issue to watch out for is reflection, which can easily bring you from 100 times faster to 10 times slower with one small typo or missing type hint.

If you’re up to the challenge, you should keep these tips in mind:

  • Use *warn-on-reflection* religiously, but be aware that it won’t warn you about many of the ways your code can be slow.
  • A solid profiler, or at least a comprehensive benchmark suite, is a must; otherwise you won’t know which function is using 99% of your runtime.
  • Especially if you’re not using hiphip, experiment with *unchecked-math*; it almost always makes your code faster, if you’re willing to give up the safety of overflow checks.
  • If you want your array code to go fast under Leiningen, you probably want to add the following to your project.clj: :jvm-opts ^:replace [].

See Also

8.7. Simple Profiling with Timbre

Problem

You want fine-grained statistics on the running time and invocation counts of your code.

Solution

Use Timbre to insert profiling macros into your code that won’t incur a performance penalty in production.

Before starting, add [com.taoensso/timbre "2.6.3"] to your project’s dependencies or start a REPL using lein-try:

$ lein try com.taoensso/timbre

Use the macros in the taoensso.timbre.profiling namespace to collect benchmarking metrics in development:

(require '[taoensso.timbre.profiling :as p])

(defn bench-me [f]
  (p/p :bench/bench-me
    (let [_ (p/p :bench/sleep
              (Thread/sleep 10))
          n (p/p :bench/call-f-once
              (f))
          _ (p/p :bench/call-f-10-times-outer
              (dotimes [_ 10]
                (p/p :bench/call-f-10-times-inner
                  (f))))]
      (iterate f n))))

(p/profile :info :Bench-f
  (bench-me
    (fn ([] (p/p :bench/no-arg-f) 100)
        ([a] (p/p :bench/one-arg-f) +))))

Here we define a Clojure function bench-me, which is called with a higher-order function f that takes zero or one argument.

Timbre outputs rich profiling information in a convenient table:

2013-Aug-25 ... Profiling :taoensso.timbre.profiling/Bench-f
                        Name  Calls    Min    Max   MAD   Mean Time%   Time
             :bench/bench-me      1   13ms   13ms   0ns   13ms    95   13ms
                :bench/sleep      1   11ms   11ms   0ns   11ms    76   11ms
:bench/call-f-10-times-outer      1  970μs  970μs   0ns  970μs     7  970μs
          :bench/call-f-once      1  610μs  610μs   0ns  610μs     4  610μs
:bench/call-f-10-times-inner     10   20μs  214μs  35μs   39μs     3  394μs
             :bench/no-arg-f     11    5μs  163μs  26μs   20μs     2  215μs
                [Clock] Time                                     100   14ms
              Accounted Time                                     186   26ms

Discussion

Profiling with Timbre is a great solution for Clojure-only profiling. Standard JVM profiling tools like YourKit and JVisualVM provide more comprehensive information on Java methods but come with a greater performance penalty.

Timbre’s profiling is most useful when profiling a specific area of code, rather than using profiling as an exploratory tool for tuning performance. As profiling markers are just macros, they are flexible. For example, you could record how many times a particular if branch was taken, all without leaving Clojure or suffering from mangled Clojure function names via YourKit or JVisualVM.

If profiling is deemed useful enough to keep in your code base, it is good practice to use the profiling macros via a namespace alias. p, while conveniently named, is prone to being shadowed by local bindings if used without an explicit namespace. In the solution we used the alias p, so each call to p becomes p/p.

Remember, you should not be hesitant to add profiling statements: there is no performance penalty for code involving taoensso.timbre.profiling/p if tracing is not enabled. This means you can leave tracing code in production, which is useful if you want to profile the same code later, or if the profiling comments make your code clearer.

8.8. Logging with Timbre

Problem

You want to add logging to your application code.

Solution

Use Timbre to configure your logger and add logging messages to your code.

Before starting, add [com.taoensso/timbre "2.7.1"] to your project’s dependencies or start a REPL using lein-try:

$ lein try com.taoensso/timbre

To write a function that writes log messages, use the Timbre functions info, error, etc:

(require '[taoensso.timbre :as log])

(defn div-4 [n]
  (log/info "Starting")
  (try
    (/ 4 n)
    (catch Throwable t
      (log/error t "oh no!"))
    (finally
      (log/info "Ending"))))

The div-4 function takes a single argument and returns 4/n.

The log/info calls will create a log message output at the “info” level. Similarly, the log/error call will create a log message at the “error” output level. Passing an exception as the first argument will cause the stack trace to be printed as well.

If you call div-4 with values that will succeed or throw an error, you will see output like the following in your REPL:

(div-4 2)
;; -> 2
;; *out*
;; 2013-Nov-22 10:34:11 -0500 laptop INFO [user] - Starting
;; 2013-Nov-22 10:34:11 -0500 laptop INFO [user] - Ending

(div-4 0)
;; -> 2013-Nov-22 10:34:47 -0500 laptop ERROR [user] -
;;      oh no! java.lang.ArithmeticException: Divide by zero
;; -> nil
;; *out*
;; 2013-Nov-22 10:34:21 -0500 laptop INFO [user] - Starting
;; 2013-Nov-22 10:34:21 -0500 laptop ERROR [user] -
;;   oh no! java.lang.ArithmeticException: Divide by zero
;; ... Exception stacktrace
;; 2013-Nov-22 10:34:21 -0500 laptop INFO [user] - Ending

Discussion

Timbre is a great way to get started with logging in your code. Using a log library allows you to specify later where the output will go, possibly to more than one location or filtered by namespace.

Timbre writes logs to any number of configured “appenders” (output destinations). By default, a single appender is configured to write to standard out.

For example, to add a second appender for a file, you can dynamically modify the configuration by enabling the preconfigured spit appender:

;; Turn it on
(log/set-config! [:appenders :spit :enabled?] true)
;; Set the log file location
(log/set-config! [:shared-appender-config :spit-filename] "out.log")

Note that the output file’s directory must exist and the user must be able to write to the file. Once this configuration has been completed, any log messages will be written to both the console and the file.

The available logging levels are :trace, :debug, :info, :warn, :error, and :fatal. The default log level is set to :debug, so all logging levels greater than or equal to :debug will be recorded (everything but :trace).

To change the logging level at runtime, change the configuration:

(log/set-level! :warn)

While Timbre is an excellent library for simple logging in your Clojure app, it may not be sufficient if you are integrating with many Java libraries. There are a variety of popular Java logging frameworks and logging facades. If you wish to leverage the existing Java logging infrastructure, you might find the tools.logging framework more suitable.

8.9. Releasing a Library to Clojars

Problem

You’ve built a library in Clojure, and you want to release it to the world.

Solution

One of the easiest places to release libraries to is Clojars, a community repository for open source libraries. To get started, sign up for an account. If you don’t already have an SSH key, the GitHub guide “Generating SSH Keys” is an excellent resource.

Once you have an account set up, you’re ready to publish any Leiningen-based project. If you don’t have a project to publish, generate one with the command lein new my-first-project-<firstname>-<lastname>, replacing <firstname> and <lastname> with your own name.

You can now use the command lein deploy clojars to release your library to Clojars:

$ lein deploy clojars
WARNING: please set :description in project.clj.
WARNING: please set :url in project.clj.
No credentials found for clojars (did you mean `lein deploy clojars`?)
See `lein help deploy` for how to configure credentials.
Username: # 1
Password: # 2
Wrote .../my-first-project-ryan-neufeld/pom.xml
Created .../my-first-project-ryan-neufeld-0.1.0-SNAPSHOT.jar
Could not find metadata my-first-project-ryan-neufeld:
    .../0.1.0-SNAPSHOT/maven-metadata.xml 
    in clojars (https://clojars.org/repo/)
Sending .../my-first-project-ryan-neufeld-0.1.0-20131113.123334-1.pom (3k)
    to https://clojars.org/repo/
Sending .../my-first-project-ryan-neufeld-0.1.0-20131113.123334-1.jar (8k)
    to https://clojars.org/repo/
Could not find metadata my-first-project-ryan-neufeld:.../maven-metadata.xml 
    in clojars (https://clojars.org/repo/)
Sending my-first-project-ryan-neufeld/.../0.1.0-SNAPSHOT/maven-metadata.xml (1k)
    to https://clojars.org/repo/
Sending my-first-project-ryan-neufeld/.../maven-metadata.xml (1k)
    to https://clojars.org/repo/
1

Enter your Clojars username, then press Return.

2

Enter your Clojars password, then press Return.

After this command has completed, your library will be available both on the Web (https://clojars.org/my-first-project-ryan-neufeld) and as a Leiningen dependency ([my-first-project-ryan-neufeld "0.1.0-SNAPSHOT"]).

Discussion

Releasing a library doesn’t get much easier than this; just create an account and press the Big Red Button. Together, Leiningen and Clojars make it trivially easy for members of the Clojure community such as yourself to release their libraries to the masses.

In this example, you released a simple, uniquely named library with little care for versioning, release strategies, or adequate metadata. In a real project, you should pay attention to these matters to be a good open source citizen.

The easiest change is adding appropriate metadata and a website. In your project.clj file, add an accurate :description and :url. If you don’t have a website for your project, consider linking to your project’s GitHub page (or other public SCM “landing page”).

Less easy is having consistent version numbers for your project. We suggest a scheme called Semantic Versioning, or “semver.” The semver scheme prescribes a version number of three parts, major, minor, and patch, joined with periods. This ends up looking like “0.1.0” or “1.4.2”. Each version position indicates a certain level of stability and consistency across releases. Releases sharing a major version should be API-compatible; bumping the major version says, “I have fundamentally changed the API of this library.” The minor version indicates when new, backward-compatible functionality has been added. Finally, the patch version indicates when bug fixes have been made.

It certainly takes discipline to follow Semantic Versioning, but when you do, you make it easier for your fellow developers to understand your library versions and trust them to behave in a way they expect.

Code signing is another important concern in the deployment process. Signing the artifacts you release lets your users know the artifacts were created by someone they trust (you) and contain exactly what you intended (i.e., they have not be tampered with). Leiningen includes the facilities to sign release artifacts using GPG and include the relevant .asc signature files in lein deploy publications. Enabling code signing is described in the GNU Privacy Guard (GPG) section of Leiningen’s deploying libraries guide.

See Also

  • The Clojars wiki, a bountiful source of information on releasing libraries to Clojars
  • Leiningen’s own deploying libraries guide, which covers code signing and how to deploy to repositories other than Clojars
  • The output of the lein help deploy command

8.10. Using Macros to Simplify API Deprecations

Problem

You want to use Clojure macros to deprecate API functions and report existing deprecations.

Solution

When maintaining a library that other programmers rely on to get their work done, it behooves you to be thoughtful when making changes. In the process of fixing bugs and making improvements to your library, you will eventually wish to change its public interface. Making changes to a public-facing portion of your library is no small matter, but assuming that you’ve determined its necessity, then you’ll want to deprecate the functions that are obsolete. The term “deprecate” basically means that a given function should be avoided in favor of some other, newer function.

For an example, take the case of the Clojure contrib library core.memoize. Without going into detail about what core.memoize does, it’s fine to know that at one point a segment of its public-facing API was a function named memo-fifo that looked like the following:

(defn memo-fifo
  ([f] ... )
  ([f limit] ... )
  ([f limit base] ... ))

Obviously, the implementation has been elided to highlight only the parts that were planned for change in a later version—namely, the function’s name and its available argument vectors. The details of the new API are not important, but they were different enough to cause potential confusion to the users. In a case like this, simply making the change without due notice in a new version would have been bad form and genuine cause for bitterness.

Therefore, the question arises: what can you do in the case where a feature is planned for deprecation that not only supports existing code in the interim, but also provides fair warning to the users of your library of a future breaking change? In this section, we’ll discuss using macros to provide a nice mechanism for deprecating library functions and macros with minimal fuss.

In the case of the planned deprecation of memo-fifo, the new function, named simply fifo, was changed not only in name but also in its provided arities. When deprecating portions of a library, it’s often a good idea to print warning messages that point to the new, preferred function to use instead. Therefore, to start on the way to deprecating memo-fifo, the following function, !!, was created to print a warning:

(defn ^:private !! [c]
  (println "WARNING - Deprecated construction method for"
           c
           "cache; preferred way is:"
           (str "(clojure.core.memoize/" c
                " function <base> <:"
                c "/threshold num>)")))

When passed a symbol, the !! function prints a message like this one:

(!! 'fifo)

;; WARNING - Deprecated construction method for fifo cache;
;; preferred way is:
;; (clojure.core.memoize/fifo function <base> <:fifo/threshold num>)

Not only does the deprecation message indicate that the function called is deprecated, but it also points to the function that should be used instead. As far as deprecation messages go, this one is solid, although your own purposes may call for something different. In any case, to insert this warning on every call to memo-fifo, we can create a simple macro to inject the call to !! into the body of the function’s definition, as shown here:

(defmacro defn-deprecated [nom _ alt ds & arities]
  `(defn ~nom ~ds                                    ; 1
     ~@(for [[args body] arities]                    ; 2
         (list args `(!! (quote ~alt)) body))))      ; 3
1

Create a defn call with the given name and docstring.

2

Loop through the given function arities.

3

Insert a call to !! as the first part of the body.

We’ll talk a bit about the goals of the defn-deprecated macro in the following discussion section, but for now, you can see how it works:

(defn-deprecated memo-fifo :as fifo
  "DEPRECATED: Please use clojure.core.memoize/fifo instead."
  ([f] ... )
  ([f limit] ... )
  ([f limit base] ... )

The only changes to the definition of memo-fifo are the use of the defn-deprecated macro instead of defn directly, the use of the :as fifo directive, and the addition (or change) of the docstring to describe the deprecation. The defn-deprecated macro takes care of assembling the parts in the macro body to print the warning on use:

(def f (memo-fifo identity 32))
;; WARNING - Deprecated construction method for fifo cache;
;; preferred way is:
;; (clojure.core.memoize/fifo function <base> <:fifo/threshold num>)

The warning message will only display once for every call to memo-fifo, and due to the nature of that function, that should be sufficient.

Discussion

There are different ways to handle the same situation besides using macros. For example, the !! function could have taken a function and a symbol and wrapped the function, inserting a deprecation warning in passing:

(defn depr [fun alt]
  (fn [& args]                                        ; 1
    (println
      "WARNING - Deprecated construction method for"
      alt
      "cache; preferred way is:"
      (str "(clojure.core.memoize/" alt
           " function <base> <:"
           alt "/threshold num>)"))
    (apply fun args)))                                ; 2
1

Return a function that prints the deprecation message before calling the deprecated function.

2

Call the deprecated function.

This new implementation of !! would work in the following way:

(def memo-fifo (depr old-memo-fifo 'fifo))

Thereafter, calling the memo-fifo function will print the deprecation message. Using a higher-order function like this is a reasonable way to avoid the potential complexities of using a macro. However, we chose the macro version for a number of reasons, explained in the following sections.

Preserving stack traces

Let’s be honest: the exception stack traces that Clojure can produce can at times be painful to deal with. If you decide to use a higher-order function like depr, be aware that if an exception occurs in its execution, another layer of stack trace will be added. By using a macro like !! that delegates its operation directly to defn, you are ensured that the stack trace will remain unadulterated (so to speak).

Metadata

Using a near 1-for-1 replacement macro like defn-deprecated allows you to preserve the metadata on a function. Observe:

(defn-deprecated ^:private memo-foo :as bar
  "Does something."
  ([] 42))

(memo-foo)
;; WARNING - Deprecated construction method for bar cache;
;; preferred way is:
;; (clojure.core.memoize/bar function <base> <:bar/threshold num>)
;;=> 42

Because defn-deprecated defers the bulk of its behavior to defn, any metadata attached to its elements automatically gets forwarded on and attached as expected:

(meta #'memo-foo)

;;=> {:arglists ([]), :ns #<Namespace user>,
;;    :name memo-foo, :private true, :doc "Does something.",
;;    ...}

Using the higher-order approach does not automatically preserve metadata:

(def baz (depr foo 'bar))

(meta #'baz)
;;=> {:ns #<Namespace user>, :name baz, ...}

Of course, you could copy over the metadata if so desired, but why do so when the macro approach takes cares of it for you?

Faster call site

The depr function, because it’s required to handle any function that you give it, needed to use apply at its core. While in the case of the core.memoize functions this was not a problem, it may become so in the case of functions requiring higher performance. In reality, though, the use of println will likely overwhelm the cost of the apply, so if you really need to deprecate a high-performance function, then you might want to consider the following approach instead.

Compile-time warnings

The operation of defn-deprecated is such that the deprecation warning is printed every time the function is called. This could be problematic if the function requires high speed.

Very few things slow a function down like a console print. Therefore, we can change defn-deprecate slightly to report its warning at compile time rather than runtime:

(defmacro defn-deprecated [nom _ alt ds & arities]
  (!! alt)                     ; 1
  `(defn ~nom ~ds ~@arities))  ; 2
1

Print the warning when the macro is accessed.

2

Delegate function definition to defn without adulteration.

Observe the compile-time warning:

(defn-deprecated ^:private memo-foo :as bar
  "Does something."
  ([] 42))

;; WARNING - Deprecated construction method for bar cache;
;; preferred way is:
;; (clojure.core.memoize/bar function <base> <:bar/threshold num>)
;;=> #'user/memo-foo

(memo-foo)
42

This approach will work well if you distribute libraries as source code rather than as compiled programs.

Turning it off

The real beauty of macros is not that they allow you to change the semantics of your programs, but that they allow you to avoid doing so whenever it’s not appropriate. For example, when using macros, you can run any code available to Clojure at compile time. Thankfully, the full Clojure language is available at compile time. Therefore, we can check a Boolean flag attached to a namespace as metadata to decide whether to report a compile-time deprecation warning. We can change the newest defn-deprecated to illustrate this technique:

(defmacro defn-deprecated
  [nom _ alt ds & arities]
  (let [silence? (:silence-deprecations (meta clojure.core/*ns*))] ; 1
    (when-not silence?  ; 2
     (!! alt)))
  `(defn ~nom ~ds ~@arities))
1

Look up the metadata on the current namespace.

2

Only report the deprecation warning if the flag is not set to silence mode.

The defn-deprecated macro checks the status of the :silence-deprecations metadata property on the current namespace and reports (or not) the deprecation warning based on it. If you wind up using this approach, then you can turn off the deprecation warning on a per-namespace basis by adding the following to your ns declaration:

(ns ^:silence-deprecations my.awesome.lib)

Now, any use of defn-deprecated in that namespace will not print the warning. Future versions of Clojure will provide a cleaner way of creating and managing compile-time flags, but for now this is a decent compromise.

See Also



[25] Bring your own Clojure!

[26] Available on most Unix-based systems.

[27] All of which we won’t be committing to print. Take a look for yourself with the command lein uberjar && unzip -l target/foo-0.1.0-SNAPSHOT-standalone.jar.

[28] See Recipe 8.9, “Releasing a Library to Clojars”, for more information on releasing libraries.

[29] If you don’t happen to already have a similarly named project, and you want to follow along, create a new one with lein new warsample.

[30] jsvc on Unix systems; procrun on Windows.

[31] On OS X we suggest using Homebrew to brew install jsvc. If you’re using Linux, you’ll likely find a jsvc package in your favorite package manager. Windows users will need to install and use procrun.

[32] Don’t worry, we’ll capture all this in a shell script soon.

[33] To accurately measure performance improvements from unchecked-math, we suggest using a tool like Criterium. Benchmarking code with time can be tricky and often yields misleading results (or none at all).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset