Clojure Makes Java Seq-able

The seq abstraction of first/rest applies to anything that there can be more than one of. In the Java world, that includes the following:

  • The Collections API
  • Regular expressions
  • File system traversal
  • XML processing
  • Relational database results

Clojure wraps these Java APIs, making the sequence library available for almost everything you do.

Seq-ing Java Collections

If you try to apply the sequence functions to Java collections, you’ll find that they behave as sequences. Collections that can act as sequences are called seq-able. For example, arrays are seq-able:

 ; String.getBytes returns a byte array
 (first (.getBytes ​"hello"​))
 -> 104
 
 (rest (.getBytes ​"hello"​))
 -> (101 108 108 111)
 
 (cons (int ​h​) (.getBytes ​"ello"​))
 -> (104 101 108 108 111)

Hashtables and Maps are also seq-able:

 ; System.getProperties returns a Hashtable
 (first (System/getProperties))
 -> #object[java.util.Hashtable$Entry 0x12468a38
 "java.runtime.name=Java(TM) SE Runtime Environment"​]
 
 (rest (System/getProperties))
 -> (#object[java.util.Hashtable$Entry 0x5b239d7d
 "​sun.boot.library.path=/Library/... etc. ...

Remember that sequences are immutable, even when the underlying Java collection is mutable. So, you can’t update the system properties by consing a new item onto (System/getProperties). cons will return a new sequence; the existing properties are unchanged.

Since strings are sequences of characters, they also are seq-able:

 (first ​"Hello"​)
 -> ​H
 
 (rest ​"Hello"​)
 -> (​e​ ​l​ ​l​ ​o​)
 
 (cons ​H​ ​"ello"​)
 -> (​H​ ​e​ ​l​ ​l​ ​o​)

Clojure will automatically obtain a sequence from a collection, but it won’t automatically convert a sequence back to the original collection type. With most collection types this behavior is intuitive, but with strings you’ll often want to convert the result to a string. Consider reversing a string. Clojure provides reverse:

 ; probably not what you want
 (reverse ​"hello"​)
 -> (​o​ ​l​ ​l​ ​e​ ​h​)

To convert a sequence back to a string, use (apply str seq):

 (apply str (reverse ​"hello"​))
 -> ​"olleh"

The Java collections are seq-able, but for most scenarios, they don’t offer advantages over Clojure’s built-in collections. Prefer the Java collections only in interop scenarios where you’re working with legacy Java APIs.

Seq-ing Regular Expressions

Clojure’s regular expressions use the java.util.regex library under the hood. At the lowest level, this exposes the mutable nature of Java’s Matcher. You can use re-matcher to create a Matcher for a regular expression and a string and then loop on re-find to iterate over the matches.

 (re-matcher regexp string)
 ; don't do this!
 (​let​ [m (re-matcher #​"w+"​ ​"the quick brown fox"​)]
  (loop [match (re-find m)]
  (when match
  (println match)
  (recur (re-find m)))))
 | the
 | quick
 | brown
 | fox
 -> nil

Much better is to use the higher-level re-seq.

 (re-seq regexp string)

re-seq exposes an immutable seq over the matches. This gives you the power of all of Clojure’s sequence functions. Try these expressions at the REPL:

 (re-seq #​"w+"​ ​"the quick brown fox"​)
 -> (​"the"​ ​"quick"​ ​"brown"​ ​"fox"​)
 
 (sort (re-seq #​"w+"​ ​"the quick brown fox"​))
 -> (​"brown"​ ​"fox"​ ​"quick"​ ​"the"​)
 
 (drop 2 (re-seq #​"w+"​ ​"the quick brown fox"​))
 -> (​"brown"​ ​"fox"​)
 
 (map clojure.string/upper-case (re-seq #​"w+"​ ​"the quick brown fox"​))
 -> (​"THE"​ ​"QUICK"​ ​"BROWN"​ ​"FOX"​)

re-seq is an example of how good abstractions reduce code bloat. Regular expression matches are not a special thing, requiring special methods to deal with them. They are sequences, just like everything else. Thanks to the number of sequence functions, you get more functionality “for free” than you would likely end up with after a misguided foray into writing regexp-specific functions.

Seq-ing the File System

You can seq over the file system. For starters, you can call java.io.File directly:

 (import ​'java.io.File​)
 (.listFiles (File. ​"."​))
 -> [Ljava.io.File​;@1f70f15e

The [Ljava.io.File... is Java’s toString representation for an array of Files. Sequence functions would call seq on this automatically, but the REPL doesn’t.

So, seq it yourself:

 (seq (.listFiles (File. ​"."​)))
 -> (#object[java.io.File 0x44fe9319 ​"./clojurebreaker"​] ...)

If the default print format for files doesn’t suit you, you could map them to a string form with getName:

 ; overkill
 (map #(.getName %) (seq (.listFiles (File. ​"."​))))
 -> (​"clojurebreaker"​ ​"data"​ ...)

Once you decide to use a function like map, calling seq is redundant. Sequence library functions call seq for you, so you don’t have to. The previous code simplifies to this:

 (map #(.getName %) (.listFiles (File. ​"."​)))
 -> (​"clojurebreaker"​ ​"data"​ ...)

Often, you want to recursively traverse the entire directory tree. Clojure provides a depth-first walk via file-seq. If you file-seq from the sample code directory for this book, you will see a lot of files:

 (count (file-seq (File. ​"."​)))
 -> 169

What if you want to see only the files that have been changed recently? Write a predicate recently-modified? that checks to see whether a file was touched in the last half hour:

 (​defn​ minutes-to-millis [mins] (* mins 1000 60))
 
 (​defn​ recently-modified? [file]
  (> (.lastModified file)
  (- (System/currentTimeMillis) (minutes-to-millis 30))))

Give it a try:

 (filter recently-modified? (file-seq (File. ​"."​)))
 -> (./sequences ./sequences/sequences.clj)

Note that your results will vary from those shown here.

Seq-ing a Stream

In Java, a Reader provides a stream of characters. You can seq over the lines of any Java Reader using line-seq. To get a Reader, you can always use Clojure’s clojure.java.io library. The clojure.java.io library provides a reader function that returns a reader on a stream, file, URL, or URI.

 (require '[clojure.java.io :refer [reader]])
 ; leaves reader open...
 (take 2 (line-seq (reader ​"src/examples/utils.clj"​)))
 -> (​"(ns examples.utils"​ ​" (:import [java.io BufferedReader InputStreamReader]))"​)

Since readers can represent non-memory resources that need to be closed, you should wrap reader creation in a with-open. Create an expression that uses the sequence function count, to count the number of lines in a file, and uses with-open to correctly close the reader when the body is complete:

 (with-open [rdr (reader ​"src/examples/utils.clj"​)]
  (count (line-seq rdr)))
 -> 64

To make the example more useful, add a filter to count only non-blank lines:

 (with-open [rdr (reader ​"src/examples/utils.clj"​)]
  (count (filter #(re-find #​"S"​ %) (line-seq rdr))))
 -> 55

Using seqs both on the file system and on the contents of individual files, you can quickly create interesting utilities. Create a program that defines these three predicates:

  • non-blank? detects non-blank lines.
  • non-svn? detects files that are not Subversion metadata.
  • clojure-source? detects Clojure source code files.

Then, create a clojure-loc function that counts the lines of Clojure code in a directory tree, using a combination of sequence functions along the way: reduce, for, count, and filter.

 (use '[clojure.java.io :only (reader)])
 (use '[clojure.string :only (blank?)])
 (​defn​ non-blank? [line] (not (blank? line)))
 
 (​defn​ non-svn? [file] (not (.contains (.toString file) ​".svn"​)))
 
 (​defn​ clojure-source? [file] (.endsWith (.toString file) ​".clj"​))
 
 (​defn​ clojure-loc [base-file]
  (reduce
  +
  (​for​ [file (file-seq base-file)
  :when (and (clojure-source? file) (non-svn? file))]
  (with-open [rdr (reader file)]
  (count (filter non-blank? (line-seq rdr)))))))

Now let’s use clojure-loc to find out how much Clojure code is in Clojure itself:

 (clojure-loc (java.io.File. ​"/home/abedra/src/opensource/clojure/clojure"​))
 -> 38716

The clojure-loc function is very task specific, but because it’s built out of sequence functions and simple predicates, you can easily tweak it to very different tasks.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset