One of the key features of the Lisp family of languages is their orientation around lists—not just as data structures, but as a structural metaphor for algorithms and execution flow. Recursive algorithms, for example, can be very cleanly structured around lists in Lisp variants.
Unfortunately, in most Lisps, this metaphor is tightly bound to the actual implementation of a singly-linked list, which has performance characteristics that make it unsuitable for many purposes.
To resolve this problem, Clojure introduced a new abstraction around the concept of a list, called a sequence, which is shared by ClojureScript. A sequence is a logical list, similar to those in most Lisps, with a well-defined set of operations. However, Clojure sequences are not a concrete type, but rather an abstract contract that may be satisfied concretely by a variety of different types of objects. All of ClojureScript’s collections, as well as many other types of logical collections in JavaScript, can be used as sequences. This allows ClojureScript code to be constructed in an idiomatic list-based Lisp style, while using whatever data structure is actually most appropriate for the job.
Many common operations in ClojureScript are part of the sequence API. Functions from the sequence API are used to select items from sequences, add items to sequences, and produce, consume, and transform sequences. Understanding how sequences work will give you a major leg up in understanding and writing idiomatic ClojureScript code and, once you get the hang of it, will help you write functions of your own that are highly general and composable with other sequence functions.
The basic definition of a sequence is very simple. All sequences
have two elements: a first, which is the first
element, and a rest, a sequence of the remaining
elements. An empty first and
rest in ClojureScript are directly analogous to the
car and cdr of older Lisps. They
are renamed to be clearer and to emphasize that they are abstract
concepts, not inherently bound to any particular implementation of a
sequence. A nil rest signals the
end of the sequence. You can obtain the first or rest of a sequence by
using the first
or rest
functions.
Anything that can be represented as a first and a rest can be a
sequence. The seq
function is used to polymorphically obtain
a sequence view of any object that supports it. Calling seq
explicitly to convert a collection is rarely necessary, however, since
sequence functions (including first
and rest
)
call seq
on their argument for you.
Lists and vectors are the most obvious sequences, being naturally ordered collections of elements:
(first [:a :b :c]) ;=> :a (rest [:a :b :c]) ;=> (:b :c) (first '(1 2 3)) ;=> 1 (rest '(1 2 3)) ;=> (2 3)
Sets are also sequences. Although they are unordered, you can still use sequence functions on them; the elements will just be cast into an arbitrary (though consistent, for the same set) order:
(first #{:b :c :a}) ;=> :a (rest #{:b :c :a}) ;=> (:b :c)
Maps are also sequences of key-value pairs, represented as two-element vectors and returned in arbitrary order:
(first {:b 2 :a 1 :c 3}) ;=> [:a 1] (rest {:b 2 :a 1 :c 3}) ;=> ([:c 3] [:b 2])
Other items which can be viewed as a sequence (sequable objects) include native JavaScript arrays and strings (as sequences of their constituent characters).
Despite their obvious utility for working with data structures, one
of the advantages of sequences is that they don’t
need to be backed by an actual data structure in
memory. All they have to do is implement first
and
rest
meaningfully.
This leads to an interesting and extremely useful feature: it is
possible to have sequences with a rest that isn’t actually created until
you call the rest
function. Sequences with a nonrealized rest
are known as lazy sequences. Because lazy sequences
don’t have to fully exist in memory all at once, they can be arbitrarily
large, even infinite.
For example, in ClojureScript, it is possible to construct an
infinite sequence of every positive integer using the iterate
function:
(do (def i (iterate inc 0)) nil)
As an aside, note that the def
is wrapped in a
do
, which returns nil
. This is only to prevent
the value of the i
symbol from being printed back at the
REPL, which ClojureScript does by default. i
can exist as a
lazy sequence using hardly any memory, but if you try to print it, it will
try to print the entire thing to the REPL. Obviously, this is impossible.
This is one thing to be careful of when using infinite lazy sequences:
don’t do something that would cause them to be printed! This will almost
certainly crash your process and force you to restart.
iterate
is a higher-order function that takes two
arguments, a function and an initial value. It returns a lazy sequence
with a first element of the initial value. Its rest sequence is lazily
constructed, and in turn has an first value of the function applied to the
initial value. Its rest, also lazy, is the result of applying the function
to the previous value, and so on.
So, by starting with the inc
function and an initial
value of 0
, the above expression constructs a sequence where
each successive element is constructed by incrementing the previous
element; this is a sequence of the positive integers. But the calculation
is only performed when the sequence is actually realized, so it doesn’t
end up using an infinite amount of memory.
Lazy sequences are cached, so each element is realized only once, and then stored instead of being recalculated. This means that expensive computations won’t be run unnecessarily. It also means that it’s safe to have sequence generating functions with side effects—they may not happen at all, if the sequence is never realized, but they’ll never be executed repeatedly.
The downside is that lazy sequences do consume memory, once they’ve been realized. For this reason, if you’re processing a very large or infinite sequence, it’s important not to maintain a reference to the head of the sequence as you iterate over it. That way, the earlier parts of the sequence can be garbage collected each time you move forward on the sequence. But if you maintain a reference to the entire sequence, its elements will be cached, not garbage collected, and you’re likely to get out-of-memory errors.
ClojureScript has a large library of functions that operate on
sequences. Although you don’t have to know them all, familiarity with the
basic ones covered here is critical to writing idiomatic functional code.
In particular, you should become very comfortable with map
,
reduce
, and filter
if you’re not already; these
are the staples of the functional programming style.
map
is a higher order function that takes a function
(which takes a single argument) and a sequence, and returns a sequence
of items resulting from applying the function to each item in the input
sequence. It is lazy; the input sequence is only consumed and the
mapping function applied when the resulting sequence is realized. As
such, it can be used on an infinite sequence to return a new infinite
sequence.
For example, the following expression applies a function to obtain the square of every number in the input sequence:
(map (fn [n] (* n n)) [1 2 3 4 5]) ;=> (1 4 9 16 25)
map
can also take more than one sequence. In this
case, the function provided must take the same number of arguments as
there are sequences, and the function is applied to the first member of
each sequence, then the second, third, and so on. Processing stops at
the end of the shortest sequence. For example:
(map (fn [a b] (+ a b)) [1 2 3] [10 20 30 40]) ;=> (11, 22, 33)
reduce
is a function that takes a function and a
sequence and uses the provided function to consume the entire sequence
and return a single value. As such, it is not lazy and should not be
used on infinite sequences.
The supplied function must take two arguments; it is first invoked on the first two elements of the sequence, then invoked again with the resulting value and the third item, then with the result of that and the fourth item, and so on until the sequence is consumed.
The following example uses the +
function to obtain a
sum of the numbers in a sequence:
(reduce + [1 2 3 4]) ;=> 10
You can also invoke reduce
with three arguments,
supplying an initial value as the second argument to
reduce
. It will be used along with the first element from
the sequence in the first function invocation, instead of using the
first two values from the sequence.
filter
takes a function and a sequence, and returns a
sequence consisting only of the items for which applying the function to
items in the input sequence returns logical true. It is fully lazy,
although consuming a single item in the resulting sequence may “jump”
forward in the input sequence until it finds one that meets the
specified criteria.
The following example uses the built-in integer?
function to return a filtered sequence only of numbers that are
integers:
(filter integer? [1 2.71 3.14 5 42]) ;=> (1 5 42)
Although there are a great many other sequence functions you’ll want to explore, here are some of the most frequently used:
Takes an item and a sequence, and returns a new sequence with the item as its first and the sequence as its rest.
Takes a sequence as its argument, and returns the length of a sequence. It must realize the entire sequence to do so, so don’t call it on an infinite sequence as it will never return.
Takes a sequence and an index (starting from zero), and
returns the item at that location in the sequence. For example,
(nth x 5)
returns the sixth item
in x
.
Takes a number n
and a sequence, and returns a
new sequence consisting of the first n
items in the
input sequence. It is fully lazy; it will only realize the input
sequence when and if the returned sequence is realized.
Takes a number n
and a sequence, and returns a
new sequence of all the items in the input sequence
except for the first n
items. It
is also lazy.
Takes any number of sequences as arguments, and returns a single sequence containing all the items in each of the input sequences. It is lazy.
Takes a single sequence and returns a sequence of its items in reverse order. It cannot be lazy, since it needs to realize the entire input sequence in order to get its last item.
Finally, there is a special sequence generator that’s actually a macro, not a function:
This is the basic, low-level way to create lazy sequences.
It is a macro that takes any number of forms as its body. The
body, when evaluated, should return a sequence or
nil
. However, the body isn’t evaluated right
away. Instead, it’s stored in a closure, and lazy-seq
returns an unrealized lazy sequence. Only if
the sequence is eventually realized will the body be invoked,
returning the result of the body as the realized sequence.
By making recursive function calls in the body of
lazy-seq
, it’s possible to construct lazy sequences of
arbitrary or infinite length.
Normally, it’s not necessary to use lazy-seq
directly; usually, one of the provided sequence-generating functions is
more suitable. iterate
, in particular, is very flexible.
However, when you need it, lazy-seq
can be a powerful tool
to create arbitrary lazy sequences.