D3 is a JavaScript library for manipulating the DOM tree in order to represent information visually. This makes it different from other graphics or plotting libraries: a conventional graphics library operates on a “canvas” and places lines, circles, and other graphical objects directly onto this canvas. But because D3 uses the DOM tree to display information, it must provide capabilities to operate on the DOM tree—in addition to the customary management of shapes, coordinates, colors, and so on. Specifically, it must allow the user to:
Specify where in the DOM tree a change should take place and which elements will be affected; the user must be able to select a node (or set of nodes).
Associate individual records from the data set with specific elements or nodes in the DOM tree; enable the user to bind or join a data set to a selection of nodes.
Change the size, position, and appearance of DOM elements according to the values of the data associated with them.
The first and last item in this list are common activities in contemporary web development, and users familiar with the jQuery library, for instance, should feel quite at home. (But if you are not familiar with jQuery and the particular style of programming popularized by it, D3 can seem very peculiar indeed!)
The second item, however, is different. The idea to establish a tight association between individual data records and individual DOM elements, so that the appearance of the DOM elements can change according to the data bound to them, seems fairly unique to D3. This idea, and the particular way it is implemented, is central to D3.
The operations to select nodes, bind data to them, and update their
appearance are bundled in the Selection
abstraction. A thorough
understanding of its concepts and capabilities is essential to be
productive with D3. (The Selection
abstraction also contains
functionality to associate event handlers with DOM elements. We will
treat this topic in Chapter 4 in the context of animating
graphics.)
Selections are ordered collections of DOM elements, wrapped in a
Selection
abstraction. The Selection
abstraction provides an
API to query and modify the elements it contains. The Selection
API is declarative and supports method chaining, making it
possible to manipulate the DOM tree without explicit looping over
individual nodes.
You typically obtain an initial selection instance by using one of the
selection functions that operate on the global d3
object (see
Table 3-1). You can then create subsets of this
initial selection using member functions that operate on a
Selection
instance (see Table 3-2).
Selection methods accept a CSS selector
string (see “CSS Selectors”) and return a collection of matching
DOM elements in document order. The select(selector)
functions
return only the first matching element, while the selectAll(selector)
methods return all matching elements. Both functions return an empty
collection if no matching elements are found or if the selector is
null
or undefined
. (Using CSS selector strings as selection
criterion is the common case; we will discuss some additional options
later in this chapter.)
Function | Description |
---|---|
|
|
|
|
|
Given the name of an element, creates a |
As already indicated, selection methods nest when chained: any subsequent selection action will only act on the results of the previous one. (Technically, the previous action returns a new selection object, which becomes the target for the next action, and so on.) For example, the following code (taken straight from the D3 Reference Documentation) will select the first bold element in all paragraphs in the document:
bs
=
d3
.
selectAll
(
"p"
).
select
(
"b"
);
whereas the following snippet will select all <circle>
elements inside
the element with the ID id123
:
cs
=
d3
.
select
(
"#id123"
).
selectAll
(
"circle"
);
You can obtain an initial selection by querying the global d3
object,
which amounts to searching the entire document. Of course, Selection
objects can be assigned to variables (as in the snippets just shown),
passed to functions (as in Example 2-6), and so on.
Function | Description |
---|---|
|
|
|
|
|
Similar to |
Three types of selectors can be used with the select()
and
selectAll()
functions:
A CSS selector string (see “CSS Selectors”). This is the common case.
The selection functions on the global d3
object accept either a
node (for select()
) or a collection of nodes (for selectAll()
).
When selection functions are invoked on a Selection
object, they
can accept an accessor function as the selector. For select()
, this function
must return a DOM Element
instance or null
if there is no match; for
selectAll()
,
it must return an array of elements, possibly empty if there are no
matches; for filter()
, it must return a Boolean value indicating
whether the current element should be retained.
Using a node as the selector may seem strange: why call select()
if I
already have a node in hand? If this is the case, then you call
select()
not to select a node, but as a convenient way to wrap the
node in a Selection
object, to make the Selection
API available
for it. This is commonly done inside of event handlers, when you
want to bring the Selection
API to work on the current node that
has received the event. (You already saw an example of this in
Example 2-8; further examples can be found in
Figures 4-2 and
4-4.)
Most D3 programming begins with making a selection of elements from a document; this selection will then be modified. Selections are therefore the base material when working with D3. Two questions present themselves at this point:
What actually is a selection?
What can I do with a selection?
Technically, a Selection
is a JavaScript wrapper around an ordered
collection of DOM elements together with a set of methods to manipulate
this collection. The API is declarative and supports method chaining,
hence it is generally unnecessary to handle the elements of the
collection explicitly.
Conceptually, a selection is a handle on (all or part of) the DOM tree, together with a set of operations that constitute the three activities listed earlier: selecting elements, binding data, and modifying appearance and behavior.
It is probably best to think of selections as opaque abstraction
and operate on them only through the provided API. Besides functions
to create selections based on specified criteria, most operations in
the Selection
API can be grouped into two sets: those that operate
on the elements of a selection (for example, by changing the attributes
of an element), and those that operate on the entire selection itself (for
example, by adding or removing elements). Last, but not least, there is
a small set of operations to manage the association between a data set
and the elements of a selection. We will turn to that topic next.
The data()
method accepts an array of arbitrary values or
objects and attempts to establish a one-to-one correspondence between
the entries of this array and the elements in the current selection.
(Remember that data()
must be called through a Selection
object,
which defines the “current selection.”)
Unless a key has been provided (described later), the data()
function will attempt to match up data entries and selection elements by
their position in their respective containers: the first data point with
the first selected DOM element, the second data point with the second DOM
element, and so on (see Figure 3-1). There is no requirement
that the number of elements in both collections must match; in fact, they
commonly do not. If the numbers do not agree, there will either be a
surplus of data points, or a surplus of DOM elements. (We will come
back to this point.)
If a data point has been associated with a DOM element, the data point
itself is stored in the
__data__
property of the selection element.
The relationship between data point and selection element is therefore
persistent and will continue until explicitly overwritten (by calling
data()
again with a different data set as argument). Because the
data point is stored inside the DOM element, the data is available to
the methods that modify the attributes and appearance of the DOM
element.
The data()
method returns a new Selection
object containing those
elements that were successfully bound to entries in the data set. The
data()
method also populates the so-called “enter” and “exit”
selections, which contain the unmatched (surplus) data points or DOM
elements, respectively (see Table 3-3).
If the number of data points and DOM elements is not equal, there will be a surplus of unmatched items, either of data points or DOM elements, but not both. Because items are matched starting at the beginning, the unmatched items will always be the trailing items in their respective collections. If items are joined on a key (see the next section), matching can fail in additional ways.
After a call to data(data)
, collections of any unmatched items are accessible through the enter()
and exit()
methods (see Figure 3-2).1
Without a preceding call to data()
, these methods return empty collections.
The exit()
method actually returns a
Selection
of DOM elements, but the enter()
method only returns a
collection of placeholder elements (by construction, there can be no
actual nodes for surplus data points).
The collections of surplus items can be used to create or remove items
as necessary to make the graph match up with the data set. Given a set
of surplus data items (as returned by enter()
), you can create the
needed DOM elements:
d3
.
select
(
"svg"
).
selectAll
(
"circle"
)
.
data
(
data
).
enter
()
.
append
(
"circle"
).
attr
(
"fill"
,
"
red
);
Similarly, given a selection of surplus DOM elements (as returned by
exit()
), you can remove these items as follows:
d3
.
select
(
"svg"
).
selectAll
(
"circle"
)
.
data
(
data
).
exit
()
.
remove
();
The data()
method itself returns the Selection
of DOM elements
that were successfully bound to data points; it can be used to
update the appearance of these DOM elements. All three activities are
used together in the General Update Pattern that will be discussed
later in this chapter.
Function | Description |
---|---|
|
|
|
|
|
Sometimes joining data and nodes by simply aligning them in order is not enough. In particular when updating an existing set of nodes with new data, it may matter that the correct DOM elements receive new values.
Figure 3-3 shows an example: when the user clicks into the graph, the positions of the circles are updated with new data. Because the graph uses a smooth, animated transition to move the circles from their old positions to new ones, it is important that each circle receives the new data associated with it. Example 3-1 shows the commands to create Figure 3-3.
function
makeKeys
(
)
{
var
ds1
=
[
[
"Mary"
,
1
]
,
[
"Jane"
,
4
]
,
[
"Anne"
,
2
]
]
;
var
ds2
=
[
[
"Anne"
,
5
]
,
[
"Jane"
,
3
]
]
;
var
scX
=
d3
.
scaleLinear
(
)
.
domain
(
[
0
,
6
]
)
.
range
(
[
50
,
300
]
)
,
scY
=
d3
.
scaleLinear
(
)
.
domain
(
[
0
,
3
]
)
.
range
(
[
50
,
150
]
)
;
var
j
=
-
1
,
k
=
-
1
;
var
svg
=
d3
.
select
(
"#key"
)
;
svg
.
selectAll
(
"text"
)
.
data
(
ds1
)
.
enter
(
)
.
append
(
"text"
)
.
attr
(
"x"
,
20
)
.
attr
(
"y"
,
d
=>
scY
(
++
j
)
)
.
text
(
d
=>
d
[
0
]
)
;
svg
.
selectAll
(
"circle"
)
.
data
(
ds1
)
.
enter
(
)
.
append
(
"circle"
)
.
attr
(
"r"
,
5
)
.
attr
(
"fill"
,
"red"
)
.
attr
(
"cx"
,
d
=>
scX
(
d
[
1
]
)
)
.
attr
(
"cy"
,
d
=>
scY
(
++
k
)
-
5
)
;
svg
.
on
(
"click"
,
function
(
)
{
var
cs
=
svg
.
selectAll
(
"circle"
)
.
data
(
ds2
,
d
=>
d
[
0
]
)
;
cs
.
transition
(
)
.
duration
(
1000
)
.
attr
(
"cx"
,
d
=>
scX
(
d
[
1
]
)
)
;
cs
.
exit
(
)
.
attr
(
"fill"
,
"blue"
)
;
}
)
;
}
The original data set.
The new data set. Note that it is incomplete: only two out of the three items will be updated with new data. Furthermore, the order of the items that are present is different than in the original data set.
Integers to track the vertical position of the text label and circle.
The active <svg>
element as Selection
, assigned to a variable
for future reference.
Create the text labels…
… and the circles at their initial positions.
Inside the click
event handler, the new data set is bound to
the selection of circle
elements. Notice the second argument to
the data()
function: this function defines the key on which
data items will be joined.
A smooth, animated transition from the old positions to the new ones.
The exit()
selection is now populated with Mary’s node, since
in the last call to data()
, no data point was bound to this
node. We give this circle a different color to make it stand out.
This example shows how joining on a key is accomplished: you simply
supply a second argument to the data(data, key)
function. This
additional argument must be an accessor function,
which returns the desired key value as a string for each node or data
point. This function will be
evaluated for all items in the data set and for the data points
bound to all nodes in the current selection, and items with matching
keys will be bound to each other. Nonmatching items, either in the
data set or the selection, populate the enter()
and exit()
selections as usual. If there are duplicate keys, either in the
data set or the selection, only the first occurrence of the key (in
collection order) is bound. Duplicates in the data set are placed in
the enter()
selection, duplicates in the current selection end up
in the exit()
selection. (See Figure 3-4.)
A particular situation arises when an existing graph must be updated repeatedly with new data—for example, because data is only becoming available over time or because the graph must respond to user input. In this case, it is not enough to simply create additional graph elements corresponding to new inputs; the new elements must also be merged back with the existing elements to get the whole graph ready for the next iteration. The complete sequence of steps is therefore:
Bind new data to an existing selection of elements.
Remove any surplus items that do not have matching data associated
anymore (the exit()
selection).
Create and configure all items associated with data points that did not
exist before (the enter()
selection).
Merge the remaining items from the original selection with the newly
created items from the enter()
selection.
Update all items in the combined selection based on the current values of the bound data set.
The example in Example 3-2 defines two data sets; clicking into the graph area replaces the current data set with the other one and updates the graph accordingly.
function
makeUpdate
(
)
{
var
ds1
=
[
[
2
,
3
,
"green"
]
,
[
1
,
2
,
"red"
]
,
[
2
,
1
,
"blue"
]
,
[
3
,
2
,
"yellow"
]
]
;
var
ds2
=
[
[
1
,
1
,
"red"
]
,
[
3
,
3
,
"black"
]
,
[
1
,
3
,
"lime"
]
,
[
3
,
1
,
"blue"
]
]
;
var
scX
=
d3
.
scaleLinear
(
)
.
domain
(
[
1
,
3
]
)
.
range
(
[
100
,
200
]
)
,
scY
=
d3
.
scaleLinear
(
)
.
domain
(
[
1
,
3
]
)
.
range
(
[
50
,
100
]
)
;
var
svg
=
d3
.
select
(
"#update"
)
;
svg
.
on
(
"click"
,
function
(
)
{
[
ds1
,
ds2
]
=
[
ds2
,
ds1
]
;
var
cs
=
svg
.
selectAll
(
"circle"
)
.
data
(
ds1
,
d
=>
d
[
2
]
)
;
cs
.
exit
(
)
.
remove
(
)
;
cs
=
cs
.
enter
(
)
.
append
(
"circle"
)
.
attr
(
"r"
,
5
)
.
attr
(
"fill"
,
d
=>
d
[
2
]
)
.
merge
(
cs
)
;
cs
.
attr
(
"cx"
,
d
=>
scX
(
d
[
0
]
)
)
.
attr
(
"cy"
,
d
=>
scY
(
d
[
1
]
)
)
;
}
)
;
svg
.
dispatch
(
"click"
)
;
}
The two data sets. Each entry consists of the x and y coordinates, followed by the color. We will also use this color string as key when binding the data.
Scales to map the data values to screen coordinates.
Obtain a convenient handle on the <svg>
element.
Register a click
event handler for this <svg>
element. All
relevant action will take place inside this event handler.
In response to a user click, swap the data sets, replacing the current one with its alternate.
Bind the new data set to the (existing) <circle>
elements in the
graph, using the color name as key.
Remove those elements that are no longer bound to data (the
exit()
selection).
Create new elements for those data points that are new in this
data set (the enter()
selection).
Merge the existing elements retained from the earlier selection into the selection of newly created elements, and treat the combination as the “current” selection going forward.
Update all the elements in the combined selection using the bound data values.
This statement generates a synthetic click
event. This triggers
the event handler and therefore populates the graph when the page
is first loaded. (We will revisit events in Chapter 4.)
The purpose of the merge()
function may not be immediately apparent.
But keep in mind that the enter and update selections (as returned by
enter()
and data()
, respectively) are separate selections. If we
want to operate on their elements without duplicating code, then we must
combine them first. Moreover, a plain concatenation of the enter and
update selections would not be appropriate; the merge()
operation is
designed to operate on the particular data representation of these two
selections.2
The Selection
abstraction contains many operations to either operate on
the elements of a selection individually, or to manipulate the overall
selection itself.
The Selection
abstraction provides several methods to manipulate aspects
of the individual DOM elements contained in the selection (see
Table 3-4).
Function | Description |
---|---|
|
Sets the attribute named |
|
Sets the named style property to the supplied value. An override priority
may be specified by supplying the string |
|
Sets the named property to the supplied value. (This is intended for HTML
elements that have properties that are not accessible as attributes, such
as the checkbox |
|
The |
|
Sets the “text content” to the supplied value: use this to set the actual
text for |
|
Sets the “inner HTML” to the supplied value: this is the HTML with
its markup inside of, but not including, the current element. (This
method replicates a corresponding property on the DOM |
|
Sets the data bound to this element to the supplied value. |
|
Invokes the supplied accessor function for each element in the selection. |
All function arguments are optional (as always in JavaScript).
The functions in Table 3-4 can be used to set, get,
or clear an attribute (or property,5 or style, …), depending on the value
argument:
If no value
is supplied, the function returns the current value
for the first nonnull element in the selection.
If a null
value is supplied, the attribute (or property) is removed
from the element.
If a constant value
is supplied and not null
, the attribute (or
property) is set to the supplied value.
If a function is supplied, it is evaluated for every element in the selection and its return value is used to modify the current element.
With the exception of the last two entries, the functions in Table 3-4 are thin wrappers around the equivalent functionality in the DOM API; you may want to compare the appropriate reference documentation for the precise semantics of some of the terms. (See, for example, the MDN Node Reference and the MDN Element Reference.)
The Selection
abstraction provides several functions to operate on the entire selection, for example, to add, remove, or reorder elements (see Table 3-5).
Function | Description |
---|---|
|
|
|
|
|
Merges the current selection with the supplied selection and returns the
combined selection. The two selections are expected to have undefined
elements in complimentary positions; if both selections have a nonnull
entry in the same position, the current selection prevails. This function
is not intended to concatenate arbitrary selections; instead, its primary
purpose is to merge the |
|
Removes the selected elements from the document and returns a selection containing the removed elements. |
|
Takes a function of two arguments ( |
|
Invokes the supplied function exactly once, passing in the current selection and any supplied arguments. Returns the current selection and so enables method chaining (the primary application for this method). |
|
Returns the nonnull nodes in this selection as an array of DOM |
|
Returns the first nonnull node in this selection as a DOM |
|
Returns the number of nonnull elements in this selection. |
|
Returns true if there are no nonnull elements in this selection. |
The node()
and nodes()
functions are a way to obtain a reference to the
actual DOM Node
instances in the selection. This is occasionally useful;
we will see examples in Chapter 4.6
We have seen the append()
function in action several times before.
The next example will demonstrate how to use insert()
and sort()
.
Initially, an unordered list is populated from a static data set. If
you mouse over the list, two more items are inserted into it. If
you then mouse click into the list, the items are sorted, in descending order,
according to their text content (see Example 3-3).
function
makeSort
(
)
{
var
data
=
[
"Jane"
,
"Anne"
,
"Mary"
]
;
var
ul
=
d3
.
select
(
"#sort"
)
;
ul
.
selectAll
(
"li"
)
.
data
(
data
)
.
enter
(
)
.
append
(
"li"
)
.
text
(
d
=>
d
)
;
// insert on mouse enter
var
once
;
ul
.
on
(
"mouseenter"
,
function
(
)
{
if
(
once
)
{
return
;
}
once
=
1
;
ul
.
insert
(
"li"
,
":nth-child(2)"
)
.
datum
(
"Lucy"
)
.
text
(
"Lucy"
)
;
ul
.
insert
(
"li"
,
":first-child"
)
.
datum
(
"Lisa"
)
.
text
(
"Lisa"
)
;
}
)
;
// sort on click
ul
.
on
(
"click"
,
function
(
)
{
ul
.
selectAll
(
"li"
)
.
sort
(
(
a
,
b
)
=>
(
a
<
b
?
1
:
b
<
a
?
-
1
:
0
)
)
;
}
)
;
}
An unordered list is populated from the data set.
The variable once
makes sure that the new items are only added to
the list once.
Register the first of two event handlers for the list: if the mouse pointer enters the area occupied by the list, the callback is invoked.
The position where the new list items are to be inserted is
specified through pseudo-classes. The :nth-child()
pseudo-class
starts counting at 1 (so that :nth-child(1)
equals :first-child
).
Observe that we need to both set the data bound to each element
(using datum()
) and the visible text (using text()
) separately
when using insert()
.
Another element is added in front of the entire list, pushing the previously added element from the second to the third position. (Positions in pseudo-classes are evaluated at the time the pseudo-class is applied.)
A click event handler is registered as the second event handler on the list.
Upon a mouse click, the list elements are sorted, in descending order, based on the value of the data bound to them.
Remember that insert()
does not bind data; an explicit call to datum()
is required to add data to elements added using insert()
.
As a rule, methods that operate on individual elements of a selection (Table 3-4) return the current selection, whereas methods that operate on the entire selection (Table 3-5) return a new selection—but there are some exceptions. For reference, here are the functions that return new selections:
Selections maintain one additional piece of information that is not exposed explicitly in the API, namely which members of the selection share a common parent in the previous selection (not necessarily in the document). This is best understood through an example. Consider the following HTML table:
<table>
<tr>
<td>
A</td><td>
B</td>
</tr>
<tr>
<td>
C</td><td>
D</td>
</tr>
</table>
and select all cells within all rows (for example, to color them):
d3
.
selectAll
(
"tr"
).
selectAll
(
"td"
).
attr
(...,
(
d
,
i
,
ns
)
=>
{
...
});
The first argument d
to the accessor function is obviously the data bound
to each cell. But what should the index i
refer to? Because the cells
were selected as elements of their respective rows, the index i
holds
the position of each cell within its row—in other words, its column.
This makes it exceedingly easy to shade the cells by columns, for example.
In the same spirit, the third argument ns
contains the elements (nodes)
for the current row (not for the entire table).
Let me repeat that this information about shared parents refers to the originating selection, not the document. For example, instead of creating a selection of rows first, you could select the cells directly from the document:
d3
.
selectAll
(
"td"
).
attr
(...,
(
d
,
i
,
ns
)
=>
{
...
});
Now, the index i
will be the running number of the cells (from 0
to 3, in this case), and ns
is the collection of all cells. Selections
maintain only a single level of ancestry. If each cell contained an
unordered list, then in the following snippet:
d3
.
selectAll
(
"tr"
).
selectAll
(
"td"
)
.
selectAll
(
"li"
).
attr
(
...,
(
d
,
i
)
=>
{
...
}
);
the index i
would be the position of each list item within its list.
The information about the column information would be lost at that point.
(It would, of course, still be available in the second selection in this
chain, the one containing the table cells.)
All of this is extremely straightforward and intuitive—the less you think about it, the easier it is. By and large, D3 simply does what you expect it to do. None of this functionality is exposed explicitly (through individual functions), but you may find in code and documentation references to “groups”—the name for the internal representation of the common parent information. (All children of a common parent form a group.)
More detail on this topic can be found in two dedicated blog posts by Mike Bostock, which are recommended reading: https://bost.ocks.org/mike/nest/ and https://bost.ocks.org/mike/selection/.
1 The names of these functions can be understood if you think of binding as going through three phases. During the “entry” phase, DOM elements are created for any unmatched data items. During the “update” phase, DOM elements are styled based on the data bound to them. During the “exit” phase, surplus DOM elements (that have no data bound to them) are removed from the graph. If you get confused, just remember that enter()
returns the surplus data items and exit()
returns the surplus DOM elements.
2 It may aid comprehension to provide a rough sketch of what is happening under the covers. The exit selection contains an array having the same number of elements as the old data set, but with only those entries populated for which there is no corresponding data point in the new data set. Both enter and update selections contain arrays having an entry for each data point in the new data set, but with only those entries populated that are newly added or retained from previously, respectively. The merge()
function combines these complementary arrays into one. If both arrays have a nonnull entry in the same position, one of them will be clobbered. All of this is implemented using the JavaScript Array
type, which allows for “holes” of unassigned, undefined values at arbitrary index positions. It is probably also best to consider all of this information as implementation detail and subject to change. Just remember the role of the merge()
function as part of the General Update Pattern.
3 Strictly speaking, the nodes
argument contains the current group, and i
is the index within the group. We will discuss groups toward the end of this chapter.
4 If you want to access this
in an accessor, you must use the function
keyword; you cannot use an arrow function.
5 The distinction between attributes and properties is subtle. Basically, an HTML element has attributes, a JavaScript Node
object has properties. Most attributes map to properties and vice versa, but the names don’t always agree exactly, and the momentaneous value associated with an attribute or property may depend on the dynamic state of the page and therefore may differ between the two representations. The SVG specification distinguishes between properties as attributes that can be modified through CSS, in contrast to attributes in general, which cannot.
6 Be warned that additional concerns arise when accessing the DOM API directly, rather than through D3. In particular, XML namespaces often need to be taken into account explicitly; see Chapter 6 for an example.