Data joins

Given an array of data and a D3 selection we can attach or ‘join’ each array element to each element of the selection.

This creates a close relationship between your data and graphical elements which makes data-driven modification of the elements straightforward.

For example if we have some SVG circles:

<circle r="40" />
<circle r="40" cx="120" />
<circle r="40" cx="240" />
<circle r="40" cx="360" />
<circle r="40" cx="480" />

and some data:

var scores = [
  {
    "name": "Andy",
    "score": 25
  },
  {
    "name": "Beth",
    "score": 39
  },
  {
    "name": "Craig",
    "score": 42
  },
  {
    "name": "Diane",
    "score": 35
  },
  {
    "name": "Evelyn",
    "score": 48
  }
]

we can select the circles and then join the array to it:

d3.selectAll('circle')
  .data(scores);

We can now manipulate the circles according to the joined data:

d3.selectAll('circle')
  .attr('r', function(d) {
    return d.score;
  });

The above code sets the radius of each circle to each person’s score.

Making a data join

Given an array myData and a selection s a data join is created using the function .data:

var myData = [ 10, 40, 20, 30 ];

var s = d3.selectAll('circle');

s.data(myData);

The array can contain any type e.g. objects:

var cities = [
  { name: 'London', population: 8674000},
  { name: 'New York', population: 8406000},
  { name: 'Sydney', population: 4293000}
];

var s = d3.selectAll('circle');

s.data(cities);

Although a couple of things occur when .data is called (see Under the Hood and Enter/Exit) you probably won’t notice much change after joining your data.

The real magic happens when you want to modify the elements in your selection according to your data.

Data-driven modification of elements

Once we’ve joined data to a selection we can modify elements by passing a function into the likes of .style and .attr (which we covered in Selections):

d3.selectAll('circle')
  .attr('r', function(d) {
    return d;
  });

For each element in the selection D3 will call this function, passing in the element’s joined data as the first argument d. The function’s return value is used to set the style or attribute value.

For example, given some circles:

<circle />
<circle />
<circle />
<circle />
<circle />

and some data:

var myData = [ 10, 40, 20, 30, 50 ];

let’s perform the data join:

var s = d3.selectAll('circle');

// Do the join
s.data(myData);

Now let’s update the radius of each circle in the selection to be equal to the corresponding data values:

s.attr('r', function(d) {
  return d;
});

The function that’s passed into .attr is called 5 times (once for each element in the selection). The first time round d will be 10 and so the circle’s radius will be set to 10. The second time round it’ll be 40 and so on.

In the above example the function simply returns d meaning that the first circle’s radius will be set to 10, the second’s radius to 40 and so.

We can return anything we like from the function, so long as it’s a valid value for the style, attribute etc. that we’re modifying. (It’s likely that some expression involving d will be returned.)

For example we can set the radius to twice d using:

s.attr('r', function(d) {
  return 2 * d;
});

Now let’s set a class on each element if the value is greater or equal to 40:

s.classed('high', function(d) {
  return d >= 40; // returns true or false
});

and finally we’ll position the circles horizontally using the i argument (see Selections):

s.attr('cx', function(d, i) {
  return i * 120;
});

Putting this all together we get:

var myData = [ 10, 40, 20, 30, 50 ];

var s = d3.selectAll('circle');

// Do the data join
s.data(myData);

// Modify the selected elements
s.attr('r', function(d) {
  return d;
  })
  .classed('high', function(d) {
    return d >= 40;
  })
  .attr('cx', function(d, i) {
    return i * 120;
  });

Arrays of objects

If we have an array of objects we can join it in the usual manner:

var cities = [
  { name: 'London', population: 8674000},
  { name: 'New York', population: 8406000},
  { name: 'Sydney', population: 4293000},
  { name: 'Paris', population: 2244000},
  { name: 'Beijing', population: 11510000}
];

var s = d3.selectAll('circle');

s.data(cities);

Now when we modify elements based on the joined data, d will represent the joined object. Thus for the first element in the selection, d will be { name: 'London', population: 8674000}.

Let’s set the circle radii proportionally to each city’s population:

s.attr('r', function(d) {
    var scaleFactor = 0.000005;
    return d.population * scaleFactor;
  })
  .attr('cx', function(d, i) {
    return i * 120;
  });

Of course, we not restricted to modifying circle elements. Supposing we had some rect and text elements, we can build a simple bar chart using what we’ve learnt:

var cities = [
  { name: 'London', population: 8674000},
  { name: 'New York', population: 8406000},
  { name: 'Sydney', population: 4293000},
  { name: 'Paris', population: 2244000},
  { name: 'Beijing', population: 11510000}
];

// Join cities to rect elements and modify height, width and position
d3.selectAll('rect')
  .data(cities)
  .attr('height', 19)
  .attr('width', function(d) {
    var scaleFactor = 0.00004;
    return d.population * scaleFactor;
  })
  .attr('y', function(d, i) {
    return i * 20;
  })

// Join cities to text elements and modify content and position
d3.selectAll('text')
  .data(cities)
  .attr('y', function(d, i) {
    return i * 20 + 13;
  })
  .attr('x', -4)
  .text(function(d) {
    return d.name;
  });

Under the hood

When D3 performs a data join it adds an attribute __data__ to each DOM element in the selection and assigns the joined data to it.

We can inspect this in Google Chrome by right clicking on an element and choosing Inspect.

This’ll reveal Chrome’s debug window. Look for a tab named ‘Properties’ and open it. Expand the element then expand the __data__ attribute. This is the data that D3 has joined to the element. (See screencast.)

Being able to check the joined data in this manner is particularly useful when debugging as it allows us to check whether our data join is behaving as expected.

What if our array’s longer (or shorter) than the selection?

So far we’ve looked at data joins where the selection is exactly the same length as the data array. Clearly this won’t always be the case and D3 handles this using enter and exit. To learn more see the enter and exit section.

What’s .datum for?

There are a few instances (such as when dealing with geographic visualisations) where it’s useful to join a single bit of data with a selection (usually containing a single element). Supposing we have an object:

var featureCollection = {type: 'FeatureCollection', features: features};

we can join it to a single element using .datum:

d3.select('path#my-map')
  .datum(featureCollection);

This just adds a __data__ attribute to the element and assigns the joined data (featureCollection in this case) to it. See the geographic visualisations section for a deeper look at this.

Most of the time .data will be used for data joins. .datum is reserved for special cases such as the above.

Comments