## Scale functions

Scale functions are JavaScript functions that:

- take an input (usually a number, date or category) and
- return a value (such as a coordinate, a colour, a length or a radius)

They’re typically used to transform (or ‘map’) data values into visual variables (such as position, length and colour).

For example, suppose we have some data:

we can create a scale function using:

D3 creates a function `myScale`

which accepts input between 0 and 10 (the **domain**) and maps it to output between 0 and 600 (the **range**).

We can use `myScale`

to calculate positions based on the data:

Scales are mainly used for transforming data values to visual variables such as position, length and colour.

For example they can transform:

- data values into lengths between 0 and 500 for a bar chart
- data values into positions between 0 and 200 for line charts
- % change data (+4%, +10%, -5% etc.) into a continuous range of colours (with red for negative and green for positive)
- dates into positions along an x-axis.

### Constructing scales

(In this section we’ll just focus on linear scales as these are the most commonly used scale type. We’ll cover other types later on.)

To create a linear scale we use:

Version 4 uses a different naming convention to v3. We use d3.scaleLinear() in v4 and d3.scale.linear() in v3.

As it stands the above function isn’t very useful so we can configure the input bounds (the `domain`

) as well as the output bounds (the `range`

):

Now `myScale`

is a function that accepts input between 0 and 100 and linearly maps it to between 0 and 800.

Try experimenting with scale functions by copying code fragments and pasting them into the console or using a web-based editor such as JS Bin.

### D3 scale types

D3 has around 12 different scale types (scaleLinear, scalePow, scaleQuantise, scaleOrdinal etc.) and broadly speaking they can be classified into 3 groups:

- scales with continuous input and continuous output
- scales with continuous input and discrete output
- scales with discrete input and discrete output

We’ll now look at these 3 categories one by one.

### Scales with continuous input and continuous output

In this section we cover scale functions that map from a **continuous input domain** to a **continuous output range**.

#### scaleLinear

Linear scales are probably the most commonly used scale type as they are the most suitable scale for transforming data values into positions and lengths. If there’s one scale type to learn about this is the one.

They use a linear function (`y = m * x + b`

) to interpolate across the domain and range.

Typical uses are to transform data values into positions and lengths, so when creating bar charts, line charts (as well as many other chart types) they are the scale to use.

The output range can also be specified as colours:

This can be useful for visualisations such as choropleth maps, but also consider `scaleQuantize`

, `scaleQuantile`

and `scaleThreshold`

.

#### scalePow

More included for completeness, rather than practical usefulness, the power scale interpolates using a power (`y = m * x^k + b`

) function. The exponent `k`

is set using `.exponent()`

:

#### scaleSqrt

The `scaleSqrt`

scale is a special case of the power scale (where *k* = 0.5) and is useful for sizing circles by area (rather than radius). (When using circle size to represent data, it’s considered better practice to set the area, rather than the radius proportionally to the data.)

#### scaleLog

Log scales interpolate using a log function (`y = m * log(x) + b`

) and can be useful when the data has an exponential nature to it.

#### scaleTime

`scaleTime`

is similar to `scaleLinear`

except the domain is expressed as an array of dates. (It’s **very** useful when dealing with time series data.)

#### scaleSequential

`scaleSequential`

is used for mapping continuous values to an output range determined by a preset (or custom) **interpolator**. (An interpolator is a function that accepts input between 0 and 1 and outputs an interpolated value between two numbers, colours, strings etc.)

D3 provides a number of preset interpolators including many colour ones. For example we can use `d3.interpolateRainbow`

to create the well known rainbow colour scale:

Note that the interpolator determines the output range so you don’t need to specify the range yourself.

The example below shows some of the other colour interpolators provided by D3:

There’s also a plug-in d3-scale-chromatic which provides the well known ColorBrewer colour schemes.

#### Clamping

By default `scaleLinear`

, `scalePow`

, `scaleSqrt`

, `scaleLog`

, `scaleTime`

and `scaleSequential`

allow input outside the domain. For example:

In this instance the scale function uses extrapolation for values outside the domain.

If we’d like the scale function to be restricted to input values inside the domain we can ‘clamp’ the scale function using `.clamp()`

:

We can switch off clamping using `.clamp(false)`

.

#### Nice

If the domain has been computed automatically from real data (e.g. by using `d3.extent`

) the start and end values might not be round figures. This isn’t necessarily a problem, but if using the scale to define an axis, it can look a bit untidy:

Therefore D3 provides a function `.nice()`

on the scales in this section which will round the domain to ‘nice’ round values:

Note that `.nice()`

must be called each time the domain is updated.

#### Multiple segments

The domain and range of `scaleLinear`

, `scalePow`

, `scaleSqrt`

, `scaleLog`

and `scaleTime`

usually consists of two values, but if we provide 3 or more values the scale function is subdivided into multiple segments:

Typically multiple segments are used for distinguishing between negative and positive values (such as in the example above). We can use as many segments as we like as long as the domain and range are of the same length.

#### Inversion

The `.invert()`

method allows us to determine a scale function’s **input** value given an **output** value (provided the scale function has a numeric domain):

A common use case is when we want to convert a user’s click along an axis into a domain value:

### Scales with continuous input and discrete output

#### scaleQuantize

`scaleQuantize`

accepts continuous input and outputs a number of discrete quantities defined by the range.

Each range value is mapped to an equal sized chunk in the domain so in the example above:

- 0 ≤
*u*< 25 is mapped to ‘lightblue’ - 25 ≤
*u*< 50 is mapped to ‘orange’ - 50 ≤
*u*< 75 is mapped to ‘lightgreen’ - 75 ≤
*u*< 100 is mapped to ‘pink’

where *u* is the input value.

Note also that input values outside the domain are clamped so in our example `quantizeScale(-10)`

returns ‘lightblue’ and `quantizeScale(110)`

returns ‘pink’.

#### scaleQuantile

`scaleQuantile`

maps continuous numeric input to discrete values. The domain is defined by **an array of numbers**:

The (sorted) domain array is divided into *n* equal sized groups where *n* is the number of range values.

Therefore in the above example the domain array is split into 3 groups where:

- the first 5 values are mapped to ‘lightblue’
- the next 5 values to ‘orange’ and
- the last 5 values to ‘lightgreen’.

The split points of the domain can be accessed using `.quantiles()`

:

If the range contains 4 values `quantileScale`

computes the **quartiles** of the data. In other words, the lowest 25% of the data is mapped to `range[0]`

, the next 25% of the data is mapped to `range[1]`

etc.

#### scaleThreshold

`scaleThreshold`

maps continuous numeric input to discrete values defined by the range. *n-1* domain split points are specified where *n* is the number of range values.

In the following example we split the domain at `0`

, `50`

and `100`

*u*< 0 is mapped to ‘#ccc’- 0 ≤
*u*< 50 to ‘lightblue’ - 50 ≤
*u*< 100 to ‘orange’ *u*≥ 100 to ‘#ccc’

where *u* is the input value.

### Scales with discrete input and discrete output

#### scaleOrdinal

`scaleOrdinal`

maps discrete values (specified by an array) to discrete values (also specified by an array). The domain array specifies the possible input values and the range array the output values. The range array will repeat if it’s shorter than the domain array.

By default if a value that’s not in the domain is used as input, the scale will implicitly add the value to the domain:

If this isn’t the desired behvaiour we can specify an output value for unknown values using `.unknown()`

:

#### scaleBand

When creating bar charts `scaleBand`

helps to determine the geometry of the bars, taking into account padding between each bar. The domain is specified as an array of values (one value for each band) and the range as the minimum and maximum extents of the bands (e.g. the total width of the bar chart).

In effect `scaleBand`

will split the range into *n* bands (where *n* is the number of values in the domain array) and compute the positions and widths of the bands taking into account any specified padding.

The width of each band can be accessed using `.bandwidth()`

:

Two types of padding may be configured:

`paddingInner`

which specifies (as a percentage of the band width) the amount of padding between each band`paddingOuter`

which specifies (as a percentage of the band width) the amount of padding before the first band and after the last band

Let’s add some inner padding to the example above:

Putting this all together we can create this bar chart:

#### scalePoint

`scalePoint`

creates scale functions that map from a discrete set of values to equally spaced points along the specified range:

The distance between the points can be accessed using `.step()`

:

Outside padding can be specified as the ratio of the padding to point spacing. For example, for the outside padding to be a quarter of the point spacing use a value of 0.25: