Skip to content

Commit

Permalink
more documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
mbostock committed Nov 5, 2023
1 parent 67b3e3b commit 54bf3fe
Show file tree
Hide file tree
Showing 2 changed files with 87 additions and 24 deletions.
63 changes: 39 additions & 24 deletions docs/marks/difference.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,8 @@

import * as Plot from "@observablehq/plot";
import * as d3 from "d3";
import {computed, ref, shallowRef, onMounted} from "vue";
import {computed, shallowRef, onMounted} from "vue";

const shift = ref(365);
const aapl = shallowRef([]);
const gistemp = shallowRef([]);
const tsa = shallowRef([{Date: new Date("2020-01-01")}]);
Expand All @@ -19,11 +18,11 @@ onMounted(() => {

</script>

# Difference mark
# Difference mark <VersionBadge pr="1896" />

The **difference mark** compares a metric. Like the [area mark](./area.md), the region between two lines is filled; unlike the area mark, alternating color shows when the primary metric is above or below the secondary metric.
The **difference mark** puts a metric in context by comparing it to something. Like the [area mark](./area.md), the region between two lines is filled; unlike the area mark, alternating color shows when the metric is above or below the comparison value.

In the simplest case, the difference mark compares a metric to a constant, often zero. For example, the plot below shows the [global surface temperature anomaly](https://data.giss.nasa.gov/gistemp/) from 1880–2016; 0° represents the 1951–1980 average; above-average temperatures are in <span style="border-bottom: solid var(--vp-c-red) 3px;">red</span>, while below-average temperatures are in <span style="border-bottom: solid var(--vp-c-blue) 3px;">blue</span>. (It’s getting hotter.)
In the simplest case, the difference mark compares a metric to a constant. For example, the plot below shows the [global surface temperature anomaly](https://data.giss.nasa.gov/gistemp/) from 1880–2016; 0° represents the 1951–1980 average; above-average temperatures are in <span style="border-bottom: solid var(--vp-c-red) 3px;">red</span>, while below-average temperatures are in <span style="border-bottom: solid var(--vp-c-blue) 3px;">blue</span>. (It’s getting hotter.)

:::plot
```js
Expand All @@ -37,7 +36,7 @@ Plot.differenceY(gistemp, {
```
:::

Applying a 24-month [moving average](../transforms/window.md) improves readability by smoothing the noise.
A 24-month [moving average](../transforms/window.md) improves readability by smoothing out the noise.

:::plot
```js
Expand All @@ -54,25 +53,23 @@ Plot.differenceY(
```
:::

More powerfully, the difference mark compares two metrics.

Comparing metrics is most convenient when the data has a column for each. For example, the plot below shows the number of daily travelers through TSA checkpoints in 2020 compared to 2019. In the first two months of 2020, there were on average <span style="border-bottom: solid #01ab63 3px;">more travelers</span> per day than 2019; yet when COVID-19 hit, there were many <span style="border-bottom: solid #4269d0 3px;">fewer travelers</span> per day, dropping almost to zero.
More powerfully, the difference mark compares two metrics. For example, the plot below shows the number of travelers per day through TSA checkpoints in 2020 compared to 2019. (This in effect compares a metric against itself, but as the data represents each year as a separate column, we can think of it as two separate metrics.) In the first two months of 2020, there were on average <span style="border-bottom: solid #01ab63 3px;">more travelers</span> per day than 2019; yet when COVID-19 hit, there were many <span style="border-bottom: solid #4269d0 3px;">fewer travelers</span> per day, dropping almost to zero.

:::plot
```js
Plot.plot({
x: {tickFormat: "%b"},
y: {grid: true, label: "Travelers"},
marks: [
Plot.axisY({label: "Daily travelers (thousands, 2020 vs. 2019)", tickFormat: (d) => d / 1000}),
Plot.axisY({label: "Travelers per day (thousands, 2020 vs. 2019)", tickFormat: (d) => d / 1000}),
Plot.ruleY([0]),
Plot.differenceY(tsa, {x: "Date", y1: "2019", y2: "2020", tip: {format: {x: "%B %-d"}}})
]
})
```
:::

If the data is “tall” rather than “wide” — TK explain what this means — you can use the [group transform](../transforms/group.md) with the [find reducer](../transforms/group.md#find): group the rows by date, and then for the two output columns **y1** and **y2**, find the desired corresponding row. The plot below shows daily minimum temperature for San Francisco compared to San Jose. The insulating effect of the fog keeps San Francisco warmer in winter and cooler in summer, reducing seasonal variation.
If the data is “tall” rather than “wide” — that is, if the two metrics we wish to compare are represented by separate *rows* rather than separate *columns* — we can use the [group transform](../transforms/group.md) with the [find reducer](../transforms/group.md#find): group the rows by **x** (date), then find the desired **y1** and **y2** for each group. The plot below shows daily minimum temperature for San Francisco compared to San Jose. Notice how the insulating fog keeps San Francisco <span style="border-bottom: solid #01ab63 3px;">warmer</span> in winter and <span style="border-bottom: solid #4269d0 3px;">cooler</span> in summer, reducing seasonal variation.

:::plot
```js
Expand Down Expand Up @@ -103,32 +100,50 @@ Plot.plot({
```
:::

The difference mark can also be used to compare a metric *to itself* using the [shift transform](../transforms/shift.md). This is especially useful for time series that exhibit [periodicity](https://en.wikipedia.org/wiki/Seasonality) — which is most of them, and certainly ones that involve human behavior. In this way a difference mark can show week-over-week or year-over-year growth.

<p>
<label class="label-input" style="display: flex;">
<span style="display: inline-block; width: 7em;">Shift:</span>
<input type="range" v-model.number="shift" min="0" max="1000" step="1">
<span style="font-variant-numeric: tabular-nums;">{{shift}}</span>
</label>
</p>
The difference mark can also be used to compare a metric to itself using the [shift transform](../transforms/shift.md). The chart below shows year-over-year growth in the price of Apple stock.

:::plot
```js
Plot.differenceY(aapl, Plot.shiftX(`${shift} days`, {x: "Date", y: "Close"})).plot({y: {grid: true}})
Plot.differenceY(aapl, Plot.shiftX("+1 year", {x: "Date", y: "Close"})).plot({y: {grid: true}})
```
:::

TK Something about if you sold Apple stock after holding it for a year, you’d tend to do pretty well. But if you hold it for less time, you see more blue. And even if you held it for a year, you could have still lost money if you sold in most of 2016. Even the unluckiest person would have made money if they held Apple stock for 780+ days (in 2015–2018).
For most of the covered time period, you would have <span style="border-bottom: solid #01ab63 3px;">made a profit</span> by holding Apple stock for a year; however, if you bought in 2015 and sold in 2016, you would likely have <span style="border-bottom: solid #4269d0 3px;">lost money</span>.

## Difference options

TK
The following channels are required:

* **x2** - the horizontal position of the metric; bound to the *x* scale
* **y2** - the vertical position of the metric; bound to the *y* scale

In addition to the [standard mark options](../features/marks.md#mark-options), the following optional channels are supported:

* **x1** - the horizontal position of the comparison; bound to the *x* scale
* **y1** - the vertical position of the comparison; bound to the *y* scale

If **x1** is not specified, it defaults to **x2**. If **y1** is not specified, it defaults to **y2** — TODO that’s not right, because **y1** defaults to zero for differenceY. These defaults facilitate sharing *x* or *y* coordinates between the metric and its comparison.

TODO

* **fill**
* **positiveFill**
* **negativeFill**
* **fillOpacity**
* **positiveFillOpacity**
* **negativeFillOpacity**
* **stroke**
* **strokeOpacity**

TODO

* **z**
* **clip**

## differenceY(*data*, *options*) {#differenceY}

```js
Plot.differenceY(gistemp, {x: "Date", y: "Anomaly"})
```

TK
TODO
48 changes: 48 additions & 0 deletions docs/transforms/shift.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
<script setup>

import * as Plot from "@observablehq/plot";
import * as d3 from "d3";
import {ref, shallowRef, onMounted} from "vue";

const shift = ref(365);
const aapl = shallowRef([]);

onMounted(() => {
d3.csv("../data/aapl.csv", d3.autoType).then((data) => (aapl.value = data));
});

</script>

# Shift transform <VersionBadge pr="1896" />

The **shift transform** is a specialized [map transform](./map.md) that derives an output **x1** channel by shifting the **x** channel; it can be used with the [difference mark](../marks/difference.md) to show change over time. For example, the chart below shows the price of Apple stock. The <span style="border-bottom: solid #01ab63 3px;">green region</span> shows when the price went up over the given interval, while the <span style="border-bottom: solid #4269d0 3px;">blue region</span> shows when the price went down.

<p>
<label class="label-input" style="display: flex;">
<span style="display: inline-block; width: 7em;">Shift (days):</span>
<input type="range" v-model.number="shift" min="0" max="1000" step="1">
<span style="font-variant-numeric: tabular-nums;">{{shift}}</span>
</label>
</p>

:::plot hidden
```js
Plot.differenceY(aapl, Plot.shiftX(`${shift} days`, {x: "Date", y: "Close"})).plot({y: {grid: true}})
```
:::

```js-vue
Plot.differenceY(aapl, Plot.shiftX("{{shift}} days", {x: "Date", y: "Close"})).plot({y: {grid: true}})
```

When looking at year-over-year growth, the chart is mostly green, implying that you would make a profit by holding Apple stock for a year. However, if you bought in 2015 and sold in 2016, you would likely have lost money. Try adjusting the slider to a shorter or longer interval: how does that affect the typical return?

## shiftX(*interval*, *options*) {#shiftX}

```js
Plot.shiftX("7 days", {x: "Date", y: "Close"})
```

Derives an **x1** channel from the input **x** channel by shifting values by the given *interval*. The *interval* may be specified as: a name (*second*, *minute*, *hour*, *day*, *week*, *month*, *quarter*, *half*, *year*, *monday*, *tuesday*, *wednesday*, *thursday*, *friday*, *saturday*, *sunday*) with an optional number and sign (*e.g.*, *+3 days* or *-1 year*); or as a number; or as an implementation — such as d3.utcMonth — with *interval*.floor(*value*), *interval*.offset(*value*), and *interval*.range(*start*, *stop*) methods.

The shiftX also transform aliases the **x** channel to **x2** and applies a domain hint to the **x2** channel such that by default the plot shows only the intersection of **x1** and **x2**. For example, if the interval is *+1 year*, the first year of the data is not shown.

0 comments on commit 54bf3fe

Please sign in to comment.