Skip to content

Issue/870 affine xform clarification #872

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open
1 change: 1 addition & 0 deletions src/reference-manual/transforms.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -331,6 +331,7 @@ $$
The default value for the offset $\mu$ is $0$ and for the multiplier $\sigma$ is
$1$ in case not both are specified.

For a container variable, the affine transform is applied to each element of that variable.

### Affine inverse transform {-}

Expand Down
50 changes: 45 additions & 5 deletions src/reference-manual/types.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -438,8 +438,8 @@ $1$ and multiplier $2$.
real<offset=1, multiplier=2> x;
```

As an example, we can give `x` a normal distribution with non-centered
parameterization as follows.
As an example, we can give `x` a normal distribution with non-centered parameterization.
In this program, the affine transform is applied to every element of vector `x`.

```stan
parameters {
Expand All @@ -450,7 +450,7 @@ model {
}
```

Recall that the centered parameterization is achieved with the code
Recall the Stan code for the centered parameterization of this model.

```stan
parameters {
Expand All @@ -461,17 +461,57 @@ model {
}
```

or equivalently
Adding the offset, multiplier transform results in the equivalent non-centered parameterization.

```stan
parameters {
real<offset=0, multiplier=1> x;
real<offset=mu, multiplier=sigma> x;
}
model {
x ~ normal(mu, sigma);
}
```

Sampling is done on the unconstrained parameters.
After applying the affine transform, the unconstrained parameters are standard normal,
thus the above model is equivalent to the hand-coded non-centered parameterization.

```stan
parameters {
real x_raw;
}
transformed parameters {
real x = mu + x_raw * sigma;
}
model {
x_raw ~ std_normal();
}
```

Use of the affine transform removes the overhead of declaring additional transformed parameters
and directly expresses the hierarchical relationship between parameters.

For a container variable, the affine transform is applied to each element of that variable.
As an example, the non-centered parameterization of Neal's Funnel in the
[Stan User's Guide reparameterization section](https://mc-stan.org/docs/stan-users-guide/reparameterization.html),
$$
p(y,x) = \textsf{normal}(y \mid 0,3) \times \prod_{n=1}^9
\textsf{normal}(x_n \mid 0,\exp(y/2)).
$$
can be written as:

```stan
parameters {
real<multiplier=3> y;
vector<multiplier=exp(0.5 * y)>[9] x;
}
model {
y ~ normal(0, 3);
x ~ std_normal(0, 0.5 * y);
}
```
where the affine transform is applied to every element of vector `x`.

### Expressions as bounds and offset/multiplier {-}

Bounds (and offset and multiplier)
Expand Down
52 changes: 30 additions & 22 deletions src/stan-users-guide/efficiency-tuning.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -266,8 +266,6 @@ funnel's neck is particularly sharp because of the exponential
function applied to $y$. A plot of the log marginal density of $y$
and the first dimension $x_1$ is shown in the following plot.



The funnel can be implemented directly in Stan as follows.

```stan
Expand Down Expand Up @@ -295,13 +293,13 @@ inefficient in the body. This can be seen in the following plots.
![](img/funnel-fit.png)


Neal's funnel. (Left) The marginal density of Neal's funnel for the upper-level variable $y$ and one lower-level variable $x_1$ (see the text for the formula). The blue region has log density greater than -8, the yellow region density greater than -16, and the gray background a density less than -16.
(Right) 4000 draws are taken from a run of Stan's sampler with default settings.
Both plots are restricted to the shown window of $x_1$ and $y$ values;
some draws fell outside of the displayed area as would be expected given
the density. The samples are consistent with the marginal density
$p(y) = \textsf{normal}(y \mid 0,3)$, which has mean 0 and standard
deviation 3.
Neal's funnel. (Left) The marginal density of Neal's funnel for the upper-level variable $y$ and
one lower-level variable $x_1$ (see the text for the formula). The blue region has log density
greater than -8, the yellow region density greater than -16, and the gray background a density less
than -16. (Right) 4000 draws are taken from a run of Stan's sampler with default settings.
Both plots are restricted to the shown window of $x_1$ and $y$ values; some draws fell outside of
the displayed area as would be expected given the density. The samples are consistent with the
marginal density $p(y) = \textsf{normal}(y \mid 0,3)$, which has mean 0 and standard deviation 3.
:::

In this particular instance, because the analytic form of the density
Expand All @@ -314,11 +312,8 @@ parameters {
vector[9] x_raw;
}
transformed parameters {
real y;
vector[9] x;

y = 3.0 * y_raw;
x = exp(y/2) * x_raw;
real y = 3.0 * y_raw;
vector[9] x = exp(0.5 * y) * x_raw;
}
model {
y_raw ~ std_normal(); // implies y ~ normal(0, 3)
Expand All @@ -327,14 +322,27 @@ model {
```

In this second model, the parameters `x_raw` and `y_raw` are
sampled as independent standard normals, which is easy for Stan. These
are then transformed into samples from the funnel. In this case, the
same transform may be used to define Monte Carlo samples directly
based on independent standard normal samples; Markov chain Monte Carlo
methods are not necessary. If such a reparameterization were used in
Stan code, it is useful to provide a comment indicating what the
distribution for the parameter implies for the distribution of the
transformed parameter.
sampled as independent standard normals, which is easy for Stan,
and then transformed into samples from the funnel.
When this transform is used in Stan code, a comment indicating what the
distribution for the parameter implies for the distribution of the transformed parameter
will improve readibility and maintainability.

As of Stan release v2.19.0, this program can be written using Stan's
[affinely transformed real type](https://mc-stan.org/docs/reference-manual/types.html#affine-transform.section).
The affine transform on the vector `x` is applied to each element of `x`.

```stan
parameters {
real<multiplier=3> y;
vector<multiplier=exp(0.5 * y)>[9] x;
}
model {
y ~ normal(0, 3);
x ~ normal(0, 0.5 * y);
}
```


### Reparameterizing the Cauchy {-}

Expand Down