From 959e44a15e84a3d5d859387789a548a0d6316374 Mon Sep 17 00:00:00 2001 From: Ashish Bora Date: Mon, 6 Jun 2022 14:19:28 -0700 Subject: [PATCH 1/2] Fix typo --- posts/2018-10-13-flow-models/index.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/posts/2018-10-13-flow-models/index.html b/posts/2018-10-13-flow-models/index.html index 4c7972d..a91f60d 100644 --- a/posts/2018-10-13-flow-models/index.html +++ b/posts/2018-10-13-flow-models/index.html @@ -474,7 +474,7 @@

RealNVP for more details on the multi-scale architecture.

NICE

From 07e68d352ee77eee31c3228cb1a655485df91124 Mon Sep 17 00:00:00 2001 From: Ashish Bora Date: Mon, 6 Jun 2022 14:36:52 -0700 Subject: [PATCH 2/2] Add function parameter --- posts/2018-10-13-flow-models/index.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/posts/2018-10-13-flow-models/index.html b/posts/2018-10-13-flow-models/index.html index a91f60d..c533e67 100644 --- a/posts/2018-10-13-flow-models/index.html +++ b/posts/2018-10-13-flow-models/index.html @@ -501,7 +501,7 @@

Glow

It performs an affine transformation using a scale and bias parameter per channel, similar to batch normalization, but works for mini-batch size 1. The parameters are trainable but initialized so that the first minibatch of data have mean 0 and standard deviation 1 after actnorm.

Substep 2: Invertible 1x1 conv

Between layers of the RealNVP flow, the ordering of channels is reversed so that all the data dimensions have a chance to be altered. A 1×1 convolution with equal number of input and output channels is a generalization of any permutation of the channel ordering.

-

Say, we have an invertible 1x1 convolution of an input $h \times w \times c$ tensor $\mathbf{h}$ with a weight matrix $\mathbf{W}$ of size $c \times c$. The output is a $h \times w \times c$ tensor, labeled as $f = \texttt{conv2d}(\mathbf{h}; \mathbf{W})$. In order to apply the change of variable rule, we need to compute the Jacobian determinant $\vert \det\partial f / \partial\mathbf{h}\vert$.

+

Say, we have an invertible 1x1 convolution of an input $h \times w \times c$ tensor $\mathbf{h}$ with a weight matrix $\mathbf{W}$ of size $c \times c$. The output is a $h \times w \times c$ tensor, labeled as $f(\mathbf{h}) = \texttt{conv2d}(\mathbf{h}; \mathbf{W})$. In order to apply the change of variable rule, we need to compute the Jacobian determinant $\vert \det\partial f / \partial\mathbf{h}\vert$.

Both the input and output of 1x1 convolution here can be viewed as a matrix of size $h \times w$. Each entry $\mathbf{x}_{ij}$ ($i=1,\dots,h, j=1,\dots,w$) in $\mathbf{h}$ is a vector of $c$ channels and each entry is multiplied by the weight matrix $\mathbf{W}$ to obtain the corresponding entry $\mathbf{y}_{ij}$ in the output matrix respectively. The derivative of each entry is $\partial \mathbf{x}_{ij} \mathbf{W} / \partial\mathbf{x}_{ij} = \mathbf{W}$ and there are $h \times w$ such entries in total:

$$