Skip to content

Commit

Permalink
Merge pull request #259 from lulmer/patch-1
Browse files Browse the repository at this point in the history
Fixing Typos and Latex expressions on hyena.mdx
  • Loading branch information
merveenoyan authored Apr 25, 2024
2 parents b47ba80 + aa381c0 commit 4082561
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions chapters/en/unit13/hyena.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -63,13 +63,13 @@ However, unlike the attention mechanism, which typically uses a single dense lay

The core idea is to repeatedly apply linear operators that are fast to evaluate to an input sequence \\(u \in \mathbb{R}^{L}\\) with \\(L\\) the length of the sequence.
Because global convolutions have a large number of parameters, they are expensive to train. A notable design choice is the use of **implicit convolutions**.
Unlike standard convolutional layers, the convolution filter \\(h\\) is learned implicitly with a small neural network \\(gamma_{\theta}\\) (also called the Hyena Filter).
This network takes the positional index and potentially positional encodings as inputs. From the outputs of \\(gamma_theta\\) one can construct a Toeplitz matrix \\(T_h\\).
Unlike standard convolutional layers, the convolution filter \\(h\\) is learned implicitly with a small neural network \\(\gamma_{\theta}\\) (also called the Hyena Filter).
This network takes the positional index and potentially positional encodings as inputs. From the outputs of \\(\gamma_{\theta}\\) one can construct a Toeplitz matrix \\(T_h\\).

This implies that instead of learning the values of the convolution filter directly, we learn a mapping from a temporal positional encoding to the values, which is more computationally efficient, especially for long sequences.

<Tip>
It's important to note that the mapping function can be conceptualized within various abstract models, such Neural Field or State Space Models (S4) as discussed in <a href="https://arxiv.org/abs/2212.14052">H3 Paper</a>.
It's important to note that the mapping function can be conceptualized within various abstract models, such as Neural Field or State Space Models (S4) as discussed in <a href="https://arxiv.org/abs/2212.14052">H3 Paper</a>.
</Tip>

### Implicit convolutions
Expand Down

0 comments on commit 4082561

Please sign in to comment.