Merge pull request #259 from lulmer/patch-1

Fixing Typos and Latex expressions on hyena.mdx
johko · Apr 25, 2024 · 4082561 · 4082561
2 parents b47ba80 + aa381c0
commit 4082561
Showing 1 changed file with 3 additions and 3 deletions.
diff --git a/chapters/en/unit13/hyena.mdx b/chapters/en/unit13/hyena.mdx
@@ -63,13 +63,13 @@ However, unlike the attention mechanism, which typically uses a single dense lay
 
 The core idea is to repeatedly apply linear operators that are fast to evaluate to an input sequence \\(u \in \mathbb{R}^{L}\\)  with \\(L\\) the length of the sequence. 
 Because global convolutions have a large number of parameters, they are expensive to train. A notable design choice is the use of **implicit convolutions**. 
-Unlike standard convolutional layers, the convolution filter \\(h\\) is learned implicitly with a small neural network \\(gamma_{\theta}\\) (also called the Hyena Filter). 
-This network takes the positional index and potentially positional encodings as inputs. From the outputs of \\(gamma_theta\\) one can construct a Toeplitz matrix \\(T_h\\). 
+Unlike standard convolutional layers, the convolution filter \\(h\\) is learned implicitly with a small neural network \\(\gamma_{\theta}\\) (also called the Hyena Filter). 
+This network takes the positional index and potentially positional encodings as inputs. From the outputs of \\(\gamma_{\theta}\\) one can construct a Toeplitz matrix \\(T_h\\). 
 
 This implies that instead of learning the values of the convolution filter directly, we learn a mapping from a temporal positional encoding to the values, which is more computationally efficient, especially for long sequences.
 
 <Tip>
-It's important to note that the mapping function can be conceptualized within various abstract models, such Neural Field or State Space Models (S4) as discussed in <a href="https://arxiv.org/abs/2212.14052">H3 Paper</a>.
+It's important to note that the mapping function can be conceptualized within various abstract models, such as Neural Field or State Space Models (S4) as discussed in <a href="https://arxiv.org/abs/2212.14052">H3 Paper</a>.
 </Tip>
 
 ### Implicit convolutions