diff --git a/chapters/en/unit5/generative-models/variational_autoencoders.mdx b/chapters/en/unit5/generative-models/variational_autoencoders.mdx
index 6c045f94b..a73074032 100644
--- a/chapters/en/unit5/generative-models/variational_autoencoders.mdx
+++ b/chapters/en/unit5/generative-models/variational_autoencoders.mdx
@@ -6,11 +6,12 @@ Autoencoders are a class of neural networks primarily used for unsupervised lear
 * **Decoder:** The decoder, on the other hand, takes the compressed representation produced by the encoder and attempts to reconstruct the original input data. Like the encoder, it often consists of one or more layers, but in the reverse order, gradually increasing the dimensions.
 
 ![Vanilla Autoencoder Image - Lilian Weng Blog](https://huggingface.co/datasets/hf-vision/course-assets/resolve/main/generative_models/autoencoder.png)
-\\(x)\\
-This encoder model consists of an encoder network (represented as \\(\g_\phi)\\) and a decoder network (represented as \\(f_\theta)\\ ).
-The low-dimensional representation is learned in the bottleneck layer as z and the reconstructed output is represented as \\( x'=f_\theta(g_\phi(x)))\\ with the goal as \\(x\approx x' \\). 
-A common loss function used in such vanilla autoencoders is \\(L(\theta, \phi) = \frac{1}{n}\sum_{i=1}^n (\mathbf{x}^{(i)} - f_\theta(g_\phi(\mathbf{x}^{(i)})))^2 \\) with tries to minimize the error between the original image and the reconstructed one and is also known as the `reconstruction loss`
-Autoencoders are useful for tasks such as data denoising, feature learning, and compression. However, traditional autoencoders lack the probabilistic nature that makes VAEs particularly intriguing and also useful for generational tasks
+
+This encoder model consists of an encoder network (represented as \\(g_\phi\\)) and a decoder network (represented as \\(f_\theta\\)). The low-dimensional representation is learned in the bottleneck layer as \\(z\\) and the reconstructed output is represented as \\(x' = f_\theta(g_\phi(x))\\) with the goal of \\(x \approx x'\\).
+
+A common loss function used in such vanilla autoencoders is \\(L(\theta, \phi) = \frac{1}{n}\sum_{i=1}^n (\mathbf{x}^{(i)} - f_\theta(g_\phi(\mathbf{x}^{(i)})))^2\\), which tries to minimize the error between the original image and the reconstructed one and is also known as the `reconstruction loss`.
+
+Autoencoders are useful for tasks such as data denoising, feature learning, and compression. However, traditional autoencoders lack the probabilistic nature that makes VAEs particularly intriguing and also useful for generational tasks.
 
 ## Variational Autoencoders (VAEs) Overview
 Variational Autoencoders (VAEs) address some of the limitations of traditional autoencoders by introducing a `probabilistic approach` to encoding and decoding. The motivation behind VAEs lies in their ability to generate new data samples by sampling from a learned distribution in the latent space rather than from a latent vector as was the case with Vanilla Autoencoders which makes them suitable for generation tasks.