diff --git a/posts/2024/2024_07_19_Inigo_week_8.rst b/posts/2024/2024_07_19_Inigo_week_8.rst index 0436fb0a..b512125f 100644 --- a/posts/2024/2024_07_19_Inigo_week_8.rst +++ b/posts/2024/2024_07_19_Inigo_week_8.rst @@ -15,21 +15,18 @@ This week I continued training the VAE model with the FiberCup dataset, this tim :alt: Vanilla Variational AutoEncoder reconstruction results after 120 epochs of training. :width: 600 -I also looked at the theoretical and technical implications of implementing the `beta-VAE architecture `_ for my experiments, which could help in disentangling the latent space representation of the streamlines according to features learnt in an -unsupervised manner. +I also looked at the theoretical and technical implications of implementing the `beta-VAE architecture `_ for my experiments, which could help in disentangling the latent space representation of the streamlines according to features learnt in an unsupervised manner. Shortly, applying a weight (bigger than 1) to the KL loss component of the VAE loss function encourages the model to learn a version of the latent space where features that can be perceived in the data space are aligned with the latent space dimensions. This way, one can modulate the generative process according to the learnt 'perceivable' features, once they are identified and located in the latent space. -However, increasing the beta weight compromises the reconstruction quality, which is what basically makes streamlines look reasonable. Finding a good beta weight is as 'simple' as running a hyperparameter search while constraining the parameter to be higher than one, and to try to prioritize -the MSE (Mean Squared Error, reconstruction loss) in the search algorithm. +However, increasing the beta weight compromises the reconstruction quality, which is what basically makes streamlines look reasonable. Finding a good beta weight is as 'simple' as running a hyperparameter search while constraining the parameter to be higher than one, and to try to prioritize the MSE (Mean Squared Error, reconstruction loss) in the search algorithm. From the technical side implementing a beta-VAE is very straightforward, by just adding the beta weight in the loss equation and storing the parameter for traceability, so this did not take a lot of time. What is coming up next week ~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Next week I wanted to tinker around a bit with this parameter to see how it affects the quality of the reconstructions and the organization of the latent space, but I don't think this is an effective strategy, nor the priority. Thus, I will -start implementing the conditional VAE, which will allow me to generate new streamlines by conditioning the latent space with a specific continuous variable. +Next week I wanted to tinker around a bit with this parameter to see how it affects the quality of the reconstructions and the organization of the latent space, but I don't think this is an effective strategy, nor the priority. Thus, I will start implementing the conditional VAE, which will allow me to generate new streamlines by conditioning the latent space with a specific continuous variable. This is a bit more complex than the vanilla VAE, but I think I will be able to implement it on time because the main components are already there and I just need to add the conditioning part, based on this `paper `_. Did I get stuck anywhere diff --git a/posts/2024/2024_07_26_Inigo_week_9.rst b/posts/2024/2024_07_26_Inigo_week_9.rst index b2dc435e..2ca854a9 100644 --- a/posts/2024/2024_07_26_Inigo_week_9.rst +++ b/posts/2024/2024_07_26_Inigo_week_9.rst @@ -18,9 +18,10 @@ As a refresher, the main idea behind conditional generative models is being able For example, imagine our VAE learned a latent representation of images of cats with different fur lengths. If we do not condition our latent space on the fur length, our model might not learn about this distinctive feature found in the data space, and cats with drastically different fur lengths may be closely clustered together in the latent space. But with conditioning, we can tell the model to cluster the images along a "fur-length" dimension, so if we sample 2 images from a line that varies along that dimension but in opposite sides, we get a cat with very short fur, and another one, with very long fur. This results in a generative process that can be tuned on demand! However, there are many methods to condition a Variational AutoEncoder, and they usually depend on the type of variable we want to condition on, so the methods for categoric variables (cat vs. dog, bundle_1_fiber vs. bundle_2_fiber, etc.) and continuous ones (age of the person, length of a streamline) are normally not applicable to both types. + In the case of the FiberCup dataset, I chose to condition the latent space on the length of the streamlines, which is a continuous variable and it is a fairly easy thing to learn from the morphology of the streamlines. -After implementing the conditional VAE as in the provided reference and training it for 64 epochs (early stopped due to lack of improvement in the MSE) I got a very crappy looking reconstruction, but the latent space seems to be organized differently compared to the vanilla VAE, which suggests that the conditioning is doing something (good or not, we will see...). +After implementing the conditional VAE as in the provided reference and training it for 64 epochs (early stopped due to lack of improvement in the MSE) I did not get a decent reconstruction, but the latent space seems to be organized differently compared to the vanilla VAE, which suggests that the conditioning is doing something (good or not, we will see...). .. image:: /_static/images/gsoc/2024/inigo/cvae_first_reconstruction_result.png :alt: First reconstruction of the training data of the conditional VAE (cVAE). @@ -50,7 +51,7 @@ After discussing with my mentors, we decided to take two steps: - Visual checking fiber generation for specific bundles. Knowing that different bundles have different fiber lengths, we try to generate fibers of specific length, and see whether the generated fibers belong to the desired bundle (no matter if they are plausible or implausible). Having length as the conditioning variable allows us to perform this trick, what would not be so intuitive to check if we had used Fractional Anisotropy or other DTI-derived metrics, as these are not visually as intuitive as length. -2. To try out an adversarial framework, which is 1) easier to implement 2) easier to understand, and 3) likely to also work (we'll see if better or not). The idea is to have a discriminator that tries to predict the conditioning variable from the latent space, and the encoder tries to fool the discriminator. This way, the encoder learns to encode the conditioning variable in the latent space, and the discriminator learns to predict it. This is a very common approach in GANs, and it is called "Conditional GAN" (cGAN). As a result, we would have what I would call a conditional adversarial VAE (CA-VAE). You can read more about adversarial VAEs `in this work `_ or `in this one `_ +2. To try out an adversarial framework, which is 1) easier to implement 2) easier to understand, and 3) likely to also work (we'll see if better or not). The idea is to have a discriminator that tries to predict the conditioning variable from the latent space, and the encoder tries to fool the discriminator. This way, the encoder learns to encode the conditioning variable in the latent space, and the discriminator learns to predict it. This is a very common approach in GANs, and it is called "Conditional GAN" (cGAN). As a result, we would have what I would call a conditional adversarial VAE (CA-VAE). You can read more about adversarial VAEs `in this work `_ or `in this one `_. Did I get stuck anywhere ~~~~~~~~~~~~~~~~~~~~~~~~