Training the model from scratch #9

MohammedZidane · 2024-03-18T13:49:17Z

Hi,
Does the code have the option to train the model from scratch? I am learning about foundation models and would like to monitor the training process itself. I could not figure out if the code allows training it from scratch.

Could you let me know if this option exist?

Thanks!

wehos · 2024-03-21T14:21:06Z

Hello, thanks for your interest.

We did not release the pretraining code yet, however, the loss functions are preserved in the code base. In the forward process here, the loss is returned. You may deploy an optimizer on these losses.

Feel free to discuss here if you encounter any specific issues.

Best,
Hongzhi

MohammedZidane · 2024-04-15T16:31:00Z

Thank you so much Hongzhi. I really appreciate that your are responsive.

As I mentioned I am leanring more about foundation models. I noticed that you are masking some of the x_seq even during a downstream task like cell type annotation. If that is true, I cannot get why. Should not be the masking only for the SSL implementation for the pretraining process?

Thanks

wehos · 2024-04-22T15:04:24Z

Thank you so much Hongzhi. I really appreciate that your are responsive.

As I mentioned I am leanring more about foundation models. I noticed that you are masking some of the x_seq even during a downstream task like cell type annotation. If that is true, I cannot get why. Should not be the masking only for the SSL implementation for the pretraining process?

Thanks

Thanks for your question. When the downstream objective is cell type annotation, the masking is effective in a similar way to input dropout. In the implementation of many deep learning models, input dropout is considered a seamless data augmentation, whose ratio may differ from hidden dropout. This technique generally works well (here is an example).

After all, feel free to remove it if hurts the performance!

MohammedZidane · 2024-04-23T13:58:18Z

got it! Thank you so much :)

MohammedZidane · 2024-05-01T10:38:15Z

Hi Hongzhi,

You suggested before that I can deploy optimizers in the cellformer.py file to do the pretraining of the model which makes sense but is not the imputation.py file like SSL implementation, in other words, it is possible to use this file for the pretraining?

Thanks

MohammedZidane · 2024-05-07T14:50:45Z

Hi Hongzhi,
I have one more question. In the imputation downstream task, in the zinb.py file in the objective folder:

The input data in the notebook has 407 genes and you use x_dict['input_gene_mask'] to only get those 407 genes from the 19374 pretraining genes then you get 307 genes whose values will be predicted, no?

If my understanding is correct, I cannot get the meaning of the 'mean' values you obtain from the other zinb.py in the decoder folder. The 'mean' has 19374 values which are then filtered to 407 then reduced to 307. I cannot get the meaning of these values. I only can get that the 307 values could be the predicted values for the imputation task.

Thanks

wehos · 2024-05-07T15:11:56Z

Hi Mohammed.

I would love to help but I'm traveling for a few conferences these days. I'll get back to you as soon as I am available.

Best,
Hongzhi

MohammedZidane · 2024-05-07T15:31:29Z

Thank you so much for your reply. Good luck :)

MohammedZidane closed this as completed Apr 23, 2024

MohammedZidane reopened this May 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training the model from scratch #9

Training the model from scratch #9

MohammedZidane commented Mar 18, 2024 •

edited

Loading

wehos commented Mar 21, 2024

MohammedZidane commented Apr 15, 2024

wehos commented Apr 22, 2024 •

edited

Loading

MohammedZidane commented Apr 23, 2024

MohammedZidane commented May 1, 2024

MohammedZidane commented May 7, 2024

wehos commented May 7, 2024 •

edited

Loading

MohammedZidane commented May 7, 2024

Training the model from scratch #9

Training the model from scratch #9

Comments

MohammedZidane commented Mar 18, 2024 • edited Loading

wehos commented Mar 21, 2024

MohammedZidane commented Apr 15, 2024

wehos commented Apr 22, 2024 • edited Loading

MohammedZidane commented Apr 23, 2024

MohammedZidane commented May 1, 2024

MohammedZidane commented May 7, 2024

wehos commented May 7, 2024 • edited Loading

MohammedZidane commented May 7, 2024

MohammedZidane commented Mar 18, 2024 •

edited

Loading

wehos commented Apr 22, 2024 •

edited

Loading

wehos commented May 7, 2024 •

edited

Loading