Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to training the model? #102

Open
lishu2006ll opened this issue Jun 14, 2023 · 6 comments
Open

how to training the model? #102

lishu2006ll opened this issue Jun 14, 2023 · 6 comments

Comments

@lishu2006ll
Copy link

Found untrained code? Where is the training code and raw data set?

@pjrenzov
Copy link

pjrenzov commented Aug 1, 2023

I would like to know same.

@jianjiabailu
Copy link

I would like to know same too

@Pearces1997
Copy link

Bumping this, would also like to see the training code and dataset

@HaiderSaleem
Copy link

for anyone wondering for fine tuning Shap-E or Point-E, here is another project Cap3D. The devs have provided code for fine tuning.

cap3D/text-to-3D/finetune_shapE.py at main · crockwell/Cap3D

@chenyg59
Copy link

chenyg59 commented Jun 27, 2024

for anyone wondering for fine tuning Shap-E or Point-E, here is another project Cap3D. The devs have provided code for fine tuning.

cap3D/text-to-3D/finetune_shapE.py at main · crockwell/Cap3D

Does the finetune_shapE.py code have the whole model of shap-e trained? It seems to me that the code only trains the transformer and diffusion parts, omitting the first two layers of cross attention and patch embedding. I'm not sure if I'm understanding correctly? I see that the data loaded during training is latent_code, how is this latent_code obtained?

@Aut0matas
Copy link

Aut0matas commented Aug 1, 2024

It seems to me that the code only trains the transformer and diffusion parts, omitting the first two layers of cross attention and patch embedding.

That is the point of "fine-tuning", through which just the weights of qkv projections are updated.

I see that the data loaded during training is latent_code, how is this latent_code obtained?

Through the "3D Encoder" as described at Fig.2 in the paper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants