Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

push_to_hub_keras should support multiple models #533

Open
merveenoyan opened this issue Dec 14, 2021 · 9 comments
Open

push_to_hub_keras should support multiple models #533

merveenoyan opened this issue Dec 14, 2021 · 9 comments
Assignees

Comments

@merveenoyan
Copy link
Contributor

merveenoyan commented Dec 14, 2021

It's a minor UX improvement for push_to_hub_keras.

Some applications like RL or non-transformers encoder-decoder networks (CNN encoder RNN decoder) have multiple models instead of one model.

We could pass a list to model, get the model names and save those models individually under different folders and push. (check if list or a model with isinstance first)

An alternative is git pull everytime we push with push_to_hub_keras (which is cumbersome in a notebook) and call push_to_hub_keras repeatedly for every model.

cc: @osanseviero

@merveenoyan merveenoyan self-assigned this Dec 14, 2021
@merveenoyan
Copy link
Contributor Author

merveenoyan commented Mar 1, 2022

I'm currently working on this for GAN sprint in which you'd have to push two models to one repository. (leaving notes to myself)
First argument of push_to_hub_keras() should be a model or a list of model objects, if list, it will check all the models to see if they're built, and also create a custom model card for both models (maybe append? which is not that aesthetic).
There would be two folders with separate models (that can be named with the order of the list, e.g. model_1 which I find sloppy), and from_pretrained_keras() should support loading them separately. (or like model_1, model_2 = from_pretrained_keras(...))
Looks like

> model_1 
>> model_1 files (including plot)
> model_2
>> model_2 files (including plot)
> README.md

The widget will never work anyway if the pipeline is not overridden so it's fine imo.

@merveenoyan
Copy link
Contributor Author

merveenoyan commented Mar 3, 2022

I think following is better so people can name the folders (when we do list it's more like model_0, model_1 etc)

push_to_hub_keras(
    model = {"generator":generator, "discriminator":discriminator},
    repo_path_or_name="merve/multiple-models-...",
...)

Which looks like this:
Ekran Resmi 2022-03-03 17 23 00
However the readme looks ugly, I append them. You can see the readme's repo for example pushed multiple model.

If no one has strong design opinions (I'd rather have you have them 😅) I'll fix the plot path in the model card and write tests.
cc: @nateraw

@osanseviero
Copy link
Contributor

I feel only the following sections are needed:

  1. Model desc (for full thing)
  2. Intended uses and limitations
  3. Training and eval data
  4. Train procedure for model A
  5. Model Plot for model A
  6. Training procedure for model B
  7. Model Plot for model B

Maybe before 4, we can have a section saying "this repository consists of X models". Then before 4 and before 6, have a heading with "Further information for model A|B". WDYT? We could do this by breaking the readme function a bit I feel.

@merveenoyan
Copy link
Contributor Author

Makes sense, I will change the card generation procedure. Thanks a lot for the input.

@nateraw
Copy link
Contributor

nateraw commented Mar 3, 2022

Sure, I like this idea 😄 . Would be very helpful especially in the case of GANs. I'd really be curious to know if other folks want this feature.

As for model card, I think we can either do what @osanseviero is describing, or just leave it as-is. I'm guessing this would only be used when models are trained together. People probably shouldn't be dumping unrelated models in the same repo and we don't want to encourage that.

@osanseviero
Copy link
Contributor

Yes, good point @nateraw. TBH since this is not a feature for which I expect significantly high usage, I think we can go as-is and iterate if we see significant usage

@Wauplin
Copy link
Contributor

Wauplin commented Aug 17, 2022

@merveenoyan (cc @osanseviero @nateraw) I'm jumping into the discussion, hope it's not outdated 😬

Since push_to_hub_keras is now http-based (#847 is merged) and do not require to download the full repository anymore, calling multiple times push_to_hub_keras (once per model) is not so cumbersome anymore, right ? Or would it still be a nice feature to have ?
I saw that #809 has been abandoned. Should I just close this issue ?

@merveenoyan
Copy link
Contributor Author

@Wauplin we decided not to go with it, it wasn't cumbersome anyway it was just a matter of pairing two models that were trained according to each other. assume you run experiments with a pipeline that has CNN model and an RNN model, it's a little cumbersome if someone is serializing these models in two different repositories, as in, you do different experiments, so your monday CNN should be with monday RNN and from the repository you should be able to tell that.
so you have multiple repositories.
these two belong together:

  • merve/CNN_I_trained_on_monday
  • merve/RNN_I_trained_on_monday
    and these two belong together and user should be able to tell that:
  • merve/CNN_I_trained_on_tuesday
  • merve/RNN_I_trained_on_tuesday

so when user is loading these models, they'll load them separately. My solution have one repository that has two models:

  • merve/CNN_RNN_I_trained_on_monday

So when someone loads it, they can load it with one line of code.
I think it's good if people were to version using revisions, e.g. have monday revision for both model repositories and tuesday revision in parallel. (I don't know if I'm clear, you can stop me if I am 😅) So I don't know if we should close this discussion or not.

Also pinging @sayakpaul since we discussed about this for dreambooth event (a case where there's two models that belong together).

@Wauplin
Copy link
Contributor

Wauplin commented Jan 30, 2023

Hi @merveenoyan , thanks for reopening the topic! Explanation are clear to me 😄
It reminds me this discussion initiated in diffusers a while ago. Their problematic is a bit different but still similar. Diffusers pipeline can be made of different components (scheduler, denoiser, decoder) that could in theory be taken from different repos/version. So they also need this possibility to load/push only components instead of the entire pipeline.

That said, not sure what should live in huggingface_hub and what should be specific to integrations. In the push_to_hub_keras situation, I definitely agree that a single repo merve/CNN_RNN_I_trained make sense and use git tags (create_tag) to pin revisions monday, tuesday,... (or most likely experiment_1, experiment_2,...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants