Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exp push: studio integration #9074

Closed
11 tasks done
dberenbaum opened this issue Feb 23, 2023 · 36 comments
Closed
11 tasks done

exp push: studio integration #9074

dberenbaum opened this issue Feb 23, 2023 · 36 comments
Labels
A: experiments Related to dvc exp epic Umbrella issue (high level). Include here: list of tasks/PRs, details, Qs, timelines, etc p1-important Important, aka current backlog of things to do product: Studio Integration with Studio

Comments

@dberenbaum
Copy link
Collaborator

dberenbaum commented Feb 23, 2023

To make exp push/pull more useful, users need to be able to see and manage the exp refs from within Studio (since those exp refs are ignored by GitHub and other Git providers). One of the challenges is that there is not even any trigger by which the Git providers can alert Studio that new exp refs have been pushed.

To enable Studio support for pushed experiments, DVC needs to:

Tasks

  1. A: experiments p2-medium product: VSCode
  2. A: experiments enhancement skip-changelog
  3. A: experiments p1-important
    skshetry
  4. A: experiments

Blocked/out of scope issues:

Tasks

  1. A: experiments p2-medium product: Studio
    amritghimire
  2. A: experiments p3-nice-to-have product: Studio

Related:

@dberenbaum dberenbaum added p2-medium Medium priority, should be done, but less important product: VSCode Integration with VSCode extension A: experiments Related to dvc exp labels Feb 23, 2023
@mattseddon
Copy link
Member

Support for saving Studio token to global DVC config or somewhere else that's reasonably secure, persistent, and ideally globally available

Github CLI has gh auth set of commands. The most useful for me has been gh auth login.

@dberenbaum
Copy link
Collaborator Author

Edited to add:

@daavoo
Copy link
Contributor

daavoo commented Mar 9, 2023

Return links to experiments in Studio so user knows exactly where to see what was pushed

What would this link to? Is it the default project view or something else? Afaik there is no real concept of a view for a single experiment in Studio

@dberenbaum
Copy link
Collaborator Author

What would this link to? Is it the default project view or something else? Afaik there is no real concept of a view for a single experiment in Studio

I think it's discussed in https://github.com/iterative/studio/issues/5150.

  • When pushing refs, alert Studio that new ref is available so Studio knows to parse the new refs

See https://github.com/iterative/studio/issues/5273#issuecomment-1462105892. @skshetry Looks like we already have the API needed for this.

@skshetry
Copy link
Member

skshetry commented Mar 10, 2023

Looks like we already have the API needed for this.

I'll leave that upto the Studio team to decide. :)
At the moment, what we need is a friendly ping to Studio to go and fetch new exp refs and parse. The existing API seems to only use the data from the payload, does not fetch or parse anything other than that. So either a new API or changes to API is needed.

Also there are open questions how Studio should pull new refs. Webhooks don't work for refs.

@skshetry
Copy link
Member

Return links to experiments in Studio so user knows exactly where to see what was pushed

This may be a bit tricky with just STUDIO_TOKEN and repo_url to figure out as a user can have multiple views.

@skshetry
Copy link
Member

Some updates here: I am waiting for Studio to parse and show the experiment refs. We already have the functionality to notify Studio of the refs. The API endpoint does refresh and tries to parse new commits.

Support for saving Studio token to global DVC config or somewhere else that's reasonably secure, persistent, and ideally globally available

I am thinking of implementing support in dvc config for token and calling this item done here for now. Secure storage implementation can be done in the future.

Return links to experiments in Studio so user knows exactly where to see what was pushed

Looks like this was deprioritized. So I don't see it happening any time soon.


I have 2 questions regarding the workflow:

  1. If we did set a config, will it always by default push refs to Studio? Or should it be a separate config?
  2. Should there be a separate config to push or not push refs to the Studio?
    Eg: --studio and --no-studio perhaps ?

@dberenbaum
Copy link
Collaborator Author

Sounds good @skshetry, thanks!

I have 2 questions regarding the workflow:

1. If we did set a config, will it always by default push refs to Studio? Or should it be a separate config?

I think it should push by default. It's a lightweight ping, and if it fails we can make it not too alarming. I don't see much reason not to push in most cases. We already will need to figure out how to differentiate this from the live sharing to studio that's enabled currently by the env var, so let's not add more options than necessary yet.

2. Should there be a separate config to push or not push refs to the Studio?
   Eg: `--studio` and `--no-studio` perhaps ?

Is it needed? Similar to above, is there a situation where it's harmful to send a ping to Studio if the token exists?

Should we return an error code on failures to ping Studio? My only worry would be not breaking CI jobs if the ping to Studio fails.

@dberenbaum
Copy link
Collaborator Author

I see we might end up trying to avoid having a token according to https://github.com/iterative/studio/issues/5050#issuecomment-1479601463:

One is to optimize the flow of trigger. Currently, user need to have the studio token to make webhook trigger. We can remove the auth part and use existing ip/user rate limit as safeguards. If the refs sent by the dvc exists in git forge, then only we can trigger the parse.

In that case, I can see more why we may need some way to enable/disable sending to Studio. It might be fine to ping by default and quietly fail since it's a pretty lightweight request, but I'd agree we probably need some way to opt out of this behavior.

@dberenbaum
Copy link
Collaborator Author

@skshetry Added #9227

@dberenbaum dberenbaum added p1-important Important, aka current backlog of things to do and removed p2-medium Medium priority, should be done, but less important labels Mar 22, 2023
@dberenbaum
Copy link
Collaborator Author

@skshetry We also may need to show some warning if the ping to Studio is successful but Studio cannot show it because the parent commit has not been pushed. See https://github.com/iterative/studio/issues/4940#issuecomment-1481247133. Should we add it as a task?

@dberenbaum

This comment was marked as resolved.

@skshetry

This comment was marked as outdated.

@skshetry
Copy link
Member

FYI you may have to sync with main for this to work. We changed the endpoint.

@skshetry
Copy link
Member

skshetry commented Mar 28, 2023

The only thing that's required for the release is #9295, but it's not a blocker on dvc side.

@daavoo
Copy link
Contributor

daavoo commented Mar 28, 2023

The only thing that's required for the release is #9295, but it's not a blocker on dvc side.

For dvc exp run, when live updates are enabled, we should discuss if we expect dvc to also call exp push here:

finally:
post_live_metrics(
"done",
info.baseline_rev,
info.name,
"dvc",
experiment_rev=dvc.experiments.scm.get_ref(EXEC_BRANCH),
metrics=get_in(dvc.metrics.show(), ["", "data"]),
)

DVCLive should do the same order of calls when save_dvc_exp=True and people run with python dvclive_script.py

@daavoo
Copy link
Contributor

daavoo commented Mar 28, 2023

we should discuss if we expect dvc to also call exp push here:

We could also remove the metrics argument if we push the full exp ref on completion

@dberenbaum
Copy link
Collaborator Author

@skshetry Could you open a docs issue/PR for this work 🙏 ?

@dberenbaum
Copy link
Collaborator Author

dberenbaum commented Mar 28, 2023

@skshetry I added this item:

Also, are you still planning on this item?

  • Return links to experiments in Studio so user knows exactly where to see what was pushed - sufficient to return link to project for now

Even if it's a project-level URL, I think it would be helpful to both indicate the ping was successful and make it easy to navigate to Studio to see the experiments.

@skshetry
Copy link
Member

Also, are you still planning on this item?

Even if it's a project-level URL, I think it would be helpful to both indicate the ping was successful and make it easy to navigate to Studio to see the experiments.

It can be implemented later on when Studio starts returning URL.

@dberenbaum

This comment was marked as outdated.

@skshetry
Copy link
Member

For dvc exp run, when live updates are enabled, we should discuss if we expect dvc to also call exp push here:

I need your opinion here, @dberenbaum. Note that we have DVC_EXP_AUTO_PUSH envvar that does this already.

I'd prefer being explicit here with dvc exp run --push. I don't really like to have too much magic with just STUDIO_TOKEN set.

@dberenbaum
Copy link
Collaborator Author

Note that we have DVC_EXP_AUTO_PUSH envvar that does this already.

I don't really like to have too much magic with just STUDIO_TOKEN set.

Maybe the problem is that we are using STUDIO_TOKEN for everything. We did it for simplicity at the start, but it's always felt like an odd way to enable/disable sharing to me. Should we use DVC_EXP_AUTO_PUSH to enable/disable live sharing instead? It aligns with the discussion in #9221 to move away from checkpoints.

It could work like:

  1. Set DVC_EXP_AUTO_PUSH to get automatic sharing.
  2. While the experiment runs, each DVCLive step will push updates to Studio using the Studio token (either from env var or the config).
  3. At the end/when a commit happens, DVC will do dvc exp push.

@shcheklein
Copy link
Member

Are there any other ways to control besides using env vars (personal bias - I never liked things like export SOMETHING or remember to do SOMETHING=true command - all looks like hacks to me, when you need to override something temporarily vs a proper interface.

@dberenbaum
Copy link
Collaborator Author

If we use DVC_EXP_AUTO_PUSH, I see no reason we can't create a DVC config option with the same name.

@shcheklein
Copy link
Member

Can it be / should it be controlled on the code level as well? E.g. enabled by default is token is set + an option to opt out.?

@daavoo daavoo added the epic Umbrella issue (high level). Include here: list of tasks/PRs, details, Qs, timelines, etc label Mar 31, 2023
@dberenbaum
Copy link
Collaborator Author

@shcheklein Can we agree on the schema and follow up on the exact workflow? Any thoughts on using both STUDIO_TOKEN and DVC_EXP_AUTO_PUSH to control this behavior?

Using both of those options as described above unifies the live sharing and exp push workflow, and it gives flexibility to tweak the workflows. We can save the token and either share or not, and can change the default behavior (we could enable DVC_EXP_AUTO_PUSH by default when saving the token if we want). We can also set them from CLI, code, in environment variables, etc.

@shcheklein
Copy link
Member

@dberenbaum could you please clarify the details a bit? I think it's hard to understand how all the things fit together.

So, we have a token in the DVC config now (who is reading it and how do we pass it downstream in DVC). Af far as I understand it's also possible to set STUDIO_TOKEN, we do this in VS Code for example.

We don't use DVC_EXP_AUTO_PUSH yet (I guess push is enabled by default or controlled via code).

Anyways, I think we need to clarify the whole picture tbh here. It's hard to agree w/o knowing the details (or may be I don't know them enough.

@dberenbaum
Copy link
Collaborator Author

Here's a proposal:

  • STUDIO_TOKEN is used to save the token and has no other effect.
  • DVC_EXP_AUTO_PUSH is used to enable/disable automatic sharing of experiments. This would push exp refs whenever they are saved locally, and if there is a valid STUDIO_TOKEN, it would also share live updates.
  • Both could be configured through different interfaces with a typical priority order: CLI/API flag -> environment variable -> local config -> project config -> user/global config -> system config.

@shcheklein
Copy link
Member

okay, I see. So it's just names for configs, not env vars. So, the question here is to decide on names and see if those two are more or less enough and seem reasonable from the user perspective (assuming we can change and fine tune their exact behavior, etc), right?

and if there is a valid STUDIO_TOKEN, it would also share live updates.
STUDIO_TOKEN is used to save the token and has no other effect.

It's a bit of a contradiction?


A few things to think, from the top of my head:

  • do we have other env / global configs similar to this? how do we name them (e.g. should be name all the options DVC_something, or other way around no DVC_ prefix). E.g. should it be DVC_STUDIO_TOKEN? or ITERATIVE_STUDIO_TOKEN?
  • can it happen that we would need DVC_EXP_PUSH_MODE={auto,default,no} (just making this up a bit- no need to spend too much on it- just feels a bit more future-proof?). (the answer here is probably no, since the name is confusing ... just in case)

@dberenbaum
Copy link
Collaborator Author

and if there is a valid STUDIO_TOKEN, it would also share live updates.
STUDIO_TOKEN is used to save the token and has no other effect.

It's a bit of a contradiction?

Maybe you took me a bit too literally 😄. Do you have confusion over how it will work in some scenario?

  • do we have other env / global configs similar to this? how do we name them (e.g. should be name all the options DVC_something, or other way around no DVC_ prefix). E.g. should it be DVC_STUDIO_TOKEN? or ITERATIVE_STUDIO_TOKEN?

See https://github.com/iterative/dvc/blob/main/dvc/env.py. Agree that it should probably be DVC_STUDIO_TOKEN. We probably need a docs page listing all those.

There is also https://dvc.org/doc/dvclive/env.

@skshetry Maybe you have thoughts on sections/naming for both options in the config?

  • can it happen that we would need DVC_EXP_PUSH_MODE={auto,default,no} (just making this up a bit- no need to spend too much on it- just feels a bit more future-proof?). (the answer here is probably no, since the name is confusing ... just in case)

Not sure I get what each option there would mean. I think we could adjust the "default" behavior via the UX in #9265. For example:

$ dvc studio login
Redirecting you to studio.iterative.ai...
Token saved to `~/.config/dvc/config`
Would you like to enable automatic experiment sharing? (Y/n)

In other words, rather than have an option like exp_auto_push that defaults to true when unset, we can stick to typical convention (unset means false) but adjust default behavior through the UX.

@dberenbaum
Copy link
Collaborator Author

dberenbaum commented Apr 11, 2023

@skshetry Let's wrap this up by:

Edit: updated the post at the top.

@omesser
Copy link
Contributor

omesser commented Apr 14, 2023

@dberenbaum can you please clarify where did we land on with the experience? I'm a bit worried we're adding friction to the user for the happy flow

@dberenbaum
Copy link
Collaborator Author

dberenbaum commented Apr 14, 2023

@omesser There's a lot of related discussion in #9295. For now, here's how it will work:

  • Setting STUDIO_TOKEN will enable dvclive sharing.
  • If either STUDIO_TOKEN or the dvc config option studio.token are set, dvc exp push will ping Studio in addition to pushing to the git and/or dvc remotes.

Nothing else is planned for now. We discussed auto-pushing exp refs and the corresponding dvc cached outputs, but after the discussion in #9295, I think we should wait. We have taken some steps to make live updates feel connected to the rest of the experiments workflow (like replacing the live row with the exp ref row in Studio). Live sharing is a different scenario than dvc exp push where it's unclear whether you want to push the exp ref or dvc cache automatically, and it makes explaining dvclive sharing more complicated (it pushes some lightweight stuff to Studio in realtime, and then pushes everything to the Git remote and DVC remote at the end, but only if you are using some option to save a DVC experiment, and if you don't want to push to the Git remote or DVC remote you can opt out by...).

@skshetry skshetry added release-blocker Blocks a release and removed release-blocker Blocks a release labels Apr 17, 2023
@skshetry skshetry removed their assignment Apr 24, 2023
@dberenbaum
Copy link
Collaborator Author

Closing this one since all tasks are completed. Great work @skshetry @daavoo and everyone else who helped here!

@dberenbaum
Copy link
Collaborator Author

Whoops, thanks @skshetry!

@daavoo daavoo added product: Studio Integration with Studio and removed product: VSCode Integration with VSCode extension labels May 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: experiments Related to dvc exp epic Umbrella issue (high level). Include here: list of tasks/PRs, details, Qs, timelines, etc p1-important Important, aka current backlog of things to do product: Studio Integration with Studio
Projects
No open projects
Archived in project
Development

No branches or pull requests

6 participants