Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move TrialStatus out to its own file (preliminary cleanup) #3325

Closed

Conversation

lena-kashtelyan
Copy link
Contributor

Summary:
The goal of this stack was to make trials store arms directly, and not have a weird intermediate step where in fact trials store generator runs, which in turn store arms. We didn't like that setup because:
a. it was confusing and we always had to caveat it,
b. it made it awkward to add or remove arms from (batch)trials,
c. it made us create things like GeneratorRun(generator_run_type="MANUAL"), which is pretty ugly,
d. it made us return things like "list of lists of generator runs" from GenerationStrategy.gen (since we need that call to produce GRs for multiple trials, where each trial itself might need multiple GRs, e.g. in OnlineExp case).
More detail on the motivation in this proposal.

{F1974983717}

Once saitcakmak and I started thinking through how this might work (related discussion in this doc), we concluded that we liked the ideas of a) storing all GRs on the Experiment, using unique indices and b) leveraging those unique indices everywhere else to connect other abstractions to GRs, e.g. Arms or Trials (if needed). We do something similar with Trials on Experiment now and thing that works pretty well.

However, we encountered a conundrum: if we were to return Arms from GS.gen like we previously wanted (whereby we'd returned list of lists of arms from that call, with each list of arms intended as a trial), we also needed to keep track of which GenNodes produced those Arms (or which GRs those arms came from, since the GRs have GenNodes' names stored on them. And if we were to capture this info during GS.gen (for which we will need to know what index a given GR will have on the Experiment), we'd end up adding GRs to the Experiment during that call, which is a pretty big and likely unintuitive side effect (and we might end up storing "garbage/debugging" GRs we generated manually and never intended to add to any trial). Sigh.

And then we realized that there might be a way out of this: what if GS.gen actually returned Trials? –– that is the core idea of this RfC stack After all, we always find ourselves saying things like "GS generated a new trial –– jk, a GR that is destined to become a Trial." So what it it could actually make Trials? The only change this would require is that those Trial-s are not automatically added to an Experiment, but rather that can be done in a separate step (this is drafted in step 3 below and will need much beautifying before it can go in).

One aspect to pay attention to: for the internal machinery of GenerationStrategy to function in a well-oiled way, it needs to know which Trials have arms that come from given GenerationNodes (many TransitionCriteria are based on this information, e.g. "move to BO after 5 Sobol trials.") To this end, we also capture a mapping from each Trial's Arms, to indices of GRs that produced those arms (and the GRs know which GenNode made them).

Contents of this stack:

  1. [D69223588][THIS DIFF] Preliminary cleanup, pulls TrialStatus out into its own file. It's still importable from where it used to live (base_trial.py)
  2. [D69221377] Add Experiment._generator_runs: this will track all GRs produced for this experiment, arms from which made it to trials attached to the experiment (aside: this should allow us to no longer track all GRs on the GS!)
  3. [D69221378] Removes the requirement that trials are created only as they are attached to the experiment; i.e. allows us to make trials that are not yet attached to anything. Step 5 will make use of this.
  4. [D69223783] We have an old attach_trial method on the experiment, used only internally. I think we could rename it to attach_custom_trial or something else –– the meaning of this method is "attach this purely manually generated parameterization." Maybe a better name will be attach_custom_arm_as_trial or something like that.
  5. [D69223800] Implements a new Experiment.attach_trial which, given a trial that is not yet attached to anything, adds it to the experiment.
  6. [D69223911] Adds a GenerationStrategy.gen_next_trials method (the main gen to rule them all), which creates trials and not GRs like the GS always did. These trials are not automatically attached to the experiment and their GRs are only added to Experiment._generator_runs iff the trial actually gets attached via Experiment.attach_trial.
  7. (More like 7-N; will add if we like the proposal): Storage updates and other downstream updates related to this core change.

Differential Revision: D69223588

@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Feb 7, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69223588

@codecov-commenter
Copy link

codecov-commenter commented Feb 7, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.70%. Comparing base (0809eb3) to head (076ad23).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3325   +/-   ##
=======================================
  Coverage   95.70%   95.70%           
=======================================
  Files         533      534    +1     
  Lines       52537    52540    +3     
=======================================
+ Hits        50278    50281    +3     
  Misses       2259     2259           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69223588

lena-kashtelyan pushed a commit to lena-kashtelyan/Ax that referenced this pull request Feb 7, 2025
…k#3325)

Summary:
Pull Request resolved: facebook#3325

**The goal of this stack was to make trials store arms directly, and not have a weird intermediate step where in fact trials store generator runs, which in turn store arms.** We didn't like that setup because:
a. it was confusing and we always had to caveat it,
b. it made it awkward to add or remove arms from (batch)trials,
c. it made us create things like `GeneratorRun(generator_run_type="MANUAL")`, which is pretty ugly,
d. it made us return things like "list of lists of generator runs" from `GenerationStrategy.gen` (since we need that call to produce GRs for multiple trials, where each trial itself might need multiple GRs, e.g. in OnlineExp case).
*More detail on the motivation in [this proposal](https://fb.workplace.com/groups/659814077526772/permalink/2499116463596515/)*.

{F1974983717}

Once saitcakmak and I started thinking through how this might work ([related discussion in this doc](https://docs.google.com/document/d/1MbHzClqVISjKjQ81kuTQWopzX0pzCb7GeIpxRAvGT7E/edit?tab=t.0)), we concluded that **we liked the ideas of a) storing all `GR`s on the `Experiment`, using unique indices and b) leveraging those unique indices everywhere else to connect other abstractions to `GR`s**, e.g. `Arms` or `Trial`s (if needed). We do something similar with `Trial`s on `Experiment` now and thing that works pretty well.

However, we encountered a conundrum: **if we were to return `Arm`s from `GS.gen` like we previously wanted** (whereby we'd returned list of lists of arms from that call, with each list of arms intended as a trial), we also needed to keep track of which `GenNodes` produced those `Arm`s (or which `GR`s those arms came from, since the `GR`s have `GenNode`s' names stored on them. And if we were to capture this info during `GS.gen` (for which we will need to know what index a given `GR` will have on the `Experiment`), **we'd end up adding `GR`s to the `Experiment` during that call, which is a pretty big and likely unintuitive side effect** (and we might end up storing "garbage/debugging" `GR`s we generated manually and never intended to add to any trial). *Sigh*.

And then we realized that there might be a way out of this: **what if `GS.gen` actually returned `Trial`s? –– that is the core idea of this RfC stack** After all, we always find ourselves saying things like "GS generated a new ~trial~ –– jk, a `GR` that is destined to become a `Trial`." So what it it could actually make `Trial`s? The only change this would require is that **those `Trial`-s are not automatically added to an `Experiment`, but rather that can be done in a separate step** (this is drafted in step 3 below and will need much beautifying before it can go in).

One aspect to pay attention to: for the internal machinery of `GenerationStrategy` to function in a well-oiled way, it needs to know which `Trial`s have arms that come from given `GenerationNodes` (many `TransitionCriteria` are based on this information, e.g. "move to BO after 5 Sobol trials.") To this end, we also capture a mapping from each `Trial`'s `Arm`s, to indices of `GR`s that produced those arms (and the `GR`s know which `GenNode` made them).

Contents of this stack:

1. [D69223588]**[THIS DIFF] Preliminary cleanup, pulls `TrialStatus` out into its own file.** It's still importable from where it used to live (base_trial.py)
2. [D69221377] Add `Experiment._generator_runs`: this will track all GRs produced for this experiment, arms from which made it to trials attached to the experiment (aside: this should allow us to no longer track all GRs on the GS!)
3. [D69221378] Removes the requirement that trials are created only as they are attached to the experiment; i.e. allows us to make trials that are not yet attached to anything. Step 5 will make use of this.
4. [D69223783] We have an old `attach_trial` method on the experiment, used only internally. I think we could rename it to `attach_custom_trial` or something else –– the meaning of this method is "attach this purely manually generated parameterization." Maybe a better name will be `attach_custom_arm_as_trial` or something like that.
5. [D69223800] Implements a new `Experiment.attach_trial` which, given a trial that is not yet attached to anything, adds it to the experiment.
6. [D69223911] **Adds a `GenerationStrategy.gen_next_trials` method (the main `gen` to rule them all), which creates trials and not GRs like the GS always did.** These trials are not automatically attached to the experiment and their GRs are only added to `Experiment._generator_runs` iff the trial actually gets attached via `Experiment.attach_trial`.
7. (More like 7-N; will add if we like the proposal): Storage updates and other downstream updates related to this core change.

Differential Revision: D69223588
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69223588

lena-kashtelyan pushed a commit to lena-kashtelyan/Ax that referenced this pull request Feb 7, 2025
…k#3325)

Summary: Pull Request resolved: facebook#3325

Reviewed By: saitcakmak

Differential Revision: D69223588
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69223588

lena-kashtelyan pushed a commit to lena-kashtelyan/Ax that referenced this pull request Feb 7, 2025
…k#3325)

Summary: Pull Request resolved: facebook#3325

Reviewed By: saitcakmak

Differential Revision: D69223588
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69223588

1 similar comment
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69223588

lena-kashtelyan pushed a commit to lena-kashtelyan/Ax that referenced this pull request Feb 10, 2025
…k#3325)

Summary: Pull Request resolved: facebook#3325

Reviewed By: saitcakmak

Differential Revision: D69223588
…k#3325)

Summary: Pull Request resolved: facebook#3325

Reviewed By: saitcakmak

Differential Revision: D69223588
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69223588

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 78f9c09.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed Do not delete this pull request or issue due to inactivity. fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants