Even more robust sweeping #302

kylebgorman · 2025-02-06T22:26:16Z

In UDTube the sweep script this loads an input YAML configuration file specified by --config and connects to W&B using the normal flags, then for each run, it inserts the W&B hyperparameter suggestions (overwriting if necessary) into the config and shells out to udtube fit. Its sweep.py parses just four flags: 3 specifying details for W&B, and one for the config file; the other ones are passed directly to udtube fit with the help of argparse.ArgumentParser.parse_known_args.

This design could easily be adapted for Yoyodyne and is more or less independent of LightningCLI migration. One advantage of this design is that when the shell call returns, the OS frees up all the associated memory. Right now, we try to empty the CUDA cache, but it's not completely obvious to me if this works as expected or not. Using a shell call instead ought to have a tiny bit additional overhead, but given that we're just running a single process at a time anyways, it's pretty minimal and should be more robust in terms of freeing memory.

The text was updated successfully, but these errors were encountered:

As described in CUNY-CL#302 this moves to a model where each sweep run is a subprocess, which seems likely to ensure the OS returns memory at the end of each.

kylebgorman added the enhancement New feature or request label Feb 6, 2025

kylebgorman linked a pull request Feb 14, 2025 that will close this issue

Robust subprocess sweep #308

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Even more robust sweeping #302

Even more robust sweeping #302

kylebgorman commented Feb 6, 2025 •

edited

Loading

Even more robust sweeping #302

Even more robust sweeping #302

Comments

kylebgorman commented Feb 6, 2025 • edited Loading

kylebgorman commented Feb 6, 2025 •

edited

Loading