Skip to content

v0.12.0

Compare
Choose a tag to compare
@github-actions github-actions released this 24 Aug 00:06
· 162 commits to main since this release

What's new

Added 🎉

  • Step resources:
    • Added a step_resources parameter to the Step class which should be used to describe the computational resources required to run a step.
      Executor implementations can use this information. For example, if your step needs 2 GPUs, you should set
      step_resources=StepResources(gpu_count=2) ("step_resources": {"gpu_count": 2} in the configuration language).
    • Added a Step.resources() property method. By default this returns the value specified by the step_resources parameter.
      If your step implementation always requires the same resources, you can just override this method so you don't have to provide
      the step_resources parameter.
  • Step execution:
    • Added an executor field to the tango.yml settings. You can use this to define the executor you want to use by default.
    • Added a Beaker Executor to the Beaker integration, registered as an Executor with the name "beaker".
      To use this executor, add these lines to your tango.yml file:
      executor:
        type: beaker
        beaker_workspace: ai2/my-workspace
        clusters:
          - ai2/general-cirrascale
      See the docs for the BeakerExecutor for more information on the input parameters.
  • Step class:
    • Added a metadata field to the step class API. This can be set through the class
      variable METADATA or through the constructor argument step_metadata.
  • Weights & Biases integration:
    • You can now change the artifact kind for step result artifacts by adding a field
      called "artifact_kind" to a step's metadata.
      For models, setting "artifact_kind" to "model" will add the corresponding artifact to W&B's new model zoo.

Changed ⚠️

  • CLI:
    • The tango run command will throw an error if you have uncommitted changes in your repository, unless
      you use the --allow-dirty flag.
    • The tango run command will use the lightweight base executor (single process) by default.
      To use the multi-process executor, set -j/--parallelism to 1 or higher or -1 to use all available CPU cores.

Fixed ✅

  • Fixed bug where StepInfo environment and platform metadata could be out-of-date if a step is run again due to failure.
  • Fixed a bug where an unfortunate combination of early stopping and decreasing model performance could result in a crash in the torch trainer.

Commits

befb00a Add workspace_metadata arg to Step class, allow changing artifact kind in W&B workspace (#363)
5ab1c2a Fix undefined behavior with TorchTrainStep (#366)
bf3c1a0 Update filelock requirement from <3.8,>=3.4 to >=3.4,<3.9 (#354)
b4e48a7 Update jsonpickle requirement from <2.2.0,>=2.1.0 to >=2.1.0,<2.3.0 (#351)
1c491f0 Update wandb requirement from <0.13,>=0.12 to >=0.12,<0.14 (#350)
93d5eb4 Bump allenai/setup-beaker from 1 to 2 (#359)
dc0f89a Fix #355 - ensure git metadata is up-to-date (#361)
258e880 Raise better error msg from step_result_for_run() (#360)
43916d1 Print debugging information about the repo used. (#353)
928aa7a Add BeakerExecutor (#340)