Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add how-to for callbacks #591

Merged
merged 2 commits into from
Nov 19, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions docs/source/changes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,9 @@ Improvements
- Adds formal support for Python 3.12. See :pull:`555`
- Several new methods were added to :class:`~hydra_zen.ZenStore`, including the abilities to copy, update, and merge stores. As well as remap the groups of a store's entries and delete individual entries. See :pull:`569`

Documentation
-------------
- Added How-To: :ref:`callbacks`

.. _v0.11.0:

Expand Down
1 change: 1 addition & 0 deletions docs/source/how_to/configure_hydra.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
.. meta::
:description: Configuring Hydra.

.. _HydraConf:

===============================
Customize Hydra's Configuration
Expand Down
217 changes: 217 additions & 0 deletions docs/source/how_to/use_callbacks.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,217 @@
.. meta::
:description: Using callbacks with hydra-zen.

.. _callbacks:

=======================================================
Use Hydra's Callbacks to Run Code Before and After Jobs
=======================================================

Hydra's `callback system <https://hydra.cc/docs/experimental/callbacks/>`_ lets us run
custom code that is triggered by events, such as a job starting and a job completing.
This enables us to do things like upload a job's results to cloud storage or
turn on performance profiling in a configurable and modular way. These callbacks can be
used across applications - independent of our task function and its config.

In this How-To, we will write toy versions of two such callbacks and will incorporate
them in our hydra-zen code. First we will hardcode our application to use these
callbacks, and afterwards we will see how they can be enabled from the CLI.


Adding basic callback support to an application
===============================================

Here we define two callbacks - `TimeIt` and `UploadResultsCallback` [1]_ - and manually
add them to :ref:`Hydra's config <HydraConf>`.

.. code-block:: python
:caption: Contents of `my_app.py`- two callbacks are defined and added to Hydra's config.

import time
from dataclasses import dataclass
from hydra.experimental.callback import Callback

# The config for our task function
@dataclass
class Config:
x: int

# Our task function
def task(x: int): # just an example task function - no important details
print(f".. running task({x=})")
import random
time.sleep(random.random())

# Defining our callbacks
class TimeIt(Callback):
def on_job_start(self, **kw) -> None:
self._start = time.time()

def on_job_end(self, **kw) -> None:
print(f"TimeIt: Took {round(time.time() - self._start, 2)} seconds")


class UploadResultsCallback(Callback):
def __init__(self, *, bucket: str = "s3:/") -> None:
self.bucket = bucket

def on_job_end(self, config: Config, **kwargs) -> None:
# Leverage access to the job's config to create a distinct file path.
path = f"file_{config.x}.txt"
print(f"UploadResultsCallback: Job ended, uploading results to {self.bucket}/{path}")

if __name__ == "__main__":
from hydra.conf import HydraConf
from hydra_zen import make_custom_builds_fn, zen, ZenStore

fbuilds = make_custom_builds_fn(populate_full_signature=True)

store = ZenStore()
# Add our callbacks directly to Hydra's config and add it to our
# config store.
store(
HydraConf(
callbacks={
"upload": fbuilds(UploadResultsCallback),
"timeit": fbuilds(TimeIt),
},
)
)
# Add our task function's config to the store
store(Config, name="task")
store.add_to_hydra_store()

# Expose CLI for running `task`
zen(task).hydra_main(
config_path=None,
config_name="task",
version_base="1.3",
)

Now when we run `my_app` we should see that both of our callbacks are running.
Let's do a multirun over two values of `x`.

.. code-block:: console
:caption: Running our application using the default config.

$ python my_app.py x=1,2 -m
[2023-11-19 13:54:22,232][HYDRA] Launching 2 jobs locally
[2023-11-19 13:54:22,232][HYDRA] #0 : x=1
.. running task(x=1)
TimeIt: Took 0.13 seconds
UploadResultsCallback: Job ended, uploading results to s3://file_1.txt

[2023-11-19 13:54:22,481][HYDRA] #1 : x=2
.. running task(x=2)
TimeIt: Took 0.72 seconds
UploadResultsCallback: Job ended, uploading results to s3://file_2.txt


We can override the default bucket for `UploadResultsCallback`.

.. code-block:: console
:caption: Running with `UploadResultsCallback(bucket='gcp:/')`.

$ python my_app.py x=1,2 hydra.callbacks.upload.bucket='gcp:/' -m
[2023-11-19 14:00:46,350][HYDRA] Launching 2 jobs locally
[2023-11-19 14:00:46,350][HYDRA] #0 : x=1
.. running task(x=1)
TimeIt: Took 0.49 seconds
UploadResultsCallback: Job ended, uploading results to gcp://file_1.txt

[2023-11-19 14:00:46,981][HYDRA] #1 : x=2
.. running task(x=2)
TimeIt: Took 0.9 seconds
UploadResultsCallback: Job ended, uploading results to gcp://file_2.txt


We can disable the `TimeIt` callback.

.. code-block:: console
:caption: Disabling `TimeIt` from the CLI.

$ python my_app.py x=1,2 ~hydra.callbacks.timeit -m
[2023-11-19 14:01:42,093][HYDRA] Launching 2 jobs locally
[2023-11-19 14:01:42,093][HYDRA] #0 : x=1
.. running task(x=1)
UploadResultsCallback: Job ended, uploading results to s3://file_1.txt

[2023-11-19 14:01:42,256][HYDRA] #1 : x=2
.. running task(x=2)
UploadResultsCallback: Job ended, uploading results to s3://file_2.txt


Enabling callbacks from the CLI
===============================

Suppose we do not want any to be callbacks enabled by default, and that we would like to
turn callbacks on from the CLI. To do this, we can add our callbacks to a 'callbacks'
group in our :class:`~hydra_zen.ZenStore`, and then leverage Hydra's `group@pkg`
`override <https://hydra.cc/docs/advanced/override_grammar/basic/>`_.


.. code-block:: python
:caption: Modifying __main__ in `my_app.py`

# Config, TimeIt, UploadResultsCallback, and task are unchanged

if __name__ == "__main__":
from hydra_zen import zen, ZenStore

store = ZenStore()

# Create configs for our callbacks and store them under the 'callbacks' group
store(UploadResultsCallback, name="upload", group="callbacks")
store(TimeIt, name="timeit", group="callbacks")

store(Config, name="task")
store.add_to_hydra_store()

zen(task).hydra_main(
config_path=None,
config_name="task",
version_base="1.3",
)


By default, running our app no longer includes any callbacks.

.. code-block:: console
:caption: Running my_app without any callbacks.

$ python my_app.py x=1,2 -m
[2023-11-19 14:01:42,093][HYDRA] Launching 2 jobs locally
[2023-11-19 14:01:42,093][HYDRA] #0 : x=1
.. running task(x=1)

[2023-11-19 14:01:42,256][HYDRA] #1 : x=2
.. running task(x=2)

Let's enable both callbacks from the CLI *and* configure `UploadResultsCallback(bucket='gcp:/')`.

.. code-block:: console
:caption: Running my_app with both callbacks enabled and `UploadResultsCallback(bucket='gcp:/')`.

$ python my_app.py x=1,2 \
[email protected]=timeit \
[email protected]=upload \
hydra.callbacks.upload.bucket=gcp:/ \
-m
[2023-11-19 14:15:41,282][HYDRA] Launching 2 jobs locally
[2023-11-19 14:15:41,282][HYDRA] #0 : x=1 [email protected]=timeit [email protected]=upload
.. running task(x=1)
UploadResultsCallback: Job ended, uploading results to gcp://file_1.txt
TimeIt: Took 0.21 seconds

[2023-11-19 14:15:41,617][HYDRA] #1 : x=2 [email protected]=timeit [email protected]=upload
.. running task(x=2)
UploadResultsCallback: Job ended, uploading results to gcp://file_2.txt
TimeIt: Took 0.39 seconds

While the input here isn't all that concise it is nonetheless important to see that
callbacks can be enabled and configured without having to modify one's code.


Footnotes
=========
.. [1] See `this code <https://github.com/facebookresearch/hydra/blob/809718cdcd64f9cd930d26dea69f2660a6ffa833/hydra/experimental/callback.py#L13-L65>`_ for the full `Callback` API.
1 change: 1 addition & 0 deletions docs/source/how_tos.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,5 +25,6 @@ objectives.

how_to/configuring_experiments
how_to/configure_hydra
how_to/use_callbacks
how_to/partial_config
how_to/beartype