Skip to content

Commit

Permalink
updateing docs & prod service file (#465)
Browse files Browse the repository at this point in the history
* Update prefect.rst and  hedwig_listener_prod.service - fixups to docs / daemon post release
  • Loading branch information
philipmac authored Mar 22, 2024
1 parent d3d621f commit ea5d4d6
Show file tree
Hide file tree
Showing 2 changed files with 46 additions and 22 deletions.
54 changes: 35 additions & 19 deletions docs/source/prefect.rst
Original file line number Diff line number Diff line change
@@ -1,31 +1,19 @@
Managing Prefect Server
=======================

Prefect Server is running, but workpool is not available
--------------------------------------------------------

In HPC, you can use the following command to create a work pool.

.. code-block::
prefect work-pool create "workpool"
**Manual instruction:**

If the prefect server is running, but the workpool is not available, then you can create a workpool by going to the website where the prefect server is hosted. Go to `Work Pools` tab and create a workpool with the name `workpool`. This is the name of the workpool that is defined in [prefect.yaml > definitions > work_pools > name](https://github.com/niaid/image_portal_workflows/pull/353/files#diff-b49a6f022232810a70f1a0c2feffbbe84d018b2418a7996e52430c6063ada3a3R23) file.

Continuous Deployment (dev to qa to prod)
-----------------------------------------

Assuming that `dev` environment is working as expected, we can promote the environment to `qa` and `prod` as well.
Server images are promotoed from one environment to the next, i.e. `dev` -> `qa` -> `prod`.

The first thing we need to do is promote the image from `dev` to `qa`.
For example to promote the image from `dev` to `qa`:

.. code-block::
spaces task -f hedwig.spaces-solution.yaml promote-image -- dev qa
Afterwards, we can deploy the aws infrastructure for `qa` as,
We can then deploy the aws infrastructure for `qa`:

.. code-block::
Expand All @@ -52,7 +40,18 @@ Make sure the configurations are correct:
# change it to dev or qa, based on your environment
2. Check prefect config with view
2. Check HPC worker daemon:

.. code-block::
systemctl status hedwig_listener_prod
Certain scenarios require the deamon to be restarted or reloaded, although typically we do not need to perform this step. (see helper_scripts/.service file) The `systemctl` should restart the worker if killed or on crash.



3. Check prefect config with view

.. code-block::
Expand All @@ -62,7 +61,7 @@ Make sure the configurations are correct:
export PREFECT_API_KEY=xyz
export PREFECT_API_URL=abc.com
3. Deploy flows with prefect deploy
4. Deploy flows with prefect deploy

.. code-block::
Expand All @@ -71,6 +70,23 @@ Make sure the configurations are correct:
# However, this will also deploy pytest_runner workflow in other envs (where it's not needed)
# prefect deploy --all
4. Run worker (properly via the helper_scripts/.service file)
The service files should restarts the worker when killed. Normally, we would need to do this step
Troubleshooting:
--------------------------------------------------------

- Prefect Server is running, but workpool is not available

In HPC, you can use the following command to create a work pool.
`prefect work-pool create "workpool"`
Enssure prefect server is running, and workpool is not available. If not create a workpool by going to the website where the prefect server is hosted. Go to `Work Pools` tab and create a workpool with the name `workpool`. This is the name of the workpool that is defined in prefect.yaml > definitions > work_pools > name file.

- IMOD unable to find `env`

.. code-block::
Unable to run command.
Cannot run program "env" (in directory "?"): error=2, No such file or directory
Note `directory "?"`, this implies that something is trying to run in a directory that does not exist. Ensure that the daemon is taken down, ensure that `ps aux | grep hedwig` does not list any processes that may be running, ensure that the service file is correct, ensure that the daemon is `reloaded` and `started`.

14 changes: 11 additions & 3 deletions helper_scripts/hedwig_listener_prod.service
Original file line number Diff line number Diff line change
@@ -1,16 +1,24 @@

[Unit]
Description=Starts the Production listener "Agent" which reaches out to workflow API.
Description=Starts the Production listener worker, which reaches out to workflow API.
After=network.target


[Service]
Type=simple
User=hedwig_prod
Group=hedwig_prod
ExecStart=/gs1/home/hedwig_prod/image_portal_workflows/helper_scripts/hedwig_reg_listen.sh listen
WorkingDirectory=/gs1/home/hedwig_prod
Environment="HEDWIG_ENV=prod"
Environment="REQUESTS_CA_BUNDLE=/etc/pki/tls/certs/ca-bundle.crt"
Environment="PREFECT_API_URL=https://prefect2.hedwig-workflow-api.niaidprod.net/api"
Environment="IMOD_DIR=/opt/rml/imod"
WorkingDirectory=/gs1/home/hedwig_prod/image_portal_workflows
# current setting on prod
# WorkingDirectory=/gs1/home/hedwig_prod

ExecStart=/gs1/home/hedwig_prod/prod/bin/prefect worker start --pool workpool
Restart=always
RestartSec="60s"

[Install]
WantedBy=multi-user.target

0 comments on commit ea5d4d6

Please sign in to comment.