Skip to content
This repository has been archived by the owner on Dec 11, 2024. It is now read-only.

Build a coffea-casa-dask version with pinned packages for EAF compatiblity #45

Closed
mapsacosta opened this issue Aug 31, 2021 · 5 comments

Comments

@mapsacosta
Copy link
Collaborator

mapsacosta commented Aug 31, 2021

I would like to enable the use of the coffea-casa-dask image for Dask clusters launched from the EAF's dask-gateway server.

Dask is very cranky when client and server versions of some packages differ (see issue here). We have not been able to use the dask-cc7 image out of the box in production yet as I need to set the client with the exact same versions at the image for workers/scheduler.

EAF images have pinned versions to avoid mismatching distributed versions causing issues, but for the coffea-casa-dask image, pinned versions in the code seem to be changing extremely fast.

I noticed this repo has a GH actions workflow that changes pinned versions constantly, which is not operationally sustainable for us neither it fits the production CI pipeline for the EAF. There are multiple ways to solve this, but basically we need to make sure at least dask, distributed and dask-gateway are pinned in a "stable" way.

Any ideas on how to make this happen?

CC. @holzman

Edit: Typo

@holzman
Copy link

holzman commented Aug 31, 2021

+1. Having more stability in the release process is a good idea generally (not just to keep server and client versions in sync).

@oshadura
Copy link
Member

oshadura commented Sep 1, 2021

Thanks for your report!

Do you know if dask, distributed have a recommended version to use with dask-gateway? As I see from point of view of releases dask-gateway they don't have any new releases during last year: https://github.com/dask/dask-gateway/tags, while dask and distributed have a monthly release. I think you have a valid point, but we probably need to report it in dask-gateway repository and to ask dask community to test and release matching versions? If you want I can open an issue to understand how we can proceed.

Other thing I noticed is that you are using mismatched versions which is highly not recommended by dask:

dask==2.30.0 \
distributed==2.30.1 \

(also please take in account that dask 2.30.0 was released almost one year ago: https://docs.dask.org/en/latest/changelog.html#id19)

We have pinned dask and distributed packages according releases since there is a continuous improvements and bug fixes (including security fixes). Sadly we don't have an image with Dask 2.30.0, since we created this repository much more later...

As a small comment from coffea-casa prospective if we don't update often version of dask/distributed(like each 2-3 month), it is really hard after to update all versions together.

@holzman @mapsacosta I also would like to propose if would you like to meet all together and discuss what we can do to make this repository more usable for your case?

@mapsacosta
Copy link
Collaborator Author

mapsacosta commented Sep 1, 2021

Hi @oshadura thanks for the quick reply!

There are three places where ideally, things should match:

  • dask-gateway-server -- Running in Kubernetes, curated by Dask developers
  • EAF dask environment -- Running in Kubernetes for users via JupyterHub, curated by EAF developers (so far me)
  • coffea-casa-dask -- Will run as the base image for Dask schedulers/clusters launched by Users via dask-gateway, curated by casa developers
# Versions on the Dask-Gateway image, from Dockerhub: https://hub.docker.com/r/daskgateway/dask-gateway-server/tags

─ [$] docker run -it --entrypoint /opt/conda/bin/python --name dask-gateway --rm daskgateway/dask-gateway:0.9.0 -c "from distributed.versions import get_versions;print(get_versions())"
{'host': {'python': '3.8.3.final.0', 'python-bits': 64, 'OS': 'Linux', 'OS-release': '3.10.0-1160.25.1.el7.x86_64', 'machine': 'x86_64', 'processor': '', 'byteorder': 'little', 'LC_ALL': 'None', 'LANG': 'None'}, 'packages': {'python': '3.8.3.final.0', 'dask': '2.30.0', 'distributed': '2.30.1', 'msgpack': '1.0.0', 'cloudpickle': '1.6.0', 'tornado': '6.1', 'toolz': '0.11.1', 'numpy': '1.19.2', 'lz4': None, 'blosc': None}}

Note that we use upstream images for the dask-gateway API server and thus, use the same package versions released officially by Dask. This image has the mismatched versions of Dask and distributed you pointed out in our notebook code above, so if it's undesirable behavior, it's probably something we should feed upstream :)

As a small comment from coffea-casa prospective if we don't update often version of dask/distributed(like each 2-3 month), it is really hard after to update all versions together.

2-3 months is a bit too much, we have controlled builds but they run every week or so. I agree, we should not let versions get behind... My concern with this really comes down to an automated process committing pinned versions to the code rather than automated builds themselves :) in the end all we need is to have some middle ground.

Let's meet to discuss and in the meantime, I will reach out to Dask developers about the versioning mismatch and potentially outdated images.

Edit: formatting

@mapsacosta
Copy link
Collaborator Author

Fortunately, this seems to be known in the dask-gateway community and a release is on the horizon: dask/dask-gateway#381 (comment)

Here are a couple issues that mention this same versioning discrepancies, I will follow them closely:
dask/dask-gateway#376
dask/dask-gateway#161

@oshadura
Copy link
Member

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants