Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support watching a specific namespace only #594

Closed
wants to merge 1 commit into from

Conversation

maxime1907
Copy link
Contributor

@maxime1907 maxime1907 commented Aug 16, 2022

Proposal

Currently, dask-gateway fetches custom resources (DaskCluster, IngressRoute...) across all namespaces of the cluster.
This behavior is fine but if we want to use it strictly in a specific namespace, it is not possible with the state of the stack.
I have then thought of a way to adapt it without creating any breaking change with the following implementation.

Implementation

  • Add an environment variable WATCH_NAMESPACE that lets dask operators to specify a namespace so that all kubernetes commands that are launched within dask-gateway are targeted only on this specific namespace.
    If this environment variable is not specified, it will start normally in cluster wide mode.
  • Change the helm chart to add support for the environment variable WATCH_NAMESPACE thanks to the following fields:
    • runInClusterScope: Run the whole stack either at the cluster scope or at the namespace scope
    • watchNamespace: Watch only a specific namespace
  • In namespaced mode, disable this traefik option: "--providers.kubernetescrd.allowCrossNamespace=true"
  • Change the RBAC files to reflect the requirements of this new behavior:
    • Cluster wide config: ServiceAccount + ClusterRole + ClusterRoleBinding
    • Namespaced config: ServiceAccount + Role + RoleBinding

Behind the scene

Cluster wide

From the user and dask operator POV, it doesn't have any impact on the current behavior of dask-gateway which currently fetches kubernetes informations cluster wide (all namespaces).

Namespaced

From the user and dask operator POV, it restricts them to the namespace where the stack is currently deployed.

@maxime1907 maxime1907 changed the title fix: migrate cluster wide permissions to namespaced ones feat: support watching a specific set of namespaces Aug 16, 2022
@maxime1907 maxime1907 changed the title feat: support watching a specific set of namespaces feat: support watching a specific set of namespaces only Aug 16, 2022
@maxime1907 maxime1907 force-pushed the rbac-namespaced branch 3 times, most recently from 83b6491 to 805a856 Compare August 16, 2022 16:39
@maxime1907 maxime1907 changed the title feat: support watching a specific set of namespaces only feat: support watching a specific namespace only Aug 16, 2022
@maxime1907 maxime1907 force-pushed the rbac-namespaced branch 3 times, most recently from 3f2344d to bb0aebf Compare August 16, 2022 16:55
@maxime1907
Copy link
Contributor Author

cc @consideRatio

@consideRatio
Copy link
Collaborator

Hi @maxime1907!

Can you document what you have done and why in the PR description?

Ideally for changes made:

  • we come to agreement on a change is worth pursueing (not done ahead of time in this case), which is often coupled with how it could be implemented and documented practically
  • describe the proposed/provided implementation: what is changed for accomplish the result? Focus both on summarizing user facing changes and behind-the-scenes changes.

Try to lead with a very easy to understand summary, and then go into a bit more details in following parts.

@maxime1907
Copy link
Contributor Author

Hi @maxime1907!

Can you document what you have done and why in the PR description?

Ideally for changes made:

  • we come to agreement on a change is worth pursueing (not done ahead of time in this case), which is often coupled with how it could be implemented and documented practically
  • describe the proposed/provided implementation: what is changed for accomplish the result? Focus both on summarizing user facing changes and behind-the-scenes changes.

Try to lead with a very easy to understand summary, and then go into a bit more details in following parts.

Hi @consideRatio,
Thanks for the directives and sorry for not doing it properly in the first place.
I did my best trying to explain what this PR is all about in its description 😉

@holzman
Copy link
Contributor

holzman commented Oct 24, 2022

@maxime1907: I've been looking at the same issue in #626 - my fault for not checking for open PRs first!

I think the only issue I see with this is that it doesn't look at gateway.backend.namespace. My thought -- rather than add more configuration values, what about:

If gateway.backend.namespace is not set (or set to the same as the release namespace), only watch the release namespace; otherwise, keep things in cluster scope.

Of course this changes the default behavior (since gateway.backend.namespace is null by default), but I think that's a good thing and the default behavior is overly permissive; most users don't want to grant cluster-wide permission to read secrets.

@maxime1907
Copy link
Contributor Author

maxime1907 commented Oct 26, 2022

@maxime1907: I've been looking at the same issue in #626 - my fault for not checking for open PRs first!

I think the only issue I see with this is that it doesn't look at gateway.backend.namespace. My thought -- rather than add more configuration values, what about:

If gateway.backend.namespace is not set (or set to the same as the release namespace), only watch the release namespace; otherwise, keep things in cluster scope.

Of course this changes the default behavior (since gateway.backend.namespace is null by default), but I think that's a good thing and the default behavior is overly permissive; most users don't want to grant cluster-wide permission to read secrets.

Hi @holzman,

Nice idea about putting the configuration in gateway.backend

About changing the default behavior, i prefer to keep the same as its currently done so cluster wide because that would be a breaking change for all users.

If'd prefer something like:
If gateway.backend.namespace not set then cluster wide (default behavior)
If gateway.backend.namespace is set then its restricted to that specific namespace (even if its the same namespace as the release, specify it)

I am waiting for @consideRatio for his toughts on the subject

@holzman
Copy link
Contributor

holzman commented Oct 26, 2022

If'd prefer something like:
If gateway.backend.namespace not set then cluster wide (default behavior)
If gateway.backend.namespace is set then its restricted to that specific namespace (even if its the same namespace as the release, specify it)

Well, this doesn't quite make sense to me - if gateway.backend.namespace is unset, we don't need cluster-wide permissions since it will be in the same namespace. OTOH
if gateway.backend.namespace is set and different than the release namespace, we will need cluster-level permissions.

Not my preference, but if we really want to preserve the current behavior than we could add the runInClusterScope helm value, but catch the case when runinClusterScope is false but the namespace specified is different than the Release namespace, something like:

{{- if and (not .Values.runInClusterScope) (ne .Values.gateway.backend.namespace .Release.Namespace) }}
{{- fail "runInClusterScope is false, but a namespace different than the release namespace was specified" }}
{{- end }}

Maybe also emit a warning for the inverse as well that cluster permissions were configured but probably overpermissive.

@maxime1907
Copy link
Contributor Author

If'd prefer something like:
If gateway.backend.namespace not set then cluster wide (default behavior)
If gateway.backend.namespace is set then its restricted to that specific namespace (even if its the same namespace as the release, specify it)

Well, this doesn't quite make sense to me - if gateway.backend.namespace is unset, we don't need cluster-wide permissions since it will be in the same namespace. OTOH if gateway.backend.namespace is set and different than the release namespace, we will need cluster-level permissions.

Not my preference, but if we really want to preserve the current behavior than we could add the runInClusterScope helm value, but catch the case when runinClusterScope is false but the namespace specified is different than the Release namespace, something like:

{{- if and (not .Values.runInClusterScope) (ne .Values.gateway.backend.namespace .Release.Namespace) }}
{{- fail "runInClusterScope is false, but a namespace different than the release namespace was specified" }}
{{- end }}

Maybe also emit a warning for the inverse as well that cluster permissions were configured but probably overpermissive.

Ahh mb on this one, idk what was i thinking, you are actually right! 🙂

Then we just need to determine if @consideRatio wants to keep the current behavior or if he lets us change it

@holzman
Copy link
Contributor

holzman commented Dec 16, 2022

@consideRatio: gentle reminder that Maxime and I would like feedback about changing the default behavior before we merge this PR and #626 and submit.

Current behavior:

  • gateway.backend.namespace unset: DaskClusters deploy in current namespace, dask-gateway monitors all namespaces
  • gateway.backend.namespace set: DaskClusters deploy in selected namespace, dask-gateway monitors all namespaces

Suggested change:

  • gateway.backend.namespace unset: DaskClusters deploy in current namespace, dask-gateway monitors current namespace
  • gateway.backend.namespace set: DaskClusters deploy in selected namespace, dask-gateway monitors selected namespace

As I'm thinking about it, I think this is also a good idea to prevent a corner case: in the unlikely event that you have multiple namespaces that are running dask-gateway, currently they may see each other's DaskClusters which almost certainly isn't intended behavior.

@consideRatio
Copy link
Collaborator

@holzman @maxime1907 I opened #661 to discuss things and left some suggestions on the path forward in the bottom of the text!

I'm low on maintenance capcity, and this functionality is quite a big deal with regards to the complexity of this project. Due to that, I felt a need to think through the configuration options we ought to provide to ensure maintenance for the feature is sustainable going onwards.

@consideRatio
Copy link
Collaborator

consideRatio commented Dec 21, 2022

For my own sanity where I struggle to have discussions about this in multiple locations, I'll close this and the other PR which you are most welcome to re-create with fresh changes, perhaps just like they are. I just want to reset the disucssion a bit, because I felt that it jumped into technical details before the choice of how to configure etc was decided.

Let's focus discussion in #661, once we arrive at a configuration api that seems OK, we can get a PR merged quickly implementing whats agreed on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants