Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance/Replace k8s python client. #1291

Open
BalaBalaYi opened this issue Oct 12, 2024 · 1 comment
Open

Enhance/Replace k8s python client. #1291

BalaBalaYi opened this issue Oct 12, 2024 · 1 comment
Assignees

Comments

@BalaBalaYi
Copy link
Collaborator

BalaBalaYi commented Oct 12, 2024

Background

Currently, DLRover uses the official Kubernetes Python client to interact with the Kubernetes API Server. This part of implementations are quite important because it involves managing the lifecycle of training workers. However, the Python client has inherent limitations and lags behind the Go (and Java) clients (e.g., lacks an informer implementation), leading to occasional unexpected usage issues in certain scenarios. Therefore, we intend to:

[Option 1] Replace the current Python client with the Go client(need to use CFFI).
[Option 2] Enhance k8s client implements in python.

Requirement

The enhancement/replacement should ensure compatibility with all existing Kubernetes-related calls while also adapting the usage in dlrover/python/scheduler/kubernetes.py.

  1. Ensure compatibility with all related features (no regression).
  2. Reimplement the watch mechanism using the 'informer'. (dlrover/python/master/watcher/k8s_watcher.py)

For replacement scheme requires evaluation of:

  1. Limitations on different system platforms.
  2. Issues with cross-language object transfer.
  3. Potential performance overhead.
  4. Additional costs for building and deployment.

For enhancement scheme requires evaluation of:

  1. Feasibility and complexity of implementation.
@BalaBalaYi BalaBalaYi changed the title Replace k8s python client with go client. Enhance k8s python client. Oct 12, 2024
@BalaBalaYi BalaBalaYi changed the title Enhance k8s python client. Enhance/Replace k8s python client. Oct 12, 2024
@Mukku27
Copy link

Mukku27 commented Oct 26, 2024

hello @BalaBalaYi
Please assign me the issue
I will be enhancing the Python Kubernetes client in dlrover/python/scheduler/kubernetes.py to address #1291 by adding an informer-like watch mechanism, improved error handling with retries, and resource caching to reduce API calls. This approach will optimize performance, ensure compatibility, and avoid moving to a Go client.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants