Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blog Post: Introducing JobSet #45759

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

danielvegamyhre
Copy link
Member

We would like to publish a blog post introducing JobSet, a K8s native API for distributed ML training and HPC workloads.

cc @ahg-g @kannon92 I think we still need to align on one example and ideally make it more concrete and polished. We should also explain the user story above it.

@k8s-ci-robot k8s-ci-robot added area/blog Issues or PRs related to the Kubernetes Blog subproject cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. language/en Issues or PRs related to English language sig/docs Categorizes an issue or PR as relevant to SIG Docs. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 2, 2024
Copy link

netlify bot commented Apr 2, 2024

Pull request preview available for checking

Built without sensitive environment variables

Name Link
🔨 Latest commit 95ac8fd
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-io-main-staging/deploys/6795206ba4e5380008256cb0
😎 Deploy Preview https://deploy-preview-45759--kubernetes-io-main-staging.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@kannon92
Copy link
Contributor

kannon92 commented Apr 2, 2024

/cc @haircommander

Trying to find an impartial reviewer ;)

@haircommander
Copy link
Contributor

I have one note but I found this informative while doing a good job of laying the groundwork needed.

LGTM (assuming the note is unaddressable)

Copy link
Contributor

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some partial feedback (not yet a full review)

@sftim
Copy link
Contributor

sftim commented Apr 2, 2024

(if this is not yet ready for review by the blog team, please change the title to start with [WIP])

@sftim
Copy link
Contributor

sftim commented Apr 2, 2024

/hold

pending assignment of publication date

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 2, 2024
@danielvegamyhre danielvegamyhre changed the title Blog Post: Introducing JobSet [WIP] Blog Post: Introducing JobSet Apr 2, 2024
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 2, 2024
@danielvegamyhre
Copy link
Member Author

(if this is not yet ready for review by the blog team, please change the title to start with [WIP])

Thanks Tim, added [WIP] to the title.

@sftim
Copy link
Contributor

sftim commented Apr 3, 2024

I propose the 26th of April as publication date. Does that work?

@sftim
Copy link
Contributor

sftim commented May 3, 2024

Let's pick a new publication date. How about the 7th of May?

@sftim
Copy link
Contributor

sftim commented May 3, 2024

You should remove [WIP] from the PR title @danielvegamyhre if / when you think this is ready to be reviewed.

@ahg-g
Copy link
Member

ahg-g commented May 31, 2024

/lgtm

Thanks @danielvegamyhre !

@danielvegamyhre
Copy link
Member Author

@ahg-g I saw your comment that you were planning to clone and submit the PR registering the JobSet annotations/labels since the original author stopped responding. If that's done, we should merge this next.

---
layout: blog
title: "Introducing JobSet"
date: 2024-04-02
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What publication date should we target?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go for 2015-01-10 - does that work?

Copy link
Member

@ahg-g ahg-g Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that works

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set publication date for 2025-01-13 since it was a few days til I had time to address these last comments, and want to give sufficient time for a final review.

Comment on lines 80 to 81
Replicated Jobs
: In modern data centers, hardware accelerators like GPUs and TPUs allocated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The headers look strange, should we move this line up and use a bold font for the header?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this is the right markdown

If the style doesn't look right: we'll take a PR (or you can file an issue).

@sftim sftim self-requested a review January 6, 2025 13:08
@sftim sftim dismissed their stale review January 6, 2025 13:08

Review was stale

@sftim
Copy link
Contributor

sftim commented Jan 6, 2025

@danielvegamyhre if you're comfortable rebasing, please rebase this against main; we've changed the site since this PR was first opened.

@sftim
Copy link
Contributor

sftim commented Jan 6, 2025

See #45759 (comment) about the proposed publication date.

@ahg-g
Copy link
Member

ahg-g commented Jan 6, 2025

@danielvegamyhre if you don't have time to address the remaining comments, I can take over the PR, please let me know.

Copy link

linux-foundation-easycla bot commented Jan 9, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: danielvegamyhre / name: Daniel Vega-Myhre (95ac8fd, 0ec6407)

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. and removed cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jan 9, 2025
@danielvegamyhre
Copy link
Member Author

@ahg-g @sftim I addressed the latest comments and rebased on main. Excited to get this published!

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Jan 9, 2025
@ahg-g
Copy link
Member

ahg-g commented Jan 10, 2025

Thanks Daniel!

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 10, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ahg-g
Once this PR has been reviewed and has the lgtm label, please ask for approval from sftim. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

---
layout: blog
title: "Introducing JobSet"
date: 2024-01-13
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sftim we missed the date; what new date do you suggest we put here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry about that.

@danielvegamyhre can you add draft: true into the front matter and a new future & weekday publication date? Then we can merge it as a draft and the process to fix things up is simpler.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry about that.

@danielvegamyhre can you add draft: true into the front matter and a new future & weekday publication date? Then we can merge it as a draft and the process to fix things up is simpler.

Sure, done. I chose Feb 3rd this time to give us some breathing room.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 25, 2025
@k8s-ci-robot k8s-ci-robot requested a review from ahg-g January 25, 2025 17:33
@ahg-g
Copy link
Member

ahg-g commented Jan 25, 2025

/lgtm
@sftim please take a look hopefully one last time

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 25, 2025
Copy link
Contributor

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR needs more work.

If you're not sure how to check it, try:

# assumes a clean working directory with no uncommited changes etc
git fetch <remote> pull/45759/head:pr-45759
git switch main
make container-image
git pull --ff-only
git switch -c pr-45759-merge-preview
git merge pr-45759 --no-ff
make container-serve
xdg-open http://localhost:1313/blog # might just be "open" not "xdg-open"

and then you should have a new browser window showing a preview. Most edits show up live. You need a container runtime like Docker or Podman; see the Makefile for hints.
You can also do something similar on Windows, but we haven't yet documented how.

different names.

<figure>
<img src="jobset_diagram.svg">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't render when I preview the article. I'm afraid that working images are a prerequisite for a merge.

Please consider using a figure short code with an alt attribute set.

Copy link
Member

@ahg-g ahg-g Jan 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would replacing this block with the following work?

{{< figure src="jobset_diagram.svg" alt="JobSet Architecture" class="diagram-large" clicktozoom="true" >}}

or

![JobSet Architecture](jobset_diagram.svg)

@sftim
Copy link
Contributor

sftim commented Jan 26, 2025

/lgtm cancel
See #45759 (comment)

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 26, 2025
@k8s-ci-robot k8s-ci-robot requested a review from sftim January 26, 2025 14:11
different names.

<figure>
<img src="jobset_diagram.svg">
Copy link
Member

@ahg-g ahg-g Jan 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would replacing this block with the following work?

{{< figure src="jobset_diagram.svg" alt="JobSet Architecture" class="diagram-large" clicktozoom="true" >}}

or

![JobSet Architecture](jobset_diagram.svg)

Comment on lines +86 to +89
<figure>
<img src="jobset_diagram.svg">
<figcaption><h4>JobSet Architecture</h4></figcaption>
</figure>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<figure>
<img src="jobset_diagram.svg">
<figcaption><h4>JobSet Architecture</h4></figcaption>
</figure>
{{< figure src="jobset_diagram.svg" alt="JobSet Architecture" class="diagram-large" clicktozoom="true" >}}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please just apply the change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy for someone to pick this up, but I really don't want to be the single load-bearing part in getting blog articles published. Because of how much I value distributing the work, I don't plan to do this step myself. @ahg-g you are welcome to ask other contributors to help out here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/blog Issues or PRs related to the Kubernetes Blog subproject cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. language/en Issues or PRs related to English language sig/docs Categorizes an issue or PR as relevant to SIG Docs. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
Status: Requires update
Development

Successfully merging this pull request may close these issues.

8 participants