Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BGD-6060 Flush events chunks before final heartbeat #236

Merged
merged 1 commit into from
Oct 30, 2024

Conversation

abunghez
Copy link
Contributor

@abunghez abunghez commented Oct 29, 2024

Jira ticket

https://spotinst.atlassian.net/browse/BGD-6060

Description

Update spark-watcher to 0.6.1. Before sending the final heartbeat to an application, make sure that the event cache is flushed to persistent storage.

Demo

image

The apps have the following IDs:

│ Labels:           bigdata.spot.io/application-id=andrei-beefed-pi-24dfa-hangs
bigdata.spot.io/application-internal-id=53f827b4-8426-44f2-a2f6-2caf3d4ae617                                                                          

│ Labels:           bigdata.spot.io/application-id=andrei-beefed-pi-29e7a-facts
bigdata.spot.io/application-internal-id=e474666f-52f9-446a-ab86-fdcbf96b016a
                                                                         
image image image image

Checklist

  • I have added a Jira ticket link
  • I have filled in the test plan
  • I have executed the tests and filled in the test results
  • I have updated/created relevant documentation

How to test

To test this change, we need a DP cluster. Run a very fast application. Logs should be available in the archive bucket (and available to be downloaded if the feature is enabled in the UI) after the application finishes.

Test plan and results

Short app test

  1. Run short app
  2. Monitor the "Deleting application from event cache" log from the spark-watcher for this application
  3. After the log message is displayed, wait for the kubelogs pipeline to be finalized on the control plane. (message on kibana)

Expected result:

After the app is finalized in the CP, the archive buckets should contain the event logs (and if the feature is enabled in the UI, the tabulated file should be available for download).

Test Description Result Notes
1 Short app test Pass application-internal-id=53f827b4-8426-44f2-a2f6-2caf3d4ae617 (on prod)
2. Short app test Pass application-internal-id=e474666f-52f9-446a-ab86-fdcbf96b016a (on prod)

@abunghez abunghez requested a review from a team as a code owner October 29, 2024 15:13
Copy link
Contributor

@crezvoy crezvoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@abunghez abunghez merged commit 7b1f644 into main Oct 30, 2024
2 checks passed
@abunghez abunghez deleted the BGD-6060-flush-events-on-final-hb branch October 30, 2024 11:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants