Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BP-1.20][FLINK-34227][runtime] Makes the JobMaster close procedure more robust to IO thread leaks from the scheduler side #26115

Merged
merged 2 commits into from
Feb 7, 2025

Conversation

XComp
Copy link
Contributor

@XComp XComp commented Feb 6, 2025

This is a combined PR of the two parent PRs #24489 and #26095

There was one minor conflict related to the Time/Duration refactoring that happened in 2.0

XComp and others added 2 commits February 6, 2025 11:50
…pache#24489)

* Makes stopping the job execution in JobMaster run in the main thread
* Makes JobMaster#closeAsync more robust to disconnect calls during shutdown
* Adds test to cover FLINK-34227 scenario
…duler instance (i.e. the JobMaster)

- Adds test for checking whether the scheduler closing leaks an IO thread via the CheckpointsCleaner to the *SchedulerTests
- Makes CheckpointsCleaner available in AdaptiveBatchSchedulerFactory.createScheduler
@@ -204,6 +205,7 @@ public static AdaptiveBatchScheduler createScheduler(
ScheduledExecutorService futureExecutor,
ClassLoader userCodeLoader,
CheckpointRecoveryFactory checkpointRecoveryFactory,
CheckpointsCleaner checkpointsCleaner,
Time rpcTimeout,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's where the conflict appeared: Duration needed to be changed to Time

@XComp XComp changed the title [FLINK-34227][runtime] Makes the JobMaster close procedure more robust to IO thread leaks from the scheduler side [BP-1.20][FLINK-34227][runtime] Makes the JobMaster close procedure more robust to IO thread leaks from the scheduler side Feb 6, 2025
@flinkbot
Copy link
Collaborator

flinkbot commented Feb 6, 2025

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@XComp XComp merged commit 9689af6 into apache:release-1.20 Feb 7, 2025
@XComp XComp deleted the FLINK-34227-1.20 branch February 7, 2025 10:37
@XComp
Copy link
Contributor Author

XComp commented Feb 7, 2025

I merged this one w/o an approval because the code change was approved in the parent PR, the conflict was minor and CI was successful

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants