Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long boot time on V3 #1432

Open
AvihaiSam opened this issue Jan 16, 2025 · 2 comments
Open

Long boot time on V3 #1432

AvihaiSam opened this issue Jan 16, 2025 · 2 comments

Comments

@AvihaiSam
Copy link

Hi,

I'm trying to migrate from V2 (amazon-kclpy==2.1.5) to V3 (amazon-kclpy==3.0.1)
The migration process has finished properly and the workers seems to work ok

The issue I experience is that since the moment KCL is launched till my python process starts it takes about 4 minutes...
It took several seconds with V2.
The kcl.properties is as provided in the samples (up to naming and launching the python process)

There are too many logs to be able to understand what's causing the boot to be so slow.

there are 4 shards on the stream and a single worker on EKS pod.

Would appreciate any help here...

@etspaceman
Copy link
Contributor

I am experiencing this too. I think this has a lot to do with the KCL's new method of distributing worker leases, as it takes time to collect this information. I'd really like the ability to make this more local-environment friendly, in which my "workers" are simply threads on my local machine.

@etspaceman
Copy link
Contributor

etspaceman commented Jan 23, 2025

Update on my end. Was able to solve the long boot time on local runs by setting the LeaseManagementConfig's failoverTimeMillis to a lower value. Seems that this config is multiplied by 2 (not configurable) and used as a delay when identifying workers to assign to leases.

That said - production deployments are bound to be slower due to the need to collect worker metrics before assigning leases. This does seem to be an intended impact from the new 3.x series, but I do think there needs to be consideration for the initial lease assignments when none are active, so that initial launches are quicker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants