-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: tcp_keepalive socket #3140
base: develop
Are you sure you want to change the base?
Conversation
This is also impacting me. Unfortunately we are invoking Lambda from ECS via AWS Batch, which doesn't support adding these new options in the task definition yet. |
This issue is same for me. |
experiencing similar issues
|
Hi @nateprewitt / @jonathan343 / @alexgromero / @SamRemis, I have an Airflow instance running on AWS and I'm using the Airflow I’ve tested the changes of this PR, and it is working as expected, handling Lambda functions that take up to 15 minutes to run without issues. Is there anything else needed for the review and merging process? I would appreciate any feedback and updates on its status. Thank you! |
bump. any movement on this PR? my 200s lambda sync invocations are constantly failing with |
When using the botocore.config.Config option tcp_keepalive=True, the TCP socket is configured with the keep alive socket option (
socket.SO_KEEPALIVE
). By default, Linux sets the TCP keepalive time parameter to 7200 seconds, which exceeds the AWS NAT Gateway default timeout of 350 seconds [source].This limitation leads to an inability to receive a response from a Lambda function under the following conditions:
Therefore, by configuring
socket.TCP_KEEPIDLE
,socket.TCP_KEEPINTVL
andsocket.TCP_KEEPCNT
whentcp_keepalive
during the_compute_socket_options
function call we can overcome this limitation.socket.IPPROTO_TCP
is used to support cross platform compatibility.The code submitted automatically calculates these values based on the read timeout. Another option would be to have supplied in the scope/client object.
Fixes issues: boto/boto3#2424, boto/boto3#2510 and #2916.
Fargate recently had a similar solution implemented to support this use case: https://aws.amazon.com/blogs/containers/announcing-additional-linux-controls-for-amazon-ecs-tasks-on-aws-fargate/.