Skip to content

Commit

Permalink
Bump Ansible SSH ControlPersist param to 300s
Browse files Browse the repository at this point in the history
We see the following error downstream [1], [2] indicates this error is
caused by the SSH socket Ansible creates to multiplex connections fails.

Rather than handling the failure, Ansible fails the task. This is
reportedly failing around 30% of the time on a very long running CI job.

300s was selected as it's used by the openstack-ansible project [3] [4]

Jira: https://issues.redhat.com/browse/OSPRH-10719

[1]
```
TASK [reproducer : Ensure we can ping controller-0 from ctlplane _raw_params=ping -c2 controller-0.utility] ***
fatal: [hypervisor -> ceph-1(ceph-1.hypervisor)]: FAILED! => changed=false
  module_stderr: ''
  module_stdout: ''
  msg: |-
    MODULE FAILURE
    See stdout/stderr for the exact error
  rc: -13
```

[2] ansible/ansible#78344
[3]
https://opendev.org/openstack/openstack-ansible/src/commit/32c6aa2cec1a2145e2c20a37df23f8b4e4b93e4c/scripts/openstack-ansible.rc#L52
[4] https://opendev.org/openstack/openstack-ansible/commit/cbdba67ad0b5a3e29db390c8e6b66721719184c0
  • Loading branch information
lewisdenny committed Nov 4, 2024
1 parent 599aab0 commit ab289f3
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion ansible.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,4 @@ inventory = inventory.yml
pipelining = True
any_errors_fatal = True
[ssh_connection]
ssh_args = -o ControlMaster=auto -o ControlPersist=60s
ssh_args = -o ControlMaster=auto -o ControlPersist=300s

0 comments on commit ab289f3

Please sign in to comment.