Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gluster_volume task fails when removing brick from disconnected peer #32

Open
mtruneck opened this issue Oct 25, 2024 · 1 comment
Open

Comments

@mtruneck
Copy link

mtruneck commented Oct 25, 2024

The scenario:

  • gluster_volume is correctly configured across an ansible group
  • one of the group's peer disconnects permanently (cluster scaled down in our case)
  • run gluster_volume ansible without the disconnected peer in the group

Expected result

  • gluster_volume task automatically removes the bricks from the disconnected peer

Real behaviour

  • it fails with ValueError: invalid literal for int() with base 10.

Reason

The problem is on the line 430 in gluster_volume.py:

def reduce_config(name, removed_bricks, replicas, force):
out = run_gluster(['volume', 'heal', name, 'info'])
summary = out.split("\n")
for line in summary:
if 'Number' in line and int(line.split(":")[1].strip()) != 0:
module.fail_json(msg="Operation aborted, self-heal in progress.")

Because it expects output from gluster volume heal [name] info like this:

Brick 10.10.1.102:/opt/volume
Status: Connected
Number of entries: 0

But in case the peer disconnected, the output is

Brick 10.10.1.102:/opt/volume
Status: Transport endpoint is not connected
Number of entries: -

So the condition on line 430 fails, because ' -' is not an int.

Task used:

- name: Configure Gluster volume.
  gluster_volume:
    state: present
    name: "{{ brick_name }}"
    brick: "{{ brick_dir }}"
    replicas: "{{ groups[gluster_group] | length }}"
    cluster: "{{ groups[gluster_group] | map('extract', hostvars, 'ansible_all_ipv4_addresses') | map('select', 'search', '^10\\.') | map('first') | list }}"
    host: "{{ ansible_all_ipv4_addresses | select('search', '^10\\.') | first }}"
    force: yes
  run_once: true
@ckoliber
Copy link

I have the same issue, and seems this problem can be fixed by small change
Is there any maintainer for this project ?
@pkesavap

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants