Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong units exposed by the IngressPerAppEndpoint #260

Closed
gboutry opened this issue Oct 3, 2023 · 6 comments · Fixed by #263
Closed

Wrong units exposed by the IngressPerAppEndpoint #260

gboutry opened this issue Oct 3, 2023 · 6 comments · Fixed by #263

Comments

@gboutry
Copy link

gboutry commented Oct 3, 2023

Bug Description

After a scenario of scale up and scale down, traefik exposed only the missing unit.

To Reproduce

  1. deploy traefik and an application
  2. scale up the application to 2 (wait for readiness)
  3. scale down the application back to 1
  4. Look at traefik routes

Environment

Traefik: 1.0/candidate 148
Juju: 3.1.5-genericlinux-amd64
Juju agent: 3.1.5
Microk8s: MicroK8s v1.28.0 revision 5802
OS: Ubuntu 22.04.3 LTS
ingress.py: 2.6

Relevant log output

Logs too long for github comments

Debug logs: https://pastebin.ubuntu.com/p/3JsfT9TGQp/
Traefik container logs: https://pastebin.ubuntu.com/p/ZVx2RK4XBR/

Additional context

Show unit:

  - relation-id: 15
    endpoint: ingress
    related-endpoint: ingress-public
    application-data:
      model: '"zaza-de71889d82db"'
      name: '"neutron"'
      port: "9696"
      redirect-https: "false"
      scheme: '"http"'
      strip-prefix: "false"
    related-units:
      neutron/0:
        in-scope: true
        data:
          host: '"neutron-0.neutron-endpoints.zaza-de71889d82db.svc.cluster.local"'

Traefik definition:

# root@traefik-public-0:/opt/traefik/juju# cat juju_ingress_ingress_15_neutron.yaml 
http:
  routers:
    juju-zaza-de71889d82db-neutron-router:
      entryPoints:
      - web
      rule: PathPrefix()
      service: juju-zaza-de71889d82db-neutron-service
    juju-zaza-de71889d82db-neutron-router-tls:
      entryPoints:
      - websecure
      rule: PathPrefix()
      service: juju-zaza-de71889d82db-neutron-service
      tls:
        domains:
        - main: 10.20.20.3
          sans:
          - '*.10.20.20.3'
  services:
    juju-zaza-de71889d82db-neutron-service:
      loadBalancer:
        servers:
        - url: http://neutron-1.neutron-endpoints.zaza-de71889d82db.svc.cluster.local:9696

Servers url changed from having neutron-0 and neutron-1 to neutron-1 only, and after some time back to neutron-0 only.

@PietroPasotti
Copy link
Collaborator

Can't seem to reproduce in unittests.
Will try next to reproduce in a local env

@simskij
Copy link
Member

simskij commented Oct 3, 2023

@gboutry can you please give us some repro instructions? thanks :)

@gboutry
Copy link
Author

gboutry commented Oct 3, 2023

I just redid the reproduction steps, and I could see that the departing unit was not removed from the servers.

After an update status, it was fixed.

I don't know what to say more on reproduction steps.
I literally scaled the app to 2 units, then back to one. I could see that, after ingress-relation-departed, the configuration file still had the unit that was departed.
Configuration was fixed after an update status.

@gboutry
Copy link
Author

gboutry commented Oct 4, 2023

After some investigation, it looks like loadbalancer's servers are not updated on scale down: https://github.com/canonical/traefik-k8s-operator/blob/main/lib/charms/traefik_k8s/v2/ingress.py#L247C9-L247C9

@PietroPasotti
Copy link
Collaborator

if you're hinting at the absence of a departed handler, that could be it. I thought I remembered -departed would be followed by a -changed, but not according to the docs.

However this should be easy to reproduce in unittests, which I didn't manage to do. Let me give another shot at it.

It might also be an instance of canonical/operator#888 but in a departed situation rather than broken.

@PietroPasotti
Copy link
Collaborator

found it! #263

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants