Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: handle invalid combination of external_hostname and routing_path #420

Merged
merged 3 commits into from
Nov 15, 2024

Conversation

ca-scribner
Copy link
Contributor

Issue

Previously, this charm accepted the following configuration:

  • routing_mode=subdomain
  • external_hostname="" # (unset)

When external_hostname is unset, the url provided for any application related by ingress uses the LoadBalancer's external address, which may be an IP. In these cases, it would provide charms urls like model.app.1.2.3.4, which are invalid URLs (the last segment of a URL cannot be all-numeric). This led to an uncaught pydantic validation error when calling ipa.publish_url() because that method includes validation of the URL, putting the charm in Error state. An example of this is shown in #361

Fixes #361

Solution

Ideally, the fix here would be to validate the charm configuration+LoadBalancer details and halt charm execution if the configuration was invalid, putting the charm into BlockedStatus until resolved. The problem is that the current architecture of this charm makes that solution challenging. The charm is designed to atomically handle events (doing only the work a particular event needs) rather than holistically (recomputing the world on each event), meaning that skipping or losing track of events leads to undesired charm states. Also, ipa.publish_url() is called deep in most (all?) event handlers, making it difficult to properly handle these errors at the charm level without a major refactor of the charm.

As a compromise, the following has been done:

  • the traefik_k8s/v2/ingress.py library's publish_url method has been updated to catch the pydantic validation error cited in Unable to use Load Balancer's IP address for the ingress gateway #361 and log it rather than raise it to the charm. The library then writes ingress=None to the databag instead of the invalid URL, giving a soft indication to the user that the url is invalid.
  • the config.yaml descriptions for routing_mode and external_hostname have been updated to explain the incompatibility in these settings
  • config validation has been added to the init of the charm for routing_mode and external_hostname. If routing_mode==subdomain and hostname is unset, the charm will log warnings for the user about the possible incompatibility (but will not block the charm)

The upshot of these changes is that this charm will:

  • not go into an unresponsive error state
  • as best it can given the current charm architecture, warn the user of the misconfiguration
  • not risk losing events through defer or getting event sequencing wrong

Context

#361

Testing Instructions

Everything added here is tested in unit tests. Integration tests were not added since it can be fully tested at the unit level.

To manually integration test:

# clone this branch

# pack traefik
charmcraft pack

# pack a tester charm
mkdir -p tests/integration/testers/ipa/lib/charms
cp -r lib/charms/traefik_k8s tests/integration/testers/ipa/lib/charms
cp -r lib/charms/observability_libs tests/integration/testers/ipa/lib/charms
cd tests/integration/testers/ipa
charmcraft pack
cd -

# deploy and relate
juju add-model traefik 
juju deploy ./traefik-k8s_ubuntu-20.04-amd64.charm --resource traefik-image=docker.io/ubuntu/traefik:2-22.04 --trust
juju deploy ./tests/integration/testers/ipa/ipa-requirer-mock_ubuntu-20.04-amd64.charm

juju relate traefik-k8s ipa-requirer-mock

# wait for everything to settle

# inspect relation data and see `ingress` url is set
juju show-unit ipa-requirer-mock/0 | yq '.ipa-requirer-mock/0.relation-info[0].application-data.ingress'
# {"url": "http://10.64.140.43/traefik-ipa-requirer-mock"}

# update to subdomain routing and confirm `ingress` url is now unset
juju config traefik-k8s routing_mode=subdomain
# wait to settle
juju show-unit ipa-requirer-mock/0 | yq '.ipa-requirer-mock/0.relation-info[0].application-data.ingress'
# null

# add an external_hostname and confirm `ingress` url is again set
juju config traefik-k8s external_hostname=foo.com
# wait to settle
{"url": "http://traefik-ipa-requirer-mock.foo.com/"}

Upgrade Notes

no effect on upgrade

…outing_path

Previously, this charm accepted the following configuration:
* routing_mode=subdomain
* external_hostname="" # (unset)

When external_hostname is unset, the url provided for any application related by ingress uses the LoadBalancer's external address, which may be an IP.  In these cases, it would provide charms urls like `model.app.1.2.3.4`, which are invalid URLs (the last segment of a URL cannot be all-numeric).  This led to an uncaught pydantic validation error when calling `ipa.publish_url()` because that method includes validation of the URL, putting the charm in Error state.  An example of this is shown in #361

Ideally, the fix here would be to validate the charm configuration+LoadBalancer details and halt charm execution if the configuration was invalid, putting the charm into BlockedStatus until resolved.  The problem is that the current architecture of this charm makes that solution challenging.  The charm is designed to atomically handle events (doing only the work a particular event needs) rather than holistically (recomputing the world on each event), meaning that skipping or losing track of events leads to undesired charm states.  Also, `ipa.publish_url()` is called deep in most (all?) event handlers, making it difficult to properly handle these errors at the charm level without a major refactor of the charm.

As a compromise, the following has been done:
* the traefik_k8s/v2/ingress.py library's `publish_url` method has been updated to catch the pydantic validation error cited in #361 and log it rather than raise it to the charm.  The library then writes ingress=None to the databag instead of the invalid URL, giving a soft indication to the user that the url is invalid.
* the config.yaml descriptions for routing_mode and external_hostname have been updated to explain the incompatibility in these settings
* config validation has been added to the __init__ of the charm for routing_mode and external_hostname.  If routing_mode==subdomain and hostname is unset, the charm will log warnings for the user about the possible incompatibility (but will not block the charm)

The upshot of these changes is that this charm will:
* not go into an unresponsive error state
* as best it can given the current charm architecture, warn the user of the misconfiguration
* not risk losing events through defer or getting event sequencing wrong

Fixes #361
@ca-scribner ca-scribner merged commit f824025 into main Nov 15, 2024
14 checks passed
@ca-scribner ca-scribner deleted the openg-2438 branch November 15, 2024 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unable to use Load Balancer's IP address for the ingress gateway
2 participants